WorldWideScience

Sample records for genomic study genes

  1. A salmonid EST genomic study: genes, duplications, phylogeny and microarrays

    Directory of Open Access Journals (Sweden)

    Brahmbhatt Sonal

    2008-11-01

    Full Text Available Abstract Background Salmonids are of interest because of their relatively recent genome duplication, and their extensive use in wild fisheries and aquaculture. A comprehensive gene list and a comparison of genes in some of the different species provide valuable genomic information for one of the most widely studied groups of fish. Results 298,304 expressed sequence tags (ESTs from Atlantic salmon (69% of the total, 11,664 chinook, 10,813 sockeye, 10,051 brook trout, 10,975 grayling, 8,630 lake whitefish, and 3,624 northern pike ESTs were obtained in this study and have been deposited into the public databases. Contigs were built and putative full-length Atlantic salmon clones have been identified. A database containing ESTs, assemblies, consensus sequences, open reading frames, gene predictions and putative annotation is available. The overall similarity between Atlantic salmon ESTs and those of rainbow trout, chinook, sockeye, brook trout, grayling, lake whitefish, northern pike and rainbow smelt is 93.4, 94.2, 94.6, 94.4, 92.5, 91.7, 89.6, and 86.2% respectively. An analysis of 78 transcript sets show Salmo as a sister group to Oncorhynchus and Salvelinus within Salmoninae, and Thymallinae as a sister group to Salmoninae and Coregoninae within Salmonidae. Extensive gene duplication is consistent with a genome duplication in the common ancestor of salmonids. Using all of the available EST data, a new expanded salmonid cDNA microarray of 32,000 features was created. Cross-species hybridizations to this cDNA microarray indicate that this resource will be useful for studies of all 68 salmonid species. Conclusion An extensive collection and analysis of salmonid RNA putative transcripts indicate that Pacific salmon, Atlantic salmon and charr are 94–96% similar while the more distant whitefish, grayling, pike and smelt are 93, 92, 89 and 86% similar to salmon. The salmonid transcriptome reveals a complex history of gene duplication that is

  2. Evolution of a microbial nitrilase gene family: a comparative and environmental genomics study

    Directory of Open Access Journals (Sweden)

    Eads Jonathan R

    2005-08-01

    Full Text Available Abstract Background Completed genomes and environmental genomic sequences are bringing a significant contribution to understanding the evolution of gene families, microbial metabolism and community eco-physiology. Here, we used comparative genomics and phylogenetic analyses in conjunction with enzymatic data to probe the evolution and functions of a microbial nitrilase gene family. Nitrilases are relatively rare in bacterial genomes, their biological function being unclear. Results We examined the genetic neighborhood of the different subfamily genes and discovered conserved gene clusters or operons associated with specific nitrilase clades. The inferred evolutionary transitions that separate nitrilases which belong to different gene clusters correlated with changes in their enzymatic properties. We present evidence that Darwinian adaptation acted during one of those transitions and identified sites in the enzyme that may have been under positive selection. Conclusion Changes in the observed biochemical properties of the nitrilases associated with the different gene clusters are consistent with a hypothesis that those enzymes have been recruited to a novel metabolic pathway following gene duplication and neofunctionalization. These results demonstrate the benefits of combining environmental genomic sampling and completed genomes data with evolutionary and biochemical analyses in the study of gene families. They also open new directions for studying the functions of nitrilases and the genes they are associated with.

  3. The multiple facets of homology and their use in comparative genomics to study the evolution of genes, genomes, and species.

    Science.gov (United States)

    Descorps-Declère, Stéphane; Lemoine, Frédéric; Sculo, Quentin; Lespinet, Olivier; Labedan, Bernard

    2008-04-01

    The incredible development of comparative genomics during the last decade has required a correct use of the concept of homology that was previously utilized only by evolutionary biologists. Unhappily, this concept has been often misunderstood and thus misused when exploited outside its evolutionary context. This review brings back to the correct definition of homology and explains how this definition has been progressively refined in order to adapt it to the various new kinds of analysis of gene properties and of their products that appear with the progress of comparative genomics. Then, we illustrate the power and the proficiency of such a concept when using the available genomics data in order to study the evolution of individual genes, of entire genomes and of species, respectively. After explaining how we detect homologues by an exhaustive comparison of a hundred of complete proteomes, we describe three main lines of research we have developed in the recent years. The first one exploits synteny and gene context data to better understand the mechanisms of genome evolution in prokaryotes. The second one is based on phylogenomics approaches to reconstruct the tree of life. The last one is devoted to reminding that protein homology is often limited to structural segments (SOH=segment of homology or module). Detecting and numbering modules allows tracing back protein history by identifying the events of gene duplication and gene fusion. We insist that one of the main present difficulties in such studies is a lack of a reliable method to identify genuine orthologues. Finally, we show how these homology studies are helpful to annotate genes and genomes and to study the complexity of the relationships between sequence and function of a gene.

  4. The function genomics study

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    @@ Genomics is a biology term appeared ten years ago, used to describe the researches of genomic mapping, sequencing, and structure analysis, etc. Genomics, the first journal for publishing papers on genomics research was born in 1986. In the past decade, the concept of genomics has been widely accepted by scientists who are engaging in biology research. Meanwhile, the research scope of genomics has been extended continuously, from simple gene mapping and sequencing to function genomics study. To reflect the change, genomics is divided into two parts now, the structure genomics and the function genomics.

  5. [A review of the genomic and gene cloning studies in trees].

    Science.gov (United States)

    Yin, Tong-Ming

    2010-07-01

    Supported by the Department of Energy (DOE) of U.S., the first tree genome, black cottonwood (Populus trichocarpa), has been completely sequenced and publicly release. This is the milestone that indicates the beginning of post-genome era for forest trees. Identification and cloning genes underlying important traits are one of the main tasks for the post-genome-era tree genomic studies. Recently, great achievements have been made in cloning genes coordinating important domestication traits in some crops, such as rice, tomato, maize and so on. Molecular breeding has been applied in the practical breeding programs for many crops. By contrast, molecular studies in trees are lagging behind. Trees possess some characteristics that make them as difficult organisms for studying on locating and cloning of genes. With the advances in techniques, given also the fast growth of tree genomic resources, great achievements are desirable in cloning unknown genes from trees, which will facilitate tree improvement programs by means of molecular breeding. In this paper, the author reviewed the progress in tree genomic and gene cloning studies, and prospected the future achievements in order to provide a useful reference for researchers working in this area.

  6. Genome-wide selection of superior reference genes for expression studies in Ganoderma lucidum.

    Science.gov (United States)

    Xu, Zhichao; Xu, Jiang; Ji, Aijia; Zhu, Yingjie; Zhang, Xin; Hu, Yuanlei; Song, Jingyuan; Chen, Shilin

    2015-12-15

    Quantitative real-time polymerase chain reaction (qRT-PCR) is widely used for the accurate analysis of gene expression. However, high homology among gene families might result in unsuitability of reference genes, which leads to the inaccuracy of qRT-PCR analysis. The release of the Ganoderma lucidum genome has triggered numerous studies to be done on the homology among gene families with the purpose of selecting reliable reference genes. Based on the G. lucdum genome and transcriptome database, 38 candidate reference genes including 28 novel genes were systematically selected and evaluated for qRT-PCR normalization. The result indicated that commonly used polyubiquitin (PUB), beta-actin (BAT), and glyceraldehyde 3-phosphate dehydrogenase (GAPDH) were unsuitable reference genes because of the high sequence similarity and low primer specificity. According to the evaluation of RefFinder, cyclophilin 5 (CYP5) was ranked as the most stable reference gene for 27 tested samples under all experimental conditions and eighteen mycelial samples. Based on sequence analysis and expression analysis, our study suggested that gene characteristic, primer specificity of high homologous genes, allele-specificity expression of candidate genes and under-evaluation of reference genes influenced the accuracy and sensitivity of qRT-PCR analysis. This investigation not only revealed potential factors influencing the unsuitability of reference genes but also selected the superior reference genes from more candidate genes and testing samples than those used in the previous study. Furthermore, our study established a model for reference gene analysis by using the genomic sequence.

  7. Detection of gene x gene interactions in genome-wide association studies of human population data

    National Research Council Canada - National Science Library

    Musani, Solomon K; Shriner, Daniel; Liu, Nianjun; Feng, Rui; Coffey, Christopher S; Yi, Nengjun; Tiwari, Hemant K; Allison, David B

    2007-01-01

    Empirical evidence supporting the commonality of gene x gene interactions, coupled with frequent failure to replicate results from previous association studies, has prompted statisticians to develop...

  8. A large-scale zebrafish gene knockout resource for the genome-wide study of gene function.

    Science.gov (United States)

    Varshney, Gaurav K; Lu, Jing; Gildea, Derek E; Huang, Haigen; Pei, Wuhong; Yang, Zhongan; Huang, Sunny C; Schoenfeld, David; Pho, Nam H; Casero, David; Hirase, Takashi; Mosbrook-Davis, Deborah; Zhang, Suiyuan; Jao, Li-En; Zhang, Bo; Woods, Ian G; Zimmerman, Steven; Schier, Alexander F; Wolfsberg, Tyra G; Pellegrini, Matteo; Burgess, Shawn M; Lin, Shuo

    2013-04-01

    With the completion of the zebrafish genome sequencing project, it becomes possible to analyze the function of zebrafish genes in a systematic way. The first step in such an analysis is to inactivate each protein-coding gene by targeted or random mutation. Here we describe a streamlined pipeline using proviral insertions coupled with high-throughput sequencing and mapping technologies to widely mutagenize genes in the zebrafish genome. We also report the first 6144 mutagenized and archived F1's predicted to carry up to 3776 mutations in annotated genes. Using in vitro fertilization, we have rescued and characterized ~0.5% of the predicted mutations, showing mutation efficacy and a variety of phenotypes relevant to both developmental processes and human genetic diseases. Mutagenized fish lines are being made freely available to the public through the Zebrafish International Resource Center. These fish lines establish an important milestone for zebrafish genetics research and should greatly facilitate systematic functional studies of the vertebrate genome.

  9. A cricket Gene Index: a genomic resource for studying neurobiology, speciation, and molecular evolution

    Directory of Open Access Journals (Sweden)

    Quackenbush John

    2007-04-01

    Full Text Available Abstract Background As the developmental costs of genomic tools decline, genomic approaches to non-model systems are becoming more feasible. Many of these systems may lack advanced genetic tools but are extremely valuable models in other biological fields. Here we report the development of expressed sequence tags (EST's in an orthopteroid insect, a model for the study of neurobiology, speciation, and evolution. Results We report the sequencing of 14,502 EST's from clones derived from a nerve cord cDNA library, and the subsequent construction of a Gene Index from these sequences, from the Hawaiian trigonidiine cricket Laupala kohalensis. The Gene Index contains 8607 unique sequences comprised of 2575 tentative consensus (TC sequences and 6032 singletons. For each of the unique sequences, an attempt was made to assign a provisional annotation and to categorize its function using a Gene Ontology-based classification through a sequence-based comparison to known proteins. In addition, a set of unique 70 base pair oligomers that can be used for DNA microarrays was developed. All Gene Index information is posted at the DFCI Gene Indices web page Conclusion Orthopterans are models used to understand the neurophysiological basis of complex motor patterns such as flight and stridulation. The sequences presented in the cricket Gene Index will provide neurophysiologists with many genetic tools that have been largely absent in this field. The cricket Gene Index is one of only two gene indices to be developed in an evolutionary model system. Species within the genus Laupala have speciated recently, rapidly, and extensively. Therefore, the genes identified in the cricket Gene Index can be used to study the genomics of speciation. Furthermore, this gene index represents a significant EST resources for basal insects. As such, this resource is a valuable comparative tool for the understanding of invertebrate molecular evolution. The sequences presented here will

  10. Network properties of complex human disease genes identified through genome-wide association studies.

    Directory of Open Access Journals (Sweden)

    Fredrik Barrenas

    Full Text Available BACKGROUND: Previous studies of network properties of human disease genes have mainly focused on monogenic diseases or cancers and have suffered from discovery bias. Here we investigated the network properties of complex disease genes identified by genome-wide association studies (GWAs, thereby eliminating discovery bias. PRINCIPAL FINDINGS: We derived a network of complex diseases (n = 54 and complex disease genes (n = 349 to explore the shared genetic architecture of complex diseases. We evaluated the centrality measures of complex disease genes in comparison with essential and monogenic disease genes in the human interactome. The complex disease network showed that diseases belonging to the same disease class do not always share common disease genes. A possible explanation could be that the variants with higher minor allele frequency and larger effect size identified using GWAs constitute disjoint parts of the allelic spectra of similar complex diseases. The complex disease gene network showed high modularity with the size of the largest component being smaller than expected from a randomized null-model. This is consistent with limited sharing of genes between diseases. Complex disease genes are less central than the essential and monogenic disease genes in the human interactome. Genes associated with the same disease, compared to genes associated with different diseases, more often tend to share a protein-protein interaction and a Gene Ontology Biological Process. CONCLUSIONS: This indicates that network neighbors of known disease genes form an important class of candidates for identifying novel genes for the same disease.

  11. Network properties of complex human disease genes identified through genome-wide association studies.

    Science.gov (United States)

    Barrenas, Fredrik; Chavali, Sreenivas; Holme, Petter; Mobini, Reza; Benson, Mikael

    2009-11-30

    Previous studies of network properties of human disease genes have mainly focused on monogenic diseases or cancers and have suffered from discovery bias. Here we investigated the network properties of complex disease genes identified by genome-wide association studies (GWAs), thereby eliminating discovery bias. We derived a network of complex diseases (n = 54) and complex disease genes (n = 349) to explore the shared genetic architecture of complex diseases. We evaluated the centrality measures of complex disease genes in comparison with essential and monogenic disease genes in the human interactome. The complex disease network showed that diseases belonging to the same disease class do not always share common disease genes. A possible explanation could be that the variants with higher minor allele frequency and larger effect size identified using GWAs constitute disjoint parts of the allelic spectra of similar complex diseases. The complex disease gene network showed high modularity with the size of the largest component being smaller than expected from a randomized null-model. This is consistent with limited sharing of genes between diseases. Complex disease genes are less central than the essential and monogenic disease genes in the human interactome. Genes associated with the same disease, compared to genes associated with different diseases, more often tend to share a protein-protein interaction and a Gene Ontology Biological Process. This indicates that network neighbors of known disease genes form an important class of candidates for identifying novel genes for the same disease.

  12. Current status and prospects for the study of Nicotiana genomics, genetics, and nicotine biosynthesis genes.

    Science.gov (United States)

    Wang, Xuewen; Bennetzen, Jeffrey L

    2015-02-01

    Nicotiana, a member of the Solanaceae family, is one of the most important research model plants, and of high agricultural and economic value worldwide. To better understand the substantial and rapid research progress with Nicotiana in recent years, its genomics, genetics, and nicotine gene studies are summarized, with useful web links. Several important genetic maps, including a high-density map of N. tabacum consisting of ~2,000 markers published in 2012, provide tools for genetics research. Four whole genome sequences are from allotetraploid species, including N. benthamiana in 2012, and three N. tabacum cultivars (TN90, K326, and BX) in 2014. Three whole genome sequences are from diploids, including progenitors N. sylvestris and N. tomentosiformis in 2013 and N. otophora in 2014. These and additional studies provide numerous insights into genome evolution after polyploidization, including changes in gene composition and transcriptome expression in N. tabacum. The major genes involved in the nicotine biosynthetic pathway have been identified and the genetic basis of the differences in nicotine levels among Nicotiana species has been revealed. In addition, other progress on chloroplast, mitochondrial, and NCBI-registered projects on Nicotiana are discussed. The challenges and prospects for genomic, genetic and application research are addressed. Hence, this review provides important resources and guidance for current and future research and application in Nicotiana.

  13. A Gene Gravity Model for the Evolution of Cancer Genomes: A Study of 3,000 Cancer Genomes across 9 Cancer Types

    Science.gov (United States)

    Lin, Chen-Ching; Zhao, Junfei; Jia, Peilin; Li, Wen-Hsiung; Zhao, Zhongming

    2015-01-01

    Cancer development and progression result from somatic evolution by an accumulation of genomic alterations. The effects of those alterations on the fitness of somatic cells lead to evolutionary adaptations such as increased cell proliferation, angiogenesis, and altered anticancer drug responses. However, there are few general mathematical models to quantitatively examine how perturbations of a single gene shape subsequent evolution of the cancer genome. In this study, we proposed the gene gravity model to study the evolution of cancer genomes by incorporating the genome-wide transcription and somatic mutation profiles of ~3,000 tumors across 9 cancer types from The Cancer Genome Atlas into a broad gene network. We found that somatic mutations of a cancer driver gene may drive cancer genome evolution by inducing mutations in other genes. This functional consequence is often generated by the combined effect of genetic and epigenetic (e.g., chromatin regulation) alterations. By quantifying cancer genome evolution using the gene gravity model, we identified six putative cancer genes (AHNAK, COL11A1, DDX3X, FAT4, STAG2, and SYNE1). The tumor genomes harboring the nonsynonymous somatic mutations in these genes had a higher mutation density at the genome level compared to the wild-type groups. Furthermore, we provided statistical evidence that hypermutation of cancer driver genes on inactive X chromosomes is a general feature in female cancer genomes. In summary, this study sheds light on the functional consequences and evolutionary characteristics of somatic mutations during tumorigenesis by propelling adaptive cancer genome evolution, which would provide new perspectives for cancer research and therapeutics. PMID:26352260

  14. A Gene Gravity Model for the Evolution of Cancer Genomes: A Study of 3,000 Cancer Genomes across 9 Cancer Types.

    Directory of Open Access Journals (Sweden)

    Feixiong Cheng

    2015-09-01

    Full Text Available Cancer development and progression result from somatic evolution by an accumulation of genomic alterations. The effects of those alterations on the fitness of somatic cells lead to evolutionary adaptations such as increased cell proliferation, angiogenesis, and altered anticancer drug responses. However, there are few general mathematical models to quantitatively examine how perturbations of a single gene shape subsequent evolution of the cancer genome. In this study, we proposed the gene gravity model to study the evolution of cancer genomes by incorporating the genome-wide transcription and somatic mutation profiles of ~3,000 tumors across 9 cancer types from The Cancer Genome Atlas into a broad gene network. We found that somatic mutations of a cancer driver gene may drive cancer genome evolution by inducing mutations in other genes. This functional consequence is often generated by the combined effect of genetic and epigenetic (e.g., chromatin regulation alterations. By quantifying cancer genome evolution using the gene gravity model, we identified six putative cancer genes (AHNAK, COL11A1, DDX3X, FAT4, STAG2, and SYNE1. The tumor genomes harboring the nonsynonymous somatic mutations in these genes had a higher mutation density at the genome level compared to the wild-type groups. Furthermore, we provided statistical evidence that hypermutation of cancer driver genes on inactive X chromosomes is a general feature in female cancer genomes. In summary, this study sheds light on the functional consequences and evolutionary characteristics of somatic mutations during tumorigenesis by propelling adaptive cancer genome evolution, which would provide new perspectives for cancer research and therapeutics.

  15. Advances in plant cell type-specific genome-wide studies of gene expression

    Institute of Scientific and Technical Information of China (English)

    Ying WANG; Yuling JIAO

    2011-01-01

    Cell is the functional unit of life.To study the complex interactions of systems of biological molecules,it is crucial to dissect these molecules at the cell level.In recent years,major progresses have been made by plant biologists to profile gene expression in specific cell types at the genome-wide level.Approaches based on the isolation of cells,polysomes or nuclei have been developed and successfully used for studying the cell types from distinct organs of several plant species.These cell-level data sets revealed previously unrecognized cellular properties,such as cell-specific gene expression modules and hormone response centers,and should serve as essential resources for functional genomic analyses.Newly developed technologies are more affordable to many laboratories and should help to provide new insights at the cellular resolution in the near future.

  16. A large-scale zebrafish gene knockout resource for the genome-wide study of gene function

    Science.gov (United States)

    Varshney, Gaurav K.; Lu, Jing; Gildea, Derek E.; Huang, Haigen; Pei, Wuhong; Yang, Zhongan; Huang, Sunny C.; Schoenfeld, David; Pho, Nam H.; Casero, David; Hirase, Takashi; Mosbrook-Davis, Deborah; Zhang, Suiyuan; Jao, Li-En; Zhang, Bo; Woods, Ian G.; Zimmerman, Steven; Schier, Alexander F.; Wolfsberg, Tyra G.; Pellegrini, Matteo; Burgess, Shawn M.; Lin, Shuo

    2013-01-01

    With the completion of the zebrafish genome sequencing project, it becomes possible to analyze the function of zebrafish genes in a systematic way. The first step in such an analysis is to inactivate each protein-coding gene by targeted or random mutation. Here we describe a streamlined pipeline using proviral insertions coupled with high-throughput sequencing and mapping technologies to widely mutagenize genes in the zebrafish genome. We also report the first 6144 mutagenized and archived F1's predicted to carry up to 3776 mutations in annotated genes. Using in vitro fertilization, we have rescued and characterized ∼0.5% of the predicted mutations, showing mutation efficacy and a variety of phenotypes relevant to both developmental processes and human genetic diseases. Mutagenized fish lines are being made freely available to the public through the Zebrafish International Resource Center. These fish lines establish an important milestone for zebrafish genetics research and should greatly facilitate systematic functional studies of the vertebrate genome. PMID:23382537

  17. Gene expression levels as endophenotypes in genome-wide association studies of Alzheimer disease

    Science.gov (United States)

    Zou, F.; Carrasquillo, M. M.; Pankratz, V. S.; Belbin, O.; Morgan, K.; Allen, M.; Wilcox, S. L.; Ma, L.; Walker, L. P.; Kouri, N.; Burgess, J. D.; Younkin, L. H.; Younkin, Samuel G.; Younkin, C. S.; Bisceglio, G. D.; Crook, J. E.; Dickson, D. W.; Petersen, R. C.; Graff-Radford, N.; Younkin, Steven G.; Ertekin-Taner, N.

    2010-01-01

    Background: Late-onset Alzheimer disease (LOAD) is a common disorder with a substantial genetic component. We postulate that many disease susceptibility variants act by altering gene expression levels. Methods: We measured messenger RNA (mRNA) expression levels of 12 LOAD candidate genes in the cerebella of 200 subjects with LOAD. Using the genotypes from our LOAD genome-wide association study for the cis-single nucleotide polymorphisms (SNPs) (n = 619) of these 12 LOAD candidate genes, we tested for associations with expression levels as endophenotypes. The strongest expression cis-SNP was tested for AD association in 7 independent case-control series (2,280 AD and 2,396 controls). Results: We identified 3 SNPs that associated significantly with IDE (insulin degrading enzyme) expression levels. A single copy of the minor allele for each significant SNP was associated with ∼twofold higher IDE expression levels. The most significant SNP, rs7910977, is 4.2 kb beyond the 3′ end of IDE. The association observed with this SNP was significant even at the genome-wide level (p = 2.7 × 10−8). Furthermore, the minor allele of rs7910977 associated significantly (p = 0.0046) with reduced LOAD risk (OR = 0.81 with a 95% CI of 0.70-0.94), as expected biologically from its association with elevated IDE expression. Conclusions: These results provide strong evidence that IDE is a late-onset Alzheimer disease (LOAD) gene with variants that modify risk of LOAD by influencing IDE expression. They also suggest that the use of expression levels as endophenotypes in genome-wide association studies may provide a powerful approach for the identification of disease susceptibility alleles. GLOSSARY AD = Alzheimer disease; CI = confidence interval; GWAS = genome-wide association study; LOAD = late-onset Alzheimer disease; mRNA = messenger RNA; OR = odds ratio; SNP = single nucleotide polymorphism. PMID:20142614

  18. Comparative genomic analysis of eutherian kallikrein genes

    Directory of Open Access Journals (Sweden)

    Marko Premzl

    2017-03-01

    Full Text Available The present study made attempts to update and revise eutherian kallikrein genes implicated in major physiological and pathological processes and in medical molecular diagnostics. Using eutherian comparative genomic analysis protocol and free available genomic sequence assemblies, the tests of reliability of eutherian public genomic sequences annotated most comprehensive curated third party data gene data set of eutherian kallikrein genes including 121 complete coding sequences among 335 potential coding sequences. The present analysis first described 13 major gene clusters of eutherian kallikrein genes, and explained their differential gene expansion patterns. One updated classification and nomenclature of eutherian kallikrein genes was proposed, as new framework of future experiments.

  19. Comparison of three summary statistics for ranking genes in genome-wide association studies.

    Science.gov (United States)

    Freytag, Saskia; Bickeböller, Heike

    2014-05-20

    Problems associated with insufficient power have haunted the analysis of genome-wide association studies and are likely to be the main challenge for the analysis of next-generation sequencing data. Ranking genes according to their strength of association with the investigated phenotype is one solution. To obtain rankings for genes, researchers can draw from a wide range of statistics summarizing the relationships between variants mapped to a gene and the phenotype. Hence, it is of interest to explore the performance of these statistics in the context of rankings. To this end, we conducted a simulation study (limited to genes of equal sizes) of three different summary statistics examining the ability to rank genes in a meaningful order. The weighted sum of squared marginal score test (Pan, 2009), RareCover algorithm (Bahtia et al., 2010) and the elastic net regularization (Zou and Hastie, 2005) were chosen, because they can handle common as well as rare variants. The test based on the score statistic outperformed both other methods in almost all investigated scenarios. It was the only measure to consistently detect genes with interacting causal variants. However, the RareCover algorithm proved better at identifying genes including causal variants with small effect sizes and low minor allele frequency than the weighted sum of squared marginal score test. The performance of the elastic net regularization was unimpressive for all but the simplest scenarios. Copyright © 2013 John Wiley & Sons, Ltd.

  20. An effective virus-based gene silencing method for functional genomics studies in common bean

    Directory of Open Access Journals (Sweden)

    Kachroo Aardra

    2011-06-01

    Full Text Available Abstract Background Common bean (Phaseolus vulgaris L. is a crop of economic and nutritious importance in many parts of the world. The lack of genomic resources have impeded the advancement of common bean genomics and thereby crop improvement. Although concerted efforts from the "Phaseomics" consortium have resulted in the development of several genomic resources, functional studies have continued to lag due to the recalcitrance of this crop for genetic transformation. Results Here we describe the use of a bean pod mottle virus (BPMV-based vector for silencing of endogenous genes in common bean as well as for protein expression. This BPMV-based vector was originally developed for use in soybean. It has been successfully employed for both protein expression and gene silencing in this species. We tested this vector for applications in common bean by targeting common bean genes encoding nodulin 22 and stearoyl-acyl carrier protein desaturase for silencing. Our results indicate that the BPMV vector can indeed be employed for reverse genetics studies of diverse biological processes in common bean. We also used the BPMV-based vector for expressing the green fluorescent protein (GFP in common bean and demonstrate stable GFP expression in all common bean tissues where BPMV was detected. Conclusions The availability of this vector is an important advance for the common bean research community not only because it provides a rapid means for functional studies in common bean, but also because it does so without generating genetically modified plants. Here we describe the detailed methodology and provide essential guidelines for the use of this vector for both gene silencing and protein expression in common bean. The entire VIGS procedure can be completed in 4-5 weeks.

  1. Genomics of local adaptation with gene flow.

    Science.gov (United States)

    Tigano, Anna; Friesen, Vicki L

    2016-05-01

    Gene flow is a fundamental evolutionary force in adaptation that is especially important to understand as humans are rapidly changing both the natural environment and natural levels of gene flow. Theory proposes a multifaceted role for gene flow in adaptation, but it focuses mainly on the disruptive effect that gene flow has on adaptation when selection is not strong enough to prevent the loss of locally adapted alleles. The role of gene flow in adaptation is now better understood due to the recent development of both genomic models of adaptive evolution and genomic techniques, which both point to the importance of genetic architecture in the origin and maintenance of adaptation with gene flow. In this review, we discuss three main topics on the genomics of adaptation with gene flow. First, we investigate selection on migration and gene flow. Second, we discuss the three potential sources of adaptive variation in relation to the role of gene flow in the origin of adaptation. Third, we explain how local adaptation is maintained despite gene flow: we provide a synthesis of recent genomic models of adaptation, discuss the genomic mechanisms and review empirical studies on the genomics of adaptation with gene flow. Despite predictions on the disruptive effect of gene flow in adaptation, an increasing number of studies show that gene flow can promote adaptation, that local adaptations can be maintained despite high gene flow, and that genetic architecture plays a fundamental role in the origin and maintenance of local adaptation with gene flow.

  2. Identification of genes related to intramuscular fat content of pig using genome-wide association study.

    Science.gov (United States)

    Won, Sohyoung; Jung, Jaehoon; Park, Eungwoo; Kim, H B

    2017-06-27

    The aim of this study is to identify SNPs and genes related to pig IMF and estimate the heritability of IMF. Genome-wide association study (GWAS) on 704 inbred Berkshires was performed for intramuscular fat content (IMF). To consider the inbreeding among samples, associations of the SNPs with IMF were tested as random effects in a mixed linear model using the genetic relationship matrix by GEMMA. Significant genes were compared with reported pig IMF QTL regions and functional classification of the identified genes were also performed. Heritability of IMF was estimated by GCTA tool. Total 365 SNPs were found to be significant from a cutoff of p-value IMF QTL regions. BMPER, FOXO1, EDAR, RNF149, CD40, PTPN1, SOX9, MYC, MIF were related to mitogen-activated protein kinase (MAPK) pathway which regulates the differentiation to adipocytes. These genes and the genes mapped on QTLs could be the candidate genes affecting IMF. Heritability of IMF was estimated as 0.52, which was relatively high, suggesting that a considerable portion of the total variance of IMF is explained by the SNP information. Our results can contribute to breeding pig with better IMF and therefore, producing pork with better sensory qualities.

  3. Gowinda: unbiased analysis of gene set enrichment for genome-wide association studies.

    Science.gov (United States)

    Kofler, Robert; Schlötterer, Christian

    2012-08-01

    An analysis of gene set [e.g. Gene Ontology (GO)] enrichment assumes that all genes are sampled independently from each other with the same probability. These assumptions are violated in genome-wide association (GWA) studies since (i) longer genes typically have more single-nucleotide polymorphisms resulting in a higher probability of being sampled and (ii) overlapping genes are sampled in clusters. Herein, we introduce Gowinda, a software specifically designed to test for enrichment of gene sets in GWA studies. We show that GO tests on GWA data could result in a substantial number of false-positive GO terms. Permutation tests implemented in Gowinda eliminate these biases, but maintain sufficient power to detect enrichment of GO terms. Since sufficient resolution for large datasets requires millions of permutations, we use multi-threading to keep computation times reasonable. Gowinda is implemented in Java (v1.6) and freely available on http://code.google.com/p/gowinda/ christian.schloetterer@vetmeduni.ac.at Manual: http://code.google.com/p/gowinda/wiki/Manual. Test data and tutorial: http://code.google.com/p/gowinda/wiki/Tutorial. http://code.google.com/p/gowinda/wiki/VALIDATION.

  4. Behavior of QQ-plots and genomic control in studies of gene-environment interaction.

    Directory of Open Access Journals (Sweden)

    Arend Voorman

    Full Text Available Genome-wide association studies of gene-environment interaction (GxE GWAS are becoming popular. As with main effects GWAS, quantile-quantile plots (QQ-plots and Genomic Control are being used to assess and correct for population substructure. However, in G x E work these approaches can be seriously misleading, as we illustrate; QQ-plots may give strong indications of substructure when absolutely none is present. Using simulation and theory, we show how and why spurious QQ-plot inflation occurs in G x E GWAS, and how this differs from main-effects analyses. We also explain how simple adjustments to standard regression-based methods used in G x E GWAS can alleviate this problem.

  5. Pichia stipitis genomics, transcriptomics, and gene clusters

    Science.gov (United States)

    Thomas W. Jeffries; Jennifer R. Headman Van Vleet

    2009-01-01

    Genome sequencing and subsequent global gene expression studies have advanced our understanding of the lignocellulose-fermenting yeast Pichia stipitis. These studies have provided an insight into its central carbon metabolism, and analysis of its genome has revealed numerous functional gene clusters and tandem repeats. Specialized physiological traits are often the...

  6. In vivo functional genomic studies of sterol carrier protein-2 gene in the yellow fever mosquito.

    Science.gov (United States)

    Peng, Rong; Maklokova, Vilena I; Chandrashekhar, Jayadevi H; Lan, Que

    2011-03-18

    A simple and efficient DNA delivery method to introduce extrachromosomal DNA into mosquito embryos would significantly aid functional genomic studies. The conventional method for delivery of DNA into insects is to inject the DNA directly into the embryos. Taking advantage of the unique aspects of mosquito reproductive physiology during vitellogenesis and an in vivo transfection reagent that mediates DNA uptake in cells via endocytosis, we have developed a new method to introduce DNA into mosquito embryos vertically via microinjection of DNA vectors in vitellogenic females without directly manipulating the embryos. Our method was able to introduce inducible gene expression vectors transiently into F0 mosquitoes to perform functional studies in vivo without transgenic lines. The high efficiency of expression knockdown was reproducible with more than 70% of the F0 individuals showed sufficient gene expression suppression (mosquitoes.

  7. In vivo functional genomic studies of sterol carrier protein-2 gene in the yellow fever mosquito.

    Directory of Open Access Journals (Sweden)

    Rong Peng

    Full Text Available A simple and efficient DNA delivery method to introduce extrachromosomal DNA into mosquito embryos would significantly aid functional genomic studies. The conventional method for delivery of DNA into insects is to inject the DNA directly into the embryos. Taking advantage of the unique aspects of mosquito reproductive physiology during vitellogenesis and an in vivo transfection reagent that mediates DNA uptake in cells via endocytosis, we have developed a new method to introduce DNA into mosquito embryos vertically via microinjection of DNA vectors in vitellogenic females without directly manipulating the embryos. Our method was able to introduce inducible gene expression vectors transiently into F0 mosquitoes to perform functional studies in vivo without transgenic lines. The high efficiency of expression knockdown was reproducible with more than 70% of the F0 individuals showed sufficient gene expression suppression (<30% of the controls' levels. At the cohort level, AeSCP-2 expression knockdown in early instar larvae resulted in detectable phenotypes of the expression deficiency such as high mortality, lowered fertility, and distorted sex ratio after induction of AeSCP-2 siRNA expression in vivo. The results further confirmed the important role of AeSCP-2 in the development and reproduction of A. aegypti. In this study, we proved that extrachromosomal transient expression of an inducible gene from a DNA vector vertically delivered via vitellogenic females can be used to manipulate gene expression in F0 generation. This new method will be a simple and efficient tool for in vivo functional genomic studies in mosquitoes.

  8. Genome-wide association study identifies candidate genes for starch content regulation in maize kernels

    Directory of Open Access Journals (Sweden)

    Na Liu

    2016-07-01

    Full Text Available Kernel starch content is an important trait in maize (Zea mays L. as it accounts for 65% to 75% of the dry kernel weight and positively correlates with seed yield. A number of starch synthesis-related genes have been identified in maize in recent years. However, many loci underlying variation in starch content among maize inbred lines still remain to be identified. The current study is a genome-wide association study that used a set of 263 maize inbred lines. In this panel, the average kernel starch content was 66.99%, ranging from 60.60% to 71.58% over the three study years. These inbred lines were genotyped with the SNP50 BeadChip maize array, which is comprised of 56,110 evenly spaced, random SNPs. Population structure was controlled by a mixed linear model (MLM as implemented in the software package TASSEL. After the statistical analyses, four SNPs were identified as significantly associated with starch content (P ≤ 0.0001, among which one each are located on chromosomes 1 and 5 and two are on chromosome 2. Furthermore, 77 candidate genes associated with starch synthesis were found within the 100-kb intervals containing these four QTLs, and four highly associated genes were within 20-kb intervals of the associated SNPs. Among the four genes, Glucose-1-phosphate adenylyltransferase (APS1; Gene ID GRMZM2G163437 is known as an important regulator of kernel starch content. The identified SNPs, QTLs, and candidate genes may not only be readily used for germplasm improvement by marker-assisted selection in breeding, but can also elucidate the genetic basis of starch content. Further studies on these identified candidate genes may help determine the molecular mechanisms regulating kernel starch content in maize and other important cereal crops.

  9. High-throughput genomic mapping of vector integration sites in gene therapy studies.

    Science.gov (United States)

    Beard, Brian C; Adair, Jennifer E; Trobridge, Grant D; Kiem, Hans-Peter

    2014-01-01

    Gene therapy has enormous potential to treat a variety of infectious and genetic diseases. To date hundreds of patients worldwide have received hematopoietic cell products that have been gene-modified with retrovirus vectors carrying therapeutic transgenes, and many patients have been cured or demonstrated disease stabilization as a result (Adair et al., Sci Transl Med 4:133ra57, 2012; Biffi et al., Science 341:1233158, 2013; Aiuti et al., Science 341:1233151, 2013; Fischer et al., Gene 525:170-173, 2013). Unfortunately, for some patients the provirus integration dysregulated the expression of nearby genes leading to clonal outgrowth and, in some cases, cancer. Thus, the unwanted side effect of insertional mutagenesis has become a major concern for retrovirus gene therapy. The careful study of retrovirus integration sites (RIS) and the contribution of individual gene-modified clones to hematopoietic repopulating cells is of crucial importance for all gene therapy studies. Supporting this, the US Food and Drug Administration (FDA) has mandated the careful monitoring of RIS in all clinical trials of gene therapy. An invaluable method was developed: linear amplification mediated-polymerase chain reaction (LAM-PCR) capable of analyzing in vitro and complex in vivo samples, capturing valuable genomic information directly flanking the site of provirus integration. Linking this method and similar methods to high-throughput sequencing has now made possible an unprecedented understanding of the integration profile of various retrovirus vectors, and allows for sensitive monitoring of their safety. It also allows for a detailed comparison of improved safety-enhanced gene therapy vectors. An important readout of safety is the relative contribution of individual gene-modified repopulating clones. One limitation of LAM-PCR is that the ability to capture the relative contribution of individual clones is compromised because of the initial linear PCR common to all current methods

  10. Nucleotide Excision Repair in Cellular Chromatin: Studies with Yeast from Nucleotide to Gene to Genome

    Directory of Open Access Journals (Sweden)

    Simon Reed

    2012-09-01

    Full Text Available Here we review our development of, and results with, high resolution studies on global genome nucleotide excision repair (GGNER in Saccharomyces cerevisiae. We have focused on how GGNER relates to histone acetylation for its functioning and we have identified the histone acetyl tranferase Gcn5 and acetylation at lysines 9/14 of histone H3 as a major factor in enabling efficient repair. We consider results employing primarily MFA2 as a model gene, but also those with URA3 located at subtelomeric sequences. In the latter case we also see a role for acetylation at histone H4. We then go on to outline the development of a high resolution genome-wide approach that enables one to examine correlations between histone modifications and the nucleotide excision repair (NER of UV-induced cyclobutane pyrimidine dimers throughout entire genomes. This is an approach that will enable rapid advances in understanding the complexities of how compacted chromatin in chromosomes is processed to access DNA damage and then returned to its pre-damaged status to maintain epigenetic codes.

  11. Gene-environment interaction effects on lung function- a genome-wide association study within the Framingham heart study

    Science.gov (United States)

    2013-01-01

    Background Previous studies in occupational exposure and lung function have focused only on the main effect of occupational exposure or genetics on lung function. Some disease-susceptible genes may be missed due to their low marginal effects, despite potential involvement in the disease process through interactions with the environment. Through comprehensive genome-wide gene-environment interaction studies, we can uncover these susceptibility genes. Our objective in this study was to explore gene by occupational exposure interaction effects on lung function using both the individual SNPs approach and the genetic network approach. Methods The study population comprised the Offspring Cohort and the Third Generation from the Framingham Heart Study. We used forced expiratory volume in one second (FEV1) and ratio of FEV1 to forced vital capacity (FVC) as outcomes. Occupational exposures were classified using a population-specific job exposure matrix. We performed genome-wide gene-environment interaction analysis, using the Affymetrix 550 K mapping array for genotyping. A linear regression-based generalized estimating equation was applied to account for within-family relatedness. Network analysis was conducted using results from single-nucleotide polymorphism (SNP)-level analyses and from gene expression study results. Results There were 4,785 participants in total. SNP-level analysis and network analysis identified SNP rs9931086 (Pinteraction =1.16 × 10-7) in gene SLC38A8, which may significantly modify the effects of occupational exposure on FEV1. Genes identified from the network analysis included CTLA-4, HDAC, and PPAR-alpha. Conclusions Our study implies that SNP rs9931086 in SLC38A8 and genes CTLA-4, HDAC, and PPAR-alpha, which are related to inflammatory processes, may modify the effect of occupational exposure on lung function. PMID:24289273

  12. Gene finding in novel genomes

    Directory of Open Access Journals (Sweden)

    Korf Ian

    2004-05-01

    Full Text Available Abstract Background Computational gene prediction continues to be an important problem, especially for genomes with little experimental data. Results I introduce the SNAP gene finder which has been designed to be easily adaptable to a variety of genomes. In novel genomes without an appropriate gene finder, I demonstrate that employing a foreign gene finder can produce highly inaccurate results, and that the most compatible parameters may not come from the nearest phylogenetic neighbor. I find that foreign gene finders are more usefully employed to bootstrap parameter estimation and that the resulting parameters can be highly accurate. Conclusion Since gene prediction is sensitive to species-specific parameters, every genome needs a dedicated gene finder.

  13. Genome Wide Association Study:Searching for Genes Underlying Body Mass Index in the Chinese

    Institute of Scientific and Technical Information of China (English)

    YANG Fang; CHEN Xiang Ding; TAN Li Jun; SHEN Jie; LI Ding You; ZHANG Fang; SHA Bao Yong; DENG Hong Wen

    2014-01-01

    Objective Obesity is becoming a worldwide health problem. The genome wide association (GWA) study particularly for body mass index (BMI) has not been successfully conducted in the Chinese. In order to identify novel genes for BMI variation in the Chinese, an initial GWA study and a follow up replication study were performed. Methods Affymetrix 500K SNPs were genotyped for initial GWA of 597 Northern Chinese. After quality control, 281 533 SNPs were included in the association analysis. Three SNPs were genotyped in a Southern Chinese replication sample containing 2 955 Chinese Han subjects. Association analyses were performed by Plink software. Results Eight SNPs were significantly associated with BMI variation after false discovery rate (FDR) correction (P=5.45×10-7-7.26×10-6, FDR q=0.033-0.048). Two adjacent SNPs (rs4432245 &rs711906) in the eukaryotic translation initiation factor 2 alpha kinase 4 (EIF2AK4) gene were significantly associated with BMI (P=6.38×10-6&4.39×10-6, FDR q=0.048). In the follow-up replication study, we confirmed the associations between BMI and rs4432245, rs711906 in the EIF2AKE gene (P=0.03&0.01, respectively). Conclusion Our study suggests novel mechanisms for BMI, where EIF2AK4 has exerted a profound effect on the synthesis and storage of triglycerides and may impact on overall energy homeostasis associated with obesity. The minor allele frequencies for the two SNPs in the EIF2AK4 gene have marked ethnic differences between Caucasians and the Chinese. The association of the EIF2AK4 gene with BMI is suggested to be‘ethnic specific’ in the Chinese.

  14. Genome association study through nonlinear mixed models revealed new candidate genes for pig growth curves

    Directory of Open Access Journals (Sweden)

    Fabyano Fonseca e Silva

    Full Text Available ABSTRACT: Genome association analyses have been successful in identifying quantitative trait loci (QTLs for pig body weights measured at a single age. However, when considering the whole weight trajectories over time in the context of genome association analyses, it is important to look at the markers that affect growth curve parameters. The easiest way to consider them is via the two-step method, in which the growth curve parameters and marker effects are estimated separately, thereby resulting in a reduction of the statistical power and the precision of estimates. One efficient solution is to adopt nonlinear mixed models (NMM, which enables a joint modeling of the individual growth curves and marker effects. Our aim was to propose a genome association analysis for growth curves in pigs based on NMM as well as to compare it with the traditional two-step method. In addition, we also aimed to identify the nearest candidate genes related to significant SNP (single nucleotide polymorphism markers. The NMM presented a higher number of significant SNPs for adult weight (A and maturity rate (K, and provided a direct way to test SNP significance simultaneously for both the A and K parameters. Furthermore, all significant SNPs from the two-step method were also reported in the NMM analysis. The ontology of the three candidate genes (SH3BGRL2, MAPK14, and MYL9 derived from significant SNPs (simultaneously affecting A and K allows us to make inferences with regards to their contribution to the pig growth process in the population studied.

  15. A new experimental approach for studying bacterial genomic island evolution identifies island genes with bacterial host-specific expression patterns

    Directory of Open Access Journals (Sweden)

    Nickerson Cheryl A

    2006-01-01

    Full Text Available Abstract Background Genomic islands are regions of bacterial genomes that have been acquired by horizontal transfer and often contain blocks of genes that function together for specific processes. Recently, it has become clear that the impact of genomic islands on the evolution of different bacterial species is significant and represents a major force in establishing bacterial genomic variation. However, the study of genomic island evolution has been mostly performed at the sequence level using computer software or hybridization analysis to compare different bacterial genomic sequences. We describe here a novel experimental approach to study the evolution of species-specific bacterial genomic islands that identifies island genes that have evolved in such a way that they are differentially-expressed depending on the bacterial host background into which they are transferred. Results We demonstrate this approach by using a "test" genomic island that we have cloned from the Salmonella typhimurium genome (island 4305 and transferred to a range of Gram negative bacterial hosts of differing evolutionary relationships to S. typhimurium. Systematic analysis of the expression of the island genes in the different hosts compared to proper controls allowed identification of genes with genera-specific expression patterns. The data from the analysis can be arranged in a matrix to give an expression "array" of the island genes in the different bacterial backgrounds. A conserved 19-bp DNA site was found upstream of at least two of the differentially-expressed island genes. To our knowledge, this is the first systematic analysis of horizontally-transferred genomic island gene expression in a broad range of Gram negative hosts. We also present evidence in this study that the IS200 element found in island 4305 in S. typhimurium strain LT2 was inserted after the island had already been acquired by the S. typhimurium lineage and that this element is likely not

  16. Candidate genes for obesity-susceptibility show enriched association within a large genome-wide association study for BMI

    Science.gov (United States)

    Vimaleswaran, Karani S.; Tachmazidou, Ioanna; Zhao, Jing Hua; Hirschhorn, Joel N.; Dudbridge, Frank; Loos, Ruth J.F.

    2012-01-01

    Before the advent of genome-wide association studies (GWASs), hundreds of candidate genes for obesity-susceptibility had been identified through a variety of approaches. We examined whether those obesity candidate genes are enriched for associations with body mass index (BMI) compared with non-candidate genes by using data from a large-scale GWAS. A thorough literature search identified 547 candidate genes for obesity-susceptibility based on evidence from animal studies, Mendelian syndromes, linkage studies, genetic association studies and expression studies. Genomic regions were defined to include the genes ±10 kb of flanking sequence around candidate and non-candidate genes. We used summary statistics publicly available from the discovery stage of the genome-wide meta-analysis for BMI performed by the genetic investigation of anthropometric traits consortium in 123 564 individuals. Hypergeometric, rank tail-strength and gene-set enrichment analysis tests were used to test for the enrichment of association in candidate compared with non-candidate genes. The hypergeometric test of enrichment was not significant at the 5% P-value quantile (P = 0.35), but was nominally significant at the 25% quantile (P = 0.015). The rank tail-strength and gene-set enrichment tests were nominally significant for the full set of genes and borderline significant for the subset without SNPs at P < 10−7. Taken together, the observed evidence for enrichment suggests that the candidate gene approach retains some value. However, the degree of enrichment is small despite the extensive number of candidate genes and the large sample size. Studies that focus on candidate genes have only slightly increased chances of detecting associations, and are likely to miss many true effects in non-candidate genes, at least for obesity-related traits. PMID:22791748

  17. Gene ontology analysis of pairwise genetic associations in two genome-wide studies of sporadic ALS

    Directory of Open Access Journals (Sweden)

    Kim Nora

    2012-07-01

    Full Text Available Abstract Background It is increasingly clear that common human diseases have a complex genetic architecture characterized by both additive and nonadditive genetic effects. The goal of the present study was to determine whether patterns of both additive and nonadditive genetic associations aggregate in specific functional groups as defined by the Gene Ontology (GO. Results We first estimated all pairwise additive and nonadditive genetic effects using the multifactor dimensionality reduction (MDR method that makes few assumptions about the underlying genetic model. Statistical significance was evaluated using permutation testing in two genome-wide association studies of ALS. The detection data consisted of 276 subjects with ALS and 271 healthy controls while the replication data consisted of 221 subjects with ALS and 211 healthy controls. Both studies included genotypes from approximately 550,000 single-nucleotide polymorphisms (SNPs. Each SNP was mapped to a gene if it was within 500 kb of the start or end. Each SNP was assigned a p-value based on its strongest joint effect with the other SNPs. We then used the Exploratory Visual Analysis (EVA method and software to assign a p-value to each gene based on the overabundance of significant SNPs at the α = 0.05 level in the gene. We also used EVA to assign p-values to each GO group based on the overabundance of significant genes at the α = 0.05 level. A GO category was determined to replicate if that category was significant at the α = 0.05 level in both studies. We found two GO categories that replicated in both studies. The first, ‘Regulation of Cellular Component Organization and Biogenesis’, a GO Biological Process, had p-values of 0.010 and 0.014 in the detection and replication studies, respectively. The second, ‘Actin Cytoskeleton’, a GO Cellular Component, had p-values of 0.040 and 0.046 in the detection and replication studies, respectively. Conclusions Pathway

  18. Recent advances in globin research using genome-wide association studies and gene editing.

    Science.gov (United States)

    Orkin, Stuart H

    2016-03-01

    A long-sought goal in the hemoglobin field has been an improved understanding of the mechanisms that regulate the switch from fetal (HbF) to adult (HbA) hemoglobin during development. With such knowledge, the hope is that strategies for directed reactivation of HbF in adults could be devised as an approach to therapy for the β-hemoglobinopathies thalassemia and sickle cell disease. Recent genome-wide association studies (GWAS) led to identification of three loci (BCL11A, HBS1L-MYB, and the β-globin cluster itself) in which natural genetic variation is correlated with different HbF levels in populations. Here, the central role of BCL11A in control of HbF is reviewed from the perspective of how findings may be translated to gene therapy in the not-too-distant future. This summary traces the evolution of recent studies from the initial recognition of BCL11A through GWAS to identification of critical sequences in an enhancer required for its erythroid-specific expression, thereby highlighting an Achilles heel for genome editing.

  19. Genome structure drives patterns of gene family evolution in ciliates, a case study using Chilodonella uncinata (Protista, Ciliophora, Phyllopharyngea).

    Science.gov (United States)

    Gao, Feng; Song, Weibo; Katz, Laura A

    2014-08-01

    In most lineages, diversity among gene family members results from gene duplication followed by sequence divergence. Because of the genome rearrangements during the development of somatic nuclei, gene family evolution in ciliates involves more complex processes. Previous work on the ciliate Chilodonella uncinata revealed that macronuclear β-tubulin gene family members are generated by alternative processing, in which germline regions are alternatively used in multiple macronuclear chromosomes. To further study genome evolution in this ciliate, we analyzed its transcriptome and found that (1) alternative processing is extensive among gene families; and (2) such gene families are likely to be C. uncinata specific. We characterized additional macronuclear and micronuclear copies of one candidate alternatively processed gene family-a protein kinase domain containing protein (PKc)-from two C. uncinata strains. Analysis of the PKc sequences reveals that (1) multiple PKc gene family members in the macronucleus share some identical regions flanked by divergent regions; and (2) the shared identical regions are processed from a single micronuclear chromosome. We discuss analogous processes in lineages across the eukaryotic tree of life to provide further insights on the impact of genome structure on gene family evolution in eukaryotes. © 2014 The Author(s). Evolution © 2014 The Society for the Study of Evolution.

  20. Genome-wide studies highlight indirect links between human replication origins and gene regulation.

    Science.gov (United States)

    Cadoret, Jean-Charles; Meisch, Françoise; Hassan-Zadeh, Vahideh; Luyten, Isabelle; Guillet, Claire; Duret, Laurent; Quesneville, Hadi; Prioleau, Marie-Noëlle

    2008-10-14

    To get insights into the regulation of replication initiation, we systematically mapped replication origins along 1% of the human genome in HeLa cells. We identified 283 origins, 10 times more than previously known. Origin density is strongly correlated with genomic landscapes, with clusters of closely spaced origins in GC-rich regions and no origins in large GC-poor regions. Origin sequences are evolutionarily conserved, and half of them map within or near CpG islands. Most of the origins overlap transcriptional regulatory elements, providing further evidence of a connection with gene regulation. Moreover, we identify c-JUN and c-FOS as important regulators of origin selection. Half of the identified replication initiation sites do not have an open chromatin configuration, showing the absence of a direct link with gene regulation. Replication timing analyses coupled with our origin mapping suggest that a relatively strict origin-timing program regulates the replication of the human genome.

  1. Clustering of gene ontology terms in genomes.

    Science.gov (United States)

    Tiirikka, Timo; Siermala, Markku; Vihinen, Mauno

    2014-10-25

    Although protein coding genes occupy only a small fraction of genomes in higher species, they are not randomly distributed within or between chromosomes. Clustering of genes with related function(s) and/or characteristics has been evident at several different levels. To study how common the clustering of functionally related genes is and what kind of functions the end products of these genes are involved, we collected gene ontology (GO) terms for complete genomes and developed a method to detect previously undefined gene clustering. Exhaustive analysis was performed for seven widely studied species ranging from human to Escherichia coli. To overcome problems related to varying gene lengths and densities, a novel method was developed and a fixed number of genes were analyzed irrespective of the genome span covered. Statistically very significant GO term clustering was apparent in all the investigated genomes. The analysis window, which ranged from 5 to 50 consecutive genes, revealed extensive GO term clusters for genes with widely varying functions. Here, the most interesting and significant results are discussed and the complete dataset for each analyzed species is available at the GOme database at http://bioinf.uta.fi/GOme. The results indicated that clusters of genes with related functions are very common, not only in bacteria, in which operons are frequent, but also in all the studied species irrespective of how complex they are. There are some differences between species but in all of them GO term clusters are common and of widely differing sizes. The presented method can be applied to analyze any genome or part of a genome for which descriptive features are available, and thus is not restricted to ontology terms. This method can also be applied to investigate gene and protein expression patterns. The results pave a way for further studies of mechanisms that shape genome structure and evolutionary forces related to them. Copyright © 2014 Elsevier B.V. All

  2. Gene set analyses of genome-wide association studies on 49 quantitative traits measured in a single genetic epidemiology dataset.

    Science.gov (United States)

    Kim, Jihye; Kwon, Ji-Sun; Kim, Sangsoo

    2013-09-01

    Gene set analysis is a powerful tool for interpreting a genome-wide association study result and is gaining popularity these days. Comparison of the gene sets obtained for a variety of traits measured from a single genetic epidemiology dataset may give insights into the biological mechanisms underlying these traits. Based on the previously published single nucleotide polymorphism (SNP) genotype data on 8,842 individuals enrolled in the Korea Association Resource project, we performed a series of systematic genome-wide association analyses for 49 quantitative traits of basic epidemiological, anthropometric, or blood chemistry parameters. Each analysis result was subjected to subsequent gene set analyses based on Gene Ontology (GO) terms using gene set analysis software, GSA-SNP, identifying a set of GO terms significantly associated to each trait (pcorr neuronal or nerve systems.

  3. Genome-wide association study of metabolic traits reveals novel gene-metabolite-disease links.

    Directory of Open Access Journals (Sweden)

    Rico Rueedi

    2014-02-01

    Full Text Available Metabolic traits are molecular phenotypes that can drive clinical phenotypes and may predict disease progression. Here, we report results from a metabolome- and genome-wide association study on (1H-NMR urine metabolic profiles. The study was conducted within an untargeted approach, employing a novel method for compound identification. From our discovery cohort of 835 Caucasian individuals who participated in the CoLaus study, we identified 139 suggestively significant (P<5×10(-8 and independent associations between single nucleotide polymorphisms (SNP and metabolome features. Fifty-six of these associations replicated in the TasteSensomics cohort, comprising 601 individuals from São Paulo of vastly diverse ethnic background. They correspond to eleven gene-metabolite associations, six of which had been previously identified in the urine metabolome and three in the serum metabolome. Our key novel findings are the associations of two SNPs with NMR spectral signatures pointing to fucose (rs492602, P = 6.9×10(-44 and lysine (rs8101881, P = 1.2×10(-33, respectively. Fine-mapping of the first locus pinpointed the FUT2 gene, which encodes a fucosyltransferase enzyme and has previously been associated with Crohn's disease. This implicates fucose as a potential prognostic disease marker, for which there is already published evidence from a mouse model. The second SNP lies within the SLC7A9 gene, rare mutations of which have been linked to severe kidney damage. The replication of previous associations and our new discoveries demonstrate the potential of untargeted metabolomics GWAS to robustly identify molecular disease markers.

  4. Genome-Wide Association Study of Metabolic Traits Reveals Novel Gene-Metabolite-Disease Links

    Science.gov (United States)

    Nicholls, Andrew W.; Salek, Reza M.; Marques-Vidal, Pedro; Morya, Edgard; Sameshima, Koichi; Montoliu, Ivan; Da Silva, Laeticia; Collino, Sebastiano; Martin, François-Pierre; Rezzi, Serge; Steinbeck, Christoph; Waterworth, Dawn M.; Waeber, Gérard; Vollenweider, Peter; Beckmann, Jacques S.; Le Coutre, Johannes; Mooser, Vincent; Bergmann, Sven; Genick, Ulrich K.; Kutalik, Zoltán

    2014-01-01

    Metabolic traits are molecular phenotypes that can drive clinical phenotypes and may predict disease progression. Here, we report results from a metabolome- and genome-wide association study on 1H-NMR urine metabolic profiles. The study was conducted within an untargeted approach, employing a novel method for compound identification. From our discovery cohort of 835 Caucasian individuals who participated in the CoLaus study, we identified 139 suggestively significant (P<5×10−8) and independent associations between single nucleotide polymorphisms (SNP) and metabolome features. Fifty-six of these associations replicated in the TasteSensomics cohort, comprising 601 individuals from São Paulo of vastly diverse ethnic background. They correspond to eleven gene-metabolite associations, six of which had been previously identified in the urine metabolome and three in the serum metabolome. Our key novel findings are the associations of two SNPs with NMR spectral signatures pointing to fucose (rs492602, P = 6.9×10−44) and lysine (rs8101881, P = 1.2×10−33), respectively. Fine-mapping of the first locus pinpointed the FUT2 gene, which encodes a fucosyltransferase enzyme and has previously been associated with Crohn's disease. This implicates fucose as a potential prognostic disease marker, for which there is already published evidence from a mouse model. The second SNP lies within the SLC7A9 gene, rare mutations of which have been linked to severe kidney damage. The replication of previous associations and our new discoveries demonstrate the potential of untargeted metabolomics GWAS to robustly identify molecular disease markers. PMID:24586186

  5. A refined study of FCRL genes from a genome-wide association study for Graves' disease

    National Research Council Canada - National Science Library

    Zhao, Shuang-Xia; Liu, Wei; Zhan, Ming; Song, Zhi-Yi; Yang, Shao-Ying; Xue, Li-Qiong; Pan, Chun-Ming; Gu, Zhao-Hui; Liu, Bing-Li; Wang, Hai-Ning; Liang, Liming; Liang, Jun; Zhang, Xiao-Mei; Yuan, Guo-Yue; Li, Chang-Gui; Chen, Ming-Dao; Chen, Jia-Lun; Gao, Guan-Qi; Song, Huai-Dong

    2013-01-01

    To pinpoint the exact location of the etiological variant/s present at 1q21.1 harboring FCRL1-5 and CD5L genes, we carried out a refined association study in the entire FCRL region in 1,536 patients with Graves' disease (GD...

  6. Comparative Study of Apoptosis-related Gene Loci in Human, Mouse and Rat Genomes

    Institute of Scientific and Technical Information of China (English)

    Yan-Bin YIN; Yong ZHANG; Peng YU; Jing-Chu LUO; Ying JIANG; Song-Gang LI

    2005-01-01

    Many genes are involved in mammalian cell apoptosis pathway. These apoptosis genes often contain characteristic functional domains, and can be classified into at least 15 functional groups, according to previous reports. Using an integrated bioinformatics platform for motif or domain search from three public mammalian proteomes (International Protein Index database for human, mouse, and rat), we systematically cataloged all of the proteins involved in mammalian apoptosis pathway. By localizing those proteins onto the genomes, we obtained a gene locus centric apoptosis gene catalog for human, mouse and rat.Further phylogenetic analysis showed that most of the apoptosis related gene loci are conserved among these three mammals. Interestingly, about one-third of apoptosis gene loci form gene clusters on mammal chromosomes, and exist in the three species, which indicated that mammalian apoptosis gene orders are also conserved. In addition, some tandem duplicated gene loci were revealed by comparing gene loci clusters in the three species. All data produced in this work were stored in a relational database and may be viewed at http://pcas.cbi.pku.edu.cn/database/apd.php.

  7. Genome-Wide association study identifies candidate genes for Parkinson's disease in an Ashkenazi Jewish population

    Directory of Open Access Journals (Sweden)

    Liu Xinmin

    2011-08-01

    Full Text Available Abstract Background To date, nine Parkinson disease (PD genome-wide association studies in North American, European and Asian populations have been published. The majority of studies have confirmed the association of the previously identified genetic risk factors, SNCA and MAPT, and two studies have identified three new PD susceptibility loci/genes (PARK16, BST1 and HLA-DRB5. In a recent meta-analysis of datasets from five of the published PD GWAS an additional 6 novel candidate genes (SYT11, ACMSD, STK39, MCCC1/LAMP3, GAK and CCDC62/HIP1R were identified. Collectively the associations identified in these GWAS account for only a small proportion of the estimated total heritability of PD suggesting that an 'unknown' component of the genetic architecture of PD remains to be identified. Methods We applied a GWAS approach to a relatively homogeneous Ashkenazi Jewish (AJ population from New York to search for both 'rare' and 'common' genetic variants that confer risk of PD by examining any SNPs with allele frequencies exceeding 2%. We have focused on a genetic isolate, the AJ population, as a discovery dataset since this cohort has a higher sharing of genetic background and historically experienced a significant bottleneck. We also conducted a replication study using two publicly available datasets from dbGaP. The joint analysis dataset had a combined sample size of 2,050 cases and 1,836 controls. Results We identified the top 57 SNPs showing the strongest evidence of association in the AJ dataset (p -5. Six SNPs located within gene regions had positive signals in at least one other independent dbGaP dataset: LOC100505836 (Chr3p24, LOC153328/SLC25A48 (Chr5q31.1, UNC13B (9p13.3, SLCO3A1(15q26.1, WNT3(17q21.3 and NSF (17q21.3. We also replicated published associations for the gene regions SNCA (Chr4q21; rs3775442, p = 0.037, PARK16 (Chr1q32.1; rs823114 (NUCKS1, p = 6.12 × 10-4, BST1 (Chr4p15; rs12502586, p = 0.027, STK39 (Chr2q24.3; rs3754775, p = 0

  8. JGI Plant Genomics Gene Annotation Pipeline

    Energy Technology Data Exchange (ETDEWEB)

    Shu, Shengqiang; Rokhsar, Dan; Goodstein, David; Hayes, David; Mitros, Therese

    2014-07-14

    Plant genomes vary in size and are highly complex with a high amount of repeats, genome duplication and tandem duplication. Gene encodes a wealth of information useful in studying organism and it is critical to have high quality and stable gene annotation. Thanks to advancement of sequencing technology, many plant species genomes have been sequenced and transcriptomes are also sequenced. To use these vastly large amounts of sequence data to make gene annotation or re-annotation in a timely fashion, an automatic pipeline is needed. JGI plant genomics gene annotation pipeline, called integrated gene call (IGC), is our effort toward this aim with aid of a RNA-seq transcriptome assembly pipeline. It utilizes several gene predictors based on homolog peptides and transcript ORFs. See Methods for detail. Here we present genome annotation of JGI flagship green plants produced by this pipeline plus Arabidopsis and rice except for chlamy which is done by a third party. The genome annotations of these species and others are used in our gene family build pipeline and accessible via JGI Phytozome portal whose URL and front page snapshot are shown below.

  9. Brief Guide to Genomics: DNA, Genes and Genomes

    Science.gov (United States)

    ... Breve guía de genómica A Brief Guide to Genomics DNA, Genes and Genomes Deoxyribonucleic acid (DNA) is ... genetic basis for health and disease. Implications of Genomics for Medical Science Virtually every human ailment has ...

  10. Genome-wide association study identifies candidate genes for Parkinson's disease in an Ashkenazi Jewish population.

    Science.gov (United States)

    Liu, Xinmin; Cheng, Rong; Verbitsky, Miguel; Kisselev, Sergey; Browne, Andrew; Mejia-Sanatana, Helen; Louis, Elan D; Cote, Lucien J; Andrews, Howard; Waters, Cheryl; Ford, Blair; Frucht, Steven; Fahn, Stanley; Marder, Karen; Clark, Lorraine N; Lee, Joseph H

    2011-08-03

    To date, nine Parkinson disease (PD) genome-wide association studies in North American, European and Asian populations have been published. The majority of studies have confirmed the association of the previously identified genetic risk factors, SNCA and MAPT, and two studies have identified three new PD susceptibility loci/genes (PARK16, BST1 and HLA-DRB5). In a recent meta-analysis of datasets from five of the published PD GWAS an additional 6 novel candidate genes (SYT11, ACMSD, STK39, MCCC1/LAMP3, GAK and CCDC62/HIP1R) were identified. Collectively the associations identified in these GWAS account for only a small proportion of the estimated total heritability of PD suggesting that an 'unknown' component of the genetic architecture of PD remains to be identified. We applied a GWAS approach to a relatively homogeneous Ashkenazi Jewish (AJ) population from New York to search for both 'rare' and 'common' genetic variants that confer risk of PD by examining any SNPs with allele frequencies exceeding 2%. We have focused on a genetic isolate, the AJ population, as a discovery dataset since this cohort has a higher sharing of genetic background and historically experienced a significant bottleneck. We also conducted a replication study using two publicly available datasets from dbGaP. The joint analysis dataset had a combined sample size of 2,050 cases and 1,836 controls. We identified the top 57 SNPs showing the strongest evidence of association in the AJ dataset (p dataset: LOC100505836 (Chr3p24), LOC153328/SLC25A48 (Chr5q31.1), UNC13B (9p13.3), SLCO3A1(15q26.1), WNT3(17q21.3) and NSF (17q21.3). We also replicated published associations for the gene regions SNCA (Chr4q21; rs3775442, p = 0.037), PARK16 (Chr1q32.1; rs823114 (NUCKS1), p = 6.12 × 10(-4)), BST1 (Chr4p15; rs12502586, p = 0.027), STK39 (Chr2q24.3; rs3754775, p = 0.005), and LAMP3 (Chr3; rs12493050, p = 0.005) in addition to the two most common PD susceptibility genes in the AJ population LRRK2 (Chr12q12

  11. Derivation of consensus inactivation status for X-linked genes from genome-wide studies.

    Science.gov (United States)

    Balaton, Bradley P; Cotton, Allison M; Brown, Carolyn J

    2015-01-01

    X chromosome inactivation is the epigenetic silencing of the majority of the genes on one of the X chromosomes in XX therian mammals. In humans, approximately 15 % of genes consistently escape from this inactivation and another 15 % of genes vary between individuals or tissues in whether they are subject to, or escape from, inactivation. Multiple studies have provided inactivation status calls for a large subset of the genes on the X chromosome; however, these studies vary in which genes they were able to make calls for and in some cases which call they give a specific gene. This analysis aggregated three published studies that have examined X chromosome inactivation status of genes across the X chromosome, generating consensus calls and identifying discordancies. The impact of expression level and chromosomal location on X chromosome inactivation status was also assessed. Overall, we assigned a consensus XCI status 639 genes, including 78 % of protein-coding genes expressed outside of the testes, with a lower frequency for non-coding RNA and testis-specific genes. Study-specific discordancies suggest that there may be instability of XCI during cell culture and also highlight study-specific variations in call type. We observe an enrichment of discordant genes at boundaries between genes subject to and escaping from inactivation. This study has compiled a comprehensive list of X-chromosome inactivation statuses for genes and also discovered some biases which will help guide future studies examining X-chromosome inactivation.

  12. Comparative genomic analysis of soybean flowering genes.

    Directory of Open Access Journals (Sweden)

    Chol-Hee Jung

    Full Text Available Flowering is an important agronomic trait that determines crop yield. Soybean is a major oilseed legume crop used for human and animal feed. Legumes have unique vegetative and floral complexities. Our understanding of the molecular basis of flower initiation and development in legumes is limited. Here, we address this by using a computational approach to examine flowering regulatory genes in the soybean genome in comparison to the most studied model plant, Arabidopsis. For this comparison, a genome-wide analysis of orthologue groups was performed, followed by an in silico gene expression analysis of the identified soybean flowering genes. Phylogenetic analyses of the gene families highlighted the evolutionary relationships among these candidates. Our study identified key flowering genes in soybean and indicates that the vernalisation and the ambient-temperature pathways seem to be the most variant in soybean. A comparison of the orthologue groups containing flowering genes indicated that, on average, each Arabidopsis flowering gene has 2-3 orthologous copies in soybean. Our analysis highlighted that the CDF3, VRN1, SVP, AP3 and PIF3 genes are paralogue-rich genes in soybean. Furthermore, the genome mapping of the soybean flowering genes showed that these genes are scattered randomly across the genome. A paralogue comparison indicated that the soybean genes comprising the largest orthologue group are clustered in a 1.4 Mb region on chromosome 16 of soybean. Furthermore, a comparison with the undomesticated soybean (Glycine soja revealed that there are hundreds of SNPs that are associated with putative soybean flowering genes and that there are structural variants that may affect the genes of the light-signalling and ambient-temperature pathways in soybean. Our study provides a framework for the soybean flowering pathway and insights into the relationship and evolution of flowering genes between a short-day soybean and the long-day plant

  13. Genome-wide gene-environment study identifies glutamate receptor gene GRIN2A as a Parkinson's disease modifier gene via interaction with coffee.

    Science.gov (United States)

    Hamza, Taye H; Chen, Honglei; Hill-Burns, Erin M; Rhodes, Shannon L; Montimurro, Jennifer; Kay, Denise M; Tenesa, Albert; Kusel, Victoria I; Sheehan, Patricia; Eaaswarkhanth, Muthukrishnan; Yearout, Dora; Samii, Ali; Roberts, John W; Agarwal, Pinky; Bordelon, Yvette; Park, Yikyung; Wang, Liyong; Gao, Jianjun; Vance, Jeffery M; Kendler, Kenneth S; Bacanu, Silviu-Alin; Scott, William K; Ritz, Beate; Nutt, John; Factor, Stewart A; Zabetian, Cyrus P; Payami, Haydeh

    2011-08-01

    Our aim was to identify genes that influence the inverse association of coffee with the risk of developing Parkinson's disease (PD). We used genome-wide genotype data and lifetime caffeinated-coffee-consumption data on 1,458 persons with PD and 931 without PD from the NeuroGenetics Research Consortium (NGRC), and we performed a genome-wide association and interaction study (GWAIS), testing each SNP's main-effect plus its interaction with coffee, adjusting for sex, age, and two principal components. We then stratified subjects as heavy or light coffee-drinkers and performed genome-wide association study (GWAS) in each group. We replicated the most significant SNP. Finally, we imputed the NGRC dataset, increasing genomic coverage to examine the region of interest in detail. The primary analyses (GWAIS, GWAS, Replication) were performed using genotyped data. In GWAIS, the most significant signal came from rs4998386 and the neighboring SNPs in GRIN2A. GRIN2A encodes an NMDA-glutamate-receptor subunit and regulates excitatory neurotransmission in the brain. Achieving P(2df) = 10(-6), GRIN2A surpassed all known PD susceptibility genes in significance in the GWAIS. In stratified GWAS, the GRIN2A signal was present in heavy coffee-drinkers (OR = 0.43; P = 6×10(-7)) but not in light coffee-drinkers. The a priori Replication hypothesis that "Among heavy coffee-drinkers, rs4998386_T carriers have lower PD risk than rs4998386_CC carriers" was confirmed: OR(Replication) = 0.59, P(Replication) = 10(-3); OR(Pooled) = 0.51, P(Pooled) = 7×10(-8). Compared to light coffee-drinkers with rs4998386_CC genotype, heavy coffee-drinkers with rs4998386_CC genotype had 18% lower risk (P = 3×10(-3)), whereas heavy coffee-drinkers with rs4998386_TC genotype had 59% lower risk (P = 6×10(-13)). Imputation revealed a block of SNPs that achieved P(2df)coffee-drinkers. This study is proof of concept that inclusion of environmental factors can help identify

  14. Genome-Wide Gene-Environment Study Identifies Glutamate Receptor Gene GRIN2A as a Parkinson's Disease Modifier Gene via Interaction with Coffee

    Science.gov (United States)

    Hamza, Taye H.; Chen, Honglei; Hill-Burns, Erin M.; Rhodes, Shannon L.; Montimurro, Jennifer; Kay, Denise M.; Tenesa, Albert; Kusel, Victoria I.; Sheehan, Patricia; Eaaswarkhanth, Muthukrishnan; Yearout, Dora; Samii, Ali; Roberts, John W.; Agarwal, Pinky; Bordelon, Yvette; Park, Yikyung; Wang, Liyong; Gao, Jianjun; Vance, Jeffery M.; Kendler, Kenneth S.; Bacanu, Silviu-Alin; Scott, William K.; Ritz, Beate; Nutt, John; Factor, Stewart A.; Zabetian, Cyrus P.; Payami, Haydeh

    2011-01-01

    Our aim was to identify genes that influence the inverse association of coffee with the risk of developing Parkinson's disease (PD). We used genome-wide genotype data and lifetime caffeinated-coffee-consumption data on 1,458 persons with PD and 931 without PD from the NeuroGenetics Research Consortium (NGRC), and we performed a genome-wide association and interaction study (GWAIS), testing each SNP's main-effect plus its interaction with coffee, adjusting for sex, age, and two principal components. We then stratified subjects as heavy or light coffee-drinkers and performed genome-wide association study (GWAS) in each group. We replicated the most significant SNP. Finally, we imputed the NGRC dataset, increasing genomic coverage to examine the region of interest in detail. The primary analyses (GWAIS, GWAS, Replication) were performed using genotyped data. In GWAIS, the most significant signal came from rs4998386 and the neighboring SNPs in GRIN2A. GRIN2A encodes an NMDA-glutamate-receptor subunit and regulates excitatory neurotransmission in the brain. Achieving P2df = 10−6, GRIN2A surpassed all known PD susceptibility genes in significance in the GWAIS. In stratified GWAS, the GRIN2A signal was present in heavy coffee-drinkers (OR = 0.43; P = 6×10−7) but not in light coffee-drinkers. The a priori Replication hypothesis that “Among heavy coffee-drinkers, rs4998386_T carriers have lower PD risk than rs4998386_CC carriers” was confirmed: ORReplication = 0.59, PReplication = 10−3; ORPooled = 0.51, PPooled = 7×10−8. Compared to light coffee-drinkers with rs4998386_CC genotype, heavy coffee-drinkers with rs4998386_CC genotype had 18% lower risk (P = 3×10−3), whereas heavy coffee-drinkers with rs4998386_TC genotype had 59% lower risk (P = 6×10−13). Imputation revealed a block of SNPs that achieved P2dfcoffee-drinkers. This study is proof of concept that inclusion of environmental factors can help identify genes that

  15. A survey of genomic studies supports association of circadian clock genes with bipolar disorder spectrum illnesses and lithium response.

    Directory of Open Access Journals (Sweden)

    Michael J McCarthy

    Full Text Available Circadian rhythm abnormalities in bipolar disorder (BD have led to a search for genetic abnormalities in circadian "clock genes" associated with BD. However, no significant clock gene findings have emerged from genome-wide association studies (GWAS. At least three factors could account for this discrepancy: complex traits are polygenic, the organization of the clock is more complex than previously recognized, and/or genetic risk for BD may be shared across multiple illnesses. To investigate these issues, we considered the clock gene network at three levels: essential "core" clock genes, upstream circadian clock modulators, and downstream clock controlled genes. Using relaxed thresholds for GWAS statistical significance, we determined the rates of clock vs. control genetic associations with BD, and four additional illnesses that share clinical features and/or genetic risk with BD (major depression, schizophrenia, attention deficit/hyperactivity. Then we compared the results to a set of lithium-responsive genes. Associations with BD-spectrum illnesses and lithium-responsiveness were both enriched among core clock genes but not among upstream clock modulators. Associations with BD-spectrum illnesses and lithium-responsiveness were also enriched among pervasively rhythmic clock-controlled genes but not among genes that were less pervasively rhythmic or non-rhythmic. Our analysis reveals previously unrecognized associations between clock genes and BD-spectrum illnesses, partly reconciling previously discordant results from past GWAS and candidate gene studies.

  16. Comparative genomics study of polyhydroxyalkanoates (PHA and ectoine relevant genes from Halomonas sp. TD01 revealed extensive horizontal gene transfer events and co-evolutionary relationships

    Directory of Open Access Journals (Sweden)

    Cai Lei

    2011-11-01

    Full Text Available Abstract Background Halophilic bacteria have shown their significance in industrial production of polyhydroxyalkanoates (PHA and are gaining more attention for genetic engineering modification. Yet, little information on the genomics and PHA related genes from halophilic bacteria have been disclosed so far. Results The draft genome of moderately halophilic bacterium, Halomonas sp. TD01, a strain of great potential for industrial production of short-chain-length polyhydroxyalkanoates (PHA, was analyzed through computational methods to reveal the osmoregulation mechanism and the evolutionary relationship of the enzymes relevant to PHA and ectoine syntheses. Genes involved in the metabolism of PHA and osmolytes were annotated and studied in silico. Although PHA synthase, depolymerase, regulator/repressor and phasin were all involved in PHA metabolic pathways, they demonstrated different horizontal gene transfer (HGT events between the genomes of different strains. In contrast, co-occurrence of ectoine genes in the same genome was more frequently observed, and ectoine genes were more likely under coincidental horizontal gene transfer than PHA related genes. In addition, the adjacent organization of the homologues of PHA synthase phaC1 and PHA granule binding protein phaP was conserved in the strain TD01, which was also observed in some halophiles and non-halophiles exclusively from γ-proteobacteria. In contrast to haloarchaea, the proteome of Halomonas sp. TD01 did not show obvious inclination towards acidity relative to non-halophilic Escherichia coli MG1655, which signified that Halomonas sp. TD01 preferred the accumulation of organic osmolytes to ions in order to balance the intracellular osmotic pressure with the environment. Conclusions The accessibility of genome information would facilitate research on the genetic engineering of halophilic bacteria including Halomonas sp. TD01.

  17. Genome-wide gene expression study indicates the anti-inflammatory effect of polarized light in recurrent childhood respiratory disease.

    Science.gov (United States)

    Falus, A; Fenyo, M; Éder, K; Madarasi, A

    2011-10-01

    The clinical and molecular effects of whole-body polarized light treatment on children suffering from recurrent respiratory infection were studied. The incidence and duration of respiratory symptoms as well as the length of appropriate antibiotic therapy were measured. Simultaneously, the genome-wide gene expression pattern was examined by whole genome cDNA microarray in peripheral lymphocytes of children. Twenty of 25 children showed a marked clinical improvement, while in five of 25 had poor response or no changes. The gene expression pattern of the patients' peripheral lymphocytes was compared in favorable and poor responders. The lymphocytes of the children with a documented improved clinical response to polarized light therapy showed a decrease in the expression of chemokine genes, such as CXCL1, CXCL2, CXCL3, and IL-8, and in that of the TNFα gene. On the contrary, a rapid elevation was found in the expression of the gene encoding for CYP4F2, a leukotriene B4-metabolizing enzyme. In children with poor clinical response to polarized light therapy, no similar changes were detected in the gene expression pattern of the lymphocytes. The improved clinical symptoms and modified gene expression profile of lymphocytes reveals an anti-inflammatory effect of whole-body polarized light irradiation.

  18. Genome-Wide Association Study of Intelligence: Additive Effects of Novel Brain Expressed Genes

    Science.gov (United States)

    Loo, Sandra K.; Shtir, Corina; Doyle, Alysa E.; Mick, Eric; McGough, James J.; McCracken, James; Biederman, Joseph; Smalley, Susan L.; Cantor, Rita M.; Faraone, Stephen V.; Nelson, Stanley F.

    2012-01-01

    Objective: The purpose of the present study was to identify common genetic variants that are associated with human intelligence or general cognitive ability. Method: We performed a genome-wide association analysis with a dense set of 1 million single-nucleotide polymorphisms (SNPs) and quantitative intelligence scores within an ancestrally…

  19. Genome-Wide Association Study of Intelligence: Additive Effects of Novel Brain Expressed Genes

    Science.gov (United States)

    Loo, Sandra K.; Shtir, Corina; Doyle, Alysa E.; Mick, Eric; McGough, James J.; McCracken, James; Biederman, Joseph; Smalley, Susan L.; Cantor, Rita M.; Faraone, Stephen V.; Nelson, Stanley F.

    2012-01-01

    Objective: The purpose of the present study was to identify common genetic variants that are associated with human intelligence or general cognitive ability. Method: We performed a genome-wide association analysis with a dense set of 1 million single-nucleotide polymorphisms (SNPs) and quantitative intelligence scores within an ancestrally…

  20. Genomic disorders: A window into human gene and genome evolution

    Science.gov (United States)

    Carvalho, Claudia M. B.; Zhang, Feng; Lupski, James R.

    2010-01-01

    Gene duplications alter the genetic constitution of organisms and can be a driving force of molecular evolution in humans and the great apes. In this context, the study of genomic disorders has uncovered the essential role played by the genomic architecture, especially low copy repeats (LCRs) or segmental duplications (SDs). In fact, regardless of the mechanism, LCRs can mediate or stimulate rearrangements, inciting genomic instability and generating dynamic and unstable regions prone to rapid molecular evolution. In humans, copy-number variation (CNV) has been implicated in common traits such as neuropathy, hypertension, color blindness, infertility, and behavioral traits including autism and schizophrenia, as well as disease susceptibility to HIV, lupus nephritis, and psoriasis among many other clinical phenotypes. The same mechanisms implicated in the origin of genomic disorders may also play a role in the emergence of segmental duplications and the evolution of new genes by means of genomic and gene duplication and triplication, exon shuffling, exon accretion, and fusion/fission events. PMID:20080665

  1. Gene-Environment Interactions in Genome-Wide Association Studies: Current Approaches and New Directions

    Science.gov (United States)

    Winham, Stacey J.; Biernacka, Joanna M.

    2013-01-01

    Background: Complex psychiatric traits have long been thought to be the result of a combination of genetic and environmental factors, and gene-environment interactions are thought to play a crucial role in behavioral phenotypes and the susceptibility and progression of psychiatric disorders. Candidate gene studies to investigate hypothesized…

  2. Gene discovery in the Entamoeba invadens genome.

    Science.gov (United States)

    Wang, Zheng; Samuelson, John; Clark, C Graham; Eichinger, Daniel; Paul, Jaishree; Van Dellen, Katrina; Hall, Neil; Anderson, Iain; Loftus, Brendan

    2003-06-01

    Entamoeba invadens, a parasite of reptiles, is a model for the study of encystation by the human enteric pathogen Entamoeba histolytica, because E. invadens form cysts in axenic culture. With approximately 0.5-fold sequence coverage of the genome, we were able to get insights into E. invadens gene and genome features. Overall, the E. invadens genome displays many of the features that are emerging from ongoing genome sequencing efforts in E. histolytica. At the nucleotide level the E. invadens genome has on average 60% sequence identity with that of E. histolytica. The presence of introns in E. invadens was predicted with similar consensus (GTTTGT em leader A/TAG) sequences to those identified in E. histolytica and Entamoeba dispar. Sequences highly repeated in the genome of E. histolytica (rRNAs, tRNAs, CXXC-rich proteins, and Leu-rich repeat proteins) were found to be highly repeated in the E. invadens genome. Numerous proteins homologous to those implicated in amoebic virulence, (Gal/GalNAc lectins, amoebapores, and cysteine proteinases) and drug resistance (p-glycoproteins) were identified. Homologs of proteins involved in cell cycle, vesicular trafficking and signal transduction were identified, which may be involved in en/excystation and cell growth of E. invadens. Finally, multiple copies of a number of E. invadens genes coding for predicted enzymes involved in core metabolism and the targets of anti-amoebic drugs were identified.

  3. Integrating genome-wide association study and expression quantitative trait loci data identifies multiple genes and gene set associated with neuroticism.

    Science.gov (United States)

    Fan, Qianrui; Wang, Wenyu; Hao, Jingcan; He, Awen; Wen, Yan; Guo, Xiong; Wu, Cuiyan; Ning, Yujie; Wang, Xi; Wang, Sen; Zhang, Feng

    2017-08-01

    Neuroticism is a fundamental personality trait with significant genetic determinant. To identify novel susceptibility genes for neuroticism, we conducted an integrative analysis of genomic and transcriptomic data of genome wide association study (GWAS) and expression quantitative trait locus (eQTL) study. GWAS summary data was driven from published studies of neuroticism, totally involving 170,906 subjects. eQTL dataset containing 927,753 eQTLs were obtained from an eQTL meta-analysis of 5311 samples. Integrative analysis of GWAS and eQTL data was conducted by summary data-based Mendelian randomization (SMR) analysis software. To identify neuroticism associated gene sets, the SMR analysis results were further subjected to gene set enrichment analysis (GSEA). The gene set annotation dataset (containing 13,311 annotated gene sets) of GSEA Molecular Signatures Database was used. SMR single gene analysis identified 6 significant genes for neuroticism, including MSRA (p value=2.27×10(-10)), MGC57346 (p value=6.92×10(-7)), BLK (p value=1.01×10(-6)), XKR6 (p value=1.11×10(-6)), C17ORF69 (p value=1.12×10(-6)) and KIAA1267 (p value=4.00×10(-6)). Gene set enrichment analysis observed significant association for Chr8p23 gene set (false discovery rate=0.033). Our results provide novel clues for the genetic mechanism studies of neuroticism. Copyright © 2017. Published by Elsevier Inc.

  4. LegumeIP 2.0--a platform for the study of gene function and genome evolution in legumes.

    Science.gov (United States)

    Li, Jun; Dai, Xinbin; Zhuang, Zhaohong; Zhao, Patrick X

    2016-01-04

    The LegumeIP 2.0 database hosts large-scale genomics and transcriptomics data and provides integrative bioinformatics tools for the study of gene function and evolution in legumes. Our recent updates in LegumeIP 2.0 include gene and protein sequences, gene models and annotations, syntenic regions, protein families and phylogenetic trees for six legume species: Medicago truncatula, Glycine max (soybean), Lotus japonicus, Phaseolus vulgaris (common bean), Cicer arietinum (chickpea) and Cajanus cajan (pigeon pea) and two outgroup reference species: Arabidopsis thaliana and Poplar trichocarpa. Moreover, the LegumeIP 2.0 features the following new data resources and bioinformatics tools: (i) an integrative gene expression atlas for four model legumes that include 550 array hybridizations from M. truncatula, 962 gene expression profiles of G. max, 276 array hybridizations from L. japonicas and 56 RNA-Seq-based gene expression profiles for C. arietinum. These datasets were manually curated and hierarchically organized based on Experimental Ontology and Plant Ontology so that users can browse, search, and retrieve data for their selected experiments. (ii) New functions/analytical tools to query, mine and visualize large-scale gene sequences, annotations and transcriptome profiles. Users may select a subset of expression experiments and visualize and compare expression profiles for multiple genes. The LegumeIP 2.0 database is freely available to the public at http://plantgrn.noble.org/LegumeIP/.

  5. Gene-based meta-analysis of genome-wide association studies implicates new loci involved in obesity

    DEFF Research Database (Denmark)

    Hägg, Sara; Ganna, Andrea; Van Der Laan, Sander W

    2015-01-01

    To date, genome-wide association studies (GWASs) have identified >100 loci with single variants associated with body mass index (BMI). This approach may miss loci with high allelic heterogeneity; therefore, the aim of the present study was to use gene-based meta-analysis to identify regions...... with high allelic heterogeneity to discover additional obesity susceptibility loci. We included GWAS data from 123 865 individuals of European descent from 46 cohorts in Stage 1 and Metabochip data from additional 103 046 individuals from 43 cohorts in Stage 2, all within the Genetic Investigation...... of ANthropometric Traits (GIANT) consortium. Each cohort was tested for association between ∼2.4 million (Stage 1) or ∼200 000 (Stage 2) imputed or genotyped single variants and BMI, and summary statistics were subsequently meta-analyzed in 17 941 genes. We used the ‘VErsatile Gene-based Association Study’ (VEGAS...

  6. Dissecting inflammatory complications in critically injured patients by within-patient gene expression changes: a longitudinal clinical genomics study.

    Directory of Open Access Journals (Sweden)

    Keyur H Desai

    2011-09-01

    Full Text Available BACKGROUND: Trauma is the number one killer of individuals 1-44 y of age in the United States. The prognosis and treatment of inflammatory complications in critically injured patients continue to be challenging, with a history of failed clinical trials and poorly understood biology. New approaches are therefore needed to improve our ability to diagnose and treat this clinical condition. METHODS AND FINDINGS: We conducted a large-scale study on 168 blunt-force trauma patients over 28 d, measuring ∼400 clinical variables and longitudinally profiling leukocyte gene expression with ∼800 microarrays. Marshall MOF (multiple organ failure clinical score trajectories were first utilized to organize the patients into five categories of increasingly poor outcomes. We then developed an analysis framework modeling early within-patient expression changes to produce a robust characterization of the genomic response to trauma. A quarter of the genome shows early expression changes associated with longer-term post-injury complications, captured by at least five dynamic co-expression modules of functionally related genes. In particular, early down-regulation of MHC-class II genes and up-regulation of p38 MAPK signaling pathway were found to strongly associate with longer-term post-injury complications, providing discrimination among patient outcomes from expression changes during the 40-80 h window post-injury. CONCLUSIONS: The genomic characterization provided here substantially expands the scope by which the molecular response to trauma may be characterized and understood. These results may be instrumental in furthering our understanding of the disease process and identifying potential targets for therapeutic intervention. Additionally, the quantitative approach we have introduced is potentially applicable to future genomics studies of rapidly progressing clinical conditions. TRIAL REGISTRATION: ClinicalTrials.gov NCT00257231

  7. Genome-Wide Association Study with Sequence Variants Identifies Candidate Genes for Mastitis Resistance in Dairy Cattle

    DEFF Research Database (Denmark)

    Sahana, Goutam; Guldbrandtsen, Bernt; Bendixen, Christian

    Six genomic regions affecting clinical mastitis were identified through a GWAS study with imputed BovineHD chip genotype data in the Nordic Holstein cattle population. The association analyses were carried out using a SNP-by-SNP analysis by fitting the regression of allele dosage and a polygenic...... Effect Predictor (VEP) vers. 2.6 using ENSEMBL vers. 67 databases. Candidate polymorphisms affecting clinical mastitis were selected based on their association with the traits and functional annotations. A strong positional candidate gene for mastitis resistance on chromosome-6 is the NPFFR2 which...... Factor Receptor Alpha (LIFR) emerged as a strong candidate gene for mastitis resistance. The LIFR gene is involved in acute phase response and is expressed in saliva and mammary gland....

  8. Genomic variation in Salmonella enterica core genes for epidemiological typing

    DEFF Research Database (Denmark)

    Leekitcharoenphon, Pimlapas; Lukjancenko, Oksana; Rundsten, Carsten Friis

    2012-01-01

    Background: Technological advances in high throughput genome sequencing are making whole genome sequencing (WGS) available as a routine tool for bacterial typing. Standardized procedures for identification of relevant genes and of variation are needed to enable comparison between studies and over...... genomes and evaluate their value as typing targets, comparing whole genome typing and traditional methods such as 16S and MLST. A consensus tree based on variation of core genes gives much better resolution than 16S and MLST; the pan-genome family tree is similar to the consensus tree, but with higher...... that there is a positive selection towards mutations leading to amino acid changes. Conclusions: Genomic variation within the core genome is useful for investigating molecular evolution and providing candidate genes for bacterial genome typing. Identification of genes with different degrees of variation is important...

  9. Assessment of osteoarthritis candidate genes in a meta-analysis of nine genome-wide association studies

    NARCIS (Netherlands)

    C. Rodriguez-Fontenla (Cristina); M. Calaza (Manuel); E. Evangelou (Evangelos); A.M. Valdes (Ana Maria); N.K. Arden (Nigel); F.J. Blanco; A.J. Carr (Andrew Jonathan); K. Chapman (Kay); P. Deloukas (Panagiotis); M. Doherty (Michael); T. Esko (Tõnu); C.M. Garcés Aletá (Carlos); J.J. Gomez-Reino Carnota (Juan); H.T. Helgadottir (Hafdis); A. Hofman (Albert); I. Jonsdottir (Ingileif); J.M. Kerkhof (Hanneke); M. Kloppenburg (Margreet); A. McCaskie (Andrew); E.E. Ntzani (Evangelia); W.E.R. Ollier (William); N. Oreiro (Natividad); K. Panoutsopoulou (Kalliope); S.H. Ralston (Stuart); Y.F.M. Ramos (Yolande); J.A. Riancho (José); F. Rivadeneira Ramirez (Fernando); P.E. Slagboom (Eline); U. Styrkarsdottir (Unnur); U. Thorsteinsdottir (Unnur); G. Thorleifsson (Gudmar); A. Tsezou (Aspasia); A.G. Uitterlinden (André); G.A. Wallis (Gillian); J.M. Wilkinson (Mark); G. Zhai (Guangju); Y. Zhu (Yanyan); D. Felson; J.P.A. Ioannidis (John); J. Loughlin (John); A. Metspalu (Andres); I. Meulenbelt (Ingrid); J-A. Zwart (John-Anker); J.B.J. van Meurs (Joyce); E. Zeggini (Eleftheria); T.D. Spector (Timothy); A. Gonzalez (Antonio)

    2014-01-01

    textabstractObjective To assess candidate genes for association with osteoarthritis (OA) and identify promising genetic factors and, secondarily, to assess the candidate gene approach in OA. Methods A total of 199 candidate genes for association with OA were identified using Human Genome Epidemiolog

  10. Bivariate genome-wide association study suggests that the DARC gene influences lean body mass and age at menarche.

    Science.gov (United States)

    Hai, Rong; Zhang, Lei; Pei, Yufang; Zhao, Lanjuan; Ran, Shu; Han, Yingying; Zhu, Xuezhen; Shen, Hui; Tian, Qing; Deng, Hongwen

    2012-06-01

    Lean body mass (LBM) and age at menarche (AAM) are two important complex traits for human health. The aim of this study was to identify pleiotropic genes for both traits using a powerful bivariate genome-wide association study (GWAS). Two studies, a discovery study and a replication study, were performed. In the discovery study, 909622 single nucleotide polymorphisms (SNPs) were genotyped in 801 unrelated female Han Chinese subjects using the Affymetrix human genome-wide SNP array 6.0 platform. Then, a bivariate GWAS was performed to identify the SNPs that may be important for LBM and AAM. In the replication study, significant findings from the discovery study were validated in 1692 unrelated Caucasian female subjects. One SNP rs3027009 that was bivariately associated with left arm lean mass and AAM in the discovery samples (P=7.26×10(-6)) and in the replication samples (P=0.005) was identified. The SNP is located at the upstream of DARC (Duffy antigen receptor for chemokines) gene, suggesting that DARC may play an important role in regulating the metabolisms of both LBM and AAM.

  11. Hippocampal atrophy as a quantitative trait in a genome-wide association study identifying novel susceptibility genes for Alzheimer's disease.

    Directory of Open Access Journals (Sweden)

    Steven G Potkin

    Full Text Available BACKGROUND: With the exception of APOE epsilon4 allele, the common genetic risk factors for sporadic Alzheimer's Disease (AD are unknown. METHODS AND FINDINGS: We completed a genome-wide association study on 381 participants in the ADNI (Alzheimer's Disease Neuroimaging Initiative study. Samples were genotyped using the Illumina Human610-Quad BeadChip. 516,645 unique Single Nucleotide Polymorphisms (SNPs were included in the analysis following quality control measures. The genotype data and raw genetic data are freely available for download (LONI, http://www.loni.ucla.edu/ADNI/Data/. Two analyses were completed: a standard case-control analysis, and a novel approach using hippocampal atrophy measured on MRI as an objectively defined, quantitative phenotype. A General Linear Model was applied to identify SNPs for which there was an interaction between the genotype and diagnosis on the quantitative trait. The case-control analysis identified APOE and a new risk gene, TOMM40 (translocase of outer mitochondrial membrane 40, at a genome-wide significance level of < or =10(-6 (10(-11 for a haplotype. TOMM40 risk alleles were approximately twice as frequent in AD subjects as controls. The quantitative trait analysis identified 21 genes or chromosomal areas with at least one SNP with a p-value < or =10(-6, which can be considered potential "new" candidate loci to explore in the etiology of sporadic AD. These candidates included EFNA5, CAND1, MAGI2, ARSB, and PRUNE2, genes involved in the regulation of protein degradation, apoptosis, neuronal loss and neurodevelopment. Thus, we identified common genetic variants associated with the increased risk of developing AD in the ADNI cohort, and present publicly available genome-wide data. Supportive evidence based on case-control studies and biological plausibility by gene annotation is provided. Currently no available sample with both imaging and genetic data is available for replication. CONCLUSIONS: Using

  12. Genomic variation in Salmonella enterica core genes for epidemiological typing

    Directory of Open Access Journals (Sweden)

    Leekitcharoenphon Pimlapas

    2012-03-01

    Full Text Available Abstract Background Technological advances in high throughput genome sequencing are making whole genome sequencing (WGS available as a routine tool for bacterial typing. Standardized procedures for identification of relevant genes and of variation are needed to enable comparison between studies and over time. The core genes--the genes that are conserved in all (or most members of a genus or species--are potentially good candidates for investigating genomic variation in phylogeny and epidemiology. Results We identify a set of 2,882 core genes clusters based on 73 publicly available Salmonella enterica genomes and evaluate their value as typing targets, comparing whole genome typing and traditional methods such as 16S and MLST. A consensus tree based on variation of core genes gives much better resolution than 16S and MLST; the pan-genome family tree is similar to the consensus tree, but with higher confidence. The core genes can be divided into two categories: a few highly variable genes and a larger set of conserved core genes, with low variance. For the most variable core genes, the variance in amino acid sequences is higher than for the corresponding nucleotide sequences, suggesting that there is a positive selection towards mutations leading to amino acid changes. Conclusions Genomic variation within the core genome is useful for investigating molecular evolution and providing candidate genes for bacterial genome typing. Identification of genes with different degrees of variation is important especially in trend analysis.

  13. Synaptotagmin gene content of the sequenced genomes

    Directory of Open Access Journals (Sweden)

    Craxton Molly

    2004-07-01

    Full Text Available Abstract Background Synaptotagmins exist as a large gene family in mammals. There is much interest in the function of certain family members which act crucially in the regulated synaptic vesicle exocytosis required for efficient neurotransmission. Knowledge of the functions of other family members is relatively poor and the presence of Synaptotagmin genes in plants indicates a role for the family as a whole which is wider than neurotransmission. Identification of the Synaptotagmin genes within completely sequenced genomes can provide the entire Synaptotagmin gene complement of each sequenced organism. Defining the detailed structures of all the Synaptotagmin genes and their encoded products can provide a useful resource for functional studies and a deeper understanding of the evolution of the gene family. The current rapid increase in the number of sequenced genomes from different branches of the tree of life, together with the public deposition of evolutionarily diverse transcript sequences make such studies worthwhile. Results I have compiled a detailed list of the Synaptotagmin genes of Caenorhabditis, Anopheles, Drosophila, Ciona, Danio, Fugu, Mus, Homo, Arabidopsis and Oryza by examining genomic and transcript sequences from public sequence databases together with some transcript sequences obtained by cDNA library screening and RT-PCR. I have compared all of the genes and investigated the relationship between plant Synaptotagmins and their non-Synaptotagmin counterparts. Conclusions I have identified and compared 98 Synaptotagmin genes from 10 sequenced genomes. Detailed comparison of transcript sequences reveals abundant and complex variation in Synaptotagmin gene expression and indicates the presence of Synaptotagmin genes in all animals and land plants. Amino acid sequence comparisons indicate patterns of conservation and diversity in function. Phylogenetic analysis shows the origin of Synaptotagmins in multicellular eukaryotes and their

  14. Genome-wide association study for acute otitis media in children identifies FNDC1 as disease contributing gene

    Science.gov (United States)

    van Ingen, Gijs; Li, Jin; Goedegebure, André; Pandey, Rahul; Li, Yun Rose; March, Michael E.; Jaddoe, Vincent W. V.; Bakay, Marina; Mentch, Frank D.; Thomas, Kelly; Wei, Zhi; Chang, Xiao; Hain, Heather S.; Uitterlinden, André G.; Moll, Henriette A.; van Duijn, Cornelia M.; Rivadeneira, Fernando; Raat, Hein; Baatenburg de Jong, Robert J.; Sleiman, Patrick M.; van der Schroeff, Marc P.; Hakonarson, Hakon

    2016-01-01

    Acute otitis media (AOM) is among the most common pediatric diseases, and the most frequent reason for antibiotic treatment in children. Risk of AOM is dependent on environmental and host factors, as well as a significant genetic component. We identify genome-wide significance at a locus on 6q25.3 (rs2932989, Pmeta=2.15 × 10−09), and show that the associated variants are correlated with the methylation status of the FNDC1 gene (cg05678571, P=1.43 × 10−06), and further show it is an eQTL for FNDC1 (P=9.3 × 10−05). The mouse homologue, Fndc1, is expressed in middle ear tissue and its expression is upregulated upon lipopolysaccharide treatment. In this first GWAS of AOM and the largest OM genetic study to date, we identify the first genome-wide significant locus associated with AOM. PMID:27677580

  15. Putative essential and core-essential genes in Mycoplasma genomes.

    Science.gov (United States)

    Lin, Yan; Zhang, Randy Ren

    2011-01-01

    Mycoplasma, which was used to create the first "synthetic life", has been an important species in the emerging field, synthetic biology. However, essential genes, an important concept of synthetic biology, for both M. mycoides and M. capricolum, as well as 14 other Mycoplasma with available genomes, are still unknown. We have developed a gene essentiality prediction algorithm that incorporates information of biased gene strand distribution, homologous search and codon adaptation index. The algorithm, which achieved an accuracy of 80.8% and 78.9% in self-consistence and cross-validation tests, respectively, predicted 5880 essential genes in the 16 Mycoplasma genomes. The intersection set of essential genes in available Mycoplasma genomes consists of 153 core essential genes. The predicted essential genes (available from pDEG, tubic.tju.edu.cn/pdeg) and the proposed algorithm can be helpful for studying minimal Mycoplasma genomes as well as essential genes in other genomes.

  16. Gene-based meta-analysis of genome-wide association studies implicates new loci involved in obesity

    Science.gov (United States)

    Hägg, Sara; Ganna, Andrea; Van Der Laan, Sander W.; Esko, Tonu; Pers, Tune H.; Locke, Adam E.; Berndt, Sonja I.; Justice, Anne E.; Kahali, Bratati; Siemelink, Marten A.; Pasterkamp, Gerard; Strachan, David P.; Speliotes, Elizabeth K.; North, Kari E.; Loos, Ruth J.F.; Hirschhorn, Joel N.; Pawitan, Yudi; Ingelsson, Erik

    2015-01-01

    To date, genome-wide association studies (GWASs) have identified >100 loci with single variants associated with body mass index (BMI). This approach may miss loci with high allelic heterogeneity; therefore, the aim of the present study was to use gene-based meta-analysis to identify regions with high allelic heterogeneity to discover additional obesity susceptibility loci. We included GWAS data from 123 865 individuals of European descent from 46 cohorts in Stage 1 and Metabochip data from additional 103 046 individuals from 43 cohorts in Stage 2, all within the Genetic Investigation of ANthropometric Traits (GIANT) consortium. Each cohort was tested for association between ∼2.4 million (Stage 1) or ∼200 000 (Stage 2) imputed or genotyped single variants and BMI, and summary statistics were subsequently meta-analyzed in 17 941 genes. We used the ‘VErsatile Gene-based Association Study’ (VEGAS) approach to assign variants to genes and to calculate gene-based P-values based on simulations. The VEGAS method was applied to each cohort separately before a gene-based meta-analysis was performed. In Stage 1, two known (FTO and TMEM18) and six novel (PEX2, MTFR2, SSFA2, IARS2, CEP295 and TXNDC12) loci were associated with BMI (P gene tests). We confirmed all loci, and six of them were gene-wide significant in Stage 2 alone. We provide biological support for the loci by pathway, expression and methylation analyses. Our results indicate that gene-based meta-analysis of GWAS provides a useful strategy to find loci of interest that were not identified in standard single-marker analyses due to high allelic heterogeneity. PMID:26376864

  17. Functional analysis of seven genes linked to body mass index and adiposity by genome-wide association studies: a review.

    Science.gov (United States)

    Speakman, John R

    2013-01-01

    Genome-wide association studies (GWAS) have identified a total of about 40 single nucleotide polymorphisms (SNPs) that show significant linkage to body mass index, a widely utilised surrogate measure of adiposity. However, only 8 of these associations have been confirmed by follow-up GWAS using more sophisticated measures of adiposity (computed tomography). Among these 8, there is a SNP close to the gene FTO which has been the subject of considerable work to diagnose its function. The remaining 7 SNPs are adjacent to, or within, the genes NEGR1, TMEM18, ETV5, FLJ35779, LINGO2, SH2B1 and GIPR, most of which are less well studied than FTO, particularly in the context of obesity. This article reviews the available data on the functions of these genes, including information gleaned from studies in humans and animal models. At present, we have virtually no information on the putative mechanism associating the genes FLJ35779 and LINGO2 to obesity. All of these genes are expressed in the brain, and for 2 of them (SH2B1 and GIPR), a direct link to the appetite regulation system is known. SH2B1 is an enhancer of intracellular signalling in the JAK-STAT pathway, and GIPR is the receptor for an appetite-linked hormone (GIP) produced by the alimentary tract. NEGR1, ETV5 and SH2B1 all have suggested roles in neurite outgrowth, and hence SNPs adjacent to these genes may affect development of the energy balance circuitry. Although the genes have central patterns of gene expression, implying a central neuronal connection to energy balance, for at least 4 of them (NEGR1, TMEM18, SH2B1 and GIPR), there are also significant peripheral functions related to adipose tissue biology. These functions may contribute to their effects on the obese phenotype. © 2013 S. Karger AG, Basel.

  18. Genes involved in the osteoarthritis process identified through genome wide expression analysis in articular cartilage; the RAAK study.

    Directory of Open Access Journals (Sweden)

    Yolande F M Ramos

    Full Text Available Identify gene expression profiles associated with OA processes in articular cartilage and determine pathways changing during the disease process.Genome wide gene expression was determined in paired samples of OA affected and preserved cartilage of the same joint using microarray analysis for 33 patients of the RAAK study. Results were replicated in independent samples by RT-qPCR and immunohistochemistry. Profiles were analyzed with the online analysis tools DAVID and STRING to identify enrichment for specific pathways and protein-protein interactions.Among the 1717 genes that were significantly differently expressed between OA affected and preserved cartilage we found significant enrichment for genes involved in skeletal development (e.g. TNFRSF11B and FRZB. Also several inflammatory genes such as CD55, PTGES and TNFAIP6, previously identified in within-joint analyses as well as in analyses comparing preserved cartilage from OA affected joints versus healthy cartilage were among the top genes. Of note was the high up-regulation of NGF in OA cartilage. RT-qPCR confirmed differential expression for 18 out of 19 genes with expression changes of 2-fold or higher, and immunohistochemistry of selected genes showed a concordant change in protein expression. Most of these changes associated with OA severity (Mankin score but were independent of joint-site or sex.We provide further insights into the ongoing OA pathophysiological processes in cartilage, in particular into differences in macroscopically intact cartilage compared to OA affected cartilage, which seem relatively consistent and independent of sex or joint. We advocate that development of treatment could benefit by focusing on these similarities in gene expression changes and/or pathways.

  19. Accounting for Population Structure in Gene-by-Environment Interactions in Genome-Wide Association Studies Using Mixed Models.

    Science.gov (United States)

    Sul, Jae Hoon; Bilow, Michael; Yang, Wen-Yun; Kostem, Emrah; Furlotte, Nick; He, Dan; Eskin, Eleazar

    2016-03-01

    Although genome-wide association studies (GWASs) have discovered numerous novel genetic variants associated with many complex traits and diseases, those genetic variants typically explain only a small fraction of phenotypic variance. Factors that account for phenotypic variance include environmental factors and gene-by-environment interactions (GEIs). Recently, several studies have conducted genome-wide gene-by-environment association analyses and demonstrated important roles of GEIs in complex traits. One of the main challenges in these association studies is to control effects of population structure that may cause spurious associations. Many studies have analyzed how population structure influences statistics of genetic variants and developed several statistical approaches to correct for population structure. However, the impact of population structure on GEI statistics in GWASs has not been extensively studied and nor have there been methods designed to correct for population structure on GEI statistics. In this paper, we show both analytically and empirically that population structure may cause spurious GEIs and use both simulation and two GWAS datasets to support our finding. We propose a statistical approach based on mixed models to account for population structure on GEI statistics. We find that our approach effectively controls population structure on statistics for GEIs as well as for genetic variants.

  20. Promising Loci and Genes for Yolk and Ovary Weight in Chickens Revealed by a Genome-Wide Association Study.

    Directory of Open Access Journals (Sweden)

    Congjiao Sun

    Full Text Available Because it serves as the cytoplasm of the oocyte and provides a large amount of reserves, the egg yolk has biological significance for developing embryos. The ovary and its hierarchy of follicles are the main reproductive organs responsible for yolk deposition in chickens. However, the genetic architecture underlying the yolk and ovarian follicle weights remains elusive. Here, we measured the yolk weight (YW at 11 age points from onset of egg laying to 72 weeks of age and measured the follicle weight (FW and ovary weight (OW at 73 weeks as part of a comprehensive genome-wide association study (GWAS in 1,534 F2 hens derived from reciprocal crosses between White Leghorn (WL and Dongxiang chickens (DX. For all ages, YWs exhibited moderate single nucleotide polymorphism (SNP-based heritability estimates (0.25-0.38, while the estimates for FW (0.16 and OW (0.20 were relatively low. Independent univariate genome-wide screens for each trait identified 12, 3, and 31 novel significant associations with YW, FW, and OW, respectively. A list of candidate genes such as ZAR1, STARD13, ACER1b, ACSBG2, and DHRS12 were identified for having a plausible function in yolk and follicle development. These genes are important to the initiation of embryogenesis, lipid transport, lipoprotein synthesis, lipid droplet promotion, and steroid hormone metabolism, respectively. Our study provides for the first time a genome-wide association (GWA analysis for follicle and ovary weight. Identification of the promising loci as well as potential candidate genes will greatly advance our understanding of the genetic basis underlying dynamic yolk weight and ovarian follicle development and has practical significance in breeding programs for the alteration of yolk weight at different age points.

  1. META-GSA: Combining Findings from Gene-Set Analyses across Several Genome-Wide Association Studies.

    Directory of Open Access Journals (Sweden)

    Albert Rosenberger

    Full Text Available Gene-set analysis (GSA methods are used as complementary approaches to genome-wide association studies (GWASs. The single marker association estimates of a predefined set of genes are either contrasted with those of all remaining genes or with a null non-associated background. To pool the p-values from several GSAs, it is important to take into account the concordance of the observed patterns resulting from single marker association point estimates across any given gene set. Here we propose an enhanced version of Fisher's inverse χ2-method META-GSA, however weighting each study to account for imperfect correlation between association patterns.We investigated the performance of META-GSA by simulating GWASs with 500 cases and 500 controls at 100 diallelic markers in 20 different scenarios, simulating different relative risks between 1 and 1.5 in gene sets of 10 genes. Wilcoxon's rank sum test was applied as GSA for each study. We found that META-GSA has greater power to discover truly associated gene sets than simple pooling of the p-values, by e.g. 59% versus 37%, when the true relative risk for 5 of 10 genes was assume to be 1.5. Under the null hypothesis of no difference in the true association pattern between the gene set of interest and the set of remaining genes, the results of both approaches are almost uncorrelated. We recommend not relying on p-values alone when combining the results of independent GSAs.We applied META-GSA to pool the results of four case-control GWASs of lung cancer risk (Central European Study and Toronto/Lunenfeld-Tanenbaum Research Institute Study; German Lung Cancer Study and MD Anderson Cancer Center Study, which had already been analyzed separately with four different GSA methods (EASE; SLAT, mSUMSTAT and GenGen. This application revealed the pathway GO0015291 "transmembrane transporter activity" as significantly enriched with associated genes (GSA-method: EASE, p = 0.0315 corrected for multiple testing. Similar

  2. Multidimensional gene set analysis of genomic data.

    Directory of Open Access Journals (Sweden)

    David Montaner

    Full Text Available Understanding the functional implications of changes in gene expression, mutations, etc., is the aim of most genomic experiments. To achieve this, several functional profiling methods have been proposed. Such methods study the behaviour of different gene modules (e.g. gene ontology terms in response to one particular variable (e.g. differential gene expression. In spite to the wealth of information provided by functional profiling methods, a common limitation to all of them is their inherent unidimensional nature. In order to overcome this restriction we present a multidimensional logistic model that allows studying the relationship of gene modules with different genome-scale measurements (e.g. differential expression, genotyping association, methylation, copy number alterations, heterozygosity, etc. simultaneously. Moreover, the relationship of such functional modules with the interactions among the variables can also be studied, which produces novel results impossible to be derived from the conventional unidimensional functional profiling methods. We report sound results of gene sets associations that remained undetected by the conventional one-dimensional gene set analysis in several examples. Our findings demonstrate the potential of the proposed approach for the discovery of new cell functionalities with complex dependences on more than one variable.

  3. Genome-wide association study using extreme truncate selection identifies novel genes affecting bone mineral density and fracture risk.

    Directory of Open Access Journals (Sweden)

    Emma L Duncan

    2011-04-01

    Full Text Available Osteoporotic fracture is a major cause of morbidity and mortality worldwide. Low bone mineral density (BMD is a major predisposing factor to fracture and is known to be highly heritable. Site-, gender-, and age-specific genetic effects on BMD are thought to be significant, but have largely not been considered in the design of genome-wide association studies (GWAS of BMD to date. We report here a GWAS using a novel study design focusing on women of a specific age (postmenopausal women, age 55-85 years, with either extreme high or low hip BMD (age- and gender-adjusted BMD z-scores of +1.5 to +4.0, n = 1055, or -4.0 to -1.5, n = 900, with replication in cohorts of women drawn from the general population (n = 20,898. The study replicates 21 of 26 known BMD-associated genes. Additionally, we report suggestive association of a further six new genetic associations in or around the genes CLCN7, GALNT3, IBSP, LTBP3, RSPO3, and SOX4, with replication in two independent datasets. A novel mouse model with a loss-of-function mutation in GALNT3 is also reported, which has high bone mass, supporting the involvement of this gene in BMD determination. In addition to identifying further genes associated with BMD, this study confirms the efficiency of extreme-truncate selection designs for quantitative trait association studies.

  4. Uses of antimicrobial genes from microbial genome

    Science.gov (United States)

    Sorek, Rotem; Rubin, Edward M.

    2013-08-20

    We describe a method for mining microbial genomes to discover antimicrobial genes and proteins having broad spectrum of activity. Also described are antimicrobial genes and their expression products from various microbial genomes that were found using this method. The products of such genes can be used as antimicrobial agents or as tools for molecular biology.

  5. Comparative genomic analysis of sixty mycobacteriophage genomes: Genome clustering, gene acquisition and gene size

    Science.gov (United States)

    Hatfull, Graham F.; Jacobs-Sera, Deborah; Lawrence, Jeffrey G.; Pope, Welkin H.; Russell, Daniel A.; Ko, Ching-Chung; Weber, Rebecca J.; Patel, Manisha C.; Germane, Katherine L.; Edgar, Robert H.; Hoyte, Natasha N.; Bowman, Charles A.; Tantoco, Anthony T.; Paladin, Elizabeth C.; Myers, Marlana S.; Smith, Alexis L.; Grace, Molly S.; Pham, Thuy T.; O'Brien, Matthew B.; Vogelsberger, Amy M.; Hryckowian, Andrew J.; Wynalek, Jessica L.; Donis-Keller, Helen; Bogel, Matt W.; Peebles, Craig L.; Cresawn, Steve G.; Hendrix, Roger W.

    2010-01-01

    Mycobacteriophages are viruses that infect mycobacterial hosts. Expansion of a collection of sequenced phage genomes to a total of sixty – all infecting a common bacterial host – provides further insight into their diversity and evolution. Of the sixty phage genomes, 55 can be grouped into nine clusters according to their nucleotide sequence similarities, five of which can be further divided into subclusters; five genomes do not cluster with other phages. The sequence diversity between genomes within a cluster varies greatly; for example, the six genomes in cluster D share more than 97.5% average nucleotide similarity with each other. In contrast, similarity between the two genomes in Cluster I is barely detectable by diagonal plot analysis. The total of 6,858 predicted ORFs have been grouped into 1523 phamilies (phams) of related sequences, 46% of which possess only a single member. Only 18.8% of the phams have sequence similarity to non-mycobacteriophage database entries and fewer than 10% of all phams can be assigned functions based on database searching or synteny. Genome clustering facilitates the identification of genes that are in greatest genetic flux and are more likely to have been exchanged horizontally in relatively recent evolutionary time. Although mycobacteriophage genes exhibit smaller average size than genes of their host (205 residues compared to 315), phage genes in higher flux average only ∼100 amino acids, suggesting that the primary units of genetic exchange correspond to single protein domains. PMID:20064525

  6. Studying Genes

    Science.gov (United States)

    ... NIGMS NIGMS Home > Science Education > Studying Genes Studying Genes Tagline (Optional) Middle/Main Content Area Other Fact Sheets What are genes? Genes are segments of DNA that contain instructions ...

  7. Genome-wide association study identifies loci and candidate genes for meat quality traits in Simmental beef cattle.

    Science.gov (United States)

    Xia, Jiangwei; Qi, Xin; Wu, Yang; Zhu, Bo; Xu, Lingyang; Zhang, Lupei; Gao, Xue; Chen, Yan; Li, Junya; Gao, Huijiang

    2016-06-01

    Improving meat quality is the best way to enhance profitability and strengthen competitiveness in beef industry. Identification of genetic variants that control beef quality traits can help breeders design optimal breeding programs to achieve this goal. We carried out a genome-wide association study for meat quality traits in 1141 Simmental cattle using the Illumina Bovine HD 770K SNP array to identify the candidate genes and genomic regions associated with meat quality traits for beef cattle, including fat color, meat color, marbling score, longissimus muscle area, and shear force. In our study, we identified twenty significant single-nucleotide polymorphisms (SNPs) (p meat quality traits. Notably, we observed several SNPs were in or near eleven genes which have been reported previously, including TMEM236, SORL1, TRDN, S100A10, AP2S1, KCTD16, LOC506594, DHX15, LAMA4, PREX1, and BRINP3. We identified a haplotype block on BTA13 containing five significant SNPs associated with fat color trait. We also found one of 19 SNPs was associated with multiple traits (shear force and longissimus muscle area) on BTA7. Our results offer valuable insights to further explore the potential mechanism of meat quality traits in Simmental beef cattle.

  8. European genome-wide association study identifies SLC14A1 as a new urinary bladder cancer susceptibility gene.

    Science.gov (United States)

    Rafnar, Thorunn; Vermeulen, Sita H; Sulem, Patrick; Thorleifsson, Gudmar; Aben, Katja K; Witjes, J Alfred; Grotenhuis, Anne J; Verhaegh, Gerald W; Hulsbergen-van de Kaa, Christina A; Besenbacher, Soren; Gudbjartsson, Daniel; Stacey, Simon N; Gudmundsson, Julius; Johannsdottir, Hrefna; Bjarnason, Hjordis; Zanon, Carlo; Helgadottir, Hafdis; Jonasson, Jon Gunnlaugur; Tryggvadottir, Laufey; Jonsson, Eirikur; Geirsson, Gudmundur; Nikulasson, Sigfus; Petursdottir, Vigdis; Bishop, D Timothy; Chung-Sak, Sei; Choudhury, Ananya; Elliott, Faye; Barrett, Jennifer H; Knowles, Margaret A; de Verdier, Petra J; Ryk, Charlotta; Lindblom, Annika; Rudnai, Peter; Gurzau, Eugene; Koppova, Kvetoslava; Vineis, Paolo; Polidoro, Silvia; Guarrera, Simonetta; Sacerdote, Carlotta; Panadero, Angeles; Sanz-Velez, José I; Sanchez, Manuel; Valdivia, Gabriel; Garcia-Prats, Maria D; Hengstler, Jan G; Selinski, Silvia; Gerullis, Holger; Ovsiannikov, Daniel; Khezri, Abdolaziz; Aminsharifi, Alireza; Malekzadeh, Mahyar; van den Berg, Leonard H; Ophoff, Roel A; Veldink, Jan H; Zeegers, Maurice P; Kellen, Eliane; Fostinelli, Jacopo; Andreoli, Daniele; Arici, Cecilia; Porru, Stefano; Buntinx, Frank; Ghaderi, Abbas; Golka, Klaus; Mayordomo, José I; Matullo, Giuseppe; Kumar, Rajiv; Steineck, Gunnar; Kiltie, Anne E; Kong, Augustine; Thorsteinsdottir, Unnur; Stefansson, Kari; Kiemeney, Lambertus A

    2011-11-01

    Three genome-wide association studies in Europe and the USA have reported eight urinary bladder cancer (UBC) susceptibility loci. Using extended case and control series and 1000 Genomes imputations of 5 340 737 single-nucleotide polymorphisms (SNPs), we searched for additional loci in the European GWAS. The discovery sample set consisted of 1631 cases and 3822 controls from the Netherlands and 603 cases and 37 781 controls from Iceland. For follow-up, we used 3790 cases and 7507 controls from 13 sample sets of European and Iranian ancestry. Based on the discovery analysis, we followed up signals in the urea transporter (UT) gene SLC14A. The strongest signal at this locus was represented by a SNP in intron 3, rs17674580, that reached genome-wide significance in the overall analysis of the discovery and follow-up groups: odds ratio = 1.17, P = 7.6 × 10(-11). SLC14A1 codes for UTs that define the Kidd blood group and are crucial for the maintenance of a constant urea concentration gradient in the renal medulla and, through this, the kidney's ability to concentrate urine. It is speculated that rs17674580, or other sequence variants in LD with it, indirectly modifies UBC risk by affecting urine production. If confirmed, this would support the 'urogenous contact hypothesis' that urine production and voiding frequency modify the risk of UBC.

  9. Genome-Wide Association Study to Identify Genes Related to Renal Mercury Concentrations in Mice

    DEFF Research Database (Denmark)

    Alkaissi, Hammoudi; Ekstrand, Jimmy; Jawad, Aksa

    2016-01-01

    BACKGROUND: Following human mercury (Hg) exposure, the metal accumulates with considerable concentrations in kidney, liver, and brain. Although the toxicokinetics of Hg has been studied extensively, factors responsible for inter-individual variation in humans are largely unknown. Differences...... in accumulation of renal Hg between inbred mouse strains suggest a genetic inter-strain variation regulating retention or/and excretion of Hg. A.SW, DBA/2 and BALB/C mouse strains accumulate higher amounts of Hg than B10.S. OBJECTIVES: To find candidate genes associated with regulation of renal Hg concentrations...... enhanced by the Pprc1 (Nrf1 and Nrf2) were included for gene expression analysis. RESULTS: Renal Hg concentrations differed significantly between A.SW and B10.S mice and between males and females within each strain. QTL analysis showed a peak logarithm of odds ratio score 5.78 on chromosome 19 (p = 0...

  10. A genome-wide association study of the maize hypersensitive defense response identifies genes that cluster in related pathways.

    Directory of Open Access Journals (Sweden)

    Bode A Olukolu

    2014-08-01

    Full Text Available Much remains unknown of molecular events controlling the plant hypersensitive defense response (HR, a rapid localized cell death that limits pathogen spread and is mediated by resistance (R- genes. Genetic control of the HR is hard to quantify due to its microscopic and rapid nature. Natural modifiers of the ectopic HR phenotype induced by an aberrant auto-active R-gene (Rp1-D21, were mapped in a population of 3,381 recombinant inbred lines from the maize nested association mapping population. Joint linkage analysis was conducted to identify 32 additive but no epistatic quantitative trait loci (QTL using a linkage map based on more than 7000 single nucleotide polymorphisms (SNPs. Genome-wide association (GWA analysis of 26.5 million SNPs was conducted after adjusting for background QTL. GWA identified associated SNPs that colocalized with 44 candidate genes. Thirty-six of these genes colocalized within 23 of the 32 QTL identified by joint linkage analysis. The candidate genes included genes predicted to be in involved programmed cell death, defense response, ubiquitination, redox homeostasis, autophagy, calcium signalling, lignin biosynthesis and cell wall modification. Twelve of the candidate genes showed significant differential expression between isogenic lines differing for the presence of Rp1-D21. Low but significant correlations between HR-related traits and several previously-measured disease resistance traits suggested that the genetic control of these traits was substantially, though not entirely, independent. This study provides the first system-wide analysis of natural variation that modulates the HR response in plants.

  11. Genome-wide study of gene variants associated with differential cardiovascular event reduction by pravastatin therapy.

    Directory of Open Access Journals (Sweden)

    Dov Shiffman

    Full Text Available Statin therapy reduces the risk of coronary heart disease (CHD, however, the person-to-person variability in response to statin therapy is not well understood. We have investigated the effect of genetic variation on the reduction of CHD events by pravastatin. First, we conducted a genome-wide association study of 682 CHD cases from the Cholesterol and Recurrent Events (CARE trial and 383 CHD cases from the West of Scotland Coronary Prevention Study (WOSCOPS, two randomized, placebo-controlled studies of pravastatin. In a combined case-only analysis, 79 single nucleotide polymorphisms (SNPs were associated with differential CHD event reduction by pravastatin according to genotype (P<0.0001, and these SNPs were analyzed in a second stage that included cases as well as non-cases from CARE and WOSCOPS and patients from the PROspective Study of Pravastatin in the Elderly at Risk/PHArmacogenomic study of Statins in the Elderly at risk for cardiovascular disease (PROSPER/PHASE, a randomized placebo controlled study of pravastatin in the elderly. We found that one of these SNPs (rs13279522 was associated with differential CHD event reduction by pravastatin therapy in all 3 studies: P = 0.002 in CARE, P = 0.01 in WOSCOPS, P = 0.002 in PROSPER/PHASE. In a combined analysis of CARE, WOSCOPS, and PROSPER/PHASE, the hazard ratio for CHD when comparing pravastatin with placebo decreased by a factor of 0.63 (95% CI: 0.52 to 0.75 for each extra copy of the minor allele (P = 4.8 × 10(-7. This SNP is located in DnaJ homolog subfamily C member 5B (DNAJC5B and merits investigation in additional randomized studies of pravastatin and other statins.

  12. Identification of PLCL1 gene for hip bone size variation in females in a genome-wide association study.

    Directory of Open Access Journals (Sweden)

    Yao-Zhong Liu

    Full Text Available Osteoporosis, the most prevalent metabolic bone disease among older people, increases risk for low trauma hip fractures (HF that are associated with high morbidity and mortality. Hip bone size (BS has been identified as one of the key measurable risk factors for HF. Although hip BS is highly genetically determined, genetic factors underlying the trait are still poorly defined. Here, we performed the first genome-wide association study (GWAS of hip BS interrogating approximately 380,000 SNPs on the Affymetrix platform in 1,000 homogeneous unrelated Caucasian subjects, including 501 females and 499 males. We identified a gene, PLCL1 (phospholipase c-like 1, that had four SNPs associated with hip BS at, or approaching, a genome-wide significance level in our female subjects; the most significant SNP, rs7595412, achieved a p value of 3.72x10(-7. The gene's importance to hip BS was replicated using the Illumina genotyping platform in an independent UK cohort containing 1,216 Caucasian females. Two SNPs of the PLCL1 gene, rs892515 and rs9789480, surrounded by the four SNPs identified in our GWAS, achieved p values of 8.62x10(-3 and 2.44x10(-3, respectively, for association with hip BS. Imputation analyses on our GWAS and the UK samples further confirmed the replication signals; eight SNPs of the gene achieved combined imputed p values<10(-5 in the two samples. The PLCL1 gene's relevance to HF was also observed in a Chinese sample containing 403 females, including 266 with HF and 177 control subjects. A SNP of the PLCL1 gene, rs3771362 that is only approximately 0.6 kb apart from the most significant SNP detected in our GWAS (rs7595412, achieved a p value of 7.66x10(-3 (odds ratio = 0.26 for association with HF. Additional biological support for the role of PLCL1 in BS comes from previous demonstrations that the PLCL1 protein inhibits IP3 (inositol 1,4,5-trisphosphate-mediated calcium signaling, an important pathway regulating mechanical sensing of

  13. InSilico DB genomic datasets hub: an efficient starting point for analyzing genome-wide studies in GenePattern, Integrative Genomics Viewer, and R/Bioconductor.

    Science.gov (United States)

    Coletta, Alain; Molter, Colin; Duqué, Robin; Steenhoff, David; Taminau, Jonatan; de Schaetzen, Virginie; Meganck, Stijn; Lazar, Cosmin; Venet, David; Detours, Vincent; Nowé, Ann; Bersini, Hugues; Weiss Solís, David Y

    2012-11-18

    Genomics datasets are increasingly useful for gaining biomedical insights, with adoption in the clinic underway. However, multiple hurdles related to data management stand in the way of their efficient large-scale utilization. The solution proposed is a web-based data storage hub. Having clear focus, flexibility and adaptability, InSilico DB seamlessly connects genomics dataset repositories to state-of-the-art and free GUI and command-line data analysis tools. The InSilico DB platform is a powerful collaborative environment, with advanced capabilities for biocuration, dataset sharing, and dataset subsetting and combination. InSilico DB is available from https://insilicodb.org.

  14. Genetic variations related to maternal whole blood mitochondrial DNA copy number: a genome-wide and candidate gene study.

    Science.gov (United States)

    Workalemahu, Tsegaselassie; Enquobahrie, Daniel A; Tadesse, Mahlet G; Hevner, Karin; Gelaye, Bizu; Sanchez, Sixto E; Williams, Michelle A

    2017-10-01

    We conducted genome-wide (GWAS) and candidate gene association studies of maternal mitochondrial DNA copy number. Maternal peripheral blood was collected during labor and delivery admission from 471 participants of a placental abruption case-control study conducted in Lima, Peru. Single nucleotide polymorphism (SNP) genotyping was performed using the Illumina Cardio-Metabo Chip. Whole blood mitochondrial DNA (mtDNA) copy number was measured using qRT-PCR techniques. We evaluated 119,629 SNPs in the GWAS and 161 SNPs (in 29 mitochondrial biogenesis and oxidative phosphorylation genes) in the candidate association study. Top hits from GWAS and the candidate gene study were selected to compute weighted genetic risk scores (wGRS). Linear regression models were used to calculate effect size estimates and related nominal p values. The top hit in our GWAS was chr19:51063065 in FOXA3 (empirical p values = 2.20e - 6). A total of 134 SNPs had p values copy number (p values copy number was significantly associated with wGRS based on top GWAS hits (β = 0.49, 95% CI:0.38-0.60, p copy number.

  15. A genome-wide association study on androstenone levels in pigs reveals a cluster of candidate genes on chromosome 6

    Directory of Open Access Journals (Sweden)

    Groenen Martien AM

    2010-05-01

    Full Text Available Abstract Background In many countries, male piglets are castrated shortly after birth because a proportion of un-castrated male pigs produce meat with an unpleasant flavour and odour. Main compounds of boar taint are androstenone and skatole. The aim of this high-density genome-wide association study was to identify single nucleotide polymorphisms (SNPs associated with androstenone levels in a commercial sire line of pigs. The identification of major genetic effects causing boar taint would accelerate the reduction of boar taint through breeding to finally eliminate the need for castration. Results The Illumina Porcine 60K+SNP Beadchip was genotyped on 987 pigs divergent for androstenone concentration from a commercial Duroc-based sire line. The association analysis with 47,897 SNPs revealed that androstenone levels in fat tissue were significantly affected by 37 SNPs on pig chromosomes SSC1 and SSC6. Among them, the 5 most significant SNPs explained together 13.7% of the genetic variance in androstenone. On SSC6, a larger region of 10 Mb was shown to be associated with androstenone covering several candidate genes potentially involved in the synthesis and metabolism of androgens. Besides known candidate genes, such as cytochrome P450 A19 (CYP2A19, sulfotransferases SULT2A1, and SULT2B1, also new members of the cytochrome P450 CYP2 gene subfamilies and of the hydroxysteroid-dehydrogenases (HSD17B14 were found. In addition, the gene encoding the ß-chain of the luteinizing hormone (LHB which induces steroid synthesis in the Leydig cells of the testis at onset of puberty maps to this area on SSC6. Interestingly, the gene encoding the α-chain of LH is also located in one of the highly significant areas on SSC1. Conclusions This study reveals several areas of the genome at high resolution responsible for variation of androstenone levels in intact boars. Major genetic factors on SSC1 and SSC6 showing moderate to large effects on androstenone

  16. BiForce Toolbox: powerful high-throughput computational analysis of gene-gene interactions in genome-wide association studies.

    Science.gov (United States)

    Gyenesei, Attila; Moody, Jonathan; Laiho, Asta; Semple, Colin A M; Haley, Chris S; Wei, Wen-Hua

    2012-07-01

    Genome-wide association studies (GWAS) have discovered many loci associated with common disease and quantitative traits. However, most GWAS have not studied the gene-gene interactions (epistasis) that could be important in complex trait genetics. A major challenge in analysing epistasis in GWAS is the enormous computational demands of analysing billions of SNP combinations. Several methods have been developed recently to address this, some using computers equipped with particular graphical processing units, most restricted to binary disease traits and all poorly suited to general usage on the most widely used operating systems. We have developed the BiForce Toolbox to address the demand for high-throughput analysis of pairwise epistasis in GWAS of quantitative and disease traits across all commonly used computer systems. BiForce Toolbox is a stand-alone Java program that integrates bitwise computing with multithreaded parallelization and thus allows rapid full pairwise genome scans via a graphical user interface or the command line. Furthermore, BiForce Toolbox incorporates additional tests of interactions involving SNPs with significant marginal effects, potentially increasing the power of detection of epistasis. BiForce Toolbox is easy to use and has been applied in multiple studies of epistasis in large GWAS data sets, identifying interesting interaction signals and pathways.

  17. Identification of Candidate Genes for Reactivity in Guzerat (Bos indicus) Cattle: A Genome-Wide Association Study

    Science.gov (United States)

    Fonseca, Pablo Augusto de Souza; Pires, Maria de Fátima Ávila; Ventura, Ricardo Vieira; Rosse, Izinara da Cruz.; Bruneli, Frank Angelo Tomita; Machado, Marco Antonio; Carvalho, Maria Raquel Santos

    2017-01-01

    Temperament is fundamental to animal production due to its direct influence on the animal-herdsman relationship. When compared to calm animals, the aggressive, anxious or fearful ones exhibit less weight gain, lower reproductive efficiency, decreased milk production and higher herd maintenance costs, all of which contribute to reduced profits. However, temperament is a trait that is complex and difficult to assess. Recently, a new quantitative system, REATEST®, for assessing reactivity, a phenotype of temperament, was developed. Herein, we describe the results of a Genome-wide association study for reactivity, assessed using REATEST® with a sample of 754 females from five dual-purpose (milk and meat production) Guzerat (Bos indicus) herds. Genotyping was performed using a 50k SNP chip and a two-step mixed model approach (Grammar-Gamma) with a one-by-one marker regression was used to identify QTLs. QTLs for reactivity were identified on chromosomes BTA1, BTA5, BTA14, and BTA25. Five intronic and two intergenic markers were significantly associated with reactivity. POU1F1, DRD3, VWA3A, ZBTB20, EPHA6, SNRPF and NTN4 were identified as candidate genes. Previous QTL reports for temperament traits, covering areas surrounding the SNPs/genes identified here, further corroborate these associations. The seven genes identified in the present study explain 20.5% of reactivity variance and give a better understanding of temperament biology. PMID:28125592

  18. A genome-wide association study reveals a novel candidate gene for sperm motility in pigs

    NARCIS (Netherlands)

    Diniz, D.B.; Lopes, M.S.; Broekhuijse, M.L.W.J.; Lopes, P.S.; Harlizius, B.; Guimaraes, S.E.F.; Duijvesteijn, N.; Knol, E.F.; Silva, F.F.

    2014-01-01

    Sperm motility is one of the most widely used parameters in order to evaluate boar semen quality. However, this trait can only be measured after puberty. Thus, the use of genomic information appears as an appealing alternative to evaluate and improve selection for boar fertility traits earlier in li

  19. Post genome-wide association studies of novel genes associated with type 2 diabetes show gene-gene interaction and high predictive value.

    Directory of Open Access Journals (Sweden)

    Stéphane Cauchi

    Full Text Available BACKGROUND: Recently, several Genome Wide Association (GWA studies in populations of European descent have identified and validated novel single nucleotide polymorphisms (SNPs, highly associated with type 2 diabetes (T2D. Our aims were to validate these markers in other European and non-European populations, then to assess their combined effect in a large French study comparing T2D and normal glucose tolerant (NGT individuals. METHODOLOGY/PRINCIPAL FINDINGS: In the same French population analyzed in our previous GWA study (3,295 T2D and 3,595 NGT, strong associations with T2D were found for CDKAL1 (OR(rs7756992 = 1.30[1.19-1.42], P = 2.3x10(-9, CDKN2A/2B (OR(rs10811661 = 0.74[0.66-0.82], P = 3.5x10(-8 and more modestly for IGFBP2 (OR(rs1470579 = 1.17[1.07-1.27], P = 0.0003 SNPs. These results were replicated in both Israeli Ashkenazi (577 T2D and 552 NGT and Austrian (504 T2D and 753 NGT populations (except for CDKAL1 but not in the Moroccan population (521 T2D and 423 NGT. In the overall group of French subjects (4,232 T2D and 4,595 NGT, IGFBP2 and CXCR4 synergistically interacted with (LOC38776, SLC30A8, HHEX and (NGN3, CDKN2A/2B, respectively, encoding for proteins presumably regulating pancreatic endocrine cell development and function. The T2D risk increased strongly when risk alleles, including the previously discovered T2D-associated TCF7L2 rs7903146 SNP, were combined (8.68-fold for the 14% of French individuals carrying 18 to 30 risk alleles with an allelic OR of 1.24. With an area under the ROC curve of 0.86, only 15 novel loci were necessary to discriminate French individuals susceptible to develop T2D. CONCLUSIONS/SIGNIFICANCE: In addition to TCF7L2, SLC30A8 and HHEX, initially identified by the French GWA scan, CDKAL1, IGFBP2 and CDKN2A/2B strongly associate with T2D in French individuals, and mostly in populations of Central European descent but not in Moroccan subjects. Genes expressed in the pancreas interact together and their

  20. KEGG: kyoto encyclopedia of genes and genomes.

    Science.gov (United States)

    Kanehisa, M; Goto, S

    2000-01-01

    KEGG (Kyoto Encyclopedia of Genes and Genomes) is a knowledge base for systematic analysis of gene functions, linking genomic information with higher order functional information. The genomic information is stored in the GENES database, which is a collection of gene catalogs for all the completely sequenced genomes and some partial genomes with up-to-date annotation of gene functions. The higher order functional information is stored in the PATHWAY database, which contains graphical representations of cellular processes, such as metabolism, membrane transport, signal transduction and cell cycle. The PATHWAY database is supplemented by a set of ortholog group tables for the information about conserved subpathways (pathway motifs), which are often encoded by positionally coupled genes on the chromosome and which are especially useful in predicting gene functions. A third database in KEGG is LIGAND for the information about chemical compounds, enzyme molecules and enzymatic reactions. KEGG provides Java graphics tools for browsing genome maps, comparing two genome maps and manipulating expression maps, as well as computational tools for sequence comparison, graph comparison and path computation. The KEGG databases are daily updated and made freely available (http://www. genome.ad.jp/kegg/).

  1. Pathway analysis of genome-wide association study data highlights pancreatic development genes as susceptibility factors for pancreatic cancer

    Science.gov (United States)

    Duell, Eric J.; Yu, Kai; Risch, Harvey A.; Olson, Sara H.; Kooperberg, Charles; Wolpin, Brian M.; Jiao, Li; Dong, Xiaoqun; Wheeler, Bill; Arslan, Alan A.; Bueno-de-Mesquita, H. Bas; Fuchs, Charles S.; Gallinger, Steven; Gross, Myron; Hartge, Patricia; Hoover, Robert N.; Holly, Elizabeth A.; Jacobs, Eric J.; Klein, Alison P.; LaCroix, Andrea; Mandelson, Margaret T.; Petersen, Gloria; Zheng, Wei; Agalliu, Ilir; Albanes, Demetrius; Boutron-Ruault, Marie-Christine; Bracci, Paige M.; Buring, Julie E.; Canzian, Federico; Chang, Kenneth; Chanock, Stephen J.; Cotterchio, Michelle; Gaziano, J.Michael; Giovannucci, Edward L.; Goggins, Michael; Hallmans, Göran; Hankinson, Susan E.; Hoffman Bolton, Judith A.; Hunter, David J.; Hutchinson, Amy; Jacobs, Kevin B.; Jenab, Mazda; Khaw, Kay-Tee; Kraft, Peter; Krogh, Vittorio; Kurtz, Robert C.; McWilliams, Robert R.; Mendelsohn, Julie B.; Patel, Alpa V.; Rabe, Kari G.; Riboli, Elio; Shu, Xiao-Ou; Tjønneland, Anne; Tobias, Geoffrey S.; Trichopoulos, Dimitrios; Virtamo, Jarmo; Visvanathan, Kala; Watters, Joanne; Yu, Herbert; Zeleniuch-Jacquotte, Anne; Stolzenberg-Solomon, Rachael Z.

    2012-01-01

    Four loci have been associated with pancreatic cancer through genome-wide association studies (GWAS). Pathway-based analysis of GWAS data is a complementary approach to identify groups of genes or biological pathways enriched with disease-associated single-nucleotide polymorphisms (SNPs) whose individual effect sizes may be too small to be detected by standard single-locus methods. We used the adaptive rank truncated product method in a pathway-based analysis of GWAS data from 3851 pancreatic cancer cases and 3934 control participants pooled from 12 cohort studies and 8 case–control studies (PanScan). We compiled 23 biological pathways hypothesized to be relevant to pancreatic cancer and observed a nominal association between pancreatic cancer and five pathways (P < 0.05), i.e. pancreatic development, Helicobacter pylori lacto/neolacto, hedgehog, Th1/Th2 immune response and apoptosis (P = 2.0 × 10−6, 1.6 × 10−5, 0.0019, 0.019 and 0.023, respectively). After excluding previously identified genes from the original GWAS in three pathways (NR5A2, ABO and SHH), the pancreatic development pathway remained significant (P = 8.3 × 10−5), whereas the others did not. The most significant genes (P < 0.01) in the five pathways were NR5A2, HNF1A, HNF4G and PDX1 for pancreatic development; ABO for H. pylori lacto/neolacto; SHH for hedgehog; TGFBR2 and CCL18 for Th1/Th2 immune response and MAPK8 and BCL2L11 for apoptosis. Our results provide a link between inherited variation in genes important for pancreatic development and cancer and show that pathway-based approaches to analysis of GWAS data can yield important insights into the collective role of genetic risk variants in cancer. PMID:22523087

  2. Genome-Wide Association Studies Suggest Limited Immune Gene Enrichment in Schizophrenia Compared to 5 Autoimmune Diseases.

    Science.gov (United States)

    Pouget, Jennie G; Gonçalves, Vanessa F; Spain, Sarah L; Finucane, Hilary K; Raychaudhuri, Soumya; Kennedy, James L; Knight, Jo

    2016-09-01

    There has been intense debate over the immunological basis of schizophrenia, and the potential utility of adjunct immunotherapies. The major histocompatibility complex is consistently the most powerful region of association in genome-wide association studies (GWASs) of schizophrenia and has been interpreted as strong genetic evidence supporting the immune hypothesis. However, global pathway analyses provide inconsistent evidence of immune involvement in schizophrenia, and it remains unclear whether genetic data support an immune etiology per se. Here we empirically test the hypothesis that variation in immune genes contributes to schizophrenia. We show that there is no enrichment of immune loci outside of the MHC region in the largest genetic study of schizophrenia conducted to date, in contrast to 5 diseases of known immune origin. Among 108 regions of the genome previously associated with schizophrenia, we identify 6 immune candidates (DPP4, HSPD1, EGR1, CLU, ESAM, NFATC3) encoding proteins with alternative, nonimmune roles in the brain. While our findings do not refute evidence that has accumulated in support of the immune hypothesis, they suggest that genetically mediated alterations in immune function may not play a major role in schizophrenia susceptibility. Instead, there may be a role for pleiotropic effects of a small number of immune genes that also regulate brain development and plasticity. Whether immune alterations drive schizophrenia progression is an important question to be addressed by future research, especially in light of the growing interest in applying immunotherapies in schizophrenia. © The Author 2016. Published by Oxford University Press on behalf of the Maryland Psychiatric Research Center.

  3. A PLSPM-based test statistic for detecting gene-gene co-association in genome-wide association study with case-control design.

    Science.gov (United States)

    Zhang, Xiaoshuai; Yang, Xiaowei; Yuan, Zhongshang; Liu, Yanxun; Li, Fangyu; Peng, Bin; Zhu, Dianwen; Zhao, Jinghua; Xue, Fuzhong

    2013-01-01

    For genome-wide association data analysis, two genes in any pathway, two SNPs in the two linked gene regions respectively or in the two linked exons respectively within one gene are often correlated with each other. We therefore proposed the concept of gene-gene co-association, which refers to the effects not only due to the traditional interaction under nearly independent condition but the correlation between two genes. Furthermore, we constructed a novel statistic for detecting gene-gene co-association based on Partial Least Squares Path Modeling (PLSPM). Through simulation, the relationship between traditional interaction and co-association was highlighted under three different types of co-association. Both simulation and real data analysis demonstrated that the proposed PLSPM-based statistic has better performance than single SNP-based logistic model, PCA-based logistic model, and other gene-based methods.

  4. Allowing for population stratification in case-only studies of gene-environment interaction, using genomic control.

    Science.gov (United States)

    Yadav, Pankaj; Freitag-Wolf, Sandra; Lieb, Wolfgang; Dempfle, Astrid; Krawczak, Michael

    2015-10-01

    Gene-environment interactions (G × E) have attracted considerable research interest in the past owing to their scientific and public health implications, but powerful statistical methods are required to successfully track down G × E, particularly at a genome-wide level. Previously, a case-only (CO) design has been proposed as a means to identify G × E with greater efficiency than traditional case-control or cohort studies. However, as with genotype-phenotype association studies themselves, hidden population stratification (PS) can impact the validity of G × E studies using a CO design. Since this problem has been subject to little research to date, we used comprehensive simulation to systematically assess the type I error rate, power and effect size bias of CO studies of G × E in the presence of PS. Three types of PS were considered, namely genetic-only (PSG), environment-only (PSE), and joint genetic and environmental stratification (PSGE). Our results reveal that the type I error rate of an unadjusted Wald test, appropriate for the CO design, would be close to its nominal level (0.05 in our study) as long as PS involves only one interaction partner (i.e., either PSG or PSE). In contrast, if the study population is stratified with respect to both G and E (i.e., if there is PSGE), then the type I error rate is seriously inflated and estimates of the underlying G × E interaction are biased. Comparison of CO to a family-based case-parents design confirmed that the latter is more robust against PSGE, as expected. However, case-parent trios may be particularly unsuitable for G × E studies in view of the fact that they require genotype data from parents and that many diseases with an environmental component are likely to be of late onset. An alternative approach to adjusting for PS is principal component analysis (PCA), which has been widely used for this very purpose in past genome-wide association studies (GWAS). However, resolving genetic PS properly by PCA

  5. HOX Gene Promoter Prediction and Inter-genomic Comparison: An Evo-Devo Study

    Directory of Open Access Journals (Sweden)

    Marla A. Endriga

    2010-10-01

    Full Text Available Homeobox genes direct the anterior-posterior axis of the body plan in eukaryotic organisms. Promoter regions upstream of the Hox genes jumpstart the transcription process. CpG islands found within the promoter regions can cause silencing of these promoters. The locations of the promoter regions and the CpG islands of Homeo sapiens sapiens (human, Pan troglodytes (chimpanzee, Mus musculus (mouse, and Rattus norvegicus (brown rat are compared and related to the possible influence on the specification of the mammalian body plan. The sequence of each gene in Hox clusters A-D of the mammals considered were retrieved from Ensembl and locations of promoter regions and CpG islands predicted using Exon Finder. The predicted promoter sequences were confirmed via BLAST and verified against the Eukaryotic Promoter Database. The significance of the locations was determined using the Kruskal-Wallis test. Among the four clusters, only promoter locations in cluster B showed significant difference. HOX B genes have been linked with the control of genes that direct the development of axial morphology, particularly of the vertebral column bones. The magnitude of variation among the body plans of closely-related species can thus be partially attributed to the promoter kind, location and number, and gene inactivation via CpG methylation.

  6. Gene enrichment in plant genomic shotgun libraries.

    Science.gov (United States)

    Rabinowicz, Pablo D; McCombie, W Richard; Martienssen, Robert A

    2003-04-01

    The Arabidopsis genome (about 130 Mbp) has been completely sequenced; whereas a draft sequence of the rice genome (about 430 Mbp) is now available and the sequencing of this genome will be completed in the near future. The much larger genomes of several important crop species, such as wheat (about 16,000 Mbp) or maize (about 2500 Mbp), may not be fully sequenced with current technology. Instead, sequencing-analysis strategies are being developed to obtain sequencing and mapping information selectively for the genic fraction (gene space) of complex plant genomes.

  7. Generalised Anxiety Disorder--A Twin Study of Genetic Architecture, Genome-Wide Association and Differential Gene Expression.

    Directory of Open Access Journals (Sweden)

    Matthew N Davies

    Full Text Available Generalised Anxiety Disorder (GAD is a common anxiety-related diagnosis, affecting approximately 5% of the adult population. One characteristic of GAD is a high degree of anxiety sensitivity (AS, a personality trait which describes the fear of arousal-related sensations. Here we present a genome-wide association study of AS using a cohort of 730 MZ and DZ female twins. The GWAS showed a significant association for a variant within the RBFOX1 gene. A heritability analysis of the same cohort also confirmed a significant genetic component with h2 of 0.42. Additionally, a subset of the cohort (25 MZ twins discordant for AS was studied for evidence of differential expression using RNA-seq data. Significant differential expression of two exons with the ITM2B gene within the discordant MZ subset was observed, a finding that was replicated in an independent cohort. While previous research has shown that anxiety has a high comorbidity with a variety of psychiatric and neurodegenerative disorders, our analysis suggests a novel etiology specific to AS.

  8. Generalised Anxiety Disorder--A Twin Study of Genetic Architecture, Genome-Wide Association and Differential Gene Expression.

    Science.gov (United States)

    Davies, Matthew N; Verdi, Serena; Burri, Andrea; Trzaskowski, Maciej; Lee, Minyoung; Hettema, John M; Jansen, Rick; Boomsma, Dorret I; Spector, Tim D

    2015-01-01

    Generalised Anxiety Disorder (GAD) is a common anxiety-related diagnosis, affecting approximately 5% of the adult population. One characteristic of GAD is a high degree of anxiety sensitivity (AS), a personality trait which describes the fear of arousal-related sensations. Here we present a genome-wide association study of AS using a cohort of 730 MZ and DZ female twins. The GWAS showed a significant association for a variant within the RBFOX1 gene. A heritability analysis of the same cohort also confirmed a significant genetic component with h2 of 0.42. Additionally, a subset of the cohort (25 MZ twins discordant for AS) was studied for evidence of differential expression using RNA-seq data. Significant differential expression of two exons with the ITM2B gene within the discordant MZ subset was observed, a finding that was replicated in an independent cohort. While previous research has shown that anxiety has a high comorbidity with a variety of psychiatric and neurodegenerative disorders, our analysis suggests a novel etiology specific to AS.

  9. Gene and genome duplication in Acanthamoeba polyphaga Mimivirus.

    Science.gov (United States)

    Suhre, Karsten

    2005-11-01

    Gene duplication is key to molecular evolution in all three domains of life and may be the first step in the emergence of new gene function. It is a well-recognized feature in large DNA viruses but has not been studied extensively in the largest known virus to date, the recently discovered Acanthamoeba polyphaga Mimivirus. Here, I present a systematic analysis of gene and genome duplication events in the mimivirus genome. I found that one-third of the mimivirus genes are related to at least one other gene in the mimivirus genome, either through a large segmental genome duplication event that occurred in the more remote past or through more recent gene duplication events, which often occur in tandem. This shows that gene and genome duplication played a major role in shaping the mimivirus genome. Using multiple alignments, together with remote-homology detection methods based on Hidden Markov Model comparison, I assign putative functions to some of the paralogous gene families. I suggest that a large part of the duplicated mimivirus gene families are likely to interfere with important host cell processes, such as transcription control, protein degradation, and cell regulatory processes. My findings support the view that large DNA viruses are complex evolving organisms, possibly deeply rooted within the tree of life, and oppose the paradigm that viral evolution is dominated by lateral gene acquisition, at least in regard to large DNA viruses.

  10. Genome-Wide Association Study Identifies NBS-LRR-Encoding Genes Related with Anthracnose and Common Bacterial Blight in the Common Bean.

    Science.gov (United States)

    Wu, Jing; Zhu, Jifeng; Wang, Lanfen; Wang, Shumin

    2017-01-01

    Nucleotide-binding site and leucine-rich repeat (NBS-LRR) genes represent the largest and most important disease resistance genes in plants. The genome sequence of the common bean (Phaseolus vulgaris L.) provides valuable data for determining the genomic organization of NBS-LRR genes. However, data on the NBS-LRR genes in the common bean are limited. In total, 178 NBS-LRR-type genes and 145 partial genes (with or without a NBS) located on 11 common bean chromosomes were identified from genome sequences database. Furthermore, 30 NBS-LRR genes were classified into Toll/interleukin-1 receptor (TIR)-NBS-LRR (TNL) types, and 148 NBS-LRR genes were classified into coiled-coil (CC)-NBS-LRR (CNL) types. Moreover, the phylogenetic tree supported the division of these PvNBS genes into two obvious groups, TNL types and CNL types. We also built expression profiles of NBS genes in response to anthracnose and common bacterial blight using qRT-PCR. Finally, we detected nine disease resistance loci for anthracnose (ANT) and seven for common bacterial blight (CBB) using the developed NBS-SSR markers. Among these loci, NSSR24, NSSR73, and NSSR265 may be located at new regions for ANT resistance, while NSSR65 and NSSR260 may be located at new regions for CBB resistance. Furthermore, we validated NSSR24, NSSR65, NSSR73, NSSR260, and NSSR265 using a new natural population. Our results provide useful information regarding the function of the NBS-LRR proteins and will accelerate the functional genomics and evolutionary studies of NBS-LRR genes in food legumes. NBS-SSR markers represent a wide-reaching resource for molecular breeding in the common bean and other food legumes. Collectively, our results should be of broad interest to bean scientists and breeders.

  11. Gene conversion in the rice genome

    DEFF Research Database (Denmark)

    Xu, Shuqing; Clark, Terry; Zheng, Hongkun;

    2008-01-01

    BACKGROUND: Gene conversion causes a non-reciprocal transfer of genetic information between similar sequences. Gene conversion can both homogenize genes and recruit point mutations thereby shaping the evolution of multigene families. In the rice genome, the large number of duplicated genes...... is not tightly linked to natural selection in the rice genome. To assess the contribution of segmental duplication on gene conversion statistics, we determined locations of conversion partners with respect to inter-chromosomal segment duplication. The number of conversions associated with segmentation is less...

  12. European genome-wide association study identifies SLC14A1 as a new urinary bladder cancer susceptibility gene

    NARCIS (Netherlands)

    Rafnar, T.; Vermeulen, H.H.M.; Sulem, P.; Thorleifsson, G.; Aben, K.K.H.; Witjes, J.A.; Grotenhuis, A.J.; Verhaegh, G.W.C.T.; Hulsbergen- van de Kaa, C.A.; Besenbacher, S.; Gudbjartsson, D.; Stacey, S.N.; Gudmundsson, J.; Johannsdottir, H.; Bjarnason, H.; Zanon, C.; Helgadottir, H.; Jonasson, J.G.; Tryggvadottir, L.; Jonsson, E.; Geirsson, G.; Nikulasson, S.; Petursdottir, V.; Bishop, D.T.; Chung-Sak, S.; Choudhury, A.; Elliott, F.; Barrett, J.H.; Knowles, M.A.; Verdier, P. de; Ryk, C.; Lindblom, A.; Rudnai, P.; Gurzau, E.; Koppova, K.; Vineis, P.; Polidoro, S.; Guarrera, S.; Sacerdote, C.; Panadero, A.; Sanz-Velez, J.I.; Sanchez, M.; Valdivia, G.; Garcia-Prats, M.D.; Hengstler, J.G.; Selinski, S.; Gerullis, H.; Ovsiannikov, D.; Khezri, A.; Aminsharifi, A.; Malekzadeh, M.; Berg, L.H. van den; Ophoff, R.A.; Veldink, J.H.; Zeegers, M.P.; Kellen, E.; Fostinelli, J.; Andreoli, D.; Arici, C.; Porru, S.; Buntinx, F.; Ghaderi, A.; Golka, K.; Mayordomo, J.I.; Matullo, G.; Kumar, R.; Steineck, G.; Kiltie, A.E.; Kong, A.; Thorsteinsdottir, U.; Stefansson, K.; Kiemeney, L.A.L.M.

    2011-01-01

    Three genome-wide association studies in Europe and the USA have reported eight urinary bladder cancer (UBC) susceptibility loci. Using extended case and control series and 1000 Genomes imputations of 5 340 737 single-nucleotide polymorphisms (SNPs), we searched for additional loci in the European G

  13. Evidence for gene-environment interaction in a genome wide study of nonsyndromic cleft palate

    DEFF Research Database (Denmark)

    Beaty, Terri H; Ruczinski, Ingo; Murray, Jeffrey C

    2011-01-01

    consortium. Family-based association tests of single nucleotide polymorphisms (SNP) and three common maternal exposures (maternal smoking, alcohol consumption, and multivitamin supplementation) were used in a combined 2 df test for gene (G) and gene-environment (G × E) interaction simultaneously, plus...... G × E interaction was included. Among these, MLLT3 and SMC2 on chromosome 9 showed multiple SNPs resulting in an increased risk if the mother consumed alcohol during the peri-conceptual period (3 months prior to conception through the first trimester). TBK1 on chr. 12 and ZNF236 on chr. 18 showed...

  14. The use of multiple hierarchically independent gene ontology terms in gene function prediction and genome annotation

    NARCIS (Netherlands)

    Kourmpetis, Y.I.A.; Burgt, van der A.; Bink, M.C.A.M.; Braak, ter C.J.F.; Ham, van R.C.H.J.

    2007-01-01

    The Gene Ontology (GO) is a widely used controlled vocabulary for the description of gene function. In this study we quantify the usage of multiple and hierarchically independent GO terms in the curated genome annotations of seven well-studied species. In most genomes, significant proportions (6 -

  15. Confluence of genes, environment, development, and behavior in a post Genome-Wide Association Study world

    DEFF Research Database (Denmark)

    Vrieze, S. I.; Iacono, W. G.; McGue, M.

    2012-01-01

    , and expected payoffs. Using substance use and abuse as our driving example, we then turn to the importance of etiological psychological theory in guiding genetic, environmental, and developmental research, as well as the utility of refined phenotypic measures, such as endophenotypes, in the pursuit...... of etiological understanding and focused tests of genetic and environmental associations. Phenotypic measurement has received considerable attention in the history of psychology and is informed by psychometrics, whereas the environment remains relatively poorly measured and is often confounded with genetic...... variation, most of which remains to be leveraged in genetic association tests. Although the genetic data can be massive and burdensome (tens of millions of variants per person), we argue that improved understanding of genomic structure and function will provide investigators with new tools to test specific...

  16. Identification of a common variant affecting human episodic memory performance using a pooled genome-wide association approach: a case study of disease gene identification.

    Science.gov (United States)

    Pawlowski, Traci L; Huentelman, Matthew J

    2011-01-01

    Genome-wide association studies (GWAS) are an important tool for discovering novel genes associated with disease or traits. Careful design of case-control groups greatly facilitates the efficacy of these studies. Here we describe a pooled GWAS study undertaken to find novel genes associated with human episodic memory performance. A genomic locus for the WW and C2 domain-containing 1 protein, KIBRA (also known as WWC1), was found to be associated with memory performance in three cognitively normal cohorts from Switzerland and the USA. This result was further supported by correlation of KIBRA genotype and differences in hippocampal activation as measured by functional magnetic resonance imaging (fMRI). These findings provide an excellent example of the application of GWAS using a pooled genomic DNA approach to successfully identify a locus with strong effects on human memory.

  17. An efficient viral vector for functional genomic studies of Prunus fruit trees and its induced resistance to Plum pox virus via silencing of a host factor gene

    OpenAIRE

    Cui, Hongguang; Wang, Aiming

    2016-01-01

    Summary RNA silencing is a powerful technology for molecular characterization of gene functions in plants. A commonly used approach to the induction of RNA silencing is through genetic transformation. A potent alternative is to use a modified viral vector for virus?induced gene silencing (VIGS) to degrade RNA molecules sharing similar nucleotide sequence. Unfortunately, genomic studies in many allogamous woody perennials such as peach are severely hindered because they have a long juvenile pe...

  18. Genome classification by gene distribution: An overlapping subspace clustering approach

    Directory of Open Access Journals (Sweden)

    Halgamuge Saman K

    2008-04-01

    Full Text Available Abstract Background Genomes of lower organisms have been observed with a large amount of horizontal gene transfers, which cause difficulties in their evolutionary study. Bacteriophage genomes are a typical example. One recent approach that addresses this problem is the unsupervised clustering of genomes based on gene order and genome position, which helps to reveal species relationships that may not be apparent from traditional phylogenetic methods. Results We propose the use of an overlapping subspace clustering algorithm for such genome classification problems. The advantage of subspace clustering over traditional clustering is that it can associate clusters with gene arrangement patterns, preserving genomic information in the clusters produced. Additionally, overlapping capability is desirable for the discovery of multiple conserved patterns within a single genome, such as those acquired from different species via horizontal gene transfers. The proposed method involves a novel strategy to vectorize genomes based on their gene distribution. A number of existing subspace clustering and biclustering algorithms were evaluated to identify the best framework upon which to develop our algorithm; we extended a generic subspace clustering algorithm called HARP to incorporate overlapping capability. The proposed algorithm was assessed and applied on bacteriophage genomes. The phage grouping results are consistent overall with the Phage Proteomic Tree and showed common genomic characteristics among the TP901-like, Sfi21-like and sk1-like phage groups. Among 441 phage genomes, we identified four significantly conserved distribution patterns structured by the terminase, portal, integrase, holin and lysin genes. We also observed a subgroup of Sfi21-like phages comprising a distinctive divergent genome organization and identified nine new phage members to the Sfi21-like genus: Staphylococcus 71, phiPVL108, Listeria A118, 2389, Lactobacillus phi AT3, A2

  19. Use of genome-wide expression data to mine the "gray zone" of GWA studies leads to novel candidate obesity genes

    NARCIS (Netherlands)

    J. Naukkarinen (Jussi); I. Surakka (Ida); K.H. Pietilainen (Kirsi Hannele); A. Rissanen (Aila); V. Salomaa (Veikko); S. Ripatti (Samuli); H. Yki-Jarvinen (Hannele); C.M. van Duijn (Cock); H.E. Wichmann (Heinz Erich); J. Kaprio (Jaakko); M. Taskinen (Marja Riitta); L. Peltonen (Leena Johanna)

    2010-01-01

    textabstractTo get beyond the "low-hanging fruits" so far identified by genome-wide association (GWA) studies, new methods must be developed in order to discover the numerous remaining genes that estimates of heritability indicate should be contributing to complex human phenotypes, such as obesity.

  20. Integrating pathway analysis and genetics of gene expression for genome-wide association study of basal cell carcinoma.

    Science.gov (United States)

    Zhang, Mingfeng; Liang, Liming; Morar, Nilesh; Dixon, Anna L; Lathrop, G Mark; Ding, Jun; Moffatt, Miriam F; Cookson, William O C; Kraft, Peter; Qureshi, Abrar A; Han, Jiali

    2012-04-01

    Genome-wide association studies (GWASs) have primarily focused on marginal effects for individual markers and have incorporated external functional information only after identifying robust statistical associations. We applied a new approach combining the genetics of gene expression and functional classification of genes to the GWAS of basal cell carcinoma (BCC) to identify potential biological pathways associated with BCC. We first identified 322,324 expression-associated single-nucleotide polymorphisms (eSNPs) from two existing GWASs of global gene expression in lymphoblastoid cell lines (n = 955), and evaluated the association of these functionally annotated SNPs with BCC among 2,045 BCC cases and 6,013 controls in Caucasians. We then grouped them into 99 KEGG pathways for pathway analysis and identified two pathways associated with BCC with p value <0.05 and false discovery rate (FDR) <0.5: the autoimmune thyroid disease pathway (mainly HLA class I and II antigens, p < 0.001, FDR = 0.24) and Janus kinase-signal transducer and activator of transcription (JAK-STAT) signaling pathway (p = 0.02, FDR = 0.49). Seventy-nine (25.7%) out of 307 significant eSNPs in the JAK-STAT pathway were associated with BCC risk (p < 0.05) in an independent replication set of 278 BCC cases and 1,262 controls. In addition, the association of JAK-STAT signaling pathway was marginally validated using 16,691 eSNPs identified from 110 normal skin samples (p = 0.08). Based on the evidence of biological functions of the JAK-STAT pathway on oncogenesis, it is plausible that this pathway is involved in BCC pathogenesis.

  1. Genomic evidence for adaptation by gene duplication.

    Science.gov (United States)

    Qian, Wenfeng; Zhang, Jianzhi

    2014-08-01

    Gene duplication is widely believed to facilitate adaptation, but unambiguous evidence for this hypothesis has been found in only a small number of cases. Although gene duplication may increase the fitness of the involved organisms by doubling gene dosage or neofunctionalization, it may also result in a simple division of ancestral functions into daughter genes, which need not promote adaptation. Hence, the general validity of the adaptation by gene duplication hypothesis remains uncertain. Indeed, a genome-scale experiment found similar fitness effects of deleting pairs of duplicate genes and deleting individual singleton genes from the yeast genome, leading to the conclusion that duplication rarely results in adaptation. Here we contend that the above comparison is unfair because of a known duplication bias among genes with different fitness contributions. To rectify this problem, we compare homologous genes from the budding yeast Saccharomyces cerevisiae and the fission yeast Schizosaccharomyces pombe. We discover that simultaneously deleting a duplicate gene pair in S. cerevisiae reduces fitness significantly more than deleting their singleton counterpart in S. pombe, revealing post-duplication adaptation. The duplicates-singleton difference in fitness effect is not attributable to a potential increase in gene dose after duplication, suggesting that the adaptation is owing to neofunctionalization, which we find to be explicable by acquisitions of binary protein-protein interactions rather than gene expression changes. These results provide genomic evidence for the role of gene duplication in organismal adaptation and are important for understanding the genetic mechanisms of evolutionary innovation.

  2. Gene finding in the chicken genome

    Directory of Open Access Journals (Sweden)

    Antonarakis Stylianos E

    2005-05-01

    Full Text Available Abstract Background Despite the continuous production of genome sequence for a number of organisms, reliable, comprehensive, and cost effective gene prediction remains problematic. This is particularly true for genomes for which there is not a large collection of known gene sequences, such as the recently published chicken genome. We used the chicken sequence to test comparative and homology-based gene-finding methods followed by experimental validation as an effective genome annotation method. Results We performed experimental evaluation by RT-PCR of three different computational gene finders, Ensembl, SGP2 and TWINSCAN, applied to the chicken genome. A Venn diagram was computed and each component of it was evaluated. The results showed that de novo comparative methods can identify up to about 700 chicken genes with no previous evidence of expression, and can correctly extend about 40% of homology-based predictions at the 5' end. Conclusions De novo comparative gene prediction followed by experimental verification is effective at enhancing the annotation of the newly sequenced genomes provided by standard homology-based methods.

  3. Genome-wide association study identifies SESTD1 as a novel risk gene for lithium-responsive bipolar disorder.

    Science.gov (United States)

    Song, J; Bergen, S E; Di Florio, A; Karlsson, R; Charney, A; Ruderfer, D M; Stahl, E A; Chambert, K D; Moran, J L; Gordon-Smith, K; Forty, L; Green, E K; Jones, I; Jones, L; Scolnick, E M; Sklar, P; Smoller, J W; Lichtenstein, P; Hultman, C; Craddock, N; Landén, M; Smoller, Jordan W; Perlis, Roy H; Lee, Phil Hyoun; Castro, Victor M; Hoffnagle, Alison G; Sklar, Pamela; Stahl, Eli A; Purcell, Shaun M; Ruderfer, Douglas M; Charney, Alexander W; Roussos, Panos; Michele Pato, Carlos Pato; Medeiros, Helen; Sobel, Janet; Craddock, Nick; Jones, Ian; Forty, Liz; Florio, Arianna Di; Green, Elaine; Jones, Lisa; Gordon-Smith, Katherine; Landen, Mikael; Hultman, Christina; Jureus, Anders; Bergen, Sarah; McCarroll, Steven; Moran, Jennifer; Smoller, Jordan W; Chambert, Kimberly; Belliveau, Richard A

    2016-09-01

    Lithium is the mainstay prophylactic treatment for bipolar disorder (BD), but treatment response varies considerably across individuals. Patients who respond well to lithium treatment might represent a relatively homogeneous subtype of this genetically and phenotypically diverse disorder. Here, we performed genome-wide association studies (GWAS) to identify (i) specific genetic variations influencing lithium response and (ii) genetic variants associated with risk for lithium-responsive BD. Patients with BD and controls were recruited from Sweden and the United Kingdom. GWAS were performed on 2698 patients with subjectively defined (self-reported) lithium response and 1176 patients with objectively defined (clinically documented) lithium response. We next conducted GWAS comparing lithium responders with healthy controls (1639 subjective responders and 8899 controls; 323 objective responders and 6684 controls). Meta-analyses of Swedish and UK results revealed no significant associations with lithium response within the bipolar subjects. However, when comparing lithium-responsive patients with controls, two imputed markers attained genome-wide significant associations, among which one was validated in confirmatory genotyping (rs116323614, P=2.74 × 10(-8)). It is an intronic single-nucleotide polymorphism (SNP) on chromosome 2q31.2 in the gene SEC14 and spectrin domains 1 (SESTD1), which encodes a protein involved in regulation of phospholipids. Phospholipids have been strongly implicated as lithium treatment targets. Furthermore, we estimated the proportion of variance for lithium-responsive BD explained by common variants ('SNP heritability') as 0.25 and 0.29 using two definitions of lithium response. Our results revealed a genetic variant in SESTD1 associated with risk for lithium-responsive BD, suggesting that the understanding of BD etiology could be furthered by focusing on this subtype of BD.

  4. Analysis of pan-genome to identify the core genes and essential genes of Brucella spp.

    Science.gov (United States)

    Yang, Xiaowen; Li, Yajie; Zang, Juan; Li, Yexia; Bie, Pengfei; Lu, Yanli; Wu, Qingmin

    2016-04-01

    Brucella spp. are facultative intracellular pathogens, that cause a contagious zoonotic disease, that can result in such outcomes as abortion or sterility in susceptible animal hosts and grave, debilitating illness in humans. For deciphering the survival mechanism of Brucella spp. in vivo, 42 Brucella complete genomes from NCBI were analyzed for the pan-genome and core genome by identification of their composition and function of Brucella genomes. The results showed that the total 132,143 protein-coding genes in these genomes were divided into 5369 clusters. Among these, 1710 clusters were associated with the core genome, 1182 clusters with strain-specific genes and 2477 clusters with dispensable genomes. COG analysis indicated that 44 % of the core genes were devoted to metabolism, which were mainly responsible for energy production and conversion (COG category C), and amino acid transport and metabolism (COG category E). Meanwhile, approximately 35 % of the core genes were in positive selection. In addition, 1252 potential essential genes were predicted in the core genome by comparison with a prokaryote database of essential genes. The results suggested that the core genes in Brucella genomes are relatively conservation, and the energy and amino acid metabolism play a more important role in the process of growth and reproduction in Brucella spp. This study might help us to better understand the mechanisms of Brucella persistent infection and provide some clues for further exploring the gene modules of the intracellular survival in Brucella spp.

  5. A Refined Study of FCRL Genes from a Genome-Wide Association Study for Graves Disease: e57758

    National Research Council Canada - National Science Library

    Shuang-Xia Zhao; Wei Liu; Ming Zhan; Zhi-Yi Song; Shao-Ying Yang; Li-Qiong Xue; Chun-Ming Pan; Zhao-Hui Gu; Bing-Li Liu; Hai-Ning Wang; Liming Liang; Jun Liang; Xiao-Mei Zhang; Guo-Yue Yuan; Chang-Gui Li; Ming-Dao Chen; Jia-Lun Chen; Guan-Qi Gao; Huai-Dong Song; of Autoimmune Thyroid Disease

    2013-01-01

      To pinpoint the exact location of the etiological variant/s present at 1q21.1 harboring FCRL1-5 and CD5L genes, we carried out a refined association study in the entire FCRL region in 1,536 patients with Graves' disease (GD...

  6. Gene and genome parameters of mammalian liver circadian genes (LCGs.

    Directory of Open Access Journals (Sweden)

    Gang Wu

    Full Text Available The mammalian circadian system controls various physiology processes and behavior responses by regulating thousands of circadian genes with rhythmic expressions. In this study, we redefined circadian-regulated genes based on published results in the mouse liver and compared them with other gene groups defined relative to circadian regulations, especially the non-circadian-regulated genes expressed in liver at multiple molecular levels from gene position to protein expression based on integrative analyses of different datasets from the literature. Based on the intra-tissue analysis, the liver circadian genes or LCGs show unique features when compared to other gene groups. First, LCGs in general have less neighboring genes and larger in both genomic and 3'-UTR lengths but shorter in CDS (coding sequence lengths. Second, LCGs have higher mRNA and protein abundance, higher temporal expression variations, and shorter mRNA half-life. Third, more than 60% of LCGs form major co-expression clusters centered in four temporal windows: dawn, day, dusk, and night. In addition, larger and smaller LCGs are found mainly expressed in the day and night temporal windows, respectively, and we believe that LCGs are well-partitioned into the gene expression regulatory network that takes advantage of gene size, expression constraint, and chromosomal architecture. Based on inter-tissue analysis, more than half of LCGs are ubiquitously expressed in multiple tissues but only show rhythmical expression in one or limited number of tissues. LCGs show at least three-fold lower expression variations across the temporal windows than those among different tissues, and this observation suggests that temporal expression variations regulated by the circadian system is relatively subtle as compared with the tissue expression variations formed during development. Taken together, we suggest that the circadian system selects gene parameters in a cost effective way to improve tissue

  7. Maximum likelihood for genome phylogeny on gene content.

    Science.gov (United States)

    Zhang, Hongmei; Gu, Xun

    2004-01-01

    With the rapid growth of entire genome data, reconstructing the phylogenetic relationship among different genomes has become a hot topic in comparative genomics. Maximum likelihood approach is one of the various approaches, and has been very successful. However, there is no reported study for any applications in the genome tree-making mainly due to the lack of an analytical form of a probability model and/or the complicated calculation burden. In this paper we studied the mathematical structure of the stochastic model of genome evolution, and then developed a simplified likelihood function for observing a specific phylogenetic pattern under four genome situation using gene content information. We use the maximum likelihood approach to identify phylogenetic trees. Simulation results indicate that the proposed method works well and can identify trees with a high correction rate. Real data application provides satisfied results. The approach developed in this paper can serve as the basis for reconstructing phylogenies of more than four genomes.

  8. Genomic studies of envelope gene sequences from mosquito and human samples from Bangkok, Thailand.

    Science.gov (United States)

    Pitaksajjakul, Pannamthip; Benjathummarak, Surachet; Son, Hyun Ngoc; Thongrungkiat, Supatra; Ramasoota, Pongrama

    2016-01-01

    Dengue virus (DENV) is an RNA virus showing a high degree of genetic variation as a consequence of its proofreading inability. This variation plays an important role in virus evolution and pathogenesis. Although levels of within-host genetic variation are similar following equilibrium, variation among different hosts is frequently different. To identify dengue quasispecies present among two hosts, we collected patient samples from six acute DENV cases and two pools of Aedes aegypti mosquitoes and analyzed the genetic variation of regions of the viral envelope gene. Among human and mosquito samples, we found three major clusters originating from two subpopulations. Although several shared lineages were observed in the two hosts, only one lineage showing evidence of neutral selection was observed among two hosts. Taken together, our data provide evidence for the existence of a DENV quasispecies, with less genetic variation observed in mosquitoes than humans and with circulating lineages found in both host types.

  9. Genomics-Based Discovery of Plant Genes for Synthetic Biology of Terpenoid Fragrances: A Case Study in Sandalwood oil Biosynthesis.

    Science.gov (United States)

    Celedon, J M; Bohlmann, J

    2016-01-01

    Terpenoid fragrances are powerful mediators of ecological interactions in nature and have a long history of traditional and modern industrial applications. Plants produce a great diversity of fragrant terpenoid metabolites, which make them a superb source of biosynthetic genes and enzymes. Advances in fragrance gene discovery have enabled new approaches in synthetic biology of high-value speciality molecules toward applications in the fragrance and flavor, food and beverage, cosmetics, and other industries. Rapid developments in transcriptome and genome sequencing of nonmodel plant species have accelerated the discovery of fragrance biosynthetic pathways. In parallel, advances in metabolic engineering of microbial and plant systems have established platforms for synthetic biology applications of some of the thousands of plant genes that underlie fragrance diversity. While many fragrance molecules (eg, simple monoterpenes) are abundant in readily renewable plant materials, some highly valuable fragrant terpenoids (eg, santalols, ambroxides) are rare in nature and interesting targets for synthetic biology. As a representative example for genomics/transcriptomics enabled gene and enzyme discovery, we describe a strategy used successfully for elucidation of a complete fragrance biosynthetic pathway in sandalwood (Santalum album) and its reconstruction in yeast (Saccharomyces cerevisiae). We address questions related to the discovery of specific genes within large gene families and recovery of rare gene transcripts that are selectively expressed in recalcitrant tissues. To substantiate the validity of the approaches, we describe the combination of methods used in the gene and enzyme discovery of a cytochrome P450 in the fragrant heartwood of tropical sandalwood, responsible for the fragrance defining, final step in the biosynthesis of (Z)-santalols. © 2016 Elsevier Inc. All rights reserved.

  10. Coelacanth genome sequence reveals the evolutionary history of vertebrate genes.

    Science.gov (United States)

    Noonan, James P; Grimwood, Jane; Danke, Joshua; Schmutz, Jeremy; Dickson, Mark; Amemiya, Chris T; Myers, Richard M

    2004-12-01

    The coelacanth is one of the nearest living relatives of tetrapods. However, a teleost species such as zebrafish or Fugu is typically used as the outgroup in current tetrapod comparative sequence analyses. Such studies are complicated by the fact that teleost genomes have undergone a whole-genome duplication event, as well as individual gene-duplication events. Here, we demonstrate the value of coelacanth genome sequence by complete sequencing and analysis of the protocadherin gene cluster of the Indonesian coelacanth, Latimeria menadoensis. We found that coelacanth has 49 protocadherin cluster genes organized in the same three ordered subclusters, alpha, beta, and gamma, as the 54 protocadherin cluster genes in human. In contrast, whole-genome and tandem duplications have generated two zebrafish protocadherin clusters comprised of at least 97 genes. Additionally, zebrafish protocadherins are far more prone to homogenizing gene conversion events than coelacanth protocadherins, suggesting that recombination- and duplication-driven plasticity may be a feature of teleost genomes. Our results indicate that coelacanth provides the ideal outgroup sequence against which tetrapod genomes can be measured. We therefore present L. menadoensis as a candidate for whole-genome sequencing.

  11. Use of Genome-Wide Expression Data to Mine the “Gray Zone” of GWA Studies Leads to Novel Candidate Obesity Genes

    Science.gov (United States)

    Naukkarinen, Jussi; Surakka, Ida; Pietiläinen, Kirsi H.; Rissanen, Aila; Salomaa, Veikko; Ripatti, Samuli; Yki-Järvinen, Hannele; van Duijn, Cornelia M.; Wichmann, H.-Erich; Kaprio, Jaakko; Taskinen, Marja-Riitta; Peltonen, Leena

    2010-01-01

    To get beyond the “low-hanging fruits” so far identified by genome-wide association (GWA) studies, new methods must be developed in order to discover the numerous remaining genes that estimates of heritability indicate should be contributing to complex human phenotypes, such as obesity. Here we describe a novel integrative method for complex disease gene identification utilizing both genome-wide transcript profiling of adipose tissue samples and consequent analysis of genome-wide association data generated in large SNP scans. We infer causality of genes with obesity by employing a unique set of monozygotic twin pairs discordant for BMI (n = 13 pairs, age 24–28 years, 15.4 kg mean weight difference) and contrast the transcript profiles with those from a larger sample of non-related adult individuals (N = 77). Using this approach, we were able to identify 27 genes with possibly causal roles in determining the degree of human adiposity. Testing for association of SNP variants in these 27 genes in the population samples of the large ENGAGE consortium (N = 21,000) revealed a significant deviation of P-values from the expected (P = 4×10−4). A total of 13 genes contained SNPs nominally associated with BMI. The top finding was blood coagulation factor F13A1 identified as a novel obesity gene also replicated in a second GWA set of ∼2,000 individuals. This study presents a new approach to utilizing gene expression studies for informing choice of candidate genes for complex human phenotypes, such as obesity. PMID:20532202

  12. Use of genome-wide expression data to mine the "Gray Zone" of GWA studies leads to novel candidate obesity genes.

    Directory of Open Access Journals (Sweden)

    Jussi Naukkarinen

    2010-06-01

    Full Text Available To get beyond the "low-hanging fruits" so far identified by genome-wide association (GWA studies, new methods must be developed in order to discover the numerous remaining genes that estimates of heritability indicate should be contributing to complex human phenotypes, such as obesity. Here we describe a novel integrative method for complex disease gene identification utilizing both genome-wide transcript profiling of adipose tissue samples and consequent analysis of genome-wide association data generated in large SNP scans. We infer causality of genes with obesity by employing a unique set of monozygotic twin pairs discordant for BMI (n = 13 pairs, age 24-28 years, 15.4 kg mean weight difference and contrast the transcript profiles with those from a larger sample of non-related adult individuals (N = 77. Using this approach, we were able to identify 27 genes with possibly causal roles in determining the degree of human adiposity. Testing for association of SNP variants in these 27 genes in the population samples of the large ENGAGE consortium (N = 21,000 revealed a significant deviation of P-values from the expected (P = 4x10(-4. A total of 13 genes contained SNPs nominally associated with BMI. The top finding was blood coagulation factor F13A1 identified as a novel obesity gene also replicated in a second GWA set of approximately 2,000 individuals. This study presents a new approach to utilizing gene expression studies for informing choice of candidate genes for complex human phenotypes, such as obesity.

  13. Recent Achievement in Gene Cloning and Functional Genomics in Soybean

    Directory of Open Access Journals (Sweden)

    Zhengjun Xia

    2013-01-01

    Full Text Available Soybean is a model plant for photoperiodism as well as for symbiotic nitrogen fixation. However, a rather low efficiency in soybean transformation hampers functional analysis of genes isolated from soybean. In comparison, rapid development and progress in flowering time and photoperiodic response have been achieved in Arabidopsis and rice. As the soybean genomic information has been released since 2008, gene cloning and functional genomic studies have been revived as indicated by successfully characterizing genes involved in maturity and nematode resistance. Here, we review some major achievements in the cloning of some important genes and some specific features at genetic or genomic levels revealed by the analysis of functional genomics of soybean.

  14. Genes but not genomes reveal bacterial domestication of Lactococcus lactis.

    Directory of Open Access Journals (Sweden)

    Delphine Passerini

    Full Text Available BACKGROUND: The population structure and diversity of Lactococcus lactis subsp. lactis, a major industrial bacterium involved in milk fermentation, was determined at both gene and genome level. Seventy-six lactococcal isolates of various origins were studied by different genotyping methods and thirty-six strains displaying unique macrorestriction fingerprints were analyzed by a new multilocus sequence typing (MLST scheme. This gene-based analysis was compared to genomic characteristics determined by pulsed-field gel electrophoresis (PFGE. METHODOLOGY/PRINCIPAL FINDINGS: The MLST analysis revealed that L. lactis subsp. lactis is essentially clonal with infrequent intra- and intergenic recombination; also, despite its taxonomical classification as a subspecies, it displays a genetic diversity as substantial as that within several other bacterial species. Genome-based analysis revealed a genome size variability of 20%, a value typical of bacteria inhabiting different ecological niches, and that suggests a large pan-genome for this subspecies. However, the genomic characteristics (macrorestriction pattern, genome or chromosome size, plasmid content did not correlate to the MLST-based phylogeny, with strains from the same sequence type (ST differing by up to 230 kb in genome size. CONCLUSION/SIGNIFICANCE: The gene-based phylogeny was not fully consistent with the traditional classification into dairy and non-dairy strains but supported a new classification based on ecological separation between "environmental" strains, the main contributors to the genetic diversity within the subspecies, and "domesticated" strains, subject to recent genetic bottlenecks. Comparison between gene- and genome-based analyses revealed little relationship between core and dispensable genome phylogenies, indicating that clonal diversification and phenotypic variability of the "domesticated" strains essentially arose through substantial genomic flux within the dispensable

  15. Genome-wide gene expression analysis of anguillid herpesvirus 1

    NARCIS (Netherlands)

    Beurden, van S.J.; Peeters, B.P.H.; Rottier, P.J.M.; Davison, A.A.; Engelsma, M.Y.

    2013-01-01

    Background Whereas temporal gene expression in mammalian herpesviruses has been studied extensively, little is known about gene expression in fish herpesviruses. Here we report a genome-wide transcription analysis of a fish herpesvirus, anguillid herpesvirus 1, in cell culture, studied during the

  16. Genomics technologies to study structural variations in the grapevine genome

    Directory of Open Access Journals (Sweden)

    Cardone Maria Francesca

    2016-01-01

    Full Text Available Grapevine is one of the most important crop plants in the world. Recently there was great expansion of genomics resources about grapevine genome, thus providing increasing efforts for molecular breeding. Current cultivars display a great level of inter-specific differentiation that needs to be investigated to reach a comprehensive understanding of the genetic basis of phenotypic differences, and to find responsible genes selected by cross breeding programs. While there have been significant advances in resolving the pattern and nature of single nucleotide polymorphisms (SNPs on plant genomes, few data are available on copy number variation (CNV. Furthermore association between structural variations and phenotypes has been described in only a few cases. We combined high throughput biotechnologies and bioinformatics tools, to reveal the first inter-varietal atlas of structural variation (SV for the grapevine genome. We sequenced and compared four table grape cultivars with the Pinot noir inbred line PN40024 genome as the reference. We detected roughly 8% of the grapevine genome affected by genomic variations. Taken into account phenotypic differences existing among the studied varieties we performed comparison of SVs among them and the reference and next we performed an in-depth analysis of gene content of polymorphic regions. This allowed us to identify genes showing differences in copy number as putative functional candidates for important traits in grapevine cultivation.

  17. Reproduction-related genes in the pearl oyster genome.

    Science.gov (United States)

    Matsumoto, Toshie; Masaoka, Tetsuji; Fujiwara, Atsushi; Nakamura, Yoji; Satoh, Nori; Awaji, Masahiko

    2013-10-01

    Molluscan reproduction has been a target of biological research because of the various reproductive strategies that have evolved in this phylum. It has also been studied for the development of fisheries technologies, particularly aquaculture. Although fundamental processes of reproduction in other phyla, such as vertebrates and arthropods, have been well studied, information on the molecular mechanisms of molluscan reproduction remains limited. The recently released draft genome of the pearl oyster Pinctada fucata provides a novel and powerful platform for obtaining structural information on the genes and proteins involved in bivalve reproduction. In the present study, we analyzed the pearl oyster draft genome to screen reproduction-related genes. Analysis was mainly conducted for genes reported from other molluscs for encoding orthologs of reproduction-related proteins in other phyla. The gene search in the P. fucata gene models (version 1.1) and genome assembly (version 1.0) were performed using Genome Browser and BLAST software. The obtained gene models were then BLASTP searched against a public database to confirm the best-hit sequences. As a result, more than 40 gene models were identified with high accuracy to encode reproduction-related genes reported for P. fucata and other molluscs. These include vasa, nanos, doublesex- and mab-3-related transcription factor, 5-hydroxytryptamine (5-HT) receptors, vitellogenin, estrogen receptor, and others. The set of reproduction-related genes of P. fucata identified in the present study constitute a new tool for research on bivalve reproduction at the molecular level.

  18. Genome-wide Analysis of Gene Regulation

    DEFF Research Database (Denmark)

    Chen, Yun

    cells are capable of regulating their gene expression, so that each cell can only express a particular set of genes yielding limited numbers of proteins with specialized functions. Therefore a rigid control of differential gene expression is necessary for cellular diversity. On the other hand, aberrant...... gene regulation will disrupt the cell’s fundamental processes, which in turn can cause disease. Hence, understanding gene regulation is essential for deciphering the code of life. Along with the development of high throughput sequencing (HTS) technology and the subsequent large-scale data analysis......, genome-wide assays have increased our understanding of gene regulation significantly. This thesis describes the integration and analysis of HTS data across different important aspects of gene regulation. Gene expression can be regulated at different stages when the genetic information is passed from gene...

  19. Genome-wide association studies and epistasis analyses of candidate genes related to age at menarche and age at natural menopause in a Korean population.

    Science.gov (United States)

    Pyun, Jung-A; Kim, Sunshin; Cho, Nam H; Koh, InSong; Lee, Jong-Young; Shin, Chol; Kwack, KyuBum

    2014-05-01

    The aim of this study was to identify polymorphisms and gene-gene interactions that are significantly associated with age at menarche and age at menopause in a Korean population. A total of 3,452 and 1,827 women participated in studies of age at menarche and age at natural menopause, respectively. Linear regression analyses adjusted for residence area were used to perform genome-wide association studies (GWAS), candidate gene association studies, and interactions between the candidate genes for age at menarche and age at natural menopause. In GWAS, four single nucleotide polymorphisms (SNPs; rs7528241, rs1324329, rs11597068, and rs6495785) were strongly associated with age at natural menopause (lowest P = 9.66 × 10). However, GWAS of age at menarche did not reveal any strong associations. In candidate gene association studies, SNPs with P menopause, there was a significant interaction between intronic SNPs on ADAM metallopeptidase with thrombospondin type I motif 9 (ADAMTS9) and SMAD family member 3 (SMAD3) genes (P = 9.52 × 10). For age at menarche, there were three significant interactions between three intronic SNPs on follicle-stimulating hormone receptor (FSHR) gene and one SNP located at the 3' flanking region of insulin-like growth factor 2 receptor (IGF2R) gene (lowest P = 1.95 × 10). Novel SNPs and synergistic interactions between candidate genes are significantly associated with age at menarche and age at natural menopause in a Korean population.

  20. Ab initio gene identification: prokaryote genome annotation with GeneScan and GLIMMER

    Indian Academy of Sciences (India)

    Gautam Aggarwal; Ramakrishna Ramaswamy

    2002-02-01

    We compare the annotation of three complete genomes using the ab initio methods of gene identification GeneScan and GLIMMER. The annotation given in GenBank, the standard against which these are compared, has been made using GeneMark. We find a number of novel genes which are predicted by both methods used here, as well as a number of genes that are predicted by GeneMark, but are not identified by either of the nonconsensus methods that we have used. The three organisms studied here are all prokaryotic species with fairly compact genomes. The Fourier measure forms the basis for an efficient non-consensus method for gene prediction, and the algorithm GeneScan exploits this measure. We have bench-marked this program as well as GLIMMER using 3 complete prokaryotic genomes. An effort has also been made to study the limitations of these techniques for complete genome analysis. GeneScan and GLIMMER are of comparable accuracy insofar as gene-identification is concerned, with sensitivities and specificities typically greater than 0.9. The number of false predictions (both positive and negative) is higher for GeneScan as compared to GLIMMER, but in a significant number of cases, similar results are provided by the two techniques. This suggests that there could be some as-yet unidentified additional genes in these three genomes, and also that some of the putative identifications made hitherto might require re-evaluation. All these cases are discussed in detail.

  1. Genome editing for human gene therapy.

    Science.gov (United States)

    Meissner, Torsten B; Mandal, Pankaj K; Ferreira, Leonardo M R; Rossi, Derrick J; Cowan, Chad A

    2014-01-01

    The rapid advancement of genome-editing techniques holds much promise for the field of human gene therapy. From bacteria to model organisms and human cells, genome editing tools such as zinc-finger nucleases (ZNFs), TALENs, and CRISPR/Cas9 have been successfully used to manipulate the respective genomes with unprecedented precision. With regard to human gene therapy, it is of great interest to test the feasibility of genome editing in primary human hematopoietic cells that could potentially be used to treat a variety of human genetic disorders such as hemoglobinopathies, primary immunodeficiencies, and cancer. In this chapter, we explore the use of the CRISPR/Cas9 system for the efficient ablation of genes in two clinically relevant primary human cell types, CD4+ T cells and CD34+ hematopoietic stem and progenitor cells. By using two guide RNAs directed at a single locus, we achieve highly efficient and predictable deletions that ablate gene function. The use of a Cas9-2A-GFP fusion protein allows FACS-based enrichment of the transfected cells. The ease of designing, constructing, and testing guide RNAs makes this dual guide strategy an attractive approach for the efficient deletion of clinically relevant genes in primary human hematopoietic stem and effector cells and enables the use of CRISPR/Cas9 for gene therapy.

  2. A Genetic Predictive Model for Canine Hip Dysplasia: Integration of Genome Wide Association Study (GWAS) and Candidate Gene Approaches

    Science.gov (United States)

    Bartolomé, Nerea; Segarra, Sergi; Artieda, Marta; Francino, Olga; Sánchez, Elisenda; Szczypiorska, Magdalena; Casellas, Joaquim; Tejedor, Diego; Cerdeira, Joaquín; Martínez, Antonio; Velasco, Alfonso; Sánchez, Armand

    2015-01-01

    Canine hip dysplasia is one of the most prevalent developmental orthopedic diseases in dogs worldwide. Unfortunately, the success of eradication programs against this disease based on radiographic diagnosis is low. Adding the use of diagnostic genetic tools to the current phenotype-based approach might be beneficial. The aim of this study was to develop a genetic prognostic test for early diagnosis of hip dysplasia in Labrador Retrievers. To develop our DNA test, 775 Labrador Retrievers were recruited. For each dog, a blood sample and a ventrodorsal hip radiograph were taken. Dogs were divided into two groups according to their FCI hip score: control (A/B) and case (D/E). C dogs were not included in the sample. Genetic characterization combining a GWAS and a candidate gene strategy using SNPs allowed a case-control population association study. A mathematical model which included 7 SNPs was developed using logistic regression. The model showed a good accuracy (Area under the ROC curve = 0.85) and was validated in an independent population of 114 dogs. This prognostic genetic test represents a useful tool for choosing the most appropriate therapeutic approach once genetic predisposition to hip dysplasia is known. Therefore, it allows a more individualized management of the disease. It is also applicable during genetic selection processes, since breeders can benefit from the information given by this test as soon as a blood sample can be collected, and act accordingly. In the authors’ opinion, a shift towards genomic screening might importantly contribute to reducing canine hip dysplasia in the future. In conclusion, based on genetic and radiographic information from Labrador Retrievers with hip dysplasia, we developed an accurate predictive genetic test for early diagnosis of hip dysplasia in Labrador Retrievers. However, further research is warranted in order to evaluate the validity of this genetic test in other dog breeds. PMID:25874693

  3. A genetic predictive model for canine hip dysplasia: integration of Genome Wide Association Study (GWAS and candidate gene approaches.

    Directory of Open Access Journals (Sweden)

    Nerea Bartolomé

    Full Text Available Canine hip dysplasia is one of the most prevalent developmental orthopedic diseases in dogs worldwide. Unfortunately, the success of eradication programs against this disease based on radiographic diagnosis is low. Adding the use of diagnostic genetic tools to the current phenotype-based approach might be beneficial. The aim of this study was to develop a genetic prognostic test for early diagnosis of hip dysplasia in Labrador Retrievers. To develop our DNA test, 775 Labrador Retrievers were recruited. For each dog, a blood sample and a ventrodorsal hip radiograph were taken. Dogs were divided into two groups according to their FCI hip score: control (A/B and case (D/E. C dogs were not included in the sample. Genetic characterization combining a GWAS and a candidate gene strategy using SNPs allowed a case-control population association study. A mathematical model which included 7 SNPs was developed using logistic regression. The model showed a good accuracy (Area under the ROC curve = 0.85 and was validated in an independent population of 114 dogs. This prognostic genetic test represents a useful tool for choosing the most appropriate therapeutic approach once genetic predisposition to hip dysplasia is known. Therefore, it allows a more individualized management of the disease. It is also applicable during genetic selection processes, since breeders can benefit from the information given by this test as soon as a blood sample can be collected, and act accordingly. In the authors' opinion, a shift towards genomic screening might importantly contribute to reducing canine hip dysplasia in the future. In conclusion, based on genetic and radiographic information from Labrador Retrievers with hip dysplasia, we developed an accurate predictive genetic test for early diagnosis of hip dysplasia in Labrador Retrievers. However, further research is warranted in order to evaluate the validity of this genetic test in other dog breeds.

  4. A genetic predictive model for canine hip dysplasia: integration of Genome Wide Association Study (GWAS) and candidate gene approaches.

    Science.gov (United States)

    Bartolomé, Nerea; Segarra, Sergi; Artieda, Marta; Francino, Olga; Sánchez, Elisenda; Szczypiorska, Magdalena; Casellas, Joaquim; Tejedor, Diego; Cerdeira, Joaquín; Martínez, Antonio; Velasco, Alfonso; Sánchez, Armand

    2015-01-01

    Canine hip dysplasia is one of the most prevalent developmental orthopedic diseases in dogs worldwide. Unfortunately, the success of eradication programs against this disease based on radiographic diagnosis is low. Adding the use of diagnostic genetic tools to the current phenotype-based approach might be beneficial. The aim of this study was to develop a genetic prognostic test for early diagnosis of hip dysplasia in Labrador Retrievers. To develop our DNA test, 775 Labrador Retrievers were recruited. For each dog, a blood sample and a ventrodorsal hip radiograph were taken. Dogs were divided into two groups according to their FCI hip score: control (A/B) and case (D/E). C dogs were not included in the sample. Genetic characterization combining a GWAS and a candidate gene strategy using SNPs allowed a case-control population association study. A mathematical model which included 7 SNPs was developed using logistic regression. The model showed a good accuracy (Area under the ROC curve = 0.85) and was validated in an independent population of 114 dogs. This prognostic genetic test represents a useful tool for choosing the most appropriate therapeutic approach once genetic predisposition to hip dysplasia is known. Therefore, it allows a more individualized management of the disease. It is also applicable during genetic selection processes, since breeders can benefit from the information given by this test as soon as a blood sample can be collected, and act accordingly. In the authors' opinion, a shift towards genomic screening might importantly contribute to reducing canine hip dysplasia in the future. In conclusion, based on genetic and radiographic information from Labrador Retrievers with hip dysplasia, we developed an accurate predictive genetic test for early diagnosis of hip dysplasia in Labrador Retrievers. However, further research is warranted in order to evaluate the validity of this genetic test in other dog breeds.

  5. Correlation of microsynteny conservation and disease gene distribution in mammalian genomes

    Directory of Open Access Journals (Sweden)

    Li Xiting

    2009-11-01

    Full Text Available Abstract Background With the completion of the whole genome sequence for many organisms, investigations into genomic structure have revealed that gene distribution is variable, and that genes with similar function or expression are located within clusters. This clustering suggests that there are evolutionary constraints that determine genome architecture. However, as most of the evidence for constraints on genome evolution comes from studies on yeast, it is unclear how much of this prior work can be extrapolated to mammalian genomes. Therefore, in this work we wished to examine the constraints on regions of the mammalian genome containing conserved gene clusters. Results We first identified regions of the mouse genome with microsynteny conservation by comparing gene arrangement in the mouse genome to the human, rat, and dog genomes. We then asked if any particular gene types were found preferentially in conserved regions. We found a significant correlation between conserved microsynteny and the density of mouse orthologs of human disease genes, suggesting that disease genes are clustered in genomic regions of increased microsynteny conservation. Conclusion The correlation between microsynteny conservation and disease gene locations indicates that regions of the mouse genome with microsynteny conservation may contain undiscovered human disease genes. This study not only demonstrates that gene function constrains mammalian genome organization, but also identifies regions of the mouse genome that can be experimentally examined to produce mouse models of human disease.

  6. Genome-Wide Detection and Analysis of Multifunctional Genes

    Science.gov (United States)

    Pritykin, Yuri; Ghersi, Dario; Singh, Mona

    2015-01-01

    Many genes can play a role in multiple biological processes or molecular functions. Identifying multifunctional genes at the genome-wide level and studying their properties can shed light upon the complexity of molecular events that underpin cellular functioning, thereby leading to a better understanding of the functional landscape of the cell. However, to date, genome-wide analysis of multifunctional genes (and the proteins they encode) has been limited. Here we introduce a computational approach that uses known functional annotations to extract genes playing a role in at least two distinct biological processes. We leverage functional genomics data sets for three organisms—H. sapiens, D. melanogaster, and S. cerevisiae—and show that, as compared to other annotated genes, genes involved in multiple biological processes possess distinct physicochemical properties, are more broadly expressed, tend to be more central in protein interaction networks, tend to be more evolutionarily conserved, and are more likely to be essential. We also find that multifunctional genes are significantly more likely to be involved in human disorders. These same features also hold when multifunctionality is defined with respect to molecular functions instead of biological processes. Our analysis uncovers key features about multifunctional genes, and is a step towards a better genome-wide understanding of gene multifunctionality. PMID:26436655

  7. Genome-wide copy number variation study associates metabotropic glutamate receptor gene networks with attention deficit hyperactivity disorder.

    NARCIS (Netherlands)

    Elia, J.; Glessner, J.T.; Wang, K.; Takahashi, N.; Shtir, C.J.; Hadley, D.; Sleiman, P.M.; Zhang, H.; Kim, C.E.; Robison, R.; Lyon, G.J.; Flory, J.H.; Bradfield, J.P.; Imielinski, M.; Hou, C.; Frackelton, E.C.; Chiavacci, R.M.; Sakurai, T.; Rabin, C.; Middleton, F.A.; Thomas, K.A.; Garris, M.; Mentch, F.; Freitag, C.M.; Steinhausen, H.C.; Todorov, A.A.; Reif, A.; Rothenberger, A.; Franke, B.; Mick, E.O.; Roeyers, H.; Buitelaar, J.K.; Lesch, K.P.; Banaschewski, T.; Ebstein, R.P.; Mulas, F.; Oades, R.D.; Sergeant, J.A.; Sonuga-Barke, E.J.S.; Renner, T.J.; Romanos, M.; Romanos, J.; Warnke, A.; Walitza, S.; Meyer, J.; Palmason, H.; Seitz, C.; Loo, S.K.; Smalley, S.L.; Biederman, J.; Kent, L.; Asherson, P.; Anney, R.J.; Gaynor, J.W.; Shaw, P.; Devoto, M.; White, P.S.; Grant, S.F.; Buxbaum, J.D.; Rapoport, J.L.; Williams, N.M.; Nelson, S.F.; Faraone, S.V.; Hakonarson, H.

    2011-01-01

    Attention deficit hyperactivity disorder (ADHD) is a common, heritable neuropsychiatric disorder of unknown etiology. We performed a whole-genome copy number variation (CNV) study on 1,013 cases with ADHD and 4,105 healthy children of European ancestry using 550,000 SNPs. We evaluated statistically

  8. Tandemly Arrayed Genes in Vertebrate Genomes

    Directory of Open Access Journals (Sweden)

    Deng Pan

    2008-01-01

    Full Text Available Tandemly arrayed genes (TAGs are duplicated genes that are linked as neighbors on a chromosome, many of which have important physiological and biochemical functions. Here we performed a survey of these genes in 11 available vertebrate genomes. TAGs account for an average of about 14% of all genes in these vertebrate genomes, and about 25% of all duplications. The majority of TAGs (72–94% have parallel transcription orientation (i.e., they are encoded on the same strand in contrast to the genome, which has about 50% of its genes in parallel transcription orientation. The majority of tandem arrays have only two members. In all species, the proportion of genes that belong to TAGs tends to be higher in large gene families than in small ones; together with our recent finding that tandem duplication played a more important role than retroposition in large families, this fact suggests that among all types of duplication mechanisms, tandem duplication is the predominant mechanism of duplication, especially in large families. Finally, several species have a higher proportion of large tandem arrays that are species-specific than random expectation.

  9. Impact of gene dosage on gene expression, biological processes and survival in cervical cancer: a genome-wide follow-up study.

    Science.gov (United States)

    Medina-Martinez, Ingrid; Barrón, Valeria; Roman-Bassaure, Edgar; Juárez-Torres, Eligia; Guardado-Estrada, Mariano; Espinosa, Ana María; Bermudez, Miriam; Fernández, Fernando; Venegas-Vega, Carlos; Orozco, Lorena; Zenteno, Edgar; Kofman, Susana; Berumen, Jaime

    2014-01-01

    We investigated the role of tumor copy number (CN)-altered genome (CN-AG) in the carcinogenesis of cervical cancer (CC), especially its effect on gene expression, biological processes, and patient survival. Fifty-nine human papillomavirus 16 (HPV16)-positive CCs were investigated with microarrays-31 for mapping CN-AG and 55 for global gene expression, with 27 CCs in common. Five-year survival was investigated in 55 patients. Deletions and amplifications >2.5 Mb were defined as CN alterations. The %CN-AG varied from 0 to 32.2% (mean = 8.1±8.9). Tumors were classified as low (mean = 0.5±0.6, n = 11), medium (mean = 5.4±2.4, n = 10), or high (mean = 19.2±6.6, n = 10) CN. The highest %CN-AG was found in 3q, which contributed an average of 55% of all CN alterations. Genome-wide, only 5.3% of CN-altered genes were deregulated directly by gene dosage. In contrast, the rate in fully duplicated 3q was twice as high. Amplification of 3q explained 23.2% of deregulated genes in whole tumors (r2 = 0.232, p = 0.006; analysis of variance), including genes located in 3q and other chromosomes. A total of 862 genes were deregulated exclusively in high-CN tumors, but only 22.9% were CN altered. This suggests that the remaining genes are not deregulated directly by gene dosage, but by mechanisms induced in trans by CN-altered genes. Anaphase-promoting complex/cyclosome (APC/C)-dependent proteasome proteolysis, glycolysis, and apoptosis were upregulated, whereas cell adhesion and angiogenesis were downregulated exclusively in high-CN tumors. The high %CN-AG and upregulated gene expression profile of APC/C-dependent proteasome proteolysis were associated with poor patient survival (p0.38, p<0.01, Spearman test). Therefore, inhibition of APC/C-dependent proteasome proteolysis and glycolysis could be useful for CC treatment. However, whether they are indispensable for tumor growth remains to be demonstrated.

  10. Impact of gene dosage on gene expression, biological processes and survival in cervical cancer: a genome-wide follow-up study.

    Directory of Open Access Journals (Sweden)

    Ingrid Medina-Martinez

    Full Text Available We investigated the role of tumor copy number (CN-altered genome (CN-AG in the carcinogenesis of cervical cancer (CC, especially its effect on gene expression, biological processes, and patient survival. Fifty-nine human papillomavirus 16 (HPV16-positive CCs were investigated with microarrays-31 for mapping CN-AG and 55 for global gene expression, with 27 CCs in common. Five-year survival was investigated in 55 patients. Deletions and amplifications >2.5 Mb were defined as CN alterations. The %CN-AG varied from 0 to 32.2% (mean = 8.1±8.9. Tumors were classified as low (mean = 0.5±0.6, n = 11, medium (mean = 5.4±2.4, n = 10, or high (mean = 19.2±6.6, n = 10 CN. The highest %CN-AG was found in 3q, which contributed an average of 55% of all CN alterations. Genome-wide, only 5.3% of CN-altered genes were deregulated directly by gene dosage. In contrast, the rate in fully duplicated 3q was twice as high. Amplification of 3q explained 23.2% of deregulated genes in whole tumors (r2 = 0.232, p = 0.006; analysis of variance, including genes located in 3q and other chromosomes. A total of 862 genes were deregulated exclusively in high-CN tumors, but only 22.9% were CN altered. This suggests that the remaining genes are not deregulated directly by gene dosage, but by mechanisms induced in trans by CN-altered genes. Anaphase-promoting complex/cyclosome (APC/C-dependent proteasome proteolysis, glycolysis, and apoptosis were upregulated, whereas cell adhesion and angiogenesis were downregulated exclusively in high-CN tumors. The high %CN-AG and upregulated gene expression profile of APC/C-dependent proteasome proteolysis were associated with poor patient survival (p0.38, p<0.01, Spearman test. Therefore, inhibition of APC/C-dependent proteasome proteolysis and glycolysis could be useful for CC treatment. However, whether they are indispensable for tumor growth remains to be demonstrated.

  11. Pinpointing disease genes through phenomic and genomic data fusion.

    Science.gov (United States)

    Jiang, Rui; Wu, Mengmeng; Li, Lianshuo

    2015-01-01

    Pinpointing genes involved in inherited human diseases remains a great challenge in the post-genomics era. Although approaches have been proposed either based on the guilt-by-association principle or making use of disease phenotype similarities, the low coverage of both diseases and genes in existing methods has been preventing the scan of causative genes for a significant proportion of diseases at the whole-genome level. To overcome this limitation, we proposed a rigorous statistical method called pgFusion to prioritize candidate genes by integrating one type of disease phenotype similarity derived from the Unified Medical Language System (UMLS) and seven types of gene functional similarities calculated from gene expression, gene ontology, pathway membership, protein sequence, protein domain, protein-protein interaction and regulation pattern, respectively. Our method covered a total of 7,719 diseases and 20,327 genes, achieving the highest coverage thus far for both diseases and genes. We performed leave-one-out cross-validation experiments to demonstrate the superior performance of our method and applied it to a real exome sequencing dataset of epileptic encephalopathies, showing the capability of this approach in finding causative genes for complex diseases. We further provided the standalone software and online services of pgFusion at http://bioinfo.au.tsinghua.edu.cn/jianglab/pgfusion. pgFusion not only provided an effective way for prioritizing candidate genes, but also demonstrated feasible solutions to two fundamental questions in the analysis of big genomic data: the comparability of heterogeneous data and the integration of multiple types of data. Applications of this method in exome or whole genome sequencing studies would accelerate the finding of causative genes for human diseases. Other research fields in genomics could also benefit from the incorporation of our data fusion methodology.

  12. Comparative genomics of the bacterial genus Listeria: Genome evolution is characterized by limited gene acquisition and limited gene loss.

    Science.gov (United States)

    den Bakker, Henk C; Cummings, Craig A; Ferreira, Vania; Vatta, Paolo; Orsi, Renato H; Degoricija, Lovorka; Barker, Melissa; Petrauskene, Olga; Furtado, Manohar R; Wiedmann, Martin

    2010-12-02

    The bacterial genus Listeria contains pathogenic and non-pathogenic species, including the pathogens L. monocytogenes and L. ivanovii, both of which carry homologous virulence gene clusters such as the prfA cluster and clusters of internalin genes. Initial evidence for multiple deletions of the prfA cluster during the evolution of Listeria indicates that this genus provides an interesting model for studying the evolution of virulence and also presents practical challenges with regard to definition of pathogenic strains. To better understand genome evolution and evolution of virulence characteristics in Listeria, we used a next generation sequencing approach to generate draft genomes for seven strains representing Listeria species or clades for which genome sequences were not available. Comparative analyses of these draft genomes and six publicly available genomes, which together represent the main Listeria species, showed evidence for (i) a pangenome with 2,032 core and 2,918 accessory genes identified to date, (ii) a critical role of gene loss events in transition of Listeria species from facultative pathogen to saprotroph, even though a consistent pattern of gene loss seemed to be absent, and a number of isolates representing non-pathogenic species still carried some virulence associated genes, and (iii) divergence of modern pathogenic and non-pathogenic Listeria species and strains, most likely circa 47 million years ago, from a pathogenic common ancestor that contained key virulence genes. Genome evolution in Listeria involved limited gene loss and acquisition as supported by (i) a relatively high coverage of the predicted pan-genome by the observed pan-genome, (ii) conserved genome size (between 2.8 and 3.2 Mb), and (iii) a highly syntenic genome. Limited gene loss in Listeria did include loss of virulence associated genes, likely associated with multiple transitions to a saprotrophic lifestyle. The genus Listeria thus provides an example of a group of

  13. Comparative genomics of the bacterial genus Listeria: Genome evolution is characterized by limited gene acquisition and limited gene loss

    Directory of Open Access Journals (Sweden)

    Barker Melissa

    2010-12-01

    Full Text Available Abstract Background The bacterial genus Listeria contains pathogenic and non-pathogenic species, including the pathogens L. monocytogenes and L. ivanovii, both of which carry homologous virulence gene clusters such as the prfA cluster and clusters of internalin genes. Initial evidence for multiple deletions of the prfA cluster during the evolution of Listeria indicates that this genus provides an interesting model for studying the evolution of virulence and also presents practical challenges with regard to definition of pathogenic strains. Results To better understand genome evolution and evolution of virulence characteristics in Listeria, we used a next generation sequencing approach to generate draft genomes for seven strains representing Listeria species or clades for which genome sequences were not available. Comparative analyses of these draft genomes and six publicly available genomes, which together represent the main Listeria species, showed evidence for (i a pangenome with 2,032 core and 2,918 accessory genes identified to date, (ii a critical role of gene loss events in transition of Listeria species from facultative pathogen to saprotroph, even though a consistent pattern of gene loss seemed to be absent, and a number of isolates representing non-pathogenic species still carried some virulence associated genes, and (iii divergence of modern pathogenic and non-pathogenic Listeria species and strains, most likely circa 47 million years ago, from a pathogenic common ancestor that contained key virulence genes. Conclusions Genome evolution in Listeria involved limited gene loss and acquisition as supported by (i a relatively high coverage of the predicted pan-genome by the observed pan-genome, (ii conserved genome size (between 2.8 and 3.2 Mb, and (iii a highly syntenic genome. Limited gene loss in Listeria did include loss of virulence associated genes, likely associated with multiple transitions to a saprotrophic lifestyle. The genus

  14. Genomics of the human carnitine acyltransferase genes

    NARCIS (Netherlands)

    van der Leij, FR; Huijkman, NCA; Boomsma, C; Kuipers, JRG; Bartelds, B

    2000-01-01

    Five genes in the human genome are known to encode different active forms of related carnitine acyltransferases: CPT1A for liver-type carnitine palmitoyltransferase I, CPT1B for muscle-type carnitine palmitoyltransferase I, CPT2 for carnitine palmitoyltransferase II, CROT for carnitine octanoyltrans

  15. Genetical Genomics for Evolutionary Studies

    NARCIS (Netherlands)

    Prins, J.C.P.; Smant, G.; Jansen, R.C.

    2012-01-01

    enetical genomics combines acquired high-throughput genomic data with genetic analysis. In this chapter, we discuss the application of genetical genomics for evolutionary studies, where new high-throughput molecular technologies are combined with mapping quantitative trait loci (QTL) on the genome

  16. Multiple Genes Related to Muscle Identified through a Joint Analysis of a Two-stage Genome-wide Association Study for Racing Performance of 1,156 Thoroughbreds.

    Science.gov (United States)

    Shin, Dong-Hyun; Lee, Jin Woo; Park, Jong-Eun; Choi, Ik-Young; Oh, Hee-Seok; Kim, Hyeon Jeong; Kim, Heebal

    2015-06-01

    Thoroughbred, a relatively recent horse breed, is best known for its use in horse racing. Although myostatin (MSTN) variants have been reported to be highly associated with horse racing performance, the trait is more likely to be polygenic in nature. The purpose of this study was to identify genetic variants strongly associated with racing performance by using estimated breeding value (EBV) for race time as a phenotype. We conducted a two-stage genome-wide association study to search for genetic variants associated with the EBV. In the first stage of genome-wide association study, a relatively large number of markers (~54,000 single-nucleotide polymorphisms, SNPs) were evaluated in a small number of samples (240 horses). In the second stage, a relatively small number of markers identified to have large effects (170 SNPs) were evaluated in a much larger number of samples (1,156 horses). We also validated the SNPs related to MSTN known to have large effects on racing performance and found significant associations in the stage two analysis, but not in stage one. We identified 28 significant SNPs related to 17 genes. Among these, six genes have a function related to myogenesis and five genes are involved in muscle maintenance. To our knowledge, these genes are newly reported for the genetic association with racing performance of Thoroughbreds. It complements a recent horse genome-wide association studies of racing performance that identified other SNPs and genes as the most significant variants. These results will help to expand our knowledge of the polygenic nature of racing performance in Thoroughbreds.

  17. Multiple Genes Related to Muscle Identified through a Joint Analysis of a Two-stage Genome-wide Association Study for Racing Performance of 1,156 Thoroughbreds

    Directory of Open Access Journals (Sweden)

    Dong-Hyun Shin

    2015-06-01

    Full Text Available Thoroughbred, a relatively recent horse breed, is best known for its use in horse racing. Although myostatin (MSTN variants have been reported to be highly associated with horse racing performance, the trait is more likely to be polygenic in nature. The purpose of this study was to identify genetic variants strongly associated with racing performance by using estimated breeding value (EBV for race time as a phenotype. We conducted a two-stage genome-wide association study to search for genetic variants associated with the EBV. In the first stage of genome-wide association study, a relatively large number of markers (~54,000 single-nucleotide polymorphisms, SNPs were evaluated in a small number of samples (240 horses. In the second stage, a relatively small number of markers identified to have large effects (170 SNPs were evaluated in a much larger number of samples (1,156 horses. We also validated the SNPs related to MSTN known to have large effects on racing performance and found significant associations in the stage two analysis, but not in stage one. We identified 28 significant SNPs related to 17 genes. Among these, six genes have a function related to myogenesis and five genes are involved in muscle maintenance. To our knowledge, these genes are newly reported for the genetic association with racing performance of Thoroughbreds. It complements a recent horse genome-wide association studies of racing performance that identified other SNPs and genes as the most significant variants. These results will help to expand our knowledge of the polygenic nature of racing performance in Thoroughbreds.

  18. From trees to the forest: genes to genomics.

    Science.gov (United States)

    Mullighan, Charles; Petersdorf, Effie; Davies, Stella M; DiPersio, John

    2011-01-01

    Crick, Watson, and colleagues revealed the genetic code in 1953, and since that time, remarkable progress has been made in understanding what makes each of us who we are. Identification of single genes important in disease, and the development of a mechanistic understanding of genetic elements that regulate gene function, have cast light on the pathophysiology of many heritable and acquired disorders. In 1990, the human genome project commenced, with the goal of sequencing the entire human genome, and a "first draft" was published with astonishing speed in 2001. The first draft, although an extraordinary achievement, reported essentially an imaginary haploid mix of alleles rather than a true diploid genome. In the years since 2001, technology has further improved, and efforts have been focused on filling in the gaps in the initial genome and starting the huge task of looking at normal variation in the human genome. This work is the beginning of understanding human genetics in the context of the structure of the genome as a complete entity, and as more than simply the sum of a series of genes. We present 3 studies in this review that apply genomic approaches to leukemia and to transplantation to improve and extend therapies.

  19. GEGEINTOOL: A Computer-Based Tool for Automated Analysis of Gene-Gene Interactions in Large Epidemiological Studies in Cardiovascular Genomics

    Directory of Open Access Journals (Sweden)

    Oscar Coltell

    2013-06-01

    Full Text Available Current methods of data analysis of gene-gene interactions in complex diseases, after taking into account environmental factors using traditional approaches, are inefficient. High-throughput methods of analysis in large scale studies including thousands of subjects and hundreds of SNPs should be implemented. We developed an integrative computer tool, GEGEINTOOL (GEne- GEne INTeraction tOOL, for large-scale analysis of gene-gene interactions, in human studies of complex diseases including a large number of subjects, SNPs, as well as environmental factors. That resource uses standard statistical packages (SPSS, etc. to build and fit the gene-gene interaction models by means of syntax scripts in predicting one or more continuous or dichotomic phenotypes. Codominant, dominant and recessive genetic interaction models including control for covariates are automatically created for each SNP in order to test the best model. From the standard outputs, GEGEINTOOL extracts a selected set of parameters (regression coefficients, p-values, adjusted means, etc., and groups them in a single MS Excel Spreadsheet. The tool allows editing the set of filter parameters, filtering the selected results depending on p-values, as well as plotting the selected gene-gene interactions to check consistency. In conclusion, GEGEINTOOL is a useful and friendly tool for exploring and identifying gene-gene interactions in complex diseases.

  20. Genome-wide association study identifies Loci and candidate genes for body composition and meat quality traits in Beijing-You chickens.

    Directory of Open Access Journals (Sweden)

    Ranran Liu

    Full Text Available Body composition and meat quality traits are important economic traits of chickens. The development of high-throughput genotyping platforms and relevant statistical methods have enabled genome-wide association studies in chickens. In order to identify molecular markers and candidate genes associated with body composition and meat quality traits, genome-wide association studies were conducted using the Illumina 60 K SNP Beadchip to genotype 724 Beijing-You chickens. For each bird, a total of 16 traits were measured, including carcass weight (CW, eviscerated weight (EW, dressing percentage, breast muscle weight (BrW and percentage (BrP, thigh muscle weight and percentage, abdominal fat weight and percentage, dry matter and intramuscular fat contents of breast and thigh muscle, ultimate pH, and shear force of the pectoralis major muscle at 100 d of age. The SNPs that were significantly associated with the phenotypic traits were identified using both simple (GLM and compressed mixed linear (MLM models. For nine of ten body composition traits studied, SNPs showing genome wide significance (P<2.59E-6 have been identified. A consistent region on chicken (Gallus gallus chromosome 4 (GGA4, including seven significant SNPs and four candidate genes (LCORL, LAP3, LDB2, TAPT1, were found to be associated with CW and EW. Another 0.65 Mb region on GGA3 for BrW and BrP was identified. After measuring the mRNA content in beast muscle for five genes located in this region, the changes in GJA1 expression were found to be consistent with that of breast muscle weight across development. It is highly possible that GJA1 is a functional gene for breast muscle development in chickens. For meat quality traits, several SNPs reaching suggestive association were identified and possible candidate genes with their functions were discussed.

  1. Genomic organization and sequence analysis of the vomeronasal receptor V2R genes in mouse genome

    Institute of Scientific and Technical Information of China (English)

    YANG Hui; Zhang YaPing

    2007-01-01

    Two multigene superfamilies, named V1R and V2R, encoding seven-transmembrane-domain G-protein coupled receptors (GPCRs) have been identified as pheromone receptors in mammals. Three V2R gene families have been described in mouse and rat. Here we screened the updated mouse genome sequence database and finally retrieved 63 putative functional V2R genes including three newly identified genes which formed a new additional family. We described the genomic organization of these genes and also characterized the conservation of mouse V2R protein sequences. These genomic and sequence information we described are useful as part of the evidence to speculate the functional domain of V2Rs and should give aid to the functionality study in the future.

  2. Family-based Genome-wide Association Study of Frontal Theta Oscillations Identifies Potassium Channel Gene KCNJ6

    OpenAIRE

    Kang, Sun J.; Rangaswamy, Madhavi; Manz, Niklas; Wang, Jen-Chyong; Wetherill, Leah; Hinrichs, Tony; Almasy, Laura; Brooks, Andy; Chorlian, David B.; Dick, Danielle; Hesselbrock, Victor; Kramer, John; Kuperman, Sam; Nurnberger, John,; Rice, John

    2012-01-01

    Event-related oscillations (EROs) represent highly heritable neuroelectric correlates of cognitive processes that manifest deficits in alcoholics and in offspring at high risk to develop alcoholism. Theta ERO to targets in the visual oddball task has been shown to be an endophenotype for alcoholism. A family-based genome-wide association study was performed for the frontal theta ERO phenotype using 634583 autosomal single nucleotide polymorphisms (SNPs) genotyped in 1560 family members from 1...

  3. Genome engineering using a synthetic gene circuit in Bacillus subtilis.

    Science.gov (United States)

    Jeong, Da-Eun; Park, Seung-Hwan; Pan, Jae-Gu; Kim, Eui-Joong; Choi, Soo-Keun

    2015-03-31

    Genome engineering without leaving foreign DNA behind requires an efficient counter-selectable marker system. Here, we developed a genome engineering method in Bacillus subtilis using a synthetic gene circuit as a counter-selectable marker system. The system contained two repressible promoters (B. subtilis xylA (Pxyl) and spac (Pspac)) and two repressor genes (lacI and xylR). Pxyl-lacI was integrated into the B. subtilis genome with a target gene containing a desired mutation. The xylR and Pspac-chloramphenicol resistant genes (cat) were located on a helper plasmid. In the presence of xylose, repression of XylR by xylose induced LacI expression, the LacIs repressed the Pspac promoter and the cells become chloramphenicol sensitive. Thus, to survive in the presence of chloramphenicol, the cell must delete Pxyl-lacI by recombination between the wild-type and mutated target genes. The recombination leads to mutation of the target gene. The remaining helper plasmid was removed easily under the chloramphenicol absent condition. In this study, we showed base insertion, deletion and point mutation of the B. subtilis genome without leaving any foreign DNA behind. Additionally, we successfully deleted a 2-kb gene (amyE) and a 38-kb operon (ppsABCDE). This method will be useful to construct designer Bacillus strains for various industrial applications.

  4. Expression of a transferred nuclear gene in a mitochondrial genome

    Directory of Open Access Journals (Sweden)

    Yichun Qiu

    2014-08-01

    Full Text Available Transfer of mitochondrial genes to the nucleus, and subsequent gain of regulatory elements for expression, is an ongoing evolutionary process in plants. Many examples have been characterized, which in some cases have revealed sources of mitochondrial targeting sequences and cis-regulatory elements. In contrast, there have been no reports of a nuclear gene that has undergone intracellular transfer to the mitochondrial genome and become expressed. Here we show that the orf164 gene in the mitochondrial genome of several Brassicaceae species, including Arabidopsis, is derived from the nuclear ARF17 gene that codes for an auxin responsive protein and is present across flowering plants. Orf164 corresponds to a portion of ARF17, and the nucleotide and amino acid sequences are 79% and 81% identical, respectively. Orf164 is transcribed in several organ types of Arabidopsis thaliana, as detected by RT-PCR. In addition, orf164 is transcribed in five other Brassicaceae within the tribes Camelineae, Erysimeae and Cardamineae, but the gene is not present in Brassica or Raphanus. This study shows that nuclear genes can be transferred to the mitochondrial genome and become expressed, providing a new perspective on the movement of genes between the genomes of subcellular compartments.

  5. An efficient viral vector for functional genomic studies of Prunus fruit trees and its induced resistance to Plum pox virus via silencing of a host factor gene.

    Science.gov (United States)

    Cui, Hongguang; Wang, Aiming

    2017-03-01

    RNA silencing is a powerful technology for molecular characterization of gene functions in plants. A commonly used approach to the induction of RNA silencing is through genetic transformation. A potent alternative is to use a modified viral vector for virus-induced gene silencing (VIGS) to degrade RNA molecules sharing similar nucleotide sequence. Unfortunately, genomic studies in many allogamous woody perennials such as peach are severely hindered because they have a long juvenile period and are recalcitrant to genetic transformation. Here, we report the development of a viral vector derived from Prunus necrotic ringspot virus (PNRSV), a widespread fruit tree virus that is endemic in all Prunus fruit production countries and regions in the world. We show that the modified PNRSV vector, harbouring the sense-orientated target gene sequence of 100-200 bp in length in genomic RNA3, could efficiently trigger the silencing of a transgene or an endogenous gene in the model plant Nicotiana benthamiana. We further demonstrate that the PNRSV-based vector could be manipulated to silence endogenous genes in peach such as eukaryotic translation initiation factor 4E isoform (eIF(iso)4E), a host factor of many potyviruses including Plum pox virus (PPV). Moreover, the eIF(iso)4E-knocked down peach plants were resistant to PPV. This work opens a potential avenue for the control of virus diseases in perennial trees via viral vector-mediated silencing of host factors, and the PNRSV vector may serve as a powerful molecular tool for functional genomic studies of Prunus fruit trees.

  6. An integration of genome-wide association study and gene expression profiling to prioritize the discovery of novel susceptibility Loci for osteoporosis-related traits.

    Science.gov (United States)

    Hsu, Yi-Hsiang; Zillikens, M Carola; Wilson, Scott G; Farber, Charles R; Demissie, Serkalem; Soranzo, Nicole; Bianchi, Estelle N; Grundberg, Elin; Liang, Liming; Richards, J Brent; Estrada, Karol; Zhou, Yanhua; van Nas, Atila; Moffatt, Miriam F; Zhai, Guangju; Hofman, Albert; van Meurs, Joyce B; Pols, Huibert A P; Price, Roger I; Nilsson, Olle; Pastinen, Tomi; Cupples, L Adrienne; Lusis, Aldons J; Schadt, Eric E; Ferrari, Serge; Uitterlinden, André G; Rivadeneira, Fernando; Spector, Timothy D; Karasik, David; Kiel, Douglas P

    2010-06-10

    Osteoporosis is a complex disorder and commonly leads to fractures in elderly persons. Genome-wide association studies (GWAS) have become an unbiased approach to identify variations in the genome that potentially affect health. However, the genetic variants identified so far only explain a small proportion of the heritability for complex traits. Due to the modest genetic effect size and inadequate power, true association signals may not be revealed based on a stringent genome-wide significance threshold. Here, we take advantage of SNP and transcript arrays and integrate GWAS and expression signature profiling relevant to the skeletal system in cellular and animal models to prioritize the discovery of novel candidate genes for osteoporosis-related traits, including bone mineral density (BMD) at the lumbar spine (LS) and femoral neck (FN), as well as geometric indices of the hip (femoral neck-shaft angle, NSA; femoral neck length, NL; and narrow-neck width, NW). A two-stage meta-analysis of GWAS from 7,633 Caucasian women and 3,657 men, revealed three novel loci associated with osteoporosis-related traits, including chromosome 1p13.2 (RAP1A, p = 3.6x10(-8)), 2q11.2 (TBC1D8), and 18q11.2 (OSBPL1A), and confirmed a previously reported region near TNFRSF11B/OPG gene. We also prioritized 16 suggestive genome-wide significant candidate genes based on their potential involvement in skeletal metabolism. Among them, 3 candidate genes were associated with BMD in women. Notably, 2 out of these 3 genes (GPR177, p = 2.6x10(-13); SOX6, p = 6.4x10(-10)) associated with BMD in women have been successfully replicated in a large-scale meta-analysis of BMD, but none of the non-prioritized candidates (associated with BMD) did. Our results support the concept of our prioritization strategy. In the absence of direct biological support for identified genes, we highlighted the efficiency of subsequent functional characterization using publicly available expression profiling relevant to the

  7. An integration of genome-wide association study and gene expression profiling to prioritize the discovery of novel susceptibility Loci for osteoporosis-related traits.

    Directory of Open Access Journals (Sweden)

    Yi-Hsiang Hsu

    2010-06-01

    Full Text Available Osteoporosis is a complex disorder and commonly leads to fractures in elderly persons. Genome-wide association studies (GWAS have become an unbiased approach to identify variations in the genome that potentially affect health. However, the genetic variants identified so far only explain a small proportion of the heritability for complex traits. Due to the modest genetic effect size and inadequate power, true association signals may not be revealed based on a stringent genome-wide significance threshold. Here, we take advantage of SNP and transcript arrays and integrate GWAS and expression signature profiling relevant to the skeletal system in cellular and animal models to prioritize the discovery of novel candidate genes for osteoporosis-related traits, including bone mineral density (BMD at the lumbar spine (LS and femoral neck (FN, as well as geometric indices of the hip (femoral neck-shaft angle, NSA; femoral neck length, NL; and narrow-neck width, NW. A two-stage meta-analysis of GWAS from 7,633 Caucasian women and 3,657 men, revealed three novel loci associated with osteoporosis-related traits, including chromosome 1p13.2 (RAP1A, p = 3.6x10(-8, 2q11.2 (TBC1D8, and 18q11.2 (OSBPL1A, and confirmed a previously reported region near TNFRSF11B/OPG gene. We also prioritized 16 suggestive genome-wide significant candidate genes based on their potential involvement in skeletal metabolism. Among them, 3 candidate genes were associated with BMD in women. Notably, 2 out of these 3 genes (GPR177, p = 2.6x10(-13; SOX6, p = 6.4x10(-10 associated with BMD in women have been successfully replicated in a large-scale meta-analysis of BMD, but none of the non-prioritized candidates (associated with BMD did. Our results support the concept of our prioritization strategy. In the absence of direct biological support for identified genes, we highlighted the efficiency of subsequent functional characterization using publicly available expression profiling relevant

  8. Evolutionary genomics of LysM genes in land plants

    Directory of Open Access Journals (Sweden)

    Stacey Gary

    2009-08-01

    Full Text Available Abstract Background The ubiquitous LysM motif recognizes peptidoglycan, chitooligosaccharides (chitin and, presumably, other structurally-related oligosaccharides. LysM-containing proteins were first shown to be involved in bacterial cell wall degradation and, more recently, were implicated in perceiving chitin (one of the established pathogen-associated molecular patterns and lipo-chitin (nodulation factors in flowering plants. However, the majority of LysM genes in plants remain functionally uncharacterized and the evolutionary history of complex LysM genes remains elusive. Results We show that LysM-containing proteins display a wide range of complex domain architectures. However, only a simple core architecture is conserved across kingdoms. Each individual kingdom appears to have evolved a distinct array of domain architectures. We show that early plant lineages acquired four characteristic architectures and progressively lost several primitive architectures. We report plant LysM phylogenies and associated gene, protein and genomic features, and infer the relative timing of duplications of LYK genes. Conclusion We report a domain architecture catalogue of LysM proteins across all kingdoms. The unique pattern of LysM protein domain architectures indicates the presence of distinctive evolutionary paths in individual kingdoms. We describe a comparative and evolutionary genomics study of LysM genes in plant kingdom. One of the two groups of tandemly arrayed plant LYK genes likely resulted from an ancient genome duplication followed by local genomic rearrangement, while the origin of the other groups of tandemly arrayed LYK genes remains obscure. Given the fact that no animal LysM motif-containing genes have been functionally characterized, this study provides clues to functional characterization of plant LysM genes and is also informative with regard to evolutionary and functional studies of animal LysM genes.

  9. Evolutionary genomics of LysM genes in land plants.

    Science.gov (United States)

    Zhang, Xue-Cheng; Cannon, Steven B; Stacey, Gary

    2009-08-03

    The ubiquitous LysM motif recognizes peptidoglycan, chitooligosaccharides (chitin) and, presumably, other structurally-related oligosaccharides. LysM-containing proteins were first shown to be involved in bacterial cell wall degradation and, more recently, were implicated in perceiving chitin (one of the established pathogen-associated molecular patterns) and lipo-chitin (nodulation factors) in flowering plants. However, the majority of LysM genes in plants remain functionally uncharacterized and the evolutionary history of complex LysM genes remains elusive. We show that LysM-containing proteins display a wide range of complex domain architectures. However, only a simple core architecture is conserved across kingdoms. Each individual kingdom appears to have evolved a distinct array of domain architectures. We show that early plant lineages acquired four characteristic architectures and progressively lost several primitive architectures. We report plant LysM phylogenies and associated gene, protein and genomic features, and infer the relative timing of duplications of LYK genes. We report a domain architecture catalogue of LysM proteins across all kingdoms. The unique pattern of LysM protein domain architectures indicates the presence of distinctive evolutionary paths in individual kingdoms. We describe a comparative and evolutionary genomics study of LysM genes in plant kingdom. One of the two groups of tandemly arrayed plant LYK genes likely resulted from an ancient genome duplication followed by local genomic rearrangement, while the origin of the other groups of tandemly arrayed LYK genes remains obscure. Given the fact that no animal LysM motif-containing genes have been functionally characterized, this study provides clues to functional characterization of plant LysM genes and is also informative with regard to evolutionary and functional studies of animal LysM genes.

  10. The genomic environment around the Aromatase gene: evolutionary insights

    Directory of Open Access Journals (Sweden)

    Reis-Henriques Maria A

    2005-08-01

    Full Text Available Abstract Background The cytochrome P450 aromatase (CYP19, catalyses the aromatisation of androgens to estrogens, a key mechanism in vertebrate reproductive physiology. A current evolutionary hypothesis suggests that CYP19 gene arose at the origin of vertebrates, given that it has not been found outside this clade. The human CYP19 gene is located in one of the proposed MHC-paralogon regions (HSA15q. At present it is unclear whether this genomic location is ancestral (which would suggest an invertebrate origin for CYP19 or derived (genomic location with no evolutionary meaning. The distinction between these possibilities should help to clarify the timing of the CYP19 emergence and which taxa should be investigated. Results Here we determine the "genomic environment" around CYP19 in three vertebrate species Homo sapiens, Tetraodon nigroviridis and Xenopus tropicalis. Paralogy studies and phylogenetic analysis of six gene families suggests that the CYP19 gene region was structured through "en bloc" genomic duplication (as part of the MHC-paralogon formation. Four gene families have specifically duplicated in the vertebrate lineage. Moreover, the mapping location of the different paralogues is consistent with a model of "en bloc" duplication. Furthermore, we also determine that this region has retained the same gene content since the divergence of Actinopterygii and Tetrapods. A single inversion in gene order has taken place, probably in the mammalian lineage. Finally, we describe the first invertebrate CYP19 sequence, from Branchiostoma floridae. Conclusion Contrary to previous suggestions, our data indicates an invertebrate origin for the aromatase gene, given the striking conservation pattern in both gene order and gene content, and the presence of aromatase in amphioxus. We propose that CYP19 duplicated in the vertebrate lineage to yield four paralogues, followed by the subsequent loss of all but one gene in vertebrate evolution. Finally, we

  11. Assembly of 500,000 inter-specific catfish expressed sequence tags and large scale gene-associated marker development for whole genome association studies

    Energy Technology Data Exchange (ETDEWEB)

    Catfish Genome Consortium; Wang, Shaolin; Peatman, Eric; Abernathy, Jason; Waldbieser, Geoff; Lindquist, Erika; Richardson, Paul; Lucas, Susan; Wang, Mei; Li, Ping; Thimmapuram, Jyothi; Liu, Lei; Vullaganti, Deepika; Kucuktas, Huseyin; Murdock, Christopher; Small, Brian C; Wilson, Melanie; Liu, Hong; Jiang, Yanliang; Lee, Yoona; Chen, Fei; Lu, Jianguo; Wang, Wenqi; Xu, Peng; Somridhivej, Benjaporn; Baoprasertkul, Puttharat; Quilang, Jonas; Sha, Zhenxia; Bao, Baolong; Wang, Yaping; Wang, Qun; Takano, Tomokazu; Nandi, Samiran; Liu, Shikai; Wong, Lilian; Kaltenboeck, Ludmilla; Quiniou, Sylvie; Bengten, Eva; Miller, Norman; Trant, John; Rokhsar, Daniel; Liu, Zhanjiang

    2010-03-23

    Background-Through the Community Sequencing Program, a catfish EST sequencing project was carried out through a collaboration between the catfish research community and the Department of Energy's Joint Genome Institute. Prior to this project, only a limited EST resource from catfish was available for the purpose of SNP identification. Results-A total of 438,321 quality ESTs were generated from 8 channel catfish (Ictalurus punctatus) and 4 blue catfish (Ictalurus furcatus) libraries, bringing the number of catfish ESTs to nearly 500,000. Assembly of all catfish ESTs resulted in 45,306 contigs and 66,272 singletons. Over 35percent of the unique sequences had significant similarities to known genes, allowing the identification of 14,776 unique genes in catfish. Over 300,000 putative SNPs have been identified, of which approximately 48,000 are high-quality SNPs identified from contigs with at least four sequences and the minor allele presence of at least two sequences in the contig. The EST resource should be valuable for identification of microsatellites, genome annotation, large-scale expression analysis, and comparative genome analysis. Conclusions-This project generated a large EST resource for catfish that captured the majority of the catfish transcriptome. The parallel analysis of ESTs from two closely related Ictalurid catfishes should also provide powerful means for the evaluation of ancient and recent gene duplications, and for the development of high-density microarrays in catfish. The inter- and intra-specific SNPs identified from all catfish EST dataset assembly will greatly benefit the catfish introgression breeding program and whole genome association studies.

  12. Genome-scale study of the importance of binding site context for transcription factor binding and gene regulation

    Directory of Open Access Journals (Sweden)

    Ronne Hans

    2008-11-01

    Full Text Available Abstract Background The rate of mRNA transcription is controlled by transcription factors that bind to specific DNA motifs in promoter regions upstream of protein coding genes. Recent results indicate that not only the presence of a motif but also motif context (for example the orientation of a motif or its location relative to the coding sequence is important for gene regulation. Results In this study we present ContextFinder, a tool that is specifically aimed at identifying cases where motif context is likely to affect gene regulation. We used ContextFinder to examine the role of motif context in S. cerevisiae both for DNA binding by transcription factors and for effects on gene expression. For DNA binding we found significant patterns of motif location bias, whereas motif orientations did not seem to matter. Motif context appears to affect gene expression even more than it affects DNA binding, as biases in both motif location and orientation were more frequent in promoters of co-expressed genes. We validated our results against data on nucleosome positioning, and found a negative correlation between preferred motif locations and nucleosome occupancy. Conclusion We conclude that the requirement for stable binding of transcription factors to DNA and their subsequent function in gene regulation can impose constraints on motif context.

  13. A conserved BDNF, glutamate- and GABA-enriched gene module related to human depression identified by coexpression meta-analysis and DNA variant genome-wide association studies.

    Directory of Open Access Journals (Sweden)

    Lun-Ching Chang

    Full Text Available Large scale gene expression (transcriptome analysis and genome-wide association studies (GWAS for single nucleotide polymorphisms have generated a considerable amount of gene- and disease-related information, but heterogeneity and various sources of noise have limited the discovery of disease mechanisms. As systematic dataset integration is becoming essential, we developed methods and performed meta-clustering of gene coexpression links in 11 transcriptome studies from postmortem brains of human subjects with major depressive disorder (MDD and non-psychiatric control subjects. We next sought enrichment in the top 50 meta-analyzed coexpression modules for genes otherwise identified by GWAS for various sets of disorders. One coexpression module of 88 genes was consistently and significantly associated with GWAS for MDD, other neuropsychiatric disorders and brain functions, and for medical illnesses with elevated clinical risk of depression, but not for other diseases. In support of the superior discriminative power of this novel approach, we observed no significant enrichment for GWAS-related genes in coexpression modules extracted from single studies or in meta-modules using gene expression data from non-psychiatric control subjects. Genes in the identified module encode proteins implicated in neuronal signaling and structure, including glutamate metabotropic receptors (GRM1, GRM7, GABA receptors (GABRA2, GABRA4, and neurotrophic and development-related proteins [BDNF, reelin (RELN, Ephrin receptors (EPHA3, EPHA5]. These results are consistent with the current understanding of molecular mechanisms of MDD and provide a set of putative interacting molecular partners, potentially reflecting components of a functional module across cells and biological pathways that are synchronously recruited in MDD, other brain disorders and MDD-related illnesses. Collectively, this study demonstrates the importance of integrating transcriptome data, gene

  14. A conserved BDNF, glutamate- and GABA-enriched gene module related to human depression identified by coexpression meta-analysis and DNA variant genome-wide association studies.

    Science.gov (United States)

    Chang, Lun-Ching; Jamain, Stephane; Lin, Chien-Wei; Rujescu, Dan; Tseng, George C; Sibille, Etienne

    2014-01-01

    Large scale gene expression (transcriptome) analysis and genome-wide association studies (GWAS) for single nucleotide polymorphisms have generated a considerable amount of gene- and disease-related information, but heterogeneity and various sources of noise have limited the discovery of disease mechanisms. As systematic dataset integration is becoming essential, we developed methods and performed meta-clustering of gene coexpression links in 11 transcriptome studies from postmortem brains of human subjects with major depressive disorder (MDD) and non-psychiatric control subjects. We next sought enrichment in the top 50 meta-analyzed coexpression modules for genes otherwise identified by GWAS for various sets of disorders. One coexpression module of 88 genes was consistently and significantly associated with GWAS for MDD, other neuropsychiatric disorders and brain functions, and for medical illnesses with elevated clinical risk of depression, but not for other diseases. In support of the superior discriminative power of this novel approach, we observed no significant enrichment for GWAS-related genes in coexpression modules extracted from single studies or in meta-modules using gene expression data from non-psychiatric control subjects. Genes in the identified module encode proteins implicated in neuronal signaling and structure, including glutamate metabotropic receptors (GRM1, GRM7), GABA receptors (GABRA2, GABRA4), and neurotrophic and development-related proteins [BDNF, reelin (RELN), Ephrin receptors (EPHA3, EPHA5)]. These results are consistent with the current understanding of molecular mechanisms of MDD and provide a set of putative interacting molecular partners, potentially reflecting components of a functional module across cells and biological pathways that are synchronously recruited in MDD, other brain disorders and MDD-related illnesses. Collectively, this study demonstrates the importance of integrating transcriptome data, gene coexpression modules

  15. Genome-Wide Study of the Tomato SlMLO Gene Family and Its Functional Characterization in Response to the Powdery Mildew Fungus Oidium neolycopersici.

    Science.gov (United States)

    Zheng, Zheng; Appiano, Michela; Pavan, Stefano; Bracuto, Valentina; Ricciardi, Luigi; Visser, Richard G F; Wolters, Anne-Marie A; Bai, Yuling

    2016-01-01

    The MLO (Mildew Locus O) gene family encodes plant-specific proteins containing seven transmembrane domains and likely acting in signal transduction in a calcium and calmodulin dependent manner. Some members of the MLO family are susceptibility factors toward fungi causing the powdery mildew disease. In tomato, for example, the loss-of-function of the MLO gene SlMLO1 leads to a particular form of powdery mildew resistance, called ol-2, which arrests almost completely fungal penetration. This type of penetration resistance is characterized by the apposition of papillae at the sites of plant-pathogen interaction. Other MLO homologs in Arabidopsis regulate root response to mechanical stimuli (AtMLO4 and AtMLO11) and pollen tube reception by the female gametophyte (AtMLO7). However, the role of most MLO genes remains unknown. In this work, we provide a genome-wide study of the tomato SlMLO gene family. Besides SlMLO1, other 15 SlMLO homologs were identified and characterized with respect to their structure, genomic organization, phylogenetic relationship, and expression profile. In addition, by analysis of transgenic plants, we demonstrated that simultaneous silencing of SlMLO1 and two of its closely related homologs, SlMLO5 and SlMLO8, confer higher level of resistance than the one associated with the ol-2 mutation. The outcome of this study provides evidence for functional redundancy among tomato homolog genes involved in powdery mildew susceptibility. Moreover, we developed a series of transgenic lines silenced for individual SlMLO homologs, which lay the foundation for further investigations aimed at assigning new biological functions to the MLO gene family.

  16. Gene duplication in the genome of parasitic Giardia lamblia

    Directory of Open Access Journals (Sweden)

    Flores Roberto

    2010-02-01

    Full Text Available Abstract Background Giardia are a group of widespread intestinal protozoan parasites in a number of vertebrates. Much evidence from G. lamblia indicated they might be the most primitive extant eukaryotes. When and how such a group of the earliest branching unicellular eukaryotes developed the ability to successfully parasitize the latest branching higher eukaryotes (vertebrates is an intriguing question. Gene duplication has long been thought to be the most common mechanism in the production of primary resources for the origin of evolutionary novelties. In order to parse the evolutionary trajectory of Giardia parasitic lifestyle, here we carried out a genome-wide analysis about gene duplication patterns in G. lamblia. Results Although genomic comparison showed that in G. lamblia the contents of many fundamental biologic pathways are simplified and the whole genome is very compact, in our study 40% of its genes were identified as duplicated genes. Evolutionary distance analyses of these duplicated genes indicated two rounds of large scale duplication events had occurred in G. lamblia genome. Functional annotation of them further showed that the majority of recent duplicated genes are VSPs (Variant-specific Surface Proteins, which are essential for the successful parasitic life of Giardia in hosts. Based on evolutionary comparison with their hosts, it was found that the rapid expansion of VSPs in G. lamblia is consistent with the evolutionary radiation of placental mammals. Conclusions Based on the genome-wide analysis of duplicated genes in G. lamblia, we found that gene duplication was essential for the origin and evolution of Giardia parasitic lifestyle. The recent expansion of VSPs uniquely occurring in G. lamblia is consistent with the increment of its hosts. Therefore we proposed a hypothesis that the increment of Giradia hosts might be the driving force for the rapid expansion of VSPs.

  17. A functional genomics approach using radiation-induced changes in gene expression to study low dose radiation effects in vitro and in vivo

    Energy Technology Data Exchange (ETDEWEB)

    Fornace, Jr, A J

    2007-03-03

    Abstract for final report for project entitled A functional genomics approach using radiation-induced changes in gene expression to study low dose radiation effects in vitro and in vivo which has been supported by the DOE Low Dose Radiation Research Program for approximately 7 years. This project has encompassed two sequential awards, ER62683 and then ER63308, in the Gene Response Section in the Center for Cancer Research at the National Cancer Institute. The project was temporarily suspended during the relocation of the Principal Investigators laboratory to the Dept. of Genetics and Complex Diseases at Harvard School of Public Health at the end of 2004. Remaining support for the final year was transferred to this new site later in 2005 and was assigned the DOE Award Number ER64065. The major aims of this project have been 1) to characterize changes in gene expression in response to low-dose radiation responses; this includes responses in human cells lines, peripheral blood lymphocytes (PBL), and in vivo after human or murine exposures, as well as the effect of dose-rate on gene responses; 2) to characterize changes in gene expression that may be involved in bystander effects, such as may be mediated by cytokines and other intercellular signaling proteins; and 3) to characterize responses in transgenic mouse models with relevance to genomic stability. A variety of approaches have been used to study transcriptional events including microarray hybridization, quantitative single-probe hybridization which was developed in this laboratory, quantitative RT-PCR, and promoter microarray analysis using genomic regulatory motifs. Considering the frequent responsiveness of genes encoding cytokines and related signaling proteins that can affect cellular metabolism, initial efforts were initiated to study radiation responses at the metabolomic level and to correlate with radiation-responsive gene expression. Productivity includes twenty-four published and in press manuscripts

  18. Conservation of ribosomal protein gene ordering in 16 complete genomes

    Institute of Scientific and Technical Information of China (English)

    2000-01-01

    The organization of ribosomal proteins in 16 prokaryotic genomes was studied as an example of comparative genome analyses of gene systems. Hypothetical ribosomal protein-containing operons were constructed. These operons also contained putative genes and other non-ribosomal genes. The correspondences among these genes across different organisms were clarified by sequence homology computations. In this way a cross tabulation of 70 ribosomal proteins genes was constructed. On average, these were organized into 9-14 operons in each genome. There were also 25 non-ribosomal or putative genes in these mainly ribosomal protein operons. Hence the table contains 95 genes in total. It was found that: (i) the conservation of the block of about 20 r-proteins in the L3 and L4 operons across almost the entire eubacteria and archaebacteria is remarkable; (ii) some operons only belong to eubacteria or archaebacteria; (iii) although the ribosomal protein operons are highly conserved within domain, there are fine variations in some operons across different organisms within each domain, and these variations are informative on the evolutionary relations among the organisms. This method provides a new potential for studying the origin and evolution of old species.

  19. A Probabilistic Genome-Wide Gene Reading Frame Sequence Model

    DEFF Research Database (Denmark)

    Have, Christian Theil; Mørk, Søren

    We introduce a new type of probabilistic sequence model, that model the sequential composition of reading frames of genes in a genome. Our approach extends gene finders with a model of the sequential composition of genes at the genome-level -- effectively producing a sequential genome annotation...... and are evaluated by the effect on prediction performance. Since bacterial gene finding to a large extent is a solved problem it forms an ideal proving ground for evaluating the explicit modeling of larger scale gene sequence composition of genomes. We conclude that the sequential composition of gene reading frames...... as output. The model can be used to obtain the most probable genome annotation based on a combination of i: a gene finder score of each gene candidate and ii: the sequence of the reading frames of gene candidates through a genome. The model --- as well as a higher order variant --- is developed and tested...

  20. Family-based genome-wide association study of frontal θ oscillations identifies potassium channel gene KCNJ6.

    Science.gov (United States)

    Kang, S J; Rangaswamy, M; Manz, N; Wang, J-C; Wetherill, L; Hinrichs, T; Almasy, L; Brooks, A; Chorlian, D B; Dick, D; Hesselbrock, V; Kramer, J; Kuperman, S; Nurnberger, J; Rice, J; Schuckit, M; Tischfield, J; Bierut, L J; Edenberg, H J; Goate, A; Foroud, T; Porjesz, B

    2012-08-01

    Event-related oscillations (EROs) represent highly heritable neuroelectric correlates of cognitive processes that manifest deficits in alcoholics and in offspring at high risk to develop alcoholism. Theta ERO to targets in the visual oddball task has been shown to be an endophenotype for alcoholism. A family-based genome-wide association study was performed for the frontal theta ERO phenotype using 634 583 autosomal single nucleotide polymorphisms (SNPs) genotyped in 1560 family members from 117 families densely affected by alcohol use disorders, recruited in the Collaborative Study on the Genetics of Alcoholism. Genome-wide significant association was found with several SNPs on chromosome 21 in KCNJ6 (a potassium inward rectifier channel; KIR3.2/GIRK2), with the most significant SNP at P = 4.7 × 10(-10)). The same SNPs were also associated with EROs from central and parietal electrodes, but with less significance, suggesting that the association is frontally focused. One imputed synonymous SNP in exon four, highly correlated with our top three SNPs, was significantly associated with the frontal theta ERO phenotype. These results suggest KCNJ6 or its product GIRK2 account for some of the variations in frontal theta band oscillations. GIRK2 receptor activation contributes to slow inhibitory postsynaptic potentials that modulate neuronal excitability, and therefore influence neuronal networks.

  1. Genome-Wide Association Study Reveals Candidate Genes for Control of Plant Height, Branch Initiation Height and Branch Number in Rapeseed (Brassica napus L.

    Directory of Open Access Journals (Sweden)

    Ming Zheng

    2017-07-01

    Full Text Available Plant architecture is crucial for rapeseed yield and is determined by plant height (PH, branch initiation height (BIH, branch number (BN and leaf and inflorescence morphology. In this study, we measured three major factors (PH, BIH, and BN in a panel of 333 rapeseed accessions across 4 years. A genome-wide association study (GWAS was performed via Q + K model and the panel was genotyped using the 60 k Brassica Infinium SNP array. We identified seven loci for PH, four for BIH, and five for BN. Subsequently, by determining linkage disequilibrium (LD decay associated with 38 significant SNPs, we gained 31, 15, and 17 candidate genes for these traits, respectively. We also showed that PH is significantly correlated with BIH, while no other correlation was revealed. Notably, a GA signaling gene (BnRGA and a flowering gene (BnFT located on chromosome A02 were identified as the most likely candidate genes associated with PH regulation. Furthermore, a meristem initiation gene (BnLOF2 and a NAC domain transcriptional factor (BnCUC3 that may be associated with BN were identified on the chromosome A07. This study reveals novel insight into the genetic control of plant architecture and may facilitate marker-based breeding for rapeseed.

  2. Gene mutations of acute myeloid leukemia in the genome era.

    Science.gov (United States)

    Naoe, Tomoki; Kiyoi, Hitoshi

    2013-02-01

    Ten years ago, gene mutations found in acute myeloid leukemia (AML) were conceptually grouped into class I mutation, which causes constitutive activation of intracellular signals that contribute to the growth and survival, and class II mutation, which blocks differentiation and/or enhance self-renewal by altered transcription factors. A cooperative model between two classes of mutations has been suggested by murine experiments and partly supported by epidemiological findings. In the last 5 years, comprehensive genomic analysis proceeded to find new gene mutations, which are found in the epigenome-associated enzymes and the molecules never noticed so far. These new mutations apparently increase the complexity and heterogeneity of AML. Although a long list of gene mutations might have been compiled, the entire picture of molecular pathogenesis in AML remains to be elucidated because gene rearrangement, gene copy number, DNA methylation and expression profiles are not fully studied in conjunction with gene mutations. Comprehensive genome research will deepen the understanding of AML to promote the development of new classification and treatment. This review focuses on gene mutations that were recently discovered by genome sequencing.

  3. Conservation of ribosomal protein gene ordering in 16 complete genomes

    Institute of Scientific and Technical Information of China (English)

    王宁; 陈润生; 王永雄

    2000-01-01

    The organization of ribosomal proteins in 16 prokaryotic genomes was studied as an example of comparative genome analyses of gene systems. Hypothetical ribosomal protein-containing operons were constructed. These operons also contained putative genes and other non-ribosomal genes. The correspondences among these genes across different organisms were clarified by sequence homology computations. In this way a cross tabulation of 70 ribosomal proteins genes was constructed. On average, these were organized into 9-14 operons in each genome. There were also 25 non-ribosomal or putative genes in these mainly ribosomal protein operons. Hence the table contains 95 genes in total. It was found that: (i) the conservation of the block of about 20 r-proteins in the L3 and L4 operons across almost the entire eubacteria and ar-chaebacteria is remarkable; (ii) some operons only belong to eubacteria or archaebacte-ria; (iii) although the ribosomal protein operons are highly conserved within domain, there are fine variat

  4. Comparative genomics of Mycoplasma: analysis of conserved essential genes and diversity of the pan-genome.

    Directory of Open Access Journals (Sweden)

    Wei Liu

    Full Text Available Mycoplasma, the smallest self-replicating organism with a minimal metabolism and little genomic redundancy, is expected to be a close approximation to the minimal set of genes needed to sustain bacterial life. This study employs comparative evolutionary analysis of twenty Mycoplasma genomes to gain an improved understanding of essential genes. By analyzing the core genome of mycoplasmas, we finally revealed the conserved essential genes set for mycoplasma survival. Further analysis showed that the core genome set has many characteristics in common with experimentally identified essential genes. Several key genes, which are related to DNA replication and repair and can be disrupted in transposon mutagenesis studies, may be critical for bacteria survival especially over long period natural selection. Phylogenomic reconstructions based on 3,355 homologous groups allowed robust estimation of phylogenetic relatedness among mycoplasma strains. To obtain deeper insight into the relative roles of molecular evolution in pathogen adaptation to their hosts, we also analyzed the positive selection pressures on particular sites and lineages. There appears to be an approximate correlation between the divergence of species and the level of positive selection detected in corresponding lineages.

  5. Floral gene resources from basal angiosperms for comparative genomics research

    Directory of Open Access Journals (Sweden)

    Zhang Xiaohong

    2005-03-01

    Full Text Available Abstract Background The Floral Genome Project was initiated to bridge the genomic gap between the most broadly studied plant model systems. Arabidopsis and rice, although now completely sequenced and under intensive comparative genomic investigation, are separated by at least 125 million years of evolutionary time, and cannot in isolation provide a comprehensive perspective on structural and functional aspects of flowering plant genome dynamics. Here we discuss new genomic resources available to the scientific community, comprising cDNA libraries and Expressed Sequence Tag (EST sequences for a suite of phylogenetically basal angiosperms specifically selected to bridge the evolutionary gaps between model plants and provide insights into gene content and genome structure in the earliest flowering plants. Results Random sequencing of cDNAs from representatives of phylogenetically important eudicot, non-grass monocot, and gymnosperm lineages has so far (as of 12/1/04 generated 70,514 ESTs and 48,170 assembled unigenes. Efficient sorting of EST sequences into putative gene families based on whole Arabidopsis/rice proteome comparison has permitted ready identification of cDNA clones for finished sequencing. Preliminarily, (i proportions of functional categories among sequenced floral genes seem representative of the entire Arabidopsis transcriptome, (ii many known floral gene homologues have been captured, and (iii phylogenetic analyses of ESTs are providing new insights into the process of gene family evolution in relation to the origin and diversification of the angiosperms. Conclusion Initial comparisons illustrate the utility of the EST data sets toward discovery of the basic floral transcriptome. These first findings also afford the opportunity to address a number of conspicuous evolutionary genomic questions, including reproductive organ transcriptome overlap between angiosperms and gymnosperms, genome-wide duplication history, lineage

  6. Floral gene resources from basal angiosperms for comparative genomics research

    Science.gov (United States)

    Albert, Victor A; Soltis, Douglas E; Carlson, John E; Farmerie, William G; Wall, P Kerr; Ilut, Daniel C; Solow, Teri M; Mueller, Lukas A; Landherr, Lena L; Hu, Yi; Buzgo, Matyas; Kim, Sangtae; Yoo, Mi-Jeong; Frohlich, Michael W; Perl-Treves, Rafael; Schlarbaum, Scott E; Bliss, Barbara J; Zhang, Xiaohong; Tanksley, Steven D; Oppenheimer, David G; Soltis, Pamela S; Ma, Hong; dePamphilis, Claude W; Leebens-Mack, James H

    2005-01-01

    Background The Floral Genome Project was initiated to bridge the genomic gap between the most broadly studied plant model systems. Arabidopsis and rice, although now completely sequenced and under intensive comparative genomic investigation, are separated by at least 125 million years of evolutionary time, and cannot in isolation provide a comprehensive perspective on structural and functional aspects of flowering plant genome dynamics. Here we discuss new genomic resources available to the scientific community, comprising cDNA libraries and Expressed Sequence Tag (EST) sequences for a suite of phylogenetically basal angiosperms specifically selected to bridge the evolutionary gaps between model plants and provide insights into gene content and genome structure in the earliest flowering plants. Results Random sequencing of cDNAs from representatives of phylogenetically important eudicot, non-grass monocot, and gymnosperm lineages has so far (as of 12/1/04) generated 70,514 ESTs and 48,170 assembled unigenes. Efficient sorting of EST sequences into putative gene families based on whole Arabidopsis/rice proteome comparison has permitted ready identification of cDNA clones for finished sequencing. Preliminarily, (i) proportions of functional categories among sequenced floral genes seem representative of the entire Arabidopsis transcriptome, (ii) many known floral gene homologues have been captured, and (iii) phylogenetic analyses of ESTs are providing new insights into the process of gene family evolution in relation to the origin and diversification of the angiosperms. Conclusion Initial comparisons illustrate the utility of the EST data sets toward discovery of the basic floral transcriptome. These first findings also afford the opportunity to address a number of conspicuous evolutionary genomic questions, including reproductive organ transcriptome overlap between angiosperms and gymnosperms, genome-wide duplication history, lineage-specific gene duplication and

  7. Bioinformatics Assisted Gene Discovery and Annotation of Human Genome

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    As the sequencing stage of human genome project is near the end, the work has begun for discovering novel genes from genome sequences and annotating their biological functions. Here are reviewed current major bioinformatics tools and technologies available for large scale gene discovery and annotation from human genome sequences. Some ideas about possible future development are also provided.

  8. Genomic Prediction of Gene Bank Wheat Landraces

    Directory of Open Access Journals (Sweden)

    José Crossa

    2016-07-01

    Full Text Available This study examines genomic prediction within 8416 Mexican landrace accessions and 2403 Iranian landrace accessions stored in gene banks. The Mexican and Iranian collections were evaluated in separate field trials, including an optimum environment for several traits, and in two separate environments (drought, D and heat, H for the highly heritable traits, days to heading (DTH, and days to maturity (DTM. Analyses accounting and not accounting for population structure were performed. Genomic prediction models include genotype × environment interaction (G × E. Two alternative prediction strategies were studied: (1 random cross-validation of the data in 20% training (TRN and 80% testing (TST (TRN20-TST80 sets, and (2 two types of core sets, “diversity” and “prediction”, including 10% and 20%, respectively, of the total collections. Accounting for population structure decreased prediction accuracy by 15–20% as compared to prediction accuracy obtained when not accounting for population structure. Accounting for population structure gave prediction accuracies for traits evaluated in one environment for TRN20-TST80 that ranged from 0.407 to 0.677 for Mexican landraces, and from 0.166 to 0.662 for Iranian landraces. Prediction accuracy of the 20% diversity core set was similar to accuracies obtained for TRN20-TST80, ranging from 0.412 to 0.654 for Mexican landraces, and from 0.182 to 0.647 for Iranian landraces. The predictive core set gave similar prediction accuracy as the diversity core set for Mexican collections, but slightly lower for Iranian collections. Prediction accuracy when incorporating G × E for DTH and DTM for Mexican landraces for TRN20-TST80 was around 0.60, which is greater than without the G × E term. For Iranian landraces, accuracies were 0.55 for the G × E model with TRN20-TST80. Results show promising prediction accuracies for potential use in germplasm enhancement and rapid introgression of exotic germplasm

  9. The genome BLASTatlas-a GeneWiz extension for visualization of whole-genome homology.

    Science.gov (United States)

    Hallin, Peter F; Binnewies, Tim T; Ussery, David W

    2008-05-01

    The development of fast and inexpensive methods for sequencing bacterial genomes has led to a wealth of data, often with many genomes being sequenced of the same species or closely related organisms. Thus, there is a need for visualization methods that will allow easy comparison of many sequenced genomes to a defined reference strain. The BLASTatlas is one such tool that is useful for mapping and visualizing whole genome homology of genes and proteins within a reference strain compared to other strains or species of one or more prokaryotic organisms. We provide examples of BLASTatlases, including the Clostridium tetani plasmid p88, where homologues for toxin genes can be easily visualized in other sequenced Clostridium genomes, and for a Clostridium botulinum genome, compared to 14 other Clostridium genomes. DNA structural information is also included in the atlas to visualize the DNA chromosomal context of regions. Additional information can be added to these plots, and as an example we have added circles showing the probability of the DNA helix opening up under superhelical tension. The tool is SOAP compliant and WSDL (web services description language) files are located on our website: (http://www.cbs.dtu.dk/ws/BLASTatlas), where programming examples are available in Perl. By providing an interoperable method to carry out whole genome visualization of homology, this service offers bioinformaticians as well as biologists an easy-to-adopt workflow that can be directly called from the programming language of the user, hence enabling automation of repeated tasks. This tool can be relevant in many pangenomic as well as in metagenomic studies, by giving a quick overview of clusters of insertion sites, genomic islands and overall homology between a reference sequence and a data set.

  10. A genome-wide linkage and association study of musical aptitude identifies loci containing genes related to inner ear development and neurocognitive functions

    Science.gov (United States)

    Oikkonen, J.; Huang, Y.; Onkamo, P.; Ukkola-Vuoti, L.; Raijas, P.; Karma, K.; Vieland, V. J.; Järvelä, I.

    2014-01-01

    Humans have developed the perception, production and processing of sounds into the art of music. A genetic contribution to these skills of musical aptitude has long been suggested. We performed a genome-wide scan in 76 pedigrees (767 individuals) characterized for the ability to discriminate pitch (SP), duration (ST) and sound patterns (KMT), which are primary capacities for music perception. Using the Bayesian linkage and association approach implemented in program package KELVIN, especially designed for complex pedigrees, several SNPs near genes affecting the functions of the auditory pathway and neurocognitive processes were identified. The strongest association was found at 3q21.3 (rs9854612) with combined SP, ST and KMT test scores (COMB). This region is located a few dozen kilobases upstream of the GATA binding protein 2 (GATA2) gene. GATA2 regulates the development of cochlear hair cells and the inferior colliculus (IC), which are important in tonotopic mapping. The highest probability of linkage was obtained for phenotype SP at 4p14, located next to the region harboring the protocadherin 7 gene, PCDH7. Two SNPs rs13146789 and rs13109270 of PCDH7 showed strong association. PCDH7 has been suggested to play a role in cochlear and amygdaloid complexes. Functional class analysis showed that inner ear and schizophrenia related genes were enriched inside the linked regions. This study is the first to show the importance of auditory pathway genes in musical aptitude. PMID:24614497

  11. Replication of 6 obesity genes in a meta-analysis of genome-wide association studies from diverse ancestries.

    Directory of Open Access Journals (Sweden)

    Li-Jun Tan

    Full Text Available Obesity is a major public health problem with a significant genetic component. Multiple DNA polymorphisms/genes have been shown to be strongly associated with obesity, typically in populations of European descent. The aim of this study was to verify the extent to which 6 confirmed obesity genes (FTO, CTNNBL1, ADRB2, LEPR, PPARG and UCP2 genes could be replicated in 8 different samples (n = 11,161 and to explore whether the same genes contribute to obesity-susceptibility in populations of different ancestries (five Caucasian, one Chinese, one African-American and one Hispanic population. GWAS-based data sets with 1000 G imputed variants were tested for association with obesity phenotypes individually in each population, and subsequently combined in a meta-analysis. Multiple variants at the FTO locus showed significant associations with BMI, fat mass (FM and percentage of body fat (PBF in meta-analysis. The strongest association was detected at rs7185735 (P-value = 1.01×10(-7 for BMI, 1.80×10(-6 for FM, and 5.29×10(-4 for PBF. Variants at the CTNNBL1, LEPR and PPARG loci demonstrated nominal association with obesity phenotypes (meta-analysis P-values ranging from 1.15×10(-3 to 4.94×10(-2. There was no evidence of association with variants at ADRB2 and UCP2 genes. When stratified by sex and ethnicity, FTO variants showed sex-specific and ethnic-specific effects on obesity traits. Thus, it is likely that FTO has an important role in the sex- and ethnic-specific risk of obesity. Our data confirmed the role of FTO, CTNNBL1, LEPR and PPARG in obesity predisposition. These findings enhanced our knowledge of genetic associations between these genes and obesity-related phenotypes, and provided further justification for pursuing functional studies of these genes in the pathophysiology of obesity. Sex and ethnic differences in genetic susceptibility across populations of diverse ancestries may contribute to a more targeted prevention and

  12. Replication of 6 Obesity Genes in a Meta-Analysis of Genome-Wide Association Studies from Diverse Ancestries

    Science.gov (United States)

    Tan, Li-Jun; Zhu, Hu; He, Hao; Wu, Ke-Hao; Li, Jian; Chen, Xiang-Ding; Zhang, Ji-Gang; Shen, Hui; Tian, Qing; Krousel-Wood, Marie; Papasian, Christopher J.; Bouchard, Claude; Pérusse, Louis; Deng, Hong-Wen

    2014-01-01

    Obesity is a major public health problem with a significant genetic component. Multiple DNA polymorphisms/genes have been shown to be strongly associated with obesity, typically in populations of European descent. The aim of this study was to verify the extent to which 6 confirmed obesity genes (FTO, CTNNBL1, ADRB2, LEPR, PPARG and UCP2 genes) could be replicated in 8 different samples (n = 11,161) and to explore whether the same genes contribute to obesity-susceptibility in populations of different ancestries (five Caucasian, one Chinese, one African-American and one Hispanic population). GWAS-based data sets with 1000 G imputed variants were tested for association with obesity phenotypes individually in each population, and subsequently combined in a meta-analysis. Multiple variants at the FTO locus showed significant associations with BMI, fat mass (FM) and percentage of body fat (PBF) in meta-analysis. The strongest association was detected at rs7185735 (P-value = 1.01×10−7 for BMI, 1.80×10−6 for FM, and 5.29×10−4 for PBF). Variants at the CTNNBL1, LEPR and PPARG loci demonstrated nominal association with obesity phenotypes (meta-analysis P-values ranging from 1.15×10−3 to 4.94×10−2). There was no evidence of association with variants at ADRB2 and UCP2 genes. When stratified by sex and ethnicity, FTO variants showed sex-specific and ethnic-specific effects on obesity traits. Thus, it is likely that FTO has an important role in the sex- and ethnic-specific risk of obesity. Our data confirmed the role of FTO, CTNNBL1, LEPR and PPARG in obesity predisposition. These findings enhanced our knowledge of genetic associations between these genes and obesity-related phenotypes, and provided further justification for pursuing functional studies of these genes in the pathophysiology of obesity. Sex and ethnic differences in genetic susceptibility across populations of diverse ancestries may contribute to a more targeted prevention and customized

  13. Replication of 6 obesity genes in a meta-analysis of genome-wide association studies from diverse ancestries.

    Science.gov (United States)

    Tan, Li-Jun; Zhu, Hu; He, Hao; Wu, Ke-Hao; Li, Jian; Chen, Xiang-Ding; Zhang, Ji-Gang; Shen, Hui; Tian, Qing; Krousel-Wood, Marie; Papasian, Christopher J; Bouchard, Claude; Pérusse, Louis; Deng, Hong-Wen

    2014-01-01

    Obesity is a major public health problem with a significant genetic component. Multiple DNA polymorphisms/genes have been shown to be strongly associated with obesity, typically in populations of European descent. The aim of this study was to verify the extent to which 6 confirmed obesity genes (FTO, CTNNBL1, ADRB2, LEPR, PPARG and UCP2 genes) could be replicated in 8 different samples (n = 11,161) and to explore whether the same genes contribute to obesity-susceptibility in populations of different ancestries (five Caucasian, one Chinese, one African-American and one Hispanic population). GWAS-based data sets with 1000 G imputed variants were tested for association with obesity phenotypes individually in each population, and subsequently combined in a meta-analysis. Multiple variants at the FTO locus showed significant associations with BMI, fat mass (FM) and percentage of body fat (PBF) in meta-analysis. The strongest association was detected at rs7185735 (P-value = 1.01×10(-7) for BMI, 1.80×10(-6) for FM, and 5.29×10(-4) for PBF). Variants at the CTNNBL1, LEPR and PPARG loci demonstrated nominal association with obesity phenotypes (meta-analysis P-values ranging from 1.15×10(-3) to 4.94×10(-2)). There was no evidence of association with variants at ADRB2 and UCP2 genes. When stratified by sex and ethnicity, FTO variants showed sex-specific and ethnic-specific effects on obesity traits. Thus, it is likely that FTO has an important role in the sex- and ethnic-specific risk of obesity. Our data confirmed the role of FTO, CTNNBL1, LEPR and PPARG in obesity predisposition. These findings enhanced our knowledge of genetic associations between these genes and obesity-related phenotypes, and provided further justification for pursuing functional studies of these genes in the pathophysiology of obesity. Sex and ethnic differences in genetic susceptibility across populations of diverse ancestries may contribute to a more targeted prevention and customized

  14. GENOME-ENABLED DISCOVERY OF CARBON SEQUESTRATION GENES IN POPLAR

    Energy Technology Data Exchange (ETDEWEB)

    DAVIS J M

    2007-10-11

    Plants utilize carbon by partitioning the reduced carbon obtained through photosynthesis into different compartments and into different chemistries within a cell and subsequently allocating such carbon to sink tissues throughout the plant. Since the phytohormones auxin and cytokinin are known to influence sink strength in tissues such as roots (Skoog & Miller 1957, Nordstrom et al. 2004), we hypothesized that altering the expression of genes that regulate auxin-mediated (e.g., AUX/IAA or ARF transcription factors) or cytokinin-mediated (e.g., RR transcription factors) control of root growth and development would impact carbon allocation and partitioning belowground (Fig. 1 - Renewal Proposal). Specifically, the ARF, AUX/IAA and RR transcription factor gene families mediate the effects of the growth regulators auxin and cytokinin on cell expansion, cell division and differentiation into root primordia. Invertases (IVR), whose transcript abundance is enhanced by both auxin and cytokinin, are critical components of carbon movement and therefore of carbon allocation. Thus, we initiated comparative genomic studies to identify the AUX/IAA, ARF, RR and IVR gene families in the Populus genome that could impact carbon allocation and partitioning. Bioinformatics searches using Arabidopsis gene sequences as queries identified regions with high degrees of sequence similarities in the Populus genome. These Populus sequences formed the basis of our transgenic experiments. Transgenic modification of gene expression involving members of these gene families was hypothesized to have profound effects on carbon allocation and partitioning.

  15. A genome-wide MeSH-based literature mining system predicts implicit gene-to-gene relationships and networks.

    Science.gov (United States)

    Xiang, Zuoshuang; Qin, Tingting; Qin, Zhaohui S; He, Yongqun

    2013-10-16

    The large amount of literature in the post-genomics era enables the study of gene interactions and networks using all available articles published for a specific organism. MeSH is a controlled vocabulary of medical and scientific terms that is used by biomedical scientists to manually index articles in the PubMed literature database. We hypothesized that genome-wide gene-MeSH term associations from the PubMed literature database could be used to predict implicit gene-to-gene relationships and networks. While the gene-MeSH associations have been used to detect gene-gene interactions in some studies, different methods have not been well compared, and such a strategy has not been evaluated for a genome-wide literature analysis. Genome-wide literature mining of gene-to-gene interactions allows ranking of the best gene interactions and investigation of comprehensive biological networks at a genome level. The genome-wide GenoMesh literature mining algorithm was developed by sequentially generating a gene-article matrix, a normalized gene-MeSH term matrix, and a gene-gene matrix. The gene-gene matrix relies on the calculation of pairwise gene dissimilarities based on gene-MeSH relationships. An optimized dissimilarity score was identified from six well-studied functions based on a receiver operating characteristic (ROC) analysis. Based on the studies with well-studied Escherichia coli and less-studied Brucella spp., GenoMesh was found to accurately identify gene functions using weighted MeSH terms, predict gene-gene interactions not reported in the literature, and cluster all the genes studied from an organism using the MeSH-based gene-gene matrix. A web-based GenoMesh literature mining program is also available at: http://genomesh.hegroup.org. GenoMesh also predicts gene interactions and networks among genes associated with specific MeSH terms or user-selected gene lists. The GenoMesh algorithm and web program provide the first genome-wide, MeSH-based literature mining

  16. Genome-wide comparative analysis reveals similar types of NBS genes in hybrid Citrus sinensis genome and original Citrus clementine genome and provides new insights into non-TIR NBS genes.

    Directory of Open Access Journals (Sweden)

    Yunsheng Wang

    Full Text Available In this study, we identified and compared nucleotide-binding site (NBS domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China. Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approximately evenly numbered groups: one group contains the Toll-Interleukin receptor (TIR domain and two different Non-TIR groups in which most of proteins contain the Coiled Coil (CC domain. Motif analysis confirmed that the two groups of CC-containing NBS genes are from different evolutionary origins. We partitioned NBS genes into clades using NBS domain sequence distances and found most clades include NBS genes from all three Citrus genomes. This suggests that three Citrus genomes have similar numbers and types of NBS genes. We also mapped the re-sequenced reads of three pomelo and three mandarin genomes onto the C. sinensis genome. We found that most NBS genes of the hybrid C. sinensis genome have corresponding homologous genes in both pomelo and mandarin genomes. The homologous NBS genes in pomelo and mandarin suggest that the parental species of C. sinensis may contain similar types of NBS genes. This explains why the hybrid C. sinensis and original C. clementina have similar types of NBS genes in this study. Furthermore, we found that sequence variation amongst Citrus NBS genes were shaped by multiple independent and shared accelerated mutation accumulation events among different groups of NBS genes and in different Citrus genomes. Our comparative analyses yield valuable insight into the structure, organization and evolution of NBS genes in Citrus genomes. Furthermore, our comprehensive analysis showed that the non-TIR NBS genes can be divided into two groups that come from different evolutionary origins. This provides new insights into non-TIR genes, which have not received much attention.

  17. Genome-wide comparative analysis reveals similar types of NBS genes in hybrid Citrus sinensis genome and original Citrus clementine genome and provides new insights into non-TIR NBS genes.

    Science.gov (United States)

    Wang, Yunsheng; Zhou, Lijuan; Li, Dazhi; Dai, Liangying; Lawton-Rauh, Amy; Srimani, Pradip K; Duan, Yongping; Luo, Feng

    2015-01-01

    In this study, we identified and compared nucleotide-binding site (NBS) domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China). Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approximately evenly numbered groups: one group contains the Toll-Interleukin receptor (TIR) domain and two different Non-TIR groups in which most of proteins contain the Coiled Coil (CC) domain. Motif analysis confirmed that the two groups of CC-containing NBS genes are from different evolutionary origins. We partitioned NBS genes into clades using NBS domain sequence distances and found most clades include NBS genes from all three Citrus genomes. This suggests that three Citrus genomes have similar numbers and types of NBS genes. We also mapped the re-sequenced reads of three pomelo and three mandarin genomes onto the C. sinensis genome. We found that most NBS genes of the hybrid C. sinensis genome have corresponding homologous genes in both pomelo and mandarin genomes. The homologous NBS genes in pomelo and mandarin suggest that the parental species of C. sinensis may contain similar types of NBS genes. This explains why the hybrid C. sinensis and original C. clementina have similar types of NBS genes in this study. Furthermore, we found that sequence variation amongst Citrus NBS genes were shaped by multiple independent and shared accelerated mutation accumulation events among different groups of NBS genes and in different Citrus genomes. Our comparative analyses yield valuable insight into the structure, organization and evolution of NBS genes in Citrus genomes. Furthermore, our comprehensive analysis showed that the non-TIR NBS genes can be divided into two groups that come from different evolutionary origins. This provides new insights into non-TIR genes, which have not received much attention.

  18. Genome-Wide Linkage and Positional Association Analyses Identify Associations of Novel AFF3 and NTM Genes with Triglycerides: The GenSalt Study

    Science.gov (United States)

    Li, Changwei; Bazzano, Lydia A.L.; Rao, Dabeeru C.; Hixson, James E.; He, Jiang; Gu, Dongfeng; Gu, Charles C.; Shimmin, Lawrence C.; Jaquish, Cashell E.; Schwander, Karen; Liu, De-Pei; Huang, Jianfeng; Lu, Fanghong; Cao, Jie; Chong, Shen; Lu, Xiangfeng; Kelly, Tanika N.

    2016-01-01

    We conducted a genome-wide linkage scan and positional association study to identify genes and variants influencing blood lipid levels among participants of the Genetic Epidemiology Network of Salt-Sensitivity (GenSalt) study. The GenSalt study was conducted among 1906 participants from 633 Han Chinese families. Lipids were measured from overnight fasting blood samples using standard methods. Multipoint quantitative trait genome-wide linkage scans were performed on the high-density lipoprotein, low-density lipoprotein, and log-transformed triglyceride phenotypes. Using dense panels of single nucleotide polymorphisms (SNPs), single-marker and gene-based association analyses were conducted to follow-up on promising linkage signals. Additive associations between each SNP and lipid phenotypes were tested using mixed linear regression models. Gene-based analyses were performed by combining P-values from single-marker analyses within each gene using the truncated product method (TPM). Significant associations were assessed for replication among 777 Asian participants of the Multi-ethnic Study of Atherosclerosis (MESA). Bonferroni correction was used to adjust for multiple testing. In the GenSalt study, suggestive linkage signals were identified at 2p11.2–2q12.1 [maximum multipoint LOD score (MML) = 2.18 at 2q11.2] and 11q24.3–11q25 (MML = 2.29 at 11q25) for the log-transformed triglyceride phenotype. Follow-up analyses of these two regions revealed gene-based associations of charged multivesicular body protein 3 (CHMP3), ring finger protein 103 (RNF103), AF4/FMR2 family, member 3 (AFF3), and neurotrimin (NTM ) with triglycerides (P = 4 × 10−4, 1.00 × 10−5, 2.00 × 10−5, and 1.00 × 10−7, respectively). Both the AFF3 and NTM triglyceride associations were replicated among MESA study participants (P = 1.00 × 10−7 and 8.00 × 10−5, respectively). Furthermore, NTM explained the linkage signal on chromosome 11. In conclusion, we identified novel genes

  19. Genomic Evidence Reveals the Extreme Diversity and Wide Distribution of the Arsenic-Related Genes in Burkholderiales

    OpenAIRE

    Xiangyang Li; Linshuang Zhang; Gejiao Wang

    2014-01-01

    So far, numerous genes have been found to associate with various strategies to resist and transform the toxic metalloid arsenic (here, we denote these genes as "arsenic-related genes"). However, our knowledge of the distribution, redundancies and organization of these genes in bacteria is still limited. In this study, we analyzed the 188 Burkholderiales genomes and found that 95% genomes harbored arsenic-related genes, with an average of 6.6 genes per genome. The results indicated: a) compare...

  20. Pathway Analysis Using Genome-Wide Association Study Data for Coronary Restenosis – A Potential Role for the PARVB Gene

    Science.gov (United States)

    Verschuren, Jeffrey J. W.; Trompet, Stella; Sampietro, M. Lourdes; Heijmans, Bastiaan T.; Koch, Werner; Kastrati, Adnan; Houwing-Duistermaat, Jeanine J.; Slagboom, P. Eline; Quax, Paul H. A.; Jukema, J. Wouter

    2013-01-01

    Background Coronary restenosis after percutaneous coronary intervention (PCI) still remains a significant limitation of the procedure. The causative mechanisms of restenosis have not yet been fully identified. The goal of the current study was to perform gene-set analysis of biological pathways related to inflammation, proliferation, vascular function and transcriptional regulation on coronary restenosis to identify novel genes and pathways related to this condition. Methods The GENetic DEterminants of Restenosis (GENDER) databank contains genotypic data of 556,099SNPs of 295 cases with restenosis and 571 matched controls. Fifty-four pathways, related to known restenosis-related processes, were selected. Gene-set analysis was performed using PLINK, GRASS and ALIGATOR software. Pathways with a p<0.01 were fine-mapped and significantly associated SNPs were analyzed in an independent replication cohort. Results Six pathways (cell-extracellular matrix (ECM) interactions pathway, IL2 signaling pathway, IL6 signaling pathway, platelet derived growth factor pathway, vitamin D receptor pathway and the mitochondria pathway) were significantly associated in one or two of the software packages. Two SNPs in the cell-ECM interactions pathway were replicated in an independent restenosis cohort. No replication was obtained for the other pathways. Conclusion With these results we demonstrate a potential role of the cell-ECM interactions pathway in the development of coronary restenosis. These findings contribute to the increasing knowledge of the genetic etiology of restenosis formation and could serve as a hypothesis-generating effort for further functional studies. PMID:23950981

  1. Genome-wide analysis of homeobox gene family in legumes: identification, gene duplication and expression profiling.

    Science.gov (United States)

    Bhattacharjee, Annapurna; Ghangal, Rajesh; Garg, Rohini; Jain, Mukesh

    2015-01-01

    Homeobox genes encode transcription factors that are known to play a major role in different aspects of plant growth and development. In the present study, we identified homeobox genes belonging to 14 different classes in five legume species, including chickpea, soybean, Medicago, Lotus and pigeonpea. The characteristic differences within homeodomain sequences among various classes of homeobox gene family were quite evident. Genome-wide expression analysis using publicly available datasets (RNA-seq and microarray) indicated that homeobox genes are differentially expressed in various tissues/developmental stages and under stress conditions in different legumes. We validated the differential expression of selected chickpea homeobox genes via quantitative reverse transcription polymerase chain reaction. Genome duplication analysis in soybean indicated that segmental duplication has significantly contributed in the expansion of homeobox gene family. The Ka/Ks ratio of duplicated homeobox genes in soybean showed that several members of this family have undergone purifying selection. Moreover, expression profiling indicated that duplicated genes might have been retained due to sub-functionalization. The genome-wide identification and comprehensive gene expression profiling of homeobox gene family members in legumes will provide opportunities for functional analysis to unravel their exact role in plant growth and development.

  2. Identification of IGF1, SLC4A4, WWOX, and SFMBT1 as hypertension susceptibility genes in Han Chinese with a genome-wide gene-based association study.

    Directory of Open Access Journals (Sweden)

    Hsin-Chou Yang

    Full Text Available Hypertension is a complex disorder with high prevalence rates all over the world. We conducted the first genome-wide gene-based association scan for hypertension in a Han Chinese population. By analyzing genome-wide single-nucleotide-polymorphism data of 400 matched pairs of young-onset hypertensive patients and normotensive controls genotyped with the Illumina HumanHap550-Duo BeadChip, 100 susceptibility genes for hypertension were identified and also validated with permutation tests. Seventeen of the 100 genes exhibited differential allelic and expression distributions between patient and control groups. These genes provided a good molecular signature for classifying hypertensive patients and normotensive controls. Among the 17 genes, IGF1, SLC4A4, WWOX, and SFMBT1 were not only identified by our gene-based association scan and gene expression analysis but were also replicated by a gene-based association analysis of the Hong Kong Hypertension Study. Moreover, cis-acting expression quantitative trait loci associated with the differentially expressed genes were found and linked to hypertension. IGF1, which encodes insulin-like growth factor 1, is associated with cardiovascular disorders, metabolic syndrome, decreased body weight/size, and changes of insulin levels in mice. SLC4A4, which encodes the electrogenic sodium bicarbonate cotransporter 1, is associated with decreased body weight/size and abnormal ion homeostasis in mice. WWOX, which encodes the WW domain-containing protein, is related to hypoglycemia and hyperphosphatemia. SFMBT1, which encodes the scm-like with four MBT domains protein 1, is a novel hypertension gene. GRB14, TMEM56 and KIAA1797 exhibited highly significant differential allelic and expressed distributions between hypertensive patients and normotensive controls. GRB14 was also found relevant to blood pressure in a previous genetic association study in East Asian populations. TMEM56 and KIAA1797 may be specific to

  3. [The application of genome editing in identification of plant gene function and crop breeding].

    Science.gov (United States)

    Xiangchun, Zhou; Yongzhong, Xing

    2016-03-01

    Plant genome can be modified via current biotechnology with high specificity and excellent efficiency. Zinc finger nucleases (ZFN), transcription activator-like effector nucleases (TALEN) and clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated 9 (Cas9) system are the key engineered nucleases used in the genome editing. Genome editing techniques enable gene targeted mutagenesis, gene knock-out, gene insertion or replacement at the target sites during the endogenous DNA repair process, including non-homologous end joining (NHEJ) and homologous recombination (HR), triggered by the induction of DNA double-strand break (DSB). Genome editing has been successfully applied in the genome modification of diverse plant species, such as Arabidopsis thaliana, Oryza sativa, and Nicotiana tabacum. In this review, we summarize the application of genome editing in identification of plant gene function and crop breeding. Moreover, we also discuss the improving points of genome editing in crop precision genetic improvement for further study.

  4. Phylogeny Inference of Closely Related Bacterial Genomes: Combining the Features of Both Overlapping Genes and Collinear Genomic Regions

    Science.gov (United States)

    Zhang, Yan-Cong; Lin, Kui

    2015-01-01

    Overlapping genes (OGs) represent one type of widespread genomic feature in bacterial genomes and have been used as rare genomic markers in phylogeny inference of closely related bacterial species. However, the inference may experience a decrease in performance for phylogenomic analysis of too closely or too distantly related genomes. Another drawback of OGs as phylogenetic markers is that they usually take little account of the effects of genomic rearrangement on the similarity estimation, such as intra-chromosome/genome translocations, horizontal gene transfer, and gene losses. To explore such effects on the accuracy of phylogeny reconstruction, we combine phylogenetic signals of OGs with collinear genomic regions, here called locally collinear blocks (LCBs). By putting these together, we refine our previous metric of pairwise similarity between two closely related bacterial genomes. As a case study, we used this new method to reconstruct the phylogenies of 88 Enterobacteriale genomes of the class Gammaproteobacteria. Our results demonstrated that the topological accuracy of the inferred phylogeny was improved when both OGs and LCBs were simultaneously considered, suggesting that combining these two phylogenetic markers may reduce, to some extent, the influence of gene loss on phylogeny inference. Such phylogenomic studies, we believe, will help us to explore a more effective approach to increasing the robustness of phylogeny reconstruction of closely related bacterial organisms. PMID:26715828

  5. Genome-Wide Association Studies Candidate Gene to Dual Modifier of Nonalcoholic Steatohepatitis and Atherosclerosis

    Directory of Open Access Journals (Sweden)

    Clint L. Miller, PhD

    2016-12-01

    Full Text Available Nonalcoholic steatohepatitis is a common disease involving chronic accumulation of fat and inflammation in the liver, often leading to advanced fibrosis, cirrhosis, and cancer. It is known that nonalcoholic steatohepatitis shares many features with atherosclerosis; however, there are still no effective therapeutics. In a recent study published in Nature, investigators demonstrated that mice lacking a high-density lipoprotein–associated gene were surprisingly protected from both steatohepatitis and atherosclerosis through the stabilization of the liver X receptor. This work reveals a timely candidate target for 2 highly prevalent cardiovascular diseases.

  6. Alpha tubulin genes from Leishmania braziliensis: genomic organization, gene structure and insights on their expression.

    Science.gov (United States)

    Ramírez, César A; Requena, José M; Puerta, Concepción J

    2013-07-06

    Alpha tubulin is a fundamental component of the cytoskeleton which is responsible for cell shape and is involved in cell division, ciliary and flagellar motility and intracellular transport. Alpha tubulin gene expression varies according to the morphological changes suffered by Leishmania in its life cycle. However, the objective of studying the mechanisms responsible for the differential expression has resulted to be a difficult task due to the complex genome organization of tubulin genes and to the non-conventional mechanisms of gene regulation operating in Leishmania. We started this work by analyzing the genomic organization of α-tubulin genes in the Leishmania braziliensis genome database. The genomic organization of L. braziliensis α-tubulin genes differs from that existing in the L. major and L. infantum genomes. Two loci containing α-tubulin genes were found in the chromosomes 13 and 29, even though the existence of sequence gaps does not allow knowing the exact number of genes at each locus. Southern blot assays showed that α-tubulin locus at chromosome 13 contains at least 8 gene copies, which are tandemly organized with a 2.08-kb repetition unit; the locus at chromosome 29 seems to contain a sole α-tubulin gene. In addition, it was found that L. braziliensis α-tubulin locus at chromosome 13 contains two types of α-tubulin genes differing in their 3' UTR, each one presumably containing different regulatory motifs. It was also determined that the mRNA expression levels of these genes are controlled by post-transcriptional mechanisms tightly linked to the growth temperature. Moreover, the decrease in the α-tubulin mRNA abundance observed when promastigotes were cultured at 35°C was accompanied by parasite morphology alterations, similar to that occurring during the promastigote to amastigote differentiation. Information found in the genome databases indicates that α-tubulin genes have been reorganized in a drastic manner along Leishmania

  7. Genome-wide annotation, expression profiling, and protein interaction studies of the core cell-cycle genes in Phalaenopsis aphrodite.

    Science.gov (United States)

    Lin, Hsiang-Yin; Chen, Jhun-Chen; Wei, Miao-Ju; Lien, Yi-Chen; Li, Huang-Hsien; Ko, Swee-Suak; Liu, Zin-Huang; Fang, Su-Chiung

    2014-01-01

    Orchidaceae is one of the most abundant and diverse families in the plant kingdom and its unique developmental patterns have drawn the attention of many evolutionary biologists. Particular areas of interest have included the co-evolution of pollinators and distinct floral structures, and symbiotic relationships with mycorrhizal flora. However, comprehensive studies to decipher the molecular basis of growth and development in orchids remain scarce. Cell proliferation governed by cell-cycle regulation is fundamental to growth and development of the plant body. We took advantage of recently released transcriptome information to systematically isolate and annotate the core cell-cycle regulators in the moth orchid Phalaenopsis aphrodite. Our data verified that Phalaenopsis cyclin-dependent kinase A (CDKA) is an evolutionarily conserved CDK. Expression profiling studies suggested that core cell-cycle genes functioning during the G1/S, S, and G2/M stages were preferentially enriched in the meristematic tissues that have high proliferation activity. In addition, subcellular localization and pairwise interaction analyses of various combinations of CDKs and cyclins, and of E2 promoter-binding factors and dimerization partners confirmed interactions of the functional units. Furthermore, our data showed that expression of the core cell-cycle genes was coordinately regulated during pollination-induced reproductive development. The data obtained establish a fundamental framework for study of the cell-cycle machinery in Phalaenopsis orchids.

  8. Global Metabolic Reconstruction and Metabolic Gene Evolution in the Cattle Genome.

    Science.gov (United States)

    Kim, Woonsu; Park, Hyesun; Seo, Seongwon

    2016-01-01

    The sequence of cattle genome provided a valuable opportunity to systematically link genetic and metabolic traits of cattle. The objectives of this study were 1) to reconstruct genome-scale cattle-specific metabolic pathways based on the most recent and updated cattle genome build and 2) to identify duplicated metabolic genes in the cattle genome for better understanding of metabolic adaptations in cattle. A bioinformatic pipeline of an organism for amalgamating genomic annotations from multiple sources was updated. Using this, an amalgamated cattle genome database based on UMD_3.1, was created. The amalgamated cattle genome database is composed of a total of 33,292 genes: 19,123 consensus genes between NCBI and Ensembl databases, 8,410 and 5,493 genes only found in NCBI or Ensembl, respectively, and 266 genes from NCBI scaffolds. A metabolic reconstruction of the cattle genome and cattle pathway genome database (PGDB) was also developed using Pathway Tools, followed by an intensive manual curation. The manual curation filled or revised 68 pathway holes, deleted 36 metabolic pathways, and added 23 metabolic pathways. Consequently, the curated cattle PGDB contains 304 metabolic pathways, 2,460 reactions including 2,371 enzymatic reactions, and 4,012 enzymes. Furthermore, this study identified eight duplicated genes in 12 metabolic pathways in the cattle genome compared to human and mouse. Some of these duplicated genes are related with specific hormone biosynthesis and detoxifications. The updated genome-scale metabolic reconstruction is a useful tool for understanding biology and metabolic characteristics in cattle. There has been significant improvements in the quality of cattle genome annotations and the MetaCyc database. The duplicated metabolic genes in the cattle genome compared to human and mouse implies evolutionary changes in the cattle genome and provides a useful information for further research on understanding metabolic adaptations of cattle.

  9. A replication study for genome-wide gene expression levels in two layer lines elucidates differentially expressed genes of pathways involved in bone remodeling and immune responsiveness.

    Directory of Open Access Journals (Sweden)

    Christin Habig

    Full Text Available The current replication study confirmed significant differences in gene expression profiles of the cerebrum among the two commercial layer lines Lohmann Selected Leghorn (LSL and Lohmann Brown (LB. Microarray analyses were performed for 30 LSL and another 30 LB laying hens kept in the small group housing system Eurovent German. A total of 14,103 microarray probe sets using customized Affymetrix ChiGene-1_0-st Arrays with 20,399 probe sets were differentially expressed among the two layer lines LSL and LB (FDR adjusted P-value <0.05. An at least 2-fold change in expression levels could be observed for 388 of these probe sets. In LSL, 214 of the 388 probe sets were down- and 174 were up-regulated and vice versa for the LB layer line. Among the 174 up-regulated probe sets in LSL, we identified 51 significantly enriched Gene ontology (GO terms of the biological process category. A total of 63 enriched GO-terms could be identified for the 214 down-regulated probe sets of the layer line LSL. We identified nine genes significantly differentially expressed between the two layer lines in both microarray experiments. These genes play a crucial role in protection of neuronal cells from oxidative stress, bone mineral density and immune response among the two layer lines LSL and LB. Thus, the different regulation of these genes may significantly contribute to phenotypic trait differences among these layer lines. In conclusion, these novel findings provide a basis for further research to improve animal welfare in laying hens and these layer lines may be of general interest as an animal model.

  10. Genome-wide association study identifies nox3 as a critical gene for susceptibility to noise-induced hearing loss.

    Directory of Open Access Journals (Sweden)

    Joel Lavinsky

    2015-04-01

    Full Text Available In the United States, roughly 10% of the population is exposed daily to hazardous levels of noise in the workplace. Twin studies estimate heritability for noise-induced hearing loss (NIHL of approximately 36%, and strain specific variation in sensitivity has been demonstrated in mice. Based upon the difficulties inherent to the study of NIHL in humans, we have turned to the study of this complex trait in mice. We exposed 5 week-old mice from the Hybrid Mouse Diversity Panel (HMDP to a 10 kHz octave band noise at 108 dB for 2 hours and assessed the permanent threshold shift 2 weeks post exposure using frequency specific stimuli. These data were then used in a genome-wide association study (GWAS using the Efficient Mixed Model Analysis (EMMA to control for population structure. In this manuscript we describe our GWAS, with an emphasis on a significant peak for susceptibility to NIHL on chromosome 17 within a haplotype block containing NADPH oxidase-3 (Nox3. Our peak was detected after an 8 kHz tone burst stimulus. Nox3 mutants and heterozygotes were then tested to validate our GWAS. The mutants and heterozygotes demonstrated a greater susceptibility to NIHL specifically at 8 kHz both on measures of distortion product otoacoustic emissions (DPOAE and on auditory brainstem response (ABR. We demonstrate that this sensitivity resides within the synaptic ribbons of the cochlea in the mutant animals specifically at 8 kHz. Our work is the first GWAS for NIHL in mice and elucidates the power of our approach to identify tonotopic genetic susceptibility to NIHL.

  11. Quantification of amylose, amylopectin and β-glucan in the search for genes controlling the three major quality traits in barley using genome-wide association studies

    Directory of Open Access Journals (Sweden)

    Soren K Rasmussen

    2014-05-01

    Full Text Available Genome-wide association studies (GWAS for amylose, amylopectin and β-glucan concentration in a collection of 254 European spring barley varieties allowed to identify 20, 17 and 21 single nucleotide polymorphic (SNP markers, respectively, associated with these important grain quality traits. Negative correlations between the content of amylose and β-glucan (R=-0.62, P<0.01 and amylopectin and β-glucan (R= -0.487, P<0.01 were found in this large collection of spring barley varieties. Besides HvCslF6, amo1 and AGPL2, sex6 and waxy were identified among the major genes responsible for β-glucan, amylose and amylopectin content, respectively. Several minor genes like HvGSL4, HvGSL3 and HvCesA6, PWD were also detected by GWAS for the first time. Furthermore, the gene encoding β-fructofuranosidase, located on the short arm of chromosome 7H at 1.49cM, and SRF6, encoding ‘leucine-rich repeat receptor kinase protein’ on chromosome 2H, are proposed to be new candidate genes for amylopectin formation in barley endosperm. Several of the associated SNPs on chromosome 1H, 5H, 6H and 7H mapped to overlapping regions containing QTLs and genes controlling the three grain constituents. In particular chromosomes 5H and 7H carry many QTLs controlling barley grain quality. Amylose, amylopectin and β-glucan were interacted among each other through a metabolic network connected by UDP showing pleiotropic effects. Taken together, these results showed that cereal quality traits related each other and regulated through an interaction network, the identified major genes and genetic regions for amylose, amylopectin and β-glucan is a helpful for further research on carbohydrates and barley breeding.

  12. Comparison of methods for genomic localization of gene trap sequences

    Directory of Open Access Journals (Sweden)

    Ferrin Thomas E

    2006-09-01

    Full Text Available Abstract Background Gene knockouts in a model organism such as mouse provide a valuable resource for the study of basic biology and human disease. Determining which gene has been inactivated by an untargeted gene trapping event poses a challenging annotation problem because gene trap sequence tags, which represent sequence near the vector insertion site of a trapped gene, are typically short and often contain unresolved residues. To understand better the localization of these sequences on the mouse genome, we compared stand-alone versions of the alignment programs BLAT, SSAHA, and MegaBLAST. A set of 3,369 sequence tags was aligned to build 34 of the mouse genome using default parameters for each algorithm. Known genome coordinates for the cognate set of full-length genes (1,659 sequences were used to evaluate localization results. Results In general, all three programs performed well in terms of localizing sequences to a general region of the genome, with only relatively subtle errors identified for a small proportion of the sequence tags. However, large differences in performance were noted with regard to correctly identifying exon boundaries. BLAT correctly identified the vast majority of exon boundaries, while SSAHA and MegaBLAST missed the majority of exon boundaries. SSAHA consistently reported the fewest false positives and is the fastest algorithm. MegaBLAST was comparable to BLAT in speed, but was the most susceptible to localizing sequence tags incorrectly to pseudogenes. Conclusion The differences in performance for sequence tags and full-length reference sequences were surprisingly small. Characteristic variations in localization results for each program were noted that affect the localization of sequence at exon boundaries, in particular.

  13. A MAGIC population-based genome-wide association study reveals functional association of GhRBB1_A07 gene with superior fiber quality in cotton.

    Science.gov (United States)

    Islam, Md Sariful; Thyssen, Gregory N; Jenkins, Johnie N; Zeng, Linghe; Delhom, Christopher D; McCarty, Jack C; Deng, Dewayne D; Hinchliffe, Doug J; Jones, Don C; Fang, David D

    2016-11-09

    Cotton supplies a great majority of natural fiber for the global textile industry. The negative correlation between yield and fiber quality has hindered breeders' ability to improve these traits simultaneously. A multi-parent advanced generation inter-cross (MAGIC) population developed through random-mating of multiple diverse parents has the ability to break this negative correlation. Genotyping-by-sequencing (GBS) is a method that can rapidly identify and genotype a large number of single nucleotide polymorphisms (SNP). Genotyping a MAGIC population using GBS technologies will enable us to identify marker-trait associations with high resolution. An Upland cotton MAGIC population was developed through random-mating of 11 diverse cultivars for five generations. In this study, fiber quality data obtained from four environments and 6071 SNP markers generated via GBS and 223 microsatellite markers of 547 recombinant inbred lines (RILs) of the MAGIC population were used to conduct a genome wide association study (GWAS). By employing a mixed linear model, GWAS enabled us to identify markers significantly associated with fiber quantitative trait loci (QTL). We identified and validated one QTL cluster associated with four fiber quality traits [short fiber content (SFC), strength (STR), length (UHM) and uniformity (UI)] on chromosome A07. We further identified candidate genes related to fiber quality attributes in this region. Gene expression and amino acid substitution analysis suggested that a regeneration of bulb biogenesis 1 (GhRBB1_A07) gene is a candidate for superior fiber quality in Upland cotton. The DNA marker CFBid0004 designed from an 18 bp deletion in the coding sequence of GhRBB1_A07 in Acala Ultima is associated with the improved fiber quality in the MAGIC RILs and 105 additional commercial Upland cotton cultivars. Using GBS and a MAGIC population enabled more precise fiber QTL mapping in Upland cotton. The fiber QTL and associated markers identified in

  14. Whole genome amplification of DNA for genotyping pharmacogenetics candidate genes.

    Directory of Open Access Journals (Sweden)

    Santosh ePhilips

    2012-03-01

    Full Text Available Whole genome amplification (WGA technologies can be used to amplify genomic DNA when only small amounts of DNA are available. The Multiple Displacement Amplification Phi polymerase based amplification has been shown to accurately amplify DNA for a variety of genotyping assays; however, it has not been tested for genotyping many of the clinically relevant genes important for pharmacogenetic studies, such as the cytochrome P450 genes, that are typically difficult to genotype due to multiple pseudogenes, copy number variations, and high similarity to other related genes. We evaluated whole genome amplified samples for Taqman™ genotyping of SNPs in a variety of pharmacogenetic genes. In 24 DNA samples from the Coriell human diversity panel, the call rates and concordance between amplified (~200-fold amplification and unamplified samples was 100% for two SNPs in CYP2D6 and one in ESR1. In samples from a breast cancer clinical trial (Trial 1, we compared the genotyping results in samples before and after WGA for four SNPs in CYP2D6, one SNP in CYP2C19, one SNP in CYP19A1, two SNPs in ESR1, and two SNPs in ESR2. The concordance rates were all >97%. Finally, we compared the allele frequencies of 143 SNPs determined in Trial 1 (whole genome amplified DNA to the allele frequencies determined in unamplified DNA samples from a separate trial (Trial 2 that enrolled a similar population. The call rates and allele frequencies between the two trials were 98% and 99.7%, respectively. We conclude that the whole genome amplified DNA is suitable for Taqman™ genotyping for a wide variety of pharmacogenetically relevant SNPs.

  15. Genome-Wide Associations of Gene Expression Variation in Humans.

    Directory of Open Access Journals (Sweden)

    2005-12-01

    Full Text Available The exploration of quantitative variation in human populations has become one of the major priorities for medical genetics. The successful identification of variants that contribute to complex traits is highly dependent on reliable assays and genetic maps. We have performed a genome-wide quantitative trait analysis of 630 genes in 60 unrelated Utah residents with ancestry from Northern and Western Europe using the publicly available phase I data of the International HapMap project. The genes are located in regions of the human genome with elevated functional annotation and disease interest including the ENCODE regions spanning 1% of the genome, Chromosome 21 and Chromosome 20q12-13.2. We apply three different methods of multiple test correction, including Bonferroni, false discovery rate, and permutations. For the 374 expressed genes, we find many regions with statistically significant association of single nucleotide polymorphisms (SNPs with expression variation in lymphoblastoid cell lines after correcting for multiple tests. Based on our analyses, the signal proximal (cis- to the genes of interest is more abundant and more stable than distal and trans across statistical methodologies. Our results suggest that regulatory polymorphism is widespread in the human genome and show that the 5-kb (phase I HapMap has sufficient density to enable linkage disequilibrium mapping in humans. Such studies will significantly enhance our ability to annotate the non-coding part of the genome and interpret functional variation. In addition, we demonstrate that the HapMap cell lines themselves may serve as a useful resource for quantitative measurements at the cellular level.

  16. Genome-wide associations of gene expression variation in humans.

    Directory of Open Access Journals (Sweden)

    Barbara E Stranger

    2005-12-01

    Full Text Available The exploration of quantitative variation in human populations has become one of the major priorities for medical genetics. The successful identification of variants that contribute to complex traits is highly dependent on reliable assays and genetic maps. We have performed a genome-wide quantitative trait analysis of 630 genes in 60 unrelated Utah residents with ancestry from Northern and Western Europe using the publicly available phase I data of the International HapMap project. The genes are located in regions of the human genome with elevated functional annotation and disease interest including the ENCODE regions spanning 1% of the genome, Chromosome 21 and Chromosome 20q12-13.2. We apply three different methods of multiple test correction, including Bonferroni, false discovery rate, and permutations. For the 374 expressed genes, we find many regions with statistically significant association of single nucleotide polymorphisms (SNPs with expression variation in lymphoblastoid cell lines after correcting for multiple tests. Based on our analyses, the signal proximal (cis- to the genes of interest is more abundant and more stable than distal and trans across statistical methodologies. Our results suggest that regulatory polymorphism is widespread in the human genome and show that the 5-kb (phase I HapMap has sufficient density to enable linkage disequilibrium mapping in humans. Such studies will significantly enhance our ability to annotate the non-coding part of the genome and interpret functional variation. In addition, we demonstrate that the HapMap cell lines themselves may serve as a useful resource for quantitative measurements at the cellular level.

  17. Genome-wide and follow-up studies identify CEP68 gene variants associated with risk of aspirin-intolerant asthma.

    Science.gov (United States)

    Kim, Jeong-Hyun; Park, Byung-Lae; Cheong, Hyun Sub; Bae, Joon Seol; Park, Jong Sook; Jang, An Soo; Uh, Soo-Taek; Choi, Jae-Sung; Kim, Yong-Hoon; Kim, Mi-Kyeong; Choi, Inseon S; Cho, Sang Heon; Choi, Byoung Whui; Park, Choon-Sik; Shin, Hyoung Doo

    2010-11-03

    Aspirin-intolerant asthma (AIA) is a rare condition that is characterized by the development of bronchoconstriction in asthmatic patients after ingestion of non-steroidal anti-inflammatory drugs including aspirin. However, the underlying mechanisms of AIA occurrence are still not fully understood. To identify the genetic variations associated with aspirin intolerance in asthmatics, the first stage of genome-wide association study with 109,365 single nucleotide polymorphisms (SNPs) was undertaken in a Korean AIA (n = 80) cohort and aspirin-tolerant asthma (ATA, n = 100) subjects as controls. For the second stage of follow-up study, 150 common SNPs from 11 candidate genes were genotyped in 163 AIA patients including intermediate AIA (AIA-I) subjects and 429 ATA controls. Among 11 candidate genes, multivariate logistic analyses showed that SNPs of CEP68 gene showed the most significant association with aspirin intolerance (P values of co-dominant for CEP68, 6.0×10(-5) to 4.0×10(-5)). All seven SNPs of the CEP68 gene showed linkage disequilibrium (LD), and the haplotype of CEP68_ht4 (T-G-A-A-A-C-G) showed a highly significant association with aspirin intolerance (OR= 2.63; 95% CI= 1.64-4.21; P = 6.0×10(-5)). Moreover, the nonsynonymous CEP68 rs7572857G>A variant that replaces glycine with serine showed a higher decline of forced expiratory volume in 1s (FEV(1)) by aspirin provocation than other variants (P = 3.0×10(-5)). Our findings imply that CEP68 could be a susceptible gene for aspirin intolerance in asthmatics, suggesting that the nonsynonymous Gly74Ser could affect the polarity of the protein structure.

  18. Genome-wide and follow-up studies identify CEP68 gene variants associated with risk of aspirin-intolerant asthma.

    Directory of Open Access Journals (Sweden)

    Jeong-Hyun Kim

    Full Text Available Aspirin-intolerant asthma (AIA is a rare condition that is characterized by the development of bronchoconstriction in asthmatic patients after ingestion of non-steroidal anti-inflammatory drugs including aspirin. However, the underlying mechanisms of AIA occurrence are still not fully understood. To identify the genetic variations associated with aspirin intolerance in asthmatics, the first stage of genome-wide association study with 109,365 single nucleotide polymorphisms (SNPs was undertaken in a Korean AIA (n = 80 cohort and aspirin-tolerant asthma (ATA, n = 100 subjects as controls. For the second stage of follow-up study, 150 common SNPs from 11 candidate genes were genotyped in 163 AIA patients including intermediate AIA (AIA-I subjects and 429 ATA controls. Among 11 candidate genes, multivariate logistic analyses showed that SNPs of CEP68 gene showed the most significant association with aspirin intolerance (P values of co-dominant for CEP68, 6.0×10(-5 to 4.0×10(-5. All seven SNPs of the CEP68 gene showed linkage disequilibrium (LD, and the haplotype of CEP68_ht4 (T-G-A-A-A-C-G showed a highly significant association with aspirin intolerance (OR= 2.63; 95% CI= 1.64-4.21; P = 6.0×10(-5. Moreover, the nonsynonymous CEP68 rs7572857G>A variant that replaces glycine with serine showed a higher decline of forced expiratory volume in 1s (FEV(1 by aspirin provocation than other variants (P = 3.0×10(-5. Our findings imply that CEP68 could be a susceptible gene for aspirin intolerance in asthmatics, suggesting that the nonsynonymous Gly74Ser could affect the polarity of the protein structure.

  19. GeneViTo: Visualizing gene-product functional and structural features in genomic datasets

    Directory of Open Access Journals (Sweden)

    Promponas Vasilis J

    2003-10-01

    Full Text Available Abstract Background The availability of increasing amounts of sequence data from completely sequenced genomes boosts the development of new computational methods for automated genome annotation and comparative genomics. Therefore, there is a need for tools that facilitate the visualization of raw data and results produced by bioinformatics analysis, providing new means for interactive genome exploration. Visual inspection can be used as a basis to assess the quality of various analysis algorithms and to aid in-depth genomic studies. Results GeneViTo is a JAVA-based computer application that serves as a workbench for genome-wide analysis through visual interaction. The application deals with various experimental information concerning both DNA and protein sequences (derived from public sequence databases or proprietary data sources and meta-data obtained by various prediction algorithms, classification schemes or user-defined features. Interaction with a Graphical User Interface (GUI allows easy extraction of genomic and proteomic data referring to the sequence itself, sequence features, or general structural and functional features. Emphasis is laid on the potential comparison between annotation and prediction data in order to offer a supplement to the provided information, especially in cases of "poor" annotation, or an evaluation of available predictions. Moreover, desired information can be output in high quality JPEG image files for further elaboration and scientific use. A compilation of properly formatted GeneViTo input data for demonstration is available to interested readers for two completely sequenced prokaryotes, Chlamydia trachomatis and Methanococcus jannaschii. Conclusions GeneViTo offers an inspectional view of genomic functional elements, concerning data stemming both from database annotation and analysis tools for an overall analysis of existing genomes. The application is compatible with Linux or Windows ME-2000-XP operating

  20. Insular organization of gene space in grass genomes.

    Science.gov (United States)

    Gottlieb, Andrea; Müller, Hans-Georg; Massa, Alicia N; Wanjugi, Humphrey; Deal, Karin R; You, Frank M; Xu, Xiangyang; Gu, Yong Q; Luo, Ming-Cheng; Anderson, Olin D; Chan, Agnes P; Rabinowicz, Pablo; Devos, Katrien M; Dvorak, Jan

    2013-01-01

    Wheat and maize genes were hypothesized to be clustered into islands but the hypothesis was not statistically tested. The hypothesis is statistically tested here in four grass species differing in genome size, Brachypodium distachyon, Oryza sativa, Sorghum bicolor, and Aegilops tauschii. Density functions obtained under a model where gene locations follow a homogeneous Poisson process and thus are not clustered are compared with a model-free situation quantified through a non-parametric density estimate. A simple homogeneous Poisson model for gene locations is not rejected for the small O. sativa and B. distachyon genomes, indicating that genes are distributed largely uniformly in those species, but is rejected for the larger S. bicolor and Ae. tauschii genomes, providing evidence for clustering of genes into islands. It is proposed to call the gene islands "gene insulae" to distinguish them from other types of gene clustering that have been proposed. An average S. bicolor and Ae. tauschii insula is estimated to contain 3.7 and 3.9 genes with an average intergenic distance within an insula of 2.1 and 16.5 kb, respectively. Inter-insular distances are greater than 8 and 81 kb and average 15.1 and 205 kb, in S. bicolor and Ae. tauschii, respectively. A greater gene density observed in the distal regions of the Ae. tauschii chromosomes is shown to be primarily caused by shortening of inter-insular distances. The comparison of the four grass genomes suggests that gene locations are largely a function of a homogeneous Poisson process in small genomes. Nonrandom insertions of LTR retroelements during genome expansion creates gene insulae, which become less dense and further apart with the increase in genome size. High concordance in relative lengths of orthologous intergenic distances among the investigated genomes including the maize genome suggests functional constraints on gene distribution in the grass genomes.

  1. Insular organization of gene space in grass genomes.

    Directory of Open Access Journals (Sweden)

    Andrea Gottlieb

    Full Text Available Wheat and maize genes were hypothesized to be clustered into islands but the hypothesis was not statistically tested. The hypothesis is statistically tested here in four grass species differing in genome size, Brachypodium distachyon, Oryza sativa, Sorghum bicolor, and Aegilops tauschii. Density functions obtained under a model where gene locations follow a homogeneous Poisson process and thus are not clustered are compared with a model-free situation quantified through a non-parametric density estimate. A simple homogeneous Poisson model for gene locations is not rejected for the small O. sativa and B. distachyon genomes, indicating that genes are distributed largely uniformly in those species, but is rejected for the larger S. bicolor and Ae. tauschii genomes, providing evidence for clustering of genes into islands. It is proposed to call the gene islands "gene insulae" to distinguish them from other types of gene clustering that have been proposed. An average S. bicolor and Ae. tauschii insula is estimated to contain 3.7 and 3.9 genes with an average intergenic distance within an insula of 2.1 and 16.5 kb, respectively. Inter-insular distances are greater than 8 and 81 kb and average 15.1 and 205 kb, in S. bicolor and Ae. tauschii, respectively. A greater gene density observed in the distal regions of the Ae. tauschii chromosomes is shown to be primarily caused by shortening of inter-insular distances. The comparison of the four grass genomes suggests that gene locations are largely a function of a homogeneous Poisson process in small genomes. Nonrandom insertions of LTR retroelements during genome expansion creates gene insulae, which become less dense and further apart with the increase in genome size. High concordance in relative lengths of orthologous intergenic distances among the investigated genomes including the maize genome suggests functional constraints on gene distribution in the grass genomes.

  2. Comparative genomics of Neisseria meningitidis: core genome, islands of horizontal transfer and pathogen-specific genes.

    Science.gov (United States)

    Dunning Hotopp, Julie C; Grifantini, Renata; Kumar, Nikhil; Tzeng, Yih Ling; Fouts, Derrick; Frigimelica, Elisabetta; Draghi, Monia; Giuliani, Marzia Monica; Rappuoli, Rino; Stephens, David S; Grandi, Guido; Tettelin, Hervé

    2006-12-01

    To better understand Neisseria meningitidis genomes and virulence, microarray comparative genome hybridization (mCGH) data were collected from one Neisseria cinerea, two Neisseria lactamica, two Neisseria gonorrhoeae and 48 Neisseria meningitidis isolates. For N. meningitidis, these isolates are from diverse clonal complexes, invasive and carriage strains, and all major serogroups. The microarray platform represented N. meningitidis strains MC58, Z2491 and FAM18, and N. gonorrhoeae FA1090. By comparing hybridization data to genome sequences, the core N. meningitidis genome and insertions/deletions (e.g. capsule locus, type I secretion system) related to pathogenicity were identified, including further characterization of the capsule locus, bioinformatics analysis of a type I secretion system, and identification of some metabolic pathways associated with intracellular survival in pathogens. Hybridization data clustered meningococcal isolates from similar clonal complexes that were distinguished by the differential presence of six distinct islands of horizontal transfer. Several of these islands contained prophage or other mobile elements, including a novel prophage and a transposon carrying portions of a type I secretion system. Acquisition of some genetic islands appears to have occurred in multiple lineages, including transfer between N. lactamica and N. meningitidis. However, island acquisition occurs infrequently, such that the genomic-level relationship is not obscured within clonal complexes. The N. meningitidis genome is characterized by the horizontal acquisition of multiple genetic islands; the study of these islands reveals important sets of genes varying between isolates and likely to be related to pathogenicity.

  3. Genome-wide study reveals an important role of spontaneous autoimmunity, cardiomyocyte differentiation defect and antiangiogenic activities in gender-specific gene expression in Keshan disease

    Institute of Scientific and Technical Information of China (English)

    He Shulan; Tan Wuhong; Wang Sen; Wu Cuiyan; Wang Pan; Wang Bin; Su Xiaohui

    2014-01-01

    Background Keshan disease (KD) is an endemic cardiomyopathy in China.The etiology of KD is still under debate and there is no effective approach to preventing and curing this disease.Young women of child-bearing age are the most frequent victims in rural areas.The aim of this study was to determine the differences between molecular pathogenic mechanisms in male and female KD sufferers.Methods We extracted RNA from the peripheral blood mononuclear cells of KD patients (12 women and 4 men) and controls (12 women and 4 men).Then the isolated RNA was amplified,labeled and hybridized to Agilent human 4×44k whole genome microarrays.Gene expression was examined using oligonucleotide microarray analysis.A quantitative polymerase chain reaction assay was also performed to validate our microarray results.Results Among the genes differentially expressed in female KD patients we identified:HLA-DOA,HLA-DRA,and HLA-DQA1 associated with spontaneous autoimmunity; BMP5 and BMP7,involved in cardiomyocyte differentiation defect; and ADAMTS 8,CCL23,and TNFSF15,implicated in anti-angiogenic activities.These genes are involved in the canonical pathways and networks recognized for the female KD sufferers and might be related to the pathogenic mechanism of KD.Conclusion Our results might help to explain the higher susceptibility of women to this disease.

  4. The complete mitochondrial genome sequence and gene organization of Tridentiger trigonocephalus (Gobiidae: Gobionellinae) with phylogenetic consideration.

    Science.gov (United States)

    Wei, Hongqing; Ma, Hongyu; Ma, Chunyan; Zhang, Fengying; Wang, Wei; Chen, Wei; Ma, Lingbo

    2016-09-01

    The complete mitochondrial genome plays an important role in studies of genome-level characteristics and phylogenetic relationships. Here we determined the complete mitogenome sequence of Tridentiger trigonocephalus (Perciformes, Gobiidae), and discovered its phylogenetic relationship. This circular genome was 16 662 bp in length, and consisted of 37 typical genes, including 13 protein-coding genes, 22 tRNA genes, and two rRNA genes. The gene order of T. trigonocephalus mitochondrial genome was identical to those observed in most other vertebrates. Of 37 genes, 28 were encoded by heavy strand, while the others were encoded by light strand. The phylogenetic tree constructed by 13 concatenated protein-coding genes showed that T. trigonocephalus was closest to T. bifasciatus, and then to T. barbatus among the 20 species within suborder Gobioidei. This work should facilitate the studies on population genetic diversity, and molecular evolution in Gobioidei fishes.

  5. Evaluation of candidate nephropathy susceptibility genes in a genome-wide association study of African American diabetic kidney disease.

    Directory of Open Access Journals (Sweden)

    Nicholette D Palmer

    Full Text Available Type 2 diabetes (T2D-associated end-stage kidney disease (ESKD is a complex disorder resulting from the combined influence of genetic and environmental factors. This study contains a comprehensive genetic analysis of putative nephropathy loci in 965 African American (AA cases with T2D-ESKD and 1029 AA population-based controls extending prior findings. Analysis was based on 4,341 directly genotyped and imputed single nucleotide polymorphisms (SNPs in 22 nephropathy candidate genes. After admixture adjustment and correction for multiple comparisons, 37 SNPs across eight loci were significantly associated (1.6E-05genes is shared across populations of African and European ancestry.

  6. Classical Oncogenes and Tumor Suppressor Genes: A Comparative Genomics Perspective

    Directory of Open Access Journals (Sweden)

    Oxana K. Pickeral

    2000-05-01

    Full Text Available We have curated a reference set of cancer-related genes and reanalyzed their sequences in the light of molecular information and resources that have become available since they were first cloned. Homology studies were carried out for human oncogenes and tumor suppressors, compared with the complete proteome of the nematode, Caenorhabditis elegans, and partial proteomes of mouse and rat and the fruit fly, Drosophila melanogaster. Our results demonstrate that simple, semi-automated bioinformatics approaches to identifying putative functionally equivalent gene products in different organisms may often be misleading. An electronic supplement to this article1 provides an integrated view of our comparative genomics analysis as well as mapping data, physical cDNA resources and links to published literature and reviews, thus creating a “window” into the genomes of humans and other organisms for cancer biology.

  7. Comparative genomics of the relationship between gene structure and expression

    NARCIS (Netherlands)

    Ren, X.

    2006-01-01

    The relationship between the structure of genes and their expression is a relatively new aspect of genome organization and regulation. With more genome sequences and expression data becoming available, bioinformatics approaches can help the further elucidation of the relationships between gene struc

  8. Weeding out the genes: the Arabidopsis genome project.

    Science.gov (United States)

    Martienssen, R A

    2000-05-01

    The Arabidopsis genome sequence is scheduled for completion at the end of this year (December 2000). It will be the first higher plant genome to be sequenced, and will allow a detailed comparison with bacterial, yeast and animal genomes. Already, two of the five chromosomes have been sequenced, and we have had our first glimpse of higher eukaryotic centromeres, and the structure of heterochromatin. The implications for understanding plant gene function, genome structure and genome organization are profound. In this review, the lessons learned for future genome projects are reviewed as well as a summary of the initial findings in Arabidopsis.

  9. The genome BLASTatlas - a GeneWiz extension for visualization of whole-genome homology

    DEFF Research Database (Denmark)

    Hallin, Peter Fischer; Binnewies, Tim Terence; Ussery, David

    2008-01-01

    the Clostridium tetani plasmid p88, where homologues for toxin genes can be easily visualized in other sequenced Clostridium genomes, and for a Clostridium botulinum genome, compared to 14 other Clostridium genomes. DNA structural information is also included in the atlas to visualize the DNA chromosomal context...

  10. High-Diversity Genes in the Arabidopsis Genome

    OpenAIRE

    Cork, Jennifer M.; Purugganan, Michael D.

    2005-01-01

    High-diversity genes represent an important class of loci in organismal genomes. Since elevated levels of nucleotide variation are a key component of the molecular signature for balancing selection or local adaptation, high-diversity genes may represent loci whose alleles are selectively maintained as balanced polymorphisms. Comparison of 4300 random shotgun sequence fragments of the Arabidopsis thaliana Ler ecotype genome with the whole genomic sequence of the Col-0 ecotype identified 60 gen...

  11. Regulation of methane genes and genome expression

    Energy Technology Data Exchange (ETDEWEB)

    John N. Reeve

    2009-09-09

    At the start of this project, it was known that methanogens were Archaeabacteria (now Archaea) and were therefore predicted to have gene expression and regulatory systems different from Bacteria, but few of the molecular biology details were established. The goals were then to establish the structures and organizations of genes in methanogens, and to develop the genetic technologies needed to investigate and dissect methanogen gene expression and regulation in vivo. By cloning and sequencing, we established the gene and operon structures of all of the “methane” genes that encode the enzymes that catalyze methane biosynthesis from carbon dioxide and hydrogen. This work identified unique sequences in the methane gene that we designated mcrA, that encodes the largest subunit of methyl-coenzyme M reductase, that could be used to identify methanogen DNA and establish methanogen phylogenetic relationships. McrA sequences are now the accepted standard and used extensively as hybridization probes to identify and quantify methanogens in environmental research. With the methane genes in hand, we used northern blot and then later whole-genome microarray hybridization analyses to establish how growth phase and substrate availability regulated methane gene expression in Methanobacterium thermautotrophicus ΔH (now Methanothermobacter thermautotrophicus). Isoenzymes or pairs of functionally equivalent enzymes catalyze several steps in the hydrogen-dependent reduction of carbon dioxide to methane. We established that hydrogen availability determine which of these pairs of methane genes is expressed and therefore which of the alternative enzymes is employed to catalyze methane biosynthesis under different environmental conditions. As were unable to establish a reliable genetic system for M. thermautotrophicus, we developed in vitro transcription as an alternative system to investigate methanogen gene expression and regulation. This led to the discovery that an archaeal protein

  12. Regulation of methane genes and genome expression

    Energy Technology Data Exchange (ETDEWEB)

    John N. Reeve

    2009-09-09

    At the start of this project, it was known that methanogens were Archaeabacteria (now Archaea) and were therefore predicted to have gene expression and regulatory systems different from Bacteria, but few of the molecular biology details were established. The goals were then to establish the structures and organizations of genes in methanogens, and to develop the genetic technologies needed to investigate and dissect methanogen gene expression and regulation in vivo. By cloning and sequencing, we established the gene and operon structures of all of the “methane” genes that encode the enzymes that catalyze methane biosynthesis from carbon dioxide and hydrogen. This work identified unique sequences in the methane gene that we designated mcrA, that encodes the largest subunit of methyl-coenzyme M reductase, that could be used to identify methanogen DNA and establish methanogen phylogenetic relationships. McrA sequences are now the accepted standard and used extensively as hybridization probes to identify and quantify methanogens in environmental research. With the methane genes in hand, we used northern blot and then later whole-genome microarray hybridization analyses to establish how growth phase and substrate availability regulated methane gene expression in Methanobacterium thermautotrophicus ΔH (now Methanothermobacter thermautotrophicus). Isoenzymes or pairs of functionally equivalent enzymes catalyze several steps in the hydrogen-dependent reduction of carbon dioxide to methane. We established that hydrogen availability determine which of these pairs of methane genes is expressed and therefore which of the alternative enzymes is employed to catalyze methane biosynthesis under different environmental conditions. As were unable to establish a reliable genetic system for M. thermautotrophicus, we developed in vitro transcription as an alternative system to investigate methanogen gene expression and regulation. This led to the discovery that an archaeal protein

  13. Sampling Daphnia's expressed genes: preservation, expansion and invention of crustacean genes with reference to insect genomes

    Directory of Open Access Journals (Sweden)

    Bauer Darren J

    2007-07-01

    Full Text Available Abstract Background Functional and comparative studies of insect genomes have shed light on the complement of genes, which in part, account for shared morphologies, developmental programs and life-histories. Contrasting the gene inventories of insects to those of the nematodes provides insight into the genomic changes responsible for their diversification. However, nematodes have weak relationships to insects, as each belongs to separate animal phyla. A better outgroup to distinguish lineage specific novelties would include other members of Arthropoda. For example, crustaceans are close allies to the insects (together forming Pancrustacea and their fascinating aquatic lifestyle provides an important comparison for understanding the genetic basis of adaptations to life on land versus life in water. Results This study reports on the first characterization of cDNA libraries and sequences for the model crustacean Daphnia pulex. We analyzed 1,546 ESTs of which 1,414 represent approximately 787 nuclear genes, by measuring their sequence similarities with insect and nematode proteomes. The provisional annotation of genes is supported by expression data from microarray studies described in companion papers. Loci expected to be shared between crustaceans and insects because of their mutual biological features are identified, including genes for reproduction, regulation and cellular processes. We identify genes that are likely derived within Pancrustacea or lost within the nematodes. Moreover, lineage specific gene family expansions are identified, which suggest certain biological demands associated with their ecological setting. In particular, up to seven distinct ferritin loci are found in Daphnia compared to three in most insects. Finally, a substantial fraction of the sampled gene transcripts shares no sequence similarity with those from other arthropods. Genes functioning during development and reproduction are comparatively well conserved between

  14. Regulatory Features for Odorant Receptor Genes in the Mouse Genome.

    Science.gov (United States)

    Degl'Innocenti, Andrea; D'Errico, Anna

    2017-01-01

    The odorant receptor genes, seven transmembrane receptor genes constituting the vastest mammalian gene multifamily, are expressed monogenically and monoallelicaly in each sensory neuron in the olfactory epithelium. This characteristic, often referred to as the one neuron-one receptor rule, is driven by mostly uncharacterized molecular dynamics, generally named odorant receptor gene choice. Much attention has been paid by the scientific community to the identification of sequences regulating the expression of odorant receptor genes within their loci, where related genes are usually arranged in genomic clusters. A number of studies identified transcription factor binding sites on odorant receptor promoter sequences. Similar binding sites were also found on a number of enhancers that regulate in cis their transcription, but have been proposed to form interchromosomal networks. Odorant receptor gene choice seems to occur via the local removal of strongly repressive epigenetic markings, put in place during the maturation of the sensory neuron on each odorant receptor locus. Here we review the fast-changing state of art for the study of regulatory features for odorant receptor genes.

  15. Evolution of closely linked gene pairs in vertebrate genomes

    NARCIS (Netherlands)

    Franck, E.; Hulsen, T.; Huynen, M.A.; Jong, de W.W.; Lunsen, N.H.; Madsen, O.

    2008-01-01

    The orientation of closely linked genes in mammalian genomes is not random: there are more head-to-head (h2h) gene pairs than expected. To understand the origin of this enrichment in h2h gene pairs, we have analyzed the phylogenetic distribution of gene pairs separated by less than 600 bp of interge

  16. The Aspergillus Genome Database, a curated comparative genomics resource for gene, protein and sequence information for the Aspergillus research community.

    Science.gov (United States)

    Arnaud, Martha B; Chibucos, Marcus C; Costanzo, Maria C; Crabtree, Jonathan; Inglis, Diane O; Lotia, Adil; Orvis, Joshua; Shah, Prachi; Skrzypek, Marek S; Binkley, Gail; Miyasato, Stuart R; Wortman, Jennifer R; Sherlock, Gavin

    2010-01-01

    The Aspergillus Genome Database (AspGD) is an online genomics resource for researchers studying the genetics and molecular biology of the Aspergilli. AspGD combines high-quality manual curation of the experimental scientific literature examining the genetics and molecular biology of Aspergilli, cutting-edge comparative genomics approaches to iteratively refine and improve structural gene annotations across multiple Aspergillus species, and web-based research tools for accessing and exploring the data. All of these data are freely available at http://www.aspgd.org. We welcome feedback from users and the research community at aspergillus-curator@genome.stanford.edu.

  17. Plant DNA barcoding: from gene to genome.

    Science.gov (United States)

    Li, Xiwen; Yang, Yang; Henry, Robert J; Rossetto, Maurizio; Wang, Yitao; Chen, Shilin

    2015-02-01

    DNA barcoding is currently a widely used and effective tool that enables rapid and accurate identification of plant species; however, none of the available loci work across all species. Because single-locus DNA barcodes lack adequate variations in closely related taxa, recent barcoding studies have placed high emphasis on the use of whole-chloroplast genome sequences which are now more readily available as a consequence of improving sequencing technologies. While chloroplast genome sequencing can already deliver a reliable barcode for accurate plant identification it is not yet resource-effective and does not yet offer the speed of analysis provided by single-locus barcodes to unspecialized laboratory facilities. Here, we review the development of candidate barcodes and discuss the feasibility of using the chloroplast genome as a super-barcode. We advocate a new approach for DNA barcoding that, for selected groups of taxa, combines the best use of single-locus barcodes and super-barcodes for efficient plant identification. Specific barcodes might enhance our ability to distinguish closely related plants at the species and population levels.

  18. The impact of genome triplication on tandem gene evolution in Brassica rapa

    Directory of Open Access Journals (Sweden)

    Lu eFang

    2012-11-01

    Full Text Available Whole genome duplication (WGD and tandem duplication (TD are both important modes of gene expansion. However, how whole genome duplication influences tandemly duplicated genes is not well studied. We used Brassica rapa, which has undergone an additional genome triplication (WGT and shares a common ancestor with Arabidopsis thaliana, Arabidopsis lyrata and Thellungiella parvula, to investigate the impact of genome triplication on tandem gene evolution. We identified 2,137, 1,569, 1,751 and 1,135 tandem gene arrays in B. rapa, A. thaliana, A. lyrata and T. parvula respectively. Among them, 414 conserved tandem arrays are shared by the 3 species without WGT, which were also considered as existing in the diploid ancestor of B. rapa. Thus, after genome triplication, B. rapa should have 1,242 tandem arrays according to the 414 conserved tandems. Here, we found 400 out of the 414 tandems had at least one syntenic ortholog in the genome of B. rapa. Furthermore, 294 out of the 400 shared syntenic orthologs maintain tandem arrays (more than one gene for each syntenic hit in B. rapa. For the 294 tandem arrays, we obtained 426 copies of syntenic paralogous tandems in the triplicated genome of B. rapa. In this study, we demonstrated that tandem arrays in B. rapa were dramatically fractionated after WGT when compared either to non-tandem genes in the B. rapa genome or to the tandem arrays in closely related species that have not experienced a recent whole-genome polyploidization event.

  19. Islet Amyloid Polypeptide Gene Variation (IAPP) and the Risk of Incident Type 2 Diabetes Mellitus: The Women’s Genome Health Study

    Science.gov (United States)

    Zee, Robert Y.L.; Pulido-Perez, Patricia; Perez-Fuentes, Ricardo; Ridker, Paul M; Chasman, Daniel I.; Romero, Jose R.

    2011-01-01

    Background Islet amyloid polypeptide (IAPP) gene variation has recently been implicated in type 2 diabetes mellitus (T2D). However, to date, no prospective epidemiological data are available. Methods The association between 10 IAPP tag-single nucleotide polymorphisms (tSNPs) and incident T2D was investigated in 22,715 Caucasian participants of the prospective Women’s Genome Health Study. All were free of known cardiovascular disease, cancer, and diabetes at baseline. During a 13-year follow-up period, 1,445 participants developed an incident T2D. Multivariable Cox regression analysis was performed to investigate the relationship between genotypes and T2D risk. Haplotype-based analysis was also performed. Results No evidence for an association of any of the tSNPs tested or haplotypes thereof with T2D risk. Conclusions If corroborated in other large, prospective studies, the present findings further suggest that the IAPP gene locus may not be useful predictor for T2D risk assessment. PMID:21219896

  20. Missing genes in the annotation of prokaryotic genomes

    Directory of Open Access Journals (Sweden)

    Feng Wu-chun

    2010-03-01

    Full Text Available Abstract Background Protein-coding gene detection in prokaryotic genomes is considered a much simpler problem than in intron-containing eukaryotic genomes. However there have been reports that prokaryotic gene finder programs have problems with small genes (either over-predicting or under-predicting. Therefore the question arises as to whether current genome annotations have systematically missing, small genes. Results We have developed a high-performance computing methodology to investigate this problem. In this methodology we compare all ORFs larger than or equal to 33 aa from all fully-sequenced prokaryotic replicons. Based on that comparison, and using conservative criteria requiring a minimum taxonomic diversity between conserved ORFs in different genomes, we have discovered 1,153 candidate genes that are missing from current genome annotations. These missing genes are similar only to each other and do not have any strong similarity to gene sequences in public databases, with the implication that these ORFs belong to missing gene families. We also uncovered 38,895 intergenic ORFs, readily identified as putative genes by similarity to currently annotated genes (we call these absent annotations. The vast majority of the missing genes found are small (less than 100 aa. A comparison of select examples with GeneMark, EasyGene and Glimmer predictions yields evidence that some of these genes are escaping detection by these programs. Conclusions Prokaryotic gene finders and prokaryotic genome annotations require improvement for accurate prediction of small genes. The number of missing gene families found is likely a lower bound on the actual number, due to the conservative criteria used to determine whether an ORF corresponds to a real gene.

  1. Evolution of paralogous genes: Reconstruction of genome rearrangements through comparison of multiple genomes within Staphylococcus aureus.

    Science.gov (United States)

    Tsuru, Takeshi; Kawai, Mikihiko; Mizutani-Ui, Yoko; Uchiyama, Ikuo; Kobayashi, Ichizo

    2006-06-01

    Analysis of evolution of paralogous genes in a genome is central to our understanding of genome evolution. Comparison of closely related bacterial genomes, which has provided clues as to how genome sequences evolve under natural conditions, would help in such an analysis. With species Staphylococcus aureus, whole-genome sequences have been decoded for seven strains. We compared their DNA sequences to detect large genome polymorphisms and to deduce mechanisms of genome rearrangements that have formed each of them. We first compared strains N315 and Mu50, which make one of the most closely related strain pairs, at the single-nucleotide resolution to catalogue all the middle-sized (more than 10 bp) to large genome polymorphisms such as indels and substitutions. These polymorphisms include two paralogous gene sets, one in a tandem paralogue gene cluster for toxins in a genomic island and the other in a ribosomal RNA operon. We also focused on two other tandem paralogue gene clusters and type I restriction-modification (RM) genes on the genomic islands. Then we reconstructed rearrangement events responsible for these polymorphisms, in the paralogous genes and the others, with reference to the other five genomes. For the tandem paralogue gene clusters, we were able to infer sequences for homologous recombination generating the change in the repeat number. These sequences were conserved among the repeated paralogous units likely because of their functional importance. The sequence specificity (S) subunit of type I RM systems showed recombination, likely at the homology of a conserved region, between the two variable regions for sequence specificity. We also noticed novel alleles in the ribosomal RNA operons and suggested a role for illegitimate recombination in their formation. These results revealed importance of recombination involving long conserved sequence in the evolution of paralogous genes in the genome.

  2. Comparative Genomic Analysis Reveals Organization, Function and Evolution of ars Genes in Pantoea spp.

    OpenAIRE

    Wang, Liying; Wang, Jin; Jing, Chuanyong

    2017-01-01

    Numerous genes are involved in various strategies to resist toxic arsenic (As). However, the As resistance strategy in genus Pantoea is poorly understood. In this study, a comparative genome analysis of 23 Pantoea genomes was conducted. Two vertical genetic arsC-like genes without any contribution to As resistance were found to exist in the 23 Pantoea strains. Besides the two arsC-like genes, As resistance gene clusters arsRBC or arsRBCH were found in 15 Pantoea genomes. These ars clusters we...

  3. An integration of genome-wide association study and gene expression profiling to prioritize the discovery of novel susceptibility loci for osteoporosis-related traits

    NARCIS (Netherlands)

    C.Y. Hsu (Chao); M.C. Zillikens (Carola); S.G. Wilson (Scott); C.R. Farber (Charles); S. Demissie (Serkalem); N. Soranzo (Nicole); E.N. Bianchi (Estelle); E. Grundberg (Elin); L. Liang (Liming); J.B. Richards (Brent); K. Estrada Gil (Karol); Y. Zhou (Yanhua); A. van Nas (Atila); M.F. Moffatt (Miriam); G. Zhai (Guangju); A. Hofman (Albert); J.B.J. van Meurs (Joyce); H.A.P. Pols (Huib); R.I. Price (Roger Ian); O. Nilsson (Ola); T. Pastinen (Tomi); L.A. Cupples (Adrienne); A.J. Lusis (Aldons Jake); E.E. Schadt (Eric); A.G. Uitterlinden (André); D.P. Kiel (Douglas); F. Rivadeneira Ramirez (Fernando); T.D. Spector (Timothy); D. Karasik (David); S.L. Ferrari (Serge)

    2010-01-01

    textabstractOsteoporosis is a complex disorder and commonly leads to fractures in elderly persons. Genome-wide association studies (GWAS) have become an unbiased approach to identify variations in the genome that potentially affect health. However, the genetic variants identified so far only explain

  4. Genomic organization and evolution of the ULBP genes in cattle.

    Science.gov (United States)

    Larson, Joshua H; Marron, Brandy M; Beever, Jonathan E; Roe, Bruce A; Lewin, Harris A

    2006-09-05

    The cattle UL16-binding protein 1 (ULBP1) and ULBP2 genes encode members of the MHC Class I superfamily that have homology to the human ULBP genes. Human ULBP1 and ULBP2 interact with the NKG2D receptor to activate effector cells in the immune system. The human cytomegalovirus UL16 protein is known to disrupt the ULBP-NKG2D interaction, thereby subverting natural killer cell-mediated responses. Previous Southern blotting experiments identified evidence of increased ULBP copy number within the genomes of ruminant artiodactyls. On the basis of these observations we hypothesized that the cattle ULBPs evolved by duplication and sequence divergence to produce a sufficient number and diversity of ULBP molecules to deliver an immune activation signal in the presence of immunogenic peptides. Given the importance of the ULBPs in antiviral immunity in other species, our goal was to determine the copy number and genomic organization of the ULBP genes in the cattle genome. Sequencing of cattle bacterial artificial chromosome genomic inserts resulted in the identification of 30 cattle ULBP loci existing in two gene clusters. Evidence of extensive segmental duplication and approximately 14 Kbp of novel repetitive sequences were identified within the major cluster. Ten ULBPs are predicted to be expressed at the cell surface. Substitution analysis revealed 11 outwardly directed residues in the predicted extracellular domains that show evidence of positive Darwinian selection. These positively selected residues have only one residue that overlaps with those proposed to interact with NKG2D, thus suggesting the interaction with molecules other than NKG2D. The ULBP loci in the cattle genome apparently arose by gene duplication and subsequent sequence divergence. Substitution analysis of the ULBP proteins provided convincing evidence for positive selection on extracellular residues that may interact with peptide ligands. These results support our hypothesis that the cattle ULBPs

  5. Genomic organization and evolution of the ULBP genes in cattle

    Directory of Open Access Journals (Sweden)

    Lewin Harris A

    2006-09-01

    Full Text Available Abstract Background The cattle UL16-binding protein 1 (ULBP1 and ULBP2 genes encode members of the MHC Class I superfamily that have homology to the human ULBP genes. Human ULBP1 and ULBP2 interact with the NKG2D receptor to activate effector cells in the immune system. The human cytomegalovirus UL16 protein is known to disrupt the ULBP-NKG2D interaction, thereby subverting natural killer cell-mediated responses. Previous Southern blotting experiments identified evidence of increased ULBP copy number within the genomes of ruminant artiodactyls. On the basis of these observations we hypothesized that the cattle ULBPs evolved by duplication and sequence divergence to produce a sufficient number and diversity of ULBP molecules to deliver an immune activation signal in the presence of immunogenic peptides. Given the importance of the ULBPs in antiviral immunity in other species, our goal was to determine the copy number and genomic organization of the ULBP genes in the cattle genome. Results Sequencing of cattle bacterial artificial chromosome genomic inserts resulted in the identification of 30 cattle ULBP loci existing in two gene clusters. Evidence of extensive segmental duplication and approximately 14 Kbp of novel repetitive sequences were identified within the major cluster. Ten ULBPs are predicted to be expressed at the cell surface. Substitution analysis revealed 11 outwardly directed residues in the predicted extracellular domains that show evidence of positive Darwinian selection. These positively selected residues have only one residue that overlaps with those proposed to interact with NKG2D, thus suggesting the interaction with molecules other than NKG2D. Conclusion The ULBP loci in the cattle genome apparently arose by gene duplication and subsequent sequence divergence. Substitution analysis of the ULBP proteins provided convincing evidence for positive selection on extracellular residues that may interact with peptide ligands. These

  6. Microfluidic gene arrays for rapid genomic profiling

    Science.gov (United States)

    West, Jay A.; Hukari, Kyle W.; Hux, Gary A.; Shepodd, Timothy J.

    2004-12-01

    Genomic analysis tools have recently become an indispensable tool for the evaluation of gene expression in a variety of experiment protocols. Two of the main drawbacks to this technology are the labor and time intensive process for sample preparation and the relatively long times required for target/probe hybridization. In order to overcome these two technological barriers we have developed a microfluidic chip to perform on chip sample purification and labeling, integrated with a high density genearray. Sample purification was performed using a porous polymer monolithic material functionalized with an oligo dT nucleotide sequence for the isolation of high purity mRNA. These purified mRNA"s can then rapidly labeled using a covalent fluorescent molecule which forms a selective covalent bond at the N7 position of guanine residues. These labeled mRNA"s can then released from the polymer monolith to allow for direct hybridization with oligonucletide probes deposited in microfluidic channel. To allow for rapid target/probe hybridization high density microarray were printed in microchannels. The channels can accommodate array densities as high as 4000 probes. When oligonucleotide deposition is complete, these channels are sealed using a polymer film which forms a pressure tight seal to allow sample reagent flow to the arrayed probes. This process will allow for real time target to probe hybridization monitoring using a top mounted CCD fiber bundle combination. Using this process we have been able to perform a multi-step sample preparation to labeled target/probe hybridization in less than 30 minutes. These results demonstrate the capability to perform rapid genomic screening on a high density microfluidic microarray of oligonucleotides.

  7. XRCC5 as a risk gene for alcohol dependence: evidence from a genome-wide gene-set-based analysis and follow-up studies in Drosophila and humans.

    Science.gov (United States)

    Juraeva, Dilafruz; Treutlein, Jens; Scholz, Henrike; Frank, Josef; Degenhardt, Franziska; Cichon, Sven; Ridinger, Monika; Mattheisen, Manuel; Witt, Stephanie H; Lang, Maren; Sommer, Wolfgang H; Hoffmann, Per; Herms, Stefan; Wodarz, Norbert; Soyka, Michael; Zill, Peter; Maier, Wolfgang; Jünger, Elisabeth; Gaebel, Wolfgang; Dahmen, Norbert; Scherbaum, Norbert; Schmäl, Christine; Steffens, Michael; Lucae, Susanne; Ising, Marcus; Smolka, Michael N; Zimmermann, Ulrich S; Müller-Myhsok, Bertram; Nöthen, Markus M; Mann, Karl; Kiefer, Falk; Spanagel, Rainer; Brors, Benedikt; Rietschel, Marcella

    2015-01-01

    Genetic factors have as large role as environmental factors in the etiology of alcohol dependence (AD). Although genome-wide association studies (GWAS) enable systematic searches for loci not hitherto implicated in the etiology of AD, many true findings may be missed owing to correction for multiple testing. The aim of the present study was to circumvent this limitation by searching for biological system-level differences, and then following up these findings in humans and animals. Gene-set-based analysis of GWAS data from 1333 cases and 2168 controls identified 19 significantly associated gene-sets, of which 5 could be replicated in an independent sample. Clustered in these gene-sets were novel and previously identified susceptibility genes. The most frequently present gene, ie in 6 out of 19 gene-sets, was X-ray repair complementing defective repair in Chinese hamster cells 5 (XRCC5). Previous human and animal studies have implicated XRCC5 in alcohol sensitivity. This phenotype is inversely correlated with the development of AD, presumably as more alcohol is required to achieve the desired effects. In the present study, the functional role of XRCC5 in AD was further validated in animals and humans. Drosophila mutants with reduced function of Ku80-the homolog of mammalian XRCC5-due to RNAi silencing showed reduced sensitivity to ethanol. In humans with free access to intravenous ethanol self-administration in the laboratory, the maximum achieved blood alcohol concentration was influenced in an allele-dose-dependent manner by genetic variation in XRCC5. In conclusion, our convergent approach identified new candidates and generated independent evidence for the involvement of XRCC5 in alcohol dependence.

  8. FGF: a web tool for Fishing Gene Family in a whole genome database

    DEFF Research Database (Denmark)

    Zheng, Hongkun; Shi, Junjie; Fang, Xiaodong

    2007-01-01

    Gene duplication is an important process in evolution. The availability of genome sequences of a number of organisms has made it possible to conduct comprehensive searches for duplicated genes enabling informative studies of their evolution. We have established the FGF (Fishing Gene Family) program...... to efficiently search for and identify gene families. The FGF output displays the results as visual phylogenetic trees including information on gene structure, chromosome position, duplication fate and selective pressure. It is particularly useful to identify pseudogenes and detect changes in gene structure. FGF...... is freely available on a web server at http://fgf.genomics.org.cn/...

  9. FGF: A web tool for Fishing Gene Family in a whole genome database

    DEFF Research Database (Denmark)

    Zheng, Hongkun; Shi, Junjie; Fang, Xiaodong

    2007-01-01

    Gene duplication is an important process in evolution. The availability of genome sequences of a number of organisms has made it possible to conduct comprehensive searches for duplicated genes enabling informative studies of their evolution. We have established the FGF (Fishing Gene Family) program...... to efficiently search for and identify gene families. The FGF output displays the results as visual phylogenetic trees including information on gene structure, chromosome position, duplication fate and selective pressure. It is particularly useful to identify pseudogenes and detect changes in gene structure. FGF...... is freely available on a web server at http://fgf.genomics.org.cn/...

  10. Concept, design and implementation of a cardiovascular gene-centric 50 k SNP array for large-scale genomic association studies.

    Directory of Open Access Journals (Sweden)

    Brendan J Keating

    Full Text Available A wealth of genetic associations for cardiovascular and metabolic phenotypes in humans has been accumulating over the last decade, in particular a large number of loci derived from recent genome wide association studies (GWAS. True complex disease-associated loci often exert modest effects, so their delineation currently requires integration of diverse phenotypic data from large studies to ensure robust meta-analyses. We have designed a gene-centric 50 K single nucleotide polymorphism (SNP array to assess potentially relevant loci across a range of cardiovascular, metabolic and inflammatory syndromes. The array utilizes a "cosmopolitan" tagging approach to capture the genetic diversity across approximately 2,000 loci in populations represented in the HapMap and SeattleSNPs projects. The array content is informed by GWAS of vascular and inflammatory disease, expression quantitative trait loci implicated in atherosclerosis, pathway based approaches and comprehensive literature searching. The custom flexibility of the array platform facilitated interrogation of loci at differing stringencies, according to a gene prioritization strategy that allows saturation of high priority loci with a greater density of markers than the existing GWAS tools, particularly in African HapMap samples. We also demonstrate that the IBC array can be used to complement GWAS, increasing coverage in high priority CVD-related loci across all major HapMap populations. DNA from over 200,000 extensively phenotyped individuals will be genotyped with this array with a significant portion of the generated data being released into the academic domain facilitating in silico replication attempts, analyses of rare variants and cross-cohort meta-analyses in diverse populations. These datasets will also facilitate more robust secondary analyses, such as explorations with alternative genetic models, epistasis and gene-environment interactions.

  11. A novel nonsense mutation in the DMP1 gene identified by a genome-wide association study is responsible for inherited rickets in Corriedale sheep.

    Directory of Open Access Journals (Sweden)

    Xia Zhao

    Full Text Available Inherited rickets of Corriedale sheep is characterized by decreased growth rate, thoracic lordosis and angular limb deformities. Previous outcross and backcross studies implicate inheritance as a simple autosomal recessive disorder. A genome wide association study was conducted using the Illumina OvineSNP50 BeadChip on 20 related sheep comprising 17 affected and 3 carriers. A homozygous region of 125 consecutive single-nucleotide polymorphism (SNP loci was identified in all affected sheep, covering a region of 6 Mb on ovine chromosome 6. Among 35 candidate genes in this region, the dentin matrix protein 1 gene (DMP1 was sequenced to reveal a nonsense mutation 250C/T on exon 6. This mutation introduced a stop codon (R145X and could truncate C-terminal amino acids. Genotyping by PCR-RFLP for this mutation showed all 17 affected sheep were "T T" genotypes; the 3 carriers were "C T"; 24 phenotypically normal related sheep were either "C T" or "C C"; and 46 unrelated normal control sheep from other breeds were all "C C". The other SNPs in DMP1 were not concordant with the disease and can all be ruled out as candidates. Previous research has shown that mutations in the DMP1 gene are responsible for autosomal recessive hypophosphatemic rickets in humans. Dmp1_knockout mice exhibit rickets phenotypes. We believe the R145X mutation to be responsible for the inherited rickets found in Corriedale sheep. A simple diagnostic test can be designed to identify carriers with the defective "T" allele. Affected sheep could be used as animal models for this form of human rickets, and for further investigation of the role of DMP1 in phosphate homeostasis.

  12. Genome-Wide Association Study with Sequence Variants Identifies Candidate Genes for Mastitis Resistance in Dairy Cattle

    DEFF Research Database (Denmark)

    Sahana, Goutam; Guldbrandtsen, Bernt; Bendixen, Christian;

    Effect Predictor (VEP) vers. 2.6 using ENSEMBL vers. 67 databases. Candidate polymorphisms affecting clinical mastitis were selected based on their association with the traits and functional annotations. A strong positional candidate gene for mastitis resistance on chromosome-6 is the NPFFR2 which...

  13. Genome-wide study of KNOX regulatory network reveals brassinosteroid catabolic genes important for shoot meristem function in rice

    Science.gov (United States)

    In flowering plants, knotted1-like homeobox (KNOX) transcription factors play crucial roles in establishment and maintenance of the shoot apical meristem (SAM), from which aerial organs such as leaves, stems, and flowers initiate. We report that a rice (Oryza sativa) KNOX gene Oryza sativa homeobox1...

  14. Identification and Categorization of Horizontally Transferred Genes in Prokaryotic Genomes

    Institute of Scientific and Technical Information of China (English)

    Shuo-Yong SHI; Xiao-Hui CAI; Da-fu DING

    2005-01-01

    Horizontal gene transfer (HGT), a process through which genomes acquire genetic materials from distantly related organisms, is believed to be one of the major forces in prokaryotic genome evolution.However, systematic investigation is still scarce to clarify two basic issues about HGT: (1) what types of genes are transferred; and (2) what influence HGT events over the organization and evolution of biological pathways. Genome-scale investigations of these two issues will advance the systematical understanding of HGT in the context of prokaryotic genome evolution. Having investigated 82 genomes, we constructed an HGT database across broad evolutionary timescales. We identified four function categories containing a high proportion of horizontally transferred genes: cell envelope, energy metabolism, regulatory functions, and transport/binding proteins. Such biased function distribution indicates that HGT is not completely random;instead, it is under high selective pressure, required by function restraints in organisms. Furthermore, we mapped the transferred genes onto the connectivity structure map of organism-specific pathways listed in Kyoto Encyclopedia of Genes and Genomes (KEGG). Our results suggest that recruitment of transferred genes into pathways is also selectively constrained because of the tuned interaction between original pathway members. Pathway organization structures still conserve well through evolution even with the recruitment of horizontally transferred genes. Interestingly, in pathways whose organization were significantly affected by HGT events, the operon-like arrangement of transferred genes was found to be prevalent. Such results suggest that operon plays an essential and directional role in the integration of alien genes into pathways.

  15. Finding the missing honey bee genes: Lessons learned from a genome upgrade

    KAUST Repository

    Elsik, Christine G

    2014-01-30

    Background: The first generation of genome sequence assemblies and annotations have had a significant impact upon our understanding of the biology of the sequenced species, the phylogenetic relationships among species, the study of populations within and across species, and have informed the biology of humans. As only a few Metazoan genomes are approaching finished quality (human, mouse, fly and worm), there is room for improvement of most genome assemblies. The honey bee (Apis mellifera) genome, published in 2006, was noted for its bimodal GC content distribution that affected the quality of the assembly in some regions and for fewer genes in the initial gene set (OGSv1.0) compared to what would be expected based on other sequenced insect genomes. Results: Here, we report an improved honey bee genome assembly (Amel_4.5) with a new gene annotation set (OGSv3.2), and show that the honey bee genome contains a number of genes similar to that of other insect genomes, contrary to what was suggested in OGSv1.0. The new genome assembly is more contiguous and complete and the new gene set includes ~5000 more protein-coding genes, 50% more than previously reported. About 1/6 of the additional genes were due to improvements to the assembly, and the remaining were inferred based on new RNAseq and protein data. Conclusions: Lessons learned from this genome upgrade have important implications for future genome sequencing projects. Furthermore, the improvements significantly enhance genomic resources for the honey bee, a key model for social behavior and essential to global ecology through pollination. 2014 Elsik et al.; licensee BioMed Central Ltd.

  16. Genome-wide association study and gene expression analysis identifies CD84 as a predictor of response to etanercept therapy in rheumatoid arthritis.

    Directory of Open Access Journals (Sweden)

    Jing Cui

    2013-03-01

    Full Text Available Anti-tumor necrosis factor alpha (anti-TNF biologic therapy is a widely used treatment for rheumatoid arthritis (RA. It is unknown why some RA patients fail to respond adequately to anti-TNF therapy, which limits the development of clinical biomarkers to predict response or new drugs to target refractory cases. To understand the biological basis of response to anti-TNF therapy, we conducted a genome-wide association study (GWAS meta-analysis of more than 2 million common variants in 2,706 RA patients from 13 different collections. Patients were treated with one of three anti-TNF medications: etanercept (n = 733, infliximab (n = 894, or adalimumab (n = 1,071. We identified a SNP (rs6427528 at the 1q23 locus that was associated with change in disease activity score (ΔDAS in the etanercept subset of patients (P = 8 × 10(-8, but not in the infliximab or adalimumab subsets (P>0.05. The SNP is predicted to disrupt transcription factor binding site motifs in the 3' UTR of an immune-related gene, CD84, and the allele associated with better response to etanercept was associated with higher CD84 gene expression in peripheral blood mononuclear cells (P = 1 × 10(-11 in 228 non-RA patients and P = 0.004 in 132 RA patients. Consistent with the genetic findings, higher CD84 gene expression correlated with lower cross-sectional DAS (P = 0.02, n = 210 and showed a non-significant trend for better ΔDAS in a subset of RA patients with gene expression data (n = 31, etanercept-treated. A small, multi-ethnic replication showed a non-significant trend towards an association among etanercept-treated RA patients of Portuguese ancestry (n = 139, P = 0.4, but no association among patients of Japanese ancestry (n = 151, P = 0.8. Our study demonstrates that an allele associated with response to etanercept therapy is also associated with CD84 gene expression, and further that CD84 expression correlates with disease activity. These findings support a model in which CD84

  17. A GeneTrek analysis of the maize genome.

    Science.gov (United States)

    Liu, Renyi; Vitte, Clémentine; Ma, Jianxin; Mahama, A Assibi; Dhliwayo, Thanda; Lee, Michael; Bennetzen, Jeffrey L

    2007-07-10

    Analysis of the sequences of 74 randomly selected BACs demonstrated that the maize nuclear genome contains approximately 37,000 candidate genes with homologues in other plant species. An additional approximately 5,500 predicted genes are severely truncated and probably pseudogenes. The distribution of genes is uneven, with approximately 30% of BACs containing no genes. BAC gene density varies from 0 to 7.9 per 100 kb, whereas most gene islands contain only one gene. The average number of genes per gene island is 1.7. Only 72% of these genes show collinearity with the rice genome. Particular LTR retrotransposon families (e.g., Gyma) are enriched on gene-free BACs, most of which do not come from pericentromeres or other large heterochromatic regions. Gene-containing BACs are relatively enriched in different families of LTR retrotransposons (e.g., Ji). Two major bursts of LTR retrotransposon activity in the last 2 million years are responsible for the large size of the maize genome, but only the more recent of these is well represented in gene-containing BACs, suggesting that LTR retrotransposons are more efficiently removed in these domains. The results demonstrate that sample sequencing and careful annotation of a few randomly selected BACs can provide a robust description of a complex plant genome.

  18. A Method for Identification of Selenoprotein Genes in Archaeal Genomes

    Institute of Scientific and Technical Information of China (English)

    Mingfeng Li; Yanzhao Huang; Yi Xiao

    2009-01-01

    The genetic codon UGA has a dual function: serving as a terminator and encoding selenocysteine. However, most popular gene annotation programs only take it as a stop signal, resulting in misannotation or completely missing selenoprotein genes. We developed a computational method named Asec-Prediction that is specific for the prediction of archaeal selenoprotein genes. To evaluate its effectiveness, we first applied it to 14 archaeal genomes with previously known selenoprotein genes, and Asec-Prediction identified all reported selenoprotein genes without redundant results. When we applied it to 12 archaeal genomes that had not been researched for selenoprotein genes, Asec-Prediction detected a novel selenoprotein gene in Methanosarcina acetivorans. Further evidence was also collected to support that the predicted gene should be a real selenoprotein gene. The result shows that Asec-Prediction is effective for the prediction of archaeal selenoprotein genes.

  19. Genomic analysis of primordial dwarfism reveals novel disease genes.

    Science.gov (United States)

    Shaheen, Ranad; Faqeih, Eissa; Ansari, Shinu; Abdel-Salam, Ghada; Al-Hassnan, Zuhair N; Al-Shidi, Tarfa; Alomar, Rana; Sogaty, Sameera; Alkuraya, Fowzan S

    2014-02-01

    Primordial dwarfism (PD) is a disease in which severely impaired fetal growth persists throughout postnatal development and results in stunted adult size. The condition is highly heterogeneous clinically, but the use of certain phenotypic aspects such as head circumference and facial appearance has proven helpful in defining clinical subgroups. In this study, we present the results of clinical and genomic characterization of 16 new patients in whom a broad definition of PD was used (e.g., 3M syndrome was included). We report a novel PD syndrome with distinct facies in two unrelated patients, each with a different homozygous truncating mutation in CRIPT. Our analysis also reveals, in addition to mutations in known PD disease genes, the first instance of biallelic truncating BRCA2 mutation causing PD with normal bone marrow analysis. In addition, we have identified a novel locus for Seckel syndrome based on a consanguineous multiplex family and identified a homozygous truncating mutation in DNA2 as the likely cause. An additional novel PD disease candidate gene XRCC4 was identified by autozygome/exome analysis, and the knockout mouse phenotype is highly compatible with PD. Thus, we add a number of novel genes to the growing list of PD-linked genes, including one which we show to be linked to a novel PD syndrome with a distinct facial appearance. PD is extremely heterogeneous genetically and clinically, and genomic tools are often required to reach a molecular diagnosis.

  20. Genome-wide identification of KANADI1 target genes.

    Directory of Open Access Journals (Sweden)

    Paz Merelo

    Full Text Available Plant organ development and polarity establishment is mediated by the action of several transcription factors. Among these, the KANADI (KAN subclade of the GARP protein family plays important roles in polarity-associated processes during embryo, shoot and root patterning. In this study, we have identified a set of potential direct target genes of KAN1 through a combination of chromatin immunoprecipitation/DNA sequencing (ChIP-Seq and genome-wide transcriptional profiling using tiling arrays. Target genes are over-represented for genes involved in the regulation of organ development as well as in the response to auxin. KAN1 affects directly the expression of several genes previously shown to be important in the establishment of polarity during lateral organ and vascular tissue development. We also show that KAN1 controls through its target genes auxin effects on organ development at different levels: transport and its regulation, and signaling. In addition, KAN1 regulates genes involved in the response to abscisic acid, jasmonic acid, brassinosteroids, ethylene, cytokinins and gibberellins. The role of KAN1 in organ polarity is antagonized by HD-ZIPIII transcription factors, including REVOLUTA (REV. A comparison of their target genes reveals that the REV/KAN1 module acts in organ patterning through opposite regulation of shared targets. Evidence of mutual repression between closely related family members is also shown.

  1. A genomic survey of the fish parasite Spironucleus salmonicida indicates genomic plasticity among diplomonads and significant lateral gene transfer in eukaryote genome evolution

    Directory of Open Access Journals (Sweden)

    Logsdon John M

    2007-02-01

    Full Text Available Abstract Background Comparative genomic studies of the mitochondrion-lacking protist group Diplomonadida (diplomonads has been lacking, although Giardia lamblia has been intensively studied. We have performed a sequence survey project resulting in 2341 expressed sequence tags (EST corresponding to 853 unique clones, 5275 genome survey sequences (GSS, and eleven finished contigs from the diplomonad fish parasite Spironucleus salmonicida (previously described as S. barkhanus. Results The analyses revealed a compact genome with few, if any, introns and very short 3' untranslated regions. Strikingly different patterns of codon usage were observed in genes corresponding to frequently sampled ESTs versus genes poorly sampled, indicating that translational selection is influencing the codon usage of highly expressed genes. Rigorous phylogenomic analyses identified 84 genes – mostly encoding metabolic proteins – that have been acquired by diplomonads or their relatively close ancestors via lateral gene transfer (LGT. Although most acquisitions were from prokaryotes, more than a dozen represent likely transfers of genes between eukaryotic lineages. Many genes that provide novel insights into the genetic basis of the biology and pathogenicity of this parasitic protist were identified including 149 that putatively encode variant-surface cysteine-rich proteins which are candidate virulence factors. A number of genomic properties that distinguish S. salmonicida from its human parasitic relative G. lamblia were identified such as nineteen putative lineage-specific gene acquisitions, distinct mutational biases and codon usage and distinct polyadenylation signals. Conclusion Our results highlight the power of comparative genomic studies to yield insights into the biology of parasitic protists and the evolution of their genomes, and suggest that genetic exchange between distantly-related protist lineages may be occurring at an appreciable rate in eukaryote

  2. Estimating variation within the genes and inferring the phylogeny of 186 sequenced diverse Escherichia coli genomes

    Directory of Open Access Journals (Sweden)

    Kaas Rolf S

    2012-10-01

    Full Text Available Abstract Background Escherichia coli exists in commensal and pathogenic forms. By measuring the variation of individual genes across more than a hundred sequenced genomes, gene variation can be studied in detail, including the number of mutations found for any given gene. This knowledge will be useful for creating better phylogenies, for determination of molecular clocks and for improved typing techniques. Results We find 3,051 gene clusters/families present in at least 95% of the genomes and 1,702 gene clusters present in 100% of the genomes. The former 'soft core' of about 3,000 gene families is perhaps more biologically relevant, especially considering that many of these genome sequences are draft quality. The E. coli pan-genome for this set of isolates contains 16,373 gene clusters. A core-gene tree, based on alignment and a pan-genome tree based on gene presence/absence, maps the relatedness of the 186 sequenced E. coli genomes. The core-gene tree displays high confidence and divides the E. coli strains into the observed MLST type clades and also separates defined phylotypes. Conclusion The results of comparing a large and diverse E. coli dataset support the theory that reliable and good resolution phylogenies can be inferred from the core-genome. The results further suggest that the resolution at the isolate level may, subsequently be improved by targeting more variable genes. The use of whole genome sequencing will make it possible to eliminate, or at least reduce, the need for several typing steps used in traditional epidemiology.

  3. Revisiting the protein-coding gene catalog of Drosophila melanogaster using 12 fly genomes.

    Science.gov (United States)

    Lin, Michael F; Carlson, Joseph W; Crosby, Madeline A; Matthews, Beverley B; Yu, Charles; Park, Soo; Wan, Kenneth H; Schroeder, Andrew J; Gramates, L Sian; St Pierre, Susan E; Roark, Margaret; Wiley, Kenneth L; Kulathinal, Rob J; Zhang, Peili; Myrick, Kyl V; Antone, Jerry V; Celniker, Susan E; Gelbart, William M; Kellis, Manolis

    2007-12-01

    The availability of sequenced genomes from 12 Drosophila species has enabled the use of comparative genomics for the systematic discovery of functional elements conserved within this genus. We have developed quantitative metrics for the evolutionary signatures specific to protein-coding regions and applied them genome-wide, resulting in 1193 candidate new protein-coding exons in the D. melanogaster genome. We have reviewed these predictions by manual curation and validated a subset by directed cDNA screening and sequencing, revealing both new genes and new alternative splice forms of known genes. We also used these evolutionary signatures to evaluate existing gene annotations, resulting in the validation of 87% of genes lacking descriptive names and identifying 414 poorly conserved genes that are likely to be spurious predictions, noncoding, or species-specific genes. Furthermore, our methods suggest a variety of refinements to hundreds of existing gene models, such as modifications to translation start codons and exon splice boundaries. Finally, we performed directed genome-wide searches for unusual protein-coding structures, discovering 149 possible examples of stop codon readthrough, 125 new candidate ORFs of polycistronic mRNAs, and several candidate translational frameshifts. These results affect >10% of annotated fly genes and demonstrate the power of comparative genomics to enhance our understanding of genome organization, even in a model organism as intensively studied as Drosophila melanogaster.

  4. A comprehensive evaluation of rodent malaria parasite genomes and gene expression

    KAUST Repository

    Otto, Thomas D

    2014-10-30

    Background: Rodent malaria parasites (RMP) are used extensively as models of human malaria. Draft RMP genomes have been published for Plasmodium yoelii, P. berghei ANKA (PbA) and P. chabaudi AS (PcAS). Although availability of these genomes made a significant impact on recent malaria research, these genomes were highly fragmented and were annotated with little manual curation. The fragmented nature of the genomes has hampered genome wide analysis of Plasmodium gene regulation and function. Results: We have greatly improved the genome assemblies of PbA and PcAS, newly sequenced the virulent parasite P. yoelii YM genome, sequenced additional RMP isolates/lines and have characterized genotypic diversity within RMP species. We have produced RNA-seq data and utilized it to improve gene-model prediction and to provide quantitative, genome-wide, data on gene expression. Comparison of the RMP genomes with the genome of the human malaria parasite P. falciparum and RNA-seq mapping permitted gene annotation at base-pair resolution. Full-length chromosomal annotation permitted a comprehensive classification of all subtelomeric multigene families including the `Plasmodium interspersed repeat genes\\' (pir). Phylogenetic classification of the pir family, combined with pir expression patterns, indicates functional diversification within this family. Conclusions: Complete RMP genomes, RNA-seq and genotypic diversity data are excellent and important resources for gene-function and post-genomic analyses and to better interrogate Plasmodium biology. Genotypic diversity between P. chabaudi isolates makes this species an excellent parasite to study genotype-phenotype relationships. The improved classification of multigene families will enhance studies on the role of (variant) exported proteins in virulence and immune evasion/modulation.

  5. Gene discovery in the Acanthamoeba castellanii genome

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, Iain J.; Watkins, Russell F.; Samuelson, John; Spencer,David F.; Majoros, William H.; Gray, Michael W.; Loftus, Brendan J.

    2005-08-01

    Acanthamoeba castellanii is a free-living amoeba found in soil, freshwater, and marine environments and an important predator of bacteria. Acanthamoeba castellanii is also an opportunistic pathogen of clinical interest, responsible for several distinct diseases in humans. In order to provide a genomic platform for the study of this ubiquitous and important protist, we generated a sequence survey of approximately 0.5 x coverage of the genome. The data predict that A. castellanii exhibits a greater biosynthetic capacity than the free-living Dictyostelium discoideum and the parasite Entamoeba histolytica, providing an explanation for the ability of A. castellanii to inhabit adversity of environments. Alginate lyase may provide access to bacteria within biofilms by breaking down the biofilm matrix, and polyhydroxybutyrate depolymerase may facilitate utilization of the bacterial storage compound polyhydroxybutyrate as a food source. Enzymes for the synthesis and breakdown of cellulose were identified, and they likely participate in encystation and excystation as in D. discoideum. Trehalose-6-phosphate synthase is present, suggesting that trehalose plays a role in stress adaptation. Detection and response to a number of stress conditions is likely accomplished with a large set of signal transduction histidine kinases and a set of putative receptorserine/threonine kinases similar to those found in E. histolytica. Serine, cysteine and metalloproteases were identified, some of which are likely involved in pathogenicity.

  6. The complete chloroplast genome sequence of Podocarpus lambertii: genome structure, evolutionary aspects, gene content and SSR detection.

    Directory of Open Access Journals (Sweden)

    Leila do Nascimento Vieira

    Full Text Available BACKGROUND: Podocarpus lambertii (Podocarpaceae is a native conifer from the Brazilian Atlantic Forest Biome, which is considered one of the 25 biodiversity hotspots in the world. The advancement of next-generation sequencing technologies has enabled the rapid acquisition of whole chloroplast (cp genome sequences at low cost. Several studies have proven the potential of cp genomes as tools to understand enigmatic and basal phylogenetic relationships at different taxonomic levels, as well as further probe the structural and functional evolution of plants. In this work, we present the complete cp genome sequence of P. lambertii. METHODOLOGY/PRINCIPAL FINDINGS: The P. lambertii cp genome is 133,734 bp in length, and similar to other sequenced cupressophytes, it lacks one of the large inverted repeat regions (IR. It contains 118 unique genes and one duplicated tRNA (trnN-GUU, which occurs as an inverted repeat sequence. The rps16 gene was not found, which was previously reported for the plastid genome of another Podocarpaceae (Nageia nagi and Araucariaceae (Agathis dammara. Structurally, P. lambertii shows 4 inversions of a large DNA fragment ∼20,000 bp compared to the Podocarpus totara cp genome. These unexpected characteristics may be attributed to geographical distance and different adaptive needs. The P. lambertii cp genome presents a total of 28 tandem repeats and 156 SSRs, with homo- and dipolymers being the most common and tri-, tetra-, penta-, and hexapolymers occurring with less frequency. CONCLUSION: The complete cp genome sequence of P. lambertii revealed significant structural changes, even in species from the same genus. These results reinforce the apparently loss of rps16 gene in Podocarpaceae cp genome. In addition, several SSRs in the P. lambertii cp genome are likely intraspecific polymorphism sites, which may allow highly sensitive phylogeographic and population structure studies, as well as phylogenetic studies of species of

  7. The Complete Chloroplast Genome Sequence of Podocarpus lambertii: Genome Structure, Evolutionary Aspects, Gene Content and SSR Detection

    Science.gov (United States)

    Vieira, Leila do Nascimento; Faoro, Helisson; Rogalski, Marcelo; Fraga, Hugo Pacheco de Freitas; Cardoso, Rodrigo Luis Alves; de Souza, Emanuel Maltempi; de Oliveira Pedrosa, Fábio; Nodari, Rubens Onofre; Guerra, Miguel Pedro

    2014-01-01

    Background Podocarpus lambertii (Podocarpaceae) is a native conifer from the Brazilian Atlantic Forest Biome, which is considered one of the 25 biodiversity hotspots in the world. The advancement of next-generation sequencing technologies has enabled the rapid acquisition of whole chloroplast (cp) genome sequences at low cost. Several studies have proven the potential of cp genomes as tools to understand enigmatic and basal phylogenetic relationships at different taxonomic levels, as well as further probe the structural and functional evolution of plants. In this work, we present the complete cp genome sequence of P. lambertii. Methodology/Principal Findings The P. lambertii cp genome is 133,734 bp in length, and similar to other sequenced cupressophytes, it lacks one of the large inverted repeat regions (IR). It contains 118 unique genes and one duplicated tRNA (trnN-GUU), which occurs as an inverted repeat sequence. The rps16 gene was not found, which was previously reported for the plastid genome of another Podocarpaceae (Nageia nagi) and Araucariaceae (Agathis dammara). Structurally, P. lambertii shows 4 inversions of a large DNA fragment ∼20,000 bp compared to the Podocarpus totara cp genome. These unexpected characteristics may be attributed to geographical distance and different adaptive needs. The P. lambertii cp genome presents a total of 28 tandem repeats and 156 SSRs, with homo- and dipolymers being the most common and tri-, tetra-, penta-, and hexapolymers occurring with less frequency. Conclusion The complete cp genome sequence of P. lambertii revealed significant structural changes, even in species from the same genus. These results reinforce the apparently loss of rps16 gene in Podocarpaceae cp genome. In addition, several SSRs in the P. lambertii cp genome are likely intraspecific polymorphism sites, which may allow highly sensitive phylogeographic and population structure studies, as well as phylogenetic studies of species of this genus. PMID

  8. A genome-wide association search for type 2 diabetes genes in African Americans

    DEFF Research Database (Denmark)

    Palmer, Nicholette D; McDonough, Caitrin W; Hicks, Pamela J

    2012-01-01

    African Americans are disproportionately affected by type 2 diabetes (T2DM) yet few studies have examined T2DM using genome-wide association approaches in this ethnicity. The aim of this study was to identify genes associated with T2DM in the African American population. We performed a Genome Wid...

  9. A Genome-Wide Association Search for Type 2 Diabetes Genes in African Americans

    NARCIS (Netherlands)

    Palmer, Nicholette D.; McDonough, Caitrin W.; Hicks, Pamela J.; Roh, Bong H.; Wing, Maria R.; An, S. Sandy; Hester, Jessica M.; Cooke, Jessica N.; Bostrom, Meredith A.; Rudock, Megan E.; Talbert, Matthew E.; Lewis, Joshua P.; Ferrara, Assiamira; Lu, Lingyi; Ziegler, Julie T.; Sale, Michele M.; Divers, Jasmin; Shriner, Daniel; Adeyemo, Adebowale; Rotimi, Charles N.; Ng, Maggie C. Y.; Langefeld, Carl D.; Freedman, Barry I.; Bowden, Donald W.

    2012-01-01

    African Americans are disproportionately affected by type 2 diabetes (T2DM) yet few studies have examined T2DM using genome-wide association approaches in this ethnicity. The aim of this study was to identify genes associated with T2DM in the African American population. We performed a Genome Wide A

  10. Genome-editing Technologies for Gene and Cell Therapy

    Science.gov (United States)

    Maeder, Morgan L; Gersbach, Charles A

    2016-01-01

    Gene therapy has historically been defined as the addition of new genes to human cells. However, the recent advent of genome-editing technologies has enabled a new paradigm in which the sequence of the human genome can be precisely manipulated to achieve a therapeutic effect. This includes the correction of mutations that cause disease, the addition of therapeutic genes to specific sites in the genome, and the removal of deleterious genes or genome sequences. This review presents the mechanisms of different genome-editing strategies and describes each of the common nuclease-based platforms, including zinc finger nucleases, transcription activator-like effector nucleases (TALENs), meganucleases, and the CRISPR/Cas9 system. We then summarize the progress made in applying genome editing to various areas of gene and cell therapy, including antiviral strategies, immunotherapies, and the treatment of monogenic hereditary disorders. The current challenges and future prospects for genome editing as a transformative technology for gene and cell therapy are also discussed. PMID:26755333

  11. Genome-editing Technologies for Gene and Cell Therapy.

    Science.gov (United States)

    Maeder, Morgan L; Gersbach, Charles A

    2016-03-01

    Gene therapy has historically been defined as the addition of new genes to human cells. However, the recent advent of genome-editing technologies has enabled a new paradigm in which the sequence of the human genome can be precisely manipulated to achieve a therapeutic effect. This includes the correction of mutations that cause disease, the addition of therapeutic genes to specific sites in the genome, and the removal of deleterious genes or genome sequences. This review presents the mechanisms of different genome-editing strategies and describes each of the common nuclease-based platforms, including zinc finger nucleases, transcription activator-like effector nucleases (TALENs), meganucleases, and the CRISPR/Cas9 system. We then summarize the progress made in applying genome editing to various areas of gene and cell therapy, including antiviral strategies, immunotherapies, and the treatment of monogenic hereditary disorders. The current challenges and future prospects for genome editing as a transformative technology for gene and cell therapy are also discussed.

  12. A unified gene catalog for the laboratory mouse reference genome.

    Science.gov (United States)

    Zhu, Y; Richardson, J E; Hale, P; Baldarelli, R M; Reed, D J; Recla, J M; Sinclair, R; Reddy, T B K; Bult, C J

    2015-08-01

    We report here a semi-automated process by which mouse genome feature predictions and curated annotations (i.e., genes, pseudogenes, functional RNAs, etc.) from Ensembl, NCBI and Vertebrate Genome Annotation database (Vega) are reconciled with the genome features in the Mouse Genome Informatics (MGI) database (http://www.informatics.jax.org) into a comprehensive and non-redundant catalog. Our gene unification method employs an algorithm (fjoin--feature join) for efficient detection of genome coordinate overlaps among features represented in two annotation data sets. Following the analysis with fjoin, genome features are binned into six possible categories (1:1, 1:0, 0:1, 1:n, n:1, n:m) based on coordinate overlaps. These categories are subsequently prioritized for assessment of annotation equivalencies and differences. The version of the unified catalog reported here contains more than 59,000 entries, including 22,599 protein-coding coding genes, 12,455 pseudogenes, and 24,007 other feature types (e.g., microRNAs, lincRNAs, etc.). More than 23,000 of the entries in the MGI gene catalog have equivalent gene models in the annotation files obtained from NCBI, Vega, and Ensembl. 12,719 of the features are unique to NCBI relative to Ensembl/Vega; 11,957 are unique to Ensembl/Vega relative to NCBI, and 3095 are unique to MGI. More than 4000 genome features fall into categories that require manual inspection to resolve structural differences in the gene models from different annotation sources. Using the MGI unified gene catalog, researchers can easily generate a comprehensive report of mouse genome features from a single source and compare the details of gene and transcript structure using MGI's mouse genome browser.

  13. Genome-wide association study of theta band event-related oscillations identifies serotonin receptor gene HTR7 influencing risk of alcohol dependence.

    Science.gov (United States)

    Zlojutro, Mark; Manz, Niklas; Rangaswamy, Madhavi; Xuei, Xiaoling; Flury-Wetherill, Leah; Koller, Daniel; Bierut, Laura J; Goate, Alison; Hesselbrock, Victor; Kuperman, Samuel; Nurnberger, John; Rice, John P; Schuckit, Marc A; Foroud, Tatiana; Edenberg, Howard J; Porjesz, Bernice; Almasy, Laura

    2011-01-01

    Event-related brain oscillations (EROs) represent highly heritable neuroelectrical correlates of human perception and cognitive performance that exhibit marked deficits in patients with various psychiatric disorders. We report the results of the first genome-wide association study (GWAS) of an ERO endophenotype-frontal theta ERO evoked by visual oddball targets during P300 response in 1,064 unrelated individuals drawn from a study of alcohol dependence. Forty-two SNPs of the Illumina HumanHap 1 M microarray were selected from the theta ERO GWAS for replication in family-based samples (N = 1,095), with four markers revealing nominally significant association. The most significant marker from the two-stage study is rs4907240 located within ARID protein 5A gene (ARID5A) on chromosome 2q11 (unadjusted, Fisher's combined P = 3.68 × 10⁻⁶). However, the most intriguing association to emerge is with rs7916403 in serotonin receptor gene HTR7 on chromosome 10q23 (combined P = 1.53 × 10⁻⁴), implicating the serotonergic system in the neurophysiological underpinnings of theta EROs. Moreover, promising SNPs were tested for association with diagnoses of alcohol dependence (DSM-IV), revealing a significant relationship with the HTR7 polymorphism among GWAS case-controls (P = 0.008). Significant recessive genetic effects were also detected for alcohol dependence in both case-control and family-based samples (P = 0.031 and 0.042, respectively), with the HTR7 risk allele corresponding to theta ERO reductions among homozygotes. These results suggest a role of the serotonergic system in the biological basis of alcohol dependence and underscore the utility of analyzing brain oscillations as a powerful approach to understanding complex genetic psychiatric disorders.

  14. A genome-wide association and gene-environment interaction study for serum triglycerides levels in a healthy Chinese male population.

    Science.gov (United States)

    Tan, Aihua; Sun, Jielin; Xia, Ning; Qin, Xue; Hu, Yanling; Zhang, Shijun; Tao, Sha; Gao, Yong; Yang, Xiaobo; Zhang, Haiying; Kim, Seong-Tae; Peng, Tao; Lin, Xiaoling; Li, Li; Mo, Linjian; Liang, Zhengjia; Shi, Deyi; Huang, Zhang; Huang, Xianghua; Liu, Ming; Ding, Qiang; Trent, Jeffrey M; Zheng, S Lilly; Mo, Zengnan; Xu, Jianfeng

    2012-04-01

    Triglyceride (TG) is a complex phenotype influenced by both genetic and environmental factors. Recent genome-wide association studies (GWAS) have identified genes or loci affecting lipid levels; however, such studies in Chinese populations are limited. A two-stage GWAS were conducted to identify genetic variants that were associated with TG in a Chinese population of 3495 men. Gene-environment interactions on serum TG levels were further investigated for the seven single nucleotide polymorphisms (SNPs) that were studied in both stages. Two previously reported SNPs (rs651821 in APOA5, rs328 in LPL) were replicated in the second stage, and the combined P-values were 9.19 × 10(-26) and 1.41 × 10(-9) for rs651821 and rs328, respectively. More importantly, a significant interaction between aldehyde dehydrogenase 2 (ALDH2) rs671 and alcohol consumption on serum TG levels were observed (P = 3.34 × 10(-5)). Rs671 was significantly associated with serum TG levels in drinkers (P = 1.90 × 10(-10)), while no association was observed in non-drinkers (P > 0.05). For drinkers, men carrying the AA/AG genotype have significantly lower serum TG levels, compared with men carrying the GG genotype. For men with the GG genotype, the serum TG levels increased with the quantity of alcohol intake (P = 1.28 × 10(-8) for trend test). We identified a novel, significant interaction effect between alcohol consumption and the ALDH2 rs671 polymorphism on TG levels, which suggests that the effect of alcohol intake on TG occurs in a two-faceted manner. Just one drink can increase TG level in susceptible individuals who carry the GG genotype, while individuals carrying AA/AG genotypes may actually benefit from moderate drinking.

  15. Genome wide association (GWA study for early onset extreme obesity supports the role of fat mass and obesity associated gene (FTO variants.

    Directory of Open Access Journals (Sweden)

    Anke Hinney

    Full Text Available BACKGROUND: Obesity is a major health problem. Although heritability is substantial, genetic mechanisms predisposing to obesity are not very well understood. We have performed a genome wide association study (GWA for early onset (extreme obesity. METHODOLOGY/PRINCIPAL FINDINGS: a GWA (Genome-Wide Human SNP Array 5.0 comprising 440,794 single nucleotide polymorphisms for early onset extreme obesity based on 487 extremely obese young German individuals and 442 healthy lean German controls; b confirmatory analyses on 644 independent families with at least one obese offspring and both parents. We aimed to identify and subsequently confirm the 15 SNPs (minor allele frequency > or =10% with the lowest p-values of the GWA by four genetic models: additive, recessive, dominant and allelic. Six single nucleotide polymorphisms (SNPs in FTO (fat mass and obesity associated gene within one linkage disequilibrium (LD block including the GWA SNP rendering the lowest p-value (rs1121980; log-additive model: nominal p = 1.13 x 10(-7, corrected p = 0.0494; odds ratio (OR(CT 1.67, 95% confidence interval (CI 1.22-2.27; OR(TT 2.76, 95% CI 1.88-4.03 belonged to the 15 SNPs showing the strongest evidence for association with obesity. For confirmation we genotyped 11 of these in the 644 independent families (of the six FTO SNPs we chose only two representing the LD bock. For both FTO SNPs the initial association was confirmed (both Bonferroni corrected p<0.01. However, none of the nine non-FTO SNPs revealed significant transmission disequilibrium. CONCLUSIONS/SIGNIFICANCE: Our GWA for extreme early onset obesity substantiates that variation in FTO strongly contributes to early onset obesity. This is a further proof of concept for GWA to detect genes relevant for highly complex phenotypes. We concurrently show that nine additional SNPs with initially low p-values in the GWA were not confirmed in our family study, thus suggesting that of the best 15 SNPs in the GWA only

  16. Genome-wide analysis of regions similar to promoters of histone genes

    KAUST Repository

    Chowdhary, Rajesh

    2010-05-28

    Background: The purpose of this study is to: i) develop a computational model of promoters of human histone-encoding genes (shortly histone genes), an important class of genes that participate in various critical cellular processes, ii) use the model so developed to identify regions across the human genome that have similar structure as promoters of histone genes; such regions could represent potential genomic regulatory regions, e.g. promoters, of genes that may be coregulated with histone genes, and iii/ identify in this way genes that have high likelihood of being coregulated with the histone genes.Results: We successfully developed a histone promoter model using a comprehensive collection of histone genes. Based on leave-one-out cross-validation test, the model produced good prediction accuracy (94.1% sensitivity, 92.6% specificity, and 92.8% positive predictive value). We used this model to predict across the genome a number of genes that shared similar promoter structures with the histone gene promoters. We thus hypothesize that these predicted genes could be coregulated with histone genes. This hypothesis matches well with the available gene expression, gene ontology, and pathways data. Jointly with promoters of the above-mentioned genes, we found a large number of intergenic regions with similar structure as histone promoters.Conclusions: This study represents one of the most comprehensive computational analyses conducted thus far on a genome-wide scale of promoters of human histone genes. Our analysis suggests a number of other human genes that share a high similarity of promoter structure with the histone genes and thus are highly likely to be coregulated, and consequently coexpressed, with the histone genes. We also found that there are a large number of intergenic regions across the genome with their structures similar to promoters of histone genes. These regions may be promoters of yet unidentified genes, or may represent remote control regions that

  17. Effect of Genome Position on Heterologous Gene Expression in Bacillus subtilis: An Unbiased Analysis

    NARCIS (Netherlands)

    Sauer, C.; Syvertsson, S.; Bohorquez, L.C.; Cruz, R.; Harwood, C.R.; van Rij, T.; Hamoen, L.W.

    2016-01-01

    A fixed gene copy number is important for the in silico construction of engineered synthetic networks. However, the copy number of integrated genes depends on their genomic location. This gene dosage effect is rarely addressed in synthetic biology. Two studies in Escherichia coli presented conflicti

  18. Prevalent role of gene features in determining evolutionary fates of whole-genome duplication duplicated genes in flowering plants.

    Science.gov (United States)

    Jiang, Wen-kai; Liu, Yun-long; Xia, En-hua; Gao, Li-zhi

    2013-04-01

    The evolution of genes and genomes after polyploidization has been the subject of extensive studies in evolutionary biology and plant sciences. While a significant number of duplicated genes are rapidly removed during a process called fractionation, which operates after the whole-genome duplication (WGD), another considerable number of genes are retained preferentially, leading to the phenomenon of biased gene retention. However, the evolutionary mechanisms underlying gene retention after WGD remain largely unknown. Through genome-wide analyses of sequence and functional data, we comprehensively investigated the relationships between gene features and the retention probability of duplicated genes after WGDs in six plant genomes, Arabidopsis (Arabidopsis thaliana), poplar (Populus trichocarpa), soybean (Glycine max), rice (Oryza sativa), sorghum (Sorghum bicolor), and maize (Zea mays). The results showed that multiple gene features were correlated with the probability of gene retention. Using a logistic regression model based on principal component analysis, we resolved evolutionary rate, structural complexity, and GC3 content as the three major contributors to gene retention. Cluster analysis of these features further classified retained genes into three distinct groups in terms of gene features and evolutionary behaviors. Type I genes are more prone to be selected by dosage balance; type II genes are possibly subject to subfunctionalization; and type III genes may serve as potential targets for neofunctionalization. This study highlights that gene features are able to act jointly as primary forces when determining the retention and evolution of WGD-derived duplicated genes in flowering plants. These findings thus may help to provide a resolution to the debate on different evolutionary models of gene fates after WGDs.

  19. Re-evaluation of ABO gene polymorphisms detected in a genome-wide association study and risk of pancreatic ductal adenocarcinoma in a Chinese population

    Institute of Scientific and Technical Information of China (English)

    Hong-Li Xu; Jia-Rong Cheng; Wei Zhang; Jing Wang; Herbert Yu; Quan-Xing Ni; Harvey A. Risch; Yu-Tang Gao

    2014-01-01

    Pancreatic cancer is a fatal malignancy with an increasing incidence in Shanghai, China. A genome-wide association study (GWAS) and other work have shown that ABO alleles are associated with pancreatic cancer risk. We conducted a population-based case-control study involving 256 patients with pathologicaly confirmed pancreatic ductal adenocarcinoma (PDAC) and 548 healthy controls in Shanghai, China, to assess the relationships between GWAS-identified ABO alleles and risk of PDAC. Carriers of the C alele of rs505922 had an increased cancer risk [adjusted odds ratio (OR) = 1.42, 95% confidence interval (CI): 1.02-1.98] compared to TT carriers. The T alleles of rs495828 and rs657152 were also significantly associated with an elevated cancer risk (adjusted OR = 1.58, 95% CI: 1.17-2.14; adjusted OR = 1.51, 95% CI: 1.09-2.10). The rs630014 variant was not associated with risk. We did not find any significant gene-environment interaction with cancer risk using a multifactor dimensionality reduction (MDR) method. Haplotype analysis also showed that the haplotype CTTC was associated with an increased risk of PDAC (adjusted OR = 1.46, 95% CI: 1.12-1.91) compared with haplotype TGGT. GWAS-identified ABO variants are thus also associated with risk of PDAC in the Chinese population.

  20. A genome wide association study for backfat thickness in Italian Large White pigs highlights new regions affecting fat deposition including neuronal genes

    Directory of Open Access Journals (Sweden)

    Fontanesi Luca

    2012-11-01

    Full Text Available Abstract Background Carcass fatness is an important trait in most pig breeding programs. Following market requests, breeding plans for fresh pork consumption are usually designed to reduce carcass fat content and increase lean meat deposition. However, the Italian pig industry is mainly devoted to the production of Protected Designation of Origin dry cured hams: pigs are slaughtered at around 160 kg of live weight and the breeding goal aims at maintaining fat coverage, measured as backfat thickness to avoid excessive desiccation of the hams. This objective has shaped the genetic pool of Italian heavy pig breeds for a few decades. In this study we applied a selective genotyping approach within a population of ~ 12,000 performance tested Italian Large White pigs. Within this population, we selectively genotyped 304 pigs with extreme and divergent backfat thickness estimated breeding value by the Illumina PorcineSNP60 BeadChip and performed a genome wide association study to identify loci associated to this trait. Results We identified 4 single nucleotide polymorphisms with P≤5.0E-07 and additional 119 ones with 5.0E-07 Conclusions Further investigations are needed to evaluate the effects of the identified single nucleotide polymorphisms associated with backfat thickness on other traits as a pre-requisite for practical applications in breeding programs. Reported results could improve our understanding of the biology of fat metabolism and deposition that could also be relevant for other mammalian species including humans, confirming the role of neuronal genes on obesity.

  1. A genome-wide gene-by-trauma interaction study of alcohol misuse in two independent cohorts identifies PRKG1 as a risk locus.

    Science.gov (United States)

    Polimanti, R; Kaufman, J; Zhao, H; Kranzler, H R; Ursano, R J; Kessler, R C; Gelernter, J; Stein, M B

    2017-03-07

    Traumatic life experiences are associated with alcohol use problems, an association that is likely to be moderated by genetic predisposition. To understand these interactions, we conducted a gene-by-environment genome-wide interaction study (GEWIS) of alcohol use problems in two independent samples, the Army STARRS (STARRS, N=16 361) and the Yale-Penn (N=8084) cohorts. Because the two cohorts were assessed using different instruments, we derived separate dimensional alcohol misuse scales and applied a proxy-phenotype study design. In African-American subjects, we identified an interaction of PRKG1 rs1729578 with trauma exposure in the STARRS cohort and replicated its interaction with trauma exposure in the Yale-Penn cohort (discovery-replication meta-analysis: z=5.64, P=1.69 × 10(-8)). PRKG1 encodes cyclic GMP-dependent protein kinase 1, which is involved in learning, memory and circadian rhythm regulation. Considering the loci identified in stage-1 that showed same effect directions in stage-2, the gene ontology (GO) enrichment analysis showed several significant results, including calcium-activated potassium channels (GO:0016286; P=2.30 × 10(-5)), cognition (GO:0050890; P=1.90 × 10(-6)), locomotion (GO:0040011; P=6.70 × 10(-5)) and Stat3 protein regulation (GO:0042517; P=6.4 × 10(-5)). To our knowledge, this is the largest GEWIS performed in psychiatric genetics, and the first GEWIS examining risk for alcohol misuse. Our results add to a growing body of literature highlighting the dynamic impact of experience on individual genetic risk.Molecular Psychiatry advance online publication, 7 March 2017; doi:10.1038/mp.2017.24.

  2. Genome Variability and Gene Content in Chordopoxviruses: Dependence on Microsatellites

    Science.gov (United States)

    Hatcher, Eneida L.; Wang, Chunlin; Lefkowitz, Elliot J.

    2015-01-01

    To investigate gene loss in poxviruses belonging to the Chordopoxvirinae subfamily, we assessed the gene content of representative members of the subfamily, and determined whether individual genes present in each genome were intact, truncated, or fragmented. When nonintact genes were identified, the early stop mutations (ESMs) leading to gene truncation or fragmentation were analyzed. Of all the ESMs present in these poxvirus genomes, over 65% co-localized with microsatellites—simple sequence nucleotide repeats. On average, microsatellites comprise 24% of the nucleotide sequence of these poxvirus genomes. These simple repeats have been shown to exhibit high rates of variation, and represent a target for poxvirus protein variation, gene truncation, and reductive evolution. PMID:25912716

  3. Genome engineering and gene expression control for bacterial strain development.

    Science.gov (United States)

    Song, Chan Woo; Lee, Joungmin; Lee, Sang Yup

    2015-01-01

    In recent years, a number of techniques and tools have been developed for genome engineering and gene expression control to achieve desired phenotypes of various bacteria. Here we review and discuss the recent advances in bacterial genome manipulation and gene expression control techniques, and their actual uses with accompanying examples. Genome engineering has been commonly performed based on homologous recombination. During such genome manipulation, the counterselection systems employing SacB or nucleases have mainly been used for the efficient selection of desired engineered strains. The recombineering technology enables simple and more rapid manipulation of the bacterial genome. The group II intron-mediated genome engineering technology is another option for some bacteria that are difficult to be engineered by homologous recombination. Due to the increasing demands on high-throughput screening of bacterial strains having the desired phenotypes, several multiplex genome engineering techniques have recently been developed and validated in some bacteria. Another approach to achieve desired bacterial phenotypes is the repression of target gene expression without the modification of genome sequences. This can be performed by expressing antisense RNA, small regulatory RNA, or CRISPR RNA to repress target gene expression at the transcriptional or translational level. All of these techniques allow efficient and rapid development and screening of bacterial strains having desired phenotypes, and more advanced techniques are expected to be seen.

  4. Genome Diversification Mechanism of Rodent and Lagomorpha Chemokine Genes

    Directory of Open Access Journals (Sweden)

    Kanako Shibata

    2013-01-01

    Full Text Available Chemokines are a large family of small cytokines that are involved in host defence and body homeostasis through recruitment of cells expressing their receptors. Their genes are known to undergo rapid evolution. Therefore, the number and content of chemokine genes can be quite diverse among the different species, making the orthologous relationships often ambiguous even between closely related species. Given that rodents and rabbit are useful experimental models in medicine and drug development, we have deduced the chemokine genes from the genome sequences of several rodent species and rabbit and compared them with those of human and mouse to determine the orthologous relationships. The interspecies differences should be taken into consideration when experimental results from animal models are extrapolated into humans. The chemokine gene lists and their orthologous relationships presented here will be useful for studies using these animal models. Our analysis also enables us to reconstruct possible gene duplication processes that generated the different sets of chemokine genes in these species.

  5. The Symbiodinium kawagutii genome illuminates dinoflagellate gene expression and coral symbiosis

    DEFF Research Database (Denmark)

    Lin, Senjie; Cheng, Shifeng; Song, Bo

    2015-01-01

    Dinoflagellates are important components of marine ecosystems and essential coral symbionts, yet little is known about their genomes. We report here on the analysis of a high-quality assembly from the 1180-megabase genome of Symbiodinium kawagutii. We annotated protein-coding genes and identified...... Symbiodinium-specific gene families. No whole-genome duplication was observed, but instead we found active (retro) transposition and gene family expansion, especially in processes important for successful symbiosis with corals. We also documented genes potentially governing sexual reproduction and cyst...... formation, novel promoter elements, and a microRNA system potentially regulating gene expression in both symbiont and coral.We found biochemical complementarity between genomes of S. kawagutii and the anthozoan Acropora, indicative of host-symbiont coevolution, providing a resource for studying...

  6. Inheritable and precise large genomic deletions of non-coding RNA genes in zebrafish using TALENs.

    Directory of Open Access Journals (Sweden)

    Yun Liu

    Full Text Available Transcription activator-like effector nucleases (TALENs have so far been applied to disrupt protein-coding genes which constitute only 2-3% of the genome in animals. The majority (70-90% of the animal genome is actually transcribed as non-coding RNAs (ncRNAs, yet the lack of efficient tools to knockout ncRNA genes hinders studies on their in vivo functions. Here we have developed novel strategies using TALENs to achieve precise and inheritable large genomic deletions and knockout of ncRNA genes in zebrafish. We have demonstrated that individual miRNA genes could be disrupted using one pair of TALENs, whereas large microRNA (miRNA gene clusters and long non-coding RNA (lncRNA genes could be precisely deleted using two pairs of TALENs. We have generated large genomic deletions of two miRNA clusters (the 1.2 kb miR-17-92 cluster and the 79.8 kb miR-430 cluster and one long non-coding RNA (lncRNA gene (the 9.0 kb malat1, and the deletions are transmitted through the germline. Taken together, our results establish TALENs as a robust tool to engineer large genomic deletions and knockout of ncRNA genes, thus opening up new avenues in the application of TALENs to study the genome in vivo.

  7. Comparative genomics of rhizobia nodulating soybean suggests extensive recruitment of lineage-specific genes in adaptations.

    Science.gov (United States)

    Tian, Chang Fu; Zhou, Yuan Jie; Zhang, Yan Ming; Li, Qin Qin; Zhang, Yun Zeng; Li, Dong Fang; Wang, Shuang; Wang, Jun; Gilbert, Luz B; Li, Ying Rui; Chen, Wen Xin

    2012-05-29

    The rhizobium-legume symbiosis has been widely studied as the model of mutualistic evolution and the essential component of sustainable agriculture. Extensive genetic and recent genomic studies have led to the hypothesis that many distinct strategies, regardless of rhizobial phylogeny, contributed to the varied rhizobium-legume symbiosis. We sequenced 26 genomes of Sinorhizobium and Bradyrhizobium nodulating soybean to test this hypothesis. The Bradyrhizobium core genome is disproportionally enriched in lipid and secondary metabolism, whereas several gene clusters known to be involved in osmoprotection and adaptation to alkaline pH are specific to the Sinorhizobium core genome. These features are consistent with biogeographic patterns of these bacteria. Surprisingly, no genes are specifically shared by these soybean microsymbionts compared with other legume microsymbionts. On the other hand, phyletic patterns of 561 known symbiosis genes of rhizobia reflected the species phylogeny of these soybean microsymbionts and other rhizobia. Similar analyses with 887 known functional genes or the whole pan genome of rhizobia revealed that only the phyletic distribution of functional genes was consistent with the species tree of rhizobia. Further evolutionary genetics revealed that recombination dominated the evolution of core genome. Taken together, our results suggested that faithfully vertical genes were rare compared with those with history of recombination including lateral gene transfer, although rhizobial adaptations to symbiotic interactions and other environmental conditions extensively recruited lineage-specific shell genes under direct or indirect control through the speciation process.

  8. Genomic sequence around butterfly wing development genes: annotation and comparative analysis.

    Directory of Open Access Journals (Sweden)

    Inês C Conceição

    Full Text Available BACKGROUND: Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. METHODOLOGY/PRINCIPAL FINDINGS: We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes. CONCLUSIONS: The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1 the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2 the high

  9. GenePRIMP: A GENE PRediction IMprovement Pipeline for Prokaryotic genomes

    Energy Technology Data Exchange (ETDEWEB)

    Pati, Amrita; Ivanova, Natalia N.; Mikhailova, Natalia; Ovchinnikova, Galina; Hooper, Sean D.; Lykidis, Athanasios; Kyrpides, Nikos C.

    2010-04-01

    We present 'gene prediction improvement pipeline' (GenePRIMP; http://geneprimp.jgi-psf.org/), a computational process that performs evidence-based evaluation of gene models in prokaryotic genomes and reports anomalies including inconsistent start sites, missed genes and split genes. We found that manual curation of gene models using the anomaly reports generated by GenePRIMP improved their quality, and demonstrate the applicability of GenePRIMP in improving finishing quality and comparing different genome-sequencing and annotation technologies.

  10. Common genetic variation near the phospholamban gene is associated with cardiac repolarisation: meta-analysis of three genome-wide association studies.

    Directory of Open Access Journals (Sweden)

    Ilja M Nolte

    Full Text Available To identify loci affecting the electrocardiographic QT interval, a measure of cardiac repolarisation associated with risk of ventricular arrhythmias and sudden cardiac death, we conducted a meta-analysis of three genome-wide association studies (GWAS including 3,558 subjects from the TwinsUK and BRIGHT cohorts in the UK and the DCCT/EDIC cohort from North America. Five loci were significantly associated with QT interval at P<1x10(-6. To validate these findings we performed an in silico comparison with data from two QT consortia: QTSCD (n = 15,842 and QTGEN (n = 13,685. Analysis confirmed the association between common variants near NOS1AP (P = 1.4x10(-83 and the phospholamban (PLN gene (P = 1.9x10(-29. The most associated SNP near NOS1AP (rs12143842 explains 0.82% variance; the SNP near PLN (rs11153730 explains 0.74% variance of QT interval duration. We found no evidence for interaction between these two SNPs (P = 0.99. PLN is a key regulator of cardiac diastolic function and is involved in regulating intracellular calcium cycling, it has only recently been identified as a susceptibility locus for QT interval. These data offer further mechanistic insights into genetic influence on the QT interval which may predispose to life threatening arrhythmias and sudden cardiac death.

  11. Genomic location and characterisation of MIC genes in cattle.

    Science.gov (United States)

    Birch, James; De Juan Sanjuan, Cristina; Guzman, Efrain; Ellis, Shirley A

    2008-08-01

    Major histocompatibility complex (MHC) class I chain-related (MIC) genes have been previously identified and characterised in human. They encode polymorphic class I-like molecules that are stress-inducible, and constitute one of the ligands of the activating natural killer cell receptor NKG2D. We have identified three MIC genes within the cattle genome, located close to three non-classical MHC class I genes. The genomic position relative to other genes is very similar to the arrangement reported in the pig MHC region. Analysis of MIC cDNA sequences derived from a range of cattle cell lines suggest there may be four MIC genes in total. We have investigated the presence of the genes in distinct and well-defined MHC haplotypes, and show that one gene is consistently present, while configuration of the other three genes appears variable.

  12. Distinct gene number-genome size relationships for eukaryotes and non-eukaryotes: gene content estimation for dinoflagellate genomes.

    Directory of Open Access Journals (Sweden)

    Yubo Hou

    Full Text Available The ability to predict gene content is highly desirable for characterization of not-yet sequenced genomes like those of dinoflagellates. Using data from completely sequenced and annotated genomes from phylogenetically diverse lineages, we investigated the relationship between gene content and genome size using regression analyses. Distinct relationships between log(10-transformed protein-coding gene number (Y' versus log(10-transformed genome size (X', genome size in kbp were found for eukaryotes and non-eukaryotes. Eukaryotes best fit a logarithmic model, Y' = ln(-46.200+22.678X', whereas non-eukaryotes a linear model, Y' = 0.045+0.977X', both with high significance (p0.91. Total gene number shows similar trends in both groups to their respective protein coding regressions. The distinct correlations reflect lower and decreasing gene-coding percentages as genome size increases in eukaryotes (82%-1% compared to higher and relatively stable percentages in prokaryotes and viruses (97%-47%. The eukaryotic regression models project that the smallest dinoflagellate genome (3x10(6 kbp contains 38,188 protein-coding (40,086 total genes and the largest (245x10(6 kbp 87,688 protein-coding (92,013 total genes, corresponding to 1.8% and 0.05% gene-coding percentages. These estimates do not likely represent extraordinarily high functional diversity of the encoded proteome but rather highly redundant genomes as evidenced by high gene copy numbers documented for various dinoflagellate species.

  13. MED: a new non-supervised gene prediction algorithm for bacterial and archaeal genomes

    Directory of Open Access Journals (Sweden)

    Yang Yi-Fan

    2007-03-01

    Full Text Available Abstract Background Despite a remarkable success in the computational prediction of genes in Bacteria and Archaea, a lack of comprehensive understanding of prokaryotic gene structures prevents from further elucidation of differences among genomes. It continues to be interesting to develop new ab initio algorithms which not only accurately predict genes, but also facilitate comparative studies of prokaryotic genomes. Results This paper describes a new prokaryotic genefinding algorithm based on a comprehensive statistical model of protein coding Open Reading Frames (ORFs and Translation Initiation Sites (TISs. The former is based on a linguistic "Entropy Density Profile" (EDP model of coding DNA sequence and the latter comprises several relevant features related to the translation initiation. They are combined to form a so-called Multivariate Entropy Distance (MED algorithm, MED 2.0, that incorporates several strategies in the iterative program. The iterations enable us to develop a non-supervised learning process and to obtain a set of genome-specific parameters for the gene structure, before making the prediction of genes. Conclusion Results of extensive tests show that MED 2.0 achieves a competitive high performance in the gene prediction for both 5' and 3' end matches, compared to the current best prokaryotic gene finders. The advantage of the MED 2.0 is particularly evident for GC-rich genomes and archaeal genomes. Furthermore, the genome-specific parameters given by MED 2.0 match with the current understanding of prokaryotic genomes and may serve as tools for comparative genomic studies. In particular, MED 2.0 is shown to reveal divergent translation initiation mechanisms in archaeal genomes while making a more accurate prediction of TISs compared to the existing gene finders and the current GenBank annotation.

  14. Sinbase: an integrated database to study genomics, genetics and comparative genomics in Sesamum indicum.

    Science.gov (United States)

    Wang, Linhai; Yu, Jingyin; Li, Donghua; Zhang, Xiurong

    2015-01-01

    Sesame (Sesamum indicum L.) is an ancient and important oilseed crop grown widely in tropical and subtropical areas. It belongs to the gigantic order Lamiales, which includes many well-known or economically important species, such as olive (Olea europaea), leonurus (Leonurus japonicus) and lavender (Lavandula spica), many of which have important pharmacological properties. Despite their importance, genetic and genomic analyses on these species have been insufficient due to a lack of reference genome information. The now available S. indicum genome will provide an unprecedented opportunity for studying both S. indicum genetic traits and comparative genomics. To deliver S. indicum genomic information to the worldwide research community, we designed Sinbase, a web-based database with comprehensive sesame genomic, genetic and comparative genomic information. Sinbase includes sequences of assembled sesame pseudomolecular chromosomes, protein-coding genes (27,148), transposable elements (372,167) and non-coding RNAs (1,748). In particular, Sinbase provides unique and valuable information on colinear regions with various plant genomes, including Arabidopsis thaliana, Glycine max, Vitis vinifera and Solanum lycopersicum. Sinbase also provides a useful search function and data mining tools, including a keyword search and local BLAST service. Sinbase will be updated regularly with new features, improvements to genome annotation and new genomic sequences, and is freely accessible at http://ocri-genomics.org/Sinbase/.

  15. Evolution of genes and genomes on the Drosophila phylogeny.

    Science.gov (United States)

    Clark, Andrew G; Eisen, Michael B; Smith, Douglas R; Bergman, Casey M; Oliver, Brian; Markow, Therese A; Kaufman, Thomas C; Kellis, Manolis; Gelbart, William; Iyer, Venky N; Pollard, Daniel A; Sackton, Timothy B; Larracuente, Amanda M; Singh, Nadia D; Abad, Jose P; Abt, Dawn N; Adryan, Boris; Aguade, Montserrat; Akashi, Hiroshi; Anderson, Wyatt W; Aquadro, Charles F; Ardell, David H; Arguello, Roman; Artieri, Carlo G; Barbash, Daniel A; Barker, Daniel; Barsanti, Paolo; Batterham, Phil; Batzoglou, Serafim; Begun, Dave; Bhutkar, Arjun; Blanco, Enrico; Bosak, Stephanie A; Bradley, Robert K; Brand, Adrianne D; Brent, Michael R; Brooks, Angela N; Brown, Randall H; Butlin, Roger K; Caggese, Corrado; Calvi, Brian R; Bernardo de Carvalho, A; Caspi, Anat; Castrezana, Sergio; Celniker, Susan E; Chang, Jean L; Chapple, Charles; Chatterji, Sourav; Chinwalla, Asif; Civetta, Alberto; Clifton, Sandra W; Comeron, Josep M; Costello, James C; Coyne, Jerry A; Daub, Jennifer; David, Robert G; Delcher, Arthur L; Delehaunty, Kim; Do, Chuong B; Ebling, Heather; Edwards, Kevin; Eickbush, Thomas; Evans, Jay D; Filipski, Alan; Findeiss, Sven; Freyhult, Eva; Fulton, Lucinda; Fulton, Robert; Garcia, Ana C L; Gardiner, Anastasia; Garfield, David A; Garvin, Barry E; Gibson, Greg; Gilbert, Don; Gnerre, Sante; Godfrey, Jennifer; Good, Robert; Gotea, Valer; Gravely, Brenton; Greenberg, Anthony J; Griffiths-Jones, Sam; Gross, Samuel; Guigo, Roderic; Gustafson, Erik A; Haerty, Wilfried; Hahn, Matthew W; Halligan, Daniel L; Halpern, Aaron L; Halter, Gillian M; Han, Mira V; Heger, Andreas; Hillier, LaDeana; Hinrichs, Angie S; Holmes, Ian; Hoskins, Roger A; Hubisz, Melissa J; Hultmark, Dan; Huntley, Melanie A; Jaffe, David B; Jagadeeshan, Santosh; Jeck, William R; Johnson, Justin; Jones, Corbin D; Jordan, William C; Karpen, Gary H; Kataoka, Eiko; Keightley, Peter D; Kheradpour, Pouya; Kirkness, Ewen F; Koerich, Leonardo B; Kristiansen, Karsten; Kudrna, Dave; Kulathinal, Rob J; Kumar, Sudhir; Kwok, Roberta; Lander, Eric; Langley, Charles H; Lapoint, Richard; Lazzaro, Brian P; Lee, So-Jeong; Levesque, Lisa; Li, Ruiqiang; Lin, Chiao-Feng; Lin, Michael F; Lindblad-Toh, Kerstin; Llopart, Ana; Long, Manyuan; Low, Lloyd; Lozovsky, Elena; Lu, Jian; Luo, Meizhong; Machado, Carlos A; Makalowski, Wojciech; Marzo, Mar; Matsuda, Muneo; Matzkin, Luciano; McAllister, Bryant; McBride, Carolyn S; McKernan, Brendan; McKernan, Kevin; Mendez-Lago, Maria; Minx, Patrick; Mollenhauer, Michael U; Montooth, Kristi; Mount, Stephen M; Mu, Xu; Myers, Eugene; Negre, Barbara; Newfeld, Stuart; Nielsen, Rasmus; Noor, Mohamed A F; O'Grady, Patrick; Pachter, Lior; Papaceit, Montserrat; Parisi, Matthew J; Parisi, Michael; Parts, Leopold; Pedersen, Jakob S; Pesole, Graziano; Phillippy, Adam M; Ponting, Chris P; Pop, Mihai; Porcelli, Damiano; Powell, Jeffrey R; Prohaska, Sonja; Pruitt, Kim; Puig, Marta; Quesneville, Hadi; Ram, Kristipati Ravi; Rand, David; Rasmussen, Matthew D; Reed, Laura K; Reenan, Robert; Reily, Amy; Remington, Karin A; Rieger, Tania T; Ritchie, Michael G; Robin, Charles; Rogers, Yu-Hui; Rohde, Claudia; Rozas, Julio; Rubenfield, Marc J; Ruiz, Alfredo; Russo, Susan; Salzberg, Steven L; Sanchez-Gracia, Alejandro; Saranga, David J; Sato, Hajime; Schaeffer, Stephen W; Schatz, Michael C; Schlenke, Todd; Schwartz, Russell; Segarra, Carmen; Singh, Rama S; Sirot, Laura; Sirota, Marina; Sisneros, Nicholas B; Smith, Chris D; Smith, Temple F; Spieth, John; Stage, Deborah E; Stark, Alexander; Stephan, Wolfgang; Strausberg, Robert L; Strempel, Sebastian; Sturgill, David; Sutton, Granger; Sutton, Granger G; Tao, Wei; Teichmann, Sarah; Tobari, Yoshiko N; Tomimura, Yoshihiko; Tsolas, Jason M; Valente, Vera L S; Venter, Eli; Venter, J Craig; Vicario, Saverio; Vieira, Filipe G; Vilella, Albert J; Villasante, Alfredo; Walenz, Brian; Wang, Jun; Wasserman, Marvin; Watts, Thomas; Wilson, Derek; Wilson, Richard K; Wing, Rod A; Wolfner, Mariana F; Wong, Alex; Wong, Gane Ka-Shu; Wu, Chung-I; Wu, Gabriel; Yamamoto, Daisuke; Yang, Hsiao-Pei; Yang, Shiaw-Pyng; Yorke, James A; Yoshida, Kiyohito; Zdobnov, Evgeny; Zhang, Peili; Zhang, Yu; Zimin, Aleksey V; Baldwin, Jennifer; Abdouelleil, Amr; Abdulkadir, Jamal; Abebe, Adal; Abera, Brikti; Abreu, Justin; Acer, St Christophe; Aftuck, Lynne; Alexander, Allen; An, Peter; Anderson, Erica; Anderson, Scott; Arachi, Harindra; Azer, Marc; Bachantsang, Pasang; Barry, Andrew; Bayul, Tashi; Berlin, Aaron; Bessette, Daniel; Bloom, Toby; Blye, Jason; Boguslavskiy, Leonid; Bonnet, Claude; Boukhgalter, Boris; Bourzgui, Imane; Brown, Adam; Cahill, Patrick; Channer, Sheridon; Cheshatsang, Yama; Chuda, Lisa; Citroen, Mieke; Collymore, Alville; Cooke, Patrick; Costello, Maura; D'Aco, Katie; Daza, Riza; De Haan, Georgius; DeGray, Stuart; DeMaso, Christina; Dhargay, Norbu; Dooley, Kimberly; Dooley, Erin; Doricent, Missole; Dorje, Passang; Dorjee, Kunsang; Dupes, Alan; Elong, Richard; Falk, Jill; Farina, Abderrahim; Faro, Susan; Ferguson, Diallo; Fisher, Sheila; Foley, Chelsea D; Franke, Alicia; Friedrich, Dennis; Gadbois, Loryn; Gearin, Gary; Gearin, Christina R; Giannoukos, Georgia; Goode, Tina; Graham, Joseph; Grandbois, Edward; Grewal, Sharleen; Gyaltsen, Kunsang; Hafez, Nabil; Hagos, Birhane; Hall, Jennifer; Henson, Charlotte; Hollinger, Andrew; Honan, Tracey; Huard, Monika D; Hughes, Leanne; Hurhula, Brian; Husby, M Erii; Kamat, Asha; Kanga, Ben; Kashin, Seva; Khazanovich, Dmitry; Kisner, Peter; Lance, Krista; Lara, Marcia; Lee, William; Lennon, Niall; Letendre, Frances; LeVine, Rosie; Lipovsky, Alex; Liu, Xiaohong; Liu, Jinlei; Liu, Shangtao; Lokyitsang, Tashi; Lokyitsang, Yeshi; Lubonja, Rakela; Lui, Annie; MacDonald, Pen; Magnisalis, Vasilia; Maru, Kebede; Matthews, Charles; McCusker, William; McDonough, Susan; Mehta, Teena; Meldrim, James; Meneus, Louis; Mihai, Oana; Mihalev, Atanas; Mihova, Tanya; Mittelman, Rachel; Mlenga, Valentine; Montmayeur, Anna; Mulrain, Leonidas; Navidi, Adam; Naylor, Jerome; Negash, Tamrat; Nguyen, Thu; Nguyen, Nga; Nicol, Robert; Norbu, Choe; Norbu, Nyima; Novod, Nathaniel; O'Neill, Barry; Osman, Sahal; Markiewicz, Eva; Oyono, Otero L; Patti, Christopher; Phunkhang, Pema; Pierre, Fritz; Priest, Margaret; Raghuraman, Sujaa; Rege, Filip; Reyes, Rebecca; Rise, Cecil; Rogov, Peter; Ross, Keenan; Ryan, Elizabeth; Settipalli, Sampath; Shea, Terry; Sherpa, Ngawang; Shi, Lu; Shih, Diana; Sparrow, Todd; Spaulding, Jessica; Stalker, John; Stange-Thomann, Nicole; Stavropoulos, Sharon; Stone, Catherine; Strader, Christopher; Tesfaye, Senait; Thomson, Talene; Thoulutsang, Yama; Thoulutsang, Dawa; Topham, Kerri; Topping, Ira; Tsamla, Tsamla; Vassiliev, Helen; Vo, Andy; Wangchuk, Tsering; Wangdi, Tsering; Weiand, Michael; Wilkinson, Jane; Wilson, Adam; Yadav, Shailendra; Young, Geneva; Yu, Qing; Zembek, Lisa; Zhong, Danni; Zimmer, Andrew; Zwirko, Zac; Jaffe, David B; Alvarez, Pablo; Brockman, Will; Butler, Jonathan; Chin, CheeWhye; Gnerre, Sante; Grabherr, Manfred; Kleber, Michael; Mauceli, Evan; MacCallum, Iain

    2007-11-08

    Comparative analysis of multiple genomes in a phylogenetic framework dramatically improves the precision and sensitivity of evolutionary inference, producing more robust results than single-genome analyses can provide. The genomes of 12 Drosophila species, ten of which are presented here for the first time (sechellia, simulans, yakuba, erecta, ananassae, persimilis, willistoni, mojavensis, virilis and grimshawi), illustrate how rates and patterns of sequence divergence across taxa can illuminate evolutionary processes on a genomic scale. These genome sequences augment the formidable genetic tools that have made Drosophila melanogaster a pre-eminent model for animal genetics, and will further catalyse fundamental research on mechanisms of development, cell biology, genetics, disease, neurobiology, behaviour, physiology and evolution. Despite remarkable similarities among these Drosophila species, we identified many putatively non-neutral changes in protein-coding genes, non-coding RNA genes, and cis-regulatory regions. These may prove to underlie differences in the ecology and behaviour of these diverse species.

  16. Testing the infinitely many genes model for the evolution of the bacterial core genome and pangenome.

    Science.gov (United States)

    Collins, R Eric; Higgs, Paul G

    2012-11-01

    When groups of related bacterial genomes are compared, the number of core genes found in all genomes is usually much less than the mean genome size, whereas the size of the pangenome (the set of genes found on at least one of the genomes) is much larger than the mean size of one genome. We analyze 172 complete genomes of Bacilli and compare the properties of the pangenomes and core genomes of monophyletic subsets taken from this group. We then assess the capabilities of several evolutionary models to predict these properties. The infinitely many genes (IMG) model is based on the assumption that each new gene can arise only once. The predictions of the model depend on the shape of the evolutionary tree that underlies the divergence of the genomes. We calculate results for coalescent trees, star trees, and arbitrary phylogenetic trees of predefined fixed branch length. On a star tree, the pangenome size increases linearly with the number of genomes, as has been suggested in some previous studies, whereas on a coalescent tree, it increases logarithmically. The coalescent tree gives a better fit to the data, for all the examples we consider. In some cases, a fixed phylogenetic tree proved better than the coalescent tree at reproducing structure in the gene frequency spectrum, but little improvement was gained in predictions of the core and pangenome sizes. Most of the data are well explained by a model with three classes of gene: an essential class that is found in all genomes, a slow class whose rate of origination and deletion is slow compared with the time of divergence of the genomes, and a fast class showing rapid origination and deletion. Although the majority of genes originating in a genome are in the fast class, these genes are not retained for long periods, and the majority of genes present in a genome are in the slow or essential classes. In general, we show that the IMG model is useful for comparison with experimental genome data both for species level and

  17. Genome-wide experimental determination of barriers to horizontal gene transfer

    Energy Technology Data Exchange (ETDEWEB)

    Rubin, Edward; Sorek, Rotem; Zhu, Yiwen; Creevey, Christopher J.; Francino, M. Pilar; Bork, Peer; Rubin, Edward M.

    2007-09-24

    Horizontal gene transfer, in which genetic material is transferred from the genome of one organism to another, has been investigated in microbial species mainly through computational sequence analyses. To address the lack of experimental data, we studied the attempted movement of 246,045 genes from 79 prokaryotic genomes into E. coli and identified genes that consistently fail to transfer. We studied the mechanisms underlying transfer inhibition by placing coding regions from different species under the control of inducible promoters. Their toxicity to the host inhibited transfer regardless of the species of origin and our data suggest that increased gene dosage and associated increased expression is a predominant cause for transfer failure. While these experimental studies examined transfer solely into E. coli, a computational analysis of gene transfer rates across available bacterial and archaeal genomes indicates that the barriers observed in our study are general across the tree of life.

  18. Comparative genomics of Geobacter chemotaxis genes reveals diverse signaling function

    Directory of Open Access Journals (Sweden)

    Antommattei Frances M

    2008-10-01

    Full Text Available Abstract Background Geobacter species are δ-Proteobacteria and are often the predominant species in a variety of sedimentary environments where Fe(III reduction is important. Their ability to remediate contaminated environments and produce electricity makes them attractive for further study. Cell motility, biofilm formation, and type IV pili all appear important for the growth of Geobacter in changing environments and for electricity production. Recent studies in other bacteria have demonstrated that signaling pathways homologous to the paradigm established for Escherichia coli chemotaxis can regulate type IV pili-dependent motility, the synthesis of flagella and type IV pili, the production of extracellular matrix material, and biofilm formation. The classification of these pathways by comparative genomics improves the ability to understand how Geobacter thrives in natural environments and better their use in microbial fuel cells. Results The genomes of G. sulfurreducens, G. metallireducens, and G. uraniireducens contain multiple (~70 homologs of chemotaxis genes arranged in several major clusters (six, seven, and seven, respectively. Unlike the single gene cluster of E. coli, the Geobacter clusters are not all located near the flagellar genes. The probable functions of some Geobacter clusters are assignable by homology to known pathways; others appear to be unique to the Geobacter sp. and contain genes of unknown function. We identified large numbers of methyl-accepting chemotaxis protein (MCP homologs that have diverse sensing domain architectures and generate a potential for sensing a great variety of environmental signals. We discuss mechanisms for class-specific segregation of the MCPs in the cell membrane, which serve to maintain pathway specificity and diminish crosstalk. Finally, the regulation of gene expression in Geobacter differs from E. coli. The sequences of predicted promoter elements suggest that the alternative sigma factors

  19. Comparative studies of genome-wide maps of nucleosomes between deletion mutants of elp3 and hos2 genes of Saccharomyces cerevisiae.

    Directory of Open Access Journals (Sweden)

    Takashi Matsumoto

    Full Text Available In order to elucidate the influence of histone acetylation upon nucleosomal DNA length and nucleosome position, we compared nucleosome maps of the following three yeast strains; strain BY4741 (control, the elp3 (one of histone acetyltransferase genes deletion mutant, and the hos2 (one of histone deactylase genes deletion mutant of Saccharomyces cerevisiae. We sequenced mononucleosomal DNA fragments after treatment with micrococcal nuclease. After mapping the DNA fragments to the genome, we identified the nucleosome positions. We showed that the distributions of the nucleosomal DNA lengths of the control and the hos2 disruptant were similar. On the other hand, the distribution of the nucleosomal DNA lengths of the elp3 disruptant shifted toward shorter than that of the control. It strongly suggests that inhibition of Elp3-induced histone acetylation causes the nucleosomal DNA length reduction. Next, we compared the profiles of nucleosome mapping numbers in gene promoter regions between the control and the disruptant. We detected 24 genes with low conservation level of nucleosome positions in promoters between the control and the elp3 disruptant as well as between the control and the hos2 disruptant. It indicates that both Elp3-induced acetylation and Hos2-induced deacetylation influence the nucleosome positions in the promoters of those 24 genes. Interestingly, in 19 of the 24 genes, the profiles of nucleosome mapping numbers were similar between the two disruptants.

  20. The cavefish genome reveals candidate genes for eye loss

    Science.gov (United States)

    McGaugh, Suzanne E.; Gross, Joshua B.; Aken, Bronwen; Blin, Maryline; Borowsky, Richard; Chalopin, Domitille; Hinaux, Hélène; Jeffery, William R.; Keene, Alex; Ma, Li; Minx, Patrick; Murphy, Daniel; O’Quin, Kelly E.; Rétaux, Sylvie; Rohner, Nicolas; Searle, Steve M. J.; Stahl, Bethany A.; Tabin, Cliff; Volff, Jean-Nicolas; Yoshizawa, Masato; Warren, Wesley C.

    2014-01-01

    Natural populations subjected to strong environmental selection pressures offer a window into the genetic underpinnings of evolutionary change. Cavefish populations, Astyanax mexicanus (Teleostei: Characiphysi), exhibit repeated, independent evolution for a variety of traits including eye degeneration, pigment loss, increased size and number of taste buds and mechanosensory organs, and shifts in many behavioural traits. Surface and cave forms are interfertile making this system amenable to genetic interrogation; however, lack of a reference genome has hampered efforts to identify genes responsible for changes in cave forms of A. mexicanus. Here we present the first de novo genome assembly for Astyanax mexicanus cavefish, contrast repeat elements to other teleost genomes, identify candidate genes underlying quantitative trait loci (QTL), and assay these candidate genes for potential functional and expression differences. We expect the cavefish genome to advance understanding of the evolutionary process, as well as, analogous human disease including retinal dysfunction. PMID:25329095

  1. Genomic organization, annotation, and ligand-receptor inferences of chicken chemokines and chemokine receptor genes based on comparative genomics

    Directory of Open Access Journals (Sweden)

    Sze Sing-Hoi

    2005-03-01

    Full Text Available Abstract Background Chemokines and their receptors play important roles in host defense, organogenesis, hematopoiesis, and neuronal communication. Forty-two chemokines and 19 cognate receptors have been found in the human genome. Prior to this report, only 11 chicken chemokines and 7 receptors had been reported. The objectives of this study were to systematically identify chicken chemokines and their cognate receptor genes in the chicken genome and to annotate these genes and ligand-receptor binding by a comparative genomics approach. Results Twenty-three chemokine and 14 chemokine receptor genes were identified in the chicken genome. All of the chicken chemokines contained a conserved CC, CXC, CX3C, or XC motif, whereas all the chemokine receptors had seven conserved transmembrane helices, four extracellular domains with a conserved cysteine, and a conserved DRYLAIV sequence in the second intracellular domain. The number of coding exons in these genes and the syntenies are highly conserved between human, mouse, and chicken although the amino acid sequence homologies are generally low between mammalian and chicken chemokines. Chicken genes were named with the systematic nomenclature used in humans and mice based on phylogeny, synteny, and sequence homology. Conclusion The independent nomenclature of chicken chemokines and chemokine receptors suggests that the chicken may have ligand-receptor pairings similar to mammals. All identified chicken chemokines and their cognate receptors were identified in the chicken genome except CCR9, whose ligand was not identified in this study. The organization of these genes suggests that there were a substantial number of these genes present before divergence between aves and mammals and more gene duplications of CC, CXC, CCR, and CXCR subfamilies in mammals than in aves after the divergence.

  2. Evidence-based gene models for structural and functional annotations of the oil palm genome.

    Science.gov (United States)

    Chan, Kuang-Lim; Tatarinova, Tatiana V; Rosli, Rozana; Amiruddin, Nadzirah; Azizi, Norazah; Halim, Mohd Amin Ab; Sanusi, Nik Shazana Nik Mohd; Jayanthi, Nagappan; Ponomarenko, Petr; Triska, Martin; Solovyev, Victor; Firdaus-Raih, Mohd; Sambanthamurthi, Ravigadevi; Murphy, Denis; Low, Eng-Ti Leslie

    2017-09-08

    biosynthesis and disease resistance. The study demonstrated the advantages of having an integrated approach to gene prediction and developed a computational framework for combining multiple genome annotations. These results, available in the oil palm annotation database ( http://palmxplore.mpob.gov.my ), will provide important resources for studies on the genomes of oil palm and related crops. This article was reviewed by Alexander Kel, Igor Rogozin, and Vladimir A. Kuznetsov.

  3. Genome-Wide Comparative Analysis Reveals Similar Types of NBS Genes in Hybrid Citrus sinensis Genome and Original Citrus clementine Genome and Provides New Insights into Non-TIR NBS Genes

    OpenAIRE

    Yunsheng Wang; Lijuan Zhou; Dazhi Li; Liangying Dai; Amy Lawton-Rauh; Pradip K. Srimani; Yongping Duan; Feng Luo

    2015-01-01

    In this study, we identified and compared nucleotide-binding site (NBS) domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China). Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approximately evenly numbered groups: one group contains the Toll-Interleukin receptor (TIR) domain and two different Non-TIR groups in which most of proteins contain the Coiled Coil (CC) domain. Motif anal...

  4. Gene loss and horizontal gene transfer contributed to the genome evolution of the extreme acidophile Ferrovum

    Directory of Open Access Journals (Sweden)

    Sophie Roxana Ullrich

    2016-05-01

    Full Text Available Acid mine drainage (AMD, associated with active and abandoned mining sites, is a habitat for acidophilic microorganisms that gain energy from the oxidation of reduced sulfur compounds and ferrous iron and that thrive at pH below 4. Members of the recently proposed genus Ferrovum are the first acidophilic iron oxidizers to be described within the Betaproteobacteria. Although they have been detected as typical community members in AMD habitats worldwide, knowledge of their phylogenetic and metabolic diversity is scarce. Genomics approaches appear to be most promising in addressing this lacuna since isolation and cultivation of Ferrovum has proven to be extremely difficult and has so far only been successful for the designated type strain Ferrovum myxofaciens P3G. In this study, the genomes of two novel strains of Ferrovum (PN-J185 and Z-31 derived from water samples of a mine water treatment plant were sequenced. These genomes were compared with those of Ferrovum sp. JA12 that also originated from the mine water treatment plant, and of the type strain (P3G. Phylogenomic scrutiny suggests that the four strains represent three Ferrovum species that cluster in two groups (1 and 2. Comprehensive analysis of their predicted metabolic pathways revealed that these groups harbor characteristic metabolic profiles, notably with respect to motility, chemotaxis, nitrogen metabolism, biofilm formation and their potential strategies to cope with the acidic environment. For example, while the F. myxofaciens strains (group 1 appear to be motile and diazotrophic, the non-motile group 2 strains have the predicted potential to use a greater variety of fixed nitrogen sources. Furthermore, analysis of their genome synteny provides first insights into their genome evolution, suggesting that horizontal gene transfer and genome reduction in the group 2 strains by loss of genes encoding complete metabolic pathways or physiological features contributed to the observed

  5. Functional and evolutionary correlates of gene constellations in the Drosophila melanogaster genome that deviate from the stereotypical gene architecture

    Directory of Open Access Journals (Sweden)

    Kohn Michael H

    2010-05-01

    Full Text Available Abstract Background The biological dimensions of genes are manifold. These include genomic properties, (e.g., X/autosomal linkage, recombination and functional properties (e.g., expression level, tissue specificity. Multiple properties, each generally of subtle influence individually, may affect the evolution of genes or merely be (auto-correlates. Results of multidimensional analyses may reveal the relative importance of these properties on the evolution of genes, and therefore help evaluate whether these properties should be considered during analyses. While numerous properties are now considered during studies, most work still assumes the stereotypical solitary gene as commonly depicted in textbooks. Here, we investigate the Drosophila melanogaster genome to determine whether deviations from the stereotypical gene architecture correlate with other properties of genes. Results Deviations from the stereotypical gene architecture were classified as the following gene constellations: Overlapping genes were defined as those that overlap in the 5-prime, exonic, or intronic regions. Chromatin co-clustering genes were defined as genes that co-clustered within 20 kb of transcriptional territories. If this scheme is applied the stereotypical gene emerges as a rare occurrence (7.5%, slightly varied schemes yielded between ~1%-50%. Moreover, when following our scheme, paired-overlapping genes and chromatin co-clustering genes accounted for 50.1 and 42.4% of the genes analyzed, respectively. Gene constellation was a correlate of a number of functional and evolutionary properties of genes, but its statistical effect was ~1-2 orders of magnitude lower than the effects of recombination, chromosome linkage and protein function. Analysis of datasets on male reproductive proteins showed these were biased in their representation of gene constellations and evolutionary rate Ka/Ks estimates, but these biases did not overwhelm the biologically meaningful

  6. Whole genome homology-based identification of candidate genes ...

    African Journals Online (AJOL)

    Josephine Erhiakporeh

    2016-07-06

    Jul 6, 2016 ... identification of a set of 75 candidate genes (42, 22 and 11 from Arabidopsis, potato and tomato, ... understanding on the genetic basis of drought tolerance by using the .... Comparative genomics and genes expression assay ... Primer code ... physiological and molecular responses to drought stress.

  7. Genic regions of a large salamander genome contain long introns and novel genes

    Directory of Open Access Journals (Sweden)

    Bryant Susan V

    2009-01-01

    Full Text Available Abstract Background The basis of genome size variation remains an outstanding question because DNA sequence data are lacking for organisms with large genomes. Sixteen BAC clones from the Mexican axolotl (Ambystoma mexicanum: c-value = 32 × 109 bp were isolated and sequenced to characterize the structure of genic regions. Results Annotation of genes within BACs showed that axolotl introns are on average 10× longer than orthologous vertebrate introns and they are predicted to contain more functional elements, including miRNAs and snoRNAs. Loci were discovered within BACs for two novel EST transcripts that are differentially expressed during spinal cord regeneration and skin metamorphosis. Unexpectedly, a third novel gene was also discovered while manually annotating BACs. Analysis of human-axolotl protein-coding sequences suggests there are 2% more lineage specific genes in the axolotl genome than the human genome, but the great majority (86% of genes between axolotl and human are predicted to be 1:1 orthologs. Considering that axolotl genes are on average 5× larger than human genes, the genic component of the salamander genome is estimated to be incredibly large, approximately 2.8 gigabases! Conclusion This study shows that a large salamander genome has a correspondingly large genic component, primarily because genes have incredibly long introns. These intronic sequences may harbor novel coding and non-coding sequences that regulate biological processes that are unique to salamanders.

  8. Comparative Inference of Duplicated Genes Produced by Polyploidization in Soybean Genome

    Directory of Open Access Journals (Sweden)

    Yanmei Yang

    2013-01-01

    Full Text Available Soybean (Glycine max is one of the most important crop plants for providing protein and oil. It is important to investigate soybean genome for its economic and scientific value. Polyploidy is a widespread and recursive phenomenon during plant evolution, and it could generate massive duplicated genes which is an important resource for genetic innovation. Improved sequence alignment criteria and statistical analysis are used to identify and characterize duplicated genes produced by polyploidization in soybean. Based on the collinearity method, duplicated genes by whole genome duplication account for 70.3% in soybean. From the statistical analysis of the molecular distances between duplicated genes, our study indicates that the whole genome duplication event occurred more than once in the genome evolution of soybean, which is often distributed near the ends of chromosomes.

  9. Gene calling and bacterial genome annotation with BG7.

    Science.gov (United States)

    Tobes, Raquel; Pareja-Tobes, Pablo; Manrique, Marina; Pareja-Tobes, Eduardo; Kovach, Evdokim; Alekhin, Alexey; Pareja, Eduardo

    2015-01-01

    New massive sequencing technologies are providing many bacterial genome sequences from diverse taxa but a refined annotation of these genomes is crucial for obtaining scientific findings and new knowledge. Thus, bacterial genome annotation has emerged as a key point to investigate in bacteria. Any efficient tool designed specifically to annotate bacterial genomes sequenced with massively parallel technologies has to consider the specific features of bacterial genomes (absence of introns and scarcity of nonprotein-coding sequence) and of next-generation sequencing (NGS) technologies (presence of errors and not perfectly assembled genomes). These features make it convenient to focus on coding regions and, hence, on protein sequences that are the elements directly related with biological functions. In this chapter we describe how to annotate bacterial genomes with BG7, an open-source tool based on a protein-centered gene calling/annotation paradigm. BG7 is specifically designed for the annotation of bacterial genomes sequenced with NGS. This tool is sequence error tolerant maintaining their capabilities for the annotation of highly fragmented genomes or for annotating mixed sequences coming from several genomes (as those obtained through metagenomics samples). BG7 has been designed with scalability as a requirement, with a computing infrastructure completely based on cloud computing (Amazon Web Services).

  10. LATERAL GENE TRANSFER AND THE HISTORY OF BACTERIAL GENOMES

    Energy Technology Data Exchange (ETDEWEB)

    Howard Ochman

    2006-02-22

    The aims of this research were to elucidate the role and extent of lateral transfer in the differentiation of bacterial strains and species, and to assess the impact of gene transfer on the evolution of bacterial genomes. The ultimate goal of the project is to examine the dynamics of a core set of protein-coding genes (i.e., those that are distributed universally among Bacteria) by developing conserved primers that would allow their amplification and sequencing in any bacterial taxa. In addition, we adopted a bioinformatic approach to elucidate the extent of lateral gene transfer in sequenced genome.

  11. Building phylogenetic trees by using gene Nucleotide Genomic Signals.

    Science.gov (United States)

    Cristea, Paul Dan

    2012-01-01

    Nucleotide genomic signal (NuGS) methodology allows a molecular level approach to determine distances between homologous genes or between conserved equivalent non-coding genome regions in various species or individuals of the same species. Therefore, distances between the genes of species or individuals can be computed and phylogenetic trees can be built. The paper illustrates the use of the nucleotide imbalance (N) and nucleotide pair imbalance (P) signals to determine the distances between the genes of several Hominidae. The results are in accordance with those of other genetic or phylogenetic approaches to establish distances between Hominidae species.

  12. Genome-wide characterization of the Pectate Lyase-like (PLL) genes in Brassica rapa.

    Science.gov (United States)

    Jiang, Jingjing; Yao, Lina; Miao, Ying; Cao, Jiashu

    2013-11-01

    Pectate lyases (PL) depolymerize demethylated pectin (pectate, EC 4.2.2.2) by catalyzing the eliminative cleavage of α-1,4-glycosidic linked galacturonan. Pectate Lyase-like (PLL) genes are one of the largest and most complex families in plants. However, studies on the phylogeny, gene structure, and expression of PLL genes are limited. To understand the potential functions of PLL genes in plants, we characterized their intron-exon structure, phylogenetic relationships, and protein structures, and measured their expression patterns in various tissues, specifically the reproductive tissues in Brassica rapa. Sequence alignments revealed two characteristic motifs in PLL genes. The chromosome location analysis indicated that 18 of the 46 PLL genes were located in the least fractionated sub-genome (LF) of B. rapa, while 16 were located in the medium fractionated sub-genome (MF1) and 12 in the more fractionated sub-genome (MF2). Quantitative RT-PCR analysis showed that BrPLL genes were expressed in various tissues, with most of them being expressed in flowers. Detailed qRT-PCR analysis identified 11 pollen specific PLL genes and several other genes with unique spatial expression patterns. In addition, some duplicated genes showed similar expression patterns. The phylogenetic analysis identified three PLL gene subfamilies in plants, among which subfamily II might have evolved from gene neofunctionalization or subfunctionalization. Therefore, this study opens the possibility for exploring the roles of PLL genes during plant development.

  13. Ultrahigh-dimensional variable selection method for whole-genome gene-gene interaction analysis

    Directory of Open Access Journals (Sweden)

    Ueki Masao

    2012-05-01

    Full Text Available Abstract Background Genome-wide gene-gene interaction analysis using single nucleotide polymorphisms (SNPs is an attractive way for identification of genetic components that confers susceptibility of human complex diseases. Individual hypothesis testing for SNP-SNP pairs as in common genome-wide association study (GWAS however involves difficulty in setting overall p-value due to complicated correlation structure, namely, the multiple testing problem that causes unacceptable false negative results. A large number of SNP-SNP pairs than sample size, so-called the large p small n problem, precludes simultaneous analysis using multiple regression. The method that overcomes above issues is thus needed. Results We adopt an up-to-date method for ultrahigh-dimensional variable selection termed the sure independence screening (SIS for appropriate handling of numerous number of SNP-SNP interactions by including them as predictor variables in logistic regression. We propose ranking strategy using promising dummy coding methods and following variable selection procedure in the SIS method suitably modified for gene-gene interaction analysis. We also implemented the procedures in a software program, EPISIS, using the cost-effective GPGPU (General-purpose computing on graphics processing units technology. EPISIS can complete exhaustive search for SNP-SNP interactions in standard GWAS dataset within several hours. The proposed method works successfully in simulation experiments and in application to real WTCCC (Wellcome Trust Case–control Consortium data. Conclusions Based on the machine-learning principle, the proposed method gives powerful and flexible genome-wide search for various patterns of gene-gene interaction.

  14. A critical assessment of cross-species detection of gene duplicates using comparative genomic hybridization

    Directory of Open Access Journals (Sweden)

    Renn Suzy CP

    2010-05-01

    Full Text Available Abstract Background Comparison of genomic DNA among closely related strains or species is a powerful approach for identifying variation in evolutionary processes. One potent source of genomic variation is gene duplication, which is prevalent among individuals and species. Array comparative genomic hybridization (aCGH has been successfully utilized to detect this variation among lineages. Here, beyond the demonstration that gene duplicates among species can be quantified with aCGH, we consider the effect of sequence divergence on the ability to detect gene duplicates. Results Using the X chromosome genomic content difference between male D. melanogaster and female D. yakuba and D. simulans, we describe a decrease in the ability to accurately measure genomic content (copy number for orthologs that are only 90% identical. We demonstrate that genome characteristics (e.g. chromatin environment and non-orthologous sequence similarity can also affect the ability to accurately measure genomic content. We describe a normalization strategy and statistical criteria to be used for the identification of gene duplicates among any species group for which an array platform is available from a closely related species. Conclusions Array CGH can be used to effectively identify gene duplication and genome content; however, certain biases are present due to sequence divergence and other genome characteristics resulting from the divergence between lineages. Highly conserved gene duplicates will be more readily recovered by aCGH. Duplicates that have been retained for a selective advantage due to directional selection acting on many loci in one or both gene copies are likely to be under-represented. The results of this study should inform the interpretation of both previously published and future work that employs this powerful technique.

  15. Genome-wide analysis of Aux/IAA and ARF gene families in Populus trichocarpa

    Energy Technology Data Exchange (ETDEWEB)

    Kalluri, Udaya C [ORNL; DiFazio, Stephen P [West Virginia University; Brunner, A. [Virginia Polytechnic Institute and State University (Virginia Tech); Tuskan, Gerald A [ORNL

    2007-01-01

    Auxin/Indole-3-Acetic Acid (Aux/IAA) and Auxin Response Factor (ARF) transcription factors are key regulators of auxin responses in plants. A total of 35 Aux/IAA and 39 ARF genes were identified in the Populus genome. Comparative phylogenetic analysis revealed that the subgroups PoptrARF2, 6, 9 and 16 and PoptrIAA3, 16, 27 and 29 have differentially expanded in Populus relative to Arabidopsis. Activator ARFs were found to be two fold-overrepresented in the Populus genome. PoptrIAA and PoptrARF gene families appear to have expanded due to high segmental and low tandem duplication events. Furthermore, expression studies showed that genes in the expanded PoptrIAA3 subgroup display differential expression. The gene-family analysis reported here will be useful in conducting future functional genomics studies to understand how the molecular roles of these large gene families translate into a diversity of biologically meaningful auxin effects.

  16. Whole genome phylogeny of Prochlorococcus marinus group of cyanobacteria: genome alignment and overlapping gene approach.

    Science.gov (United States)

    Prabha, Ratna; Singh, Dhananjaya P; Gupta, Shailendra K; Rai, Anil

    2014-06-01

    Prochlorococcus is the smallest known oxygenic phototrophic marine cyanobacterium dominating the mid-latitude oceans. Physiologically and genetically distinct P. marinus isolates from many oceans in the world were assigned two different groups, a tightly clustered high-light (HL)-adapted and a divergent low-light (LL-) adapted clade. Phylogenetic analysis of this cyanobacterium on the basis of 16S rRNA and other conserved genes did not show consistency with its phenotypic behavior. We analyzed phylogeny of this genus on the basis of complete genome sequences through genome alignment, overlapping-gene content and gene-order approach. Phylogenetic tree of P. marinus obtained by comparing whole genome sequences in contrast to that based on 16S rRNA gene, corresponded well with the HL/LL ecotypic distinction of twelve strains and showed consistency with phenotypic classification of P. marinus. Evidence for the horizontal descent and acquisition of genes within and across the genus was observed. Many genes involved in metabolic functions were found to be conserved across these genomes and many were continuously gained by different strains as per their needs during the course of their evolution. Consistency in the physiological and genetic phylogeny based on whole genome sequence is established. These observations improve our understanding about the adaptation and diversification of these organisms under evolutionary pressure.

  17. Putative essential and core-essential genes in Mycoplasma genomes

    OpenAIRE

    Lin, Yan; Zhang, Randy Ren

    2011-01-01

    Mycoplasma, which was used to create the first “synthetic life”, has been an important species in the emerging field, synthetic biology. However, essential genes, an important concept of synthetic biology, for both M. mycoides and M. capricolum, as well as 14 other Mycoplasma with available genomes, are still unknown. We have developed a gene essentiality prediction algorithm that incorporates information of biased gene strand distribution, homologous search and codon adaptation index. The al...

  18. Strigolactone biology: genes, functional genomics, epigenetics and applications.

    Science.gov (United States)

    Makhzoum, Abdullah; Yousefzadi, Morteza; Malik, Sonia; Gantet, Pascal; Tremouillaux-Guiller, Jocelyne

    2017-03-01

    Strigolactones (SLs) represent an important new plant hormone class marked by their multifunctional role in plant and rhizosphere interactions. These compounds stimulate hyphal branching in arbuscular mycorrhizal fungi (AMF) and seed germination of root parasitic plants. In addition, they are involved in the control of plant architecture by inhibiting bud outgrowth as well as many other morphological and developmental processes together with other plant hormones such as auxins and cytokinins. The biosynthetic pathway of SLs that are derived from carotenoids was partially decrypted based on the identification of mutants from a variety of plant species. Only a few SL biosynthetic and regulated genes and related regulatory transcription factors have been identified. However, functional genomics and epigenetic studies started to give first elements on the modality of the regulation of SLs related genes. Since they control plant architecture and plant-rhizosphere interaction, SLs start to be used for agronomical and biotechnological applications. Furthermore, the genes involved in the SL biosynthetic pathway and genes regulated by SL constitute interesting targets for plant breeding. Therefore, it is necessary to decipher and better understand the genetic determinants of their regulation at different levels.

  19. Evaluation of the utility of gene expression and metabolic information for genomic prediction in maize.

    Science.gov (United States)

    Guo, Zhigang; Magwire, Michael M; Basten, Christopher J; Xu, Zhanyou; Wang, Daolong

    2016-12-01

    Predictive ability derived from gene expression and metabolic information was evaluated using genomic prediction methods based on datasets from a public maize panel. With the rapid development of high throughput biological technologies, information from gene expression and metabolites has received growing attention in plant genetics and breeding. In this study, we evaluated the utility of gene expression and metabolic information for genomic prediction using data obtained from a maize diversity panel. Our results show that, when used as predictor variables, gene expression levels and metabolite abundances provided reasonable predictive abilities relative to those based on genetic markers, although these values were not as large as those with genetic markers. Integrating gene expression levels and metabolite abundances with genetic markers significantly improved predictive abilities in comparison to the benchmark genomic best linear unbiased prediction model using genome-wide markers only. Predictive abilities based on gene expression and metabolites were trait-specific and were affected by the time of measurement and tissue samples as well as the number of genes and metabolites included in the model. In general, our results suggest that, rather than being conventionally used as intermediate phenotypes, gene expression and metabolic information can be used as predictors for genomic prediction and help improve genetic gains for complex traits in breeding programs.

  20. Common genetic variation near the phospholamban gene is associated with cardiac repolarisation : Meta-analysis of three genome-wide association studies

    NARCIS (Netherlands)

    I.M. Nolte (Ilja); C. Wallace (Chris); S.J. Newhouse (Stephen); D. Waggott (Daryl); J. Fu (Jingyuan); N. Soranzo (Nicole); R. Gwilliam (Rhian); S. Demissie (Serkalem); I. Savelieva (Irina); D. Zheng (Dongling); C. Dalageorgou (Chrysoula); M. Farrall (Martin); N.J. Samani (Nilesh); J. Connell (John); M.J. Brown (Morris); A. Dominiczak (Anna); M. Lathrop (Mark); E. Zeggini (Eleftheria); L.V. Wain (Louise); C. Newton-Cheh (Christopher); M. Eijgelsheim (Mark); K. Rice (Kenneth); P.I.W. de Bakker (Paul); A. Pfeufer (Arne); S. Sanna (Serena); D.E. Arking (Dan); F.W. Asselbergs (Folkert); T.D. Spector (Tim); N.D. Carter (Nicholas); S. Jeffery (Steve); M. Tobin (Martin); M. Caulfield (Mark); H. Snieder (Harold); A.D. Paterson (Andrew); P. Munroe (Patricia); Y. Jamshidi (Yalda)

    2009-01-01

    textabstractTo identify loci affecting the electrocardiographic QT interval, a measure of cardiac repolarisation associated with risk of ventricular arrhythmias and sudden cardiac death, we conducted a meta-analysis of three genome-wide association studies (GWAS) including 3,558 subjects from the Tw

  1. Common Genetic Variation Near the Phospholamban Gene Is Associated with Cardiac Repolarisation : Meta-Analysis of Three Genome-Wide Association Studies

    NARCIS (Netherlands)

    Nolte, Ilja M.; Wallace, Chris; Newhouse, Stephen J.; Waggott, Daryl; Fu, Jingyuan; Soranzo, Nicole; Gwilliam, Rhian; Deloukas, Panos; Savelieva, Irina; Zheng, Dongling; Dalageorgou, Chrysoula; Farrall, Martin; Samani, Nilesh J.; Connell, John; Brown, Morris; Dominiczak, Anna; Lathrop, Mark; Zeggini, Eleftheria; Wain, Louise V.; Newton-Cheh, Christopher; Eijgelsheim, Mark; Rice, Kenneth; de Bakker, Paul I. W.; Pfeufer, Arne; Sanna, Serena; Arking, Dan E.; Asselbergs, Folkert W.; Spector, Tim D.; Carter, Nicholas D.; Jeffery, Steve; Tobin, Martin; Caulfield, Mark; Snieder, Harold; Paterson, Andrew D.; Munroe, Patricia B.; Jamshidi, Yalda

    2009-01-01

    To identify loci affecting the electrocardiographic QT interval, a measure of cardiac repolarisation associated with risk of ventricular arrhythmias and sudden cardiac death, we conducted a meta-analysis of three genome-wide association studies (GWAS) including 3,558 subjects from the TwinsUK and BR

  2. Genome-wide association and genetic functional studies identify autism susceptibility candidate 2 gene (AUTS2) in the regulation of alcohol consumption

    NARCIS (Netherlands)

    G. Schumann (Gunter); L. Coin (Lachlan); A. Lourdusamy (Anbarasu); P. Charoen (Pimphen); K.H. Berger (Karen); D. Stacey (David); S. Desrivières (Sylvane); F.A. Aliev (Fazil); A.A. Khan (Anokhi); N. Amin (Najaf); G. Bakalkin (Georgy); B. Balkau (Beverley); J.W.J. Beulens (Joline); A. Bilbao (Ainhoa); R.A. de Boer (Rudolf); D. Beury (Delphine); M.L. Bots (Michiel); E.J. Breetvelt (Elemi); S. Cauchi (Stephane); C. Cavalcanti-Proença (Christine); J.C. Chambers (John); T.-K. Clarke; N. Dahmen (N.); E.J.C. de Geus (Eco); D. Dick (Danielle); F. Ducci (Francesca); A. Easton (Alanna); H.J. Edenberg (Howard); T. Esk (Tõnu); A. Fernández-Medarde (Alberto); T. Foroud (Tatiana); N.B. Freimer (Nelson); J.-A. Girault; D.E. Grobbee (Diederick); S. Guarrera (Simonetta); D.F. Gudbjartsson (Daniel); A.L. Hartikainen; A.C. Heath (Andrew); V. Hesselbrock (Victor); A. Hofman (Albert); J.J. Hottenga (Jouke Jan); M.K. Isohanni (Matti); J. Kaprio (Jaakko); K-T. Khaw (Kay-Tee); B. Kuehnel (Brigitte); J. Laitinen (Jaana); S. Lobbens (Stéphane); J. Luan; M. Mangino (Massimo); M. Maroteaux (Matthieu); G. Matullo (Giuseppe); M.I. McCarthy (Mark); C. Mueller (Christian); G. Navis (Gerjan); M.E. Numans (Mattijs); A.M. Núñez (Alejandro); D.R. Nyholt (Dale); C.N. Onland-Moret (Charlotte); B.A. Oostra (Ben); P.F. O'Reilly (Paul); M. Palkovits (Miklos); B.W.J.H. Penninx (Brenda); S. Polidoro (Silvia); A. Pouta (Anneli); I. Prokopenko (Inga); F. Ricceri (Fulvio); E. Santos (Eugenio); J.H. Smit (Johannes); N. Soranzo (Nicole); K. Song (Kijoung); U. Sovio (Ulla); M. Stumvoll (Michael); I. Surakk (Ida); T.E. Thorgeirsson (Thorgeir); U. Thorsteinsdottir (Unnur); C. Troakes (Claire); T. Tyrfingsson (Thorarinn); A. Tönjes (Anke); C.S.P.M. Uiterwaal (Cuno); A.G. Uitterlinden (André); P. van der Harst (Pim); Y.T. van der Schouw (Yvonne); O. Staehlin (Oliver); N. Vogelzangs (Nicole); P. Vollenweider (Peter); G. Waeber (Gérard); N.J. Wareham (Nick); D. Waterworth (Dawn); J.B. Whitfield (John); E.H. Wichmann (Erich); G.A.H.M. Willemsen (Gonneke); J.C.M. Witteman (Jacqueline); X. Yuan (Xin); G. Zhai (Guangju); J.H. Zhao (Jing); W. Zhang (Weihua); N.G. Martin (Nicholas); A. Metspalu (Andres); A. Doering (Angela); J. Scott (James); T.D. Spector (Timothy); R.J.F. Loos (Ruth); D.I. Boomsma (Dorret); V. Mooser (Vincent); L. Peltonen (Leena Johanna); K. Stefansson (Kari); P. Tikka-Kleemola (Päivi); P. Vineis (Paolo); W.H. Sommer (Wolfgang); J.S. Kooner (Jaspal); R. Spanagel (Rainer); U.A. Heberlein (Ulrike); M.R. Järvelin; P. Elliott (Paul); Y.S. Aulchenko (Yurii); S.J.L. Bakker (Stephan)

    2011-01-01

    textabstractAlcohol consumption is a moderately heritable trait, but the genetic basis in humans is largely unknown, despite its clinical and societal importance. We report a genome-wide association study meta-analysis of ∼2.5 million directly genotyped or imputed SNPs with alcohol consumption (gram

  3. Genome-wide association and genetic functional studies identify autism susceptibility candidate 2 gene (AUTS2) in the regulation of alcohol consumption

    NARCIS (Netherlands)

    Schumann, Gunter; Coin, Lachlan J.; Lourdusamy, Anbarasu; Charoen, Pimphen; Berger, Karen H.; Stacey, David; Desrivieres, Sylvane; Aliev, Fazil A.; Khan, Anokhi A.; Amin, Najaf; Aulchenko, Yurii S.; Bakalkin, Georgy; Bakker, Stephan J.; Balkau, Beverley; Beulens, Joline W.; Bilbao, Ainhoa; de Boer, Rudolf A.; Beury, Delphine; Bots, Michiel L.; Breetvelt, Elemi J.; Cauchi, Stephane; Cavalcanti-Proenca, Christine; Chambers, John C.; Clarke, Toni-Kim; Dahmen, Norbert; de Geus, Eco J.; Dick, Danielle; Ducci, Francesca; Easton, Alanna; Edenberg, Howard J.; Esk, Tonu; Fernandez-Medarde, Alberto; Foroud, Tatiana; Freimer, Nelson B.; Girault, Jean-Antoine; Grobbee, Diederick E.; Guarrera, Simonetta; Gudbjartsson, Daniel F.; Hartikainen, Anna-Liisa; Heath, Andrew C.; Hesselbrock, Victor; Hofman, Albert; Hottenga, Jouke-Jan; Isohanni, Matti K.; Kaprio, Jaakko; Khaw, Kay-Tee; Kuehnel, Brigitte; Laitinen, Jaana; Lobbens, Stephane; Luan, Jian'an; Mangino, Massimo; Maroteaux, Matthieu; Matullo, Giuseppe; McCarthy, Mark I.; Mueller, Christian; Navis, Gerjan; Numans, Mattijs E.; Nunez, Alejandro; Nyholt, Dale R.; Onland-Moret, Charlotte N.; Oostra, Ben A.; O'Reilly, Paul F.; Palkovits, Miklos; Penninx, Brenda W.; Polidoro, Silvia; Pouta, Anneli; Prokopenko, Inga; Ricceri, Fulvio; Santos, Eugenio; Smit, Johannes H.; Soranzo, Nicole; Song, Kijoung; Sovio, Ulla; Stumvoll, Michael; Surakk, Ida; Thorgeirsson, Thorgeir E.; Thorsteinsdottir, Unnur; Troakes, Claire; Tyrfingsson, Thorarinn; Toenjes, Anke; Uiterwaal, Cuno S.; Uitterlinden, Andre G.; van der Harst, Pim; van der Schouw, Yvonne T.; Staehlin, Oliver; Vogelzangs, Nicole; Vollenweider, Peter; Waeber, Gerard; Wareham, Nicholas J.; Waterworth, Dawn M.; Whitfield, John B.; Wichmann, Erich H.; Willemsen, Gonneke; Witteman, Jacqueline C.; Yuan, Xin; Zhai, Guangju; Zhao, Jing H.; Zhang, Weihua; Martin, Nicholas G.; Metspalu, Andres; Doering, Angela; Scott, James; Spector, Tim D.; Loos, Ruth J.; Boomsma, Dorret I.; Mooser, Vincent; Peltonen, Leena; Stefansson, Kari; van Duijn, Cornelia M.; Vineis, Paolo; Sommer, Wolfgang H.; Kooner, Jaspal S.; Spanagel, Rainer; Heberlein, Ulrike A.; Jarvelin, Marjo-Riitta; Elliott, Paul

    2011-01-01

    Alcohol consumption is a moderately heritable trait, but the genetic basis in humans is largely unknown, despite its clinical and societal importance. We report a genome-wide association study meta-analysis of similar to 2.5 million directly genotyped or imputed SNPs with alcohol consumption (gram p

  4. Genome-wide association study identifies a sequence variant within the DAB2IP gene conferring susceptibility to abdominal aortic aneurysm

    DEFF Research Database (Denmark)

    Gretarsdottir, Solveig; Baas, Annette F; Thorleifsson, Gudmar

    2010-01-01

    We performed a genome-wide association study on 1,292 individuals with abdominal aortic aneurysms (AAAs) and 30,503 controls from Iceland and The Netherlands, with a follow-up of top markers in up to 3,267 individuals with AAAs and 7,451 controls. The A allele of rs7025486 on 9q33 was found to as...

  5. Genome-wide association study identifies a sequence variant within the DAB2IP gene conferring susceptibility to abdominal aortic aneurysm

    NARCIS (Netherlands)

    Gretarsdottir, Solveig; Baas, Annette F.; Thorleifsson, Gudmar; Holm, Hilma; den Heijer, Martin; de Vries, Jean-Paul P. M.; Kranendonk, Steef E.; Zeebregts, Clark J. A. M.; van Sterkenburg, Steven M.; Geelkerken, Robert H.; van Rij, Andre M.; Williams, Michael J. A.; Boll, Albert P. M.; Kostic, Jelena P.; Jonasdottir, Adalbjorg; Jonasdottir, Aslaug; Walters, G. Bragi; Masson, Gisli; Sulem, Patrick; Saemundsdottir, Jona; Mouy, Magali; Magnusson, Kristinn P.; Tromp, Gerard; Elmore, James R.; Sakalihasan, Natzi; Limet, Raymond; Defraigne, Jean-Olivier; Ferrell, Robert E.; Ronkainen, Antti; Ruigrok, Ynte M.; Wijmenga, Cisca; Grobbee, Diederick E.; Shah, Svati H.; Granger, Christopher B.; Quyyumi, Arshed A.; Vaccarino, Viola; Patel, Riyaz S.; Zafari, A. Maziar; Levey, Allan I.; Austin, Harland; Girelli, Domenico; Pignatti, Pier Franco; Olivieri, Oliviero; Martinelli, Nicola; Malerba, Giovanni; Trabetti, Elisabetta; Becker, Lewis C.; Becker, Diane M.; Reilly, Muredach P.; Rader, Daniel J.; Mueller, Thomas; Dieplinger, Benjamin; Haltmayer, Meinhard; Urbonavicius, Sigitas; Lindblad, Bengt; Gottsater, Anders; Gaetani, Eleonora; Pola, Roberto; Wells, Philip; Rodger, Marc; Forgie, Melissa; Langlois, Nicole; Corral, Javier; Vicente, Vicente; Fontcuberta, Jordi; Espana, Francisco; Grarup, Niels; Jorgensen, Torben; Witte, Daniel R.; Hansen, Torben; Pedersen, Oluf; Aben, Katja K.; de Graaf, Jacqueline; Holewijn, Suzanne; Folkersen, Lasse; Franco-Cereceda, Anders; Eriksson, Per; Collier, David A.; Stefansson, Hreinn; Steinthorsdottir, Valgerdur; Rafnar, Thorunn; Valdimarsson, Einar M.; Magnadottir, Hulda B.; Sveinbjornsdottir, Sigurlaug; Olafsson, Isleifur; Magnusson, Magnus Karl; Palmason, Robert; Haraldsdottir, Vilhelmina; Andersen, Karl; Onundarson, Pall T.; Thorgeirsson, Gudmundur; Kiemeney, Lambertus A.; Powell, Janet T.; Carey, David J.; Kuivaniemi, Helena; Lindholt, Jes S.; Jones, Gregory T.; Kong, Augustine; Blankensteijn, Jan D.; Matthiasson, Stefan E.; Thorsteinsdottir, Unnur; Stefansson, Kari

    2010-01-01

    We performed a genome-wide association study on 1,292 individuals with abdominal aortic aneurysms (AAAs) and 30,503 controls from Iceland and The Netherlands, with a follow-up of top markers in up to 3,267 individuals with AAAs and 7,451 controls. The A allele of rs7025486 on 9q33 was found to assoc

  6. Common genetic variation near the phospholamban gene is associated with cardiac repolarisation : Meta-analysis of three genome-wide association studies

    NARCIS (Netherlands)

    I.M. Nolte (Ilja); C. Wallace (Chris); S.J. Newhouse (Stephen); D. Waggott (Daryl); J. Fu (Jingyuan); N. Soranzo (Nicole); R. Gwilliam (Rhian); S. Demissie (Serkalem); I. Savelieva (Irina); D. Zheng (Dongling); C. Dalageorgou (Chrysoula); M. Farrall (Martin); N.J. Samani (Nilesh); J. Connell (John); M.J. Brown (Morris); A. Dominiczak (Anna); M. Lathrop (Mark); E. Zeggini (Eleftheria); L.V. Wain (Louise); C. Newton-Cheh (Christopher); M. Eijgelsheim (Mark); K. Rice (Kenneth); P.I.W. de Bakker (Paul); A. Pfeufer (Arne); S. Sanna (Serena); D.E. Arking (Dan); F.W. Asselbergs (Folkert); T.D. Spector (Tim); N.D. Carter (Nicholas); S. Jeffery (Steve); M. Tobin (Martin); M. Caulfield (Mark); H. Snieder (Harold); A.D. Paterson (Andrew); P. Munroe (Patricia); Y. Jamshidi (Yalda)

    2009-01-01

    textabstractTo identify loci affecting the electrocardiographic QT interval, a measure of cardiac repolarisation associated with risk of ventricular arrhythmias and sudden cardiac death, we conducted a meta-analysis of three genome-wide association studies (GWAS) including 3,558 subjects from the

  7. Common Genetic Variation Near the Phospholamban Gene Is Associated with Cardiac Repolarisation : Meta-Analysis of Three Genome-Wide Association Studies

    NARCIS (Netherlands)

    Nolte, Ilja M.; Wallace, Chris; Newhouse, Stephen J.; Waggott, Daryl; Fu, Jingyuan; Soranzo, Nicole; Gwilliam, Rhian; Deloukas, Panos; Savelieva, Irina; Zheng, Dongling; Dalageorgou, Chrysoula; Farrall, Martin; Samani, Nilesh J.; Connell, John; Brown, Morris; Dominiczak, Anna; Lathrop, Mark; Zeggini, Eleftheria; Wain, Louise V.; Newton-Cheh, Christopher; Eijgelsheim, Mark; Rice, Kenneth; de Bakker, Paul I. W.; Pfeufer, Arne; Sanna, Serena; Arking, Dan E.; Asselbergs, Folkert W.; Spector, Tim D.; Carter, Nicholas D.; Jeffery, Steve; Tobin, Martin; Caulfield, Mark; Snieder, Harold; Paterson, Andrew D.; Munroe, Patricia B.; Jamshidi, Yalda

    2009-01-01

    To identify loci affecting the electrocardiographic QT interval, a measure of cardiac repolarisation associated with risk of ventricular arrhythmias and sudden cardiac death, we conducted a meta-analysis of three genome-wide association studies (GWAS) including 3,558 subjects from the TwinsUK and

  8. Annotating the Function of the Human Genome with Gene Ontology and Disease Ontology.

    Science.gov (United States)

    Hu, Yang; Zhou, Wenyang; Ren, Jun; Dong, Lixiang; Wang, Yadong; Jin, Shuilin; Cheng, Liang

    2016-01-01

    Increasing evidences indicated that function annotation of human genome in molecular level and phenotype level is very important for systematic analysis of genes. In this study, we presented a framework named Gene2Function to annotate Gene Reference into Functions (GeneRIFs), in which each functional description of GeneRIFs could be annotated by a text mining tool Open Biomedical Annotator (OBA), and each Entrez gene could be mapped to Human Genome Organisation Gene Nomenclature Committee (HGNC) gene symbol. After annotating all the records about human genes of GeneRIFs, 288,869 associations between 13,148 mRNAs and 7,182 terms, 9,496 associations between 948 microRNAs and 533 terms, and 901 associations between 139 long noncoding RNAs (lncRNAs) and 297 terms were obtained as a comprehensive annotation resource of human genome. High consistency of term frequency of individual gene (Pearson correlation = 0.6401, p = 2.2e - 16) and gene frequency of individual term (Pearson correlation = 0.1298, p = 3.686e - 14) in GeneRIFs and GOA shows our annotation resource is very reliable.

  9. Cross-family translational genomics of abiotic stress-responsive genes between Arabidopsis and Medicago truncatula.

    Directory of Open Access Journals (Sweden)

    Daejin Hyung

    Full Text Available Cross-species translation of genomic information may play a pivotal role in applying biological knowledge gained from relatively simple model system to other less studied, but related, genomes. The information of abiotic stress (ABS-responsive genes in Arabidopsis was identified and translated into the legume model system, Medicago truncatula. Various data resources, such as TAIR/AtGI DB, expression profiles and literatures, were used to build a genome-wide list of ABS genes. tBlastX/BlastP similarity search tools and manual inspection of alignments were used to identify orthologous genes between the two genomes. A total of 1,377 genes were finally collected and classified into 18 functional criteria of gene ontology (GO. The data analysis according to the expression cues showed that there was substantial level of interaction among three major types (i.e., drought, salinity and cold stress of abiotic stresses. In an attempt to translate the ABS genes between these two species, genomic locations for each gene were mapped using an in-house-developed comparative analysis platform. The comparative analysis revealed that fragmental colinearity, represented by only 37 synteny blocks, existed between Arabidopsis and M. truncatula. Based on the combination of E-value and alignment remarks, estimated translation rate was 60.2% for this cross-family translation. As a prelude of the functional comparative genomic approaches, in-silico gene network/interactome analyses were conducted to predict key components in the ABS responses, and one of the sub-networks was integrated with corresponding comparative map. The results demonstrated that core members of the sub-network were well aligned with previously reported ABS regulatory networks. Taken together, the results indicate that network-based integrative approaches of comparative and functional genomics are important to interpret and translate genomic information for complex traits such as abiotic stresses.

  10. Meta-Analysis of Genome-Wide Association Studies and Network Analysis-Based Integration with Gene Expression Data Identify New Suggestive Loci and Unravel a Wnt-Centric Network Associated with Dupuytren’s Disease

    Science.gov (United States)

    Becker, Kerstin; Siegert, Sabine; Toliat, Mohammad Reza; Du, Juanjiangmeng; Casper, Ramona; Dolmans, Guido H.; Werker, Paul M.; Tinschert, Sigrid; Franke, Andre; Gieger, Christian; Strauch, Konstantin; Nothnagel, Michael; Nürnberg, Peter; Hennies, Hans Christian

    2016-01-01

    Dupuytren´s disease, a fibromatosis of the connective tissue in the palm, is a common complex disease with a strong genetic component. Up to date nine genetic loci have been found to be associated with the disease. Six of these loci contain genes that code for Wnt signalling proteins. In spite of this striking first insight into the genetic factors in Dupuytren´s disease, much of the inherited risk in Dupuytren´s disease still needs to be discovered. The already identified loci jointly explain ~1% of the heritability in this disease. To further elucidate the genetic basis of Dupuytren´s disease, we performed a genome-wide meta-analysis combining three genome-wide association study (GWAS) data sets, comprising 1,580 cases and 4,480 controls. We corroborated all nine previously identified loci, six of these with genome-wide significance (p-value Dupuytren´s disease. PMID:27467239

  11. Analysis of 81 genes from 64 plastid genomes resolves relationships in angiosperms and identifies genome-scale evolutionary patterns

    Science.gov (United States)

    Jansen, Robert K.; Cai, Zhengqiu; Raubeson, Linda A.; Daniell, Henry; dePamphilis, Claude W.; Leebens-Mack, James; Müller, Kai F.; Guisinger-Bellian, Mary; Haberle, Rosemarie C.; Hansen, Anne K.; Chumley, Timothy W.; Lee, Seung-Bum; Peery, Rhiannon; McNeal, Joel R.; Kuehl, Jennifer V.; Boore, Jeffrey L.

    2007-01-01

    Angiosperms are the largest and most successful clade of land plants with >250,000 species distributed in nearly every terrestrial habitat. Many phylogenetic studies have been based on DNA sequences of one to several genes, but, despite decades of intensive efforts, relationships among early diverging lineages and several of the major clades remain either incompletely resolved or weakly supported. We performed phylogenetic analyses of 81 plastid genes in 64 sequenced genomes, including 13 new genomes, to estimate relationships among the major angiosperm clades, and the resulting trees are used to examine the evolution of gene and intron content. Phylogenetic trees from multiple methods, including model-based approaches, provide strong support for the position of Amborella as the earliest diverging lineage of flowering plants, followed by Nymphaeales and Austrobaileyales. The plastid genome trees also provide strong support for a sister relationship between eudicots and monocots, and this group is sister to a clade that includes Chloranthales and magnoliids. Resolution of relationships among the major clades of angiosperms provides the necessary framework for addressing numerous evolutionary questions regarding the rapid diversification of angiosperms. Gene and intron content are highly conserved among the early diverging angiosperms and basal eudicots, but 62 independent gene and intron losses are limited to the more derived monocot and eudicot clades. Moreover, a lineage-specific correlation was detected between rates of nucleotide substitutions, indels, and genomic rearrangements. PMID:18048330

  12. Bacterial Cellular Engineering by Genome Editing and Gene Silencing

    Directory of Open Access Journals (Sweden)

    Nobutaka Nakashima

    2014-02-01

    Full Text Available Genome editing is an important technology for bacterial cellular engineering, which is commonly conducted by homologous recombination-based procedures, including gene knockout (disruption, knock-in (insertion, and allelic exchange. In addition, some new recombination-independent approaches have emerged that utilize catalytic RNAs, artificial nucleases, nucleic acid analogs, and peptide nucleic acids. Apart from these methods, which directly modify the genomic structure, an alternative approach is to conditionally modify the gene expression profile at the posttranscriptional level without altering the genomes. This is performed by expressing antisense RNAs to knock down (silence target mRNAs in vivo. This review describes the features and recent advances on methods used in genomic engineering and silencing technologies that are advantageously used for bacterial cellular engineering.

  13. Identification of Ohnolog Genes Originating from Whole Genome Duplication in Early Vertebrates, Based on Synteny Comparison across Multiple Genomes.

    Science.gov (United States)

    Singh, Param Priya; Arora, Jatin; Isambert, Hervé

    2015-07-01

    Whole genome duplications (WGD) have now been firmly established in all major eukaryotic kingdoms. In particular, all vertebrates descend from two rounds of WGDs, that occurred in their jawless ancestor some 500 MY ago. Paralogs retained from WGD, also coined 'ohnologs' after Susumu Ohno, have been shown to be typically associated with development, signaling and gene regulation. Ohnologs, which amount to about 20 to 35% of genes in the human genome, have also been shown to be prone to dominant deleterious mutations and frequently implicated in cancer and genetic diseases. Hence, identifying ohnologs is central to better understand the evolution of vertebrates and their susceptibility to genetic diseases. Early computational analyses to identify vertebrate ohnologs relied on content-based synteny comparisons between the human genome and a single invertebrate outgroup genome or within the human genome itself. These approaches are thus limited by lineage specific rearrangements in individual genomes. We report, in this study, the identification of vertebrate ohnologs based on the quantitative assessment and integration of synteny conservation between six amniote vertebrates and six invertebrate outgroups. Such a synteny comparison across multiple genomes is shown to enhance the statistical power of ohnolog identification in vertebrates compared to earlier approaches, by overcoming lineage specific genome rearrangements. Ohnolog gene families can be browsed and downloaded for three statistical confidence levels or recompiled for specific, user-defined, significance criteria at http://ohnologs.curie.fr/. In the light of the importance of WGD on the genetic makeup of vertebrates, our analysis provides a useful resource for researchers interested in gaining further insights on vertebrate evolution and genetic diseases.

  14. Identification of Ohnolog Genes Originating from Whole Genome Duplication in Early Vertebrates, Based on Synteny Comparison across Multiple Genomes.

    Directory of Open Access Journals (Sweden)

    Param Priya Singh

    2015-07-01

    Full Text Available Whole genome duplications (WGD have now been firmly established in all major eukaryotic kingdoms. In particular, all vertebrates descend from two rounds of WGDs, that occurred in their jawless ancestor some 500 MY ago. Paralogs retained from WGD, also coined 'ohnologs' after Susumu Ohno, have been shown to be typically associated with development, signaling and gene regulation. Ohnologs, which amount to about 20 to 35% of genes in the human genome, have also been shown to be prone to dominant deleterious mutations and frequently implicated in cancer and genetic diseases. Hence, identifying ohnologs is central to better understand the evolution of vertebrates and their susceptibility to genetic diseases. Early computational analyses to identify vertebrate ohnologs relied on content-based synteny comparisons between the human genome and a single invertebrate outgroup genome or within the human genome itself. These approaches are thus limited by lineage specific rearrangements in individual genomes. We report, in this study, the identification of vertebrate ohnologs based on the quantitative assessment and integration of synteny conservation between six amniote vertebrates and six invertebrate outgroups. Such a synteny comparison across multiple genomes is shown to enhance the statistical power of ohnolog identification in vertebrates compared to earlier approaches, by overcoming lineage specific genome rearrangements. Ohnolog gene families can be browsed and downloaded for three statistical confidence levels or recompiled for specific, user-defined, significance criteria at http://ohnologs.curie.fr/. In the light of the importance of WGD on the genetic makeup of vertebrates, our analysis provides a useful resource for researchers interested in gaining further insights on vertebrate evolution and genetic diseases.

  15. Identification of genes and genomic islands correlated with high pathogenicity in Streptococcus suis using whole genome tiling microarrays.

    Directory of Open Access Journals (Sweden)

    Xiao Zheng

    Full Text Available Streptococcus suis is an important zoonotic pathogen that can cause meningitis and sepsis in both pigs and humans. Infections in humans have been sporadic worldwide but two severe outbreaks occurred in China in recent years, while infections in pigs are a major problem in the swine industry. Some S. suis strains are more pathogenic than others with 2 sequence types (ST, ST1 and ST7, being well recognized as highly pathogenic. We analyzed 31 isolates from 23 serotypes and 25 STs by NimbleGen tiling microarray using the genome of a high pathogenicity (HP ST1 strain, GZ1, as reference and a new algorithm to detect gene content difference. The number of genes absent in a strain ranged from 49 to 225 with a total of 632 genes absent in at least one strain, while 1346 genes were found to be invariably present in all strains as the core genome of S. suis, accounting for 68% of the GZ1 genome. The majority of genes are located in chromosomal blocks with two or more contiguous genes. Sixty two blocks are absent in two or more strains and defined as regions of difference (RDs, among which 26 are putative genomic islands (GIs. Clustering and statistical analyses revealed that 8 RDs including 6 putative GIs and 21 genes within these RDs are significantly associated with HP. Three RDs encode known virulence related factors including the extracellular factor, the capsular polysaccharide and a SrtF pilus. The strains were divided into 5 groups based on population genetic analysis of multilocus sequence typing data and the distribution of the RDs among the groups revealed gain and loss of RDs in different groups. Our study elucidated the gene content diversity of S. suis and identified genes that potentially promote HP.

  16. Development of genome-specific primers for homoeologous genes in allopolyploid species: the waxy and starch synthase II genes in allohexaploid wheat (Triticum aestivum L. as examples

    Directory of Open Access Journals (Sweden)

    Brûlé-Babel Anita

    2010-05-01

    Full Text Available Abstract Background In allopolypoid crops, homoeologous genes in different genomes exhibit a very high sequence similarity, especially in the coding regions of genes. This makes it difficult to design genome-specific primers to amplify individual genes from different genomes. Development of genome-specific primers for agronomically important genes in allopolypoid crops is very important and useful not only for the study of sequence diversity and association mapping of genes in natural populations, but also for the development of gene-based functional markers for marker-assisted breeding. Here we report on a useful approach for the development of genome-specific primers in allohexaploid wheat. Findings In the present study, three genome-specific primer sets for the waxy (Wx genes and four genome-specific primer sets for the starch synthase II (SSII genes were developed mainly from single nucleotide polymorphisms (SNPs and/or insertions or deletions (Indels in introns and intron-exon junctions. The size of a single PCR product ranged from 750 bp to 1657 bp. The total length of amplified PCR products by these genome-specific primer sets accounted for 72.6%-87.0% of the Wx genes and 59.5%-61.6% of the SSII genes. Five genome-specific primer sets for the Wx genes (one for Wx-7A, three for Wx-4A and one for Wx-7D could distinguish the wild type wheat and partial waxy wheat lines. These genome-specific primer sets for the Wx and SSII genes produced amplifications in hexaploid wheat, cultivated durum wheat, and Aegilops tauschii accessions, but failed to generate amplification in the majority of wild diploid and tetraploid accessions. Conclusions For the first time, we report on the development of genome-specific primers from three homoeologous Wx and SSII genes covering the majority of the genes in allohexaploid wheat. These genome-specific primers are being used for the study of sequence diversity and association mapping of the three homoeologous Wx

  17. Systematic discovery of unannotated genes in 11 yeast species using a database of orthologous genomic segments

    LENUS (Irish Health Repository)

    OhEigeartaigh, Sean S

    2011-07-26

    Abstract Background In standard BLAST searches, no information other than the sequences of the query and the database entries is considered. However, in situations where two genes from different species have only borderline similarity in a BLAST search, the discovery that the genes are located within a region of conserved gene order (synteny) can provide additional evidence that they are orthologs. Thus, for interpreting borderline search results, it would be useful to know whether the syntenic context of a database hit is similar to that of the query. This principle has often been used in investigations of particular genes or genomic regions, but to our knowledge it has never been implemented systematically. Results We made use of the synteny information contained in the Yeast Gene Order Browser database for 11 yeast species to carry out a systematic search for protein-coding genes that were overlooked in the original annotations of one or more yeast genomes but which are syntenic with their orthologs. Such genes tend to have been overlooked because they are short, highly divergent, or contain introns. The key features of our software - called SearchDOGS - are that the database entries are classified into sets of genomic segments that are already known to be orthologous, and that very weak BLAST hits are retained for further analysis if their genomic location is similar to that of the query. Using SearchDOGS we identified 595 additional protein-coding genes among the 11 yeast species, including two new genes in Saccharomyces cerevisiae. We found additional genes for the mating pheromone a-factor in six species including Kluyveromyces lactis. Conclusions SearchDOGS has proven highly successful for identifying overlooked genes in the yeast genomes. We anticipate that our approach can be adapted for study of further groups of species, such as bacterial genomes. More generally, the concept of doing sequence similarity searches against databases to which external

  18. Identification of putative noncoding RNA genes in the Burkholderia cenocepacia J2315 genome

    DEFF Research Database (Denmark)

    Coenye, T.; Drevinek, P.; Mahenthiralingam, E.

    2007-01-01

    Noncoding RNA (ncRNA) genes are not involved in the production of mRNA and proteins, but produce transcripts that function directly as structural or regulatory RNAs. In the present study, the presence of ncRNA genes in the genome of Burkholderia cenocepacia J2315 was evaluated by combining compar...

  19. Genome-wide identification of structural variants in genes encoding drug targets

    DEFF Research Database (Denmark)

    Rasmussen, Henrik Berg; Dahmcke, Christina Mackeprang

    2012-01-01

    The objective of the present study was to identify structural variants of drug target-encoding genes on a genome-wide scale. We also aimed at identifying drugs that are potentially amenable for individualization of treatments based on knowledge about structural variation in the genes encoding the...

  20. Construction of gene targeting vectors from lambda KOS genomic libraries.

    Science.gov (United States)

    Wattler, S; Kelly, M; Nehls, M

    1999-06-01

    We describe a highly redundant murine genomic library in a new lambda phage, lambda knockout shuttle (lambda KOS) that facilitates the very rapid construction of replacement-type gene targeting vectors. The library consists of 94 individually amplified subpools, each containing an average of 40,000 independent genomic clones. The subpools are arrayed into a 96-well format that allows a PCR-based efficient recovery of independent genomic clones. The lambda KOS vector backbone permits the CRE-mediated conversion into high-copy number pKOS plasmids, wherein the genomic inserts are automatically flanked by negative-selection cassettes. The lambda KOS vector system exploits the yeast homologous recombination machinery to simplify the construction of replacement-type gene targeting vectors independent of restriction sites within the genomic insert. We outline procedures that allow the generation of simple and more sophisticated conditional gene targeting vectors within 3-4 weeks, beginning with the screening of the lambda KOS genomic library.

  1. BACs as tools for the study of genomic imprinting.

    Science.gov (United States)

    Tunster, S J; Van De Pette, M; John, R M

    2011-01-01

    Genomic imprinting in mammals results in the expression of genes from only one parental allele. Imprinting occurs as a consequence of epigenetic marks set down either in the father's or the mother's germ line and affects a very specific category of mammalian gene. A greater understanding of this distinctive phenomenon can be gained from studies using large genomic clones, called bacterial artificial chromosomes (BACs). Here, we review the important applications of BACs to imprinting research, covering physical mapping studies and the use of BACs as transgenes in mice to study gene expression patterns, to identify imprinting centres, and to isolate the consequences of altered gene dosage. We also highlight the significant and unique advantages that rapid BAC engineering brings to genomic imprinting research.

  2. BACs as Tools for the Study of Genomic Imprinting

    Directory of Open Access Journals (Sweden)

    S. J. Tunster

    2011-01-01

    Full Text Available Genomic imprinting in mammals results in the expression of genes from only one parental allele. Imprinting occurs as a consequence of epigenetic marks set down either in the father's or the mother's germ line and affects a very specific category of mammalian gene. A greater understanding of this distinctive phenomenon can be gained from studies using large genomic clones, called bacterial artificial chromosomes (BACs. Here, we review the important applications of BACs to imprinting research, covering physical mapping studies and the use of BACs as transgenes in mice to study gene expression patterns, to identify imprinting centres, and to isolate the consequences of altered gene dosage. We also highlight the significant and unique advantages that rapid BAC engineering brings to genomic imprinting research.

  3. Identification of neural outgrowth genes using genome-wide RNAi.

    Directory of Open Access Journals (Sweden)

    Katharine J Sepp

    2008-07-01

    Full Text Available While genetic screens have identified many genes essential for neurite outgrowth, they have been limited in their ability to identify neural genes that also have earlier critical roles in the gastrula, or neural genes for which maternally contributed RNA compensates for gene mutations in the zygote. To address this, we developed methods to screen the Drosophila genome using RNA-interference (RNAi on primary neural cells and present the results of the first full-genome RNAi screen in neurons. We used live-cell imaging and quantitative image analysis to characterize the morphological phenotypes of fluorescently labelled primary neurons and glia in response to RNAi-mediated gene knockdown. From the full genome screen, we focused our analysis on 104 evolutionarily conserved genes that when downregulated by RNAi, have morphological defects such as reduced axon extension, excessive branching, loss of fasciculation, and blebbing. To assist in the phenotypic analysis of the large data sets, we generated image analysis algorithms that could assess the statistical significance of the mutant phenotypes. The algorithms were essential for the analysis of the thousands of images generated by the screening process and will become a valuable tool for future genome-wide screens in primary neurons. Our analysis revealed unexpected, essential roles in neurite outgrowth for genes representing a wide range of functional categories including signalling molecules, enzymes, channels, receptors, and cytoskeletal proteins. We also found that genes known to be involved in protein and vesicle trafficking showed similar RNAi phenotypes. We confirmed phenotypes of the protein trafficking genes Sec61alpha and Ran GTPase using Drosophila embryo and mouse embryonic cerebral cortical neurons, respectively. Collectively, our results showed that RNAi phenotypes in primary neural culture can parallel in vivo phenotypes, and the screening technique can be used to identify many new

  4. Genomic Characterization of Phenylalanine Ammonia Lyase Gene in Buckwheat.

    Directory of Open Access Journals (Sweden)

    Karthikeyan Thiyagarajan

    Full Text Available Phenylalanine Ammonia Lyase (PAL gene which plays a key role in bio-synthesis of medicinally important compounds, Rutin/quercetin was sequence characterized for its efficient genomics application. These compounds possessing anti-diabetic and anti-cancer properties and are predominantly produced by Fagopyrum spp. In the present study, PAL gene was sequenced from three Fagopyrum spp. (F. tataricum, F. esculentum and F. dibotrys and showed the presence of three SNPs and four insertion/deletions at intra and inter specific level. Among them, the potential SNP (position 949th bp G>C with Parsimony Informative Site was selected and successfully utilised to individuate the zygosity/allelic variation of 16 F. tataricum varieties. Insertion mutations were identified in coding region, which resulted the change of a stretch of 39 amino acids on the putative protein. Our Study revealed that autogamous species (F. tataricum has lower frequency of observed SNPs as compared to allogamous species (F. dibotrys and F. esculentum. The identified SNPs in F. tataricum didn't result to amino acid change, while in other two species it caused both conservative and non-conservative variations. Consistent pattern of SNPs across the species revealed their phylogenetic importance. We found two groups of F. tataricum and one of them was closely related with F. dibotrys. Sequence characterization information of PAL gene reported in present investigation can be utilized in genetic improvement of buckwheat in reference to its medicinal value.

  5. FGF: A web tool for Fishing Gene Family in a whole genome database

    DEFF Research Database (Denmark)

    Zheng, Hongkun; Shi, Junjie; Fang, Xiaodong

    2007-01-01

    to efficiently search for and identify gene families. The FGF output displays the results as visual phylogenetic trees including information on gene structure, chromosome position, duplication fate and selective pressure. It is particularly useful to identify pseudogenes and detect changes in gene structure. FGF......Gene duplication is an important process in evolution. The availability of genome sequences of a number of organisms has made it possible to conduct comprehensive searches for duplicated genes enabling informative studies of their evolution. We have established the FGF (Fishing Gene Family) program...

  6. Genome-wide gene-gene interaction analysis for next-generation sequencing.

    Science.gov (United States)

    Zhao, Jinying; Zhu, Yun; Xiong, Momiao

    2016-03-01

    The critical barrier in interaction analysis for next-generation sequencing (NGS) data is that the traditional pairwise interaction analysis that is suitable for common variants is difficult to apply to rare variants because of their prohibitive computational time, large number of tests and low power. The great challenges for successful detection of interactions with NGS data are (1) the demands in the paradigm of changes in interaction analysis; (2) severe multiple testing; and (3) heavy computations. To meet these challenges, we shift the paradigm of interaction analysis between two SNPs to interaction analysis between two genomic regions. In other words, we take a gene as a unit of analysis and use functional data analysis techniques as dimensional reduction tools to develop a novel statistic to collectively test interaction between all possible pairs of SNPs within two genome regions. By intensive simulations, we demonstrate that the functional logistic regression for interaction analysis has the correct type 1 error rates and higher power to detect interaction than the currently used methods. The proposed method was applied to a coronary artery disease dataset from the Wellcome Trust Case Control Consortium (WTCCC) study and the Framingham Heart Study (FHS) dataset, and the early-onset myocardial infarction (EOMI) exome sequence datasets with European origin from the NHLBI's Exome Sequencing Project. We discovered that 6 of 27 pairs of significantly interacted genes in the FHS were replicated in the independent WTCCC study and 24 pairs of significantly interacted genes after applying Bonferroni correction in the EOMI study.

  7. Biased distribution of DNA uptake sequences towards genome maintenance genes

    DEFF Research Database (Denmark)

    Davidsen, T.; Rodland, E.A.; Lagesen, K.

    2004-01-01

    coding regions are the DNA uptake sequences (DUS) required for natural genetic transformation. More importantly, we found a significantly higher density of DUS within genes involved in DNA repair, recombination, restriction-modification and replication than in any other annotated gene group......Repeated sequence signatures are characteristic features of all genomic DNA. We have made a rigorous search for repeat genomic sequences in the human pathogens Neisseria meningitidis, Neisseria gonorrhoeae and Haemophilus influenzae and found that by far the most frequent 9-10mers residing within...

  8. Genome-wide identification and characterization of WRKY gene family in peanut

    Directory of Open Access Journals (Sweden)

    Hui eSong

    2016-04-01

    Full Text Available WRKY, an important transcription factor family, is widely distributed in the plant kingdom. Many reports focused on analysis of phylogenetic relationship and biological function of WRKY protein at the whole genome level in different plant species. However, little is known about WRKY proteins in the genome of Arachis species and their response to salicylic acid (SA and jasmonic acid (JA treatment. In this study, we identified 77 and 75 WRKY proteins from the two wild ancestral diploid genomes of cultivated tetraploid peanut, Arachis duranensis and Arachis ipaënsis, using bioinformatics approaches. Most peanut WRKY coding genes were located on A. duranensis chromosome A6 and A. ipaënsis chromosome B3, while the least number of WRKY genes was found in chromosome 9. The WRKY orthologous gene pairs in A. duranensis and A. ipaënsis chromosomes were highly syntenic. Our analysis indicated that segmental duplication events played a major role in AdWRKY and AiWRKY genes, and strong purifying selection was observed in gene duplication pairs. Furthermore, we translate the knowledge gained from the genome-wide analysis result of wild ancestral peanut to cultivated peanut to reveal that gene activities of specific cultivated peanut WRKY gene were changed due to SA and JA treatment. Peanut WRKY7, 8 and 13 genes were down-regulated, whereas WRKY1 and 12 genes were up-regulated with SA and JA treatment. These results could provide valuable information for peanut improvement.

  9. Advancing Eucalyptus genomics: identification and sequencing of lignin biosynthesis genes from deep-coverage BAC libraries

    Directory of Open Access Journals (Sweden)

    Kudrna David

    2011-03-01

    Full Text Available Abstract Background Eucalyptus species are among the most planted hardwoods in the world because of their rapid growth, adaptability and valuable wood properties. The development and integration of genomic resources into breeding practice will be increasingly important in the decades to come. Bacterial artificial chromosome (BAC libraries are key genomic tools that enable positional cloning of important traits, synteny evaluation, and the development of genome framework physical maps for genetic linkage and genome sequencing. Results We describe the construction and characterization of two deep-coverage BAC libraries EG_Ba and EG_Bb obtained from nuclear DNA fragments of E. grandis (clone BRASUZ1 digested with HindIII and BstYI, respectively. Genome coverages of 17 and 15 haploid genome equivalents were estimated for EG_Ba and EG_Bb, respectively. Both libraries contained large inserts, with average sizes ranging from 135 Kb (Eg_Bb to 157 Kb (Eg_Ba, very low extra-nuclear genome contamination providing a probability of finding a single copy gene ≥ 99.99%. Libraries were screened for the presence of several genes of interest via hybridizations to high-density BAC filters followed by PCR validation. Five selected BAC clones were sequenced and assembled using the Roche GS FLX technology providing the whole sequence of the E. grandis chloroplast genome, and complete genomic sequences of important lignin biosynthesis genes. Conclusions The two E. grandis BAC libraries described in this study represent an important milestone for the advancement of Eucalyptus genomics and forest tree research. These BAC resources have a highly redundant genome coverage (> 15×, contain large average inserts and have a very low percentage of clones with organellar DNA or empty vectors. These publicly available BAC libraries are thus suitable for a broad range of applications in genetic and genomic research in Eucalyptus and possibly in related species of Myrtaceae

  10. Snf2 family gene distribution in higher plant genomes reveals DRD1 expansion and diversification in the tomato genome.

    Directory of Open Access Journals (Sweden)

    Joachim W Bargsten

    Full Text Available As part of large protein complexes, Snf2 family ATPases are responsible for energy supply during chromatin remodeling, but the precise mechanism of action of many of these proteins is largely unknown. They influence many processes in plants, such as the response to environmental stress. This analysis is the first comprehensive study of Snf2 family ATPases in plants. We here present a comparative analysis of 1159 candidate plant Snf2 genes in 33 complete and annotated plant genomes, including two green algae. The number of Snf2 ATPases shows considerable variation across plant genomes (17-63 genes. The DRD1, Rad5/16 and Snf2 subfamily members occur most often. Detailed analysis of the plant-specific DRD1 subfamily in related plant genomes shows the occurrence of a complex series of evolutionary events. Notably tomato carries unexpected gene expansions of DRD1 gene members. Most of these genes are expressed in tomato, although at low levels and with distinct tissue or organ specificity. In contrast, the Snf2 subfamily genes tend to be expressed constitutively in tomato. The results underpin and extend the Snf2 subfamily classification, which could help to determine the various functional roles of Snf2 ATPases and to target environmental stress tolerance and yield in future breeding.

  11. Resequencing of the common marmoset genome improves genome assemblies and gene-coding sequence analysis.

    Science.gov (United States)

    Sato, Kengo; Kuroki, Yoko; Kumita, Wakako; Fujiyama, Asao; Toyoda, Atsushi; Kawai, Jun; Iriki, Atsushi; Sasaki, Erika; Okano, Hideyuki; Sakakibara, Yasubumi

    2015-11-20

    The first draft of the common marmoset (Callithrix jacchus) genome was published by the Marmoset Genome Sequencing and Analysis Consortium. The draft was based on whole-genome shotgun sequencing, and the current assembly version is Callithrix_jacches-3.2.1, but there still exist 187,214 undetermined gap regions and supercontigs and relatively short contigs that are unmapped to chromosomes in the draft genome. We performed resequencing and assembly of the genome of common marmoset by deep sequencing with high-throughput sequencing technology. Several different sequence runs using Illumina sequencing platforms were executed, and 181 Gbp of high-quality bases including mate-pairs with long insert lengths of 3, 8, 20, and 40 Kbp were obtained, that is, approximately 60× coverage. The resequencing significantly improved the MGSAC draft genome sequence. The N50 of the contigs, which is a statistical measure used to evaluate assembly quality, doubled. As a result, 51% of the contigs (total length: 299 Mbp) that were unmapped to chromosomes in the MGSAC draft were merged with chromosomal contigs, and the improved genome sequence helped to detect 5,288 new genes that are homologous to human cDNAs and the gaps in 5,187 transcripts of the Ensembl gene annotations were completely filled.

  12. Genomic analysis and gene structure of the plant carotenoid dioxygenase 4 family: a deeper study in Crocus sativus and its allies.

    Science.gov (United States)

    Ahrazem, Oussama; Trapero, Almudena; Gómez, M Dolores; Rubio-Moraga, Angela; Gómez-Gómez, Lourdes

    2010-10-01

    The plastoglobule-targeted enzyme carotenoid cleavage dioxygenase (CCD4) mediates the formation of volatile C13 ketones, such as β-ionone, by cleaving the C9-C10 and C9'-C10' double bonds of cyclic carotenoids. Here, we report the isolation and analysis of CCD4 genomic DNA regions in Crocus sativus. Different CCD4 alleles have been identified: CsCCD4a which is found with and without an intron and CsCCD4b that showed the presence of a unique intron. The presence of different CCD4 alleles was also observed in other Crocus species. Furthermore, comparison of the locations of CCD4 introns within the coding region with CCD4 genes from other plant species suggests that independent gain/losses have occurred. The comparison of the promoter region of CsCCD4a and CsCCD4b with available CCD4 gene promoters from other plant species highlighted the conservation of cis-elements involved in light response, heat stress, as well as the absence and unique presence of cis-elements involved in circadian regulation and low temperature responses, respectively. Functional characterization of the Crocus sativus CCD4a promoter using Arabidopsis plants stably transformed with a DNA fragment of 1400 base pairs (P-CsCCD4a) fused to the β-glucuronidase (GUS) reporter gene showed that this sequence was sufficient to drive GUS expression in the flower, in particular high levels were detected in pollen.

  13. Mining Bacterial Genomes for Secondary Metabolite Gene Clusters.

    Science.gov (United States)

    Adamek, Martina; Spohn, Marius; Stegmann, Evi; Ziemert, Nadine

    2017-01-01

    With the emergence of bacterial resistance against frequently used antibiotics, novel antibacterial compounds are urgently needed. Traditional bioactivity-guided drug discovery strategies involve laborious screening efforts and display high rediscovery rates. With the progress in next generation sequencing methods and the knowledge that the majority of antibiotics in clinical use are produced as secondary metabolites by bacteria, mining bacterial genomes for secondary metabolites with antimicrobial activity is a promising approach, which can guide a more time and cost-effective identification of novel compounds. However, what sounds easy to accomplish, comes with several challenges. To date, several tools for the prediction of secondary metabolite gene clusters are available, some of which are based on the detection of signature genes, while others are searching for specific patterns in gene content or regulation.Apart from the mere identification of gene clusters, several other factors such as determining cluster boundaries and assessing the novelty of the detected cluster are important. For this purpose, comparison of the predicted secondary metabolite genes with different cluster and compound databases is necessary. Furthermore, it is advisable to classify detected clusters into gene cluster families. So far, there is no standardized procedure for genome mining; however, different approaches to overcome all of these challenges exist and are addressed in this chapter. We give practical guidance on the workflow for secondary metabolite gene cluster identification, which includes the determination of gene cluster boundaries, addresses problems occurring with the use of draft genomes, and gives an outlook on the different methods for gene cluster classification. Based on comprehensible examples a protocol is set, which should enable the readers to mine their own genome data for interesting secondary metabolites.

  14. [Evolution of gene orders in genomes of cyanobacteria].

    Science.gov (United States)

    Markov, A V; Zakharov, I A

    2009-08-01

    Genomes of 23 strains of cyanobacteria were comparatively analyzed using quantitative methods of estimation of gene order similarity. It has been found that reconstructions of phylogenesis of cyanobacteria based on the comparison of the orders of genes in chromosomes and nucleotide sequences appear to be similar. This confirms the applicability of quantitative measures of similarity of gene orders for phylogenetic reconstructions. In the evolution of marine unicellular plankton cyanobacteria, genome rearrangements are fixed with a low rate (about 3% of gene order changes per 1% of 16S rRNA changes), whereas in other groups of cyanobacteria the gene order can change several times more rapidly. The gene orders in genomes of cyanobacteria and chloroplasts preserve a considerable degree of similarity. The closest relatives of chloroplasts among the analyzed cyanobacteria are likely to be strains from hot springs belonging to the genus Synechococcus. Comparative analysis of gene orders and nucleotide sequences strongly suggests that Synechococcus strains from diferent environments (sea, fresh waters, hot springs) are not related and belong to evolutionally distant lines.

  15. GeneTack database: genes with frameshifts in prokaryotic genomes and eukaryotic mRNA sequences.

    Science.gov (United States)

    Antonov, Ivan; Baranov, Pavel; Borodovsky, Mark

    2013-01-01

    Database annotations of prokaryotic genomes and eukaryotic mRNA sequences pay relatively low attention to frame transitions that disrupt protein-coding genes. Frame transitions (frameshifts) could be caused by sequencing errors or indel mutations inside protein-coding regions. Other observed frameshifts are related to recoding events (that evolved to control expression of some genes). Earlier, we have developed an algorithm and software program GeneTack for ab initio frameshift finding in intronless genes. Here, we describe a database (freely available at http://topaz.gatech.edu/GeneTack/db.html) containing genes with frameshifts (fs-genes) predicted by GeneTack. The database includes 206 991 fs-genes from 1106 complete prokaryotic genomes and 45 295 frameshifts predicted in mRNA sequences from 100 eukaryotic genomes. The whole set of fs-genes was grouped into clusters based on sequence similarity between fs-proteins (conceptually translated fs-genes), conservation of the frameshift position and frameshift direction (-1, +1). The fs-genes can be retrieved by similarity search to a given query sequence via a web interface, by fs-gene cluster browsing, etc. Clusters of fs-genes are characterized with respect to their likely origin, such as pseudogenization, phase variation, etc. The largest clusters contain fs-genes with programed frameshifts (related to recoding events).

  16. Genome-wide association study suggests common variants within RP11-634B7.4 gene influencing severe pre-treatment pain in head and neck cancer patients

    Science.gov (United States)

    Reyes-Gibby, Cielito C.; Wang, Jian; Silvas, Mary Rose T.; Yu, Robert K.; Hanna, Ehab Y.; Shete, Sanjay

    2016-01-01

    Pain is often one of the first signs of squamous cell carcinoma of the head and neck (HNSCC). Pain at diagnosis is an important prognostic marker for the development of chronic pain, and importantly, for the overall survival time. To identify variants influencing severe pre-treatment pain in 1,368 patients newly diagnosed with HNSCC, we conducted a genome-wide association study based on 730,525 tagging SNPs. The patients were all previously untreated for cancer. About 15% of the patients had severe pre-treatment pain, defined as pain score ≥7 (0 = “no pain” and 10 = “worst pain”). We identified 3 common genetic variants in high linkage disequilibrium for severe pre-treatment pain, representing one genomic region at 1q44 (rs3862188, P = 3.45 × 10−8; rs880143, P = 3.45 × 10−8; and rs7526880, P = 4.92 × 10−8), which maps to the RP11-634B7.4 gene, a novel antisense gene to three olfactory receptor genes. Olfactory receptor genes, upstream effectors of the MAPK signaling cascade, might be novel target genes for pain in HNSCC patients. Future experimental validation to explore biological mechanisms will be key to defining the role of the intronic variants and non-coding RNA for pain in patients with HNSCC. PMID:27670397

  17. Sequencing rare marine actinomycete genomes reveals high density of unique natural product biosynthetic gene clusters.

    Science.gov (United States)

    Schorn, Michelle A; Alanjary, Mohammad M; Aguinaldo, Kristen; Korobeynikov, Anton; Podell, Sheila; Patin, Nastassia; Lincecum, Tommie; Jensen, Paul R; Ziemert, Nadine; Moore, Bradley S

    2016-12-01

    Traditional natural product discovery methods have nearly exhausted the accessible diversity of microbial chemicals, making new sources and techniques paramount in the search for new molecules. Marine actinomycete bacteria have recently come into the spotlight as fruitful producers of structurally diverse secondary metabolites, and remain relatively untapped. In this study, we sequenced 21 marine-derived actinomycete strains, rarely studied for their secondary metabolite potential and under-represented in current genomic databases. We found that genome size and phylogeny were good predictors of biosynthetic gene cluster diversity, with larger genomes rivalling the well-known marine producers in the Streptomyces and Salinispora genera. Genomes in the Micrococcineae suborder, however, had consistently the lowest number of biosynthetic gene clusters. By networking individual gene clusters into gene cluster families, we were able to computationally estimate the degree of novelty each genus contributed to the current sequence databases. Based on the similarity measures between all actinobacteria in the Joint Genome Institute's Atlas of Biosynthetic gene Clusters database, rare marine genera show a high degree of novelty and diversity, with Corynebacterium, Gordonia, Nocardiopsis, Saccharomonospora and Pseudonocardia genera representing the highest gene cluster diversity. This research validates that rare marine actinomycetes are important candidates for exploration, as they are relatively unstudied, and their relatives are historically rich in secondary metabolites.

  18. Candidate genes revealed by a genome scan for mosquito resistance to a bacterial insecticide: sequence and gene expression variations

    Directory of Open Access Journals (Sweden)

    David Jean-Philippe

    2009-11-01

    Full Text Available Abstract Background Genome scans are becoming an increasingly popular approach to study the genetic basis of adaptation and speciation, but on their own, they are often helpless at identifying the specific gene(s or mutation(s targeted by selection. This shortcoming is hopefully bound to disappear in the near future, thanks to the wealth of new genomic resources that are currently being developed for many species. In this article, we provide a foretaste of this exciting new era by conducting a genome scan in the mosquito Aedes aegypti with the aim to look for candidate genes involved in resistance to Bacillus thuringiensis subsp. israelensis (Bti insecticidal toxins. Results The genome of a Bti-resistant and a Bti-susceptible strains was surveyed using about 500 MITE-based molecular markers, and the loci showing the highest inter-strain genetic differentiation were sequenced and mapped on the Aedes aegypti genome sequence. Several good candidate genes for Bti-resistance were identified in the vicinity of these highly differentiated markers. Two of them, coding for a cadherin and a leucine aminopeptidase, were further examined at the sequence and gene expression levels. In the resistant strain, the cadherin gene displayed patterns of nucleotide polymorphisms consistent with the action of positive selection (e.g. an excess of high compared to intermediate frequency mutations, as well as a significant under-expression compared to the susceptible strain. Conclusion Both sequence and gene expression analyses agree to suggest a role for positive selection in the evolution of this cadherin gene in the resistant strain. However, it is unlikely that resistance to Bti is conferred by this gene alone, and further investigation will be needed to characterize other genes significantly associated with Bti resistance in Ae. aegypti. Beyond these results, this article illustrates how genome scans can build on the body of new genomic information (here, full

  19. Bacterial genes in the aphid genome: absence of functional gene transfer from Buchnera to its host.

    Directory of Open Access Journals (Sweden)

    Naruo Nikoh

    2010-02-01

    Full Text Available Genome reduction is typical of obligate symbionts. In cellular organelles, this reduction partly reflects transfer of ancestral bacterial genes to the host genome, but little is known about gene transfer in other obligate symbioses. Aphids harbor anciently acquired obligate mutualists, Buchnera aphidicola (Gammaproteobacteria, which have highly reduced genomes (420-650 kb, raising the possibility of gene transfer from ancestral Buchnera to the aphid genome. In addition, aphids often harbor other bacteria that also are potential sources of transferred genes. Previous limited sampling of genes expressed in bacteriocytes, the specialized cells that harbor Buchnera, revealed that aphids acquired at least two genes from bacteria. The newly sequenced genome of the pea aphid, Acyrthosiphon pisum, presents the first opportunity for a complete inventory of genes transferred from bacteria to the host genome in the context of an ancient obligate symbiosis. Computational screening of the entire A. pisum genome, followed by phylogenetic and experimental analyses, provided strong support for the transfer of 12 genes or gene fragments from bacteria to the aphid genome: three LD-carboxypeptidases (LdcA1, LdcA2,psiLdcA, five rare lipoprotein As (RlpA1-5, N-acetylmuramoyl-L-alanine amidase (AmiD, 1,4-beta-N-acetylmuramidase (bLys, DNA polymerase III alpha chain (psiDnaE, and ATP synthase delta chain (psiAtpH. Buchnera was the apparent source of two highly truncated pseudogenes (psiDnaE and psiAtpH. Most other transferred genes were closely related to genes from relatives of Wolbachia (Alphaproteobacteria. At least eight of the transferred genes (LdcA1, AmiD, RlpA1-5, bLys appear to be functional, and expression of seven (LdcA1, AmiD, RlpA1-5 are highly upregulated in bacteriocytes. The LdcAs and RlpAs appear to have been duplicated after transfer. Our results excluded the hypothesis that genome reduction in Buchnera has been accompanied by gene transfer to the

  20. Genome-wide Analysis of Gene Regulation

    DEFF Research Database (Denmark)

    Chen, Yun

    IP-seq and small RNA-seq, we delineated the landscape of the promoters with bidirectional transcriptions that yield steady-state RNA in only one directions (Paper III). A subsequent motif analysis enabled us to uncover specific DNA signals – early polyA sites – that make RNA on the reverse strand sensitive...... they regulated or if the sites had global elevated usage rates by multiple TFs. Using RNA-seq, 5’end-seq in combination with depletion of 5’exonuclease as well as nonsensemediated decay (NMD) factors, we systematically analyzed NMD substrates as well as their degradation intermediates in human cells (Paper V......). Gene enrichment analysis on the detected NMD substrates revealed an unappreciated NMD-based regulatory mechanism of the genes hosting multiple intronic snoRNAs, which can facilitate differential expression of individual snoRNAs from a single host gene locus. Finally, supported by RNA-seq and small RNA-seq...

  1. Discovery of germline-related genes in Cephalochordate amphioxus: A genome wide survey using genome annotation and transcriptome data.

    Science.gov (United States)

    Yue, Jia-Xing; Li, Kun-Lung; Yu, Jr-Kai

    2015-12-01

    The generation of germline cells is a critical process in the reproduction of multicellular organisms. Studies in animal models have identified a common repertoire of genes that play essential roles in primordial germ cell (PGC) formation. However, comparative studies also indicate that the timing and regulation of this core genetic program vary considerably in different animals, raising the intriguing questions regarding the evolution of PGC developmental mechanisms in metazoans. Cephalochordates (commonly called amphioxus or lancelets) represent one of the invertebrate chordate groups and can provide important information about the evolution of developmental mechanisms in the chordate lineage. In this study, we used genome and transcriptome data to identify germline-related genes in two distantly related cephalochordate species, Branchiostoma floridae and Asymmetron lucayanum. Branchiostoma and Asymmetron diverged more than 120 MYA, and the most conspicuous difference between them is their gonadal morphology. We used important germline developmental genes in several model animals to search the amphioxus genome and transcriptome dataset for conserved homologs. We also annotated the assembled transcriptome data using Gene Ontology (GO) terms to facilitate the discovery of putative genes associated with germ cell development and reproductive functions in amphioxus. We further confirmed the expression of 14 genes in developing oocytes or mature eggs using whole mount in situ hybridization, suggesting their potential functions in amphioxus germ cell development. The results of this global survey provide a useful resource for testing potential functions of candidate germline-related genes in cephalochordates and for investigating differences in gonad developmental mechanisms between Branchiostoma and Asymmetron species.

  2. Genome-wide patterns of Arabidopsis gene expression in nature.

    Directory of Open Access Journals (Sweden)

    Christina L Richards

    Full Text Available Organisms in the wild are subject to multiple, fluctuating environmental factors, and it is in complex natural environments that genetic regulatory networks actually function and evolve. We assessed genome-wide gene expression patterns in the wild in two natural accessions of the model plant Arabidopsis thaliana and examined the nature of transcriptional variation throughout its life cycle and gene expression correlations with natural environmental fluctuations. We grew plants in a natural field environment and measured genome-wide time-series gene expression from the plant shoot every three days, spanning the seedling to reproductive stages. We find that 15,352 genes were expressed in the A. thaliana shoot in the field, and accession and flowering status (vegetative versus flowering were strong components of transcriptional variation in this plant. We identified between ∼110 and 190 time-varying gene expression clusters in the field, many of which were significantly overrepresented by genes regulated by abiotic and biotic environmental stresses. The two main principal components of vegetative shoot gene expression (PC(veg correlate to temperature and precipitation occurrence in the field. The largest PC(veg axes included thermoregulatory genes while the second major PC(veg was associated with precipitation and contained drought-responsive genes. By exposing A. thaliana to natural environments in an open field, we provide a framework for further understanding the genetic networks that are deployed in natural environments, and we connect plant molecular genetics in the laboratory to plant organismal ecology in the wild.

  3. Comparative genomic analysis of Drosophila melanogaster and vector mosquito developmental genes.

    Directory of Open Access Journals (Sweden)

    Susanta K Behura

    Full Text Available Genome sequencing projects have presented the opportunity for analysis of developmental genes in three vector mosquito species: Aedes aegypti, Culex quinquefasciatus, and Anopheles gambiae. A comparative genomic analysis of developmental genes in Drosophila melanogaster and these three important vectors of human disease was performed in this investigation. While the study was comprehensive, special emphasis centered on genes that 1 are components of developmental signaling pathways, 2 regulate fundamental developmental processes, 3 are critical for the development of tissues of vector importance, 4 function in developmental processes known to have diverged within insects, and 5 encode microRNAs (miRNAs that regulate developmental transcripts in Drosophila. While most fruit fly developmental genes are conserved in the three vector mosquito species, several genes known to be critical for Drosophila development were not identified in one or more mosquito genomes. In other cases, mosquito lineage-specific gene gains with respect to D. melanogaster were noted. Sequence analyses also revealed that numerous repetitive sequences are a common structural feature of Drosophila and mosquito developmental genes. Finally, analysis of predicted miRNA binding sites in fruit fly and mosquito developmental genes suggests that the repertoire of developmental genes targeted by miRNAs is species-specific. The results of this study provide insight into the evolution of developmental genes and processes in dipterans and other arthropods, serve as a resource for those pursuing analysis of mosquito development, and will promote the design and refinement of functional analysis experiments.

  4. Genomic imprinting and maternal effect genes in haplodiploid sex determination.

    Science.gov (United States)

    van de Zande, L; Verhulst, E C

    2014-01-01

    The research into the Drosophila melanogaster sex-determining system has been at the basis of all further research on insect sex determination. This further research has made it clear that, for most insect species, the presence of sufficient functional Transformer (TRA) protein in the early embryonic stage is essential for female sexual development. In Hymenoptera, functional analysis of sex determination by knockdown studies of sex-determining genes has only been performed for 2 species. The first is the social insect species Apis mellifera, the honeybee, which has single-locus complementary sex determination (CSD). The other species is the parasitoid Nasonia vitripennis, the jewel wasp. Nasonia has a non-CSD sex-determining system, described as the maternal effect genomic imprinting sex determination system (MEGISD). Here, we describe the arguments that eventually led to the formulation of MEGISD and the experimental data that supported and refined this model. We evaluate the possibility that DNA methylation lies at the basis of MEGISD and briefly address the role of genomic imprinting in non-CSD sex determination in other Hymenoptera.

  5. Genome-wide analysis of homeobox genes from Mesobuthus martensii reveals Hox gene duplication in scorpions.

    Science.gov (United States)

    Di, Zhiyong; Yu, Yao; Wu, Yingliang; Hao, Pei; He, Yawen; Zhao, Huabin; Li, Yixue; Zhao, Guoping; Li, Xuan; Li, Wenxin; Cao, Zhijian

    2015-06-01

    Homeobox genes belong to a large gene group, which encodes the famous DNA-binding homeodomain that plays a key role in development and cellular differentiation during embryogenesis in animals. Here, one hundred forty-nine homeobox genes were identified from the Asian scorpion, Mesobuthus martensii (Chelicerata: Arachnida: Scorpiones: Buthidae) based on our newly assembled genome sequence with approximately 248 × coverage. The identified homeobox genes were categorized into eight classes including 82 families: 67 ANTP class genes, 33 PRD genes, 11 LIM genes, five POU genes, six SINE genes, 14 TALE genes, five CUT genes, two ZF genes and six unclassified genes. Transcriptome data confirmed that more than half of the genes were expressed in adults. The homeobox gene diversity of the eight classes is similar to the previously analyzed Mandibulata arthropods. Interestingly, it is hypothesized that the scorpion M. martensii may have two Hox clusters. The first complete genome-wide analysis of homeobox genes in Chelicerata not only reveals the repertoire of scorpion, arachnid and chelicerate homeobox genes, but also shows some insights into the evolution of arthropod homeobox genes.

  6. Genome-wide search for gene-gene interactions in colorectal cancer.

    Directory of Open Access Journals (Sweden)

    Shuo Jiao

    Full Text Available Genome-wide association studies (GWAS have successfully identified a number of single-nucleotide polymorphisms (SNPs associated with colorectal cancer (CRC risk. However, these susceptibility loci known today explain only a small fraction of the genetic risk. Gene-gene interaction (GxG is considered to be one source of the missing heritability. To address this, we performed a genome-wide search for pair-wise GxG associated with CRC risk using 8,380 cases and 10,558 controls in the discovery phase and 2,527 cases and 2,658 controls in the replication phase. We developed a simple, but powerful method for testing interaction, which we term the Average Risk Due to Interaction (ARDI. With this method, we conducted a genome-wide search to identify SNPs showing evidence for GxG with previously identified CRC susceptibility loci from 14 independent regions. We also conducted a genome-wide search for GxG using the marginal association screening and examining interaction among SNPs that pass the screening threshold (p<10(-4. For the known locus rs10795668 (10p14, we found an interacting SNP rs367615 (5q21 with replication p = 0.01 and combined p = 4.19×10(-8. Among the top marginal SNPs after LD pruning (n = 163, we identified an interaction between rs1571218 (20p12.3 and rs10879357 (12q21.1 (nominal combined p = 2.51×10(-6; Bonferroni adjusted p = 0.03. Our study represents the first comprehensive search for GxG in CRC, and our results may provide new insight into the genetic etiology of CRC.

  7. In-silico human genomics with GeneCards

    Directory of Open Access Journals (Sweden)

    Stelzer Gil

    2011-10-01

    Full Text Available Abstract Since 1998, the bioinformatics, systems biology, genomics and medical communities have enjoyed a synergistic relationship with the GeneCards database of human genes (http://www.genecards.org. This human gene compendium was created to help to introduce order into the increasing chaos of information flow. As a consequence of viewing details and deep links related to specific genes, users have often requested enhanced capabilities, such that, over time, GeneCards has blossomed into a suite of tools (including GeneDecks, GeneALaCart, GeneLoc, GeneNote and GeneAnnot for a variety of analyses of both single human genes and sets thereof. In this paper, we focus on inhouse and external research activities which have been enabled, enhanced, complemented and, in some cases, motivated by GeneCards. In turn, such interactions have often inspired and propelled improvements in GeneCards. We describe here the evolution and architecture of this project, including examples of synergistic applications in diverse areas such as synthetic lethality in cancer, the annotation of genetic variations in disease, omics integration in a systems biology approach to kidney disease, and bioinformatics tools.

  8. Identification of new genes in Sinorhizobium meliloti using the Genome Sequencer FLX system

    Directory of Open Access Journals (Sweden)

    Jensen Roderick V

    2008-05-01

    Full Text Available Abstract Background Sinorhizobium meliloti is an agriculturally important model symbiont. There is an ongoing need to update and improve its genome annotation. In this study, we used a high-throughput pyrosequencing approach to sequence the transcriptome of S. meliloti, and search for new bacterial genes missed in the previous genome annotation. This is the first report of sequencing a bacterial transcriptome using the pyrosequencing technology. Results Our pilot sequencing run generated 19,005 reads with an average length of 136 nucleotides per read. From these data, we identified 20 new genes. These new gene transcripts were confirmed by RT-PCR and their possible functions were analyzed. Conclusion Our results indicate that high-throughput sequence analysis of bacterial transcriptomes is feasible and next-generation sequencing technologies will greatly facilitate the discovery of new genes and improve genome annotation.

  9. Chiropteran types I and II interferon genes inferred from genome sequencing traces by a statistical gene-family assembler

    Directory of Open Access Journals (Sweden)

    Haines Albert

    2010-07-01

    Full Text Available Abstract Background The rate of emergence of human pathogens is steadily increasing; most of these novel agents originate in wildlife. Bats, remarkably, are the natural reservoirs of many of the most pathogenic viruses in humans. There are two bat genome projects currently underway, a circumstance that promises to speed the discovery host factors important in the coevolution of bats with their viruses. These genomes, however, are not yet assembled and one of them will provide only low coverage, making the inference of most genes of immunological interest error-prone. Many more wildlife genome projects are underway and intend to provide only shallow coverage. Results We have developed a statistical method for the assembly of gene families from partial genomes. The method takes full advantage of the quality scores generated by base-calling software, incorporating them into a complete probabilistic error model, to overcome the limitation inherent in the inference of gene family members from partial sequence information. We validated the method by inferring the human IFNA genes from the genome trace archives, and used it to infer 61 type-I interferon genes, and single type-II interferon genes in the bats Pteropus vampyrus and Myotis lucifugus. We confirmed our inferences by direct cloning and sequencing of IFNA, IFNB, IFND, and IFNK in P. vampyrus, and by demonstrating transcription of some of the inferred genes by known interferon-inducing stimuli. Conclusion The statistical trace assembler described here provides a reliable method for extracting information from the many available and forthcoming partial or shallow genome sequencing projects, thereby facilitating the study of a wider variety of organisms with ecological and biomedical significance to humans than would otherwise be possible.

  10. Daysleeper : from genomic parasite to indispensable gene

    NARCIS (Netherlands)

    Knip, Marijn

    2012-01-01

    In this thesis the evolutionary background, function and localization of the domesticated transposase DAYSLEEPER are described. We found that DAYSLEEPER-like genes can be found in angiosperms, but not in lower plants. We also found that DAYSLEEPER interacts with several proteins and is probably

  11. Breaking the 1000-gene barrier for Mimivirus using ultra-deep genome and transcriptome sequencing

    Directory of Open Access Journals (Sweden)

    Claverie Jean-Michel

    2011-03-01

    Full Text Available Abstract Background Mimivirus, a giant dsDNA virus infecting Acanthamoeba, is the prototype of the mimiviridae family, the latest addition to the family of the nucleocytoplasmic large DNA viruses (NCLDVs. Its 1.2 Mb-genome was initially predicted to encode 917 genes. A subsequent RNA-Seq analysis precisely mapped many transcript boundaries and identified 75 new genes. Findings We now report a much deeper analysis using the SOLiD™ technology combining RNA-Seq of the Mimivirus transcriptome during the infectious cycle (202.4 Million reads, and a complete genome re-sequencing (45.3 Million reads. This study corrected the genome sequence and identified several single nucleotide polymorphisms. Our results also provided clear evidence of previously overlooked transcription units, including an important RNA polymerase subunit distantly related to Euryarchea homologues. The total Mimivirus gene count is now 1018, 11% greater than the original annotation. Conclusions This study highlights the huge progress brought about by ultra-deep sequencing for the comprehensive annotation of virus genomes, opening the door to a complete one-nucleotide resolution level description of their transcriptional activity, and to the realistic modeling of the viral genome expression at the ultimate molecular level. This work also illustrates the need to go beyond bioinformatics-only approaches for the annotation of short protein and non-coding genes in viral genomes.

  12. Genomic analysis reveals extensive gene duplication within the bovine TRB locus

    Directory of Open Access Journals (Sweden)

    Law Andy

    2009-04-01

    Full Text Available Abstract Background Diverse TR and IG repertoires are generated by V(DJ somatic recombination. Genomic studies have been pivotal in cataloguing the V, D, J and C genes present in the various TR/IG loci and describing how duplication events have expanded the number of these genes. Such studies have also provided insights into the evolution of these loci and the complex mechanisms that regulate TR/IG expression. In this study we analyze the sequence of the third bovine genome assembly to characterize the germline repertoire of bovine TRB genes and compare the organization, evolution and regulatory structure of the bovine TRB locus with that of humans and mice. Results The TRB locus in the third bovine genome assembly is distributed over 5 scaffolds, extending to ~730 Kb. The available sequence contains 134 TRBV genes, assigned to 24 subgroups, and 3 clusters of DJC genes, each comprising a single TRBD gene, 5–7 TRBJ genes and a single TRBC gene. Seventy-nine of the TRBV genes are predicted to be functional. Comparison with the human and murine TRB loci shows that the gene order, as well as the sequences of non-coding elements that regulate TRB expression, are highly conserved in the bovine. Dot-plot analyses demonstrate that expansion of the genomic TRBV repertoire has occurred via a complex and extensive series of duplications, predominantly involving DNA blocks containing multiple genes. These duplication events have resulted in massive expansion of several TRBV subgroups, most notably TRBV6, 9 and 21 which contain 40, 35 and 16 members respectively. Similarly, duplication has lead to the generation of a third DJC cluster. Analyses of cDNA data confirms the diversity of the TRBV genes and, in addition, identifies a substantial number of TRBV genes, predominantly from the larger subgroups, which are still absent from the genome assembly. The observed gene duplication within the bovine TRB locus has created a repertoire of phylogenetically

  13. Structural features of conopeptide genes inferred from partial sequences of the Conus tribblei genome.

    Science.gov (United States)

    Barghi, Neda; Concepcion, Gisela P; Olivera, Baldomero M; Lluisma, Arturo O

    2016-02-01

    The evolvability of venom components (in particular, the gene-encoded peptide toxins) in venomous species serves as an adaptive strategy allowing them to target new prey types or respond to changes in the prey field. The structure, organization, and expression of the venom peptide genes may provide insights into the molecular mechanisms that drive the evolution of such genes. Conus is a particularly interesting group given the high chemical diversity of their venom peptides, and the rapid evolution of the conopeptide-encoding genes. Conus genomes, however, are large and characterized by a high proportion of repetitive sequences. As a result, the structure and organization of conopeptide genes have remained poorly known. In this study, a survey of the genome of Conus tribblei was undertaken to address this gap. A partial assembly of C. tribblei genome was generated; the assembly, though consisting of a large number of fragments, accounted for 2160.5 Mb of sequence. A large number of repetitive genomic elements consisting of 642.6 Mb of retrotransposable elements, simple repeats, and novel interspersed repeats were observed. We characterized the structural organization and distribution of conotoxin genes in the genome. A significant number of conopeptide genes (estimated to be between 148 and 193) belonging to different superfamilies with complete or nearly complete exon regions were observed, ~60 % of which were expressed. The unexpressed conopeptide genes represent hidden but significant conotoxin diversity. The conotoxin genes also differed in the frequency and length of the introns. The interruption of exons by long introns in the conopeptide genes and the presence of repeats in the introns may indicate the importance of introns in facilitating recombination, evolution and diversification of conotoxins. These findings advance our understanding of the structural framework that promotes the gene-level molecular evolution of venom peptides.

  14. Genome Binding and Gene Regulation by Stem Cell Transcription Factors

    NARCIS (Netherlands)

    J.H. Brandsma (Johan)

    2016-01-01

    markdownabstractNearly all cells of an individual organism contain the same genome. However, each cell type transcribes a different set of genes due to the presence of different sets of cell type-specific transcription factors. Such transcription factors bind to regulatory regions such as promoters

  15. Gene hunting : molecular analysis of the chicken genome

    NARCIS (Netherlands)

    Crooijmans, R.P.M.A.

    2000-01-01

    This dissertation describes the development of molecular tools to identify genes that are involved in production and health traits in poultry. To unravel the chicken genome, fluorescent molecular markers (microsatellite markers) were developed and optimized to perform high throughput screening of re

  16. Biased distribution of DNA uptake sequences towards genome maintenance genes

    DEFF Research Database (Denmark)

    Davidsen, T.; Rodland, E.A.; Lagesen, K.

    2004-01-01

    in these organisms. Pasteurella multocida also displayed high frequencies of a putative DUS identical to that previously identified in H. influenzae and with a skewed distribution towards genome maintenance genes, indicating that this bacterium might be transformation competent under certain conditions....

  17. Re-Examining the Gene in Personalized Genomics

    Science.gov (United States)

    Bartol, Jordan

    2013-01-01

    Personalized genomics companies (PG; also called "direct-to-consumer genetics") are businesses marketing genetic testing to consumers over the Internet. While much has been written about these new businesses, little attention has been given to their roles in science communication. This paper provides an analysis of the gene concept…

  18. Evolution of the mitochondrial genome in snakes: Gene rearrangements and phylogenetic relationships

    Directory of Open Access Journals (Sweden)

    Zhou Kaiya

    2008-11-01

    Full Text Available Abstract Background Snakes as a major reptile group display a variety of morphological characteristics pertaining to their diverse behaviours. Despite abundant analyses of morphological characters, molecular studies using mitochondrial and nuclear genes are limited. As a result, the phylogeny of snakes remains controversial. Previous studies on mitochondrial genomes of snakes have demonstrated duplication of the control region and translocation of trnL to be two notable features of the alethinophidian (all serpents except blindsnakes and threadsnakes mtDNAs. Our purpose is to further investigate the gene organizations, evolution of the snake mitochondrial genome, and phylogenetic relationships among several major snake families. Results The mitochondrial genomes were sequenced for four taxa representing four different families, and each had a different gene arrangement. Comparative analyses with other snake mitochondrial genomes allowed us to summarize six types of mitochondrial gene arrangement in snakes. Phylogenetic reconstruction with commonly used methods of phylogenetic inference (BI, ML, MP, NJ arrived at a similar topology, which was used to reconstruct the evolution of mitochondrial gene arrangements in snakes. Conclusion The phylogenetic relationships among the major families of snakes are in accordance with the mitochondrial genomes in terms of gene arrangements. The gene arrangement in Ramphotyphlops braminus mtDNA is inferred to be ancestral for snakes. After the divergence of the early Ramphotyphlops lineage, three types of rearrangements occurred. These changes involve translocations within the IQM tRNA gene cluster and the duplication of the CR. All phylogenetic methods support the placement of Enhydris plumbea outside of the (Colubridae + Elapidae cluster, providing mitochondrial genomic evidence for the familial rank of Homalopsidae.

  19. Mapping and annotating obesity-related genes in pig and human genomes.

    Science.gov (United States)

    Martelli, Pier Luigi; Fontanesi, Luca; Piovesan, Damiano; Fariselli, Piero; Casadio, Rita

    2014-01-01

    Background. Obesity is a major health problem in both developed and emerging countries. Obesity is a complex disease whose etiology involves genetic factors in strong interplay with environmental determinants and lifestyle. The discovery of genetic factors and biological pathways underlying human obesity is hampered by the difficulty in controlling the genetic background of human cohorts. Animal models are then necessary to further dissect the genetics of obesity. Pig has emerged as one of the most attractive models, because of the similarity with humans in the mechanisms regulating the fat deposition. Results. We collected the genes related to obesity in humans and to fat deposition traits in pig. We localized them on both human and pig genomes, building a map useful to interpret comparative studies on obesity. We characterized the collected genes structurally and functionally with BAR+ and mapped them on KEGG pathways and on STRING protein interaction network. Conclusions. The collected set consists of 361 obesity related genes in human and pig genomes. All genes were mapped on the human genome, and 54 could not be localized on the pig genome (release 2012). Only for 3 human genes there is no counterpart in pig, confirming that this animal is a good model for human obesity studies. Obesity related genes are mostly involved in regulation and signaling processes/pathways and relevant connection emerges between obesity-related genes and diseases such as cancer and infectious diseases.

  20. Evolutionary Fates and Dynamic Functionalization of Young Duplicate Genes in Arabidopsis Genomes1[OPEN

    Science.gov (United States)

    Wang, Jun; Tao, Feng; Marowsky, Nicholas C.; Fan, Chuanzhu

    2016-01-01

    Gene duplication is a primary means to generate genomic novelties, playing an essential role in speciation and adaptation. Particularly in plants, a high abundance of duplicate genes has been maintained for significantly long periods of evolutionary time. To address the manner in which young duplicate genes were derived primarily from small-scale gene duplication and preserved in plant genomes and to determine the underlying driving mechanisms, we generated transcriptomes to produce the expression profiles of five tissues in Arabidopsis thaliana and the closely related species Arabidopsis lyrata and Capsella rubella. Based on the quantitative analysis metrics, we investigated the evolutionary processes of young duplicate genes in Arabidopsis. We determined that conservation, neofunctionalization, and specialization are three main evolutionary processes for Arabidopsis young duplicate genes. We explicitly demonstrated the dynamic functionalization of duplicate genes along the evolutionary time scale. Upon origination, duplicates tend to maintain their ancestral functions; but as they survive longer, they might be likely to develop distinct and novel functions. The temporal evolutionary processes and functionalization of plant duplicate genes are associated with their ancestral functions, dynamic DNA methylation levels, and histone modification abundances. Furthermore, duplicate genes tend to be initially expressed in pollen and then to gain more interaction partners over time. Altogether, our study provides novel insights into the dynamic retention processes of young duplicate genes in plant genomes. PMID:27485883

  1. 酒精依赖的基于基因和基于通路的全基因组关联研究%Gene-based and pathway-based genome-wide associaiton study of alcohol dependence

    Institute of Scientific and Technical Information of China (English)

    Zuo LJ; Zhang CK; Sayward FG; Cheung KH; Wang KS; Krystal JH; Zhao HY; Luo XG

    2015-01-01

      结果:我们发现了几个与酒精依赖显著相关的可重复的全基因组风险基因和风险通路。在多重比较Bonferroni校正后,“细胞-细胞外基质相互作用”通路(EA样本中p  结论:一些基因和生物信号传导过程可能与酒精依赖的风险相关,本研究的发现为此提供了新的证据。%Background:The organization of risk genes within signaling pathways may provide clues about the converging neurobiological effects of risk genes for alcohol dependence. Aims:Identify risk genes and risk gene pathways for alcohol dependence. Methods:We conducted a pathway-based genome-wide association study (GWAS) of alcohol dependence using a gene-set-rich analytic approach. Approximately one million genetic markers were tested in the discovery sample which included 1409 European-American (EA) alcohol dependent individuals and 1518 EA healthy comparison subjects. An additional 681 African-American (AA) cases and 508 AA healthy subjects served as the replication sample. Results:We identified several genome-wide replicable risk genes and risk pathways that were significantly associated with alcohol dependence. After applying the Bonferroni correction for multiple testing, the‘cell-extracellular matrix interactions’ pathway (p Conclusions:These findings provide new evidence highlighting several genes and biological signaling processes that may be related to the risk for alcohol dependence.

  2. Plasma fatty acid ratios affect blood gene expression profiles--a cross-sectional study of the Norwegian Women and Cancer Post-Genome Cohort.

    Science.gov (United States)

    Olsen, Karina Standahl; Fenton, Christopher; Frøyland, Livar; Waaseth, Marit; Paulssen, Ruth H; Lund, Eiliv

    2013-01-01

    High blood concentrations of n-6 fatty acids (FAs) relative to n-3 FAs may lead to a "physiological switch" towards permanent low-grade inflammation, potentially influencing the onset of cardiovascular and inflammatory diseases, as well as cancer. To explore the potential effects of FA ratios prior to disease onset, we measured blood gene expression profiles and plasma FA ratios (linoleic acid/alpha-linolenic acid, LA/ALA; arachidonic acid/eicosapentaenoic acid, AA/EPA; and total n-6/n-3) in a cross-section of middle-aged Norwegian women (n = 227). After arranging samples from the highest values to the lowest for all three FA ratios (LA/ALA, AA/EPA and total n-6/n-3), the highest and lowest deciles of samples were compared. Differences in gene expression profiles were assessed by single-gene and pathway-level analyses. The LA/ALA ratio had the largest impact on gene expression profiles, with 135 differentially expressed genes, followed by the total n-6/n-3 ratio (125 genes) and the AA/EPA ratio (72 genes). All FA ratios were associated with genes related to immune processes, with a tendency for increased pro-inflammatory signaling in the highest FA ratio deciles. Lipid metabolism related to peroxisome proliferator-activated receptor γ (PPARγ) signaling was modified, with possible implications for foam cell formation and development of cardiovascular diseases. We identified higher expression levels of several autophagy marker genes, mainly in the lowest LA/ALA decile. This finding may point to the regulation of autophagy as a novel aspect of FA biology which warrants further study. Lastly, all FA ratios were associated with gene sets that included targets of specific microRNAs, and gene sets containing common promoter motifs that did not match any known transcription factors. We conclude that plasma FA ratios are associated with differences in blood gene expression profiles in this free-living population, and that affected genes and pathways may influence the

  3. Plasma fatty acid ratios affect blood gene expression profiles--a cross-sectional study of the Norwegian Women and Cancer Post-Genome Cohort.

    Directory of Open Access Journals (Sweden)

    Karina Standahl Olsen

    Full Text Available High blood concentrations of n-6 fatty acids (FAs relative to n-3 FAs may lead to a "physiological switch" towards permanent low-grade inflammation, potentially influencing the onset of cardiovascular and inflammatory diseases, as well as cancer. To explore the potential effects of FA ratios prior to disease onset, we measured blood gene expression profiles and plasma FA ratios (linoleic acid/alpha-linolenic acid, LA/ALA; arachidonic acid/eicosapentaenoic acid, AA/EPA; and total n-6/n-3 in a cross-section of middle-aged Norwegian women (n = 227. After arranging samples from the highest values to the lowest for all three FA ratios (LA/ALA, AA/EPA and total n-6/n-3, the highest and lowest deciles of samples were compared. Differences in gene expression profiles were assessed by single-gene and pathway-level analyses. The LA/ALA ratio had the largest impact on gene expression profiles, with 135 differentially expressed genes, followed by the total n-6/n-3 ratio (125 genes and the AA/EPA ratio (72 genes. All FA ratios were associated with genes related to immune processes, with a tendency for increased pro-inflammatory signaling in the highest FA ratio deciles. Lipid metabolism related to peroxisome proliferator-activated receptor γ (PPARγ signaling was modified, with possible implications for foam cell formation and development of cardiovascular diseases. We identified higher expression levels of several autophagy marker genes, mainly in the lowest LA/ALA decile. This finding may point to the regulation of autophagy as a novel aspect of FA biology which warrants further study. Lastly, all FA ratios were associated with gene sets that included targets of specific microRNAs, and gene sets containing common promoter motifs that did not match any known transcription factors. We conclude that plasma FA ratios are associated with differences in blood gene expression profiles in this free-living population, and that affected genes and pathways may

  4. Evaluating Phylostratigraphic Evidence for Widespread De Novo Gene Birth in Genome Evolution.

    Science.gov (United States)

    Moyers, Bryan A; Zhang, Jianzhi

    2016-05-01

    The source of genetic novelty is an area of wide interest and intense investigation. Although gene duplication is conventionally thought to dominate the production of new genes, this view was recently challenged by a proposal of widespread de novo gene origination in eukaryotic evolution. Specifically, distributions of various gene properties such as coding sequence length, expression level, codon usage, and probability of being subject to purifying selection among groups of genes with different estimated ages were reported to support a model in which new protein-coding proto-genes arise from noncoding DNA and gradually integrate into cellular networks. Here we show that the genomic patterns asserted to support widespread de novo gene origination are largely attributable to biases in gene age estimation by phylostratigraphy, because such patterns are also observed in phylostratigraphic analysis of simulated genes bearing identical ages. Furthermore, there is no evidence of purifying selection on very young de novo genes previously claimed to show such signals. Together, these findings are consistent with the prevailing view that de novo gene birth is a relatively minor contributor to new genes in genome evolution. They also illustrate the danger of using phylostratigraphy in the study of new gene origination without considering its inherent bias. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  5. GeneChip Resequencing of the Smallpox Virus Genome Can Identify Novel Strains: a Biodefense Application▿

    Science.gov (United States)

    Sulaiman, Irshad M.; Tang, Kevin; Osborne, John; Sammons, Scott; Wohlhueter, Robert M.

    2007-01-01

    We developed a set of seven resequencing GeneChips, based on the complete genome sequences of 24 strains of smallpox virus (variola virus), for rapid characterization of this human-pathogenic virus. Each GeneChip was designed to analyze a divergent segment of approximately 30,000 bases of the smallpox virus genome. This study includes the hybridization results of 14 smallpox virus strains. Of the 14 smallpox virus strains hybridized, only 7 had sequence information included in the design of the smallpox virus resequencing GeneChips; similar information for the remaining strains was not tiled as a reference in these GeneChips. By use of variola virus-specific primers and long-range PCR, 22 overlapping amplicons were amplified to cover nearly the complete genome and hybridized with the smallpox virus resequencing GeneChip set. These GeneChips were successful in generating nucleotide sequences for all 14 of the smallpox virus strains hybridized. Analysis of the data indicated that the GeneChip resequencing by hybridization was fast and reproducible and that the smallpox virus resequencing GeneChips could differentiate the 14 smallpox virus strains characterized. This study also suggests that high-density resequencing GeneChips have potential biodefense applications and may be used as an alternate tool for rapid identification of smallpox virus in the future. PMID:17182757

  6. Identification of a novel gene by whole human genome tiling array.

    Science.gov (United States)

    Ishida, Hirokazu; Yagi, Tomohito; Tanaka, Masami; Tokuda, Yuichi; Kamoi, Kazumi; Hongo, Fumiya; Kawauchi, Akihiro; Nakano, Masakazu; Miki, Tsuneharu; Tashiro, Kei

    2013-03-01

    When the whole human genome sequence was determined by the Human Genome Project, the number of identified genes was fewer than expected. However, recent studies suggest that undiscovered transcripts still exist in the human genome. Furthermore, a new technology, the DNA microarray, which can simultaneously characterize huge amounts of genome sequence data, has become a useful tool for analyzing genetic changes in various diseases. A version of this tool, the tiling DNA microarray, was designed to search all the transcripts of the entire human genome, and provides huge amounts of data, including both exon and intron sequences, by a simple process. Although some previous studies using tiling DNA microarray analysis have indicated that numerous novel transcripts can be found in the human genome, none of them has reported any novel full-length human genes. Here, to find novel genes, we analyzed all the transcripts expressed in normal human prostate cells using this microarray. Because the optimal analytical parameters for using tiling DNA microarray data for this purpose had not been established, we established parameters for extracting the most likely regions for novel transcripts. The three parameters we optimized were the threshold for positive signal intensity, the Max gap, and the Min run, which we set to detect all transcriptional regions that were above the average length of known exons and had a signal intensity in the top 5%. We succeeded in obtaining the full-length sequence of one novel gene, located on chromosome 12q24.13. We named the novel gene "POTAGE". Its 5841-bp mRNA consists of 26 exons. We detected part of exon 2 in the tiling data analysis. The full-length sequence was then obtained by RT-PCR and RACE. Although the function of POTAGE is unclear, its sequence showed high homology with genes in other species, suggesting it might have an important or essential function. This study demonstrates that the tiling DNA microarray can be useful for

  7. The evolution of chloroplast genes and genomes in ferns.

    Science.gov (United States)

    Wolf, Paul G; Der, Joshua P; Duffy, Aaron M; Davidson, Jacob B; Grusz, Amanda L; Pryer, Kathleen M

    2011-07-01

    Most of the publicly available data on chloroplast (plastid) genes and genomes come from seed plants, with relatively little information from their sister group, the ferns. Here we describe several broad evolutionary patterns and processes in fern plastid genomes (plastomes), and we include some new plastome sequence data. We review what we know about the evolutionary history of plastome structure across the fern phylogeny and we compare plastome organization and patterns of evolution in ferns to those in seed plants. A large clade of ferns is characterized by a plastome that has been reorganized with respect to the ancestral gene order (a similar order that is ancestral in seed plants). We review the sequence of inversions that gave rise to this organization. We also explore global nucleotide substitution patterns in ferns versus those found in seed plants across plastid genes, and we review the high levels of RNA editing observed in fern plastomes.

  8. Phylogeny of bacterial and archaeal genomes using conserved genes: supertrees and supermatrices.

    Directory of Open Access Journals (Sweden)

    Jenna Morgan Lang

    Full Text Available Over 3000 microbial (bacterial and archaeal genomes have been made publically available to date, providing an unprecedented opportunity to examine evolutionary genomic trends and offering valuable reference data for a variety of other studies such as metagenomics. The utility of these genome sequences is greatly enhanced when we have an understanding of how they are phylogenetically related to each other. Therefore, we here describe our efforts to reconstruct the phylogeny of all available bacterial and archaeal genomes. We identified 24, single-copy, ubiquitous genes suitable for this phylogenetic analysis. We used two approaches to combine the data for the 24 genes. First, we concatenated alignments of all genes into a single alignment from which a Maximum Likelihood (ML tree was inferred using RAxML. Second, we used a relatively new approach to combining gene data, Bayesian Concordance Analysis (BCA, as implemented in the BUCKy software, in which the results of 24 single-gene phylogenetic analyses are used to generate a "primary concordance" tree. A comparison of the concatenated ML tree and the primary concordance (BUCKy tree reveals that the two approaches give similar results, relative to a phylogenetic tree inferred from the 16S rRNA gene. After comparing the results and the methods used, we conclude that the current best approach for generating a single phylogenetic tree, suitable for use as a reference phylogeny for comparative analyses, is to perform a maximum likelihood analysis of a concatenated alignment of conserved, single-copy genes.

  9. Characterization and distribution of repetitive elements in association with genes in the human genome.

    Science.gov (United States)

    Liang, Kai-Chiang; Tseng, Joseph T; Tsai, Shaw-Jenq; Sun, H Sunny

    2015-08-01

    Repetitive elements constitute more than 50% of the human genome. Recent studies implied that the complexity of living organisms is not just a direct outcome of a number of coding sequences; the repetitive elements, which do not encode proteins, may also play a significant role. Though scattered studies showed that repetitive elements in the regulatory regions of a gene control gene expression, no systematic survey has been done to report the characterization and distribution of various types of these repetitive elements in the human genome. Sequences from 5' and 3' untranslated regions and upstream and downstream of a gene were downloaded from the Ensembl database. The repetitive elements in the neighboring of each gene were identified and classified using cross-matching implemented in the RepeatMasker. The annotation and distribution of distinct classes of repetitive elements associated with individual gene were collected to characterize genes in association with different types of repetitive elements using systems biology program. We identified a total of 1,068,400 repetitive elements which belong to 37-class families and 1235 subclasses that are associated with 33,761 genes and 57,365 transcripts. In addition, we found that the tandem repeats preferentially locate proximal to the transcription start site (TSS) of genes and the major function of these genes are involved in developmental processes. On the other hand, interspersed repetitive elements showed a tendency to be accumulated at distal region from the TSS and the function of interspersed repeat-containing genes took part in the catabolic/metabolic processes. Results from the distribution analysis were collected and used to construct a gene-based repetitive element database (GBRED; http://www.binfo.ncku.edu.tw/GBRED/index.html). A user-friendly web interface was designed to provide the information of repetitive elements associated with any particular gene(s). This is the first study focusing on the gene

  10. Improvement of genome assembly completeness and identification of novel full-length protein-coding genes by RNA-seq in the giant panda genome.

    Science.gov (United States)

    Chen, Meili; Hu, Yibo; Liu, Jingxing; Wu, Qi; Zhang, Chenglin; Yu, Jun; Xiao, Jingfa; Wei, Fuwen; Wu, Jiayan

    2015-12-11

    High-quality and complete gene models are the basis of whole genome analyses. The giant panda (Ailuropoda melanoleuca) genome was the first genome sequenced on the basis of solely short reads, but the genome annotation had lacked the support of transcriptomic evidence. In this study, we applied RNA-seq to globally improve the genome assembly completeness and to detect novel expressed transcripts in 12 tissues from giant pandas, by using a transcriptome reconstruction strategy that combined reference-based and de novo methods. Several aspects of genome assembly completeness in the transcribed regions were effectively improved by the de novo assembled transcripts, including genome scaffolding, the detection of small-size assembly errors, the extension of scaffold/contig boundaries, and gap closure. Through expression and homology validation, we detected three groups of novel full-length protein-coding genes. A total of 12.62% of the novel protein-coding genes were validated by proteomic data. GO annotation analysis showed that some of the novel protein-coding genes were involved in pigmentation, anatomical structure formation and reproduction, which might be related to the development and evolution of the black-white pelage, pseudo-thumb and delayed embryonic implantation of giant pandas. The updated genome annotation will help further giant panda studies from both structural and functional perspectives.

  11. An enigmatic fourth runt domain gene in the fugu genome: ancestral gene loss versus accelerated evolution

    Directory of Open Access Journals (Sweden)

    Hood Leroy

    2004-11-01

    Full Text Available Abstract Background The runt domain transcription factors are key regulators of developmental processes in bilaterians, involved both in cell proliferation and differentiation, and their disruption usually leads to disease. Three runt domain genes have been described in each vertebrate genome (the RUNX gene family, but only one in other chordates. Therefore, the common ancestor of vertebrates has been thought to have had a single runt domain gene. Results Analysis of the genome draft of the fugu pufferfish (Takifugu rubripes reveals the existence of a fourth runt domain gene, FrRUNT, in addition to the orthologs of human RUNX1, RUNX2 and RUNX3. The tiny FrRUNT packs six exons and two putative promoters in just 3 kb of genomic sequence. The first exon is located within an intron of FrSUPT3H, the ortholog of human SUPT3H, and the first exon of FrSUPT3H resides within the first intron of FrRUNT. The two gene structures are therefore "interlocked". In the human genome, SUPT3H is instead interlocked with RUNX2. FrRUNT has no detectable ortholog in the genomes of mammals, birds or amphibians. We consider alternative explanations for an apparent contradiction between the phylogenetic data and the comparison of the genomic neighborhoods of human and fugu runt domain genes. We hypothesize that an ancient RUNT locus was lost in the tetrapod lineage, together with FrFSTL6, a member of a novel family of follistatin-like genes. Conclusions Our results suggest that the runt domain family may have started expanding in chordates much earlier than previously thought, and exemplify the importance of detailed analysis of whole-genome draft sequence to provide new insights into gene evolution.

  12. Genome-wide analysis of chimpanzee genes with premature termination codons

    Directory of Open Access Journals (Sweden)

    Cavelier Lucia

    2009-01-01

    Full Text Available Abstract Background Premature termination codons (PTCs cause mRNA degradation or a truncated protein and thereby contribute to the transcriptome and proteome divergence between species. Here we present the first genome-wide study of PTCs in the chimpanzee. By comparing the human and chimpanzee genome sequences we identify and characterize genes with PTCs, in order to understand the contribution of these mutations to the transcriptome diversity between the species. Results We have studied a total of 13,487 human-chimpanzee gene pairs and found that ~8% were affected by PTCs in the chimpanzee. A majority (764/1,109 of PTCs were caused by insertions or deletions and the remaining part was caused by substitutions. The distribution of PTC genes varied between chromosomes, with Y having the highest proportion. Furthermore, the density of PTC genes varied on a megabasepair scale within chromosomes and we found the density to be correlated both with indel divergence and proximity to the telomere. Within genes, PTCs were more common close to the 5' and 3' ends of the amino acid sequence. Gene Ontology classification revealed that olfactory receptor genes were over represented among the PTC genes. Conclusion Our results showed that the density of PTC genes fluctuated across the genome depending on the local genomic context. PTCs were preferentially located in the terminal parts of the transcript, which generally have a lower frequency of functional domains, indicating that selection was operating against PTCs at sites central to protein function. The enrichment of GO terms associated with olfaction suggests that PTCs may have influenced the difference in the repertoire of olfactory genes between humans and chimpanzees. In summary, 8% of the chimpanzee genes were affected by PTCs and this type of variation is likely to have an important effect on the transcript and proteomic divergence between humans and chimpanzees.

  13. Whole-Genome Microarray and Gene Deletion Studies Reveal Regulation of the Polyhydroxyalkanoate Production Cycle by the Stringent Response in Ralstonia eutropha H16

    Energy Technology Data Exchange (ETDEWEB)

    Brigham, CJ; Speth, DR; Rha, C; Sinskey, AJ

    2012-10-22

    Poly(3-hydroxybutyrate) (PHB) production and mobilization in Ralstonia eutropha are well studied, but in only a few instances has PHB production been explored in relation to other cellular processes. We examined the global gene expression of wild-type R. eutropha throughout the PHB cycle: growth on fructose, PHB production using fructose following ammonium depletion, and PHB utilization in the absence of exogenous carbon after ammonium was resupplied. Our results confirm or lend support to previously reported results regarding the expression of PHB-related genes and enzymes. Additionally, genes for many different cellular processes, such as DNA replication, cell division, and translation, are selectively repressed during PHB production. In contrast, the expression levels of genes under the control of the alternative sigma factor sigma(54) increase sharply during PHB production and are repressed again during PHB utilization. Global gene regulation during PHB production is strongly reminiscent of the gene expression pattern observed during the stringent response in other species. Furthermore, a ppGpp synthase deletion mutant did not show an accumulation of PHB, and the chemical induction of the stringent response with DL-norvaline caused an increased accumulation of PHB in the presence of ammonium. These results indicate that the stringent response is required for PHB accumulation in R. eutropha, helping to elucidate a thus-far-unknown physiological basis for this process.

  14. Genome-wide analysis reveals diverged patterns of codon bias, gene expression, and rates of sequence evolution in picea gene families.

    Science.gov (United States)

    De La Torre, Amanda R; Lin, Yao-Cheng; Van de Peer, Yves; Ingvarsson, Pär K

    2015-03-05

    The recent sequencing of several gymnosperm genomes has greatly facilitated studying the evolution of their genes and gene families. In this study, we examine the evidence for expression-mediated selection in the first two fully sequenced representatives of the gymnosperm plant clade (Picea abies and Picea glauca). We use genome-wide estimates of gene expression (>50,000 expressed genes) to study the relationship between gene expression, codon bias, rates of sequence divergence, protein length, and gene duplication. We found that gene expression is correlated with rates of sequence divergence and codon bias, suggesting that natural selection is acting on Picea protein-coding genes for translational efficiency. Gene expression, rates of sequence divergence, and codon bias are correlated with the size of gene families, with large multicopy gene families having, on average, a lower expression level and breadth, lower codon bias, and higher rates of sequence divergence than single-copy gene families. Tissue-specific patterns of gene expression were more common in large gene families with large gene expression divergence than in single-copy families. Recent family expansions combined with large gene expression variation in paralogs and increased rates of sequence evolution suggest that some Picea gene families are rapidly evolving to cope with biotic and abiotic stress. Our study highlights the importance of gene expression and natural selection in shaping the evolution of protein-coding genes in Picea species, and sets the ground for further studies investigating the evolution of individual gene families in gymnosperms.

  15. RNAi screening in primary human hepatocytes of genes implicated in genome-wide association studies for roles in type 2 diabetes identifies roles for CAMK1D and CDKAL1, among others, in hepatic glucose regulation.

    Directory of Open Access Journals (Sweden)

    Steven Haney

    Full Text Available Genome-wide association (GWA studies have described a large number of new candidate genes that contribute to of Type 2 Diabetes (T2D. In some cases, small clusters of genes are implicated, rather than a single gene, and in all cases, the genetic contribution is not defined through the effects on a specific organ, such as the pancreas or liver. There is a significant need to develop and use human cell-based models to examine the effects these genes may have on glucose regulation. We describe the development of a primary human hepatocyte model that adjusts glucose disposition according to hormonal signals. This model was used to determine whether candidate genes identified in GWA studies regulate hepatic glucose disposition through siRNAs corresponding to the list of identified genes. We find that several genes affect the storage of glucose as glycogen (glycolytic response and/or affect the utilization of pyruvate, the critical step in gluconeogenesis. Of the genes that affect both of these processes, CAMK1D, TSPAN8 and KIF11 affect the localization of a mediator of both gluconeogenesis and glycolysis regulation, CRTC2, to the nucleus in response to glucagon. In addition, the gene CDKAL1 was observed to affect glycogen storage, and molecular experiments using mutant forms of CDK5, a putative target of CDKAL1, in HepG2 cells show that this is mediated by coordinate regulation of CDK5 and PKA on MEK, which ultimately regulates the phosphorylation of ribosomal protein S6, a critical step in the insulin signaling pathway.

  16. RNAi screening in primary human hepatocytes of genes implicated in genome-wide association studies for roles in type 2 diabetes identifies roles for CAMK1D and CDKAL1, among others, in hepatic glucose regulation.

    Science.gov (United States)

    Haney, Steven; Zhao, Juan; Tiwari, Shiwani; Eng, Kurt; Guey, Lin T; Tien, Eric

    2013-01-01

    Genome-wide association (GWA) studies have described a large number of new candidate genes that contribute to of Type 2 Diabetes (T2D). In some cases, small clusters of genes are implicated, rather than a single gene, and in all cases, the genetic contribution is not defined through the effects on a specific organ, such as the pancreas or liver. There is a significant need to develop and use human cell-based models to examine the effects these genes may have on glucose regulation. We describe the development of a primary human hepatocyte model that adjusts glucose disposition according to hormonal signals. This model was used to determine whether candidate genes identified in GWA studies regulate hepatic glucose disposition through siRNAs corresponding to the list of identified genes. We find that several genes affect the storage of glucose as glycogen (glycolytic response) and/or affect the utilization of pyruvate, the critical step in gluconeogenesis. Of the genes that affect both of these processes, CAMK1D, TSPAN8 and KIF11 affect the localization of a mediator of both gluconeogenesis and glycolysis regulation, CRTC2, to the nucleus in response to glucagon. In addition, the gene CDKAL1 was observed to affect glycogen storage, and molecular experiments using mutant forms of CDK5, a putative target of CDKAL1, in HepG2 cells show that this is mediated by coordinate regulation of CDK5 and PKA on MEK, which ultimately regulates the phosphorylation of ribosomal protein S6, a critical step in the insulin signaling pathway.

  17. The Complete Mitochondrial Genome of Aleurocanthus camelliae: Insights into Gene Arrangement and Genome Organization within the Family Aleyrodidae.

    Science.gov (United States)

    Chen, Shi-Chun; Wang, Xiao-Qing; Li, Pin-Wu; Hu, Xiang; Wang, Jin-Jun; Peng, Ping

    2016-11-07

    There are numerous gene rearrangements and transfer RNA gene absences existing in mitochondrial (mt) genomes of Aleyrodidae species. To understand how mt genomes evolved in the family Aleyrodidae, we have sequenced the complete mt genome of Aleurocanthus camelliae and comparatively analyzed all reported whitefly mt genomes. The mt genome of A. camelliae is 15,188 bp long, and consists of 13 protein-coding genes, two rRNA genes, 21 tRNA genes and a putative control region (GenBank: KU761949). The tRNA gene, trnI, has not been observed in this genome. The mt genome has a unique gene order and shares most gene boundaries with Tetraleurodes acaciae. Nineteen of 21 tRNA genes have the conventional cloverleaf shaped secondary structure and two (trnS₁ and trnS₂) lack the dihydrouridine (DHU) arm. Using ARWEN and homologous sequence alignment, we have identified five tRNA genes and revised the annotation for three whitefly mt genomes. This result suggests that most absent genes exist in the genomes and have not been identified, due to be lack of technology and inference sequence. The phylogenetic relationships among 11 whiteflies and Drosophila melanogaster were inferred by maximum likelihood and Bayesian inference methods. Aleurocanthus camelliae and T. acaciae form a sister group, and all three Bemisia tabaci and two Bemisia afer strains gather together. These results are identical to the relationships inferred from gene order. We inferred that gene rearrangement plays an important role in the mt genome evolved from whiteflies.

  18. Estimating variation within the genes and inferring the phylogeny of 186 sequenced diverse Escherichia coli genomes

    DEFF Research Database (Denmark)

    Kaas, Rolf Sommer; Rundsten, Carsten Friis; Ussery, David

    2012-01-01

    more biologically relevant, especially considering that many of these genome sequences are draft quality. The E. coli pan-genome for this set of isolates contains 16,373 gene clusters. A core-gene tree, based on alignment and a pan-genome tree based on gene presence/absence, maps the relatedness...

  19. Evolutionary maintenance of filovirus-like genes in bat genomes

    Directory of Open Access Journals (Sweden)

    Taylor Derek J

    2011-11-01

    Full Text Available Abstract Background Little is known of the biological significance and evolutionary maintenance of integrated non-retroviral RNA virus genes in eukaryotic host genomes. Here, we isolated novel filovirus-like genes from bat genomes and tested for evolutionary maintenance. We also estimated the age of filovirus VP35-like gene integrations and tested the phylogenetic hypotheses that there is a eutherian mammal clade and a marsupial/ebolavirus/Marburgvirus dichotomy for filoviruses. Results We detected homologous copies of VP35-like and NP-like gene integrations in both Old World and New World species of Myotis (bats. We also detected previously unknown VP35-like genes in rodents that are positionally homologous. Comprehensive phylogenetic estimates for filovirus NP-like and VP35-like loci support two main clades with a marsupial and a rodent grouping within the ebolavirus/Lloviu virus/Marburgvirus clade. The concordance of VP35-like, NP-like and mitochondrial gene trees with the expected species tree supports the notion that the copies we examined are orthologs that predate the global spread and radiation of the genus Myotis. Parametric simulations were consistent with selective maintenance for the open reading frame (ORF of VP35-like genes in Myotis. The ORF of the filovirus-like VP35 gene has been maintained in bat genomes for an estimated 13. 4 MY. ORFs were disrupted for the NP-like genes in Myotis. Likelihood ratio tests revealed that a model that accommodates positive selection is a significantly better fit to the data than a model that does not allow for positive selection for VP35-like sequences. Moreover, site-by-site analysis of selection using two methods indicated at least 25 sites in the VP35-like alignment are under positive selection in Myotis. Conclusions Our results indicate that filovirus-like elements have significance beyond genomic imprints of prior infection. That is, there appears to be, or have been, functionally maintained

  20. Genome-Wide Architecture of Disease Resistance Genes in Lettuce.

    Science.gov (United States)

    Christopoulou, Marilena; Wo, Sebastian Reyes-Chin; Kozik, Alex; McHale, Leah K; Truco, Maria-Jose; Wroblewski, Tadeusz; Michelmore, Richard W

    2015-10-08

    Genome-wide motif searches identified 1134 genes in the lettuce reference genome of cv. Salinas that are potentially involved in pathogen recognition, of which 385 were predicted to encode nucleotide binding-leucine rich repeat receptor (NLR) proteins. Using a maximum-likelihood approach, we grouped the NLRs into 25 multigene families and 17 singletons. Forty-one percent of these NLR-encoding genes belong to three families, the largest being RGC16 with 62 genes in cv. Salinas. The majority of NLR-encoding genes are located in five major resistance clusters (MRCs) on chromosomes 1, 2, 3, 4, and 8 and cosegregate with multiple disease resistance phenotypes. Most MRCs contain primarily members of a single NLR gene family but a few are more complex. MRC2 spans 73 Mb and contains 61 NLRs of six different gene families that cosegregate with nine disease resistance phenotypes. MRC3, which is 25 Mb, contains 22 RGC21 genes and colocates with Dm13. A library of 33 transgenic RNA interference tester stocks was generated for functional analysis of NLR-encoding genes that cosegregated with disease resistance phenotypes in each of the MRCs. Members of four NLR-encoding families, RGC1, RGC2, RGC21, and RGC12 were shown to be required for 16 disease resistance phenotypes in lettuce. The general composition of MRCs is conserved across different genotypes; however, the specific repertoire of NLR-encoding genes varied particularly of the rapidly evolving Type I genes. These tester stocks are valuable resources for future analyses of additional resistance phenotypes. Copyright © 2015 Christopoulou et al.

  1. Genomic Analyses of Bacterial Porin-Cytochrome Gene Clusters

    Directory of Open Access Journals (Sweden)

    Liang eShi

    2014-11-01

    Full Text Available The porin-cytochrome (Pcc protein complex is responsible for trans-outer membrane electron transfer during extracellular reduction of Fe(III by the dissimilatory metal-reducing bacterium Geobacter sulfurreducens PCA. The identified and characterized Pcc complex of G. sulfurreducens PCA consists of a porin-like outer-membrane protein, a periplasmic 8-heme c-type cytochrome (c-Cyt and an outer-membrane 12-heme c-Cyt, and the genes encoding the Pcc proteins are clustered in the same regions of genome (i.e., the pcc gene clusters of G. sulfurreducens PCA. A survey of additionally microbial genomes has identified the pcc gene clusters in all sequenced Geobacter spp. and other bacteria from six different phyla, including Anaeromyxobacter dehalogenans 2CP-1, A. dehalogenans 2CP-C, Anaeromyxobacter sp. K, Candidatus Kuenenia stuttgartiensis, Denitrovibrio acetiphilus DSM 12809, Desulfurispirillum indicum S5, Desulfurivibrio alkaliphilus AHT2, Desulfurobacterium thermolithotrophum DSM 11699, Desulfuromonas acetoxidans DSM 684, Ignavibacterium album JCM 16511, and Thermovibrio ammonificans HB-1. The numbers of genes in the pcc gene clusters vary, ranging from two to nine. Similar to the metal-reducing (Mtr gene clusters of other Fe(III-reducing bacteria, such as Shewanella spp., additional genes that encode putative c-Cyts with predicted cellular localizations at the cytoplasmic membrane, periplasm and outer membrane often associate with the pcc gene clusters. This suggests that the Pcc-associated c-Cyts may be part of the pathways for extracellular electron transfer reactions. The presence of pcc gene clusters in the microorganisms that do not reduce solid-phase Fe(III and Mn(IV oxides, such as D. alkaliphilus AHT2 and I. album JCM 16511, also suggests that some of the pcc gene clusters may be involved in extracellular electron transfer reactions with the substrates other than Fe(III and Mn(IV oxides.

  2. Use of quantitative real time PCR for a genome-wide study of AYWB phytoplasma gene expression in plant and insect hosts

    DEFF Research Database (Denmark)

    Makarova, Olga; MacLean, Allyson M.; Hogenhout, Saskia A.

    2011-01-01

    Phytoplasmas are obligate parasites of plants and insects and cause significant crop yield losses worldwide. A number of microarray gene expression studies have been performed to understand better the effects of phytoplasma infection on plant physiology. However, little effort has been made...... this technique for reliable gene expression quantification of phytoplasmas on a large scale. In our experimental setup, 242 genes of aster yellows phytoplasma strain witches' broom (AY-WB) were tested for differences in expression in plant and insect host environments, and were shown to be predominantly...... expressed in the plant or insect hosts. In silico operon prediction corroborated the experimental data. Our findings suggest that the delta delta Ct method can be used to study the physiology of this pathogen...

  3. Extensive loss of translational genes in the structurally dynamic mitochondrial genome of the angiosperm Silene latifolia

    Directory of Open Access Journals (Sweden)

    Sloan Daniel B

    2010-09-01

    Full Text Available Abstract Background Mitochondrial gene loss and functional transfer to the nucleus is an ongoing process in many lineages of plants, resulting in substantial variation across species in mitochondrial gene content. The Caryophyllaceae represents one lineage that has experienced a particularly high rate of mitochondrial gene loss relative to other angiosperms. Results In this study, we report the first complete mitochondrial genome sequence from a member of this family, Silene latifolia. The genome can be mapped as a 253,413 bp circle, but its structure is complicated by a large repeated region that is present in 6 copies. Active recombination among these copies produces a suite of alternative genome configurations that appear to be at or near "recombinational equilibrium". The genome contains the fewest genes of any angiosperm mitochondrial genome sequenced to date, with intact copies of only 25 of the 41 protein genes inferred to be present in the common ancestor of angiosperms. As observed more broadly in angiosperms, ribosomal proteins have been especially prone to gene loss in the S. latifolia lineage. The genome has also experienced a major reduction in tRNA gene content, including loss of functional tRNAs of both native and chloroplast origin. Even assuming expanded wobble-pairing rules, the mitochondrial genome can support translation of only 17 of the 61 sense codons, which code for only 9 of the 20 amino acids. In addition, genes encoding 18S and, especially, 5S rRNA exhibit exceptional sequence divergence relative to other plants. Divergence in one region of 18S rRNA appears to be the result of a gene conversion event, in which recombination with a homologous gene of chloroplast origin led to the complete replacement of a helix in this ribosomal RNA. Conclusions These findings suggest a markedly expanded role for nuclear gene products in the translation of mitochondrial genes in S. latifolia and raise the possibility of altered

  4. Involvement of plastid, mitochondrial and nuclear genomes in plant-to-plant horizontal gene transfer

    Directory of Open Access Journals (Sweden)

    Maria Virginia Sanchez-Puerta

    2014-12-01

    Full Text Available This review focuses on plant-to-plant horizontal gene transfer (HGT involving the three DNA-containing cellular compartments. It highlights the great incidence of HGT in the mitochondrial genome (mtDNA of angiosperms, the increasing number of examples in plant nuclear genomes, and the lack of any convincing evidence for HGT in the well-studied plastid genome of land plants. Most of the foreign mitochondrial genes are non-functional, generally found as pseudogenes in the recipient plant mtDNA that maintains its functional native genes. The few exceptions involve chimeric HGT, in which foreign and native copies recombine leading to a functional and single copy of the gene. Maintenance of foreign genes in plant mitochondria is probably the result of genetic drift, but a possible evolutionary advantage may be conferred through the generation of genetic diversity by gene conversion between native and foreign copies. Conversely, a few cases of nuclear HGT in plants involve functional transfers of novel genes that resulted in adaptive evolution. Direct cell-to-cell contact between plants (e.g. host-parasite relationships or natural grafting facilitate the exchange of genetic material, in which HGT has been reported for both nuclear and mitochondrial genomes, and in the form of genomic DNA, instead of RNA. A thorough review of the literature indicates that HGT in mitochondrial and nuclear genomes of angiosperms is much more frequent than previously expected and that the evolutionary impact and mechanisms underlying plant-to-plant HGT remain to be uncovered.

  5. Genome-Wide Distribution, Organisation and Functional Characterization of Disease Resistance and Defence Response Genes across Rice Species

    Science.gov (United States)

    Singh, Sangeeta; Chand, Suresh; Singh, N. K.; Sharma, Tilak Raj

    2015-01-01

    The resistance (R) genes and defense response (DR) genes have become very important resources for the development of disease resistant cultivars. In the present investigation, genome-wide identification, expression, phylogenetic and synteny analysis was done for R and DR-genes across three species of rice viz: Oryza sativa ssp indica cv 93-11, Oryza sativa ssp japonica and wild rice species, Oryza brachyantha. We used the in silico approach to identify and map 786 R -genes and 167 DR-genes, 672 R-genes and 142 DR-genes, 251 R-genes and 86 DR-genes in the japonica, indica and O. brachyanth a genomes, respectively. Our analysis showed that 60.5% and 55.6% of the R-genes are tandemly repeated within clusters and distributed over all the rice chromosomes in indica and japonica genomes, respectively. The phylogenetic analysis along with motif distribution shows high degree of conservation of R- and DR-genes in clusters. In silico expression analysis of R-genes and DR-genes showed more than 85% were expressed genes showing corresponding EST matches in the databases. This study gave special emphasis on mechanisms of gene evolution and duplication for R and DR genes across species. Analysis of paralogs across rice species indicated 17% and 4.38% R-genes, 29% and 11.63% DR-genes duplication in indica and Oryza brachyantha, as compared to 20% and 26% duplication of R-genes and DR-genes in japonica respectively. We found that during the course of duplication only 9.5% of R- and DR-genes changed their function and rest of the genes have maintained their identity. Syntenic relationship across three genomes inferred that more orthology is shared between indica and japonica genomes as compared to brachyantha genome. Genome wide identification of R-genes and DR-genes in the rice genome will help in allele mining and functional validation of these genes, and to understand molecular mechanism of disease resistance and their evolution in rice and related species. PMID:25902056

  6. Modeling chromosomes in mouse to explore the function of genes, genomic disorders, and chromosomal organization.

    Directory of Open Access Journals (Sweden)

    Véronique Brault

    2006-07-01

    Full Text Available One of the challenges of genomic research after the completion of the human genome project is to assign a function to all the genes and to understand their interactions and organizations. Among the various techniques, the emergence of chromosome engineering tools with the aim to manipulate large genomic regions in the mouse model offers a powerful way to accelerate the discovery of gene functions and provides more mouse models to study normal and pathological developmental processes associated with aneuploidy. The combination of gene targeting in ES cells, recombinase technology, and other techniques makes it possible to generate new chromosomes carrying specific and defined deletions, duplications, inversions, and translocations that are accelerating functional analysis. This review presents the current status of chromosome engineering techniques and discusses the different applications as well as the implication of these new techniques in future research to better understand the function of chromosomal organization and structures.

  7. Systematically fragmented genes in a multipartite mitochondrial genome

    Science.gov (United States)

    Vlcek, Cestmir; Marande, William; Teijeiro, Shona; Lukeš, Julius; Burger, Gertraud

    2011-01-01

    Arguably, the most bizarre mitochondrial DNA (mtDNA) is that of the euglenozoan eukaryote Diplonema papillatum. The genome consists of numerous small circular chromosomes none of which appears to encode a complete gene. For instance, the cox1 coding sequence is spread out over nine different chromosomes in non-overlapping pieces (modules), which are transcribed separately and joined to a contiguous mRNA by trans-splicing. Here, we examine how many genes are encoded by Diplonema mtDNA and whether all are fragmented and their transcripts trans-spliced. Module identification is challenging due to the sequence divergence of Diplonema mitochondrial genes. By employing most sensitive protein profile search algorithms and comparing genomic with cDNA sequence, we recognize a total of 11 typical mitochondrial genes. The 10 protein-coding genes are systematically chopped up into three to 12 modules of 60–350 bp length. The corresponding mRNAs are all trans-spliced. Identification of ribosomal RNAs is most difficult. So far, we only detect the 3′-module of the large subunit ribosomal RNA (rRNA); it does not trans-splice with other pieces. The small subunit rRNA gene remains elusive. Our results open new intriguing questions about the biochemistry and evolution of mitochondrial trans-splicing in Diplonema. PMID:20935050

  8. A genome-wide association study of anorexia nervosa

    NARCIS (Netherlands)

    Boraska, V; Franklin, C S; Floyd, J A B; Thornton, L M; Huckins, L M; Southam, L; Rayner, N W; Tachmazidou, I; Klump, K L; Treasure, J; Lewis, C M; Schmidt, U; Tozzi, F; Kiezebrink, K; Hebebrand, J; Gorwood, P; Adan, R A H; Kas, M J H; Favaro, A; Santonastaso, P; Fernández-Aranda, F; Gratacos, M; Rybakowski, F; Dmitrzak-Weglarz, M; Kaprio,