WorldWideScience

Sample records for genome structure expression

  1. Comparative genomics of the relationship between gene structure and expression

    NARCIS (Netherlands)

    Ren, X.

    2006-01-01

    The relationship between the structure of genes and their expression is a relatively new aspect of genome organization and regulation. With more genome sequences and expression data becoming available, bioinformatics approaches can help the further elucidation of the relationships between gene

  2. cDNA structure, genomic organization and expression patterns of ...

    African Journals Online (AJOL)

    Visfatin was a newly identified adipocytokine, which was involved in various physiologic and pathologic processes of organisms. The cDNA structure, genomic organization and expression patterns of silver Prussian carp visfatin were described in this report. The silver Prussian carp visfatin cDNA cloned from the liver was ...

  3. Effects of aneuploidy on genome structure, expression, and interphase organization in Arabidopsis thaliana.

    Directory of Open Access Journals (Sweden)

    Bruno Huettel

    2008-10-01

    Full Text Available Aneuploidy refers to losses and/or gains of individual chromosomes from the normal chromosome set. The resulting gene dosage imbalance has a noticeable affect on the phenotype, as illustrated by aneuploid syndromes, including Down syndrome in humans, and by human solid tumor cells, which are highly aneuploid. Although the phenotypic manifestations of aneuploidy are usually apparent, information about the underlying alterations in structure, expression, and interphase organization of unbalanced chromosome sets is still sparse. Plants generally tolerate aneuploidy better than animals, and, through colchicine treatment and breeding strategies, it is possible to obtain inbred sibling plants with different numbers of chromosomes. This possibility, combined with the genetic and genomics tools available for Arabidopsis thaliana, provides a powerful means to assess systematically the molecular and cytological consequences of aberrant numbers of specific chromosomes. Here, we report on the generation of Arabidopsis plants in which chromosome 5 is present in triplicate. We compare the global transcript profiles of normal diploids and chromosome 5 trisomics, and assess genome integrity using array comparative genome hybridization. We use live cell imaging to determine the interphase 3D arrangement of transgene-encoded fluorescent tags on chromosome 5 in trisomic and triploid plants. The results indicate that trisomy 5 disrupts gene expression throughout the genome and supports the production and/or retention of truncated copies of chromosome 5. Although trisomy 5 does not grossly distort the interphase arrangement of fluorescent-tagged sites on chromosome 5, it may somewhat enhance associations between transgene alleles. Our analysis reveals the complex genomic changes that can occur in aneuploids and underscores the importance of using multiple experimental approaches to investigate how chromosome numerical changes condition abnormal phenotypes and

  4. Genomic structure, expression and association study of the porcine FSD2.

    Science.gov (United States)

    Lim, Kyu-Sang; Lee, Kyung-Tai; Lee, Si-Woo; Chai, Han-Ha; Jang, Gulwon; Hong, Ki-Chang; Kim, Tae-Hun

    2016-09-01

    The fibronectin type III and SPRY domain containing 2 (FSD2) on porcine chromosome 7 is considered a candidate gene for pork quality, since its two domains, which were present in fibronectin and ryanodine receptor. The fibronectin type III and SPRY domains were first identified in fibronectin and ryanodine receptor, respectively, which are candidate genes for meat quality. The aim of this study was to elucidate the genomic structure of FSD2 and functions of single nucleotide polymorphisms (SNPs) within FSD2 that are related to meat quality in pigs. Using a bacterial artificial chromosome clone sequence, we revealed that porcine FSD2 consisted of 13 exons encoding 750 amino acids. In addition, FSD2 was expressed in heart, longissimus dorsi muscle, psoas muscle, and tendon among 23 kinds of porcine tissues tested. A total of ten SNPs, including four missense mutations, were identified in the exonic region of FSD2, and two major haplotypes were obtained based on the SNP genotypes of 633 Berkshire pigs. Both haplotypes were associated significantly with intramuscular fat content (IMF, P meat color, affecting yellowness (P = 0.002). These haplotype effects were further supported by the alteration of putative protein structures with amino acid substitutions. Taken together, our results suggest that FSD2 haplotypes are involved in regulating meat quality including IMF, MP, and meat color in pigs, and may be used as meaningful molecular makers to identify pigs with preferable pork quality.

  5. Gene expression in chicken reveals correlation with structural genomic features and conserved patterns of transcription in the terrestrial vertebrates.

    Directory of Open Access Journals (Sweden)

    Haisheng Nie

    Full Text Available BACKGROUND: The chicken is an important agricultural and avian-model species. A survey of gene expression in a range of different tissues will provide a benchmark for understanding expression levels under normal physiological conditions in birds. With expression data for birds being very scant, this benchmark is of particular interest for comparative expression analysis among various terrestrial vertebrates. METHODOLOGY/PRINCIPAL FINDINGS: We carried out a gene expression survey in eight major chicken tissues using whole genome microarrays. A global picture of gene expression is presented for the eight tissues, and tissue specific as well as common gene expression were identified. A Gene Ontology (GO term enrichment analysis showed that tissue-specific genes are enriched with GO terms reflecting the physiological functions of the specific tissue, and housekeeping genes are enriched with GO terms related to essential biological functions. Comparisons of structural genomic features between tissue-specific genes and housekeeping genes show that housekeeping genes are more compact. Specifically, coding sequence and particularly introns are shorter than genes that display more variation in expression between tissues, and in addition intergenic space was also shorter. Meanwhile, housekeeping genes are more likely to co-localize with other abundantly or highly expressed genes on the same chromosomal regions. Furthermore, comparisons of gene expression in a panel of five common tissues between birds, mammals and amphibians showed that the expression patterns across tissues are highly similar for orthologous genes compared to random gene pairs within each pair-wise comparison, indicating a high degree of functional conservation in gene expression among terrestrial vertebrates. CONCLUSIONS: The housekeeping genes identified in this study have shorter gene length, shorter coding sequence length, shorter introns, and shorter intergenic regions, there seems

  6. Informational laws of genome structures

    Science.gov (United States)

    Bonnici, Vincenzo; Manca, Vincenzo

    2016-06-01

    In recent years, the analysis of genomes by means of strings of length k occurring in the genomes, called k-mers, has provided important insights into the basic mechanisms and design principles of genome structures. In the present study, we focus on the proper choice of the value of k for applying information theoretic concepts that express intrinsic aspects of genomes. The value k = lg2(n), where n is the genome length, is determined to be the best choice in the definition of some genomic informational indexes that are studied and computed for seventy genomes. These indexes, which are based on information entropies and on suitable comparisons with random genomes, suggest five informational laws, to which all of the considered genomes obey. Moreover, an informational genome complexity measure is proposed, which is a generalized logistic map that balances entropic and anti-entropic components of genomes and is related to their evolutionary dynamics. Finally, applications to computational synthetic biology are briefly outlined.

  7. Genomic survey, gene expression analysis and structural modeling suggest diverse roles of DNA methyltransferases in legumes.

    Directory of Open Access Journals (Sweden)

    Rohini Garg

    Full Text Available DNA methylation plays a crucial role in development through inheritable gene silencing. Plants possess three types of DNA methyltransferases (MTases, namely Methyltransferase (MET, Chromomethylase (CMT and Domains Rearranged Methyltransferase (DRM, which maintain methylation at CG, CHG and CHH sites. DNA MTases have not been studied in legumes so far. Here, we report the identification and analysis of putative DNA MTases in five legumes, including chickpea, soybean, pigeonpea, Medicago and Lotus. MTases in legumes could be classified in known MET, CMT, DRM and DNA nucleotide methyltransferases (DNMT2 subfamilies based on their domain organization. First three MTases represent DNA MTases, whereas DNMT2 represents a transfer RNA (tRNA MTase. Structural comparison of all the MTases in plants with known MTases in mammalian and plant systems have been reported to assign structural features in context of biological functions of these proteins. The structure analysis clearly specified regions crucial for protein-protein interactions and regions important for nucleosome binding in various domains of CMT and MET proteins. In addition, structural model of DRM suggested that circular permutation of motifs does not have any effect on overall structure of DNA methyltransferase domain. These results provide valuable insights into role of various domains in molecular recognition and should facilitate mechanistic understanding of their function in mediating specific methylation patterns. Further, the comprehensive gene expression analyses of MTases in legumes provided evidence of their role in various developmental processes throughout the plant life cycle and response to various abiotic stresses. Overall, our study will be very helpful in establishing the specific functions of DNA MTases in legumes.

  8. Structure, sequence and expression of the hepatitis delta (δ) viral genome

    Science.gov (United States)

    Wang, Kang-Sheng; Choo, Qui-Lim; Weiner, Amy J.; Ou, Jing-Hsiung; Najarian, Richard C.; Thayer, Richard M.; Mullenbach, Guy T.; Denniston, Katherine J.; Gerin, John L.; Houghton, Michael

    1986-10-01

    Biochemical and electron microscopic data indicate that the human hepatitis δ viral agent contains a covalently closed circular and single-stranded RNA genome that has certain similarities with viroid-like agents from plants. The sequence of the viral genome (1,678 nucleotides) has been determined and an open reading frame within the complementary strand has been shown to encode an antigen that binds specifically to antisera from patients with chronic hepatitis δ viral infections.

  9. Comparative study of four interleukin 17 cytokines of tongue sole Cynoglossus semilaevis: Genomic structure, expression pattern, and promoter activity.

    Science.gov (United States)

    Chi, Heng; Sun, Li

    2015-11-01

    The interleukin (IL)-17 cytokine family participates in the regulation of many cellular functions. In the present study, we analyzed the genomic structure, expression, and promoter activity of four IL-17 members from the teleost fish tongue sole (Cynoglossus semilaevis), i.e. CsIL-17C CsIL-17D, CsIL-17F, and IL-17F like (IL-17Fl). We found that CsIL-17C, CsIL-17D, CsIL-17F, and CsIL-17Fl share 21.2%-28.6% overall sequence identities among themselves and 31.5%-71.2% overall sequence identities with their counterparts in other teleost. All four CsIL-17 members possess an IL-17 domain and four conserved cysteine residues. Phylogenetic analysis classified the four CsIL-17 members into three clusters. Under normal physiological conditions, the four CsIL-17 expressed in multiple tissues, especially non-immune tissues. Bacterial infection upregulated the expression of all four CsIL-17, while viral infection upregulated the expression of CsIL-17D and CsIL-17Fl but downregulated the expression of CsIL-17C and CsIL-17F. The 1.2 kb 5'-flanking regions of the four CsIL-17 exhibited apparent promoter activity and contain a number of putative transcription factor-binding sites. Furthermore, the promoter activities of CsIL-17C, CsIL-17D, and CsIL-17F, but not CsIL-17Fl, were modulated to significant extents by lipopolysaccharide, PolyI:C, and PMA. This study provides the first evidence that in teleost, different IL-17 members differ in expression pattern and promoter activity. Copyright © 2015 Elsevier Ltd. All rights reserved.

  10. Gene design, cloning and protein-expression methods for high-value targets at the Seattle Structural Genomics Center for Infectious Disease

    International Nuclear Information System (INIS)

    Raymond, Amy; Haffner, Taryn; Ng, Nathan; Lorimer, Don; Staker, Bart; Stewart, Lance

    2011-01-01

    An overview of one salvage strategy for high-value SSGCID targets is given. Any structural genomics endeavor, particularly ambitious ones such as the NIAID-funded Seattle Structural Genomics Center for Infectious Disease (SSGCID) and Center for Structural Genomics of Infectious Disease (CSGID), face technical challenges at all points of the production pipeline. One salvage strategy employed by SSGCID is combined gene engineering and structure-guided construct design to overcome challenges at the levels of protein expression and protein crystallization. Multiple constructs of each target are cloned in parallel using Polymerase Incomplete Primer Extension cloning and small-scale expressions of these are rapidly analyzed by capillary electrophoresis. Using the methods reported here, which have proven particularly useful for high-value targets, otherwise intractable targets can be resolved

  11. National Academy of Sciences and Academy of Sciences of the USSR workshop on structure of the eucaryotic genome and regulation of its expression. Final report

    Energy Technology Data Exchange (ETDEWEB)

    1990-12-31

    This report provides a brief overview of the Workshop on Structure of the Eukaryotic Genome and Regulation of its Expression held in Tbilisi, Georgia, USSR. The report describes the presentations made at the meeting but also goes on to describe the state of molecular biology and genetics research in the Soviet Union and makes recommendations on how to improve future such meetings.

  12. National Academy of Sciences and Academy of Sciences of the USSR workshop on structure of the eucaryotic genome and regulation of its expression

    Energy Technology Data Exchange (ETDEWEB)

    1990-01-01

    This report provides a brief overview of the Workshop on Structure of the Eukaryotic Genome and Regulation of its Expression held in Tbilisi, Georgia, USSR. The report describes the presentations made at the meeting but also goes on to describe the state of molecular biology and genetics research in the Soviet Union and makes recommendations on how to improve future such meetings.

  13. A novel rat genomic simple repeat DNA with RNA-homology shows triplex (H-DNA)-like structure and tissue-specific RNA expression

    International Nuclear Information System (INIS)

    Dey, Indranil; Rath, Pramod C.

    2005-01-01

    Mammalian genome contains a wide variety of repetitive DNA sequences of relatively unknown function. We report a novel 227 bp simple repeat DNA (3.3 DNA) with a d {(GA) 7 A (AG) 7 } dinucleotide mirror repeat from the rat (Rattus norvegicus) genome. 3.3 DNA showed 75-85% homology with several eukaryotic mRNAs due to (GA/CU) n dinucleotide repeats by nBlast search and a dispersed distribution in the rat genome by Southern blot hybridization with [ 32 P]3.3 DNA. The d {(GA) 7 A (AG) 7 } mirror repeat formed a triplex (H-DNA)-like structure in vitro. Two large RNAs of 9.1 and 7.5 kb were detected by [ 32 P]3.3 DNA in rat brain by Northern blot hybridization indicating expression of such simple sequence repeats at RNA level in vivo. Further, several cDNAs were isolated from a rat cDNA library by [ 32 P]3.3 DNA probe. Three such cDNAs showed tissue-specific RNA expression in rat. pRT 4.1 cDNA showed strong expression of a 2.39 kb RNA in brain and spleen, pRT 5.5 cDNA showed strong expression of a 2.8 kb RNA in brain and a 3.9 kb RNA in lungs, and pRT 11.4 cDNA showed weak expression of a 2.4 kb RNA in lungs. Thus, genomic simple sequence repeats containing d (GA/CT) n dinucleotides are transcriptionally expressed and regulated in rat tissues. Such d (GA/CT) n dinucleotide repeats may form structural elements (e.g., triplex) which may be sites for functional regulation of genomic coding sequences as well as RNAs. This may be a general function of such transcriptionally active simple sequence repeats widely dispersed in mammalian genome

  14. Structure and expression strategy of the genome of Culex pipiens densovirus, a mosquito densovirus with an ambisense organization.

    Science.gov (United States)

    Baquerizo-Audiot, Elizabeth; Abd-Alla, Adly; Jousset, Françoise-Xavière; Cousserans, François; Tijssen, Peter; Bergoin, Max

    2009-07-01

    The genome of all densoviruses (DNVs) so far isolated from mosquitoes or mosquito cell lines consists of a 4-kb single-stranded DNA molecule with a monosense organization (genus Brevidensovirus, subfamily Densovirinae). We previously reported the isolation of a Culex pipiens DNV (CpDNV) that differs significantly from brevidensoviruses by (i) having a approximately 6-kb genome, (ii) lacking sequence homology, and (iii) lacking antigenic cross-reactivity with Brevidensovirus capsid polypeptides. We report here the sequence organization and transcription map of this virus. The cloned genome of CpDNV is 5,759 nucleotides (nt) long, and it possesses an inverted terminal repeat (ITR) of 285 nt and an ambisense organization of its genes. The nonstructural (NS) proteins NS-1, NS-2, and NS-3 are located in the 5' half of one strand and are organized into five open reading frames (ORFs) due to the split of both NS-1 and NS-2 into two ORFs. The ORF encoding capsid polypeptides is located in the 5' half of the complementary strand. The expression of NS proteins is controlled by two promoters, P7 and P17, driving the transcription of a 2.4-kb mRNA encoding NS-3 and of a 1.8-kb mRNA encoding NS-1 and NS-2, respectively. The two NS mRNAs species are spliced off a 53-nt sequence. Capsid proteins are translated from an unspliced 2.3-kb mRNA driven by the P88 promoter. CpDNV thus appears as a new type of mosquito DNV, and based on the overall organization and expression modalities of its genome, it may represent the prototype of a new genus of DNV.

  15. Insights into structural variations and genome rearrangements in prokaryotic genomes.

    Science.gov (United States)

    Periwal, Vinita; Scaria, Vinod

    2015-01-01

    Structural variations (SVs) are genomic rearrangements that affect fairly large fragments of DNA. Most of the SVs such as inversions, deletions and translocations have been largely studied in context of genetic diseases in eukaryotes. However, recent studies demonstrate that genome rearrangements can also have profound impact on prokaryotic genomes, leading to altered cell phenotype. In contrast to single-nucleotide variations, SVs provide a much deeper insight into organization of bacterial genomes at a much better resolution. SVs can confer change in gene copy number, creation of new genes, altered gene expression and many other functional consequences. High-throughput technologies have now made it possible to explore SVs at a much refined resolution in bacterial genomes. Through this review, we aim to highlight the importance of the less explored field of SVs in prokaryotic genomes and their impact. We also discuss its potential applicability in the emerging fields of synthetic biology and genome engineering where targeted SVs could serve to create sophisticated and accurate genome editing. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  16. Structure, expression profile and phylogenetic inference of chalcone isomerase-like genes from the narrow-leafed lupin (Lupinus angustifolius L. genome

    Directory of Open Access Journals (Sweden)

    Łucja ePrzysiecka

    2015-04-01

    Full Text Available Lupins, like other legumes, have a unique biosynthesis scheme of 5-deoxy-type flavonoids and isoflavonoids. A key enzyme in this pathway is chalcone isomerase (CHI, a member of CHI-fold protein family, encompassing subfamilies of CHI1, CHI2, CHI-like (CHIL, and fatty acid-binding (FAP proteins. Here, two Lupinus angustifolius (narrow-leafed lupin CHILs, LangCHIL1 and LangCHIL2, were identified and characterized using DNA fingerprinting, cytogenetic and linkage mapping, sequencing and expression profiling. Clones carrying CHIL sequences were assembled into two contigs. Full gene sequences were obtained from these contigs, and mapped in two L. angustifolius linkage groups by gene-specific markers. Bacterial artificial chromosome fluorescence in situ hybridization approach confirmed the localization of two LangCHIL genes in distinct chromosomes. The expression profiles of both LangCHIL isoforms were very similar. The highest level of transcription was in the roots of the third week of plant growth; thereafter, expression declined. The expression of both LangCHIL genes in leaves and stems was similar and low. Comparative mapping to reference legume genome sequences revealed strong syntenic links; however, LangCHIL2 contig had a much more conserved structure than LangCHIL1. LangCHIL2 is assumed to be an ancestor gene, whereas LangCHIL1 probably appeared as a result of duplication. As both copies are transcriptionally active, questions arise concerning their hypothetical functional divergence. Screening of the narrow-leafed lupin genome and transcriptome with CHI-fold protein sequences, followed by Bayesian inference of phylogeny and cross-genera synteny survey, identified representatives of all but one (CHI1 main subfamilies. They are as follows: two copies of CHI2, FAPa2 and CHIL, and single copies of FAPb and FAPa1. Duplicated genes are remnants of whole genome duplication which is assumed to have occurred after the divergence of Lupinus, Arachis

  17. Structural genomics in endocrinology

    NARCIS (Netherlands)

    Smit, J. W.; Romijn, J. A.

    2001-01-01

    Traditionally, endocrine research evolved from the phenotypical characterisation of endocrine disorders to the identification of underlying molecular pathophysiology. This approach has been, and still is, extremely successful. The introduction of genomics and proteomics has resulted in a reversal of

  18. The Pekin duck programmed death-ligand 1: cDNA cloning, genomic structure, molecular characterization and mRNA expression analysis.

    Science.gov (United States)

    Yao, Q; Fischer, K P; Tyrrell, D L; Gutfreund, K S

    2015-04-01

    Programmed death ligand-1 (PD-L1) plays an important role in the attenuation of adaptive immune responses in higher vertebrates. Here, we describe the identification of the Pekin duck PD-L1 orthologue (duPD-L1) and its gene structure. The duPD-L1 cDNA encodes a 311-amino acid protein that has an amino acid identity of 78% and 42% with chicken and human PD-L1, respectively. Mapping of the duPD-L1 cDNA with duck genomic sequences revealed an exonic structure of its coding sequence similar to those of other vertebrates but lacked a noncoding exon 1. Homology modelling of the duPD-L1 extracellular domain was compatible with the tandem IgV-like and IgC-like IgSF domain structure of human PD-L1 (PDB ID: 3BIS). Residues known to be important for receptor binding of human PD-L1 were mostly conserved in duPD-L1 within the N-terminus and the G sheet, and partially conserved within the F sheet but not within sheets C and C'. DuPD-L1 mRNA was constitutively expressed in all tissues examined with highest expression levels in lung and spleen and very low levels of expression in muscle, kidney and brain. Mitogen stimulation of duck peripheral blood mononuclear cells transiently increased duPD-L1 mRNA expression. Our observations demonstrate evolutionary conservation of the exonic structure of its coding sequence, the extracellular domain structure and residues implicated in receptor binding, but the role of the longer cytoplasmic tail in avian PD-L1 proteins remains to be determined. © 2014 John Wiley & Sons Ltd.

  19. Genome structures and halophyte-specific gene expression of the extremophile thellungiella parvula in comparison with Thellungiella salsuginea (Thellungiella halophila) and arabidopsis

    KAUST Repository

    Oh, Dongha

    2010-09-10

    The genome of Thellungiella parvula, a halophytic relative of Arabidopsis (Arabidopsis thaliana), is being assembled using Roche-454 sequencing. Analyses of a 10-Mb scaffold revealed synteny with Arabidopsis, with recombination and inversion and an uneven distribution of repeat sequences. T. parvula genome structure and DNA sequences were compared with orthologous regions from Arabidopsis and publicly available bacterial artificial chromosome sequences from Thellungiella salsuginea (previously Thellungiella halophila). The three-way comparison of sequences, from one abiotic stress-sensitive species and two tolerant species, revealed extensive sequence conservation and microcolinearity, but grouping Thellungiella species separately from Arabidopsis. However, the T. parvula segments are distinguished from their T. salsuginea counterparts by a pronounced paucity of repeat sequences, resulting in a 30% shorter DNA segment with essentially the same gene content in T. parvula. Among the genes is SALT OVERLY SENSITIVE1 (SOS1), a sodium/proton antiporter, which represents an essential component of plant salinity stress tolerance. Although the SOS1 coding region is highly conserved among all three species, the promoter regions show conservation only between the two Thellungiella species. Comparative transcript analyses revealed higher levels of basal as well as salt-induced SOS1 expression in both Thellungiella species as compared with Arabidopsis. The Thellungiella species and other halophytes share conserved pyrimidine-rich 5\\' untranslated region proximal regions of SOS1 that are missing in Arabidopsis. Completion of the genome structure of T. parvula is expected to highlight distinctive genetic elements underlying the extremophile lifestyle of this species. © American Society of Plant Biologists.

  20. Genome structures and halophyte-specific gene expression of the extremophile thellungiella parvula in comparison with Thellungiella salsuginea (Thellungiella halophila) and arabidopsis

    KAUST Repository

    Oh, Dongha; Dassanayake, Maheshi; Haas, Jeffrey S.; Kropornika, Anna; Wright, Chris L.; D'Urzo, Matilde Paino; Hong, Hyewon; Ali, Shahjahan; Herná ndez, Á lvaro Gonzalez; Lambert, Georgina M.; Inan, Gü nsu; Galbraith, David; Bressan, Ray Anthony; Yun, Daejin; Zhu, Jian-Kang; Cheeseman, John McP; Bohnert, Hans Jü rgen

    2010-01-01

    and an uneven distribution of repeat sequences. T. parvula genome structure and DNA sequences were compared with orthologous regions from Arabidopsis and publicly available bacterial artificial chromosome sequences from Thellungiella salsuginea (previously

  1. Genome-wide decoding of hierarchical modular structure of transcriptional regulation by cis-element and expression clustering.

    Science.gov (United States)

    Leyfer, Dmitriy; Weng, Zhiping

    2005-09-01

    A holistic approach to the study of cellular processes is identifying both gene-expression changes and regulatory elements promoting such changes. Cellular regulatory processes can be viewed as transcriptional modules (TMs), groups of coexpressed genes regulated by groups of transcription factors (TFs). We set out to devise a method that would identify TMs while avoiding arbitrary thresholds on TM sizes and number. Assuming that gene expression is determined by TFs that bind to the gene's promoter, clustering of genes based on TF binding sites (cis-elements) should create gene groups similar to those obtained by gene expression clustering. Intersections between the expression and cis-element-based gene clusters reveal TMs. Statistical significance assigned to each TM allows identification of regulatory units of any size. Our method correctly identifies the number and sizes of TMs on simulated datasets. We demonstrate that yeast experimental TMs are biologically relevant by comparing them with MIPS and GO categories. Our modules are in statistically significant agreement with TMs from other research groups. This work suggests that there is no preferential division of biological processes into regulatory units; each degree of partitioning exhibits a slice of biological network revealing hierarchical modular organization of transcriptional regulation.

  2. Porcine EEF1A1 and EEF1A2 genes: genomic structure, polymorphism, mapping and expression

    Czech Academy of Sciences Publication Activity Database

    Svobodová, K.; Horák, Pavel; Stratil, Antonín; Bartenschlager, H.; Van Poucke, M.; Chalupová, P.; Dvořáková, Věra; Knorr, Ch.; Stupka, R.; Čítek, J.; Šprysl, M.; Palánová, Anna; Peelman, L. J.; Geldermann, H.; Knoll, A.

    2015-01-01

    Roč. 42, č. 8 (2015), s. 1257-1264 ISSN 0301-4851 R&D Projects: GA ČR(CZ) GA523/06/1302; GA ČR GA523/09/0844 Institutional support: RVO:67985904 Keywords : EEF1A1 * EEF1A2 * gene expression Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 1.698, year: 2015

  3. Regulation of methane genes and genome expression

    Energy Technology Data Exchange (ETDEWEB)

    John N. Reeve

    2009-09-09

    At the start of this project, it was known that methanogens were Archaeabacteria (now Archaea) and were therefore predicted to have gene expression and regulatory systems different from Bacteria, but few of the molecular biology details were established. The goals were then to establish the structures and organizations of genes in methanogens, and to develop the genetic technologies needed to investigate and dissect methanogen gene expression and regulation in vivo. By cloning and sequencing, we established the gene and operon structures of all of the “methane” genes that encode the enzymes that catalyze methane biosynthesis from carbon dioxide and hydrogen. This work identified unique sequences in the methane gene that we designated mcrA, that encodes the largest subunit of methyl-coenzyme M reductase, that could be used to identify methanogen DNA and establish methanogen phylogenetic relationships. McrA sequences are now the accepted standard and used extensively as hybridization probes to identify and quantify methanogens in environmental research. With the methane genes in hand, we used northern blot and then later whole-genome microarray hybridization analyses to establish how growth phase and substrate availability regulated methane gene expression in Methanobacterium thermautotrophicus ΔH (now Methanothermobacter thermautotrophicus). Isoenzymes or pairs of functionally equivalent enzymes catalyze several steps in the hydrogen-dependent reduction of carbon dioxide to methane. We established that hydrogen availability determine which of these pairs of methane genes is expressed and therefore which of the alternative enzymes is employed to catalyze methane biosynthesis under different environmental conditions. As were unable to establish a reliable genetic system for M. thermautotrophicus, we developed in vitro transcription as an alternative system to investigate methanogen gene expression and regulation. This led to the discovery that an archaeal protein

  4. Functional Insights from Structural Genomics

    Energy Technology Data Exchange (ETDEWEB)

    Forouhar,F.; Kuzin, A.; Seetharaman, J.; Lee, I.; Zhou, W.; Abashidze, M.; Chen, Y.; Montelione, G.; Tong, L.; et al

    2007-01-01

    Structural genomics efforts have produced structural information, either directly or by modeling, for thousands of proteins over the past few years. While many of these proteins have known functions, a large percentage of them have not been characterized at the functional level. The structural information has provided valuable functional insights on some of these proteins, through careful structural analyses, serendipity, and structure-guided functional screening. Some of the success stories based on structures solved at the Northeast Structural Genomics Consortium (NESG) are reported here. These include a novel methyl salicylate esterase with important role in plant innate immunity, a novel RNA methyltransferase (H. influenzae yggJ (HI0303)), a novel spermidine/spermine N-acetyltransferase (B. subtilis PaiA), a novel methyltransferase or AdoMet binding protein (A. fulgidus AF{_}0241), an ATP:cob(I)alamin adenosyltransferase (B. subtilis YvqK), a novel carboxysome pore (E. coli EutN), a proline racemase homolog with a disrupted active site (B. melitensis BME11586), an FMN-dependent enzyme (S. pneumoniae SP{_}1951), and a 12-stranded {beta}-barrel with a novel fold (V. parahaemolyticus VPA1032).

  5. 2004 Structural, Function and Evolutionary Genomics

    Energy Technology Data Exchange (ETDEWEB)

    Douglas L. Brutlag Nancy Ryan Gray

    2005-03-23

    This Gordon conference will cover the areas of structural, functional and evolutionary genomics. It will take a systematic approach to genomics, examining the evolution of proteins, protein functional sites, protein-protein interactions, regulatory networks, and metabolic networks. Emphasis will be placed on what we can learn from comparative genomics and entire genomes and proteomes.

  6. Gene Composer in a structural genomics environment

    International Nuclear Information System (INIS)

    Lorimer, Don; Raymond, Amy; Mixon, Mark; Burgin, Alex; Staker, Bart; Stewart, Lance

    2011-01-01

    For structural biology applications, protein-construct engineering is guided by comparative sequence analysis and structural information, which allow the researcher to better define domain boundaries for terminal deletions and nonconserved regions for surface mutants. A database software application called Gene Composer has been developed to facilitate construct design. The structural genomics effort at the Seattle Structural Genomics Center for Infectious Disease (SSGCID) requires the manipulation of large numbers of amino-acid sequences and the underlying DNA sequences which are to be cloned into expression vectors. To improve efficiency in high-throughput protein structure determination, a database software package, Gene Composer, has been developed which facilitates the information-rich design of protein constructs and their underlying gene sequences. With its modular workflow design and numerous graphical user interfaces, Gene Composer enables researchers to perform all common bioinformatics steps used in modern structure-guided protein engineering and synthetic gene engineering. An example of the structure determination of H1N1 RNA-dependent RNA polymerase PB2 subunit is given

  7. Structure and transcription of the Helicoverpa armigera densovirus (HaDV2) genome and its expression strategy in LD652 cells.

    Science.gov (United States)

    Xu, Pengjun; Graham, Robert I; Wilson, Kenneth; Wu, Kongming

    2017-02-07

    Densoviruses (DVs) are highly pathogenic to their hosts. However, we previously reported a mutualistic DV (HaDV2). Very little was known about the characteristics of this virus, so herein we undertook a series of experiments to explore the molecular biology of HaDV2 further. Phylogenetic analysis showed that HaDV2 was similar to members of the genus Iteradensovirus. However, compared to current members of the genus Iteradensovirus, the sequence identity of HaDV2 is less than 44% at the nucleotide-level, and lower than 36, 28 and 19% at the amino-acid-level of VP, NS1 and NS2 proteins, respectively. Moreover, NS1 and NS2 proteins from HaDV2 were smaller than those from other iteradensoviruses due to their shorter N-terminal sequences. Two transcripts of about 2.2 kb coding for the NS proteins and the VP proteins were identified by Northern Blot and RACE analysis. Using specific anti-NS1 and anti-NS2 antibodies, Western Blot analysis revealed a 78 kDa and a 48 kDa protein, respectively. Finally, the localization of both NS1 and NS2 proteins within the cell nucleus was determined by using Green Fluorescent Protein (GFP) labelling. The genome organization, terminal hairpin structure, transcription and expression strategies as well as the mutualistic relationship with its host, suggested that HaDV2 was a novel member of the genus Iteradensovirus within the subfamily Densovirinae.

  8. Genomic hypomethylation in the human germline associates with selective structural mutability in the human genome.

    Directory of Open Access Journals (Sweden)

    Jian Li

    Full Text Available The hotspots of structural polymorphisms and structural mutability in the human genome remain to be explained mechanistically. We examine associations of structural mutability with germline DNA methylation and with non-allelic homologous recombination (NAHR mediated by low-copy repeats (LCRs. Combined evidence from four human sperm methylome maps, human genome evolution, structural polymorphisms in the human population, and previous genomic and disease studies consistently points to a strong association of germline hypomethylation and genomic instability. Specifically, methylation deserts, the ~1% fraction of the human genome with the lowest methylation in the germline, show a tenfold enrichment for structural rearrangements that occurred in the human genome since the branching of chimpanzee and are highly enriched for fast-evolving loci that regulate tissue-specific gene expression. Analysis of copy number variants (CNVs from 400 human samples identified using a custom-designed array comparative genomic hybridization (aCGH chip, combined with publicly available structural variation data, indicates that association of structural mutability with germline hypomethylation is comparable in magnitude to the association of structural mutability with LCR-mediated NAHR. Moreover, rare CNVs occurring in the genomes of individuals diagnosed with schizophrenia, bipolar disorder, and developmental delay and de novo CNVs occurring in those diagnosed with autism are significantly more concentrated within hypomethylated regions. These findings suggest a new connection between the epigenome, selective mutability, evolution, and human disease.

  9. Structural Genomics of Minimal Organisms: Pipeline and Results

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Sung-Hou; Shin, Dong-Hae; Kim, Rosalind; Adams, Paul; Chandonia, John-Marc

    2007-09-14

    The initial objective of the Berkeley Structural Genomics Center was to obtain a near complete three-dimensional (3D) structural information of all soluble proteins of two minimal organisms, closely related pathogens Mycoplasma genitalium and M. pneumoniae. The former has fewer than 500 genes and the latter has fewer than 700 genes. A semiautomated structural genomics pipeline was set up from target selection, cloning, expression, purification, and ultimately structural determination. At the time of this writing, structural information of more than 93percent of all soluble proteins of M. genitalium is avail able. This chapter summarizes the approaches taken by the authors' center.

  10. Identification of the major structural and nonstructural proteins encoded by human parvovirus B19 and mapping of their genes by procaryotic expression of isolated genomic fragments

    Energy Technology Data Exchange (ETDEWEB)

    Cotmore, S.F.; McKie, V.C.; Anderson, L.J.; Astell, C.R.; Tattersall, P.

    1986-11-01

    Plasma from a child with homozygous sickle-cell disease, sampled during the early phase of an aplastic crisis, contained human parvovirus B19 virions. Plasma taken 10 days later (during the convalescent phase) contained both immunoglobulin M and immunoglobulin G antibodies directed against two viral polypeptides with apparent molecular weights for 83,000 and 58,000 which were present exclusively in the particulate fraction of the plasma taken during the acute phase. These two protein species comigrated at 110S on neutral sucrose velocity gradients with the B19 viral DNA and thus appear to constitute the viral capsid polypeptides. The B19 genome was molecularly cloned into a bacterial plasmid vector. Two expression constructs containing B19 sequences from different halves of the viral genome were obtained, which directed the synthesis, in bacteria, of segments of virally encoded protein. These polypeptide fragments were then purified and used to immunize rabbits. Antibodies against a protein sequence specified between nucleotides 2897 and 3749 recognized both the 83- and 58-kilodalton capsid polypeptides in aplastic plasma taken during the acute phase and detected similar proteins in the similar proteins in the tissues of a stillborn fetus which had been infected transplacentally with B19. Antibodies against a protein sequence encoded in the other half of the B19 genome (nucleotides 1072 through 2044) did not react specifically with any protein in plasma taken during the acute phase but recognized three nonstructural polypeptides of 71, 63, and 52 kilodaltons present in the liver and, at lower levels, in some other tissues of the transplacentally infected fetus.

  11. Identification of the major structural and nonstructural proteins encoded by human parvovirus B19 and mapping of their genes by procaryotic expression of isolated genomic fragments

    International Nuclear Information System (INIS)

    Cotmore, S.F.; McKie, V.C.; Anderson, L.J.; Astell, C.R.; Tattersall, P.

    1986-01-01

    Plasma from a child with homozygous sickle-cell disease, sampled during the early phase of an aplastic crisis, contained human parvovirus B19 virions. Plasma taken 10 days later (during the convalescent phase) contained both immunoglobulin M and immunoglobulin G antibodies directed against two viral polypeptides with apparent molecular weights for 83,000 and 58,000 which were present exclusively in the particulate fraction of the plasma taken during the acute phase. These two protein species comigrated at 110S on neutral sucrose velocity gradients with the B19 viral DNA and thus appear to constitute the viral capsid polypeptides. The B19 genome was molecularly cloned into a bacterial plasmid vector. Two expression constructs containing B19 sequences from different halves of the viral genome were obtained, which directed the synthesis, in bacteria, of segments of virally encoded protein. These polypeptide fragments were then purified and used to immunize rabbits. Antibodies against a protein sequence specified between nucleotides 2897 and 3749 recognized both the 83- and 58-kilodalton capsid polypeptides in aplastic plasma taken during the acute phase and detected similar proteins in the similar proteins in the tissues of a stillborn fetus which had been infected transplacentally with B19. Antibodies against a protein sequence encoded in the other half of the B19 genome (nucleotides 1072 through 2044) did not react specifically with any protein in plasma taken during the acute phase but recognized three nonstructural polypeptides of 71, 63, and 52 kilodaltons present in the liver and, at lower levels, in some other tissues of the transplacentally infected fetus

  12. Genomics technologies to study structural variations in the grapevine genome

    Directory of Open Access Journals (Sweden)

    Cardone Maria Francesca

    2016-01-01

    Full Text Available Grapevine is one of the most important crop plants in the world. Recently there was great expansion of genomics resources about grapevine genome, thus providing increasing efforts for molecular breeding. Current cultivars display a great level of inter-specific differentiation that needs to be investigated to reach a comprehensive understanding of the genetic basis of phenotypic differences, and to find responsible genes selected by cross breeding programs. While there have been significant advances in resolving the pattern and nature of single nucleotide polymorphisms (SNPs on plant genomes, few data are available on copy number variation (CNV. Furthermore association between structural variations and phenotypes has been described in only a few cases. We combined high throughput biotechnologies and bioinformatics tools, to reveal the first inter-varietal atlas of structural variation (SV for the grapevine genome. We sequenced and compared four table grape cultivars with the Pinot noir inbred line PN40024 genome as the reference. We detected roughly 8% of the grapevine genome affected by genomic variations. Taken into account phenotypic differences existing among the studied varieties we performed comparison of SVs among them and the reference and next we performed an in-depth analysis of gene content of polymorphic regions. This allowed us to identify genes showing differences in copy number as putative functional candidates for important traits in grapevine cultivation.

  13. Genome-wide expression profiling of complex regional pain syndrome.

    Directory of Open Access Journals (Sweden)

    Eun-Heui Jin

    Full Text Available Complex regional pain syndrome (CRPS is a chronic, progressive, and devastating pain syndrome characterized by spontaneous pain, hyperalgesia, allodynia, altered skin temperature, and motor dysfunction. Although previous gene expression profiling studies have been conducted in animal pain models, there genome-wide expression profiling in the whole blood of CRPS patients has not been reported yet. Here, we successfully identified certain pain-related genes through genome-wide expression profiling in the blood from CRPS patients. We found that 80 genes were differentially expressed between 4 CRPS patients (2 CRPS I and 2 CRPS II and 5 controls (cut-off value: 1.5-fold change and p<0.05. Most of those genes were associated with signal transduction, developmental processes, cell structure and motility, and immunity and defense. The expression levels of major histocompatibility complex class I A subtype (HLA-A29.1, matrix metalloproteinase 9 (MMP9, alanine aminopeptidase N (ANPEP, l-histidine decarboxylase (HDC, granulocyte colony-stimulating factor 3 receptor (G-CSF3R, and signal transducer and activator of transcription 3 (STAT3 genes selected from the microarray were confirmed in 24 CRPS patients and 18 controls by quantitative reverse transcription-polymerase chain reaction (qRT-PCR. We focused on the MMP9 gene that, by qRT-PCR, showed a statistically significant difference in expression in CRPS patients compared to controls with the highest relative fold change (4.0±1.23 times and p = 1.4×10(-4. The up-regulation of MMP9 gene in the blood may be related to the pain progression in CRPS patients. Our findings, which offer a valuable contribution to the understanding of the differential gene expression in CRPS may help in the understanding of the pathophysiology of CRPS pain progression.

  14. Structural genomics of infectious disease drug targets: the SSGCID

    International Nuclear Information System (INIS)

    Stacy, Robin; Begley, Darren W.; Phan, Isabelle; Staker, Bart L.; Van Voorhis, Wesley C.; Varani, Gabriele; Buchko, Garry W.; Stewart, Lance J.; Myler, Peter J.

    2011-01-01

    An introduction and overview of the focus, goals and overall mission of the Seattle Structural Genomics Center for Infectious Disease (SSGCID) is given. The Seattle Structural Genomics Center for Infectious Disease (SSGCID) is a consortium of researchers at Seattle BioMed, Emerald BioStructures, the University of Washington and Pacific Northwest National Laboratory that was established to apply structural genomics approaches to drug targets from infectious disease organisms. The SSGCID is currently funded over a five-year period by the National Institute of Allergy and Infectious Diseases (NIAID) to determine the three-dimensional structures of 400 proteins from a variety of Category A, B and C pathogens. Target selection engages the infectious disease research and drug-therapy communities to identify drug targets, essential enzymes, virulence factors and vaccine candidates of biomedical relevance to combat infectious diseases. The protein-expression systems, purified proteins, ligand screens and three-dimensional structures produced by SSGCID constitute a valuable resource for drug-discovery research, all of which is made freely available to the greater scientific community. This issue of Acta Crystallographica Section F, entirely devoted to the work of the SSGCID, covers the details of the high-throughput pipeline and presents a series of structures from a broad array of pathogenic organisms. Here, a background is provided on the structural genomics of infectious disease, the essential components of the SSGCID pipeline are discussed and a survey of progress to date is presented

  15. Structural genomic variation in ischemic stroke

    Science.gov (United States)

    Matarin, Mar; Simon-Sanchez, Javier; Fung, Hon-Chung; Scholz, Sonja; Gibbs, J. Raphael; Hernandez, Dena G.; Crews, Cynthia; Britton, Angela; Wavrant De Vrieze, Fabienne; Brott, Thomas G.; Brown, Robert D.; Worrall, Bradford B.; Silliman, Scott; Case, L. Douglas; Hardy, John A.; Rich, Stephen S.; Meschia, James F.; Singleton, Andrew B.

    2008-01-01

    Technological advances in molecular genetics allow rapid and sensitive identification of genomic copy number variants (CNVs). This, in turn, has sparked interest in the function such variation may play in disease. While a role for copy number mutations as a cause of Mendelian disorders is well established, it is unclear whether CNVs may affect risk for common complex disorders. We sought to investigate whether CNVs may modulate risk for ischemic stroke (IS) and to provide a catalog of CNVs in patients with this disorder by analyzing copy number metrics produced as a part of our previous genome-wide single-nucleotide polymorphism (SNP)-based association study of ischemic stroke in a North American white population. We examined CNVs in 263 patients with ischemic stroke (IS). Each identified CNV was compared with changes identified in 275 neurologically normal controls. Our analysis identified 247 CNVs, corresponding to 187 insertions (76%; 135 heterozygous; 25 homozygous duplications or triplications; 2 heterosomic) and 60 deletions (24%; 40 heterozygous deletions;3 homozygous deletions; 14 heterosomic deletions). Most alterations (81%) were the same as, or overlapped with, previously reported CNVs. We report here the first genome-wide analysis of CNVs in IS patients. In summary, our study did not detect any common genomic structural variation unequivocally linked to IS, although we cannot exclude that smaller CNVs or CNVs in genomic regions poorly covered by this methodology may confer risk for IS. The application of genome-wide SNP arrays now facilitates the evaluation of structural changes through the entire genome as part of a genome-wide genetic association study. PMID:18288507

  16. Functional Coverage of the Human Genome by Existing Structures, Structural Genomics Targets, and Homology Models.

    Directory of Open Access Journals (Sweden)

    2005-08-01

    Full Text Available The bias in protein structure and function space resulting from experimental limitations and targeting of particular functional classes of proteins by structural biologists has long been recognized, but never continuously quantified. Using the Enzyme Commission and the Gene Ontology classifications as a reference frame, and integrating structure data from the Protein Data Bank (PDB, target sequences from the structural genomics projects, structure homology derived from the SUPERFAMILY database, and genome annotations from Ensembl and NCBI, we provide a quantified view, both at the domain and whole-protein levels, of the current and projected coverage of protein structure and function space relative to the human genome. Protein structures currently provide at least one domain that covers 37% of the functional classes identified in the genome; whole structure coverage exists for 25% of the genome. If all the structural genomics targets were solved (twice the current number of structures in the PDB, it is estimated that structures of one domain would cover 69% of the functional classes identified and complete structure coverage would be 44%. Homology models from existing experimental structures extend the 37% coverage to 56% of the genome as single domains and 25% to 31% for complete structures. Coverage from homology models is not evenly distributed by protein family, reflecting differing degrees of sequence and structure divergence within families. While these data provide coverage, conversely, they also systematically highlight functional classes of proteins for which structures should be determined. Current key functional families without structure representation are highlighted here; updated information on the "most wanted list" that should be solved is available on a weekly basis from http://function.rcsb.org:8080/pdb/function_distribution/index.html.

  17. Ligninolytic peroxidase genes in the oyster mushroom genome: heterologous expression, molecular structure, catalytic and stability properties, and lignin-degrading ability

    Science.gov (United States)

    Elena Fernández-Fueyo; Francisco J Ruiz-Dueñas; María Jesús Martinez; Antonio Romero; Kenneth E Hammel; Francisco Javier Medrano; Angel T. Martínez

    2014-01-01

    Background: The genome of Pleurotus ostreatus, an important edible mushroom and a model ligninolytic organism of interest in lignocellulose biorefineries due to its ability to delignify agricultural wastes, was sequenced with the purpose of identifying and characterizing the enzymes responsible for lignin degradation. ...

  18. Genome-Wide Analyses of the NAC Transcription Factor Gene Family in Pepper (Capsicum annuum L.: Chromosome Location, Phylogeny, Structure, Expression Patterns, Cis-Elements in the Promoter, and Interaction Network

    Directory of Open Access Journals (Sweden)

    Weiping Diao

    2018-03-01

    Full Text Available The NAM, ATAF1/2, and CUC2 (NAC transcription factors form a large plant-specific gene family, which is involved in the regulation of tissue development in response to biotic and abiotic stress. To date, there have been no comprehensive studies investigating chromosomal location, gene structure, gene phylogeny, conserved motifs, or gene expression of NAC in pepper (Capsicum annuum L.. The recent release of the complete genome sequence of pepper allowed us to perform a genome-wide investigation of Capsicum annuum L. NAC (CaNAC proteins. In the present study, a comprehensive analysis of the CaNAC gene family in pepper was performed, and a total of 104 CaNAC genes were identified. Genome mapping analysis revealed that CaNAC genes were enriched on four chromosomes (chromosomes 1, 2, 3, and 6. In addition, phylogenetic analysis of the NAC domains from pepper, potato, Arabidopsis, and rice showed that CaNAC genes could be clustered into three groups (I, II, and III. Group III, which contained 24 CaNAC genes, was exclusive to the Solanaceae plant family. Gene structure and protein motif analyses showed that these genes were relatively conserved within each subgroup. The number of introns in CaNAC genes varied from 0 to 8, with 83 (78.9% of CaNAC genes containing two or less introns. Promoter analysis confirmed that CaNAC genes are involved in pepper growth, development, and biotic or abiotic stress responses. Further, the expression of 22 selected CaNAC genes in response to seven different biotic and abiotic stresses [salt, heat shock, drought, Phytophthora capsici, abscisic acid, salicylic acid (SA, and methyl jasmonate (MeJA] was evaluated by quantitative RT-PCR to determine their stress-related expression patterns. Several putative stress-responsive CaNAC genes, including CaNAC72 and CaNAC27, which are orthologs of the known stress-responsive Arabidopsis gene ANAC055 and potato gene StNAC30, respectively, were highly regulated by treatment with

  19. Using Genomics for Natural Product Structure Elucidation.

    Science.gov (United States)

    Tietz, Jonathan I; Mitchell, Douglas A

    2016-01-01

    Natural products (NPs) are the most historically bountiful source of chemical matter for drug development-especially for anti-infectives. With insights gleaned from genome mining, interest in natural product discovery has been reinvigorated. An essential stage in NP discovery is structural elucidation, which sheds light not only on the chemical composition of a molecule but also its novelty, properties, and derivatization potential. The history of structure elucidation is replete with techniquebased revolutions: combustion analysis, crystallography, UV, IR, MS, and NMR have each provided game-changing advances; the latest such advance is genomics. All natural products have a genetic basis, and the ability to obtain and interpret genomic information for structure elucidation is increasingly available at low cost to non-specialists. In this review, we describe the value of genomics as a structural elucidation technique, especially from the perspective of the natural product chemist approaching an unknown metabolite. Herein we first introduce the databases and programs of interest to the natural products chemist, with an emphasis on those currently most suited for general usability. We describe strategies for linking observed natural product-linked phenotypes to their corresponding gene clusters. We then discuss techniques for extracting structural information from genes, illustrated with numerous case examples. We also provide an analysis of the biases and limitations of the field with recommendations for future development. Our overview is not only aimed at biologically-oriented researchers already at ease with bioinformatic techniques, but also, in particular, at natural product, organic, and/or medicinal chemists not previously familiar with genomic techniques.

  20. Interrogating the druggable genome with structural informatics.

    Science.gov (United States)

    Hambly, Kevin; Danzer, Joseph; Muskal, Steven; Debe, Derek A

    2006-08-01

    Structural genomics projects are producing protein structure data at an unprecedented rate. In this paper, we present the Target Informatics Platform (TIP), a novel structural informatics approach for amplifying the rapidly expanding body of experimental protein structure information to enhance the discovery and optimization of small molecule protein modulators on a genomic scale. In TIP, existing experimental structure information is augmented using a homology modeling approach, and binding sites across multiple target families are compared using a clique detection algorithm. We report here a detailed analysis of the structural coverage for the set of druggable human targets, highlighting drug target families where the level of structural knowledge is currently quite high, as well as those areas where structural knowledge is sparse. Furthermore, we demonstrate the utility of TIP's intra- and inter-family binding site similarity analysis using a series of retrospective case studies. Our analysis underscores the utility of a structural informatics infrastructure for extracting drug discovery-relevant information from structural data, aiding researchers in the identification of lead discovery and optimization opportunities as well as potential "off-target" liabilities.

  1. Genomic structure and expression pattern of MHC IIα and IIβ genes reveal an unusual immune trait in lined seahorse Hippocampus erectus.

    Science.gov (United States)

    Luo, Wei; Wang, Xin; Qu, Hongyue; Qin, Geng; Zhang, Huixian; Lin, Qiang

    2016-11-01

    The major histocompatibility complex (MHC) genes are crucial in the adaptive immune system, and the gene duplication of MHC in animals can generally result in immune flexibility. In this study, we found that the lined seahorse (Hippocampus erectus) has only one gene copy number (GCN) of MHC IIα and IIβ, which is different from that in other teleosts. Together with the lack of spleen and gut-associated lymphatic tissue (GALT), the seahorse may be referred to as having a partial but natural "immunodeficiency". Highly variable amino acid residues were found in the IIα and IIβ domains, especially in the α1 and β1 domains with 9.62% and 8.43% allelic variation, respectively. Site models revealed seven and ten positively selected positions in the α1 and β1 domains, respectively. Real-time PCR experiments showed high expression levels of the MHC II genes in intestine (In), gill (Gi) and trunk kidney (TK) and medium in muscle (Mu) and brood pouch (BP), and the expression levels were significantly up-regulated after bacterial infection. Specially, relative higher expression level of both MHC IIα and IIβ was found in Mu and BP when compared with other fish species, in which MHC II is expressed negligibly in Mu. These results indicate that apart from TK, Gi and In, MU and BP play an important role in the immune response against pathogens in the seahorse. In conclusion, high allelic variation and strong positive selection in PBR and relative higher expression in MU and BP are speculated to partly compensate for the immunodeficiency. Copyright © 2016 Elsevier Ltd. All rights reserved.

  2. Genome Wide Identification, Phylogeny, and Expression of Aquaporin Genes in Common Carp (Cyprinus carpio.

    Directory of Open Access Journals (Sweden)

    Chuanju Dong

    Full Text Available Aquaporins (Aqps are integral membrane proteins that facilitate the transport of water and small solutes across cell membranes. Among vertebrate species, Aqps are highly conserved in both gene structure and amino acid sequence. These proteins are vital for maintaining water homeostasis in living organisms, especially for aquatic animals such as teleost fish. Studies on teleost Aqps are mainly limited to several model species with diploid genomes. Common carp, which has a tetraploidized genome, is one of the most common aquaculture species being adapted to a wide range of aquatic environments. The complete common carp genome has recently been released, providing us the possibility for gene evolution of aqp gene family after whole genome duplication.In this study, we identified a total of 37 aqp genes from common carp genome. Phylogenetic analysis revealed that most of aqps are highly conserved. Comparative analysis was performed across five typical vertebrate genomes. We found that almost all of the aqp genes in common carp were duplicated in the evolution of the gene family. We postulated that the expansion of the aqp gene family in common carp was the result of an additional whole genome duplication event and that the aqp gene family in other teleosts has been lost in their evolution history with the reason that the functions of genes are redundant and conservation. Expression patterns were assessed in various tissues, including brain, heart, spleen, liver, intestine, gill, muscle, and skin, which demonstrated the comprehensive expression profiles of aqp genes in the tetraploidized genome. Significant gene expression divergences have been observed, revealing substantial expression divergences or functional divergences in those duplicated aqp genes post the latest WGD event.To some extent, the gene families are also considered as a unique source for evolutionary studies. Moreover, the whole set of common carp aqp gene family provides an

  3. Two duplicated chicken-type lysozyme genes in disc abalone Haliotis discus discus: molecular aspects in relevance to structure, genomic organization, mRNA expression and bacteriolytic function.

    Science.gov (United States)

    Umasuthan, Navaneethaiyer; Bathige, S D N K; Kasthuri, Saranya Revathy; Wan, Qiang; Whang, Ilson; Lee, Jehee

    2013-08-01

    Lysozymes are crucial antibacterial proteins that are associated with catalytic cleavage of peptidoglycan and subsequent bacteriolysis. The present study describes the identification of two lysozyme genes from disc abalone Haliotis discus discus and their characterization at sequence-, genomic-, transcriptional- and functional-levels. Two cDNAs and BAC clones bearing lysozyme genes were isolated from abalone transcriptome and BAC genomic libraries, respectively and sequences were determined. Corresponding deduced amino acid sequences harbored a chicken-type lysozyme (LysC) family profile and exhibited conserved characteristics of LysC family members including active residues (Glu and Asp) and GS(S/T)DYGIFQINS motif suggested that they are LysC counterparts in disc abalone and designated as abLysC1 and abLysC2. While abLysC1 represented the homolog recently reported in Ezo abalone [1], abLysC2 shared significant identity with LysC homologs. Unlike other vertebrate LysCs, coding sequence of abLysCs were distributed within five exons interrupted by four introns. Both abLysCs revealed a broader mRNA distribution with highest levels in mantle (abLysC1) and hepatopancreas (abLysC2) suggesting their likely main role in defense and digestion, respectively. Investigation of temporal transcriptional profiles post-LPS and -pathogen challenges revealed induced-responses of abLysCs in gills and hemocytes. The in vitro muramidase activity of purified recombinant (r) abLysCs proteins was evaluated, and findings indicated that they are active in acidic pH range (3.5-6.5) and over a broad temperature range (20-60 °C) and influenced by ionic strength. When the antibacterial spectra of (r)abLysCs were examined, they displayed differential activities against both Gram positive and Gram negative strains providing evidence for their involvement in bacteriolytic function in abalone physiology. Copyright © 2013 Elsevier Ltd. All rights reserved.

  4. Structural genomic variations and Parkinson's disease.

    Science.gov (United States)

    Bandrés-Ciga, Sara; Ruz, Clara; Barrero, Francisco J; Escamilla-Sevilla, Francisco; Pelegrina, Javier; Vives, Francisco; Duran, Raquel

    2017-10-01

    Parkinson's disease (PD) is the second most common neurodegenerative disease, whose prevalence is projected to be between 8.7 and 9.3 million by 2030. Until about 20 years ago, PD was considered to be the textbook example of a "non-genetic" disorder. Nowadays, PD is generally considered a multifactorial disorder that arises from the combination and complex interaction of genes and environmental factors. To date, a total of 7 genes including SNCA, LRRK2, PARK2, DJ-1, PINK 1, VPS35 and ATP13A2 have been seen to cause unequivocally Mendelian PD. Also, variants with incomplete penetrance in the genes LRRK2 and GBA are considered to be strong risk factors for PD worldwide. Although genetic studies have provided valuable insights into the pathogenic mechanisms underlying PD, the role of structural variation in PD has been understudied in comparison with other genomic variations. Structural genomic variations might substantially account for such genetic substrates yet to be discovered. The present review aims to provide an overview of the structural genomic variants implicated in the pathogenesis of PD.

  5. Expression of a transferred nuclear gene in a mitochondrial genome

    Directory of Open Access Journals (Sweden)

    Yichun Qiu

    2014-08-01

    Full Text Available Transfer of mitochondrial genes to the nucleus, and subsequent gain of regulatory elements for expression, is an ongoing evolutionary process in plants. Many examples have been characterized, which in some cases have revealed sources of mitochondrial targeting sequences and cis-regulatory elements. In contrast, there have been no reports of a nuclear gene that has undergone intracellular transfer to the mitochondrial genome and become expressed. Here we show that the orf164 gene in the mitochondrial genome of several Brassicaceae species, including Arabidopsis, is derived from the nuclear ARF17 gene that codes for an auxin responsive protein and is present across flowering plants. Orf164 corresponds to a portion of ARF17, and the nucleotide and amino acid sequences are 79% and 81% identical, respectively. Orf164 is transcribed in several organ types of Arabidopsis thaliana, as detected by RT-PCR. In addition, orf164 is transcribed in five other Brassicaceae within the tribes Camelineae, Erysimeae and Cardamineae, but the gene is not present in Brassica or Raphanus. This study shows that nuclear genes can be transferred to the mitochondrial genome and become expressed, providing a new perspective on the movement of genes between the genomes of subcellular compartments.

  6. Genome-Wide Expression Profiling of Complex Regional Pain Syndrome

    Science.gov (United States)

    Jin, Eun-Heui; Zhang, Enji; Ko, Youngkwon; Sim, Woo Seog; Moon, Dong Eon; Yoon, Keon Jung; Hong, Jang Hee; Lee, Won Hyung

    2013-01-01

    Complex regional pain syndrome (CRPS) is a chronic, progressive, and devastating pain syndrome characterized by spontaneous pain, hyperalgesia, allodynia, altered skin temperature, and motor dysfunction. Although previous gene expression profiling studies have been conducted in animal pain models, there genome-wide expression profiling in the whole blood of CRPS patients has not been reported yet. Here, we successfully identified certain pain-related genes through genome-wide expression profiling in the blood from CRPS patients. We found that 80 genes were differentially expressed between 4 CRPS patients (2 CRPS I and 2 CRPS II) and 5 controls (cut-off value: 1.5-fold change and pCRPS patients and 18 controls by quantitative reverse transcription-polymerase chain reaction (qRT-PCR). We focused on the MMP9 gene that, by qRT-PCR, showed a statistically significant difference in expression in CRPS patients compared to controls with the highest relative fold change (4.0±1.23 times and p = 1.4×10−4). The up-regulation of MMP9 gene in the blood may be related to the pain progression in CRPS patients. Our findings, which offer a valuable contribution to the understanding of the differential gene expression in CRPS may help in the understanding of the pathophysiology of CRPS pain progression. PMID:24244504

  7. Elucidation of Operon Structures across Closely Related Bacterial Genomes

    Science.gov (United States)

    Li, Guojun

    2014-01-01

    About half of the protein-coding genes in prokaryotic genomes are organized into operons to facilitate co-regulation during transcription. With the evolution of genomes, operon structures are undergoing changes which could coordinate diverse gene expression patterns in response to various stimuli during the life cycle of a bacterial cell. Here we developed a graph-based model to elucidate the diversity of operon structures across a set of closely related bacterial genomes. In the constructed graph, each node represents one orthologous gene group (OGG) and a pair of nodes will be connected if any two genes, from the corresponding two OGGs respectively, are located in the same operon as immediate neighbors in any of the considered genomes. Through identifying the connected components in the above graph, we found that genes in a connected component are likely to be functionally related and these identified components tend to form treelike topology, such as paths and stars, corresponding to different biological mechanisms in transcriptional regulation as follows. Specifically, (i) a path-structure component integrates genes encoding a protein complex, such as ribosome; and (ii) a star-structure component not only groups related genes together, but also reflects the key functional roles of the central node of this component, such as the ABC transporter with a transporter permease and substrate-binding proteins surrounding it. Most interestingly, the genes from organisms with highly diverse living environments, i.e., biomass degraders and animal pathogens of clostridia in our study, can be clearly classified into different topological groups on some connected components. PMID:24959722

  8. In vitro analysis of integrated global high-resolution DNA methylation profiling with genomic imbalance and gene expression in osteosarcoma.

    Directory of Open Access Journals (Sweden)

    Bekim Sadikovic

    Full Text Available Genetic and epigenetic changes contribute to deregulation of gene expression and development of human cancer. Changes in DNA methylation are key epigenetic factors regulating gene expression and genomic stability. Recent progress in microarray technologies resulted in developments of high resolution platforms for profiling of genetic, epigenetic and gene expression changes. OS is a pediatric bone tumor with characteristically high level of numerical and structural chromosomal changes. Furthermore, little is known about DNA methylation changes in OS. Our objective was to develop an integrative approach for analysis of high-resolution epigenomic, genomic, and gene expression profiles in order to identify functional epi/genomic differences between OS cell lines and normal human osteoblasts. A combination of Affymetrix Promoter Tilling Arrays for DNA methylation, Agilent array-CGH platform for genomic imbalance and Affymetrix Gene 1.0 platform for gene expression analysis was used. As a result, an integrative high-resolution approach for interrogation of genome-wide tumour-specific changes in DNA methylation was developed. This approach was used to provide the first genomic DNA methylation maps, and to identify and validate genes with aberrant DNA methylation in OS cell lines. This first integrative analysis of global cancer-related changes in DNA methylation, genomic imbalance, and gene expression has provided comprehensive evidence of the cumulative roles of epigenetic and genetic mechanisms in deregulation of gene expression networks.

  9. Genome-wide associations of gene expression variation in humans.

    Directory of Open Access Journals (Sweden)

    Barbara E Stranger

    2005-12-01

    Full Text Available The exploration of quantitative variation in human populations has become one of the major priorities for medical genetics. The successful identification of variants that contribute to complex traits is highly dependent on reliable assays and genetic maps. We have performed a genome-wide quantitative trait analysis of 630 genes in 60 unrelated Utah residents with ancestry from Northern and Western Europe using the publicly available phase I data of the International HapMap project. The genes are located in regions of the human genome with elevated functional annotation and disease interest including the ENCODE regions spanning 1% of the genome, Chromosome 21 and Chromosome 20q12-13.2. We apply three different methods of multiple test correction, including Bonferroni, false discovery rate, and permutations. For the 374 expressed genes, we find many regions with statistically significant association of single nucleotide polymorphisms (SNPs with expression variation in lymphoblastoid cell lines after correcting for multiple tests. Based on our analyses, the signal proximal (cis- to the genes of interest is more abundant and more stable than distal and trans across statistical methodologies. Our results suggest that regulatory polymorphism is widespread in the human genome and show that the 5-kb (phase I HapMap has sufficient density to enable linkage disequilibrium mapping in humans. Such studies will significantly enhance our ability to annotate the non-coding part of the genome and interpret functional variation. In addition, we demonstrate that the HapMap cell lines themselves may serve as a useful resource for quantitative measurements at the cellular level.

  10. Genome-Wide Associations of Gene Expression Variation in Humans.

    Directory of Open Access Journals (Sweden)

    2005-12-01

    Full Text Available The exploration of quantitative variation in human populations has become one of the major priorities for medical genetics. The successful identification of variants that contribute to complex traits is highly dependent on reliable assays and genetic maps. We have performed a genome-wide quantitative trait analysis of 630 genes in 60 unrelated Utah residents with ancestry from Northern and Western Europe using the publicly available phase I data of the International HapMap project. The genes are located in regions of the human genome with elevated functional annotation and disease interest including the ENCODE regions spanning 1% of the genome, Chromosome 21 and Chromosome 20q12-13.2. We apply three different methods of multiple test correction, including Bonferroni, false discovery rate, and permutations. For the 374 expressed genes, we find many regions with statistically significant association of single nucleotide polymorphisms (SNPs with expression variation in lymphoblastoid cell lines after correcting for multiple tests. Based on our analyses, the signal proximal (cis- to the genes of interest is more abundant and more stable than distal and trans across statistical methodologies. Our results suggest that regulatory polymorphism is widespread in the human genome and show that the 5-kb (phase I HapMap has sufficient density to enable linkage disequilibrium mapping in humans. Such studies will significantly enhance our ability to annotate the non-coding part of the genome and interpret functional variation. In addition, we demonstrate that the HapMap cell lines themselves may serve as a useful resource for quantitative measurements at the cellular level.

  11. Deep transcriptome sequencing provides new insights into the structural and functional organization of the wheat genome.

    Science.gov (United States)

    Pingault, Lise; Choulet, Frédéric; Alberti, Adriana; Glover, Natasha; Wincker, Patrick; Feuillet, Catherine; Paux, Etienne

    2015-02-10

    Because of its size, allohexaploid nature, and high repeat content, the bread wheat genome is a good model to study the impact of the genome structure on gene organization, function, and regulation. However, because of the lack of a reference genome sequence, such studies have long been hampered and our knowledge of the wheat gene space is still limited. The access to the reference sequence of the wheat chromosome 3B provided us with an opportunity to study the wheat transcriptome and its relationships to genome and gene structure at a level that has never been reached before. By combining this sequence with RNA-seq data, we construct a fine transcriptome map of the chromosome 3B. More than 8,800 transcription sites are identified, that are distributed throughout the entire chromosome. Expression level, expression breadth, alternative splicing as well as several structural features of genes, including transcript length, number of exons, and cumulative intron length are investigated. Our analysis reveals a non-monotonic relationship between gene expression and structure and leads to the hypothesis that gene structure is determined by its function, whereas gene expression is subject to energetic cost. Moreover, we observe a recombination-based partitioning at the gene structure and function level. Our analysis provides new insights into the relationships between gene and genome structure and function. It reveals mechanisms conserved with other plant species as well as superimposed evolutionary forces that shaped the wheat gene space, likely participating in wheat adaptation.

  12. Integrated genomic and gene expression profiling identifies two major genomic circuits in urothelial carcinoma.

    Directory of Open Access Journals (Sweden)

    David Lindgren

    Full Text Available Similar to other malignancies, urothelial carcinoma (UC is characterized by specific recurrent chromosomal aberrations and gene mutations. However, the interconnection between specific genomic alterations, and how patterns of chromosomal alterations adhere to different molecular subgroups of UC, is less clear. We applied tiling resolution array CGH to 146 cases of UC and identified a number of regions harboring recurrent focal genomic amplifications and deletions. Several potential oncogenes were included in the amplified regions, including known oncogenes like E2F3, CCND1, and CCNE1, as well as new candidate genes, such as SETDB1 (1q21, and BCL2L1 (20q11. We next combined genome profiling with global gene expression, gene mutation, and protein expression data and identified two major genomic circuits operating in urothelial carcinoma. The first circuit was characterized by FGFR3 alterations, overexpression of CCND1, and 9q and CDKN2A deletions. The second circuit was defined by E3F3 amplifications and RB1 deletions, as well as gains of 5p, deletions at PTEN and 2q36, 16q, 20q, and elevated CDKN2A levels. TP53/MDM2 alterations were common for advanced tumors within the two circuits. Our data also suggest a possible RAS/RAF circuit. The tumors with worst prognosis showed a gene expression profile that indicated a keratinized phenotype. Taken together, our integrative approach revealed at least two separate networks of genomic alterations linked to the molecular diversity seen in UC, and that these circuits may reflect distinct pathways of tumor development.

  13. GenRGenS: Software for Generating Random Genomic Sequences and Structures

    OpenAIRE

    Ponty , Yann; Termier , Michel; Denise , Alain

    2006-01-01

    International audience; GenRGenS is a software tool dedicated to randomly generating genomic sequences and structures. It handles several classes of models useful for sequence analysis, such as Markov chains, hidden Markov models, weighted context-free grammars, regular expressions and PROSITE expressions. GenRGenS is the only program that can handle weighted context-free grammars, thus allowing the user to model and to generate structured objects (such as RNA secondary structures) of any giv...

  14. A genome-wide map of aberrantly expressed chromosomal islands in colorectal cancer

    Directory of Open Access Journals (Sweden)

    Castanos-Velez Esmeralda

    2006-09-01

    Full Text Available Abstract Background Cancer development is accompanied by genetic phenomena like deletion and amplification of chromosome parts or alterations of chromatin structure. It is expected that these mechanisms have a strong effect on regional gene expression. Results We investigated genome-wide gene expression in colorectal carcinoma (CRC and normal epithelial tissues from 25 patients using oligonucleotide arrays. This allowed us to identify 81 distinct chromosomal islands with aberrant gene expression. Of these, 38 islands show a gain in expression and 43 a loss of expression. In total, 7.892 genes (25.3% of all human genes are located in aberrantly expressed islands. Many chromosomal regions that are linked to hereditary colorectal cancer show deregulated expression. Also, many known tumor genes localize to chromosomal islands of misregulated expression in CRC. Conclusion An extensive comparison with published CGH data suggests that chromosomal regions known for frequent deletions in colon cancer tend to show reduced expression. In contrast, regions that are often amplified in colorectal tumors exhibit heterogeneous expression patterns: even show a decrease of mRNA expression. Because for several islands of deregulated expression chromosomal aberrations have never been observed, we speculate that additional mechanisms (like abnormal states of regional chromatin also have a substantial impact on the formation of co-expression islands in colorectal carcinoma.

  15. Genomic expression patterns of cardiac tissues from dogs with dilated cardiomyopathy.

    Science.gov (United States)

    Oyama, Mark A; Chittur, Sridar

    2005-07-01

    To evaluate global genome expression patterns of left ventricular tissues from dogs with dilated cardiomyopathy (DCM). Tissues obtained from the left ventricle of 2 Doberman Pinschers with end-stage DCM and 5 healthy control dogs. Transcriptional activities of 23,851 canine DNA sequences were determined by use of an oligonucleotide microarray. Genome expression patterns of DCM tissue were evaluated by measuring the relative amount of complementary RNA hybridization to the microarray probes and comparing it with gene expression for tissues from 5 healthy control dogs. 478 transcripts were differentially expressed (> or = 2.5-fold change). In DCM tissue, expression of 173 transcripts was upregulated and expression of 305 transcripts was downregulated, compared with expression for control tissues. Of the 478 transcripts, 167 genes could be specifically identified. These genes were grouped into 1 of 8 categories on the basis of their primary physiologic function. Grouping revealed that pathways involving cellular energy production, signaling and communication, and cell structure were generally downregulated, whereas pathways involving cellular defense and stress responses were upregulated. Many previously unreported genes that may contribute to the pathophysiologic aspects of heart disease were identified. Evaluation of global expression patterns provides a molecular portrait of heart failure, yields insights into the pathophysiologic aspects of DCM, and identifies intriguing genes and pathways for further study.

  16. Genome-wide association analyses of expression phenotypes.

    Science.gov (United States)

    Chen, Gary K; Zheng, Tian; Witte, John S; Goode, Ellen L; Gao, Lei; Hu, Pingzhao; Suh, Young Ju; Suktitipat, Bhoom; Szymczak, Silke; Woo, Jung Hoon; Zhang, Wei

    2007-01-01

    A number of issues arise when analyzing the large amount of data from high-throughput genotype and expression microarray experiments, including design and interpretation of genome-wide association studies of expression phenotypes. These issues were considered by contributions submitted to Group 1 of the Genetic Analysis Workshop 15 (GAW15), which focused on the association of quantitative expression data. These contributions evaluated diverse hypotheses, including those relevant to cancer and obesity research, and used various analytic techniques, many of which were derived from information theory. Several observations from these reports stand out. First, one needs to consider the genetic model of the trait of interest and carefully select which single nucleotide polymorphisms and individuals are included early in the design stage of a study. Second, by targeting specific pathways when analyzing genome-wide data, one can generate more interpretable results than agnostic approaches. Finally, for datasets with small sample sizes but a large number of features like the Genetic Analysis Workshop 15 dataset, machine learning approaches may be more practical than traditional parametric approaches. (c) 2007 Wiley-Liss, Inc.

  17. Child Development and Structural Variation in the Human Genome

    Science.gov (United States)

    Zhang, Ying; Haraksingh, Rajini; Grubert, Fabian; Abyzov, Alexej; Gerstein, Mark; Weissman, Sherman; Urban, Alexander E.

    2013-01-01

    Structural variation of the human genome sequence is the insertion, deletion, or rearrangement of stretches of DNA sequence sized from around 1,000 to millions of base pairs. Over the past few years, structural variation has been shown to be far more common in human genomes than previously thought. Very little is currently known about the effects…

  18. Genomes

    National Research Council Canada - National Science Library

    Brown, T. A. (Terence A.)

    2002-01-01

    ... of genome expression and replication processes, and transcriptomics and proteomics. This text is richly illustrated with clear, easy-to-follow, full color diagrams, which are downloadable from the book's website...

  19. Cognitive endophenotypes inform genome-wide expression profiling in schizophrenia.

    Science.gov (United States)

    Zheutlin, Amanda B; Viehman, Rachael W; Fortgang, Rebecca; Borg, Jacqueline; Smith, Desmond J; Suvisaari, Jaana; Therman, Sebastian; Hultman, Christina M; Cannon, Tyrone D

    2016-01-01

    We performed a whole-genome expression study to clarify the nature of the biological processes mediating between inherited genetic variations and cognitive dysfunction in schizophrenia. Gene expression was assayed from peripheral blood mononuclear cells using Illumina Human WG6 v3.0 chips in twins discordant for schizophrenia or bipolar disorder and control twins. After quality control, expression levels of 18,559 genes were screened for association with the California Verbal Learning Test (CVLT) performance, and any memory-related probes were then evaluated for variation by diagnostic status in the discovery sample (N = 190), and in an independent replication sample (N = 73). Heritability of gene expression using the twin design was also assessed. After Bonferroni correction (p schizophrenia patients, with comparable effect sizes in the same direction in the replication sample. For 41 of these 43 transcripts, expression levels were heritable. Nearly all identified genes contain common or de novo mutations associated with schizophrenia in prior studies. Genes increasing risk for schizophrenia appear to do so in part via effects on signaling cascades influencing memory. The genes implicated in these processes are enriched for those related to RNA processing and DNA replication and include genes influencing G-protein coupled signal transduction, cytokine signaling, and oligodendrocyte function. (c) 2015 APA, all rights reserved).

  20. BarleyBase—an expression profiling database for plant genomics

    Science.gov (United States)

    Shen, Lishuang; Gong, Jian; Caldo, Rico A.; Nettleton, Dan; Cook, Dianne; Wise, Roger P.; Dickerson, Julie A.

    2005-01-01

    BarleyBase (BB) (www.barleybase.org) is an online database for plant microarrays with integrated tools for data visualization and statistical analysis. BB houses raw and normalized expression data from the two publicly available Affymetrix genome arrays, Barley1 and Arabidopsis ATH1 with plans to include the new Affymetrix 61K wheat, maize, soybean and rice arrays, as they become available. BB contains a broad set of query and display options at all data levels, ranging from experiments to individual hybridizations to probe sets down to individual probes. Users can perform cross-experiment queries on probe sets based on observed expression profiles and/or based on known biological information. Probe set queries are integrated with visualization and analysis tools such as the R statistical toolbox, data filters and a large variety of plot types. Controlled vocabularies for gene and plant ontologies, as well as interconnecting links to physical or genetic map and other genomic data in PlantGDB, Gramene and GrainGenes, allow users to perform EST alignments and gene function prediction using Barley1 exemplar sequences, thus, enhancing cross-species comparison. PMID:15608273

  1. Genome-wide identification, characterization, and expression profile of aquaporin gene family in flax (Linum usitatissimum).

    Science.gov (United States)

    Shivaraj, S M; Deshmukh, Rupesh K; Rai, Rhitu; Bélanger, Richard; Agrawal, Pawan K; Dash, Prasanta K

    2017-04-27

    Membrane intrinsic proteins (MIPs) form transmembrane channels and facilitate transport of myriad substrates across the cell membrane in many organisms. Majority of plant MIPs have water transporting ability and are commonly referred as aquaporins (AQPs). In the present study, we identified aquaporin coding genes in flax by genome-wide analysis, their structure, function and expression pattern by pan-genome exploration. Cross-genera phylogenetic analysis with known aquaporins from rice, arabidopsis, and poplar showed five subgroups of flax aquaporins representing 16 plasma membrane intrinsic proteins (PIPs), 17 tonoplast intrinsic proteins (TIPs), 13 NOD26-like intrinsic proteins (NIPs), 2 small basic intrinsic proteins (SIPs), and 3 uncharacterized intrinsic proteins (XIPs). Amongst aquaporins, PIPs contained hydrophilic aromatic arginine (ar/R) selective filter but TIP, NIP, SIP and XIP subfamilies mostly contained hydrophobic ar/R selective filter. Analysis of RNA-seq and microarray data revealed high expression of PIPs in multiple tissues, low expression of NIPs, and seed specific expression of TIP3 in flax. Exploration of aquaporin homologs in three closely related Linum species bienne, grandiflorum and leonii revealed presence of 49, 39 and 19 AQPs, respectively. The genome-wide identification of aquaporins, first in flax, provides insight to elucidate their physiological and developmental roles in flax.

  2. Structural development of child's artistic expression

    OpenAIRE

    Sanja Filipović; Milica Vojvodić

    2017-01-01

    Abstract Structural development implies control and capability of the expression usage in terms of independent creative expression and making. Understanding of structural development of child's artistic expression as a phenomenon (which is suitable to child's age) has some implications on methodical acts considering artistic education of children and youngsters. Therefore, it is of unexceptional importance to know these laws as well as methodical acts which encourage the structural develop...

  3. Genome-Wide Identification and Expression Analysis of WRKY Gene Family in Capsicum annuum L.

    Science.gov (United States)

    Diao, Wei-Ping; Snyder, John C; Wang, Shu-Bin; Liu, Jin-Bing; Pan, Bao-Gui; Guo, Guang-Jun; Wei, Ge

    2016-01-01

    The WRKY family of transcription factors is one of the most important families of plant transcriptional regulators with members regulating multiple biological processes, especially in regulating defense against biotic and abiotic stresses. However, little information is available about WRKYs in pepper (Capsicum annuum L.). The recent release of completely assembled genome sequences of pepper allowed us to perform a genome-wide investigation for pepper WRKY proteins. In the present study, a total of 71 WRKY genes were identified in the pepper genome. According to structural features of their encoded proteins, the pepper WRKY genes (CaWRKY) were classified into three main groups, with the second group further divided into five subgroups. Genome mapping analysis revealed that CaWRKY were enriched on four chromosomes, especially on chromosome 1, and 15.5% of the family members were tandemly duplicated genes. A phylogenetic tree was constructed depending on WRKY domain' sequences derived from pepper and Arabidopsis. The expression of 21 selected CaWRKY genes in response to seven different biotic and abiotic stresses (salt, heat shock, drought, Phytophtora capsici, SA, MeJA, and ABA) was evaluated by quantitative RT-PCR; Some CaWRKYs were highly expressed and up-regulated by stress treatment. Our results will provide a platform for functional identification and molecular breeding studies of WRKY genes in pepper.

  4. Structural biology at York Structural Biology Laboratory; laboratory information management systems for structural genomics

    Czech Academy of Sciences Publication Activity Database

    Dohnálek, Jan

    2005-01-01

    Roč. 12, č. 1 (2005), s. 3 ISSN 1211-5894. [Meeting of Structural Biologists /4./. 10.03.2005-12.03.2005, Nové Hrady] R&D Projects: GA MŠk(CZ) 1K05008 Keywords : structural biology * LIMS * structural genomics Subject RIV: CD - Macromolecular Chemistry

  5. Genome-wide study of correlations between genomic features and their relationship with the regulation of gene expression.

    Science.gov (United States)

    Kravatsky, Yuri V; Chechetkin, Vladimir R; Tchurikov, Nikolai A; Kravatskaya, Galina I

    2015-02-01

    The broad class of tasks in genetics and epigenetics can be reduced to the study of various features that are distributed over the genome (genome tracks). The rapid and efficient processing of the huge amount of data stored in the genome-scale databases cannot be achieved without the software packages based on the analytical criteria. However, strong inhomogeneity of genome tracks hampers the development of relevant statistics. We developed the criteria for the assessment of genome track inhomogeneity and correlations between two genome tracks. We also developed a software package, Genome Track Analyzer, based on this theory. The theory and software were tested on simulated data and were applied to the study of correlations between CpG islands and transcription start sites in the Homo sapiens genome, between profiles of protein-binding sites in chromosomes of Drosophila melanogaster, and between DNA double-strand breaks and histone marks in the H. sapiens genome. Significant correlations between transcription start sites on the forward and the reverse strands were observed in genomes of D. melanogaster, Caenorhabditis elegans, Mus musculus, H. sapiens, and Danio rerio. The observed correlations may be related to the regulation of gene expression in eukaryotes. Genome Track Analyzer is freely available at http://ancorr.eimb.ru/. © The Author 2015. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  6. Mapping Determinants of Gene Expression Plasticity by Genetical Genomics in C. elegans

    NARCIS (Netherlands)

    Li, Y.; Alda Alvarez, O.; Gutteling, E.W.; Tijsterman, M.; Fu, J.; Riksen, J.A.G.; Hazendonk, E.; Prins, J.C.P.; Plasterk, R.H.A.; Jansen, R.C.; Breitling, R.; Kammenga, J.E.

    2006-01-01

    Recent genetical genomics studies have provided intimate views on gene regulatory networks. Gene expression variations between genetically different individuals have been mapped to the causal regulatory regions, termed expression quantitative trait loci. Whether the environment-induced plastic

  7. Mapping determinants of gene expression plasticity by genetical genomics in C. elegans.

    NARCIS (Netherlands)

    Li, Y.; Alvarez, O.A.; Gutteling, E.W.; Tijsterman, M.; Fu, J.; Riksen, J.A.; Hazendonk, M.G.A.; Prins, P.; Plasterk, R.H.A.; Jansen, R.C.; Breitling, R.; Kammenga, J.E.

    2006-01-01

    Recent genetical genomics studies have provided intimate views on gene regulatory networks. Gene expression variations between genetically different individuals have been mapped to the causal regulatory regions, termed expression quantitative trait loci. Whether the environment-induced plastic

  8. Effects of high hydrostatic pressure on genomic expression profiling of porcine parthenogenetic activated and cloned embryos

    DEFF Research Database (Denmark)

    Lin, Lin; Luo, Yonglun; Sørensen, Peter

    2014-01-01

    derived by PA or HMC. Hierarchical clustering depicted stage-specific genomic expression profiling. At the 4-cell and blastocyst stages, 103 and 163 transcripts were differentially expressed between the HMC and PA embryos, respectively (P

  9. Visualization of RNA structure models within the Integrative Genomics Viewer.

    Science.gov (United States)

    Busan, Steven; Weeks, Kevin M

    2017-07-01

    Analyses of the interrelationships between RNA structure and function are increasingly important components of genomic studies. The SHAPE-MaP strategy enables accurate RNA structure probing and realistic structure modeling of kilobase-length noncoding RNAs and mRNAs. Existing tools for visualizing RNA structure models are not suitable for efficient analysis of long, structurally heterogeneous RNAs. In addition, structure models are often advantageously interpreted in the context of other experimental data and gene annotation information, for which few tools currently exist. We have developed a module within the widely used and well supported open-source Integrative Genomics Viewer (IGV) that allows visualization of SHAPE and other chemical probing data, including raw reactivities, data-driven structural entropies, and data-constrained base-pair secondary structure models, in context with linear genomic data tracks. We illustrate the usefulness of visualizing RNA structure in the IGV by exploring structure models for a large viral RNA genome, comparing bacterial mRNA structure in cells with its structure under cell- and protein-free conditions, and comparing a noncoding RNA structure modeled using SHAPE data with a base-pairing model inferred through sequence covariation analysis. © 2017 Busan and Weeks; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  10. Pathgroups, a dynamic data structure for genome reconstruction problems.

    Science.gov (United States)

    Zheng, Chunfang

    2010-07-01

    Ancestral gene order reconstruction problems, including the median problem, quartet construction, small phylogeny, guided genome halving and genome aliquoting, are NP hard. Available heuristics dedicated to each of these problems are computationally costly for even small instances. We present a data structure enabling rapid heuristic solution to all these ancestral genome reconstruction problems. A generic greedy algorithm with look-ahead based on an automatically generated priority system suffices for all the problems using this data structure. The efficiency of the algorithm is due to fast updating of the structure during run time and to the simplicity of the priority scheme. We illustrate with the first rapid algorithm for quartet construction and apply this to a set of yeast genomes to corroborate a recent gene sequence-based phylogeny. http://albuquerque.bioinformatics.uottawa.ca/pathgroup/Quartet.html chunfang313@gmail.com Supplementary data are available at Bioinformatics online.

  11. Genome-wide gene expression regulation as a function of genotype and age in C. elegans

    NARCIS (Netherlands)

    Viñuela Rodriguez, A.; Snoek, L.B.; Riksen, J.A.G.; Kammenga, J.E.

    2010-01-01

    Gene expression becomes more variable with age, and it is widely assumed that this is due to a decrease in expression regulation. But currently there is no understanding how gene expression regulatory patterns progress with age. Here we explored genome-wide gene expression variation and regulatory

  12. Structural dynamics of retroviral genome and the packaging.

    Science.gov (United States)

    Miyazaki, Yasuyuki; Miyake, Ariko; Nomaguchi, Masako; Adachi, Akio

    2011-01-01

    Retroviruses can cause diseases such as AIDS, leukemia, and tumors, but are also used as vectors for human gene therapy. All retroviruses, except foamy viruses, package two copies of unspliced genomic RNA into their progeny viruses. Understanding the molecular mechanisms of retroviral genome packaging will aid the design of new anti-retroviral drugs targeting the packaging process and improve the efficacy of retroviral vectors. Retroviral genomes have to be specifically recognized by the cognate nucleocapsid domain of the Gag polyprotein from among an excess of cellular and spliced viral mRNA. Extensive virological and structural studies have revealed how retroviral genomic RNA is selectively packaged into the viral particles. The genomic area responsible for the packaging is generally located in the 5' untranslated region (5' UTR), and contains dimerization site(s). Recent studies have shown that retroviral genome packaging is modulated by structural changes of RNA at the 5' UTR accompanied by the dimerization. In this review, we focus on three representative retroviruses, Moloney murine leukemia virus, human immunodeficiency virus type 1 and 2, and describe the molecular mechanism of retroviral genome packaging.

  13. Structural dynamics of retroviral genome and the packaging

    Directory of Open Access Journals (Sweden)

    Yasuyuki eMiyazaki

    2011-12-01

    Full Text Available Retroviruses can cause diseases such as AIDS, leukemia and tumors, but are also used as vectors for human gene therapy. All retroviruses, except foamy viruses, package two copies of unspliced genomic RNA into their progeny viruses. Understanding the molecular mechanisms of retroviral genome packaging will aid the design of new anti-retroviral drugs targeting the packaging process and improve the efficacy of retroviral vectors. Retroviral genomes have to be specifically recognized by the cognate nucleocapsid (NC domain of the Gag polyprotein from among an excess of cellular and spliced viral mRNA. Extensive virological and structural studies have revealed how retroviral genomic RNA is selectively packaged into the viral particles. The genomic area responsible for the packaging is generally located in the 5’ untranslated region (5’ UTR, and contains dimerization site(s. Recent studies have shown that retroviral genome packaging is modulated by structural changes of RNA at the 5’ UTR accompanied by the dimerization. In this review, we focus on three representative retroviruses, Moloney murine leukemia virus (MoMLV, human immunodeficiency virus type 1 (HIV-1 and 2 (HIV-2, and describe the molecular mechanism of retroviral genome packaging.

  14. The genome and structural proteome of an ocean siphovirus: a new window into the cyanobacterial 'mobilome'.

    Science.gov (United States)

    Sullivan, Matthew B; Krastins, Bryan; Hughes, Jennifer L; Kelly, Libusha; Chase, Michael; Sarracino, David; Chisholm, Sallie W

    2009-11-01

    Prochlorococcus, an abundant phototroph in the oceans, are infected by members of three families of viruses: myo-, podo- and siphoviruses. Genomes of myo- and podoviruses isolated on Prochlorococcus contain DNA replication machinery and virion structural genes homologous to those from coliphages T4 and T7 respectively. They also contain a suite of genes of cyanobacterial origin, most notably photosynthesis genes, which are expressed during infection and appear integral to the evolutionary trajectory of both host and phage. Here we present the first genome of a cyanobacterial siphovirus, P-SS2, which was isolated from Atlantic slope waters using a Prochlorococcus host (MIT9313). The P-SS2 genome is larger than, and considerably divergent from, previously sequenced siphoviruses. It appears most closely related to lambdoid siphoviruses, with which it shares 13 functional homologues. The approximately 108 kb P-SS2 genome encodes 131 predicted proteins and notably lacks photosynthesis genes which have consistently been found in other marine cyanophage, but does contain 14 other cyanobacterial homologues. While only six structural proteins were identified from the genome sequence, 35 proteins were detected experimentally; these mapped onto capsid and tail structural modules in the genome. P-SS2 is potentially capable of integration into its host as inferred from bioinformatically identified genetic machinery int, bet, exo and a 53 bp attachment site. The host attachment site appears to be a genomic island that is tied to insertion sequence (IS) activity that could facilitate mobility of a gene involved in the nitrogen-stress response. The homologous region and a secondary IS-element hot-spot in Synechococcus RS9917 are further evidence of IS-mediated genome evolution coincident with a probable relic prophage integration event. This siphovirus genome provides a glimpse into the biology of a deep-photic zone phage as well as the ocean cyanobacterial prophage and IS element

  15. Functional RNA structures throughout the Hepatitis C Virus genome.

    Science.gov (United States)

    Adams, Rebecca L; Pirakitikulr, Nathan; Pyle, Anna Marie

    2017-06-01

    The single-stranded Hepatitis C Virus (HCV) genome adopts a set of elaborate RNA structures that are involved in every stage of the viral lifecycle. Recent advances in chemical probing, sequencing, and structural biology have facilitated analysis of RNA folding on a genome-wide scale, revealing novel structures and networks of interactions. These studies have underscored the active role played by RNA in every function of HCV and they open the door to new types of RNA-targeted therapeutics. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. Spectral entropy criteria for structural segmentation in genomic DNA sequences

    International Nuclear Information System (INIS)

    Chechetkin, V.R.; Lobzin, V.V.

    2004-01-01

    The spectral entropy is calculated with Fourier structure factors and characterizes the level of structural ordering in a sequence of symbols. It may efficiently be applied to the assessment and reconstruction of the modular structure in genomic DNA sequences. We present the relevant spectral entropy criteria for the local and non-local structural segmentation in DNA sequences. The results are illustrated with the model examples and analysis of intervening exon-intron segments in the protein-coding regions

  17. Complete mitochondrial genome of Concholepas concholepas inferred by 454 pyrosequencing and mtDNA expression in two mollusc populations.

    Science.gov (United States)

    Núñez-Acuña, Gustavo; Aguilar-Espinoza, Andrea; Gallardo-Escárate, Cristian

    2013-03-01

    Despite the great relevance of mitochondrial genome analysis in evolutionary studies, there is scarce information on how the transcripts associated with the mitogenome are expressed and their role in the genetic structuring of populations. This work reports the complete mitochondrial genome of the marine gastropod Concholepas concholepas, obtained by 454 pryosequencing, and an analysis of mitochondrial transcripts of two populations 1000 km apart along the Chilean coast. The mitochondrion of C. concholepas is 15,495 base pairs (bp) in size and contains the 37 subunits characteristic of metazoans, as well as a non-coding region of 330 bp. In silico analysis of mitochondrial gene variability showed significant differences among populations. In terms of levels of relative abundance of transcripts associated with mitochondrion in the two populations (assessed by qPCR), the genes associated with complexes III and IV of the mitochondrial genome had the highest levels of expression in the northern population while transcripts associated with the ATP synthase complex had the highest levels of expression in the southern population. Moreover, fifteen polymorphic SNPs were identified in silico between the mitogenomes of the two populations. Four of these markers implied different amino acid substitutions (non-synonymous SNPs). This work contributes novel information regarding the mitochondrial genome structure and mRNA expression levels of C. concholepas. Copyright © 2012 Elsevier Inc. All rights reserved.

  18. Structural Genomics and Drug Discovery for Infectious Diseases

    International Nuclear Information System (INIS)

    Anderson, W.F.

    2009-01-01

    The application of structural genomics methods and approaches to proteins from organisms causing infectious diseases is making available the three dimensional structures of many proteins that are potential drug targets and laying the groundwork for structure aided drug discovery efforts. There are a number of structural genomics projects with a focus on pathogens that have been initiated worldwide. The Center for Structural Genomics of Infectious Diseases (CSGID) was recently established to apply state-of-the-art high throughput structural biology technologies to the characterization of proteins from the National Institute for Allergy and Infectious Diseases (NIAID) category A-C pathogens and organisms causing emerging, or re-emerging infectious diseases. The target selection process emphasizes potential biomedical benefits. Selected proteins include known drug targets and their homologs, essential enzymes, virulence factors and vaccine candidates. The Center also provides a structure determination service for the infectious disease scientific community. The ultimate goal is to generate a library of structures that are available to the scientific community and can serve as a starting point for further research and structure aided drug discovery for infectious diseases. To achieve this goal, the CSGID will determine protein crystal structures of 400 proteins and protein-ligand complexes using proven, rapid, highly integrated, and cost-effective methods for such determination, primarily by X-ray crystallography. High throughput crystallographic structure determination is greatly aided by frequent, convenient access to high-performance beamlines at third-generation synchrotron X-ray sources.

  19. Genomic structural variation contributes to phenotypic change of industrial bioethanol yeast Saccharomyces cerevisiae.

    Science.gov (United States)

    Zhang, Ke; Zhang, Li-Jie; Fang, Ya-Hong; Jin, Xin-Na; Qi, Lei; Wu, Xue-Chang; Zheng, Dao-Qiong

    2016-03-01

    Genomic structural variation (GSV) is a ubiquitous phenomenon observed in the genomes of Saccharomyces cerevisiae strains with different genetic backgrounds; however, the physiological and phenotypic effects of GSV are not well understood. Here, we first revealed the genetic characteristics of a widely used industrial S. cerevisiae strain, ZTW1, by whole genome sequencing. ZTW1 was identified as an aneuploidy strain and a large-scale GSV was observed in the ZTW1 genome compared with the genome of a diploid strain YJS329. These GSV events led to copy number variations (CNVs) in many chromosomal segments as well as one whole chromosome in the ZTW1 genome. Changes in the DNA dosage of certain functional genes directly affected their expression levels and the resultant ZTW1 phenotypes. Moreover, CNVs of large chromosomal regions triggered an aneuploidy stress in ZTW1. This stress decreased the proliferation ability and tolerance of ZTW1 to various stresses, while aneuploidy response stress may also provide some benefits to the fermentation performance of the yeast, including increased fermentation rates and decreased byproduct generation. This work reveals genomic characters of the bioethanol S. cerevisiae strain ZTW1 and suggests that GSV is an important kind of mutation that changes the traits of industrial S. cerevisiae strains. © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  20. Long-Range Order and Fractality in the Structure and Organization of Eukaryotic Genomes

    Science.gov (United States)

    Polychronopoulos, Dimitris; Tsiagkas, Giannis; Athanasopoulou, Labrini; Sellis, Diamantis; Almirantis, Yannis

    2014-12-01

    The late Professor J.S. Nicolis always emphasized, both in his writings and in presentations and discussions with students and friends, the relevance of a dynamical systems approach to biology. In particular, viewing the genome as a "biological text" captures the dynamical character of both the evolution and function of the organisms in the form of correlations indicating the presence of a long-range order. This genomic structure can be expressed in forms reminiscent of natural languages and several temporal and spatial traces l by the functioning of dynamical systems: Zipf laws, self-similarity and fractality. Here we review several works of our group and recent unpublished results, focusing on the chromosomal distribution of biologically active genomic components: Genes and protein-coding segments, CpG islands, transposable elements belonging to all major classes and several types of conserved non-coding genomic elements. We report the systematic appearance of power-laws in the size distribution of the distances between elements belonging to each of these types of functional genomic elements. Moreover, fractality is also found in several cases, using box-counting and entropic scaling.We present here, for the first time in a unified way, an aggregative model of the genomic dynamics which can explain the observed patterns on the grounds of known phenomena accompanying genome evolution. Our results comply with recent findings about a "fractal globule" geometry of chromatin in the eukaryotic nucleus.

  1. Structural determinants and mechanism of HIV-1 genome packaging.

    Science.gov (United States)

    Lu, Kun; Heng, Xiao; Summers, Michael F

    2011-07-22

    Like all retroviruses, the human immunodeficiency virus selectively packages two copies of its unspliced RNA genome, both of which are utilized for strand-transfer-mediated recombination during reverse transcription-a process that enables rapid evolution under environmental and chemotherapeutic pressures. The viral RNA appears to be selected for packaging as a dimer, and there is evidence that dimerization and packaging are mechanistically coupled. Both processes are mediated by interactions between the nucleocapsid domains of a small number of assembling viral Gag polyproteins and RNA elements within the 5'-untranslated region of the genome. A number of secondary structures have been predicted for regions of the genome that are responsible for packaging, and high-resolution structures have been determined for a few small RNA fragments and protein-RNA complexes. However, major questions regarding the RNA structures (and potentially the structural changes) that are responsible for dimeric genome selection remain unanswered. Here, we review efforts that have been made to identify the molecular determinants and mechanism of human immunodeficiency virus type 1 genome packaging. Copyright © 2011 Elsevier Ltd. All rights reserved.

  2. On the expression strategy of the tospoviral genome

    NARCIS (Netherlands)

    Poelwijk, van F.

    1996-01-01


    The work described in this thesis was aimed at the unravelling of the molecular biology of tospoviruses, with special emphasis on the process of replication of the tripartite RNA genome.

    At the onset of the research the complete genome sequence of tomato spotted wilt virus (TSWV),

  3. Visual Comparison of Multiple Gene Expression Datasets in a Genomic Context

    Directory of Open Access Journals (Sweden)

    Borowski Krzysztof

    2008-06-01

    Full Text Available The need for novel methods of visualizing microarray data is growing. New perspectives are beneficial to finding patterns in expression data. The Bluejay genome browser provides an integrative way of visualizing gene expression datasets in a genomic context. We have now developed the functionality to display multiple microarray datasets simultaneously in Bluejay, in order to provide researchers with a comprehensive view of their datasets linked to a graphical representation of gene function. This will enable biologists to obtain valuable insights on expression patterns, by allowing them to analyze the expression values in relation to the gene locations as well as to compare expression profiles of related genomes or of di erent experiments for the same genome.

  4. In vivo genome-wide profiling of RNA secondary structure reveals novel regulatory features.

    Science.gov (United States)

    Ding, Yiliang; Tang, Yin; Kwok, Chun Kit; Zhang, Yu; Bevilacqua, Philip C; Assmann, Sarah M

    2014-01-30

    RNA structure has critical roles in processes ranging from ligand sensing to the regulation of translation, polyadenylation and splicing. However, a lack of genome-wide in vivo RNA structural data has limited our understanding of how RNA structure regulates gene expression in living cells. Here we present a high-throughput, genome-wide in vivo RNA structure probing method, structure-seq, in which dimethyl sulphate methylation of unprotected adenines and cytosines is identified by next-generation sequencing. Application of this method to Arabidopsis thaliana seedlings yielded the first in vivo genome-wide RNA structure map at nucleotide resolution for any organism, with quantitative structural information across more than 10,000 transcripts. Our analysis reveals a three-nucleotide periodic repeat pattern in the structure of coding regions, as well as a less-structured region immediately upstream of the start codon, and shows that these features are strongly correlated with translation efficiency. We also find patterns of strong and weak secondary structure at sites of alternative polyadenylation, as well as strong secondary structure at 5' splice sites that correlates with unspliced events. Notably, in vivo structures of messenger RNAs annotated for stress responses are poorly predicted in silico, whereas mRNA structures of genes related to cell function maintenance are well predicted. Global comparison of several structural features between these two categories shows that the mRNAs associated with stress responses tend to have more single-strandedness, longer maximal loop length and higher free energy per nucleotide, features that may allow these RNAs to undergo conformational changes in response to environmental conditions. Structure-seq allows the RNA structurome and its biological roles to be interrogated on a genome-wide scale and should be applicable to any organism.

  5. Multi-scale structural community organisation of the human genome.

    Science.gov (United States)

    Boulos, Rasha E; Tremblay, Nicolas; Arneodo, Alain; Borgnat, Pierre; Audit, Benjamin

    2017-04-11

    Structural interaction frequency matrices between all genome loci are now experimentally achievable thanks to high-throughput chromosome conformation capture technologies. This ensues a new methodological challenge for computational biology which consists in objectively extracting from these data the structural motifs characteristic of genome organisation. We deployed the fast multi-scale community mining algorithm based on spectral graph wavelets to characterise the networks of intra-chromosomal interactions in human cell lines. We observed that there exist structural domains of all sizes up to chromosome length and demonstrated that the set of structural communities forms a hierarchy of chromosome segments. Hence, at all scales, chromosome folding predominantly involves interactions between neighbouring sites rather than the formation of links between distant loci. Multi-scale structural decomposition of human chromosomes provides an original framework to question structural organisation and its relationship to functional regulation across the scales. By construction the proposed methodology is independent of the precise assembly of the reference genome and is thus directly applicable to genomes whose assembly is not fully determined.

  6. The Impact of Structural Genomics: Expectations and Outcomes

    Energy Technology Data Exchange (ETDEWEB)

    Chandonia, John-Marc; Brenner, Steven E.

    2005-12-21

    Structural Genomics (SG) projects aim to expand our structural knowledge of biological macromolecules, while lowering the average costs of structure determination. We quantitatively analyzed the novelty, cost, and impact of structures solved by SG centers, and contrast these results with traditional structural biology. The first structure from a protein family is particularly important to reveal the fold and ancient relationships to other proteins. In the last year, approximately half of such structures were solved at a SG center rather than in a traditional laboratory. Furthermore, the cost of solving a structure at the most efficient U.S. center has now dropped to one-quarter the estimated cost of solving a structure by traditional methods. However, top structural biology laboratories are much more efficient than the average, and comparable to SG centers despite working on very challenging structures. Moreover, traditional structural biology papers are cited significantly more often, suggesting greater current impact.

  7. High throughput platforms for structural genomics of integral membrane proteins.

    Science.gov (United States)

    Mancia, Filippo; Love, James

    2011-08-01

    Structural genomics approaches on integral membrane proteins have been postulated for over a decade, yet specific efforts are lagging years behind their soluble counterparts. Indeed, high throughput methodologies for production and characterization of prokaryotic integral membrane proteins are only now emerging, while large-scale efforts for eukaryotic ones are still in their infancy. Presented here is a review of recent literature on actively ongoing structural genomics of membrane protein initiatives, with a focus on those aimed at implementing interesting techniques aimed at increasing our rate of success for this class of macromolecules. Copyright © 2011 Elsevier Ltd. All rights reserved.

  8. Megabase replication domains along the human genome: relation to chromatin structure and genome organisation.

    Science.gov (United States)

    Audit, Benjamin; Zaghloul, Lamia; Baker, Antoine; Arneodo, Alain; Chen, Chun-Long; d'Aubenton-Carafa, Yves; Thermes, Claude

    2013-01-01

    In higher eukaryotes, the absence of specific sequence motifs, marking the origins of replication has been a serious hindrance to the understanding of (i) the mechanisms that regulate the spatio-temporal replication program, and (ii) the links between origins activation, chromatin structure and transcription. In this chapter, we review the partitioning of the human genome into megabased-size replication domains delineated as N-shaped motifs in the strand compositional asymmetry profiles. They collectively span 28.3% of the genome and are bordered by more than 1,000 putative replication origins. We recapitulate the comparison of this partition of the human genome with high-resolution experimental data that confirms that replication domain borders are likely to be preferential replication initiation zones in the germline. In addition, we highlight the specific distribution of experimental and numerical chromatin marks along replication domains. Domain borders correspond to particular open chromatin regions, possibly encoded in the DNA sequence, and around which replication and transcription are highly coordinated. These regions also present a high evolutionary breakpoint density, suggesting that susceptibility to breakage might be linked to local open chromatin fiber state. Altogether, this chapter presents a compartmentalization of the human genome into replication domains that are landmarks of the human genome organization and are likely to play a key role in genome dynamics during evolution and in pathological situations.

  9. Structural development of child's artistic expression

    Directory of Open Access Journals (Sweden)

    Sanja Filipović

    2017-03-01

    Full Text Available Structural development implies control and capability of the expression usage in terms of independent creative expression and making. Understanding of structural development of child's artistic expression as a phenomenon (which is suitable to child's age has some implications on methodical acts considering the artistic education of children and youngsters. Therefore, it is of unexceptional importance to know these laws as well as methodical acts which encourage the structural development of artistic capabilities from an early age. Various experts dealt with this phenomenon, particularly Bogomil Karlavaris. In his methodical research, he has given an unexceptional part to this problem. It has been a starting point for analysis of certain methodical questions which are included in this work.

  10. Evolutionary genomics and population structure of Entamoeba histolytica

    Directory of Open Access Journals (Sweden)

    Koushik Das

    2014-11-01

    Full Text Available Amoebiasis caused by the gastrointestinal parasite Entamoeba histolytica has diverse disease outcomes. Study of genome and evolution of this fascinating parasite will help us to understand the basis of its virulence and explain why, when and how it causes diseases. In this review, we have summarized current knowledge regarding evolutionary genomics of E. histolytica and discussed their association with parasite phenotypes and its differential pathogenic behavior. How genetic diversity reveals parasite population structure has also been discussed. Queries concerning their evolution and population structure which were required to be addressed have also been highlighted. This significantly large amount of genomic data will improve our knowledge about this pathogenic species of Entamoeba.

  11. Analysis of genomic imbalances and gene expression changes in transformed follicular lymphoma (FL)

    DEFF Research Database (Denmark)

    Obel, G.; Farinha, P.; Lam, W.

    2005-01-01

    American patients with transformed FL. Methods: High-resolution BAC-array comparative genomic hybridisation (CGH) was used to detect genomic imbalances. Gene expression profiling was performed using cDNA microarrays (Affymetrix). Results: Of 9 biopsy pairs identified so far, analysis results of the first 4...

  12. House spider genome uncovers evolutionary shifts in the diversity and expression of black widow venom proteins associated with extreme toxicity.

    Science.gov (United States)

    Gendreau, Kerry L; Haney, Robert A; Schwager, Evelyn E; Wierschin, Torsten; Stanke, Mario; Richards, Stephen; Garb, Jessica E

    2017-02-16

    Black widow spiders are infamous for their neurotoxic venom, which can cause extreme and long-lasting pain. This unusual venom is dominated by latrotoxins and latrodectins, two protein families virtually unknown outside of the black widow genus Latrodectus, that are difficult to study given the paucity of spider genomes. Using tissue-, sex- and stage-specific expression data, we analyzed the recently sequenced genome of the house spider (Parasteatoda tepidariorum), a close relative of black widows, to investigate latrotoxin and latrodectin diversity, expression and evolution. We discovered at least 47 latrotoxin genes in the house spider genome, many of which are tandem-arrayed. Latrotoxins vary extensively in predicted structural domains and expression, implying their significant functional diversification. Phylogenetic analyses show latrotoxins have substantially duplicated after the Latrodectus/Parasteatoda split and that they are also related to proteins found in endosymbiotic bacteria. Latrodectin genes are less numerous than latrotoxins, but analyses show their recruitment for venom function from neuropeptide hormone genes following duplication, inversion and domain truncation. While latrodectins and other peptides are highly expressed in house spider and black widow venom glands, latrotoxins account for a far smaller percentage of house spider venom gland expression. The house spider genome sequence provides novel insights into the evolution of venom toxins once considered unique to black widows. Our results greatly expand the size of the latrotoxin gene family, reinforce its narrow phylogenetic distribution, and provide additional evidence for the lateral transfer of latrotoxins between spiders and bacterial endosymbionts. Moreover, we strengthen the evidence for the evolution of latrodectin venom genes from the ecdysozoan Ion Transport Peptide (ITP)/Crustacean Hyperglycemic Hormone (CHH) neuropeptide superfamily. The lower expression of latrotoxins in

  13. Structural variation in two human genomes mapped at single-nucleotide resolution by whole genome de novo assembly

    DEFF Research Database (Denmark)

    Li, Yingrui; Zheng, Hancheng; Luo, Ruibang

    2011-01-01

    Here we use whole-genome de novo assembly of second-generation sequencing reads to map structural variation (SV) in an Asian genome and an African genome. Our approach identifies small- and intermediate-size homozygous variants (1-50 kb) including insertions, deletions, inversions and their precise...

  14. Structured RNAs and synteny regions in the pig genome

    DEFF Research Database (Denmark)

    Anthon, Christian; Tafer, Hakim; Havgaard, Jakob H

    2014-01-01

    BACKGROUND: Annotating mammalian genomes for noncoding RNAs (ncRNAs) is nontrivial since far from all ncRNAs are known and the computational models are resource demanding. Currently, the human genome holds the best mammalian ncRNA annotation, a result of numerous efforts by several groups. However......, a more direct strategy is desired for the increasing number of sequenced mammalian genomes of which some, such as the pig, are relevant as disease models and production animals. RESULTS: We present a comprehensive annotation of structured RNAs in the pig genome. Combining sequence and structure...... lncRNA loci, 11 conflicts of annotation, and 3,183 ncRNA genes. The ncRNA genes comprise 359 miRNAs, 8 ribozymes, 185 rRNAs, 638 snoRNAs, 1,030 snRNAs, 810 tRNAs and 153 ncRNA genes not belonging to the here fore mentioned classes. When running the pipeline on a local shuffled version of the genome...

  15. Genome-wide Identification and Expression Analysis of the CDPK Gene Family in Grape, Vitis spp.

    Science.gov (United States)

    Zhang, Kai; Han, Yong-Tao; Zhao, Feng-Li; Hu, Yang; Gao, Yu-Rong; Ma, Yan-Fei; Zheng, Yi; Wang, Yue-Jin; Wen, Ying-Qiang

    2015-06-30

    Calcium-dependent protein kinases (CDPKs) play vital roles in plant growth and development, biotic and abiotic stress responses, and hormone signaling. Little is known about the CDPK gene family in grapevine. In this study, we performed a genome-wide analysis of the 12X grape genome (Vitis vinifera) and identified nineteen CDPK genes. Comparison of the structures of grape CDPK genes allowed us to examine their functional conservation and differentiation. Segmentally duplicated grape CDPK genes showed high structural conservation and contributed to gene family expansion. Additional comparisons between grape and Arabidopsis thaliana demonstrated that several grape CDPK genes occured in the corresponding syntenic blocks of Arabidopsis, suggesting that these genes arose before the divergence of grapevine and Arabidopsis. Phylogenetic analysis divided the grape CDPK genes into four groups. Furthermore, we examined the expression of the corresponding nineteen homologous CDPK genes in the Chinese wild grape (Vitis pseudoreticulata) under various conditions, including biotic stress, abiotic stress, and hormone treatments. The expression profiles derived from reverse transcription and quantitative PCR suggested that a large number of VpCDPKs responded to various stimuli on the transcriptional level, indicating their versatile roles in the responses to biotic and abiotic stresses. Moreover, we examined the subcellular localization of VpCDPKs by transiently expressing six VpCDPK-GFP fusion proteins in Arabidopsis mesophyll protoplasts; this revealed high variability consistent with potential functional differences. Taken as a whole, our data provide significant insights into the evolution and function of grape CDPKs and a framework for future investigation of grape CDPK genes.

  16. Genomic variation and its impact on gene expression in Drosophila melanogaster.

    Directory of Open Access Journals (Sweden)

    Andreas Massouras

    Full Text Available Understanding the relationship between genetic and phenotypic variation is one of the great outstanding challenges in biology. To meet this challenge, comprehensive genomic variation maps of human as well as of model organism populations are required. Here, we present a nucleotide resolution catalog of single-nucleotide, multi-nucleotide, and structural variants in 39 Drosophila melanogaster Genetic Reference Panel inbred lines. Using an integrative, local assembly-based approach for variant discovery, we identify more than 3.6 million distinct variants, among which were more than 800,000 unique insertions, deletions (indels, and complex variants (1 to 6,000 bp. While the SNP density is higher near other variants, we find that variants themselves are not mutagenic, nor are regions with high variant density particularly mutation-prone. Rather, our data suggest that the elevated SNP density around variants is mainly due to population-level processes. We also provide insights into the regulatory architecture of gene expression variation in adult flies by mapping cis-expression quantitative trait loci (cis-eQTLs for more than 2,000 genes. Indels comprise around 10% of all cis-eQTLs and show larger effects than SNP cis-eQTLs. In addition, we identified two-fold more gene associations in males as compared to females and found that most cis-eQTLs are sex-specific, revealing a partial decoupling of the genomic architecture between the sexes as well as the importance of genetic factors in mediating sex-biased gene expression. Finally, we performed RNA-seq-based allelic expression imbalance analyses in the offspring of crosses between sequenced lines, which revealed that the majority of strong cis-eQTLs can be validated in heterozygous individuals.

  17. Genomic mid-range inhomogeneity correlates with an abundance of RNA secondary structures

    Directory of Open Access Journals (Sweden)

    Song Jun

    2008-06-01

    Full Text Available Abstract Background Genomes possess different levels of non-randomness, in particular, an inhomogeneity in their nucleotide composition. Inhomogeneity is manifest from the short-range where neighboring nucleotides influence the choice of base at a site, to the long-range, commonly known as isochores, where a particular base composition can span millions of nucleotides. A separate genomic issue that has yet to be thoroughly elucidated is the role that RNA secondary structure (SS plays in gene expression. Results We present novel data and approaches that show that a mid-range inhomogeneity (~30 to 1000 nt not only exists in mammalian genomes but is also significantly associated with strong RNA SS. A whole-genome bioinformatics investigation of local SS in a set of 11,315 non-redundant human pre-mRNA sequences has been carried out. Four distinct components of these molecules (5'-UTRs, exons, introns and 3'-UTRs were considered separately, since they differ in overall nucleotide composition, sequence motifs and periodicities. For each pre-mRNA component, the abundance of strong local SS ( Conclusion We demonstrate that the excess of strong local SS in pre-mRNAs is linked to the little explored phenomenon of genomic mid-range inhomogeneity (MRI. MRI is an interdependence between nucleotide choice and base composition over a distance of 20–1000 nt. Additionally, we have created a public computational resource to support further study of genomic MRI.

  18. Improved methods and resources for paramecium genomics: transcription units, gene annotation and gene expression.

    Science.gov (United States)

    Arnaiz, Olivier; Van Dijk, Erwin; Bétermier, Mireille; Lhuillier-Akakpo, Maoussi; de Vanssay, Augustin; Duharcourt, Sandra; Sallet, Erika; Gouzy, Jérôme; Sperling, Linda

    2017-06-26

    The 15 sibling species of the Paramecium aurelia cryptic species complex emerged after a whole genome duplication that occurred tens of millions of years ago. Given extensive knowledge of the genetics and epigenetics of Paramecium acquired over the last century, this species complex offers a uniquely powerful system to investigate the consequences of whole genome duplication in a unicellular eukaryote as well as the genetic and epigenetic mechanisms that drive speciation. High quality Paramecium gene models are important for research using this system. The major aim of the work reported here was to build an improved gene annotation pipeline for the Paramecium lineage. We generated oriented RNA-Seq transcriptome data across the sexual process of autogamy for the model species Paramecium tetraurelia. We determined, for the first time in a ciliate, candidate P. tetraurelia transcription start sites using an adapted Cap-Seq protocol. We developed TrUC, multi-threaded Perl software that in conjunction with TopHat mapping of RNA-Seq data to a reference genome, predicts transcription units for the annotation pipeline. We used EuGene software to combine annotation evidence. The high quality gene structural annotations obtained for P. tetraurelia were used as evidence to improve published annotations for 3 other Paramecium species. The RNA-Seq data were also used for differential gene expression analysis, providing a gene expression atlas that is more sensitive than the previously established microarray resource. We have developed a gene annotation pipeline tailored for the compact genomes and tiny introns of Paramecium species. A novel component of this pipeline, TrUC, predicts transcription units using Cap-Seq and oriented RNA-Seq data. TrUC could prove useful beyond Paramecium, especially in the case of high gene density. Accurate predictions of 3' and 5' UTR will be particularly valuable for studies of gene expression (e.g. nucleosome positioning, identification of cis

  19. Family emotional expressiveness and family structure

    Directory of Open Access Journals (Sweden)

    Čotar-Konrad Sonja

    2016-01-01

    Full Text Available The present paper scrutinizes the relationship between family emotional expressiveness (i.e., the tendency to express dominant and/or submissive positive and negative emotions and components of family structure as proposed in Olson’s Circumplex model (i.e., cohesion and flexibility, family communication, and satisfaction in families with adolescents. The study was conducted on a sample of 514 Slovenian adolescents, who filled out two questionnaires: the Slovenian version of Family Emotional Expressiveness - FEQ and FACES IV. The results revealed that all four basic dimensions of family functioning were significantly associated with higher/more frequent expressions of positive submissive emotions, as well as with lower/less frequent expressions of negative dominant emotions. Moreover, expressions of negative submissive emotions explained a small, but significant amount of variance in three out of four family functioning variables (satisfaction, flexibility, and communication. The importance of particular aspects of emotional expressiveness for family cohesion, flexibility, communication, and satisfaction is discussed, and the relevance of present findings for family counselling is outlined.

  20. A comprehensive evaluation of rodent malaria parasite genomes and gene expression

    KAUST Repository

    Otto, Thomas D

    2014-10-30

    Background: Rodent malaria parasites (RMP) are used extensively as models of human malaria. Draft RMP genomes have been published for Plasmodium yoelii, P. berghei ANKA (PbA) and P. chabaudi AS (PcAS). Although availability of these genomes made a significant impact on recent malaria research, these genomes were highly fragmented and were annotated with little manual curation. The fragmented nature of the genomes has hampered genome wide analysis of Plasmodium gene regulation and function. Results: We have greatly improved the genome assemblies of PbA and PcAS, newly sequenced the virulent parasite P. yoelii YM genome, sequenced additional RMP isolates/lines and have characterized genotypic diversity within RMP species. We have produced RNA-seq data and utilized it to improve gene-model prediction and to provide quantitative, genome-wide, data on gene expression. Comparison of the RMP genomes with the genome of the human malaria parasite P. falciparum and RNA-seq mapping permitted gene annotation at base-pair resolution. Full-length chromosomal annotation permitted a comprehensive classification of all subtelomeric multigene families including the `Plasmodium interspersed repeat genes\\' (pir). Phylogenetic classification of the pir family, combined with pir expression patterns, indicates functional diversification within this family. Conclusions: Complete RMP genomes, RNA-seq and genotypic diversity data are excellent and important resources for gene-function and post-genomic analyses and to better interrogate Plasmodium biology. Genotypic diversity between P. chabaudi isolates makes this species an excellent parasite to study genotype-phenotype relationships. The improved classification of multigene families will enhance studies on the role of (variant) exported proteins in virulence and immune evasion/modulation.

  1. Genome-wide identification, functional analysis and expression ...

    African Journals Online (AJOL)

    The plant pleiotropic drug resistance (PDR) family of ATP-binding cassette (ABC) transporters has comprehensively been researched in relation to transport of antifungal agents and resistant pathogens. In our study, analyses of the whole family of PDR genes present in the potato genome were provided. This analysis ...

  2. Genomic and expression analysis of the vanG-like gene cluster of Clostridium difficile.

    Science.gov (United States)

    Peltier, Johann; Courtin, Pascal; El Meouche, Imane; Catel-Ferreira, Manuella; Chapot-Chartier, Marie-Pierre; Lemée, Ludovic; Pons, Jean-Louis

    2013-07-01

    Primary antibiotic treatment of Clostridium difficile intestinal diseases requires metronidazole or vancomycin therapy. A cluster of genes homologous to enterococcal glycopeptides resistance vanG genes was found in the genome of C. difficile 630, although this strain remains sensitive to vancomycin. This vanG-like gene cluster was found to consist of five ORFs: the regulatory region consisting of vanR and vanS and the effector region consisting of vanG, vanXY and vanT. We found that 57 out of 83 C. difficile strains, representative of the main lineages of the species, harbour this vanG-like cluster. The cluster is expressed as an operon and, when present, is found at the same genomic location in all strains. The vanG, vanXY and vanT homologues in C. difficile 630 are co-transcribed and expressed to a low level throughout the growth phases in the absence of vancomycin. Conversely, the expression of these genes is strongly induced in the presence of subinhibitory concentrations of vancomycin, indicating that the vanG-like operon is functional at the transcriptional level in C. difficile. Hydrophilic interaction liquid chromatography (HILIC-HPLC) and MS analysis of cytoplasmic peptidoglycan precursors of C. difficile 630 grown without vancomycin revealed the exclusive presence of a UDP-MurNAc-pentapeptide with an alanine at the C terminus. UDP-MurNAc-pentapeptide [d-Ala] was also the only peptidoglycan precursor detected in C. difficile grown in the presence of vancomycin, corroborating the lack of vancomycin resistance. Peptidoglycan structures of a vanG-like mutant strain and of a strain lacking the vanG-like cluster did not differ from the C. difficile 630 strain, indicating that the vanG-like cluster also has no impact on cell-wall composition.

  3. [Genome-wide identification and expression analysis of the WRKY gene family in peach].

    Science.gov (United States)

    Gu, Yan-bing; Ji, Zhi-rui; Chi, Fu-mei; Qiao, Zhuang; Xu, Cheng-nan; Zhang, Jun-xiang; Zhou, Zong-shan; Dong, Qing-long

    2016-03-01

    The WRKY transcription factors are one of the largest families of transcriptional regulators and play diverse regulatory roles in biotic and abiotic stresses, plant growth and development processes. In this study, the WRKY DNA-binding domain (Pfam Database number: PF03106) downloaded from Pfam protein families database was exploited to identify WRKY genes from the peach (Prunus persica 'Lovell') genome using HMMER 3.0. The obtained amino acid sequences were analyzed with DNAMAN 5.0, WebLogo 3, MEGA 5.1, MapInspect and MEME bioinformatics softwares. Totally 61 peach WRKY genes were found in the peach genome. Our phylogenetic analysis revealed that peach WRKY genes were classified into three Groups: Ⅰ, Ⅱ and Ⅲ. The WRKY N-terminal and C-terminal domains of Group Ⅰ (group I-N and group I-C) were monophyletic. The Group Ⅱ was sub-divided into five distinct clades (groupⅡ-a, Ⅱ-b, Ⅱ-c, Ⅱ-d and Ⅱ-e). Our domain analysis indicated that the WRKY regions contained a highly conserved heptapeptide stretch WRKYGQK at its N-terminus followed by a zinc-finger motif. The chromosome mapping analysis showed that peach WRKY genes were distributed with different densities over 8 chromosomes. The intron-exon structure analysis revealed that structures of the WRKY gene were highly conserved in the peach. The conserved motif analysis showed that the conserved motifs 1, 2 and 3, which specify the WRKY domain, were observed in all peach WRKY proteins, motif 5 as the unknown domain was observed in group Ⅱ-d, two WRKY domains were assigned to GroupⅠ. SqRT-PCR and qRT-PCR results indicated that 16 PpWRKY genes were expressed in roots, stems, leaves, flowers and fruits at various expression levels. Our analysis thus identified the PpWRKY gene families, and future functional studies are needed to reveal its specific roles.

  4. Chromatin structure and evolution in the human genome

    Directory of Open Access Journals (Sweden)

    Dunlop Malcolm G

    2007-05-01

    Full Text Available Abstract Background Evolutionary rates are not constant across the human genome but genes in close proximity have been shown to experience similar levels of divergence and selection. The higher-order organisation of chromosomes has often been invoked to explain such phenomena but previously there has been insufficient data on chromosome structure to investigate this rigorously. Using the results of a recent genome-wide analysis of open and closed human chromatin structures we have investigated the global association between divergence, selection and chromatin structure for the first time. Results In this study we have shown that, paradoxically, synonymous site divergence (dS at non-CpG sites is highest in regions of open chromatin, primarily as a result of an increased number of transitions, while the rates of other traditional measures of mutation (intergenic, intronic and ancient repeat divergence as well as SNP density are highest in closed regions of the genome. Analysis of human-chimpanzee divergence across intron-exon boundaries indicates that although genes in relatively open chromatin generally display little selection at their synonymous sites, those in closed regions show markedly lower divergence at their fourfold degenerate sites than in neighbouring introns and intergenic regions. Exclusion of known Exonic Splice Enhancer hexamers has little affect on the divergence observed at fourfold degenerate sites across chromatin categories; however, we show that closed chromatin is enriched with certain classes of ncRNA genes whose RNA secondary structure may be particularly important. Conclusion We conclude that, overall, non-CpG mutation rates are lowest in open regions of the genome and that regions of the genome with a closed chromatin structure have the highest background mutation rate. This might reflect lower rates of DNA damage or enhanced DNA repair processes in regions of open chromatin. Our results also indicate that dS is a poor

  5. The genomic structure of the DMBT1 gene

    DEFF Research Database (Denmark)

    Mollenhauer, J; Holmskov, U; Wiemann, S

    1999-01-01

    Increasing evidence has accumulated for an involvement of the inactivation of tumour suppressor genes at chromosome 10q in the carcinogenesis of brain tumours, melanomas, and carcinomas of the lung, the prostate, the pancreas, and the endometrium. The gene DMBT1 (Deleted in Malignant Brain Tumours...... 1) is located at chromosome 10q25.3-q26.1, within one of the putative intervals for tumour suppressor genes. DMBT1 is a member of the scavenger-receptor cysteine-rich (SRCR) superfamily and displays homozygous deletions or lack of expression in glioblastoma multiforme, medulloblastoma......, and in gastrointestinal and lung cancers. Based on these properties, DMBT1 has been proposed to be a candidate tumour suppressor gene. We have determined the genomic sequence of DMBT1 to allow analyses of mutations. The gene has at least 54 exons that span a genomic region of about 80 kb. We have identified a putative...

  6. Genomic organization, expression, and chromosome localization of a third aurora-related kinase gene, Aie1.

    Science.gov (United States)

    Hu, H M; Chuang, C K; Lee, M J; Tseng, T C; Tang, T K

    2000-11-01

    We previously reported two novel testis-specific serine/threonine kinases, Aie1 (mouse) and AIE2 (human), that share high amino acid identities with the kinase domains of fly aurora and yeast Ipl1. Here, we report the entire intron-exon organization of the Aie1 gene and analyze the expression patterns of Aie1 mRNA during testis development. The mouse Aie1 gene spans approximately 14 kb and contains seven exons. The sequences of the exon-intron boundaries of the Aie1 gene conform to the consensus sequences (GT/AG) of the splicing donor and acceptor sites of most eukaryotic genes. Comparative genomic sequencing revealed that the gene structure is highly conserved between mouse Aie1 and human AIE2. However, much less homology was found in the sequence outside the kinase-coding domains. The Aie1 locus was mapped to mouse chromosome 7A2-A3 by fluorescent in situ hybridization. Northern blot analysis indicates that Aie1 mRNA likely is expressed at a low level on day 14 and reaches its plateau on day 21 in the developing postnatal testis. RNA in situ hybridization indicated that the expression of the Aie1 transcript was restricted to meiotically active germ cells, with the highest levels detected in spermatocytes at the late pachytene stage. These findings suggest that Aie1 plays a role in spermatogenesis.

  7. Genome-wide identification and expression analysis of the WRKY gene family in cassava

    Directory of Open Access Journals (Sweden)

    Yunxie eWei

    2016-02-01

    Full Text Available The WRKY family, a large family of transcription factors (TFs found in higher plants, plays central roles in many aspects of physiological processes and adaption to environment. However, little information is available regarding the WRKY family in cassava (Manihot esculenta. In the present study, 85 WRKY genes were identified from the cassava genome and classified into three groups according to conserved WRKY domains and zinc-finger structure. Conserved motif analysis showed that all of the identified MeWRKYs had the conserved WRKY domain. Gene structure analysis suggested that the number of introns in MeWRKY genes varied from 1 to 5, with the majority of MeWRKY genes containing 3 exons. Expression profiles of MeWRKY genes in different tissues and in response to drought stress were analyzed using the RNA-seq technique. The results showed that 72 MeWRKY genes had differential expression in their transcript abundance and 78 MeWRKY genes were differentially expressed in response to drought stresses in different accessions, indicating their contribution to plant developmental processes and drought stress resistance in cassava. Finally, the expression of 9 WRKY genes was analyzed by qRT-PCR under osmotic, salt, ABA, H2O2, and cold treatments, indicating that MeWRKYs may be involved in different signaling pathways. Taken together, this systematic analysis identifies some tissue-specific and abiotic stress-responsive candidate MeWRKY genes for further functional assays in planta, and provides a solid foundation for understanding of abiotic stress responses and signal transduction mediated by WRKYs in cassava.

  8. Genome-Wide Identification and Expression Analysis of the WRKY Gene Family in Cassava.

    Science.gov (United States)

    Wei, Yunxie; Shi, Haitao; Xia, Zhiqiang; Tie, Weiwei; Ding, Zehong; Yan, Yan; Wang, Wenquan; Hu, Wei; Li, Kaimian

    2016-01-01

    The WRKY family, a large family of transcription factors (TFs) found in higher plants, plays central roles in many aspects of physiological processes and adaption to environment. However, little information is available regarding the WRKY family in cassava (Manihot esculenta). In the present study, 85 WRKY genes were identified from the cassava genome and classified into three groups according to conserved WRKY domains and zinc-finger structure. Conserved motif analysis showed that all of the identified MeWRKYs had the conserved WRKY domain. Gene structure analysis suggested that the number of introns in MeWRKY genes varied from 1 to 5, with the majority of MeWRKY genes containing three exons. Expression profiles of MeWRKY genes in different tissues and in response to drought stress were analyzed using the RNA-seq technique. The results showed that 72 MeWRKY genes had differential expression in their transcript abundance and 78 MeWRKY genes were differentially expressed in response to drought stresses in different accessions, indicating their contribution to plant developmental processes and drought stress resistance in cassava. Finally, the expression of 9 WRKY genes was analyzed by qRT-PCR under osmotic, salt, ABA, H2O2, and cold treatments, indicating that MeWRKYs may be involved in different signaling pathways. Taken together, this systematic analysis identifies some tissue-specific and abiotic stress-responsive candidate MeWRKY genes for further functional assays in planta, and provides a solid foundation for understanding of abiotic stress responses and signal transduction mediated by WRKYs in cassava.

  9. A hidden Markov model approach for determining expression from genomic tiling micro arrays

    Directory of Open Access Journals (Sweden)

    Krogh Anders

    2006-05-01

    Full Text Available Abstract Background Genomic tiling micro arrays have great potential for identifying previously undiscovered coding as well as non-coding transcription. To-date, however, analyses of these data have been performed in an ad hoc fashion. Results We present a probabilistic procedure, ExpressHMM, that adaptively models tiling data prior to predicting expression on genomic sequence. A hidden Markov model (HMM is used to model the distributions of tiling array probe scores in expressed and non-expressed regions. The HMM is trained on sets of probes mapped to regions of annotated expression and non-expression. Subsequently, prediction of transcribed fragments is made on tiled genomic sequence. The prediction is accompanied by an expression probability curve for visual inspection of the supporting evidence. We test ExpressHMM on data from the Cheng et al. (2005 tiling array experiments on ten Human chromosomes 1. Results can be downloaded and viewed from our web site 2. Conclusion The value of adaptive modelling of fluorescence scores prior to categorisation into expressed and non-expressed probes is demonstrated. Our results indicate that our adaptive approach is superior to the previous analysis in terms of nucleotide sensitivity and transfrag specificity.

  10. Genome-wide identification and comparative expression analysis of LEA genes in watermelon and melon genomes.

    Science.gov (United States)

    Celik Altunoglu, Yasemin; Baloglu, Mehmet Cengiz; Baloglu, Pinar; Yer, Esra Nurten; Kara, Sibel

    2017-01-01

    Late embryogenesis abundant (LEA) proteins are large and diverse group of polypeptides which were first identified during seed dehydration and then in vegetative plant tissues during different stress responses. Now, gene family members of LEA proteins have been detected in various organisms. However, there is no report for this protein family in watermelon and melon until this study. A total of 73 LEA genes from watermelon ( ClLEA ) and 61 LEA genes from melon ( CmLEA ) were identified in this comprehensive study. They were classified into four and three distinct clusters in watermelon and melon, respectively. There was a correlation between gene structure and motif composition among each LEA groups. Segmental duplication played an important role for LEA gene expansion in watermelon. Maximum gene ontology of LEA genes was observed with poplar LEA genes. For evaluation of tissue specific expression patterns of ClLEA and CmLEA genes, publicly available RNA-seq data were analyzed. The expression analysis of selected LEA genes in root and leaf tissues of drought-stressed watermelon and melon were examined using qRT-PCR. Among them, ClLEA - 12 - 17 - 46 genes were quickly induced after drought application. Therefore, they might be considered as early response genes for water limitation conditions in watermelon. In addition, CmLEA - 42 - 43 genes were found to be up-regulated in both tissues of melon under drought stress. Our results can open up new frontiers about understanding of functions of these important family members under normal developmental stages and stress conditions by bioinformatics and transcriptomic approaches.

  11. Genome polymorphism markers and stress genes expression for ...

    African Journals Online (AJOL)

    SAM

    2014-06-11

    Jun 11, 2014 ... RNA extraction and purification for SOD and PAL gene expression. Fresh leaf tissues (100 mg), from ... Data analysis. Gelquant program for quantification of protein, DNA and RNA gel. (version 1.8.2) was used for .... by reprogramming the expression of endogenous genes. Higher level of these antioxidant ...

  12. Genome organization and expression of the rat ACBP gene family

    DEFF Research Database (Denmark)

    Mandrup, S; Andreasen, P H; Knudsen, J

    1993-01-01

    pool former. We have molecularly cloned and characterized the rat ACBP gene family which comprises one expressed and four processed pseudogenes. One of these was shown to exist in two allelic forms. A comprehensive computer-aided analysis of the promoter region of the expressed ACBP gene revealed...

  13. Implications of structural genomics target selection strategies: Pfam5000, whole genome, and random approaches

    Energy Technology Data Exchange (ETDEWEB)

    Chandonia, John-Marc; Brenner, Steven E.

    2004-07-14

    The structural genomics project is an international effort to determine the three-dimensional shapes of all important biological macromolecules, with a primary focus on proteins. Target proteins should be selected according to a strategy which is medically and biologically relevant, of good value, and tractable. As an option to consider, we present the Pfam5000 strategy, which involves selecting the 5000 most important families from the Pfam database as sources for targets. We compare the Pfam5000 strategy to several other proposed strategies that would require similar numbers of targets. These include including complete solution of several small to moderately sized bacterial proteomes, partial coverage of the human proteome, and random selection of approximately 5000 targets from sequenced genomes. We measure the impact that successful implementation of these strategies would have upon structural interpretation of the proteins in Swiss-Prot, TrEMBL, and 131 complete proteomes (including 10 of eukaryotes) from the Proteome Analysis database at EBI. Solving the structures of proteins from the 5000 largest Pfam families would allow accurate fold assignment for approximately 68 percent of all prokaryotic proteins (covering 59 percent of residues) and 61 percent of eukaryotic proteins (40 percent of residues). More fine-grained coverage which would allow accurate modeling of these proteins would require an order of magnitude more targets. The Pfam5000 strategy may be modified in several ways, for example to focus on larger families, bacterial sequences, or eukaryotic sequences; as long as secondary consideration is given to large families within Pfam, coverage results vary only slightly. In contrast, focusing structural genomics on a single tractable genome would have only a limited impact in structural knowledge of other proteomes: a significant fraction (about 30-40 percent of the proteins, and 40-60 percent of the residues) of each proteome is classified in small

  14. Genome-wide expressions in autologous eutopic and ectopic endometrium of fertile women with endometriosis

    OpenAIRE

    Khan, Meraj A; Sengupta, Jayasree; Mittal, Suneeta; Ghosh, Debabrata

    2012-01-01

    Abstract Background In order to obtain a lead of the pathophysiology of endometriosis, genome-wide expressional analyses of eutopic and ectopic endometrium have earlier been reported, however, the effects of stages of severity and phases of menstrual cycle on expressional profiles have not been examined. The effect of genetic heterogeneity and fertility history on transcriptional activity was also not considered. In the present study, a genome-wide expression analysis of autologous, paired eu...

  15. Regulation of gene expression in Mycoplasmas: contribution from Mycoplasma hyopneumoniae and Mycoplasma synoviae genome sequences

    Directory of Open Access Journals (Sweden)

    Humberto Maciel França Madeira

    2007-01-01

    Full Text Available This report describes the transcription apparatus of Mycoplasma hyopneumoniae (strains J and 7448 and Mycoplasma synoviae, using a comparative genomics approach to summarize the main features related to transcription and control of gene expression in mycoplasmas. Most of the transcription-related genes present in the three strains are well conserved among mycoplasmas. Some unique aspects of transcription in mycoplasmas and the scarcity of regulatory proteins in mycoplasma genomes are discussed.

  16. The Fanconi anemia/BRCA gene network in zebrafish: Embryonic expression and comparative genomics

    OpenAIRE

    Titus, Tom A.; Yan, Yi-Lin; Wilson, Catherine; Starks, Amber M.; Frohnmayer, Jonathan D.; Canestro, Cristian; Rodriguez-Mari, Adriana; He, Xinjun; Postlethwait, John H.

    2008-01-01

    Fanconi anemia (FA) is a genic disease resulting in bone marrow failure, high cancer risks, and infertility, and developmental anomalies including microphthalmia, microcephaly, hypoplastic radius and thumb. Here we present cDNA sequences, genetic mapping, and genomic analyses for the four previously undescribed zebrafish FA genes (fanci, fancj, fancm, and fancn, and show that they reverted to single copy after the teleost genome duplication. We tested the hypothesis that FA genes are expresse...

  17. Ectopic Expression of Testis Germ Cell Proteins in Cancer and Its Potential Role in Genomic Instability

    Directory of Open Access Journals (Sweden)

    Aaraby Yoheswaran Nielsen

    2016-06-01

    Full Text Available Genomic instability is a hallmark of human cancer and an enabling factor for the genetic alterations that drive cancer development. The processes involved in genomic instability resemble those of meiosis, where genetic material is interchanged between homologous chromosomes. In most types of human cancer, epigenetic changes, including hypomethylation of gene promoters, lead to the ectopic expression of a large number of proteins normally restricted to the germ cells of the testis. Due to the similarities between meiosis and genomic instability, it has been proposed that activation of meiotic programs may drive genomic instability in cancer cells. Some germ cell proteins with ectopic expression in cancer cells indeed seem to promote genomic instability, while others reduce polyploidy and maintain mitotic fidelity. Furthermore, oncogenic germ cell proteins may indirectly contribute to genomic instability through induction of replication stress, similar to classic oncogenes. Thus, current evidence suggests that testis germ cell proteins are implicated in cancer development by regulating genomic instability during tumorigenesis, and these proteins therefore represent promising targets for novel therapeutic strategies.

  18. Genome-wide screens for expressed hypothetical proteins

    DEFF Research Database (Denmark)

    Madsen, Claus Desler; Durhuus, Jon Ambæk; Rasmussen, Lene Juel

    2012-01-01

    A hypothetical protein (HP) is defined as a protein that is predicted to be expressed from an open reading frame, but for which there is no experimental evidence of translation. HPs constitute a substantial fraction of proteomes of human as well as of other organisms. With the general belief that...... that the majority of HPs are the product of pseudogenes, it is essential to have a tool with the ability of pinpointing the minority of HPs with a high probability of being expressed....

  19. Structured Matrix Completion with Applications to Genomic Data Integration.

    Science.gov (United States)

    Cai, Tianxi; Cai, T Tony; Zhang, Anru

    2016-01-01

    Matrix completion has attracted significant recent attention in many fields including statistics, applied mathematics and electrical engineering. Current literature on matrix completion focuses primarily on independent sampling models under which the individual observed entries are sampled independently. Motivated by applications in genomic data integration, we propose a new framework of structured matrix completion (SMC) to treat structured missingness by design. Specifically, our proposed method aims at efficient matrix recovery when a subset of the rows and columns of an approximately low-rank matrix are observed. We provide theoretical justification for the proposed SMC method and derive lower bound for the estimation errors, which together establish the optimal rate of recovery over certain classes of approximately low-rank matrices. Simulation studies show that the method performs well in finite sample under a variety of configurations. The method is applied to integrate several ovarian cancer genomic studies with different extent of genomic measurements, which enables us to construct more accurate prediction rules for ovarian cancer survival.

  20. Cloning, production, and purification of proteins for a medium-scale structural genomics project.

    Science.gov (United States)

    Quevillon-Cheruel, Sophie; Collinet, Bruno; Trésaugues, Lionel; Minard, Philippe; Henckes, Gilles; Aufrère, Robert; Blondeau, Karine; Zhou, Cong-Zhao; Liger, Dominique; Bettache, Nabila; Poupon, Anne; Aboulfath, Ilham; Leulliot, Nicolas; Janin, Joël; van Tilbeurgh, Herman

    2007-01-01

    The South-Paris Yeast Structural Genomics Pilot Project (http://www.genomics.eu.org) aims at systematically expressing, purifying, and determining the three-dimensional structures of Saccharomyces cerevisiae proteins. We have already cloned 240 yeast open reading frames in the Escherichia coli pET system. Eighty-two percent of the targets can be expressed in E. coli, and 61% yield soluble protein. We have currently purified 58 proteins. Twelve X-ray structures have been solved, six are in progress, and six other proteins gave crystals. In this chapter, we present the general experimental flowchart applied for this project. One of the main difficulties encountered in this pilot project was the low solubility of a great number of target proteins. We have developed parallel strategies to recover these proteins from inclusion bodies, including refolding, coexpression with chaperones, and an in vitro expression system. A limited proteolysis protocol, developed to localize flexible regions in proteins that could hinder crystallization, is also described.

  1. Co-Expression of Neighboring Genes in the Zebrafish (Danio rerio Genome

    Directory of Open Access Journals (Sweden)

    Daryi Wang

    2009-08-01

    Full Text Available Neighboring genes in the eukaryotic genome have a tendency to express concurrently, and the proximity of two adjacent genes is often considered a possible explanation for their co-expression behavior. However, the actual contribution of the physical distance between two genes to their co-expression behavior has yet to be defined. To further investigate this issue, we studied the co-expression of neighboring genes in zebrafish, which has a compact genome and has experienced a whole genome duplication event. Our analysis shows that the proportion of highly co-expressed neighboring pairs (Pearson’s correlation coefficient R>0.7 is low (0.24% ~ 0.67%; however, it is still significantly higher than that of random pairs. In particular, the statistical result implies that the co-expression tendency of neighboring pairs is negatively correlated with their physical distance. Our findings therefore suggest that physical distance may play an important role in the co-expression of neighboring genes. Possible mechanisms related to the neighboring genes’ co-expression are also discussed.

  2. Orthogonal control of expression mean and variance by epigenetic features at different genomic loci.

    Science.gov (United States)

    Dey, Siddharth S; Foley, Jonathan E; Limsirichai, Prajit; Schaffer, David V; Arkin, Adam P

    2015-05-05

    While gene expression noise has been shown to drive dramatic phenotypic variations, the molecular basis for this variability in mammalian systems is not well understood. Gene expression has been shown to be regulated by promoter architecture and the associated chromatin environment. However, the exact contribution of these two factors in regulating expression noise has not been explored. Using a dual-reporter lentiviral model system, we deconvolved the influence of the promoter sequence to systematically study the contribution of the chromatin environment at different genomic locations in regulating expression noise. By integrating a large-scale analysis to quantify mRNA levels by smFISH and protein levels by flow cytometry in single cells, we found that mean expression and noise are uncorrelated across genomic locations. Furthermore, we showed that this independence could be explained by the orthogonal control of mean expression by the transcript burst size and noise by the burst frequency. Finally, we showed that genomic locations displaying higher expression noise are associated with more repressed chromatin, thereby indicating the contribution of the chromatin environment in regulating expression noise. © 2015 The Authors. Published under the terms of the CC BY 4.0 license.

  3. Evidence-based gene models for structural and functional annotations of the oil palm genome.

    Science.gov (United States)

    Chan, Kuang-Lim; Tatarinova, Tatiana V; Rosli, Rozana; Amiruddin, Nadzirah; Azizi, Norazah; Halim, Mohd Amin Ab; Sanusi, Nik Shazana Nik Mohd; Jayanthi, Nagappan; Ponomarenko, Petr; Triska, Martin; Solovyev, Victor; Firdaus-Raih, Mohd; Sambanthamurthi, Ravigadevi; Murphy, Denis; Low, Eng-Ti Leslie

    2017-09-08

    Oil palm is an important source of edible oil. The importance of the crop, as well as its long breeding cycle (10-12 years) has led to the sequencing of its genome in 2013 to pave the way for genomics-guided breeding. Nevertheless, the first set of gene predictions, although useful, had many fragmented genes. Classification and characterization of genes associated with traits of interest, such as those for fatty acid biosynthesis and disease resistance, were also limited. Lipid-, especially fatty acid (FA)-related genes are of particular interest for the oil palm as they specify oil yields and quality. This paper presents the characterization of the oil palm genome using different gene prediction methods and comparative genomics analysis, identification of FA biosynthesis and disease resistance genes, and the development of an annotation database and bioinformatics tools. Using two independent gene-prediction pipelines, Fgenesh++ and Seqping, 26,059 oil palm genes with transcriptome and RefSeq support were identified from the oil palm genome. These coding regions of the genome have a characteristic broad distribution of GC 3 (fraction of cytosine and guanine in the third position of a codon) with over half the GC 3 -rich genes (GC 3  ≥ 0.75286) being intronless. In comparison, only one-seventh of the oil palm genes identified are intronless. Using comparative genomics analysis, characterization of conserved domains and active sites, and expression analysis, 42 key genes involved in FA biosynthesis in oil palm were identified. For three of them, namely EgFABF, EgFABH and EgFAD3, segmental duplication events were detected. Our analysis also identified 210 candidate resistance genes in six classes, grouped by their protein domain structures. We present an accurate and comprehensive annotation of the oil palm genome, focusing on analysis of important categories of genes (GC 3 -rich and intronless), as well as those associated with important functions, such as FA

  4. SINEs, evolution and genome structure in the opossum.

    Science.gov (United States)

    Gu, Wanjun; Ray, David A; Walker, Jerilyn A; Barnes, Erin W; Gentles, Andrew J; Samollow, Paul B; Jurka, Jerzy; Batzer, Mark A; Pollock, David D

    2007-07-01

    Short INterspersed Elements (SINEs) are non-autonomous retrotransposons, usually between 100 and 500 base pairs (bp) in length, which are ubiquitous components of eukaryotic genomes. Their activity, distribution, and evolution can be highly informative on genomic structure and evolutionary processes. To determine recent activity, we amplified more than one hundred SINE1 loci in a panel of 43 M. domestica individuals derived from five diverse geographic locations. The SINE1 family has expanded recently enough that many loci were polymorphic, and the SINE1 insertion-based genetic distances among populations reflected geographic distance. Genome-wide comparisons of SINE1 densities and GC content revealed that high SINE1 density is associated with high GC content in a few long and many short spans. Young SINE1s, whether fixed or polymorphic, showed an unbiased GC content preference for insertion, indicating that the GC preference accumulates over long time periods, possibly in periodic bursts. SINE1 evolution is thus broadly similar to human Alu evolution, although it has an independent origin. High GC content adjacent to SINE1s is strongly correlated with bias towards higher AT to GC substitutions and lower GC to AT substitutions. This is consistent with biased gene conversion, and also indicates that like chickens, but unlike eutherian mammals, GC content heterogeneity (isochore structure) is reinforced by substitution processes in the M. domestica genome. Nevertheless, both high and low GC content regions are apparently headed towards lower GC content equilibria, possibly due to a relative shift to lower recombination rates in the recent Monodelphis ancestral lineage. Like eutherians, metatherian (marsupial) mammals have evolved high CpG substitution rates, but this is apparently a convergence in process rather than a shared ancestral state.

  5. A hidden Markov model approach for determining expression from genomic tiling micro arrays

    DEFF Research Database (Denmark)

    Terkelsen, Kasper Munch; Gardner, P. P.; Arctander, Peter

    2006-01-01

    Background Genomic tiling micro arrays have great potential for identifying previously undiscovered coding as well as non-coding transcription. To-date, however, analyses of these data have been performed in an ad hoc fashion. Results We present a probabilistic procedure, ExpressHMM, that adaptiv......Background Genomic tiling micro arrays have great potential for identifying previously undiscovered coding as well as non-coding transcription. To-date, however, analyses of these data have been performed in an ad hoc fashion. Results We present a probabilistic procedure, Express...

  6. CHESS (CgHExpreSS): a comprehensive analysis tool for the analysis of genomic alterations and their effects on the expression profile of the genome.

    Science.gov (United States)

    Lee, Mikyung; Kim, Yangseok

    2009-12-16

    Genomic alterations frequently occur in many cancer patients and play important mechanistic roles in the pathogenesis of cancer. Furthermore, they can modify the expression level of genes due to altered copy number in the corresponding region of the chromosome. An accumulating body of evidence supports the possibility that strong genome-wide correlation exists between DNA content and gene expression. Therefore, more comprehensive analysis is needed to quantify the relationship between genomic alteration and gene expression. A well-designed bioinformatics tool is essential to perform this kind of integrative analysis. A few programs have already been introduced for integrative analysis. However, there are many limitations in their performance of comprehensive integrated analysis using published software because of limitations in implemented algorithms and visualization modules. To address this issue, we have implemented the Java-based program CHESS to allow integrative analysis of two experimental data sets: genomic alteration and genome-wide expression profile. CHESS is composed of a genomic alteration analysis module and an integrative analysis module. The genomic alteration analysis module detects genomic alteration by applying a threshold based method or SW-ARRAY algorithm and investigates whether the detected alteration is phenotype specific or not. On the other hand, the integrative analysis module measures the genomic alteration's influence on gene expression. It is divided into two separate parts. The first part calculates overall correlation between comparative genomic hybridization ratio and gene expression level by applying following three statistical methods: simple linear regression, Spearman rank correlation and Pearson's correlation. In the second part, CHESS detects the genes that are differentially expressed according to the genomic alteration pattern with three alternative statistical approaches: Student's t-test, Fisher's exact test and Chi square

  7. Genome-wide identification and expression analysis of aquaporins in tomato.

    Science.gov (United States)

    Reuscher, Stefan; Akiyama, Masahito; Mori, Chiharu; Aoki, Koh; Shibata, Daisuke; Shiratake, Katsuhiro

    2013-01-01

    The family of aquaporins, also called water channels or major intrinsic proteins, is characterized by six transmembrane domains that together facilitate the transport of water and a variety of low molecular weight solutes. They are found in all domains of life, but show their highest diversity in plants. Numerous studies identified aquaporins as important targets for improving plant performance under drought stress. The phylogeny of aquaporins is well established based on model species like Arabidopsis thaliana, which can be used as a template to investigate aquaporins in other species. In this study we comprehensively identified aquaporin encoding genes in tomato (Solanum lycopersicum), which is an important vegetable crop and also serves as a model for fleshy fruit development. We found 47 aquaporin genes in the tomato genome and analyzed their structural features. Based on a phylogenetic analysis of the deduced amino acid sequences the aquaporin genes were assigned to five subfamilies (PIPs, TIPs, NIPs, SIPs and XIPs) and their substrate specificity was assessed on the basis of key amino acid residues. As ESTs were available for 32 genes, expression of these genes was analyzed in 13 different tissues and developmental stages of tomato. We detected tissue-specific and development-specific expression of tomato aquaporin genes, which is a first step towards revealing the contribution of aquaporins to water and solute transport in leaves and during fruit development.

  8. Genome-wide identification and expression analysis of aquaporins in tomato.

    Directory of Open Access Journals (Sweden)

    Stefan Reuscher

    Full Text Available The family of aquaporins, also called water channels or major intrinsic proteins, is characterized by six transmembrane domains that together facilitate the transport of water and a variety of low molecular weight solutes. They are found in all domains of life, but show their highest diversity in plants. Numerous studies identified aquaporins as important targets for improving plant performance under drought stress. The phylogeny of aquaporins is well established based on model species like Arabidopsis thaliana, which can be used as a template to investigate aquaporins in other species. In this study we comprehensively identified aquaporin encoding genes in tomato (Solanum lycopersicum, which is an important vegetable crop and also serves as a model for fleshy fruit development. We found 47 aquaporin genes in the tomato genome and analyzed their structural features. Based on a phylogenetic analysis of the deduced amino acid sequences the aquaporin genes were assigned to five subfamilies (PIPs, TIPs, NIPs, SIPs and XIPs and their substrate specificity was assessed on the basis of key amino acid residues. As ESTs were available for 32 genes, expression of these genes was analyzed in 13 different tissues and developmental stages of tomato. We detected tissue-specific and development-specific expression of tomato aquaporin genes, which is a first step towards revealing the contribution of aquaporins to water and solute transport in leaves and during fruit development.

  9. Genome engineering for improved recombinant protein expression in Escherichia coli.

    Science.gov (United States)

    Mahalik, Shubhashree; Sharma, Ashish K; Mukherjee, Krishna J

    2014-12-19

    A metabolic engineering perspective which views recombinant protein expression as a multistep pathway allows us to move beyond vector design and identify the downstream rate limiting steps in expression. In E.coli these are typically at the translational level and the supply of precursors in the form of energy, amino acids and nucleotides. Further recombinant protein production triggers a global cellular stress response which feedback inhibits both growth and product formation. Countering this requires a system level analysis followed by a rational host cell engineering to sustain expression for longer time periods. Another strategy to increase protein yields could be to divert the metabolic flux away from biomass formation and towards recombinant protein production. This would require a growth stoppage mechanism which does not affect the metabolic activity of the cell or the transcriptional or translational efficiencies. Finally cells have to be designed for efficient export to prevent buildup of proteins inside the cytoplasm and also simplify downstream processing. The rational and the high throughput strategies that can be used for the construction of such improved host cell platforms for recombinant protein expression is the focus of this review.

  10. Central genomic regulation of the expression of oestrous behaviour in dairy cows: a review.

    Science.gov (United States)

    Woelders, H; van der Lende, T; Kommadath, A; te Pas, M F W; Smits, M A; Kaal, L M T E

    2014-05-01

    The expression of oestrous behaviour in Holstein Friesian dairy cows has progressively decreased over the past 50 years. Reduced oestrus expression is one of the factors contributing to the current suboptimal reproductive efficiency in dairy farming. Variation between and within cows in the expression of oestrous behaviour is associated with variation in peripheral blood oestradiol concentrations during oestrus. In addition, there is evidence for a priming role of progesterone for the full display of oestrous behaviour. A higher rate of metabolic clearance of ovarian steroids could be one of the factors leading to lower peripheral blood concentrations of oestradiol and progesterone in high-producing dairy cows. Oestradiol acts on the brain by genomic, non-genomic and growth factor-dependent mechanisms. A firm base of understanding of the ovarian steroid-driven central genomic regulation of female sexual behaviour has been obtained from studies on rodents. These studies have resulted in the definition of five modules of oestradiol-activated genes in the brain, referred to as the GAPPS modules. In a recent series of studies, gene expression in the anterior pituitary and four brain areas (amygdala, hippocampus, dorsal hypothalamus and ventral hypothalamus) in oestrous and luteal phase cows, respectively, has been measured, and the relation with oestrous behaviour of these cows was analysed. These studies identified a number of genes of which the expression was associated with the intensity of oestrous behaviour. These genes could be grouped according to the GAPPS modules, suggesting close similarity of the regulation of oestrous behaviour in cows and female sexual behaviour in rodents. A better understanding of the central genomic regulation of the expression of oestrous behaviour in dairy cows may in due time contribute to improved (genomic) selection strategies for appropriate oestrus expression in high-producing dairy cows.

  11. The diurnal logic of the expression of the chloroplast genome in Chlamydomonas reinhardtii.

    Directory of Open Access Journals (Sweden)

    Adam D Idoine

    Full Text Available Chloroplasts are derived from cyanobacteria and have retained a bacterial-type genome and gene expression machinery. The chloroplast genome encodes many of the core components of the photosynthetic apparatus in the thylakoid membranes. To avoid photooxidative damage and production of harmful reactive oxygen species (ROS by incompletely assembled thylakoid protein complexes, chloroplast gene expression must be tightly regulated and co-ordinated with gene expression in the nucleus. Little is known about the control of chloroplast gene expression at the genome-wide level in response to internal rhythms and external cues. To obtain a comprehensive picture of organelle transcript levels in the unicellular model alga Chlamydomonas reinhardtii in diurnal conditions, a qRT-PCR platform was developed and used to quantify 68 chloroplast, 21 mitochondrial as well as 71 nuclear transcripts in cells grown in highly controlled 12 h light/12 h dark cycles. Interestingly, in anticipation of dusk, chloroplast transcripts from genes involved in transcription reached peak levels first, followed by transcripts from genes involved in translation, and finally photosynthesis gene transcripts. This pattern matches perfectly the theoretical demands of a cell "waking up" from the night. A similar trend was observed in the nuclear transcripts. These results suggest a striking internal logic in the expression of the chloroplast genome and a previously unappreciated complexity in the regulation of chloroplast genes.

  12. New families of human regulatory RNA structures identified by comparative analysis of vertebrate genomes

    DEFF Research Database (Denmark)

    Parker, Brian John; Moltke, Ida; Roth, Adam

    2011-01-01

    a comparative method, EvoFam, for genome-wide identification of families of regulatory RNA structures, based on primary sequence and secondary structure similarity. We apply EvoFam to a 41-way genomic vertebrate alignment. Genome-wide, we identify 220 human, high-confidence families outside protein...

  13. Comprehensive genome-wide survey, genomic constitution and expression profiling of the NAC transcription factor family in foxtail millet (Setaria italica L..

    Directory of Open Access Journals (Sweden)

    Swati Puranik

    Full Text Available The NAC proteins represent a major plant-specific transcription factor family that has established enormously diverse roles in various plant processes. Aided by the availability of complete genomes, several members of this family have been identified in Arabidopsis, rice, soybean and poplar. However, no comprehensive investigation has been presented for the recently sequenced, naturally stress tolerant crop, Setaria italica (foxtail millet that is famed as a model crop for bioenergy research. In this study, we identified 147 putative NAC domain-encoding genes from foxtail millet by systematic sequence analysis and physically mapped them onto nine chromosomes. Genomic organization suggested that inter-chromosomal duplications may have been responsible for expansion of this gene family in foxtail millet. Phylogenetically, they were arranged into 11 distinct sub-families (I-XI, with duplicated genes fitting into one cluster and possessing conserved motif compositions. Comparative mapping with other grass species revealed some orthologous relationships and chromosomal rearrangements including duplication, inversion and deletion of genes. The evolutionary significance as duplication and divergence of NAC genes based on their amino acid substitution rates was understood. Expression profiling against various stresses and phytohormones provides novel insights into specific and/or overlapping expression patterns of SiNAC genes, which may be responsible for functional divergence among individual members in this crop. Further, we performed structure modeling and molecular simulation of a stress-responsive protein, SiNAC128, proffering an initial framework for understanding its molecular function. Taken together, this genome-wide identification and expression profiling unlocks new avenues for systematic functional analysis of novel NAC gene family candidates which may be applied for improvising stress adaption in plants.

  14. Comprehensive genome-wide survey, genomic constitution and expression profiling of the NAC transcription factor family in foxtail millet (Setaria italica L.).

    Science.gov (United States)

    Puranik, Swati; Sahu, Pranav Pankaj; Mandal, Sambhu Nath; B, Venkata Suresh; Parida, Swarup Kumar; Prasad, Manoj

    2013-01-01

    The NAC proteins represent a major plant-specific transcription factor family that has established enormously diverse roles in various plant processes. Aided by the availability of complete genomes, several members of this family have been identified in Arabidopsis, rice, soybean and poplar. However, no comprehensive investigation has been presented for the recently sequenced, naturally stress tolerant crop, Setaria italica (foxtail millet) that is famed as a model crop for bioenergy research. In this study, we identified 147 putative NAC domain-encoding genes from foxtail millet by systematic sequence analysis and physically mapped them onto nine chromosomes. Genomic organization suggested that inter-chromosomal duplications may have been responsible for expansion of this gene family in foxtail millet. Phylogenetically, they were arranged into 11 distinct sub-families (I-XI), with duplicated genes fitting into one cluster and possessing conserved motif compositions. Comparative mapping with other grass species revealed some orthologous relationships and chromosomal rearrangements including duplication, inversion and deletion of genes. The evolutionary significance as duplication and divergence of NAC genes based on their amino acid substitution rates was understood. Expression profiling against various stresses and phytohormones provides novel insights into specific and/or overlapping expression patterns of SiNAC genes, which may be responsible for functional divergence among individual members in this crop. Further, we performed structure modeling and molecular simulation of a stress-responsive protein, SiNAC128, proffering an initial framework for understanding its molecular function. Taken together, this genome-wide identification and expression profiling unlocks new avenues for systematic functional analysis of novel NAC gene family candidates which may be applied for improvising stress adaption in plants.

  15. The E4 protein; structure, function and patterns of expression

    Energy Technology Data Exchange (ETDEWEB)

    Doorbar, John, E-mail: jdoorba@nimr.mrc.ac.uk

    2013-10-15

    The papillomavirus E4 open reading frame (ORF) is contained within the E2 ORF, with the primary E4 gene-product (E1{sup ∧}E4) being translated from a spliced mRNA that includes the E1 initiation codon and adjacent sequences. E4 is located centrally within the E2 gene, in a region that encodes the E2 protein′s flexible hinge domain. Although a number of minor E4 transcripts have been reported, it is the product of the abundant E1{sup ∧}E4 mRNA that has been most extensively analysed. During the papillomavirus life cycle, the E1{sup ∧}E4 gene products generally become detectable at the onset of vegetative viral genome amplification as the late stages of infection begin. E4 contributes to genome amplification success and virus synthesis, with its high level of expression suggesting additional roles in virus release and/or transmission. In general, E4 is easily visualised in biopsy material by immunostaining, and can be detected in lesions caused by diverse papillomavirus types, including those of dogs, rabbits and cattle as well as humans. The E4 protein can serve as a biomarker of active virus infection, and in the case of high-risk human types also disease severity. In some cutaneous lesions, E4 can be expressed at higher levels than the virion coat proteins, and can account for as much as 30% of total lesional protein content. The E4 proteins of the Beta, Gamma and Mu HPV types assemble into distinctive cytoplasmic, and sometimes nuclear, inclusion granules. In general, the E4 proteins are expressed before L2 and L1, with their structure and function being modified, first by kinases as the infected cell progresses through the S and G2 cell cycle phases, but also by proteases as the cell exits the cell cycle and undergoes true terminal differentiation. The kinases that regulate E4 also affect other viral proteins simultaneously, and include protein kinase A, Cyclin-dependent kinase, members of the MAP Kinase family and protein kinase C. For HPV16 E1{sup

  16. Structural analysis of a set of proteins resulting from a bacterial genomics project.

    Science.gov (United States)

    Badger, J; Sauder, J M; Adams, J M; Antonysamy, S; Bain, K; Bergseid, M G; Buchanan, S G; Buchanan, M D; Batiyenko, Y; Christopher, J A; Emtage, S; Eroshkina, A; Feil, I; Furlong, E B; Gajiwala, K S; Gao, X; He, D; Hendle, J; Huber, A; Hoda, K; Kearins, P; Kissinger, C; Laubert, B; Lewis, H A; Lin, J; Loomis, K; Lorimer, D; Louie, G; Maletic, M; Marsh, C D; Miller, I; Molinari, J; Muller-Dieckmann, H J; Newman, J M; Noland, B W; Pagarigan, B; Park, F; Peat, T S; Post, K W; Radojicic, S; Ramos, A; Romero, R; Rutter, M E; Sanderson, W E; Schwinn, K D; Tresser, J; Winhoven, J; Wright, T A; Wu, L; Xu, J; Harris, T J R

    2005-09-01

    The targets of the Structural GenomiX (SGX) bacterial genomics project were proteins conserved in multiple prokaryotic organisms with no obvious sequence homolog in the Protein Data Bank of known structures. The outcome of this work was 80 structures, covering 60 unique sequences and 49 different genes. Experimental phase determination from proteins incorporating Se-Met was carried out for 45 structures with most of the remainder solved by molecular replacement using members of the experimentally phased set as search models. An automated tool was developed to deposit these structures in the Protein Data Bank, along with the associated X-ray diffraction data (including refined experimental phases) and experimentally confirmed sequences. BLAST comparisons of the SGX structures with structures that had appeared in the Protein Data Bank over the intervening 3.5 years since the SGX target list had been compiled identified homologs for 49 of the 60 unique sequences represented by the SGX structures. This result indicates that, for bacterial structures that are relatively easy to express, purify, and crystallize, the structural coverage of gene space is proceeding rapidly. More distant sequence-structure relationships between the SGX and PDB structures were investigated using PDB-BLAST and Combinatorial Extension (CE). Only one structure, SufD, has a truly unique topology compared to all folds in the PDB. Copyright 2005 Wiley-Liss, Inc.

  17. Refining the structure and content of clinical genomic reports.

    Science.gov (United States)

    Dorschner, Michael O; Amendola, Laura M; Shirts, Brian H; Kiedrowski, Lesli; Salama, Joseph; Gordon, Adam S; Fullerton, Stephanie M; Tarczy-Hornoch, Peter; Byers, Peter H; Jarvik, Gail P

    2014-03-01

    To effectively articulate the results of exome and genome sequencing we refined the structure and content of molecular test reports. To communicate results of a randomized control trial aimed at the evaluation of exome sequencing for clinical medicine, we developed a structured narrative report. With feedback from genetics and non-genetics professionals, we developed separate indication-specific and incidental findings reports. Standard test report elements were supplemented with research study-specific language, which highlighted the limitations of exome sequencing and provided detailed, structured results, and interpretations. The report format we developed to communicate research results can easily be transformed for clinical use by removal of research-specific statements and disclaimers. The development of clinical reports for exome sequencing has shown that accurate and open communication between the clinician and laboratory is ideally an ongoing process to address the increasing complexity of molecular genetic testing. © 2014 Wiley Periodicals, Inc.

  18. Macromolecular structure determination in the post-genome era

    CERN Document Server

    Kuhn, P

    2001-01-01

    Recent advances in genetics, molecular biology and crystallographic instrumentation and methodology have led to a revolution in the field of Structural Molecular Biology (SMB). These combined advances have paved the way to a more complete and detailed understanding of the biological macromolecules that make up an organism, both in terms of their individual functions and also the interactions between them. In this paper we describe a large-scale, genomic approach to the three-dimensional structure determination of macromolecules and their complexes, using high-throughput methodology to streamline all aspects of the process. This task requires the development of automated high-intensity synchrotron beam lines for X-ray diffraction data collection from single crystal samples. Furthermore, these beam lines must be operated within a sophisticated software and hardware environment, which is capable of delivering a completely automated structure determination pipeline. The SMB resource at SSRL is developing a system...

  19. Genome-wide classification and expression analysis of MYB transcription factor families in rice and Arabidopsis

    Science.gov (United States)

    2012-01-01

    Background The MYB gene family comprises one of the richest groups of transcription factors in plants. Plant MYB proteins are characterized by a highly conserved MYB DNA-binding domain. MYB proteins are classified into four major groups namely, 1R-MYB, 2R-MYB, 3R-MYB and 4R-MYB based on the number and position of MYB repeats. MYB transcription factors are involved in plant development, secondary metabolism, hormone signal transduction, disease resistance and abiotic stress tolerance. A comparative analysis of MYB family genes in rice and Arabidopsis will help reveal the evolution and function of MYB genes in plants. Results A genome-wide analysis identified at least 155 and 197 MYB genes in rice and Arabidopsis, respectively. Gene structure analysis revealed that MYB family genes possess relatively more number of introns in the middle as compared with C- and N-terminal regions of the predicted genes. Intronless MYB-genes are highly conserved both in rice and Arabidopsis. MYB genes encoding R2R3 repeat MYB proteins retained conserved gene structure with three exons and two introns, whereas genes encoding R1R2R3 repeat containing proteins consist of six exons and five introns. The splicing pattern is similar among R1R2R3 MYB genes in Arabidopsis. In contrast, variation in splicing pattern was observed among R1R2R3 MYB members of rice. Consensus motif analysis of 1kb upstream region (5′ to translation initiation codon) of MYB gene ORFs led to the identification of conserved and over-represented cis-motifs in both rice and Arabidopsis. Real-time quantitative RT-PCR analysis showed that several members of MYBs are up-regulated by various abiotic stresses both in rice and Arabidopsis. Conclusion A comprehensive genome-wide analysis of chromosomal distribution, tandem repeats and phylogenetic relationship of MYB family genes in rice and Arabidopsis suggested their evolution via duplication. Genome-wide comparative analysis of MYB genes and their expression analysis

  20. Genomic and Expression Profiling of Benign and Malignant Nerve Sheath Tumors in Neurofibromatosis Patients

    Science.gov (United States)

    2008-05-01

    DAMD17-03-1-0297 Title: Genomic and Expression Pr ofiling of Benign and Malignant Nerve Sheath Tumors in Neurofibromatosis Patients...have determined the gene expression signature for benign and malignant peripheral nerve sheath tumors and found that the major trend in transformation...However, EGFR data in soft tissue neoplasms is limited. Using a variety of benign and malignant spindle cell neoplasms, we assessed EGFR status by

  1. A genomic perspective on protein tyrosine phosphatases: gene structure, pseudogenes, and genetic disease linkage

    DEFF Research Database (Denmark)

    Andersen, Jannik N; Jansen, Peter G; Echwald, Søren M

    2004-01-01

    sequence databases, we discovered one novel human PTP gene and defined chromosomal loci and exon structure of the additional 37 genes encoding known PTP transcripts. Direct orthologs were present in the mouse genome for all 38 human PTP genes. In addition, we identified 12 PTP pseudogenes unique to humans...... that have probably contaminated previous bioinformatics analysis of this gene family. PCR amplification and transcript sequencing indicate that some PTP pseudogenes are expressed, but their function (if any) is unknown. Furthermore, we analyzed the enhanced diversity generated by alternative splicing...

  2. Regulatory network construction in Arabidopsis by using genome-wide gene expression quantitative trait loci

    NARCIS (Netherlands)

    Keurentjes, Joost J.B.; Fu, Jingyuan; Terpstra, Inez R.; Garcia, Juan M.; Ackerveken, Guido van den; Snoek, L. Basten; Peeters, Anton J.M.; Vreugdenhil, Dick; Koornneef, Maarten; Jansen, Ritsert C.

    2007-01-01

    Accessions of a plant species can show considerable genetic differences that are analyzed effectively by using recombinant inbred line (RIL) populations. Here we describe the results of genome-wide expression variation analysis in an RIL population of Arabidopsis thaliana. For many genes, variation

  3. Natural variation of histone modification and its impact on gene expression in the rat genome

    NARCIS (Netherlands)

    Rintisch, Carola; Heinig, Matthias; Bauerfeind, Anja; Schafer, Sebastian; Mieth, Christin; Patone, Giannino; Hummel, Oliver; Chen, Wei; Cook, Stuart; Cuppen, Edwin; Colomé-Tatché, Maria; Johannes, Frank; Jansen, Ritsert C; Neil, Helen; Werner, Michel; Pravenec, Michal; Vingron, Martin; Hubner, Norbert

    Histone modifications are epigenetic marks that play fundamental roles in many biological processes including the control of chromatin-mediated regulation of gene expression. Little is known about interindividual variability of histone modification levels across the genome and to what extent they

  4. Functional Associations by Response Overlap (FARO), a functional genomics approach matching gene expression phenotypes

    DEFF Research Database (Denmark)

    Nielsen, Henrik Bjørn; Mundy, J.; Willenbrock, Hanni

    2007-01-01

    The systematic comparison of transcriptional responses of organisms is a powerful tool in functional genomics. For example, mutants may be characterized by comparing their transcript profiles to those obtained in other experiments querying the effects on gene expression of many experimental facto...

  5. Porcine UCHL1: genomic organization, chromosome localization and expression analysis

    DEFF Research Database (Denmark)

    Larsen, Knud; Madsen, Lone Bruhn; Bendixen, Christian

    2012-01-01

    to and protection from Parkinson’s disease. Here we report cloning, characterization, expression analysis and mapping of porcine UCHL1. The UCHL1 cDNA was amplified by reverse transcriptase polymerase chain reaction (RT-PCR) using oligonucleotide primers derived from in silico sequences. The porcine cDNA codes...... in developing porcine embryos. UCHL1 transcript was detected as early as 40 days of gestation. A significant decrease in UCHL1 transcript was detected in basal ganglia from day 60 to day 115 of gestation...

  6. Comprehensive Genomic Identification and Expression Analysis of the Phosphate Transporter (PHT) Gene Family in Apple.

    Science.gov (United States)

    Sun, Tingting; Li, Mingjun; Shao, Yun; Yu, Lingyan; Ma, Fengwang

    2017-01-01

    Elemental phosphorus (Pi) is essential to plant growth and development. The family of phosphate transporters (PHTs) mediates the uptake and translocation of Pi inside the plants. Members include five sub-cellular phosphate transporters that play different roles in Pi uptake and transport. We searched the Genome Database for Rosaceae and identified five clusters of phosphate transporters in apple ( Malus domestica ), including 37 putative genes. The MdPHT1 family contains 14 genes while MdPHT2 has two, MdPHT3 has seven, MdPHT4 has 11, and MdPHT5 has three. Our overview of this gene family focused on structure, chromosomal distribution and localization, phylogenies, and motifs. These genes displayed differential expression patterns in various tissues. For example, expression was high for MdPHT1;12, MdPHT3;6 , and MdPHT3;7 in the roots, and was also increased in response to low-phosphorus conditions. In contrast, MdPHT4;1, MdPHT4;4 , and MdPHT4;10 were expressed only in the leaves while transcript levels of MdPHT1;4, MdPHT1;12 , and MdPHT5;3 were highest in flowers. In general, these 37 genes were regulated significantly in either roots or leaves in response to the imposition of phosphorus and/or drought stress. The results suggest that members of the PHT family function in plant adaptations to adverse growing environments. Our study will lay a foundation for better understanding the PHT family evolution and exploring genes of interest for genetic improvement in apple.

  7. The complete chloroplast genome sequence of Podocarpus lambertii: genome structure, evolutionary aspects, gene content and SSR detection.

    Directory of Open Access Journals (Sweden)

    Leila do Nascimento Vieira

    Full Text Available BACKGROUND: Podocarpus lambertii (Podocarpaceae is a native conifer from the Brazilian Atlantic Forest Biome, which is considered one of the 25 biodiversity hotspots in the world. The advancement of next-generation sequencing technologies has enabled the rapid acquisition of whole chloroplast (cp genome sequences at low cost. Several studies have proven the potential of cp genomes as tools to understand enigmatic and basal phylogenetic relationships at different taxonomic levels, as well as further probe the structural and functional evolution of plants. In this work, we present the complete cp genome sequence of P. lambertii. METHODOLOGY/PRINCIPAL FINDINGS: The P. lambertii cp genome is 133,734 bp in length, and similar to other sequenced cupressophytes, it lacks one of the large inverted repeat regions (IR. It contains 118 unique genes and one duplicated tRNA (trnN-GUU, which occurs as an inverted repeat sequence. The rps16 gene was not found, which was previously reported for the plastid genome of another Podocarpaceae (Nageia nagi and Araucariaceae (Agathis dammara. Structurally, P. lambertii shows 4 inversions of a large DNA fragment ∼20,000 bp compared to the Podocarpus totara cp genome. These unexpected characteristics may be attributed to geographical distance and different adaptive needs. The P. lambertii cp genome presents a total of 28 tandem repeats and 156 SSRs, with homo- and dipolymers being the most common and tri-, tetra-, penta-, and hexapolymers occurring with less frequency. CONCLUSION: The complete cp genome sequence of P. lambertii revealed significant structural changes, even in species from the same genus. These results reinforce the apparently loss of rps16 gene in Podocarpaceae cp genome. In addition, several SSRs in the P. lambertii cp genome are likely intraspecific polymorphism sites, which may allow highly sensitive phylogeographic and population structure studies, as well as phylogenetic studies of species of

  8. Genome-Wide Identification, Phylogenetic and Expression Analyses of the Ubiquitin-Conjugating Enzyme Gene Family in Maize

    Science.gov (United States)

    Jue, Dengwei; Sang, Xuelian; Lu, Shengqiao; Dong, Chen; Zhao, Qiufang; Chen, Hongliang; Jia, Liqiang

    2015-01-01

    Background Ubiquitination is a post-translation modification where ubiquitin is attached to a substrate. Ubiquitin-conjugating enzymes (E2s) play a major role in the ubiquitin transfer pathway, as well as a variety of functions in plant biological processes. To date, no genome-wide characterization of this gene family has been conducted in maize (Zea mays). Methodology/Principal Findings In the present study, a total of 75 putative ZmUBC genes have been identified and located in the maize genome. Phylogenetic analysis revealed that ZmUBC proteins could be divided into 15 subfamilies, which include 13 ubiquitin-conjugating enzymes (ZmE2s) and two independent ubiquitin-conjugating enzyme variant (UEV) groups. The predicted ZmUBC genes were distributed across 10 chromosomes at different densities. In addition, analysis of exon-intron junctions and sequence motifs in each candidate gene has revealed high levels of conservation within and between phylogenetic groups. Tissue expression analysis indicated that most ZmUBC genes were expressed in at least one of the tissues, indicating that these are involved in various physiological and developmental processes in maize. Moreover, expression profile analyses of ZmUBC genes under different stress treatments (4°C, 20% PEG6000, and 200 mM NaCl) and various expression patterns indicated that these may play crucial roles in the response of plants to stress. Conclusions Genome-wide identification, chromosome organization, gene structure, evolutionary and expression analyses of ZmUBC genes have facilitated in the characterization of this gene family, as well as determined its potential involvement in growth, development, and stress responses. This study provides valuable information for better understanding the classification and putative functions of the UBC-encoding genes of maize. PMID:26606743

  9. Analysis of Epstein-Barr Virus Genomes and Expression Profiles in Gastric Adenocarcinoma.

    Science.gov (United States)

    Borozan, Ivan; Zapatka, Marc; Frappier, Lori; Ferretti, Vincent

    2018-01-15

    Epstein-Barr virus (EBV) is a causative agent of a variety of lymphomas, nasopharyngeal carcinoma (NPC), and ∼9% of gastric carcinomas (GCs). An important question is whether particular EBV variants are more oncogenic than others, but conclusions are currently hampered by the lack of sequenced EBV genomes. Here, we contribute to this question by mining whole-genome sequences of 201 GCs to identify 13 EBV-positive GCs and by assembling 13 new EBV genome sequences, almost doubling the number of available GC-derived EBV genome sequences and providing the first non-Asian EBV genome sequences from GC. Whole-genome sequence comparisons of all EBV isolates sequenced to date (85 from tumors and 57 from healthy individuals) showed that most GC and NPC EBV isolates were closely related although American Caucasian GC samples were more distant, suggesting a geographical component. However, EBV GC isolates were found to contain some consistent changes in protein sequences regardless of geographical origin. In addition, transcriptome data available for eight of the EBV-positive GCs were analyzed to determine which EBV genes are expressed in GC. In addition to the expected latency proteins (EBNA1, LMP1, and LMP2A), specific subsets of lytic genes were consistently expressed that did not reflect a typical lytic or abortive lytic infection, suggesting a novel mechanism of EBV gene regulation in the context of GC. These results are consistent with a model in which a combination of specific latent and lytic EBV proteins promotes tumorigenesis. IMPORTANCE Epstein-Barr virus (EBV) is a widespread virus that causes cancer, including gastric carcinoma (GC), in a small subset of individuals. An important question is whether particular EBV variants are more cancer associated than others, but more EBV sequences are required to address this question. Here, we have generated 13 new EBV genome sequences from GC, almost doubling the number of EBV sequences from GC isolates and providing the

  10. Genomic Survey and Expression Profiling of the MYB Gene Family in Watermelon

    Directory of Open Access Journals (Sweden)

    Qing XU

    2018-01-01

    Full Text Available Myeloblastosis (MYB proteins constitute one of the largest transcription factor (TF families in plants. They are functionally diverse in regulating plant development, metabolism, and multiple stress responses. However, the function of watermelon MYB proteins remains elusive to date. Here, a genome-wide identification of watermelon MYB TFs was performed by bioinformatics analysis. A total of 162 MYB genes were identified from watermelon (ClaMYB. A comprehensive overview of the ClaMYB genes was undertaken, including the gene structures, chromosomal distribution, gene duplication, conserved protein motif, and phylogenetic relationship. According to the analyses, the watermelon MYB genes were categorized into three groups (R1R2R3-MYB, R2R3-MYB, and MYB-related. Amino acid alignments for all MYB motifs of ClaMYBs demonstrated high conservation. Investigation of their chromosomal localization revealed that these ClaMYB genes distributed across the 11 watermelon chromosomes. Gene duplication analyses showed that tandem duplication events contributed predominantly to the expansion of the MYB gene family in the watermelon genome. Phylogenetic comparison of the ClaMYB proteins with Arabidopsis MYB proteins revealed that watermelon MYB proteins underwent a more diverse evolution after divergence from Arabidopsis. Some watermelon MYBs were found to cluster into the functional clades of Arabidopsis MYB proteins. Expression analysis under different stress conditions identified a group of watermelon MYB proteins implicated in the plant stress responses. The comprehensive investigation of watermelon MYB genes in this study provides a useful reference for future cloning and functional analysis of watermelon MYB proteins. Keywords: watermelon, MYB transcription factor, abiotic stress, phylogenetic analysis

  11. The genome and structural proteome of an ocean siphovirus: a new window into the cyanobacterial ‘mobilome’

    Science.gov (United States)

    Sullivan, Matthew B; Krastins, Bryan; Hughes, Jennifer L; Kelly, Libusha; Chase, Michael; Sarracino, David; Chisholm, Sallie W

    2009-01-01

    Prochlorococcus, an abundant phototroph in the oceans, are infected by members of three families of viruses: myo-, podo- and siphoviruses. Genomes of myo- and podoviruses isolated on Prochlorococcus contain DNA replication machinery and virion structural genes homologous to those from coliphages T4 and T7 respectively. They also contain a suite of genes of cyanobacterial origin, most notably photosynthesis genes, which are expressed during infection and appear integral to the evolutionary trajectory of both host and phage. Here we present the first genome of a cyanobacterial siphovirus, P-SS2, which was isolated from Atlantic slope waters using a Prochlorococcus host (MIT9313). The P-SS2 genome is larger than, and considerably divergent from, previously sequenced siphoviruses. It appears most closely related to lambdoid siphoviruses, with which it shares 13 functional homologues. The ∼108 kb P-SS2 genome encodes 131 predicted proteins and notably lacks photosynthesis genes which have consistently been found in other marine cyanophage, but does contain 14 other cyanobacterial homologues. While only six structural proteins were identified from the genome sequence, 35 proteins were detected experimentally; these mapped onto capsid and tail structural modules in the genome. P-SS2 is potentially capable of integration into its host as inferred from bioinformatically identified genetic machinery int, bet, exo and a 53 bp attachment site. The host attachment site appears to be a genomic island that is tied to insertion sequence (IS) activity that could facilitate mobility of a gene involved in the nitrogen-stress response. The homologous region and a secondary IS-element hot-spot in Synechococcus RS9917 are further evidence of IS-mediated genome evolution coincident with a probable relic prophage integration event. This siphovirus genome provides a glimpse into the biology of a deep-photic zone phage as well as the ocean cyanobacterial prophage and IS element

  12. Structure and expression of thyroglobulin gene

    Energy Technology Data Exchange (ETDEWEB)

    Vassart, G; Brocas, H; Christophe, D; de Martynoff, G; Leriche, A; Mercken, L; Pohl, V; van Heuverswyn, B [Institut de Recherche Interdisciplinaire en Biologie Humaine et Nucleaire (IRIBHN), Faculte de Medecine, Universite libre de Bruxelles, Campus Hopital Erasme, Brussels (Belgium)

    1982-01-01

    Thyroglobulin is composed of two 300000 dalton polypeptide chains, translated from an 8000 base mRNA. Preparation of a full length cDNA and its cloning in E. coli have lead to the demonstration that the polypeptides of thyroglobulin protomers were identical. Used as molecular probes, the cloned cDNA allowed the isolation of a fragment of thyroglobulin gene. Electron microscopic studies have demonstrated that this gene contains more than 90 % intronic material separating small size exons (<200 bp). Sequencing of bovine thyroglobulin structural gene is in progress. Preliminary results show evidence for the existence of repetitive segments. Availability of cloned DNA complementary to bovine and human thyroglobulin mRNA allows the study of genetic defects of thyroglobulin gene expression in the human and in various animal models.

  13. Recognizing genes and other components of genomic structure

    Energy Technology Data Exchange (ETDEWEB)

    Burks, C. (Los Alamos National Lab., NM (USA)); Myers, E. (Arizona Univ., Tucson, AZ (USA). Dept. of Computer Science); Stormo, G.D. (Colorado Univ., Boulder, CO (USA). Dept. of Molecular, Cellular and Developmental Biology)

    1991-01-01

    The Aspen Center for Physics (ACP) sponsored a three-week workshop, with 26 scientists participating, from 28 May to 15 June, 1990. The workshop, entitled Recognizing Genes and Other Components of Genomic Structure, focussed on discussion of current needs and future strategies for developing the ability to identify and predict the presence of complex functional units on sequenced, but otherwise uncharacterized, genomic DNA. We addressed the need for computationally-based, automatic tools for synthesizing available data about individual consensus sequences and local compositional patterns into the composite objects (e.g., genes) that are -- as composite entities -- the true object of interest when scanning DNA sequences. The workshop was structured to promote sustained informal contact and exchange of expertise between molecular biologists, computer scientists, and mathematicians. No participant stayed for less than one week, and most attended for two or three weeks. Computers, software, and databases were available for use as electronic blackboards'' and as the basis for collaborative exploration of ideas being discussed and developed at the workshop. 23 refs., 2 tabs.

  14. Structural constraints in the packaging of bluetongue virus genomic segments.

    Science.gov (United States)

    Burkhardt, Christiane; Sung, Po-Yu; Celma, Cristina C; Roy, Polly

    2014-10-01

    The mechanism used by bluetongue virus (BTV) to ensure the sorting and packaging of its 10 genomic segments is still poorly understood. In this study, we investigated the packaging constraints for two BTV genomic segments from two different serotypes. Segment 4 (S4) of BTV serotype 9 was mutated sequentially and packaging of mutant ssRNAs was investigated by two newly developed RNA packaging assay systems, one in vivo and the other in vitro. Modelling of the mutated ssRNA followed by biochemical data analysis suggested that a conformational motif formed by interaction of the 5' and 3' ends of the molecule was necessary and sufficient for packaging. A similar structural signal was also identified in S8 of BTV serotype 1. Furthermore, the same conformational analysis of secondary structures for positive-sense ssRNAs was used to generate a chimeric segment that maintained the putative packaging motif but contained unrelated internal sequences. This chimeric segment was packaged successfully, confirming that the motif identified directs the correct packaging of the segment. © 2014 The Authors.

  15. Identification of genomic indels and structural variations using split reads

    Directory of Open Access Journals (Sweden)

    Urban Alexander E

    2011-07-01

    Full Text Available Abstract Background Recent studies have demonstrated the genetic significance of insertions, deletions, and other more complex structural variants (SVs in the human population. With the development of the next-generation sequencing technologies, high-throughput surveys of SVs on the whole-genome level have become possible. Here we present split-read identification, calibrated (SRiC, a sequence-based method for SV detection. Results We start by mapping each read to the reference genome in standard fashion using gapped alignment. Then to identify SVs, we score each of the many initial mappings with an assessment strategy designed to take into account both sequencing and alignment errors (e.g. scoring more highly events gapped in the center of a read. All current SV calling methods have multilevel biases in their identifications due to both experimental and computational limitations (e.g. calling more deletions than insertions. A key aspect of our approach is that we calibrate all our calls against synthetic data sets generated from simulations of high-throughput sequencing (with realistic error models. This allows us to calculate sensitivity and the positive predictive value under different parameter-value scenarios and for different classes of events (e.g. long deletions vs. short insertions. We run our calculations on representative data from the 1000 Genomes Project. Coupling the observed numbers of events on chromosome 1 with the calibrations gleaned from the simulations (for different length events allows us to construct a relatively unbiased estimate for the total number of SVs in the human genome across a wide range of length scales. We estimate in particular that an individual genome contains ~670,000 indels/SVs. Conclusions Compared with the existing read-depth and read-pair approaches for SV identification, our method can pinpoint the exact breakpoints of SV events, reveal the actual sequence content of insertions, and cover the whole

  16. Comparative analysis of codon usage patterns and identification of predicted highly expressed genes in five Salmonella genomes

    Directory of Open Access Journals (Sweden)

    Mondal U

    2008-01-01

    Full Text Available Purpose: To anlyse codon usage patterns of five complete genomes of Salmonella , predict highly expressed genes, examine horizontally transferred pathogenicity-related genes to detect their presence in the strains, and scrutinize the nature of highly expressed genes to infer upon their lifestyle. Methods: Protein coding genes, ribosomal protein genes, and pathogenicity-related genes were analysed with Codon W and CAI (codon adaptation index Calculator. Results: Translational efficiency plays a role in codon usage variation in Salmonella genes. Low bias was noticed in most of the genes. GC3 (guanine cytosine at third position composition does not influence codon usage variation in the genes of these Salmonella strains. Among the cluster of orthologous groups (COGs, translation, ribosomal structure biogenesis [J], and energy production and conversion [C] contained the highest number of potentially highly expressed (PHX genes. Correspondence analysis reveals the conserved nature of the genes. Highly expressed genes were detected. Conclusions: Selection for translational efficiency is the major source of variation of codon usage in the genes of Salmonella . Evolution of pathogenicity-related genes as a unit suggests their ability to infect and exist as a pathogen. Presence of a lot of PHX genes in the information and storage-processing category of COGs indicated their lifestyle and revealed that they were not subjected to genome reduction.

  17. The Genomic Pattern of tDNA Operon Expression in E. coli.

    Directory of Open Access Journals (Sweden)

    2005-06-01

    Full Text Available In fast-growing microorganisms, a tRNA concentration profile enriched in major isoacceptors selects for the biased usage of cognate codons. This optimizes translational rate for the least mass invested in the translational apparatus. Such translational streamlining is thought to be growth-regulated, but its genetic basis is poorly understood. First, we found in reanalysis of the E. coli tRNA profile that the degree to which it is translationally streamlined is nearly invariant with growth rate. Then, using least squares multiple regression, we partitioned tRNA isoacceptor pools to predicted tDNA operons from the E. coli K12 genome. Co-expression of tDNAs in operons explains the tRNA profile significantly better than tDNA gene dosage alone. Also, operon expression increases significantly with proximity to the origin of replication, oriC, at all growth rates. Genome location explains about 15% of expression variation in a form, at a given growth rate, that is consistent with replication-dependent gene concentration effects. Yet the change in the tRNA profile with growth rate is less than would be expected from such effects. We estimated per-copy expression rates for all tDNA operons that were consistent with independent estimates for rDNA operons. We also found that tDNA operon location, and the location dependence of expression, were significantly different in the leading and lagging strands. The operonic organization and genomic location of tDNA operons are significant factors influencing their expression. Nonrandom patterns of location and strandedness shown by tDNA operons in E. coli suggest that their genomic architecture may be under selection to satisfy physiological demand for tRNA expression at high growth rates.

  18. Secure web book to store structural genomics research data.

    Science.gov (United States)

    Manjasetty, Babu A; Höppner, Klaus; Mueller, Uwe; Heinemann, Udo

    2003-01-01

    Recently established collaborative structural genomics programs aim at significantly accelerating the crystal structure analysis of proteins. These large-scale projects require efficient data management systems to ensure seamless collaboration between different groups of scientists working towards the same goal. Within the Berlin-based Protein Structure Factory, the synchrotron X-ray data collection and the subsequent crystal structure analysis tasks are located at BESSY, a third-generation synchrotron source. To organize file-based communication and data transfer at the BESSY site of the Protein Structure Factory, we have developed the web-based BCLIMS, the BESSY Crystallography Laboratory Information Management System. BCLIMS is a relational data management system which is powered by MySQL as the database engine and Apache HTTP as the web server. The database interface routines are written in Python programing language. The software is freely available to academic users. Here we describe the storage, retrieval and manipulation of laboratory information, mainly pertaining to the synchrotron X-ray diffraction experiments and the subsequent protein structure analysis, using BCLIMS.

  19. Training set optimization under population structure in genomic selection.

    Science.gov (United States)

    Isidro, Julio; Jannink, Jean-Luc; Akdemir, Deniz; Poland, Jesse; Heslot, Nicolas; Sorrells, Mark E

    2015-01-01

    Population structure must be evaluated before optimization of the training set population. Maximizing the phenotypic variance captured by the training set is important for optimal performance. The optimization of the training set (TRS) in genomic selection has received much interest in both animal and plant breeding, because it is critical to the accuracy of the prediction models. In this study, five different TRS sampling algorithms, stratified sampling, mean of the coefficient of determination (CDmean), mean of predictor error variance (PEVmean), stratified CDmean (StratCDmean) and random sampling, were evaluated for prediction accuracy in the presence of different levels of population structure. In the presence of population structure, the most phenotypic variation captured by a sampling method in the TRS is desirable. The wheat dataset showed mild population structure, and CDmean and stratified CDmean methods showed the highest accuracies for all the traits except for test weight and heading date. The rice dataset had strong population structure and the approach based on stratified sampling showed the highest accuracies for all traits. In general, CDmean minimized the relationship between genotypes in the TRS, maximizing the relationship between TRS and the test set. This makes it suitable as an optimization criterion for long-term selection. Our results indicated that the best selection criterion used to optimize the TRS seems to depend on the interaction of trait architecture and population structure.

  20. Combining Functional and Structural Genomics to Sample the Essential Burkholderia Structome

    Science.gov (United States)

    Baugh, Loren; Gallagher, Larry A.; Patrapuvich, Rapatbhorn; Clifton, Matthew C.; Gardberg, Anna S.; Edwards, Thomas E.; Armour, Brianna; Begley, Darren W.; Dieterich, Shellie H.; Dranow, David M.; Abendroth, Jan; Fairman, James W.; Fox, David; Staker, Bart L.; Phan, Isabelle; Gillespie, Angela; Choi, Ryan; Nakazawa-Hewitt, Steve; Nguyen, Mary Trang; Napuli, Alberto; Barrett, Lynn; Buchko, Garry W.; Stacy, Robin; Myler, Peter J.; Stewart, Lance J.; Manoil, Colin; Van Voorhis, Wesley C.

    2013-01-01

    Background The genus Burkholderia includes pathogenic gram-negative bacteria that cause melioidosis, glanders, and pulmonary infections of patients with cancer and cystic fibrosis. Drug resistance has made development of new antimicrobials critical. Many approaches to discovering new antimicrobials, such as structure-based drug design and whole cell phenotypic screens followed by lead refinement, require high-resolution structures of proteins essential to the parasite. Methodology/Principal Findings We experimentally identified 406 putative essential genes in B. thailandensis, a low-virulence species phylogenetically similar to B. pseudomallei, the causative agent of melioidosis, using saturation-level transposon mutagenesis and next-generation sequencing (Tn-seq). We selected 315 protein products of these genes based on structure-determination criteria, such as excluding very large and/or integral membrane proteins, and entered them into the Seattle Structural Genomics Center for Infection Disease (SSGCID) structure determination pipeline. To maximize structural coverage of these targets, we applied an “ortholog rescue” strategy for those producing insoluble or difficult to crystallize proteins, resulting in the addition of 387 orthologs (or paralogs) from seven other Burkholderia species into the SSGCID pipeline. This structural genomics approach yielded structures from 31 putative essential targets from B. thailandensis, and 25 orthologs from other Burkholderia species, yielding an overall structural coverage for 49 of the 406 essential gene families, with a total of 88 depositions into the Protein Data Bank. Of these, 25 proteins have properties of a potential antimicrobial drug target i.e., no close human homolog, part of an essential metabolic pathway, and a deep binding pocket. We describe the structures of several potential drug targets in detail. Conclusions/Significance This collection of structures, solubility and experimental essentiality data

  1. Combining functional and structural genomics to sample the essential Burkholderia structome.

    Directory of Open Access Journals (Sweden)

    Loren Baugh

    Full Text Available The genus Burkholderia includes pathogenic gram-negative bacteria that cause melioidosis, glanders, and pulmonary infections of patients with cancer and cystic fibrosis. Drug resistance has made development of new antimicrobials critical. Many approaches to discovering new antimicrobials, such as structure-based drug design and whole cell phenotypic screens followed by lead refinement, require high-resolution structures of proteins essential to the parasite.We experimentally identified 406 putative essential genes in B. thailandensis, a low-virulence species phylogenetically similar to B. pseudomallei, the causative agent of melioidosis, using saturation-level transposon mutagenesis and next-generation sequencing (Tn-seq. We selected 315 protein products of these genes based on structure-determination criteria, such as excluding very large and/or integral membrane proteins, and entered them into the Seattle Structural Genomics Center for Infection Disease (SSGCID structure determination pipeline. To maximize structural coverage of these targets, we applied an "ortholog rescue" strategy for those producing insoluble or difficult to crystallize proteins, resulting in the addition of 387 orthologs (or paralogs from seven other Burkholderia species into the SSGCID pipeline. This structural genomics approach yielded structures from 31 putative essential targets from B. thailandensis, and 25 orthologs from other Burkholderia species, yielding an overall structural coverage for 49 of the 406 essential gene families, with a total of 88 depositions into the Protein Data Bank. Of these, 25 proteins have properties of a potential antimicrobial drug target i.e., no close human homolog, part of an essential metabolic pathway, and a deep binding pocket. We describe the structures of several potential drug targets in detail.This collection of structures, solubility and experimental essentiality data provides a resource for development of drugs against

  2. Combining functional and structural genomics to sample the essential Burkholderia structome.

    Science.gov (United States)

    Baugh, Loren; Gallagher, Larry A; Patrapuvich, Rapatbhorn; Clifton, Matthew C; Gardberg, Anna S; Edwards, Thomas E; Armour, Brianna; Begley, Darren W; Dieterich, Shellie H; Dranow, David M; Abendroth, Jan; Fairman, James W; Fox, David; Staker, Bart L; Phan, Isabelle; Gillespie, Angela; Choi, Ryan; Nakazawa-Hewitt, Steve; Nguyen, Mary Trang; Napuli, Alberto; Barrett, Lynn; Buchko, Garry W; Stacy, Robin; Myler, Peter J; Stewart, Lance J; Manoil, Colin; Van Voorhis, Wesley C

    2013-01-01

    The genus Burkholderia includes pathogenic gram-negative bacteria that cause melioidosis, glanders, and pulmonary infections of patients with cancer and cystic fibrosis. Drug resistance has made development of new antimicrobials critical. Many approaches to discovering new antimicrobials, such as structure-based drug design and whole cell phenotypic screens followed by lead refinement, require high-resolution structures of proteins essential to the parasite. We experimentally identified 406 putative essential genes in B. thailandensis, a low-virulence species phylogenetically similar to B. pseudomallei, the causative agent of melioidosis, using saturation-level transposon mutagenesis and next-generation sequencing (Tn-seq). We selected 315 protein products of these genes based on structure-determination criteria, such as excluding very large and/or integral membrane proteins, and entered them into the Seattle Structural Genomics Center for Infection Disease (SSGCID) structure determination pipeline. To maximize structural coverage of these targets, we applied an "ortholog rescue" strategy for those producing insoluble or difficult to crystallize proteins, resulting in the addition of 387 orthologs (or paralogs) from seven other Burkholderia species into the SSGCID pipeline. This structural genomics approach yielded structures from 31 putative essential targets from B. thailandensis, and 25 orthologs from other Burkholderia species, yielding an overall structural coverage for 49 of the 406 essential gene families, with a total of 88 depositions into the Protein Data Bank. Of these, 25 proteins have properties of a potential antimicrobial drug target i.e., no close human homolog, part of an essential metabolic pathway, and a deep binding pocket. We describe the structures of several potential drug targets in detail. This collection of structures, solubility and experimental essentiality data provides a resource for development of drugs against infections and diseases

  3. TIGER: Toolbox for integrating genome-scale metabolic models, expression data, and transcriptional regulatory networks

    Directory of Open Access Journals (Sweden)

    Jensen Paul A

    2011-09-01

    Full Text Available Abstract Background Several methods have been developed for analyzing genome-scale models of metabolism and transcriptional regulation. Many of these methods, such as Flux Balance Analysis, use constrained optimization to predict relationships between metabolic flux and the genes that encode and regulate enzyme activity. Recently, mixed integer programming has been used to encode these gene-protein-reaction (GPR relationships into a single optimization problem, but these techniques are often of limited generality and lack a tool for automating the conversion of rules to a coupled regulatory/metabolic model. Results We present TIGER, a Toolbox for Integrating Genome-scale Metabolism, Expression, and Regulation. TIGER converts a series of generalized, Boolean or multilevel rules into a set of mixed integer inequalities. The package also includes implementations of existing algorithms to integrate high-throughput expression data with genome-scale models of metabolism and transcriptional regulation. We demonstrate how TIGER automates the coupling of a genome-scale metabolic model with GPR logic and models of transcriptional regulation, thereby serving as a platform for algorithm development and large-scale metabolic analysis. Additionally, we demonstrate how TIGER's algorithms can be used to identify inconsistencies and improve existing models of transcriptional regulation with examples from the reconstructed transcriptional regulatory network of Saccharomyces cerevisiae. Conclusion The TIGER package provides a consistent platform for algorithm development and extending existing genome-scale metabolic models with regulatory networks and high-throughput data.

  4. DNA microarrays of baculovirus genomes: differential expression of viral genes in two susceptible insect cell lines.

    Science.gov (United States)

    Yamagishi, J; Isobe, R; Takebuchi, T; Bando, H

    2003-03-01

    We describe, for the first time, the generation of a viral DNA chip for simultaneous expression measurements of nearly all known open reading frames (ORFs) in the best-studied members of the family Baculoviridae, Autographa californica multiple nucleopolyhedrovirus (AcMNPV) and Bombyx mori nucleopolyhedrovirus (BmNPV). In this study, a viral DNA chip (Ac-BmNPV chip) was fabricated and used to characterize the viral gene expression profile for AcMNPV in different cell types. The viral chip is composed of microarrays of viral DNA prepared by robotic deposition of PCR-amplified viral DNA fragments on glass for ORFs in the NPV genome. Viral gene expression was monitored by hybridization to the DNA fragment microarrays with fluorescently labeled cDNAs prepared from infected Spodoptera frugiperda, Sf9 cells and Trichoplusia ni, TnHigh-Five cells, the latter a major producer of baculovirus and recombinant proteins. A comparison of expression profiles of known ORFs in AcMNPV elucidated six genes (ORF150, p10, pk2, and three late gene expression factor genes lef-3, p35 and lef- 6) the expression of each of which was regulated differently in the two cell lines. Most of these genes are known to be closely involved in the viral life cycle such as in DNA replication, late gene expression and the release of polyhedra from infected cells. These results imply that the differential expression of these viral genes accounts for the differences in viral replication between these two cell lines. Thus, these fabricated microarrays of NPV DNA which allow a rapid analysis of gene expression at the viral genome level should greatly speed the functional analysis of large genomes of NPV.

  5. Further statistical analysis for genome-wide expression evolution in primate brain/liver/fibroblast tissue

    Directory of Open Access Journals (Sweden)

    Gu Jianying

    2004-05-01

    Full Text Available Abstract In spite of only a 1-2 per cent genomic DNA sequence difference, humans and chimpanzees differ considerably in behaviour and cognition. Affymetrix microarray technology provides a novel approach to addressing a long-term debate on whether the difference between humans and chimpanzees results from the alteration of gene expressions. Here, we used several statistical methods (distance method, two-sample t-tests, regularised t-tests, ANOVA and bootstrapping to detect the differential expression pattern between humans and great apes. Our analysis shows that the pattern we observed before is robust against various statistical methods; that is, the pronounced expression changes occurred on the human lineage after the split from chimpanzees, and that the dramatic brain expression alterations in humans may be mainly driven by a set of genes with increased expression (up-regulated rather than decreased expression (down-regulated.

  6. Simultaneous Structural Variation Discovery in Multiple Paired-End Sequenced Genomes

    Science.gov (United States)

    Hormozdiari, Fereydoun; Hajirasouliha, Iman; McPherson, Andrew; Eichler, Evan E.; Sahinalp, S. Cenk

    Next generation sequencing technologies have been decreasing the costs and increasing the world-wide capacity for sequence production at an unprecedented rate, making the initiation of large scale projects aiming to sequence almost 2000 genomes [1]. Structural variation detection promises to be one of the key diagnostic tools for cancer and other diseases with genomic origin. In this paper, we study the problem of detecting structural variation events in two or more sequenced genomes through high throughput sequencing . We propose to move from the current model of (1) detecting genomic variations in single next generation sequenced (NGS) donor genomes independently, and (2) checking whether two or more donor genomes indeed agree or disagree on the variations (in this paper we name this framework Independent Structural Variation Discovery and Merging - ISV&M), to a new model in which we detect structural variation events among multiple genomes simultaneously.

  7. Genome-wide expression in veterans with schizophrenia further validates the immune hypothesis for schizophrenia.

    Science.gov (United States)

    Fries, Gabriel R; Dimitrov, Dimitre H; Lee, Shuko; Braida, Nicole; Yantis, Jesse; Honaker, Craig; Cuellar, Joe; Walss-Bass, Consuelo

    2018-02-01

    This study aimed to test whether a dysregulation of gene expression may be the underlying cause of previously reported elevated levels of inflammatory cytokines in veterans with schizophrenia. We performed a genome-wide expression analysis in peripheral blood mononuclear cells from veterans with schizophrenia and controls, and our results show that 167 genes and putative loci were differently expressed between groups. These genes were enriched primarily for pathways related to inflammatory mechanisms and formed networks related to cell death and survival, immune cell trafficking, among others, which is in line with previous reports and further validates the inflammatory hypothesis of schizophrenia. Copyright © 2017 Elsevier B.V. All rights reserved.

  8. Morphological, Genome and Gene Expression Changes in Newly Induced Autopolyploid Chrysanthemum lavandulifolium (Fisch. ex Trautv. Makino

    Directory of Open Access Journals (Sweden)

    Ri Gao

    2016-10-01

    Full Text Available Autopolyploidy is widespread in higher plants and plays an important role in the process of evolution. The present study successfully induced autotetraploidys from Chrysanthemum lavandulifolium by colchicine. The plant morphology, genomic, transcriptomic, and epigenetic changes between tetraploid and diploid plants were investigated. Ligulate flower, tubular flower and leaves of tetraploid plants were greater than those of the diploid plants. Compared with diploid plants, the genome changed as a consequence of polyploidization in tetraploid plants, namely, 1.1% lost fragments and 1.6% novel fragments occurred. In addition, DNA methylation increased after genome doubling in tetraploid plants. Among 485 common transcript-derived fragments (TDFs, which existed in tetraploid and diploid progenitors, 62 fragments were detected as differentially expressed TDFs, 6.8% of TDFs exhibited up-regulated gene expression in the tetraploid plants and 6.0% exhibited down-regulation. The present study provides a reference for further studying the autopolyploidization role in the evolution of C. lavandulifolium. In conclusion, the autopolyploid C. lavandulifolium showed a global change in morphology, genome and gene expression compared with corresponding diploid.

  9. The RNAPII-CTD Maintains Genome Integrity through Inhibition of Retrotransposon Gene Expression and Transposition.

    Directory of Open Access Journals (Sweden)

    Maria J Aristizabal

    2015-10-01

    Full Text Available RNA polymerase II (RNAPII contains a unique C-terminal domain that is composed of heptapeptide repeats and which plays important regulatory roles during gene expression. RNAPII is responsible for the transcription of most protein-coding genes, a subset of non-coding genes, and retrotransposons. Retrotransposon transcription is the first step in their multiplication cycle, given that the RNA intermediate is required for the synthesis of cDNA, the material that is ultimately incorporated into a new genomic location. Retrotransposition can have grave consequences to genome integrity, as integration events can change the gene expression landscape or lead to alteration or loss of genetic information. Given that RNAPII transcribes retrotransposons, we sought to investigate if the RNAPII-CTD played a role in the regulation of retrotransposon gene expression. Importantly, we found that the RNAPII-CTD functioned to maintaining genome integrity through inhibition of retrotransposon gene expression, as reducing CTD length significantly increased expression and transposition rates of Ty1 elements. Mechanistically, the increased Ty1 mRNA levels in the rpb1-CTD11 mutant were partly due to Cdk8-dependent alterations to the RNAPII-CTD phosphorylation status. In addition, Cdk8 alone contributed to Ty1 gene expression regulation by altering the occupancy of the gene-specific transcription factor Ste12. Loss of STE12 and TEC1 suppressed growth phenotypes of the RNAPII-CTD truncation mutant. Collectively, our results implicate Ste12 and Tec1 as general and important contributors to the Cdk8, RNAPII-CTD regulatory circuitry as it relates to the maintenance of genome integrity.

  10. Expressed Peptide Tags: An additional layer of data for genome annotation

    Energy Technology Data Exchange (ETDEWEB)

    Savidor, Alon [ORNL; Donahoo, Ryan S [ORNL; Hurtado-Gonzales, Oscar [University of Tennessee, Knoxville (UTK); Verberkmoes, Nathan C [ORNL; Shah, Manesh B [ORNL; Lamour, Kurt H [ORNL; McDonald, W Hayes [ORNL

    2006-01-01

    While genome sequencing is becoming ever more routine, genome annotation remains a challenging process. Identification of the coding sequences within the genomic milieu presents a tremendous challenge, especially for eukaryotes with their complex gene architectures. Here we present a method to assist the annotation process through the use of proteomic data and bioinformatics. Mass spectra of digested protein preparations of the organism of interest were acquired and searched against a protein database created by a six frame translation of the genome. The identified peptides were mapped back to the genome, compared to the current annotation, and then categorized as supporting or extending the current genome annotation. We named the classified peptides Expressed Peptide Tags (EPTs). The well annotated bacterium Rhodopseudomonas palustris was used as a control for the method and showed high degree of correlation between EPT mapping and the current annotation, with 86% of the EPTs confirming existing gene calls and less than 1% of the EPTs expanding on the current annotation. The eukaryotic plant pathogens Phytophthora ramorum and Phytophthora sojae, whose genomes have been recently sequenced and are much less well annotated, were also subjected to this method. A series of algorithmic steps were taken to increase the confidence of EPT identification for these organisms, including generation of smaller sub-databases to be searched against, and definition of EPT criteria that accommodates the more complex eukaryotic gene architecture. As expected, the analysis of the Phytophthora species showed less correlation between EPT mapping and their current annotation. While ~77% of Phytophthora EPTs supported the current annotation, a portion of them (7.2% and 12.6% for P. ramorum and P. sojae, respectively) suggested modification to current gene calls or identified novel genes that were missed by the current genome annotation of these organisms.

  11. Multi-scale coding of genomic information: From DNA sequence to genome structure and function

    International Nuclear Information System (INIS)

    Arneodo, Alain; Vaillant, Cedric; Audit, Benjamin; Argoul, Francoise; D'Aubenton-Carafa, Yves; Thermes, Claude

    2011-01-01

    Understanding how chromatin is spatially and dynamically organized in the nucleus of eukaryotic cells and how this affects genome functions is one of the main challenges of cell biology. Since the different orders of packaging in the hierarchical organization of DNA condition the accessibility of DNA sequence elements to trans-acting factors that control the transcription and replication processes, there is actually a wealth of structural and dynamical information to learn in the primary DNA sequence. In this review, we show that when using concepts, methodologies, numerical and experimental techniques coming from statistical mechanics and nonlinear physics combined with wavelet-based multi-scale signal processing, we are able to decipher the multi-scale sequence encoding of chromatin condensation-decondensation mechanisms that play a fundamental role in regulating many molecular processes involved in nuclear functions.

  12. Fast identification of folded human protein domains expressed in E. coli suitable for structural analysis

    Directory of Open Access Journals (Sweden)

    Schlegel Brigitte

    2004-03-01

    Full Text Available Abstract Background High-throughput protein structure analysis of individual protein domains requires analysis of large numbers of expression clones to identify suitable constructs for structure determination. For this purpose, methods need to be implemented for fast and reliable screening of the expressed proteins as early as possible in the overall process from cloning to structure determination. Results 88 different E. coli expression constructs for 17 human protein domains were analysed using high-throughput cloning, purification and folding analysis to obtain candidates suitable for structural analysis. After 96 deep-well microplate expression and automated protein purification, protein domains were directly analysed using 1D 1H-NMR spectroscopy. In addition, analytical hydrophobic interaction chromatography (HIC was used to detect natively folded protein. With these two analytical methods, six constructs (representing two domains were quickly identified as being well folded and suitable for structural analysis. Conclusion The described approach facilitates high-throughput structural analysis. Clones expressing natively folded proteins suitable for NMR structure determination were quickly identified upon small scale expression screening using 1D 1H-NMR and/or analytical HIC. This procedure is especially effective as a fast and inexpensive screen for the 'low hanging fruits' in structural genomics.

  13. Expression of homing endonuclease gene and insertion-like element in sea anemone mitochondrial genomes: Lesson learned from Anemonia viridis.

    Science.gov (United States)

    Chi, Sylvia Ighem; Urbarova, Ilona; Johansen, Steinar D

    2018-04-30

    The mitochondrial genomes of sea anemones are dynamic in structure. Invasion by genetic elements, such as self-catalytic group I introns or insertion-like sequences, contribute to sea anemone mitochondrial genome expansion and complexity. By using next generation sequencing we investigated the complete mtDNAs and corresponding transcriptomes of the temperate sea anemone Anemonia viridis and its closer tropical relative Anemonia majano. Two versions of fused homing endonuclease gene (HEG) organization were observed among the Actiniidae sea anemones; in-frame gene fusion and pseudo-gene fusion. We provided support for the pseudo-gene fusion organization in Anemonia species, resulting in a repressed HEG from the COI-884 group I intron. orfA, a putative protein-coding gene with insertion-like features, was present in both Anemonia species. Interestingly, orfA and COI expression were significantly up-regulated upon long-term environmental stress corresponding to low seawater pH conditions. This study provides new insights to the dynamics of sea anemone mitochondrial genome structure and function. Copyright © 2018 Elsevier B.V. All rights reserved.

  14. XTHs from Fragaria vesca: genomic structure and transcriptomic analysis in ripening fruit and other tissues.

    Science.gov (United States)

    Opazo, María Cecilia; Lizana, Rodrigo; Stappung, Yazmina; Davis, Thomas M; Herrera, Raúl; Moya-León, María Alejandra

    2017-11-07

    Fragaria vesca or 'woodland strawberry' has emerged as an attractive model for the study of ripening of non-climacteric fruit. It has several advantages, such as its small genome and its diploidy. The recent availability of the complete sequence of its genome opens the possibility for further analysis and its use as a reference species. Fruit softening is a physiological event and involves many biochemical changes that take place at the final stages of fruit development; among them, the remodeling of cell walls by the action of a set of enzymes. Xyloglucan endotransglycosylase/hydrolase (XTH) is a cell wall-associated enzyme, which is encoded by a multigene family. Its action modifies the structure of xyloglucans, a diverse group of polysaccharides that crosslink with cellulose microfibrills, affecting therefore the functional structure of the cell wall. The aim of this work is to identify the XTH-encoding genes present in F. vesca and to determine its transcription level in ripening fruit. The search resulted in identification of 26 XTH-encoding genes named as FvXTHs. Genetic structure and phylogenetic analyses were performed allowing the classification of FvXTH genes into three phylogenetic groups: 17 in group I/II, 2 in group IIIA and 4 in group IIIB. Two sequences were included into the ancestral group. Through a comparative analysis, characteristic structural protein domains were found in FvXTH protein sequences. In complement, expression analyses of FvXTHs by qPCR were performed in fruit at different developmental and ripening stages, as well as, in other tissues. The results showed a diverse expression pattern of FvXTHs in several tissues, although most of them are highly expressed in roots. Their expression patterns are not related to their respective phylogenetic groups. In addition, most FvXTHs are expressed in ripe fruit, and interestingly, some of them (FvXTH 18 and 20, belonging to phylogenic group I/II, and FvXTH 25 and 26 to group IIIB) display an

  15. Systematic drug safety evaluation based on public genomic expression (Connectivity Map) data: myocardial and infectious adverse reactions as application cases.

    Science.gov (United States)

    Wang, Kejian; Weng, Zuquan; Sun, Liya; Sun, Jiazhi; Zhou, Shu-Feng; He, Lin

    2015-02-13

    Adverse drug reaction (ADR) is of great importance to both regulatory agencies and the pharmaceutical industry. Various techniques, such as quantitative structure-activity relationship (QSAR) and animal toxicology, are widely used to identify potential risks during the preclinical stage of drug development. Despite these efforts, drugs with safety liabilities can still pass through safety checkpoints and enter the market. This situation raises the concern that conventional chemical structure analysis and phenotypic screening are not sufficient to avoid all clinical adverse events. Genomic expression data following in vitro drug treatments characterize drug actions and thus have become widely used in drug repositioning. In the present study, we explored prediction of ADRs based on the drug-induced gene-expression profiles from cultured human cells in the Connectivity Map (CMap) database. The results showed that drugs inducing comparable ADRs generally lead to similar CMap expression profiles. Based on such ADR-gene expression association, we established prediction models for various ADRs, including severe myocardial and infectious events. Drugs with FDA boxed warnings of safety liability were effectively identified. We therefore suggest that drug-induced gene expression change, in combination with effective computational methods, may provide a new dimension of information to facilitate systematic drug safety evaluation. Copyright © 2015 Elsevier Inc. All rights reserved.

  16. RNA structural constraints in the evolution of the influenza A virus genome NP segment

    NARCIS (Netherlands)

    A.P. Gultyaev (Alexander); A. Tsyganov-Bodounov (Anton); M.I. Spronken (Monique); S. Van Der Kooij (Sander); R.A.M. Fouchier (Ron); R.C.L. Olsthoorn (René)

    2014-01-01

    textabstractConserved RNA secondary structures were predicted in the nucleoprotein (NP) segment of the influenza A virus genome using comparative sequence and structure analysis. A number of structural elements exhibiting nucleotide covariations were identified over the whole segment length,

  17. Macromolecular structure determination in the post-genome era

    International Nuclear Information System (INIS)

    Kuhn, P.; Soltis, S.M.

    2001-01-01

    Recent advances in genetics, molecular biology and crystallographic instrumentation and methodology have led to a revolution in the field of Structural Molecular Biology (SMB). These combined advances have paved the way to a more complete and detailed understanding of the biological macromolecules that make up an organism, both in terms of their individual functions and also the interactions between them. In this paper we describe a large-scale, genomic approach to the three-dimensional structure determination of macromolecules and their complexes, using high-throughput methodology to streamline all aspects of the process. This task requires the development of automated high-intensity synchrotron beam lines for X-ray diffraction data collection from single crystal samples. Furthermore, these beam lines must be operated within a sophisticated software and hardware environment, which is capable of delivering a completely automated structure determination pipeline. The SMB resource at SSRL is developing a system for the structure determination steps of this process, starting with the initial characterization of the frozen sample, followed by data collection, data reduction, phase determination, and model building. This paper focuses on the data collection elements of this high-throughput system

  18. Full-length RNA structure prediction of the HIV-1 genome reveals a conserved core domain

    DEFF Research Database (Denmark)

    Sükösd, Zsuzsanna; Andersen, Ebbe Sloth; Seemann, Ernst Stefan

    2015-01-01

    of the HIV-1 genome is highly variable in most regions, with a limited number of stable and conserved RNA secondary structures. Most interesting, a set of long distance interactions form a core organizing structure (COS) that organize the genome into three major structural domains. Despite overlapping...

  19. Genomic expression analysis of rat chromosome 4 for skeletal traits at femoral neck.

    Science.gov (United States)

    Alam, Imranul; Sun, Qiwei; Liu, Lixiang; Koller, Daniel L; Liu, Yunlong; Edenberg, Howard J; Econs, Michael J; Foroud, Tatiana; Turner, Charles H

    2008-10-08

    Hip fracture is the most devastating osteoporotic fracture type with significant morbidity and mortality. Several studies in humans and animal models identified chromosomal regions linked to hip size and bone mass. Previously, we identified that the region of 4q21-q41 on rat chromosome (Chr) 4 harbors multiple femoral neck quantitative trait loci (QTLs) in inbred Fischer 344 (F344) and Lewis (LEW) rats. The purpose of this study is to identify the candidate genes for femoral neck structure and density by correlating gene expression in the proximal femur with the femoral neck phenotypes linked to the QTLs on Chr 4. RNA was extracted from proximal femora of 4-wk-old rats from F344 and LEW strains, and two other strains, Copenhagen 2331 and Dark Agouti, were used as a negative control. Microarray analysis was performed using Affymetrix Rat Genome 230 2.0 arrays. A total of 99 genes in the 4q21-q41 region were differentially expressed (P level of the gene in that strain. A total of 18 candidate genes were strongly correlated (r(2) > 0.50) with femoral neck width and prioritized for further analysis. Quantitative PCR analysis confirmed 14 of 18 of the candidate genes. Ingenuity pathway analysis revealed several direct or indirect relationships among the candidate genes related to angiogenesis (VEGF), bone growth (FGF2), bone formation (IGF2 and IGF2BP3), and resorption (TNF). This study provides a shortened list of genetic determinants of skeletal traits at the hip and may lead to novel approaches for prevention and treatment of hip fracture.

  20. Rapid Identification of Potential Drugs for Diabetic Nephropathy Using Whole-Genome Expression Profiles of Glomeruli

    Directory of Open Access Journals (Sweden)

    Jingsong Shi

    2016-01-01

    Full Text Available Objective. To investigate potential drugs for diabetic nephropathy (DN using whole-genome expression profiles and the Connectivity Map (CMAP. Methodology. Eighteen Chinese Han DN patients and six normal controls were included in this study. Whole-genome expression profiles of microdissected glomeruli were measured using the Affymetrix human U133 plus 2.0 chip. Differentially expressed genes (DEGs between late stage and early stage DN samples and the CMAP database were used to identify potential drugs for DN using bioinformatics methods. Results. (1 A total of 1065 DEGs (FDR 1.5 were found in late stage DN patients compared with early stage DN patients. (2 Piperlongumine, 15d-PGJ2 (15-delta prostaglandin J2, vorinostat, and trichostatin A were predicted to be the most promising potential drugs for DN, acting as NF-κB inhibitors, histone deacetylase inhibitors (HDACIs, PI3K pathway inhibitors, or PPARγ agonists, respectively. Conclusion. Using whole-genome expression profiles and the CMAP database, we rapidly predicted potential DN drugs, and therapeutic potential was confirmed by previously published studies. Animal experiments and clinical trials are needed to confirm both the safety and efficacy of these drugs in the treatment of DN.

  1. Predicting effects of structural stress in a genome-reduced model bacterial metabolism

    Science.gov (United States)

    Güell, Oriol; Sagués, Francesc; Serrano, M. Ángeles

    2012-08-01

    Mycoplasma pneumoniae is a human pathogen recently proposed as a genome-reduced model for bacterial systems biology. Here, we study the response of its metabolic network to different forms of structural stress, including removal of individual and pairs of reactions and knockout of genes and clusters of co-expressed genes. Our results reveal a network architecture as robust as that of other model bacteria regarding multiple failures, although less robust against individual reaction inactivation. Interestingly, metabolite motifs associated to reactions can predict the propagation of inactivation cascades and damage amplification effects arising in double knockouts. We also detect a significant correlation between gene essentiality and damages produced by single gene knockouts, and find that genes controlling high-damage reactions tend to be expressed independently of each other, a functional switch mechanism that, simultaneously, acts as a genetic firewall to protect metabolism. Prediction of failure propagation is crucial for metabolic engineering or disease treatment.

  2. Genome-Wide Tuning of Protein Expression Levels to Rapidly Engineer Microbial Traits.

    Science.gov (United States)

    Freed, Emily F; Winkler, James D; Weiss, Sophie J; Garst, Andrew D; Mutalik, Vivek K; Arkin, Adam P; Knight, Rob; Gill, Ryan T

    2015-11-20

    The reliable engineering of biological systems requires quantitative mapping of predictable and context-independent expression over a broad range of protein expression levels. However, current techniques for modifying expression levels are cumbersome and are not amenable to high-throughput approaches. Here we present major improvements to current techniques through the design and construction of E. coli genome-wide libraries using synthetic DNA cassettes that can tune expression over a ∼10(4) range. The cassettes also contain molecular barcodes that are optimized for next-generation sequencing, enabling rapid and quantitative tracking of alleles that have the highest fitness advantage. We show these libraries can be used to determine which genes and expression levels confer greater fitness to E. coli under different growth conditions.

  3. Genome-wide Identification and Expression Analysis of Half-size ABCG Genes in Malus × domestica

    Directory of Open Access Journals (Sweden)

    Juanjuan MA

    2018-03-01

    Full Text Available Half-size adenosine triphosphate-binding cassette transporter subgroup G (ABCG genes play crucial roles in regulating the movements of a variety of substrates and have been well studied in several plants. However, half-size ABCGs have not been characterized in detail in apple (Malus × domestica Borkh.. Here, we performed a genome-wide identification and expression analysis of the half-size ABCG gene family in apple. A total of 46 apple half-size ABCGs were identified and divided into six clusters according to the phylogenetic analysis. A gene structural analysis showed that most half-size ABCGs in the same cluster shared a similar exon–intron organization. A gene duplication analysis showed that segmental, tandem and whole-genome duplications could account for the expansion of half-size ABCG transporters in M. domestica. Moreover, a promoter scan, digital expression analysis and RNA-seq revealed that MdABCG21 may be involved in root's cytokinin transport and that ABCG17 may be involved in the lateral bud development of M. spectabilis ‘Bly114’ by mediating cytokinin transport. The data presented here lay the foundation for further investigations into the biological and physiological processes and functions of half-size ABCG genes in apple. Keywords: apple, ABCG gene, duplication, gene expression

  4. Rapid detection of structural variation in a human genome using nanochannel-based genome mapping technology

    DEFF Research Database (Denmark)

    Cao, Hongzhi; Hastie, Alex R.; Cao, Dandan

    2014-01-01

    mutations; however, none of the current detection methods are comprehensive, and currently available methodologies are incapable of providing sufficient resolution and unambiguous information across complex regions in the human genome. To address these challenges, we applied a high-throughput, cost......-effective genome mapping technology to comprehensively discover genome-wide SVs and characterize complex regions of the YH genome using long single molecules (>150 kb) in a global fashion. RESULTS: Utilizing nanochannel-based genome mapping technology, we obtained 708 insertions/deletions and 17 inversions larger...... fosmid data. Of the remaining 270 SVs, 260 are insertions and 213 overlap known SVs in the Database of Genomic Variants. Overall, 609 out of 666 (90%) variants were supported by experimental orthogonal methods or historical evidence in public databases. At the same time, genome mapping also provides...

  5. Mapping determinants of gene expression plasticity by genetical genomics in C. elegans.

    Directory of Open Access Journals (Sweden)

    Yang Li

    2006-12-01

    Full Text Available Recent genetical genomics studies have provided intimate views on gene regulatory networks. Gene expression variations between genetically different individuals have been mapped to the causal regulatory regions, termed expression quantitative trait loci. Whether the environment-induced plastic response of gene expression also shows heritable difference has not yet been studied. Here we show that differential expression induced by temperatures of 16 degrees C and 24 degrees C has a strong genetic component in Caenorhabditis elegans recombinant inbred strains derived from a cross between strains CB4856 (Hawaii and N2 (Bristol. No less than 59% of 308 trans-acting genes showed a significant eQTL-by-environment interaction, here termed plasticity quantitative trait loci. In contrast, only 8% of an estimated 188 cis-acting genes showed such interaction. This indicates that heritable differences in plastic responses of gene expression are largely regulated in trans. This regulation is spread over many different regulators. However, for one group of trans-genes we found prominent evidence for a common master regulator: a transband of 66 coregulated genes appeared at 24 degrees C. Our results suggest widespread genetic variation of differential expression responses to environmental impacts and demonstrate the potential of genetical genomics for mapping the molecular determinants of phenotypic plasticity.

  6. Decoherence in yeast cell populations and its implications for genome-wide expression noise.

    Science.gov (United States)

    Briones, M R S; Bosco, F

    2009-01-20

    Gene expression "noise" is commonly defined as the stochastic variation of gene expression levels in different cells of the same population under identical growth conditions. Here, we tested whether this "noise" is amplified with time, as a consequence of decoherence in global gene expression profiles (genome-wide microarrays) of synchronized cells. The stochastic component of transcription causes fluctuations that tend to be amplified as time progresses, leading to a decay of correlations of expression profiles, in perfect analogy with elementary relaxation processes. Measuring decoherence, defined here as a decay in the auto-correlation function of yeast genome-wide expression profiles, we found a slowdown in the decay of correlations, opposite to what would be expected if, as in mixing systems, correlations decay exponentially as the equilibrium state is reached. Our results indicate that the populational variation in gene expression (noise) is a consequence of temporal decoherence, in which the slow decay of correlations is a signature of strong interdependence of the transcription dynamics of different genes.

  7. A genome-wide gene expression signature of environmental geography in leukocytes of Moroccan Amazighs.

    Directory of Open Access Journals (Sweden)

    Youssef Idaghdour

    2008-04-01

    Full Text Available The different environments that humans experience are likely to impact physiology and disease susceptibility. In order to estimate the magnitude of the impact of environment on transcript abundance, we examined gene expression in peripheral blood leukocyte samples from 46 desert nomadic, mountain agrarian and coastal urban Moroccan Amazigh individuals. Despite great expression heterogeneity in humans, as much as one third of the leukocyte transcriptome was found to be associated with differences among regions. Genome-wide polymorphism analysis indicates that genetic differentiation in the total sample is limited and is unlikely to explain the expression divergence. Methylation profiling of 1,505 CpG sites suggests limited contribution of methylation to the observed differences in gene expression. Genetic network analysis further implies that specific aspects of immune function are strongly affected by regional factors and may influence susceptibility to respiratory and inflammatory disease. Our results show a strong genome-wide gene expression signature of regional population differences that presumably include lifestyle, geography, and biotic factors, implying that these can play at least as great a role as genetic divergence in modulating gene expression variation in humans.

  8. Insular Celtic population structure and genomic footprints of migration.

    Directory of Open Access Journals (Sweden)

    Ross P Byrne

    2018-01-01

    Full Text Available Previous studies of the genetic landscape of Ireland have suggested homogeneity, with population substructure undetectable using single-marker methods. Here we have harnessed the haplotype-based method fineSTRUCTURE in an Irish genome-wide SNP dataset, identifying 23 discrete genetic clusters which segregate with geographical provenance. Cluster diversity is pronounced in the west of Ireland but reduced in the east where older structure has been eroded by historical migrations. Accordingly, when populations from the neighbouring island of Britain are included, a west-east cline of Celtic-British ancestry is revealed along with a particularly striking correlation between haplotypes and geography across both islands. A strong relationship is revealed between subsets of Northern Irish and Scottish populations, where discordant genetic and geographic affinities reflect major migrations in recent centuries. Additionally, Irish genetic proximity of all Scottish samples likely reflects older strata of communication across the narrowest inter-island crossing. Using GLOBETROTTER we detected Irish admixture signals from Britain and Europe and estimated dates for events consistent with the historical migrations of the Norse-Vikings, the Anglo-Normans and the British Plantations. The influence of the former is greater than previously estimated from Y chromosome haplotypes. In all, we paint a new picture of the genetic landscape of Ireland, revealing structure which should be considered in the design of studies examining rare genetic variation and its association with traits.

  9. Gene structure, phylogeny and expression profile of the sucrose ...

    Indian Academy of Sciences (India)

    Gene structure, phylogeny and expression profile of the sucrose synthase gene family in .... 24, 701–713. Bate N. and Twell D. 1998 Functional architecture of a late pollen .... Manzara T. and Gruissem W. 1988 Organization and expression.

  10. Integrating genome-wide genetic variations and monocyte expression data reveals trans-regulated gene modules in humans.

    Directory of Open Access Journals (Sweden)

    Maxime Rotival

    2011-12-01

    Full Text Available One major expectation from the transcriptome in humans is to characterize the biological basis of associations identified by genome-wide association studies. So far, few cis expression quantitative trait loci (eQTLs have been reliably related to disease susceptibility. Trans-regulating mechanisms may play a more prominent role in disease susceptibility. We analyzed 12,808 genes detected in at least 5% of circulating monocyte samples from a population-based sample of 1,490 European unrelated subjects. We applied a method of extraction of expression patterns-independent component analysis-to identify sets of co-regulated genes. These patterns were then related to 675,350 SNPs to identify major trans-acting regulators. We detected three genomic regions significantly associated with co-regulated gene modules. Association of these loci with multiple expression traits was replicated in Cardiogenics, an independent study in which expression profiles of monocytes were available in 758 subjects. The locus 12q13 (lead SNP rs11171739, previously identified as a type 1 diabetes locus, was associated with a pattern including two cis eQTLs, RPS26 and SUOX, and 5 trans eQTLs, one of which (MADCAM1 is a potential candidate for mediating T1D susceptibility. The locus 12q24 (lead SNP rs653178, which has demonstrated extensive disease pleiotropy, including type 1 diabetes, hypertension, and celiac disease, was associated to a pattern strongly correlating to blood pressure level. The strongest trans eQTL in this pattern was CRIP1, a known marker of cellular proliferation in cancer. The locus 12q15 (lead SNP rs11177644 was associated with a pattern driven by two cis eQTLs, LYZ and YEATS4, and including 34 trans eQTLs, several of them tumor-related genes. This study shows that a method exploiting the structure of co-expressions among genes can help identify genomic regions involved in trans regulation of sets of genes and can provide clues for understanding the

  11. Functional Genome Mining for Metabolites Encoded by Large Gene Clusters through Heterologous Expression of a Whole-Genome Bacterial Artificial Chromosome Library in Streptomyces spp.

    Science.gov (United States)

    Xu, Min; Wang, Yemin; Zhao, Zhilong; Gao, Guixi; Huang, Sheng-Xiong; Kang, Qianjin; He, Xinyi; Lin, Shuangjun; Pang, Xiuhua; Deng, Zixin

    2016-01-01

    ABSTRACT Genome sequencing projects in the last decade revealed numerous cryptic biosynthetic pathways for unknown secondary metabolites in microbes, revitalizing drug discovery from microbial metabolites by approaches called genome mining. In this work, we developed a heterologous expression and functional screening approach for genome mining from genomic bacterial artificial chromosome (BAC) libraries in Streptomyces spp. We demonstrate mining from a strain of Streptomyces rochei, which is known to produce streptothricins and borrelidin, by expressing its BAC library in the surrogate host Streptomyces lividans SBT5, and screening for antimicrobial activity. In addition to the successful capture of the streptothricin and borrelidin biosynthetic gene clusters, we discovered two novel linear lipopeptides and their corresponding biosynthetic gene cluster, as well as a novel cryptic gene cluster for an unknown antibiotic from S. rochei. This high-throughput functional genome mining approach can be easily applied to other streptomycetes, and it is very suitable for the large-scale screening of genomic BAC libraries for bioactive natural products and the corresponding biosynthetic pathways. IMPORTANCE Microbial genomes encode numerous cryptic biosynthetic gene clusters for unknown small metabolites with potential biological activities. Several genome mining approaches have been developed to activate and bring these cryptic metabolites to biological tests for future drug discovery. Previous sequence-guided procedures relied on bioinformatic analysis to predict potentially interesting biosynthetic gene clusters. In this study, we describe an efficient approach based on heterologous expression and functional screening of a whole-genome library for the mining of bioactive metabolites from Streptomyces. The usefulness of this function-driven approach was demonstrated by the capture of four large biosynthetic gene clusters for metabolites of various chemical types, including

  12. The Fanconi anemia/BRCA gene network in zebrafish: Embryonic expression and comparative genomics

    Energy Technology Data Exchange (ETDEWEB)

    Titus, Tom A.; Yan Yilin; Wilson, Catherine; Starks, Amber M.; Frohnmayer, Jonathan D.; Bremiller, Ruth A.; Canestro, Cristian; Rodriguez-Mari, Adriana; He Xinjun [Institute of Neuroscience, University of Oregon, 1425 E. 13th Avenue, Eugene, OR 97403 (United States); Postlethwait, John H., E-mail: jpostle@uoneuro.uoregon.edu [Institute of Neuroscience, University of Oregon, 1425 E. 13th Avenue, Eugene, OR 97403 (United States)

    2009-07-31

    Fanconi anemia (FA) is a genetic disease resulting in bone marrow failure, high cancer risks, and infertility, and developmental anomalies including microphthalmia, microcephaly, hypoplastic radius and thumb. Here we present cDNA sequences, genetic mapping, and genomic analyses for the four previously undescribed zebrafish FA genes (fanci, fancj, fancm, and fancn), and show that they reverted to single copy after the teleost genome duplication. We tested the hypothesis that FA genes are expressed during embryonic development in tissues that are disrupted in human patients by investigating fanc gene expression patterns. We found fanc gene maternal message, which can provide Fanc proteins to repair DNA damage encountered in rapid cleavage divisions. Zygotic expression was broad but especially strong in eyes, central nervous system and hematopoietic tissues. In the pectoral fin bud at hatching, fanc genes were expressed specifically in the apical ectodermal ridge, a signaling center for fin/limb development that may be relevant to the radius/thumb anomaly of FA patients. Hatching embryos expressed fanc genes strongly in the oral epithelium, a site of squamous cell carcinomas in FA patients. Larval and adult zebrafish expressed fanc genes in proliferative regions of the brain, which may be related to microcephaly in FA. Mature ovaries and testes expressed fanc genes in specific stages of oocyte and spermatocyte development, which may be related to DNA repair during homologous recombination in meiosis and to infertility in human patients. The intestine strongly expressed some fanc genes specifically in proliferative zones. Our results show that zebrafish has a complete complement of fanc genes in single copy and that these genes are expressed in zebrafish embryos and adults in proliferative tissues that are often affected in FA patients. These results support the notion that zebrafish offers an attractive experimental system to help unravel mechanisms relevant not only

  13. The Fanconi anemia/BRCA gene network in zebrafish: embryonic expression and comparative genomics.

    Science.gov (United States)

    Titus, Tom A; Yan, Yi-Lin; Wilson, Catherine; Starks, Amber M; Frohnmayer, Jonathan D; Bremiller, Ruth A; Cañestro, Cristian; Rodriguez-Mari, Adriana; He, Xinjun; Postlethwait, John H

    2009-07-31

    Fanconi anemia (FA) is a genetic disease resulting in bone marrow failure, high cancer risks, and infertility, and developmental anomalies including microphthalmia, microcephaly, hypoplastic radius and thumb. Here we present cDNA sequences, genetic mapping, and genomic analyses for the four previously undescribed zebrafish FA genes (fanci, fancj, fancm, and fancn), and show that they reverted to single copy after the teleost genome duplication. We tested the hypothesis that FA genes are expressed during embryonic development in tissues that are disrupted in human patients by investigating fanc gene expression patterns. We found fanc gene maternal message, which can provide Fanc proteins to repair DNA damage encountered in rapid cleavage divisions. Zygotic expression was broad but especially strong in eyes, central nervous system and hematopoietic tissues. In the pectoral fin bud at hatching, fanc genes were expressed specifically in the apical ectodermal ridge, a signaling center for fin/limb development that may be relevant to the radius/thumb anomaly of FA patients. Hatching embryos expressed fanc genes strongly in the oral epithelium, a site of squamous cell carcinomas in FA patients. Larval and adult zebrafish expressed fanc genes in proliferative regions of the brain, which may be related to microcephaly in FA. Mature ovaries and testes expressed fanc genes in specific stages of oocyte and spermatocyte development, which may be related to DNA repair during homologous recombination in meiosis and to infertility in human patients. The intestine strongly expressed some fanc genes specifically in proliferative zones. Our results show that zebrafish has a complete complement of fanc genes in single copy and that these genes are expressed in zebrafish embryos and adults in proliferative tissues that are often affected in FA patients. These results support the notion that zebrafish offers an attractive experimental system to help unravel mechanisms relevant not only

  14. The Fanconi anemia/BRCA gene network in zebrafish: Embryonic expression and comparative genomics

    International Nuclear Information System (INIS)

    Titus, Tom A.; Yan Yilin; Wilson, Catherine; Starks, Amber M.; Frohnmayer, Jonathan D.; Bremiller, Ruth A.; Canestro, Cristian; Rodriguez-Mari, Adriana; He Xinjun; Postlethwait, John H.

    2009-01-01

    Fanconi anemia (FA) is a genetic disease resulting in bone marrow failure, high cancer risks, and infertility, and developmental anomalies including microphthalmia, microcephaly, hypoplastic radius and thumb. Here we present cDNA sequences, genetic mapping, and genomic analyses for the four previously undescribed zebrafish FA genes (fanci, fancj, fancm, and fancn), and show that they reverted to single copy after the teleost genome duplication. We tested the hypothesis that FA genes are expressed during embryonic development in tissues that are disrupted in human patients by investigating fanc gene expression patterns. We found fanc gene maternal message, which can provide Fanc proteins to repair DNA damage encountered in rapid cleavage divisions. Zygotic expression was broad but especially strong in eyes, central nervous system and hematopoietic tissues. In the pectoral fin bud at hatching, fanc genes were expressed specifically in the apical ectodermal ridge, a signaling center for fin/limb development that may be relevant to the radius/thumb anomaly of FA patients. Hatching embryos expressed fanc genes strongly in the oral epithelium, a site of squamous cell carcinomas in FA patients. Larval and adult zebrafish expressed fanc genes in proliferative regions of the brain, which may be related to microcephaly in FA. Mature ovaries and testes expressed fanc genes in specific stages of oocyte and spermatocyte development, which may be related to DNA repair during homologous recombination in meiosis and to infertility in human patients. The intestine strongly expressed some fanc genes specifically in proliferative zones. Our results show that zebrafish has a complete complement of fanc genes in single copy and that these genes are expressed in zebrafish embryos and adults in proliferative tissues that are often affected in FA patients. These results support the notion that zebrafish offers an attractive experimental system to help unravel mechanisms relevant not only

  15. Initiation of genome instability and preneoplastic processes through loss of Fhit expression.

    Directory of Open Access Journals (Sweden)

    Joshua C Saldivar

    Full Text Available Genomic instability drives tumorigenesis, but how it is initiated in sporadic neoplasias is unknown. In early preneoplasias, alterations at chromosome fragile sites arise due to DNA replication stress. A frequent, perhaps earliest, genetic alteration in preneoplasias is deletion within the fragile FRA3B/FHIT locus, leading to loss of Fhit protein expression. Because common chromosome fragile sites are exquisitely sensitive to replication stress, it has been proposed that their clonal alterations in cancer cells are due to stress sensitivity rather than to a selective advantage imparted by loss of expression of fragile gene products. Here, we show in normal, transformed, and cancer-derived cell lines that Fhit-depletion causes replication stress-induced DNA double-strand breaks. Using DNA combing, we observed a defect in replication fork progression in Fhit-deficient cells that stemmed primarily from fork stalling and collapse. The likely mechanism for the role of Fhit in replication fork progression is through regulation of Thymidine kinase 1 expression and thymidine triphosphate pool levels; notably, restoration of nucleotide balance rescued DNA replication defects and suppressed DNA breakage in Fhit-deficient cells. Depletion of Fhit did not activate the DNA damage response nor cause cell cycle arrest, allowing continued cell proliferation and ongoing chromosomal instability. This finding was in accord with in vivo studies, as Fhit knockout mouse tissue showed no evidence of cell cycle arrest or senescence yet exhibited numerous somatic DNA copy number aberrations at replication stress-sensitive loci. Furthermore, cells established from Fhit knockout tissue showed rapid immortalization and selection of DNA deletions and amplifications, including amplification of the Mdm2 gene, suggesting that Fhit loss-induced genome instability facilitates transformation. We propose that loss of Fhit expression in precancerous lesions is the first step in the

  16. Multi-targeted priming for genome-wide gene expression assays

    Directory of Open Access Journals (Sweden)

    Adomas Aleksandra B

    2010-08-01

    Full Text Available Abstract Background Complementary approaches to assaying global gene expression are needed to assess gene expression in regions that are poorly assayed by current methodologies. A key component of nearly all gene expression assays is the reverse transcription of transcribed sequences that has traditionally been performed by priming the poly-A tails on many of the transcribed genes in eukaryotes with oligo-dT, or by priming RNA indiscriminately with random hexamers. We designed an algorithm to find common sequence motifs that were present within most protein-coding genes of Saccharomyces cerevisiae and of Neurospora crassa, but that were not present within their ribosomal RNA or transfer RNA genes. We then experimentally tested whether degenerately priming these motifs with multi-targeted primers improved the accuracy and completeness of transcriptomic assays. Results We discovered two multi-targeted primers that would prime a preponderance of genes in the genomes of Saccharomyces cerevisiae and Neurospora crassa while avoiding priming ribosomal RNA or transfer RNA. Examining the response of Saccharomyces cerevisiae to nitrogen deficiency and profiling Neurospora crassa early sexual development, we demonstrated that using multi-targeted primers in reverse transcription led to superior performance of microarray profiling and next-generation RNA tag sequencing. Priming with multi-targeted primers in addition to oligo-dT resulted in higher sensitivity, a larger number of well-measured genes and greater power to detect differences in gene expression. Conclusions Our results provide the most complete and detailed expression profiles of the yeast nitrogen starvation response and N. crassa early sexual development to date. Furthermore, our multi-targeting priming methodology for genome-wide gene expression assays provides selective targeting of multiple sequences and counter-selection against undesirable sequences, facilitating a more complete and

  17. Genome-wide identification and expression profiling of serine proteases and homologs in the diamondback moth, Plutella xylostella (L.).

    Science.gov (United States)

    Lin, Hailan; Xia, Xiaofeng; Yu, Liying; Vasseur, Liette; Gurr, Geoff M; Yao, Fengluan; Yang, Guang; You, Minsheng

    2015-12-10

    Serine proteases (SPs) are crucial proteolytic enzymes responsible for digestion and other processes including signal transduction and immune responses in insects. Serine protease homologs (SPHs) lack catalytic activity but are involved in innate immunity. This study presents a genome-wide investigation of SPs and SPHs in the diamondback moth, Plutella xylostella (L.), a globally-distributed destructive pest of cruciferous crops. A total of 120 putative SPs and 101 putative SPHs were identified in the P. xylostella genome by bioinformatics analysis. Based on the features of trypsin, 38 SPs were putatively designated as trypsin genes. The distribution, transcription orientation, exon-intron structure and sequence alignments suggested that the majority of trypsin genes evolved from tandem duplications. Among the 221 SP/SPH genes, ten SP and three SPH genes with one or more clip domains were predicted and designated as PxCLIPs. Phylogenetic analysis of CLIPs in P. xylostella, two other Lepidoptera species (Bombyx mori and Manduca sexta), and two more distantly related insects (Drosophila melanogaster and Apis mellifera) showed that seven of the 13 PxCLIPs were clustered with homologs of the Lepidoptera rather than other species. Expression profiling of the P. xylostella SP and SPH genes in different developmental stages and tissues showed diverse expression patterns, suggesting high functional diversity with roles in digestion and development. This is the first genome-wide investigation on the SP and SPH genes in P. xylostella. The characterized features and profiled expression patterns of the P. xylostella SPs and SPHs suggest their involvement in digestion, development and immunity of this species. Our findings provide a foundation for further research on the functions of this gene family in P. xylostella, and a better understanding of its capacity to rapidly adapt to a wide range of environmental variables including host plants and insecticides.

  18. Histone deacetylase inhibitors reduce the number of herpes simplex virus-1 genomes initiating expression in individual cells

    Directory of Open Access Journals (Sweden)

    Lev Shapira

    2016-12-01

    Full Text Available Although many viral particles can enter a single cell, the number of viral genomes per cell that establish infection is limited. However, mechanisms underlying this restriction were not explored in depth. For herpesviruses, one of the possible mechanisms suggested is chromatinization and silencing of the incoming genomes. To test this hypothesis, we followed infection with three herpes simplex virus 1 (HSV-1 fluorescence-expressing recombinants in the presence or absence of histone deacetylases inhibitors (HDACi’s. Unexpectedly, a lower number of viral genomes initiated expression in the presence of these inhibitors. This phenomenon was observed using several HDACi: Trichostatin A (TSA, Suberohydroxamic Acid (SBX, Valporic Acid (VPA and Suberoylanilide Hydoxamic Acid (SAHA. We found that HDACi presence did not change the progeny outcome from the infected cells but did alter the kinetic of the gene expression from the viral genomes. Different cell types (HFF, Vero and U2OS, which vary in their capability to activate intrinsic and innate immunity, show a cell specific basal average number of viral genomes establishing infection. Importantly, in all cell types, treatment with TSA reduced the number of viral genomes. ND10 nuclear bodies are known to interact with the incoming herpes genomes and repress viral replication. The viral immediate early protein, ICP0, is known to disassemble the ND10 bodies and to induce degradation of some of the host proteins in these domains. HDACi treated cells expressed higher levels of some of the host ND10 proteins (PML and ATRX, which may explain the lower number of viral genomes initiating expression per cell. Corroborating this hypothesis, infection with three HSV-1 recombinants carrying a deletion in the gene coding for ICP0, show a reduction in the number of genomes being expressed in U2OS cells. We suggest that alterations in the levels of host proteins involved in intrinsic antiviral defense may result in

  19. Systematic determination of the mosaic structure of bacterial genomes: species backbone versus strain-specific loops

    Directory of Open Access Journals (Sweden)

    Gendrault-Jacquemard A

    2005-07-01

    Full Text Available Abstract Background Public databases now contain multitude of complete bacterial genomes, including several genomes of the same species. The available data offers new opportunities to address questions about bacterial genome evolution, a task that requires reliable fine comparison data of closely related genomes. Recent analyses have shown, using pairwise whole genome alignments, that it is possible to segment bacterial genomes into a common conserved backbone and strain-specific sequences called loops. Results Here, we generalize this approach and propose a strategy that allows systematic and non-biased genome segmentation based on multiple genome alignments. Segmentation analyses, as applied to 13 different bacterial species, confirmed the feasibility of our approach to discern the 'mosaic' organization of bacterial genomes. Segmentation results are available through a Web interface permitting functional analysis, extraction and visualization of the backbone/loops structure of documented genomes. To illustrate the potential of this approach, we performed a precise analysis of the mosaic organization of three E. coli strains and functional characterization of the loops. Conclusion The segmentation results including the backbone/loops structure of 13 bacterial species genomes are new and available for use by the scientific community at the URL: http://genome.jouy.inra.fr/mosaic.

  20. Genome-wide survey of flavonoid biosynthesis genes and gene expression analysis between black- and yellow-seeded Brassica napus

    Directory of Open Access Journals (Sweden)

    Cunmin Qu

    2016-12-01

    Full Text Available Flavonoids, the compounds that impart color to fruits, flowers, and seeds, are the most widespread secondary metabolites in plants. However, a systematic analysis of these loci has not been performed in Brassicaceae. In this study, we isolated 649 nucleotide sequences related to flavonoid biosynthesis, i.e., the Transparent Testa (TT genes, and their associated amino acid sequences in 17 Brassicaceae species, grouped into Arabidopsis or Brassicaceae subgroups. Moreover, 36 copies of 21 genes of the flavonoid biosynthesis pathway were identified in A. thaliana, 53 were identified in B. rapa, 50 in B. oleracea, and 95 in B. napus, followed the genomic distribution, collinearity analysis and genes triplication of them among Brassicaceae species. The results showed that the extensive gene loss, whole genome triplication, and diploidization that occurred after divergence from the common ancestor. Using qRT-PCR methods, we analyzed the expression of eighteen flavonoid biosynthesis genes in 6 yellow- and black-seeded B. napus inbred lines with different genetic background, found that 12 of which were preferentially expressed during seed development, whereas the remaining genes were expressed in all B. napus tissues examined. Moreover, fourteen of these genes showed significant differences in expression level during seed development, and all but four of these (i.e., BnTT5, BnTT7, BnTT10, and BnTTG1 had similar expression patterns among the yellow- and black-seeded B. napus. Results showed that the structural genes (BnTT3, BnTT18 and BnBAN, regulatory genes (BnTTG2 and BnTT16 and three encoding transfer proteins (BnTT12, BnTT19, and BnAHA10 might play an crucial roles in the formation of different seed coat colors in B. napus. These data will be helpful for illustrating the molecular mechanisms of flavonoid biosynthesis in Brassicaceae species.

  1. Cell-of-Origin-Specific 3D Genome Structure Acquired during Somatic Cell Reprogramming

    NARCIS (Netherlands)

    Krijger, Peter Hugo Lodewijk; Di Stefano, Bruno; de Wit, Elzo; Limone, Francesco; van Oevelen, Chris; de Laat, Wouter; Graf, Thomas

    2016-01-01

    Forced expression of reprogramming factors can convert somatic cells into induced pluripotent stem cells (iPSCs). Here we studied genome topology dynamics during reprogramming of different somatic cell types with highly distinct genome conformations. We find large-scale topologically associated

  2. Genome-Wide Expression of MicroRNAs Is Regulated by DNA Methylation in Hepatocarcinogenesis

    Directory of Open Access Journals (Sweden)

    Jing Shen

    2015-01-01

    Full Text Available Background. Previous studies, including ours, have examined the regulation of microRNAs (miRNAs by DNA methylation, but whether this regulation occurs at a genome-wide level in hepatocellular carcinoma (HCC is unclear. Subjects/Methods. Using a two-phase study design, we conducted genome-wide screening for DNA methylation and miRNA expression to explore the potential role of methylation alterations in miRNAs regulation. Results. We found that expressions of 25 miRNAs were statistically significantly different between tumor and nontumor tissues and perfectly differentiated HCC tumor from nontumor. Six miRNAs were overexpressed, and 19 were repressed in tumors. Among 133 miRNAs with inverse correlations between methylation and expression, 8 miRNAs (6% showed statistically significant differences in expression between tumor and nontumor tissues. Six miRNAs were validated in 56 additional paired HCC tissues, and significant inverse correlations were observed for miR-125b and miR-199a, which is consistent with the inactive chromatin pattern found in HepG2 cells. Conclusion. These data suggest that the expressions of miR-125b and miR-199a are dramatically regulated by DNA hypermethylation that plays a key role in hepatocarcinogenesis.

  3. cDNA structure, genomic organization and expression patterns of ...

    African Journals Online (AJOL)

    use

    2011-11-23

    Nov 23, 2011 ... adenine dinucleotide (NAD) intermediate (Rongvaux et al., 2002). Thereupon ... in house mouse, Norway rat and human. It was not difficult to ... species in freshwater regions, and has been a new model organism in aquatic ...

  4. Effects of in ovo electroporation on endogenous gene expression: genome-wide analysis

    Directory of Open Access Journals (Sweden)

    Chambers David

    2011-04-01

    Full Text Available Abstract Background In ovo electroporation is a widely used technique to study gene function in developmental biology. Despite the widespread acceptance of this technique, no genome-wide analysis of the effects of in ovo electroporation, principally the current applied across the tissue and exogenous vector DNA introduced, on endogenous gene expression has been undertaken. Here, the effects of electric current and expression of a GFP-containing construct, via electroporation into the midbrain of Hamburger-Hamilton stage 10 chicken embryos, are analysed by microarray. Results Both current alone and in combination with exogenous DNA expression have a small but reproducible effect on endogenous gene expression, changing the expression of the genes represented on the array by less than 0.1% (current and less than 0.5% (current + DNA, respectively. The subset of genes regulated by electric current and exogenous DNA span a disparate set of cellular functions. However, no genes involved in the regional identity were affected. In sharp contrast to this, electroporation of a known transcription factor, Dmrt5, caused a much greater change in gene expression. Conclusions These findings represent the first systematic genome-wide analysis of the effects of in ovo electroporation on gene expression during embryonic development. The analysis reveals that this process has minimal impact on the genetic basis of cell fate specification. Thus, the study demonstrates the validity of the in ovo electroporation technique to study gene function and expression during development. Furthermore, the data presented here can be used as a resource to refine the set of transcriptional responders in future in ovo electroporation studies of specific gene function.

  5. Disturbance of gene expression in primary human hepatocytes by hepatotoxic pyrrolizidine alkaloids: A whole genome transcriptome analysis.

    Science.gov (United States)

    Luckert, Claudia; Hessel, Stefanie; Lenze, Dido; Lampen, Alfonso

    2015-10-01

    1,2-unsaturated pyrrolizidine alkaloids (PA) are plant metabolites predominantly occurring in the plant families Asteraceae and Boraginaceae. Acute and chronic PA poisoning causes severe hepatotoxicity. So far, the molecular mechanisms of PA toxicity are not well understood. To analyze its mode of action, primary human hepatocytes were exposed to a non-cytotoxic dose of 100 μM of four structurally different PA: echimidine, heliotrine, senecionine, senkirkine. Changes in mRNA expression were analyzed by a whole genome microarray. Employing cut-off values with a |fold change| of 2 and a q-value of 0.01, data analysis revealed numerous changes in gene expression. In total, 4556, 1806, 3406 and 8623 genes were regulated by echimidine, heliotrine, senecione and senkirkine, respectively. 1304 genes were identified as commonly regulated. PA affected pathways related to cell cycle regulation, cell death and cancer development. The transcription factors TP53, MYC, NFκB and NUPR1 were predicted to be activated upon PA treatment. Furthermore, gene expression data showed a considerable interference with lipid metabolism and bile acid flow. The associated transcription factors FXR, LXR, SREBF1/2, and PPARα/γ/δ were predicted to be inhibited. In conclusion, though structurally different, all four PA significantly regulated a great number of genes in common. This proposes similar molecular mechanisms, although the extent seems to differ between the analyzed PA as reflected by the potential hepatotoxicity and individual PA structure. Copyright © 2015 Elsevier Ltd. All rights reserved.

  6. Epigenetic changes of Arabidopsis genome associated with altered DNA methyltransferase and demethylase expressions after gamma irradiation

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Ji Eun; Cho, Eun Ju; Kim, Ji Hong; Chung, Byung Yeoup; Kim, Jin Hong [Korea Atomic Energy Research Institute, Daejeon (Korea, Republic of)

    2012-05-15

    DNA methylation at carbon 5 of cytosines is a hall mark of epigenetic inactivation and heterochromatin in both plants and mammals. In Arabidopsis, DNA methylation has two roles that protect the genome from selfish DNA elements and regulate gene expression. Plant genome has three types of DNA methyltransferase, METHYLTRANSFERASE 1 (MET1), DOMAINREARRANGED METHYLASE (DRM) and CHROMOMETHYLASE 3 (CMT3) that are capable of methylating CG, CHG (where H is A, T, or C) and CHH sites, respectively. MET1 is a maintenance DNA methyltransferase that controls CG methylation. Two members of the DRM family, DRM1 and DRM2, are responsible for de novo methylation of CG, CHG, and CHH sites but show a preference for CHH sites. Finally, CMT3 principally carries out CHG methylation and is involved in both de novo methylation and maintenance. Alternatively, active DNA demethylation may occur through the glycosylase activity by removing the methylcytosines from DNA. It may have essential roles in preventing transcriptional silencing of transgenes and endogenous genes and in activating the expression of imprinted genes. DNA demetylation in Arabidopsis is mediated by the DEMETER (DME) family of bifunctional DNA glycosylase. Three targets of DME are MEA (MEDEA), FWA (FLOWERING WAGENINGEN), and FIS2 (FERTILIZATION INDEPENDENT SEED 2). The DME family contains DEMETER-LIKE 2 (DML2), DML3, and REPRESSOR OF SILENING 1 (ROS1). DNA demetylation by ROS1, DML2, and DML3 protect the hypermethylation of specific genome loci. ROS1 is necessary to suppress the promoter methylation and the silencing of endogenous genes. In contrast, the function of DML2 and DML3 has not been reported. Several recent studies have suggested that epigenetic alterations such as change in DNA methylation and histone modification should be caused in plant genomes upon exposure to ionizing radiation. However, there is a lack of data exploring the underlying mechanisms. Therefore, the present study aims to characterize and

  7. Epigenetic changes of Arabidopsis genome associated with altered DNA methyltransferase and demethylase expressions after gamma irradiation

    International Nuclear Information System (INIS)

    Kim, Ji Eun; Cho, Eun Ju; Kim, Ji Hong; Chung, Byung Yeoup; Kim, Jin Hong

    2012-01-01

    DNA methylation at carbon 5 of cytosines is a hall mark of epigenetic inactivation and heterochromatin in both plants and mammals. In Arabidopsis, DNA methylation has two roles that protect the genome from selfish DNA elements and regulate gene expression. Plant genome has three types of DNA methyltransferase, METHYLTRANSFERASE 1 (MET1), DOMAINREARRANGED METHYLASE (DRM) and CHROMOMETHYLASE 3 (CMT3) that are capable of methylating CG, CHG (where H is A, T, or C) and CHH sites, respectively. MET1 is a maintenance DNA methyltransferase that controls CG methylation. Two members of the DRM family, DRM1 and DRM2, are responsible for de novo methylation of CG, CHG, and CHH sites but show a preference for CHH sites. Finally, CMT3 principally carries out CHG methylation and is involved in both de novo methylation and maintenance. Alternatively, active DNA demethylation may occur through the glycosylase activity by removing the methylcytosines from DNA. It may have essential roles in preventing transcriptional silencing of transgenes and endogenous genes and in activating the expression of imprinted genes. DNA demetylation in Arabidopsis is mediated by the DEMETER (DME) family of bifunctional DNA glycosylase. Three targets of DME are MEA (MEDEA), FWA (FLOWERING WAGENINGEN), and FIS2 (FERTILIZATION INDEPENDENT SEED 2). The DME family contains DEMETER-LIKE 2 (DML2), DML3, and REPRESSOR OF SILENING 1 (ROS1). DNA demetylation by ROS1, DML2, and DML3 protect the hypermethylation of specific genome loci. ROS1 is necessary to suppress the promoter methylation and the silencing of endogenous genes. In contrast, the function of DML2 and DML3 has not been reported. Several recent studies have suggested that epigenetic alterations such as change in DNA methylation and histone modification should be caused in plant genomes upon exposure to ionizing radiation. However, there is a lack of data exploring the underlying mechanisms. Therefore, the present study aims to characterize and

  8. Inferring causal genomic alterations in breast cancer using gene expression data

    Science.gov (United States)

    2011-01-01

    Background One of the primary objectives in cancer research is to identify causal genomic alterations, such as somatic copy number variation (CNV) and somatic mutations, during tumor development. Many valuable studies lack genomic data to detect CNV; therefore, methods that are able to infer CNVs from gene expression data would help maximize the value of these studies. Results We developed a framework for identifying recurrent regions of CNV and distinguishing the cancer driver genes from the passenger genes in the regions. By inferring CNV regions across many datasets we were able to identify 109 recurrent amplified/deleted CNV regions. Many of these regions are enriched for genes involved in many important processes associated with tumorigenesis and cancer progression. Genes in these recurrent CNV regions were then examined in the context of gene regulatory networks to prioritize putative cancer driver genes. The cancer driver genes uncovered by the framework include not only well-known oncogenes but also a number of novel cancer susceptibility genes validated via siRNA experiments. Conclusions To our knowledge, this is the first effort to systematically identify and validate drivers for expression based CNV regions in breast cancer. The framework where the wavelet analysis of copy number alteration based on expression coupled with the gene regulatory network analysis, provides a blueprint for leveraging genomic data to identify key regulatory components and gene targets. This integrative approach can be applied to many other large-scale gene expression studies and other novel types of cancer data such as next-generation sequencing based expression (RNA-Seq) as well as CNV data. PMID:21806811

  9. Characterization of gene expression on genomic segment 7 of infectious salmon anaemia virus

    Directory of Open Access Journals (Sweden)

    Qian Biao

    2007-03-01

    Full Text Available Abstract Background Infectious salmon anaemia (ISA virus (ISAV, an important pathogen of fish that causes disease accompanied by high mortality in marine-farmed Atlantic salmon, is the only species in the genus Isavirus, one of the five genera of the Orthomyxoviridae family. The Isavirus genome consists of eight single-stranded RNA species, and the virions have two surface glycoproteins; haemagglutinin-esterase (HE protein encoded on segment 6 and fusion (F protein encoded on segment 5. Based on the initial demonstration of two 5'-coterminal mRNA transcripts by RT-PCR, ISAV genomic segment 7 was suggested to share a similar coding strategy with segment 7 of influenza A virus, encoding two proteins. However, there appears to be confusion as to the protein sizes predicted from the two open reading frames (ORFs of ISAV segment 7 which has in turn led to confusion of the predicted protein functions. The primary goal of the present work was to clone and express these two ORFs in order to assess whether the predicted protein sizes match those of the expressed proteins so as to clarify the coding assignments, and thereby identify any additional structural proteins of ISAV. Results In the present study we show that ISAV segment 7 encodes 3 proteins with estimated molecular masses of 32, 18, and 9.5 kDa. The 18-kDa and 9.5-kDa products are based on removal of an intron each from the primary transcript (7-ORF1 so that the translation continues in the +2 and +3 reading frames, respectively. The segment 7-ORF1/3 product is variably truncated in the sequence of ISAV isolates of the European genotype. All three proteins are recognized by rabbit antiserum against the 32-kDa product of the primary transcript, as they all share the N-terminal 22 amino acids. This antiserum detected a single 35-kDa protein in Western blots of purified virus, and immunoprecipitated a 32-kDa protein in ISAV-infected TO cells. Immunofluorescence staining of infected cells with the

  10. Producing genome structure populations with the dynamic and automated PGS software.

    Science.gov (United States)

    Hua, Nan; Tjong, Harianto; Shin, Hanjun; Gong, Ke; Zhou, Xianghong Jasmine; Alber, Frank

    2018-05-01

    Chromosome conformation capture technologies such as Hi-C are widely used to investigate the spatial organization of genomes. Because genome structures can vary considerably between individual cells of a population, interpreting ensemble-averaged Hi-C data can be challenging, in particular for long-range and interchromosomal interactions. We pioneered a probabilistic approach for the generation of a population of distinct diploid 3D genome structures consistent with all the chromatin-chromatin interaction probabilities from Hi-C experiments. Each structure in the population is a physical model of the genome in 3D. Analysis of these models yields new insights into the causes and the functional properties of the genome's organization in space and time. We provide a user-friendly software package, called PGS, which runs on local machines (for practice runs) and high-performance computing platforms. PGS takes a genome-wide Hi-C contact frequency matrix, along with information about genome segmentation, and produces an ensemble of 3D genome structures entirely consistent with the input. The software automatically generates an analysis report, and provides tools to extract and analyze the 3D coordinates of specific domains. Basic Linux command-line knowledge is sufficient for using this software. A typical running time of the pipeline is ∼3 d with 300 cores on a computer cluster to generate a population of 1,000 diploid genome structures at topological-associated domain (TAD)-level resolution.

  11. Transcriptome sequencing and whole genome expression profiling of chrysanthemum under dehydration stress

    Science.gov (United States)

    2013-01-01

    Background Chrysanthemum is one of the most important ornamental crops in the world and drought stress seriously limits its production and distribution. In order to generate a functional genomics resource and obtain a deeper understanding of the molecular mechanisms regarding chrysanthemum responses to dehydration stress, we performed large-scale transcriptome sequencing of chrysanthemum plants under dehydration stress using the Illumina sequencing technology. Results Two cDNA libraries constructed from mRNAs of control and dehydration-treated seedlings were sequenced by Illumina technology. A total of more than 100 million reads were generated and de novo assembled into 98,180 unique transcripts which were further extensively annotated by comparing their sequencing to different protein databases. Biochemical pathways were predicted from these transcript sequences. Furthermore, we performed gene expression profiling analysis upon dehydration treatment in chrysanthemum and identified 8,558 dehydration-responsive unique transcripts, including 307 transcription factors and 229 protein kinases and many well-known stress responsive genes. Gene ontology (GO) term enrichment and biochemical pathway analyses showed that dehydration stress caused changes in hormone response, secondary and amino acid metabolism, and light and photoperiod response. These findings suggest that drought tolerance of chrysanthemum plants may be related to the regulation of hormone biosynthesis and signaling, reduction of oxidative damage, stabilization of cell proteins and structures, and maintenance of energy and carbon supply. Conclusions Our transcriptome sequences can provide a valuable resource for chrysanthemum breeding and research and novel insights into chrysanthemum responses to dehydration stress and offer candidate genes or markers that can be used to guide future studies attempting to breed drought tolerant chrysanthemum cultivars. PMID:24074255

  12. Genomic DNA-based absolute quantification of gene expression in Vitis.

    Science.gov (United States)

    Gambetta, Gregory A; McElrone, Andrew J; Matthews, Mark A

    2013-07-01

    Many studies in which gene expression is quantified by polymerase chain reaction represent the expression of a gene of interest (GOI) relative to that of a reference gene (RG). Relative expression is founded on the assumptions that RG expression is stable across samples, treatments, organs, etc., and that reaction efficiencies of the GOI and RG are equal; assumptions which are often faulty. The true variability in RG expression and actual reaction efficiencies are seldom determined experimentally. Here we present a rapid and robust method for absolute quantification of expression in Vitis where varying concentrations of genomic DNA were used to construct GOI standard curves. This methodology was utilized to absolutely quantify and determine the variability of the previously validated RG ubiquitin (VvUbi) across three test studies in three different tissues (roots, leaves and berries). In addition, in each study a GOI was absolutely quantified. Data sets resulting from relative and absolute methods of quantification were compared and the differences were striking. VvUbi expression was significantly different in magnitude between test studies and variable among individual samples. Absolute quantification consistently reduced the coefficients of variation of the GOIs by more than half, often resulting in differences in statistical significance and in some cases even changing the fundamental nature of the result. Utilizing genomic DNA-based absolute quantification is fast and efficient. Through eliminating error introduced by assuming RG stability and equal reaction efficiencies between the RG and GOI this methodology produces less variation, increased accuracy and greater statistical power. © 2012 Scandinavian Plant Physiology Society.

  13. Genome-wide expression analysis of salt-stressed diploid and autotetraploid Paulownia tomentosa.

    Directory of Open Access Journals (Sweden)

    Zhenli Zhao

    Full Text Available Paulownia tomentosa is a fast-growing tree species with multiple uses. It is grown worldwide, but is native to China, where it is widely cultivated in saline regions. We previously confirmed that autotetraploid P. tomentosa plants are more stress-tolerant than the diploid plants. However, the molecular mechanism underlying P. tomentosa salinity tolerance has not been fully characterized. Using the complete Paulownia fortunei genome as a reference, we applied next-generation RNA-sequencing technology to analyze the effects of salt stress on diploid and autotetraploid P. tomentosa plants. We generated 175 million clean reads and identified 15,873 differentially expressed genes (DEGs from four P. tomentosa libraries (two diploid and two autotetraploid. Functional annotations of the differentially expressed genes using the Gene Ontology and Kyoto Encyclopedia of Genes and Genomes databases revealed that plant hormone signal transduction and photosynthetic activities are vital for plant responses to high-salt conditions. We also identified several transcription factors, including members of the AP2/EREBP, bHLH, MYB, and NAC families. Quantitative real-time PCR analysis validated the expression patterns of eight differentially expressed genes. Our findings and the generated transcriptome data may help to accelerate the genetic improvement of cultivated P. tomentosa and other plant species for enhanced growth in saline soils.

  14. Replication, gene expression and particle production by a consensus Merkel Cell Polyomavirus (MCPyV genome.

    Directory of Open Access Journals (Sweden)

    Friederike Neumann

    Full Text Available Merkel Cell Polyomavirus (MCPyV genomes are clonally integrated in tumor tissues of approximately 85% of all Merkel cell carcinoma (MCC cases, a highly aggressive tumor of the skin which predominantly afflicts elderly and immunosuppressed patients. All integrated viral genomes recovered from MCC tissue or MCC cell lines harbor signature mutations in the early gene transcript encoding for the large T-Antigen (LT-Ag. These mutations selectively abrogate the ability of LT-Ag to support viral replication while still maintaining its Rb-binding activity, suggesting a continuous requirement for LT-Ag mediated cell cycle deregulation during MCC pathogenesis. To gain a better understanding of MCPyV biology, in vitro MCPyV replication systems are required. We have generated a synthetic MCPyV genomic clone (MCVSyn based on the consensus sequence of MCC-derived sequences deposited in the NCBI database. Here, we demonstrate that transfection of recircularized MCVSyn DNA into some human cell lines recapitulates efficient replication of the viral genome, early and late gene expression together with virus particle formation. However, serial transmission of infectious virus was not observed. This in vitro culturing system allows the study of viral replication and will facilitate the molecular dissection of important aspects of the MCPyV lifecycle.

  15. Inferring network structure in non-normal and mixed discrete-continuous genomic data.

    Science.gov (United States)

    Bhadra, Anindya; Rao, Arvind; Baladandayuthapani, Veerabhadran

    2018-03-01

    Inferring dependence structure through undirected graphs is crucial for uncovering the major modes of multivariate interaction among high-dimensional genomic markers that are potentially associated with cancer. Traditionally, conditional independence has been studied using sparse Gaussian graphical models for continuous data and sparse Ising models for discrete data. However, there are two clear situations when these approaches are inadequate. The first occurs when the data are continuous but display non-normal marginal behavior such as heavy tails or skewness, rendering an assumption of normality inappropriate. The second occurs when a part of the data is ordinal or discrete (e.g., presence or absence of a mutation) and the other part is continuous (e.g., expression levels of genes or proteins). In this case, the existing Bayesian approaches typically employ a latent variable framework for the discrete part that precludes inferring conditional independence among the data that are actually observed. The current article overcomes these two challenges in a unified framework using Gaussian scale mixtures. Our framework is able to handle continuous data that are not normal and data that are of mixed continuous and discrete nature, while still being able to infer a sparse conditional sign independence structure among the observed data. Extensive performance comparison in simulations with alternative techniques and an analysis of a real cancer genomics data set demonstrate the effectiveness of the proposed approach. © 2017, The International Biometric Society.

  16. Genomic and Expression Profiling of Benign and Malignant Nerve Sheath Profiling of Benign and Malignant Nerve Sheath

    Science.gov (United States)

    2007-05-01

    Benign and Malignant Nerve Sheath Tumors in Neurofibromatosis Patients PRINCIPAL INVESTIGATOR: Matt van de Rijn, M.D., Ph.D. Torsten...Annual 3. DATES COVERED 1 May 2006 –30 Apr 2007 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER Genomic and Expression Profiling of Benign and Malignant Nerve...Award Number: DAMD17-03-1-0297 Title: Genomic and Expression Profiling of Benign and Malignant Nerve Sheath Tumors in Neurofibromatosis

  17. Genomic analysis of the hierarchical structure of regulatory networks

    Science.gov (United States)

    Yu, Haiyuan; Gerstein, Mark

    2006-01-01

    A fundamental question in biology is how the cell uses transcription factors (TFs) to coordinate the expression of thousands of genes in response to various stimuli. The relationships between TFs and their target genes can be modeled in terms of directed regulatory networks. These relationships, in turn, can be readily compared with commonplace “chain-of-command” structures in social networks, which have characteristic hierarchical layouts. Here, we develop algorithms for identifying generalized hierarchies (allowing for various loop structures) and use these approaches to illuminate extensive pyramid-shaped hierarchical structures existing in the regulatory networks of representative prokaryotes (Escherichia coli) and eukaryotes (Saccharomyces cerevisiae), with most TFs at the bottom levels and only a few master TFs on top. These masters are situated near the center of the protein–protein interaction network, a different type of network from the regulatory one, and they receive most of the input for the whole regulatory hierarchy through protein interactions. Moreover, they have maximal influence over other genes, in terms of affecting expression-level changes. Surprisingly, however, TFs at the bottom of the regulatory hierarchy are more essential to the viability of the cell. Finally, one might think master TFs achieve their wide influence through directly regulating many targets, but TFs with most direct targets are in the middle of the hierarchy. We find, in fact, that these midlevel TFs are “control bottlenecks” in the hierarchy, and this great degree of control for “middle managers” has parallels in efficient social structures in various corporate and governmental settings. PMID:17003135

  18. Structure-based inference of molecular functions of proteins of unknown function from Berkeley Structural Genomics Center

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Sung-Hou; Shin, Dong Hae; Hou, Jingtong; Chandonia, John-Marc; Das, Debanu; Choi, In-Geol; Kim, Rosalind; Kim, Sung-Hou

    2007-09-02

    Advances in sequence genomics have resulted in an accumulation of a huge number of protein sequences derived from genome sequences. However, the functions of a large portion of them cannot be inferred based on the current methods of sequence homology detection to proteins of known functions. Three-dimensional structure can have an important impact in providing inference of molecular function (physical and chemical function) of a protein of unknown function. Structural genomics centers worldwide have been determining many 3-D structures of the proteins of unknown functions, and possible molecular functions of them have been inferred based on their structures. Combined with bioinformatics and enzymatic assay tools, the successful acceleration of the process of protein structure determination through high throughput pipelines enables the rapid functional annotation of a large fraction of hypothetical proteins. We present a brief summary of the process we used at the Berkeley Structural Genomics Center to infer molecular functions of proteins of unknown function.

  19. Genome-wide identification and expression analysis of MAPK and MAPKK gene family in Malus domestica.

    Science.gov (United States)

    Zhang, Shizhong; Xu, Ruirui; Luo, Xiaocui; Jiang, Zesheng; Shu, Huairui

    2013-12-01

    MAPK signal transduction modules play crucial roles in regulating many biological processes in plants, which are composed of three classes of hierarchically organized protein kinases, namely MAPKKKs, MAPKKs, and MAPKs. Although genome-wide analysis of this family has been carried out in some species, little is known about MAPK and MAPKK genes in apple (Malus domestica). In this study, a total of 26 putative apple MAPK genes (MdMPKs) and 9 putative apple MAPKK genes (MdMKKs) have been identified and located within the apple genome. Phylogenetic analysis revealed that MdMAPKs and MdMAPKKs could be divided into 4 subfamilies (groups A, B, C and D), respectively. The predicted MdMAPKs and MdMAPKKs were distributed across 13 out of 17 chromosomes with different densities. In addition, analysis of exon-intron junctions and of intron phase inside the predicted coding region of each candidate gene has revealed high levels of conservation within and between phylogenetic groups. According to the microarray and expressed sequence tag (EST) analysis, the different expression patterns indicate that they may play different roles during fruit development and rootstock-scion interaction process. Moreover, MAPK and MAPKK genes were performed expression profile analyses in different tissues (root, stem, leaf, flower and fruit), and all of the selected genes were expressed in at least one of the tissues tested, indicating that the MAPKs and MAPKKs are involved in various aspects of physiological and developmental processes of apple. To our knowledge, this is the first report of a genome-wide analysis of the apple MAPK and MAPKK gene family. This study provides valuable information for understanding the classification and putative functions of the MAPK signal in apple. © 2013.

  20. Gene discovery and transcript analyses in the corn smut pathogen Ustilago maydis: expressed sequence tag and genome sequence comparison

    Directory of Open Access Journals (Sweden)

    Saville Barry J

    2007-09-01

    Full Text Available Abstract Background Ustilago maydis is the basidiomycete fungus responsible for common smut of corn and is a model organism for the study of fungal phytopathogenesis. To aid in the annotation of the genome sequence of this organism, several expressed sequence tag (EST libraries were generated from a variety of U. maydis cell types. In addition to utility in the context of gene identification and structure annotation, the ESTs were analyzed to identify differentially abundant transcripts and to detect evidence of alternative splicing and anti-sense transcription. Results Four cDNA libraries were constructed using RNA isolated from U. maydis diploid teliospores (U. maydis strains 518 × 521 and haploid cells of strain 521 grown under nutrient rich, carbon starved, and nitrogen starved conditions. Using the genome sequence as a scaffold, the 15,901 ESTs were assembled into 6,101 contiguous expressed sequences (contigs; among these, 5,482 corresponded to predicted genes in the MUMDB (MIPS Ustilago maydis database, while 619 aligned to regions of the genome not yet designated as genes in MUMDB. A comparison of EST abundance identified numerous genes that may be regulated in a cell type or starvation-specific manner. The transcriptional response to nitrogen starvation was assessed using RT-qPCR. The results of this suggest that there may be cross-talk between the nitrogen and carbon signalling pathways in U. maydis. Bioinformatic analysis identified numerous examples of alternative splicing and anti-sense transcription. While intron retention was the predominant form of alternative splicing in U. maydis, other varieties were also evident (e.g. exon skipping. Selected instances of both alternative splicing and anti-sense transcription were independently confirmed using RT-PCR. Conclusion Through this work: 1 substantial sequence information has been provided for U. maydis genome annotation; 2 new genes were identified through the discovery of 619

  1. The eukaryotic genome is structurally and functionally more like a social insect colony than a book.

    Science.gov (United States)

    Qiu, Guo-Hua; Yang, Xiaoyan; Zheng, Xintian; Huang, Cuiqin

    2017-11-01

    Traditionally, the genome has been described as the 'book of life'. However, the metaphor of a book may not reflect the dynamic nature of the structure and function of the genome. In the eukaryotic genome, the number of centrally located protein-coding sequences is relatively constant across species, but the amount of noncoding DNA increases considerably with the increase of organismal evolutional complexity. Therefore, it has been hypothesized that the abundant peripheral noncoding DNA protects the genome and the central protein-coding sequences in the eukaryotic genome. Upon comparison with the habitation, sociality and defense mechanisms of a social insect colony, it is found that the genome is similar to a social insect colony in various aspects. A social insect colony may thus be a better metaphor than a book to describe the spatial organization and physical functions of the genome. The potential implications of the metaphor are also discussed.

  2. Ultra high-resolution gene centric genomic structural analysis of a non-syndromic congenital heart defect, Tetralogy of Fallot.

    Directory of Open Access Journals (Sweden)

    Douglas C Bittel

    Full Text Available Tetralogy of Fallot (TOF is one of the most common severe congenital heart malformations. Great progress has been made in identifying key genes that regulate heart development, yet approximately 70% of TOF cases are sporadic and nonsyndromic with no known genetic cause. We created an ultra high-resolution gene centric comparative genomic hybridization (gcCGH microarray based on 591 genes with a validated association with cardiovascular development or function. We used our gcCGH array to analyze the genomic structure of 34 infants with sporadic TOF without a deletion on chromosome 22q11.2 (n male = 20; n female = 14; age range of 2 to 10 months. Using our custom-made gcCGH microarray platform, we identified a total of 613 copy number variations (CNVs ranging in size from 78 base pairs to 19.5 Mb. We identified 16 subjects with 33 CNVs that contained 13 different genes which are known to be directly associated with heart development. Additionally, there were 79 genes from the broader list of genes that were partially or completely contained in a CNV. All 34 individuals examined had at least one CNV involving these 79 genes. Furthermore, we had available whole genome exon arrays from right ventricular tissue in 13 of our subjects. We analyzed these for correlations between copy number and gene expression level. Surprisingly, we could detect only one clear association between CNVs and expression (GSTT1 for any of the 591 focal genes on the gcCGH array. The expression levels of GSTT1 were correlated with copy number in all cases examined (r = 0.95, p = 0.001. We identified a large number of small CNVs in genes with varying associations with heart development. Our results illustrate the complexity of human genome structural variation and underscore the need for multifactorial assessment of potential genetic/genomic factors that contribute to congenital heart defects.

  3. MYB Transcription Factors in Chinese Pear (Pyrus bretschneideri Rehd.: Genome-Wide Identification, Classification and Expression Profiling during Fruit Development

    Directory of Open Access Journals (Sweden)

    Yun Peng eCao

    2016-04-01

    Full Text Available The MYB family is one of the largest families of transcription factors in plants. Although some MYBs have been reported to play roles in secondary metabolism, no comprehensive study of the MYB family in Chinese pear (Pyrus bretschneideri Rehd. has been reported. In the present study, we performed genome-wide analysis of MYB genes in Chinese pear, designated as PbMYBs, including analyses of their phylogenic relationships, structures, chromosomal locations, promoter regions, GO annotations and collinearity. A total of 129 PbMYB genes were identified in the pear genome and were divided into 31 subgroups based on phylogenetic analysis. These PbMYBs were unevenly distributed among 16 chromosomes (total of 17 chromosomes. The occurrence of gene duplication events indicated that whole-genome duplication and segmental duplication likely played key roles in expansion of the PbMYB gene family. Ka/Ks analysis suggested that the duplicated PbMYBs mainly experienced purifying selection with restrictive functional divergence after the duplication events. Interspecies microsynteny analysis revealed maximum orthology between pear and peach, followed by plum and strawberry. Subsequently, the expression patterns of 20 PbMYB genes that may be involved in lignin biosynthesis according to their phylogenetic relationships were examined throughout fruit development. Among the twenty genes examined, PbMYB25 and PbMYB52 exhibited expression patterns consistent with the typical variations in the lignin content previously reported. Moreover, sub-cellular localization analysis revealed that two proteins PbMYB25 and PbMYB52 were localized to the nucleus. All together, PbMYB25 and PbMYB52 were inferred to be candidate genes involved in the regulation of lignin biosynthesis during the development of pear fruit. This study provides useful information for further functional analysis of the MYB gene family in pear.

  4. Systematic drug safety evaluation based on public genomic expression (Connectivity Map) data: Myocardial and infectious adverse reactions as application cases

    Energy Technology Data Exchange (ETDEWEB)

    Wang, Kejian, E-mail: kejian.wang.bio@gmail.com [Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Shanghai Jiao Tong University, Shanghai (China); Weng, Zuquan [Japan National Institute of Occupational Safety and Health, Kawasaki (Japan); Sun, Liya [Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Shanghai Jiao Tong University, Shanghai (China); Sun, Jiazhi; Zhou, Shu-Feng [Department of Pharmaceutical Sciences, College of Pharmacy, University of South Florida, Tampa, FL (United States); He, Lin, E-mail: helin@Bio-X.com [Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Shanghai Jiao Tong University, Shanghai (China)

    2015-02-13

    Adverse drug reaction (ADR) is of great importance to both regulatory agencies and the pharmaceutical industry. Various techniques, such as quantitative structure–activity relationship (QSAR) and animal toxicology, are widely used to identify potential risks during the preclinical stage of drug development. Despite these efforts, drugs with safety liabilities can still pass through safety checkpoints and enter the market. This situation raises the concern that conventional chemical structure analysis and phenotypic screening are not sufficient to avoid all clinical adverse events. Genomic expression data following in vitro drug treatments characterize drug actions and thus have become widely used in drug repositioning. In the present study, we explored prediction of ADRs based on the drug-induced gene-expression profiles from cultured human cells in the Connectivity Map (CMap) database. The results showed that drugs inducing comparable ADRs generally lead to similar CMap expression profiles. Based on such ADR-gene expression association, we established prediction models for various ADRs, including severe myocardial and infectious events. Drugs with FDA boxed warnings of safety liability were effectively identified. We therefore suggest that drug-induced gene expression change, in combination with effective computational methods, may provide a new dimension of information to facilitate systematic drug safety evaluation. - Highlights: • Drugs causing common toxicity lead to similar in vitro gene expression changes. • We built a model to predict drug toxicity with drug-specific expression profiles. • Drugs with FDA black box warnings were effectively identified by our model. • In vitro assay can detect severe toxicity in the early stage of drug development.

  5. Systematic drug safety evaluation based on public genomic expression (Connectivity Map) data: Myocardial and infectious adverse reactions as application cases

    International Nuclear Information System (INIS)

    Wang, Kejian; Weng, Zuquan; Sun, Liya; Sun, Jiazhi; Zhou, Shu-Feng; He, Lin

    2015-01-01

    Adverse drug reaction (ADR) is of great importance to both regulatory agencies and the pharmaceutical industry. Various techniques, such as quantitative structure–activity relationship (QSAR) and animal toxicology, are widely used to identify potential risks during the preclinical stage of drug development. Despite these efforts, drugs with safety liabilities can still pass through safety checkpoints and enter the market. This situation raises the concern that conventional chemical structure analysis and phenotypic screening are not sufficient to avoid all clinical adverse events. Genomic expression data following in vitro drug treatments characterize drug actions and thus have become widely used in drug repositioning. In the present study, we explored prediction of ADRs based on the drug-induced gene-expression profiles from cultured human cells in the Connectivity Map (CMap) database. The results showed that drugs inducing comparable ADRs generally lead to similar CMap expression profiles. Based on such ADR-gene expression association, we established prediction models for various ADRs, including severe myocardial and infectious events. Drugs with FDA boxed warnings of safety liability were effectively identified. We therefore suggest that drug-induced gene expression change, in combination with effective computational methods, may provide a new dimension of information to facilitate systematic drug safety evaluation. - Highlights: • Drugs causing common toxicity lead to similar in vitro gene expression changes. • We built a model to predict drug toxicity with drug-specific expression profiles. • Drugs with FDA black box warnings were effectively identified by our model. • In vitro assay can detect severe toxicity in the early stage of drug development

  6. Integration of expression data in genome-scale metabolic network reconstructions

    Directory of Open Access Journals (Sweden)

    Anna S. Blazier

    2012-08-01

    Full Text Available With the advent of high-throughput technologies, the field of systems biology has amassed an abundance of omics data, quantifying thousands of cellular components across a variety of scales, ranging from mRNA transcript levels to metabolite quantities. Methods are needed to not only integrate this omics data but to also use this data to heighten the predictive capabilities of computational models. Several recent studies have successfully demonstrated how flux balance analysis (FBA, a constraint-based modeling approach, can be used to integrate transcriptomic data into genome-scale metabolic network reconstructions to generate predictive computational models. In this review, we summarize such FBA-based methods for integrating expression data into genome-scale metabolic network reconstructions, highlighting their advantages as well as their limitations.

  7. Prepatterning of developmental gene expression by modified histones before zygotic genome activation

    DEFF Research Database (Denmark)

    Lindeman, Leif C.; Andersen, Ingrid S.; Reiner, Andrew H.

    2011-01-01

    A hallmark of anamniote vertebrate development is a window of embryonic transcription-independent cell divisions before onset of zygotic genome activation (ZGA). Chromatin determinants of ZGA are unexplored; however, marking of developmental genes by modified histones in sperm suggests a predictive...... role of histone marks for ZGA. In zebrafish, pre-ZGA development for ten cell cycles provides an opportunity to examine whether genomic enrichment in modified histones is present before initiation of transcription. By profiling histone H3 trimethylation on all zebrafish promoters before and after ZGA......, we demonstrate here an epigenetic prepatterning of developmental gene expression. This involves pre-ZGA marking of transcriptionally inactive genes involved in homeostatic and developmental regulation by permissive H3K4me3 with or without repressive H3K9me3 or H3K27me3. Our data suggest that histone...

  8. Genome-wide gene expression dataset used to identify potential therapeutic targets in androgenetic alopecia

    Directory of Open Access Journals (Sweden)

    R. Dey-Rao

    2017-08-01

    Full Text Available The microarray dataset attached to this report is related to the research article with the title: “A genomic approach to susceptibility and pathogenesis leads to identifying potential novel therapeutic targets in androgenetic alopecia” (Dey-Rao and Sinha, 2017 [1]. Male-pattern hair loss that is induced by androgens (testosterone in genetically predisposed individuals is known as androgenetic alopecia (AGA. The raw dataset is being made publicly available to enable critical and/or extended analyses. Our related research paper utilizes the attached raw dataset, for genome-wide gene-expression associated investigations. Combined with several in silico bioinformatics-based analyses we were able to delineate five strategic molecular elements as potential novel targets towards future AGA-therapy.

  9. Genome-wide analysis of gene expression in primate taste buds reveals links to diverse processes.

    Directory of Open Access Journals (Sweden)

    Peter Hevezi

    Full Text Available Efforts to unravel the mechanisms underlying taste sensation (gustation have largely focused on rodents. Here we present the first comprehensive characterization of gene expression in primate taste buds. Our findings reveal unique new insights into the biology of taste buds. We generated a taste bud gene expression database using laser capture microdissection (LCM procured fungiform (FG and circumvallate (CV taste buds from primates. We also used LCM to collect the top and bottom portions of CV taste buds. Affymetrix genome wide arrays were used to analyze gene expression in all samples. Known taste receptors are preferentially expressed in the top portion of taste buds. Genes associated with the cell cycle and stem cells are preferentially expressed in the bottom portion of taste buds, suggesting that precursor cells are located there. Several chemokines including CXCL14 and CXCL8 are among the highest expressed genes in taste buds, indicating that immune system related processes are active in taste buds. Several genes expressed specifically in endocrine glands including growth hormone releasing hormone and its receptor are also strongly expressed in taste buds, suggesting a link between metabolism and taste. Cell type-specific expression of transcription factors and signaling molecules involved in cell fate, including KIT, reveals the taste bud as an active site of cell regeneration, differentiation, and development. IKBKAP, a gene mutated in familial dysautonomia, a disease that results in loss of taste buds, is expressed in taste cells that communicate with afferent nerve fibers via synaptic transmission. This database highlights the power of LCM coupled with transcriptional profiling to dissect the molecular composition of normal tissues, represents the most comprehensive molecular analysis of primate taste buds to date, and provides a foundation for further studies in diverse aspects of taste biology.

  10. Transcriptional interference networks coordinate the expression of functionally related genes clustered in the same genomic loci.

    Science.gov (United States)

    Boldogköi, Zsolt

    2012-01-01

    The regulation of gene expression is essential for normal functioning of biological systems in every form of life. Gene expression is primarily controlled at the level of transcription, especially at the phase of initiation. Non-coding RNAs are one of the major players at every level of genetic regulation, including the control of chromatin organization, transcription, various post-transcriptional processes, and translation. In this study, the Transcriptional Interference Network (TIN) hypothesis was put forward in an attempt to explain the global expression of antisense RNAs and the overall occurrence of tandem gene clusters in the genomes of various biological systems ranging from viruses to mammalian cells. The TIN hypothesis suggests the existence of a novel layer of genetic regulation, based on the interactions between the transcriptional machineries of neighboring genes at their overlapping regions, which are assumed to play a fundamental role in coordinating gene expression within a cluster of functionally linked genes. It is claimed that the transcriptional overlaps between adjacent genes are much more widespread in genomes than is thought today. The Waterfall model of the TIN hypothesis postulates a unidirectional effect of upstream genes on the transcription of downstream genes within a cluster of tandemly arrayed genes, while the Seesaw model proposes a mutual interdependence of gene expression between the oppositely oriented genes. The TIN represents an auto-regulatory system with an exquisitely timed and highly synchronized cascade of gene expression in functionally linked genes located in close physical proximity to each other. In this study, we focused on herpesviruses. The reason for this lies in the compressed nature of viral genes, which allows a tight regulation and an easier investigation of the transcriptional interactions between genes. However, I believe that the same or similar principles can be applied to cellular organisms too.

  11. Whole-Genome Analysis of a Novel Fish Reovirus (MsReV Discloses Aquareovirus Genomic Structure Relationship with Host in Saline Environments

    Directory of Open Access Journals (Sweden)

    Zhong-Yuan Chen

    2015-08-01

    Full Text Available Aquareoviruses are serious pathogens of aquatic animals. Here, genome characterization and functional gene analysis of a novel aquareovirus, largemouth bass Micropterus salmoides reovirus (MsReV, was described. It comprises 11 dsRNA segments (S1–S11 covering 24,024 bp, and encodes 12 putative proteins including the inclusion forming-related protein NS87 and the fusion-associated small transmembrane (FAST protein NS22. The function of NS22 was confirmed by expression in fish cells. Subsequently, MsReV was compared with two representative aquareoviruses, saltwater fish turbot Scophthalmus maximus reovirus (SMReV and freshwater fish grass carp reovirus strain 109 (GCReV-109. MsReV NS87 and NS22 genes have the same structure and function with those of SMReV, whereas GCReV-109 is either missing the coiled-coil region in NS79 or the gene-encoding NS22. Significant similarities are also revealed among equivalent genome segments between MsReV and SMReV, but a difference is found between MsReV and GCReV-109. Furthermore, phylogenetic analysis showed that 13 aquareoviruses could be divided into freshwater and saline environments subgroups, and MsReV was closely related to SMReV in saline environments. Consequently, these viruses from hosts in saline environments have more genomic structural similarities than the viruses from hosts in freshwater. This is the first study of the relationships between aquareovirus genomic structure and their host environments.

  12. Comparative Genomics in Switchgrass Using 61,585 High-Quality Expressed Sequence Tags

    Directory of Open Access Journals (Sweden)

    Christian M. Tobias

    2008-11-01

    Full Text Available The development of genomic resources for switchgrass ( L., a perennial NAD-malic enzyme type C grass, is required to enable molecular breeding and biotechnological approaches for improving its value as a forage and bioenergy crop. Expressed sequence tag (EST sequencing is one method that can quickly sample gene inventories and produce data suitable for marker development or analysis of tissue-specific patterns of expression. Toward this goal, three cDNA libraries from callus, crown, and seedling tissues of ‘Kanlow’ switchgrass were end-sequenced to generate a total of 61,585 high-quality ESTs from 36,565 separate clones. Seventy-three percent of the assembled consensus sequences could be aligned with the sorghum [ (L. Moench] genome at a -value of <1 × 10, indicating a high degree of similarity. Sixty-five percent of the ESTs matched with gene ontology molecular terms, and 3.3% of the sequences were matched with genes that play potential roles in cell-wall biogenesis. The representation in the three libraries of gene families known to be associated with C photosynthesis, cellulose and β-glucan synthesis, phenylpropanoid biosynthesis, and peroxidase activity indicated likely roles for individual family members. Pairwise comparisons of synonymous codon substitutions were used to assess genome sequence diversity and indicated an overall similarity between the two genome copies present in the tetraploid. Identification of EST–simple sequence repeat markers and amplification on two individual parents of a mapping population yielded an average of 2.18 amplicons per individual, and 35% of the markers produced fragment length polymorphisms.

  13. Alignment-free comparative genomic screen for structured RNAs using coarse-grained secondary structure dot plots

    DEFF Research Database (Denmark)

    Kato, Yuki; Gorodkin, Jan; Havgaard, Jakob Hull

    2017-01-01

    . Methods: Here we present a fast and efficient method, DotcodeR, for detecting structurally similar RNAs in genomic sequences by comparing their corresponding coarse-grained secondary structure dot plots at string level. This allows us to perform an all-against-all scan of all window pairs from two genomes...... without alignment. Results: Our computational experiments with simulated data and real chromosomes demonstrate that the presented method has good sensitivity. Conclusions: DotcodeR can be useful as a pre-filter in a genomic comparative scan for structured RNAs....

  14. From structure prediction to genomic screens for novel non-coding RNAs

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Hofacker, Ivo L.

    2011-01-01

    Abstract: Non-coding RNAs (ncRNAs) are receiving more and more attention not only as an abundant class of genes, but also as regulatory structural elements (some located in mRNAs). A key feature of RNA function is its structure. Computational methods were developed early for folding and prediction....... This and the increased amount of available genomes have made it possible to employ structure-based methods for genomic screens. The field has moved from folding prediction of single sequences to computational screens for ncRNAs in genomic sequence using the RNA structure as the main characteristic feature. Whereas early...... upon some of the concepts in current methods that have been applied in genomic screens for de novo RNA structures in searches for novel ncRNA genes and regulatory RNA structure on mRNAs. We discuss the strengths and weaknesses of the different strategies and how they can complement each other....

  15. Impact of antenatal glucocorticosteroids on whole-genome expression in preterm babies.

    Science.gov (United States)

    Saugstad, Ola Didrik; Kwinta, Przemko; Wollen, Embjørg Julianne; Bik-Multanowski, Mirosław; Madetko-Talowska, Anna; Jagła, Mateusz; Tomasik, Tomasz; Pietrzyk, Jacek Józef

    2013-04-01

    To study the impact that using antenatal steroid to treat threatened preterm delivery has on whole-genome expression. A prospective whole-genome expression study was carried out on 50 newborn infants, delivered before 32 weeks gestation, who had been exposed to antenatal steroids, including 40 who had received a full antenatal steroid course. Seventy infants not exposed to antenatal steroids formed the control group. Microarray analyses were performed five and 28 days after delivery, and the results were validated by real-time PCR. The study was conducted between September 2008 and November 2010. Twenty thousand six hundred and ninety-three genes were studied in the infants' leucocytes. Thirteen were differentially expressed 5 days after delivery, but there were no differences at day 28. Four genes related to cancer or inflammation were up-regulated. Nine genes were down-regulated: six were Y-linked and associated with malignancies, graft-versus-host disease, male infertility and cell differentiation and three were associated with pre-eclampsia, oxidative stress and chloride/bicarbonate exchange. Seven gene pathways were up-regulated at day five and only one at day 28. These were associated with cell growth, cell cycle regulation, metabolism and apoptosis. Antenatal steroid therapy affects a limited number of genes and gene pathways in leucocytes in preterm babies at day five of life. The effect is short-lived, but long-term effects cannot be ruled out. ©2013 The Author(s)/Acta Paediatrica ©2013 Foundation Acta Paediatrica.

  16. Genome-wide analysis of the expansin gene superfamily reveals grapevine-specific structural and functional characteristics.

    Directory of Open Access Journals (Sweden)

    Silvia Dal Santo

    Full Text Available BACKGROUND: Expansins are proteins that loosen plant cell walls in a pH-dependent manner, probably by increasing the relative movement among polymers thus causing irreversible expansion. The expansin superfamily (EXP comprises four distinct families: expansin A (EXPA, expansin B (EXPB, expansin-like A (EXLA and expansin-like B (EXLB. There is experimental evidence that EXPA and EXPB proteins are required for cell expansion and developmental processes involving cell wall modification, whereas the exact functions of EXLA and EXLB remain unclear. The complete grapevine (Vitis vinifera genome sequence has allowed the characterization of many gene families, but an exhaustive genome-wide analysis of expansin gene expression has not been attempted thus far. METHODOLOGY/PRINCIPAL FINDINGS: We identified 29 EXP superfamily genes in the grapevine genome, representing all four EXP families. Members of the same EXP family shared the same exon-intron structure, and phylogenetic analysis confirmed a closer relationship between EXP genes from woody species, i.e. grapevine and poplar (Populus trichocarpa, compared to those from Arabidopsis thaliana and rice (Oryza sativa. We also identified grapevine-specific duplication events involving the EXLB family. Global gene expression analysis confirmed a strong correlation among EXP genes expressed in mature and green/vegetative samples, respectively, as reported for other gene families in the recently-published grapevine gene expression atlas. We also observed the specific co-expression of EXLB genes in woody organs, and the involvement of certain grapevine EXP genes in berry development and post-harvest withering. CONCLUSION: Our comprehensive analysis of the grapevine EXP superfamily confirmed and extended current knowledge about the structural and functional characteristics of this gene family, and also identified properties that are currently unique to grapevine expansin genes. Our data provide a model for the

  17. Strategies used for genetically modifying bacterial genome: ite-directed mutagenesis, gene inactivation, and gene over-expression*

    Science.gov (United States)

    Xu, Jian-zhong; Zhang, Wei-guo

    2016-01-01

    With the availability of the whole genome sequence of Escherichia coli or Corynebacterium glutamicum, strategies for directed DNA manipulation have developed rapidly. DNA manipulation plays an important role in understanding the function of genes and in constructing novel engineering bacteria according to requirement. DNA manipulation involves modifying the autologous genes and expressing the heterogenous genes. Two alternative approaches, using electroporation linear DNA or recombinant suicide plasmid, allow a wide variety of DNA manipulation. However, the over-expression of the desired gene is generally executed via plasmid-mediation. The current review summarizes the common strategies used for genetically modifying E. coli and C. glutamicum genomes, and discusses the technical problem of multi-layered DNA manipulation. Strategies for gene over-expression via integrating into genome are proposed. This review is intended to be an accessible introduction to DNA manipulation within the bacterial genome for novices and a source of the latest experimental information for experienced investigators. PMID:26834010

  18. Use of whole genome expression analysis in the toxicity screening of nanoparticles

    International Nuclear Information System (INIS)

    Fröhlich, Eleonore; Meindl, Claudia; Wagner, Karin; Leitinger, Gerd; Roblegg, Eva

    2014-01-01

    The use of nanoparticles (NPs) offers exciting new options in technical and medical applications provided they do not cause adverse cellular effects. Cellular effects of NPs depend on particle parameters and exposure conditions. In this study, whole genome expression arrays were employed to identify the influence of particle size, cytotoxicity, protein coating, and surface functionalization of polystyrene particles as model particles and for short carbon nanotubes (CNTs) as particles with potential interest in medical treatment. Another aim of the study was to find out whether screening by microarray would identify other or additional targets than commonly used cell-based assays for NP action. Whole genome expression analysis and assays for cell viability, interleukin secretion, oxidative stress, and apoptosis were employed. Similar to conventional assays, microarray data identified inflammation, oxidative stress, and apoptosis as affected by NP treatment. Application of lower particle doses and presence of protein decreased the total number of regulated genes but did not markedly influence the top regulated genes. Cellular effects of CNTs were small; only carboxyl-functionalized single-walled CNTs caused appreciable regulation of genes. It can be concluded that regulated functions correlated well with results in cell-based assays. Presence of protein mitigated cytotoxicity but did not cause a different pattern of regulated processes. - Highlights: • Regulated functions were screened using whole genome expression assays. • Polystyrene particles regulated more genes than short carbon nanotubes. • Protein coating of polystyrene particles did not change regulation pattern. • Functions regulated by microarray were confirmed by cell-based assay

  19. Use of whole genome expression analysis in the toxicity screening of nanoparticles

    Energy Technology Data Exchange (ETDEWEB)

    Fröhlich, Eleonore, E-mail: eleonore.froehlich@medunigraz.at [Center for Medical Research, Medical University of Graz, Stiftingtalstr. 24, 8010 Graz (Austria); Meindl, Claudia; Wagner, Karin [Center for Medical Research, Medical University of Graz, Stiftingtalstr. 24, 8010 Graz (Austria); Leitinger, Gerd [Center for Medical Research, Medical University of Graz, Stiftingtalstr. 24, 8010 Graz (Austria); Institute for Cell Biology, Histology and Embryology, Medical University of Graz, Harrachgasse 21, 8010 Graz (Austria); Roblegg, Eva [Institute of Pharmaceutical Sciences, Department of Pharmaceutical Technology, Karl-Franzens-University of Graz, Universitätsplatz 1, 8010 Graz (Austria)

    2014-10-15

    The use of nanoparticles (NPs) offers exciting new options in technical and medical applications provided they do not cause adverse cellular effects. Cellular effects of NPs depend on particle parameters and exposure conditions. In this study, whole genome expression arrays were employed to identify the influence of particle size, cytotoxicity, protein coating, and surface functionalization of polystyrene particles as model particles and for short carbon nanotubes (CNTs) as particles with potential interest in medical treatment. Another aim of the study was to find out whether screening by microarray would identify other or additional targets than commonly used cell-based assays for NP action. Whole genome expression analysis and assays for cell viability, interleukin secretion, oxidative stress, and apoptosis were employed. Similar to conventional assays, microarray data identified inflammation, oxidative stress, and apoptosis as affected by NP treatment. Application of lower particle doses and presence of protein decreased the total number of regulated genes but did not markedly influence the top regulated genes. Cellular effects of CNTs were small; only carboxyl-functionalized single-walled CNTs caused appreciable regulation of genes. It can be concluded that regulated functions correlated well with results in cell-based assays. Presence of protein mitigated cytotoxicity but did not cause a different pattern of regulated processes. - Highlights: • Regulated functions were screened using whole genome expression assays. • Polystyrene particles regulated more genes than short carbon nanotubes. • Protein coating of polystyrene particles did not change regulation pattern. • Functions regulated by microarray were confirmed by cell-based assay.

  20. Long-term in vitro, cell-type-specific genome-wide reprogramming of gene expression

    International Nuclear Information System (INIS)

    Hakelien, Anne-Mari; Gaustad, Kristine G.; Taranger, Christel K.; Skalhegg, Bjorn S.; Kuentziger, Thomas; Collas, Philippe

    2005-01-01

    We demonstrate a cell extract-based, genome-wide and heritable reprogramming of gene expression in vitro. Kidney epithelial 293T cells have previously been shown to take on T cell properties following a brief treatment with an extract of Jurkat T cells. We show here that 293T cells exposed for 1 h to a Jurkat cell extract undergo genome-wide, target cell-type-specific and long-lasting transcriptional changes. Microarray analyses indicate that on any given week after extract treatment, ∼2500 genes are upregulated >3-fold, of which ∼900 are also expressed in Jurkat cells. Concomitantly, ∼1500 genes are downregulated or repressed, of which ∼500 are also downregulated in Jurkat cells. Gene expression changes persist for over 30 passages (∼80 population doublings) in culture. Target cell-type specificity of these changes is shown by the lack of activation or repression of Jurkat-specific genes by extracts of 293T cells or carcinoma cells. Quantitative RT-PCR analysis confirms the long-term transcriptional activation of genes involved in key T cell functions. Additionally, growth of cells in suspended aggregates, expression of CD3 and CD28 T cell surface markers, and interleukin-2 secretion by 293T cells treated with extract of adult peripheral blood T cells illustrate a functional nuclear reprogramming. Therefore, target cell-type-specific and heritable changes in gene expression, and alterations in cell function, can be promoted by extracts derived from transformed cells as well as from adult primary cells

  1. Single Amplified Genomes as Source for Novel Extremozymes: Annotation, Expression and Functional Assessment

    KAUST Repository

    Grötzinger, Stefan

    2017-12-01

    Enzymes, as nature’s catalysts, show remarkable abilities that can revolutionize the chemical, biotechnological, bioremediation, agricultural and pharmaceutical industries. However, the narrow range of stability of the majority of described biocatalysts limits their use for many applications. To overcome these restrictions, extremozymes derived from microorganisms thriving under harsh conditions can be used. Extremophiles living in high salinity are especially interesting as they operate at low water activity, which is similar to conditions used in standard chemical applications. Because only about 0.1 % of all microorganisms can be cultured, the traditional way of culture-based enzyme function determination needs to be overcome. The rise of high-throughput next-generation-sequencing technologies allows for deep insight into nature’s variety. Single amplified genomes (SAGs) specifically allow for whole genome assemblies from small sample volumes with low cell yields, as are typical for extreme environments. Although these technologies have been available for years, the expected boost in biotechnology has held off. One of the main reasons is the lack of reliable functional annotation of the genomic data, which is caused by the low amount (0.15 %) of experimentally described genes. Here, we present a novel annotation algorithm, designed to annotate the enzymatic function of genomes from microorganisms with low homologies to described microorganisms. The algorithm was established on SAGs from the extreme environment of selected hypersaline Red Sea brine pools with 4.3 M salinity and temperatures up to 68°C. Additionally, a novel consensus pattern for the identification of γ-carbonic anhydrases was created and applied in the algorithm. To verify the annotation, selected genes were expressed in the hypersaline expression system Halobacterium salinarum. This expression system was established and optimized in a continuously stirred tank reactor, leading to

  2. Distributed probing of chromatin structure in vivo reveals pervasive chromatin accessibility for expressed and non-expressed genes during tissue differentiation in C. elegans

    Directory of Open Access Journals (Sweden)

    Sha Ky

    2010-08-01

    Full Text Available Abstract Background Tissue differentiation is accompanied by genome-wide changes in the underlying chromatin structure and dynamics, or epigenome. By controlling when, where, and what regulatory factors have access to the underlying genomic DNA, the epigenome influences the cell's transcriptome and ultimately its function. Existing genomic methods for analyzing cell-type-specific changes in chromatin generally involve two elements: (i a source for purified cells (or nuclei of distinct types, and (ii a specific treatment that partitions or degrades chromatin by activity or structural features. For many cell types of great interest, such assays are limited by our inability to isolate the relevant cell populations in an organism or complex tissue containing an intertwined mixture of other cells. This limitation has confined available knowledge of chromatin dynamics to a narrow range of biological systems (cell types that can be sorted/separated/dissected in large numbers and tissue culture models or to amalgamations of diverse cell types (tissue chunks, whole organisms. Results Transgene-driven expression of DNA/chromatin modifying enzymes provides one opportunity to query chromatin structures in expression-defined cell subsets. In this work we combine in vivo expression of a bacterial DNA adenine methyltransferase (DAM with high throughput sequencing to sample tissue-specific chromatin accessibility on a genome-wide scale. We have applied the method (DALEC: Direct Asymmetric Ligation End Capture towards mapping a cell-type-specific view of genome accessibility as a function of differentiated state. Taking advantage of C. elegans strains expressing the DAM enzyme in diverse tissues (body wall muscle, gut, and hypodermis, our efforts yield a genome-wide dataset measuring chromatin accessibility at each of 538,000 DAM target sites in the C. elegans (diploid genome. Conclusions Validating the DALEC mapping results, we observe a strong association

  3. Genome-wide gene expression profiling of low-dose, long-term exposure of human osteosarcoma cells to bisphenol A and its analogs bisphenols AF and S.

    Science.gov (United States)

    Fic, A; Mlakar, S Jurković; Juvan, P; Mlakar, V; Marc, J; Dolenc, M Sollner; Broberg, K; Mašič, L Peterlin

    2015-08-01

    The bisphenols AF (BPAF) and S (BPS) are structural analogs of the endocrine disruptor bisphenol A (BPA), and are used in common products as a replacement for BPA. To elucidate genome-wide gene expression responses, estrogen-dependent osteosarcoma cells were cultured with 10 nM BPA, BPAF, or BPS, for 8 h and 3 months. Genome-wide gene expression was analyzed using the Illumina Expression BeadChip. Three months exposure had significant effects on gene expression, particularly for BPS, followed by BPAF and BPA, according to the number of differentially expressed genes (1980, 778, 60, respectively), the magnitude of changes in gene expression, and the number of enriched biological processes (800, 415, 33, respectively) and pathways (77, 52, 6, respectively). 'Embryonic skeletal system development' was the most enriched bone-related process, which was affected only by BPAF and BPS. Interestingly, all three bisphenols showed highest down-regulation of genes related to the cardiovascular system (e.g., NPPB, NPR3, TXNIP). BPA only and BPA/BPAF/BPS also affected genes related to the immune system and fetal development, respectively. For BPAF and BPS, the 'isoprenoid biosynthetic process' was enriched (up-regulated genes: HMGCS1, PDSS1, ACAT2, RCE1, DHDDS). Compared to BPA, BPAF and BPS had more effects on gene expression after long-term exposure. These findings stress the need for careful toxicological characterization of BPA analogs in the future. Copyright © 2015 Elsevier Ltd. All rights reserved.

  4. Neuropilin-2 genomic elements drive cre recombinase expression in primitive blood, vascular and neuronal lineages.

    Science.gov (United States)

    Wiszniak, Sophie; Scherer, Michaela; Ramshaw, Hayley; Schwarz, Quenten

    2015-11-01

    We have established a novel Cre mouse line, using genomic elements encompassing the Nrp2 locus, present within a bacterial artificial chromosome clone. By crossing this Cre driver line to R26R LacZ reporter mice, we have documented the temporal expression and lineage traced tissues in which Cre is expressed. Nrp2-Cre drives expression in primitive blood cells arising from the yolk sac, venous and lymphatic endothelial cells, peripheral sensory ganglia, and the lung bud. This mouse line will provide a new tool to researchers wishing to study the development of various tissues and organs in which this Cre driver is expressed, as well as allow tissue-specific knockout of genes of interest to study protein function. This work also presents the first evidence for expression of Nrp2 protein in a mesodermal progenitor with restricted hematopoietic potential, which will significantly advance the study of primitive erythropoiesis. genesis 53:709-717, 2015. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.

  5. Statistical properties of thermodynamically predicted RNA secondary structures in viral genomes

    Science.gov (United States)

    Spanò, M.; Lillo, F.; Miccichè, S.; Mantegna, R. N.

    2008-10-01

    By performing a comprehensive study on 1832 segments of 1212 complete genomes of viruses, we show that in viral genomes the hairpin structures of thermodynamically predicted RNA secondary structures are more abundant than expected under a simple random null hypothesis. The detected hairpin structures of RNA secondary structures are present both in coding and in noncoding regions for the four groups of viruses categorized as dsDNA, dsRNA, ssDNA and ssRNA. For all groups, hairpin structures of RNA secondary structures are detected more frequently than expected for a random null hypothesis in noncoding rather than in coding regions. However, potential RNA secondary structures are also present in coding regions of dsDNA group. In fact, we detect evolutionary conserved RNA secondary structures in conserved coding and noncoding regions of a large set of complete genomes of dsDNA herpesviruses.

  6. The First Complete Chloroplast Genome Sequences in Actinidiaceae: Genome Structure and Comparative Analysis.

    Science.gov (United States)

    Yao, Xiaohong; Tang, Ping; Li, Zuozhou; Li, Dawei; Liu, Yifei; Huang, Hongwen

    2015-01-01

    Actinidia chinensis is an important economic plant belonging to the basal lineage of the asterids. Availability of a complete Actinidia chloroplast genome sequence is crucial to understanding phylogenetic relationships among major lineages of angiosperms and facilitates kiwifruit genetic improvement. We report here the complete nucleotide sequences of the chloroplast genomes for Actinidia chinensis and A. chinensis var deliciosa obtained through de novo assembly of Illumina paired-end reads produced by total DNA sequencing. The total genome size ranges from 155,446 to 157,557 bp, with an inverted repeat (IR) of 24,013 to 24,391 bp, a large single copy region (LSC) of 87,984 to 88,337 bp and a small single copy region (SSC) of 20,332 to 20,336 bp. The genome encodes 113 different genes, including 79 unique protein-coding genes, 30 tRNA genes and 4 ribosomal RNA genes, with 16 duplicated in the inverted repeats, and a tRNA gene (trnfM-CAU) duplicated once in the LSC region. Comparisons of IR boundaries among four asterid species showed that IR/LSC borders were extended into the 5' portion of the psbA gene and IR contraction occurred in Actinidia. The clap gene has been lost from the chloroplast genome in Actinidia, and may have been transferred to the nucleus during chloroplast evolution. Twenty-seven polymorphic simple sequence repeat (SSR) loci were identified in the Actinidia chloroplast genome. Maximum parsimony analyses of a 72-gene, 16 taxa angiosperm dataset strongly support the placement of Actinidiaceae in Ericales within the basal asterids.

  7. Computational biology of genome expression and regulation--a review of microarray bioinformatics.

    Science.gov (United States)

    Wang, Junbai

    2008-01-01

    Microarray technology is being used widely in various biomedical research areas; the corresponding microarray data analysis is an essential step toward the best utilizing of array technologies. Here we review two components of the microarray data analysis: a low level of microarray data analysis that emphasizes the designing, the quality control, and the preprocessing of microarray experiments, then a high level of microarray data analysis that focuses on the domain-specific microarray applications such as tumor classification, biomarker prediction, analyzing array CGH experiments, and reverse engineering of gene expression networks. Additionally, we will review the recent development of building a predictive model in genome expression and regulation studies. This review may help biologists grasp a basic knowledge of microarray bioinformatics as well as its potential impact on the future evolvement of biomedical research fields.

  8. In silico method for modelling metabolism and gene product expression at genome scale

    Energy Technology Data Exchange (ETDEWEB)

    Lerman, Joshua A.; Hyduke, Daniel R.; Latif, Haythem; Portnoy, Vasiliy A.; Lewis, Nathan E.; Orth, Jeffrey D.; Rutledge, Alexandra C.; Smith, Richard D.; Adkins, Joshua N.; Zengler, Karsten; Palsson, Bernard O.

    2012-07-03

    Transcription and translation use raw materials and energy generated metabolically to create the macromolecular machinery responsible for all cellular functions, including metabolism. A biochemically accurate model of molecular biology and metabolism will facilitate comprehensive and quantitative computations of an organism's molecular constitution as a function of genetic and environmental parameters. Here we formulate a model of metabolism and macromolecular expression. Prototyping it using the simple microorganism Thermotoga maritima, we show our model accurately simulates variations in cellular composition and gene expression. Moreover, through in silico comparative transcriptomics, the model allows the discovery of new regulons and improving the genome and transcription unit annotations. Our method presents a framework for investigating molecular biology and cellular physiology in silico and may allow quantitative interpretation of multi-omics data sets in the context of an integrated biochemical description of an organism.

  9. A sequence-based survey of the complex structural organization of tumor genomes

    Energy Technology Data Exchange (ETDEWEB)

    Collins, Colin; Raphael, Benjamin J.; Volik, Stanislav; Yu, Peng; Wu, Chunxiao; Huang, Guiqing; Linardopoulou, Elena V.; Trask, Barbara J.; Waldman, Frederic; Costello, Joseph; Pienta, Kenneth J.; Mills, Gordon B.; Bajsarowicz, Krystyna; Kobayashi, Yasuko; Sridharan, Shivaranjani; Paris, Pamela; Tao, Quanzhou; Aerni, Sarah J.; Brown, Raymond P.; Bashir, Ali; Gray, Joe W.; Cheng, Jan-Fang; de Jong, Pieter; Nefedov, Mikhail; Ried, Thomas; Padilla-Nash, Hesed M.; Collins, Colin C.

    2008-04-03

    The genomes of many epithelial tumors exhibit extensive chromosomal rearrangements. All classes of genome rearrangements can be identified using End Sequencing Profiling (ESP), which relies on paired-end sequencing of cloned tumor genomes. In this study, brain, breast, ovary and prostate tumors along with three breast cancer cell lines were surveyed with ESP yielding the largest available collection of sequence-ready tumor genome breakpoints and providing evidence that some rearrangements may be recurrent. Sequencing and fluorescence in situ hybridization (FISH) confirmed translocations and complex tumor genome structures that include coamplification and packaging of disparate genomic loci with associated molecular heterogeneity. Comparison of the tumor genomes suggests recurrent rearrangements. Some are likely to be novel structural polymorphisms, whereas others may be bona fide somatic rearrangements. A recurrent fusion transcript in breast tumors and a constitutional fusion transcript resulting from a segmental duplication were identified. Analysis of end sequences for single nucleotide polymorphisms (SNPs) revealed candidate somatic mutations and an elevated rate of novel SNPs in an ovarian tumor. These results suggest that the genomes of many epithelial tumors may be far more dynamic and complex than previously appreciated and that genomic fusions including fusion transcripts and proteins may be common, possibly yielding tumor-specific biomarkers and therapeutic targets.

  10. Genome-wide expressions in autologous eutopic and ectopic endometrium of fertile women with endometriosis.

    Science.gov (United States)

    Khan, Meraj A; Sengupta, Jayasree; Mittal, Suneeta; Ghosh, Debabrata

    2012-09-24

    In order to obtain a lead of the pathophysiology of endometriosis, genome-wide expressional analyses of eutopic and ectopic endometrium have earlier been reported, however, the effects of stages of severity and phases of menstrual cycle on expressional profiles have not been examined. The effect of genetic heterogeneity and fertility history on transcriptional activity was also not considered. In the present study, a genome-wide expression analysis of autologous, paired eutopic and ectopic endometrial samples obtained from fertile women (n=18) suffering from moderate (stage 3; n=8) or severe (stage 4; n=10) ovarian endometriosis during proliferative (n=13) and secretory (n=5) phases of menstrual cycle was performed. Individual pure RNA samples were subjected to Agilent's Whole Human Genome 44K microarray experiments. Microarray data were validated (Pcopy numbers by performing real time RT-PCR of seven (7) arbitrarily selected genes in all samples. The data obtained were subjected to differential expression (DE) and differential co-expression (DC) analyses followed by networks and enrichment analysis, and gene set enrichment analysis (GSEA). The reproducibility of prediction based on GSEA implementation of DC results was assessed by examining the relative expressions of twenty eight (28) selected genes in RNA samples obtained from fresh pool of eutopic and ectopic samples from confirmed ovarian endometriosis patients with stages 3 and 4 (n=4/each) during proliferative and secretory (n=4/each) phases. Higher clustering effect of pairing (cluster distance, cd=0.1) in samples from same individuals on expressional arrays among eutopic and ectopic samples was observed as compared to that of clinical stages of severity (cd=0.5) and phases of menstrual cycle (cd=0.6). Post hoc analysis revealed anomaly in the expressional profiles of several genes associated with immunological, neuracrine and endocrine functions and gynecological cancers however with no overt oncogenic

  11. Whole-genome gene expression profiling of formalin-fixed, paraffin-embedded tissue samples.

    Directory of Open Access Journals (Sweden)

    Craig April

    2009-12-01

    Full Text Available We have developed a gene expression assay (Whole-Genome DASL, capable of generating whole-genome gene expression profiles from degraded samples such as formalin-fixed, paraffin-embedded (FFPE specimens.We demonstrated a similar level of sensitivity in gene detection between matched fresh-frozen (FF and FFPE samples, with the number and overlap of probes detected in the FFPE samples being approximately 88% and 95% of that in the corresponding FF samples, respectively; 74% of the differentially expressed probes overlapped between the FF and FFPE pairs. The WG-DASL assay is also able to detect 1.3-1.5 and 1.5-2 -fold changes in intact and FFPE samples, respectively. The dynamic range for the assay is approximately 3 logs. Comparing the WG-DASL assay with an in vitro transcription-based labeling method yielded fold-change correlations of R(2 approximately 0.83, while fold-change comparisons with quantitative RT-PCR assays yielded R(2 approximately 0.86 and R(2 approximately 0.55 for intact and FFPE samples, respectively. Additionally, the WG-DASL assay yielded high self-correlations (R(2>0.98 with low intact RNA inputs ranging from 1 ng to 100 ng; reproducible expression profiles were also obtained with 250 pg total RNA (R(2 approximately 0.92, with approximately 71% of the probes detected in 100 ng total RNA also detected at the 250 pg level. When FFPE samples were assayed, 1 ng total RNA yielded self-correlations of R(2 approximately 0.80, while still maintaining a correlation of R(2 approximately 0.75 with standard FFPE inputs (200 ng.Taken together, these results show that WG-DASL assay provides a reliable platform for genome-wide expression profiling in archived materials. It also possesses utility within clinical settings where only limited quantities of samples may be available (e.g. microdissected material or when minimally invasive procedures are performed (e.g. biopsied specimens.

  12. Genome-wide differential gene expression in children exposed to air pollution in the Czech Republic

    DEFF Research Database (Denmark)

    van Leeuwen, D M; van Herwijnen, M H M; Pedersen, Marie

    2006-01-01

    The Teplice area in the Czech Republic is a mining district where elevated levels of air pollution including airborne carcinogens, have been demonstrated, especially during winter time. This environmental exposure can impact human health; in particular children may be more vulnerable. To study....... This suggests an effect of air pollution on the primary structural unit of the condensed DNA. In addition, several other pathways were modulated. Based on the results of this study, we suggest that transcriptomic analysis represents a promising biomarker for environmental carcinogenesis....... the impact of air pollution in children at the transcriptional level, peripheral blood cells were subjected to whole genome response analysis, in order to identify significantly modulated biological pathways and processes as a result of exposure. Using genome-wide oligonucleotide microarrays, we investigated...

  13. Genomic Features That Predict Allelic Imbalance in Humans Suggest Patterns of Constraint on Gene Expression Variation

    Science.gov (United States)

    Fédrigo, Olivier; Haygood, Ralph; Mukherjee, Sayan; Wray, Gregory A.

    2009-01-01

    Variation in gene expression is an important contributor to phenotypic diversity within and between species. Although this variation often has a genetic component, identification of the genetic variants driving this relationship remains challenging. In particular, measurements of gene expression usually do not reveal whether the genetic basis for any observed variation lies in cis or in trans to the gene, a distinction that has direct relevance to the physical location of the underlying genetic variant, and which may also impact its evolutionary trajectory. Allelic imbalance measurements identify cis-acting genetic effects by assaying the relative contribution of the two alleles of a cis-regulatory region to gene expression within individuals. Identification of patterns that predict commonly imbalanced genes could therefore serve as a useful tool and also shed light on the evolution of cis-regulatory variation itself. Here, we show that sequence motifs, polymorphism levels, and divergence levels around a gene can be used to predict commonly imbalanced genes in a human data set. Reduction of this feature set to four factors revealed that only one factor significantly differentiated between commonly imbalanced and nonimbalanced genes. We demonstrate that these results are consistent between the original data set and a second published data set in humans obtained using different technical and statistical methods. Finally, we show that variation in the single allelic imbalance-associated factor is partially explained by the density of genes in the region of a target gene (allelic imbalance is less probable for genes in gene-dense regions), and, to a lesser extent, the evenness of expression of the gene across tissues and the magnitude of negative selection on putative regulatory regions of the gene. These results suggest that the genomic distribution of functional cis-regulatory variants in the human genome is nonrandom, perhaps due to local differences in evolutionary

  14. Structure and expression of human dihydropteridine reductase

    International Nuclear Information System (INIS)

    Lockyer, J.; Cook, R.G.; Milstien, S.; Kaufman, S.; Woo, S.L.C.; Ledley, F.D.

    1987-01-01

    Dihydropteridine reductase catalyzes the NADH-mediated reduction of quinonoid dihydrobiopterin and is an essential component of the pterindependent aromatic amino acid hydroxylating systems. A cDNA for human DHPR was isolated from a human liver cDNA library in the vector λgt11 using a monospecific antibody against sheep DHPR. The nucleic acid sequence and amino acid sequence of human DHPR were determined from a full-length clone. A 112 amino acid sequence of sheep DHPR was obtained by sequencing purified sheep DHPR. This sequence is highly homologous to the predicted amino acid sequence of the human protein. Gene transfer of the recombinant human DHPR into COS cells leads to expression of DHPR enzymatic activity. These results indicate that the cDNA clone identified by antibody screening is an authentic and full-length cDNA for human DHPR

  15. Genome-wide identification of WRKY family genes in peach and analysis of WRKY expression during bud dormancy.

    Science.gov (United States)

    Chen, Min; Tan, Qiuping; Sun, Mingyue; Li, Dongmei; Fu, Xiling; Chen, Xiude; Xiao, Wei; Li, Ling; Gao, Dongsheng

    2016-06-01

    Bud dormancy in deciduous fruit trees is an important adaptive mechanism for their survival in cold climates. The WRKY genes participate in several developmental and physiological processes, including dormancy. However, the dormancy mechanisms of WRKY genes have not been studied in detail. We conducted a genome-wide analysis and identified 58 WRKY genes in peach. These putative genes were located on all eight chromosomes. In bioinformatics analyses, we compared the sequences of WRKY genes from peach, rice, and Arabidopsis. In a cluster analysis, the gene sequences formed three groups, of which group II was further divided into five subgroups. Gene structure was highly conserved within each group, especially in groups IId and III. Gene expression analyses by qRT-PCR showed that WRKY genes showed different expression patterns in peach buds during dormancy. The mean expression levels of six WRKY genes (Prupe.6G286000, Prupe.1G393000, Prupe.1G114800, Prupe.1G071400, Prupe.2G185100, and Prupe.2G307400) increased during endodormancy and decreased during ecodormancy, indicating that these six WRKY genes may play a role in dormancy in a perennial fruit tree. This information will be useful for selecting fruit trees with desirable dormancy characteristics or for manipulating dormancy in genetic engineering programs.

  16. Genome-wide analysis and expression profiling of the GRF gene family in oilseed rape (Brassica napus L.).

    Science.gov (United States)

    Ma, Jin-Qi; Jian, Hong-Ju; Yang, Bo; Lu, Kun; Zhang, Ao-Xiang; Liu, Pu; Li, Jia-Na

    2017-07-15

    Growth regulating-factors (GRFs) are plant-specific transcription factors that help regulate plant growth and development. Genome-wide identification and evolutionary analyses of GRF gene families have been performed in Arabidopsis thaliana, Zea mays, Oryza sativa, and Brassica rapa, but a comprehensive analysis of the GRF gene family in oilseed rape (Brassica napus) has not yet been reported. In the current study, we identified 35 members of the BnGRF family in B. napus. We analyzed the chromosomal distribution, phylogenetic relationships (Bayesian Inference and Neighbor Joining method), gene structures, and motifs of the BnGRF family members, as well as the cis-acting regulatory elements in their promoters. We also analyzed the expression patterns of 15 randomly selected BnGRF genes in various tissues and in plant varieties with different harvest indices and gibberellic acid (GA) responses. The expression levels of BnGRFs under GA treatment suggested the presence of possible negative feedback regulation. The evolutionary patterns and expression profiles of BnGRFs uncovered in this study increase our understanding of the important roles played by these genes in oilseed rape. Copyright © 2017. Published by Elsevier B.V.

  17. Genome-Wide Analysis, Classification, Evolution, and Expression Analysis of the Cytochrome P450 93 Family in Land Plants.

    Directory of Open Access Journals (Sweden)

    Hai Du

    Full Text Available Cytochrome P450 93 family (CYP93 belonging to the cytochrome P450 superfamily plays important roles in diverse plant processes. However, no previous studies have investigated the evolution and expression of the members of this family. In this study, we performed comprehensive genome-wide analysis to identify CYP93 genes in 60 green plants. In all, 214 CYP93 proteins were identified; they were specifically found in flowering plants and could be classified into ten subfamilies-CYP93A-K, with the last two being identified first. CYP93A is the ancestor that was derived in flowering plants, and the remaining showed lineage-specific distribution-CYP93B and CYP93C are present in dicots; CYP93F is distributed only in Poaceae; CYP93G and CYP93J are monocot-specific; CYP93E is unique to legumes; CYP93H and CYP93K are only found in Aquilegia coerulea, and CYP93D is Brassicaceae-specific. Each subfamily generally has conserved gene numbers, structures, and characteristics, indicating functional conservation during evolution. Synonymous nucleotide substitution (dN/dS analysis showed that CYP93 genes are under strong negative selection. Comparative expression analyses of CYP93 genes in dicots and monocots revealed that they are preferentially expressed in the roots and tend to be induced by biotic and/or abiotic stresses, in accordance with their well-known functions in plant secondary biosynthesis.

  18. Local chromatin structure of heterochromatin regulates repeated DNA stability, nucleolus structure, and genome integrity

    Energy Technology Data Exchange (ETDEWEB)

    Peng, Jamy C. [Univ. of California, Berkeley, CA (United States)

    2007-01-01

    Heterochromatin constitutes a significant portion of the genome in higher eukaryotes; approximately 30% in Drosophila and human. Heterochromatin contains a high repeat DNA content and a low density of protein-encoding genes. In contrast, euchromatin is composed mostly of unique sequences and contains the majority of single-copy genes. Genetic and cytological studies demonstrated that heterochromatin exhibits regulatory roles in chromosome organization, centromere function and telomere protection. As an epigenetically regulated structure, heterochromatin formation is not defined by any DNA sequence consensus. Heterochromatin is characterized by its association with nucleosomes containing methylated-lysine 9 of histone H3 (H3K9me), heterochromatin protein 1 (HP1) that binds H3K9me, and Su(var)3-9, which methylates H3K9 and binds HP1. Heterochromatin formation and functions are influenced by HP1, Su(var)3-9, and the RNA interference (RNAi) pathway. My thesis project investigates how heterochromatin formation and function impact nuclear architecture, repeated DNA organization, and genome stability in Drosophila melanogaster. H3K9me-based chromatin reduces extrachromosomal DNA formation; most likely by restricting the access of repair machineries to repeated DNAs. Reducing extrachromosomal ribosomal DNA stabilizes rDNA repeats and the nucleolus structure. H3K9me-based chromatin also inhibits DNA damage in heterochromatin. Cells with compromised heterochromatin structure, due to Su(var)3-9 or dcr-2 (a component of the RNAi pathway) mutations, display severe DNA damage in heterochromatin compared to wild type. In these mutant cells, accumulated DNA damage leads to chromosomal defects such as translocations, defective DNA repair response, and activation of the G2-M DNA repair and mitotic checkpoints that ensure cellular and animal viability. My thesis research suggests that DNA replication, repair, and recombination mechanisms in heterochromatin differ from those in

  19. Genomic analysis of expressed sequence tags in American black bear Ursus americanus.

    Science.gov (United States)

    Zhao, Sen; Shao, Chunxuan; Goropashnaya, Anna V; Stewart, Nathan C; Xu, Yichi; Tøien, Øivind; Barnes, Brian M; Fedorov, Vadim B; Yan, Jun

    2010-03-26

    Species of the bear family (Ursidae) are important organisms for research in molecular evolution, comparative physiology and conservation biology, but relatively little genetic sequence information is available for this group. Here we report the development and analyses of the first large scale Expressed Sequence Tag (EST) resource for the American black bear (Ursus americanus). Comprehensive analyses of molecular functions, alternative splicing, and tissue-specific expression of 38,757 black bear EST sequences were conducted using the dog genome as a reference. We identified 18 genes, involved in functions such as lipid catabolism, cell cycle, and vesicle-mediated transport, that are showing rapid evolution in the bear lineage Three genes, Phospholamban (PLN), cysteine glycine-rich protein 3 (CSRP3) and Troponin I type 3 (TNNI3), are related to heart contraction, and defects in these genes in humans lead to heart disease. Two genes, biphenyl hydrolase-like (BPHL) and CSRP3, contain positively selected sites in bear. Global analysis of evolution rates of hibernation-related genes in bear showed that they are largely conserved and slowly evolving genes, rather than novel and fast-evolving genes. We provide a genomic resource for an important mammalian organism and our study sheds new light on the possible functions and evolution of bear genes.

  20. Genomic analysis of expressed sequence tags in American black bear Ursus americanus

    Science.gov (United States)

    2010-01-01

    Background Species of the bear family (Ursidae) are important organisms for research in molecular evolution, comparative physiology and conservation biology, but relatively little genetic sequence information is available for this group. Here we report the development and analyses of the first large scale Expressed Sequence Tag (EST) resource for the American black bear (Ursus americanus). Results Comprehensive analyses of molecular functions, alternative splicing, and tissue-specific expression of 38,757 black bear EST sequences were conducted using the dog genome as a reference. We identified 18 genes, involved in functions such as lipid catabolism, cell cycle, and vesicle-mediated transport, that are showing rapid evolution in the bear lineage Three genes, Phospholamban (PLN), cysteine glycine-rich protein 3 (CSRP3) and Troponin I type 3 (TNNI3), are related to heart contraction, and defects in these genes in humans lead to heart disease. Two genes, biphenyl hydrolase-like (BPHL) and CSRP3, contain positively selected sites in bear. Global analysis of evolution rates of hibernation-related genes in bear showed that they are largely conserved and slowly evolving genes, rather than novel and fast-evolving genes. Conclusion We provide a genomic resource for an important mammalian organism and our study sheds new light on the possible functions and evolution of bear genes. PMID:20338065

  1. The Genome of Tolypocladium inflatum: Evolution, Organization, and Expression of the Cyclosporin Biosynthetic Gene Cluster

    Science.gov (United States)

    Bushley, Kathryn E.; Raja, Rajani; Jaiswal, Pankaj; Cumbie, Jason S.; Nonogaki, Mariko; Boyd, Alexander E.; Owensby, C. Alisha; Knaus, Brian J.; Elser, Justin; Miller, Daniel; Di, Yanming; McPhail, Kerry L.; Spatafora, Joseph W.

    2013-01-01

    The ascomycete fungus Tolypocladium inflatum, a pathogen of beetle larvae, is best known as the producer of the immunosuppressant drug cyclosporin. The draft genome of T. inflatum strain NRRL 8044 (ATCC 34921), the isolate from which cyclosporin was first isolated, is presented along with comparative analyses of the biosynthesis of cyclosporin and other secondary metabolites in T. inflatum and related taxa. Phylogenomic analyses reveal previously undetected and complex patterns of homology between the nonribosomal peptide synthetase (NRPS) that encodes for cyclosporin synthetase (simA) and those of other secondary metabolites with activities against insects (e.g., beauvericin, destruxins, etc.), and demonstrate the roles of module duplication and gene fusion in diversification of NRPSs. The secondary metabolite gene cluster responsible for cyclosporin biosynthesis is described. In addition to genes necessary for cyclosporin biosynthesis, it harbors a gene for a cyclophilin, which is a member of a family of immunophilins known to bind cyclosporin. Comparative analyses support a lineage specific origin of the cyclosporin gene cluster rather than horizontal gene transfer from bacteria or other fungi. RNA-Seq transcriptome analyses in a cyclosporin-inducing medium delineate the boundaries of the cyclosporin cluster and reveal high levels of expression of the gene cluster cyclophilin. In medium containing insect hemolymph, weaker but significant upregulation of several genes within the cyclosporin cluster, including the highly expressed cyclophilin gene, was observed. T. inflatum also represents the first reference draft genome of Ophiocordycipitaceae, a third family of insect pathogenic fungi within the fungal order Hypocreales, and supports parallel and qualitatively distinct radiations of insect pathogens. The T. inflatum genome provides additional insight into the evolution and biosynthesis of cyclosporin and lays a foundation for further investigations of the role

  2. Genome-wide expressions in autologous eutopic and ectopic endometrium of fertile women with endometriosis

    Directory of Open Access Journals (Sweden)

    Khan Meraj A

    2012-09-01

    Full Text Available Abstract Background In order to obtain a lead of the pathophysiology of endometriosis, genome-wide expressional analyses of eutopic and ectopic endometrium have earlier been reported, however, the effects of stages of severity and phases of menstrual cycle on expressional profiles have not been examined. The effect of genetic heterogeneity and fertility history on transcriptional activity was also not considered. In the present study, a genome-wide expression analysis of autologous, paired eutopic and ectopic endometrial samples obtained from fertile women (n = 18 suffering from moderate (stage 3; n = 8 or severe (stage 4; n = 10 ovarian endometriosis during proliferative (n = 13 and secretory (n = 5 phases of menstrual cycle was performed. Methods Individual pure RNA samples were subjected to Agilent’s Whole Human Genome 44K microarray experiments. Microarray data were validated (P  Results Higher clustering effect of pairing (cluster distance, cd = 0.1 in samples from same individuals on expressional arrays among eutopic and ectopic samples was observed as compared to that of clinical stages of severity (cd = 0.5 and phases of menstrual cycle (cd = 0.6. Post hoc analysis revealed anomaly in the expressional profiles of several genes associated with immunological, neuracrine and endocrine functions and gynecological cancers however with no overt oncogenic potential in endometriotic tissue. Dys-regulation of three (CLOCK, ESR1, and MYC major transcription factors appeared to be significant causative factors in the pathogenesis of ovarian endometriosis. A novel cohort of twenty-eight (28 genes representing potential marker for ovarian endometriosis in fertile women was discovered. Conclusions Dysfunctional expression of immuno-neuro-endocrine behaviour in endometrium appeared critical to endometriosis. Although no overt oncogenic potential was evident, several genes associated with gynecological cancers were

  3. Genomic Organization and Expression of Iron Metabolism Genes in the Emerging Pathogenic Mold Scedosporium apiospermum

    Directory of Open Access Journals (Sweden)

    Yohann Le Govic

    2018-04-01

    Full Text Available The ubiquitous mold Scedosporium apiospermum is increasingly recognized as an emerging pathogen, especially among patients with underlying disorders such as immunodeficiency or cystic fibrosis (CF. Indeed, it ranks the second among the filamentous fungi colonizing the respiratory tract of CF patients. However, our knowledge about virulence factors of this fungus is still limited. The role of iron-uptake systems may be critical for establishment of Scedosporium infections, notably in the iron-rich environment of the CF lung. Two main strategies are employed by fungi to efficiently acquire iron from their host or from their ecological niche: siderophore production and reductive iron assimilation (RIA systems. The aim of this study was to assess the existence of orthologous genes involved in iron metabolism in the recently sequenced genome of S. apiospermum. At first, a tBLASTn analysis using A. fumigatus iron-related proteins as query revealed orthologs of almost all relevant loci in the S. apiospermum genome. Whereas the genes putatively involved in RIA were randomly distributed, siderophore biosynthesis and transport genes were organized in two clusters, each containing a non-ribosomal peptide synthetase (NRPS whose orthologs in A. fumigatus have been described to catalyze hydroxamate siderophore synthesis. Nevertheless, comparative genomic analysis of siderophore-related clusters showed greater similarity between S. apiospermum and phylogenetically close molds than with Aspergillus species. The expression level of these genes was then evaluated by exposing conidia to iron starvation and iron excess. The expression of several orthologs of A. fumigatus genes involved in siderophore-based iron uptake or RIA was significantly induced during iron starvation, and conversely repressed in iron excess conditions. Altogether, these results indicate that S. apiospermum possesses the genetic information required for efficient and competitive iron uptake

  4. A combined analysis of genome-wide expression profiling of bipolar disorder in human prefrontal cortex.

    Science.gov (United States)

    Wang, Jinglu; Qu, Susu; Wang, Weixiao; Guo, Liyuan; Zhang, Kunlin; Chang, Suhua; Wang, Jing

    2016-11-01

    Numbers of gene expression profiling studies of bipolar disorder have been published. Besides different array chips and tissues, variety of the data processes in different cohorts aggravated the inconsistency of results of these genome-wide gene expression profiling studies. By searching the gene expression databases, we obtained six data sets for prefrontal cortex (PFC) of bipolar disorder with raw data and combinable platforms. We used standardized pre-processing and quality control procedures to analyze each data set separately and then combined them into a large gene expression matrix with 101 bipolar disorder subjects and 106 controls. A standard linear mixed-effects model was used to calculate the differentially expressed genes (DEGs). Multiple levels of sensitivity analyses and cross validation with genetic data were conducted. Functional and network analyses were carried out on basis of the DEGs. In the result, we identified 198 unique differentially expressed genes in the PFC of bipolar disorder and control. Among them, 115 DEGs were robust to at least three leave-one-out tests or different pre-processing methods; 51 DEGs were validated with genetic association signals. Pathway enrichment analysis showed these DEGs were related with regulation of neurological system, cell death and apoptosis, and several basic binding processes. Protein-protein interaction network further identified one key hub gene. We have contributed the most comprehensive integrated analysis of bipolar disorder expression profiling studies in PFC to date. The DEGs, especially those with multiple validations, may denote a common signature of bipolar disorder and contribute to the pathogenesis of disease. Copyright © 2016 Elsevier Ltd. All rights reserved.

  5. Large-scale trends in the evolution of gene structures within 11 animal genomes.

    Directory of Open Access Journals (Sweden)

    Mark Yandell

    2006-03-01

    Full Text Available We have used the annotations of six animal genomes (Homo sapiens, Mus musculus, Ciona intestinalis, Drosophila melanogaster, Anopheles gambiae, and Caenorhabditis elegans together with the sequences of five unannotated Drosophila genomes to survey changes in protein sequence and gene structure over a variety of timescales--from the less than 5 million years since the divergence of D. simulans and D. melanogaster to the more than 500 million years that have elapsed since the Cambrian explosion. To do so, we have developed a new open-source software library called CGL (for "Comparative Genomics Library". Our results demonstrate that change in intron-exon structure is gradual, clock-like, and largely independent of coding-sequence evolution. This means that genome annotations can be used in new ways to inform, corroborate, and test conclusions drawn from comparative genomics analyses that are based upon protein and nucleotide sequence similarities.

  6. Defining the diverse spectrum of inversions, complex structural variation, and chromothripsis in the morbid human genome

    NARCIS (Netherlands)

    Collins, Ryan L; Brand, Harrison; Redin, Claire E.; Hanscom, Carrie; Antolik, Caroline; Stone, Matthew R; Glessner, Joseph T.; Mason, Tamara; Pregno, Giulia; Dorrani, Naghmeh; Mandrile, Giorgia; Giachino, Daniela; Perrin, Danielle; Walsh, Cole; Cipicchio, Michelle; Costello, Maura; Stortchevoi, Alexei; An, Joon Yong; Currall, Benjamin B; Seabra, Catarina M; Ragavendran, Ashok; Margolin, Lauren; Martinez-Agosto, Julian A.; Lucente, Diane; Levy, Brynn; Sanders, Jan-Stephan; Wapner, Ronald J.; Quintero-Rivera, Fabiola; Kloosterman, Wigard; Talkowski, Michael E.

    2017-01-01

    Background: Structural variation (SV) influences genome organization and contributes to human disease. However, the complete mutational spectrum of SV has not been routinely captured in disease association studies. Results: We sequenced 689 participants with autism spectrum disorder (ASD) and other

  7. From structure prediction to genomic screens for novel non-coding RNAs.

    Science.gov (United States)

    Gorodkin, Jan; Hofacker, Ivo L

    2011-08-01

    Non-coding RNAs (ncRNAs) are receiving more and more attention not only as an abundant class of genes, but also as regulatory structural elements (some located in mRNAs). A key feature of RNA function is its structure. Computational methods were developed early for folding and prediction of RNA structure with the aim of assisting in functional analysis. With the discovery of more and more ncRNAs, it has become clear that a large fraction of these are highly structured. Interestingly, a large part of the structure is comprised of regular Watson-Crick and GU wobble base pairs. This and the increased amount of available genomes have made it possible to employ structure-based methods for genomic screens. The field has moved from folding prediction of single sequences to computational screens for ncRNAs in genomic sequence using the RNA structure as the main characteristic feature. Whereas early methods focused on energy-directed folding of single sequences, comparative analysis based on structure preserving changes of base pairs has been efficient in improving accuracy, and today this constitutes a key component in genomic screens. Here, we cover the basic principles of RNA folding and touch upon some of the concepts in current methods that have been applied in genomic screens for de novo RNA structures in searches for novel ncRNA genes and regulatory RNA structure on mRNAs. We discuss the strengths and weaknesses of the different strategies and how they can complement each other.

  8. Genome-wide identification and structure-function studies of proteases and protease inhibitors in Cicer arietinum (chickpea).

    Science.gov (United States)

    Sharma, Ranu; Suresh, C G

    2015-01-01

    Proteases are a family of enzymes present in almost all living organisms. In plants they are involved in many biological processes requiring stress response in situations such as water deficiency, pathogen attack, maintaining protein content of the cell, programmed cell death, senescence, reproduction and many more. Similarly, protease inhibitors (PIs) are involved in various important functions like suppression of invasion by pathogenic nematodes, inhibition of spores-germination and mycelium growth of Alternaria alternata and response to wounding and fungal attack. As much as we know, no genome-wide study of proteases together with proteinaceous PIs is reported in any of the sequenced genomes till now. Phylogenetic studies and domain analysis of proteases were carried out to understand the molecular evolution as well as gene and protein features. Structural analysis was carried out to explore the binding mode and affinity of PIs for cognate proteases and prolyl oligopeptidase protease with inhibitor ligand. In the study reported here, a significant number of proteases and PIs were identified in chickpea genome. The gene expression profiles of proteases and PIs in five different plant tissues revealed a differential expression pattern in more than one plant tissue. Molecular dynamics studies revealed the formation of stable complex owing to increased number of protein-ligand and inter and intramolecular protein-protein hydrogen bonds. The genome-wide identification, characterization, evolutionary understanding, gene expression, and structural analysis of proteases and PIs provide a framework for future analysis when defining their roles in stress response and developing a more stress tolerant variety of chickpea. Copyright © 2014 Elsevier Ltd. All rights reserved.

  9. Structural constraints in the packaging of bluetongue virus genomic segments

    OpenAIRE

    Burkhardt, Christiane; Sung, Po-Yu; Celma, Cristina C.; Roy, Polly

    2014-01-01

    : The mechanism used by bluetongue virus (BTV) to ensure the sorting and packaging of its 10 genomic segments is still poorly understood. In this study, we investigated the packaging constraints for two BTV genomic segments from two different serotypes. Segment 4 (S4) of BTV serotype 9 was mutated sequentially and packaging of mutant ssRNAs was investigated by two newly developed RNA packaging assay systems, one in vivo and the other in vitro. Modelling of the mutated ssRNA followed by bioche...

  10. Leishmania naiffi and Leishmania guyanensis reference genomes highlight genome structure and gene evolution in the Viannia subgenus.

    Science.gov (United States)

    Coughlan, Simone; Taylor, Ali Shirley; Feane, Eoghan; Sanders, Mandy; Schonian, Gabriele; Cotton, James A; Downing, Tim

    2018-04-01

    The unicellular protozoan parasite Leishmania causes the neglected tropical disease leishmaniasis, affecting 12 million people in 98 countries. In South America, where the Viannia subgenus predominates, so far only L. ( Viannia ) braziliensis and L. ( V. ) panamensis have been sequenced, assembled and annotated as reference genomes. Addressing this deficit in molecular information can inform species typing, epidemiological monitoring and clinical treatment. Here, L. ( V. ) naiffi and L. ( V. ) guyanensis genomic DNA was sequenced to assemble these two genomes as draft references from short sequence reads. The methods used were tested using short sequence reads for L. braziliensis M2904 against its published reference as a comparison. This assembly and annotation pipeline identified 70 additional genes not annotated on the original M2904 reference. Phylogenetic and evolutionary comparisons of L. guyanensis and L. naiffi with 10 other Viannia genomes revealed four traits common to all Viannia : aneuploidy, 22 orthologous groups of genes absent in other Leishmania subgenera, elevated TATE transposon copies and a high NADH-dependent fumarate reductase gene copy number. Within the Viannia , there were limited structural changes in genome architecture specific to individual species: a 45 Kb amplification on chromosome 34 was present in all bar L. lainsoni , L. naiffi had a higher copy number of the virulence factor leishmanolysin, and laboratory isolate L. shawi M8408 had a possible minichromosome derived from the 3' end of chromosome 34 . This combination of genome assembly, phylogenetics and comparative analysis across an extended panel of diverse Viannia has uncovered new insights into the origin and evolution of this subgenus and can help improve diagnostics for leishmaniasis surveillance.

  11. Whole-genome expression analyses of type 2 diabetes in human skin reveal altered immune function and burden of infection.

    Science.gov (United States)

    Wu, Chun; Chen, Xiaopan; Shu, Jing; Lee, Chun-Ting

    2017-05-23

    Skin disorders are among most common complications associated with type 2 diabetes mellitus (T2DM). Although T2DM patients are known to have increased risk of infections and other T2DM-related skin disorders, their molecular mechanisms are largely unknown. This study aims to identify dysregulated genes and gene networks that are associated with T2DM in human skin. We compared the expression profiles of 56,318 transcribed genes on 74 T2DM cases and 148 gender- age-, and race-matched non-diabetes controls from the Genotype-Tissue Expression (GTEx) database. RNA-Sequencing data indicates that diabetic skin is characterized by increased expression of genes that are related to immune responses (CCL20, CXCL9, CXCL10, CXCL11, CXCL13, and CCL18), JAK/STAT signaling pathway (JAK3, STAT1, and STAT2), tumor necrosis factor superfamily (TNFSF10 and TNFSF15), and infectious disease pathways (OAS1, OAS2, OAS3, and IFIH1). Genes in cell adhesion molecules pathway (NCAM1 and L1CAM) and collagen family (PCOLCE2 and COL9A3) are downregulated, suggesting structural changes in the skin of T2DM. For the first time, to the best of our knowledge, this pioneer analytic study reports comprehensive unbiased gene expression changes and dysregulated pathways in the non-diseased skin of T2DM patients. This comprehensive understanding derived from whole-genome expression profiles could advance our knowledge in determining molecular targets for the prevention and treatment of T2DM-associated skin disorders.

  12. Whole genome expression and biochemical correlates of extreme constitutional types defined in Ayurveda.

    Science.gov (United States)

    Prasher, Bhavana; Negi, Sapna; Aggarwal, Shilpi; Mandal, Amit K; Sethi, Tav P; Deshmukh, Shailaja R; Purohit, Sudha G; Sengupta, Shantanu; Khanna, Sangeeta; Mohammad, Farhan; Garg, Gaurav; Brahmachari, Samir K; Mukerji, Mitali

    2008-09-09

    Ayurveda is an ancient system of personalized medicine documented and practiced in India since 1500 B.C. According to this system an individual's basic constitution to a large extent determines predisposition and prognosis to diseases as well as therapy and life-style regime. Ayurveda describes seven broad constitution types (Prakritis) each with a varying degree of predisposition to different diseases. Amongst these, three most contrasting types, Vata, Pitta, Kapha, are the most vulnerable to diseases. In the realm of modern predictive medicine, efforts are being directed towards capturing disease phenotypes with greater precision for successful identification of markers for prospective disease conditions. In this study, we explore whether the different constitution types as described in Ayurveda has molecular correlates. Normal individuals of the three most contrasting constitutional types were identified following phenotyping criteria described in Ayurveda in Indian population of Indo-European origin. The peripheral blood samples of these individuals were analysed for genome wide expression levels, biochemical and hematological parameters. Gene Ontology (GO) and pathway based analysis was carried out on differentially expressed genes to explore if there were significant enrichments of functional categories among Prakriti types. Individuals from the three most contrasting constitutional types exhibit striking differences with respect to biochemical and hematological parameters and at genome wide expression levels. Biochemical profiles like liver function tests, lipid profiles, and hematological parameters like haemoglobin exhibited differences between Prakriti types. Functional categories of genes showing differential expression among Prakriti types were significantly enriched in core biological processes like transport, regulation of cyclin dependent protein kinase activity, immune response and regulation of blood coagulation. A significant enrichment of

  13. Genome-wide expression analysis in fibroblast cell lines from probands with Pallister Killian syndrome.

    Directory of Open Access Journals (Sweden)

    Maninder Kaur

    Full Text Available Pallister Killian syndrome (OMIM: # 601803 is a rare multisystem disorder typically caused by tissue limited mosaic tetrasomy of chromosome 12p (isochromosome 12p. The clinical manifestations of Pallister Killian syndrome are variable with the most common findings including craniofacial dysmorphia, hypotonia, cognitive impairment, hearing loss, skin pigmentary differences and epilepsy. Isochromosome 12p is identified primarily in skin fibroblast cultures and in chorionic villus and amniotic fluid cell samples and may be identified in blood lymphocytes during the neonatal and early childhood period. We performed genomic expression profiling correlated with interphase fluorescent in situ hybridization and single nucleotide polymorphism array quantification of degree of mosaicism in fibroblasts from 17 Caucasian probands with Pallister Killian syndrome and 9 healthy age, gender and ethnicity matched controls. We identified a characteristic profile of 354 (180 up- and 174 down-regulated differentially expressed genes in Pallister Killian syndrome probands and supportive evidence for a Pallister Killian syndrome critical region on 12p13.31. The differentially expressed genes were enriched for developmentally important genes such as homeobox genes. Among the differentially expressed genes, we identified several genes whose misexpression may be associated with the clinical phenotype of Pallister Killian syndrome such as downregulation of ZFPM2, GATA6 and SOX9, and overexpression of IGFBP2.

  14. Large clusters of co-expressed genes in the Drosophila genome.

    Science.gov (United States)

    Boutanaev, Alexander M; Kalmykova, Alla I; Shevelyov, Yuri Y; Nurminsky, Dmitry I

    2002-12-12

    Clustering of co-expressed, non-homologous genes on chromosomes implies their co-regulation. In lower eukaryotes, co-expressed genes are often found in pairs. Clustering of genes that share aspects of transcriptional regulation has also been reported in higher eukaryotes. To advance our understanding of the mode of coordinated gene regulation in multicellular organisms, we performed a genome-wide analysis of the chromosomal distribution of co-expressed genes in Drosophila. We identified a total of 1,661 testes-specific genes, one-third of which are clustered on chromosomes. The number of clusters of three or more genes is much higher than expected by chance. We observed a similar trend for genes upregulated in the embryo and in the adult head, although the expression pattern of individual genes cannot be predicted on the basis of chromosomal position alone. Our data suggest that the prevalent mechanism of transcriptional co-regulation in higher eukaryotes operates with extensive chromatin domains that comprise multiple genes.

  15. Genome-Wide Expression Profiling of Five Mouse Models Identifies Similarities and Differences with Human Psoriasis

    Science.gov (United States)

    Swindell, William R.; Johnston, Andrew; Carbajal, Steve; Han, Gangwen; Wohn, Christian; Lu, Jun; Xing, Xianying; Nair, Rajan P.; Voorhees, John J.; Elder, James T.; Wang, Xiao-Jing; Sano, Shigetoshi; Prens, Errol P.; DiGiovanni, John; Pittelkow, Mark R.; Ward, Nicole L.; Gudjonsson, Johann E.

    2011-01-01

    Development of a suitable mouse model would facilitate the investigation of pathomechanisms underlying human psoriasis and would also assist in development of therapeutic treatments. However, while many psoriasis mouse models have been proposed, no single model recapitulates all features of the human disease, and standardized validation criteria for psoriasis mouse models have not been widely applied. In this study, whole-genome transcriptional profiling is used to compare gene expression patterns manifested by human psoriatic skin lesions with those that occur in five psoriasis mouse models (K5-Tie2, imiquimod, K14-AREG, K5-Stat3C and K5-TGFbeta1). While the cutaneous gene expression profiles associated with each mouse phenotype exhibited statistically significant similarity to the expression profile of psoriasis in humans, each model displayed distinctive sets of similarities and differences in comparison to human psoriasis. For all five models, correspondence to the human disease was strong with respect to genes involved in epidermal development and keratinization. Immune and inflammation-associated gene expression, in contrast, was more variable between models as compared to the human disease. These findings support the value of all five models as research tools, each with identifiable areas of convergence to and divergence from the human disease. Additionally, the approach used in this paper provides an objective and quantitative method for evaluation of proposed mouse models of psoriasis, which can be strategically applied in future studies to score strengths of mouse phenotypes relative to specific aspects of human psoriasis. PMID:21483750

  16. Genomic and expression analysis of the flax (Linum usitatissimum) family of glycosyl hydrolase 35 genes.

    Science.gov (United States)

    Hobson, Neil; Deyholos, Michael K

    2013-05-23

    Several β-galactosidases of the Glycosyl Hydrolase 35 (GH35) family have been characterized, and many of these modify cell wall components, including pectins, xyloglucans, and arabinogalactan proteins. The phloem fibres of flax (Linum usitatissimum) have gelatinous-type cell walls that are rich in crystalline cellulose and depend on β-galactosidase activity for their normal development. In this study, we investigate the transcript expression patterns and inferred evolutionary relationships of the complete set of flax GH35 genes, to better understand the functions of these genes in flax and other species. Using the recently published flax genome assembly, we identified 43 β-galactosidase-like (BGAL) genes, based on the presence of a GH35 domain. Phylogenetic analyses of their protein sequences clustered them into eight sub-families. Sub-family B, whose members in other species were known to be expressed in developing flowers and pollen, was greatly under represented in flax (p-value < 0.01). Sub-family A5, whose sole member from arabidopsis has been described as its primary xyloglucan BGAL, was greatly expanded in flax (p-value < 0.01). A number of flax BGALs were also observed to contain non-consensus GH35 active sites. Expression patterns of the flax BGALs were investigated using qRT-PCR and publicly available microarray data. All predicted flax BGALs showed evidence of expression in at least one tissue. Flax has a large number of BGAL genes, which display a distinct distribution among the BGAL sub-families, in comparison to other closely related species with available whole genome assemblies. Almost every flax BGAL was expressed in fibres, the majority of which expressed predominately in fibres as compared to other tissues, suggesting an important role for the expansion of this gene family in the development of this species as a fibre crop. Variations displayed in the canonical GH35 active site suggest a variety of roles unique to flax, which will require

  17. Structural genomic variation as risk factor for idiopathic recurrent miscarriage

    DEFF Research Database (Denmark)

    Nagirnaja, Liina; Palta, Priit; Kasak, Laura

    2014-01-01

    Recurrent miscarriage (RM) is a multifactorial disorder with acknowledged genetic heritability that affects ∼3% of couples aiming at childbirth. As copy number variants (CNVs) have been shown to contribute to reproductive disease susceptibility, we aimed to describe genome-wide profile of CNVs an...

  18. Genome-wide expression patterns associated with oncogenesis and sarcomatous transdifferentation of cholangiocarcinoma

    International Nuclear Information System (INIS)

    Seol, Min-A; Kim, Dae-Ghon; Chu, In-Sun; Lee, Mi-Jin; Yu, Goung-Ran; Cui, Xiang-Dan; Cho, Baik-Hwan; Ahn, Eun-Kyung; Leem, Sun-Hee; Kim, In-Hee

    2011-01-01

    The molecular mechanisms of CC (cholangiocarcinoma) oncogenesis and progression are poorly understood. This study aimed to determine the genome-wide expression of genes related to CC oncogenesis and sarcomatous transdifferentiation. Genes that were differentially expressed between CC cell lines or tissues and cultured normal biliary epithelial (NBE) cells were identified using DNA microarray technology. Expressions were validated in human CC tissues and cells. Using unsupervised hierarchical clustering analysis of the cell line and tissue samples, we identified a set of 342 commonly regulated (>2-fold change) genes. Of these, 53, including tumor-related genes, were upregulated, and 289, including tumor suppressor genes, were downregulated (<0.5 fold change). Expression of SPP1, EFNB2, E2F2, IRX3, PTTG1, PPARγ, KRT17, UCHL1, IGFBP7 and SPARC proteins was immunohistochemically verified in human and hamster CC tissues. Additional unsupervised hierarchical clustering analysis of sarcomatoid CC cells compared to three adenocarcinomatous CC cell lines revealed 292 differentially upregulated genes (>4-fold change), and 267 differentially downregulated genes (<0.25 fold change). The expression of 12 proteins was validated in the CC cell lines by immunoblot analysis and immunohistochemical staining. Of the proteins analyzed, we found upregulation of the expression of the epithelial-mesenchymal transition (EMT)-related proteins VIM and TWIST1, and restoration of the methylation-silenced proteins LDHB, BNIP3, UCHL1, and NPTX2 during sarcomatoid transdifferentiation of CC. The deregulation of oncogenes, tumor suppressor genes, and methylation-related genes may be useful in identifying molecular targets for CC diagnosis and prognosis

  19. Functional Associations by Response Overlap (FARO, a functional genomics approach matching gene expression phenotypes.

    Directory of Open Access Journals (Sweden)

    Henrik Bjørn Nielsen

    2007-08-01

    Full Text Available The systematic comparison of transcriptional responses of organisms is a powerful tool in functional genomics. For example, mutants may be characterized by comparing their transcript profiles to those obtained in other experiments querying the effects on gene expression of many experimental factors including treatments, mutations and pathogen infections. Similarly, drugs may be discovered by the relationship between the transcript profiles effectuated or impacted by a candidate drug and by the target disease. The integration of such data enables systems biology to predict the interplay between experimental factors affecting a biological system. Unfortunately, direct comparisons of gene expression profiles obtained in independent, publicly available microarray experiments are typically compromised by substantial, experiment-specific biases. Here we suggest a novel yet conceptually simple approach for deriving 'Functional Association(s by Response Overlap' (FARO between microarray gene expression studies. The transcriptional response is defined by the set of differentially expressed genes independent from the magnitude or direction of the change. This approach overcomes the limited comparability between studies that is typical for methods that rely on correlation in gene expression. We apply FARO to a compendium of 242 diverse Arabidopsis microarray experimental factors, including phyto-hormones, stresses and pathogens, growth conditions/stages, tissue types and mutants. We also use FARO to confirm and further delineate the functions of Arabidopsis MAP kinase 4 in disease and stress responses. Furthermore, we find that a large, well-defined set of genes responds in opposing directions to different stress conditions and predict the effects of different stress combinations. This demonstrates the usefulness of our approach for exploiting public microarray data to derive biologically meaningful associations between experimental factors. Finally, our

  20. Genome-wide identification, evolutionary and expression analysis of the aspartic protease gene superfamily in grape

    Science.gov (United States)

    2013-01-01

    Background Aspartic proteases (APs) are a large family of proteolytic enzymes found in almost all organisms. In plants, they are involved in many biological processes, such as senescence, stress responses, programmed cell death, and reproduction. Prior to the present study, no grape AP gene(s) had been reported, and their research on woody species was very limited. Results In this study, a total of 50 AP genes (VvAP) were identified in the grape genome, among which 30 contained the complete ASP domain. Synteny analysis within grape indicated that segmental and tandem duplication events contributed to the expansion of the grape AP family. Additional analysis between grape and Arabidopsis demonstrated that several grape AP genes were found in the corresponding syntenic blocks of Arabidopsis, suggesting that these genes arose before the divergence of grape and Arabidopsis. Phylogenetic relationships of the 30 VvAPs with the complete ASP domain and their Arabidopsis orthologs, as well as their gene and protein features were analyzed and their cellular localization was predicted. Moreover, expression profiles of VvAP genes in six different tissues were determined, and their transcript abundance under various stresses and hormone treatments were measured. Twenty-seven VvAP genes were expressed in at least one of the six tissues examined; nineteen VvAPs responded to at least one abiotic stress, 12 VvAPs responded to powdery mildew infection, and most of the VvAPs responded to SA and ABA treatments. Furthermore, integrated synteny and phylogenetic analysis identified orthologous AP genes between grape and Arabidopsis, providing a unique starting point for investigating the function of grape AP genes. Conclusions The genome-wide identification, evolutionary and expression analyses of grape AP genes provide a framework for future analysis of AP genes in defining their roles during stress response. Integrated synteny and phylogenetic analyses provide novel insight into the

  1. Genome-wide haplotype analysis of cis expression quantitative trait loci in monocytes.

    Directory of Open Access Journals (Sweden)

    Sophie Garnier

    Full Text Available In order to assess whether gene expression variability could be influenced by several SNPs acting in cis, either through additive or more complex haplotype effects, a systematic genome-wide search for cis haplotype expression quantitative trait loci (eQTL was conducted in a sample of 758 individuals, part of the Cardiogenics Transcriptomic Study, for which genome-wide monocyte expression and GWAS data were available. 19,805 RNA probes were assessed for cis haplotypic regulation through investigation of ~2,1 × 10(9 haplotypic combinations. 2,650 probes demonstrated haplotypic p-values >10(4-fold smaller than the best single SNP p-value. Replication of significant haplotype effects were tested for 412 probes for which SNPs (or proxies that defined the detected haplotypes were available in the Gutenberg Health Study composed of 1,374 individuals. At the Bonferroni correction level of 1.2 × 10(-4 (~0.05/412, 193 haplotypic signals replicated. 1000 G imputation was then conducted, and 105 haplotypic signals still remained more informative than imputed SNPs. In-depth analysis of these 105 cis eQTL revealed that at 76 loci genetic associations were compatible with additive effects of several SNPs, while for the 29 remaining regions data could be compatible with a more complex haplotypic pattern. As 24 of the 105 cis eQTL have previously been reported to be disease-associated loci, this work highlights the need for conducting haplotype-based and 1000 G imputed cis eQTL analysis before commencing functional studies at disease-associated loci.

  2. Genome-wide identification and expression analysis of the CIPK gene family in cassava

    Directory of Open Access Journals (Sweden)

    Wei eHu

    2015-10-01

    Full Text Available Cassava is an important food and potential biofuel crop that is tolerant to multiple abiotic stressors. The mechanisms underlying these tolerances are currently less known. CBL-interacting protein kinases (CIPKs have been shown to play crucial roles in plant developmental processes, hormone signaling transduction, and in the response to abiotic stress. However, no data is currently available about the CPK family in cassava. In this study, a total of 25 CIPK genes were identified from cassava genome based on our previous genome sequencing data. Phylogenetic analysis suggested that 25 MeCIPKs could be classified into four subfamilies, which was supported by exon-intron organizations and the architectures of conserved protein motifs. Transcriptomic analysis of a wild subspecies and two cultivated varieties showed that most MeCIPKs had different expression patterns between wild subspecies and cultivatars in different tissues or in response to drought stress. Some orthologous genes involved in CIPK interaction networks were identified between Arabidopsis and cassava. The interaction networks and co-expression patterns of these orthologous genes revealed that the crucial pathways controlled by CIPK networks may be involved in the differential response to drought stress in different accessions of cassava. Nine MeCIPK genes were selected to investigate their transcriptional response to various stimuli and the results showed the comprehensive response of the tested MeCIPK genes to osmotic, salt, cold, oxidative stressors, and ABA signaling. The identification and expression analysis of CIPK family suggested that CIPK genes are important components of development and multiple signal transduction pathways in cassava. The findings of this study will help lay a foundation for the functional characterization of the CIPK gene family and provide an improved understanding of abiotic stress responses and signaling transduction in cassava.

  3. Complete Chloroplast Genome of the Wollemi Pine (Wollemia nobilis): Structure and Evolution.

    Science.gov (United States)

    Yap, Jia-Yee S; Rohner, Thore; Greenfield, Abigail; Van Der Merwe, Marlien; McPherson, Hannah; Glenn, Wendy; Kornfeld, Geoff; Marendy, Elessa; Pan, Annie Y H; Wilton, Alan; Wilkins, Marc R; Rossetto, Maurizio; Delaney, Sven K

    2015-01-01

    The Wollemi pine (Wollemia nobilis) is a rare Southern conifer with striking morphological similarity to fossil pines. A small population of W. nobilis was discovered in 1994 in a remote canyon system in the Wollemi National Park (near Sydney, Australia). This population contains fewer than 100 individuals and is critically endangered. Previous genetic studies of the Wollemi pine have investigated its evolutionary relationship with other pines in the family Araucariaceae, and have suggested that the Wollemi pine genome contains little or no variation. However, these studies were performed prior to the widespread use of genome sequencing, and their conclusions were based on a limited fraction of the Wollemi pine genome. In this study, we address this problem by determining the entire sequence of the W. nobilis chloroplast genome. A detailed analysis of the structure of the genome is presented, and the evolution of the genome is inferred by comparison with the chloroplast sequences of other members of the Araucariaceae and the related family Podocarpaceae. Pairwise alignments of whole genome sequences, and the presence of unique pseudogenes, gene duplications and insertions in W. nobilis and Araucariaceae, indicate that the W. nobilis chloroplast genome is most similar to that of its sister taxon Agathis. However, the W. nobilis genome contains an unusually high number of repetitive sequences, and these could be used in future studies to investigate and conserve any remnant genetic diversity in the Wollemi pine.

  4. A genomic overview of the population structure of Salmonella.

    Directory of Open Access Journals (Sweden)

    Nabil-Fareed Alikhan

    2018-04-01

    Full Text Available For many decades, Salmonella enterica has been subdivided by serological properties into serovars or further subdivided for epidemiological tracing by a variety of diagnostic tests with higher resolution. Recently, it has been proposed that so-called eBurst groups (eBGs based on the alleles of seven housekeeping genes (legacy multilocus sequence typing [MLST] corresponded to natural populations and could replace serotyping. However, this approach lacks the resolution needed for epidemiological tracing and the existence of natural populations had not been independently validated by independent criteria. Here, we describe EnteroBase, a web-based platform that assembles draft genomes from Illumina short reads in the public domain or that are uploaded by users. EnteroBase implements legacy MLST as well as ribosomal gene MLST (rMLST, core genome MLST (cgMLST, and whole genome MLST (wgMLST and currently contains over 100,000 assembled genomes from Salmonella. It also provides graphical tools for visual interrogation of these genotypes and those based on core single nucleotide polymorphisms (SNPs. eBGs based on legacy MLST are largely consistent with eBGs based on rMLST, thus demonstrating that these correspond to natural populations. rMLST also facilitated the selection of representative genotypes for SNP analyses of the entire breadth of diversity within Salmonella. In contrast, cgMLST provides the resolution needed for epidemiological investigations. These observations show that genomic genotyping, with the assistance of EnteroBase, can be applied at all levels of diversity within the Salmonella genus.

  5. A genomic overview of the population structure of Salmonella.

    Science.gov (United States)

    Alikhan, Nabil-Fareed; Zhou, Zhemin; Sergeant, Martin J; Achtman, Mark

    2018-04-01

    For many decades, Salmonella enterica has been subdivided by serological properties into serovars or further subdivided for epidemiological tracing by a variety of diagnostic tests with higher resolution. Recently, it has been proposed that so-called eBurst groups (eBGs) based on the alleles of seven housekeeping genes (legacy multilocus sequence typing [MLST]) corresponded to natural populations and could replace serotyping. However, this approach lacks the resolution needed for epidemiological tracing and the existence of natural populations had not been independently validated by independent criteria. Here, we describe EnteroBase, a web-based platform that assembles draft genomes from Illumina short reads in the public domain or that are uploaded by users. EnteroBase implements legacy MLST as well as ribosomal gene MLST (rMLST), core genome MLST (cgMLST), and whole genome MLST (wgMLST) and currently contains over 100,000 assembled genomes from Salmonella. It also provides graphical tools for visual interrogation of these genotypes and those based on core single nucleotide polymorphisms (SNPs). eBGs based on legacy MLST are largely consistent with eBGs based on rMLST, thus demonstrating that these correspond to natural populations. rMLST also facilitated the selection of representative genotypes for SNP analyses of the entire breadth of diversity within Salmonella. In contrast, cgMLST provides the resolution needed for epidemiological investigations. These observations show that genomic genotyping, with the assistance of EnteroBase, can be applied at all levels of diversity within the Salmonella genus.

  6. CGI: Java software for mapping and visualizing data from array-based comparative genomic hybridization and expression profiling.

    Science.gov (United States)

    Gu, Joyce Xiuweu-Xu; Wei, Michael Yang; Rao, Pulivarthi H; Lau, Ching C; Behl, Sanjiv; Man, Tsz-Kwong

    2007-10-06

    With the increasing application of various genomic technologies in biomedical research, there is a need to integrate these data to correlate candidate genes/regions that are identified by different genomic platforms. Although there are tools that can analyze data from individual platforms, essential software for integration of genomic data is still lacking. Here, we present a novel Java-based program called CGI (Cytogenetics-Genomics Integrator) that matches the BAC clones from array-based comparative genomic hybridization (aCGH) to genes from RNA expression profiling datasets. The matching is computed via a fast, backend MySQL database containing UCSC Genome Browser annotations. This program also provides an easy-to-use graphical user interface for visualizing and summarizing the correlation of DNA copy number changes and RNA expression patterns from a set of experiments. In addition, CGI uses a Java applet to display the copy number values of a specific BAC clone in aCGH experiments side by side with the expression levels of genes that are mapped back to that BAC clone from the microarray experiments. The CGI program is built on top of extensible, reusable graphic components specifically designed for biologists. It is cross-platform compatible and the source code is freely available under the General Public License.

  7. CGI: Java Software for Mapping and Visualizing Data from Array-based Comparative Genomic Hybridization and Expression Profiling

    Directory of Open Access Journals (Sweden)

    Joyce Xiuweu-Xu Gu

    2007-01-01

    Full Text Available With the increasing application of various genomic technologies in biomedical research, there is a need to integrate these data to correlate candidate genes/regions that are identified by different genomic platforms. Although there are tools that can analyze data from individual platforms, essential software for integration of genomic data is still lacking. Here, we present a novel Java-based program called CGI (Cytogenetics-Genomics Integrator that matches the BAC clones from array-based comparative genomic hybridization (aCGH to genes from RNA expression profiling datasets. The matching is computed via a fast, backend MySQL database containing UCSC Genome Browser annotations. This program also provides an easy-to-use graphical user interface for visualizing and summarizing the correlation of DNA copy number changes and RNA expression patterns from a set of experiments. In addition, CGI uses a Java applet to display the copy number values of a specifi c BAC clone in aCGH experiments side by side with the expression levels of genes that are mapped back to that BAC clone from the microarray experiments. The CGI program is built on top of extensible, reusable graphic components specifically designed for biologists. It is cross-platform compatible and the source code is freely available under the General Public License.

  8. Development and validation of new SSR markers from expressed regions in the garlic genome

    Directory of Open Access Journals (Sweden)

    Meryem Ipek

    2015-02-01

    Full Text Available Only a limited number of simple sequence repeat (SSR markers is available for the genome of garlic (Allium sativum L. despite the fact that SSR markers have become one of the most preferred DNA marker systems. To develop new SSR markers for the garlic genome, garlic expressed sequence tags (ESTs at the publicly available GarlicEST database were screened for SSR motifs and a total of 132 SSR motifs were identified. Primer pairs were designed for 50 SSR motifs and 24 of these primer pairs were selected as SSR markers based on their consistent amplification patterns and polymorphisms. In addition, two SSR markers were developed from the sequences of garlic cDNA-AFLP fragments. The use of 26 EST-SSR markers for the assessment of genetic relationship was tested using 31 garlic genotypes. Twenty six EST-SSR markers amplified 130 polymorphic DNA fragments and the number of polymorphic alleles per SSR marker ranged from 2 to 13 with an average of 5 alleles. Observed heterozygosity and polymorphism information content (PIC of the SSR markers were between 0.23 and 0.88, and 0.20 and 0.87, respectively. Twenty one out of the 31 garlic genotypes were analyzed in a previous study using AFLP markers and the garlic genotypes clustered together with AFLP markers were also grouped together with EST-SSR markers demonstrating high concordance between AFLP and EST-SSR marker systems and possible immediate application of EST-SSR markers for fingerprinting of garlic clones. EST-SSR markers could be used in genetic studies such as genetic mapping, association mapping, genetic diversity and comparison of the genomes of Allium species.

  9. Genome-Wide Classification and Evolutionary and Expression Analyses of Citrus MYB Transcription Factor Families in Sweet Orange

    Science.gov (United States)

    Hou, Xiao-Jin; Li, Si-Bei; Liu, Sheng-Rui; Hu, Chun-Gen; Zhang, Jin-Zhi

    2014-01-01

    MYB family genes are widely distributed in plants and comprise one of the largest transcription factors involved in various developmental processes and defense responses of plants. To date, few MYB genes and little expression profiling have been reported for citrus. Here, we describe and classify 177 members of the sweet orange MYB gene (CsMYB) family in terms of their genomic gene structures and similarity to their putative Arabidopsis orthologs. According to these analyses, these CsMYBs were categorized into four groups (4R-MYB, 3R-MYB, 2R-MYB and 1R-MYB). Gene structure analysis revealed that 1R-MYB genes possess relatively more introns as compared with 2R-MYB genes. Investigation of their chromosomal localizations revealed that these CsMYBs are distributed across nine chromosomes. Sweet orange includes a relatively small number of MYB genes compared with the 198 members in Arabidopsis, presumably due to a paralog reduction related to repetitive sequence insertion into promoter and non-coding transcribed region of the genes. Comparative studies of CsMYBs and Arabidopsis showed that CsMYBs had fewer gene duplication events. Expression analysis revealed that the MYB gene family has a wide expression profile in sweet orange development and plays important roles in development and stress responses. In addition, 337 new putative microsatellites with flanking sequences sufficient for primer design were also identified from the 177 CsMYBs. These results provide a useful reference for the selection of candidate MYB genes for cloning and further functional analysis forcitrus. PMID:25375352

  10. Application of Whole Genome Expression Analysis to Assess Bacterial Responses to Environmental Conditions

    Science.gov (United States)

    Vukanti, R. V.; Mintz, E. M.; Leff, L. G.

    2005-05-01

    Bacterial responses to environmental signals are multifactorial and are coupled to changes in gene expression. An understanding of bacterial responses to environmental conditions is possible using microarray expression analysis. In this study, the utility of microarrays for examining changes in gene expression in Escherichia coli under different environmental conditions was assessed. RNA was isolated, hybridized to Affymetrix E. coli Genome 2.0 chips and analyzed using Affymetrix GCOS and Genespring software. Major limiting factors were obtaining enough quality RNA (107-108 cells to get 10μg RNA)and accounting for differences in growth rates under different conditions. Stabilization of RNA prior to isolation and taking extreme precautions while handling RNA were crucial. In addition, use of this method in ecological studies is limited by availability and cost of commercial arrays; choice of primers for cDNA synthesis, reproducibility, complexity of results generated and need to validate findings. This method may be more widely applicable with the development of better approaches for RNA recovery from environmental samples and increased number of available strain-specific arrays. Diligent experimental design and verification of results with real-time PCR or northern blots is needed. Overall, there is a great potential for use of this technology to discover mechanisms underlying organisms' responses to environmental conditions.

  11. Genome-Wide Identification of Histone Modifiers and Their Expression Patterns during Fruit Abscission in Litchi

    Directory of Open Access Journals (Sweden)

    Jianguo Li

    2017-04-01

    Full Text Available Modifications to histones, including acetylation and methylation processes, play crucial roles in the regulation of gene expression in plant development as well as in stress responses. However, limited information on the enzymes catalyzing histone acetylation and methylation in non-model plants is currently available. In this study, several histone modifier (HM types, including six histone acetyltransferases (HATs, 11 histone deacetylases (HDACs, 48 histone methyltransferases (HMTs, and 22 histone demethylases (HDMs, are identified in litchi (Litchi chinensis Sonn. cv. Feizixiao based on similarities in their sequences to homologs in Arabidopsis (A. thaliana, tomato (Solanum lycopersicum, and rice (Oryza sativa. Phylogenetic analyses reveal that HM enzymes can be grouped into four HAT, two HDAC, two HMT, and two HDM subfamilies, respectively, while further expression profile analyses demonstrate that 17 HMs were significantly altered during fruit abscission in two field treatments. Analyses reveal that these genes exhibit four distinct patterns of expression in response to fruit abscission, while an in vitro assay was used to confirm the HDAC activity of LcHDA2, LcHDA6, and LcSRT2. Our findings are the first in-depth analysis of HMs in the litchi genome, and imply that some are likely to play important roles in fruit abscission in this commercially important plant.

  12. A tiling microarray for global analysis of chloroplast genome expression in cucumber and other plants

    Directory of Open Access Journals (Sweden)

    Pląder Wojciech

    2011-09-01

    Full Text Available Abstract Plastids are small organelles equipped with their own genomes (plastomes. Although these organelles are involved in numerous plant metabolic pathways, current knowledge about the transcriptional activity of plastomes is limited. To solve this problem, we constructed a plastid tiling microarray (PlasTi-microarray consisting of 1629 oligonucleotide probes. The oligonucleotides were designed based on the cucumber chloroplast genomic sequence and targeted both strands of the plastome in a non-contiguous arrangement. Up to 4 specific probes were designed for each gene/exon, and the intergenic regions were covered regularly, with 70-nt intervals. We also developed a protocol for direct chemical labeling and hybridization of as little as 2 micrograms of chloroplast RNA. We used this protocol for profiling the expression of the cucumber chloroplast plastome on the PlasTi-microarray. Owing to the high sequence similarity of plant plastomes, the newly constructed microarray can be used to study plants other than cucumber. Comparative hybridization of chloroplast transcriptomes from cucumber, Arabidopsis, tomato and spinach showed that the PlasTi-microarray is highly versatile.

  13. Reframed Genome-Scale Metabolic Model to Facilitate Genetic Design and Integration with Expression Data.

    Science.gov (United States)

    Gu, Deqing; Jian, Xingxing; Zhang, Cheng; Hua, Qiang

    2017-01-01

    Genome-scale metabolic network models (GEMs) have played important roles in the design of genetically engineered strains and helped biologists to decipher metabolism. However, due to the complex gene-reaction relationships that exist in model systems, most algorithms have limited capabilities with respect to directly predicting accurate genetic design for metabolic engineering. In particular, methods that predict reaction knockout strategies leading to overproduction are often impractical in terms of gene manipulations. Recently, we proposed a method named logical transformation of model (LTM) to simplify the gene-reaction associations by introducing intermediate pseudo reactions, which makes it possible to generate genetic design. Here, we propose an alternative method to relieve researchers from deciphering complex gene-reactions by adding pseudo gene controlling reactions. In comparison to LTM, this new method introduces fewer pseudo reactions and generates a much smaller model system named as gModel. We showed that gModel allows two seldom reported applications: identification of minimal genomes and design of minimal cell factories within a modified OptKnock framework. In addition, gModel could be used to integrate expression data directly and improve the performance of the E-Fmin method for predicting fluxes. In conclusion, the model transformation procedure will facilitate genetic research based on GEMs, extending their applications.

  14. Mobile Genome Express (MGE: A comprehensive automatic genetic analyses pipeline with a mobile device.

    Directory of Open Access Journals (Sweden)

    Jun-Hee Yoon

    Full Text Available The development of next-generation sequencing (NGS technology allows to sequence whole exomes or genome. However, data analysis is still the biggest bottleneck for its wide implementation. Most laboratories still depend on manual procedures for data handling and analyses, which translates into a delay and decreased efficiency in the delivery of NGS results to doctors and patients. Thus, there is high demand for developing an automatic and an easy-to-use NGS data analyses system. We developed comprehensive, automatic genetic analyses controller named Mobile Genome Express (MGE that works in smartphones or other mobile devices. MGE can handle all the steps for genetic analyses, such as: sample information submission, sequencing run quality check from the sequencer, secured data transfer and results review. We sequenced an Actrometrix control DNA containing multiple proven human mutations using a targeted sequencing panel, and the whole analysis was managed by MGE, and its data reviewing program called ELECTRO. All steps were processed automatically except for the final sequencing review procedure with ELECTRO to confirm mutations. The data analysis process was completed within several hours. We confirmed the mutations that we have identified were consistent with our previous results obtained by using multi-step, manual pipelines.

  15. Mobile Genome Express (MGE): A comprehensive automatic genetic analyses pipeline with a mobile device.

    Science.gov (United States)

    Yoon, Jun-Hee; Kim, Thomas W; Mendez, Pedro; Jablons, David M; Kim, Il-Jin

    2017-01-01

    The development of next-generation sequencing (NGS) technology allows to sequence whole exomes or genome. However, data analysis is still the biggest bottleneck for its wide implementation. Most laboratories still depend on manual procedures for data handling and analyses, which translates into a delay and decreased efficiency in the delivery of NGS results to doctors and patients. Thus, there is high demand for developing an automatic and an easy-to-use NGS data analyses system. We developed comprehensive, automatic genetic analyses controller named Mobile Genome Express (MGE) that works in smartphones or other mobile devices. MGE can handle all the steps for genetic analyses, such as: sample information submission, sequencing run quality check from the sequencer, secured data transfer and results review. We sequenced an Actrometrix control DNA containing multiple proven human mutations using a targeted sequencing panel, and the whole analysis was managed by MGE, and its data reviewing program called ELECTRO. All steps were processed automatically except for the final sequencing review procedure with ELECTRO to confirm mutations. The data analysis process was completed within several hours. We confirmed the mutations that we have identified were consistent with our previous results obtained by using multi-step, manual pipelines.

  16. Controlling Citrate Synthase Expression by CRISPR/Cas9 Genome Editing for n-Butanol Production in Escherichia coli

    DEFF Research Database (Denmark)

    Heo, Min-Ji; Jung, Hwi-Min; Um, Jaeyong

    2017-01-01

    Genome editing using CRISPR/Cas9 was successfully demonstrated in Esherichia coli to effectively produce n-butanol in a defined medium under microaerobic condition. The butanol synthetic pathway genes including those encoding oxygen-tolerant alcohol dehydrogenase were overexpressed in metabolically...... prediction program, UTR designer, and modified using the CRISPR/Cas9 genome editing method to reduce its expression level. E. coli strains with decreased citrate synthase expression produced more butanol and the citrate synthase activity was correlated with butanol production. These results demonstrate...

  17. Expression and genomic organization of zonadhesin-like genes in three species of fish give insight into the evolutionary history of a mosaic protein

    Directory of Open Access Journals (Sweden)

    Davidson William S

    2005-11-01

    Full Text Available Abstract Background The mosaic sperm protein zonadhesin (ZAN has been characterized in mammals and is implicated in species-specific egg-sperm binding interactions. The genomic structure and testes-specific expression of zonadhesin is known for many mammalian species. All zonadhesin genes characterized to date consist of meprin A5 antigen receptor tyrosine phosphatase mu (MAM domains, mucin tandem repeats, and von Willebrand (VWD adhesion domains. Here we investigate the genomic structure and expression of zonadhesin-like genes in three species of fish. Results The cDNA and corresponding genomic locus of a zonadhesin-like gene (zlg in Atlantic salmon (Salmo salar were sequenced. Zlg is similar in adhesion domain content to mammalian zonadhesin; however, the domain order is altered. Analysis of puffer fish (Takifugu rubripes and zebrafish (Danio rerio sequence data identified zonadhesin (zan genes that share the same domain order, content, and a conserved syntenic relationship with mammalian zonadhesin. A zonadhesin-like gene in D. rerio was also identified. Unlike mammalian zonadhesin, D. rerio zan and S. salar zlg were expressed in the gut and not in the testes. Conclusion We characterized likely orthologs of zonadhesin in both T. rubripes and D. rerio and uncovered zonadhesin-like genes in S. salar and D. rerio. Each of these genes contains MAM, mucin, and VWD domains. While these domains are associated with several proteins that show prominent gut expression, their combination is unique to zonadhesin and zonadhesin-like genes in vertebrates. The expression patterns of fish zonadhesin and zonadhesin-like genes suggest that the reproductive role of zonadhesin evolved later in the mammalian lineage.

  18. RNA 3D modules in genome-wide predictions of RNA 2D structure

    DEFF Research Database (Denmark)

    Theis, Corinna; Zirbel, Craig L; Zu Siederdissen, Christian Höner

    2015-01-01

    . These modules can, for example, occur inside structural elements which in RNA 2D predictions appear as internal loops. Hence one question is if the use of such RNA 3D information can improve the prediction accuracy of RNA secondary structure at a genome-wide level. Here, we use RNAz in combination with 3D......Recent experimental and computational progress has revealed a large potential for RNA structure in the genome. This has been driven by computational strategies that exploit multiple genomes of related organisms to identify common sequences and secondary structures. However, these computational...... approaches have two main challenges: they are computationally expensive and they have a relatively high false discovery rate (FDR). Simultaneously, RNA 3D structure analysis has revealed modules composed of non-canonical base pairs which occur in non-homologous positions, apparently by independent evolution...

  19. Expression and genomic analysis of midasin, a novel and highly conserved AAA protein distantly related to dynein

    Directory of Open Access Journals (Sweden)

    Gibbons I R

    2002-07-01

    Full Text Available Abstract Background The largest open reading frame in the Saccharomyces genome encodes midasin (MDN1p, YLR106p, an AAA ATPase of 560 kDa that is essential for cell viability. Orthologs of midasin have been identified in the genome projects for Drosophila, Arabidopsis, and Schizosaccharomyces pombe. Results Midasin is present as a single-copy gene encoding a well-conserved protein of ~600 kDa in all eukaryotes for which data are available. In humans, the gene maps to 6q15 and encodes a predicted protein of 5596 residues (632 kDa. Sequence alignments of midasin from humans, yeast, Giardia and Encephalitozoon indicate that its domain structure comprises an N-terminal domain (35 kDa, followed by an AAA domain containing six tandem AAA protomers (~30 kDa each, a linker domain (260 kDa, an acidic domain (~70 kDa containing 35–40% aspartate and glutamate, and a carboxy-terminal M-domain (30 kDa that possesses MIDAS sequence motifs and is homologous to the I-domain of integrins. Expression of hemagglutamin-tagged midasin in yeast demonstrates a polypeptide of the anticipated size that is localized principally in the nucleus. Conclusions The highly conserved structure of midasin in eukaryotes, taken in conjunction with its nuclear localization in yeast, suggests that midasin may function as a nuclear chaperone and be involved in the assembly/disassembly of macromolecular complexes in the nucleus. The AAA domain of midasin is evolutionarily related to that of dynein, but it appears to lack a microtubule-binding site.

  20. Genomic organization, evolution, and expression of photoprotein and opsin genes in Mnemiopsis leidyi: a new view of ctenophore photocytes

    Directory of Open Access Journals (Sweden)

    Schnitzler Christine E

    2012-12-01

    Full Text Available Abstract Background Calcium-activated photoproteins are luciferase variants found in photocyte cells of bioluminescent jellyfish (Phylum Cnidaria and comb jellies (Phylum Ctenophora. The complete genomic sequence from the ctenophore Mnemiopsis leidyi, a representative of the earliest branch of animals that emit light, provided an opportunity to examine the genome of an organism that uses this class of luciferase for bioluminescence and to look for genes involved in light reception. To determine when photoprotein genes first arose, we examined the genomic sequence from other early-branching taxa. We combined our genomic survey with gene trees, developmental expression patterns, and functional protein assays of photoproteins and opsins to provide a comprehensive view of light production and light reception in Mnemiopsis. Results The Mnemiopsis genome has 10 full-length photoprotein genes situated within two genomic clusters with high sequence conservation that are maintained due to strong purifying selection and concerted evolution. Photoprotein-like genes were also identified in the genomes of the non-luminescent sponge Amphimedon queenslandica and the non-luminescent cnidarian Nematostella vectensis, and phylogenomic analysis demonstrated that photoprotein genes arose at the base of all animals. Photoprotein gene expression in Mnemiopsis embryos begins during gastrulation in migrating precursors to photocytes and persists throughout development in the canals where photocytes reside. We identified three putative opsin genes in the Mnemiopsis genome and show that they do not group with well-known bilaterian opsin subfamilies. Interestingly, photoprotein transcripts are co-expressed with two of the putative opsins in developing photocytes. Opsin expression is also seen in the apical sensory organ. We present evidence that one opsin functions as a photopigment in vitro, absorbing light at wavelengths that overlap with peak photoprotein light

  1. Reliable and efficient solution of genome-scale models of Metabolism and macromolecular Expression

    DEFF Research Database (Denmark)

    Ma, Ding; Yang, Laurence; Fleming, Ronan M. T.

    2017-01-01

    orders of magnitude. Data values also have greatly varying magnitudes. Standard double-precision solvers may return inaccurate solutions or report that no solution exists. Exact simplex solvers based on rational arithmetic require a near-optimal warm start to be practical on large problems (current ME......Constraint-Based Reconstruction and Analysis (COBRA) is currently the only methodology that permits integrated modeling of Metabolism and macromolecular Expression (ME) at genome-scale. Linear optimization computes steady-state flux solutions to ME models, but flux values are spread over many...... models have 70,000 constraints and variables and will grow larger). We have developed a quadrupleprecision version of our linear and nonlinear optimizer MINOS, and a solution procedure (DQQ) involving Double and Quad MINOS that achieves reliability and efficiency for ME models and other challenging...

  2. The SGC beyond structural genomics: redefining the role of 3D structures by coupling genomic stratification with fragment-based discovery.

    Science.gov (United States)

    Bradley, Anthony R; Echalier, Aude; Fairhead, Michael; Strain-Damerell, Claire; Brennan, Paul; Bullock, Alex N; Burgess-Brown, Nicola A; Carpenter, Elisabeth P; Gileadi, Opher; Marsden, Brian D; Lee, Wen Hwa; Yue, Wyatt; Bountra, Chas; von Delft, Frank

    2017-11-08

    The ongoing explosion in genomics data has long since outpaced the capacity of conventional biochemical methodology to verify the large number of hypotheses that emerge from the analysis of such data. In contrast, it is still a gold-standard for early phenotypic validation towards small-molecule drug discovery to use probe molecules (or tool compounds), notwithstanding the difficulty and cost of generating them. Rational structure-based approaches to ligand discovery have long promised the efficiencies needed to close this divergence; in practice, however, this promise remains largely unfulfilled, for a host of well-rehearsed reasons and despite the huge technical advances spearheaded by the structural genomics initiatives of the noughties. Therefore the current, fourth funding phase of the Structural Genomics Consortium (SGC), building on its extensive experience in structural biology of novel targets and design of protein inhibitors, seeks to redefine what it means to do structural biology for drug discovery. We developed the concept of a Target Enabling Package (TEP) that provides, through reagents, assays and data, the missing link between genetic disease linkage and the development of usefully potent compounds. There are multiple prongs to the ambition: rigorously assessing targets' genetic disease linkages through crowdsourcing to a network of collaborating experts; establishing a systematic approach to generate the protocols and data that comprise each target's TEP; developing new, X-ray-based fragment technologies for generating high quality chemical matter quickly and cheaply; and exploiting a stringently open access model to build multidisciplinary partnerships throughout academia and industry. By learning how to scale these approaches, the SGC aims to make structures finally serve genomics, as originally intended, and demonstrate how 3D structures systematically allow new modes of druggability to be discovered for whole classes of targets. © 2017 The

  3. Genome-Wide Identification and Expression Analyses of Aquaporin Gene Family during Development and Abiotic Stress in Banana

    Science.gov (United States)

    Hu, Wei; Hou, Xiaowan; Huang, Chao; Yan, Yan; Tie, Weiwei; Ding, Zehong; Wei, Yunxie; Liu, Juhua; Miao, Hongxia; Lu, Zhiwei; Li, Meiying; Xu, Biyu; Jin, Zhiqiang

    2015-01-01

    Aquaporins (AQPs) function to selectively control the flow of water and other small molecules through biological membranes, playing crucial roles in various biological processes. However, little information is available on the AQP gene family in bananas. In this study, we identified 47 banana AQP genes based on the banana genome sequence. Evolutionary analysis of AQPs from banana, Arabidopsis, poplar, and rice indicated that banana AQPs (MaAQPs) were clustered into four subfamilies. Conserved motif analysis showed that all banana AQPs contained the typical AQP-like or major intrinsic protein (MIP) domain. Gene structure analysis suggested the majority of MaAQPs had two to four introns with a highly specific number and length for each subfamily. Expression analysis of MaAQP genes during fruit development and postharvest ripening showed that some MaAQP genes exhibited high expression levels during these stages, indicating the involvement of MaAQP genes in banana fruit development and ripening. Additionally, some MaAQP genes showed strong induction after stress treatment and therefore, may represent potential candidates for improving banana resistance to abiotic stress. Taken together, this study identified some excellent tissue-specific, fruit development- and ripening-dependent, and abiotic stress-responsive candidate MaAQP genes, which could lay a solid foundation for genetic improvement of banana cultivars. PMID:26307965

  4. Genome wide identification and expression analysis of Homeodomain leucine zipper subfamily IV (HDZ IV gene family from Musa accuminata

    Directory of Open Access Journals (Sweden)

    Ashutosh ePandey

    2016-02-01

    Full Text Available The homedodomain zipper family (HD-ZIP of transcription factors is present only in plants and plays important role in the regulation of plant-specific processes. The subfamily IV of HDZ transcription factors (HD-ZIP IV has primarily been implicated in the regulation of epidermal structure development. Though this gene family is present in all lineages of land plants, members of this gene family have not been identified in banana, which is one of the major staple fruit crops. In the present work, we identified 21 HDZIV genes in banana by the computational analysis of banana genome resource. Our analysis suggested that these genes putatively encode proteins having all the characteristic domains of HDZIV transcription factors. The phylogenetic analysis of the banana HDZIV family genes further confirmed that after separation from a common ancestor, the banana and poales lineages might have followed distinct evolutionary paths. Further, we conclude that segmental duplication played a major role in the evolution of banana HDZIV genes. All the identified banana HDZIV genes expresses in different banana tissue, however at varying levels. The transcript levels of some of the banana HDZIV genes were also detected in banana fruit pulp, suggesting their putative role in fruit attributes. A large number of genes of this family showed modulated expression under drought and salinity stress. Taken together, the present work lays a foundation for elucidation of functional aspects of the banana HDZIV genes and for their possible use in the banana improvement programs.

  5. Genome-wide identification, phylogeny and expression analyses of SCARECROW-LIKE(SCL) genes in millet (Setaria italica).

    Science.gov (United States)

    Liu, Hongyun; Qin, Jiajia; Fan, Hui; Cheng, Jinjin; Li, Lin; Liu, Zheng

    2017-07-01

    As a member of the GRAS gene family, SCARECROW - LIKE ( SCL ) genes encode transcriptional regulators that are involved in plant information transmission and signal transduction. In this study, 44 SCL genes including two SCARECROW genes in millet were identified to be distributed on eight chromosomes, except chromosome 6. All the millet genes contain motifs 6-8, indicating that these motifs are conserved during the evolution. SCL genes of millet were divided into eight groups based on the phylogenetic relationship and classification of Arabidopsis SCL genes. Several putative millet orthologous genes in Arabidopsis , maize and rice were identified. High throughput RNA sequencing revealed that the expressions of millet SCL genes in root, stem, leaf, spica, and along leaf gradient varied greatly. Analyses combining the gene expression patterns, gene structures, motif compositions, promoter cis -elements identification, alternative splicing of transcripts and phylogenetic relationship of SCL genes indicate that the these genes may play diverse functions. Functionally characterized SCL genes in maize, rice and Arabidopsis would provide us some clues for future characterization of their homologues in millet. To the best of our knowledge, this is the first study of millet SCL genes at the genome wide level. Our work provides a useful platform for functional analysis of SCL genes in millet, a model crop for C 4 photosynthesis and bioenergy studies.

  6. Genome-wide identification, phylogenetic analysis, and expression profiling of polyamine synthesis gene family members in tomato.

    Science.gov (United States)

    Liu, Taibo; Huang, Binbin; Chen, Lin; Xian, Zhiqiang; Song, Shiwei; Chen, Riyuan; Hao, Yanwei

    2018-06-30

    Polyamines (PAs), including putrescine (Put), spermidine (Spd), spermine (Spm), and thermospermine (T-Spm), play key roles in plant development, including fruit setting and ripening, morphogenesis, and abiotic/biotic stress. Their functions appear to be intimately related to their synthesis, which occurs via arginine/ornithine decarboxylase (ADC/ODC), Spd synthase (SPDS), Spm synthase (SPMS), and Acaulis5 (ACL5), respectively. Unfortunately, the expression and function of these PA synthesis-relate genes during specific developmental process or under stress have not been fully elucidated. Here, we present the results of a genome-wide analysis of the PA synthesis genes (ADC, ODC, SPDS, SPMS, ACL5) in the tomato (Solanum lycopersicum). In total, 14 PA synthesis-related genes were identified. Further analysis of their structures, conserved domains, phylogenetic trees, predicted subcellular localization, and promoter cis-regulatory elements were analyzed. Furthermore, we also performed experiments to evaluate their tissue expression patterns and under hormone and various stress treatments. To our knowledge, this is the first study to elucidate the mechanisms underlying PA function in this variety of tomato. Taken together, these data provide valuable information for future functional characterization of specific genes in the PA synthesis pathway in this and other plant species. Although additional research is required, the insight gained by this and similar studies can be used to improve our understanding of PA metabolism ultimately leading to more effective and consistent plant cultivation. Copyright © 2018 Elsevier B.V. All rights reserved.

  7. Structure and genome organization of AFV2, a novel archaeal lipothrixvirus with unusual terminal and core structures

    DEFF Research Database (Denmark)

    Häring, Monika; Vestergaard, Gisle Alberg; Brügger, Kim

    2005-01-01

    A novel filamentous virus, AFV2, from the hyperthermophilic archaeal genus Acidianus shows structural similarity to lipothrixviruses but differs from them in its unusual terminal and core structures. The double-stranded DNA genome contains 31,787 bp and carries eight open reading frames homologous...

  8. Genetic variability in MCF-7 sublines: evidence of rapid genomic and RNA expression profile modifications

    International Nuclear Information System (INIS)

    Nugoli, Mélanie; Theillet, Charles; Chuchana, Paul; Vendrell, Julie; Orsetti, Béatrice; Ursule, Lisa; Nguyen, Catherine; Birnbaum, Daniel; Douzery, Emmanuel JP; Cohen, Pascale

    2003-01-01

    Both phenotypic and cytogenetic variability have been reported for clones of breast carcinoma cell lines but have not been comprehensively studied. Despite this, cell lines such as MCF-7 cells are extensively used as model systems. In this work we documented, using CGH and RNA expression profiles, the genetic variability at the genomic and RNA expression levels of MCF-7 cells of different origins. Eight MCF-7 sublines collected from different sources were studied as well as 3 subclones isolated from one of the sublines by limit dilution. MCF-7 sublines showed important differences in copy number alteration (CNA) profiles. Overall numbers of events ranged from 28 to 41. Involved chromosomal regions varied greatly from a subline to another. A total of 62 chromosomal regions were affected by either gains or losses in the 11 sublines studied. We performed a phylogenetic analysis of CGH profiles using maximum parsimony in order to reconstruct the putative filiation of the 11 MCF-7 sublines. The phylogenetic tree obtained showed that the MCF-7 clade was characterized by a restricted set of 8 CNAs and that the most divergent subline occupied the position closest to the common ancestor. Expression profiles of 8 MCF-7 sublines were analyzed along with those of 19 unrelated breast cancer cell lines using home made cDNA arrays comprising 720 genes. Hierarchical clustering analysis of the expression data showed that 7/8 MCF-7 sublines were grouped forming a cluster while the remaining subline clustered with unrelated breast cancer cell lines. These data thus showed that MCF-7 sublines differed at both the genomic and phenotypic levels. The analysis of CGH profiles of the parent subline and its three subclones supported the heteroclonal nature of MCF-7 cells. This strongly suggested that the genetic plasticity of MCF-7 cells was related to their intrinsic capacity to generate clonal heterogeneity. We propose that MCF-7, and possibly the breast tumor it was derived from, evolved

  9. G2S: A web-service for annotating genomic variants on 3D protein structures.

    Science.gov (United States)

    Wang, Juexin; Sheridan, Robert; Sumer, S Onur; Schultz, Nikolaus; Xu, Dong; Gao, Jianjiong

    2018-01-27

    Accurately mapping and annotating genomic locations on 3D protein structures is a key step in structure-based analysis of genomic variants detected by recent large-scale sequencing efforts. There are several mapping resources currently available, but none of them provides a web API (Application Programming Interface) that support programmatic access. We present G2S, a real-time web API that provides automated mapping of genomic variants on 3D protein structures. G2S can align genomic locations of variants, protein locations, or protein sequences to protein structures and retrieve the mapped residues from structures. G2S API uses REST-inspired design conception and it can be used by various clients such as web browsers, command terminals, programming languages and other bioinformatics tools for bringing 3D structures into genomic variant analysis. The webserver and source codes are freely available at https://g2s.genomenexus.org. g2s@genomenexus.org. Supplementary data are available at Bioinformatics online. © The Author (2018). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  10. Three-dimensional Structure of a Viral Genome-delivery Portal Vertex

    Energy Technology Data Exchange (ETDEWEB)

    A Olia; P Prevelige Jr.; J Johnson; G Cingolani

    2011-12-31

    DNA viruses such as bacteriophages and herpesviruses deliver their genome into and out of the capsid through large proteinaceous assemblies, known as portal proteins. Here, we report two snapshots of the dodecameric portal protein of bacteriophage P22. The 3.25-{angstrom}-resolution structure of the portal-protein core bound to 12 copies of gene product 4 (gp4) reveals a {approx}1.1-MDa assembly formed by 24 proteins. Unexpectedly, a lower-resolution structure of the full-length portal protein unveils the unique topology of the C-terminal domain, which forms a {approx}200-{angstrom}-long {alpha}-helical barrel. This domain inserts deeply into the virion and is highly conserved in the Podoviridae family. We propose that the barrel domain facilitates genome spooling onto the interior surface of the capsid during genome packaging and, in analogy to a rifle barrel, increases the accuracy of genome ejection into the host cell.

  11. Self-Organization of Genome Expression from Embryo to Terminal Cell Fate: Single-Cell Statistical Mechanics of Biological Regulation

    Directory of Open Access Journals (Sweden)

    Alessandro Giuliani

    2017-12-01

    Full Text Available A statistical mechanical mean-field approach to the temporal development of biological regulation provides a phenomenological, but basic description of the dynamical behavior of genome expression in terms of autonomous self-organization with a critical transition (Self-Organized Criticality: SOC. This approach reveals the basis of self-regulation/organization of genome expression, where the extreme complexity of living matter precludes any strict mechanistic approach. The self-organization in SOC involves two critical behaviors: scaling-divergent behavior (genome avalanche and sandpile-type critical behavior. Genome avalanche patterns—competition between order (scaling and disorder (divergence reflect the opposite sequence of events characterizing the self-organization process in embryo development and helper T17 terminal cell differentiation, respectively. On the other hand, the temporal development of sandpile-type criticality (the degree of SOC control in mouse embryo suggests the existence of an SOC control landscape with a critical transition state (i.e., the erasure of zygote-state criticality. This indicates that a phase transition of the mouse genome before and after reprogramming (immediately after the late 2-cell state occurs through a dynamical change in a control parameter. This result provides a quantitative open-thermodynamic appreciation of the still largely qualitative notion of the epigenetic landscape. Our results suggest: (i the existence of coherent waves of condensation/de-condensation in chromatin, which are transmitted across regions of different gene-expression levels along the genome; and (ii essentially the same critical dynamics we observed for cell-differentiation processes exist in overall RNA expression during embryo development, which is particularly relevant because it gives further proof of SOC control of overall expression as a universal feature.

  12. Rice sHsp genes: genomic organization and expression profiling under stress and development

    Directory of Open Access Journals (Sweden)

    Grover Anil

    2009-08-01

    Full Text Available Abstract Background Heat shock proteins (Hsps constitute an important component in the heat shock response of all living systems. Among the various plant Hsps (i.e. Hsp100, Hsp90, Hsp70 and Hsp20, Hsp20 or small Hsps (sHsps are expressed in maximal amounts under high temperature stress. The characteristic feature of the sHsps is the presence of α-crystallin domain (ACD at the C-terminus. sHsps cooperate with Hsp100/Hsp70 and co-chaperones in ATP-dependent manner in preventing aggregation of cellular proteins and in their subsequent refolding. Database search was performed to investigate the sHsp gene family across rice genome sequence followed by comprehensive expression analysis of these genes. Results We identified 40 α-crystallin domain containing genes in rice. Phylogenetic analysis showed that 23 out of these 40 genes constitute sHsps. The additional 17 genes containing ACD clustered with Acd proteins of Arabidopsis. Detailed scrutiny of 23 sHsp sequences enabled us to categorize these proteins in a revised scheme of classification constituting of 16 cytoplasmic/nuclear, 2 ER, 3 mitochondrial, 1 plastid and 1 peroxisomal genes. In the new classification proposed herein nucleo-cytoplasmic class of sHsps with 9 subfamilies is more complex in rice than in Arabidopsis. Strikingly, 17 of 23 rice sHsp genes were noted to be intronless. Expression analysis based on microarray and RT-PCR showed that 19 sHsp genes were upregulated by high temperature stress. Besides heat stress, expression of sHsp genes was up or downregulated by other abiotic and biotic stresses. In addition to stress regulation, various sHsp genes were differentially upregulated at different developmental stages of the rice plant. Majority of sHsp genes were expressed in seed. Conclusion We identified twenty three sHsp genes and seventeen Acd genes in rice. Three nucleocytoplasmic sHsp genes were found only in monocots. Analysis of expression profiling of sHsp genes revealed

  13. Enforcing Co-expression Within a Brain-Imaging Genomics Regression Framework.

    Science.gov (United States)

    Zille, Pascal; Calhoun, Vince D; Wang, Yu-Ping

    2017-06-28

    Among the challenges arising in brain imaging genetic studies, estimating the potential links between neurological and genetic variability within a population is key. In this work, we propose a multivariate, multimodal formulation for variable selection that leverages co-expression patterns across various data modalities. Our approach is based on an intuitive combination of two widely used statistical models: sparse regression and canonical correlation analysis (CCA). While the former seeks multivariate linear relationships between a given phenotype and associated observations, the latter searches to extract co-expression patterns between sets of variables belonging to different modalities. In the following, we propose to rely on a 'CCA-type' formulation in order to regularize the classical multimodal sparse regression problem (essentially incorporating both CCA and regression models within a unified formulation). The underlying motivation is to extract discriminative variables that are also co-expressed across modalities. We first show that the simplest formulation of such model can be expressed as a special case of collaborative learning methods. After discussing its limitation, we propose an extended, more flexible formulation, and introduce a simple and efficient alternating minimization algorithm to solve the associated optimization problem.We explore the parameter space and provide some guidelines regarding parameter selection. Both the original and extended versions are then compared on a simple toy dataset and a more advanced simulated imaging genomics dataset in order to illustrate the benefits of the latter. Finally, we validate the proposed formulation using single nucleotide polymorphisms (SNP) data and functional magnetic resonance imaging (fMRI) data from a population of adolescents (n = 362 subjects, age 16.9 ± 1.9 years from the Philadelphia Neurodevelopmental Cohort) for the study of learning ability. Furthermore, we carry out a significance

  14. Structured RNAs in the ENCODE selected regions of the human genome

    DEFF Research Database (Denmark)

    Washietl, Stefan; Pedersen, Jakob Skou; Korbel, Jan O

    2007-01-01

    Functional RNA structures play an important role both in the context of noncoding RNA transcripts as well as regulatory elements in mRNAs. Here we present a computational study to detect functional RNA structures within the ENCODE regions of the human genome. Since structural RNAs in general lack...... with the GENCODE annotation points to functional RNAs in all genomic contexts, with a slightly increased density in 3'-UTRs. While we estimate a significant false discovery rate of approximately 50%-70% many of the predictions can be further substantiated by additional criteria: 248 loci are predicted by both RNAz...

  15. Terminal structures of West Nile virus genomic RNA and their interactions with viral NS5 protein

    International Nuclear Information System (INIS)

    Dong Hongping; Zhang Bo; Shi Peiyong

    2008-01-01

    Genome cyclization is essential for flavivirus replication. We used RNases to probe the structures formed by the 5'-terminal 190 nucleotides and the 3'-terminal 111 nucleotides of the West Nile virus (WNV) genomic RNA. When analyzed individually, the two RNAs adopt stem-loop structures as predicted by the thermodynamic-folding program. However, when mixed together, the two RNAs form a duplex that is mediated through base-pairings of two sets of RNA elements (5'CS/3'CSI and 5'UAR/3'UAR). Formation of the RNA duplex facilitates a conformational change that leaves the 3'-terminal nucleotides of the genome (position - 8 to - 16) to be single-stranded. Viral NS5 binds specifically to the 5'-terminal stem-loop (SL1) of the genomic RNA. The 5'SL1 RNA structure is essential for WNV replication. The study has provided further evidence to suggest that flavivirus genome cyclization and NS5/5'SL1 RNA interaction facilitate NS5 binding to the 3' end of the genome for the initiation of viral minus-strand RNA synthesis

  16. Primary structure, gene organization and polypeptide expression of poliovirus RNA

    Energy Technology Data Exchange (ETDEWEB)

    Kitamura, N. (State Univ. of New York, Stony Brook); Semler, B.L.; Rothberg, P.G.

    1981-06-18

    The primary structure of the poliovirus genome has been determined. The RNA molecule is 7433 nucleotides long, polyadenylated at the 3' terminus, and covalently linked to a small protein (VPg) at the 5' terminus. An open reading frame of 2207 consecutive triplets spans over 89% of the nucleotide sequence and codes for the viral polyprotein NCVPOO. Twelve viral polypeptides have been mapped by amino acid sequence analysis and were found to be proteolytic cleavage products of the polyprotein, cleavages occurring predominantly at Gln-Gly pairs.

  17. Genome-wide identification of structural variants in genes encoding drug targets

    DEFF Research Database (Denmark)

    Rasmussen, Henrik Berg; Dahmcke, Christina Mackeprang

    2012-01-01

    The objective of the present study was to identify structural variants of drug target-encoding genes on a genome-wide scale. We also aimed at identifying drugs that are potentially amenable for individualization of treatments based on knowledge about structural variation in the genes encoding...

  18. Integrating sequencing technologies in personal genomics: optimal low cost reconstruction of structural variants.

    Directory of Open Access Journals (Sweden)

    Jiang Du

    2009-07-01

    Full Text Available The goal of human genome re-sequencing is obtaining an accurate assembly of an individual's genome. Recently, there has been great excitement in the development of many technologies for this (e.g. medium and short read sequencing from companies such as 454 and SOLiD, and high-density oligo-arrays from Affymetrix and NimbelGen, with even more expected to appear. The costs and sensitivities of these technologies differ considerably from each other. As an important goal of personal genomics is to reduce the cost of re-sequencing to an affordable point, it is worthwhile to consider optimally integrating technologies. Here, we build a simulation toolbox that will help us optimally combine different technologies for genome re-sequencing, especially in reconstructing large structural variants (SVs. SV reconstruction is considered the most challenging step in human genome re-sequencing. (It is sometimes even harder than de novo assembly of small genomes because of the duplications and repetitive sequences in the human genome. To this end, we formulate canonical problems that are representative of issues in reconstruction and are of small enough scale to be computationally tractable and simulatable. Using semi-realistic simulations, we show how we can combine different technologies to optimally solve the assembly at low cost. With mapability maps, our simulations efficiently handle the inhomogeneous repeat-containing structure of the human genome and the computational complexity of practical assembly algorithms. They quantitatively show how combining different read lengths is more cost-effective than using one length, how an optimal mixed sequencing strategy for reconstructing large novel SVs usually also gives accurate detection of SNPs/indels, how paired-end reads can improve reconstruction efficiency, and how adding in arrays is more efficient than just sequencing for disentangling some complex SVs. Our strategy should facilitate the sequencing of

  19. Expression induction of P450 genes by imidacloprid in Nilaparvata lugens: A genome-scale analysis.

    Science.gov (United States)

    Zhang, Jianhua; Zhang, Yixi; Wang, Yunchao; Yang, Yuanxue; Cang, Xinzhu; Liu, Zewen

    2016-09-01

    The overexpression of P450 monooxygenase genes is a main mechanism for the resistance to imidacloprid, a representative neonicotinoid insecticide, in Nilaparvata lugens (brown planthopper, BPH). However, only two P450 genes (CYP6AY1 and CYP6ER1), among fifty-four P450 genes identified from BPH genome database, have been reported to play important roles in imidacloprid resistance until now. In this study, after the confirmation of important roles of P450s in imidacloprid resistance by the synergism analysis, the expression induction by imidacloprid was determined for all P450 genes. In the susceptible (Sus) strain, eight P450 genes in Clade4, eight in Clade3 and two in Clade2 were up-regulated by imidacloprid, among which three genes (CYP6CS1, CYP6CW1 and CYP6ER1, all in Clade3) were increased to above 4.0-fold and eight genes to above 2.0-fold. In contrast, no P450 genes were induced in Mito clade. Eight genes induced to above 2.0-fold were selected to determine their expression and induced levels in Huzhou population, in which piperonyl butoxide showed the biggest effects on imidacloprid toxicity among eight field populations. The expression levels of seven P450 genes were higher in Huzhou population than that in Sus strain, with the biggest differences for CYP6CS1 (9.8-fold), CYP6ER1 (7.7-fold) and CYP6AY1 (5.1-fold). The induction levels for all tested genes were bigger in Sus strain than that in Huzhou population except CYP425B1. Screening the induction of P450 genes by imidacloprid in the genome-scale will provide an overall view on the possible metabolic factors in the resistance to neonicotinoid insecticides. The further work, such as the functional study of recombinant proteins, will be performed to validate the roles of these P450s in imidacloprid resistance. Copyright © 2015 Elsevier B.V. All rights reserved.

  20. Genome3D: a UK collaborative project to annotate genomic sequences with predicted 3D structures based on SCOP and CATH domains.

    Science.gov (United States)

    Lewis, Tony E; Sillitoe, Ian; Andreeva, Antonina; Blundell, Tom L; Buchan, Daniel W A; Chothia, Cyrus; Cuff, Alison; Dana, Jose M; Filippis, Ioannis; Gough, Julian; Hunter, Sarah; Jones, David T; Kelley, Lawrence A; Kleywegt, Gerard J; Minneci, Federico; Mitchell, Alex; Murzin, Alexey G; Ochoa-Montaño, Bernardo; Rackham, Owen J L; Smith, James; Sternberg, Michael J E; Velankar, Sameer; Yeats, Corin; Orengo, Christine

    2013-01-01

    Genome3D, available at http://www.genome3d.eu, is a new collaborative project that integrates UK-based structural resources to provide a unique perspective on sequence-structure-function relationships. Leading structure prediction resources (DomSerf, FUGUE, Gene3D, pDomTHREADER, Phyre and SUPERFAMILY) provide annotations for UniProt sequences to indicate the locations of structural domains (structural annotations) and their 3D structures (structural models). Structural annotations and 3D model predictions are currently available for three model genomes (Homo sapiens, E. coli and baker's yeast), and the project will extend to other genomes in the near future. As these resources exploit different strategies for predicting structures, the main aim of Genome3D is to enable comparisons between all the resources so that biologists can see where predictions agree and are therefore more trusted. Furthermore, as these methods differ in whether they build their predictions using CATH or SCOP, Genome3D also contains the first official mapping between these two databases. This has identified pairs of similar superfamilies from the two resources at various degrees of consensus (532 bronze pairs, 527 silver pairs and 370 gold pairs).

  1. Transfer of Genomics Information to Flow Cytometry: Expression of CD27 and CD44 Discriminates Subtypes of Acute Lymphoblastic Leukemia

    Czech Academy of Sciences Publication Activity Database

    Vášková, M.; Mejstříková, E.; Kalina, T.; Martinková, Patrícia; Omelka, M.; Trka, J.; Starý, J.; Hrušák, O.

    2005-01-01

    Roč. 19, č. 5 (2005), s. 876-878 ISSN 0887-6924 Source of funding: V - iné verejné zdroje Keywords : transfer * genomics * information * cytometry * expression * discriminates * subtypesacute * lymphoblastic * leukemia Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 6.612, year: 2005

  2. Molecular cloning and expression of full-length DNA copies of the genomic RNAs of cowpea mosaic virus

    NARCIS (Netherlands)

    Vos, P.

    1987-01-01

    The experiments described in this thesis were designed to unravel various aspects of the mechanism of gene expression of cowpea mosaic virus (CPMV). For this purpose full-length DNA copies of both genomic RNAs of CPMV were constructed. Using powerful invitro

  3. Long-Term Protective Immune Response Elicited by Vaccination with an Expression Genomic Library of Toxoplasma gondii

    OpenAIRE

    Fachado, Alberto; Rodriguez, Alexandro; Molina, Judith; Silvério, Jaline C.; Marino, Ana P. M. P.; Pinto, Luzia M. O.; Angel, Sergio O.; Infante, Juan F.; Traub-Cseko, Yara; Amendoeira, Regina R.; Lannes-Vieira, Joseli

    2003-01-01

    Immunization of BALB/c mice with an expression genomic library of Toxoplasma gondii induces a Th1-type immune response, with recognition of several T. gondii proteins (21 to 117 kDa) and long-term protective immunity against a lethal challenge. These results support further investigations to achieve a multicomponent anti-T. gondii DNA vaccine.

  4. Genome-wide discovery of putative sRNAs in Paracoccus denitrificans expressed under nitrous oxide emitting conditions.

    Directory of Open Access Journals (Sweden)

    Hannah Gaimster

    2016-11-01

    Full Text Available Nitrous oxide (N2O is a stable, ozone depleting greenhouse gas. Emissions of N2O into the atmosphere continue to rise, primarily due to the use of nitrogen-containing fertilizers by soil denitrifying microbes. It is clear more effective mitigation strategies are required to reduce emissions. One way to help develop future mitigation strategies is to address the currently poor understanding of transcriptional regulation of the enzymes used to produce and consume N2O. With this ultimate aim in mind we performed RNA-seq on a model soil denitrifier, Paracoccus denitrificans, cultured anaerobically under high N2O and low N2O emitting conditions, and aerobically under zero N2O emitting conditions to identify small RNAs (sRNAs with potential regulatory functions transcribed under these conditions. sRNAs are short (∼40–500 nucleotides non-coding RNAs that regulate a wide range of activities in many bacteria. 167 sRNAs were identified throughout the P. denitrificans genome which are either present in intergenic regions or located antisense to ORFs. Furthermore, many of these sRNAs are differentially expressed under high N2O and low N2O emitting conditions respectively, suggesting they may play a role in production or reduction of N2O. Expression of 16 of these sRNAs have been confirmed by RT-PCR. 90% of the sRNAs are predicted to form secondary structures. Predicted targets include transporters and a number of transcriptional regulators. A number of sRNAs were conserved in other members of the α-proteobacteria. Better understanding of the sRNA factors which contribute to expression of the machinery required to reduce N2O will, in turn, help to inform strategies for mitigation of N2O emissions.

  5. Genomic and expression profiling of human spermatocytic seminomas: primary spermatocyte as tumorigenic precursor and DMRT1 as candidate chromosome 9 gene.

    NARCIS (Netherlands)

    Looijenga, L.H.J.; Hersmus, R.; Gillis, A.J.M.; Pfundt, R.; Stoop, H.J.; Gurp, R.J.H.L.M. van; Veltman, J.; Beverloo, H.B.; Drunen, E. van; Geurts van Kessel, A.H.M.; Pera, R.R.; Schneider, D.T.; Summersgill, B.; Shipley, J.; McIntyre, A.; Spek, P. van der; Schoenmakers, E.F.P.M.; Oosterhuis, J.W.

    2006-01-01

    Spermatocytic seminomas are solid tumors found solely in the testis of predominantly elderly individuals. We investigated these tumors using a genome-wide analysis for structural and numerical chromosomal changes through conventional karyotyping, spectral karyotyping, and array comparative genomic

  6. Genome-wide organization and expression profiling of the R2R3-MYB transcription factor family in pineapple (Ananas comosus).

    Science.gov (United States)

    Liu, Chaoyang; Xie, Tao; Chen, Chenjie; Luan, Aiping; Long, Jianmei; Li, Chuhao; Ding, Yaqi; He, Yehua

    2017-07-01

    The MYB proteins comprise one of the largest families of plant transcription factors, which are involved in various plant physiological and biochemical processes. Pineapple (Ananas comosus) is one of three most important tropical fruits worldwide. The completion of pineapple genome sequencing provides a great opportunity to investigate the organization and evolutionary traits of pineapple MYB genes at the genome-wide level. In the present study, a total of 94 pineapple R2R3-MYB genes were identified and further phylogenetically classified into 26 subfamilies, as supported by the conserved gene structures and motif composition. Collinearity analysis indicated that the segmental duplication events played a crucial role in the expansion of pineapple MYB gene family. Further comparative phylogenetic analysis suggested that there have been functional divergences of MYB gene family during plant evolution. RNA-seq data from different tissues and developmental stages revealed distinct temporal and spatial expression profiles of the AcMYB genes. Further quantitative expression analysis showed the specific expression patterns of the selected putative stress-related AcMYB genes in response to distinct abiotic stress and hormonal treatments. The comprehensive expression analysis of the pineapple MYB genes, especially the tissue-preferential and stress-responsive genes, could provide valuable clues for further function characterization. In this work, we systematically identified AcMYB genes by analyzing the pineapple genome sequence using a set of bioinformatics approaches. Our findings provide a global insight into the organization, phylogeny and expression patterns of the pineapple R2R3-MYB genes, and hence contribute to the greater understanding of their biological roles in pineapple.

  7. From structure prediction to genomic screens for novel non-coding RNAs.

    Directory of Open Access Journals (Sweden)

    Jan Gorodkin

    2011-08-01

    Full Text Available Non-coding RNAs (ncRNAs are receiving more and more attention not only as an abundant class of genes, but also as regulatory structural elements (some located in mRNAs. A key feature of RNA function is its structure. Computational methods were developed early for folding and prediction of RNA structure with the aim of assisting in functional analysis. With the discovery of more and more ncRNAs, it has become clear that a large fraction of these are highly structured. Interestingly, a large part of the structure is comprised of regular Watson-Crick and GU wobble base pairs. This and the increased amount of available genomes have made it possible to employ structure-based methods for genomic screens. The field has moved from folding prediction of single sequences to computational screens for ncRNAs in genomic sequence using the RNA structure as the main characteristic feature. Whereas early methods focused on energy-directed folding of single sequences, comparative analysis based on structure preserving changes of base pairs has been efficient in improving accuracy, and today this constitutes a key component in genomic screens. Here, we cover the basic principles of RNA folding and touch upon some of the concepts in current methods that have been applied in genomic screens for de novo RNA structures in searches for novel ncRNA genes and regulatory RNA structure on mRNAs. We discuss the strengths and weaknesses of the different strategies and how they can complement each other.

  8. Comparative Genome Structure, Secondary Metabolite, and Effector Coding Capacity across Cochliobolus Pathogens

    Energy Technology Data Exchange (ETDEWEB)

    Condon, Bradford J.; Leng, Yueqiang; Wu, Dongliang; Bushley, Kathryn E.; Ohm, Robin A.; Otillar, Robert; Martin, Joel; Schackwitz, Wendy; Grimwood, Jane; MohdZainudin, NurAinlzzati; Xue, Chunsheng; Wang, Rui; Manning, Viola A.; Dhillon, Braham; Tu, Zheng Jin; Steffenson, Brian J.; Salamov, Asaf; Sun, Hui; Lowry, Steve; LaButti, Kurt; Han, James; Copeland, Alex; Lindquist, Erika; Barry, Kerrie; Schmutz, Jeremy; Baker, Scott E.; Ciuffetti, Lynda M.; Grigoriev, Igor V.; Zhong, Shaobin; Turgeon, B. Gillian

    2013-01-24

    The genomes of five Cochliobolus heterostrophus strains, two Cochliobolus sativus strains, three additional Cochliobolus species (Cochliobolus victoriae, Cochliobolus carbonum, Cochliobolus miyabeanus), and closely related Setosphaeria turcica were sequenced at the Joint Genome Institute (JGI). The datasets were used to identify SNPs between strains and species, unique genomic regions, core secondary metabolism genes, and small secreted protein (SSP) candidate effector encoding genes with a view towards pinpointing structural elements and gene content associated with specificity of these closely related fungi to different cereal hosts. Whole-genome alignment shows that three to five of each genome differs between strains of the same species, while a quarter of each genome differs between species. On average, SNP counts among field isolates of the same C. heterostrophus species are more than 25 higher than those between inbred lines and 50 lower than SNPs between Cochliobolus species. The suites of nonribosomal peptide synthetase (NRPS), polyketide synthase (PKS), and SSP encoding genes are astoundingly diverse among species but remarkably conserved among isolates of the same species, whether inbred or field strains, except for defining examples that map to unique genomic regions. Functional analysis of several strain-unique PKSs and NRPSs reveal a strong correlation with a role in virulence.

  9. Global Gene Expression Profiling of Human Genome Following Exposure to Sarin and Soman

    International Nuclear Information System (INIS)

    Gopalakrishnakone, P.; Pachiappan, A.; Srinivasan, K. N.; Loke, W. K.; Lee, F. K.

    2007-01-01

    Toxicogenomics merges genomics with toxicology is a rapidly expanding field on the assumption that the transcriptional responses of cells to different toxic exposure are sufficiently distinct robust and reproducible to discriminate toxin from different families/classes which can be called as 'fingerprints' or 'Atlases'. In this study chemical weapons sarin was studied in a time and dose dependent manner after exposure to human neuroblastoma cell line. (Sarin or GB) exerts its effect through inhibition of acetylcholinesterase activity and induction of delayed neurotoxicity in a dose [EC 50 50 ppm, (around 372.4 μM)] and time-dependent manner. The effect and/or the mechanism of single or repeated exposures to GB, however, are less clear and yet to be explored at cellular level. The present study aims to scrutinize, the global gene expression profile following sarin toxicity in neuronal cells using Affymetrix-GeneChips. A tim-course study on the effect of a single (3 or 24h) or repeated (24 or 48h) doses of sarin (5ppm) on SHSY5Y cells was carried out. Using GeneSpring (PCA) analysis, 550 genes whose expression was significantly (p less than 0.01) altered by at least 2.5-fold, were selected. The results indicate that the low-level single dose exposure do not always parallel acute toxicity, but can cause a reversible down-regulation of genes and a range of anti-cholinesterase effects. In contrast, repeated doses produced persistent irreversible down-regulation of genes related to neurodegenerative mechanism at 48h. Real-time PCR and western blot analysis confirmed the reduced expression of presenilin 1 (TMP21), 2 and dopa.decarboxylase (DDC) mRNA and proteins. Besides providing an in vitro experimental model for studies on the neuropathophysiology and brain cells this investigation indicate possible mechanisms by which sarin could mediate neuro-degeneration. A comparison will be made with similar study with soman. (author)

  10. Evidence-based annotation of the malaria parasite's genome using comparative expression profiling.

    Directory of Open Access Journals (Sweden)

    Yingyao Zhou

    2008-02-01

    Full Text Available A fundamental problem in systems biology and whole genome sequence analysis is how to infer functions for the many uncharacterized proteins that are identified, whether they are conserved across organisms of different phyla or are phylum-specific. This problem is especially acute in pathogens, such as malaria parasites, where genetic and biochemical investigations are likely to be more difficult. Here we perform comparative expression analysis on Plasmodium parasite life cycle data derived from P. falciparum blood, sporozoite, zygote and ookinete stages, and P. yoelii mosquito oocyst and salivary gland sporozoites, blood and liver stages and show that type II fatty acid biosynthesis genes are upregulated in liver and insect stages relative to asexual blood stages. We also show that some universally uncharacterized genes with orthologs in Plasmodium species, Saccharomyces cerevisiae and humans show coordinated transcription patterns in large collections of human and yeast expression data and that the function of the uncharacterized genes can sometimes be predicted based on the expression patterns across these diverse organisms. We also use a comprehensive and unbiased literature mining method to predict which uncharacterized parasite-specific genes are likely to have roles in processes such as gliding motility, host-cell interactions, sporozoite stage, or rhoptry function. These analyses, together with protein-protein interaction data, provide probabilistic models that predict the function of 926 uncharacterized malaria genes and also suggest that malaria parasites may provide a simple model system for the study of some human processes. These data also provide a foundation for further studies of transcriptional regulation in malaria parasites.

  11. Genome-Wide Identification and Transcriptome-Based Expression Profiling of the Sox Gene Family in the Nile Tilapia (Oreochromis niloticus).

    Science.gov (United States)

    Wei, Ling; Yang, Chao; Tao, Wenjing; Wang, Deshou

    2016-02-23

    The Sox transcription factor family is characterized with the presence of a Sry-related high-mobility group (HMG) box and plays important roles in various biological processes in animals, including sex determination and differentiation, and the development of multiple organs. In this study, 27 Sox genes were identified in the genome of the Nile tilapia (Oreochromis niloticus), and were classified into seven groups. The members of each group of the tilapia Sox genes exhibited a relatively conserved exon-intron structure. Comparative analysis showed that the Sox gene family has undergone an expansion in tilapia and other teleost fishes following their whole genome duplication, and group K only exists in teleosts. Transcriptome-based analysis demonstrated that most of the tilapia Sox genes presented stage-specific and/or sex-dimorphic expressions during gonadal development, and six of the group B Sox genes were specifically expressed in the adult brain. Our results provide a better understanding of gene structure and spatio-temporal expression of the Sox gene family in tilapia, and will be useful for further deciphering the roles of the Sox genes during sex determination and gonadal development in teleosts.

  12. Genome-Wide Identification and Transcriptome-Based Expression Profiling of the Sox Gene Family in the Nile Tilapia (Oreochromis niloticus

    Directory of Open Access Journals (Sweden)

    Ling Wei

    2016-02-01

    Full Text Available The Sox transcription factor family is characterized with the presence of a Sry-related high-mobility group (HMG box and plays important roles in various biological processes in animals, including sex determination and differentiation, and the development of multiple organs. In this study, 27 Sox genes were identified in the genome of the Nile tilapia (Oreochromis niloticus, and were classified into seven groups. The members of each group of the tilapia Sox genes exhibited a relatively conserved exon-intron structure. Comparative analysis showed that the Sox gene family has undergone an expansion in tilapia and other teleost fishes following their whole genome duplication, and group K only exists in teleosts. Transcriptome-based analysis demonstrated that most of the tilapia Sox genes presented stage-specific and/or sex-dimorphic expressions during gonadal development, and six of the group B Sox genes were specifically expressed in the adult brain. Our results provide a better understanding of gene structure and spatio-temporal expression of the Sox gene family in tilapia, and will be useful for further deciphering the roles of the Sox genes during sex determination and gonadal development in teleosts.

  13. Exploring the role of genome and structural ions in preventing viral capsid collapse during dehydration

    Science.gov (United States)

    Martín-González, Natalia; Guérin Darvas, Sofía M.; Durana, Aritz; Marti, Gerardo A.; Guérin, Diego M. A.; de Pablo, Pedro J.

    2018-03-01

    Even though viruses evolve mainly in liquid milieu, their horizontal transmission routes often include episodes of dry environment. Along their life cycle, some insect viruses, such as viruses from the Dicistroviridae family, withstand dehydrated conditions with presently unknown consequences to their structural stability. Here, we use atomic force microscopy to monitor the structural changes of viral particles of Triatoma virus (TrV) after desiccation. Our results demonstrate that TrV capsids preserve their genome inside, conserving their height after exposure to dehydrating conditions, which is in stark contrast with other viruses that expel their genome when desiccated. Moreover, empty capsids (without genome) resulted in collapsed particles after desiccation. We also explored the role of structural ions in the dehydration process of the virions (capsid containing genome) by chelating the accessible cations from the external solvent milieu. We observed that ion suppression helps to keep the virus height upon desiccation. Our results show that under drying conditions, the genome of TrV prevents the capsid from collapsing during dehydration, while the structural ions are responsible for promoting solvent exchange through the virion wall.

  14. Genome Wide Identification, Evolutionary, and Expression Analysis of VQ Genes from Two Pyrus Species.

    Science.gov (United States)

    Cao, Yunpeng; Meng, Dandan; Abdullah, Muhammad; Jin, Qing; Lin, Yi; Cai, Yongping

    2018-04-23

    The VQ motif-containing gene, a member of the plant-specific genes, is involved in the plant developmental process and various stress responses. The VQ motif-containing gene family has been studied in several plants, such as rice ( Oryza sativa ), maize ( Zea mays ), and Arabidopsis ( Arabidopsis thaliana ). However, no systematic study has been performed in Pyrus species, which have important economic value. In our study, we identified 41 and 28 VQ motif-containing genes in Pyrus bretschneideri and Pyrus communis , respectively. Phylogenetic trees were calculated using A. thaliana and O. sativa VQ motif-containing genes as a template, allowing us to categorize these genes into nine subfamilies. Thirty-two and eight paralogous of VQ motif-containing genes were found in P. bretschneideri and P. communis , respectively, showing that the VQ motif-containing genes had a more remarkable expansion in P. bretschneideri than in P. communis . A total of 31 orthologous pairs were identified from the P. bretschneideri and P. communis VQ motif-containing genes. Additionally, among the paralogs, we found that these duplication gene pairs probably derived from segmental duplication/whole-genome duplication (WGD) events in the genomes of P. bretschneideri and P. communis , respectively. The gene expression profiles in both P. bretschneideri and P. communis fruits suggested functional redundancy for some orthologous gene pairs derived from a common ancestry, and sub-functionalization or neo-functionalization for some of them. Our study provided the first systematic evolutionary analysis of the VQ motif-containing genes in Pyrus , and highlighted the diversification and duplication of VQ motif-containing genes in both P. bretschneideri and P. communis .

  15. Genome-wide expression profiling during protection from colitis by regulatory T cells

    DEFF Research Database (Denmark)

    Kristensen, Nanna Ny; Olsen, Jørgen; Gad, Monika

    2008-01-01

    BACKGROUND: In the adoptive transfer model of colitis it has been shown that regulatory T cells (Treg) can hinder disease development and cure already existing mild colitis. The mechanisms underlying this regulatory effect of CD4(+)CD25(+) Tregs are not well understood. METHODS: To identify......Chip Mouse Genome 430 2.0 Array), which enabled an analysis of a complete set of RNA transcript levels in each sample. Array results were confirmed by real-time reverse-transcriptase polymerase chain reaction (RT-PCR). RESULTS: Data were analyzed using combined projections to latent structures and functional...... annotation analysis. The colitic samples were clearly distinguishable from samples from normal mice by a vast number of inflammation- and growth factor-related transcripts. In contrast, the Treg-protected animals could not be distinguished from either the normal BALB/c mice or the normal SCID mice. mRNA...

  16. Generation and analysis of expressed sequence tags in the extreme large genomes Lilium and Tulipa

    Directory of Open Access Journals (Sweden)

    Shahin Arwa

    2012-11-01

    Full Text Available Abstract Background Bulbous flowers such as lily and tulip (Liliaceae family are monocot perennial herbs that are economically very important ornamental plants worldwide. However, there are hardly any genetic studies performed and genomic resources are lacking. To build genomic resources and develop tools to speed up the breeding in both crops, next generation sequencing was implemented. We sequenced and assembled transcriptomes of four lily and five tulip genotypes using 454 pyro-sequencing technology. Results Successfully, we developed the first set of 81,791 contigs with an average length of 514 bp for tulip, and enriched the very limited number of 3,329 available ESTs (Expressed Sequence Tags for lily with 52,172 contigs with an average length of 555 bp. The contigs together with singletons covered on average 37% of lily and 39% of tulip estimated transcriptome. Mining lily and tulip sequence data for SSRs (Simple Sequence Repeats showed that di-nucleotide repeats were twice more abundant in UTRs (UnTranslated Regions compared to coding regions, while tri-nucleotide repeats were equally spread over coding and UTR regions. Two sets of single nucleotide polymorphism (SNP markers suitable for high throughput genotyping were developed. In the first set, no SNPs flanking the target SNP (50 bp on either side were allowed. In the second set, one SNP in the flanking regions was allowed, which resulted in a 2 to 3 fold increase in SNP marker numbers compared with the first set. Orthologous groups between the two flower bulbs: lily and tulip (12,017 groups and among the three monocot species: lily, tulip, and rice (6,900 groups were determined using OrthoMCL. Orthologous groups were screened for common SNP markers and EST-SSRs to study synteny between lily and tulip, which resulted in 113 common SNP markers and 292 common EST-SSR. Lily and tulip contigs generated were annotated and described according to Gene Ontology terminology. Conclusions

  17. Generation and analysis of expressed sequence tags in the extreme large genomes Lilium and Tulipa.

    Science.gov (United States)

    Shahin, Arwa; van Kaauwen, Martijn; Esselink, Danny; Bargsten, Joachim W; van Tuyl, Jaap M; Visser, Richard G F; Arens, Paul

    2012-11-20

    Bulbous flowers such as lily and tulip (Liliaceae family) are monocot perennial herbs that are economically very important ornamental plants worldwide. However, there are hardly any genetic studies performed and genomic resources are lacking. To build genomic resources and develop tools to speed up the breeding in both crops, next generation sequencing was implemented. We sequenced and assembled transcriptomes of four lily and five tulip genotypes using 454 pyro-sequencing technology. Successfully, we developed the first set of 81,791 contigs with an average length of 514 bp for tulip, and enriched the very limited number of 3,329 available ESTs (Expressed Sequence Tags) for lily with 52,172 contigs with an average length of 555 bp. The contigs together with singletons covered on average 37% of lily and 39% of tulip estimated transcriptome. Mining lily and tulip sequence data for SSRs (Simple Sequence Repeats) showed that di-nucleotide repeats were twice more abundant in UTRs (UnTranslated Regions) compared to coding regions, while tri-nucleotide repeats were equally spread over coding and UTR regions. Two sets of single nucleotide polymorphism (SNP) markers suitable for high throughput genotyping were developed. In the first set, no SNPs flanking the target SNP (50 bp on either side) were allowed. In the second set, one SNP in the flanking regions was allowed, which resulted in a 2 to 3 fold increase in SNP marker numbers compared with the first set. Orthologous groups between the two flower bulbs: lily and tulip (12,017 groups) and among the three monocot species: lily, tulip, and rice (6,900 groups) were determined using OrthoMCL. Orthologous groups were screened for common SNP markers and EST-SSRs to study synteny between lily and tulip, which resulted in 113 common SNP markers and 292 common EST-SSR. Lily and tulip contigs generated were annotated and described according to Gene Ontology terminology. Two transcriptome sets were built that are valuable

  18. Combining genetical genomics and bulked segregant analysis differential expression: an approach to gene localization

    NARCIS (Netherlands)

    Chen, Xinwei; Hedley, P.E.; Morris, J.; Liu, Hui; Niks, R.E.; Waugh, R.

    2011-01-01

    Positional gene isolation in unsequenced species generally requires either a reference genome sequence or an inference of gene content based on conservation of synteny with a genomic model. In the large unsequenced genomes of the Triticeae cereals the latter, i.e. conservation of synteny with the

  19. Insights into the genome structure and copy-number variation of Eimeria tenella

    Directory of Open Access Journals (Sweden)

    Lim Lik-Sin

    2012-08-01

    method to improve the assembly of the genome of E. tenella from shotgun data, and to help reveal its overall structure. A preliminary assessment of copy-number variation (extra or missing copies of genomic segments between strains of E. tenella was also carried out. The emerging picture is of a very unusual genome architecture displaying inter-strain copy-number variation. We suggest that these features may be related to the known ability of this parasite to rapidly develop drug resistance.

  20. Hybrid logic on linear structures: expressivity and complexity

    NARCIS (Netherlands)

    Franceschet, M.; de Rijke, M.; Schlingoff, B.-H.

    2003-01-01

    We investigate expressivity and complexity of hybrid logics on linear structures. Hybrid logics are an enrichment of modal logics with certain first-order features which are algorithmically well behaved. Therefore, they are well suited for the specification of certain properties of computational

  1. Genome-wide identification of WRKY transcription factors in kiwifruit (Actinidia spp.) and analysis of WRKY expression in responses to biotic and abiotic stresses.

    Science.gov (United States)

    Jing, Zhaobin; Liu, Zhande

    2018-04-01

    As one of the largest transcriptional factor families in plants, WRKY transcription factors play important roles in various biotic and abiotic stress responses. To date, WRKY genes in kiwifruit (Actinidia spp.) remain poorly understood. In our study, o total of 97 AcWRKY genes have been identified in the kiwifruit genome. An overview of these AcWRKY genes is analyzed, including the phylogenetic relationships, exon-intron structures, synteny and expression profiles. The 97 AcWRKY genes were divided into three groups based on the conserved WRKY domain. Synteny analysis indicated that segmental duplication events contributed to the expansion of the kiwifruit AcWRKY family. In addition, the synteny analysis between kiwifruit and Arabidopsis suggested that some of the AcWRKY genes were derived from common ancestors before the divergence of these two species. Conserved motifs outside the AcWRKY domain may reflect their functional conservation. Genome-wide segmental and tandem duplication were found, which may contribute to the expansion of AcWRKY genes. Furthermore, the analysis of selected AcWRKY genes showed a variety of expression patterns in five different organs as well as during biotic and abiotic stresses. The genome-wide identification and characterization of kiwifruit WRKY transcription factors provides insight into the evolutionary history and is a useful resource for further functional analyses of kiwifruit.

  2. Genomic Survey, Characterization, and Expression Profile Analysis of the SBP Genes in Pineapple (Ananas comosus L.).

    Science.gov (United States)

    Ali, Hina; Liu, Yanhui; Azam, Syed Muhammad; Rahman, Zia Ur; Priyadarshani, S V G N; Li, Weimin; Huang, Xinyu; Hu, Bingyan; Xiong, Junjie; Ali, Umair; Qin, Yuan

    2017-01-01

    Gene expression is regulated by transcription factors, which play many significant developmental processes. SQUAMOSA promoter-binding proteins (SBP) perform a variety of regulatory functions in leaf, flower, and fruit development, plant architecture, and sporogenesis. 16 SBP genes were identified in pineapple and were divided into four groups on basis of phylogenetic analysis. Five paralogs in pineapple for SBP genes were identified with Ka/Ks ratio varied from 0.20 for AcSBP14 and AcSBP15 to 0.36 for AcSBP6 and AcSBP16 , respectively. 16 SBP genes were located on 12 chromosomes out of 25 pineapple chromosomes with highly conserved protein sequence structures. The isoionic points of SBP ranged from 6.05 to 9.57, while molecular weight varied from 22.7 to 121.9 kD. Expression profiles of SBP genes revealed that AcSBP7 and AcSBP15 (leaf), AcSBP13 , AcSBP12 , AcSBP8 , AcSBP16 , AcSBP9 , and AcSBP11 (sepal), AcSBP6 , AcSBP4 , and AcSBP10 (stamen), AcSBP14 , AcSBP1 , and AcSBP5 (fruit) while the rest of genes showed low expression in studied tissues. Four genes, that is, AcSBP11 , AcSBP6 , AcSBP4 , and AcSBP12 , were highly expressed at 4°C, while AcSBP16 were upregulated at 45°C. RNA-Seq was validated through qRT-PCR for some genes. Salt stress-induced expression of two genes, that is, AcSBP7 and AcSBP14 , while in drought stress, AcSBP12 and AcSBP15 were highly expressed. Our study lays a foundation for further gene function and expression studies of SBP genes in pineapple.

  3. GeneViTo: Visualizing gene-product functional and structural features in genomic datasets

    Directory of Open Access Journals (Sweden)

    Promponas Vasilis J

    2003-10-01

    Full Text Available Abstract Background The availability of increasing amounts of sequence data from completely sequenced genomes boosts the development of new computational methods for automated genome annotation and comparative genomics. Therefore, there is a need for tools that facilitate the visualization of raw data and results produced by bioinformatics analysis, providing new means for interactive genome exploration. Visual inspection can be used as a basis to assess the quality of various analysis algorithms and to aid in-depth genomic studies. Results GeneViTo is a JAVA-based computer application that serves as a workbench for genome-wide analysis through visual interaction. The application deals with various experimental information concerning both DNA and protein sequences (derived from public sequence databases or proprietary data sources and meta-data obtained by various prediction algorithms, classification schemes or user-defined features. Interaction with a Graphical User Interface (GUI allows easy extraction of genomic and proteomic data referring to the sequence itself, sequence features, or general structural and functional features. Emphasis is laid on the potential comparison between annotation and prediction data in order to offer a supplement to the provided information, especially in cases of "poor" annotation, or an evaluation of available predictions. Moreover, desired information can be output in high quality JPEG image files for further elaboration and scientific use. A compilation of properly formatted GeneViTo input data for demonstration is available to interested readers for two completely sequenced prokaryotes, Chlamydia trachomatis and Methanococcus jannaschii. Conclusions GeneViTo offers an inspectional view of genomic functional elements, concerning data stemming both from database annotation and analysis tools for an overall analysis of existing genomes. The application is compatible with Linux or Windows ME-2000-XP operating

  4. Structure and expression of sulfatase and sulfatase modifying factor genes in the diamondback moth, Plutella xylostella.

    Science.gov (United States)

    Ma, Xiao-Li; He, Wei-Yi; Chen, Wei; Xu, Xue-Jiao; Qi, Wei-Ping; Zou, Ming-Min; You, Yan-Chun; Baxter, Simon W; Wang, Ping; You, Min-Sheng

    2017-06-01

    The diamondback moth, Plutella xylostella (L.), uses sulfatases (SULF) to counteract the glucosinolate-myrosinase defensive system that cruciferous plants have evolved to deter insect feeding. Sulfatase activity is regulated by post-translational modification of a cysteine residue by sulfatase modifying factor 1 (SUMF1). We identified 12 SULF genes (PxylSulfs) and two SUMF1 genes (PxylSumf1s) in the P. xylostella genome. Phylogenetic analysis of SULFs and SUMFs from P. xylostella, Bombyx mori, Manduca sexta, Heliconius melpomene, Danaus plexippus, Drosophila melanogaster, Tetranychus urticae and Homo sapiens showed that the SULFs were clustered into five groups, and the SUMFs could be divided into two groups. Profiling of the expression of PxylSulfs and PxylSumfs by RNA-seq and by quantitative real-time polymerase chain reaction showed that two glucosinolate sulfatase genes (GSS), PxylSulf2 and PxylSulf3, were primarily expressed in the midgut of 3rd- and 4th-instar larvae. Moreover, expression of sulfatases PxylSulf2, PxylSulf3 and PxylSulf4 were correlated with expression of the sulfatases modifying factor PxylSumf1a. The findings from this study provide new insights into the structure and expression of SUMF1 and PxylSulf genes that are considered to be key factors for the evolutionary success of P. xylostella as a specialist herbivore of cruciferous plants. © 2017 Institute of Zoology, Chinese Academy of Sciences.

  5. Genome-wide identification and expression profiling of tomato Hsp20 gene family in response to biotic and abiotic stresses

    Directory of Open Access Journals (Sweden)

    jiahong yu

    2016-08-01

    Full Text Available The Hsp20 genes are involved in the response of plants to environment stresses including heat shock and also play a vital role in plant growth and development. They represent the most abundant small heat shock proteins (sHsps in plants, but little is known about this family in tomato (Solanum lycopersicum, an important vegetable crop in the world. Here, we characterized heat shock protein 20 (SlHsp20 gene family in tomato through integration of gene structure, chromosome location, phylogenetic relationship and expression profile. Using bioinformatics-based methods, we identified at least 42 putative SlHsp20 genes in tomato. Sequence analysis revealed that most of SlHsp20 genes possessed no intron or a relatively short intron in length. Chromosome mapping indicated that inter-arm and intra-chromosome duplication events contributed remarkably to the expansion of SlHsp20 genes. Phylogentic tree of Hsp20 genes from tomato and other plant species revealed that SlHsp20 genes were grouped into 13 subfamilies, indicating that these genes may have a common ancestor that generated diverse subfamilies prior to the mono-dicot split. In addition, expression analysis using RNA-seq in various tissues and developmental stages of cultivated tomato and the wild relative Solanum pimpinellifolium revealed that most of these genes (83% were expressed in at least one stage from at least one genotype. Out of 42 genes, 4 genes were expressed constitutively in almost all the tissues analyzed, implying that these genes might have specific housekeeping function in tomato cell under normal growth conditions. Two SlHsp20 genes displayed differential expression levels between cultivated tomato and S. pimpinellifolium in vegetative (leaf and root and reproductive organs (floral bud and flower, suggesting inter-species diversification for functional specialization during the process of domestication. Based on genome-wide microarray analysis, we showed that the transcript

  6. Genome-Wide Identification and Expression Profiling of Tomato Hsp20 Gene Family in Response to Biotic and Abiotic Stresses.

    Science.gov (United States)

    Yu, Jiahong; Cheng, Yuan; Feng, Kun; Ruan, Meiying; Ye, Qingjing; Wang, Rongqing; Li, Zhimiao; Zhou, Guozhi; Yao, Zhuping; Yang, Yuejian; Wan, Hongjian

    2016-01-01

    The Hsp20 genes are involved in the response of plants to environment stresses including heat shock and also play a vital role in plant growth and development. They represent the most abundant small heat shock proteins (sHsps) in plants, but little is known about this family in tomato (Solanum lycopersicum), an important vegetable crop in the world. Here, we characterized heat shock protein 20 (SlHsp20) gene family in tomato through integration of gene structure, chromosome location, phylogenetic relationship, and expression profile. Using bioinformatics-based methods, we identified at least 42 putative SlHsp20 genes in tomato. Sequence analysis revealed that most of SlHsp20 genes possessed no intron or a relatively short intron in length. Chromosome mapping indicated that inter-arm and intra-chromosome duplication events contributed remarkably to the expansion of SlHsp20 genes. Phylogentic tree of Hsp20 genes from tomato and other plant species revealed that SlHsp20 genes were grouped into 13 subfamilies, indicating that these genes may have a common ancestor that generated diverse subfamilies prior to the mono-dicot split. In addition, expression analysis using RNA-seq in various tissues and developmental stages of cultivated tomato and the wild relative Solanum pimpinellifolium revealed that most of these genes (83%) were expressed in at least one stage from at least one genotype. Out of 42 genes, 4 genes were expressed constitutively in almost all the tissues analyzed, implying that these genes might have specific housekeeping function in tomato cell under normal growth conditions. Two SlHsp20 genes displayed differential expression levels between cultivated tomato and S. pimpinellifolium in vegetative (leaf and root) and reproductive organs (floral bud and flower), suggesting inter-species diversification for functional specialization during the process of domestication. Based on genome-wide microarray analysis, we showed that the transcript levels of SlHsp20

  7. CpG preconditioning regulates miRNA expression that modulates genomic reprogramming associated with neuroprotection against ischemic injury

    Science.gov (United States)

    Vartanian, Keri B; Mitchell, Hugh D; Stevens, Susan L; Conrad, Valerie K; McDermott, Jason E; Stenzel-Poore, Mary P

    2015-01-01

    Cytosine-phosphate-guanine (CpG) preconditioning reprograms the genomic response to stroke to protect the brain against ischemic injury. The mechanisms underlying genomic reprogramming are incompletely understood. MicroRNAs (miRNAs) regulate gene expression; however, their role in modulating gene responses produced by CpG preconditioning is unknown. We evaluated brain miRNA expression in response to CpG preconditioning before and after stroke using microarray. Importantly, we have data from previous gene microarrays under the same conditions, which allowed integration of miRNA and gene expression data to specifically identify regulated miRNA gene targets. CpG preconditioning did not significantly alter miRNA expression before stroke, indicating that miRNA regulation is not critical for the initiation of preconditioning-induced neuroprotection. However, after stroke, differentially regulated miRNAs between CpG- and saline-treated animals associated with the upregulation of several neuroprotective genes, implicating these miRNAs in genomic reprogramming that increases neuroprotection. Statistical analysis revealed that the miRNA targets were enriched in the gene population regulated in the setting of stroke, implying that miRNAs likely orchestrate this gene expression. These data suggest that miRNAs regulate endogenous responses to stroke and that manipulation of these miRNAs may have the potential to acutely activate novel neuroprotective processes that reduce damage. PMID:25388675

  8. Mutations in Cytosine-5 tRNA Methyltransferases Impact Mobile Element Expression and Genome Stability at Specific DNA Repeats

    Directory of Open Access Journals (Sweden)

    Bianca Genenncher

    2018-02-01

    Full Text Available The maintenance of eukaryotic genome stability is ensured by the interplay of transcriptional as well as post-transcriptional mechanisms that control recombination of repeat regions and the expression and mobility of transposable elements. We report here that mutations in two (cytosine-5 RNA methyltransferases, Dnmt2 and NSun2, impact the accumulation of mobile element-derived sequences and DNA repeat integrity in Drosophila. Loss of Dnmt2 function caused moderate effects under standard conditions, while heat shock exacerbated these effects. In contrast, NSun2 function affected mobile element expression and genome integrity in a heat shock-independent fashion. Reduced tRNA stability in both RCMT mutants indicated that tRNA-dependent processes affected mobile element expression and DNA repeat stability. Importantly, further experiments indicated that complex formation with RNA could also contribute to the impact of RCMT function on gene expression control. These results thus uncover a link between tRNA modification enzymes, the expression of repeat DNA, and genomic integrity.

  9. Recurrent RECQL4 Imbalance and Increased Gene Expression Levels Are Associated with Structural Chromosomal Instability in Sporadic Osteosarcoma

    Directory of Open Access Journals (Sweden)

    Georges Maire

    2009-03-01

    Full Text Available Osteosarcoma (OS is an aggressive bone tumor with complex abnormal karyotypes and a highly unstable genome, exhibiting both numerical- and structural-chromosomal instability (N- and S-CIN. Chromosomal rearrangements and genomic imbalances affecting 8q24 are frequent in OS. RECQL4 gene maps to this cytoband and encodes a putative helicase involved in the fidelity of DNA replication and repair. This protective genomic function of the protein is relevant because often patients with Rothmund-Thomson syndrome have constitutional mutations of RECQL4 and carry a very high risk of developing OS. To determine the relative level of expression of RECQL4 in OS, 18 sporadic tumors were studied by reverse transcription–polymerase chain reaction. All tumors overexpressed RECQL4 in comparison to control osteoblasts, and fluorescence in situ hybridization analysis of tumor DNA showed that expression levels were strongly copy number–dependent. Relative N- and S-CIN levels were determined by classifying copy number transitions within array comparative genomic hybridization profiles and by enumerating the frequency of break-apart fluorescence in situ hybridization within 8q24 using region-specific and control probes. Although there was no evidence that disruption of 8q24 in OS led to an elevated expression of RECQL4, there was a marked association between increased overall levels of S-CIN, determined by copy number transition frequency and higher levels of RECQL4.

  10. SL1 revisited: functional analysis of the structure and conformation of HIV-1 genome RNA.

    Science.gov (United States)

    Sakuragi, Sayuri; Yokoyama, Masaru; Shioda, Tatsuo; Sato, Hironori; Sakuragi, Jun-Ichi

    2016-11-11

    The dimer initiation site/dimer linkage sequence (DIS/DLS) region of HIV is located on the 5' end of the viral genome and suggested to form complex secondary/tertiary structures. Within this structure, stem-loop 1 (SL1) is believed to be most important and an essential key to dimerization, since the sequence and predicted secondary structure of SL1 are highly stable and conserved among various virus subtypes. In particular, a six-base palindromic sequence is always present at the hairpin loop of SL1 and the formation of kissing-loop structure at this position between the two strands of genomic RNA is suggested to trigger dimerization. Although the higher-order structure model of SL1 is well accepted and perhaps even undoubted lately, there could be stillroom for consideration to depict the functional SL1 structure while in vivo (in virion or cell). In this study, we performed several analyses to identify the nucleotides and/or basepairing within SL1 which are necessary for HIV-1 genome dimerization, encapsidation, recombination and infectivity. We unexpectedly found that some nucleotides that are believed to contribute the formation of the stem do not impact dimerization or infectivity. On the other hand, we found that one G-C basepair involved in stem formation may serve as an alternative dimer interactive site. We also report on our further investigation of the roles of the palindromic sequences on viral replication. Collectively, we aim to assemble a more-comprehensive functional map of SL1 on the HIV-1 viral life cycle. We discovered several possibilities for a novel structure of SL1 in HIV-1 DLS. The newly proposed structure model suggested that the hairpin loop of SL1 appeared larger, and genome dimerization process might consist of more complicated mechanism than previously understood. Further investigations would be still required to fully understand the genome packaging and dimerization of HIV.

  11. Embryonic stem cell-like features of testicular carcinoma in situ revealed by genome-wide gene expression profiling

    DEFF Research Database (Denmark)

    Almstrup, Kristian; Hoei-Hansen, Christina E; Wirkner, Ute

    2004-01-01

    in their stoichiometry on progression into embryonic carcinoma. We compared the CIS expression profile with patterns reported in embryonic stem cells (ESCs), which revealed a substantial overlap that may be as high as 50%. We also demonstrated an over-representation of expressed genes in regions of 17q and 12, reported......Carcinoma in situ (CIS) is the common precursor of histologically heterogeneous testicular germ cell tumors (TGCTs), which in recent decades have markedly increased and now are the most common malignancy of young men. Using genome-wide gene expression profiling, we identified >200 genes highly...

  12. Gene order data from a model amphibian (Ambystoma: new perspectives on vertebrate genome structure and evolution

    Directory of Open Access Journals (Sweden)

    Voss S Randal

    2006-08-01

    Full Text Available Abstract Background Because amphibians arise from a branch of the vertebrate evolutionary tree that is juxtaposed between fishes and amniotes, they provide important comparative perspective for reconstructing character changes that have occurred during vertebrate evolution. Here, we report the first comparative study of vertebrate genome structure that includes a representative amphibian. We used 491 transcribed sequences from a salamander (Ambystoma genetic map and whole genome assemblies for human, mouse, rat, dog, chicken, zebrafish, and the freshwater pufferfish Tetraodon nigroviridis to compare gene orders and rearrangement rates. Results Ambystoma has experienced a rate of genome rearrangement that is substantially lower than mammalian species but similar to that of chicken and fish. Overall, we found greater conservation of genome structure between Ambystoma and tetrapod vertebrates, nevertheless, 57% of Ambystoma-fish orthologs are found in conserved syntenies of four or more genes. Comparisons between Ambystoma and amniotes reveal extensive conservation of segmental homology for 57% of the presumptive Ambystoma-amniote orthologs. Conclusion Our analyses suggest relatively constant interchromosomal rearrangement rates from the euteleost ancestor to the origin of mammals and illustrate the utility of amphibian mapping data in establishing ancestral amniote and tetrapod gene orders. Comparisons between Ambystoma and amniotes reveal some of the key events that have structured the human genome since diversification of the ancestral amniote lineage.

  13. The genomic structure of the human UFO receptor.

    Science.gov (United States)

    Schulz, A S; Schleithoff, L; Faust, M; Bartram, C R; Janssen, J W

    1993-02-01

    Using a DNA transfection-tumorigenicity assay we have recently identified the UFO oncogene. It encodes a tyrosine kinase receptor characterized by the juxtaposition of two immunoglobulin-like and two fibronectin type III repeats in its extracellular domain. Here we describe the genomic organization of the human UFO locus. The UFO receptor is encoded by 20 exons that are distributed over a region of 44 kb. Different isoforms of UFO mRNA are generated by alternative splicing of exon 10 and differential usage of two imperfect polyadenylation sites resulting in the presence or absence of 1.5-kb 3' untranslated sequences. Primer extension and S1 nuclease analyses revealed multiple transcriptional initiation sites including a major site 169 bp upstream of the translation start site. The promoter region is GC rich, lacks TATA and CAAT boxes, but contains potential recognition sites for a variety of trans-acting factors, including Sp1, AP-2 and the cyclic AMP response element-binding protein. Proto-UFO and its oncogenic counterpart exhibit identical cDNA and promoter regions sequences. Possible modes of UFO activation are discussed.

  14. Alternative splicing and differential gene expression in colon cancer detected by a whole genome exon array

    Directory of Open Access Journals (Sweden)

    Sugnet Charles

    2006-12-01

    Full Text Available Abstract Background Alternative splicing is a mechanism for increasing protein diversity by excluding or including exons during post-transcriptional processing. Alternatively spliced proteins are particularly relevant in oncology since they may contribute to the etiology of cancer, provide selective drug targets, or serve as a marker set for cancer diagnosis. While conventional identification of splice variants generally targets individual genes, we present here a new exon-centric array (GeneChip Human Exon 1.0 ST that allows genome-wide identification of differential splice variation, and concurrently provides a flexible and inclusive analysis of gene expression. Results We analyzed 20 paired tumor-normal colon cancer samples using a microarray designed to detect over one million putative exons that can be virtually assembled into potential gene-level transcripts according to various levels of prior supporting evidence. Analysis of high confidence (empirically supported transcripts identified 160 differentially expressed genes, with 42 genes occupying a network impacting cell proliferation and another twenty nine genes with unknown functions. A more speculative analysis, including transcripts based solely on computational prediction, produced another 160 differentially expressed genes, three-fourths of which have no previous annotation. We also present a comparison of gene signal estimations from the Exon 1.0 ST and the U133 Plus 2.0 arrays. Novel splicing events were predicted by experimental algorithms that compare the relative contribution of each exon to the cognate transcript intensity in each tissue. The resulting candidate splice variants were validated with RT-PCR. We found nine genes that were differentially spliced between colon tumors and normal colon tissues, several of which have not been previously implicated in cancer. Top scoring candidates from our analysis were also found to substantially overlap with EST-based bioinformatic

  15. Protein structure similarity clustering (PSSC) and natural product structure as inspiration sources for drug development and chemical genomics

    NARCIS (Netherlands)

    Dekker, Frank J; Koch, Marcus A; Waldmann, Herbert; Dekker, Frans

    Finding small molecules that modulate protein function is of primary importance in drug development and in the emerging field of chemical genomics. To facilitate the identification of such molecules, we developed a novel strategy making use of structural conservatism found in protein domain

  16. A comparative genomics screen identifies a Sinorhizobium meliloti 1021 sodM-like gene strongly expressed within host plant nodules

    Directory of Open Access Journals (Sweden)

    Queiroux Clothilde

    2012-05-01

    Full Text Available Abstract Background We have used the genomic data in the Integrated Microbial Genomes system of the Department of Energy’s Joint Genome Institute to make predictions about rhizobial open reading frames that play a role in nodulation of host plants. The genomic data was screened by searching for ORFs conserved in α-proteobacterial rhizobia, but not conserved in closely-related non-nitrogen-fixing α-proteobacteria. Results Using this approach, we identified many genes known to be involved in nodulation or nitrogen fixation, as well as several new candidate genes. We knocked out selected new genes and assayed for the presence of nodulation phenotypes and/or nodule-specific expression. One of these genes, SMc00911, is strongly expressed by bacterial cells within host plant nodules, but is expressed minimally by free-living bacterial cells. A strain carrying an insertion mutation in SMc00911 is not defective in the symbiosis with host plants, but in contrast to expectations, this mutant strain is able to out-compete the S. meliloti 1021 wild type strain for nodule occupancy in co-inoculation experiments. The SMc00911 ORF is predicted to encode a “SodM-like” (superoxide dismutase-like protein containing a rhodanese sulfurtransferase domain at the N-terminus and a chromate-resistance superfamily domain at the C-terminus. Several other ORFs (SMb20360, SMc01562, SMc01266, SMc03964, and the SMc01424-22 operon identified in the screen are expressed at a moderate level by bacteria within nodules, but not by free-living bacteria. Conclusions Based on the analysis of ORFs identified in this study, we conclude that this comparative genomics approach can identify rhizobial genes involved in the nitrogen-fixing symbiosis with host plants, although none of the newly identified genes were found to be essential for this process.

  17. Sublethal Concentrations of Carbapenems Alter Cell Morphology and Genomic Expression of Klebsiella pneumoniae Biofilms

    Science.gov (United States)

    Van Laar, Tricia A.; Chen, Tsute; You, Tao

    2015-01-01

    Klebsiella pneumoniae, a Gram-negative bacterium, is normally associated with pneumonia in patients with weakened immune systems. However, it is also a prevalent nosocomial infectious agent that can be found in infected surgical sites and combat wounds. Many of these clinical strains display multidrug resistance. We have worked with a clinical strain of K. pneumoniae that was initially isolated from a wound of an injured soldier. This strain demonstrated resistance to many commonly used antibiotics but sensitivity to carbapenems. This isolate was capable of forming biofilms in vitro, contributing to its increased antibiotic resistance and impaired clearance. We were interested in determining how sublethal concentrations of carbapenem treatment specifically affect K. pneumoniae biofilms both in morphology and in genomic expression. Scanning electron microscopy showed striking morphological differences between untreated and treated biofilms, including rounding, blebbing, and dimpling of treated cells. Comparative transcriptome analysis using RNA sequencing (RNA-Seq) technology identified a large number of open reading frames (ORFs) differentially regulated in response to carbapenem treatment at 2 and 24 h. ORFs upregulated with carbapenem treatment included genes involved in resistance, as well as those coding for antiporters and autoinducers. ORFs downregulated included those coding for metal transporters, membrane biosynthesis proteins, and motility proteins. Quantitative real-time PCR validated the general trend of some of these differentially regulated ORFs. Treatment of K. pneumoniae biofilms with sublethal concentrations of carbapenems induced a wide range of phenotypic and gene expression changes. This study reveals some of the mechanisms underlying how sublethal amounts of carbapenems could affect the overall fitness and pathogenic potential of K. pneumoniae biofilm cells. PMID:25583711

  18. Photon-induced cell migration and integrin expression promoted by DNA integration of HPV16 genome

    International Nuclear Information System (INIS)

    Rieken, Stefan; Simon, Florian; Habermehl, Daniel; Dittmar, Jan Oliver; Combs, Stephanie E.; Weber, Klaus; Debus, Juergen; Lindel, Katja

    2014-01-01

    Persistent human papilloma virus 16 (HPV16) infections are a major cause of cervical cancer. The integration of the viral DNA into the host genome causes E2 gene disruption which prevents apoptosis and increases host cell motility. In cervical cancer patients, survival is limited by local infiltration and systemic dissemination. Surgical control rates are poor in cases of parametrial infiltration. In these patients, radiotherapy (RT) is administered to enhance local control. However, photon irradiation itself has been reported to increase cell motility. In cases of E2-disrupted cervical cancers, this phenomenon would impose an additional risk of enhanced tumor cell motility. Here, we analyze mechanisms underlying photon-increased migration in keratinocytes with differential E2 gene status. Isogenic W12 (intact E2 gene status) and S12 (disrupted E2 gene status) keratinocytes were analyzed in fibronectin-based and serum-stimulated migration experiments following single photon doses of 0, 2, and 10 Gy. Quantitative FACS analyses of integrin expression were performed. Migration and adhesion are increased in E2 gene-disrupted keratinocytes. E2 gene disruption promotes attractability by serum components, therefore, effectuating the risk of local infiltration and systemic dissemination. In S12 cells, migration is further increased by photon RT which leads to enhanced expression of fibronectin receptor integrins. HPV16-associated E2 gene disruption is a main predictor of treatment-refractory cancer virulence. E2 gene disruption promotes cell motility. Following photon RT, E2-disrupted tumors bear the risk of integrin-related infiltration and dissemination. (orig.) [de

  19. Analysis of genomic aberrations and gene expression profiling identifies novel lesions and pathways in myeloproliferative neoplasms

    International Nuclear Information System (INIS)

    Rice, K L; Lin, X; Wolniak, K; Ebert, B L; Berkofsky-Fessler, W; Buzzai, M; Sun, Y; Xi, C; Elkin, P; Levine, R; Golub, T; Gilliland, D G; Crispino, J D; Licht, J D; Zhang, W

    2011-01-01

    Polycythemia vera (PV), essential thrombocythemia and primary myelofibrosis, are myeloproliferative neoplasms (MPNs) with distinct clinical features and are associated with the JAK2V617F mutation. To identify genomic anomalies involved in the pathogenesis of these disorders, we profiled 87 MPN patients using Affymetrix 250K single-nucleotide polymorphism (SNP) arrays. Aberrations affecting chr9 were the most frequently observed and included 9pLOH (n=16), trisomy 9 (n=6) and amplifications of 9p13.3–23.3 (n=1), 9q33.1–34.13 (n=1) and 9q34.13 (n=6). Patients with trisomy 9 were associated with elevated JAK2V617F mutant allele burden, suggesting that gain of chr9 represents an alternative mechanism for increasing JAK2V617F dosage. Gene expression profiling of patients with and without chr9 abnormalities (+9, 9pLOH), identified genes potentially involved in disease pathogenesis including JAK2, STAT5B and MAPK14. We also observed recurrent gains of 1p36.31–36.33 (n=6), 17q21.2–q21.31 (n=5) and 17q25.1–25.3 (n=5) and deletions affecting 18p11.31–11.32 (n=8). Combined SNP and gene expression analysis identified aberrations affecting components of a non-canonical PRC2 complex (EZH1, SUZ12 and JARID2) and genes comprising a ‘HSC signature' (MLLT3, SMARCA2 and PBX1). We show that NFIB, which is amplified in 7/87 MPN patients and upregulated in PV CD34+ cells, protects cells from apoptosis induced by cytokine withdrawal

  20. Genome-wide survey and developmental expression mapping of zebrafish SET domain-containing genes.

    Directory of Open Access Journals (Sweden)

    Xiao-Jian Sun

    Full Text Available SET domain-containing proteins represent an evolutionarily conserved family of epigenetic regulators, which are responsible for most histone lysine methylation. Since some of these genes have been revealed to be essential for embryonic development, we propose that the zebrafish, a vertebrate model organism possessing many advantages for developmental studies, can be utilized to study the biological functions of these genes and the related epigenetic mechanisms during early development. To this end, we have performed a genome-wide survey of zebrafish SET domain genes. 58 genes total have been identified. Although gene duplication events give rise to several lineage-specific paralogs, clear reciprocal orthologous relationship reveals high conservation between zebrafish and human SET domain genes. These data were further subject to an evolutionary analysis ranging from yeast to human, leading to the identification of putative clusters of orthologous groups (COGs of this gene family. By means of whole-mount mRNA in situ hybridization strategy, we have also carried out a developmental expression mapping of these genes. A group of maternal SET domain genes, which are implicated in the programming of histone modification states in early development, have been identified and predicted to be responsible for all known sites of SET domain-mediated histone methylation. Furthermore, some genes show specific expression patterns in certain tissues at certain stages, suggesting the involvement of epigenetic mechanisms in the development of these systems. These results provide a global view of zebrafish SET domain histone methyltransferases in evolutionary and developmental dimensions and pave the way for using zebrafish to systematically study the roles of these genes during development.

  1. Tree decomposition based fast search of RNA structures including pseudoknots in genomes.

    Science.gov (United States)

    Song, Yinglei; Liu, Chunmei; Malmberg, Russell; Pan, Fangfang; Cai, Liming

    2005-01-01

    Searching genomes for RNA secondary structure with computational methods has become an important approach to the annotation of non-coding RNAs. However, due to the lack of efficient algorithms for accurate RNA structure-sequence alignment, computer programs capable of fast and effectively searching genomes for RNA secondary structures have not been available. In this paper, a novel RNA structure profiling model is introduced based on the notion of a conformational graph to specify the consensus structure of an RNA family. Tree decomposition yields a small tree width t for such conformation graphs (e.g., t = 2 for stem loops and only a slight increase for pseudo-knots). Within this modelling framework, the optimal alignment of a sequence to the structure model corresponds to finding a maximum valued isomorphic subgraph and consequently can be accomplished through dynamic programming on the tree decomposition of the conformational graph in time O(k(t)N(2)), where k is a small parameter; and N is the size of the projiled RNA structure. Experiments show that the application of the alignment algorithm to search in genomes yields the same search accuracy as methods based on a Covariance model with a significant reduction in computation time. In particular; very accurate searches of tmRNAs in bacteria genomes and of telomerase RNAs in yeast genomes can be accomplished in days, as opposed to months required by other methods. The tree decomposition based searching tool is free upon request and can be downloaded at our site h t t p ://w.uga.edu/RNA-informatics/software/index.php.

  2. Genome-wide characterization of pectin methyl esterase genes reveals members differentially expressed in tolerant and susceptible wheats in response to Fusarium graminearum.

    Science.gov (United States)

    Zega, Alessandra; D'Ovidio, Renato

    2016-11-01

    Pectin methyl esterase (PME) genes code for enzymes that are involved in structural modifications of the plant cell wall during plant growth and development. They are also involved in plant-pathogen interaction. PME genes belong to a multigene family and in this study we report the first comprehensive analysis of the PME gene family in bread wheat (Triticum aestivum L.). Like in other species, the members of the TaPME family are dispersed throughout the genome and their encoded products retain the typical structural features of PMEs. qRT-PCR analysis showed variation in the expression pattern of TaPME genes in different tissues and revealed that these genes are mainly expressed in flowering spikes. In our attempt to identify putative TaPME genes involved in wheat defense, we revealed a strong variation in the expression of the TaPME following Fusarium graminearum infection, the causal agent of Fusarium head blight (FHB). Particularly interesting was the finding that the expression profile of some PME genes was markedly different between the FHB-resistant wheat cultivar Sumai3 and the FHB-susceptible cultivar Bobwhite, suggesting a possible involvement of these PME genes in FHB resistance. Moreover, the expression analysis of the TaPME genes during F. graminearum progression within the spike revealed those genes that responded more promptly to pathogen invasion. Copyright © 2016 Elsevier Masson SAS. All rights reserved.

  3. A genomic portrait of the genetic architecture and regulatory impact of microRNA expression in response to infection.

    Science.gov (United States)

    Siddle, Katherine J; Deschamps, Matthieu; Tailleux, Ludovic; Nédélec, Yohann; Pothlichet, Julien; Lugo-Villarino, Geanncarlo; Libri, Valentina; Gicquel, Brigitte; Neyrolles, Olivier; Laval, Guillaume; Patin, Etienne; Barreiro, Luis B; Quintana-Murci, Lluís

    2014-05-01

    MicroRNAs (miRNAs) are critical regulators of gene expression, and their role in a wide variety of biological processes, including host antimicrobial defense, is increasingly well described. Consistent with their diverse functional effects, miRNA expression is highly context dependent and shows marked changes upon cellular activation. However, the genetic control of miRNA expression in response to external stimuli and the impact of such perturbations on miRNA-mediated regulatory networks at the population level remain to be determined. Here we assessed changes in miRNA expression upon Mycobacterium tuberculosis infection and mapped expression quantitative trait loci (eQTL) in dendritic cells from a panel of healthy individuals. Genome-wide expression profiling revealed that ∼40% of miRNAs are differentially expressed upon infection. We find that the expression of 3% of miRNAs is controlled by proximate genetic factors, which are enriched in a promoter-specific histone modification associated with active transcription. Notably, we identify two infection-specific response eQTLs, for miR-326 and miR-1260, providing an initial assessment of the impact of genotype-environment interactions on miRNA molecular phenotypes. Furthermore, we show that infection coincides with a marked remodeling of the genome-wide relationships between miRNA and mRNA expression levels. This observation, supplemented by experimental data using the model of miR-29a, sheds light on the role of a set of miRNAs in cellular responses to infection. Collectively, this study increases our understanding of the genetic architecture of miRNA expression in response to infection, and highlights the wide-reaching impact of altering miRNA expression on the transcriptional landscape of a cell.

  4. Evolution of the Exon-Intron Structure in Ciliate Genomes.

    Directory of Open Access Journals (Sweden)

    Vladyslav S Bondarenko

    Full Text Available A typical eukaryotic gene is comprised of alternating stretches of regions, exons and introns, retained in and spliced out a mature mRNA, respectively. Although the length of introns may vary substantially among organisms, a large fraction of genes contains short introns in many species. Notably, some Ciliates (Paramecium and Nyctotherus possess only ultra-short introns, around 25 bp long. In Paramecium, ultra-short introns with length divisible by three (3n are under strong evolutionary pressure and have a high frequency of in-frame stop codons, which, in the case of intron retention, cause premature termination of mRNA translation and consequent degradation of the mis-spliced mRNA by the nonsense-mediated decay mechanism. Here, we analyzed introns in five genera of Ciliates, Paramecium, Tetrahymena, Ichthyophthirius, Oxytricha, and Stylonychia. Introns can be classified into two length classes in Tetrahymena and Ichthyophthirius (with means 48 bp, 69 bp, and 55 bp, 64 bp, respectively, but, surprisingly, comprise three distinct length classes in Oxytricha and Stylonychia (with means 33-35 bp, 47-51 bp, and 78-80 bp. In most ranges of the intron lengths, 3n introns are underrepresented and have a high frequency of in-frame stop codons in all studied species. Introns of Paramecium, Tetrahymena, and Ichthyophthirius are preferentially located at the 5' and 3' ends of genes, whereas introns of Oxytricha and Stylonychia are strongly skewed towards the 5' end. Analysis of evolutionary conservation shows that, in each studied genome, a significant fraction of intron positions is conserved between the orthologs, but intron lengths are not correlated between the species. In summary, our study provides a detailed characterization of introns in several genera of Ciliates and highlights some of their distinctive properties, which, together, indicate that splicing spellchecking is a universal and evolutionarily conserved process in the biogenesis of short

  5. Chloroplast genomes of Arabidopsis halleri ssp. gemmifera and Arabidopsis lyrata ssp. petraea: Structures and comparative analysis.

    Science.gov (United States)

    Asaf, Sajjad; Khan, Abdul Latif; Khan, Muhammad Aaqil; Waqas, Muhammad; Kang, Sang-Mo; Yun, Byung-Wook; Lee, In-Jung

    2017-08-08

    We investigated the complete chloroplast (cp) genomes of non-model Arabidopsis halleri ssp. gemmifera and Arabidopsis lyrata ssp. petraea using Illumina paired-end sequencing to understand their genetic organization and structure. Detailed bioinformatics analysis revealed genome sizes of both subspecies ranging between 154.4~154.5 kbp, with a large single-copy region (84,197~84,158 bp), a small single-copy region (17,738~17,813 bp) and pair of inverted repeats (IRa/IRb; 26,264~26,259 bp). Both cp genomes encode 130 genes, including 85 protein-coding genes, eight ribosomal RNA genes and 37 transfer RNA genes. Whole cp genome comparison of A. halleri ssp. gemmifera and A. lyrata ssp. petraea, along with ten other Arabidopsis species, showed an overall high degree of sequence similarity, with divergence among some intergenic spacers. The location and distribution of repeat sequences were determined, and sequence divergences of shared genes were calculated among related species. Comparative phylogenetic analysis of the entire genomic data set and 70 shared genes between both cp genomes confirmed the previous phylogeny and generated phylogenetic trees with the same topologies. The sister species of A. halleri ssp. gemmifera is A. umezawana, whereas the closest relative of A. lyrata spp. petraea is A. arenicola.

  6. Elucidating the triplicated ancestral genome structure of radish based on chromosome-level comparison with the Brassica genomes.

    Science.gov (United States)

    Jeong, Young-Min; Kim, Namshin; Ahn, Byung Ohg; Oh, Mijin; Chung, Won-Hyong; Chung, Hee; Jeong, Seongmun; Lim, Ki-Byung; Hwang, Yoon-Jung; Kim, Goon-Bo; Baek, Seunghoon; Choi, Sang-Bong; Hyung, Dae-Jin; Lee, Seung-Won; Sohn, Seong-Han; Kwon, Soo-Jin; Jin, Mina; Seol, Young-Joo; Chae, Won Byoung; Choi, Keun Jin; Park, Beom-Seok; Yu, Hee-Ju; Mun, Jeong-Hwan

    2016-07-01

    This study presents a chromosome-scale draft genome sequence of radish that is assembled into nine chromosomal pseudomolecules. A comprehensive comparative genome analysis with the Brassica genomes provides genomic evidences on the evolution of the mesohexaploid radish genome. Radish (Raphanus sativus L.) is an agronomically important root vegetable crop and its origin and phylogenetic position in the tribe Brassiceae is controversial. Here we present a comprehensive analysis of the radish genome based on the chromosome sequences of R. sativus cv. WK10039. The radish genome was sequenced and assembled into 426.2 Mb spanning >98 % of the gene space, of which 344.0 Mb were integrated into nine chromosome pseudomolecules. Approximately 36 % of the genome was repetitive sequences and 46,514 protein-coding genes were predicted and annotated. Comparative mapping of the tPCK-like ancestral genome revealed that the radish genome has intermediate characteristics between the Brassica A/C and B genomes in the triplicated segments, suggesting an internal origin from the genus Brassica. The evolutionary characteristics shared between radish and other Brassica species provided genomic evidences that the current form of nine chromosomes in radish was rearranged from the chromosomes of hexaploid progenitor. Overall, this study provides a chromosome-scale draft genome sequence of radish as well as novel insight into evolution of the mesohexaploid genomes in the tribe Brassiceae.

  7. Genome-Wide Identification of the Alba Gene Family in Plants and Stress-Responsive Expression of the Rice Alba Genes.

    Science.gov (United States)

    Verma, Jitendra Kumar; Wardhan, Vijay; Singh, Deepali; Chakraborty, Subhra; Chakraborty, Niranjan

    2018-03-28

    Architectural proteins play key roles in genome construction and regulate the expression of many genes, albeit the modulation of genome plasticity by these proteins is largely unknown. A critical screening of the architectural proteins in five crop species, viz., Oryza sativa , Zea mays , Sorghum bicolor , Cicer arietinum , and Vitis vinifera , and in the model plant Arabidopsis thaliana along with evolutionary relevant species such as Chlamydomonas reinhardtii , Physcomitrella patens , and Amborella trichopoda , revealed 9, 20, 10, 7, 7, 6, 1, 4, and 4 Alba (acetylation lowers binding affinity) genes, respectively. A phylogenetic analysis of the genes and of their counterparts in other plant species indicated evolutionary conservation and diversification. In each group, the structural components of the genes and motifs showed significant conservation. The chromosomal location of the Alba genes of rice ( OsAlba ), showed an unequal distribution on 8 of its 12 chromosomes. The expression profiles of the OsAlba genes indicated a distinct tissue-specific expression in the seedling, vegetative, and reproductive stages. The quantitative real-time PCR (qRT-PCR) analysis of the OsAlba genes confirmed their stress-inducible expression under multivariate environmental conditions and phytohormone treatments. The evaluation of the regulatory elements in 68 Alba genes from the 9 species studied led to the identification of conserved motifs and overlapping microRNA (miRNA) target sites, suggesting the conservation of their function in related proteins and a divergence in their biological roles across species. The 3D structure and the prediction of putative ligands and their binding sites for OsAlba proteins offered a key insight into the structure-function relationship. These results provide a comprehensive overview of the subtle genetic diversification of the OsAlba genes, which will help in elucidating their functional role in plants.

  8. Structural and sequence diversity of the transposon Galileo in the Drosophila willistoni genome.

    Science.gov (United States)

    Gonçalves, Juliana W; Valiati, Victor Hugo; Delprat, Alejandra; Valente, Vera L S; Ruiz, Alfredo

    2014-09-13

    Galileo is one of three members of the P superfamily of DNA transposons. It was originally discovered in Drosophila buzzatii, in which three segregating chromosomal inversions were shown to have been generated by ectopic recombination between Galileo copies. Subsequently, Galileo was identified in six of 12 sequenced Drosophila genomes, indicating its widespread distribution within this genus. Galileo is strikingly abundant in Drosophila willistoni, a neotropical species that is highly polymorphic for chromosomal inversions, suggesting a role for this transposon in the evolution of its genome. We carried out a detailed characterization of all Galileo copies present in the D. willistoni genome. A total of 191 copies, including 133 with two terminal inverted repeats (TIRs), were classified according to structure in six groups. The TIRs exhibited remarkable variation in their length and structure compared to the most complete copy. Three copies showed extended TIRs due to internal tandem repeats, the insertion of other transposable elements (TEs), or the incorporation of non-TIR sequences into the TIRs. Phylogenetic analyses of the transposase (TPase)-encoding and TIR segments yielded two divergent clades, which we termed Galileo subfamilies V and W. Target-site duplications (TSDs) in D. willistoni Galileo copies were 7- or 8-bp in length, with the consensus sequence GTATTAC. Analysis of the region around the TSDs revealed a target site motif (TSM) with a 15-bp palindrome that may give rise to a stem-loop secondary structure. There is a remarkable abundance and diversity of Galileo copies in the D. willistoni genome, although no functional copies were found. The TIRs in particular have a dynamic structure and extend in different ways, but their ends (required for transposition) are more conserved than the rest of the element. The D. willistoni genome harbors two Galileo subfamilies (V and W) that diverged ~9 million years ago and may have descended from an ancestral

  9. Structural genomic variation in childhood epilepsies with complex phenotypes

    DEFF Research Database (Denmark)

    Helbig, Ingo; Swinkels, Marielle E M; Aten, Emmelien

    2014-01-01

    of CNVs in patients with unclassified epilepsies and complex phenotypes. A total of 222 patients from three European countries, including patients with structural lesions on magnetic resonance imaging (MRI), dysmorphic features, and multiple congenital anomalies, were clinically evaluated and screened.......9%). Segregation of all identified variants could be assessed in 42 patients, 11 of which were de novo. The frequency of all structural variants and de novo variants was not statistically different between patients with or without MRI abnormalities or MRI subcategories. Patients with dysmorphic features were more...

  10. Genome-wide identification, characterisation and expression analysis of the MADS-box gene family in Prunus mume.

    Science.gov (United States)

    Xu, Zongda; Zhang, Qixiang; Sun, Lidan; Du, Dongliang; Cheng, Tangren; Pan, Huitang; Yang, Weiru; Wang, Jia

    2014-10-01

    MADS-box genes encode transcription factors that play crucial roles in plant development, especially in flower and fruit development. To gain insight into this gene family in Prunus mume, an important ornamental and fruit plant in East Asia, and to elucidate their roles in flower organ determination and fruit development, we performed a genome-wide identification, characterisation and expression analysis of MADS-box genes in this Rosaceae tree. In this study, 80 MADS-box genes were identified in P. mume and categorised into MIKC, Mα, Mβ, Mγ and Mδ groups based on gene structures and phylogenetic relationships. The MIKC group could be further classified into 12 subfamilies. The FLC subfamily was absent in P. mume and the six tandemly arranged DAM genes might experience a species-specific evolution process in P. mume. The MADS-box gene family might experience an evolution process from MIKC genes to Mδ genes to Mα, Mβ and Mγ genes. The expression analysis suggests that P. mume MADS-box genes have diverse functions in P. mume development and the functions of duplicated genes diverged after the duplication events. In addition to its involvement in the development of female gametophytes, type I genes also play roles in male gametophytes development. In conclusion, this study adds to our understanding of the roles that the MADS-box genes played in flower and fruit development and lays a foundation for selecting candidate genes for functional studies in P. mume and other species. Furthermore, this study also provides a basis to study the evolution of the MADS-box family.

  11. The poplar phi class glutathione transferase: expression, activity and structure of GSTF1

    Directory of Open Access Journals (Sweden)

    Henri ePégeot

    2014-12-01

    Full Text Available Glutathione transferases (GSTs constitute a superfamily of enzymes with essential roles in cellular detoxification and secondary metabolism in plants as in other organisms. Several plant GSTs, including those of the Phi class (GSTFs, require a conserved catalytic serine residue to perform glutathione (GSH-conjugation reactions. Genomic analyses revealed that terrestrial plants have around 10 GSTFs, 8 in the Populus trichocarpa genome, but their physiological functions and substrates are mostly unknown. Transcript expression analyses showed a predominant expression of all genes both in reproductive (female flowers, fruits, floral buds and vegetative organs (leaves, petioles. Here, we show that the recombinant poplar GSTF1 (PttGSTF1 possesses peroxidase activity towards cumene hydroperoxide and GSH-conjugation activity towards model substrates such as 2,4-dinitrochlorobenzene, benzyl and phenetyl isothiocyanate, 4-nitrophenyl butyrate and 4-hydroxy-2-nonenal but interestingly not on previously identified GSTF-class substrates. In accordance to analytical gel filtration data, crystal structure of PttGSTF1 showed a canonical dimeric organization with bound GSH or MES molecules. The structure of these protein-substrate complexes allowed delineating the residues contributing to both the G and H sites that form the active site cavity. In sum, the presence of GSTF1 transcripts and proteins in most poplar organs especially those rich in secondary metabolites such as flowers and fruits, together with its GSH-conjugation activity and its documented stress-responsive expression suggest that its function is associated with the catalytic transformation of metabolites and/or peroxide removal rather than with ligandin properties as previously reported for other GSTFs.

  12. Deciphering the genomic structure, function and evolution of carotenogenesis related phytoene synthases in grasses

    Directory of Open Access Journals (Sweden)

    Dibari Bianca

    2012-06-01

    Full Text Available Abstract Background Carotenoids are isoprenoid pigments, essential for photosynthesis and photoprotection in plants. The enzyme phytoene synthase (PSY plays an essential role in mediating condensation of two geranylgeranyl diphosphate molecules, the first committed step in carotenogenesis. PSY are nuclear enzymes encoded by a small gene family consisting of three paralogous genes (PSY1-3 that have been widely characterized in rice, maize and sorghum. Results In wheat, for which yellow pigment content is extremely important for flour colour, only PSY1 has been extensively studied because of its association with QTLs reported for yellow pigment whereas PSY2 has been partially characterized. Here, we report the isolation of bread wheat PSY3 genes from a Renan BAC library using Brachypodium as a model genome for the Triticeae to develop Conserved Orthologous Set markers prior to gene cloning and sequencing. Wheat PSY3 homoeologous genes were sequenced and annotated, unravelling their novel structure associated with intron-loss events and consequent exonic fusions. A wheat PSY3 promoter region was also investigated for the presence of cis-acting elements involved in the response to abscisic acid (ABA, since carotenoids also play an important role as precursors of signalling molecules devoted to plant development and biotic/abiotic stress responses. Expression of wheat PSYs in leaves and roots was investigated during ABA treatment to confirm the up-regulation of PSY3 during abiotic stress. Conclusions We investigated the structural and functional determinisms of PSY genes in wheat. More generally, among eudicots and monocots, the PSY gene family was found to be associated with differences in gene copy numbers, allowing us to propose an evolutionary model for the entire PSY gene family in Grasses.

  13. Integrative genome-wide expression profiling identifies three distinct molecular subgroups of renal cell carcinoma with different patient outcome

    Directory of Open Access Journals (Sweden)

    Beleut Manfred

    2012-07-01

    Full Text Available Abstract Background Renal cell carcinoma (RCC is characterized by a number of diverse molecular aberrations that differ among individuals. Recent approaches to molecularly classify RCC were based on clinical, pathological as well as on single molecular parameters. As a consequence, gene expression patterns reflecting the sum of genetic aberrations in individual tumors may not have been recognized. In an attempt to uncover such molecular features in RCC, we used a novel, unbiased and integrative approach. Methods We integrated gene expression data from 97 primary RCC of different pathologic parameters, 15 RCC metastases as well as 34 cancer cell lines for two-way nonsupervised hierarchical clustering using gene groups suggested by the PANTHER Classification System. We depicted the genomic landscape of the resulted tumor groups by means of Single Nuclear Polymorphism (SNP technology. Finally, the achieved results were immunohistochemically analyzed using a tissue microarray (TMA composed of 254 RCC. Results We found robust, genome wide expression signatures, which split RCC into three distinct molecular subgroups. These groups remained stable even if randomly selected gene sets were clustered. Notably, the pattern obtained from RCC cell lines was clearly distinguishable from that of primary tumors. SNP array analysis demonstrated differing frequencies of chromosomal copy number alterations among RCC subgroups. TMA analysis with group-specific markers showed a prognostic significance of the different groups. Conclusion We propose the existence of characteristic and histologically independent genome-wide expression outputs in RCC with potential biological and clinical relevance.

  14. Dissecting inflammatory complications in critically injured patients by within-patient gene expression changes: a longitudinal clinical genomics study.

    Directory of Open Access Journals (Sweden)

    Keyur H Desai

    2011-09-01

    Full Text Available Trauma is the number one killer of individuals 1-44 y of age in the United States. The prognosis and treatment of inflammatory complications in critically injured patients continue to be challenging, with a history of failed clinical trials and poorly understood biology. New approaches are therefore needed to improve our ability to diagnose and treat this clinical condition.We conducted a large-scale study on 168 blunt-force trauma patients over 28 d, measuring ∼400 clinical variables and longitudinally profiling leukocyte gene expression with ∼800 microarrays. Marshall MOF (multiple organ failure clinical score trajectories were first utilized to organize the patients into five categories of increasingly poor outcomes. We then developed an analysis framework modeling early within-patient expression changes to produce a robust characterization of the genomic response to trauma. A quarter of the genome shows early expression changes associated with longer-term post-injury complications, captured by at least five dynamic co-expression modules of functionally related genes. In particular, early down-regulation of MHC-class II genes and up-regulation of p38 MAPK signaling pathway were found to strongly associate with longer-term post-injury complications, providing discrimination among patient outcomes from expression changes during the 40-80 h window post-injury.The genomic characterization provided here substantially expands the scope by which the molecular response to trauma may be characterized and understood. These results may be instrumental in furthering our understanding of the disease process and identifying potential targets for therapeutic intervention. Additionally, the quantitative approach we have introduced is potentially applicable to future genomics studies of rapidly progressing clinical conditions.ClinicalTrials.gov NCT00257231

  15. Structured RNAs and synteny regions in the pig genome

    DEFF Research Database (Denmark)

    Anthon, Christian; Tafer, Hakim; Havgaard, Jakob Hull

    2014-01-01

    annotation. To further enhance the reliability, 571 of the 3,556 structured RNAs were manually curated by methods depending on the RNA class while 1,581 were declared as pseudogenes. We further created a multiple alignment of pig against 20 representative vertebrates, from which RNAz predicted 83,859 de novo...

  16. Distinct gene subsets in pterygia formation and recurrence: dissecting complex biological phenomenon using genome wide expression data

    Directory of Open Access Journals (Sweden)

    Ang Leonard PK

    2009-03-01

    Full Text Available Abstract Background Pterygium is a common ocular surface disease characterized by fibrovascular invasion of the cornea and is sight-threatening due to astigmatism, tear film disturbance, or occlusion of the visual axis. However, the mechanisms for formation and post-surgical recurrence of pterygium are not understood, and a valid animal model does not exist. Here, we investigated the possible mechanisms of pterygium pathogenesis and recurrence. Methods First we performed a genome wide expression analysis (human Affymetrix Genechip, >22000 genes with principal component analysis and clustering techniques, and validated expression of key molecules with PCR. The controls for this study were the un-involved conjunctival tissue of the same eye obtained during the surgical resection of the lesions. Interesting molecules were further investigated with immunohistochemistry, Western blots, and comparison with tear proteins from pterygium patients. Results Principal component analysis in pterygium indicated a signature of matrix-related structural proteins, including fibronectin-1 (both splice-forms, collagen-1A2, keratin-12 and small proline rich protein-1. Immunofluorescence showed strong expression of keratin-6A in all layers, especially the superficial layers, of pterygium epithelium, but absent in the control, with up-regulation and nuclear accumulation of the cell adhesion molecule CD24 in the pterygium epithelium. Western blot shows increased protein expression of beta-microseminoprotein, a protein up-regulated in human cutaneous squamous cell carcinoma. Gene products of 22 up-regulated genes in pterygium have also been found by us in human tears using nano-electrospray-liquid chromatography/mass spectrometry after pterygium surgery. Recurrent disease was associated with up-regulation of sialophorin, a negative regulator of cell adhesion, and never in mitosis a-5, known to be involved in cell motility. Conclusion Aberrant wound healing is therefore

  17. Genomes in Turmoil: Frugality Drives Microbial Community Structure in Extremely Acidic Environments

    Science.gov (United States)

    Holmes, D. S.

    2016-12-01

    Extremely acidic environments (To gain insight into these issues, we have conducted deep bioinformatic analyses, including metabolic reconstruction of key assimilatory pathways, phylogenomics and network scrutiny of >160 genomes of acidophiles, including representatives from Archaea, Bacteria and Eukarya and at least ten metagenomes of acidic environments [Cardenas JP, et al. pp 179-197 in Acidophiles, eds R. Quatrini and D. B. Johnson, Caister Academic Press, UK (2016)]. Results yielded valuable insights into cellular processes, including carbon and nitrogen management and energy production, linking biogeochemical processes to organismal physiology. They also provided insight into the evolutionary forces that shape the genomic structure of members of acidophile communities. Niche partitioning can explain diversity patterns in rapidly changing acidic environments such as bioleaching heaps. However, in spatially and temporally homogeneous acidic environments genome flux appears to provide deeper insight into the composition and evolution of acidic consortia. Acidophiles have undergone genome streamlining by gene loss promoting mutual coexistence of species that exploit complementarity use of scarce resources consistent with the Black Queen hypothesis [Morris JJ et al. mBio 3: e00036-12 (2012)]. Acidophiles also have a large pool of accessory genes (the microbial super-genome) that can be accessed by horizontal gene transfer. This further promotes dependency relationships as drivers of community structure and the evolution of keystone species. Acknowledgements: Fondecyt 1130683; Basal CCTE PFB16

  18. Candidate genes revealed by a genome scan for mosquito resistance to a bacterial insecticide: sequence and gene expression variations

    Directory of Open Access Journals (Sweden)

    David Jean-Philippe

    2009-11-01

    Full Text Available Abstract Background Genome scans are becoming an increasingly popular approach to study the genetic basis of adaptation and speciation, but on their own, they are often helpless at identifying the specific gene(s or mutation(s targeted by selection. This shortcoming is hopefully bound to disappear in the near future, thanks to the wealth of new genomic resources that are currently being developed for many species. In this article, we provide a foretaste of this exciting new era by conducting a genome scan in the mosquito Aedes aegypti with the aim to look for candidate genes involved in resistance to Bacillus thuringiensis subsp. israelensis (Bti insecticidal toxins. Results The genome of a Bti-resistant and a Bti-susceptible strains was surveyed using about 500 MITE-based molecular markers, and the loci showing the highest inter-strain genetic differentiation were sequenced and mapped on the Aedes aegypti genome sequence. Several good candidate genes for Bti-resistance were identified in the vicinity of these highly differentiated markers. Two of them, coding for a cadherin and a leucine aminopeptidase, were further examined at the sequence and gene expression levels. In the resistant strain, the cadherin gene displayed patterns of nucleotide polymorphisms consistent with the action of positive selection (e.g. an excess of high compared to intermediate frequency mutations, as well as a significant under-expression compared to the susceptible strain. Conclusion Both sequence and gene expression analyses agree to suggest a role for positive selection in the evolution of this cadherin gene in the resistant strain. However, it is unlikely that resistance to Bti is conferred by this gene alone, and further investigation will be needed to characterize other genes significantly associated with Bti resistance in Ae. aegypti. Beyond these results, this article illustrates how genome scans can build on the body of new genomic information (here, full

  19. Genome-Wide Identification and Expression Analysis of the Biotin Carboxyl Carrier Subunits of Heteromeric Acetyl-CoA Carboxylase in Gossypium

    Directory of Open Access Journals (Sweden)

    Jinping Hua

    2017-05-01

    Full Text Available Acetyl-CoA carboxylase is an important enzyme, which catalyzes acetyl-CoA’s carboxylation to produce malonyl-CoA and to serve as a committed step for de novo fatty acid biosynthesis in plastids. In this study, 24 putative cotton BCCP genes were identified based on the lately published genome data in Gossypium. Among them, 4, 4, 8, and 8 BCCP homologs were identified in Gossypium raimondii, G. arboreum, G. hirsutum, and G. barbadense, respectively. These genes were divided into two classes based on a phylogenetic analysis. In each class, these homologs were relatively conserved in gene structure and motifs. The chromosomal distribution pattern revealed that all the BCCP genes were distributed equally on corresponding chromosomes or scaffold in the four cotton species. Segmental duplication was a predominant duplication event in both of G. hirsutum and G. barbadense. The analysis of the expression profile showed that 8 GhBCCP genes expressed in all the tested tissues with changed expression levels, and GhBCCP genes belonging to class II were predominantly expressed in developing ovules. Meanwhile, the expression analysis for the 16 cotton BCCP genes from G. raimondii, G. arboreum and G. hirsutum showed that they were induced or suppressed by cold or salt stress, and their expression patterns varied among different tissues. These findings will help to determine the functional and evolutionary characteristics of the BCCP genes in Gossypium species.

  20. A high-quality human reference panel reveals the complexity and distribution of genomic structural variants

    NARCIS (Netherlands)

    Hehir-Kwa, J.Y.; Marschall, T.; Kloosterman, W.P.; Francioli, L.C.; Baaijens, J.A.; Dijkstra, L.J.; Abdellaoui, A.; Koval, V.; Thung, D.T.; Wardenaar, R.; Renkens, I.; Coe, B.P.; Deelen, P.; de Ligt, J.; Lameijer, E.W.; Dijk, F.; Hormozdiari, F.; Uitterlinden, A.G.; van Duijn, C.M.; Eichler, E.E.; Bakker, P.I.W.; Swertz, M.A.; Wijmenga, C.; van Ommen, G.J.B; Slagboom, P.E.; Boomsma, D.I.; Schönhuth, A.; Ye, K.; Guryev, V.

    2016-01-01

    Structural variation (SV) represents a major source of differences between individual human genomes and has been linked to disease phenotypes. However, the majority of studies provide neither a global view of the full spectrum of these variants nor integrate them into reference panels of genetic

  1. New families of human regulatory RNA structures identified by comparative analysis of vertebrate genomes.

    Science.gov (United States)

    Parker, Brian J; Moltke, Ida; Roth, Adam; Washietl, Stefan; Wen, Jiayu; Kellis, Manolis; Breaker, Ronald; Pedersen, Jakob Skou

    2011-11-01

    Regulatory RNA structures are often members of families with multiple paralogous instances across the genome. Family members share functional and structural properties, which allow them to be studied as a whole, facilitating both bioinformatic and experimental characterization. We have developed a comparative method, EvoFam, for genome-wide identification of families of regulatory RNA structures, based on primary sequence and secondary structure similarity. We apply EvoFam to a 41-way genomic vertebrate alignment. Genome-wide, we identify 220 human, high-confidence families outside protein-coding regions comprising 725 individual structures, including 48 families with known structural RNA elements. Known families identified include both noncoding RNAs, e.g., miRNAs and the recently identified MALAT1/MEN β lincRNA family; and cis-regulatory structures, e.g., iron-responsive elements. We also identify tens of new families supported by strong evolutionary evidence and other statistical evidence, such as GO term enrichments. For some of these, detailed analysis has led to the formulation of specific functional hypotheses. Examples include two hypothesized auto-regulatory feedback mechanisms: one involving six long hairpins in the 3'-UTR of MAT2A, a key metabolic gene that produces the primary human methyl donor S-adenosylmethionine; the other involving a tRNA-like structure in the intron of the tRNA maturation gene POP1. We experimentally validate the predicted MAT2A structures. Finally, we identify potential new regulatory networks, including large families of short hairpins enriched in immunity-related genes, e.g., TNF, FOS, and CTLA4, which include known transcript destabilizing elements. Our findings exemplify the diversity of post-transcriptional regulation and provide a resource for further characterization of new regulatory mechanisms and families of noncoding RNAs.

  2. Genome wide gene expression regulation by HIP1 Protein Interactor, HIPPI: Prediction and validation

    Directory of Open Access Journals (Sweden)

    Lahiri Ansuman

    2011-09-01

    Full Text Available Abstract Background HIP1 Protein Interactor (HIPPI is a pro-apoptotic protein that induces Caspase8 mediated apoptosis in cell. We have shown earlier that HIPPI could interact with a specific 9 bp sequence motif, defined as the HIPPI binding site (HBS, present in the upstream promoter of Caspase1 gene and regulate its expression. We also have shown that HIPPI, without any known nuclear localization signal, could be transported to the nucleus by HIP1, a NLS containing nucleo-cytoplasmic shuttling protein. Thus our present work aims at the investigation of the role of HIPPI as a global transcription regulator. Results We carried out genome wide search for the presence of HBS in the upstream sequences of genes. Our result suggests that HBS was predominantly located within 2 Kb upstream from transcription start site. Transcription factors like CREBP1, TBP, OCT1, EVI1 and P53 half site were significantly enriched in the 100 bp vicinity of HBS indicating that they might co-operate with HIPPI for transcription regulation. To illustrate the role of HIPPI on transcriptome, we performed gene expression profiling by microarray. Exogenous expression of HIPPI in HeLa cells resulted in up-regulation of 580 genes (p HIP1 was knocked down. HIPPI-P53 interaction was necessary for HIPPI mediated up-regulation of Caspase1 gene. Finally, we analyzed published microarray data obtained with post mortem brains of Huntington's disease (HD patients to investigate the possible involvement of HIPPI in HD pathogenesis. We observed that along with the transcription factors like CREB, P300, SREBP1, Sp1 etc. which are already known to be involved in HD, HIPPI binding site was also significantly over-represented in the upstream sequences of genes altered in HD. Conclusions Taken together, the results suggest that HIPPI could act as an important transcription regulator in cell regulating a vast array of genes, particularly transcription factors and at least, in part, play a

  3. Ectopic Expression of O Antigen in Bordetella pertussis by a Novel Genomic Integration System.

    Science.gov (United States)

    Ishigaki, Keisuke; Shinzawa, Naoaki; Nishikawa, Sayaka; Suzuki, Koichiro; Fukui-Miyazaki, Aya; Horiguchi, Yasuhiko

    2018-01-01

    We describe a novel genome integration system that enables the introduction of DNA fragments as large as 50 kbp into the chromosomes of recipient bacteria. This system, named BPI, comprises a bacterial artificial chromosome vector and phage-derived gene integration machinery. We introduced the wbm locus of Bordetella bronchiseptica , which is required for O antigen biosynthesis, into the chromosome of B. pertussis , which intrinsically lacks O antigen, using the BPI system. After the introduction of the wbm locus, B. pertussis presented an additional substance in the lipooligosaccharide fraction that was specifically recognized by the anti- B. bronchiseptica antibody but not the anti- B. pertussis antibody, indicating that B. pertussis expressed O antigen corresponding to that of B. bronchiseptica . O antigen-expressing B. pertussis was less sensitive to the bactericidal effects of serum and polymyxin B than the isogenic parental strain. In addition, an in vivo competitive infection assay showed that O antigen-expressing B. pertussis dominantly colonized the mouse respiratory tract over the parental strain. These results indicate that the BPI system provides a means to alter the phenotypes of bacteria by introducing large exogenous DNA fragments. IMPORTANCE Some bacterial phenotypes emerge through the cooperative functions of a number of genes residing within a large genetic locus. To transfer the phenotype of one bacterium to another, a means to introduce the large genetic locus into the recipient bacterium is needed. Therefore, we developed a novel system by combining the advantages of a bacterial artificial chromosome vector and phage-derived gene integration machinery. In this study, we succeeded for the first time in introducing a gene locus involved in O antigen biosynthesis of Bordetella bronchiseptica into the chromosome of B. pertussis , which intrinsically lacks O antigen, and using this system we analyzed phenotypic alterations in the resultant

  4. Genome-wide Annotation, Identification, and Global Transcriptomic Analysis of Regulatory or Small RNA Gene Expression in Staphylococcus aureus.

    Science.gov (United States)

    Carroll, Ronan K; Weiss, Andy; Broach, William H; Wiemels, Richard E; Mogen, Austin B; Rice, Kelly C; Shaw, Lindsey N

    2016-02-09

    In Staphylococcus aureus, hundreds of small regulatory or small RNAs (sRNAs) have been identified, yet this class of molecule remains poorly understood and severely understudied. sRNA genes are typically absent from genome annotation files, and as a consequence, their existence is often overlooked, particularly in global transcriptomic studies. To facilitate improved detection and analysis of sRNAs in S. aureus, we generated updated GenBank files for three commonly used S. aureus strains (MRSA252, NCTC 8325, and USA300), in which we added annotations for >260 previously identified sRNAs. These files, the first to include genome-wide annotation of sRNAs in S. aureus, were then used as a foundation to identify novel sRNAs in the community-associated methicillin-resistant strain USA300. This analysis led to the discovery of 39 previously unidentified sRNAs. Investigating the genomic loci of the newly identified sRNAs revealed a surprising degree of inconsistency in genome annotation in S. aureus, which may be hindering the analysis and functional exploration of these elements. Finally, using our newly created annotation files as a reference, we perform a global analysis of sRNA gene expression in S. aureus and demonstrate that the newly identified tsr25 is the most highly upregulated sRNA in human serum. This study provides an invaluable resource to the S. aureus research community in the form of our newly generated annotation files, while at the same time presenting the first examination of differential sRNA expression in pathophysiologically relevant conditions. Despite a large number of studies identifying regulatory or small RNA (sRNA) genes in Staphylococcus aureus, their annotation is notably lacking in available genome files. In addition to this, there has been a considerable lack of cross-referencing in the wealth of studies identifying these elements, often leading to the same sRNA being identified multiple times and bearing multiple names. In this work

  5. Population Structure Analysis of Bull Genomes of European and Western Ancestry

    DEFF Research Database (Denmark)

    Chung, Neo Christopher; Szyda, Joanna; Frąszczak, Magdalena

    2017-01-01

    Since domestication, population bottlenecks, breed formation, and selective breeding have radically shaped the genealogy and genetics of Bos taurus. In turn, characterization of population structure among diverse bull (males of Bos taurus) genomes enables detailed assessment of genetic resources...... and origins. By analyzing 432 unrelated bull genomes from 13 breeds and 16 countries, we demonstrate genetic diversity and structural complexity among the European/Western cattle population. Importantly, we relaxed a strong assumption of discrete or admixed population, by adapting latent variable models...... harboring largest genetic differentiation suggest positive selection underlying population structure. We carried out gene set analysis using SNP annotations to identify enriched functional categories such as energy-related processes and multiple development stages. Our population structure analysis of bull...

  6. Structure-related clustering of gene expression fingerprints of thp-1 cells exposed to smaller polycyclic aromatic hydrocarbons.

    Science.gov (United States)

    Wan, B; Yarbrough, J W; Schultz, T W

    2008-01-01

    This study was undertaken to test the hypothesis that structurally similar PAHs induce similar gene expression profiles. THP-1 cells were exposed to a series of 12 selected PAHs at 50 microM for 24 hours and gene expressions profiles were analyzed using both unsupervised and supervised methods. Clustering analysis of gene expression profiles revealed that the 12 tested chemicals were grouped into five clusters. Within each cluster, the gene expression profiles are more similar to each other than to the ones outside the cluster. One-methylanthracene and 1-methylfluorene were found to have the most similar profiles; dibenzothiophene and dibenzofuran were found to share common profiles with fluorine. As expression pattern comparisons were expanded, similarity in genomic fingerprint dropped off dramatically. Prediction analysis of microarrays (PAM) based on the clustering pattern generated 49 predictor genes that can be used for sample discrimination. Moreover, a significant analysis of Microarrays (SAM) identified 598 genes being modulated by tested chemicals with a variety of biological processes, such as cell cycle, metabolism, and protein binding and KEGG pathways being significantly (p < 0.05) affected. It is feasible to distinguish structurally different PAHs based on their genomic fingerprints, which are mechanism based.

  7. Gene Structures, Classification, and Expression Models of the DREB Transcription Factor Subfamily in Populus trichocarpa

    Directory of Open Access Journals (Sweden)

    Yunlin Chen

    2013-01-01

    Full Text Available We identified 75 dehydration-responsive element-binding (DREB protein genes in Populus trichocarpa. We analyzed gene structures, phylogenies, domain duplications, genome localizations, and expression profiles. The phylogenic construction suggests that the PtrDREB gene subfamily can be classified broadly into six subtypes (DREB A-1 to A-6 in Populus. The chromosomal localizations of the PtrDREB genes indicated 18 segmental duplication events involving 36 genes and six redundant PtrDREB genes were involved in tandem duplication events. There were fewer introns in the PtrDREB subfamily. The motif composition of PtrDREB was highly conserved in the same subtype. We investigated expression profiles of this gene subfamily from different tissues and/or developmental stages. Sixteen genes present in the digital expression analysis had high levels of transcript accumulation. The microarray results suggest that 18 genes were upregulated. We further examined the stress responsiveness of 15 genes by qRT-PCR. A digital northern analysis showed that the PtrDREB17, 18, and 32 genes were highly induced in leaves under cold stress, and the same expression trends were shown by qRT-PCR. Taken together, these observations may lay the foundation for future functional analyses to unravel the biological roles of Populus’ DREB genes.

  8. Genome-Wide Identification of R2R3-MYB Genes and Expression Analyses During Abiotic Stress in Gossypium raimondii

    Science.gov (United States)

    He, Qiuling; Jones, Don C.; Li, Wei; Xie, Fuliang; Ma, Jun; Sun, Runrun; Wang, Qinglian; Zhu, Shuijin; Zhang, Baohong

    2016-01-01

    The R2R3-MYB is one of the largest families of transcription factors, which have been implicated in multiple biological processes. There is great diversity in the number of R2R3-MYB genes in different plants. However, there is no report on genome-wide characterization of this gene family in cotton. In the present study, a total of 205 putative R2R3-MYB genes were identified in cotton D genome (Gossypium raimondii), that are much larger than that found in other cash crops with fully sequenced genomes. These GrMYBs were classified into 13 groups with the R2R3-MYB genes from Arabidopsis and rice. The amino acid motifs and phylogenetic tree were predicted and analyzed. The sequences of GrMYBs were distributed across 13 chromosomes at various densities. The results showed that the expansion of the G. Raimondii R2R3-MYB family was mainly attributable to whole genome duplication and segmental duplication. Moreover, the expression pattern of 52 selected GrMYBs and 46 GaMYBs were tested in roots and leaves under different abiotic stress conditions. The results revealed that the MYB genes in cotton were differentially expressed under salt and drought stress treatment. Our results will be useful for determining the precise role of the MYB genes during stress responses with crop improvement. PMID:27009386

  9. Platform comparison for evaluation of ALK protein immunohistochemical expression, genomic copy number and hotspot mutation status in neuroblastomas.

    Directory of Open Access Journals (Sweden)

    Benedict Yan

    Full Text Available ALK is an established causative oncogenic driver in neuroblastoma, and is likely to emerge as a routine biomarker in neuroblastoma diagnostics. At present, the optimal strategy for clinical diagnostic evaluation of ALK protein, genomic and hotspot mutation status is not well-studied. We evaluated ALK immunohistochemical (IHC protein expression using three different antibodies (ALK1, 5A4 and D5F3 clones, ALK genomic status using single-color chromogenic in situ hybridization (CISH, and ALK hotspot mutation status using conventional Sanger sequencing and a next-generation sequencing platform (Ion Torrent Personal Genome Machine (IT-PGM, in archival formalin-fixed, paraffin-embedded neuroblastoma samples. We found a significant difference in IHC results using the three different antibodies, with the highest percentage of positive cases seen on D5F3 immunohistochemistry. Correlation with ALK genomic and hotspot mutational status revealed that the majority of D5F3 ALK-positive cases did not possess either ALK genomic amplification or hotspot mutations. Comparison of sequencing platforms showed a perfect correlation between conventional Sanger and IT-PGM sequencing. Our findings suggest that D5F3 immunohistochemistry, single-color CISH and IT-PGM sequencing are suitable assays for evaluation of ALK status in future neuroblastoma clinical trials.

  10. Gene expression profile and genomic alterations in colonic tumours induced by 1,2-dimethylhydrazine (DMH) in rats

    International Nuclear Information System (INIS)

    Femia, Angelo Pietro; Luceri, Cristina; Toti, Simona; Giannini, Augusto; Dolara, Piero; Caderni, Giovanna

    2010-01-01

    Azoxymethane (AOM) or 1,2-dimethylhydrazine (DMH)-induced colon carcinogenesis in rats shares many phenotypical similarities with human sporadic colon cancer and is a reliable model for identifying chemopreventive agents. Genetic mutations relevant to human colon cancer have been described in this model, but comprehensive gene expression and genomic analysis have not been reported so far. Therefore, we applied genome-wide technologies to study variations in gene expression and genomic alterations in DMH-induced colon cancer in F344 rats. For gene expression analysis, 9 tumours (TUM) and their paired normal mucosa (NM) were hybridized on 4 × 44K Whole rat arrays (Agilent) and selected genes were validated by semi-quantitative RT-PCR. Functional analysis on microarray data was performed by GenMAPP/MappFinder analysis. Array-comparative genomic hybridization (a-CGH) was performed on 10 paired TUM-NM samples hybridized on Rat genome arrays 2 × 105K (Agilent) and the results were analyzed by CGH Analytics (Agilent). Microarray gene expression analysis showed that Defcr4, Igfbp5, Mmp7, Nos2, S100A8 and S100A9 were among the most up-regulated genes in tumours (Fold Change (FC) compared with NM: 183, 48, 39, 38, 36 and 32, respectively), while Slc26a3, Mptx, Retlna and Muc2 were strongly down-regulated (FC: -500; -376, -167, -79, respectively). Functional analysis showed that pathways controlling cell cycle, protein synthesis, matrix metalloproteinases, TNFα/NFkB, and inflammatory responses were up-regulated in tumours, while Krebs cycle, the electron transport chain, and fatty acid beta oxidation were down-regulated. a-CGH analysis showed that four TUM out of ten had one or two chromosomal aberrations. Importantly, one sample showed a deletion on chromosome 18 including Apc. The results showed complex gene expression alterations in adenocarcinomas encompassing many altered pathways. While a-CGH analysis showed a low degree of genomic imbalance, it is interesting to

  11. Genetic profiles of gastroesophageal cancer: combined analysis using expression array and tiling array--comparative genomic hybridization

    DEFF Research Database (Denmark)

    Isinger-Ekstrand, Anna; Johansson, Jan; Ohlsson, Mattias

    2010-01-01

    15, 13q34, and 12q13, whereas different profiles with gains at 5p15, 7p22, 2q35, and 13q34 characterized gastric cancers. CDK6 and EGFR were identified as putative target genes in cancers of the esophagus and the gastroesophageal junction, with upregulation in one quarter of the tumors. Gains......We aimed to characterize the genomic profiles of adenocarcinomas in the gastroesophageal junction in relation to cancers in the esophagus and the stomach. Profiles of gains/losses as well as gene expression profiles were obtained from 27 gastroesophageal adenocarcinomas by means of 32k high......-resolution array-based comparative genomic hybridization and 27k oligo gene expression arrays, and putative target genes were validated in an extended series. Adenocarcinomas in the distal esophagus and the gastroesophageal junction showed strong similarities with the most common gains at 20q13, 8q24, 1q21-23, 5p...

  12. The Drosophila Helicase MLE Targets Hairpin Structures in Genomic Transcripts.

    Directory of Open Access Journals (Sweden)

    Simona Cugusi

    2016-01-01

    Full Text Available RNA hairpins are a common type of secondary structures that play a role in every aspect of RNA biochemistry including RNA editing, mRNA stability, localization and translation of transcripts, and in the activation of the RNA interference (RNAi and microRNA (miRNA pathways. Participation in these functions often requires restructuring the RNA molecules by the association of single-strand (ss RNA-binding proteins or by the action of helicases. The Drosophila MLE helicase has long been identified as a member of the MSL complex responsible for dosage compensation. The complex includes one of two long non-coding RNAs and MLE was shown to remodel the roX RNA hairpin structures in order to initiate assembly of the complex. Here we report that this function of MLE may apply to the hairpins present in the primary RNA transcripts that generate the small molecules responsible for RNA interference. Using stocks from the Transgenic RNAi Project and the Vienna Drosophila Research Center, we show that MLE specifically targets hairpin RNAs at their site of transcription. The association of MLE at these sites is independent of sequence and chromosome location. We use two functional assays to test the biological relevance of this association and determine that MLE participates in the RNAi pathway.

  13. Susceptibilities to DNA Structural Transitions within Eukaryotic Genomes

    Science.gov (United States)

    Zhabinskaya, Dina; Benham, Craig; Madden, Sally

    2012-02-01

    We analyze the competitive transitions to alternate secondary DNA structures in a negatively supercoiled DNA molecule of kilobase length and specified base sequence. We use statistical mechanics to calculate the competition among all regions within the sequence that are susceptible to transitions to alternate structures. We use an approximate numerical method since the calculation of an exact partition function is numerically cumbersome for DNA molecules of lengths longer than hundreds of base pairs. This method yields accurate results in reasonable computational times. We implement algorithms that calculate the competition between transitions to denatured states and to Z-form DNA. We analyze these transitions near the transcription start sites (TSS) of a set of eukaryotic genes. We find an enhancement of Z-forming regions upstream of the TSS and a depletion of denatured regions around the start sites. We confirm that these finding are statistically significant by comparing our results to a set of randomized genes with preserved base composition at each position relative to the gene start sites. When we study the correlation of these transitions in orthologous mouse and human genes we find a clear evolutionary conservation of both types of transitions around the TSS.

  14. Plum pox virus (PPV) genome expression in genetically engineered RNAi plants

    Science.gov (United States)

    An important approach to controlling sharka disease caused by Plum pox virus (PPV) is the development of PPV resistant plants using small interfering RNAs (siRNA) technology. In order to evaluate siRNA induced gene silencing, we studied, based on knowledge of the PPV genome sequence, virus genome t...

  15. Genomic Imprinting and the Expression of Affect in Angelman Syndrome: What's in the Smile?

    Science.gov (United States)

    Oliver, Chris; Horsler, Kate; Berg, Katy; Bellamy, Gail; Dick, Katie; Griffiths, Emily

    2007-01-01

    Background: Kinship theory (or the genomic conflict hypothesis) proposes that the phenotypic effects of genomic imprinting arise from conflict between paternally and maternally inherited alleles. A prediction arising for social behaviour from this theory is that imbalance in this conflict resulting from a deletion of a maternally imprinted gene,…

  16. tigaR: integrative significance analysis of temporal differential gene expression induced by genomic abnormalities

    NARCIS (Netherlands)

    Miok, V.; Wilting, S.M.; van de Wiel, M.A.; Jaspers, A.; van Noort, P.I.; Brakenhoff, R.H.; Snijders, P.J.F.; Steenbergen, R.D.M.; van Wieringen, W.N.

    2014-01-01

    Background: To determine which changes in the host cell genome are crucial for cervical carcinogenesis, a longitudinal in vitro model system of HPV-transformed keratinocytes was profiled in a genome-wide manner. Four cell lines affected with either HPV16 or HPV18 were assayed at 8 sequential time

  17. Impact of delay to cryopreservation on RNA integrity and genome-wide expression profiles in resected tumor samples.

    Directory of Open Access Journals (Sweden)

    Elodie Caboux

    Full Text Available The quality of tissue samples and extracted mRNA is a major source of variability in tumor transcriptome analysis using genome-wide expression microarrays. During and immediately after surgical tumor resection, tissues are exposed to metabolic, biochemical and physical stresses characterized as "warm ischemia". Current practice advocates cryopreservation of biosamples within 30 minutes of resection, but this recommendation has not been systematically validated by measurements of mRNA decay over time. Using Illumina HumanHT-12 v3 Expression BeadChips, providing a genome-wide coverage of over 24,000 genes, we have analyzed gene expression variation in samples of 3 hepatocellular carcinomas (HCC and 3 lung carcinomas (LC cryopreserved at times up to 2 hours after resection. RNA Integrity Numbers (RIN revealed no significant deterioration of mRNA up to 2 hours after resection. Genome-wide transcriptome analysis detected non-significant gene expression variations of -3.5%/hr (95% CI: -7.0%/hr to 0.1%/hr; p = 0.054. In LC, no consistent gene expression pattern was detected in relation with warm ischemia. In HCC, a signature of 6 up-regulated genes (CYP2E1, IGLL1, CABYR, CLDN2, NQO1, SCL13A5 and 6 down-regulated genes (MT1G, MT1H, MT1E, MT1F, HABP2, SPINK1 was identified (FDR <0.05. Overall, our observations support current recommendation of time to cryopreservation of up to 30 minutes and emphasize the need for identifying tissue-specific genes deregulated following resection to avoid misinterpreting expression changes induced by warm ischemia as pathologically significant changes.

  18. Genome-wide characterization, evolution, and expression analysis of the leucine-rich repeat receptor-like protein kinase (LRR-RLK) gene family in Rosaceae genomes.

    Science.gov (United States)

    Sun, Jiangmei; Li, Leiting; Wang, Peng; Zhang, Shaoling; Wu, Juyou

    2017-10-10

    Leucine-rich repeat receptor-like protein kinase (LRR-RLK) is the largest gene family of receptor-like protein kinases (RLKs) and actively participates in regulating the growth, development, signal transduction, immunity, and stress responses of plants. However, the patterns of LRR-RLK gene family evolution in the five main Rosaceae species for which genome sequences are available have not yet been reported. In this study, we performed a comprehensive analysis of LRR-RLK genes for five Rosaceae species: Fragaria vesca (strawberry), Malus domestica (apple), Pyrus bretschneideri (Chinese white pear), Prunus mume (mei), and Prunus persica (peach), which contained 201, 244, 427, 267, and 258 LRR-RLK genes, respectively. All LRR-RLK genes were further grouped into 23 subfamilies based on the hidden Markov models approach. RLK-Pelle_LRR-XII-1, RLK-Pelle_LRR-XI-1, and RLK-Pelle_LRR-III were the three largest subfamilies. Synteny analysis indicated that there were 236 tandem duplicated genes in the five Rosaceae species, among which subfamilies XII-1 (82 genes) and XI-1 (80 genes) comprised 68.6%. Our results indicate that tandem duplication made a large contribution to the expansion of the subfamilies. The gene expression, tissue-specific expression, and subcellular localization data revealed that LRR-RLK genes were differentially expressed in various organs and tissues, and the largest subfamily XI-1 was highly expressed in all five Rosaceae species, suggesting that LRR-RLKs play important roles in each stage of plant growth and development. Taken together, our results provide an overview of the LRR-RLK family in Rosaceae genomes and the basis for further functional studies.

  19. Peripheral blood gene expression as a novel genomic biomarker in complicated sarcoidosis.

    Directory of Open Access Journals (Sweden)

    Tong Zhou

    Full Text Available Sarcoidosis, a systemic granulomatous syndrome invariably affecting the lung, typically spontaneously remits but in ~20% of cases progresses with severe lung dysfunction or cardiac and neurologic involvement (complicated sarcoidosis. Unfortunately, current biomarkers fail to distinguish patients with remitting (uncomplicated sarcoidosis from other fibrotic lung disorders, and fail to identify individuals at risk for complicated sarcoidosis. We utilized genome-wide peripheral blood gene expression analysis to identify a 20-gene sarcoidosis biomarker signature distinguishing sarcoidosis (n = 39 from healthy controls (n = 35, 86% classification accuracy and which served as a molecular signature for complicated sarcoidosis (n = 17. As aberrancies in T cell receptor (TCR signaling, JAK-STAT (JS signaling, and cytokine-cytokine receptor (CCR signaling are implicated in sarcoidosis pathogenesis, a 31-gene signature comprised of T cell signaling pathway genes associated with sarcoidosis (TCR/JS/CCR was compared to the unbiased 20-gene biomarker signature but proved inferior in prediction accuracy in distinguishing complicated from uncomplicated sarcoidosis. Additional validation strategies included significant association of single nucleotide polymorphisms (SNPs in signature genes with sarcoidosis susceptibility and severity (unbiased signature genes - CX3CR1, FKBP1A, NOG, RBM12B, SENS3, TSHZ2; T cell/JAK-STAT pathway genes such as AKT3, CBLB, DLG1, IFNG, IL2RA, IL7R, ITK, JUN, MALT1, NFATC2, PLCG1, SPRED1. In summary, this validated peripheral blood molecular gene signature appears to be a valuable biomarker in identifying cases with sarcoidoisis and predicting risk for complicated sarcoidosis.

  20. The complete mitochondrial genome structure of the jaguar (Panthera onca).

    Science.gov (United States)

    Caragiulo, Anthony; Dougherty, Eric; Soto, Sofia; Rabinowitz, Salisa; Amato, George

    2016-01-01

    The jaguar (Panthera onca) is the largest felid in the Western hemisphere, and the only member of the Panthera genus in the New World. The jaguar inhabits most countries within Central and South America, and is considered near threatened by the International Union for the Conservation of Nature. This study represents the first sequence of the entire jaguar mitogenome, which was the only Panthera mitogenome that had not been sequenced. The jaguar mitogenome is 17,049 bases and possesses the same molecular structure as other felid mitogenomes. Bayesian inference (BI) and maximum likelihood (ML) were used to determine the phylogenetic placement of the jaguar within the Panthera genus. Both BI and ML analyses revealed the jaguar to be sister to the tiger/leopard/snow leopard clade.

  1. Germline Cas9 expression yields highly efficient genome engineering in a major worldwide disease vector, Aedes aegypti.

    Science.gov (United States)

    Li, Ming; Bui, Michelle; Yang, Ting; Bowman, Christian S; White, Bradley J; Akbari, Omar S

    2017-12-05

    The development of CRISPR/Cas9 technologies has dramatically increased the accessibility and efficiency of genome editing in many organisms. In general, in vivo germline expression of Cas9 results in substantially higher activity than embryonic injection. However, no transgenic lines expressing Cas9 have been developed for the major mosquito disease vector Aedes aegypti Here, we describe the generation of multiple stable, transgenic Ae. aegypti strains expressing Cas9 in the germline, resulting in dramatic improvements in both the consistency and efficiency of genome modifications using CRISPR. Using these strains, we disrupted numerous genes important for normal morphological development, and even generated triple mutants from a single injection. We have also managed to increase the rates of homology-directed repair by more than an order of magnitude. Given the exceptional mutagenic efficiency and specificity of the Cas9 strains we engineered, they can be used for high-throughput reverse genetic screens to help functionally annotate the Ae. aegypti genome. Additionally, these strains represent a step toward the development of novel population control technologies targeting Ae. aegypti that rely on Cas9-based gene drives. Copyright © 2017 the Author(s). Published by PNAS.

  2. Genome-wide evolutionary characterization and expression analyses of major latex protein (MLP) family genes in Vitis vinifera.

    Science.gov (United States)

    Zhang, Ningbo; Li, Ruimin; Shen, Wei; Jiao, Shuzhen; Zhang, Junxiang; Xu, Weirong

    2018-04-27

    The major latex protein/ripening-related protein (MLP/RRP) subfamily is known to be involved in a wide range of biological processes of plant development and various stress responses. However, the biological function of MLP/RRP proteins is still far from being clear and identification of them may provide important clues for understanding their roles. Here, we report a genome-wide evolutionary characterization and gene expression analysis of the MLP family in European Vitis species. A total of 14 members, was found in the grape genome, all of which are located on chromosome 1, where are predominantly arranged in tandem clusters. We have noticed, most surprisingly, promoter-sharing by several non-identical but highly similar gene members to a greater extent than expected by chance. Synteny analysis between the grape and Arabidopsis thaliana genomes suggested that 3 grape MLP genes arose before the divergence of the two species. Phylogenetic analysis provided further insights into the evolutionary relationship between the genes, as well as their putative functions, and tissue-specific expression analysis suggested distinct biological roles for different members. Our expression data suggested a couple of candidate genes involved in abiotic stresses and phytohormone responses. The present work provides new insight into the evolution and regulation of Vitis MLP genes, which represent targets for future studies and inclusion in tolerance-related molecular breeding programs.

  3. Epigenomics of Total Acute Sleep Deprivation in Relation to Genome-Wide DNA Methylation Profiles and RNA Expression.

    Science.gov (United States)

    Nilsson, Emil K; Boström, Adrian E; Mwinyi, Jessica; Schiöth, Helgi B

    2016-06-01

    Despite an established link between sleep deprivation and epigenetic processes in humans, it remains unclear to what extent sleep deprivation modulates DNA methylation. We performed a within-subject randomized blinded study with 16 healthy subjects to examine the effect of one night of total sleep deprivation (TSD) on the genome-wide methylation profile in blood compared with that in normal sleep. Genome-wide differences in methylation between both conditions were assessed by applying a paired regression model that corrected for monocyte subpopulations. In addition, the correlations between the methylation of genes detected to be modulated by TSD and gene expression were examined in a separate, publicly available cohort of 10 healthy male donors (E-GEOD-49065). Sleep deprivation significantly affected the DNA methylation profile both independently and in dependency of shifts in monocyte composition. Our study detected differential methylation of 269 probes. Notably, one CpG site was located 69 bp upstream of ING5, which has been shown to be differentially expressed after sleep deprivation. Gene set enrichment analysis detected the Notch and Wnt signaling pathways to be enriched among the differentially methylated genes. These results provide evidence that total acute sleep deprivation alters the methylation profile in healthy human subjects. This is, to our knowledge, the first study that systematically investigated the impact of total acute sleep deprivation on genome-wide DNA methylation profiles in blood and related the epigenomic findings to the expression data.

  4. Genome-Wide Identification and Expression Analysis of WRKY Transcription Factors under Multiple Stresses in Brassica napus.

    Science.gov (United States)

    He, Yajun; Mao, Shaoshuai; Gao, Yulong; Zhu, Liying; Wu, Daoming; Cui, Yixin; Li, Jiana; Qian, Wei

    2016-01-01

    WRKY transcription factors play important roles in responses to environmental stress stimuli. Using a genome-wide domain analysis, we identified 287 WRKY genes with 343 WRKY domains in the sequenced genome of Brassica napus, 139 in the A sub-genome and 148 in the C sub-genome. These genes were classified into eight groups based on phylogenetic analysis. In the 343 WRKY domains, a total of 26 members showed divergence in the WRKY domain, and 21 belonged to group I. This finding suggested that WRKY genes in group I are more active and variable compared with genes in other groups. Using genome-wide identification and analysis of the WRKY gene family in Brassica napus, we observed genome duplication, chromosomal/segmental duplications and tandem duplication. All of these duplications contributed to the expansion of the WRKY gene family. The duplicate segments that were detected indicated that genome duplication events occurred in the two diploid progenitors B. rapa and B. olearecea before they combined to form B. napus. Analysis of the public microarray database and EST database for B. napus indicated that 74 WRKY genes were induced or preferentially expressed under stress conditions. According to the public QTL data, we identified 77 WRKY genes in 31 QTL regions related to various stress tolerance. We further evaluated the expression of 26 BnaWRKY genes under multiple stresses by qRT-PCR. Most of the genes were induced by low temperature, salinity and drought stress, indicating that the WRKYs play important roles in B. napus stress responses. Further, three BnaWRKY genes were strongly responsive to the three multiple stresses simultaneously, which suggests that these 3 WRKY may have multi-functional roles in stress tolerance and can potentially be used in breeding new rapeseed cultivars. We also found six tandem repeat pairs exhibiting similar expression profiles under the various stress conditions, and three pairs were mapped in the stress related QTL regions

  5. Genome-Wide Identification and Expression Analysis of WRKY Transcription Factors under Multiple Stresses in Brassica napus.

    Directory of Open Access Journals (Sweden)

    Yajun He

    Full Text Available WRKY transcription factors play important roles in responses to environmental stress stimuli. Using a genome-wide domain analysis, we identified 287 WRKY genes with 343 WRKY domains in the sequenced genome of Brassica napus, 139 in the A sub-genome and 148 in the C sub-genome. These genes were classified into eight groups based on phylogenetic analysis. In the 343 WRKY domains, a total of 26 members showed divergence in the WRKY domain, and 21 belonged to group I. This finding suggested that WRKY genes in group I are more active and variable compared with genes in other groups. Using genome-wide identification and analysis of the WRKY gene family in Brassica napus, we observed genome duplication, chromosomal/segmental duplications and tandem duplication. All of these duplications contributed to the expansion of the WRKY gene family. The duplicate segments that were detected indicated that genome duplication events occurred in the two diploid progenitors B. rapa and B. olearecea before they combined to form B. napus. Analysis of the public microarray database and EST database for B. napus indicated that 74 WRKY genes were induced or preferentially expressed under stress conditions. According to the public QTL data, we identified 77 WRKY genes in 31 QTL regions related to various stress tolerance. We further evaluated the expression of 26 BnaWRKY genes under multiple stresses by qRT-PCR. Most of the genes were induced by low temperature, salinity and drought stress, indicating that the WRKYs play important roles in B. napus stress responses. Further, three BnaWRKY genes were strongly responsive to the three multiple stresses simultaneously, which suggests that these 3 WRKY may have multi-functional roles in stress tolerance and can potentially be used in breeding new rapeseed cultivars. We also found six tandem repeat pairs exhibiting similar expression profiles under the various stress conditions, and three pairs were mapped in the stress related

  6. Genome-wide identification of sweet orange (Citrus sinensis) histone modification gene families and their expression analysis during the fruit development and fruit-blue mold infection process.

    Science.gov (United States)

    Xu, Jidi; Xu, Haidan; Liu, Yuanlong; Wang, Xia; Xu, Qiang; Deng, Xiuxin

    2015-01-01

    In eukaryotes, histone acetylation and methylation have been known to be involved in regulating diverse developmental processes and plant defense. These histone modification events are controlled by a series of histone modification gene families. To date, there is no study regarding genome-wide characterization of histone modification related genes in citrus species. Based on the two recent sequenced sweet orange genome databases, a total of 136 CsHMs (Citrus sinensis histone modification genes), including 47 CsHMTs (histone methyltransferase genes), 23 CsHDMs (histone demethylase genes), 50 CsHATs (histone acetyltransferase genes), and 16 CsHDACs (histone deacetylase genes) were identified. These genes were categorized to 11 gene families. A comprehensive analysis of these 11 gene families was performed with chromosome locations, phylogenetic comparison, gene structures, and conserved domain compositions of proteins. In order to gain an insight into the potential roles of these genes in citrus fruit development, 42 CsHMs with high mRNA abundance in fruit tissues were selected to further analyze their expression profiles at six stages of fruit development. Interestingly, a numbers of genes were expressed highly in flesh of ripening fruit and some of them showed the increasing expression levels along with the fruit development. Furthermore, we analyzed the expression patterns of all 136 CsHMs response to the infection of blue mold (Penicillium digitatum), which is the most devastating pathogen in citrus post-harvest process. The results indicated that 20 of them showed the strong alterations of their expression levels during the fruit-pathogen infection. In conclusion, this study presents a comprehensive analysis of the histone modification gene families in sweet orange and further elucidates their behaviors during the fruit development and the blue mold infection responses.

  7. Complete Chloroplast Genomes of Papaver rhoeas and Papaver orientale: Molecular Structures, Comparative Analysis, and Phylogenetic Analysis

    Directory of Open Access Journals (Sweden)

    Jianguo Zhou

    2018-02-01

    Full Text Available Papaver rhoeas L. and P. orientale L., which belong to the family Papaveraceae, are used as ornamental and medicinal plants. The chloroplast genome has been used for molecular markers, evolutionary biology, and barcoding identification. In this study, the complete chloroplast genome sequences of P. rhoeas and P. orientale are reported. Results show that the complete chloroplast genomes of P. rhoeas and P. orientale have typical quadripartite structures, which are comprised of circular 152,905 and 152,799-bp-long molecules, respectively. A total of 130 genes were identified in each genome, including 85 protein-coding genes, 37 tRNA genes, and 8 rRNA genes. Sequence divergence analysis of four species from Papaveraceae indicated that the most divergent regions are found in the non-coding spacers with minimal differences among three Papaver species. These differences include the ycf1 gene and intergenic regions, such as rpoB-trnC, trnD-trnT, petA-psbJ, psbE-petL, and ccsA-ndhD. These regions are hypervariable regions, which can be used as specific DNA barcodes. This finding suggested that the chloroplast genome could be used as a powerful tool to resolve the phylogenetic positions and relationships of Papaveraceae. These results offer valuable information for future research in the identification of Papaver species and will benefit further investigations of these species.

  8. Complete plastid genomes from Ophioglossum californicum, Psilotum nudum, and Equisetum hyemale reveal an ancestral land plant genome structure and resolve the position of Equisetales among monilophytes

    Directory of Open Access Journals (Sweden)

    Grewe Felix

    2013-01-01

    Full Text Available Abstract Background Plastid genome structure and content is remarkably conserved in land plants. This widespread conservation has facilitated taxon-rich phylogenetic analyses that have resolved organismal relationships among many land plant groups. However, the relationships among major fern lineages, especially the placement of Equisetales, remain enigmatic. Results In order to understand the evolution of plastid genomes and to establish phylogenetic relationships among ferns, we sequenced the plastid genomes from three early diverging species: Equisetum hyemale (Equisetales, Ophioglossum californicum (Ophioglossales, and Psilotum nudum (Psilotales. A comparison of fern plastid genomes showed that some lineages have retained inverted repeat (IR boundaries originating from the common ancestor of land plants, while other lineages have experienced multiple IR changes including expansions and inversions. Genome content has remained stable throughout ferns, except for a few lineage-specific losses of genes and introns. Notably, the losses of the rps16 gene and the rps12i346 intron are shared among Psilotales, Ophioglossales, and Equisetales, while the gain of a mitochondrial atp1 intron is shared between Marattiales and Polypodiopsida. These genomic structural changes support the placement of Equisetales as sister to Ophioglossales + Psilotales and Marattiales as sister to Polypodiopsida. This result is augmented by some molecular phylogenetic analyses that recover the same relationships, whereas others suggest a relationship between Equisetales and Polypodiopsida. Conclusions Although molecular analyses were inconsistent with respect to the position of Marattiales and Equisetales, several genomic structural changes have for the first time provided a clear placement of these lineages within the ferns. These results further demonstrate the power of using rare genomic structural changes in cases where molecular data fail to provide strong phylogenetic

  9. Reference-quality genome sequence of Aegilops tauschii, the source of wheat D genome, shows that recombination shapes genome structure and evolution

    Science.gov (United States)

    Aegilops tauschii is the diploid progenitor of the D genome of hexaploid wheat and an important genetic resource for wheat. A reference-quality sequence for the Ae. tauschii genome was produced with a combination of ordered-clone sequencing, whole-genome shotgun sequencing, and BioNano optical geno...

  10. Structure and expression of the chicken calmodulin I gene

    DEFF Research Database (Denmark)

    Ye, Q; Berchtold, M W

    1997-01-01

    The chicken calmodulin I (CaMI) gene has been isolated and characterized on the level of cDNA and genomic DNA. The deduced amino acid (aa) sequence is identical to the one of chicken CaMII which consists of 148 aa. The CaMI gene contains six exons. Its intron/exon organization is identical...... to that of the chicken CaMII and the CaMI and CaMIII genes of rat and human. Expression of the CaMI gene was detected in all chicken tissues examined, although at varying levels. The gene is transcribed into four mRNAs of 0.8, 1.4, 1.7 and 4.4 kb as determined by Northern blot analysis. Our results demonstrate...... that the "multigene-one-protein" principle of CaM synthesis is not only applicable to mammals whose CaM is encoded by three different genes, but also to chickens....

  11. Structure and Expression Analyses of SVA Elements in Relation to Functional Genes

    Directory of Open Access Journals (Sweden)

    Yun-Jeong Kwon

    2013-09-01

    Full Text Available SINE-VNTR-Alu (SVA elements are present in hominoid primates and are divided into 6 subfamilies (SVA-A to SVA-F and active in the human population. Using a bioinformatic tool, 22 SVA element-associated genes are identified in the human genome. In an analysis of genomic structure, SVA elements are detected in the 5' untranslated region (UTR of HGSNAT (SVA-B, MRGPRX3 (SVA-D, HYAL1 (SVA-F, TCHH (SVA-F, and ATXN2L (SVA-F genes, while some elements are observed in the 3'UTR of SPICE1 (SVA-B, TDRKH (SVA-C, GOSR1 (SVA-D, BBS5 (SVA-D, NEK5 (SVA-D, ABHD2 (SVA-F, C1QTNF7 (SVA-F, ORC6L (SVA-F, TMEM69 (SVA-F, and CCDC137 (SVA-F genes. They could contribute to exon extension or supplying poly A signals. LEPR (SVA-C, ALOX5 (SVA-D, PDS5B (SVA-D, and ABCA10 (SVA-F genes also showed alternative transcripts by SVA exonization events. Dominant expression of HYAL1_SVA appeared in lung tissues, while HYAL1_noSVA showed ubiquitous expression in various human tissues. Expression of both transcripts (TDRKH_SVA and TDRKH_noSVA of the TDRKH gene appeared to be ubiquitous. Taken together, these data suggest that SVA elements cause transcript isoforms that contribute to modulation of gene regulation in various human tissues.

  12. De novo prediction of human chromosome structures: Epigenetic marking patterns encode genome architecture.

    Science.gov (United States)

    Di Pierro, Michele; Cheng, Ryan R; Lieberman Aiden, Erez; Wolynes, Peter G; Onuchic, José N

    2017-11-14

    Inside the cell nucleus, genomes fold into organized structures that are characteristic of cell type. Here, we show that this chromatin architecture can be predicted de novo using epigenetic data derived from chromatin immunoprecipitation-sequencing (ChIP-Seq). We exploit the idea that chromosomes encode a 1D sequence of chromatin structural types. Interactions between these chromatin types determine the 3D structural ensemble of chromosomes through a process similar to phase separation. First, a neural network is used to infer the relation between the epigenetic marks present at a locus, as assayed by ChIP-Seq, and the genomic compartment in which those loci reside, as measured by DNA-DNA proximity ligation (Hi-C). Next, types inferred from this neural network are used as an input to an energy landscape model for chromatin organization [Minimal Chromatin Model (MiChroM)] to generate an ensemble of 3D chromosome conformations at a resolution of 50 kilobases (kb). After training the model, dubbed Maximum Entropy Genomic Annotation from Biomarkers Associated to Structural Ensembles (MEGABASE), on odd-numbered chromosomes, we predict the sequences of chromatin types and the subsequent 3D conformational ensembles for the even chromosomes. We validate these structural ensembles by using ChIP-Seq tracks alone to predict Hi-C maps, as well as distances measured using 3D fluorescence in situ hybridization (FISH) experiments. Both sets of experiments support the hypothesis of phase separation being the driving process behind compartmentalization. These findings strongly suggest that epigenetic marking patterns encode sufficient information to determine the global architecture of chromosomes and that de novo structure prediction for whole genomes may be increasingly possible. Copyright © 2017 the Author(s). Published by PNAS.

  13. Association of HLA-DR with susceptibility to and clinical expression of rheumatoid arthritis: re-evaluation by means of genomic tissue typing

    NARCIS (Netherlands)

    van Jaarsveld, C. H.; Otten, H. G.; Jacobs, J. W.; Kruize, A. A.; Brus, H. L.; Bijlsma, J. W.

    1998-01-01

    The clinical expression of rheumatoid arthritis (RA) varies considerably among individual patients. Genetic variations in human leucocyte antigen (HLA) may influence clinical expression. We re-examined the association of HLA-DR with susceptibility to and clinical expression of RA using genomic

  14. Potential Impact on Clinical Decision Making via a Genome-Wide Expression Profiling: A Case Report

    Directory of Open Access Journals (Sweden)

    Hyun Kim

    2016-11-01

    Full Text Available Management of men with prostate cancer is fraught with uncertainty as physicians and patients balance efficacy with potential toxicity and diminished quality of life. Utilization of genomics as a prognostic biomarker has improved the informed decision-making process by enabling more rationale treatment choices. Recently investigations have begun to determine whether genomic information from tumor transcriptome data can be used to impact clinical decision-making beyond prognosis. Here we discuss the potential of genomics to alter management of a patient who presented with high-risk prostate adenocarcinoma. We suggest that this information help selecting patients for advanced imaging, chemotherapies, or clinical trial.

  15. Optimized paired-sgRNA/Cas9 cloning and expression cassette triggers high-efficiency multiplex genome editing in kiwifruit.

    Science.gov (United States)

    Wang, Zupeng; Wang, Shuaibin; Li, Dawei; Zhang, Qiong; Li, Li; Zhong, Caihong; Liu, Yifei; Huang, Hongwen

    2018-01-13

    Kiwifruit is an important fruit crop; however, technologies for its functional genomic and molecular improvement are limited. The clustered regulatory interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein (Cas) system has been successfully applied to genetic improvement in many crops, but its editing capability is variable depending on the different combinations of the synthetic guide RNA (sgRNA) and Cas9 protein expression devices. Optimizing conditions for its use within a particular species is therefore needed to achieve highly efficient genome editing. In this study, we developed a new cloning strategy for generating paired-sgRNA/Cas9 vectors containing four sgRNAs targeting the kiwifruit phytoene desaturase gene (AcPDS). Comparing to the previous method of paired-sgRNA cloning, our strategy only requires the synthesis of two gRNA-containing primers which largely reduces the cost. We further compared efficiencies of paired-sgRNA/Cas9 vectors containing different sgRNA expression devices, including both the polycistronic tRNA-sgRNA cassette (PTG) and the traditional CRISPR expression cassette. We found the mutagenesis frequency of the PTG/Cas9 system was 10-fold higher than that of the CRISPR/Cas9 system, coinciding with the relative expressions of sgRNAs in two different expression cassettes. In particular, we identified large chromosomal fragment deletions induced by the paired-sgRNAs of the PTG/Cas9 system. Finally, as expected, we found both systems can successfully induce the albino phenotype of kiwifruit plantlets regenerated from the G418-resistance callus lines. We conclude that the PTG/Cas9 system is a more powerful system than the traditional CRISPR/Cas9 system for kiwifruit genome editing, which provides valuable clues for optimizing CRISPR/Cas9 editing system in other plants. © 2018 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons

  16. Plastid genome structure and loss of photosynthetic ability in the parasitic genus Cuscuta.

    Science.gov (United States)

    Revill, Meredith J W; Stanley, Susan; Hibberd, Julian M

    2005-09-01

    The genus Cuscuta (dodder) is composed of parasitic plants, some species of which appear to be losing the ability to photosynthesize. A molecular phylogeny was constructed using 15 species of Cuscuta in order to assess whether changes in photosynthetic ability and alterations in structure of the plastid genome relate to phylogenetic position within the genus. The molecular phylogeny provides evidence for four major clades within Cuscuta. Although DNA blot analysis showed that Cuscuta species have smaller plastid genomes than tobacco, and that plastome size varied significantly even within one Cuscuta clade, dot blot analysis indicated that the dodders possess homologous sequence to 101 genes from the tobacco plastome. Evidence is provided for significant rates of DNA transfer from plastid to nucleus in Cuscuta. Size and structure of Cuscuta plastid genomes, as well as photosynthetic ability, appear to vary independently of position within the phylogeny, thus supporting the hypothesis that within Cuscuta photosynthetic ability and organization of the plastid genome are changing in an unco-ordinated manner.

  17. Conserved structure and expression of hsp70 paralogs in teleost fishes

    DEFF Research Database (Denmark)

    Metzger, David C.H.; Hansen, Jakob Hemmer; Schulte, Patricia M.

    2016-01-01

    present in the F. heteroclitus genome. Comparison of expression patterns in F. heteroclitus and Gasterosteus aculeatus demonstrates that hsp70-2 has a higher fold increase than hsp70-1 following heat shock in gill but not in muscle tissue, revealing a conserved difference in expression patterns between...

  18. Effects of immunostimulation on social behavior, chemical communication and genome-wide gene expression in honey bee workers (Apis mellifera

    Directory of Open Access Journals (Sweden)

    Richard Freddie-Jeanne

    2012-10-01

    Full Text Available Abstract Background Social insects, such as honey bees, use molecular, physiological and behavioral responses to combat pathogens and parasites. The honey bee genome contains all of the canonical insect immune response pathways, and several studies have demonstrated that pathogens can activate expression of immune effectors. Honey bees also use behavioral responses, termed social immunity, to collectively defend their hives from pathogens and parasites. These responses include hygienic behavior (where workers remove diseased brood and allo-grooming (where workers remove ectoparasites from nestmates. We have previously demonstrated that immunostimulation causes changes in the cuticular hydrocarbon profiles of workers, which results in altered worker-worker social interactions. Thus, cuticular hydrocarbons may enable workers to identify sick nestmates, and adjust their behavior in response. Here, we test the specificity of behavioral, chemical and genomic responses to immunostimulation by challenging workers with a panel of different immune stimulants (saline, Sephadex beads and Gram-negative bacteria E. coli. Results While only bacteria-injected bees elicited altered behavioral responses from healthy nestmates compared to controls, all treatments resulted in significant changes in cuticular hydrocarbon profiles. Immunostimulation caused significant changes in expression of hundreds of genes, the majority of which have not been identified as members of the canonical immune response pathways. Furthermore, several new candidate genes that may play a role in cuticular hydrocarbon biosynthesis were identified. Effects of immune challenge expression of several genes involved in immune response, cuticular hydrocarbon biosynthesis, and the Notch signaling pathway were confirmed using quantitative real-time PCR. Finally, we identified common genes regulated by pathogen challenge in honey bees and other insects. Conclusions These results demonstrate that

  19. Ageing, chronic alcohol consumption and folate are determinants of genomic DNA methylation, p16 promoter methylation and the expression of p16 in the mouse colon

    Science.gov (United States)

    Elder age and chronic alcohol consumption are important risk factors for the development of colon cancer. Each factor can alter genomic and gene-specific DNA methylation. This study examined the effects of aging and chronic alcohol consumption on genomic and p16-specific methylation, and p16 express...

  20. Aging and chronic alcohol consumption are determinants of p16 gene expression, genomic DNA methylation and p16 promoter methylation in the mouse colon

    Science.gov (United States)

    Elder age and chronic alcohol consumption are important risk factors for the development of colon cancer. Each factor can alter genomic and gene-specific DNA methylation. This study examined the effects of aging and chronic alcohol consumption on genomic and p16-specific methylation, and p16 express...

  1. A genome wide survey of SNP variation reveals the genetic structure of sheep breeds.

    Directory of Open Access Journals (Sweden)

    James W Kijas

    Full Text Available The genetic structure of sheep reflects their domestication and subsequent formation into discrete breeds. Understanding genetic structure is essential for achieving genetic improvement through genome-wide association studies, genomic selection and the dissection of quantitative traits. After identifying the first genome-wide set of SNP for sheep, we report on levels of genetic variability both within and between a diverse sample of ovine populations. Then, using cluster analysis and the partitioning of genetic variation, we demonstrate sheep are characterised by weak phylogeographic structure, overlapping genetic similarity and generally low differentiation which is consistent with their short evolutionary history. The degree of population substructure was, however, sufficient to cluster individuals based on geographic origin and known breed history. Specifically, African and Asian populations clustered separately from breeds of European origin sampled from Australia, New Zealand, Europe and North America. Furthermore, we demonstrate the presence of stratification within some, but not all, ovine breeds. The results emphasize that careful documentation of genetic structure will be an essential prerequisite when mapping the genetic basis of complex traits. Furthermore, the identification of a subset of SNP able to assign individuals into broad groupings demonstrates even a small panel of markers may be suitable for applications such as traceability.

  2. Morphology, genome sequence, and structural proteome of type phage P335 from Lactococcus lactis

    DEFF Research Database (Denmark)

    Labrie, Simon J.; Josephsen, Jytte; Neve, Horst

    2008-01-01

    for a shorter tail and a different collar/whisker structure. Its 33,613-bp double-stranded DNA genome had 50 open reading frames. Putative functions were assigned to 29 of them. Unlike other sequenced genomes from lactococcal phages belonging to this species, P335 did not have a lysogeny module. However, it did...... genome. The genetic diversity of the P335 species indicates that they are exceptional models for studying the modular theory of phage evolution....

  3. Datasets in Gene Expression Omnibus used in the study ORD-020969: Genomic effects of androstenedione and sex-specific liver cancer susceptibility in mice

    Data.gov (United States)

    U.S. Environmental Protection Agency — Datasets in Gene Expression Omnibus used in the study ORD-020969: Genomic effects of androstenedione and sex-specific liver cancer susceptibility in mice. This...

  4. Synthesizing genome-wide association studies and expression microarray reveals novel genes that act in the human growth plate to modulate height.

    Science.gov (United States)

    Lui, Julian C; Nilsson, Ola; Chan, Yingleong; Palmer, Cameron D; Andrade, Anenisia C; Hirschhorn, Joel N; Baron, Jeffrey

    2012-12-01

    Previous meta-analysis of genome-wide association (GWA) studies has identified 180 loci that influence adult height. However, each GWA locus typically comprises a set of contiguous genes, only one of which presumably modulates height. We reasoned that many of the causative genes within these loci influence height because they are expressed in and function in the growth plate, a cartilaginous structure that causes bone elongation and thus determines stature. Therefore, we used expression microarray studies of mouse and rat growth plate, human disease databases and a mouse knockout phenotype database to identify genes within the GWAS loci that are likely required for normal growth plate function. Each of these approaches identified significantly more genes within the GWA height loci than at random genomic locations (P analysis strongly implicates 78 genes in growth plate function, including multiple genes that participate in PTHrP-IHH, BMP and CNP signaling, and many genes that have not previously been implicated in the growth plate. Thus, this analysis reveals a large number of novel genes that regulate human growth plate chondrogenesis and thereby contribute to the normal variations in human adult height. The analytic approach developed for this study may be applied to GWA studies for other common polygenic traits and diseases, thus providing a new general strategy to identify causative genes within GWA loci and to translate genetic associations into mechanistic biological insights.

  5. Genome scan identifies a locus affecting gamma-globin expression in human beta-cluster YAC transgenic mice

    Energy Technology Data Exchange (ETDEWEB)

    Lin, S.D.; Cooper, P.; Fung, J.; Weier, H.U.G.; Rubin, E.M.

    2000-03-01

    Genetic factors affecting post-natal g-globin expression - a major modifier of the severity of both b-thalassemia and sickle cell anemia, have been difficult to study. This is especially so in mice, an organism lacking a globin gene with an expression pattern equivalent to that of human g-globin. To model the human b-cluster in mice, with the goal of screening for loci affecting human g-globin expression in vivo, we introduced a human b-globin cluster YAC transgene into the genome of FVB mice . The b-cluster contained a Greek hereditary persistence of fetal hemoglobin (HPFH) g allele resulting in postnatal expression of human g-globin in transgenic mice. The level of human g-globin for various F1 hybrids derived from crosses between the FVB transgenics and other inbred mouse strains was assessed. The g-globin level of the C3HeB/FVB transgenic mice was noted to be significantly elevated. To map genes affecting postnatal g-globin expression, a 20 centiMorgan (cM) genome scan of a C3HeB/F VB transgenics [prime] FVB backcross was performed, followed by high-resolution marker analysis of promising loci. From this analysis we mapped a locus within a 2.2 cM interval of mouse chromosome 1 at a LOD score of 4.2 that contributes 10.4% of variation in g-globin expression level. Combining transgenic modeling of the human b-globin gene cluster with quantitative trait analysis, we have identified and mapped a murine locus that impacts on human g-globin expression in vivo.

  6. Identification of balanced chromosomal rearrangements previously unknown among participants in the 1000 Genomes Project: implications for interpretation of structural variation in genomes and the future of clinical cytogenetics.

    Science.gov (United States)

    Dong, Zirui; Wang, Huilin; Chen, Haixiao; Jiang, Hui; Yuan, Jianying; Yang, Zhenjun; Wang, Wen-Jing; Xu, Fengping; Guo, Xiaosen; Cao, Ye; Zhu, Zhenzhen; Geng, Chunyu; Cheung, Wan Chee; Kwok, Yvonne K; Yang, Huanming; Leung, Tak Yeung; Morton, Cynthia C; Cheung, Sau Wai; Choy, Kwong Wai

    2017-11-02

    PurposeRecent studies demonstrate that whole-genome sequencing enables detection of cryptic rearrangements in apparently balanced chromosomal rearrangements (also known as balanced chromosomal abnormalities, BCAs) previously identified by conventional cytogenetic methods. We aimed to assess our analytical tool for detecting BCAs in the 1000 Genomes Project without knowing which bands were affected.MethodsThe 1000 Genomes Project provides an unprecedented integrated map of structural variants in phenotypically normal subjects, but there is no information on potential inclusion of subjects with apparent BCAs akin to those traditionally detected in diagnostic cytogenetics laboratories. We applied our analytical tool to 1,166 genomes from the 1000 Genomes Project with sufficient physical coverage (8.25-fold).ResultsWith this approach, we detected four reciprocal balanced translocations and four inversions, ranging in size from 57.9 kb to 13.3 Mb, all of which were confirmed by cytogenetic methods and polymerase chain reaction studies. One of these DNAs has a subtle translocation that is not readily identified by chromosome analysis because of the similarity of the banding patterns and size of exchanged segments, and another results in disruption of all transcripts of an OMIM gene.ConclusionOur study demonstrates the extension of utilizing low-pass whole-genome sequencing for unbiased detection of BCAs including translocations and inversions previously unknown in the 1000 Genomes Project.GENETICS in MEDICINE advance online publication, 2 November 2017; doi:10.1038/gim.2017.170.

  7. Integrative genomic approaches to dissect clinically-significant relationships between the VDR cistrome and gene expression in primary colon cancer.

    Science.gov (United States)

    Long, Mark D; Campbell, Moray J

    2017-10-01

    Recently, we undertook a pan-cancer analyses of the nuclear hormone receptor (NR) superfamily in The Cancer Genome Atlas (TCGA), and revealed that the vitamin D receptor (NR1I1/VDR) was commonly and significantly down-regulated specifically in colon adenocarcinoma cohort (COAD). To examine the consequence of down-regulated VDR expression we re-analyzed VDR chromatin immunoprecipitation sequencing (ChIP-Seq) data from LS180 colon cancer cells (GSE31939). This analysis identified 1809 loci that displayed significant (p.adjcolon tumor suppressor, Galactin 4) had significantly shorted disease free survival. These analyses suggest that reduced expression of VDR in colon cancer (but neither loss nor mutation) changes the actions of the VDR by both dampening the expression of tumor suppressors (e.g. LGALS4) whilst either stabilizing or not down-regulating expression of oncogenes (e.g. Carbonic Anhydrase 9 (CA9)). These integrative genomic approaches are relatively generic and applicable to the study of any transcription factor. Copyright © 2016. Published by Elsevier Ltd.

  8. Whole Genome and Global Gene Expression Analyses of the Model Mushroom Flammulina velutipes Reveal a High Capacity for Lignocellulose Degradation

    Science.gov (United States)

    Park, Young-Jin; Baek, Jeong Hun; Lee, Seonwook; Kim, Changhoon; Rhee, Hwanseok; Kim, Hyungtae; Seo, Jeong-Sun; Park, Hae-Ran; Yoon, Dae-Eun; Nam, Jae-Young; Kim, Hong-Il; Kim, Jong-Guk; Yoon, Hyeokjun; Kang, Hee-Wan; Cho, Jae-Yong; Song, Eun-Sung; Sung, Gi-Ho; Yoo, Young-Bok; Lee, Chang-Soo; Lee, Byoung-Moo; Kong, Won-Sik

    2014-01-01

    Flammulina velutipes is a fungus with health and medicinal benefits that has been used for consumption and cultivation in East Asia. F. velutipes is also known to degrade lignocellulose and produce ethanol. The overlapping interests of mushroom production and wood bioconversion make F. velutipes an attractive new model for fungal wood related studies. Here, we present the complete sequence of the F. velutipes genome. This is the first sequenced genome for a commercially produced edible mushroom that also degrades wood. The 35.6-Mb genome contained 12,218 predicted protein-encoding genes and 287 tRNA genes assembled into 11 scaffolds corresponding with the 11 chromosomes of strain KACC42780. The 88.4-kb mitochondrial genome contained 35 genes. Well-developed wood degrading machinery with strong potential for lignin degradation (69 auxiliary activities, formerly FOLymes) and carbohydrate degradation (392 CAZymes), along with 58 alcohol dehydrogenase genes were highly expressed in the mycelium, demonstrating the potential application of this organism to bioethanol production. Thus, the newly uncovered wood degrading capacity and sequential nature of this process in F. velutipes, offer interesting possibilities for more detailed studies on either lignin or (hemi-) cellulose degradation in complex wood substrates. The mutual interest in wood degradation by the mushroom industry and (ligno-)cellulose biomass related industries further increase the significance of F. velutipes as a new model. PMID:24714189

  9. A physical map for the Amborella trichopoda genome sheds light on the evolution of angiosperm genome structure

    OpenAIRE

    Zuccolo, Andrea; Bowers, John E; Estill, James C; Xiong, Zhiyong; Luo, Meizhong; Sebastian, Aswathy; Goicoechea, Jos? Luis; Collura, Kristi; Yu, Yeisoo; Jiao, Yuannian; Duarte, Jill; Tang, Haibao; Ayyampalayam, Saravanaraj; Rounsley, Steve; Kudrna, Dave

    2011-01-01

    Background Recent phylogenetic analyses have identified Amborella trichopoda, an understory tree species endemic to the forests of New Caledonia, as sister to a clade including all other known flowering plant species. The Amborella genome is a unique reference for understanding the evolution of angiosperm genomes because it can serve as an outgroup to root comparative analyses. A physical map, BAC end sequences and sample shotgun sequences provide a first view of the 870 Mbp Amborella genome....

  10. Software for computing and annotating genomic ranges.

    Directory of Open Access Journals (Sweden)

    Michael Lawrence

    Full Text Available We describe Bioconductor infrastructure for representing and computing on annotated genomic ranges and integrating genomic data with the statistical computing features of R and its extensions. At the core of the infrastructure are three packages: IRanges, GenomicRanges, and GenomicFeatures. These packages provide scalable data structures for representing annotated ranges on the genome, with special support for transcript structures, read alignments and coverage vectors. Computational facilities include efficient algorithms for overlap and nearest neighbor detection, coverage calculation and other range operations. This infrastructure directly supports more than 80 other Bioconductor packages, including those for sequence analysis, differential expression analysis and visualization.

  11. Software for computing and annotating genomic ranges.

    Science.gov (United States)

    Lawrence, Michael; Huber, Wolfgang; Pagès, Hervé; Aboyoun, Patrick; Carlson, Marc; Gentleman, Robert; Morgan, Martin T; Carey, Vincent J

    2013-01-01

    We describe Bioconductor infrastructure for representing and computing on annotated genomic ranges and integrating genomic data with the statistical computing features of R and its extensions. At the core of the infrastructure are three packages: IRanges, GenomicRanges, and GenomicFeatures. These packages provide scalable data structures for representing annotated ranges on the genome, with special support for transcript structures, read alignments and coverage vectors. Computational facilities include efficient algorithms for overlap and nearest neighbor detection, coverage calculation and other range operations. This infrastructure directly supports more than 80 other Bioconductor packages, including those for sequence analysis, differential expression analysis and visualization.

  12. Evaluation of K-ras and p53 expression in pancreatic adenocarcinoma using the cancer genome atlas.

    Directory of Open Access Journals (Sweden)

    Liming Lu

    Full Text Available Genetic alterations in K-ras and p53 are thought to be critical in pancreatic cancer development and progression. However, K-ras and p53 expression in pancreatic adenocarcinoma have not been systematically examined in The Cancer Genome Atlas (TCGA Data Portal. Information regarding K-ras and p53 alterations, mRNA expression data, and protein/protein phosphorylation abundance was retrieved from The Cancer Genome Atlas (TCGA databases, and analyses were performed by the cBioPortal for Cancer Genomics. The mutual exclusivity analysis showed that events in K-ras and p53 were likely to co-occur in pancreatic adenocarcinoma (Log odds ratio = 1.599, P = 0.006. The graphical summary of the mutations showed that there were hotspots for protein activation. In the network analysis, no solid association between K-ras and p53 was observed in pancreatic adenocarcinoma. In the survival analysis, neither K-ras nor p53 were associated with both survival events. As in the data mining study in the TCGA databases, our study provides a new perspective to understand the genetic features of K-ras and p53 in pancreatic adenocarcinoma.

  13. Gene Expression Analysis of Escherichia Coli Grown in Miniaturized Bioreactor Platforms for High-Throughput Analysis of Growth and genomic Data

    DEFF Research Database (Denmark)

    Boccazzi, P.; Zanzotto, A.; Szita, Nicolas

    2005-01-01

    Combining high-throughput growth physiology and global gene expression data analysis is of significant value for integrating metabolism and genomics. We compared global gene expression using 500 ng of total RNA from Escherichia coli cultures grown in rich or defined minimal media in a miniaturize...... cultures using just 500 ng of total RNA indicate that high-throughput integration of growth physiology and genomics will be possible with novel biochemical platforms and improved detection technologies....

  14. Improvisation in evolution of genes and genomes: whose structure is it anyway?

    Science.gov (United States)

    Shakhnovich, Boris E; Shakhnovich, Eugene I

    2008-06-01

    Significant progress has been made in recent years in a variety of seemingly unrelated fields such as sequencing, protein structure prediction, and high-throughput transcriptomics and metabolomics. At the same time, new microscopic models have been developed that made it possible to analyze the evolution of genes and genomes from first principles. The results from these efforts enable, for the first time, a comprehensive insight into the evolution of complex systems and organisms on all scales--from sequences to organisms and populations. Every newly sequenced genome uncovers new genes, families, and folds. Where do these new genes come from? How do gene duplication and subsequent divergence of sequence and structure affect the fitness of the organism? What role does regulation play in the evolution of proteins and folds? Emerging synergism between data and modeling provides first robust answers to these questions.

  15. Comparative Annotation of Viral Genomes with Non-Conserved Gene Structure

    DEFF Research Database (Denmark)

    de Groot, Saskia; Mailund, Thomas; Hein, Jotun

    2007-01-01

    Motivation: Detecting genes in viral genomes is a complex task. Due to the biological necessity of them being constrained in length, RNA viruses in particular tend to code in overlapping reading frames. Since one amino acid is encoded by a triplet of nucleic acids, up to three genes may be coded...... allows for coding in unidirectional nested and overlapping reading frames, to annotate two homologous aligned viral genomes. Our method does not insist on conserved gene structure between the two sequences, thus making it applicable for the pairwise comparison of more distantly related sequences. Results...... and HIV2, as well as of two different Hepatitis Viruses, attaining results of ~87% sensitivity and ~98.5% specificity. We subsequently incorporate prior knowledge by "knowing" the gene structure of one sequence and annotating the other conditional on it. Boosting accuracy close to perfect we demonstrate...

  16. Genome-Wide Identification, Characterization and Expression Analysis of the Solute Carrier 6 Gene Family in Silkworm (Bombyx mori).

    Science.gov (United States)

    Tang, Xin; Liu, Huawei; Chen, Quanmei; Wang, Xin; Xiong, Ying; Zhao, Ping

    2016-10-03

    The solute carrier 6 (SLC6) gene family, initially known as the neurotransmitter transporters, plays vital roles in the regulation of neurotransmitter signaling, nutrient absorption and motor behavior. In this study, a total of 16 candidate genes were identified as SLC6 family gene homologs in the silkworm (Bombyx mori) genome. Spatio-temporal expression patterns of silkworm SLC6 gene transcripts indicated that these genes were highly and specifically expressed in midgut, brain and gonads; moreover, these genes were expressed primarily at the feeding stage or adult stage. Levels of expression for most midgut-specific and midgut-enriched gene transcripts were down-regulated after starvation but up-regulated after re-feeding. In addition, we observed that expression levels of these genes except for BmSLC6-15 and BmGT1 were markedly up-regulated by a juvenile hormone analog. Moreover, brain-enriched genes showed differential expression patterns during wandering and mating processes, suggesting that these genes may be involved in modulating wandering and mating behaviors. Our results improve our understanding of the expression patterns and potential physiological functions of the SLC6 gene family, and provide valuable information for the comprehensive functional analysis of the SLC6 gene family.

  17. Embryonic stem cell-like features of testicular carcinoma in situ revealed by genome-wide gene expression profiling.

    Science.gov (United States)

    Almstrup, Kristian; Hoei-Hansen, Christina E; Wirkner, Ute; Blake, Jonathon; Schwager, Christian; Ansorge, Wilhelm; Nielsen, John E; Skakkebaek, Niels E; Rajpert-De Meyts, Ewa; Leffers, Henrik

    2004-07-15

    Carcinoma in situ (CIS) is the common precursor of histologically heterogeneous testicular germ cell tumors (TGCTs), which in recent decades have markedly increased and now are the most common malignancy of young men. Using genome-wide gene expression profiling, we identified >200 genes highly expressed in testicular CIS, including many never reported in testicular neoplasms. Expression was further verified by semiquantitative reverse transcription-PCR and in situ hybridization. Among the highest expressed genes were NANOG and POU5F1, and reverse transcription-PCR revealed possible changes in their stoichiometry on progression into embryonic carcinoma. We compared the CIS expression profile with patterns reported in embryonic stem cells (ESCs), which revealed a substantial overlap that may be as high as 50%. We also demonstrated an over-representation of expressed genes in regions of 17q and 12, reported as unstable in cultured ESCs. The close similarity between CIS and ESCs explains the pluripotency of CIS. Moreover, the findings are consistent with an early prenatal origin of TGCTs and thus suggest that etiologic factors operating in utero are of primary importance for the incidence trends of TGCTs. Finally, some of the highly expressed genes identified in this study are promising candidates for new diagnostic markers for CIS and/or TGCTs.

  18. High-throughput SHAPE analysis reveals structures in HIV-1 genomic RNA strongly conserved across distinct biological states.

    Directory of Open Access Journals (Sweden)

    Kevin A Wilkinson

    2008-04-01

    Full Text Available Replication and pathogenesis of the human immunodeficiency virus (HIV is tightly linked to the structure of its RNA genome, but genome structure in infectious virions is poorly understood. We invent high-throughput SHAPE (selective 2'-hydroxyl acylation analyzed by primer extension technology, which uses many of the same tools as DNA sequencing, to quantify RNA backbone flexibility at single-nucleotide resolution and from which robust structural information can be immediately derived. We analyze the structure of HIV-1 genomic RNA in four biologically instructive states, including the authentic viral genome inside native particles. Remarkably, given the large number of plausible local structures, the first 10% of the HIV-1 genome exists in a single, predominant conformation in all four states. We also discover that noncoding regions functioning in a regulatory role have significantly lower (p-value < 0.0001 SHAPE reactivities, and hence more structure, than do viral coding regions that function as the template for protein synthesis. By directly monitoring protein binding inside virions, we identify the RNA recognition motif for the viral nucleocapsid protein. Seven structurally homologous binding sites occur in a well-defined domain in the genome, consistent with a role in directing specific packaging of genomic RNA into nascent virions. In addition, we identify two distinct motifs that are targets for the duplex destabilizing activity of this same protein. The nucleocapsid protein destabilizes local HIV-1 RNA structure in ways likely to facilitate initial movement both of the retroviral reverse transcriptase from its tRNA primer and of the ribosome in coding regions. Each of the three nucleocapsid interaction motifs falls in a specific genome domain, indicating that local protein interactions can be organized by the long-range architecture of an RNA. High-throughput SHAPE reveals a comprehensive view of HIV-1 RNA genome structure, and further

  19. Genome-wide identification and characterization of NB-ARC resistant genes in wheat (Triticum aestivum L.) and their expression during leaf rust infection.

    Science.gov (United States)

    Chandra, Saket; Kazmi, Andaleeb Z; Ahmed, Zainab; Roychowdhury, Gargi; Kumari, Veena; Kumar, Manish; Mukhopadhyay, Kunal

    2017-07-01

    NB-ARC domain-containing resistance genes from the wheat genome were identified, characterized and localized on chromosome arms that displayed differential yet positive response during incompatible and compatible leaf rust interactions. Wheat (Triticum aestivum L.) is an important cereal crop; however, its production is affected severely by numerous diseases including rusts. An efficient, cost-effective and ecologically viable approach to control pathogens is through host resistance. In wheat, high numbers of resistance loci are present but only few have been identified and cloned. A comprehensive analysis of the NB-ARC-containing genes in complete wheat genome was accomplished in this study. Complete NB-ARC encoding genes were mined from the Ensembl Plants database to predict 604 NB-ARC containing sequences using the HMM approach. Genome-wide analysis of orthologous clusters in the NB-ARC-containing sequences of wheat and other members of the Poaceae family revealed maximum homology with Oryza sativa indica and Brachypodium distachyon. The identification of overlap between orthologous clusters enabled the elucidation of the function and evolution of resistance proteins. The distributions of the NB-ARC domain-containing sequences were found to be balanced among the three wheat sub-genomes. Wheat chromosome arms 4AL and 7BL had the most NB-ARC domain-containing contigs. The spatio-temporal expression profiling studies exemplified the positive role of these genes in resistant and susceptible wheat plants during incompatible and compatible interaction in response to the leaf rust pathogen Puccinia triticina. Two NB-ARC domain-containing sequences were modelled in silico, cloned and sequenced to analyze their fine structures. The data obtained in this study will augment isolation, characterization and application NB-ARC resistance genes in marker-assisted selection based breeding programs for improving rust resistance in wheat.

  20. Genome-wide analysis of the sox family in the calcareous sponge Sycon ciliatum: multiple genes with unique expression patterns

    Directory of Open Access Journals (Sweden)

    Fortunato Sofia

    2012-07-01

    Full Text Available Abstract Background Sox genes are HMG-domain containing transcription factors with important roles in developmental processes in animals; many of them appear to have conserved functions among eumetazoans. Demosponges have fewer Sox genes than eumetazoans, but their roles remain unclear. The aim of this study is to gain insight into the early evolutionary history of the Sox gene family by identification and expression analysis of Sox genes in the calcareous sponge Sycon ciliatum. Methods Calcaronean Sox related sequences were retrieved by searching recently generated genomic and transcriptome sequence resources and analyzed using variety of phylogenetic methods and identification of conserved motifs. Expression was studied by whole mount in situ hybridization. Results We have identified seven Sox genes and four Sox-related genes in the complete genome of Sycon ciliatum. Phylogenetic and conserved motif analyses showed that five of Sycon Sox genes represent groups B, C, E, and F present in cnidarians and bilaterians. Two additional genes are classified as Sox genes but cannot be assigned to specific subfamilies, and four genes are more similar to Sox genes than to other HMG-containing genes. Thus, the repertoire of Sox genes is larger in this representative of calcareous sponges than in the demosponge Amphimedon queenslandica. It remains unclear whether this is due to the expansion of the gene family in Sycon or a secondary reduction in the Amphimedon genome. In situ hybridization of Sycon Sox genes revealed a variety of expression patterns during embryogenesis and in specific cell types of adult sponges. Conclusions In this study, we describe a large family of Sox genes in Sycon ciliatum with dynamic expression patterns, indicating that Sox genes are regulators in development and cell type determination in sponges, as observed in higher animals. The revealed differences between demosponge and calcisponge Sox genes repertoire highlight the need to

  1. Genome - wide identification, molecular characterization and expression analysis of the rop gtpase family in pepper (capsicum annum)

    International Nuclear Information System (INIS)

    Huang, D.; Li, M.; He, S.

    2015-01-01

    ROP/RAC GTPases is a plant-specific subfamily of Rho GTPases that plays a versatile role in the regulation of plant growth, development, in hormone signal transduction and response to the environment. Prior to the present study, only one Rop gene in pepper has been described. However, with the recent release of the draft genome sequence of pepper allowes us to conduct a genome wide search to identify how many Rop family members existed in pepper genome. We carried out bioinformatics analysis to establish the conserved as well as divergent regions on the protein sequences, phylogenetically analysis and the corresponding result shows that, CaROPs could be distributed into four groups as described in the literature for their homologs in Arabidopsis. To understand the function of nine Rop genes in pepper, we accordingly studied the tissue, fruit development and ripening expression patterns of CaRop genes by obtained RNA-seq data from public database. From our analysis, we realized that the expression of CaRop genes shows no total tissue or developmental specific expression. Furthermore, gene expression profiles of CaRop in response to environment stresses and hormone treatment, such as inoculated with Ralstonia solanacearum, by heat stress as well as treated with four phytohormones respectively and evaluated with real time RT-PCR. The potential involvement of specific CaRop genes in growth, fruit development, ripening, environment stresses as well as hormone responses discussed and may lay the foundation for future functional analysis to unravel their biological roles. (author)

  2. Western environment/lifestyle is associated with increased genome methylation and decreased gene expression in Chinese immigrants living in Australia.

    Science.gov (United States)

    Zhang, Guicheng; Wang, Kui; Schultz, Ennee; Khoo, Siew-Kim; Zhang, Xiaopeng; Annamalay, Alicia; Laing, Ingrid A; Hales, Belinda J; Goldblatt, Jack; Le Souëf, Peter N

    2016-01-01

    Several human diseases and conditions are disproportionally distributed in the world with a significant "Western-developed" vs. "Eastern-developing" gradient. We compared genome-wide DNA methylation of peripheral blood mononuclear cells in 25 newly arrived Chinese immigrants living in a Western environment for less than 6 months ("Newly arrived") with 23 Chinese immigrants living in the Western environment for more than two years ("Long-term") with a mean of 8.7 years, using the Infinium HumanMethylation450 BeadChip. In a sub-group of both subject groups (n = 12 each) we also investigated genome-wide gene expression using a Human HT-12 v4 expression beadChip. There were 62.5% probes among the total number of 382,250 valid CpG sites with greater mean Beta (β) in "Long-term" than in "Newly arrived". In the regions of CpG islands and gene promoters, compared with the CpG sites in all other regions, lower percentages of CpG sites with mean methylation levels in "Long-term" greater than "Newly arrived" were observed, but still >50%. The increase of methylation was associated with a general decrease of gene expression in Chinese immigrants living in the Western environment for a longer period of time. After adjusting for age, gender and other confounding factors the findings remained. Chinese immigrants living in Australia for a longer period of time have increased overall genome methylation and decreased overall gene expression compared with newly arrived immigrants. © 2015 Wiley Periodicals, Inc.

  3. Whole genome expression array profiling highlights differences in mucosal defense genes in Barrett's esophagus and esophageal adenocarcinoma.

    Directory of Open Access Journals (Sweden)

    Derek J Nancarrow

    Full Text Available Esophageal adenocarcinoma (EAC has become a major concern in Western countries due to rapid rises in incidence coupled with very poor survival rates. One of the key risk factors for the development of this cancer is the presence of Barrett's esophagus (BE, which is believed to form in response to repeated gastro-esophageal reflux. In this study we performed comparative, genome-wide expression profiling (using Illumina whole-genome Beadarrays on total RNA extracted from esophageal biopsy tissues from individuals with EAC, BE (in the absence of EAC and those with normal squamous epithelium. We combined these data with publically accessible raw data from three similar studies to investigate key gene and ontology differences between these three tissue states. The results support the deduction that BE is a tissue with enhanced glycoprotein synthesis machinery (DPP4, ATP2A3, AGR2 designed to provide strong mucosal defenses aimed at resisting gastro-esophageal reflux. EAC exhibits the enhanced extracellular matrix remodeling (collagens, IGFBP7, PLAU effects expected in an aggressive form of cancer, as well as evidence of reduced expression of genes associated with mucosal (MUC6, CA2, TFF1 and xenobiotic (AKR1C2, AKR1B10 defenses. When our results are compared to previous whole-genome expression profiling studies keratin, mucin, annexin and trefoil factor gene groups are the most frequently represented differentially expressed gene families. Eleven genes identified here are also represented in at least 3 other profiling studies. We used these genes to discriminate between squamous epithelium, BE and EAC within the two largest cohorts using a support vector machine leave one out cross validation (LOOCV analysis. While this method was satisfactory for discriminating squamous epithelium and BE, it demonstrates the need for more detailed investigations into profiling changes between BE and EAC.

  4. Recombination-dependent replication and gene conversion homogenize repeat sequences and diversify plastid genome structure.

    Science.gov (United States)

    Ruhlman, Tracey A; Zhang, Jin; Blazier, John C; Sabir, Jamal S M; Jansen, Robert K

    2017-04-01

    There is a misinterpretation in the literature regarding the variable orientation of the small single copy region of plastid genomes (plastomes). The common phenomenon of small and large single copy inversion, hypothesized to occur through intramolecular recombination between inverted repeats (IR) in a circular, single unit-genome, in fact, more likely occurs through recombination-dependent replication (RDR) of linear plastome templates. If RDR can be primed through both intra- and intermolecular recombination, then this mechanism could not only create inversion isomers of so-called single copy regions, but also an array of alternative sequence arrangements. We used Illumina paired-end and PacBio single-molecule r