WorldWideScience

Sample records for earthworm genomes genes

  1. Complete mitochondrial genome of four pheretimoid earthworms (Clitellata: Oligochaeta) and their phylogenetic reconstruction.

    Science.gov (United States)

    Zhang, Liangliang; Jiang, Jibao; Dong, Yan; Qiu, Jiangping

    2015-12-15

    Among oligochaetes, the Pheretima complex within the Megascolecidae is a major earthworm group. Recently, however, the systematics of the Pheretima complex based on morphology are challenged by molecular studies. Since little comparative analysis of earthworm complete mitochondrial genomes has been reported yet, we sequenced mitogenomes of four pheretimoid earthworm species to explore their phylogenetic relationships. The general earthworm genomic features are also found in four earthworms: all genes transcribed from the same strand, the same initiation codon ATG for each PCGs, and conserved structures of RNA genes. Interestingly we find an extra potential tRNA-leucine (CUN) in Amynthas longisiphonus. The earthworm mitochondrial ATP8 exhibits the highest evolutionary rate, while the gene CO1 evolves slowest. Phylogenetic analysis based on protein-coding genes (PCGs) strongly supports the monophyly of the Clitellata, Hirudinea, Oligochaeta, Megascolecidae and Pheretima complex. Our analysis, however, reveals non-monophyly within the genara Amynthas and Metaphire. Thus the generic divisions based on morphology in the Pheretima complex should be reconsidered. Copyright © 2015 Elsevier B.V. All rights reserved.

  2. Comparative Genomics of Symbiotic Bacteria in Earthworm Nephridia

    DEFF Research Database (Denmark)

    Kjeldsen, Kasper Urup; Pinel, Nicolas; Lund, Marie Braad

    The excretory and osmoregulatory organs (nephridia) of lumbricid earthworms are densely colonized by extracellular bacterial symbionts belonging to the newly established betaproteobacterial genus Verminephrobacter. The nephridial symbiont of the earthworm Eisenia fetida was subjected to full geno...

  3. Pheromone evolution, reproductive genes, and comparative transcriptomics in mediterranean earthworms (annelida, oligochaeta, hormogastridae).

    Science.gov (United States)

    Novo, Marta; Riesgo, Ana; Fernández-Guerra, Antoni; Giribet, Gonzalo

    2013-07-01

    Animals inhabiting cryptic environments are often subjected to morphological stasis due to the lack of obvious agents driving selection, and hence chemical cues may be important drivers of sexual selection and individual recognition. Here, we provide a comparative analysis of de novo-assembled transcriptomes in two Mediterranean earthworm species with the objective to detect pheromone proteins and other reproductive genes that could be involved in cryptic speciation processes, as recently characterized in other earthworm species. cDNA libraries of unspecific tissue of Hormogaster samnitica and three different tissues of H. elisae were sequenced in an Illumina Genome Analyzer II or Hi-Seq. Two pheromones, Attractin and Temptin were detected in all tissue samples and both species. Attractin resulted in a reliable marker for phylogenetic inference. Temptin contained multiple paralogs and was slightly overexpressed in the digestive tissue, suggesting that these pheromones could be released with the casts. Genes involved in sexual determination and fertilization were highly expressed in reproductive tissue. This is thus the first detailed analysis of the molecular machinery of sexual reproduction in earthworms.

  4. C60 exposure induced tissue damage and gene expression alterations in the earthworm Lumbricus rubellus

    NARCIS (Netherlands)

    Ploeg, van der M.J.C.; Handy, R.D.; Heckmann, L.H.; Hout, van der A.; Brink, van den N.W.

    2013-01-01

    Effects of C60 exposure (0, 15 or 154 mg/kg soil) on the earthworm Lumbricus rubellus were assessed at the tissue and molecular level, in two experiments. In the first experiment, earthworms were exposed for four weeks, and in the second lifelong. In both experiments, gene expression of heat shock

  5. Identification and optimization of classifier genes from multi-class earthworm microarray dataset.

    Directory of Open Access Journals (Sweden)

    Ying Li

    Full Text Available Monitoring, assessment and prediction of environmental risks that chemicals pose demand rapid and accurate diagnostic assays. A variety of toxicological effects have been associated with explosive compounds TNT and RDX. One important goal of microarray experiments is to discover novel biomarkers for toxicity evaluation. We have developed an earthworm microarray containing 15,208 unique oligo probes and have used it to profile gene expression in 248 earthworms exposed to TNT, RDX or neither. We assembled a new machine learning pipeline consisting of several well-established feature filtering/selection and classification techniques to analyze the 248-array dataset in order to construct classifier models that can separate earthworm samples into three groups: control, TNT-treated, and RDX-treated. First, a total of 869 genes differentially expressed in response to TNT or RDX exposure were identified using a univariate statistical algorithm of class comparison. Then, decision tree-based algorithms were applied to select a subset of 354 classifier genes, which were ranked by their overall weight of significance. A multiclass support vector machine (MC-SVM method and an unsupervised K-mean clustering method were applied to independently refine the classifier, producing a smaller subset of 39 and 30 classifier genes, separately, with 11 common genes being potential biomarkers. The combined 58 genes were considered the refined subset and used to build MC-SVM and clustering models with classification accuracy of 83.5% and 56.9%, respectively. This study demonstrates that the machine learning approach can be used to identify and optimize a small subset of classifier/biomarker genes from high dimensional datasets and generate classification models of acceptable precision for multiple classes.

  6. Oxidative stress and gene expression of earthworm (Eisenia fetida) to clothianidin.

    Science.gov (United States)

    Liu, Tong; Wang, Xiuguo; You, Xiangwei; Chen, Dan; Li, Yiqiang; Wang, Fenglong

    2017-08-01

    Neonicotinoid insecticides have become the most widely used pesticides in the world. Clothianidin is a novel neonicotinoid insecticide with a thiazolyl ring that exhibits excellent biological efficacy against a variety of pests. In the present study, the oxidative stress and genotoxicity of clothianidin on earthworms were evaluated. Moreover, the effective concentrations of clothianidin in artificial soil were monitored during the whole exposure period. The results showed that clothianidin was stable in artificial soil and that the residue concentrations were 0.094, 0.476, and 0.941mg/kg after 28 d of exposure, which represented changes no more than 10% compared to the concentrations on the 0th day. Additionally, both the concentration of and exposure time to clothianidin had a substantial influence on biomarkers in earthworms. At 0.5mg/kg and 1.0mg/kg, the reactive oxygen species (ROS) levels were greatly enhanced, causing changes in antioxidant enzyme activities, damage to biological macromolecules and abnormal expression of functional genes. Additionally, the present results showed that superoxide dismutase (SOD), DNA damage and heat shock protein 70 (HSP70) may be good indicators for environmental risk assessment of clothianidin to earthworms. Copyright © 2017. Published by Elsevier Inc.

  7. Endocrine disruptors in soil: Effects of bisphenol A on gene expression of the earthworm Eisenia fetida.

    Science.gov (United States)

    Novo, M; Verdú, I; Trigo, D; Martínez-Guitarte, J L

    2018-04-15

    Xenobiotics such as bisphenol A (BPA), are present in biosolids, which are applied as organic fertilizers in agricultural fields. Their effects on soil life have been poorly assessed, and this is particularly important in the case of earthworms, which represent the main animal biomass in this medium. In the present work we study the impacts of BPA on gene expression of Eisenia fetida, a widely used ecotoxicological model. Chronic soil tests and acute contact tests were performed, and gene expression was analyzed in total tissue and in masculine reproductive organs of the earthworms. The genes studied in this research played a role in endocrine pathways, detoxification mechanisms, stress response, epigenetics, and genotoxicity. Most of the genes were identified for the first time, providing potentially useful biomarkers for future assessments. For chronic exposures, no changes were detected in whole-body tissue; however, masculine reproductive organs showed changes in the expression of genes related to endocrine function (EcR, MAPR, AdipoR), epigenetic mechanisms (DNMTs), genotoxicity (PARP1), and stress responses (HSC70 4). For acute exposures, the expression of one epigenetic-related gene was altered for both whole-body tissues and male reproductive organs (Piwi2). Further changes were detected for whole-body tissues involved in detoxification (Metallothionein), stress (HSC70 4), and genotoxicity (PARP1) mechanisms. Acute exposure effects were also tested in whole-body tissues of juveniles, showing changes in the expression of Metallothionein and Piwi2. The molecular changes found in the analyzed earthworms indicate that exposure to BPA may have negative implications in their populations. Particularly interesting are the alterations related to epigenetic mechanisms, which suggest that future generations may be impacted. This study is the first to evaluate the molecular effects of BPA on soil organisms, and further assays will be necessary to better characterize

  8. Variation in gene expression within clones of the earthworm Dendrobaena octaedra.

    Directory of Open Access Journals (Sweden)

    Marina Mustonen

    Full Text Available Gene expression is highly plastic, which can help organisms to both acclimate and adapt to changing environments. Possible variation in gene expression among individuals with the same genotype (among clones is not widely considered, even though it could impact the results of studies that focus on gene expression phenotypes, for example studies using clonal lines. We examined the extent of within and between clone variation in gene expression in the earthworm Dendrobaena octaedra, which reproduces through apomictic parthenogenesis. Five microsatellite markers were developed and used to confirm that offspring are genetic clones of their parent. After that, expression of 12 genes was measured from five individuals each from six clonal lines after exposure to copper contaminated soil. Variation in gene expression was higher over all genotypes than within genotypes, as initially assumed. A subset of the genes was also examined in the offspring of exposed individuals in two of the clonal lines. In this case, variation in gene expression within genotypes was as high as that observed over all genotypes. One gene in particular (chymotrypsin inhibitor also showed significant differences in the expression levels among genetically identical individuals. Gene expression can vary considerably, and the extent of variation may depend on the genotypes and genes studied. Ensuring a large sample, with many different genotypes, is critical in studies comparing gene expression phenotypes. Researchers should be especially cautious inferring gene expression phenotypes when using only a single clonal or inbred line, since the results might be specific to only certain genotypes.

  9. Antioxidant gene expression and metabolic responses of earthworms (Eisenia fetida) after exposure to various concentrations of hexabromocyclododecane.

    Science.gov (United States)

    Shi, Yajuan; Xu, Xiangbo; Chen, Juan; Liang, Ruoyu; Zheng, Xiaoqi; Shi, Yajing; Wang, Yurong

    2018-01-01

    Hexabromocyclododecane (HBCD), a ubiquitous suspected contaminant, is one of the world's most prominent brominated flame retardants (BFRs). In the present study, earthworms (Eisenia fetida) were exposed to HBCD. The expression of selected antioxidant enzyme genes was measured, and the metabolic responses were assessed using nuclear magnetic resonance (NMR) to identify the molecular mechanism of the antioxidant stress reaction and the metabolic reactions of earthworms to HBCD. A significant up-regulation (p  0.05). Principal component analysis (PCA) of the metabolic responses showed that all groups could be clearly differentiated, and the highest concentration dose group was the most distant from the control group. Except for fumarate, the measured metabolites, which included adenosine triphosphate (ATP), valine, lysine, glycine, betaine and lactate, revealed significant (p earthworm exposure studies. Copyright © 2017 Elsevier Ltd. All rights reserved.

  10. Responses of growth inhibition and antioxidant gene expression in earthworms (Eisenia fetida) exposed to tetrabromobisphenol A, hexabromocyclododecane and decabromodiphenyl ether.

    Science.gov (United States)

    Shi, Ya-juan; Xu, Xiang-bo; Zheng, Xiao-qi; Lu, Yong-long

    2015-01-01

    Tetrabromobisphenol A (TBBPA), hexabromocyclododecane (HBCD) and decabromodiphenyl ether (BDE 209), suspected ubiquitous contaminants, account for the largest volume of brominated flame retardants (BFRs) since penta-BDE and octa-BDE have been phased out globally. In this paper, the growth inhibition and gene transcript levels of antioxidant enzymes (superoxide dismutase (SOD), catalase (CAT)) and the stress-response gene involved in the prevention of oxidative stress (Hsp70) of earthworms (Eisenia fetida) exposed to TBBPA, HBCD and BDE 209 were measured to identify the toxicity effects of selected BFRs on earthworms. The growth of earthworms treated by TBBPA at 200 and 400 mg/kg dw were inhibited at rate of 13.7% and 22.0% respectively, while there was no significant growth inhibition by HBCD and BDE 209. A significant (Pearthworms exposed to TBBPA at 50 mg/kg dw (1.77-fold) and to HBCD at 400 mg/kg dw (2.06-fold). The transcript level of Hsp70 gene was significantly up-regulated (Pearthworms exposed to TBBPA at concentration of 50-200 mg/kg (2.16-2.19-fold) and HBCD at 400 mg/kg (2.61-fold). No significant variation of CAT gene expression in all the BFRs treatments was observed, neither does all the target gene expression level exposed to BDE 209. Assessed by growth inhibition and the changes at mRNA levels of encoding genes in earthworms, TBBPA showed the greatest toxicity, followed by HBCD and BDE 209, consistent with trends in molecular properties. The results help to understand the molecular mechanism of antioxidant defense. Copyright © 2015 Elsevier Inc. All rights reserved.

  11. Toxic responses of Sox2 gene in the regeneration of the earthworm Eisenia foetida exposed to Retnoic acid.

    Science.gov (United States)

    Tao, Jing; Rong, Wei; Diao, Xiaoping; Zhou, Hailong

    2018-01-01

    Exogenous retinoic acid delays and disturbs the regeneration of Eisenia foetida. The stem cell pluripotency factor, Sox2, can play a crucial role in cell reprogramming and dedifferentiation. In this study, we compared the regeneration of Eisenia foetida in different segments after amputation and the effects of retinoic acid on the regeneration of different segments. The results showed that the regeneration speed of the head and tail was slightly faster than the middle part, and retinoic acid disrupted and delayed the regeneration of the earthworm. The qRT-PCR and Western blot analysis showed that the expression of the Sox2 gene and Sox2 protein was highest on the seventh day in different segments (pregeneration of earthworms and the formation of blastema are related to the expression of the Sox2 gene and protein. Retinoic acid delays and interferes with the regeneration of the earthworm by affecting the expression levels of the Sox2 gene and protein. Copyright © 2017 Elsevier Inc. All rights reserved.

  12. Identification of a cytochrome P450 gene in the earthworm Eisenia fetida and its mRNA expression under enrofloxacin stress.

    Science.gov (United States)

    Li, Yinsheng; Zhao, Chun; Lu, Xiaoxu; Ai, Xiaojie; Qiu, Jiangping

    2018-04-15

    Cytochrome P450 (CYP450) enzymes are a family of hemoproteins primarily responsible for detoxification functions. Earthworms have been used as a bioindicator of soil pollution in numerous studies, but no CYP450 gene has so far been cloned. RT-PCR and RACE-PCR were employed to construct and sequence the CYP450 gene DNA from the extracted mRNA in the earthworm Eisenia fetida. The cloned gene (EW1) has an open reading frame of 477bp. The 3'-terminal region contained both the consensus and the signature sequences characteristic of CYP450. It was closely related to the CYP450 gene from the flatworm genus Opisthorchis felineus with 87% homology. The predicted structure of the putative protein was 97% homologous to human CYP450 family 27. This gene has been deposited in GenBank (accession no. KM881474). Earthworms (E. fetida) were then exposed to 1, 10, 100, and 500mgkg -1 enrofloxacin in soils to explore the mRNA expression by real time qPCR. The effect of enrofloxacin on mRNA expression levels of EW1 exhibited a marked hormesis pattern across the enrofloxacin dose range tested. This is believed to be the first reported CYP450 gene in earthworms, with reference value for molecular studies on detoxification processes in earthworms. Copyright © 2017 Elsevier Inc. All rights reserved.

  13. Genome position and gene amplification

    Czech Academy of Sciences Publication Activity Database

    Jirsová, Pavla; Snijders, A.M.; Kwek, S.; Roydasgupta, R.; Fridlyand, J.; Tokuyasu, T.; Pinkel, D.; Albertson, D. G.

    2007-01-01

    Roč. 8, č. 6 (2007), r120 ISSN 1474-760X Institutional research plan: CEZ:AV0Z50040507; CEZ:AV0Z50040702 Keywords : gene amplification * array comparative genomic hybridization * oncogene Subject RIV: BO - Biophysics Impact factor: 6.589, year: 2007

  14. Earthworm Protease

    Directory of Open Access Journals (Sweden)

    Rong Pan

    2010-01-01

    Full Text Available The alimentary tract of earthworm secretes a group of proteases with a relative wide substrate specificity. In 1983, six isozymes were isolated from earthworm with fibrinolytic activities and called fibriniolytic enzymes. So far, more isozymes have been found from different earthworm species such as Lumbricus rubellus and Eisenia fetida. For convenience, the proteases are named on the basis of the earthworm species and the protein function, for instance, Eisenia fetida protease (EfP. The proteases have the abilities not only to hydrolyze fibrin and other protein, but also activate proenzymes such as plasminogen and prothrombin. In the light of recent studies, eight of the EfPs contain oligosaccharides chains which are thought to support the enzyme structure. Interestingly, EfP-II has a broader substrate specificity presenting alkaline trypsin, chymotrypsin and elastase activities, but EfP-III-1 has a stricter specificity. The protein crystal structures show the characteristics in their specificities. Earthworm proteases have been applied in several areas such as clinical treatment of clotting diseases, anti-tumor study, environmental protection and nutritional production. The current clinical utilizations and some potential new applications of the earthworm protease will be discussed in this paper.

  15. Earthworm Protease

    International Nuclear Information System (INIS)

    Pan, R.; Zhang, Z.; He, R.

    2010-01-01

    The alimentary tract of earthworm secretes a group of proteases with a relative wide substrate specificity. In 1983, six isozymes were isolated from earthworm with fibrinolytic activities and called fibrinolytic enzymes. So far, more isozymes have been found from different earthworm species such as Lumbricus rubellus and Eisenia fetida. For convenience, the proteases are named on the basis of the earthworm species and the protein function, for instance, Eisenia fetida protease (EfP). The proteases have the abilities not only to hydrolyze fibrin and other protein, but also activate pro enzymes such as plasminogen and prothrombin. In the light of recent studies, eight of the EfPs contain oligosaccharides chains which are thought to support the enzyme structure. Interestingly, EfP-II has a broader substrate specificity presenting alkaline trypsin, chymotrypsin and elastase activities, but EfP-III-1 has a stricter specificity. The protein crystal structures show the characteristics in their specificities. Earthworm proteases have been applied in several areas such as clinical treatment of clotting diseases, anti-tumor study, environmental protection and nutritional production. The current clinical utilizations and some potential new applications of the earthworm protease will be discussed in this paper.

  16. Uses of antimicrobial genes from microbial genome

    Science.gov (United States)

    Sorek, Rotem; Rubin, Edward M.

    2013-08-20

    We describe a method for mining microbial genomes to discover antimicrobial genes and proteins having broad spectrum of activity. Also described are antimicrobial genes and their expression products from various microbial genomes that were found using this method. The products of such genes can be used as antimicrobial agents or as tools for molecular biology.

  17. Impacts of BDE209 addition on Pb uptake, subcellular partitioning and gene toxicity in earthworm (Eisenia fetida)

    Energy Technology Data Exchange (ETDEWEB)

    Zhang, Wei, E-mail: wzhang@ecust.edu.cn [State Environmental Protection Key Laboratory of Environmental Risk Assessment and Control on Chemical Process, Shanghai 200237 (China); School of Resource and Environmental Engineering, East China University of Science and Technology, Shanghai 200237 (China); Liu, Kou; Li, Jing; Liang, Jun; Lin, Kuangfei [State Environmental Protection Key Laboratory of Environmental Risk Assessment and Control on Chemical Process, Shanghai 200237 (China); School of Resource and Environmental Engineering, East China University of Science and Technology, Shanghai 200237 (China)

    2015-12-30

    Highlights: • 10 or 100 μg g{sup −1} BDE209 addition caused histological changes in Pb-exposed earthworms’ body wall. • Strong histopathological effects with BDE209 addition confirmed the enhanced Pb bioavailability. • The presence of higher levels of BDE209 altered subcellular partitioning of Pb in earthworm. • Co-exposure to Pb and BDE209 declined SOD and CAT gene transcripts synergistically. • BDE209 addition elicited up-regulation of Hsp90 gene expression compared to Pb exposure alone. - Abstract: Lead (Pb) and decabromodiphenyl ether (BDE209) are the mainly co-existed contaminants at e-waste recycling sites. The potential toxicity of Pb (250 μg g{sup −1}) to earthworm Eisenia fetida in the presence of BDE209 (1, 10 and 100 μg g{sup −1}) was determined during 14-d incubation period. Compared to Pb treatment alone, the co-exposure with 1 μg g{sup −1} BDE209 barely affected Pb uptake, subcellular partitioning and gene expression; however, histopathological changes in earthworms’ body wall (epidermal, circular and longitudinal muscles) demonstrated that 10 and 100 μg g{sup −1} BDE209 additions enhanced Pb uptake and altered its subcellular partitioning, indicating that Pb redistributed from fractions E (cell debris) and D (metal-rich granules) to fraction C (cytosols); Additionally, BDE209 supply significantly inhibited (p < 0.05) the induction of SOD (superoxide dismutase) and CAT (catalase) gene expressions (maximum down-regulation 59% for SOD gene at Pb + 100 μg g{sup −1} BDE209 and 89% for CAT gene at Pb + 10 μg g{sup −1} BDE209), while facilitated (p < 0.05) Hsp90 (heat shock protein 90) gene expression with maximum induction rate of 120% after exposure to Pb + 10 μg g{sup −1} BDE209. These findings illustrate the importance of considering environmental BDE209 co-exposure when assessing Pb bioaccumulation and toxicity in multi-contaminated soil ecosystems.

  18. Impacts of BDE209 addition on Pb uptake, subcellular partitioning and gene toxicity in earthworm (Eisenia fetida)

    International Nuclear Information System (INIS)

    Zhang, Wei; Liu, Kou; Li, Jing; Liang, Jun; Lin, Kuangfei

    2015-01-01

    Highlights: • 10 or 100 μg g −1 BDE209 addition caused histological changes in Pb-exposed earthworms’ body wall. • Strong histopathological effects with BDE209 addition confirmed the enhanced Pb bioavailability. • The presence of higher levels of BDE209 altered subcellular partitioning of Pb in earthworm. • Co-exposure to Pb and BDE209 declined SOD and CAT gene transcripts synergistically. • BDE209 addition elicited up-regulation of Hsp90 gene expression compared to Pb exposure alone. - Abstract: Lead (Pb) and decabromodiphenyl ether (BDE209) are the mainly co-existed contaminants at e-waste recycling sites. The potential toxicity of Pb (250 μg g −1 ) to earthworm Eisenia fetida in the presence of BDE209 (1, 10 and 100 μg g −1 ) was determined during 14-d incubation period. Compared to Pb treatment alone, the co-exposure with 1 μg g −1 BDE209 barely affected Pb uptake, subcellular partitioning and gene expression; however, histopathological changes in earthworms’ body wall (epidermal, circular and longitudinal muscles) demonstrated that 10 and 100 μg g −1 BDE209 additions enhanced Pb uptake and altered its subcellular partitioning, indicating that Pb redistributed from fractions E (cell debris) and D (metal-rich granules) to fraction C (cytosols); Additionally, BDE209 supply significantly inhibited (p < 0.05) the induction of SOD (superoxide dismutase) and CAT (catalase) gene expressions (maximum down-regulation 59% for SOD gene at Pb + 100 μg g −1 BDE209 and 89% for CAT gene at Pb + 10 μg g −1 BDE209), while facilitated (p < 0.05) Hsp90 (heat shock protein 90) gene expression with maximum induction rate of 120% after exposure to Pb + 10 μg g −1 BDE209. These findings illustrate the importance of considering environmental BDE209 co-exposure when assessing Pb bioaccumulation and toxicity in multi-contaminated soil ecosystems.

  19. Pichia stipitis genomics, transcriptomics, and gene clusters

    Science.gov (United States)

    Thomas W. Jeffries; Jennifer R. Headman Van Vleet

    2009-01-01

    Genome sequencing and subsequent global gene expression studies have advanced our understanding of the lignocellulose-fermenting yeast Pichia stipitis. These studies have provided an insight into its central carbon metabolism, and analysis of its genome has revealed numerous functional gene clusters and tandem repeats. Specialized physiological traits are often the...

  20. Persistence drives gene clustering in bacterial genomes

    Directory of Open Access Journals (Sweden)

    Rocha Eduardo PC

    2008-01-01

    Full Text Available Abstract Background Gene clustering plays an important role in the organization of the bacterial chromosome and several mechanisms have been proposed to explain its extent. However, the controversies raised about the validity of each of these mechanisms remind us that the cause of this gene organization remains an open question. Models proposed to explain clustering did not take into account the function of the gene products nor the likely presence or absence of a given gene in a genome. However, genomes harbor two very different categories of genes: those genes present in a majority of organisms – persistent genes – and those present in very few organisms – rare genes. Results We show that two classes of genes are significantly clustered in bacterial genomes: the highly persistent and the rare genes. The clustering of rare genes is readily explained by the selfish operon theory. Yet, genes persistently present in bacterial genomes are also clustered and we try to understand why. We propose a model accounting specifically for such clustering, and show that indispensability in a genome with frequent gene deletion and insertion leads to the transient clustering of these genes. The model describes how clusters are created via the gene flux that continuously introduces new genes while deleting others. We then test if known selective processes, such as co-transcription, physical interaction or functional neighborhood, account for the stabilization of these clusters. Conclusion We show that the strong selective pressure acting on the function of persistent genes, in a permanent state of flux of genes in bacterial genomes, maintaining their size fairly constant, that drives persistent genes clustering. A further selective stabilization process might contribute to maintaining the clustering.

  1. Gene conversion in the rice genome

    DEFF Research Database (Denmark)

    Xu, Shuqing; Clark, Terry; Zheng, Hongkun

    2008-01-01

    -chromosomal conversions distributed between chromosome 1 and 5, 2 and 6, and 3 and 5 are more frequent than genome average (Z-test, P ... is not tightly linked to natural selection in the rice genome. To assess the contribution of segmental duplication on gene conversion statistics, we determined locations of conversion partners with respect to inter-chromosomal segment duplication. The number of conversions associated with segmentation is less...... involved in conversion events. CONCLUSION: The evolution of gene families in the rice genome may have been accelerated by conversion with pseudogenes. Our analysis suggests a possible role for gene conversion in the evolution of pathogen-response genes....

  2. Genome-Wide Comparative Gene Family Classification

    Science.gov (United States)

    Frech, Christian; Chen, Nansheng

    2010-01-01

    Correct classification of genes into gene families is important for understanding gene function and evolution. Although gene families of many species have been resolved both computationally and experimentally with high accuracy, gene family classification in most newly sequenced genomes has not been done with the same high standard. This project has been designed to develop a strategy to effectively and accurately classify gene families across genomes. We first examine and compare the performance of computer programs developed for automated gene family classification. We demonstrate that some programs, including the hierarchical average-linkage clustering algorithm MC-UPGMA and the popular Markov clustering algorithm TRIBE-MCL, can reconstruct manual curation of gene families accurately. However, their performance is highly sensitive to parameter setting, i.e. different gene families require different program parameters for correct resolution. To circumvent the problem of parameterization, we have developed a comparative strategy for gene family classification. This strategy takes advantage of existing curated gene families of reference species to find suitable parameters for classifying genes in related genomes. To demonstrate the effectiveness of this novel strategy, we use TRIBE-MCL to classify chemosensory and ABC transporter gene families in C. elegans and its four sister species. We conclude that fully automated programs can establish biologically accurate gene families if parameterized accordingly. Comparative gene family classification finds optimal parameters automatically, thus allowing rapid insights into gene families of newly sequenced species. PMID:20976221

  3. Visualizing conserved gene location across microbe genomes

    Science.gov (United States)

    Shaw, Chris D.

    2009-01-01

    This paper introduces an analysis-based zoomable visualization technique for displaying the location of genes across many related species of microbes. The purpose of this visualizatiuon is to enable a biologist to examine the layout of genes in the organism of interest with respect to the gene organization of related organisms. During the genomic annotation process, the ability to observe gene organization in common with previously annotated genomes can help a biologist better confirm the structure and function of newly analyzed microbe DNA sequences. We have developed a visualization and analysis tool that enables the biologist to observe and examine gene organization among genomes, in the context of the primary sequence of interest. This paper describes the visualization and analysis steps, and presents a case study using a number of Rickettsia genomes.

  4. JGI Plant Genomics Gene Annotation Pipeline

    Energy Technology Data Exchange (ETDEWEB)

    Shu, Shengqiang; Rokhsar, Dan; Goodstein, David; Hayes, David; Mitros, Therese

    2014-07-14

    Plant genomes vary in size and are highly complex with a high amount of repeats, genome duplication and tandem duplication. Gene encodes a wealth of information useful in studying organism and it is critical to have high quality and stable gene annotation. Thanks to advancement of sequencing technology, many plant species genomes have been sequenced and transcriptomes are also sequenced. To use these vastly large amounts of sequence data to make gene annotation or re-annotation in a timely fashion, an automatic pipeline is needed. JGI plant genomics gene annotation pipeline, called integrated gene call (IGC), is our effort toward this aim with aid of a RNA-seq transcriptome assembly pipeline. It utilizes several gene predictors based on homolog peptides and transcript ORFs. See Methods for detail. Here we present genome annotation of JGI flagship green plants produced by this pipeline plus Arabidopsis and rice except for chlamy which is done by a third party. The genome annotations of these species and others are used in our gene family build pipeline and accessible via JGI Phytozome portal whose URL and front page snapshot are shown below.

  5. Simultaneous gene finding in multiple genomes.

    Science.gov (United States)

    König, Stefanie; Romoth, Lars W; Gerischer, Lizzy; Stanke, Mario

    2016-11-15

    As the tree of life is populated with sequenced genomes ever more densely, the new challenge is the accurate and consistent annotation of entire clades of genomes. We address this problem with a new approach to comparative gene finding that takes a multiple genome alignment of closely related species and simultaneously predicts the location and structure of protein-coding genes in all input genomes, thereby exploiting negative selection and sequence conservation. The model prefers potential gene structures in the different genomes that are in agreement with each other, or-if not-where the exon gains and losses are plausible given the species tree. We formulate the multi-species gene finding problem as a binary labeling problem on a graph. The resulting optimization problem is NP hard, but can be efficiently approximated using a subgradient-based dual decomposition approach. The proposed method was tested on whole-genome alignments of 12 vertebrate and 12 Drosophila species. The accuracy was evaluated for human, mouse and Drosophila melanogaster and compared to competing methods. Results suggest that our method is well-suited for annotation of (a large number of) genomes of closely related species within a clade, in particular, when RNA-Seq data are available for many of the genomes. The transfer of existing annotations from one genome to another via the genome alignment is more accurate than previous approaches that are based on protein-spliced alignments, when the genomes are at close to medium distances. The method is implemented in C ++ as part of Augustus and available open source at http://bioinf.uni-greifswald.de/augustus/ CONTACT: stefaniekoenig@ymail.com or mario.stanke@uni-greifswald.deSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  6. Comparative Genomic Analysis of Soybean Flowering Genes

    Science.gov (United States)

    Jung, Chol-Hee; Wong, Chui E.; Singh, Mohan B.; Bhalla, Prem L.

    2012-01-01

    Flowering is an important agronomic trait that determines crop yield. Soybean is a major oilseed legume crop used for human and animal feed. Legumes have unique vegetative and floral complexities. Our understanding of the molecular basis of flower initiation and development in legumes is limited. Here, we address this by using a computational approach to examine flowering regulatory genes in the soybean genome in comparison to the most studied model plant, Arabidopsis. For this comparison, a genome-wide analysis of orthologue groups was performed, followed by an in silico gene expression analysis of the identified soybean flowering genes. Phylogenetic analyses of the gene families highlighted the evolutionary relationships among these candidates. Our study identified key flowering genes in soybean and indicates that the vernalisation and the ambient-temperature pathways seem to be the most variant in soybean. A comparison of the orthologue groups containing flowering genes indicated that, on average, each Arabidopsis flowering gene has 2-3 orthologous copies in soybean. Our analysis highlighted that the CDF3, VRN1, SVP, AP3 and PIF3 genes are paralogue-rich genes in soybean. Furthermore, the genome mapping of the soybean flowering genes showed that these genes are scattered randomly across the genome. A paralogue comparison indicated that the soybean genes comprising the largest orthologue group are clustered in a 1.4 Mb region on chromosome 16 of soybean. Furthermore, a comparison with the undomesticated soybean (Glycine soja) revealed that there are hundreds of SNPs that are associated with putative soybean flowering genes and that there are structural variants that may affect the genes of the light-signalling and ambient-temperature pathways in soybean. Our study provides a framework for the soybean flowering pathway and insights into the relationship and evolution of flowering genes between a short-day soybean and the long-day plant, Arabidopsis. PMID:22679494

  7. Cloning, expression and characterization of a gene from earthworm Eisenia fetida encoding a blood-clot dissolving protein.

    Directory of Open Access Journals (Sweden)

    GangQiang Li

    Full Text Available A lumbrokinase gene encoding a blood-clot dissolving protein was cloned from earthworm (Eisenia fetida by RT-PCR amplification. The gene designated as CST1 (GenBank No. AY840996 was sequence analyzed. The cDNA consists of 888 bp with an open reading frame of 729 bp, which encodes 242 amino acid residues. Multiple sequence alignments revealed that CST1 shares similarities and conserved amino acids with other reported lumbrokinases. The amino acid sequence of CST1 exhibits structural features similar to those found in other serine proteases, including human tissue-type (tPA, urokinase (uPA, and vampire bat (DSPAα1 plasminogen activators. CST1 has a conserved catalytic triad, found in the active sites of protease enzymes, which are important residues involved in polypeptide catalysis. CST1 was expressed as inclusion bodies in Escherichia coli BL21(DE3. The molecular mass of recombinant CST1 (rCST was 25 kDa as estimated by SDS-PAGE, and further confirmed by Western Blot analysis. His-tagged rCST1 was purified and renatured using nickel-chelating resin with a recovery rate of 50% and a purity of 95%. The purified, renatured rCST1 showed fibrinolytic activity evaluated by both a fibrin plate and a blood clot lysis assay. rCST1 degraded fibrin on the fibrin plate. A significant percentage (65.7% of blood clot lysis was observed when blood clot was treated with 80 mg/mL of rCST1 in vitro. The antithrombotic activity of rCST1 was 912 units/mg calculated by comparison with the activity of a lumbrokinase standard. These findings indicate that rCST1 has potential as a potent blood-clot treatment. Therefore, the expression and purification of a single lumbrokinase represents an important improvement in the use of lumbrokinases.

  8. Novel [NiFe]- and [FeFe]-hydrogenase gene transcripts indicative of active facultative aerobes and obligate anaerobes in earthworm gut contents.

    Science.gov (United States)

    Schmidt, Oliver; Wüst, Pia K; Hellmuth, Susanne; Borst, Katharina; Horn, Marcus A; Drake, Harold L

    2011-09-01

    The concomitant occurrence of molecular hydrogen (H(2)) and organic acids along the alimentary canal of the earthworm is indicative of ongoing fermentation during gut passage. Fermentative H(2) production is catalyzed by [FeFe]-hydrogenases and group 4 [NiFe]-hydrogenases in obligate anaerobes (e.g., Clostridiales) and facultative aerobes (e.g., Enterobacteriaceae), respectively, functional groups that might respond differently to contrasting redox conditions. Thus, the objectives of this study were to assess the redox potentials of the alimentary canal of Lumbricus terrestris and analyze the hydrogenase transcript diversities of H(2) producers in glucose-supplemented gut content microcosms. Although redox potentials in the core of the alimentary canal were variable on an individual worm basis, average redox potentials were similar. The lowest redox potentials occurred in the foregut and midgut regions, averaging 40 and 110 mV, respectively. Correlation plots between hydrogenase amino acid sequences and 16S rRNA gene sequences indicated that closely related hydrogenases belonged to closely related taxa, whereas distantly related hydrogenases did not necessarily belong to distantly related taxa. Of 178 [FeFe]-hydrogenase gene transcripts, 177 clustered in 12 Clostridiales-affiliated operational taxonomic units, the majority of which were indicative of heretofore unknown hydrogenases. Of 86 group 4 [NiFe]-hydrogenase gene transcripts, 79% and 21% were affiliated with organisms in the Enterobacteriaceae and Aeromonadaceae, respectively. The collective results (i) suggest that fermenters must cope with variable and moderately oxidative redox conditions along the alimentary canal, (ii) demonstrate that heretofore undetected hydrogenases are present in the earthworm gut, and (iii) corroborate previous findings implicating Clostridiaceae and Enterobacteriaceae as active fermentative taxa in earthworm gut content.

  9. Tandemly Arrayed Genes in Vertebrate Genomes

    Directory of Open Access Journals (Sweden)

    Deng Pan

    2008-01-01

    Full Text Available Tandemly arrayed genes (TAGs are duplicated genes that are linked as neighbors on a chromosome, many of which have important physiological and biochemical functions. Here we performed a survey of these genes in 11 available vertebrate genomes. TAGs account for an average of about 14% of all genes in these vertebrate genomes, and about 25% of all duplications. The majority of TAGs (72–94% have parallel transcription orientation (i.e., they are encoded on the same strand in contrast to the genome, which has about 50% of its genes in parallel transcription orientation. The majority of tandem arrays have only two members. In all species, the proportion of genes that belong to TAGs tends to be higher in large gene families than in small ones; together with our recent finding that tandem duplication played a more important role than retroposition in large families, this fact suggests that among all types of duplication mechanisms, tandem duplication is the predominant mechanism of duplication, especially in large families. Finally, several species have a higher proportion of large tandem arrays that are species-specific than random expectation.

  10. Synaptotagmin gene content of the sequenced genomes

    Directory of Open Access Journals (Sweden)

    Craxton Molly

    2004-07-01

    Full Text Available Abstract Background Synaptotagmins exist as a large gene family in mammals. There is much interest in the function of certain family members which act crucially in the regulated synaptic vesicle exocytosis required for efficient neurotransmission. Knowledge of the functions of other family members is relatively poor and the presence of Synaptotagmin genes in plants indicates a role for the family as a whole which is wider than neurotransmission. Identification of the Synaptotagmin genes within completely sequenced genomes can provide the entire Synaptotagmin gene complement of each sequenced organism. Defining the detailed structures of all the Synaptotagmin genes and their encoded products can provide a useful resource for functional studies and a deeper understanding of the evolution of the gene family. The current rapid increase in the number of sequenced genomes from different branches of the tree of life, together with the public deposition of evolutionarily diverse transcript sequences make such studies worthwhile. Results I have compiled a detailed list of the Synaptotagmin genes of Caenorhabditis, Anopheles, Drosophila, Ciona, Danio, Fugu, Mus, Homo, Arabidopsis and Oryza by examining genomic and transcript sequences from public sequence databases together with some transcript sequences obtained by cDNA library screening and RT-PCR. I have compared all of the genes and investigated the relationship between plant Synaptotagmins and their non-Synaptotagmin counterparts. Conclusions I have identified and compared 98 Synaptotagmin genes from 10 sequenced genomes. Detailed comparison of transcript sequences reveals abundant and complex variation in Synaptotagmin gene expression and indicates the presence of Synaptotagmin genes in all animals and land plants. Amino acid sequence comparisons indicate patterns of conservation and diversity in function. Phylogenetic analysis shows the origin of Synaptotagmins in multicellular eukaryotes and their

  11. earthworm Eisenia foetida (Oligochaeta)

    African Journals Online (AJOL)

    Influence of temperature on the reproduction of the earthworm Eisenia foetida ... feeding supplements for poultry, fish and other livestock ... of earthworm reproduction. ..... invertebrate populations in artificial soil made of sewage sludge and.

  12. Earthworm symbiont Verminephrobacter eiseniae mediates natural transformation within the host egg capsules using type IV pili

    Directory of Open Access Journals (Sweden)

    SEANA Kelyn DAVIDSON

    2014-10-01

    Full Text Available The dense microbial communities commonly associated with plants and animals should offer many opportunities for horizontal gene transfer (HGT through described mechanisms of DNA exchange including natural transformation. However, studies of the significance of natural transformation have focused primarily on pathogens. The study presented here demonstrates highly efficient DNA exchange by natural transformation in a common symbiont of earthworms. The obligate bacterial symbiont Verminephrobacter eiseniae is a member of a microbial consortium of the earthworm Eisenia fetida that is transmitted into the egg capsules to colonize the embryonic worms. In the study presented here, by testing for transformants under different conditions in culture, we demonstrate that V. eiseniae can incorporate free DNA from the environment, that competency is regulated by environmental factors, and that it is sequence specific. Mutations in the type IV pili of V. eiseniae resulted in loss of DNA uptake, implicating the type IV pilus (TFP apparatus in DNA uptake. Furthermore, injection of DNA carrying antibiotic-resistance genes into egg capsules resulted in transformants within the capsule, demonstrating the relevance of DNA uptake within the earthworm system. The ability to take up species-specific DNA from the environment may explain the maintenance of the relatively large, intact genome of this long-associated obligate symbiont, and provides a mechanism for acquisition of foreign genes within the earthworm system.

  13. Teacher's Guide for Earthworms.

    Science.gov (United States)

    Bruno, Merle S.; And Others

    This teacher's guide on earthworms includes four major sections: (1) introduction, (2) caring for earthworms in the classroom, (3) classroom activities, and (4) the appendix. The introduction includes information concerning grade level, scheduling, materials, obtaining earthworms, field study, classroom clean-up, and records. Caring for earthworms…

  14. Earthworms and Soil Pollutants

    Directory of Open Access Journals (Sweden)

    Kazuyoshi Tamae

    2011-11-01

    Full Text Available Although the toxicity of metal contaminated soils has been assessed with various bioassays, more information is needed about the biochemical responses, which may help to elucidate the mechanisms involved in metal toxicity. We previously reported that the earthworm, Eisenia fetida, accumulates cadmium in its seminal vesicles. The bio-accumulative ability of earthworms is well known, and thus the earthworm could be a useful living organism for the bio-monitoring of soil pollution. In this short review, we describe recent studies concerning the relationship between earthworms and soil pollutants, and discuss the possibility of using the earthworm as a bio-monitoring organism for soil pollution.

  15. Whole genome DNA methylation: beyond genes silencing

    OpenAIRE

    Tirado-Magallanes, Roberto; Rebbani, Khadija; Lim, Ricky; Pradhan, Sriharsa; Benoukraf, Touati

    2016-01-01

    The combination of DNA bisulfite treatment with high-throughput sequencing technologies has enabled investigation of genome-wide DNA methylation at near base pair level resolution, far beyond that of the kilobase-long canonical CpG islands that initially revealed the biological relevance of this covalent DNA modification. The latest high-resolution studies have revealed a role for very punctual DNA methylation in chromatin plasticity, gene regulation and splicing. Here, we aim to outline the ...

  16. Gene Composer in a structural genomics environment

    International Nuclear Information System (INIS)

    Lorimer, Don; Raymond, Amy; Mixon, Mark; Burgin, Alex; Staker, Bart; Stewart, Lance

    2011-01-01

    For structural biology applications, protein-construct engineering is guided by comparative sequence analysis and structural information, which allow the researcher to better define domain boundaries for terminal deletions and nonconserved regions for surface mutants. A database software application called Gene Composer has been developed to facilitate construct design. The structural genomics effort at the Seattle Structural Genomics Center for Infectious Disease (SSGCID) requires the manipulation of large numbers of amino-acid sequences and the underlying DNA sequences which are to be cloned into expression vectors. To improve efficiency in high-throughput protein structure determination, a database software package, Gene Composer, has been developed which facilitates the information-rich design of protein constructs and their underlying gene sequences. With its modular workflow design and numerous graphical user interfaces, Gene Composer enables researchers to perform all common bioinformatics steps used in modern structure-guided protein engineering and synthetic gene engineering. An example of the structure determination of H1N1 RNA-dependent RNA polymerase PB2 subunit is given

  17. A Probabilistic Genome-Wide Gene Reading Frame Sequence Model

    DEFF Research Database (Denmark)

    Have, Christian Theil; Mørk, Søren

    We introduce a new type of probabilistic sequence model, that model the sequential composition of reading frames of genes in a genome. Our approach extends gene finders with a model of the sequential composition of genes at the genome-level -- effectively producing a sequential genome annotation...... as output. The model can be used to obtain the most probable genome annotation based on a combination of i: a gene finder score of each gene candidate and ii: the sequence of the reading frames of gene candidates through a genome. The model --- as well as a higher order variant --- is developed and tested...... and are evaluated by the effect on prediction performance. Since bacterial gene finding to a large extent is a solved problem it forms an ideal proving ground for evaluating the explicit modeling of larger scale gene sequence composition of genomes. We conclude that the sequential composition of gene reading frames...

  18. Earthworm Collections of the World

    OpenAIRE

    Sherlock, Emma; Livermore, Laurence; Scott, Ben

    2013-01-01

    A poster presenting "Earthworm Collections of the World" This site provides a central hub for researchers and students to locate earthworm collections and specimens, along with useful information on the various earthworm families and species.

  19. Genomic variation in Salmonella enterica core genes for epidemiological typing

    DEFF Research Database (Denmark)

    Leekitcharoenphon, Pimlapas; Lukjancenko, Oksana; Rundsten, Carsten Friis

    2012-01-01

    Background: Technological advances in high throughput genome sequencing are making whole genome sequencing (WGS) available as a routine tool for bacterial typing. Standardized procedures for identification of relevant genes and of variation are needed to enable comparison between studies and over...... genomes and evaluate their value as typing targets, comparing whole genome typing and traditional methods such as 16S and MLST. A consensus tree based on variation of core genes gives much better resolution than 16S and MLST; the pan-genome family tree is similar to the consensus tree, but with higher...... that there is a positive selection towards mutations leading to amino acid changes. Conclusions: Genomic variation within the core genome is useful for investigating molecular evolution and providing candidate genes for bacterial genome typing. Identification of genes with different degrees of variation is important...

  20. Clusters of orthologous genes for 41 archaeal genomes and implications for evolutionary genomics of archaea

    OpenAIRE

    Wolf Yuri I; Novichkov Pavel S; Sorokin Alexander V; Makarova Kira S; Koonin Eugene V

    2007-01-01

    Abstract Background An evolutionary classification of genes from sequenced genomes that distinguishes between orthologs and paralogs is indispensable for genome annotation and evolutionary reconstruction. Shortly after multiple genome sequences of bacteria, archaea, and unicellular eukaryotes became available, an attempt on such a classification was implemented in Clusters of Orthologous Groups of proteins (COGs). Rapid accumulation of genome sequences creates opportunities for refining COGs ...

  1. The zebrafish genome: a review and msx gene case study.

    Science.gov (United States)

    Postlethwait, J H

    2006-01-01

    Zebrafish is one of several important teleost models for understanding principles of vertebrate developmental, molecular, organismal, genetic, evolutionary, and genomic biology. Efficient investigation of the molecular genetic basis of induced mutations depends on knowledge of the zebrafish genome. Principles of zebrafish genomic analysis, including gene mapping, ortholog identification, conservation of syntenies, genome duplication, and evolution of duplicate gene function are discussed here using as a case study the zebrafish msxa, msxb, msxc, msxd, and msxe genes, which together constitute zebrafish orthologs of tetrapod Msx1, Msx2, and Msx3. Genomic analysis suggests orthologs for this difficult to understand group of paralogs.

  2. Comparative genomics on Norrie disease gene.

    Science.gov (United States)

    Katoh, Masuko; Katoh, Masaru

    2005-05-01

    DAND1 (NBL1), DAND2 (CKTSF1B1 or GREM1 or GREMLIN), DAND3 (CKTSF1B2 or GREM2 or PRDC), DAND4 (CER1), DAND5 (CKTSF1B3 or GREM3 or DANTE), MUC2, MUC5AC, MUC5B, MUC6, MUC19, WISP1, WISP2, WISP3, VWF, NOV and Norrie disease (NDP or NORRIN) genes encode proteins with cysteine knot domain. Cysteine-knot superfamily proteins regulate ligand-receptor interactions for a variety of signaling pathways implicated in embryogenesis, homeostasis, and carcinogenesis. Although Ndp is unrelated to Wnt family members, Ndp is claimed to function as a ligand for Fzd4. Here, we identified and characterized rat Ndp, cow Ndp, chicken ndp and zebrafish ndp genes by using bioinformatics. Rat Ndp gene, consisting of three exons, was located within AC105563.4 genome sequence. Cow Ndp and chicken ndp complete CDS were derived from CB467544.1 EST and BX932859.2 cDNA, respectively. Zebrafish ndp gene was located within BX572627.5 genome sequence. Rat Ndp (131 aa) was a secreted protein with C-terminal cysteine knot-like (CTCK) domain. Rat Ndp showed 100, 96.9, 95.4, 87.8 and 66.4 total-amino-acid identity with mouse Ndp, cow Ndp, human NDP, chicken ndp and zebrafish ndp, respectively. Exon-intron structure of mammalian Ndp orthologs was well conserved. FOXA2, CUTL1 (CCAAT displacement protein), LMO2, CEBPA (C/EBPalpha)-binding sites and triple POU2F1 (OCT1)-binding sites were conserved among promoters of mammalian Ndp orthologs.

  3. Widespread of horizontal gene transfer in the human genome.

    Science.gov (United States)

    Huang, Wenze; Tsai, Lillian; Li, Yulong; Hua, Nan; Sun, Chen; Wei, Chaochun

    2017-04-04

    A fundamental concept in biology is that heritable material is passed from parents to offspring, a process called vertical gene transfer. An alternative mechanism of gene acquisition is through horizontal gene transfer (HGT), which involves movement of genetic materials between different species. Horizontal gene transfer has been found prevalent in prokaryotes but very rare in eukaryote. In this paper, we investigate horizontal gene transfer in the human genome. From the pair-wise alignments between human genome and 53 vertebrate genomes, 1,467 human genome regions (2.6 M bases) from all chromosomes were found to be more conserved with non-mammals than with most mammals. These human genome regions involve 642 known genes, which are enriched with ion binding. Compared to known horizontal gene transfer regions in the human genome, there were few overlapping regions, which indicated horizontal gene transfer is more common than we expected in the human genome. Horizontal gene transfer impacts hundreds of human genes and this study provided insight into potential mechanisms of HGT in the human genome.

  4. Brief Guide to Genomics: DNA, Genes and Genomes

    Science.gov (United States)

    ... clinic. Most new drugs based on genome-based research are estimated to be at least 10 to 15 years away, though recent genome-driven efforts in lipid-lowering therapy have considerably shortened that interval. According ...

  5. Genome-wide Analysis of Gene Regulation

    DEFF Research Database (Denmark)

    Chen, Yun

    to protein: through epigenetic modifications, transcription regulators or post-transcriptional controls. The following papers concern several layers of gene regulation with questions answered by different HTS approaches. Genome-wide screening of epigenetic changes by ChIP-seq allowed us to study both spatial...... and temporal alterations of histone modifications (Papers I and II). Coupling the data with machine learning approaches, we established a prediction framework to assess the most informative histone marks as well as their most influential nucleosome positions in predicting the promoter usages. (Papers I...... they regulated or if the sites had global elevated usage rates by multiple TFs. Using RNA-seq, 5’end-seq in combination with depletion of 5’exonuclease as well as nonsensemediated decay (NMD) factors, we systematically analyzed NMD substrates as well as their degradation intermediates in human cells (Paper V...

  6. New Genome Similarity Measures based on Conserved Gene Adjacencies.

    Science.gov (United States)

    Doerr, Daniel; Kowada, Luis Antonio B; Araujo, Eloi; Deshpande, Shachi; Dantas, Simone; Moret, Bernard M E; Stoye, Jens

    2017-06-01

    Many important questions in molecular biology, evolution, and biomedicine can be addressed by comparative genomic approaches. One of the basic tasks when comparing genomes is the definition of measures of similarity (or dissimilarity) between two genomes, for example, to elucidate the phylogenetic relationships between species. The power of different genome comparison methods varies with the underlying formal model of a genome. The simplest models impose the strong restriction that each genome under study must contain the same genes, each in exactly one copy. More realistic models allow several copies of a gene in a genome. One speaks of gene families, and comparative genomic methods that allow this kind of input are called gene family-based. The most powerful-but also most complex-models avoid this preprocessing of the input data and instead integrate the family assignment within the comparative analysis. Such methods are called gene family-free. In this article, we study an intermediate approach between family-based and family-free genomic similarity measures. Introducing this simpler model, called gene connections, we focus on the combinatorial aspects of gene family-free genome comparison. While in most cases, the computational costs to the general family-free case are the same, we also find an instance where the gene connections model has lower complexity. Within the gene connections model, we define three variants of genomic similarity measures that have different expression powers. We give polynomial-time algorithms for two of them, while we show NP-hardness for the third, most powerful one. We also generalize the measures and algorithms to make them more robust against recent local disruptions in gene order. Our theoretical findings are supported by experimental results, proving the applicability and performance of our newly defined similarity measures.

  7. Comparative genomics of the relationship between gene structure and expression

    NARCIS (Netherlands)

    Ren, X.

    2006-01-01

    The relationship between the structure of genes and their expression is a relatively new aspect of genome organization and regulation. With more genome sequences and expression data becoming available, bioinformatics approaches can help the further elucidation of the relationships between gene

  8. Regulation of methane genes and genome expression

    Energy Technology Data Exchange (ETDEWEB)

    John N. Reeve

    2009-09-09

    At the start of this project, it was known that methanogens were Archaeabacteria (now Archaea) and were therefore predicted to have gene expression and regulatory systems different from Bacteria, but few of the molecular biology details were established. The goals were then to establish the structures and organizations of genes in methanogens, and to develop the genetic technologies needed to investigate and dissect methanogen gene expression and regulation in vivo. By cloning and sequencing, we established the gene and operon structures of all of the “methane” genes that encode the enzymes that catalyze methane biosynthesis from carbon dioxide and hydrogen. This work identified unique sequences in the methane gene that we designated mcrA, that encodes the largest subunit of methyl-coenzyme M reductase, that could be used to identify methanogen DNA and establish methanogen phylogenetic relationships. McrA sequences are now the accepted standard and used extensively as hybridization probes to identify and quantify methanogens in environmental research. With the methane genes in hand, we used northern blot and then later whole-genome microarray hybridization analyses to establish how growth phase and substrate availability regulated methane gene expression in Methanobacterium thermautotrophicus ΔH (now Methanothermobacter thermautotrophicus). Isoenzymes or pairs of functionally equivalent enzymes catalyze several steps in the hydrogen-dependent reduction of carbon dioxide to methane. We established that hydrogen availability determine which of these pairs of methane genes is expressed and therefore which of the alternative enzymes is employed to catalyze methane biosynthesis under different environmental conditions. As were unable to establish a reliable genetic system for M. thermautotrophicus, we developed in vitro transcription as an alternative system to investigate methanogen gene expression and regulation. This led to the discovery that an archaeal protein

  9. Evolution of closely linked gene pairs in vertebrate genomes

    NARCIS (Netherlands)

    Franck, E.; Hulsen, T.; Huynen, M.A.; Jong, de W.W.; Lunsen, N.H.; Madsen, O.

    2008-01-01

    The orientation of closely linked genes in mammalian genomes is not random: there are more head-to-head (h2h) gene pairs than expected. To understand the origin of this enrichment in h2h gene pairs, we have analyzed the phylogenetic distribution of gene pairs separated by less than 600 bp of

  10. Establishing gene models from the Pinus pinaster genome using gene capture and BAC sequencing.

    Science.gov (United States)

    Seoane-Zonjic, Pedro; Cañas, Rafael A; Bautista, Rocío; Gómez-Maldonado, Josefa; Arrillaga, Isabel; Fernández-Pozo, Noé; Claros, M Gonzalo; Cánovas, Francisco M; Ávila, Concepción

    2016-02-27

    In the era of DNA throughput sequencing, assembling and understanding gymnosperm mega-genomes remains a challenge. Although drafts of three conifer genomes have recently been published, this number is too low to understand the full complexity of conifer genomes. Using techniques focused on specific genes, gene models can be established that can aid in the assembly of gene-rich regions, and this information can be used to compare genomes and understand functional evolution. In this study, gene capture technology combined with BAC isolation and sequencing was used as an experimental approach to establish de novo gene structures without a reference genome. Probes were designed for 866 maritime pine transcripts to sequence genes captured from genomic DNA. The gene models were constructed using GeneAssembler, a new bioinformatic pipeline, which reconstructed over 82% of the gene structures, and a high proportion (85%) of the captured gene models contained sequences from the promoter regulatory region. In a parallel experiment, the P. pinaster BAC library was screened to isolate clones containing genes whose cDNA sequence were already available. BAC clones containing the asparagine synthetase, sucrose synthase and xyloglucan endotransglycosylase gene sequences were isolated and used in this study. The gene models derived from the gene capture approach were compared with the genomic sequences derived from the BAC clones. This combined approach is a particularly efficient way to capture the genomic structures of gene families with a small number of members. The experimental approach used in this study is a valuable combined technique to study genomic gene structures in species for which a reference genome is unavailable. It can be used to establish exon/intron boundaries in unknown gene structures, to reconstruct incomplete genes and to obtain promoter sequences that can be used for transcriptional studies. A bioinformatics algorithm (GeneAssembler) is also provided as a

  11. Descriptions of Earthworms

    NARCIS (Netherlands)

    Horst, R.

    1890-01-01

    Among the animals, collected by Dr. A. Vorderman in the island of Billiton 1) and presented to our Museum, I met with some earthworms belonging to the genus Perichaeta, which appear to be hitherto undescribed.

  12. Nutrition Studies with Earthworms.

    Science.gov (United States)

    Tobaga, Leandro

    1980-01-01

    Describes experiments which demonstrate how different diets affect the growth rate of earthworms. Procedures for feeding baby worms are outlined, the analysis of results are discussed, and various modifications of the exercise are provided. (CS)

  13. Functional validation of candidate genes detected by genomic feature models

    DEFF Research Database (Denmark)

    Rohde, Palle Duun; Østergaard, Solveig; Kristensen, Torsten Nygaard

    2018-01-01

    to investigate locomotor activity, and applied genomic feature prediction models to identify gene ontology (GO) cate- gories predictive of this phenotype. Next, we applied the covariance association test to partition the genomic variance of the predictive GO terms to the genes within these terms. We...... then functionally assessed whether the identified candidate genes affected locomotor activity by reducing gene expression using RNA interference. In five of the seven candidate genes tested, reduced gene expression altered the phenotype. The ranking of genes within the predictive GO term was highly correlated......Understanding the genetic underpinnings of complex traits requires knowledge of the genetic variants that contribute to phenotypic variability. Reliable statistical approaches are needed to obtain such knowledge. In genome-wide association studies, variants are tested for association with trait...

  14. Widespread of horizontal gene transfer in the human genome

    OpenAIRE

    Huang, Wenze; Tsai, Lillian; Li, Yulong; Hua, Nan; Sun, Chen; Wei, Chaochun

    2017-01-01

    Background A fundamental concept in biology is that heritable material is passed from parents to offspring, a process called vertical gene transfer. An alternative mechanism of gene acquisition is through horizontal gene transfer (HGT), which involves movement of genetic materials between different species. Horizontal gene transfer has been found prevalent in prokaryotes but very rare in eukaryote. In this paper, we investigate horizontal gene transfer in the human genome. Results From the pa...

  15. Gene copy number variation throughout the Plasmodium falciparum genome

    Directory of Open Access Journals (Sweden)

    Stewart Lindsay B

    2009-08-01

    Full Text Available Abstract Background Gene copy number variation (CNV is responsible for several important phenotypes of the malaria parasite Plasmodium falciparum, including drug resistance, loss of infected erythrocyte cytoadherence and alteration of receptor usage for erythrocyte invasion. Despite the known effects of CNV, little is known about its extent throughout the genome. Results We performed a whole-genome survey of CNV genes in P. falciparum using comparative genome hybridisation of a diverse set of 16 laboratory culture-adapted isolates to a custom designed high density Affymetrix GeneChip array. Overall, 186 genes showed hybridisation signals consistent with deletion or amplification in one or more isolate. There is a strong association of CNV with gene length, genomic location, and low orthology to genes in other Plasmodium species. Sub-telomeric regions of all chromosomes are strongly associated with CNV genes independent from members of previously described multigene families. However, ~40% of CNV genes were located in more central regions of the chromosomes. Among the previously undescribed CNV genes, several that are of potential phenotypic relevance are identified. Conclusion CNV represents a major form of genetic variation within the P. falciparum genome; the distribution of gene features indicates the involvement of highly non-random mutational and selective processes. Additional studies should be directed at examining CNV in natural parasite populations to extend conclusions to clinical settings.

  16. Annotation of nerve cord transcriptome in earthworm Eisenia fetida

    Directory of Open Access Journals (Sweden)

    Vasanthakumar Ponesakki

    2017-12-01

    Full Text Available In annelid worms, the nerve cord serves as a crucial organ to control the sensory and behavioral physiology. The inadequate genome resource of earthworms has prioritized the comprehensive analysis of their transcriptome dataset to monitor the genes express in the nerve cord and predict their role in the neurotransmission and sensory perception of the species. The present study focuses on identifying the potential transcripts and predicting their functional features by annotating the transcriptome dataset of nerve cord tissues prepared by Gong et al., 2010 from the earthworm Eisenia fetida. Totally 9762 transcripts were successfully annotated against the NCBI nr database using the BLASTX algorithm and among them 7680 transcripts were assigned to a total of 44,354 GO terms. The conserve domain analysis indicated the over representation of P-loop NTPase domain and calcium binding EF-hand domain. The COG functional annotation classified 5860 transcript sequences into 25 functional categories. Further, 4502 contig sequences were found to map with 124 KEGG pathways. The annotated contig dataset exhibited 22 crucial neuropeptides having considerable matches to the marine annelid Platynereis dumerilii, suggesting their possible role in neurotransmission and neuromodulation. In addition, 108 human stem cell marker homologs were identified including the crucial epigenetic regulators, transcriptional repressors and cell cycle regulators, which may contribute to the neuronal and segmental regeneration. The complete functional annotation of this nerve cord transcriptome can be further utilized to interpret genetic and molecular mechanisms associated with neuronal development, nervous system regeneration and nerve cord function.

  17. Mapping earthworm communities in Europe

    NARCIS (Netherlands)

    Rutgers, M.; Orgiazzi, A.; Gardi, C.; Römbke, J.; Jansch, S.; Keith, A.; Neilson, R.; Boag, B.; Schmidt, O.; Murchie, A.K.; Blackshaw, R.P.; Pérès, G.; Cluzeau, D.; Guernion, M.; Briones, M.J.I.; Rodeiro, J.; Pineiro, R.; Diaz Cosin, D.J.; Sousa, J.P.; Suhadolc, M.; Kos, I.; Krogh, P.H.; Faber, J.H.; Mulder, C.; Bogte, J.J.; Wijnen, van H.J.; Schouten, A.J.; Zwart, de D.

    2016-01-01

    Existing data sets on earthworm communities in Europe were collected, harmonized, collated, modelled and depicted on a soil biodiversity map. Digital Soil Mapping was applied using multiple regressions relating relatively low density earthworm community data to soil characteristics, land use,

  18. Predictions of Gene Family Distributions in Microbial Genomes: Evolution by Gene Duplication and Modification

    International Nuclear Information System (INIS)

    Yanai, Itai; Camacho, Carlos J.; DeLisi, Charles

    2000-01-01

    A universal property of microbial genomes is the considerable fraction of genes that are homologous to other genes within the same genome. The process by which these homologues are generated is not well understood, but sequence analysis of 20 microbial genomes unveils a recurrent distribution of gene family sizes. We show that a simple evolutionary model based on random gene duplication and point mutations fully accounts for these distributions and permits predictions for the number of gene families in genomes not yet complete. Our findings are consistent with the notion that a genome evolves from a set of precursor genes to a mature size by gene duplications and increasing modifications. (c) 2000 The American Physical Society

  19. Predictions of Gene Family Distributions in Microbial Genomes: Evolution by Gene Duplication and Modification

    Energy Technology Data Exchange (ETDEWEB)

    Yanai, Itai; Camacho, Carlos J.; DeLisi, Charles

    2000-09-18

    A universal property of microbial genomes is the considerable fraction of genes that are homologous to other genes within the same genome. The process by which these homologues are generated is not well understood, but sequence analysis of 20 microbial genomes unveils a recurrent distribution of gene family sizes. We show that a simple evolutionary model based on random gene duplication and point mutations fully accounts for these distributions and permits predictions for the number of gene families in genomes not yet complete. Our findings are consistent with the notion that a genome evolves from a set of precursor genes to a mature size by gene duplications and increasing modifications. (c) 2000 The American Physical Society.

  20. The earthworm Aporrectodea caliginosa stimulates abundance and activity of phenoxyalkanoic acid herbicide degraders

    Science.gov (United States)

    Liu, Ya-Jun; Zaprasis, Adrienne; Liu, Shuang-Jiang; Drake, Harold L; Horn, Marcus A

    2011-01-01

    2-Methyl-4-chlorophenoxyacetic acid (MCPA) is a widely used phenoxyalkanoic acid (PAA) herbicide. Earthworms represent the dominant macrofauna and enhance microbial activities in many soils. Thus, the effect of the model earthworm Aporrectodea caliginosa (Oligochaeta, Lumbricidae) on microbial MCPA degradation was assessed in soil columns with agricultural soil. MCPA degradation was quicker in soil with earthworms than without earthworms. Quantitative PCR was inhibition-corrected per nucleic acid extract and indicated that copy numbers of tfdA-like and cadA genes (both encoding oxygenases initiating aerobic PAA degradation) in soil with earthworms were up to three and four times higher than without earthworms, respectively. tfdA-like and 16S rRNA gene transcript copy numbers in soil with earthworms were two and six times higher than without earthworms, respectively. Most probable numbers (MPNs) of MCPA degraders approximated 4 × 105 gdw−1 in soil before incubation and in soil treated without earthworms, whereas MPNs of earthworm-treated soils were approximately 150 × higher. The aerobic capacity of soil to degrade MCPA was higher in earthworm-treated soils than in earthworm-untreated soils. Burrow walls and 0–5 cm depth bulk soil displayed higher capacities to degrade MCPA than did soil from 5–10 cm depth bulk soil, expression of tfdA-like genes in burrow walls was five times higher than in bulk soil and MCPA degraders were abundant in burrow walls (MPNs of 5 × 107 gdw−1). The collective data indicate that earthworms stimulate abundance and activity of MCPA degraders endogenous to soil by their burrowing activities and might thus be advantageous for enhancing PAA degradation in soil. PMID:20740027

  1. Genomic Prediction of Gene Bank Wheat Landraces

    Directory of Open Access Journals (Sweden)

    José Crossa

    2016-07-01

    Full Text Available This study examines genomic prediction within 8416 Mexican landrace accessions and 2403 Iranian landrace accessions stored in gene banks. The Mexican and Iranian collections were evaluated in separate field trials, including an optimum environment for several traits, and in two separate environments (drought, D and heat, H for the highly heritable traits, days to heading (DTH, and days to maturity (DTM. Analyses accounting and not accounting for population structure were performed. Genomic prediction models include genotype × environment interaction (G × E. Two alternative prediction strategies were studied: (1 random cross-validation of the data in 20% training (TRN and 80% testing (TST (TRN20-TST80 sets, and (2 two types of core sets, “diversity” and “prediction”, including 10% and 20%, respectively, of the total collections. Accounting for population structure decreased prediction accuracy by 15–20% as compared to prediction accuracy obtained when not accounting for population structure. Accounting for population structure gave prediction accuracies for traits evaluated in one environment for TRN20-TST80 that ranged from 0.407 to 0.677 for Mexican landraces, and from 0.166 to 0.662 for Iranian landraces. Prediction accuracy of the 20% diversity core set was similar to accuracies obtained for TRN20-TST80, ranging from 0.412 to 0.654 for Mexican landraces, and from 0.182 to 0.647 for Iranian landraces. The predictive core set gave similar prediction accuracy as the diversity core set for Mexican collections, but slightly lower for Iranian collections. Prediction accuracy when incorporating G × E for DTH and DTM for Mexican landraces for TRN20-TST80 was around 0.60, which is greater than without the G × E term. For Iranian landraces, accuracies were 0.55 for the G × E model with TRN20-TST80. Results show promising prediction accuracies for potential use in germplasm enhancement and rapid introgression of exotic germplasm

  2. Genes but not genomes reveal bacterial domestication of Lactococcus lactis.

    Directory of Open Access Journals (Sweden)

    Delphine Passerini

    Full Text Available BACKGROUND: The population structure and diversity of Lactococcus lactis subsp. lactis, a major industrial bacterium involved in milk fermentation, was determined at both gene and genome level. Seventy-six lactococcal isolates of various origins were studied by different genotyping methods and thirty-six strains displaying unique macrorestriction fingerprints were analyzed by a new multilocus sequence typing (MLST scheme. This gene-based analysis was compared to genomic characteristics determined by pulsed-field gel electrophoresis (PFGE. METHODOLOGY/PRINCIPAL FINDINGS: The MLST analysis revealed that L. lactis subsp. lactis is essentially clonal with infrequent intra- and intergenic recombination; also, despite its taxonomical classification as a subspecies, it displays a genetic diversity as substantial as that within several other bacterial species. Genome-based analysis revealed a genome size variability of 20%, a value typical of bacteria inhabiting different ecological niches, and that suggests a large pan-genome for this subspecies. However, the genomic characteristics (macrorestriction pattern, genome or chromosome size, plasmid content did not correlate to the MLST-based phylogeny, with strains from the same sequence type (ST differing by up to 230 kb in genome size. CONCLUSION/SIGNIFICANCE: The gene-based phylogeny was not fully consistent with the traditional classification into dairy and non-dairy strains but supported a new classification based on ecological separation between "environmental" strains, the main contributors to the genetic diversity within the subspecies, and "domesticated" strains, subject to recent genetic bottlenecks. Comparison between gene- and genome-based analyses revealed little relationship between core and dispensable genome phylogenies, indicating that clonal diversification and phenotypic variability of the "domesticated" strains essentially arose through substantial genomic flux within the dispensable

  3. Whole genome homology-based identification of candidate genes ...

    African Journals Online (AJOL)

    Josephine Erhiakporeh

    2016-07-06

    Jul 6, 2016 ... candidate genes for drought tolerance in sesame. (Sesamum ... Our results provided genomic resources for further functional analysis and genetic engineering .... reverse transcribed using the Reverse Transcription System.

  4. Analysis of pan-genome to identify the core genes and essential genes of Brucella spp.

    Science.gov (United States)

    Yang, Xiaowen; Li, Yajie; Zang, Juan; Li, Yexia; Bie, Pengfei; Lu, Yanli; Wu, Qingmin

    2016-04-01

    Brucella spp. are facultative intracellular pathogens, that cause a contagious zoonotic disease, that can result in such outcomes as abortion or sterility in susceptible animal hosts and grave, debilitating illness in humans. For deciphering the survival mechanism of Brucella spp. in vivo, 42 Brucella complete genomes from NCBI were analyzed for the pan-genome and core genome by identification of their composition and function of Brucella genomes. The results showed that the total 132,143 protein-coding genes in these genomes were divided into 5369 clusters. Among these, 1710 clusters were associated with the core genome, 1182 clusters with strain-specific genes and 2477 clusters with dispensable genomes. COG analysis indicated that 44 % of the core genes were devoted to metabolism, which were mainly responsible for energy production and conversion (COG category C), and amino acid transport and metabolism (COG category E). Meanwhile, approximately 35 % of the core genes were in positive selection. In addition, 1252 potential essential genes were predicted in the core genome by comparison with a prokaryote database of essential genes. The results suggested that the core genes in Brucella genomes are relatively conservation, and the energy and amino acid metabolism play a more important role in the process of growth and reproduction in Brucella spp. This study might help us to better understand the mechanisms of Brucella persistent infection and provide some clues for further exploring the gene modules of the intracellular survival in Brucella spp.

  5. Distinct gene number-genome size relationships for eukaryotes and non-eukaryotes: gene content estimation for dinoflagellate genomes.

    Directory of Open Access Journals (Sweden)

    Yubo Hou

    Full Text Available The ability to predict gene content is highly desirable for characterization of not-yet sequenced genomes like those of dinoflagellates. Using data from completely sequenced and annotated genomes from phylogenetically diverse lineages, we investigated the relationship between gene content and genome size using regression analyses. Distinct relationships between log(10-transformed protein-coding gene number (Y' versus log(10-transformed genome size (X', genome size in kbp were found for eukaryotes and non-eukaryotes. Eukaryotes best fit a logarithmic model, Y' = ln(-46.200+22.678X', whereas non-eukaryotes a linear model, Y' = 0.045+0.977X', both with high significance (p0.91. Total gene number shows similar trends in both groups to their respective protein coding regressions. The distinct correlations reflect lower and decreasing gene-coding percentages as genome size increases in eukaryotes (82%-1% compared to higher and relatively stable percentages in prokaryotes and viruses (97%-47%. The eukaryotic regression models project that the smallest dinoflagellate genome (3x10(6 kbp contains 38,188 protein-coding (40,086 total genes and the largest (245x10(6 kbp 87,688 protein-coding (92,013 total genes, corresponding to 1.8% and 0.05% gene-coding percentages. These estimates do not likely represent extraordinarily high functional diversity of the encoded proteome but rather highly redundant genomes as evidenced by high gene copy numbers documented for various dinoflagellate species.

  6. Transmission of Nephridial Bacteria of the Earthworm Eisenia fetida

    OpenAIRE

    Davidson, Seana K.; Stahl, David A.

    2006-01-01

    The lumbricid earthworms (annelid family Lumbricidae) harbor gram-negative bacteria in their excretory organs, the nephridia. Comparative 16S rRNA gene sequencing of bacteria associated with the nephridia of several earthworm species has shown that each species of worm harbors a distinct bacterial species and that the bacteria from different species form a monophyletic cluster within the genus Acidovorax, suggesting that there is a specific association resulting from radiation from a common b...

  7. Genome-Wide Detection and Analysis of Multifunctional Genes

    Science.gov (United States)

    Pritykin, Yuri; Ghersi, Dario; Singh, Mona

    2015-01-01

    Many genes can play a role in multiple biological processes or molecular functions. Identifying multifunctional genes at the genome-wide level and studying their properties can shed light upon the complexity of molecular events that underpin cellular functioning, thereby leading to a better understanding of the functional landscape of the cell. However, to date, genome-wide analysis of multifunctional genes (and the proteins they encode) has been limited. Here we introduce a computational approach that uses known functional annotations to extract genes playing a role in at least two distinct biological processes. We leverage functional genomics data sets for three organisms—H. sapiens, D. melanogaster, and S. cerevisiae—and show that, as compared to other annotated genes, genes involved in multiple biological processes possess distinct physicochemical properties, are more broadly expressed, tend to be more central in protein interaction networks, tend to be more evolutionarily conserved, and are more likely to be essential. We also find that multifunctional genes are significantly more likely to be involved in human disorders. These same features also hold when multifunctionality is defined with respect to molecular functions instead of biological processes. Our analysis uncovers key features about multifunctional genes, and is a step towards a better genome-wide understanding of gene multifunctionality. PMID:26436655

  8. The invasive MED/Q Bemisia tabaci genome: a tale of gene loss and gene gain

    Science.gov (United States)

    Whiteflies are a group of invasive crop pests that impact global agriculture. An analysis was conducted to compare draft genomes of two whitefly strains, which demonstrated the relative conserved gene order, but a number of genes were either novel (added) or omitted (deleted) between genomes. This...

  9. Conserved genomic organisation of Group B Sox genes in insects.

    Directory of Open Access Journals (Sweden)

    Woerfel Gertrud

    2005-05-01

    Full Text Available Abstract Background Sox domain containing genes are important metazoan transcriptional regulators implicated in a wide rage of developmental processes. The vertebrate B subgroup contains the Sox1, Sox2 and Sox3 genes that have early functions in neural development. Previous studies show that Drosophila Group B genes have been functionally conserved since they play essential roles in early neural specification and mutations in the Drosophila Dichaete and SoxN genes can be rescued with mammalian Sox genes. Despite their importance, the extent and organisation of the Group B family in Drosophila has not been fully characterised, an important step in using Drosophila to examine conserved aspects of Group B Sox gene function. Results We have used the directed cDNA sequencing along with the output from the publicly-available genome sequencing projects to examine the structure of Group B Sox domain genes in Drosophila melanogaster, Drosophila pseudoobscura, Anopheles gambiae and Apis mellifora. All of the insect genomes contain four genes encoding Group B proteins, two of which are intronless, as is the case with vertebrate group B genes. As has been previously reported and unusually for Group B genes, two of the insect group B genes, Sox21a and Sox21b, contain introns within their DNA-binding domains. We find that the highly unusual multi-exon structure of the Sox21b gene is common to the insects. In addition, we find that three of the group B Sox genes are organised in a linked cluster in the insect genomes. By in situ hybridisation we show that the pattern of expression of each of the four group B genes during embryogenesis is conserved between D. melanogaster and D. pseudoobscura. Conclusion The DNA-binding domain sequences and genomic organisation of the group B genes have been conserved over 300 My of evolution since the last common ancestor of the Hymenoptera and the Diptera. Our analysis suggests insects have two Group B1 genes, SoxN and

  10. Clusters of orthologous genes for 41 archaeal genomes and implications for evolutionary genomics of archaea

    Directory of Open Access Journals (Sweden)

    Wolf Yuri I

    2007-11-01

    Full Text Available Abstract Background An evolutionary classification of genes from sequenced genomes that distinguishes between orthologs and paralogs is indispensable for genome annotation and evolutionary reconstruction. Shortly after multiple genome sequences of bacteria, archaea, and unicellular eukaryotes became available, an attempt on such a classification was implemented in Clusters of Orthologous Groups of proteins (COGs. Rapid accumulation of genome sequences creates opportunities for refining COGs but also represents a challenge because of error amplification. One of the practical strategies involves construction of refined COGs for phylogenetically compact subsets of genomes. Results New Archaeal Clusters of Orthologous Genes (arCOGs were constructed for 41 archaeal genomes (13 Crenarchaeota, 27 Euryarchaeota and one Nanoarchaeon using an improved procedure that employs a similarity tree between smaller, group-specific clusters, semi-automatically partitions orthology domains in multidomain proteins, and uses profile searches for identification of remote orthologs. The annotation of arCOGs is a consensus between three assignments based on the COGs, the CDD database, and the annotations of homologs in the NR database. The 7538 arCOGs, on average, cover ~88% of the genes in a genome compared to a ~76% coverage in COGs. The finer granularity of ortholog identification in the arCOGs is apparent from the fact that 4538 arCOGs correspond to 2362 COGs; ~40% of the arCOGs are new. The archaeal gene core (protein-coding genes found in all 41 genome consists of 166 arCOGs. The arCOGs were used to reconstruct gene loss and gene gain events during archaeal evolution and gene sets of ancestral forms. The Last Archaeal Common Ancestor (LACA is conservatively estimated to possess 996 genes compared to 1245 and 1335 genes for the last common ancestors of Crenarchaeota and Euryarchaeota, respectively. It is inferred that LACA was a chemoautotrophic hyperthermophile

  11. LATERAL GENE TRANSFER AND THE HISTORY OF BACTERIAL GENOMES

    Energy Technology Data Exchange (ETDEWEB)

    Howard Ochman

    2006-02-22

    The aims of this research were to elucidate the role and extent of lateral transfer in the differentiation of bacterial strains and species, and to assess the impact of gene transfer on the evolution of bacterial genomes. The ultimate goal of the project is to examine the dynamics of a core set of protein-coding genes (i.e., those that are distributed universally among Bacteria) by developing conserved primers that would allow their amplification and sequencing in any bacterial taxa. In addition, we adopted a bioinformatic approach to elucidate the extent of lateral gene transfer in sequenced genome.

  12. Gene calling and bacterial genome annotation with BG7.

    Science.gov (United States)

    Tobes, Raquel; Pareja-Tobes, Pablo; Manrique, Marina; Pareja-Tobes, Eduardo; Kovach, Evdokim; Alekhin, Alexey; Pareja, Eduardo

    2015-01-01

    New massive sequencing technologies are providing many bacterial genome sequences from diverse taxa but a refined annotation of these genomes is crucial for obtaining scientific findings and new knowledge. Thus, bacterial genome annotation has emerged as a key point to investigate in bacteria. Any efficient tool designed specifically to annotate bacterial genomes sequenced with massively parallel technologies has to consider the specific features of bacterial genomes (absence of introns and scarcity of nonprotein-coding sequence) and of next-generation sequencing (NGS) technologies (presence of errors and not perfectly assembled genomes). These features make it convenient to focus on coding regions and, hence, on protein sequences that are the elements directly related with biological functions. In this chapter we describe how to annotate bacterial genomes with BG7, an open-source tool based on a protein-centered gene calling/annotation paradigm. BG7 is specifically designed for the annotation of bacterial genomes sequenced with NGS. This tool is sequence error tolerant maintaining their capabilities for the annotation of highly fragmented genomes or for annotating mixed sequences coming from several genomes (as those obtained through metagenomics samples). BG7 has been designed with scalability as a requirement, with a computing infrastructure completely based on cloud computing (Amazon Web Services).

  13. From Genomics to Gene Therapy: Induced Pluripotent Stem Cells Meet Genome Editing.

    Science.gov (United States)

    Hotta, Akitsu; Yamanaka, Shinya

    2015-01-01

    The advent of induced pluripotent stem (iPS) cells has opened up numerous avenues of opportunity for cell therapy, including the initiation in September 2014 of the first human clinical trial to treat dry age-related macular degeneration. In parallel, advances in genome-editing technologies by site-specific nucleases have dramatically improved our ability to edit endogenous genomic sequences at targeted sites of interest. In fact, clinical trials have already begun to implement this technology to control HIV infection. Genome editing in iPS cells is a powerful tool and enables researchers to investigate the intricacies of the human genome in a dish. In the near future, the groundwork laid by such an approach may expand the possibilities of gene therapy for treating congenital disorders. In this review, we summarize the exciting progress being made in the utilization of genomic editing technologies in pluripotent stem cells and discuss remaining challenges toward gene therapy applications.

  14. Expression of a transferred nuclear gene in a mitochondrial genome

    Directory of Open Access Journals (Sweden)

    Yichun Qiu

    2014-08-01

    Full Text Available Transfer of mitochondrial genes to the nucleus, and subsequent gain of regulatory elements for expression, is an ongoing evolutionary process in plants. Many examples have been characterized, which in some cases have revealed sources of mitochondrial targeting sequences and cis-regulatory elements. In contrast, there have been no reports of a nuclear gene that has undergone intracellular transfer to the mitochondrial genome and become expressed. Here we show that the orf164 gene in the mitochondrial genome of several Brassicaceae species, including Arabidopsis, is derived from the nuclear ARF17 gene that codes for an auxin responsive protein and is present across flowering plants. Orf164 corresponds to a portion of ARF17, and the nucleotide and amino acid sequences are 79% and 81% identical, respectively. Orf164 is transcribed in several organ types of Arabidopsis thaliana, as detected by RT-PCR. In addition, orf164 is transcribed in five other Brassicaceae within the tribes Camelineae, Erysimeae and Cardamineae, but the gene is not present in Brassica or Raphanus. This study shows that nuclear genes can be transferred to the mitochondrial genome and become expressed, providing a new perspective on the movement of genes between the genomes of subcellular compartments.

  15. Genome engineering using a synthetic gene circuit in Bacillus subtilis.

    Science.gov (United States)

    Jeong, Da-Eun; Park, Seung-Hwan; Pan, Jae-Gu; Kim, Eui-Joong; Choi, Soo-Keun

    2015-03-31

    Genome engineering without leaving foreign DNA behind requires an efficient counter-selectable marker system. Here, we developed a genome engineering method in Bacillus subtilis using a synthetic gene circuit as a counter-selectable marker system. The system contained two repressible promoters (B. subtilis xylA (Pxyl) and spac (Pspac)) and two repressor genes (lacI and xylR). Pxyl-lacI was integrated into the B. subtilis genome with a target gene containing a desired mutation. The xylR and Pspac-chloramphenicol resistant genes (cat) were located on a helper plasmid. In the presence of xylose, repression of XylR by xylose induced LacI expression, the LacIs repressed the Pspac promoter and the cells become chloramphenicol sensitive. Thus, to survive in the presence of chloramphenicol, the cell must delete Pxyl-lacI by recombination between the wild-type and mutated target genes. The recombination leads to mutation of the target gene. The remaining helper plasmid was removed easily under the chloramphenicol absent condition. In this study, we showed base insertion, deletion and point mutation of the B. subtilis genome without leaving any foreign DNA behind. Additionally, we successfully deleted a 2-kb gene (amyE) and a 38-kb operon (ppsABCDE). This method will be useful to construct designer Bacillus strains for various industrial applications. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  16. Functional validation of candidate genes detected by genomic feature models

    DEFF Research Database (Denmark)

    Rohde, Palle Duun; Østergaard, Solveig; Kristensen, Torsten Nygaard

    2018-01-01

    Understanding the genetic underpinnings of complex traits requires knowledge of the genetic variants that contribute to phenotypic variability. Reliable statistical approaches are needed to obtain such knowledge. In genome-wide association studies, variants are tested for association with trait...... then functionally assessed whether the identified candidate genes affected locomotor activity by reducing gene expression using RNA interference. In five of the seven candidate genes tested, reduced gene expression altered the phenotype. The ranking of genes within the predictive GO term was highly correlated...

  17. Structure and earthworms

    Science.gov (United States)

    Earthworms are an important part of the soil ecosystem and an indicator of soil quality. Sometimes referred to as ecosystem engineers, they play a pivotal role in maintaining soil productivity. Their burrowing, feeding, and casting activities alter the physical, chemical, and biological properties o...

  18. The duplicated genes database: identification and functional annotation of co-localised duplicated genes across genomes.

    Directory of Open Access Journals (Sweden)

    Marion Ouedraogo

    Full Text Available BACKGROUND: There has been a surge in studies linking genome structure and gene expression, with special focus on duplicated genes. Although initially duplicated from the same sequence, duplicated genes can diverge strongly over evolution and take on different functions or regulated expression. However, information on the function and expression of duplicated genes remains sparse. Identifying groups of duplicated genes in different genomes and characterizing their expression and function would therefore be of great interest to the research community. The 'Duplicated Genes Database' (DGD was developed for this purpose. METHODOLOGY: Nine species were included in the DGD. For each species, BLAST analyses were conducted on peptide sequences corresponding to the genes mapped on a same chromosome. Groups of duplicated genes were defined based on these pairwise BLAST comparisons and the genomic location of the genes. For each group, Pearson correlations between gene expression data and semantic similarities between functional GO annotations were also computed when the relevant information was available. CONCLUSIONS: The Duplicated Gene Database provides a list of co-localised and duplicated genes for several species with the available gene co-expression level and semantic similarity value of functional annotation. Adding these data to the groups of duplicated genes provides biological information that can prove useful to gene expression analyses. The Duplicated Gene Database can be freely accessed through the DGD website at http://dgd.genouest.org.

  19. The genomic structure of the DMBT1 gene

    DEFF Research Database (Denmark)

    Mollenhauer, J; Holmskov, U; Wiemann, S

    1999-01-01

    Increasing evidence has accumulated for an involvement of the inactivation of tumour suppressor genes at chromosome 10q in the carcinogenesis of brain tumours, melanomas, and carcinomas of the lung, the prostate, the pancreas, and the endometrium. The gene DMBT1 (Deleted in Malignant Brain Tumours...... 1) is located at chromosome 10q25.3-q26.1, within one of the putative intervals for tumour suppressor genes. DMBT1 is a member of the scavenger-receptor cysteine-rich (SRCR) superfamily and displays homozygous deletions or lack of expression in glioblastoma multiforme, medulloblastoma......, and in gastrointestinal and lung cancers. Based on these properties, DMBT1 has been proposed to be a candidate tumour suppressor gene. We have determined the genomic sequence of DMBT1 to allow analyses of mutations. The gene has at least 54 exons that span a genomic region of about 80 kb. We have identified a putative...

  20. Biased distribution of DNA uptake sequences towards genome maintenance genes

    DEFF Research Database (Denmark)

    Davidsen, T.; Rodland, E.A.; Lagesen, K.

    2004-01-01

    Repeated sequence signatures are characteristic features of all genomic DNA. We have made a rigorous search for repeat genomic sequences in the human pathogens Neisseria meningitidis, Neisseria gonorrhoeae and Haemophilus influenzae and found that by far the most frequent 9-10mers residing within...... in these organisms. Pasteurella multocida also displayed high frequencies of a putative DUS identical to that previously identified in H. influenzae and with a skewed distribution towards genome maintenance genes, indicating that this bacterium might be transformation competent under certain conditions....

  1. Identification of neural outgrowth genes using genome-wide RNAi.

    Directory of Open Access Journals (Sweden)

    Katharine J Sepp

    2008-07-01

    Full Text Available While genetic screens have identified many genes essential for neurite outgrowth, they have been limited in their ability to identify neural genes that also have earlier critical roles in the gastrula, or neural genes for which maternally contributed RNA compensates for gene mutations in the zygote. To address this, we developed methods to screen the Drosophila genome using RNA-interference (RNAi on primary neural cells and present the results of the first full-genome RNAi screen in neurons. We used live-cell imaging and quantitative image analysis to characterize the morphological phenotypes of fluorescently labelled primary neurons and glia in response to RNAi-mediated gene knockdown. From the full genome screen, we focused our analysis on 104 evolutionarily conserved genes that when downregulated by RNAi, have morphological defects such as reduced axon extension, excessive branching, loss of fasciculation, and blebbing. To assist in the phenotypic analysis of the large data sets, we generated image analysis algorithms that could assess the statistical significance of the mutant phenotypes. The algorithms were essential for the analysis of the thousands of images generated by the screening process and will become a valuable tool for future genome-wide screens in primary neurons. Our analysis revealed unexpected, essential roles in neurite outgrowth for genes representing a wide range of functional categories including signalling molecules, enzymes, channels, receptors, and cytoskeletal proteins. We also found that genes known to be involved in protein and vesicle trafficking showed similar RNAi phenotypes. We confirmed phenotypes of the protein trafficking genes Sec61alpha and Ran GTPase using Drosophila embryo and mouse embryonic cerebral cortical neurons, respectively. Collectively, our results showed that RNAi phenotypes in primary neural culture can parallel in vivo phenotypes, and the screening technique can be used to identify many new

  2. Using the gene ontology to scan multilevel gene sets for associations in genome wide association studies.

    Science.gov (United States)

    Schaid, Daniel J; Sinnwell, Jason P; Jenkins, Gregory D; McDonnell, Shannon K; Ingle, James N; Kubo, Michiaki; Goss, Paul E; Costantino, Joseph P; Wickerham, D Lawrence; Weinshilboum, Richard M

    2012-01-01

    Gene-set analyses have been widely used in gene expression studies, and some of the developed methods have been extended to genome wide association studies (GWAS). Yet, complications due to linkage disequilibrium (LD) among single nucleotide polymorphisms (SNPs), and variable numbers of SNPs per gene and genes per gene-set, have plagued current approaches, often leading to ad hoc "fixes." To overcome some of the current limitations, we developed a general approach to scan GWAS SNP data for both gene-level and gene-set analyses, building on score statistics for generalized linear models, and taking advantage of the directed acyclic graph structure of the gene ontology when creating gene-sets. However, other types of gene-set structures can be used, such as the popular Kyoto Encyclopedia of Genes and Genomes (KEGG). Our approach combines SNPs into genes, and genes into gene-sets, but assures that positive and negative effects of genes on a trait do not cancel. To control for multiple testing of many gene-sets, we use an efficient computational strategy that accounts for LD and provides accurate step-down adjusted P-values for each gene-set. Application of our methods to two different GWAS provide guidance on the potential strengths and weaknesses of our proposed gene-set analyses. © 2011 Wiley Periodicals, Inc.

  3. On Computing Breakpoint Distances for Genomes with Duplicate Genes.

    Science.gov (United States)

    Shao, Mingfu; Moret, Bernard M E

    2017-06-01

    A fundamental problem in comparative genomics is to compute the distance between two genomes in terms of its higher level organization (given by genes or syntenic blocks). For two genomes without duplicate genes, we can easily define (and almost always efficiently compute) a variety of distance measures, but the problem is NP-hard under most models when genomes contain duplicate genes. To tackle duplicate genes, three formulations (exemplar, maximum matching, and any matching) have been proposed, all of which aim to build a matching between homologous genes so as to minimize some distance measure. Of the many distance measures, the breakpoint distance (the number of nonconserved adjacencies) was the first one to be studied and remains of significant interest because of its simplicity and model-free property. The three breakpoint distance problems corresponding to the three formulations have been widely studied. Although we provided last year a solution for the exemplar problem that runs very fast on full genomes, computing optimal solutions for the other two problems has remained challenging. In this article, we describe very fast, exact algorithms for these two problems. Our algorithms rely on a compact integer-linear program that we further simplify by developing an algorithm to remove variables, based on new results on the structure of adjacencies and matchings. Through extensive experiments using both simulations and biological data sets, we show that our algorithms run very fast (in seconds) on mammalian genomes and scale well beyond. We also apply these algorithms (as well as the classic orthology tool MSOAR) to create orthology assignment, then compare their quality in terms of both accuracy and coverage. We find that our algorithm for the "any matching" formulation significantly outperforms other methods in terms of accuracy while achieving nearly maximum coverage.

  4. The genome BLASTatlas - a GeneWiz extension for visualization of whole-genome homology

    DEFF Research Database (Denmark)

    Hallin, Peter Fischer; Binnewies, Tim Terence; Ussery, David

    2008-01-01

    ://www.cbs.dtu.dk/ws/BLASTatlas), where programming examples are available in Perl. By providing an interoperable method to carry out whole genome visualization of homology, this service offers bioinformaticians as well as biologists an easy-to-adopt workflow that can be directly called from the programming language of the user, hence......The development of fast and inexpensive methods for sequencing bacterial genomes has led to a wealth of data, often with many genomes being sequenced of the same species or closely related organisms. Thus, there is a need for visualization methods that will allow easy comparison of many sequenced...... genomes to a defined reference strain. The BLASTatlas is one such tool that is useful for mapping and visualizing whole genome homology of genes and proteins within a reference strain compared to other strains or species of one or more prokaryotic organisms. We provide examples of BLASTatlases, including...

  5. Sugar Lego: gene composition of bacterial carbohydrate metabolism genomic loci.

    Science.gov (United States)

    Kaznadzey, Anna; Shelyakin, Pavel; Gelfand, Mikhail S

    2017-11-25

    Bacterial carbohydrate metabolism is extremely diverse, since carbohydrates serve as a major energy source and are involved in a variety of cellular processes. Bacterial genes belonging to same metabolic pathway are often co-localized in the chromosome, but it is not a strict rule. Gene co-localization in linked to co-evolution and co-regulation. This study focuses on a large-scale analysis of bacterial genomic loci related to the carbohydrate metabolism. We demonstrate that only 53% of 148,000 studied genes from over six hundred bacterial genomes are co-localized in bacterial genomes with other carbohydrate metabolism genes, which points to a significant role of singleton genes. Co-localized genes form cassettes, ranging in size from two to fifteen genes. Two major factors influencing the cassette-forming tendency are gene function and bacterial phylogeny. We have obtained a comprehensive picture of co-localization preferences of genes for nineteen major carbohydrate metabolism functional classes, over two hundred gene orthologous clusters, and thirty bacterial classes, and characterized the cassette variety in size and content among different species, highlighting a significant role of short cassettes. The preference towards co-localization of carbohydrate metabolism genes varies between 40 and 76% for bacterial taxa. Analysis of frequently co-localized genes yielded forty-five significant pairwise links between genes belonging to different functional classes. The number of such links per class range from zero to eight, demonstrating varying preferences of respective genes towards a specific chromosomal neighborhood. Genes from eleven functional classes tend to co-localize with genes from the same class, indicating an important role of clustering of genes with similar functions. At that, in most cases such co-localization does not originate from local duplication events. Overall, we describe a complex web formed by evolutionary relationships of bacterial

  6. Gene organization inside replication domains in mammalian genomes

    Science.gov (United States)

    Zaghloul, Lamia; Baker, Antoine; Audit, Benjamin; Arneodo, Alain

    2012-11-01

    We investigate the large-scale organization of human genes with respect to "master" replication origins that were previously identified as bordering nucleotide compositional skew domains. We separate genes in two categories depending on their CpG enrichment at the promoter which can be considered as a marker of germline DNA methylation. Using expression data in mouse, we confirm that CpG-rich genes are highly expressed in germline whereas CpG-poor genes are in a silent state. We further show that, whether tissue-specific or broadly expressed (housekeeping genes), the CpG-rich genes are over-represented close to the replication skew domain borders suggesting some coordination of replication and transcription. We also reveal that the transcription of the longest CpG-rich genes is co-oriented with replication fork progression so that the promoter of these transcriptionally active genes be located into the accessible open chromatin environment surrounding the master replication origins that border the replication skew domains. The observation of a similar gene organization in the mouse genome confirms the interplay of replication, transcription and chromatin structure as the cornerstone of mammalian genome architecture.

  7. Differential retention of metabolic genes following whole-genome duplication.

    Science.gov (United States)

    Gout, Jean-François; Duret, Laurent; Kahn, Daniel

    2009-05-01

    Classical studies in Metabolic Control Theory have shown that metabolic fluxes usually exhibit little sensitivity to changes in individual enzyme activity, yet remain sensitive to global changes of all enzymes in a pathway. Therefore, little selective pressure is expected on the dosage or expression of individual metabolic genes, yet entire pathways should still be constrained. However, a direct estimate of this selective pressure had not been evaluated. Whole-genome duplications (WGDs) offer a good opportunity to address this question by analyzing the fates of metabolic genes during the massive gene losses that follow. Here, we take advantage of the successive rounds of WGD that occurred in the Paramecium lineage. We show that metabolic genes exhibit different gene retention patterns than nonmetabolic genes. Contrary to what was expected for individual genes, metabolic genes appeared more retained than other genes after the recent WGD, which was best explained by selection for gene expression operating on entire pathways. Metabolic genes also tend to be less retained when present at high copy number before WGD, contrary to other genes that show a positive correlation between gene retention and preduplication copy number. This is rationalized on the basis of the classical concave relationship relating metabolic fluxes with enzyme expression.

  8. In-silico human genomics with GeneCards

    Directory of Open Access Journals (Sweden)

    Stelzer Gil

    2011-10-01

    Full Text Available Abstract Since 1998, the bioinformatics, systems biology, genomics and medical communities have enjoyed a synergistic relationship with the GeneCards database of human genes (http://www.genecards.org. This human gene compendium was created to help to introduce order into the increasing chaos of information flow. As a consequence of viewing details and deep links related to specific genes, users have often requested enhanced capabilities, such that, over time, GeneCards has blossomed into a suite of tools (including GeneDecks, GeneALaCart, GeneLoc, GeneNote and GeneAnnot for a variety of analyses of both single human genes and sets thereof. In this paper, we focus on inhouse and external research activities which have been enabled, enhanced, complemented and, in some cases, motivated by GeneCards. In turn, such interactions have often inspired and propelled improvements in GeneCards. We describe here the evolution and architecture of this project, including examples of synergistic applications in diverse areas such as synthetic lethality in cancer, the annotation of genetic variations in disease, omics integration in a systems biology approach to kidney disease, and bioinformatics tools.

  9. Genetic addiction: selfish gene's strategy for symbiosis in the genome.

    Science.gov (United States)

    Mochizuki, Atsushi; Yahara, Koji; Kobayashi, Ichizo; Iwasa, Yoh

    2006-02-01

    The evolution and maintenance of the phenomenon of postsegregational host killing or genetic addiction are paradoxical. In this phenomenon, a gene complex, once established in a genome, programs death of a host cell that has eliminated it. The intact form of the gene complex would survive in other members of the host population. It is controversial as to why these genetic elements are maintained, due to the lethal effects of host killing, or perhaps some other properties are beneficial to the host. We analyzed their population dynamics by analytical methods and computer simulations. Genetic addiction turned out to be advantageous to the gene complex in the presence of a competitor genetic element. The advantage is, however, limited in a population without spatial structure, such as that in a well-mixed liquid culture. In contrast, in a structured habitat, such as the surface of a solid medium, the addiction gene complex can increase in frequency, irrespective of its initial density. Our demonstration that genomes can evolve through acquisition of addiction genes has implications for the general question of how a genome can evolve as a community of potentially selfish genes.

  10. GENOME-ENABLED DISCOVERY OF CARBON SEQUESTRATION GENES IN POPLAR

    Energy Technology Data Exchange (ETDEWEB)

    DAVIS J M

    2007-10-11

    Plants utilize carbon by partitioning the reduced carbon obtained through photosynthesis into different compartments and into different chemistries within a cell and subsequently allocating such carbon to sink tissues throughout the plant. Since the phytohormones auxin and cytokinin are known to influence sink strength in tissues such as roots (Skoog & Miller 1957, Nordstrom et al. 2004), we hypothesized that altering the expression of genes that regulate auxin-mediated (e.g., AUX/IAA or ARF transcription factors) or cytokinin-mediated (e.g., RR transcription factors) control of root growth and development would impact carbon allocation and partitioning belowground (Fig. 1 - Renewal Proposal). Specifically, the ARF, AUX/IAA and RR transcription factor gene families mediate the effects of the growth regulators auxin and cytokinin on cell expansion, cell division and differentiation into root primordia. Invertases (IVR), whose transcript abundance is enhanced by both auxin and cytokinin, are critical components of carbon movement and therefore of carbon allocation. Thus, we initiated comparative genomic studies to identify the AUX/IAA, ARF, RR and IVR gene families in the Populus genome that could impact carbon allocation and partitioning. Bioinformatics searches using Arabidopsis gene sequences as queries identified regions with high degrees of sequence similarities in the Populus genome. These Populus sequences formed the basis of our transgenic experiments. Transgenic modification of gene expression involving members of these gene families was hypothesized to have profound effects on carbon allocation and partitioning.

  11. Identification of candidate new cancer susceptibility genes using yeast genomics

    International Nuclear Information System (INIS)

    Brown, M.; Brown, J.A.; Game, J.C.

    2003-01-01

    A large proportion of cancer susceptibility syndromes are the result of mutations in genes in DNA repair or in cell-cycle checkpoints in response to DNA damage, such as ataxia telangiectasia (AT), Fanconi's anemia (FA), Bloom's syndrome (BS), Nijmegen breakage syndrome (NBS), and xeroderma pigmentosum (XP). Mutations in these genes often cause gross chromosomal instability leading to an increased mutation rate of all genes including those directly responsible for cancer. We have proposed that because the orthologs of these genes in budding yeast, S. cerevisiae, confer protection against killing by DNA damaging agents it should be possible to identify new cancer susceptibility genes by identifying yeast genes whose deletion causes sensitivity to DNA damage. We therefore screened the recently completed collection of individual gene deletion mutants to identify genes that affect sensitivity to DNA-damaging agents. Screening for sensitivity in this obtained up to now with the F98 glioma model othe fact that each deleted gene is replaced by a cassette containing two molecular 'barcodes', or 20-mers, that uniquely identify the strain when DNA from a pool of strains is hybridized to an oligonucleotide array containing the complementary sequences of the barcodes. We performed the screen with UV, IR, H 2 0 2 and other DNA damaging agents. In addition to identifying genes already known to confer resistance to DNA damaging agents we have identified, and individually confirmed, several genes not previously associated with resistance. Several of these are of unknown function. We have also examined the chromosomal stability of selected strains and found that IR sensitive strains often but not always exhibit genomic instability. We are presently constructing a yeast artificial chromosome to globally interrogate all the genes in the deletion pool for their involvement in genomic stability. This work shows that budding yeast is a valuable eukaryotic model organism to identify

  12. Genome Binding and Gene Regulation by Stem Cell Transcription Factors

    NARCIS (Netherlands)

    J.H. Brandsma (Johan)

    2016-01-01

    markdownabstractNearly all cells of an individual organism contain the same genome. However, each cell type transcribes a different set of genes due to the presence of different sets of cell type-specific transcription factors. Such transcription factors bind to regulatory regions such as promoters

  13. Gene therapy and genome surgery in the retina.

    Science.gov (United States)

    DiCarlo, James E; Mahajan, Vinit B; Tsang, Stephen H

    2018-06-01

    Precision medicine seeks to treat disease with molecular specificity. Advances in genome sequence analysis, gene delivery, and genome surgery have allowed clinician-scientists to treat genetic conditions at the level of their pathology. As a result, progress in treating retinal disease using genetic tools has advanced tremendously over the past several decades. Breakthroughs in gene delivery vectors, both viral and nonviral, have allowed the delivery of genetic payloads in preclinical models of retinal disorders and have paved the way for numerous successful clinical trials. Moreover, the adaptation of CRISPR-Cas systems for genome engineering have enabled the correction of both recessive and dominant pathogenic alleles, expanding the disease-modifying power of gene therapies. Here, we highlight the translational progress of gene therapy and genome editing of several retinal disorders, including RPE65-, CEP290-, and GUY2D-associated Leber congenital amaurosis, as well as choroideremia, achromatopsia, Mer tyrosine kinase- (MERTK-) and RPGR X-linked retinitis pigmentosa, Usher syndrome, neovascular age-related macular degeneration, X-linked retinoschisis, Stargardt disease, and Leber hereditary optic neuropathy.

  14. Genomic dissection and prioritizing of candidate genes of QTL for ...

    Indian Academy of Sciences (India)

    of Anatomy and Neurobiology, University of Tennessee Health Science Center, Memphis, TN 38163, USA. 5Mudanjiang ..... Fragile X mental retardation gene 1,. −2.1 ... stimulus/stress and signalling associated with acute-phase response were .... This work was supported by the Center of Genomics and Bioinfor- matics and ...

  15. Re-Examining the Gene in Personalized Genomics

    Science.gov (United States)

    Bartol, Jordan

    2013-01-01

    Personalized genomics companies (PG; also called "direct-to-consumer genetics") are businesses marketing genetic testing to consumers over the Internet. While much has been written about these new businesses, little attention has been given to their roles in science communication. This paper provides an analysis of the gene concept…

  16. Gene hunting: molecular analysis of the chicken genome

    NARCIS (Netherlands)

    Crooijmans, R.P.M.A.

    2000-01-01

    This dissertation describes the development of molecular tools to identify genes that are involved in production and health traits in poultry. To unravel the chicken genome, fluorescent molecular markers (microsatellite markers) were developed and optimized to perform high throughput

  17. Genomic dissection and prioritizing of candidate genes of QTL for ...

    Indian Academy of Sciences (India)

    Genomic dissection and prioritizing of candidate genes of QTL for regulating spontaneous arthritis on chromosome 1 in mice deficient for interleukin-1 receptor antagonist. Yanhong Cao, Jifei Zhang, Yan Jiao, Jian Yan, Feng Jiao, XiaoYun Liu, Robert W. Williams, Karen A. Hasty,. John M. Stuart and Weikuan Gu. J. Genet.

  18. Invasion of exotic earthworms into ecosystems inhabited by native earthworms

    Science.gov (United States)

    P. F. Hendrix; G. H. Baker; M. A. Callaham Jr; G. A. Damoff; Fragoso C.; G. Gonzalez; S. W. James; S. L. Lachnicht; T. Winsome; X. Zou

    2006-01-01

    The most conspicuous biological invasions in terrestrial ecosystems have been by exotic plants, insects and vertebrates. Invasions by exotic earthworms, although not as well studied, may be increasing with global commerce in agriculture, waste management and bioremediation. A number of cases has documented where invasive earthworms have caused significant changes in...

  19. Comparative genomics of Mycoplasma: analysis of conserved essential genes and diversity of the pan-genome.

    Directory of Open Access Journals (Sweden)

    Wei Liu

    Full Text Available Mycoplasma, the smallest self-replicating organism with a minimal metabolism and little genomic redundancy, is expected to be a close approximation to the minimal set of genes needed to sustain bacterial life. This study employs comparative evolutionary analysis of twenty Mycoplasma genomes to gain an improved understanding of essential genes. By analyzing the core genome of mycoplasmas, we finally revealed the conserved essential genes set for mycoplasma survival. Further analysis showed that the core genome set has many characteristics in common with experimentally identified essential genes. Several key genes, which are related to DNA replication and repair and can be disrupted in transposon mutagenesis studies, may be critical for bacteria survival especially over long period natural selection. Phylogenomic reconstructions based on 3,355 homologous groups allowed robust estimation of phylogenetic relatedness among mycoplasma strains. To obtain deeper insight into the relative roles of molecular evolution in pathogen adaptation to their hosts, we also analyzed the positive selection pressures on particular sites and lineages. There appears to be an approximate correlation between the divergence of species and the level of positive selection detected in corresponding lineages.

  20. Comparative genomics and transcriptomics of trait-gene association

    Directory of Open Access Journals (Sweden)

    Pierlé Sebastián

    2012-11-01

    Full Text Available Abstract Background The Order Rickettsiales includes important tick-borne pathogens, from Rickettsia rickettsii, which causes Rocky Mountain spotted fever, to Anaplasma marginale, the most prevalent vector-borne pathogen of cattle. Although most pathogens in this Order are transmitted by arthropod vectors, little is known about the microbial determinants of transmission. A. marginale provides unique tools for studying the determinants of transmission, with multiple strain sequences available that display distinct and reproducible transmission phenotypes. The closed core A. marginale genome suggests that any phenotypic differences are due to single nucleotide polymorphisms (SNPs. We combined DNA/RNA comparative genomic approaches using strains with different tick transmission phenotypes and identified genes that segregate with transmissibility. Results Comparison of seven strains with different transmission phenotypes generated a list of SNPs affecting 18 genes and nine promoters. Transcriptional analysis found two candidate genes downstream from promoter SNPs that were differentially transcribed. To corroborate the comparative genomics approach we used three RNA-seq platforms to analyze the transcriptomes from two A. marginale strains with different transmission phenotypes. RNA-seq analysis confirmed the comparative genomics data and found 10 additional genes whose transcription between strains with distinct transmission efficiencies was significantly different. Six regions of the genome that contained no annotation were found to be transcriptionally active, and two of these newly identified transcripts were differentially transcribed. Conclusions This approach identified 30 genes and two novel transcripts potentially involved in tick transmission. We describe the transcriptome of an obligate intracellular bacterium in depth, while employing massive parallel sequencing to dissect an important trait in bacterial pathogenesis.

  1. Soft rot erwiniae: from genes to genomes.

    Science.gov (United States)

    Toth, Ian K; Bell, Kenneth S; Holeva, Maria C; Birch, Paul R J

    2003-01-01

    SUMMARY The soft rot erwiniae, Erwinia carotovora ssp. atroseptica (Eca), E. carotovora ssp. carotovora (Ecc) and E. chrysanthemi (Ech) are major bacterial pathogens of potato and other crops world-wide. We currently understand much about how these bacteria attack plants and protect themselves against plant defences. However, the processes underlying the establishment of infection, differences in host range and their ability to survive when not causing disease, largely remain a mystery. This review will focus on our current knowledge of pathogenesis in these organisms and discuss how modern genomic approaches, including complete genome sequencing of Eca and Ech, may open the door to a new understanding of the potential subtlety and complexity of soft rot erwiniae and their interactions with plants. The soft rot erwiniae are members of the Enterobacteriaceae, along with other plant pathogens such as Erwinia amylovora and human pathogens such as Escherichia coli, Salmonella spp. and Yersinia spp. Although the genus name Erwinia is most often used to describe the group, an alternative genus name Pectobacterium was recently proposed for the soft rot species. Ech mainly affects crops and other plants in tropical and subtropical regions and has a wide host range that includes potato and the important model host African violet (Saintpaulia ionantha). Ecc affects crops and other plants in subtropical and temperate regions and has probably the widest host range, which also includes potato. Eca, on the other hand, has a host range limited almost exclusively to potato in temperate regions only. Disease symptoms: Soft rot erwiniae cause general tissue maceration, termed soft rot disease, through the production of plant cell wall degrading enzymes. Environmental factors such as temperature, low oxygen concentration and free water play an essential role in disease development. On potato, and possibly other plants, disease symptoms may differ, e.g. blackleg disease is associated

  2. Digestive enzymes of some earthworms.

    Science.gov (United States)

    Mishra, P C; Dash, M C

    1980-10-15

    4 species of tropical earthworms differed with regard to enzyme activity. The maximum activity of protease and of cellulase occurred in the posterior region of the gut of the earthworms. On the average Octochaetona surensis shows maximum activity and Drawida calebi shows minimum activity for all the enzymes studied.

  3. Mapping earthworm communities in Europe

    DEFF Research Database (Denmark)

    Rutgers, Michiel; Orgiazzi, Alberto; Gardi, Ciro

    Existing data sets on earthworm communities in Europe were collected, harmonized, modelled and depicted on a soil biodiversity map of Europe. Digital Soil Mapping was applied using multiple regressions relating relatively low density earthworm community data to soil characteristics, land use...

  4. Genome-wide identification of key modulators of gene-gene interaction networks in breast cancer.

    Science.gov (United States)

    Chiu, Yu-Chiao; Wang, Li-Ju; Hsiao, Tzu-Hung; Chuang, Eric Y; Chen, Yidong

    2017-10-03

    With the advances in high-throughput gene profiling technologies, a large volume of gene interaction maps has been constructed. A higher-level layer of gene-gene interaction, namely modulate gene interaction, is composed of gene pairs of which interaction strengths are modulated by (i.e., dependent on) the expression level of a key modulator gene. Systematic investigations into the modulation by estrogen receptor (ER), the best-known modulator gene, have revealed the functional and prognostic significance in breast cancer. However, a genome-wide identification of key modulator genes that may further unveil the landscape of modulated gene interaction is still lacking. We proposed a systematic workflow to screen for key modulators based on genome-wide gene expression profiles. We designed four modularity parameters to measure the ability of a putative modulator to perturb gene interaction networks. Applying the method to a dataset of 286 breast tumors, we comprehensively characterized the modularity parameters and identified a total of 973 key modulator genes. The modularity of these modulators was verified in three independent breast cancer datasets. ESR1, the encoding gene of ER, appeared in the list, and abundant novel modulators were illuminated. For instance, a prognostic predictor of breast cancer, SFRP1, was found the second modulator. Functional annotation analysis of the 973 modulators revealed involvements in ER-related cellular processes as well as immune- and tumor-associated functions. Here we present, as far as we know, the first comprehensive analysis of key modulator genes on a genome-wide scale. The validity of filtering parameters as well as the conservativity of modulators among cohorts were corroborated. Our data bring new insights into the modulated layer of gene-gene interaction and provide candidates for further biological investigations.

  5. Comparison of methods for genomic localization of gene trap sequences

    Directory of Open Access Journals (Sweden)

    Ferrin Thomas E

    2006-09-01

    Full Text Available Abstract Background Gene knockouts in a model organism such as mouse provide a valuable resource for the study of basic biology and human disease. Determining which gene has been inactivated by an untargeted gene trapping event poses a challenging annotation problem because gene trap sequence tags, which represent sequence near the vector insertion site of a trapped gene, are typically short and often contain unresolved residues. To understand better the localization of these sequences on the mouse genome, we compared stand-alone versions of the alignment programs BLAT, SSAHA, and MegaBLAST. A set of 3,369 sequence tags was aligned to build 34 of the mouse genome using default parameters for each algorithm. Known genome coordinates for the cognate set of full-length genes (1,659 sequences were used to evaluate localization results. Results In general, all three programs performed well in terms of localizing sequences to a general region of the genome, with only relatively subtle errors identified for a small proportion of the sequence tags. However, large differences in performance were noted with regard to correctly identifying exon boundaries. BLAT correctly identified the vast majority of exon boundaries, while SSAHA and MegaBLAST missed the majority of exon boundaries. SSAHA consistently reported the fewest false positives and is the fastest algorithm. MegaBLAST was comparable to BLAT in speed, but was the most susceptible to localizing sequence tags incorrectly to pseudogenes. Conclusion The differences in performance for sequence tags and full-length reference sequences were surprisingly small. Characteristic variations in localization results for each program were noted that affect the localization of sequence at exon boundaries, in particular.

  6. Comparative genome analysis of PHB gene family reveals deep evolutionary origins and diverse gene function.

    Science.gov (United States)

    Di, Chao; Xu, Wenying; Su, Zhen; Yuan, Joshua S

    2010-10-07

    PHB (Prohibitin) gene family is involved in a variety of functions important for different biological processes. PHB genes are ubiquitously present in divergent species from prokaryotes to eukaryotes. Human PHB genes have been found to be associated with various diseases. Recent studies by our group and others have shown diverse function of PHB genes in plants for development, senescence, defence, and others. Despite the importance of the PHB gene family, no comprehensive gene family analysis has been carried to evaluate the relatedness of PHB genes across different species. In order to better guide the gene function analysis and understand the evolution of the PHB gene family, we therefore carried out the comparative genome analysis of the PHB genes across different kingdoms. The relatedness, motif distribution, and intron/exon distribution all indicated that PHB genes is a relatively conserved gene family. The PHB genes can be classified into 5 classes and each class have a very deep evolutionary origin. The PHB genes within the class maintained the same motif patterns during the evolution. With Arabidopsis as the model species, we found that PHB gene intron/exon structure and domains are also conserved during the evolution. Despite being a conserved gene family, various gene duplication events led to the expansion of the PHB genes. Both segmental and tandem gene duplication were involved in Arabidopsis PHB gene family expansion. However, segmental duplication is predominant in Arabidopsis. Moreover, most of the duplicated genes experienced neofunctionalization. The results highlighted that PHB genes might be involved in important functions so that the duplicated genes are under the evolutionary pressure to derive new function. PHB gene family is a conserved gene family and accounts for diverse but important biological functions based on the similar molecular mechanisms. The highly diverse biological function indicated that more research needs to be carried out

  7. Viral Genome DataBase: storing and analyzing genes and proteins from complete viral genomes.

    Science.gov (United States)

    Hiscock, D; Upton, C

    2000-05-01

    The Viral Genome DataBase (VGDB) contains detailed information of the genes and predicted protein sequences from 15 completely sequenced genomes of large (&100 kb) viruses (2847 genes). The data that is stored includes DNA sequence, protein sequence, GenBank and user-entered notes, molecular weight (MW), isoelectric point (pI), amino acid content, A + T%, nucleotide frequency, dinucleotide frequency and codon use. The VGDB is a mySQL database with a user-friendly JAVA GUI. Results of queries can be easily sorted by any of the individual parameters. The software and additional figures and information are available at http://athena.bioc.uvic.ca/genomes/index.html .

  8. Gene Conversion in Angiosperm Genomes with an Emphasis on Genes Duplicated by Polyploidization

    Directory of Open Access Journals (Sweden)

    Xi-Yin Wang

    2011-01-01

    Full Text Available Angiosperm genomes differ from those of mammals by extensive and recursive polyploidizations. The resulting gene duplication provides opportunities both for genetic innovation, and for concerted evolution. Though most genes may escape conversion by their homologs, concerted evolution of duplicated genes can last for millions of years or longer after their origin. Indeed, paralogous genes on two rice chromosomes duplicated an estimated 60–70 million years ago have experienced gene conversion in the past 400,000 years. Gene conversion preserves similarity of paralogous genes, but appears to accelerate their divergence from orthologous genes in other species. The mutagenic nature of recombination coupled with the buffering effect provided by gene redundancy, may facilitate the evolution of novel alleles that confer functional innovations while insulating biological fitness of affected plants. A mixed evolutionary model, characterized by a primary birth-and-death process and occasional homoeologous recombination and gene conversion, may best explain the evolution of multigene families.

  9. Genome-wide associations of gene expression variation in humans.

    Directory of Open Access Journals (Sweden)

    Barbara E Stranger

    2005-12-01

    Full Text Available The exploration of quantitative variation in human populations has become one of the major priorities for medical genetics. The successful identification of variants that contribute to complex traits is highly dependent on reliable assays and genetic maps. We have performed a genome-wide quantitative trait analysis of 630 genes in 60 unrelated Utah residents with ancestry from Northern and Western Europe using the publicly available phase I data of the International HapMap project. The genes are located in regions of the human genome with elevated functional annotation and disease interest including the ENCODE regions spanning 1% of the genome, Chromosome 21 and Chromosome 20q12-13.2. We apply three different methods of multiple test correction, including Bonferroni, false discovery rate, and permutations. For the 374 expressed genes, we find many regions with statistically significant association of single nucleotide polymorphisms (SNPs with expression variation in lymphoblastoid cell lines after correcting for multiple tests. Based on our analyses, the signal proximal (cis- to the genes of interest is more abundant and more stable than distal and trans across statistical methodologies. Our results suggest that regulatory polymorphism is widespread in the human genome and show that the 5-kb (phase I HapMap has sufficient density to enable linkage disequilibrium mapping in humans. Such studies will significantly enhance our ability to annotate the non-coding part of the genome and interpret functional variation. In addition, we demonstrate that the HapMap cell lines themselves may serve as a useful resource for quantitative measurements at the cellular level.

  10. Genome-Wide Associations of Gene Expression Variation in Humans.

    Directory of Open Access Journals (Sweden)

    2005-12-01

    Full Text Available The exploration of quantitative variation in human populations has become one of the major priorities for medical genetics. The successful identification of variants that contribute to complex traits is highly dependent on reliable assays and genetic maps. We have performed a genome-wide quantitative trait analysis of 630 genes in 60 unrelated Utah residents with ancestry from Northern and Western Europe using the publicly available phase I data of the International HapMap project. The genes are located in regions of the human genome with elevated functional annotation and disease interest including the ENCODE regions spanning 1% of the genome, Chromosome 21 and Chromosome 20q12-13.2. We apply three different methods of multiple test correction, including Bonferroni, false discovery rate, and permutations. For the 374 expressed genes, we find many regions with statistically significant association of single nucleotide polymorphisms (SNPs with expression variation in lymphoblastoid cell lines after correcting for multiple tests. Based on our analyses, the signal proximal (cis- to the genes of interest is more abundant and more stable than distal and trans across statistical methodologies. Our results suggest that regulatory polymorphism is widespread in the human genome and show that the 5-kb (phase I HapMap has sufficient density to enable linkage disequilibrium mapping in humans. Such studies will significantly enhance our ability to annotate the non-coding part of the genome and interpret functional variation. In addition, we demonstrate that the HapMap cell lines themselves may serve as a useful resource for quantitative measurements at the cellular level.

  11. Evolutionary maintenance of filovirus-like genes in bat genomes

    Directory of Open Access Journals (Sweden)

    Taylor Derek J

    2011-11-01

    Full Text Available Abstract Background Little is known of the biological significance and evolutionary maintenance of integrated non-retroviral RNA virus genes in eukaryotic host genomes. Here, we isolated novel filovirus-like genes from bat genomes and tested for evolutionary maintenance. We also estimated the age of filovirus VP35-like gene integrations and tested the phylogenetic hypotheses that there is a eutherian mammal clade and a marsupial/ebolavirus/Marburgvirus dichotomy for filoviruses. Results We detected homologous copies of VP35-like and NP-like gene integrations in both Old World and New World species of Myotis (bats. We also detected previously unknown VP35-like genes in rodents that are positionally homologous. Comprehensive phylogenetic estimates for filovirus NP-like and VP35-like loci support two main clades with a marsupial and a rodent grouping within the ebolavirus/Lloviu virus/Marburgvirus clade. The concordance of VP35-like, NP-like and mitochondrial gene trees with the expected species tree supports the notion that the copies we examined are orthologs that predate the global spread and radiation of the genus Myotis. Parametric simulations were consistent with selective maintenance for the open reading frame (ORF of VP35-like genes in Myotis. The ORF of the filovirus-like VP35 gene has been maintained in bat genomes for an estimated 13. 4 MY. ORFs were disrupted for the NP-like genes in Myotis. Likelihood ratio tests revealed that a model that accommodates positive selection is a significantly better fit to the data than a model that does not allow for positive selection for VP35-like sequences. Moreover, site-by-site analysis of selection using two methods indicated at least 25 sites in the VP35-like alignment are under positive selection in Myotis. Conclusions Our results indicate that filovirus-like elements have significance beyond genomic imprints of prior infection. That is, there appears to be, or have been, functionally maintained

  12. Integrated genomic and gene expression profiling identifies two major genomic circuits in urothelial carcinoma.

    Directory of Open Access Journals (Sweden)

    David Lindgren

    Full Text Available Similar to other malignancies, urothelial carcinoma (UC is characterized by specific recurrent chromosomal aberrations and gene mutations. However, the interconnection between specific genomic alterations, and how patterns of chromosomal alterations adhere to different molecular subgroups of UC, is less clear. We applied tiling resolution array CGH to 146 cases of UC and identified a number of regions harboring recurrent focal genomic amplifications and deletions. Several potential oncogenes were included in the amplified regions, including known oncogenes like E2F3, CCND1, and CCNE1, as well as new candidate genes, such as SETDB1 (1q21, and BCL2L1 (20q11. We next combined genome profiling with global gene expression, gene mutation, and protein expression data and identified two major genomic circuits operating in urothelial carcinoma. The first circuit was characterized by FGFR3 alterations, overexpression of CCND1, and 9q and CDKN2A deletions. The second circuit was defined by E3F3 amplifications and RB1 deletions, as well as gains of 5p, deletions at PTEN and 2q36, 16q, 20q, and elevated CDKN2A levels. TP53/MDM2 alterations were common for advanced tumors within the two circuits. Our data also suggest a possible RAS/RAF circuit. The tumors with worst prognosis showed a gene expression profile that indicated a keratinized phenotype. Taken together, our integrative approach revealed at least two separate networks of genomic alterations linked to the molecular diversity seen in UC, and that these circuits may reflect distinct pathways of tumor development.

  13. Gene Discovery through Genomic Sequencing of Brucella abortus

    Science.gov (United States)

    Sánchez, Daniel O.; Zandomeni, Ruben O.; Cravero, Silvio; Verdún, Ramiro E.; Pierrou, Ester; Faccio, Paula; Diaz, Gabriela; Lanzavecchia, Silvia; Agüero, Fernán; Frasch, Alberto C. C.; Andersson, Siv G. E.; Rossetti, Osvaldo L.; Grau, Oscar; Ugalde, Rodolfo A.

    2001-01-01

    Brucella abortus is the etiological agent of brucellosis, a disease that affects bovines and human. We generated DNA random sequences from the genome of B. abortus strain 2308 in order to characterize molecular targets that might be useful for developing immunological or chemotherapeutic strategies against this pathogen. The partial sequencing of 1,899 clones allowed the identification of 1,199 genomic sequence surveys (GSSs) with high homology (BLAST expect value < 10−5) to sequences deposited in the GenBank databases. Among them, 925 represent putative novel genes for the Brucella genus. Out of 925 nonredundant GSSs, 470 were classified in 15 categories based on cellular function. Seven hundred GSSs showed no significant database matches and remain available for further studies in order to identify their function. A high number of GSSs with homology to Agrobacterium tumefaciens and Rhizobium meliloti proteins were observed, thus confirming their close phylogenetic relationship. Among them, several GSSs showed high similarity with genes related to nodule nitrogen fixation, synthesis of nod factors, nodulation protein symbiotic plasmid, and nodule bacteroid differentiation. We have also identified several B. abortus homologs of virulence and pathogenesis genes from other pathogens, including a homolog to both the Shda gene from Salmonella enterica serovar Typhimurium and the AidA-1 gene from Escherichia coli. Other GSSs displayed significant homologies to genes encoding components of the type III and type IV secretion machineries, suggesting that Brucella might also have an active type III secretion machinery. PMID:11159979

  14. Identification of DNA repair genes in the human genome

    International Nuclear Information System (INIS)

    Hoeijmakers, J.H.J.; van Duin, M.; Westerveld, A.; Yasui, A.; Bootsma, D.

    1986-01-01

    To identify human DNA repair genes we have transfected human genomic DNA ligated to a dominant marker to excision repair deficient xeroderma pigmentosum (XP) and CHO cells. This resulted in the cloning of a human gene, ERCC-1, that complements the defect of a UV- and mitomycin-C sensitive CHO mutant 43-3B. The ERCC-1 gene has a size of 15 kb, consists of 10 exons and is located in the region 19q13.2-q13.3. Its primary transcript is processed into two mRNAs by alternative splicing of an internal coding exon. One of these transcripts encodes a polypeptide of 297 aminoacids. A putative DNA binding protein domain and nuclear location signal could be identified. Significant AA-homology is found between ERCC-1 and the yeast excision repair gene RAD10. 58 references, 6 figures, 1 table

  15. Re-examining the Gene in Personalized Genomics

    Science.gov (United States)

    Bartol, Jordan

    2013-10-01

    Personalized genomics companies (PG; also called `direct-to-consumer genetics') are businesses marketing genetic testing to consumers over the Internet. While much has been written about these new businesses, little attention has been given to their roles in science communication. This paper provides an analysis of the gene concept presented to customers and the relation between the information given and the science behind PG. Two quite different gene concepts are present in company rhetoric, but only one features in the science. To explain this, we must appreciate the delicate tension between PG, academic science, public expectation, and market forces.

  16. GeneViTo: Visualizing gene-product functional and structural features in genomic datasets

    Directory of Open Access Journals (Sweden)

    Promponas Vasilis J

    2003-10-01

    Full Text Available Abstract Background The availability of increasing amounts of sequence data from completely sequenced genomes boosts the development of new computational methods for automated genome annotation and comparative genomics. Therefore, there is a need for tools that facilitate the visualization of raw data and results produced by bioinformatics analysis, providing new means for interactive genome exploration. Visual inspection can be used as a basis to assess the quality of various analysis algorithms and to aid in-depth genomic studies. Results GeneViTo is a JAVA-based computer application that serves as a workbench for genome-wide analysis through visual interaction. The application deals with various experimental information concerning both DNA and protein sequences (derived from public sequence databases or proprietary data sources and meta-data obtained by various prediction algorithms, classification schemes or user-defined features. Interaction with a Graphical User Interface (GUI allows easy extraction of genomic and proteomic data referring to the sequence itself, sequence features, or general structural and functional features. Emphasis is laid on the potential comparison between annotation and prediction data in order to offer a supplement to the provided information, especially in cases of "poor" annotation, or an evaluation of available predictions. Moreover, desired information can be output in high quality JPEG image files for further elaboration and scientific use. A compilation of properly formatted GeneViTo input data for demonstration is available to interested readers for two completely sequenced prokaryotes, Chlamydia trachomatis and Methanococcus jannaschii. Conclusions GeneViTo offers an inspectional view of genomic functional elements, concerning data stemming both from database annotation and analysis tools for an overall analysis of existing genomes. The application is compatible with Linux or Windows ME-2000-XP operating

  17. Estimating variation within the genes and inferring the phylogeny of 186 sequenced diverse Escherichia coli genomes

    DEFF Research Database (Denmark)

    Kaas, Rolf Sommer; Rundsten, Carsten Friis; Ussery, David

    2012-01-01

    Background Escherichia coli exists in commensal and pathogenic forms. By measuring the variation of individual genes across more than a hundred sequenced genomes, gene variation can be studied in detail, including the number of mutations found for any given gene. This knowledge will be useful...... for creating better phylogenies, for determination of molecular clocks and for improved typing techniques. Results We find 3,051 gene clusters/families present in at least 95% of the genomes and 1,702 gene clusters present in 100% of the genomes. The former 'soft core' of about 3,000 gene families is perhaps...... more biologically relevant, especially considering that many of these genome sequences are draft quality. The E. coli pan-genome for this set of isolates contains 16,373 gene clusters. A core-gene tree, based on alignment and a pan-genome tree based on gene presence/absence, maps the relatedness...

  18. New Markov Model Approaches to Deciphering Microbial Genome Function and Evolution: Comparative Genomics of Laterally Transferred Genes

    Energy Technology Data Exchange (ETDEWEB)

    Borodovsky, M.

    2013-04-11

    Algorithmic methods for gene prediction have been developed and successfully applied to many different prokaryotic genome sequences. As the set of genes in a particular genome is not homogeneous with respect to DNA sequence composition features, the GeneMark.hmm program utilizes two Markov models representing distinct classes of protein coding genes denoted "typical" and "atypical". Atypical genes are those whose DNA features deviate significantly from those classified as typical and they represent approximately 10% of any given genome. In addition to the inherent interest of more accurately predicting genes, the atypical status of these genes may also reflect their separate evolutionary ancestry from other genes in that genome. We hypothesize that atypical genes are largely comprised of those genes that have been relatively recently acquired through lateral gene transfer (LGT). If so, what fraction of atypical genes are such bona fide LGTs? We have made atypical gene predictions for all fully completed prokaryotic genomes; we have been able to compare these results to other "surrogate" methods of LGT prediction.

  19. Conditions for the evolution of gene clusters in bacterial genomes.

    Directory of Open Access Journals (Sweden)

    Sara Ballouz

    2010-02-01

    Full Text Available Genes encoding proteins in a common pathway are often found near each other along bacterial chromosomes. Several explanations have been proposed to account for the evolution of these structures. For instance, natural selection may directly favour gene clusters through a variety of mechanisms, such as increased efficiency of coregulation. An alternative and controversial hypothesis is the selfish operon model, which asserts that clustered arrangements of genes are more easily transferred to other species, thus improving the prospects for survival of the cluster. According to another hypothesis (the persistence model, genes that are in close proximity are less likely to be disrupted by deletions. Here we develop computational models to study the conditions under which gene clusters can evolve and persist. First, we examine the selfish operon model by re-implementing the simulation and running it under a wide range of conditions. Second, we introduce and study a Moran process in which there is natural selection for gene clustering and rearrangement occurs by genome inversion events. Finally, we develop and study a model that includes selection and inversion, which tracks the occurrence and fixation of rearrangements. Surprisingly, gene clusters fail to evolve under a wide range of conditions. Factors that promote the evolution of gene clusters include a low number of genes in the pathway, a high population size, and in the case of the selfish operon model, a high horizontal transfer rate. The computational analysis here has shown that the evolution of gene clusters can occur under both direct and indirect selection as long as certain conditions hold. Under these conditions the selfish operon model is still viable as an explanation for the evolution of gene clusters.

  20. Conditions for the Evolution of Gene Clusters in Bacterial Genomes

    Science.gov (United States)

    Ballouz, Sara; Francis, Andrew R.; Lan, Ruiting; Tanaka, Mark M.

    2010-01-01

    Genes encoding proteins in a common pathway are often found near each other along bacterial chromosomes. Several explanations have been proposed to account for the evolution of these structures. For instance, natural selection may directly favour gene clusters through a variety of mechanisms, such as increased efficiency of coregulation. An alternative and controversial hypothesis is the selfish operon model, which asserts that clustered arrangements of genes are more easily transferred to other species, thus improving the prospects for survival of the cluster. According to another hypothesis (the persistence model), genes that are in close proximity are less likely to be disrupted by deletions. Here we develop computational models to study the conditions under which gene clusters can evolve and persist. First, we examine the selfish operon model by re-implementing the simulation and running it under a wide range of conditions. Second, we introduce and study a Moran process in which there is natural selection for gene clustering and rearrangement occurs by genome inversion events. Finally, we develop and study a model that includes selection and inversion, which tracks the occurrence and fixation of rearrangements. Surprisingly, gene clusters fail to evolve under a wide range of conditions. Factors that promote the evolution of gene clusters include a low number of genes in the pathway, a high population size, and in the case of the selfish operon model, a high horizontal transfer rate. The computational analysis here has shown that the evolution of gene clusters can occur under both direct and indirect selection as long as certain conditions hold. Under these conditions the selfish operon model is still viable as an explanation for the evolution of gene clusters. PMID:20168992

  1. Recombinant Protein Production of Earthworm Lumbrokinase for Potential Antithrombotic Application

    Directory of Open Access Journals (Sweden)

    Kevin Yueju Wang

    2013-01-01

    Full Text Available Earthworms have been used as a traditional medicine in China, Japan, and other Far East countries for thousands of years. Oral administration of dry earthworm powder is considered as a potent and effective supplement for supporting healthy blood circulation. Lumbrokinases are a group of enzymes that were isolated and purified from different species of earthworms. These enzymes are recognized as fibrinolytic agents that can be used to treat various conditions associated with thrombosis. Many lumbrokinase (LK genes have been cloned and characterized. Advances in genetic technology have provided the ability to produce recombinant LK and have made it feasible to purify a single lumbrokinase enzyme for potential antithrombotic application. In this review, we focus on expression systems that can be used for lumbrokinase production. In particular, the advantages of using a transgenic plant system to produce edible lumbrokinase are described.

  2. Diversity and host specificity of the Verminephrobacter–earthworm symbiosis

    DEFF Research Database (Denmark)

    Lund, Marie Braad; Davidson, Seana; Holmstrup, Martin

    2010-01-01

    Symbiotic bacteria of the genus Verminephrobacter (Betaproteobacteria) were detected in the nephridia of 19 out of 23 investigated earthworm species (Oligochaeta: Lumbricidae) by 16S rRNA gene sequence analysis and fluorescence in situ hybridization (FISH). While all four Lumbricus species and th...

  3. A salmonid EST genomic study: genes, duplications, phylogeny and microarrays

    Directory of Open Access Journals (Sweden)

    Brahmbhatt Sonal

    2008-11-01

    Full Text Available Abstract Background Salmonids are of interest because of their relatively recent genome duplication, and their extensive use in wild fisheries and aquaculture. A comprehensive gene list and a comparison of genes in some of the different species provide valuable genomic information for one of the most widely studied groups of fish. Results 298,304 expressed sequence tags (ESTs from Atlantic salmon (69% of the total, 11,664 chinook, 10,813 sockeye, 10,051 brook trout, 10,975 grayling, 8,630 lake whitefish, and 3,624 northern pike ESTs were obtained in this study and have been deposited into the public databases. Contigs were built and putative full-length Atlantic salmon clones have been identified. A database containing ESTs, assemblies, consensus sequences, open reading frames, gene predictions and putative annotation is available. The overall similarity between Atlantic salmon ESTs and those of rainbow trout, chinook, sockeye, brook trout, grayling, lake whitefish, northern pike and rainbow smelt is 93.4, 94.2, 94.6, 94.4, 92.5, 91.7, 89.6, and 86.2% respectively. An analysis of 78 transcript sets show Salmo as a sister group to Oncorhynchus and Salvelinus within Salmoninae, and Thymallinae as a sister group to Salmoninae and Coregoninae within Salmonidae. Extensive gene duplication is consistent with a genome duplication in the common ancestor of salmonids. Using all of the available EST data, a new expanded salmonid cDNA microarray of 32,000 features was created. Cross-species hybridizations to this cDNA microarray indicate that this resource will be useful for studies of all 68 salmonid species. Conclusion An extensive collection and analysis of salmonid RNA putative transcripts indicate that Pacific salmon, Atlantic salmon and charr are 94–96% similar while the more distant whitefish, grayling, pike and smelt are 93, 92, 89 and 86% similar to salmon. The salmonid transcriptome reveals a complex history of gene duplication that is

  4. Pancreatic cancer genomes reveal aberrations in axon guidance pathway genes.

    Science.gov (United States)

    Biankin, Andrew V; Waddell, Nicola; Kassahn, Karin S; Gingras, Marie-Claude; Muthuswamy, Lakshmi B; Johns, Amber L; Miller, David K; Wilson, Peter J; Patch, Ann-Marie; Wu, Jianmin; Chang, David K; Cowley, Mark J; Gardiner, Brooke B; Song, Sarah; Harliwong, Ivon; Idrisoglu, Senel; Nourse, Craig; Nourbakhsh, Ehsan; Manning, Suzanne; Wani, Shivangi; Gongora, Milena; Pajic, Marina; Scarlett, Christopher J; Gill, Anthony J; Pinho, Andreia V; Rooman, Ilse; Anderson, Matthew; Holmes, Oliver; Leonard, Conrad; Taylor, Darrin; Wood, Scott; Xu, Qinying; Nones, Katia; Fink, J Lynn; Christ, Angelika; Bruxner, Tim; Cloonan, Nicole; Kolle, Gabriel; Newell, Felicity; Pinese, Mark; Mead, R Scott; Humphris, Jeremy L; Kaplan, Warren; Jones, Marc D; Colvin, Emily K; Nagrial, Adnan M; Humphrey, Emily S; Chou, Angela; Chin, Venessa T; Chantrill, Lorraine A; Mawson, Amanda; Samra, Jaswinder S; Kench, James G; Lovell, Jessica A; Daly, Roger J; Merrett, Neil D; Toon, Christopher; Epari, Krishna; Nguyen, Nam Q; Barbour, Andrew; Zeps, Nikolajs; Kakkar, Nipun; Zhao, Fengmei; Wu, Yuan Qing; Wang, Min; Muzny, Donna M; Fisher, William E; Brunicardi, F Charles; Hodges, Sally E; Reid, Jeffrey G; Drummond, Jennifer; Chang, Kyle; Han, Yi; Lewis, Lora R; Dinh, Huyen; Buhay, Christian J; Beck, Timothy; Timms, Lee; Sam, Michelle; Begley, Kimberly; Brown, Andrew; Pai, Deepa; Panchal, Ami; Buchner, Nicholas; De Borja, Richard; Denroche, Robert E; Yung, Christina K; Serra, Stefano; Onetto, Nicole; Mukhopadhyay, Debabrata; Tsao, Ming-Sound; Shaw, Patricia A; Petersen, Gloria M; Gallinger, Steven; Hruban, Ralph H; Maitra, Anirban; Iacobuzio-Donahue, Christine A; Schulick, Richard D; Wolfgang, Christopher L; Morgan, Richard A; Lawlor, Rita T; Capelli, Paola; Corbo, Vincenzo; Scardoni, Maria; Tortora, Giampaolo; Tempero, Margaret A; Mann, Karen M; Jenkins, Nancy A; Perez-Mancera, Pedro A; Adams, David J; Largaespada, David A; Wessels, Lodewyk F A; Rust, Alistair G; Stein, Lincoln D; Tuveson, David A; Copeland, Neal G; Musgrove, Elizabeth A; Scarpa, Aldo; Eshleman, James R; Hudson, Thomas J; Sutherland, Robert L; Wheeler, David A; Pearson, John V; McPherson, John D; Gibbs, Richard A; Grimmond, Sean M

    2012-11-15

    Pancreatic cancer is a highly lethal malignancy with few effective therapies. We performed exome sequencing and copy number analysis to define genomic aberrations in a prospectively accrued clinical cohort (n = 142) of early (stage I and II) sporadic pancreatic ductal adenocarcinoma. Detailed analysis of 99 informative tumours identified substantial heterogeneity with 2,016 non-silent mutations and 1,628 copy-number variations. We define 16 significantly mutated genes, reaffirming known mutations (KRAS, TP53, CDKN2A, SMAD4, MLL3, TGFBR2, ARID1A and SF3B1), and uncover novel mutated genes including additional genes involved in chromatin modification (EPC1 and ARID2), DNA damage repair (ATM) and other mechanisms (ZIM2, MAP2K4, NALCN, SLC16A4 and MAGEA6). Integrative analysis with in vitro functional data and animal models provided supportive evidence for potential roles for these genetic aberrations in carcinogenesis. Pathway-based analysis of recurrently mutated genes recapitulated clustering in core signalling pathways in pancreatic ductal adenocarcinoma, and identified new mutated genes in each pathway. We also identified frequent and diverse somatic aberrations in genes described traditionally as embryonic regulators of axon guidance, particularly SLIT/ROBO signalling, which was also evident in murine Sleeping Beauty transposon-mediated somatic mutagenesis models of pancreatic cancer, providing further supportive evidence for the potential involvement of axon guidance genes in pancreatic carcinogenesis.

  5. Genome-wide identification of KANADI1 target genes.

    Directory of Open Access Journals (Sweden)

    Paz Merelo

    Full Text Available Plant organ development and polarity establishment is mediated by the action of several transcription factors. Among these, the KANADI (KAN subclade of the GARP protein family plays important roles in polarity-associated processes during embryo, shoot and root patterning. In this study, we have identified a set of potential direct target genes of KAN1 through a combination of chromatin immunoprecipitation/DNA sequencing (ChIP-Seq and genome-wide transcriptional profiling using tiling arrays. Target genes are over-represented for genes involved in the regulation of organ development as well as in the response to auxin. KAN1 affects directly the expression of several genes previously shown to be important in the establishment of polarity during lateral organ and vascular tissue development. We also show that KAN1 controls through its target genes auxin effects on organ development at different levels: transport and its regulation, and signaling. In addition, KAN1 regulates genes involved in the response to abscisic acid, jasmonic acid, brassinosteroids, ethylene, cytokinins and gibberellins. The role of KAN1 in organ polarity is antagonized by HD-ZIPIII transcription factors, including REVOLUTA (REV. A comparison of their target genes reveals that the REV/KAN1 module acts in organ patterning through opposite regulation of shared targets. Evidence of mutual repression between closely related family members is also shown.

  6. Genomic analysis of primordial dwarfism reveals novel disease genes.

    Science.gov (United States)

    Shaheen, Ranad; Faqeih, Eissa; Ansari, Shinu; Abdel-Salam, Ghada; Al-Hassnan, Zuhair N; Al-Shidi, Tarfa; Alomar, Rana; Sogaty, Sameera; Alkuraya, Fowzan S

    2014-02-01

    Primordial dwarfism (PD) is a disease in which severely impaired fetal growth persists throughout postnatal development and results in stunted adult size. The condition is highly heterogeneous clinically, but the use of certain phenotypic aspects such as head circumference and facial appearance has proven helpful in defining clinical subgroups. In this study, we present the results of clinical and genomic characterization of 16 new patients in whom a broad definition of PD was used (e.g., 3M syndrome was included). We report a novel PD syndrome with distinct facies in two unrelated patients, each with a different homozygous truncating mutation in CRIPT. Our analysis also reveals, in addition to mutations in known PD disease genes, the first instance of biallelic truncating BRCA2 mutation causing PD with normal bone marrow analysis. In addition, we have identified a novel locus for Seckel syndrome based on a consanguineous multiplex family and identified a homozygous truncating mutation in DNA2 as the likely cause. An additional novel PD disease candidate gene XRCC4 was identified by autozygome/exome analysis, and the knockout mouse phenotype is highly compatible with PD. Thus, we add a number of novel genes to the growing list of PD-linked genes, including one which we show to be linked to a novel PD syndrome with a distinct facial appearance. PD is extremely heterogeneous genetically and clinically, and genomic tools are often required to reach a molecular diagnosis.

  7. Microbial environment affects innate immunity in two closely related earthworm species Eisenia andrei and Eisenia fetida.

    Directory of Open Access Journals (Sweden)

    Jiří Dvořák

    Full Text Available Survival of earthworms in the environment depends on their ability to recognize and eliminate potential pathogens. This work is aimed to compare the innate defense mechanisms of two closely related earthworm species, Eisenia andrei and Eisenia fetida, that inhabit substantially different ecological niches. While E. andrei lives in a compost and manure, E. fetida can be found in the litter layer in forests. Therefore, the influence of environment-specific microbiota on the immune response of both species was followed. Firstly, a reliable method to discern between E. andrei and E. fetida based on species-specific primers for cytochrome c oxidase I (COI and stringent PCR conditions was developed. Secondly, to analyze the immunological profile in both earthworm species, the activity and expression of lysozyme, pattern recognition protein CCF, and antimicrobial proteins with hemolytic function, fetidin and lysenins, have been assessed. Whereas, CCF and lysozyme showed only slight differences in the expression and activity, fetidin/lysenins expression as well as the hemolytic activity was considerably higher in E. andrei as compared to E. fetida. The expression of fetidin/lysenins in E. fetida was not affected upon the challenge with compost microbiota, suggesting more substantial changes in the regulation of the gene expression. Genomic DNA analyses revealed significantly higher level of fetidin/lysenins (determined using universal primer pairs in E. andrei compared to E. fetida. It can be hypothesized that E. andrei colonizing compost as a new habitat acquired an evolutionary selection advantage resulting in a higher expression of antimicrobial proteins.

  8. Genomic and gene variation in Mycoplasma hominis strains

    DEFF Research Database (Denmark)

    Christiansen, Gunna; Andersen, H; Birkelund, Svend

    1987-01-01

    DNAs from 14 strains of Mycoplasma hominis isolated from various habitats, including strain PG21, were analyzed for genomic heterogeneity. DNA-DNA filter hybridization values were from 51 to 91%. Restriction endonuclease digestion patterns, analyzed by agarose gel electrophoresis, revealed...... no identity or cluster formation between strains. Variation within M. hominis rRNA genes was analyzed by Southern hybridization of EcoRI-cleaved DNA hybridized with a cloned fragment of the rRNA gene from the mycoplasma strain PG50. Five of the M. hominis strains showed identical hybridization patterns....... These hybridization patterns were compared with those of 12 other mycoplasma species, which showed a much more complex band pattern. Cloned nonribosomal RNA gene fragments of M. hominis PG21 DNA were analyzed, and the fragments were used to demonstrate heterogeneity among the strains. A monoclonal antibody against...

  9. Combining genetical genomics and bulked segregant analysis differential expression: an approach to gene localization

    NARCIS (Netherlands)

    Chen, Xinwei; Hedley, P.E.; Morris, J.; Liu, Hui; Niks, R.E.; Waugh, R.

    2011-01-01

    Positional gene isolation in unsequenced species generally requires either a reference genome sequence or an inference of gene content based on conservation of synteny with a genomic model. In the large unsequenced genomes of the Triticeae cereals the latter, i.e. conservation of synteny with the

  10. Mapping our genes: The genome projects: How big, how fast

    Energy Technology Data Exchange (ETDEWEB)

    none,

    1988-04-01

    For the past 2 years, scientific and technical journals in biology and medicine have extensively covered a debate about whether and how to determine the function and order of human genes on human chromosomes and when to determine the sequence of molecular building blocks that comprise DNA in those chromosomes. In 1987, these issues rose to become part of the public agenda. The debate involves science, technology, and politics. Congress is responsible for /open quotes/writing the rules/close quotes/ of what various federal agencies do and for funding their work. This report surveys the points made so far in the debate, focusing on those that most directly influence the policy options facing the US Congress. Congressional interest focused on how to assess the rationales for conducting human genome projects, how to fund human genome projects (at what level and through which mechanisms), how to coordinate the scientific and technical programs of the several federal agencies and private interests already supporting various genome projects, and how to strike a balance regarding the impact of genome projects on international scientific cooperation and international economic competition in biotechnology. OTA prepared this report with the assistance of several hundred experts throughout the world. 342 refs., 26 figs., 11 tabs.

  11. Mapping Our Genes: The Genome Projects: How Big, How Fast

    Science.gov (United States)

    1988-04-01

    For the past 2 years, scientific and technical journals in biology and medicine have extensively covered a debate about whether and how to determine the function and order of human genes on human chromosomes and when to determine the sequence of molecular building blocks that comprise DNA in those chromosomes. In 1987, these issues rose to become part of the public agenda. The debate involves science, technology, and politics. Congress is responsible for ?writing the rules? of what various federal agencies do and for funding their work. This report surveys the points made so far in the debate, focusing on those that most directly influence the policy options facing the US Congress. Congressional interest focused on how to assess the rationales for conducting human genome projects, how to fund human genome projects (at what level and through which mechanisms), how to coordinate the scientific and technical programs of the several federal agencies and private interests already supporting various genome projects, and how to strike a balance regarding the impact of genome projects on international scientific cooperation and international economic competition in biotechnology. The Office of Technology Assessment (OTA) prepared this report with the assistance of several hundred experts throughout the world.

  12. Census of solo LuxR genes in prokaryotic genomes.

    Science.gov (United States)

    Hudaiberdiev, Sanjarbek; Choudhary, Kumari S; Vera Alvarez, Roberto; Gelencsér, Zsolt; Ligeti, Balázs; Lamba, Doriano; Pongor, Sándor

    2015-01-01

    luxR genes encode transcriptional regulators that control acyl homoserine lactone-based quorum sensing (AHL QS) in Gram negative bacteria. On the bacterial chromosome, luxR genes are usually found next or near to a luxI gene encoding the AHL signal synthase. Recently, a number of luxR genes were described that have no luxI genes in their vicinity on the chromosome. These so-called solo luxR genes may either respond to internal AHL signals produced by a non-adjacent luxI in the chromosome, or can respond to exogenous signals. Here we present a survey of solo luxR genes found in complete and draft bacterial genomes in the NCBI databases using HMMs. We found that 2698 of the 3550 luxR genes found are solos, which is an unexpectedly high number even if some of the hits may be false positives. We also found that solo LuxR sequences form distinct clusters that are different from the clusters of LuxR sequences that are part of the known luxR-luxI topological arrangements. We also found a number of cases that we termed twin luxR topologies, in which two adjacent luxR genes were in tandem or divergent orientation. Many of the luxR solo clusters were devoid of the sequence motifs characteristic of AHL binding LuxR proteins so there is room to speculate that the solos may be involved in sensing hitherto unknown signals. It was noted that only some of the LuxR clades are rich in conserved cysteine residues. Molecular modeling suggests that some of the cysteines may be involved in disulfide formation, which makes us speculate that some LuxR proteins, including some of the solos may be involved in redox regulation.

  13. Rapid genome reshaping by multiple-gene loss after whole-genome duplication in teleost fish suggested by mathematical modeling

    Science.gov (United States)

    Sato, Yukuto; Tsukamoto, Katsumi; Nishida, Mutsumi

    2015-01-01

    Whole-genome duplication (WGD) is believed to be a significant source of major evolutionary innovation. Redundant genes resulting from WGD are thought to be lost or acquire new functions. However, the rates of gene loss and thus temporal process of genome reshaping after WGD remain unclear. The WGD shared by all teleost fish, one-half of all jawed vertebrates, was more recent than the two ancient WGDs that occurred before the origin of jawed vertebrates, and thus lends itself to analysis of gene loss and genome reshaping. Using a newly developed orthology identification pipeline, we inferred the post–teleost-specific WGD evolutionary histories of 6,892 protein-coding genes from nine phylogenetically representative teleost genomes on a time-calibrated tree. We found that rapid gene loss did occur in the first 60 My, with a loss of more than 70–80% of duplicated genes, and produced similar genomic gene arrangements within teleosts in that relatively short time. Mathematical modeling suggests that rapid gene loss occurred mainly by events involving simultaneous loss of multiple genes. We found that the subsequent 250 My were characterized by slow and steady loss of individual genes. Our pipeline also identified about 1,100 shared single-copy genes that are inferred to have become singletons before the divergence of clupeocephalan teleosts. Therefore, our comparative genome analysis suggests that rapid gene loss just after the WGD reshaped teleost genomes before the major divergence, and provides a useful set of marker genes for future phylogenetic analysis. PMID:26578810

  14. Genome sequencing and comparative genomics reveal a repertoire of putative pathogenicity genes in chilli anthracnose fungus Colletotrichum truncatum.

    Science.gov (United States)

    Rao, Soumya; Nandineni, Madhusudan R

    2017-01-01

    Colletotrichum truncatum, a major fungal phytopathogen, causes the anthracnose disease on an economically important spice crop chilli (Capsicum annuum), resulting in huge economic losses in tropical and sub-tropical countries. It follows a subcuticular intramural infection strategy on chilli with a short, asymptomatic, endophytic phase, which contrasts with the intracellular hemibiotrophic lifestyle adopted by most of the Colletotrichum species. However, little is known about the molecular determinants and the mechanism of pathogenicity in this fungus. A high quality whole genome sequence and gene annotation based on transcriptome data of an Indian isolate of C. truncatum from chilli has been obtained. Analysis of the genome sequence revealed a rich repertoire of pathogenicity genes in C. truncatum encoding secreted proteins, effectors, plant cell wall degrading enzymes, secondary metabolism associated proteins, with potential roles in the host-specific infection strategy, placing it next only to the Fusarium species. The size of genome assembly, number of predicted genes and some of the functional categories were similar to other sequenced Colletotrichum species. The comparative genomic analyses with other species and related fungi identified some unique genes and certain highly expanded gene families of CAZymes, proteases and secondary metabolism associated genes in the genome of C. truncatum. The draft genome assembly and functional annotation of potential pathogenicity genes of C. truncatum provide an important genomic resource for understanding the biology and lifestyle of this important phytopathogen and will pave the way for designing efficient disease control regimens.

  15. Genes encoding calmodulin-binding proteins in the Arabidopsis genome

    Science.gov (United States)

    Reddy, Vaka S.; Ali, Gul S.; Reddy, Anireddy S N.

    2002-01-01

    Analysis of the recently completed Arabidopsis genome sequence indicates that approximately 31% of the predicted genes could not be assigned to functional categories, as they do not show any sequence similarity with proteins of known function from other organisms. Calmodulin (CaM), a ubiquitous and multifunctional Ca(2+) sensor, interacts with a wide variety of cellular proteins and modulates their activity/function in regulating diverse cellular processes. However, the primary amino acid sequence of the CaM-binding domain in different CaM-binding proteins (CBPs) is not conserved. One way to identify most of the CBPs in the Arabidopsis genome is by protein-protein interaction-based screening of expression libraries with CaM. Here, using a mixture of radiolabeled CaM isoforms from Arabidopsis, we screened several expression libraries prepared from flower meristem, seedlings, or tissues treated with hormones, an elicitor, or a pathogen. Sequence analysis of 77 positive clones that interact with CaM in a Ca(2+)-dependent manner revealed 20 CBPs, including 14 previously unknown CBPs. In addition, by searching the Arabidopsis genome sequence with the newly identified and known plant or animal CBPs, we identified a total of 27 CBPs. Among these, 16 CBPs are represented by families with 2-20 members in each family. Gene expression analysis revealed that CBPs and CBP paralogs are expressed differentially. Our data suggest that Arabidopsis has a large number of CBPs including several plant-specific ones. Although CaM is highly conserved between plants and animals, only a few CBPs are common to both plants and animals. Analysis of Arabidopsis CBPs revealed the presence of a variety of interesting domains. Our analyses identified several hypothetical proteins in the Arabidopsis genome as CaM targets, suggesting their involvement in Ca(2+)-mediated signaling networks.

  16. The compact Selaginella genome identifies changes in gene content associated with the evolution of vascular plants

    Energy Technology Data Exchange (ETDEWEB)

    Grigoriev, Igor V.; Banks, Jo Ann; Nishiyama, Tomoaki; Hasebe, Mitsuyasu; Bowman, John L.; Gribskov, Michael; dePamphilis, Claude; Albert, Victor A.; Aono, Naoki; Aoyama, Tsuyoshi; Ambrose, Barbara A.; Ashton, Neil W.; Axtell, Michael J.; Barker, Elizabeth; Barker, Michael S.; Bennetzen, Jeffrey L.; Bonawitz, Nicholas D.; Chapple, Clint; Cheng, Chaoyang; Correa, Luiz Gustavo Guedes; Dacre, Michael; DeBarry, Jeremy; Dreyer, Ingo; Elias, Marek; Engstrom, Eric M.; Estelle, Mark; Feng, Liang; Finet, Cedric; Floyd, Sandra K.; Frommer, Wolf B.; Fujita, Tomomichi; Gramzow, Lydia; Gutensohn, Michael; Harholt, Jesper; Hattori, Mitsuru; Heyl, Alexander; Hirai, Tadayoshi; Hiwatashi, Yuji; Ishikawa, Masaki; Iwata, Mineko; Karol, Kenneth G.; Koehler, Barbara; Kolukisaoglu, Uener; Kubo, Minoru; Kurata, Tetsuya; Lalonde, Sylvie; Li, Kejie; Li, Ying; Litt, Amy; Lyons, Eric; Manning, Gerard; Maruyama, Takeshi; Michael, Todd P.; Mikami, Koji; Miyazaki, Saori; Morinaga, Shin-ichi; Murata, Takashi; Mueller-Roeber, Bernd; Nelson, David R.; Obara, Mari; Oguri, Yasuko; Olmstead, Richard G.; Onodera, Naoko; Petersen, Bent Larsen; Pils, Birgit; Prigge, Michael; Rensing, Stefan A.; Riano-Pachon, Diego Mauricio; Roberts, Alison W.; Sato, Yoshikatsu; Scheller, Henrik Vibe; Schulz, Burkhard; Schulz, Christian; Shakirov, Eugene V.; Shibagaki, Nakako; Shinohara, Naoki; Shippen, Dorothy E.; Sorensen, Iben; Sotooka, Ryo; Sugimoto, Nagisa; Sugita, Mamoru; Sumikawa, Naomi; Tanurdzic, Milos; Theilsen, Gunter; Ulvskov, Peter; Wakazuki, Sachiko; Weng, Jing-Ke; Willats, William W.G.T.; Wipf, Daniel; Wolf, Paul G.; Yang, Lixing; Zimmer, Andreas D.; Zhu, Qihui; Mitros, Therese; Hellsten, Uffe; Loque, Dominique; Otillar, Robert; Salamov, Asaf; Schmutz, Jeremy; Shapiro, Harris; Lindquist, Erika; Lucas, Susan; Rokhsar, Daniel

    2011-04-28

    We report the genome sequence of the nonseed vascular plant, Selaginella moellendorffii, and by comparative genomics identify genes that likely played important roles in the early evolution of vascular plants and their subsequent evolution

  17. Effects of earthworms on nitrogen mineralization.

    NARCIS (Netherlands)

    Willems, J.J.G.M.; Marinissen, J.C.Y.; Blair, J.

    1996-01-01

    The influence of earthworms (Lumbricus terrestris and Aporrectodea tuberculata) on the rate of net N mineralization was studied, both in soil with intact soil structure (partly influenced by past earthworm activity) and in columns with sieved soil

  18. For Better Soil, Let Earthworms Toil.

    Science.gov (United States)

    Swinehart, Rebecca, Ed.

    1995-01-01

    This activity involves elementary students in investigating how earthworms affect soil fertility. An introduction discusses topsoil loss and the connections between soil and earthworm ecology. Materials needed and step-by-step procedure are provided. (LZ)

  19. Phylogeny Inference of Closely Related Bacterial Genomes: Combining the Features of Both Overlapping Genes and Collinear Genomic Regions

    Science.gov (United States)

    Zhang, Yan-Cong; Lin, Kui

    2015-01-01

    Overlapping genes (OGs) represent one type of widespread genomic feature in bacterial genomes and have been used as rare genomic markers in phylogeny inference of closely related bacterial species. However, the inference may experience a decrease in performance for phylogenomic analysis of too closely or too distantly related genomes. Another drawback of OGs as phylogenetic markers is that they usually take little account of the effects of genomic rearrangement on the similarity estimation, such as intra-chromosome/genome translocations, horizontal gene transfer, and gene losses. To explore such effects on the accuracy of phylogeny reconstruction, we combine phylogenetic signals of OGs with collinear genomic regions, here called locally collinear blocks (LCBs). By putting these together, we refine our previous metric of pairwise similarity between two closely related bacterial genomes. As a case study, we used this new method to reconstruct the phylogenies of 88 Enterobacteriale genomes of the class Gammaproteobacteria. Our results demonstrated that the topological accuracy of the inferred phylogeny was improved when both OGs and LCBs were simultaneously considered, suggesting that combining these two phylogenetic markers may reduce, to some extent, the influence of gene loss on phylogeny inference. Such phylogenomic studies, we believe, will help us to explore a more effective approach to increasing the robustness of phylogeny reconstruction of closely related bacterial organisms. PMID:26715828

  20. Genomic Copy Number Dictates a Gene-Independent Cell Response to CRISPR/Cas9 Targeting | Office of Cancer Genomics

    Science.gov (United States)

    The CRISPR/Cas9 system enables genome editing and somatic cell genetic screens in mammalian cells. We performed genome-scale loss-of-function screens in 33 cancer cell lines to identify genes essential for proliferation/survival and found a strong correlation between increased gene copy number and decreased cell viability after genome editing. Within regions of copy-number gain, CRISPR/Cas9 targeting of both expressed and unexpressed genes, as well as intergenic loci, led to significantly decreased cell proliferation through induction of a G2 cell-cycle arrest.

  1. Gene prediction and RFX transcriptional regulation analysis using comparative genomics

    OpenAIRE

    Chu, Jeffrey Shih Chieh

    2011-01-01

    Regulatory Factor X (RFX) is a family of transcription factors (TF) that is conserved in all metazoans, in some fungi, and in only a few single-cellular organisms. Seven members are found in mammals, nine in fishes, three in fruit flies, and a single member in nematodes and fungi. RFX is involved in many different roles in humans, but a particular function that is conserved in many metazoans is its regulation of ciliogenesis. Probing over 150 genomes for the presence of RFX and ciliary genes ...

  2. Cognitive genomics: Linking genes to behavior in the human brain

    Directory of Open Access Journals (Sweden)

    Genevieve Konopka

    2017-02-01

    Full Text Available Correlations of genetic variation in DNA with functional brain activity have already provided a starting point for delving into human cognitive mechanisms. However, these analyses do not provide the specific genes driving the associations, which are complicated by intergenic localization as well as tissue-specific epigenetics and expression. The use of brain-derived expression datasets could build upon the foundation of these initial genetic insights and yield genes and molecular pathways for testing new hypotheses regarding the molecular bases of human brain development, cognition, and disease. Thus, coupling these human brain gene expression data with measurements of brain activity may provide genes with critical roles in brain function. However, these brain gene expression datasets have their own set of caveats, most notably a reliance on postmortem tissue. In this perspective, I summarize and examine the progress that has been made in this realm to date, and discuss the various frontiers remaining, such as the inclusion of cell-type-specific information, additional physiological measurements, and genomic data from patient cohorts.

  3. Extensive gene content variation in the Brachypodium distachyon pan-genome correlates with population structure.

    Science.gov (United States)

    Gordon, Sean P; Contreras-Moreira, Bruno; Woods, Daniel P; Des Marais, David L; Burgess, Diane; Shu, Shengqiang; Stritt, Christoph; Roulin, Anne C; Schackwitz, Wendy; Tyler, Ludmila; Martin, Joel; Lipzen, Anna; Dochy, Niklas; Phillips, Jeremy; Barry, Kerrie; Geuten, Koen; Budak, Hikmet; Juenger, Thomas E; Amasino, Richard; Caicedo, Ana L; Goodstein, David; Davidson, Patrick; Mur, Luis A J; Figueroa, Melania; Freeling, Michael; Catalan, Pilar; Vogel, John P

    2017-12-19

    While prokaryotic pan-genomes have been shown to contain many more genes than any individual organism, the prevalence and functional significance of differentially present genes in eukaryotes remains poorly understood. Whole-genome de novo assembly and annotation of 54 lines of the grass Brachypodium distachyon yield a pan-genome containing nearly twice the number of genes found in any individual genome. Genes present in all lines are enriched for essential biological functions, while genes present in only some lines are enriched for conditionally beneficial functions (e.g., defense and development), display faster evolutionary rates, lie closer to transposable elements and are less likely to be syntenic with orthologous genes in other grasses. Our data suggest that differentially present genes contribute substantially to phenotypic variation within a eukaryote species, these genes have a major influence in population genetics, and transposable elements play a key role in pan-genome evolution.

  4. Gene disruptions using P transposable elements: an integral component of the Drosophila genome project.

    OpenAIRE

    Spradling, A C; Stern, D M; Kiss, I; Roote, J; Laverty, T; Rubin, G M

    1995-01-01

    Biologists require genetic as well as molecular tools to decipher genomic information and ultimately to understand gene function. The Berkeley Drosophila Genome Project is addressing these needs with a massive gene disruption project that uses individual, genetically engineered P transposable elements to target open reading frames throughout the Drosophila genome. DNA flanking the insertions is sequenced, thereby placing an extensive series of genetic markers on the physical genomic map and a...

  5. Comparative genomics of Geobacter chemotaxis genes reveals diverse signaling function

    Directory of Open Access Journals (Sweden)

    Antommattei Frances M

    2008-10-01

    Full Text Available Abstract Background Geobacter species are δ-Proteobacteria and are often the predominant species in a variety of sedimentary environments where Fe(III reduction is important. Their ability to remediate contaminated environments and produce electricity makes them attractive for further study. Cell motility, biofilm formation, and type IV pili all appear important for the growth of Geobacter in changing environments and for electricity production. Recent studies in other bacteria have demonstrated that signaling pathways homologous to the paradigm established for Escherichia coli chemotaxis can regulate type IV pili-dependent motility, the synthesis of flagella and type IV pili, the production of extracellular matrix material, and biofilm formation. The classification of these pathways by comparative genomics improves the ability to understand how Geobacter thrives in natural environments and better their use in microbial fuel cells. Results The genomes of G. sulfurreducens, G. metallireducens, and G. uraniireducens contain multiple (~70 homologs of chemotaxis genes arranged in several major clusters (six, seven, and seven, respectively. Unlike the single gene cluster of E. coli, the Geobacter clusters are not all located near the flagellar genes. The probable functions of some Geobacter clusters are assignable by homology to known pathways; others appear to be unique to the Geobacter sp. and contain genes of unknown function. We identified large numbers of methyl-accepting chemotaxis protein (MCP homologs that have diverse sensing domain architectures and generate a potential for sensing a great variety of environmental signals. We discuss mechanisms for class-specific segregation of the MCPs in the cell membrane, which serve to maintain pathway specificity and diminish crosstalk. Finally, the regulation of gene expression in Geobacter differs from E. coli. The sequences of predicted promoter elements suggest that the alternative sigma factors

  6. FGF: A web tool for Fishing Gene Family in a whole genome database

    DEFF Research Database (Denmark)

    Zheng, Hongkun; Shi, Junjie; Fang, Xiaodong

    2007-01-01

    Gene duplication is an important process in evolution. The availability of genome sequences of a number of organisms has made it possible to conduct comprehensive searches for duplicated genes enabling informative studies of their evolution. We have established the FGF (Fishing Gene Family) progr...... is freely available on a web server at http://fgf.genomics.org.cn/...

  7. Improved methods and resources for paramecium genomics: transcription units, gene annotation and gene expression.

    Science.gov (United States)

    Arnaiz, Olivier; Van Dijk, Erwin; Bétermier, Mireille; Lhuillier-Akakpo, Maoussi; de Vanssay, Augustin; Duharcourt, Sandra; Sallet, Erika; Gouzy, Jérôme; Sperling, Linda

    2017-06-26

    The 15 sibling species of the Paramecium aurelia cryptic species complex emerged after a whole genome duplication that occurred tens of millions of years ago. Given extensive knowledge of the genetics and epigenetics of Paramecium acquired over the last century, this species complex offers a uniquely powerful system to investigate the consequences of whole genome duplication in a unicellular eukaryote as well as the genetic and epigenetic mechanisms that drive speciation. High quality Paramecium gene models are important for research using this system. The major aim of the work reported here was to build an improved gene annotation pipeline for the Paramecium lineage. We generated oriented RNA-Seq transcriptome data across the sexual process of autogamy for the model species Paramecium tetraurelia. We determined, for the first time in a ciliate, candidate P. tetraurelia transcription start sites using an adapted Cap-Seq protocol. We developed TrUC, multi-threaded Perl software that in conjunction with TopHat mapping of RNA-Seq data to a reference genome, predicts transcription units for the annotation pipeline. We used EuGene software to combine annotation evidence. The high quality gene structural annotations obtained for P. tetraurelia were used as evidence to improve published annotations for 3 other Paramecium species. The RNA-Seq data were also used for differential gene expression analysis, providing a gene expression atlas that is more sensitive than the previously established microarray resource. We have developed a gene annotation pipeline tailored for the compact genomes and tiny introns of Paramecium species. A novel component of this pipeline, TrUC, predicts transcription units using Cap-Seq and oriented RNA-Seq data. TrUC could prove useful beyond Paramecium, especially in the case of high gene density. Accurate predictions of 3' and 5' UTR will be particularly valuable for studies of gene expression (e.g. nucleosome positioning, identification of cis

  8. The genome of Chelonid herpesvirus 5 harbors atypical genes

    Science.gov (United States)

    Ackermann, Mathias; Koriabine, Maxim; Hartmann-Fritsch, Fabienne; de Jong, Pieter J.; Lewis, Teresa D.; Schetle, Nelli; Work, Thierry M.; Dagenais, Julie; Balazs, George H.; Leong, Jo-Ann C.

    2012-01-01

    The Chelonid fibropapilloma-associated herpesvirus (CFPHV; ChHV5) is believed to be the causative agent of fibropapillomatosis (FP), a neoplastic disease of marine turtles. While clinical signs and pathology of FP are well known, research on ChHV5 has been impeded because no cell culture system for its propagation exists. We have cloned a BAC containing ChHV5 in pTARBAC2.1 and determined its nucleotide sequence. Accordingly, ChHV5 has a type D genome and its predominant gene order is typical for the varicellovirus genus within thealphaherpesvirinae. However, at least four genes that are atypical for an alphaherpesvirus genome were also detected, i.e. two members of the C-type lectin-like domain superfamily (F-lec1, F-lec2), an orthologue to the mouse cytomegalovirus M04 (F-M04) and a viral sialyltransferase (F-sial). Four lines of evidence suggest that these atypical genes are truly part of the ChHV5 genome: (1) the pTARBAC insertion interrupted the UL52 ORF, leaving parts of the gene to either side of the insertion and suggesting that an intact molecule had been cloned. (2) Using FP-associated UL52 (F-UL52) as an anchor and the BAC-derived sequences as a means to generate primers, overlapping PCR was performed with tumor-derived DNA as template, which confirmed the presence of the same stretch of “atypical” DNA in independent FP cases. (3) Pyrosequencing of DNA from independent tumors did not reveal previously undetected viral sequences, suggesting that no apparent loss of viral sequence had happened due to the cloning strategy. (4) The simultaneous presence of previously known ChHV5 sequences and F-sial as well as F-M04 sequences was also confirmed in geographically distinct Australian cases of FP. Finally, transcripts of F-sial and F-M04 but not transcripts of lytic viral genes were detected in tumors from Hawaiian FP-cases. Therefore, we suggest that F-sial and F-M04 may play a role in FP pathogenesis

  9. Detection of earthworm prey by Ruff

    NARCIS (Netherlands)

    Onrust, J.; Loonstra, A.H.J.; Schmaltz, L.E.; Verkuil, Y.I.; Hooijmeijer, J.C.E.W.; Piersma, T.

    2017-01-01

    Ruff Philomachus pugnax staging in the Netherlands forage in agricultural grasslands,where they mainly eat earthworms (Lumbricidae). Food intake and the surface availabilityof earthworms were studied in dairy farmland of southwest Friesland in March–April2011. Daily changes in earthworm availability

  10. Antimicrobial activity of earthworm ( Eudrilus eugeniae ) paste ...

    African Journals Online (AJOL)

    Earthworm plays a major role in the proper functioning of the soil ecosystem. It acts as scavenger and helps in recycling of dead and decayed plant material by feeding on them. Earthworm increases the soil fertility and is often referred to as a farmer's friend. Earthworms have been used in medicine for various remedies.

  11. Ecological functions of earthworms in soil

    NARCIS (Netherlands)

    Andriuzzi, W.S.

    2015-01-01

    Ecological functions of earthworms in soil

    Walter S. Andriuzzi

    Abstract

    Earthworms are known to play an important role in soil structure and fertility, but there are still big knowledge gaps on the functional ecology of distinct earthworm species, on their

  12. Genome Enabled Discovery of Carbon Sequestration Genes in Poplar

    Energy Technology Data Exchange (ETDEWEB)

    Filichkin, Sergei; Etherington, Elizabeth; Ma, Caiping; Strauss, Steve

    2007-02-22

    The goals of the S.H. Strauss laboratory portion of 'Genome-enabled discovery of carbon sequestration genes in poplar' are (1) to explore the functions of candidate genes using Populus transformation by inserting genes provided by Oakridge National Laboratory (ORNL) and the University of Florida (UF) into poplar; (2) to expand the poplar transformation toolkit by developing transformation methods for important genotypes; and (3) to allow induced expression, and efficient gene suppression, in roots and other tissues. As part of the transformation improvement effort, OSU developed transformation protocols for Populus trichocarpa 'Nisqually-1' clone and an early flowering P. alba clone, 6K10. Complete descriptions of the transformation systems were published (Ma et. al. 2004, Meilan et. al 2004). Twenty-one 'Nisqually-1' and 622 6K10 transgenic plants were generated. To identify root predominant promoters, a set of three promoters were tested for their tissue-specific expression patterns in poplar and in Arabidopsis as a model system. A novel gene, ET304, was identified by analyzing a collection of poplar enhancer trap lines generated at OSU (Filichkin et. al 2006a, 2006b). Other promoters include the pGgMT1 root-predominant promoter from Casuarina glauca and the pAtPIN2 promoter from Arabidopsis root specific PIN2 gene. OSU tested two induction systems, alcohol- and estrogen-inducible, in multiple poplar transgenics. Ethanol proved to be the more efficient when tested in tissue culture and greenhouse conditions. Two estrogen-inducible systems were evaluated in transgenic Populus, neither of which functioned reliably in tissue culture conditions. GATEWAY-compatible plant binary vectors were designed to compare the silencing efficiency of homologous (direct) RNAi vs. heterologous (transitive) RNAi inverted repeats. A set of genes was targeted for post transcriptional silencing in the model Arabidopsis system; these include the floral

  13. Genome-wide comparative analysis reveals similar types of NBS genes in hybrid Citrus sinensis genome and original Citrus clementine genome and provides new insights into non-TIR NBS genes.

    Directory of Open Access Journals (Sweden)

    Yunsheng Wang

    Full Text Available In this study, we identified and compared nucleotide-binding site (NBS domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China. Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approximately evenly numbered groups: one group contains the Toll-Interleukin receptor (TIR domain and two different Non-TIR groups in which most of proteins contain the Coiled Coil (CC domain. Motif analysis confirmed that the two groups of CC-containing NBS genes are from different evolutionary origins. We partitioned NBS genes into clades using NBS domain sequence distances and found most clades include NBS genes from all three Citrus genomes. This suggests that three Citrus genomes have similar numbers and types of NBS genes. We also mapped the re-sequenced reads of three pomelo and three mandarin genomes onto the C. sinensis genome. We found that most NBS genes of the hybrid C. sinensis genome have corresponding homologous genes in both pomelo and mandarin genomes. The homologous NBS genes in pomelo and mandarin suggest that the parental species of C. sinensis may contain similar types of NBS genes. This explains why the hybrid C. sinensis and original C. clementina have similar types of NBS genes in this study. Furthermore, we found that sequence variation amongst Citrus NBS genes were shaped by multiple independent and shared accelerated mutation accumulation events among different groups of NBS genes and in different Citrus genomes. Our comparative analyses yield valuable insight into the structure, organization and evolution of NBS genes in Citrus genomes. Furthermore, our comprehensive analysis showed that the non-TIR NBS genes can be divided into two groups that come from different evolutionary origins. This provides new insights into non-TIR genes, which have not received much attention.

  14. Recognizing genes and other components of genomic structure

    Energy Technology Data Exchange (ETDEWEB)

    Burks, C. (Los Alamos National Lab., NM (USA)); Myers, E. (Arizona Univ., Tucson, AZ (USA). Dept. of Computer Science); Stormo, G.D. (Colorado Univ., Boulder, CO (USA). Dept. of Molecular, Cellular and Developmental Biology)

    1991-01-01

    The Aspen Center for Physics (ACP) sponsored a three-week workshop, with 26 scientists participating, from 28 May to 15 June, 1990. The workshop, entitled Recognizing Genes and Other Components of Genomic Structure, focussed on discussion of current needs and future strategies for developing the ability to identify and predict the presence of complex functional units on sequenced, but otherwise uncharacterized, genomic DNA. We addressed the need for computationally-based, automatic tools for synthesizing available data about individual consensus sequences and local compositional patterns into the composite objects (e.g., genes) that are -- as composite entities -- the true object of interest when scanning DNA sequences. The workshop was structured to promote sustained informal contact and exchange of expertise between molecular biologists, computer scientists, and mathematicians. No participant stayed for less than one week, and most attended for two or three weeks. Computers, software, and databases were available for use as electronic blackboards'' and as the basis for collaborative exploration of ideas being discussed and developed at the workshop. 23 refs., 2 tabs.

  15. Comparative genome analysis and resistance gene mapping in grain legumes

    International Nuclear Information System (INIS)

    Young, N.D.

    1998-01-01

    Using, DNA markers and genome organization, several important disease resistance genes have been analyzed in mungbean (Vigna radiata), cowpea (Vigna unguiculata), common bean (Phaseolus vulgaris), and soybean (Glycine max). In the process, medium-density linkage maps consisting of restriction fragment length polymorphism (RFLP) markers were constructed for both mungbean and cowpea. Comparisons between these maps, as well as the maps of soybean and common bean, indicate that there is significant conservation of DNA marker order, though the conserved blocks in soybean are much shorter than in the others. DNA mapping results also indicate that a gene for seed weight may be conserved between mungbean and cowpea. Using the linkage maps, genes that control bruchid (genus Callosobruchus) and powdery mildew (Erysiphe polygoni) resistance in mungbean, aphid resistance in cowpea (Aphis craccivora), and cyst nematode (Heterodera glycines) resistance in soybean have all been mapped and characterized. For some of these traits resistance was found to be oligogenic and DNA mapping uncovered multiple genes involved in the phenotype. (author)

  16. Plant ion channels: gene families, physiology, and functional genomics analyses.

    Science.gov (United States)

    Ward, John M; Mäser, Pascal; Schroeder, Julian I

    2009-01-01

    Distinct potassium, anion, and calcium channels in the plasma membrane and vacuolar membrane of plant cells have been identified and characterized by patch clamping. Primarily owing to advances in Arabidopsis genetics and genomics, and yeast functional complementation, many of the corresponding genes have been identified. Recent advances in our understanding of ion channel genes that mediate signal transduction and ion transport are discussed here. Some plant ion channels, for example, ALMT and SLAC anion channel subunits, are unique. The majority of plant ion channel families exhibit homology to animal genes; such families include both hyperpolarization- and depolarization-activated Shaker-type potassium channels, CLC chloride transporters/channels, cyclic nucleotide-gated channels, and ionotropic glutamate receptor homologs. These plant ion channels offer unique opportunities to analyze the structural mechanisms and functions of ion channels. Here we review gene families of selected plant ion channel classes and discuss unique structure-function aspects and their physiological roles in plant cell signaling and transport.

  17. On the total number of genes and their length distribution in complete microbial genomes

    DEFF Research Database (Denmark)

    Skovgaard, Marie; Jensen, L.J.; Brunak, Søren

    2001-01-01

    In sequenced microbial genomes, some of the annotated genes are actually not protein-coding genes, but rather open reading frames that occur by chance. Therefore, the number of annotated genes is higher than the actual number of genes for most of these microbes. Comparison of the length...... distribution of the annotated genes with the length distribution of those matching a known protein reveals that too many short genes are annotated in many genomes. Here we estimate the true number of protein-coding genes for sequenced genomes. Although it is often claimed that Escherichia coli has about 4300...... genes, we show that it probably has only similar to 3800 genes, and that a similar discrepancy exists for almost all published genomes....

  18. OxyGene: an innovative platform for investigating oxidative-response genes in whole prokaryotic genomes

    Directory of Open Access Journals (Sweden)

    Barloy-Hubler Frédérique

    2008-12-01

    Full Text Available Abstract Background Oxidative stress is a common stress encountered by living organisms and is due to an imbalance between intracellular reactive oxygen and nitrogen species (ROS, RNS and cellular antioxidant defence. To defend themselves against ROS/RNS, bacteria possess a subsystem of detoxification enzymes, which are classified with regard to their substrates. To identify such enzymes in prokaryotic genomes, different approaches based on similarity, enzyme profiles or patterns exist. Unfortunately, several problems persist in the annotation, classification and naming of these enzymes due mainly to some erroneous entries in databases, mistake propagation, absence of updating and disparity in function description. Description In order to improve the current annotation of oxidative stress subsystems, an innovative platform named OxyGene has been developed. It integrates an original database called OxyDB, holding thoroughly tested anchor-based signatures associated to subfamilies of oxidative stress enzymes, and a new anchor-driven annotator, for ab initio detection of ROS/RNS response genes. All complete Bacterial and Archaeal genomes have been re-annotated, and the results stored in the OxyGene repository can be interrogated via a Graphical User Interface. Conclusion OxyGene enables the exploration and comparative analysis of enzymes belonging to 37 detoxification subclasses in 664 microbial genomes. It proposes a new classification that improves both the ontology and the annotation of the detoxification subsystems in prokaryotic whole genomes, while discovering new ORFs and attributing precise function to hypothetical annotated proteins. OxyGene is freely available at: http://www.umr6026.univ-rennes1.fr/english/home/research/basic/software

  19. Population genomics of the Arabidopsis thaliana flowering time gene network.

    Science.gov (United States)

    Flowers, Jonathan M; Hanzawa, Yoshie; Hall, Megan C; Moore, Richard C; Purugganan, Michael D

    2009-11-01

    The time to flowering is a key component of the life-history strategy of the model plant Arabidopsis thaliana that varies quantitatively among genotypes. A significant problem for evolutionary and ecological genetics is to understand how natural selection may operate on this ecologically significant trait. Here, we conduct a population genomic study of resequencing data from 52 genes in the flowering time network. McDonald-Kreitman tests of neutrality suggested a strong excess of amino acid polymorphism when pooling across loci. This excess of replacement polymorphism across the flowering time network and a skewed derived frequency spectrum toward rare alleles for both replacement and noncoding polymorphisms relative to synonymous changes is consistent with a large class of deleterious polymorphisms segregating in these genes. Assuming selective neutrality of synonymous changes, we estimate that approximately 30% of amino acid polymorphisms are deleterious. Evidence of adaptive substitution is less prominent in our analysis. The photoperiod regulatory gene, CO, and a gibberellic acid transcription factor, AtMYB33, show evidence of adaptive fixation of amino acid mutations. A test for extended haplotypes revealed no examples of flowering time alleles with haplotypes comparable in length to those associated with the null fri(Col) allele reported previously. This suggests that the FRI gene likely has a uniquely intense or recent history of selection among the flowering time genes considered here. Although there is some evidence for adaptive evolution in these life-history genes, it appears that slightly deleterious polymorphisms are a major component of natural molecular variation in the flowering time network of A. thaliana.

  20. Constraints on genome dynamics revealed from gene distribution among the Ralstonia solanacearum species.

    Directory of Open Access Journals (Sweden)

    Pierre Lefeuvre

    Full Text Available Because it is suspected that gene content may partly explain host adaptation and ecology of pathogenic bacteria, it is important to study factors affecting genome composition and its evolution. While recent genomic advances have revealed extremely large pan-genomes for some bacterial species, it remains difficult to predict to what extent gene pool is accessible within or transferable between populations. As genomes bear imprints of the history of the organisms, gene distribution pattern analyses should provide insights into the forces and factors at play in the shaping and maintaining of bacterial genomes. In this study, we revisited the data obtained from a previous CGH microarrays analysis in order to assess the genomic plasticity of the R. solanacearum species complex. Gene distribution analyses demonstrated the remarkably dispersed genome of R. solanacearum with more than half of the genes being accessory. From the reconstruction of the ancestral genomes compositions, we were able to infer the number of gene gain and loss events along the phylogeny. Analyses of gene movement patterns reveal that factors associated with gene function, genomic localization and ecology delineate gene flow patterns. While the chromosome displayed lower rates of movement, the megaplasmid was clearly associated with hot-spots of gene gain and loss. Gene function was also confirmed to be an essential factor in gene gain and loss dynamics with significant differences in movement patterns between different COG categories. Finally, analyses of gene distribution highlighted possible highways of horizontal gene transfer. Due to sampling and design bias, we can only speculate on factors at play in this gene movement dynamic. Further studies examining precise conditions that favor gene transfer would provide invaluable insights in the fate of bacteria, species delineation and the emergence of successful pathogens.

  1. Complete genome sequence of Brachyspira intermedia reveals unique genomic features in Brachyspira species and phage-mediated horizontal gene transfer

    Science.gov (United States)

    2011-01-01

    Background Brachyspira spp. colonize the intestines of some mammalian and avian species and show different degrees of enteropathogenicity. Brachyspira intermedia can cause production losses in chickens and strain PWS/AT now becomes the fourth genome to be completed in the genus Brachyspira. Results 15 classes of unique and shared genes were analyzed in B. intermedia, B. murdochii, B. hyodysenteriae and B. pilosicoli. The largest number of unique genes was found in B. intermedia and B. murdochii. This indicates the presence of larger pan-genomes. In general, hypothetical protein annotations are overrepresented among the unique genes. A 3.2 kb plasmid was found in B. intermedia strain PWS/AT. The plasmid was also present in the B. murdochii strain but not in nine other Brachyspira isolates. Within the Brachyspira genomes, genes had been translocated and also frequently switched between leading and lagging strands, a process that can be followed by different AT-skews in the third positions of synonymous codons. We also found evidence that bacteriophages were being remodeled and genes incorporated into them. Conclusions The accessory gene pool shapes species-specific traits. It is also influenced by reductive genome evolution and horizontal gene transfer. Gene-transfer events can cross both species and genus boundaries and bacteriophages appear to play an important role in this process. A mechanism for horizontal gene transfer appears to be gene translocations leading to remodeling of bacteriophages in combination with broad tropism. PMID:21816042

  2. Analysis of the Genome and Chromium Metabolism-Related Genes of Serratia sp. S2.

    Science.gov (United States)

    Dong, Lanlan; Zhou, Simin; He, Yuan; Jia, Yan; Bai, Qunhua; Deng, Peng; Gao, Jieying; Li, Yingli; Xiao, Hong

    2018-05-01

    This study is to investigate the genome sequence of Serratia sp. S2. The genomic DNA of Serratia sp. S2 was extracted and the sequencing library was constructed. The sequencing was carried out by Illumina 2000 and complete genomic sequences were obtained. Gene function annotation and bioinformatics analysis were performed by comparing with the known databases. The genome size of Serratia sp. S2 was 5,604,115 bp and the G+C content was 57.61%. There were 5373 protein coding genes, and 3732, 3614, and 3942 genes were respectively annotated into the GO, KEGG, and COG databases. There were 12 genes related to chromium metabolism in the Serratia sp. S2 genome. The whole genome sequence of Serratia sp. S2 is submitted to the GenBank database with gene accession number of LNRP00000000. Our findings may provide theoretical basis for the subsequent development of new biotechnology to repair environmental chromium pollution.

  3. Global Metabolic Reconstruction and Metabolic Gene Evolution in the Cattle Genome

    Science.gov (United States)

    Kim, Woonsu; Park, Hyesun; Seo, Seongwon

    2016-01-01

    The sequence of cattle genome provided a valuable opportunity to systematically link genetic and metabolic traits of cattle. The objectives of this study were 1) to reconstruct genome-scale cattle-specific metabolic pathways based on the most recent and updated cattle genome build and 2) to identify duplicated metabolic genes in the cattle genome for better understanding of metabolic adaptations in cattle. A bioinformatic pipeline of an organism for amalgamating genomic annotations from multiple sources was updated. Using this, an amalgamated cattle genome database based on UMD_3.1, was created. The amalgamated cattle genome database is composed of a total of 33,292 genes: 19,123 consensus genes between NCBI and Ensembl databases, 8,410 and 5,493 genes only found in NCBI or Ensembl, respectively, and 266 genes from NCBI scaffolds. A metabolic reconstruction of the cattle genome and cattle pathway genome database (PGDB) was also developed using Pathway Tools, followed by an intensive manual curation. The manual curation filled or revised 68 pathway holes, deleted 36 metabolic pathways, and added 23 metabolic pathways. Consequently, the curated cattle PGDB contains 304 metabolic pathways, 2,460 reactions including 2,371 enzymatic reactions, and 4,012 enzymes. Furthermore, this study identified eight duplicated genes in 12 metabolic pathways in the cattle genome compared to human and mouse. Some of these duplicated genes are related with specific hormone biosynthesis and detoxifications. The updated genome-scale metabolic reconstruction is a useful tool for understanding biology and metabolic characteristics in cattle. There has been significant improvements in the quality of cattle genome annotations and the MetaCyc database. The duplicated metabolic genes in the cattle genome compared to human and mouse implies evolutionary changes in the cattle genome and provides a useful information for further research on understanding metabolic adaptations of cattle. PMID

  4. Soybean (Glycine max) SWEET gene family: insights through comparative genomics, transcriptome profiling and whole genome re-sequence analysis.

    Science.gov (United States)

    Patil, Gunvant; Valliyodan, Babu; Deshmukh, Rupesh; Prince, Silvas; Nicander, Bjorn; Zhao, Mingzhe; Sonah, Humira; Song, Li; Lin, Li; Chaudhary, Juhi; Liu, Yang; Joshi, Trupti; Xu, Dong; Nguyen, Henry T

    2015-07-11

    SWEET (MtN3_saliva) domain proteins, a recently identified group of efflux transporters, play an indispensable role in sugar efflux, phloem loading, plant-pathogen interaction and reproductive tissue development. The SWEET gene family is predominantly studied in Arabidopsis and members of the family are being investigated in rice. To date, no transcriptome or genomics analysis of soybean SWEET genes has been reported. In the present investigation, we explored the evolutionary aspect of the SWEET gene family in diverse plant species including primitive single cell algae to angiosperms with a major emphasis on Glycine max. Evolutionary features showed expansion and duplication of the SWEET gene family in land plants. Homology searches with BLAST tools and Hidden Markov Model-directed sequence alignments identified 52 SWEET genes that were mapped to 15 chromosomes in the soybean genome as tandem duplication events. Soybean SWEET (GmSWEET) genes showed a wide range of expression profiles in different tissues and developmental stages. Analysis of public transcriptome data and expression profiling using quantitative real time PCR (qRT-PCR) showed that a majority of the GmSWEET genes were confined to reproductive tissue development. Several natural genetic variants (non-synonymous SNPs, premature stop codons and haplotype) were identified in the GmSWEET genes using whole genome re-sequencing data analysis of 106 soybean genotypes. A significant association was observed between SNP-haplogroup and seed sucrose content in three gene clusters on chromosome 6. Present investigation utilized comparative genomics, transcriptome profiling and whole genome re-sequencing approaches and provided a systematic description of soybean SWEET genes and identified putative candidates with probable roles in the reproductive tissue development. Gene expression profiling at different developmental stages and genomic variation data will aid as an important resource for the soybean research

  5. Tree Species Identity Shapes Earthworm Communities

    DEFF Research Database (Denmark)

    Schelfhout, Stephanie; Mertens, Jan; Verheyen, Kris

    2017-01-01

    Earthworms are key organisms in forest ecosystems because they incorporate organic material into the soil and affect the activity of other soil organisms. Here, we investigated how tree species affect earthworm communities via litter and soil characteristics. In a 36-year old common garden...... of soil and foliar litter, and determined the forest floor turnover rate and the density and biomass of the earthworm species occurring in the stands. Tree species significantly affected earthworm communities via leaf litter and/or soil characteristics. Anecic earthworms were abundant under Fraxinus, Acer...

  6. Gene loss and horizontal gene transfer contributed to the genome evolution of the extreme acidophile Ferrovum

    Directory of Open Access Journals (Sweden)

    Sophie Roxana Ullrich

    2016-05-01

    Full Text Available Acid mine drainage (AMD, associated with active and abandoned mining sites, is a habitat for acidophilic microorganisms that gain energy from the oxidation of reduced sulfur compounds and ferrous iron and that thrive at pH below 4. Members of the recently proposed genus Ferrovum are the first acidophilic iron oxidizers to be described within the Betaproteobacteria. Although they have been detected as typical community members in AMD habitats worldwide, knowledge of their phylogenetic and metabolic diversity is scarce. Genomics approaches appear to be most promising in addressing this lacuna since isolation and cultivation of Ferrovum has proven to be extremely difficult and has so far only been successful for the designated type strain Ferrovum myxofaciens P3G. In this study, the genomes of two novel strains of Ferrovum (PN-J185 and Z-31 derived from water samples of a mine water treatment plant were sequenced. These genomes were compared with those of Ferrovum sp. JA12 that also originated from the mine water treatment plant, and of the type strain (P3G. Phylogenomic scrutiny suggests that the four strains represent three Ferrovum species that cluster in two groups (1 and 2. Comprehensive analysis of their predicted metabolic pathways revealed that these groups harbor characteristic metabolic profiles, notably with respect to motility, chemotaxis, nitrogen metabolism, biofilm formation and their potential strategies to cope with the acidic environment. For example, while the F. myxofaciens strains (group 1 appear to be motile and diazotrophic, the non-motile group 2 strains have the predicted potential to use a greater variety of fixed nitrogen sources. Furthermore, analysis of their genome synteny provides first insights into their genome evolution, suggesting that horizontal gene transfer and genome reduction in the group 2 strains by loss of genes encoding complete metabolic pathways or physiological features contributed to the observed

  7. Exploiting proteomic data for genome annotation and gene model validation in Aspergillus niger.

    Science.gov (United States)

    Wright, James C; Sugden, Deana; Francis-McIntyre, Sue; Riba-Garcia, Isabel; Gaskell, Simon J; Grigoriev, Igor V; Baker, Scott E; Beynon, Robert J; Hubbard, Simon J

    2009-02-04

    Proteomic data is a potentially rich, but arguably unexploited, data source for genome annotation. Peptide identifications from tandem mass spectrometry provide prima facie evidence for gene predictions and can discriminate over a set of candidate gene models. Here we apply this to the recently sequenced Aspergillus niger fungal genome from the Joint Genome Institutes (JGI) and another predicted protein set from another A.niger sequence. Tandem mass spectra (MS/MS) were acquired from 1d gel electrophoresis bands and searched against all available gene models using Average Peptide Scoring (APS) and reverse database searching to produce confident identifications at an acceptable false discovery rate (FDR). 405 identified peptide sequences were mapped to 214 different A.niger genomic loci to which 4093 predicted gene models clustered, 2872 of which contained the mapped peptides. Interestingly, 13 (6%) of these loci either had no preferred predicted gene model or the genome annotators' chosen "best" model for that genomic locus was not found to be the most parsimonious match to the identified peptides. The peptides identified also boosted confidence in predicted gene structures spanning 54 introns from different gene models. This work highlights the potential of integrating experimental proteomics data into genomic annotation pipelines much as expressed sequence tag (EST) data has been. A comparison of the published genome from another strain of A.niger sequenced by DSM showed that a number of the gene models or proteins with proteomics evidence did not occur in both genomes, further highlighting the utility of the method.

  8. Ultrahigh-dimensional variable selection method for whole-genome gene-gene interaction analysis

    Directory of Open Access Journals (Sweden)

    Ueki Masao

    2012-05-01

    Full Text Available Abstract Background Genome-wide gene-gene interaction analysis using single nucleotide polymorphisms (SNPs is an attractive way for identification of genetic components that confers susceptibility of human complex diseases. Individual hypothesis testing for SNP-SNP pairs as in common genome-wide association study (GWAS however involves difficulty in setting overall p-value due to complicated correlation structure, namely, the multiple testing problem that causes unacceptable false negative results. A large number of SNP-SNP pairs than sample size, so-called the large p small n problem, precludes simultaneous analysis using multiple regression. The method that overcomes above issues is thus needed. Results We adopt an up-to-date method for ultrahigh-dimensional variable selection termed the sure independence screening (SIS for appropriate handling of numerous number of SNP-SNP interactions by including them as predictor variables in logistic regression. We propose ranking strategy using promising dummy coding methods and following variable selection procedure in the SIS method suitably modified for gene-gene interaction analysis. We also implemented the procedures in a software program, EPISIS, using the cost-effective GPGPU (General-purpose computing on graphics processing units technology. EPISIS can complete exhaustive search for SNP-SNP interactions in standard GWAS dataset within several hours. The proposed method works successfully in simulation experiments and in application to real WTCCC (Wellcome Trust Case–control Consortium data. Conclusions Based on the machine-learning principle, the proposed method gives powerful and flexible genome-wide search for various patterns of gene-gene interaction.

  9. Can earthworms survive fire retardants?

    Science.gov (United States)

    Beyer, W.N.; Olson, A.

    1996-01-01

    Most common fire retardants are foams or are similar to common agricultural fertilizers, such as ammonium sulfate and ammonium phosphate. Although fire retardants are widely applied to soils, we lack basic information about their toxicities to soil organisms. We measured the toxicity of five fire retardants (Firetrol LCG-R, Firetrol GTS-R, Silv-Ex Foam Concentrate, Phos-chek D-75, and Phos-chek WD-881) to earthworms using the pesticide toxicity test developed for earthworms by the European Economic Community. None was lethal at 1,000 ppm in the soil, which was suggested as a relatively high exposure under normal applications. We concluded that the fire retardants tested are relatively nontoxic to soil organisms compared with other environmental chemicals and that they probably do not reduce earthworm populations when applied under usual firefighting conditions.

  10. Mapping Determinants of Gene Expression Plasticity by Genetical Genomics in C. elegans

    NARCIS (Netherlands)

    Li, Y.; Alda Alvarez, O.; Gutteling, E.W.; Tijsterman, M.; Fu, J.; Riksen, J.A.G.; Hazendonk, E.; Prins, J.C.P.; Plasterk, R.H.A.; Jansen, R.C.; Breitling, R.; Kammenga, J.E.

    2006-01-01

    Recent genetical genomics studies have provided intimate views on gene regulatory networks. Gene expression variations between genetically different individuals have been mapped to the causal regulatory regions, termed expression quantitative trait loci. Whether the environment-induced plastic

  11. Mapping determinants of gene expression plasticity by genetical genomics in C. elegans.

    NARCIS (Netherlands)

    Li, Y.; Alvarez, O.A.; Gutteling, E.W.; Tijsterman, M.; Fu, J.; Riksen, J.A.; Hazendonk, M.G.A.; Prins, P.; Plasterk, R.H.A.; Jansen, R.C.; Breitling, R.; Kammenga, J.E.

    2006-01-01

    Recent genetical genomics studies have provided intimate views on gene regulatory networks. Gene expression variations between genetically different individuals have been mapped to the causal regulatory regions, termed expression quantitative trait loci. Whether the environment-induced plastic

  12. Genome-wide analysis of regions similar to promoters of histone genes

    KAUST Repository

    Chowdhary, Rajesh; Bajic, Vladimir B.; Dong, Difeng; Wong, Limsoon; Liu, Jun S

    2010-01-01

    of histone and histone-coregulated gene transcription initiation. While these hypotheses still remain to be verified, we believe that these form a useful resource for researchers to further explore regulation of human histone genes and human genome

  13. Important Issues in Ecotoxicological Investigations Using Earthworms.

    Science.gov (United States)

    Velki, Mirna; Ečimović, Sandra

    The importance and beneficial effects of earthworms on soil structure and quality is well-established. In addition, earthworms have proved to be important model organisms for investigation of pollutant effects on soil ecosystems. In ecotoxicological investigations effects of various pollutants on earthworms were assessed. But some important issues regarding the effects of pollutants on earthworms still need to be comprehensively addressed. In this review several issues relevant to soil ecotoxicological investigations using earthworms are emphasized and guidelines that should be adopted in ecotoxicological investigations using earthworms are given. The inclusion of these guidelines in ecotoxicological studies will contribute to the better quantification of impacts of pollutants and will allow more accurate prediction of the real field effects of pollutants to earthworms.

  14. Molecular Assemblies, Genes and Genomics Integrated Efficiently (MAGGIE)

    Energy Technology Data Exchange (ETDEWEB)

    Baliga, Nitin S

    2011-05-26

    Final report on MAGGIE. We set ambitious goals to model the functions of individual organisms and their community from molecular to systems scale. These scientific goals are driving the development of sophisticated algorithms to analyze large amounts of experimental measurements made using high throughput technologies to explain and predict how the environment influences biological function at multiple scales and how the microbial systems in turn modify the environment. By experimentally evaluating predictions made using these models we will test the degree to which our quantitative multiscale understanding wilt help to rationally steer individual microbes and their communities towards specific tasks. Towards this end we have made substantial progress towards understanding evolution of gene families, transcriptional structures, detailed structures of keystone molecular assemblies (proteins and complexes), protein interactions, biological networks, microbial interactions, and community structure. Using comparative analysis we have tracked the evolutionary history of gene functions to understand how novel functions evolve. One level up, we have used proteomics data, high-resolution genome tiling microarrays, and 5' RNA sequencing to revise genome annotations, discover new genes including ncRNAs, and map dynamically changing operon structures of five model organisms: For Desulfovibrio vulgaris Hildenborough, Pyrococcus furiosis, Sulfolobus solfataricus, Methanococcus maripaludis and Haiobacterium salinarum NROL We have developed machine learning algorithms to accurately identify protein interactions at a near-zero false positive rate from noisy data generated using tagfess complex purification, TAP purification, and analysis of membrane complexes. Combining other genome-scale datasets produced by ENIGMA (in particular, microarray data) and available from literature we have been able to achieve a true positive rate as high as 65% at almost zero false positives

  15. Genome-wide study of correlations between genomic features and their relationship with the regulation of gene expression.

    Science.gov (United States)

    Kravatsky, Yuri V; Chechetkin, Vladimir R; Tchurikov, Nikolai A; Kravatskaya, Galina I

    2015-02-01

    The broad class of tasks in genetics and epigenetics can be reduced to the study of various features that are distributed over the genome (genome tracks). The rapid and efficient processing of the huge amount of data stored in the genome-scale databases cannot be achieved without the software packages based on the analytical criteria. However, strong inhomogeneity of genome tracks hampers the development of relevant statistics. We developed the criteria for the assessment of genome track inhomogeneity and correlations between two genome tracks. We also developed a software package, Genome Track Analyzer, based on this theory. The theory and software were tested on simulated data and were applied to the study of correlations between CpG islands and transcription start sites in the Homo sapiens genome, between profiles of protein-binding sites in chromosomes of Drosophila melanogaster, and between DNA double-strand breaks and histone marks in the H. sapiens genome. Significant correlations between transcription start sites on the forward and the reverse strands were observed in genomes of D. melanogaster, Caenorhabditis elegans, Mus musculus, H. sapiens, and Danio rerio. The observed correlations may be related to the regulation of gene expression in eukaryotes. Genome Track Analyzer is freely available at http://ancorr.eimb.ru/. © The Author 2015. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  16. Genomic Characterization of Phenylalanine Ammonia Lyase Gene in Buckwheat.

    Directory of Open Access Journals (Sweden)

    Karthikeyan Thiyagarajan

    Full Text Available Phenylalanine Ammonia Lyase (PAL gene which plays a key role in bio-synthesis of medicinally important compounds, Rutin/quercetin was sequence characterized for its efficient genomics application. These compounds possessing anti-diabetic and anti-cancer properties and are predominantly produced by Fagopyrum spp. In the present study, PAL gene was sequenced from three Fagopyrum spp. (F. tataricum, F. esculentum and F. dibotrys and showed the presence of three SNPs and four insertion/deletions at intra and inter specific level. Among them, the potential SNP (position 949th bp G>C with Parsimony Informative Site was selected and successfully utilised to individuate the zygosity/allelic variation of 16 F. tataricum varieties. Insertion mutations were identified in coding region, which resulted the change of a stretch of 39 amino acids on the putative protein. Our Study revealed that autogamous species (F. tataricum has lower frequency of observed SNPs as compared to allogamous species (F. dibotrys and F. esculentum. The identified SNPs in F. tataricum didn't result to amino acid change, while in other two species it caused both conservative and non-conservative variations. Consistent pattern of SNPs across the species revealed their phylogenetic importance. We found two groups of F. tataricum and one of them was closely related with F. dibotrys. Sequence characterization information of PAL gene reported in present investigation can be utilized in genetic improvement of buckwheat in reference to its medicinal value.

  17. Genome Wide Identification, Phylogeny, and Expression of Aquaporin Genes in Common Carp (Cyprinus carpio.

    Directory of Open Access Journals (Sweden)

    Chuanju Dong

    Full Text Available Aquaporins (Aqps are integral membrane proteins that facilitate the transport of water and small solutes across cell membranes. Among vertebrate species, Aqps are highly conserved in both gene structure and amino acid sequence. These proteins are vital for maintaining water homeostasis in living organisms, especially for aquatic animals such as teleost fish. Studies on teleost Aqps are mainly limited to several model species with diploid genomes. Common carp, which has a tetraploidized genome, is one of the most common aquaculture species being adapted to a wide range of aquatic environments. The complete common carp genome has recently been released, providing us the possibility for gene evolution of aqp gene family after whole genome duplication.In this study, we identified a total of 37 aqp genes from common carp genome. Phylogenetic analysis revealed that most of aqps are highly conserved. Comparative analysis was performed across five typical vertebrate genomes. We found that almost all of the aqp genes in common carp were duplicated in the evolution of the gene family. We postulated that the expansion of the aqp gene family in common carp was the result of an additional whole genome duplication event and that the aqp gene family in other teleosts has been lost in their evolution history with the reason that the functions of genes are redundant and conservation. Expression patterns were assessed in various tissues, including brain, heart, spleen, liver, intestine, gill, muscle, and skin, which demonstrated the comprehensive expression profiles of aqp genes in the tetraploidized genome. Significant gene expression divergences have been observed, revealing substantial expression divergences or functional divergences in those duplicated aqp genes post the latest WGD event.To some extent, the gene families are also considered as a unique source for evolutionary studies. Moreover, the whole set of common carp aqp gene family provides an

  18. Comparative analysis of genome maintenance genes in naked mole rat, mouse, and human

    NARCIS (Netherlands)

    S.L. Macrae (Sheila L.); Q. Zhang (Quanwei); C. Lemetre (Christophe); I. Seim (Inge); R.B. Calder (Robert B.); J.H.J. Hoeijmakers (Jan); Y. Suh (Yousin); V.N. Gladyshev (Vadim N.); A. Seluanov (Andrei); V. Gorbunova (Vera); J. Vijg (Jan); Z.D. Zhang (Zhengdong D.)

    2015-01-01

    textabstractGenome maintenance (GM) is an essential defense system against aging and cancer, as both are characterized by increased genome instability. Here, we compared the copy number variation and mutation rate of 518 GM-associated genes in the naked mole rat (NMR), mouse, and human genomes. GM

  19. Theories of Population Variation in Genes and Genomes

    DEFF Research Database (Denmark)

    Christiansen, Freddy

    This textbook provides an authoritative introduction to both classical and coalescent approaches to population genetics. Written for graduate students and advanced undergraduates by one of the world’s leading authorities in the field, the book focuses on the theoretical background of population...... genetics, while emphasizing the close interplay between theory and empiricism. Traditional topics such as genetic and phenotypic variation, mutation, migration, and linkage are covered and advanced by contemporary coalescent theory, which describes the genealogy of genes in a population, ultimately...... connecting them to a single common ancestor. Effects of selection, particularly genomic effects, are discussed with reference to molecular genetic variation. The book is designed for students of population genetics, bioinformatics, evolutionary biology, molecular evolution, and theoretical biology—as well...

  20. Genome wide analyses of metal responsive genes in Caenorhabditis elegans

    Directory of Open Access Journals (Sweden)

    Michael eAschner

    2012-04-01

    Full Text Available Metals are major contaminants that influence human health. Many metals have physiologic roles, but excessive levels can be harmful. Advances in technology have made toxicogenomic analyses possible to characterize the effects of metal exposure on the entire genome. Much of what is known about cellular responses to metals has come from mammalian systems; however the use of non-mammalian species is gaining wider attention. Caenorhabditis elegans (C. elegans is a small round worm whose genome has been fully sequenced and its development from egg to adult is well characterized. It is an attractive model for high throughput screens due to its short lifespan, ease of genetic mutability, low cost and high homology with humans. Research performed in C. elegans has led to insights in apoptosis, gene expression and neurodegeneration, all of which can be altered by metal exposure. Additionally, by using worms one can potentially study how the mechanisms that underline differential responses to metals in nematodes and humans, allowing for identification of novel pathways and therapeutic targets. In this review, toxicogenomic studies performed in C. elegans exposed to various metals will be discussed, highlighting how this non-mammalian system can be utilized to study cellular processes and pathways induced by metals. Recent work focusing on neurodegeneration in Parkinson’s disease will be discussed as an example of the usefulness of genetic screens in C. elegans and the novel findings that can be produced.

  1. Meta genome-wide network from functional linkages of genes in human gut microbial ecosystems.

    Science.gov (United States)

    Ji, Yan; Shi, Yixiang; Wang, Chuan; Dai, Jianliang; Li, Yixue

    2013-03-01

    The human gut microbial ecosystem (HGME) exerts an important influence on the human health. In recent researches, meta-genomics provided deep insights into the HGME in terms of gene contents, metabolic processes and genome constitutions of meta-genome. Here we present a novel methodology to investigate the HGME on the basis of a set of functionally coupled genes regardless of their genome origins when considering the co-evolution properties of genes. By analyzing these coupled genes, we showed some basic properties of HGME significantly associated with each other, and further constructed a protein interaction map of human gut meta-genome to discover some functional modules that may relate with essential metabolic processes. Compared with other studies, our method provides a new idea to extract basic function elements from meta-genome systems and investigate complex microbial environment by associating its biological traits with co-evolutionary fingerprints encoded in it.

  2. GeneDig: a web application for accessing genomic and bioinformatics knowledge.

    Science.gov (United States)

    Suciu, Radu M; Aydin, Emir; Chen, Brian E

    2015-02-28

    With the exponential increase and widespread availability of genomic, transcriptomic, and proteomic data, accessing these '-omics' data is becoming increasingly difficult. The current resources for accessing and analyzing these data have been created to perform highly specific functions intended for specialists, and thus typically emphasize functionality over user experience. We have developed a web-based application, GeneDig.org, that allows any general user access to genomic information with ease and efficiency. GeneDig allows for searching and browsing genes and genomes, while a dynamic navigator displays genomic, RNA, and protein information simultaneously for co-navigation. We demonstrate that our application allows more than five times faster and efficient access to genomic information than any currently available methods. We have developed GeneDig as a platform for bioinformatics integration focused on usability as its central design. This platform will introduce genomic navigation to broader audiences while aiding the bioinformatics analyses performed in everyday biology research.

  3. Hypothesis: Gene-rich plastid genomes in red algae may be an outcome of nuclear genome reduction.

    Science.gov (United States)

    Qiu, Huan; Lee, Jun Mo; Yoon, Hwan Su; Bhattacharya, Debashish

    2017-06-01

    Red algae (Rhodophyta) putatively diverged from the eukaryote tree of life >1.2 billion years ago and are the source of plastids in the ecologically important diatoms, haptophytes, and dinoflagellates. In general, red algae contain the largest plastid gene inventory among all such organelles derived from primary, secondary, or additional rounds of endosymbiosis. In contrast, their nuclear gene inventory is reduced when compared to their putative sister lineage, the Viridiplantae, and other photosynthetic lineages. The latter is thought to have resulted from a phase of genome reduction that occurred in the stem lineage of Rhodophyta. A recent comparative analysis of a taxonomically broad collection of red algal and Viridiplantae plastid genomes demonstrates that the red algal ancestor encoded ~1.5× more plastid genes than Viridiplantae. This difference is primarily explained by more extensive endosymbiotic gene transfer (EGT) in the stem lineage of Viridiplantae, when compared to red algae. We postulate that limited EGT in Rhodophytes resulted from the countervailing force of ancient, and likely recurrent, nuclear genome reduction. In other words, the propensity for nuclear gene loss led to the retention of red algal plastid genes that would otherwise have undergone intracellular gene transfer to the nucleus. This hypothesis recognizes the primacy of nuclear genome evolution over that of plastids, which have no inherent control of their gene inventory and can change dramatically (e.g., secondarily non-photosynthetic eukaryotes, dinoflagellates) in response to selection acting on the host lineage. © 2017 Phycological Society of America.

  4. Strategies used for genetically modifying bacterial genome: ite-directed mutagenesis, gene inactivation, and gene over-expression*

    Science.gov (United States)

    Xu, Jian-zhong; Zhang, Wei-guo

    2016-01-01

    With the availability of the whole genome sequence of Escherichia coli or Corynebacterium glutamicum, strategies for directed DNA manipulation have developed rapidly. DNA manipulation plays an important role in understanding the function of genes and in constructing novel engineering bacteria according to requirement. DNA manipulation involves modifying the autologous genes and expressing the heterogenous genes. Two alternative approaches, using electroporation linear DNA or recombinant suicide plasmid, allow a wide variety of DNA manipulation. However, the over-expression of the desired gene is generally executed via plasmid-mediation. The current review summarizes the common strategies used for genetically modifying E. coli and C. glutamicum genomes, and discusses the technical problem of multi-layered DNA manipulation. Strategies for gene over-expression via integrating into genome are proposed. This review is intended to be an accessible introduction to DNA manipulation within the bacterial genome for novices and a source of the latest experimental information for experienced investigators. PMID:26834010

  5. Draft Genome Sequence and Gene Annotation of the Entomopathogenic Fungus Verticillium hemipterigenum

    OpenAIRE

    Horn, Fabian; Habel, Andreas; Scharf, Daniel H.; Dworschak, Jan; Brakhage, Axel A.; Guthke, Reinhard; Hertweck, Christian; Linde, J?rg

    2015-01-01

    Verticillium hemipterigenum (anamorph Torrubiella hemipterigena) is an entomopathogenic fungus and produces a broad range of secondary metabolites. Here, we present the draft genome sequence of the fungus, including gene structure and functional annotation. Genes were predicted incorporating RNA-Seq data and functionally annotated to provide the basis for further genome studies.

  6. Congruent Deep Relationships in the Grape Family (Vitaceae) Based on Sequences of Chloroplast Genomes and Mitochondrial Genes via Genome Skimming.

    Science.gov (United States)

    Zhang, Ning; Wen, Jun; Zimmer, Elizabeth A

    2015-01-01

    Vitaceae is well-known for having one of the most economically important fruits, i.e., the grape (Vitis vinifera). The deep phylogeny of the grape family was not resolved until a recent phylogenomic analysis of 417 nuclear genes from transcriptome data. However, it has been reported extensively that topologies based on nuclear and organellar genes may be incongruent due to differences in their evolutionary histories. Therefore, it is important to reconstruct a backbone phylogeny of the grape family using plastomes and mitochondrial genes. In this study,next-generation sequencing data sets of 27 species were obtained using genome skimming with total DNAs from silica-gel preserved tissue samples on an Illumina NextSeq 500 instrument [corrected]. Plastomes were assembled using the combination of de novo and reference genome (of V. vinifera) methods. Sixteen mitochondrial genes were also obtained via genome skimming using the reference genome of V. vinifera. Extensive phylogenetic analyses were performed using maximum likelihood and Bayesian methods. The topology based on either plastome data or mitochondrial genes is congruent with the one using hundreds of nuclear genes, indicating that the grape family did not exhibit significant reticulation at the deep level. The results showcase the power of genome skimming in capturing extensive phylogenetic data: especially from chloroplast and mitochondrial DNAs.

  7. Congruent Deep Relationships in the Grape Family (Vitaceae Based on Sequences of Chloroplast Genomes and Mitochondrial Genes via Genome Skimming.

    Directory of Open Access Journals (Sweden)

    Ning Zhang

    Full Text Available Vitaceae is well-known for having one of the most economically important fruits, i.e., the grape (Vitis vinifera. The deep phylogeny of the grape family was not resolved until a recent phylogenomic analysis of 417 nuclear genes from transcriptome data. However, it has been reported extensively that topologies based on nuclear and organellar genes may be incongruent due to differences in their evolutionary histories. Therefore, it is important to reconstruct a backbone phylogeny of the grape family using plastomes and mitochondrial genes. In this study,next-generation sequencing data sets of 27 species were obtained using genome skimming with total DNAs from silica-gel preserved tissue samples on an Illumina NextSeq 500 instrument [corrected]. Plastomes were assembled using the combination of de novo and reference genome (of V. vinifera methods. Sixteen mitochondrial genes were also obtained via genome skimming using the reference genome of V. vinifera. Extensive phylogenetic analyses were performed using maximum likelihood and Bayesian methods. The topology based on either plastome data or mitochondrial genes is congruent with the one using hundreds of nuclear genes, indicating that the grape family did not exhibit significant reticulation at the deep level. The results showcase the power of genome skimming in capturing extensive phylogenetic data: especially from chloroplast and mitochondrial DNAs.

  8. Reconstruction of Ancestral Genomes in Presence of Gene Gain and Loss.

    Science.gov (United States)

    Avdeyev, Pavel; Jiang, Shuai; Aganezov, Sergey; Hu, Fei; Alekseyev, Max A

    2016-03-01

    Since most dramatic genomic changes are caused by genome rearrangements as well as gene duplications and gain/loss events, it becomes crucial to understand their mechanisms and reconstruct ancestral genomes of the given genomes. This problem was shown to be NP-complete even in the "simplest" case of three genomes, thus calling for heuristic rather than exact algorithmic solutions. At the same time, a larger number of input genomes may actually simplify the problem in practice as it was earlier illustrated with MGRA, a state-of-the-art software tool for reconstruction of ancestral genomes of multiple genomes. One of the key obstacles for MGRA and other similar tools is presence of breakpoint reuses when the same breakpoint region is broken by several different genome rearrangements in the course of evolution. Furthermore, such tools are often limited to genomes composed of the same genes with each gene present in a single copy in every genome. This limitation makes these tools inapplicable for many biological datasets and degrades the resolution of ancestral reconstructions in diverse datasets. We address these deficiencies by extending the MGRA algorithm to genomes with unequal gene contents. The developed next-generation tool MGRA2 can handle gene gain/loss events and shares the ability of MGRA to reconstruct ancestral genomes uniquely in the case of limited breakpoint reuse. Furthermore, MGRA2 employs a number of novel heuristics to cope with higher breakpoint reuse and process datasets inaccessible for MGRA. In practical experiments, MGRA2 shows superior performance for simulated and real genomes as compared to other ancestral genome reconstruction tools.

  9. Exploiting proteomic data for genome annotation and gene model validation in Aspergillus niger

    OpenAIRE

    Wright, James C.; Sugden, Deana; Francis-McIntyre, Sue; Riba Garcia, Isabel; Gaskell, Simon J.; Grigoriev, Igor V.; Baker, Scott E.; Beynon, Robert J.; Hubbard, Simon J.

    2009-01-01

    Abstract Background Proteomic data is a potentially rich, but arguably unexploited, data source for genome annotation. Peptide identifications from tandem mass spectrometry provide prima facie evidence for gene predictions and can discriminate over a set of candidate gene models. Here we apply this to the recently sequenced Aspergillus niger fungal genome from the Joint Genome Institutes (JGI) and another predicted protein set from another A.niger sequence. Tandem mass spectra (MS/MS) were ac...

  10. Genome-wide comparative analysis reveals similar types of NBS genes in hybrid Citrus sinensis genome and original Citrus clementine genome and provides new insights into non-TIR NBS genes

    Science.gov (United States)

    In this study, we identified and compared nucleotide-binding site (NBS) domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China). Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approxima...

  11. The First Myriapod Genome Sequence Reveals Conservative Arthropod Gene Content and Genome Organisation in the Centipede Strigamia maritima

    Science.gov (United States)

    Chipman, Ariel D.; Ferrier, David E. K.; Brena, Carlo; Qu, Jiaxin; Hughes, Daniel S. T.; Schröder, Reinhard; Torres-Oliva, Montserrat; Znassi, Nadia; Jiang, Huaiyang; Almeida, Francisca C.; Alonso, Claudio R.; Apostolou, Zivkos; Aqrawi, Peshtewani; Arthur, Wallace; Barna, Jennifer C. J.; Blankenburg, Kerstin P.; Brites, Daniela; Capella-Gutiérrez, Salvador; Coyle, Marcus; Dearden, Peter K.; Du Pasquier, Louis; Duncan, Elizabeth J.; Ebert, Dieter; Eibner, Cornelius; Erikson, Galina; Evans, Peter D.; Extavour, Cassandra G.; Francisco, Liezl; Gabaldón, Toni; Gillis, William J.; Goodwin-Horn, Elizabeth A.; Green, Jack E.; Griffiths-Jones, Sam; Grimmelikhuijzen, Cornelis J. P.; Gubbala, Sai; Guigó, Roderic; Han, Yi; Hauser, Frank; Havlak, Paul; Hayden, Luke; Helbing, Sophie; Holder, Michael; Hui, Jerome H. L.; Hunn, Julia P.; Hunnekuhl, Vera S.; Jackson, LaRonda; Javaid, Mehwish; Jhangiani, Shalini N.; Jiggins, Francis M.; Jones, Tamsin E.; Kaiser, Tobias S.; Kalra, Divya; Kenny, Nathan J.; Korchina, Viktoriya; Kovar, Christie L.; Kraus, F. Bernhard; Lapraz, François; Lee, Sandra L.; Lv, Jie; Mandapat, Christigale; Manning, Gerard; Mariotti, Marco; Mata, Robert; Mathew, Tittu; Neumann, Tobias; Newsham, Irene; Ngo, Dinh N.; Ninova, Maria; Okwuonu, Geoffrey; Ongeri, Fiona; Palmer, William J.; Patil, Shobha; Patraquim, Pedro; Pham, Christopher; Pu, Ling-Ling; Putman, Nicholas H.; Rabouille, Catherine; Ramos, Olivia Mendivil; Rhodes, Adelaide C.; Robertson, Helen E.; Robertson, Hugh M.; Ronshaugen, Matthew; Rozas, Julio; Saada, Nehad; Sánchez-Gracia, Alejandro; Scherer, Steven E.; Schurko, Andrew M.; Siggens, Kenneth W.; Simmons, DeNard; Stief, Anna; Stolle, Eckart; Telford, Maximilian J.; Tessmar-Raible, Kristin; Thornton, Rebecca; van der Zee, Maurijn; von Haeseler, Arndt; Williams, James M.; Willis, Judith H.; Wu, Yuanqing; Zou, Xiaoyan; Lawson, Daniel; Muzny, Donna M.; Worley, Kim C.; Gibbs, Richard A.; Akam, Michael; Richards, Stephen

    2014-01-01

    Myriapods (e.g., centipedes and millipedes) display a simple homonomous body plan relative to other arthropods. All members of the class are terrestrial, but they attained terrestriality independently of insects. Myriapoda is the only arthropod class not represented by a sequenced genome. We present an analysis of the genome of the centipede Strigamia maritima. It retains a compact genome that has undergone less gene loss and shuffling than previously sequenced arthropods, and many orthologues of genes conserved from the bilaterian ancestor that have been lost in insects. Our analysis locates many genes in conserved macro-synteny contexts, and many small-scale examples of gene clustering. We describe several examples where S. maritima shows different solutions from insects to similar problems. The insect olfactory receptor gene family is absent from S. maritima, and olfaction in air is likely effected by expansion of other receptor gene families. For some genes S. maritima has evolved paralogues to generate coding sequence diversity, where insects use alternate splicing. This is most striking for the Dscam gene, which in Drosophila generates more than 100,000 alternate splice forms, but in S. maritima is encoded by over 100 paralogues. We see an intriguing linkage between the absence of any known photosensory proteins in a blind organism and the additional absence of canonical circadian clock genes. The phylogenetic position of myriapods allows us to identify where in arthropod phylogeny several particular molecular mechanisms and traits emerged. For example, we conclude that juvenile hormone signalling evolved with the emergence of the exoskeleton in the arthropods and that RR-1 containing cuticle proteins evolved in the lineage leading to Mandibulata. We also identify when various gene expansions and losses occurred. The genome of S. maritima offers us a unique glimpse into the ancestral arthropod genome, while also displaying many adaptations to its specific

  12. On the total number of genes and their length distribution in complete microbial genomes

    DEFF Research Database (Denmark)

    Skovgaard, M; Jensen, L J; Brunak, S

    2001-01-01

    In sequenced microbial genomes, some of the annotated genes are actually not protein-coding genes, but rather open reading frames that occur by chance. Therefore, the number of annotated genes is higher than the actual number of genes for most of these microbes. Comparison of the length distribut......In sequenced microbial genomes, some of the annotated genes are actually not protein-coding genes, but rather open reading frames that occur by chance. Therefore, the number of annotated genes is higher than the actual number of genes for most of these microbes. Comparison of the length...... distribution of the annotated genes with the length distribution of those matching a known protein reveals that too many short genes are annotated in many genomes. Here we estimate the true number of protein-coding genes for sequenced genomes. Although it is often claimed that Escherichia coli has about 4300...... genes, we show that it probably has only approximately 3800 genes, and that a similar discrepancy exists for almost all published genomes....

  13. The impact of genome triplication on tandem gene evolution in Brassica rapa

    Directory of Open Access Journals (Sweden)

    Lu eFang

    2012-11-01

    Full Text Available Whole genome duplication (WGD and tandem duplication (TD are both important modes of gene expansion. However, how whole genome duplication influences tandemly duplicated genes is not well studied. We used Brassica rapa, which has undergone an additional genome triplication (WGT and shares a common ancestor with Arabidopsis thaliana, Arabidopsis lyrata and Thellungiella parvula, to investigate the impact of genome triplication on tandem gene evolution. We identified 2,137, 1,569, 1,751 and 1,135 tandem gene arrays in B. rapa, A. thaliana, A. lyrata and T. parvula respectively. Among them, 414 conserved tandem arrays are shared by the 3 species without WGT, which were also considered as existing in the diploid ancestor of B. rapa. Thus, after genome triplication, B. rapa should have 1,242 tandem arrays according to the 414 conserved tandems. Here, we found 400 out of the 414 tandems had at least one syntenic ortholog in the genome of B. rapa. Furthermore, 294 out of the 400 shared syntenic orthologs maintain tandem arrays (more than one gene for each syntenic hit in B. rapa. For the 294 tandem arrays, we obtained 426 copies of syntenic paralogous tandems in the triplicated genome of B. rapa. In this study, we demonstrated that tandem arrays in B. rapa were dramatically fractionated after WGT when compared either to non-tandem genes in the B. rapa genome or to the tandem arrays in closely related species that have not experienced a recent whole-genome polyploidization event.

  14. Inter-genomic DNA Exchanges and Homeologous Gene Silencing Shaped the Nascent Allopolyploid Coffee Genome (Coffea arabica L.

    Directory of Open Access Journals (Sweden)

    Philippe Lashermes

    2016-09-01

    Full Text Available Allopolyploidization is a biological process that has played a major role in plant speciation and evolution. Genomic changes are common consequences of polyploidization, but their dynamics over time are still poorly understood. Coffea arabica, a recently formed allotetraploid, was chosen to study genetic changes that accompany allopolyploid formation. Both RNA-seq and DNA-seq data were generated from two genetically distant C. arabica accessions. Genomic structural variation was investigated using C. canephora, one of its diploid progenitors, as reference genome. The fate of 9047 duplicate homeologous genes was inferred and compared between the accessions. The pattern of SNP density along the reference genome was consistent with the allopolyploid structure. Large genomic duplications or deletions were not detected. Two homeologous copies were retained and expressed in 96% of the genes analyzed. Nevertheless, duplicated genes were found to be affected by various genomic changes leading to homeolog loss or silencing. Genetic and epigenetic changes were evidenced that could have played a major role in the stabilization of the unique ancestral allotetraploid and its subsequent diversification. While the early evolution of C. arabica mainly involved homeologous crossover exchanges, the later stage appears to have relied on more gradual evolution involving gene conversion and homeolog silencing.

  15. The complete chloroplast genome sequence of Podocarpus lambertii: genome structure, evolutionary aspects, gene content and SSR detection.

    Directory of Open Access Journals (Sweden)

    Leila do Nascimento Vieira

    Full Text Available BACKGROUND: Podocarpus lambertii (Podocarpaceae is a native conifer from the Brazilian Atlantic Forest Biome, which is considered one of the 25 biodiversity hotspots in the world. The advancement of next-generation sequencing technologies has enabled the rapid acquisition of whole chloroplast (cp genome sequences at low cost. Several studies have proven the potential of cp genomes as tools to understand enigmatic and basal phylogenetic relationships at different taxonomic levels, as well as further probe the structural and functional evolution of plants. In this work, we present the complete cp genome sequence of P. lambertii. METHODOLOGY/PRINCIPAL FINDINGS: The P. lambertii cp genome is 133,734 bp in length, and similar to other sequenced cupressophytes, it lacks one of the large inverted repeat regions (IR. It contains 118 unique genes and one duplicated tRNA (trnN-GUU, which occurs as an inverted repeat sequence. The rps16 gene was not found, which was previously reported for the plastid genome of another Podocarpaceae (Nageia nagi and Araucariaceae (Agathis dammara. Structurally, P. lambertii shows 4 inversions of a large DNA fragment ∼20,000 bp compared to the Podocarpus totara cp genome. These unexpected characteristics may be attributed to geographical distance and different adaptive needs. The P. lambertii cp genome presents a total of 28 tandem repeats and 156 SSRs, with homo- and dipolymers being the most common and tri-, tetra-, penta-, and hexapolymers occurring with less frequency. CONCLUSION: The complete cp genome sequence of P. lambertii revealed significant structural changes, even in species from the same genus. These results reinforce the apparently loss of rps16 gene in Podocarpaceae cp genome. In addition, several SSRs in the P. lambertii cp genome are likely intraspecific polymorphism sites, which may allow highly sensitive phylogeographic and population structure studies, as well as phylogenetic studies of species of

  16. Convergent functional genomics in addiction research - a translational approach to study candidate genes and gene networks.

    Science.gov (United States)

    Spanagel, Rainer

    2013-01-01

    Convergent functional genomics (CFG) is a translational methodology that integrates in a Bayesian fashion multiple lines of evidence from studies in human and animal models to get a better understanding of the genetics of a disease or pathological behavior. Here the integration of data sets that derive from forward genetics in animals and genetic association studies including genome wide association studies (GWAS) in humans is described for addictive behavior. The aim of forward genetics in animals and association studies in humans is to identify mutations (e.g. SNPs) that produce a certain phenotype; i.e. "from phenotype to genotype". Most powerful in terms of forward genetics is combined quantitative trait loci (QTL) analysis and gene expression profiling in recombinant inbreed rodent lines or genetically selected animals for a specific phenotype, e.g. high vs. low drug consumption. By Bayesian scoring genomic information from forward genetics in animals is then combined with human GWAS data on a similar addiction-relevant phenotype. This integrative approach generates a robust candidate gene list that has to be functionally validated by means of reverse genetics in animals; i.e. "from genotype to phenotype". It is proposed that studying addiction relevant phenotypes and endophenotypes by this CFG approach will allow a better determination of the genetics of addictive behavior.

  17. Complementary Information Derived from CRISPR Cas9 Mediated Gene Deletion and Suppression. | Office of Cancer Genomics

    Science.gov (United States)

    CRISPR-Cas9 provides the means to perform genome editing and facilitates loss-of-function screens. However, we and others demonstrated that expression of the Cas9 endonuclease induces a gene-independent response that correlates with the number of target sequences in the genome. An alternative approach to suppressing gene expression is to block transcription using a catalytically inactive Cas9 (dCas9). Here we directly compare genome editing by CRISPR-Cas9 (cutting, CRISPRc) and gene suppression using KRAB-dCas9 (CRISPRi) in loss-of-function screens to identify cell essential genes.

  18. Finding the missing honey bee genes: Lessons learned from a genome upgrade

    KAUST Repository

    Elsik, Christine G; Worley, Kim C; Bennett, Anna K; Beye, Martin; Camara, Francisco; Childers, Christopher P; de Graaf, Dirk C; Debyser, Griet; Deng, Jixin; Devreese, Bart; Elhaik, Eran; Evans, Jay D; Foster, Leonard J; Graur, Dan; Guigo, Roderic; Hoff, Katharina Jasmin; Holder, Michael E; Hudson, Matthew E; Hunt, Greg J; Jiang, Huaiyang; Joshi, Vandita; Khetani, Radhika S; Kosarev, Peter; Kovar, Christie L; Ma, Jian; Maleszka, Ryszard; Moritz, Robin F A; Munoz-Torres, Monica C; Murphy, Terence D; Muzny, Donna M; Newsham, Irene F; Reese, Justin T; Robertson, Hugh M; Robinson, Gene E; Rueppell, Olav; Solovyev, Victor; Stanke, Mario; Stolle, Eckart; Tsuruda, Jennifer M; Vaerenbergh, Matthias Van; Waterhouse, Robert M; Weaver, Daniel B; Whitfield, Charles W; Wu, Yuanqing; Zdobnov, Evgeny M; Zhang, Lan; Zhu, Dianhui; Gibbs, Richard A; Patil, S.; Gubbala, S.; Aqrawi, P.; Arias, F.; Bess, C.; Blankenburg, K. B.; Brocchini, M.; Buhay, C.; Challis, D.; Chang, K.; Chen, D.; Coleman, P.; Drummond, J.; English, A.; Evani, U.; Francisco, L.; Fu, Q.; Goodspeed, R.; Haessly, T. H.; Hale, W.; Han, H.; Hu, Y.; Jackson, L.; Jakkamsetti, A.; Jayaseelan, J. C.; Kakkar, N.; Kalra, D.; Kandadi, H.; Lee, S.; Li, H.; Liu, Y.; Macmil, S.; Mandapat, C. M.; Mata, R.; Mathew, T.; Matskevitch, T.; Munidasa, M.; Nagaswamy, U.; Najjar, R.; Nguyen, N.; Niu, J.; Opheim, D.; Palculict, T.; Paul, S.; Pellon, M.; Perales, L.; Pham, C.; Pham, P.

    2014-01-01

    Background: The first generation of genome sequence assemblies and annotations have had a significant impact upon our understanding of the biology of the sequenced species, the phylogenetic relationships among species, the study of populations within and across species, and have informed the biology of humans. As only a few Metazoan genomes are approaching finished quality (human, mouse, fly and worm), there is room for improvement of most genome assemblies. The honey bee (Apis mellifera) genome, published in 2006, was noted for its bimodal GC content distribution that affected the quality of the assembly in some regions and for fewer genes in the initial gene set (OGSv1.0) compared to what would be expected based on other sequenced insect genomes. Results: Here, we report an improved honey bee genome assembly (Amel_4.5) with a new gene annotation set (OGSv3.2), and show that the honey bee genome contains a number of genes similar to that of other insect genomes, contrary to what was suggested in OGSv1.0. The new genome assembly is more contiguous and complete and the new gene set includes ~5000 more protein-coding genes, 50% more than previously reported. About 1/6 of the additional genes were due to improvements to the assembly, and the remaining were inferred based on new RNAseq and protein data. Conclusions: Lessons learned from this genome upgrade have important implications for future genome sequencing projects. Furthermore, the improvements significantly enhance genomic resources for the honey bee, a key model for social behavior and essential to global ecology through pollination. 2014 Elsik et al.; licensee BioMed Central Ltd.

  19. Finding the missing honey bee genes: lessons learned from a genome upgrade.

    Science.gov (United States)

    Elsik, Christine G; Worley, Kim C; Bennett, Anna K; Beye, Martin; Camara, Francisco; Childers, Christopher P; de Graaf, Dirk C; Debyser, Griet; Deng, Jixin; Devreese, Bart; Elhaik, Eran; Evans, Jay D; Foster, Leonard J; Graur, Dan; Guigo, Roderic; Hoff, Katharina Jasmin; Holder, Michael E; Hudson, Matthew E; Hunt, Greg J; Jiang, Huaiyang; Joshi, Vandita; Khetani, Radhika S; Kosarev, Peter; Kovar, Christie L; Ma, Jian; Maleszka, Ryszard; Moritz, Robin F A; Munoz-Torres, Monica C; Murphy, Terence D; Muzny, Donna M; Newsham, Irene F; Reese, Justin T; Robertson, Hugh M; Robinson, Gene E; Rueppell, Olav; Solovyev, Victor; Stanke, Mario; Stolle, Eckart; Tsuruda, Jennifer M; Vaerenbergh, Matthias Van; Waterhouse, Robert M; Weaver, Daniel B; Whitfield, Charles W; Wu, Yuanqing; Zdobnov, Evgeny M; Zhang, Lan; Zhu, Dianhui; Gibbs, Richard A

    2014-01-30

    The first generation of genome sequence assemblies and annotations have had a significant impact upon our understanding of the biology of the sequenced species, the phylogenetic relationships among species, the study of populations within and across species, and have informed the biology of humans. As only a few Metazoan genomes are approaching finished quality (human, mouse, fly and worm), there is room for improvement of most genome assemblies. The honey bee (Apis mellifera) genome, published in 2006, was noted for its bimodal GC content distribution that affected the quality of the assembly in some regions and for fewer genes in the initial gene set (OGSv1.0) compared to what would be expected based on other sequenced insect genomes. Here, we report an improved honey bee genome assembly (Amel_4.5) with a new gene annotation set (OGSv3.2), and show that the honey bee genome contains a number of genes similar to that of other insect genomes, contrary to what was suggested in OGSv1.0. The new genome assembly is more contiguous and complete and the new gene set includes ~5000 more protein-coding genes, 50% more than previously reported. About 1/6 of the additional genes were due to improvements to the assembly, and the remaining were inferred based on new RNAseq and protein data. Lessons learned from this genome upgrade have important implications for future genome sequencing projects. Furthermore, the improvements significantly enhance genomic resources for the honey bee, a key model for social behavior and essential to global ecology through pollination.

  20. Finding the missing honey bee genes: Lessons learned from a genome upgrade

    KAUST Repository

    Elsik, Christine G

    2014-01-30

    Background: The first generation of genome sequence assemblies and annotations have had a significant impact upon our understanding of the biology of the sequenced species, the phylogenetic relationships among species, the study of populations within and across species, and have informed the biology of humans. As only a few Metazoan genomes are approaching finished quality (human, mouse, fly and worm), there is room for improvement of most genome assemblies. The honey bee (Apis mellifera) genome, published in 2006, was noted for its bimodal GC content distribution that affected the quality of the assembly in some regions and for fewer genes in the initial gene set (OGSv1.0) compared to what would be expected based on other sequenced insect genomes. Results: Here, we report an improved honey bee genome assembly (Amel_4.5) with a new gene annotation set (OGSv3.2), and show that the honey bee genome contains a number of genes similar to that of other insect genomes, contrary to what was suggested in OGSv1.0. The new genome assembly is more contiguous and complete and the new gene set includes ~5000 more protein-coding genes, 50% more than previously reported. About 1/6 of the additional genes were due to improvements to the assembly, and the remaining were inferred based on new RNAseq and protein data. Conclusions: Lessons learned from this genome upgrade have important implications for future genome sequencing projects. Furthermore, the improvements significantly enhance genomic resources for the honey bee, a key model for social behavior and essential to global ecology through pollination. 2014 Elsik et al.; licensee BioMed Central Ltd.

  1. Extensive error in the number of genes inferred from draft genome assemblies.

    Directory of Open Access Journals (Sweden)

    James F Denton

    2014-12-01

    Full Text Available Current sequencing methods produce large amounts of data, but genome assemblies based on these data are often woefully incomplete. These incomplete and error-filled assemblies result in many annotation errors, especially in the number of genes present in a genome. In this paper we investigate the magnitude of the problem, both in terms of total gene number and the number of copies of genes in specific families. To do this, we compare multiple draft assemblies against higher-quality versions of the same genomes, using several new assemblies of the chicken genome based on both traditional and next-generation sequencing technologies, as well as published draft assemblies of chimpanzee. We find that upwards of 40% of all gene families are inferred to have the wrong number of genes in draft assemblies, and that these incorrect assemblies both add and subtract genes. Using simulated genome assemblies of Drosophila melanogaster, we find that the major cause of increased gene numbers in draft genomes is the fragmentation of genes onto multiple individual contigs. Finally, we demonstrate the usefulness of RNA-Seq in improving the gene annotation of draft assemblies, largely by connecting genes that have been fragmented in the assembly process.

  2. Genome-wide search for gene-gene interactions in colorectal cancer.

    Directory of Open Access Journals (Sweden)

    Shuo Jiao

    Full Text Available Genome-wide association studies (GWAS have successfully identified a number of single-nucleotide polymorphisms (SNPs associated with colorectal cancer (CRC risk. However, these susceptibility loci known today explain only a small fraction of the genetic risk. Gene-gene interaction (GxG is considered to be one source of the missing heritability. To address this, we performed a genome-wide search for pair-wise GxG associated with CRC risk using 8,380 cases and 10,558 controls in the discovery phase and 2,527 cases and 2,658 controls in the replication phase. We developed a simple, but powerful method for testing interaction, which we term the Average Risk Due to Interaction (ARDI. With this method, we conducted a genome-wide search to identify SNPs showing evidence for GxG with previously identified CRC susceptibility loci from 14 independent regions. We also conducted a genome-wide search for GxG using the marginal association screening and examining interaction among SNPs that pass the screening threshold (p<10(-4. For the known locus rs10795668 (10p14, we found an interacting SNP rs367615 (5q21 with replication p = 0.01 and combined p = 4.19×10(-8. Among the top marginal SNPs after LD pruning (n = 163, we identified an interaction between rs1571218 (20p12.3 and rs10879357 (12q21.1 (nominal combined p = 2.51×10(-6; Bonferroni adjusted p = 0.03. Our study represents the first comprehensive search for GxG in CRC, and our results may provide new insight into the genetic etiology of CRC.

  3. A comprehensive evaluation of rodent malaria parasite genomes and gene expression

    KAUST Repository

    Otto, Thomas D

    2014-10-30

    Background: Rodent malaria parasites (RMP) are used extensively as models of human malaria. Draft RMP genomes have been published for Plasmodium yoelii, P. berghei ANKA (PbA) and P. chabaudi AS (PcAS). Although availability of these genomes made a significant impact on recent malaria research, these genomes were highly fragmented and were annotated with little manual curation. The fragmented nature of the genomes has hampered genome wide analysis of Plasmodium gene regulation and function. Results: We have greatly improved the genome assemblies of PbA and PcAS, newly sequenced the virulent parasite P. yoelii YM genome, sequenced additional RMP isolates/lines and have characterized genotypic diversity within RMP species. We have produced RNA-seq data and utilized it to improve gene-model prediction and to provide quantitative, genome-wide, data on gene expression. Comparison of the RMP genomes with the genome of the human malaria parasite P. falciparum and RNA-seq mapping permitted gene annotation at base-pair resolution. Full-length chromosomal annotation permitted a comprehensive classification of all subtelomeric multigene families including the `Plasmodium interspersed repeat genes\\' (pir). Phylogenetic classification of the pir family, combined with pir expression patterns, indicates functional diversification within this family. Conclusions: Complete RMP genomes, RNA-seq and genotypic diversity data are excellent and important resources for gene-function and post-genomic analyses and to better interrogate Plasmodium biology. Genotypic diversity between P. chabaudi isolates makes this species an excellent parasite to study genotype-phenotype relationships. The improved classification of multigene families will enhance studies on the role of (variant) exported proteins in virulence and immune evasion/modulation.

  4. StereoGene: rapid estimation of genome-wide correlation of continuous or interval feature data.

    Science.gov (United States)

    Stavrovskaya, Elena D; Niranjan, Tejasvi; Fertig, Elana J; Wheelan, Sarah J; Favorov, Alexander V; Mironov, Andrey A

    2017-10-15

    Genomics features with similar genome-wide distributions are generally hypothesized to be functionally related, for example, colocalization of histones and transcription start sites indicate chromatin regulation of transcription factor activity. Therefore, statistical algorithms to perform spatial, genome-wide correlation among genomic features are required. Here, we propose a method, StereoGene, that rapidly estimates genome-wide correlation among pairs of genomic features. These features may represent high-throughput data mapped to reference genome or sets of genomic annotations in that reference genome. StereoGene enables correlation of continuous data directly, avoiding the data binarization and subsequent data loss. Correlations are computed among neighboring genomic positions using kernel correlation. Representing the correlation as a function of the genome position, StereoGene outputs the local correlation track as part of the analysis. StereoGene also accounts for confounders such as input DNA by partial correlation. We apply our method to numerous comparisons of ChIP-Seq datasets from the Human Epigenome Atlas and FANTOM CAGE to demonstrate its wide applicability. We observe the changes in the correlation between epigenomic features across developmental trajectories of several tissue types consistent with known biology and find a novel spatial correlation of CAGE clusters with donor splice sites and with poly(A) sites. These analyses provide examples for the broad applicability of StereoGene for regulatory genomics. The StereoGene C ++ source code, program documentation, Galaxy integration scripts and examples are available from the project homepage http://stereogene.bioinf.fbb.msu.ru/. favorov@sensi.org. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  5. Genomic organization, annotation, and ligand-receptor inferences of chicken chemokines and chemokine receptor genes based on comparative genomics

    Directory of Open Access Journals (Sweden)

    Sze Sing-Hoi

    2005-03-01

    Full Text Available Abstract Background Chemokines and their receptors play important roles in host defense, organogenesis, hematopoiesis, and neuronal communication. Forty-two chemokines and 19 cognate receptors have been found in the human genome. Prior to this report, only 11 chicken chemokines and 7 receptors had been reported. The objectives of this study were to systematically identify chicken chemokines and their cognate receptor genes in the chicken genome and to annotate these genes and ligand-receptor binding by a comparative genomics approach. Results Twenty-three chemokine and 14 chemokine receptor genes were identified in the chicken genome. All of the chicken chemokines contained a conserved CC, CXC, CX3C, or XC motif, whereas all the chemokine receptors had seven conserved transmembrane helices, four extracellular domains with a conserved cysteine, and a conserved DRYLAIV sequence in the second intracellular domain. The number of coding exons in these genes and the syntenies are highly conserved between human, mouse, and chicken although the amino acid sequence homologies are generally low between mammalian and chicken chemokines. Chicken genes were named with the systematic nomenclature used in humans and mice based on phylogeny, synteny, and sequence homology. Conclusion The independent nomenclature of chicken chemokines and chemokine receptors suggests that the chicken may have ligand-receptor pairings similar to mammals. All identified chicken chemokines and their cognate receptors were identified in the chicken genome except CCR9, whose ligand was not identified in this study. The organization of these genes suggests that there were a substantial number of these genes present before divergence between aves and mammals and more gene duplications of CC, CXC, CCR, and CXCR subfamilies in mammals than in aves after the divergence.

  6. Mammalian-specific genomic functions: Newly acquired traits generated by genomic imprinting and LTR retrotransposon-derived genes in mammals.

    Science.gov (United States)

    Kaneko-Ishino, Tomoko; Ishino, Fumitoshi

    2015-01-01

    Mammals, including human beings, have evolved a unique viviparous reproductive system and a highly developed central nervous system. How did these unique characteristics emerge in mammalian evolution, and what kinds of changes did occur in the mammalian genomes as evolution proceeded? A key conceptual term in approaching these issues is "mammalian-specific genomic functions", a concept covering both mammalian-specific epigenetics and genetics. Genomic imprinting and LTR retrotransposon-derived genes are reviewed as the representative, mammalian-specific genomic functions that are essential not only for the current mammalian developmental system, but also mammalian evolution itself. First, the essential roles of genomic imprinting in mammalian development, especially related to viviparous reproduction via placental function, as well as the emergence of genomic imprinting in mammalian evolution, are discussed. Second, we introduce the novel concept of "mammalian-specific traits generated by mammalian-specific genes from LTR retrotransposons", based on the finding that LTR retrotransposons served as a critical driving force in the mammalian evolution via generating mammalian-specific genes.

  7. Genome-wide analysis of regions similar to promoters of histone genes

    KAUST Repository

    Chowdhary, Rajesh

    2010-05-28

    Background: The purpose of this study is to: i) develop a computational model of promoters of human histone-encoding genes (shortly histone genes), an important class of genes that participate in various critical cellular processes, ii) use the model so developed to identify regions across the human genome that have similar structure as promoters of histone genes; such regions could represent potential genomic regulatory regions, e.g. promoters, of genes that may be coregulated with histone genes, and iii/ identify in this way genes that have high likelihood of being coregulated with the histone genes.Results: We successfully developed a histone promoter model using a comprehensive collection of histone genes. Based on leave-one-out cross-validation test, the model produced good prediction accuracy (94.1% sensitivity, 92.6% specificity, and 92.8% positive predictive value). We used this model to predict across the genome a number of genes that shared similar promoter structures with the histone gene promoters. We thus hypothesize that these predicted genes could be coregulated with histone genes. This hypothesis matches well with the available gene expression, gene ontology, and pathways data. Jointly with promoters of the above-mentioned genes, we found a large number of intergenic regions with similar structure as histone promoters.Conclusions: This study represents one of the most comprehensive computational analyses conducted thus far on a genome-wide scale of promoters of human histone genes. Our analysis suggests a number of other human genes that share a high similarity of promoter structure with the histone genes and thus are highly likely to be coregulated, and consequently coexpressed, with the histone genes. We also found that there are a large number of intergenic regions across the genome with their structures similar to promoters of histone genes. These regions may be promoters of yet unidentified genes, or may represent remote control regions that

  8. Flexibility and symmetry of prokaryotic genome rearrangement reveal lineage-associated core-gene-defined genome organizational frameworks.

    Science.gov (United States)

    Kang, Yu; Gu, Chaohao; Yuan, Lina; Wang, Yue; Zhu, Yanmin; Li, Xinna; Luo, Qibin; Xiao, Jingfa; Jiang, Daquan; Qian, Minping; Ahmed Khan, Aftab; Chen, Fei; Zhang, Zhang; Yu, Jun

    2014-11-25

    The prokaryotic pangenome partitions genes into core and dispensable genes. The order of core genes, albeit assumed to be stable under selection in general, is frequently interrupted by horizontal gene transfer and rearrangement, but how a core-gene-defined genome maintains its stability or flexibility remains to be investigated. Based on data from 30 species, including 425 genomes from six phyla, we grouped core genes into syntenic blocks in the context of a pangenome according to their stability across multiple isolates. A subset of the core genes, often species specific and lineage associated, formed a core-gene-defined genome organizational framework (cGOF). Such cGOFs are either single segmental (one-third of the species analyzed) or multisegmental (the rest). Multisegment cGOFs were further classified into symmetric or asymmetric according to segment orientations toward the origin-terminus axis. The cGOFs in Gram-positive species are exclusively symmetric and often reversible in orientation, as opposed to those of the Gram-negative bacteria, which are all asymmetric and irreversible. Meanwhile, all species showing strong strand-biased gene distribution contain symmetric cGOFs and often specific DnaE (α subunit of DNA polymerase III) isoforms. Furthermore, functional evaluations revealed that cGOF genes are hub associated with regard to cellular activities, and the stability of cGOF provides efficient indexes for scaffold orientation as demonstrated by assembling virtual and empirical genome drafts. cGOFs show species specificity, and the symmetry of multisegmental cGOFs is conserved among taxa and constrained by DNA polymerase-centric strand-biased gene distribution. The definition of species-specific cGOFs provides powerful guidance for genome assembly and other structure-based analysis. Prokaryotic genomes are frequently interrupted by horizontal gene transfer (HGT) and rearrangement. To know whether there is a set of genes not only conserved in position

  9. Virtual Genome Walking across the 32 Gb Ambystoma mexicanum genome; assembling gene models and intronic sequence.

    Science.gov (United States)

    Evans, Teri; Johnson, Andrew D; Loose, Matthew

    2018-01-12

    Large repeat rich genomes present challenges for assembly using short read technologies. The 32 Gb axolotl genome is estimated to contain ~19 Gb of repetitive DNA making an assembly from short reads alone effectively impossible. Indeed, this model species has been sequenced to 20× coverage but the reads could not be conventionally assembled. Using an alternative strategy, we have assembled subsets of these reads into scaffolds describing over 19,000 gene models. We call this method Virtual Genome Walking as it locally assembles whole genome reads based on a reference transcriptome, identifying exons and iteratively extending them into surrounding genomic sequence. These assemblies are then linked and refined to generate gene models including upstream and downstream genomic, and intronic, sequence. Our assemblies are validated by comparison with previously published axolotl bacterial artificial chromosome (BAC) sequences. Our analyses of axolotl intron length, intron-exon structure, repeat content and synteny provide novel insights into the genic structure of this model species. This resource will enable new experimental approaches in axolotl, such as ChIP-Seq and CRISPR and aid in future whole genome sequencing efforts. The assembled sequences and annotations presented here are freely available for download from https://tinyurl.com/y8gydc6n . The software pipeline is available from https://github.com/LooseLab/iterassemble .

  10. Do alterations in mesofauna community affect earthworms?

    Science.gov (United States)

    Uvarov, Alexei V; Karaban, Kamil

    2015-11-01

    Interactions between the saprotrophic animal groups that strongly control soil microbial activities and the functioning of detrital food webs, such as earthworms and mesofauna, are not well understood. Earthworm trophic and engineering activities strongly affect mesofauna abundance and diversity through various direct and indirect pathways. In contrast, mesofauna effects on earthworm populations are less evident; however, their importance may be high, considering the keystone significance of earthworms for the functioning of the soil system. We studied effects of a diverse mesofauna community of a deciduous forest on two earthworm species representing epigeic (Lumbricus rubellus) and endogeic (Aporrectodea caliginosa) ecological groups. In microcosms, the density of total mesofauna or its separate groups (enchytraeids, collembolans, gamasid mites) was manipulated (increased) and responses of earthworms and soil systems were recorded. A rise in mesofauna density resulted in a decrease of biomass and an increased mortality in L. rubellus, presumably due to competition with mesofauna for litter resources. In contrast, similar mesofauna manipulations promoted reproduction of A. caliginosa, suggesting a facilitated exploitation of litter resources due to increased mesofauna activities. Changes of microcosm respiration rates, litter organic matter content and microbial activities across the manipulation treatments indicate that mesofauna modify responses of soil systems in the presence of earthworms. However, similar mesofauna manipulations could induce different responses in soil systems with either epigeic or endogeic lumbricids, which suggests that earthworm/mesofauna interactions are species-specific. Thus, mesofauna impacts should be treated as a factor affecting the engineering activities of epigeic and endogeic earthworms in the soil.

  11. Evolution of the tripartite symbiosis between earthworms, Verminephrobacter and Flexibacter-like bacteria

    Directory of Open Access Journals (Sweden)

    Peter eMøller

    2015-05-01

    Full Text Available Nephridial (excretory organ symbionts are widespread in lumbricid earthworms and the complexity of the nephridial symbiont communities varies greatly between earthworm species. The two most common symbionts are the well-described Verminephrobacter and less well-known Flexibacter-like bacteria. Verminephrobacter are present in almost all lumbricid earthworms, they are species-specific, vertically transmitted, and have presumably been associated with their hosts since the origin of lumbricids. Flexibacter-like symbionts have been reported from about half the investigated earthworms; they are also vertically transmitted. To investigate the evolution of this tri-partite symbiosis, phylogenies for 18 lumbricid earthworm species were constructed based on two mitochondrial genes, NADH dehydrogenase subunit 2 (ND2 and cytochrome c oxidase subunit I (COI, and compared to their symbiont phylogenies based on RNA polymerase subunit B (rpoB and 16S rRNA genes.The two nephridial symbionts showed markedly different evolutionary histories with their hosts. For Verminephrobacter, clear signs of long-term host-symbiont co-evolution with rare host switching events confirmed its ancient association with lumbricid earthworms, likely dating back to their last common ancestor about 100 million years (MY ago. In contrast, phylogenies for the Flexibacter-like symbionts suggested an ability to switch to new hosts, to which they adapted and subsequently became species-specific. Putative co-speciation events were only observed with closely related host species; on that basis, this secondary symbiosis was estimated to be minimum 45 MY old. Based on the monophyletic clustering of the Flexibacter-like symbionts, the low 16S rRNA gene sequence similarity to the nearest described species (<92% and environmental sequences (<94.2 %, and the specific habitat in the earthworm nephridia, we propose a new candidate genus for this group, Candidatus Nephrothrix.

  12. Whole genome duplications and expansion of the vertebrate GATA transcription factor gene family

    Directory of Open Access Journals (Sweden)

    Bowerman Bruce

    2009-08-01

    Full Text Available Abstract Background GATA transcription factors influence many developmental processes, including the specification of embryonic germ layers. The GATA gene family has significantly expanded in many animal lineages: whereas diverse cnidarians have only one GATA transcription factor, six GATA genes have been identified in many vertebrates, five in many insects, and eleven to thirteen in Caenorhabditis nematodes. All bilaterian animal genomes have at least one member each of two classes, GATA123 and GATA456. Results We have identified one GATA123 gene and one GATA456 gene from the genomic sequence of two invertebrate deuterostomes, a cephalochordate (Branchiostoma floridae and a hemichordate (Saccoglossus kowalevskii. We also have confirmed the presence of six GATA genes in all vertebrate genomes, as well as additional GATA genes in teleost fish. Analyses of conserved sequence motifs and of changes to the exon-intron structure, and molecular phylogenetic analyses of these deuterostome GATA genes support their origin from two ancestral deuterostome genes, one GATA 123 and one GATA456. Comparison of the conserved genomic organization across vertebrates identified eighteen paralogous gene families linked to multiple vertebrate GATA genes (GATA paralogons, providing the strongest evidence yet for expansion of vertebrate GATA gene families via genome duplication events. Conclusion From our analysis, we infer the evolutionary birth order and relationships among vertebrate GATA transcription factors, and define their expansion via multiple rounds of whole genome duplication events. As the genomes of four independent invertebrate deuterostome lineages contain single copy GATA123 and GATA456 genes, we infer that the 0R (pre-genome duplication invertebrate deuterostome ancestor also had two GATA genes, one of each class. Synteny analyses identify duplications of paralogous chromosomal regions (paralogons, from single ancestral vertebrate GATA123 and GATA456

  13. Genomic sequence around butterfly wing development genes: annotation and comparative analysis.

    Directory of Open Access Journals (Sweden)

    Inês C Conceição

    Full Text Available BACKGROUND: Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. METHODOLOGY/PRINCIPAL FINDINGS: We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes. CONCLUSIONS: The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1 the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2 the high

  14. Identification of Ohnolog Genes Originating from Whole Genome Duplication in Early Vertebrates, Based on Synteny Comparison across Multiple Genomes.

    Science.gov (United States)

    Singh, Param Priya; Arora, Jatin; Isambert, Hervé

    2015-07-01

    Whole genome duplications (WGD) have now been firmly established in all major eukaryotic kingdoms. In particular, all vertebrates descend from two rounds of WGDs, that occurred in their jawless ancestor some 500 MY ago. Paralogs retained from WGD, also coined 'ohnologs' after Susumu Ohno, have been shown to be typically associated with development, signaling and gene regulation. Ohnologs, which amount to about 20 to 35% of genes in the human genome, have also been shown to be prone to dominant deleterious mutations and frequently implicated in cancer and genetic diseases. Hence, identifying ohnologs is central to better understand the evolution of vertebrates and their susceptibility to genetic diseases. Early computational analyses to identify vertebrate ohnologs relied on content-based synteny comparisons between the human genome and a single invertebrate outgroup genome or within the human genome itself. These approaches are thus limited by lineage specific rearrangements in individual genomes. We report, in this study, the identification of vertebrate ohnologs based on the quantitative assessment and integration of synteny conservation between six amniote vertebrates and six invertebrate outgroups. Such a synteny comparison across multiple genomes is shown to enhance the statistical power of ohnolog identification in vertebrates compared to earlier approaches, by overcoming lineage specific genome rearrangements. Ohnolog gene families can be browsed and downloaded for three statistical confidence levels or recompiled for specific, user-defined, significance criteria at http://ohnologs.curie.fr/. In the light of the importance of WGD on the genetic makeup of vertebrates, our analysis provides a useful resource for researchers interested in gaining further insights on vertebrate evolution and genetic diseases.

  15. PanCoreGen - Profiling, detecting, annotating protein-coding genes in microbial genomes.

    Science.gov (United States)

    Paul, Sandip; Bhardwaj, Archana; Bag, Sumit K; Sokurenko, Evgeni V; Chattopadhyay, Sujay

    2015-12-01

    A large amount of genomic data, especially from multiple isolates of a single species, has opened new vistas for microbial genomics analysis. Analyzing the pan-genome (i.e. the sum of genetic repertoire) of microbial species is crucial in understanding the dynamics of molecular evolution, where virulence evolution is of major interest. Here we present PanCoreGen - a standalone application for pan- and core-genomic profiling of microbial protein-coding genes. PanCoreGen overcomes key limitations of the existing pan-genomic analysis tools, and develops an integrated annotation-structure for a species-specific pan-genomic profile. It provides important new features for annotating draft genomes/contigs and detecting unidentified genes in annotated genomes. It also generates user-defined group-specific datasets within the pan-genome. Interestingly, analyzing an example-set of Salmonella genomes, we detect potential footprints of adaptive convergence of horizontally transferred genes in two human-restricted pathogenic serovars - Typhi and Paratyphi A. Overall, PanCoreGen represents a state-of-the-art tool for microbial phylogenomics and pathogenomics study. Copyright © 2015 Elsevier Inc. All rights reserved.

  16. PanCoreGen – profiling, detecting, annotating protein-coding genes in microbial genomes

    Science.gov (United States)

    Bhardwaj, Archana; Bag, Sumit K; Sokurenko, Evgeni V.

    2015-01-01

    A large amount of genomic data, especially from multiple isolates of a single species, has opened new vistas for microbial genomics analysis. Analyzing pan-genome (i.e. the sum of genetic repertoire) of microbial species is crucial in understanding the dynamics of molecular evolution, where virulence evolution is of major interest. Here we present PanCoreGen – a standalone application for pan- and core-genomic profiling of microbial protein-coding genes. PanCoreGen overcomes key limitations of the existing pan-genomic analysis tools, and develops an integrated annotation-structure for species-specific pan-genomic profile. It provides important new features for annotating draft genomes/contigs and detecting unidentified genes in annotated genomes. It also generates user-defined group-specific datasets within the pan-genome. Interestingly, analyzing an example-set of Salmonella genomes, we detect potential footprints of adaptive convergence of horizontally transferred genes in two human-restricted pathogenic serovars – Typhi and Paratyphi A. Overall, PanCoreGen represents a state-of-the-art tool for microbial phylogenomics and pathogenomics study. PMID:26456591

  17. Genome-wide gene expression regulation as a function of genotype and age in C. elegans

    NARCIS (Netherlands)

    Viñuela Rodriguez, A.; Snoek, L.B.; Riksen, J.A.G.; Kammenga, J.E.

    2010-01-01

    Gene expression becomes more variable with age, and it is widely assumed that this is due to a decrease in expression regulation. But currently there is no understanding how gene expression regulatory patterns progress with age. Here we explored genome-wide gene expression variation and regulatory

  18. The Symbiodinium kawagutii genome illuminates dinoflagellate gene expression and coral symbiosis

    DEFF Research Database (Denmark)

    Lin, Senjie; Cheng, Shifeng; Song, Bo

    2015-01-01

    Symbiodinium-specific gene families. No whole-genome duplication was observed, but instead we found active (retro) transposition and gene family expansion, especially in processes important for successful symbiosis with corals. We also documented genes potentially governing sexual reproduction and cyst...... the molecular basis and evolution of coral symbiosis....

  19. CGUG: in silico proteome and genome parsing tool for the determination of "core" and unique genes in the analysis of genomes up to ca. 1.9 Mb

    Directory of Open Access Journals (Sweden)

    Mahadevan Padmanabhan

    2009-08-01

    Full Text Available Abstract Background Viruses and small-genome bacteria (~2 megabases and smaller comprise a considerable population in the biosphere and are of interest to many researchers. These genomes are now sequenced at an unprecedented rate and require complementary computational tools to analyze. "CoreGenesUniqueGenes" (CGUG is an in silico genome data mining tool that determines a "core" set of genes from two to five organisms with genomes in this size range. Core and unique genes may reflect similar niches and needs, and may be used in classifying organisms. Findings CGUG is available at http://binf.gmu.edu/geneorder.html as a web-based on-the-fly tool that performs iterative BLASTP analyses using a reference genome and up to four query genomes to provide a table of genes common to these genomes. The result is an in silico display of genomes and their proteomes, allowing for further analysis. CGUG can be used for "genome annotation by homology", as demonstrated with Chlamydophila and Francisella genomes. Conclusion CGUG is used to reanalyze the ICTV-based classifications of bacteriophages, to reconfirm long-standing relationships and to explore new classifications. These genomes have been problematic in the past, due largely to horizontal gene transfers. CGUG is validated as a tool for reannotating small genome bacteria using more up-to-date annotations by similarity or homology. These serve as an entry point for wet-bench experiments to confirm the functions of these "hypothetical" and "unknown" proteins.

  20. Exploiting proteomic data for genome annotation and gene model validation in Aspergillus niger

    Directory of Open Access Journals (Sweden)

    Grigoriev Igor V

    2009-02-01

    Full Text Available Abstract Background Proteomic data is a potentially rich, but arguably unexploited, data source for genome annotation. Peptide identifications from tandem mass spectrometry provide prima facie evidence for gene predictions and can discriminate over a set of candidate gene models. Here we apply this to the recently sequenced Aspergillus niger fungal genome from the Joint Genome Institutes (JGI and another predicted protein set from another A.niger sequence. Tandem mass spectra (MS/MS were acquired from 1d gel electrophoresis bands and searched against all available gene models using Average Peptide Scoring (APS and reverse database searching to produce confident identifications at an acceptable false discovery rate (FDR. Results 405 identified peptide sequences were mapped to 214 different A.niger genomic loci to which 4093 predicted gene models clustered, 2872 of which contained the mapped peptides. Interestingly, 13 (6% of these loci either had no preferred predicted gene model or the genome annotators' chosen "best" model for that genomic locus was not found to be the most parsimonious match to the identified peptides. The peptides identified also boosted confidence in predicted gene structures spanning 54 introns from different gene models. Conclusion This work highlights the potential of integrating experimental proteomics data into genomic annotation pipelines much as expressed sequence tag (EST data has been. A comparison of the published genome from another strain of A.niger sequenced by DSM showed that a number of the gene models or proteins with proteomics evidence did not occur in both genomes, further highlighting the utility of the method.

  1. MED: a new non-supervised gene prediction algorithm for bacterial and archaeal genomes

    Directory of Open Access Journals (Sweden)

    Yang Yi-Fan

    2007-03-01

    Full Text Available Abstract Background Despite a remarkable success in the computational prediction of genes in Bacteria and Archaea, a lack of comprehensive understanding of prokaryotic gene structures prevents from further elucidation of differences among genomes. It continues to be interesting to develop new ab initio algorithms which not only accurately predict genes, but also facilitate comparative studies of prokaryotic genomes. Results This paper describes a new prokaryotic genefinding algorithm based on a comprehensive statistical model of protein coding Open Reading Frames (ORFs and Translation Initiation Sites (TISs. The former is based on a linguistic "Entropy Density Profile" (EDP model of coding DNA sequence and the latter comprises several relevant features related to the translation initiation. They are combined to form a so-called Multivariate Entropy Distance (MED algorithm, MED 2.0, that incorporates several strategies in the iterative program. The iterations enable us to develop a non-supervised learning process and to obtain a set of genome-specific parameters for the gene structure, before making the prediction of genes. Conclusion Results of extensive tests show that MED 2.0 achieves a competitive high performance in the gene prediction for both 5' and 3' end matches, compared to the current best prokaryotic gene finders. The advantage of the MED 2.0 is particularly evident for GC-rich genomes and archaeal genomes. Furthermore, the genome-specific parameters given by MED 2.0 match with the current understanding of prokaryotic genomes and may serve as tools for comparative genomic studies. In particular, MED 2.0 is shown to reveal divergent translation initiation mechanisms in archaeal genomes while making a more accurate prediction of TISs compared to the existing gene finders and the current GenBank annotation.

  2. Transmission of nephridial bacteria of the earthworm Eisenia fetida.

    Science.gov (United States)

    Davidson, Seana K; Stahl, David A

    2006-01-01

    The lumbricid earthworms (annelid family Lumbricidae) harbor gram-negative bacteria in their excretory organs, the nephridia. Comparative 16S rRNA gene sequencing of bacteria associated with the nephridia of several earthworm species has shown that each species of worm harbors a distinct bacterial species and that the bacteria from different species form a monophyletic cluster within the genus Acidovorax, suggesting that there is a specific association resulting from radiation from a common bacterial ancestor. Previous microscopy and culture studies revealed the presence of bacteria within the egg capsules and on the surface of embryos but did not demonstrate that the bacteria within the egg capsule were the same bacteria that colonized the nephridia. We present evidence, based on curing experiments, in situ hybridizations with Acidovorax-specific probes, and 16S rRNA gene sequence analysis, that the egg capsules contain high numbers of the bacterial symbiont and that juveniles are colonized during development within the egg capsule. Studies exposing aposymbiotic hatchlings to colonized adults and their bedding material suggested that juvenile earthworms do not readily acquire bacteria from the soil after hatching but must be colonized during development by bacteria deposited in the egg capsule. Whether this is due to the developmental stage of the host or the physiological state of the symbiont remains to be investigated.

  3. Network graph analysis of gene-gene interactions in genome-wide association study data.

    Science.gov (United States)

    Lee, Sungyoung; Kwon, Min-Seok; Park, Taesung

    2012-12-01

    Most common complex traits, such as obesity, hypertension, diabetes, and cancers, are known to be associated with multiple genes, environmental factors, and their epistasis. Recently, the development of advanced genotyping technologies has allowed us to perform genome-wide association studies (GWASs). For detecting the effects of multiple genes on complex traits, many approaches have been proposed for GWASs. Multifactor dimensionality reduction (MDR) is one of the powerful and efficient methods for detecting high-order gene-gene (GxG) interactions. However, the biological interpretation of GxG interactions identified by MDR analysis is not easy. In order to aid the interpretation of MDR results, we propose a network graph analysis to elucidate the meaning of identified GxG interactions. The proposed network graph analysis consists of three steps. The first step is for performing GxG interaction analysis using MDR analysis. The second step is to draw the network graph using the MDR result. The third step is to provide biological evidence of the identified GxG interaction using external biological databases. The proposed method was applied to Korean Association Resource (KARE) data, containing 8838 individuals with 327,632 single-nucleotide polymorphisms, in order to perform GxG interaction analysis of body mass index (BMI). Our network graph analysis successfully showed that many identified GxG interactions have known biological evidence related to BMI. We expect that our network graph analysis will be helpful to interpret the biological meaning of GxG interactions.

  4. Mitochondrial Genomes of Kinorhyncha: trnM Duplication and New Gene Orders within Animals.

    Science.gov (United States)

    Popova, Olga V; Mikhailov, Kirill V; Nikitin, Mikhail A; Logacheva, Maria D; Penin, Aleksey A; Muntyan, Maria S; Kedrova, Olga S; Petrov, Nikolai B; Panchin, Yuri V; Aleoshin, Vladimir V

    2016-01-01

    Many features of mitochondrial genomes of animals, such as patterns of gene arrangement, nucleotide content and substitution rate variation are extensively used in evolutionary and phylogenetic studies. Nearly 6,000 mitochondrial genomes of animals have already been sequenced, covering the majority of animal phyla. One of the groups that escaped mitogenome sequencing is phylum Kinorhyncha-an isolated taxon of microscopic worm-like ecdysozoans. The kinorhynchs are thought to be one of the early-branching lineages of Ecdysozoa, and their mitochondrial genomes may be important for resolving evolutionary relations between major animal taxa. Here we present the results of sequencing and analysis of mitochondrial genomes from two members of Kinorhyncha, Echinoderes svetlanae (Cyclorhagida) and Pycnophyes kielensis (Allomalorhagida). Their mitochondrial genomes are circular molecules approximately 15 Kbp in size. The kinorhynch mitochondrial gene sequences are highly divergent, which precludes accurate phylogenetic inference. The mitogenomes of both species encode a typical metazoan complement of 37 genes, which are all positioned on the major strand, but the gene order is distinct and unique among Ecdysozoa or animals as a whole. We predict four types of start codons for protein-coding genes in E. svetlanae and five in P. kielensis with a consensus DTD in single letter code. The mitochondrial genomes of E. svetlanae and P. kielensis encode duplicated methionine tRNA genes that display compensatory nucleotide substitutions. Two distant species of Kinorhyncha demonstrate similar patterns of gene arrangements in their mitogenomes. Both genomes have duplicated methionine tRNA genes; the duplication predates the divergence of two species. The kinorhynchs share a few features pertaining to gene order that align them with Priapulida. Gene order analysis reveals that gene arrangement specific of Priapulida may be ancestral for Scalidophora, Ecdysozoa, and even Protostomia.

  5. Mitochondrial Genomes of Kinorhyncha: trnM Duplication and New Gene Orders within Animals.

    Directory of Open Access Journals (Sweden)

    Olga V Popova

    Full Text Available Many features of mitochondrial genomes of animals, such as patterns of gene arrangement, nucleotide content and substitution rate variation are extensively used in evolutionary and phylogenetic studies. Nearly 6,000 mitochondrial genomes of animals have already been sequenced, covering the majority of animal phyla. One of the groups that escaped mitogenome sequencing is phylum Kinorhyncha-an isolated taxon of microscopic worm-like ecdysozoans. The kinorhynchs are thought to be one of the early-branching lineages of Ecdysozoa, and their mitochondrial genomes may be important for resolving evolutionary relations between major animal taxa. Here we present the results of sequencing and analysis of mitochondrial genomes from two members of Kinorhyncha, Echinoderes svetlanae (Cyclorhagida and Pycnophyes kielensis (Allomalorhagida. Their mitochondrial genomes are circular molecules approximately 15 Kbp in size. The kinorhynch mitochondrial gene sequences are highly divergent, which precludes accurate phylogenetic inference. The mitogenomes of both species encode a typical metazoan complement of 37 genes, which are all positioned on the major strand, but the gene order is distinct and unique among Ecdysozoa or animals as a whole. We predict four types of start codons for protein-coding genes in E. svetlanae and five in P. kielensis with a consensus DTD in single letter code. The mitochondrial genomes of E. svetlanae and P. kielensis encode duplicated methionine tRNA genes that display compensatory nucleotide substitutions. Two distant species of Kinorhyncha demonstrate similar patterns of gene arrangements in their mitogenomes. Both genomes have duplicated methionine tRNA genes; the duplication predates the divergence of two species. The kinorhynchs share a few features pertaining to gene order that align them with Priapulida. Gene order analysis reveals that gene arrangement specific of Priapulida may be ancestral for Scalidophora, Ecdysozoa, and even

  6. Genome-wide comparative analysis of NBS-encoding genes between Brassica species and Arabidopsis thaliana.

    Science.gov (United States)

    Yu, Jingyin; Tehrim, Sadia; Zhang, Fengqi; Tong, Chaobo; Huang, Junyan; Cheng, Xiaohui; Dong, Caihua; Zhou, Yanqiu; Qin, Rui; Hua, Wei; Liu, Shengyi

    2014-01-03

    Plant disease resistance (R) genes with the nucleotide binding site (NBS) play an important role in offering resistance to pathogens. The availability of complete genome sequences of Brassica oleracea and Brassica rapa provides an important opportunity for researchers to identify and characterize NBS-encoding R genes in Brassica species and to compare with analogues in Arabidopsis thaliana based on a comparative genomics approach. However, little is known about the evolutionary fate of NBS-encoding genes in the Brassica lineage after split from A. thaliana. Here we present genome-wide analysis of NBS-encoding genes in B. oleracea, B. rapa and A. thaliana. Through the employment of HMM search and manual curation, we identified 157, 206 and 167 NBS-encoding genes in B. oleracea, B. rapa and A. thaliana genomes, respectively. Phylogenetic analysis among 3 species classified NBS-encoding genes into 6 subgroups. Tandem duplication and whole genome triplication (WGT) analyses revealed that after WGT of the Brassica ancestor, NBS-encoding homologous gene pairs on triplicated regions in Brassica ancestor were deleted or lost quickly, but NBS-encoding genes in Brassica species experienced species-specific gene amplification by tandem duplication after divergence of B. rapa and B. oleracea. Expression profiling of NBS-encoding orthologous gene pairs indicated the differential expression pattern of retained orthologous gene copies in B. oleracea and B. rapa. Furthermore, evolutionary analysis of CNL type NBS-encoding orthologous gene pairs among 3 species suggested that orthologous genes in B. rapa species have undergone stronger negative selection than those in B .oleracea species. But for TNL type, there are no significant differences in the orthologous gene pairs between the two species. This study is first identification and characterization of NBS-encoding genes in B. rapa and B. oleracea based on whole genome sequences. Through tandem duplication and whole genome

  7. Evidence-based gene models for structural and functional annotations of the oil palm genome.

    Science.gov (United States)

    Chan, Kuang-Lim; Tatarinova, Tatiana V; Rosli, Rozana; Amiruddin, Nadzirah; Azizi, Norazah; Halim, Mohd Amin Ab; Sanusi, Nik Shazana Nik Mohd; Jayanthi, Nagappan; Ponomarenko, Petr; Triska, Martin; Solovyev, Victor; Firdaus-Raih, Mohd; Sambanthamurthi, Ravigadevi; Murphy, Denis; Low, Eng-Ti Leslie

    2017-09-08

    Oil palm is an important source of edible oil. The importance of the crop, as well as its long breeding cycle (10-12 years) has led to the sequencing of its genome in 2013 to pave the way for genomics-guided breeding. Nevertheless, the first set of gene predictions, although useful, had many fragmented genes. Classification and characterization of genes associated with traits of interest, such as those for fatty acid biosynthesis and disease resistance, were also limited. Lipid-, especially fatty acid (FA)-related genes are of particular interest for the oil palm as they specify oil yields and quality. This paper presents the characterization of the oil palm genome using different gene prediction methods and comparative genomics analysis, identification of FA biosynthesis and disease resistance genes, and the development of an annotation database and bioinformatics tools. Using two independent gene-prediction pipelines, Fgenesh++ and Seqping, 26,059 oil palm genes with transcriptome and RefSeq support were identified from the oil palm genome. These coding regions of the genome have a characteristic broad distribution of GC 3 (fraction of cytosine and guanine in the third position of a codon) with over half the GC 3 -rich genes (GC 3  ≥ 0.75286) being intronless. In comparison, only one-seventh of the oil palm genes identified are intronless. Using comparative genomics analysis, characterization of conserved domains and active sites, and expression analysis, 42 key genes involved in FA biosynthesis in oil palm were identified. For three of them, namely EgFABF, EgFABH and EgFAD3, segmental duplication events were detected. Our analysis also identified 210 candidate resistance genes in six classes, grouped by their protein domain structures. We present an accurate and comprehensive annotation of the oil palm genome, focusing on analysis of important categories of genes (GC 3 -rich and intronless), as well as those associated with important functions, such as FA

  8. A massive incorporation of microbial genes into the genome of Tetranychus urticae, a polyphagous arthropod herbivore.

    Science.gov (United States)

    Wybouw, N; Van Leeuwen, T; Dermauw, W

    2018-06-01

    A number of horizontal gene transfers (HGTs) have been identified in the spider mite Tetranychus urticae, a chelicerate herbivore. However, the genome of this mite species has at present not been thoroughly mined for the presence of HGT genes. Here, we performed a systematic screen for HGT genes in the T. urticae genome using the h-index metric. Our results not only validated previously identified HGT genes but also uncovered 25 novel HGT genes. In addition to HGT genes with a predicted biochemical function in carbohydrate, lipid and folate metabolism, we also identified the horizontal transfer of a ketopantoate hydroxymethyltransferase and a pantoate β-alanine ligase gene. In plants and bacteria, both genes are essential for vitamin B5 biosynthesis and their presence in the mite genome strongly suggests that spider mites, similar to Bemisia tabaci and nematodes, can synthesize their own vitamin B5. We further show that HGT genes were physically embedded within the mite genome and were expressed in different life stages. By screening chelicerate genomes and transcriptomes, we were able to estimate the evolutionary histories of these HGTs during chelicerate evolution. Our study suggests that HGT has made a significant and underestimated impact on the metabolic repertoire of plant-feeding spider mites. © 2018 The Royal Entomological Society.

  9. RGmatch: matching genomic regions to proximal genes in omics data integration

    Directory of Open Access Journals (Sweden)

    Pedro Furió-Tarí

    2016-11-01

    Full Text Available Abstract Background The integrative analysis of multiple genomics data often requires that genome coordinates-based signals have to be associated with proximal genes. The relative location of a genomic region with respect to the gene (gene area is important for functional data interpretation; hence algorithms that match regions to genes should be able to deliver insight into this information. Results In this work we review the tools that are publicly available for making region-to-gene associations. We also present a novel method, RGmatch, a flexible and easy-to-use Python tool that computes associations either at the gene, transcript, or exon level, applying a set of rules to annotate each region-gene association with the region location within the gene. RGmatch can be applied to any organism as long as genome annotation is available. Furthermore, we qualitatively and quantitatively compare RGmatch to other tools. Conclusions RGmatch simplifies the association of a genomic region with its closest gene. At the same time, it is a powerful tool because the rules used to annotate these associations are very easy to modify according to the researcher’s specific interests. Some important differences between RGmatch and other similar tools already in existence are RGmatch’s flexibility, its wide range of user options, compatibility with any annotatable organism, and its comprehensive and user-friendly output.

  10. Genic regions of a large salamander genome contain long introns and novel genes

    Directory of Open Access Journals (Sweden)

    Bryant Susan V

    2009-01-01

    Full Text Available Abstract Background The basis of genome size variation remains an outstanding question because DNA sequence data are lacking for organisms with large genomes. Sixteen BAC clones from the Mexican axolotl (Ambystoma mexicanum: c-value = 32 × 109 bp were isolated and sequenced to characterize the structure of genic regions. Results Annotation of genes within BACs showed that axolotl introns are on average 10× longer than orthologous vertebrate introns and they are predicted to contain more functional elements, including miRNAs and snoRNAs. Loci were discovered within BACs for two novel EST transcripts that are differentially expressed during spinal cord regeneration and skin metamorphosis. Unexpectedly, a third novel gene was also discovered while manually annotating BACs. Analysis of human-axolotl protein-coding sequences suggests there are 2% more lineage specific genes in the axolotl genome than the human genome, but the great majority (86% of genes between axolotl and human are predicted to be 1:1 orthologs. Considering that axolotl genes are on average 5× larger than human genes, the genic component of the salamander genome is estimated to be incredibly large, approximately 2.8 gigabases! Conclusion This study shows that a large salamander genome has a correspondingly large genic component, primarily because genes have incredibly long introns. These intronic sequences may harbor novel coding and non-coding sequences that regulate biological processes that are unique to salamanders.

  11. Snf2 family gene distribution in higher plant genomes reveals DRD1 expansion and diversification in the tomato genome.

    Science.gov (United States)

    Bargsten, Joachim W; Folta, Adam; Mlynárová, Ludmila; Nap, Jan-Peter

    2013-01-01

    As part of large protein complexes, Snf2 family ATPases are responsible for energy supply during chromatin remodeling, but the precise mechanism of action of many of these proteins is largely unknown. They influence many processes in plants, such as the response to environmental stress. This analysis is the first comprehensive study of Snf2 family ATPases in plants. We here present a comparative analysis of 1159 candidate plant Snf2 genes in 33 complete and annotated plant genomes, including two green algae. The number of Snf2 ATPases shows considerable variation across plant genomes (17-63 genes). The DRD1, Rad5/16 and Snf2 subfamily members occur most often. Detailed analysis of the plant-specific DRD1 subfamily in related plant genomes shows the occurrence of a complex series of evolutionary events. Notably tomato carries unexpected gene expansions of DRD1 gene members. Most of these genes are expressed in tomato, although at low levels and with distinct tissue or organ specificity. In contrast, the Snf2 subfamily genes tend to be expressed constitutively in tomato. The results underpin and extend the Snf2 subfamily classification, which could help to determine the various functional roles of Snf2 ATPases and to target environmental stress tolerance and yield in future breeding.

  12. Snf2 family gene distribution in higher plant genomes reveals DRD1 expansion and diversification in the tomato genome.

    Directory of Open Access Journals (Sweden)

    Joachim W Bargsten

    Full Text Available As part of large protein complexes, Snf2 family ATPases are responsible for energy supply during chromatin remodeling, but the precise mechanism of action of many of these proteins is largely unknown. They influence many processes in plants, such as the response to environmental stress. This analysis is the first comprehensive study of Snf2 family ATPases in plants. We here present a comparative analysis of 1159 candidate plant Snf2 genes in 33 complete and annotated plant genomes, including two green algae. The number of Snf2 ATPases shows considerable variation across plant genomes (17-63 genes. The DRD1, Rad5/16 and Snf2 subfamily members occur most often. Detailed analysis of the plant-specific DRD1 subfamily in related plant genomes shows the occurrence of a complex series of evolutionary events. Notably tomato carries unexpected gene expansions of DRD1 gene members. Most of these genes are expressed in tomato, although at low levels and with distinct tissue or organ specificity. In contrast, the Snf2 subfamily genes tend to be expressed constitutively in tomato. The results underpin and extend the Snf2 subfamily classification, which could help to determine the various functional roles of Snf2 ATPases and to target environmental stress tolerance and yield in future breeding.

  13. Genome-wide identification of SAUR genes in watermelon (Citrullus lanatus).

    Science.gov (United States)

    Zhang, Na; Huang, Xing; Bao, Yaning; Wang, Bo; Zeng, Hongxia; Cheng, Weishun; Tang, Mi; Li, Yuhua; Ren, Jian; Sun, Yuhong

    2017-07-01

    The early auxin responsive SAUR family is an important gene family in auxin signal transduction. We here present the first report of a genome-wide identification of SAUR genes in watermelon genome. We successfully identified 65 ClaSAURs and provide a genomic framework for future study on these genes. Phylogenetic result revealed a Cucurbitaceae-specific SAUR subfamily and contribute to understanding of the evolutionary pattern of SAUR genes in plants. Quantitative RT-PCR analysis demonstrates the existed expression of 11 randomly selected SAUR genes in watermelon tissues. ClaSAUR36 was highly expressed in fruit, for which further study might bring a new prospective for watermelon fruit development. Moreover, correlation analysis revealed the similar expression profiles of SAUR genes between watermelon and Arabidopsis during shoot organogenesis. This work gives us a new support for the conserved auxin machinery in plants.

  14. Evolution of genes and genomes on the Drosophila phylogeny

    DEFF Research Database (Denmark)

    Clark, Andrew G; Eisen, Michael B; Smith, Douglas R

    2007-01-01

    Comparative analysis of multiple genomes in a phylogenetic framework dramatically improves the precision and sensitivity of evolutionary inference, producing more robust results than single-genome analyses can provide. The genomes of 12 Drosophila species, ten of which are presented here for the ......Comparative analysis of multiple genomes in a phylogenetic framework dramatically improves the precision and sensitivity of evolutionary inference, producing more robust results than single-genome analyses can provide. The genomes of 12 Drosophila species, ten of which are presented here...... tools that have made Drosophila melanogaster a pre-eminent model for animal genetics, and will further catalyse fundamental research on mechanisms of development, cell biology, genetics, disease, neurobiology, behaviour, physiology and evolution. Despite remarkable similarities among these Drosophila...

  15. In silico exploration of Red Sea Bacillus genomes for natural product biosynthetic gene clusters

    KAUST Repository

    Othoum, Ghofran K; Bougouffa, Salim; Razali, Rozaimi; Bokhari, Ameerah; Alamoudi, Soha; Antunes, André ; Gao, Xin; Hoehndorf, Robert; Arold, Stefan T.; Gojobori, Takashi; Hirt, Heribert; Mijakovic, Ivan; Bajic, Vladimir B.; Lafi, Feras Fawzi; Essack, Magbubah

    2018-01-01

    are better potential sources for novel antibiotics. Moreover, the genome of the Red Sea strain B. paralicheniformis Bac48 is more enriched in modular PKS genes compared to B. licheniformis strains and other B. paralicheniformis strains. This may be linked

  16. Comparative inference of duplicated genes produced by polyploidization in soybean genome.

    Science.gov (United States)

    Yang, Yanmei; Wang, Jinpeng; Di, Jianyong

    2013-01-01

    Soybean (Glycine max) is one of the most important crop plants for providing protein and oil. It is important to investigate soybean genome for its economic and scientific value. Polyploidy is a widespread and recursive phenomenon during plant evolution, and it could generate massive duplicated genes which is an important resource for genetic innovation. Improved sequence alignment criteria and statistical analysis are used to identify and characterize duplicated genes produced by polyploidization in soybean. Based on the collinearity method, duplicated genes by whole genome duplication account for 70.3% in soybean. From the statistical analysis of the molecular distances between duplicated genes, our study indicates that the whole genome duplication event occurred more than once in the genome evolution of soybean, which is often distributed near the ends of chromosomes.

  17. Site-Specific Integration of Exogenous Genes Using Genome Editing Technologies in Zebrafish

    Directory of Open Access Journals (Sweden)

    Atsuo Kawahara

    2016-05-01

    Full Text Available The zebrafish (Danio rerio is an ideal vertebrate model to investigate the developmental molecular mechanism of organogenesis and regeneration. Recent innovation in genome editing technologies, such as zinc finger nucleases (ZFNs, transcription activator-like effector nucleases (TALENs and the clustered regularly interspaced short palindromic repeats (CRISPR/CRISPR associated protein 9 (Cas9 system, have allowed researchers to generate diverse genomic modifications in whole animals and in cultured cells. The CRISPR/Cas9 and TALEN techniques frequently induce DNA double-strand breaks (DSBs at the targeted gene, resulting in frameshift-mediated gene disruption. As a useful application of genome editing technology, several groups have recently reported efficient site-specific integration of exogenous genes into targeted genomic loci. In this review, we provide an overview of TALEN- and CRISPR/Cas9-mediated site-specific integration of exogenous genes in zebrafish.

  18. Analysis of genomic imbalances and gene expression changes in transformed follicular lymphoma (FL)

    DEFF Research Database (Denmark)

    Obel, G.; Farinha, P.; Lam, W.

    2005-01-01

    American patients with transformed FL. Methods: High-resolution BAC-array comparative genomic hybridisation (CGH) was used to detect genomic imbalances. Gene expression profiling was performed using cDNA microarrays (Affymetrix). Results: Of 9 biopsy pairs identified so far, analysis results of the first 4...

  19. Genome-wide methylation analysis identifies genes silenced in non-seminoma cell lines.

    Science.gov (United States)

    Noor, Dzul Azri Mohamed; Jeyapalan, Jennie N; Alhazmi, Safiah; Carr, Matthew; Squibb, Benjamin; Wallace, Claire; Tan, Christopher; Cusack, Martin; Hughes, Jaime; Reader, Tom; Shipley, Janet; Sheer, Denise; Scotting, Paul J

    2016-01-01

    Silencing of genes by DNA methylation is a common phenomenon in many types of cancer. However, the genome-wide effect of DNA methylation on gene expression has been analysed in relatively few cancers. Germ cell tumours (GCTs) are a complex group of malignancies. They are unique in developing from a pluripotent progenitor cell. Previous analyses have suggested that non-seminomas exhibit much higher levels of DNA methylation than seminomas. The genomic targets that are methylated, the extent to which this results in gene silencing and the identity of the silenced genes most likely to play a role in the tumours' biology have not yet been established. In this study, genome-wide methylation and expression analysis of GCT cell lines was combined with gene expression data from primary tumours to address this question. Genome methylation was analysed using the Illumina infinium HumanMethylome450 bead chip system and gene expression was analysed using Affymetrix GeneChip Human Genome U133 Plus 2.0 arrays. Regulation by methylation was confirmed by demethylation using 5-aza-2-deoxycytidine and reverse transcription-quantitative PCR. Large differences in the level of methylation of the CpG islands of individual genes between tumour cell lines correlated well with differential gene expression. Treatment of non-seminoma cells with 5-aza-2-deoxycytidine verified that methylation of all genes tested played a role in their silencing in yolk sac tumour cells and many of these genes were also differentially expressed in primary tumours. Genes silenced by methylation in the various GCT cell lines were identified. Several pluripotency-associated genes were identified as a major functional group of silenced genes.

  20. Comparative Genomics of Non-TNL Disease Resistance Genes from Six Plant Species.

    Science.gov (United States)

    Nepal, Madhav P; Andersen, Ethan J; Neupane, Surendra; Benson, Benjamin V

    2017-09-30

    Disease resistance genes (R genes), as part of the plant defense system, have coevolved with corresponding pathogen molecules. The main objectives of this project were to identify non-Toll interleukin receptor, nucleotide-binding site, leucine-rich repeat (nTNL) genes and elucidate their evolutionary divergence across six plant genomes. Using reference sequences from Arabidopsis , we investigated nTNL orthologs in the genomes of common bean, Medicago , soybean, poplar, and rice. We used Hidden Markov Models for sequence identification, performed model-based phylogenetic analyses, visualized chromosomal positioning, inferred gene clustering, and assessed gene expression profiles. We analyzed 908 nTNL R genes in the genomes of the six plant species, and classified them into 12 subgroups based on the presence of coiled-coil (CC), nucleotide binding site (NBS), leucine rich repeat (LRR), resistance to Powdery mildew 8 (RPW8), and BED type zinc finger domains. Traditionally classified CC-NBS-LRR (CNL) genes were nested into four clades (CNL A-D) often with abundant, well-supported homogeneous subclades of Type-II R genes. CNL-D members were absent in rice, indicating a unique R gene retention pattern in the rice genome. Genomes from Arabidopsis , common bean, poplar and soybean had one chromosome without any CNL R genes. Medicago and Arabidopsis had the highest and lowest number of gene clusters, respectively. Gene expression analyses suggested unique patterns of expression for each of the CNL clades. Differential gene expression patterns of the nTNL genes were often found to correlate with number of introns and GC content, suggesting structural and functional divergence.

  1. Candidate genes revealed by a genome scan for mosquito resistance to a bacterial insecticide: sequence and gene expression variations

    Directory of Open Access Journals (Sweden)

    David Jean-Philippe

    2009-11-01

    Full Text Available Abstract Background Genome scans are becoming an increasingly popular approach to study the genetic basis of adaptation and speciation, but on their own, they are often helpless at identifying the specific gene(s or mutation(s targeted by selection. This shortcoming is hopefully bound to disappear in the near future, thanks to the wealth of new genomic resources that are currently being developed for many species. In this article, we provide a foretaste of this exciting new era by conducting a genome scan in the mosquito Aedes aegypti with the aim to look for candidate genes involved in resistance to Bacillus thuringiensis subsp. israelensis (Bti insecticidal toxins. Results The genome of a Bti-resistant and a Bti-susceptible strains was surveyed using about 500 MITE-based molecular markers, and the loci showing the highest inter-strain genetic differentiation were sequenced and mapped on the Aedes aegypti genome sequence. Several good candidate genes for Bti-resistance were identified in the vicinity of these highly differentiated markers. Two of them, coding for a cadherin and a leucine aminopeptidase, were further examined at the sequence and gene expression levels. In the resistant strain, the cadherin gene displayed patterns of nucleotide polymorphisms consistent with the action of positive selection (e.g. an excess of high compared to intermediate frequency mutations, as well as a significant under-expression compared to the susceptible strain. Conclusion Both sequence and gene expression analyses agree to suggest a role for positive selection in the evolution of this cadherin gene in the resistant strain. However, it is unlikely that resistance to Bti is conferred by this gene alone, and further investigation will be needed to characterize other genes significantly associated with Bti resistance in Ae. aegypti. Beyond these results, this article illustrates how genome scans can build on the body of new genomic information (here, full

  2. Construction of the BAC Library of Small Abalone (Haliotis diversicolor) for Gene Screening and Genome Characterization.

    Science.gov (United States)

    Jiang, Likun; You, Weiwei; Zhang, Xiaojun; Xu, Jian; Jiang, Yanliang; Wang, Kai; Zhao, Zixia; Chen, Baohua; Zhao, Yunfeng; Mahboob, Shahid; Al-Ghanim, Khalid A; Ke, Caihuan; Xu, Peng

    2016-02-01

    The small abalone (Haliotis diversicolor) is one of the most important aquaculture species in East Asia. To facilitate gene cloning and characterization, genome analysis, and genetic breeding of it, we constructed a large-insert bacterial artificial chromosome (BAC) library, which is an important genetic tool for advanced genetics and genomics research. The small abalone BAC library includes 92,610 clones with an average insert size of 120 Kb, equivalent to approximately 7.6× of the small abalone genome. We set up three-dimensional pools and super pools of 18,432 BAC clones for target gene screening using PCR method. To assess the approach, we screened 12 target genes in these 18,432 BAC clones and identified 16 positive BAC clones. Eight positive BAC clones were then sequenced and assembled with the next generation sequencing platform. The assembled contigs representing these 8 BAC clones spanned 928 Kb of the small abalone genome, providing the first batch of genome sequences for genome evaluation and characterization. The average GC content of small abalone genome was estimated as 40.33%. A total of 21 protein-coding genes, including 7 target genes, were annotated into the 8 BACs, which proved the feasibility of PCR screening approach with three-dimensional pools in small abalone BAC library. One hundred fifty microsatellite loci were also identified from the sequences for marker development in the future. The BAC library and clone pools provided valuable resources and tools for genetic breeding and conservation of H. diversicolor.

  3. Annotating the Function of the Human Genome with Gene Ontology and Disease Ontology.

    Science.gov (United States)

    Hu, Yang; Zhou, Wenyang; Ren, Jun; Dong, Lixiang; Wang, Yadong; Jin, Shuilin; Cheng, Liang

    2016-01-01

    Increasing evidences indicated that function annotation of human genome in molecular level and phenotype level is very important for systematic analysis of genes. In this study, we presented a framework named Gene2Function to annotate Gene Reference into Functions (GeneRIFs), in which each functional description of GeneRIFs could be annotated by a text mining tool Open Biomedical Annotator (OBA), and each Entrez gene could be mapped to Human Genome Organisation Gene Nomenclature Committee (HGNC) gene symbol. After annotating all the records about human genes of GeneRIFs, 288,869 associations between 13,148 mRNAs and 7,182 terms, 9,496 associations between 948 microRNAs and 533 terms, and 901 associations between 139 long noncoding RNAs (lncRNAs) and 297 terms were obtained as a comprehensive annotation resource of human genome. High consistency of term frequency of individual gene (Pearson correlation = 0.6401, p = 2.2e - 16) and gene frequency of individual term (Pearson correlation = 0.1298, p = 3.686e - 14) in GeneRIFs and GOA shows our annotation resource is very reliable.

  4. Regulation of gene expression in Mycoplasmas: contribution from Mycoplasma hyopneumoniae and Mycoplasma synoviae genome sequences

    Directory of Open Access Journals (Sweden)

    Humberto Maciel França Madeira

    2007-01-01

    Full Text Available This report describes the transcription apparatus of Mycoplasma hyopneumoniae (strains J and 7448 and Mycoplasma synoviae, using a comparative genomics approach to summarize the main features related to transcription and control of gene expression in mycoplasmas. Most of the transcription-related genes present in the three strains are well conserved among mycoplasmas. Some unique aspects of transcription in mycoplasmas and the scarcity of regulatory proteins in mycoplasma genomes are discussed.

  5. The Fanconi anemia/BRCA gene network in zebrafish: Embryonic expression and comparative genomics

    OpenAIRE

    Titus, Tom A.; Yan, Yi-Lin; Wilson, Catherine; Starks, Amber M.; Frohnmayer, Jonathan D.; Canestro, Cristian; Rodriguez-Mari, Adriana; He, Xinjun; Postlethwait, John H.

    2008-01-01

    Fanconi anemia (FA) is a genic disease resulting in bone marrow failure, high cancer risks, and infertility, and developmental anomalies including microphthalmia, microcephaly, hypoplastic radius and thumb. Here we present cDNA sequences, genetic mapping, and genomic analyses for the four previously undescribed zebrafish FA genes (fanci, fancj, fancm, and fancn, and show that they reverted to single copy after the teleost genome duplication. We tested the hypothesis that FA genes are expresse...

  6. Cross-family translational genomics of abiotic stress-responsive genes between Arabidopsis and Medicago truncatula.

    Directory of Open Access Journals (Sweden)

    Daejin Hyung

    Full Text Available Cross-species translation of genomic information may play a pivotal role in applying biological knowledge gained from relatively simple model system to other less studied, but related, genomes. The information of abiotic stress (ABS-responsive genes in Arabidopsis was identified and translated into the legume model system, Medicago truncatula. Various data resources, such as TAIR/AtGI DB, expression profiles and literatures, were used to build a genome-wide list of ABS genes. tBlastX/BlastP similarity search tools and manual inspection of alignments were used to identify orthologous genes between the two genomes. A total of 1,377 genes were finally collected and classified into 18 functional criteria of gene ontology (GO. The data analysis according to the expression cues showed that there was substantial level of interaction among three major types (i.e., drought, salinity and cold stress of abiotic stresses. In an attempt to translate the ABS genes between these two species, genomic locations for each gene were mapped using an in-house-developed comparative analysis platform. The comparative analysis revealed that fragmental colinearity, represented by only 37 synteny blocks, existed between Arabidopsis and M. truncatula. Based on the combination of E-value and alignment remarks, estimated translation rate was 60.2% for this cross-family translation. As a prelude of the functional comparative genomic approaches, in-silico gene network/interactome analyses were conducted to predict key components in the ABS responses, and one of the sub-networks was integrated with corresponding comparative map. The results demonstrated that core members of the sub-network were well aligned with previously reported ABS regulatory networks. Taken together, the results indicate that network-based integrative approaches of comparative and functional genomics are important to interpret and translate genomic information for complex traits such as abiotic stresses.

  7. Genome-scale analysis of positional clustering of mouse testis-specific genes

    Directory of Open Access Journals (Sweden)

    Lee Bernett TK

    2005-01-01

    Full Text Available Abstract Background Genes are not randomly distributed on a chromosome as they were thought even after removal of tandem repeats. The positional clustering of co-expressed genes is known in prokaryotes and recently reported in several eukaryotic organisms such as Caenorhabditis elegans, Drosophila melanogaster, and Homo sapiens. In order to further investigate the mode of tissue-specific gene clustering in higher eukaryotes, we have performed a genome-scale analysis of positional clustering of the mouse testis-specific genes. Results Our computational analysis shows that a large proportion of testis-specific genes are clustered in groups of 2 to 5 genes in the mouse genome. The number of clusters is much higher than expected by chance even after removal of tandem repeats. Conclusion Our result suggests that testis-specific genes tend to cluster on the mouse chromosomes. This provides another piece of evidence for the hypothesis that clusters of tissue-specific genes do exist.

  8. Earthworms and the soil greenhouse gas balance

    NARCIS (Netherlands)

    Lubbers, I.M.

    2014-01-01

    Earthworms play an essential part in determining the greenhouse gas (GHG) balance of soils worldwide. Their activity affects both biotic and abiotic soil properties, which in turn influence soil GHG emissions, carbon (C) sequestration and plant growth. Yet, the balance of earthworms

  9. Abundance of earthworms in Nigerian ecological zones ...

    African Journals Online (AJOL)

    ... that if the illegal annual bush burning is prevented, the soil surface will be naturally mulched, earthworms protected, and by their function in the soil, the need for soil mechanization and fertilization could be replaced by earthworms to produce natural foods. Keywords: Annelida, Oligochaeta, leaf litter breakdown, soil fauna ...

  10. Prospects: the tomato genome as a cornerstone for gene discovery

    Science.gov (United States)

    Those involved in the international tomato genome sequencing effort contributed to not only the development of an important genome sequence relevant to a major economic and nutritional crop, but also to the tomato experimental system as a model for plant biology. Without question, prior seminal work...

  11. Genome Mutational and Transcriptional Hotspots Are Traps for Duplicated Genes and Sources of Adaptations.

    Science.gov (United States)

    Fares, Mario A; Sabater-Muñoz, Beatriz; Toft, Christina

    2017-05-01

    Gene duplication generates new genetic material, which has been shown to lead to major innovations in unicellular and multicellular organisms. A whole-genome duplication occurred in the ancestor of Saccharomyces yeast species but 92% of duplicates returned to single-copy genes shortly after duplication. The persisting duplicated genes in Saccharomyces led to the origin of major metabolic innovations, which have been the source of the unique biotechnological capabilities in the Baker's yeast Saccharomyces cerevisiae. What factors have determined the fate of duplicated genes remains unknown. Here, we report the first demonstration that the local genome mutation and transcription rates determine the fate of duplicates. We show, for the first time, a preferential location of duplicated genes in the mutational and transcriptional hotspots of S. cerevisiae genome. The mechanism of duplication matters, with whole-genome duplicates exhibiting different preservation trends compared to small-scale duplicates. Genome mutational and transcriptional hotspots are rich in duplicates with large repetitive promoter elements. Saccharomyces cerevisiae shows more tolerance to deleterious mutations in duplicates with repetitive promoter elements, which in turn exhibit higher transcriptional plasticity against environmental perturbations. Our data demonstrate that the genome traps duplicates through the accelerated regulatory and functional divergence of their gene copies providing a source of novel adaptations in yeast. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  12. Systematic discovery of unannotated genes in 11 yeast species using a database of orthologous genomic segments

    LENUS (Irish Health Repository)

    OhEigeartaigh, Sean S

    2011-07-26

    Abstract Background In standard BLAST searches, no information other than the sequences of the query and the database entries is considered. However, in situations where two genes from different species have only borderline similarity in a BLAST search, the discovery that the genes are located within a region of conserved gene order (synteny) can provide additional evidence that they are orthologs. Thus, for interpreting borderline search results, it would be useful to know whether the syntenic context of a database hit is similar to that of the query. This principle has often been used in investigations of particular genes or genomic regions, but to our knowledge it has never been implemented systematically. Results We made use of the synteny information contained in the Yeast Gene Order Browser database for 11 yeast species to carry out a systematic search for protein-coding genes that were overlooked in the original annotations of one or more yeast genomes but which are syntenic with their orthologs. Such genes tend to have been overlooked because they are short, highly divergent, or contain introns. The key features of our software - called SearchDOGS - are that the database entries are classified into sets of genomic segments that are already known to be orthologous, and that very weak BLAST hits are retained for further analysis if their genomic location is similar to that of the query. Using SearchDOGS we identified 595 additional protein-coding genes among the 11 yeast species, including two new genes in Saccharomyces cerevisiae. We found additional genes for the mating pheromone a-factor in six species including Kluyveromyces lactis. Conclusions SearchDOGS has proven highly successful for identifying overlooked genes in the yeast genomes. We anticipate that our approach can be adapted for study of further groups of species, such as bacterial genomes. More generally, the concept of doing sequence similarity searches against databases to which external

  13. Metal accumulation in earthworms inhabiting floodplain soils

    International Nuclear Information System (INIS)

    Vijver, Martina G.; Vink, Jos P.M.; Miermans, Cornelis J.H.; Gestel, Cornelis A.M. van

    2007-01-01

    The main factors contributing to variation in metal concentrations in earthworms inhabiting floodplain soils were investigated in three floodplains differing in inundation frequency and vegetation type. Metal concentrations in epigeic earthworms showed larger seasonal variations than endogeic earthworms. Variation in internal levels between sampling intervals were largest in earthworms from floodplain sites frequently inundated. High and low frequency flooding did not result in consistent changes in internal metal concentrations. Vegetation types of the floodplains did not affect metal levels in Lumbricus rubellus, except for internal Cd levels, which were positively related to the presence of organic litter. Internal levels of most essential metals were higher in spring. In general, no clear patterns in metal uptake were found and repetition of the sampling campaign will probably yield different results. - Metal levels in earthworms show large variation among sites, among seasons and among epigeic and endogeic species

  14. Profiling of gene duplication patterns of sequenced teleost genomes: evidence for rapid lineage-specific genome expansion mediated by recent tandem duplications.

    Science.gov (United States)

    Lu, Jianguo; Peatman, Eric; Tang, Haibao; Lewis, Joshua; Liu, Zhanjiang

    2012-06-15

    Gene duplication has had a major impact on genome evolution. Localized (or tandem) duplication resulting from unequal crossing over and whole genome duplication are believed to be the two dominant mechanisms contributing to vertebrate genome evolution. While much scrutiny has been directed toward discerning patterns indicative of whole-genome duplication events in teleost species, less attention has been paid to the continuous nature of gene duplications and their impact on the size, gene content, functional diversity, and overall architecture of teleost genomes. Here, using a Markov clustering algorithm directed approach we catalogue and analyze patterns of gene duplication in the four model teleost species with chromosomal coordinates: zebrafish, medaka, stickleback, and Tetraodon. Our analyses based on set size, duplication type, synonymous substitution rate (Ks), and gene ontology emphasize shared and lineage-specific patterns of genome evolution via gene duplication. Most strikingly, our analyses highlight the extraordinary duplication and retention rate of recent duplicates in zebrafish and their likely role in the structural and functional expansion of the zebrafish genome. We find that the zebrafish genome is remarkable in its large number of duplicated genes, small duplicate set size, biased Ks distribution toward minimal mutational divergence, and proportion of tandem and intra-chromosomal duplicates when compared with the other teleost model genomes. The observed gene duplication patterns have played significant roles in shaping the architecture of teleost genomes and appear to have contributed to the recent functional diversification and divergence of important physiological processes in zebrafish. We have analyzed gene duplication patterns and duplication types among the available teleost genomes and found that a large number of genes were tandemly and intrachromosomally duplicated, suggesting their origin of independent and continuous duplication

  15. The FUN of identifying gene function in bacterial pathogens; insights from Salmonella functional genomics.

    Science.gov (United States)

    Hammarlöf, Disa L; Canals, Rocío; Hinton, Jay C D

    2013-10-01

    The availability of thousands of genome sequences of bacterial pathogens poses a particular challenge because each genome contains hundreds of genes of unknown function (FUN). How can we easily discover which FUN genes encode important virulence factors? One solution is to combine two different functional genomic approaches. First, transcriptomics identifies bacterial FUN genes that show differential expression during the process of mammalian infection. Second, global mutagenesis identifies individual FUN genes that the pathogen requires to cause disease. The intersection of these datasets can reveal a small set of candidate genes most likely to encode novel virulence attributes. We demonstrate this approach with the Salmonella infection model, and propose that a similar strategy could be used for other bacterial pathogens. Copyright © 2013 Elsevier Ltd. All rights reserved.

  16. Collective Dynamics of Specific Gene Ensembles Crucial for Neutrophil Differentiation: The Existence of Genome Vehicles Revealed

    Science.gov (United States)

    Giuliani, Alessandro; Tomita, Masaru

    2010-01-01

    Cell fate decision remarkably generates specific cell differentiation path among the multiple possibilities that can arise through the complex interplay of high-dimensional genome activities. The coordinated action of thousands of genes to switch cell fate decision has indicated the existence of stable attractors guiding the process. However, origins of the intracellular mechanisms that create “cellular attractor” still remain unknown. Here, we examined the collective behavior of genome-wide expressions for neutrophil differentiation through two different stimuli, dimethyl sulfoxide (DMSO) and all-trans-retinoic acid (atRA). To overcome the difficulties of dealing with single gene expression noises, we grouped genes into ensembles and analyzed their expression dynamics in correlation space defined by Pearson correlation and mutual information. The standard deviation of correlation distributions of gene ensembles reduces when the ensemble size is increased following the inverse square root law, for both ensembles chosen randomly from whole genome and ranked according to expression variances across time. Choosing the ensemble size of 200 genes, we show the two probability distributions of correlations of randomly selected genes for atRA and DMSO responses overlapped after 48 hours, defining the neutrophil attractor. Next, tracking the ranked ensembles' trajectories, we noticed that only certain, not all, fall into the attractor in a fractal-like manner. The removal of these genome elements from the whole genomes, for both atRA and DMSO responses, destroys the attractor providing evidence for the existence of specific genome elements (named “genome vehicle”) responsible for the neutrophil attractor. Notably, within the genome vehicles, genes with low or moderate expression changes, which are often considered noisy and insignificant, are essential components for the creation of the neutrophil attractor. Further investigations along with our findings might

  17. Advancing Eucalyptus genomics: identification and sequencing of lignin biosynthesis genes from deep-coverage BAC libraries

    Directory of Open Access Journals (Sweden)

    Kudrna David

    2011-03-01

    Full Text Available Abstract Background Eucalyptus species are among the most planted hardwoods in the world because of their rapid growth, adaptability and valuable wood properties. The development and integration of genomic resources into breeding practice will be increasingly important in the decades to come. Bacterial artificial chromosome (BAC libraries are key genomic tools that enable positional cloning of important traits, synteny evaluation, and the development of genome framework physical maps for genetic linkage and genome sequencing. Results We describe the construction and characterization of two deep-coverage BAC libraries EG_Ba and EG_Bb obtained from nuclear DNA fragments of E. grandis (clone BRASUZ1 digested with HindIII and BstYI, respectively. Genome coverages of 17 and 15 haploid genome equivalents were estimated for EG_Ba and EG_Bb, respectively. Both libraries contained large inserts, with average sizes ranging from 135 Kb (Eg_Bb to 157 Kb (Eg_Ba, very low extra-nuclear genome contamination providing a probability of finding a single copy gene ≥ 99.99%. Libraries were screened for the presence of several genes of interest via hybridizations to high-density BAC filters followed by PCR validation. Five selected BAC clones were sequenced and assembled using the Roche GS FLX technology providing the whole sequence of the E. grandis chloroplast genome, and complete genomic sequences of important lignin biosynthesis genes. Conclusions The two E. grandis BAC libraries described in this study represent an important milestone for the advancement of Eucalyptus genomics and forest tree research. These BAC resources have a highly redundant genome coverage (> 15×, contain large average inserts and have a very low percentage of clones with organellar DNA or empty vectors. These publicly available BAC libraries are thus suitable for a broad range of applications in genetic and genomic research in Eucalyptus and possibly in related species of Myrtaceae

  18. Genome-wide identification and characterization of WRKY gene family in peanut

    Directory of Open Access Journals (Sweden)

    Hui eSong

    2016-04-01

    Full Text Available WRKY, an important transcription factor family, is widely distributed in the plant kingdom. Many reports focused on analysis of phylogenetic relationship and biological function of WRKY protein at the whole genome level in different plant species. However, little is known about WRKY proteins in the genome of Arachis species and their response to salicylic acid (SA and jasmonic acid (JA treatment. In this study, we identified 77 and 75 WRKY proteins from the two wild ancestral diploid genomes of cultivated tetraploid peanut, Arachis duranensis and Arachis ipaënsis, using bioinformatics approaches. Most peanut WRKY coding genes were located on A. duranensis chromosome A6 and A. ipaënsis chromosome B3, while the least number of WRKY genes was found in chromosome 9. The WRKY orthologous gene pairs in A. duranensis and A. ipaënsis chromosomes were highly syntenic. Our analysis indicated that segmental duplication events played a major role in AdWRKY and AiWRKY genes, and strong purifying selection was observed in gene duplication pairs. Furthermore, we translate the knowledge gained from the genome-wide analysis result of wild ancestral peanut to cultivated peanut to reveal that gene activities of specific cultivated peanut WRKY gene were changed due to SA and JA treatment. Peanut WRKY7, 8 and 13 genes were down-regulated, whereas WRKY1 and 12 genes were up-regulated with SA and JA treatment. These results could provide valuable information for peanut improvement.

  19. Genome-Wide Identification and Characterization of WRKY Gene Family in Peanut.

    Science.gov (United States)

    Song, Hui; Wang, Pengfei; Lin, Jer-Young; Zhao, Chuanzhi; Bi, Yuping; Wang, Xingjun

    2016-01-01

    WRKY, an important transcription factor family, is widely distributed in the plant kingdom. Many reports focused on analysis of phylogenetic relationship and biological function of WRKY protein at the whole genome level in different plant species. However, little is known about WRKY proteins in the genome of Arachis species and their response to salicylic acid (SA) and jasmonic acid (JA) treatment. In this study, we identified 77 and 75 WRKY proteins from the two wild ancestral diploid genomes of cultivated tetraploid peanut, Arachis duranensis and Arachis ipaënsis, using bioinformatics approaches. Most peanut WRKY coding genes were located on A. duranensis chromosome A6 and A. ipaënsis chromosome B3, while the least number of WRKY genes was found in chromosome 9. The WRKY orthologous gene pairs in A. duranensis and A. ipaënsis chromosomes were highly syntenic. Our analysis indicated that segmental duplication events played a major role in AdWRKY and AiWRKY genes, and strong purifying selection was observed in gene duplication pairs. Furthermore, we translate the knowledge gained from the genome-wide analysis result of wild ancestral peanut to cultivated peanut to reveal that gene activities of specific cultivated peanut WRKY gene were changed due to SA and JA treatment. Peanut WRKY7, 8 and 13 genes were down-regulated, whereas WRKY1 and 12 genes were up-regulated with SA and JA treatment. These results could provide valuable information for peanut improvement.

  20. Leishmania naiffi and Leishmania guyanensis reference genomes highlight genome structure and gene evolution in the Viannia subgenus.

    Science.gov (United States)

    Coughlan, Simone; Taylor, Ali Shirley; Feane, Eoghan; Sanders, Mandy; Schonian, Gabriele; Cotton, James A; Downing, Tim

    2018-04-01

    The unicellular protozoan parasite Leishmania causes the neglected tropical disease leishmaniasis, affecting 12 million people in 98 countries. In South America, where the Viannia subgenus predominates, so far only L. ( Viannia ) braziliensis and L. ( V. ) panamensis have been sequenced, assembled and annotated as reference genomes. Addressing this deficit in molecular information can inform species typing, epidemiological monitoring and clinical treatment. Here, L. ( V. ) naiffi and L. ( V. ) guyanensis genomic DNA was sequenced to assemble these two genomes as draft references from short sequence reads. The methods used were tested using short sequence reads for L. braziliensis M2904 against its published reference as a comparison. This assembly and annotation pipeline identified 70 additional genes not annotated on the original M2904 reference. Phylogenetic and evolutionary comparisons of L. guyanensis and L. naiffi with 10 other Viannia genomes revealed four traits common to all Viannia : aneuploidy, 22 orthologous groups of genes absent in other Leishmania subgenera, elevated TATE transposon copies and a high NADH-dependent fumarate reductase gene copy number. Within the Viannia , there were limited structural changes in genome architecture specific to individual species: a 45 Kb amplification on chromosome 34 was present in all bar L. lainsoni , L. naiffi had a higher copy number of the virulence factor leishmanolysin, and laboratory isolate L. shawi M8408 had a possible minichromosome derived from the 3' end of chromosome 34 . This combination of genome assembly, phylogenetics and comparative analysis across an extended panel of diverse Viannia has uncovered new insights into the origin and evolution of this subgenus and can help improve diagnostics for leishmaniasis surveillance.

  1. PSP: rapid identification of orthologous coding genes under positive selection across multiple closely related prokaryotic genomes.

    Science.gov (United States)

    Su, Fei; Ou, Hong-Yu; Tao, Fei; Tang, Hongzhi; Xu, Ping

    2013-12-27

    With genomic sequences of many closely related bacterial strains made available by deep sequencing, it is now possible to investigate trends in prokaryotic microevolution. Positive selection is a sub-process of microevolution, in which a particular mutation is favored, causing the allele frequency to continuously shift in one direction. Wide scanning of prokaryotic genomes has shown that positive selection at the molecular level is much more frequent than expected. Genes with significant positive selection may play key roles in bacterial adaption to different environmental pressures. However, selection pressure analyses are computationally intensive and awkward to configure. Here we describe an open access web server, which is designated as PSP (Positive Selection analysis for Prokaryotic genomes) for performing evolutionary analysis on orthologous coding genes, specially designed for rapid comparison of dozens of closely related prokaryotic genomes. Remarkably, PSP facilitates functional exploration at the multiple levels by assignments and enrichments of KO, GO or COG terms. To illustrate this user-friendly tool, we analyzed Escherichia coli and Bacillus cereus genomes and found that several genes, which play key roles in human infection and antibiotic resistance, show significant evidence of positive selection. PSP is freely available to all users without any login requirement at: http://db-mml.sjtu.edu.cn/PSP/. PSP ultimately allows researchers to do genome-scale analysis for evolutionary selection across multiple prokaryotic genomes rapidly and easily, and identify the genes undergoing positive selection, which may play key roles in the interactions of host-pathogen and/or environmental adaptation.

  2. Network Graph Analysis of Gene-Gene Interactions in Genome-Wide Association Study Data

    Directory of Open Access Journals (Sweden)

    Sungyoung Lee

    2012-12-01

    Full Text Available Most common complex traits, such as obesity, hypertension, diabetes, and cancers, are known to be associated with multiple genes, environmental factors, and their epistasis. Recently, the development of advanced genotyping technologies has allowed us to perform genome-wide association studies (GWASs. For detecting the effects of multiple genes on complex traits, many approaches have been proposed for GWASs. Multifactor dimensionality reduction (MDR is one of the powerful and efficient methods for detecting high-order gene-gene (GxG interactions. However, the biological interpretation of GxG interactions identified by MDR analysis is not easy. In order to aid the interpretation of MDR results, we propose a network graph analysis to elucidate the meaning of identified GxG interactions. The proposed network graph analysis consists of three steps. The first step is for performing GxG interaction analysis using MDR analysis. The second step is to draw the network graph using the MDR result. The third step is to provide biological evidence of the identified GxG interaction using external biological databases. The proposed method was applied to Korean Association Resource (KARE data, containing 8838 individuals with 327,632 single-nucleotide polymorphisms, in order to perform GxG interaction analysis of body mass index (BMI. Our network graph analysis successfully showed that many identified GxG interactions have known biological evidence related to BMI. We expect that our network graph analysis will be helpful to interpret the biological meaning of GxG interactions.

  3. Directed evolution combined with synthetic biology strategies expedite semi-rational engineering of genes and genomes.

    Science.gov (United States)

    Kang, Zhen; Zhang, Junli; Jin, Peng; Yang, Sen

    2015-01-01

    Owing to our limited understanding of the relationship between sequence and function and the interaction between intracellular pathways and regulatory systems, the rational design of enzyme-coding genes and de novo assembly of a brand-new artificial genome for a desired functionality or phenotype are difficult to achieve. As an alternative approach, directed evolution has been widely used to engineer genomes and enzyme-coding genes. In particular, significant developments toward DNA synthesis, DNA assembly (in vitro or in vivo), recombination-mediated genetic engineering, and high-throughput screening techniques in the field of synthetic biology have been matured and widely adopted, enabling rapid semi-rational genome engineering to generate variants with desired properties. In this commentary, these novel tools and their corresponding applications in the directed evolution of genomes and enzymes are discussed. Moreover, the strategies for genome engineering and rapid in vitro enzyme evolution are also proposed.

  4. Comparative genome sequencing of Drosophila pseudoobscura: Chromosomal, gene, and cis-element evolution

    DEFF Research Database (Denmark)

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.

    2005-01-01

    years (Myr) since the pseudoobscura/melanogaster divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome-wide average, consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than random and nearby sequences......We have sequenced the genome of a second Drosophila species, Drosophila pseudoobscura, and compared this to the genome sequence of Drosophila melanogaster, a primary model organism. Throughout evolution the vast majority of Drosophila genes have remained on the same chromosome arm, but within each...... between the species-but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a pattern of repeat-mediated chromosomal rearrangement, and high coadaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence...

  5. Visual Comparison of Multiple Gene Expression Datasets in a Genomic Context

    Directory of Open Access Journals (Sweden)

    Borowski Krzysztof

    2008-06-01

    Full Text Available The need for novel methods of visualizing microarray data is growing. New perspectives are beneficial to finding patterns in expression data. The Bluejay genome browser provides an integrative way of visualizing gene expression datasets in a genomic context. We have now developed the functionality to display multiple microarray datasets simultaneously in Bluejay, in order to provide researchers with a comprehensive view of their datasets linked to a graphical representation of gene function. This will enable biologists to obtain valuable insights on expression patterns, by allowing them to analyze the expression values in relation to the gene locations as well as to compare expression profiles of related genomes or of di erent experiments for the same genome.

  6. [A review of the genomic and gene cloning studies in trees].

    Science.gov (United States)

    Yin, Tong-Ming

    2010-07-01

    Supported by the Department of Energy (DOE) of U.S., the first tree genome, black cottonwood (Populus trichocarpa), has been completely sequenced and publicly release. This is the milestone that indicates the beginning of post-genome era for forest trees. Identification and cloning genes underlying important traits are one of the main tasks for the post-genome-era tree genomic studies. Recently, great achievements have been made in cloning genes coordinating important domestication traits in some crops, such as rice, tomato, maize and so on. Molecular breeding has been applied in the practical breeding programs for many crops. By contrast, molecular studies in trees are lagging behind. Trees possess some characteristics that make them as difficult organisms for studying on locating and cloning of genes. With the advances in techniques, given also the fast growth of tree genomic resources, great achievements are desirable in cloning unknown genes from trees, which will facilitate tree improvement programs by means of molecular breeding. In this paper, the author reviewed the progress in tree genomic and gene cloning studies, and prospected the future achievements in order to provide a useful reference for researchers working in this area.

  7. Characterization of genomic sequence of a drought-resistant gene ...

    Indian Academy of Sciences (India)

    to study the genomics of polyploid plants, as most pro- genitors have been ... had been shown to constitute significant stress in pilot exper- iments. Untreated ... Southern blotting, real-time quantitative PCR and total soluble sugar analysis.

  8. Comparative Genomics Reveals the Core Gene Toolbox for the Fungus-Insect Symbiosis

    Science.gov (United States)

    Stata, Matt; Wang, Wei; White, Merlin M.; Moncalvo, Jean-Marc

    2018-01-01

    ABSTRACT Modern genomics has shed light on many entomopathogenic fungi and expanded our knowledge widely; however, little is known about the genomic features of the insect-commensal fungi. Harpellales are obligate commensals living in the digestive tracts of disease-bearing insects (black flies, midges, and mosquitoes). In this study, we produced and annotated whole-genome sequences of nine Harpellales taxa and conducted the first comparative analyses to infer the genomic diversity within the members of the Harpellales. The genomes of the insect gut fungi feature low (26% to 37%) GC content and large genome size variations (25 to 102 Mb). Further comparisons with insect-pathogenic fungi (from both Ascomycota and Zoopagomycota), as well as with free-living relatives (as negative controls), helped to identify a gene toolbox that is essential to the fungus-insect symbiosis. The results not only narrow the genomic scope of fungus-insect interactions from several thousands to eight core players but also distinguish host invasion strategies employed by insect pathogens and commensals. The genomic content suggests that insect commensal fungi rely mostly on adhesion protein anchors that target digestive system, while entomopathogenic fungi have higher numbers of transmembrane helices, signal peptides, and pathogen-host interaction (PHI) genes across the whole genome and enrich genes as well as functional domains to inactivate the host inflammation system and suppress the host defense. Phylogenomic analyses have revealed that genome sizes of Harpellales fungi vary among lineages with an integer-multiple pattern, which implies that ancient genome duplications may have occurred within the gut of insects. PMID:29764946

  9. Gene Discovery through Genomic Sequencing of Brucella abortus

    OpenAIRE

    Sánchez, Daniel O.; Zandomeni, Ruben O.; Cravero, Silvio; Verdún, Ramiro E.; Pierrou, Ester; Faccio, Paula; Diaz, Gabriela; Lanzavecchia, Silvia; Agüero, Fernán; Frasch, Alberto C. C.; Andersson, Siv G. E.; Rossetti, Osvaldo L.; Grau, Oscar; Ugalde, Rodolfo A.

    2001-01-01

    Brucella abortus is the etiological agent of brucellosis, a disease that affects bovines and human. We generated DNA random sequences from the genome of B. abortus strain 2308 in order to characterize molecular targets that might be useful for developing immunological or chemotherapeutic strategies against this pathogen. The partial sequencing of 1,899 clones allowed the identification of 1,199 genomic sequence surveys (GSSs) with high homology (BLAST expect value < 10−5) to sequences deposit...

  10. Genome-wide analysis of WRKY gene family in the sesame genome and identification of the WRKY genes involved in responses to abiotic stresses.

    Science.gov (United States)

    Li, Donghua; Liu, Pan; Yu, Jingyin; Wang, Linhai; Dossa, Komivi; Zhang, Yanxin; Zhou, Rong; Wei, Xin; Zhang, Xiurong

    2017-09-11

    Sesame (Sesamum indicum L.) is one of the world's most important oil crops. However, it is susceptible to abiotic stresses in general, and to waterlogging and drought stresses in particular. The molecular mechanisms of abiotic stress tolerance in sesame have not yet been elucidated. The WRKY domain transcription factors play significant roles in plant growth, development, and responses to stresses. However, little is known about the number, location, structure, molecular phylogenetics, and expression of the WRKY genes in sesame. We performed a comprehensive study of the WRKY gene family in sesame and identified 71 SiWRKYs. In total, 65 of these genes were mapped to 15 linkage groups within the sesame genome. A phylogenetic analysis was performed using a related species (Arabidopsis thaliana) to investigate the evolution of the sesame WRKY genes. Tissue expression profiles of the WRKY genes demonstrated that six SiWRKY genes were highly expressed in all organs, suggesting that these genes may be important for plant growth and organ development in sesame. Analysis of the SiWRKY gene expression patterns revealed that 33 and 26 SiWRKYs respond strongly to waterlogging and drought stresses, respectively. Changes in the expression of 12 SiWRKY genes were observed at different times after the waterlogging and drought treatments had begun, demonstrating that sesame gene expression patterns vary in response to abiotic stresses. In this study, we analyzed the WRKY family of transcription factors encoded by the sesame genome. Insight was gained into the classification, evolution, and function of the SiWRKY genes, revealing their putative roles in a variety of tissues. Responses to abiotic stresses in different sesame cultivars were also investigated. The results of our study provide a better understanding of the structures and functions of sesame WRKY genes and suggest that manipulating these WRKYs could enhance resistance to waterlogging and drought.

  11. Local coexpression domains of two to four genes in the genome of Arabidopsis

    NARCIS (Netherlands)

    Ren, X.Y.; Fiers, M.W.E.J.; Stiekema, W.J.; Nap, J.P.H.

    2005-01-01

    Expression of genes in eukaryotic genomes is known to cluster, but cluster size is generally loosely defined and highly variable. We have here taken a very strict definition of cluster as sets of physically adjacent genes that are highly coexpressed and form so-called local coexpression domains. The

  12. Genome-wide identification of structural variants in genes encoding drug targets

    DEFF Research Database (Denmark)

    Rasmussen, Henrik Berg; Dahmcke, Christina Mackeprang

    2012-01-01

    The objective of the present study was to identify structural variants of drug target-encoding genes on a genome-wide scale. We also aimed at identifying drugs that are potentially amenable for individualization of treatments based on knowledge about structural variation in the genes encoding...

  13. Genome-wide SNP association-based localization of a dwarfism gene in Friesian dwarf horses

    NARCIS (Netherlands)

    Orr, J.L.; Back, W.; Gu, J.; Leegwater, P.H.; Govindarajan, P.; Conroy, J.; Ducro, B.J.; Arendonk, van J.A.M.

    2010-01-01

    The recent completion of the horse genome and commercial availability of an equine SNP genotyping array has facilitated the mapping of disease genes. We report putative localization of the gene responsible for dwarfism, a trait in Friesian horses that is thought to have a recessive mode of

  14. A systematic genome-wide analysis of zebrafish protein-coding gene function

    NARCIS (Netherlands)

    Kettleborough, R.N.; Busch-Nentwich, E.M.; Harvey, S.A.; Dooley, C.M.; de Bruijn, E.; van Eeden, F.; Sealy, I.; White, R.J.; Herd, C.; Nijman, I.J.; Fenyes, F.; Mehroke, S.; Scahill, C.; Gibbons, R.; Wali, N.; Carruthers, S.; Hall, A.; Yen, J.; Cuppen, E.; Stemple, D.L.

    2013-01-01

    Since the publication of the human reference genome, the identities of specific genes associated with human diseases are being discovered at a rapid rate. A central problem is that the biological activity of these genes is often unclear. Detailed investigations in model vertebrate organisms,

  15. Isolation and characterization of the genomic region from Drosophila kuntzei containing the Adh and Adhr genes

    NARCIS (Netherlands)

    Oppentocht, JE; van Delden, W; van de Zande, L

    The nucleotide sequences of the Adh and Adhr genes of Drosophila kuntzei were derived from combined overlapping sequences of clones isolated from a genomic library and from cloned PCR and inverse-PCR fragments. Only a proximal promoter was detected upstream of the Adh gene, indicating that D.

  16. Gene finding with a hidden Markov model of genome structure and evolution

    DEFF Research Database (Denmark)

    Pedersen, Jakob Skou; Hein, Jotun

    2003-01-01

    the model are linear in alignment length and genome number. The model is applied to the problem of gene finding. The benefit of modelling sequence evolution is demonstrated both in a range of simulations and on a set of orthologous human/mouse gene pairs. AVAILABILITY: Free availability over the Internet...

  17. Confluence of genes, environment, development, and behavior in a post Genome-Wide Association Study world

    DEFF Research Database (Denmark)

    Vrieze, S. I.; Iacono, W. G.; McGue, M.

    2012-01-01

    This article serves to outline a research paradigm to investigate main effects and interactions of genes, environment, and development on behavior and psychiatric illness. We provide a historical context for candidate gene studies and genome-wide association studies, including benefits, limitations...

  18. Identification of putative noncoding RNA genes in the Burkholderia cenocepacia J2315 genome

    DEFF Research Database (Denmark)

    Coenye, T.; Drevinek, P.; Mahenthiralingam, E.

    2007-01-01

    Noncoding RNA (ncRNA) genes are not involved in the production of mRNA and proteins, but produce transcripts that function directly as structural or regulatory RNAs. In the present study, the presence of ncRNA genes in the genome of Burkholderia cenocepacia J2315 was evaluated by combining...

  19. Nutrition-gene interactions (post-genomics). Changes in gene expression through nutritional manipulations

    International Nuclear Information System (INIS)

    Harper, G.S.; Lehnert, S.A.; Greenwood, P.L.

    2005-01-01

    This paper discusses the effects of severe nutritional restriction, both pre- and post-weaning, on development of skeletal muscle in food animals. Given recent predictions about growth in demand for muscle-foods in developing countries, the global community will need to face the food-feed dilemma, and balance efficiency of production against the quality-of-life aspects of local livestock husbandry. It is likely that production animals will be grown in successively more marginal environments and at higher stocking rates on unimproved pastures. Understanding the nutritional limits to animal growth at the level of muscle gene networks will help us find optima for nutrition, growth rate and meat yield. Genomic approaches give us unprecedented capacity to map the networks of control under nutritionally restricted conditions, though the challenges remain of identifying steps that regulate substrate flux. The paper describes some approaches currently being taken to understanding muscle development, and concludes that the genes contributing to two ruminant phenotypes should be mapped and characterized. These are: the capacity to depress metabolic rate in response to nutritional restriction; and the capacity to exhibit compensatory growth after restriction is relieved. (author)

  20. Earthworm introduction on calcareous minesoils

    International Nuclear Information System (INIS)

    Vimmerstedt, J.P.; Kost, D.A.

    1994-01-01

    Burrowing activity of the nightcrawler, Lumbricus terrestis (L.t.), incorporates organic matter into mineral soil while creating long-lasting macropores. Thus L.t. has potential as a biological means of improving physical and chemical properties of surface mined areas. Efforts to establish L.t. population on forested acidic or calcareous minesoils have been successful, but thus far have not been able to establish L.t. in grassland ecosystems on calcareous minesoils. In May, 1989, the authors put 11 clitellate L.t. under sphagnum moss on calcareous gray cast overburden on standard graded topsoil, or on ripped and disked topsoil. All soils had cover of agronomic grasses and legumes. They found no L.t. at the 24 points of inoculation during sampling in fall of 1990 with formalin extractant, although smaller species, Lumbricus rubellus and Dendrobaena spp., were found. At another location, in May, 1990, they put 25 clitellate L.t. at 16 points in grasslands growing on gray cast overburden. Using formalin extraction, they found no L.t. in May 1992 at these locations. Working in this same area in November, 1992, they released 10 clitellate L.t. at 16 points under 10 cm of moist Alnus glutinosa leaf litter. Careful examination of the surface inoculation points in spring and fall of 1993 did not show obvious signs of earthworm activity. Their next step will be to use Earthworm Inoculation Units (earthworm-minesoil microcosms containing L.t. adults, immatures, and cocoons) as the source of the new populations

  1. Integrative analysis of genome-wide gene copy number changes and gene expression in non-small cell lung cancer.

    Directory of Open Access Journals (Sweden)

    Verena Jabs

    Full Text Available Non-small cell lung cancer (NSCLC represents a genomically unstable cancer type with extensive copy number aberrations. The relationship of gene copy number alterations and subsequent mRNA levels has only fragmentarily been described. The aim of this study was to conduct a genome-wide analysis of gene copy number gains and corresponding gene expression levels in a clinically well annotated NSCLC patient cohort (n = 190 and their association with survival. While more than half of all analyzed gene copy number-gene expression pairs showed statistically significant correlations (10,296 of 18,756 genes, high correlations, with a correlation coefficient >0.7, were obtained only in a subset of 301 genes (1.6%, including KRAS, EGFR and MDM2. Higher correlation coefficients were associated with higher copy number and expression levels. Strong correlations were frequently based on few tumors with high copy number gains and correspondingly increased mRNA expression. Among the highly correlating genes, GO groups associated with posttranslational protein modifications were particularly frequent, including ubiquitination and neddylation. In a meta-analysis including 1,779 patients we found that survival associated genes were overrepresented among highly correlating genes (61 of the 301 highly correlating genes, FDR adjusted p<0.05. Among them are the chaperone CCT2, the core complex protein NUP107 and the ubiquitination and neddylation associated protein CAND1. In conclusion, in a comprehensive analysis we described a distinct set of highly correlating genes. These genes were found to be overrepresented among survival-associated genes based on gene expression in a large collection of publicly available datasets.

  2. Genomic sequence and organization of two members of a human lectin gene family

    International Nuclear Information System (INIS)

    Gitt, M.A.; Barondes, S.H.

    1991-01-01

    The authors have isolated and sequenced the genomic DNA encoding a human dimeric soluble lactose-binding lectin. The gene has four exons, and its upstream region contains sequences that suggest control by glucocorticoids, heat (environmental) shock, metals, and other factors. They have also isolated and sequenced three exons of the gene encoding another human putative lectin, the existence of which was first indicated by isolation of its cDNA. Comparisons suggest a general pattern of genomic organization of members of this lectin gene family

  3. BRAKER1: Unsupervised RNA-Seq-Based Genome Annotation with GeneMark-ET and AUGUSTUS.

    Science.gov (United States)

    Hoff, Katharina J; Lange, Simone; Lomsadze, Alexandre; Borodovsky, Mark; Stanke, Mario

    2016-03-01

    Gene finding in eukaryotic genomes is notoriously difficult to automate. The task is to design a work flow with a minimal set of tools that would reach state-of-the-art performance across a wide range of species. GeneMark-ET is a gene prediction tool that incorporates RNA-Seq data into unsupervised training and subsequently generates ab initio gene predictions. AUGUSTUS is a gene finder that usually requires supervised training and uses information from RNA-Seq reads in the prediction step. Complementary strengths of GeneMark-ET and AUGUSTUS provided motivation for designing a new combined tool for automatic gene prediction. We present BRAKER1, a pipeline for unsupervised RNA-Seq-based genome annotation that combines the advantages of GeneMark-ET and AUGUSTUS. As input, BRAKER1 requires a genome assembly file and a file in bam-format with spliced alignments of RNA-Seq reads to the genome. First, GeneMark-ET performs iterative training and generates initial gene structures. Second, AUGUSTUS uses predicted genes for training and then integrates RNA-Seq read information into final gene predictions. In our experiments, we observed that BRAKER1 was more accurate than MAKER2 when it is using RNA-Seq as sole source for training and prediction. BRAKER1 does not require pre-trained parameters or a separate expert-prepared training step. BRAKER1 is available for download at http://bioinf.uni-greifswald.de/bioinf/braker/ and http://exon.gatech.edu/GeneMark/ katharina.hoff@uni-greifswald.de or borodovsky@gatech.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  4. Comparative genomic and transcriptomic analysis of selected fatty acid biosynthesis genes and CNL disease resistance genes in oil palm

    Science.gov (United States)

    Rosli, Rozana; Amiruddin, Nadzirah; Ab Halim, Mohd Amin; Chan, Pek-Lan; Chan, Kuang-Lim; Azizi, Norazah; Morris, Priscilla E.; Leslie Low, Eng-Ti; Ong-Abdullah, Meilina; Sambanthamurthi, Ravigadevi; Singh, Rajinder

    2018-01-01

    Comparative genomics and transcriptomic analyses were performed on two agronomically important groups of genes from oil palm versus other major crop species and the model organism, Arabidopsis thaliana. The first analysis was of two gene families with key roles in regulation of oil quality and in particular the accumulation of oleic acid, namely stearoyl ACP desaturases (SAD) and acyl-acyl carrier protein (ACP) thioesterases (FAT). In both cases, these were found to be large gene families with complex expression profiles across a wide range of tissue types and developmental stages. The detailed classification of the oil palm SAD and FAT genes has enabled the updating of the latest version of the oil palm gene model. The second analysis focused on disease resistance (R) genes in order to elucidate possible candidates for breeding of pathogen tolerance/resistance. Ortholog analysis showed that 141 out of the 210 putative oil palm R genes had homologs in banana and rice. These genes formed 37 clusters with 634 orthologous genes. Classification of the 141 oil palm R genes showed that the genes belong to the Kinase (7), CNL (95), MLO-like (8), RLK (3) and Others (28) categories. The CNL R genes formed eight clusters. Expression data for selected R genes also identified potential candidates for breeding of disease resistance traits. Furthermore, these findings can provide information about the species evolution as well as the identification of agronomically important genes in oil palm and other major crops. PMID:29672525

  5. Comparative genomic and transcriptomic analysis of selected fatty acid biosynthesis genes and CNL disease resistance genes in oil palm.

    Science.gov (United States)

    Rosli, Rozana; Amiruddin, Nadzirah; Ab Halim, Mohd Amin; Chan, Pek-Lan; Chan, Kuang-Lim; Azizi, Norazah; Morris, Priscilla E; Leslie Low, Eng-Ti; Ong-Abdullah, Meilina; Sambanthamurthi, Ravigadevi; Singh, Rajinder; Murphy, Denis J

    2018-01-01

    Comparative genomics and transcriptomic analyses were performed on two agronomically important groups of genes from oil palm versus other major crop species and the model organism, Arabidopsis thaliana. The first analysis was of two gene families with key roles in regulation of oil quality and in particular the accumulation of oleic acid, namely stearoyl ACP desaturases (SAD) and acyl-acyl carrier protein (ACP) thioesterases (FAT). In both cases, these were found to be large gene families with complex expression profiles across a wide range of tissue types and developmental stages. The detailed classification of the oil palm SAD and FAT genes has enabled the updating of the latest version of the oil palm gene model. The second analysis focused on disease resistance (R) genes in order to elucidate possible candidates for breeding of pathogen tolerance/resistance. Ortholog analysis showed that 141 out of the 210 putative oil palm R genes had homologs in banana and rice. These genes formed 37 clusters with 634 orthologous genes. Classification of the 141 oil palm R genes showed that the genes belong to the Kinase (7), CNL (95), MLO-like (8), RLK (3) and Others (28) categories. The CNL R genes formed eight clusters. Expression data for selected R genes also identified potential candidates for breeding of disease resistance traits. Furthermore, these findings can provide information about the species evolution as well as the identification of agronomically important genes in oil palm and other major crops.

  6. A Gene Gravity Model for the Evolution of Cancer Genomes: A Study of 3,000 Cancer Genomes across 9 Cancer Types

    Science.gov (United States)

    Lin, Chen-Ching; Zhao, Junfei; Jia, Peilin; Li, Wen-Hsiung; Zhao, Zhongming

    2015-01-01

    Cancer development and progression result from somatic evolution by an accumulation of genomic alterations. The effects of those alterations on the fitness of somatic cells lead to evolutionary adaptations such as increased cell proliferation, angiogenesis, and altered anticancer drug responses. However, there are few general mathematical models to quantitatively examine how perturbations of a single gene shape subsequent evolution of the cancer genome. In this study, we proposed the gene gravity model to study the evolution of cancer genomes by incorporating the genome-wide transcription and somatic mutation profiles of ~3,000 tumors across 9 cancer types from The Cancer Genome Atlas into a broad gene network. We found that somatic mutations of a cancer driver gene may drive cancer genome evolution by inducing mutations in other genes. This functional consequence is often generated by the combined effect of genetic and epigenetic (e.g., chromatin regulation) alterations. By quantifying cancer genome evolution using the gene gravity model, we identified six putative cancer genes (AHNAK, COL11A1, DDX3X, FAT4, STAG2, and SYNE1). The tumor genomes harboring the nonsynonymous somatic mutations in these genes had a higher mutation density at the genome level compared to the wild-type groups. Furthermore, we provided statistical evidence that hypermutation of cancer driver genes on inactive X chromosomes is a general feature in female cancer genomes. In summary, this study sheds light on the functional consequences and evolutionary characteristics of somatic mutations during tumorigenesis by propelling adaptive cancer genome evolution, which would provide new perspectives for cancer research and therapeutics. PMID:26352260

  7. Comparative Genomic Analysis of Neutrophilic Iron(II Oxidizer Genomes for Candidate Genes in Extracellular Electron Transfer

    Directory of Open Access Journals (Sweden)

    Shaomei He

    2017-08-01

    Full Text Available Extracellular electron transfer (EET is recognized as a key biochemical process in circumneutral pH Fe(II-oxidizing bacteria (FeOB. In this study, we searched for candidate EET genes in 73 neutrophilic FeOB genomes, among which 43 genomes are complete or close-to-complete and the rest have estimated genome completeness ranging from 5 to 91%. These neutrophilic FeOB span members of the microaerophilic, anaerobic phototrophic, and anaerobic nitrate-reducing FeOB groups. We found that many microaerophilic and several anaerobic FeOB possess homologs of Cyc2, an outer membrane cytochrome c originally identified in Acidithiobacillus ferrooxidans. The “porin-cytochrome c complex” (PCC gene clusters homologous to MtoAB/PioAB are present in eight FeOB, accounting for 19% of complete and close-to-complete genomes examined, whereas PCC genes homologous to OmbB-OmaB-OmcB in Geobacter sulfurreducens are absent. Further, we discovered gene clusters that may potentially encode two novel PCC types. First, a cluster (tentatively named “PCC3” encodes a porin, an extracellular and a periplasmic cytochrome c with remarkably large numbers of heme-binding motifs. Second, a cluster (tentatively named “PCC4” encodes a porin and three periplasmic multiheme cytochromes c. A conserved inner membrane protein (IMP encoded in PCC3 and PCC4 gene clusters might be responsible for translocating electrons across the inner membrane. Other bacteria possessing PCC3 and PCC4 are mostly Proteobacteria isolated from environments with a potential niche for Fe(II oxidation. In addition to cytochrome c, multicopper oxidase (MCO genes potentially involved in Fe(II oxidation were also identified. Notably, candidate EET genes were not found in some FeOB, especially the anaerobic ones, probably suggesting EET genes or Fe(II oxidation mechanisms are different from the searched models. Overall, based on current EET models, the search extends our understanding of bacterial EET and

  8. Tree Species Identity Shapes Earthworm Communities

    Directory of Open Access Journals (Sweden)

    Stephanie Schelfhout

    2017-03-01

    Full Text Available Earthworms are key organisms in forest ecosystems because they incorporate organic material into the soil and affect the activity of other soil organisms. Here, we investigated how tree species affect earthworm communities via litter and soil characteristics. In a 36-year old common garden experiment, replicated six times over Denmark, six tree species were planted in blocks: sycamore maple (Acer pseudoplatanus, beech (Fagus sylvatica, ash (Fraxinus excelsior, Norway spruce (Picea abies, pedunculate oak (Quercus robur and lime (Tilia cordata. We studied the chemical characteristics of soil and foliar litter, and determined the forest floor turnover rate and the density and biomass of the earthworm species occurring in the stands. Tree species significantly affected earthworm communities via leaf litter and/or soil characteristics. Anecic earthworms were abundant under Fraxinus, Acer and Tilia, which is related to calcium-rich litter and low soil acidification. Epigeic earthworms were indifferent to calcium content in leaf litter and were shown to be mainly related to soil moisture content and litter C:P ratios. Almost no earthworms were found in Picea stands, likely because of the combined effects of recalcitrant litter, low pH and low soil moisture content.

  9. Earthworms – good indicators for forest disturbance

    Directory of Open Access Journals (Sweden)

    YAHYA KOOCH

    2014-08-01

    Full Text Available In temperate forests, formation of canopy gaps by windthrow is a characteristic natural disturbance event. Little work has been done on the effects of canopy gaps on soil properties and fauna, especially earthworms as ecosystem engineers. We conducted a study to examine the reaction of earthworms (density/biomass and different soil properties (i.e., soil moisture, pH, organic matter, total N, and available Ca to different canopy gap areas in 25-ha areas of Liresar district beech forest located in a temperate forest of Mazandaran province in the north of Iran. Soil samples were taken at 0-15, 15-30 and 30-45 cm depths from gap center, gap edge and closed canopy using core soil sampler with 81 cm2 cross section. The earthworms were collected simultaneously with the soil sampling by hand sorting method. Our study supports that the canopy gap will create a mosaic of environmental conditions. Earthworm's density and biomass tended to be higher in small canopy gaps compared with the other canopy gap areas. Earthworm's population showed decreasing trend from closed canopy to disturbed sites (gap edge and gap center. The top soil was more appropriate to presence of earthworms although ecological groups have occupied different soil layers. As a conclusion, earthworms can be introduced as good bio-indicator of environmental changes that occur by disturbance.

  10. Evolutionary Fates and Dynamic Functionalization of Young Duplicate Genes in Arabidopsis Genomes.

    Science.gov (United States)

    Wang, Jun; Tao, Feng; Marowsky, Nicholas C; Fan, Chuanzhu

    2016-09-01

    Gene duplication is a primary means to generate genomic novelties, playing an essential role in speciation and adaptation. Particularly in plants, a high abundance of duplicate genes has been maintained for significantly long periods of evolutionary time. To address the manner in which young duplicate genes were derived primarily from small-scale gene duplication and preserved in plant genomes and to determine the underlying driving mechanisms, we generated transcriptomes to produce the expression profiles of five tissues in Arabidopsis thaliana and the closely related species Arabidopsis lyrata and Capsella rubella Based on the quantitative analysis metrics, we investigated the evolutionary processes of young duplicate genes in Arabidopsis. We determined that conservation, neofunctionalization, and specialization are three main evolutionary processes for Arabidopsis young duplicate genes. We explicitly demonstrated the dynamic functionalization of duplicate genes along the evolutionary time scale. Upon origination, duplicates tend to maintain their ancestral functions; but as they survive longer, they might be likely to develop distinct and novel functions. The temporal evolutionary processes and functionalization of plant duplicate genes are associated with their ancestral functions, dynamic DNA methylation levels, and histone modification abundances. Furthermore, duplicate genes tend to be initially expressed in pollen and then to gain more interaction partners over time. Altogether, our study provides novel insights into the dynamic retention processes of young duplicate genes in plant genomes. © 2016 American Society of Plant Biologists. All rights reserved.

  11. Evolutionary Fates and Dynamic Functionalization of Young Duplicate Genes in Arabidopsis Genomes1[OPEN

    Science.gov (United States)

    Wang, Jun; Tao, Feng; Marowsky, Nicholas C.; Fan, Chuanzhu

    2016-01-01

    Gene duplication is a primary means to generate genomic novelties, playing an essential role in speciation and adaptation. Particularly in plants, a high abundance of duplicate genes has been maintained for significantly long periods of evolutionary time. To address the manner in which young duplicate genes were derived primarily from small-scale gene duplication and preserved in plant genomes and to determine the underlying driving mechanisms, we generated transcriptomes to produce the expression profiles of five tissues in Arabidopsis thaliana and the closely related species Arabidopsis lyrata and Capsella rubella. Based on the quantitative analysis metrics, we investigated the evolutionary processes of young duplicate genes in Arabidopsis. We determined that conservation, neofunctionalization, and specialization are three main evolutionary processes for Arabidopsis young duplicate genes. We explicitly demonstrated the dynamic functionalization of duplicate genes along the evolutionary time scale. Upon origination, duplicates tend to maintain their ancestral functions; but as they survive longer, they might be likely to develop distinct and novel functions. The temporal evolutionary processes and functionalization of plant duplicate genes are associated with their ancestral functions, dynamic DNA methylation levels, and histone modification abundances. Furthermore, duplicate genes tend to be initially expressed in pollen and then to gain more interaction partners over time. Altogether, our study provides novel insights into the dynamic retention processes of young duplicate genes in plant genomes. PMID:27485883

  12. Genome-wide investigation and transcriptome analysis of the WRKY gene family in Gossypium.

    Science.gov (United States)

    Ding, Mingquan; Chen, Jiadong; Jiang, Yurong; Lin, Lifeng; Cao, YueFen; Wang, Minhua; Zhang, Yuting; Rong, Junkang; Ye, Wuwei

    2015-02-01

    WRKY transcription factors play important roles in various stress responses in diverse plant species. In cotton, this family has not been well studied, especially in relation to fiber development. Here, the genomes and transcriptomes of Gossypium raimondii and Gossypium arboreum were investigated to identify fiber development related WRKY genes. This represents the first comprehensive comparative study of WRKY transcription factors in both diploid A and D cotton species. In total, 112 G. raimondii and 109 G. arboreum WRKY genes were identified. No significant gene structure or domain alterations were detected between the two species, but many SNPs distributed unequally in exon and intron regions. Physical mapping revealed that the WRKY genes in G. arboreum were not located in the corresponding chromosomes of G. raimondii, suggesting great chromosome rearrangement in the diploid cotton genomes. The cotton WRKY genes, especially subgroups I and II, have expanded through multiple whole genome duplications and tandem duplications compared with other plant species. Sequence comparison showed many functionally divergent sites between WRKY subgroups, while the genes within each group are under strong purifying selection. Transcriptome analysis suggested that many WRKY genes participate in specific fiber development processes such as fiber initiation, elongation and maturation with different expression patterns between species. Complex WRKY gene expression such as differential Dt and At allelic gene expression in G. hirsutum and alternative splicing events were also observed in both diploid and tetraploid cottons during fiber development process. In conclusion, this study provides important information on the evolution and function of WRKY gene family in cotton species.

  13. Comparative genomic analysis of Drosophila melanogaster and vector mosquito developmental genes.

    Directory of Open Access Journals (Sweden)

    Susanta K Behura

    Full Text Available Genome sequencing projects have presented the opportunity for analysis of developmental genes in three vector mosquito species: Aedes aegypti, Culex quinquefasciatus, and Anopheles gambiae. A comparative genomic analysis of developmental genes in Drosophila melanogaster and these three important vectors of human disease was performed in this investigation. While the study was comprehensive, special emphasis centered on genes that 1 are components of developmental signaling pathways, 2 regulate fundamental developmental processes, 3 are critical for the development of tissues of vector importance, 4 function in developmental processes known to have diverged within insects, and 5 encode microRNAs (miRNAs that regulate developmental transcripts in Drosophila. While most fruit fly developmental genes are conserved in the three vector mosquito species, several genes known to be critical for Drosophila development were not identified in one or more mosquito genomes. In other cases, mosquito lineage-specific gene gains with respect to D. melanogaster were noted. Sequence analyses also revealed that numerous repetitive sequences are a common structural feature of Drosophila and mosquito developmental genes. Finally, analysis of predicted miRNA binding sites in fruit fly and mosquito developmental genes suggests that the repertoire of developmental genes targeted by miRNAs is species-specific. The results of this study provide insight into the evolution of developmental genes and processes in dipterans and other arthropods, serve as a resource for those pursuing analysis of mosquito development, and will promote the design and refinement of functional analysis experiments.

  14. Mitochondrial genome evolution in Alismatales: Size reduction and extensive loss of ribosomal protein genes

    DEFF Research Database (Denmark)

    Petersen, Gitte; Cuenca, Argelia; Zervas, Athanasios

    2017-01-01

    The order Alismatales is a hotspot for evolution of plant mitochondrial genomes characterized by remarkable differences in genome size, substitution rates, RNA editing, retrotranscription, gene loss and intron loss. Here we have sequenced the complete mitogenomes of Zostera marina and Stratiotes...... aloides, which together with previously sequenced mitogenomes from Butomus and Spirodela, provide new evolutionary evidence of genome size reduction, gene loss and transfer to the nucleus. The Zostera mitogenome includes a large portion of DNA transferred from the plastome, yet it is the smallest known...... mitogenome from a non-parasitic plant. Using a broad sample of the Alismatales, the evolutionary history of ribosomal protein gene loss is analyzed. In Zostera almost all ribosomal protein genes are lost from the mitogenome, but only some can be found in the nucleus....

  15. prokaryote genome annotation with GeneScan and GLIMMER

    Indian Academy of Sciences (India)

    Unknown

    The number of false predictions (both positive and negative) is higher for GeneScan as compared to GLIMMER, but in a ... on whether they need to be trained on a set of genes in order to ..... FP has partial matches to the kdpA gene in C. jejuni.

  16. A scan for positively selected genes in the genomes of humans and chimpanzees.

    Directory of Open Access Journals (Sweden)

    Rasmus Nielsen

    2005-06-01

    Full Text Available Since the divergence of humans and chimpanzees about 5 million years ago, these species have undergone a remarkable evolution with drastic divergence in anatomy and cognitive abilities. At the molecular level, despite the small overall magnitude of DNA sequence divergence, we might expect such evolutionary changes to leave a noticeable signature throughout the genome. We here compare 13,731 annotated genes from humans to their chimpanzee orthologs to identify genes that show evidence of positive selection. Many of the genes that present a signature of positive selection tend to be involved in sensory perception or immune defenses. However, the group of genes that show the strongest evidence for positive selection also includes a surprising number of genes involved in tumor suppression and apoptosis, and of genes involved in spermatogenesis. We hypothesize that positive selection in some of these genes may be driven by genomic conflict due to apoptosis during spermatogenesis. Genes with maximal expression in the brain show little or no evidence for positive selection, while genes with maximal expression in the testis tend to be enriched with positively selected genes. Genes on the X chromosome also tend to show an elevated tendency for positive selection. We also present polymorphism data from 20 Caucasian Americans and 19 African Americans for the 50 annotated genes showing the strongest evidence for positive selection. The polymorphism analysis further supports the presence of positive selection in these genes by showing an excess of high-frequency derived nonsynonymous mutations.

  17. Earthworm in the 21st century

    Science.gov (United States)

    Friberg, Paul; Lisowski, Stefan; Dricker, Ilya; Hellman, Sidney

    2010-05-01

    Earthworm (Johnson et al., 1995) is a fully open-source earthquake data acquisition and processing package that is in widespread use through out the world. Earthworm includes basic seismic data acquistion for the majority of the dataloggers currently available and provides network transport mechanisms and common formats as output for data transferral. In addition, it comes with network seismology tools to compute network detections, perform automated arrival picking, and automated hypocentral and magnitude estimations. More importantly it is an open and free framework in the C-programming language that can be used to create new modules that process waveform and earthquake data in near real time. The number of Earthworm installations is growing annually as are the number of contributions to the system. Furthermore its growth into other areas of waveform data acquistion (namely Geomagnetic observatories and Infrasound arrays) show its adaptability to other waveform technologies and processing strategies. In this presentation we discuss the coming challenges to growing Earthworm and new developments in its use; namely the open source add-ons that have become interfaces to Earthworm's core. These add-ons include GlowWorm, MagWorm, Hydra, SWARM, Winston, EarlyBird, Iworm, and most importantly, AQMS (formerly known as CHEETAH). The AQMS, ANSS Quake Monitoring System, is the Earthworm system created in California which has now been installed in the majority of Regional Seismic Networks (RSNs) in the United States. AQMS allows additional real-time and post-processing of Earthworm generated data to be stored and manipulated in a database using numerous database oriented tools. The use of a relational database for persistence provides users with the ability to implement configuration control and research capabilities not available in earlier Earthworm add-ons. By centralizing on AQMS, the RSNs will be able to leverage new developments by easily sharing Earthworm and AQMS

  18. Radioactive elements and earthworms in contaminated soils

    International Nuclear Information System (INIS)

    Suleymanova, A.S.; Abdullayev, A.S.; Ahmadov, G.S.; Naghiyev, J.A.; Samadov, P.A.

    2010-11-01

    Earthworms are one of the indispensable soil animals which treat soil with letting it through their gut and help increasing soil fertility. The effect of radioactive elements and comparative effect of heavy metals to the vital functions of earthworms were determined in laboratory conditions. Experiments were continued for a month, and first of all, each soil type, grey-brown soil from Ramana iodine plant territory of Baku city, brown soil from Aluminum plant territory of Ganja city, aborigine grey-brown soil of Absheron peninsula, treated with Ra and U salts as model variants and brown soil of Ganja city was analyzed by gamma-spectrometer for radionuclide determining at the beginning and at the end of the experiment. Earthworms (Nicodrilus Caliginosus Sav.trapezoides) aboriginal for Absheron peninsula and plant residues were added to the soil. At the end of the month the biomass, survival value, coprolite allocation value, food activity and catalase value in earthworms and in soil were determined. The gamma-spectrometric analysis results gave interesting values in coprolites, soils which had been treated through the earthworms' gut. In comparison with the initial variants in experimental results more percent of radioactivity was gathered in coprolites. By this way earthworms absorbed most of radioactive elements and allocated them as coprogenous substances on the upper layer of soil. During absorbing, some percents of radioactive elements were also gathered in gut cells of the earthworms. Thereby determination of some vital functions of earthworms was expedient. Thus, by the instrumentality of these experiments we can use earthworms for biodiagnosis and for bioremediation of contaminated soils with radionuclides and heavy metals.

  19. Accumulation of chlorinated benzenes in earthworms

    Science.gov (United States)

    Beyer, W.N.

    1996-01-01

    Chlorinated benzenes are widespread in the environment. Hexachlorobenzene, pentachlorobenzene and all isomers of dichlorobenzenes, trichlorobenzenes, and tetrachlorobenzenes, have been detected in fish, water, and sediments from the Great Lakes. This paper describes a long-term (26 week) experiment relating the concentrations of chlorinated benzenes in earthworms to 1) the length of exposure, and it describes three 8-week experiments relating concentrations of chlorinated benzenes in earthworms to 2) their concentration in soil 3) the soil organic matter content and, 4) the degree of chlorination. In the 26-week experiment, the concentration of 1,2,4 - trichlorobenzene in earthworms fluctuated only slightly about a mean of 0.63 ppm (Fig. 1). Although a statistically significant decrease can be demonstrated over the test (Pearson correlation coefficient, r = -0.62 p earthworms showed a cyclical trend that coincided with replacement of the media, and a slight but statistically significant tendency to increase from about 2 to 3 ppm over the 26 weeks (r = 0.55, p earthworms increased as the concentrations in the soil increased (Fig. 2), but leveled off at the highest soil concentrations. The most surprising result of this study was the relatively low concentrations in earthworms compared to those in soils. The average concentration of each of the six isomers of trichlorobenzene and tetrachlorobenzene in earthworms was only about 1 ppm (Table 2); the isomeric structure did not affect accumulation. The concentration of organic matter in soil had a prominent effect on hexachlorobenzene concentrations in earthworms (Fig. 3). Hexachlorobenzene concentrations decreased steadily from 9.3 ppm in earthworms kept in soil without any peat moss added to about 1 ppm in soil containing 16 or 32% organic matter.

  20. An evolvable oestrogen receptor activity sensor: development of a modular system for integrating multiple genes into the yeast genome

    NARCIS (Netherlands)

    Fox, J.E.; Bridgham, J.T.; Bovee, T.F.H.; Thornton, J.W.

    2007-01-01

    To study a gene interaction network, we developed a gene-targeting strategy that allows efficient and stable genomic integration of multiple genetic constructs at distinct target loci in the yeast genome. This gene-targeting strategy uses a modular plasmid with a recyclable selectable marker and a

  1. Constraints on genes shape long-term conservation of macro-synteny in metazoan genomes

    Directory of Open Access Journals (Sweden)

    Putnam Nicholas H

    2011-10-01

    Full Text Available Abstract Background Many metazoan genomes conserve chromosome-scale gene linkage relationships (“macro-synteny” from the common ancestor of multicellular animal life 1234, but the biological explanation for this conservation is still unknown. Double cut and join (DCJ is a simple, well-studied model of neutral genome evolution amenable to both simulation and mathematical analysis 5, but as we show here, it is not sufficent to explain long-term macro-synteny conservation. Results We examine a family of simple (one-parameter extensions of DCJ to identify models and choices of parameters consistent with the levels of macro- and micro-synteny conservation observed among animal genomes. Our software implements a flexible strategy for incorporating genomic context into the DCJ model to incorporate various types of genomic context (“DCJ-[C]”, and is available as open source software from http://github.com/putnamlab/dcj-c. Conclusions A simple model of genome evolution, in which DCJ moves are allowed only if they maintain chromosomal linkage among a set of constrained genes, can simultaneously account for the level of macro-synteny conservation and for correlated conservation among multiple pairs of species. Simulations under this model indicate that a constraint on approximately 7% of metazoan genes is sufficient to constrain genome rearrangement to an average rate of 25 inversions and 1.7 translocations per million years.

  2. Genome mining of Streptomyces scabrisporus NF3 reveals symbiotic features including genes related to plant interactions

    Science.gov (United States)

    Rodríguez-Luna, Stefany Daniela; Cruz Vázquez, Angélica Patricia; Jiménez Suárez, Verónica; Rodríguez-Sanoja, Romina; Alvarez-Buylla, Elena R.; Sánchez, Sergio

    2018-01-01

    Endophytic bacteria are wide-spread and associated with plant physiological benefits, yet their genomes and secondary metabolites remain largely unidentified. In this study, we explored the genome of the endophyte Streptomyces scabrisporus NF3 for discovery of potential novel molecules as well as genes and metabolites involved in host interactions. The complete genomes of seven Streptomyces and three other more distantly related bacteria were used to define the functional landscape of this unique microbe. The S. scabrisporus NF3 genome is larger than the average Streptomyces genome and not structured for an obligate endosymbiotic lifestyle; this and the fact that can grow in R2YE media implies that it could include a soil-living stage. The genome displays an enrichment of genes associated with amino acid production, protein secretion, secondary metabolite and antioxidants production and xenobiotic degradation, indicating that S. scabrisporus NF3 could contribute to the metabolic enrichment of soil microbial communities and of its hosts. Importantly, besides its metabolic advantages, the genome showed evidence for differential functional specificity and diversification of plant interaction molecules, including genes for the production of plant hormones, stress resistance molecules, chitinases, antibiotics and siderophores. Given the diversity of S. scabrisporus mechanisms for host upkeep, we propose that these strategies were necessary for its adaptation to plant hosts and to face changes in environmental conditions. PMID:29447216

  3. Genome mining of Streptomyces scabrisporus NF3 reveals symbiotic features including genes related to plant interactions.

    Directory of Open Access Journals (Sweden)

    Corina Diana Ceapă

    Full Text Available Endophytic bacteria are wide-spread and associated with plant physiological benefits, yet their genomes and secondary metabolites remain largely unidentified. In this study, we explored the genome of the endophyte Streptomyces scabrisporus NF3 for discovery of potential novel molecules as well as genes and metabolites involved in host interactions. The complete genomes of seven Streptomyces and three other more distantly related bacteria were used to define the functional landscape of this unique microbe. The S. scabrisporus NF3 genome is larger than the average Streptomyces genome and not structured for an obligate endosymbiotic lifestyle; this and the fact that can grow in R2YE media implies that it could include a soil-living stage. The genome displays an enrichment of genes associated with amino acid production, protein secretion, secondary metabolite and antioxidants production and xenobiotic degradation, indicating that S. scabrisporus NF3 could contribute to the metabolic enrichment of soil microbial communities and of its hosts. Importantly, besides its metabolic advantages, the genome showed evidence for differential functional specificity and diversification of plant interaction molecules, including genes for the production of plant hormones, stress resistance molecules, chitinases, antibiotics and siderophores. Given the diversity of S. scabrisporus mechanisms for host upkeep, we propose that these strategies were necessary for its adaptation to plant hosts and to face changes in environmental conditions.

  4. Molecular genetic differentiation in earthworms inhabiting a heterogeneous Pb-polluted landscape

    Energy Technology Data Exchange (ETDEWEB)

    Andre, J., E-mail: Andrej@cardiff.ac.u [Cardiff School of Biosciences, Cardiff University, BIOSI 1, Museum Avenue, Cardiff CF10 3TL (United Kingdom); Department of Soil Science, School of Human and Environmental Sciences, University of Reading, Whiteknights, Reading RG6 6DW (United Kingdom); King, R.A. [Cardiff School of Biosciences, Cardiff University, BIOSI 1, Museum Avenue, Cardiff CF10 3TL (United Kingdom); Stuerzenbaum, S.R. [King' s College London, School of Biomedical and Health Sciences, Pharmaceutical Sciences Division, London SE1 9NH (United Kingdom); Kille, P. [Cardiff School of Biosciences, Cardiff University, BIOSI 1, Museum Avenue, Cardiff CF10 3TL (United Kingdom); Hodson, M.E. [Department of Soil Science, School of Human and Environmental Sciences, University of Reading, Whiteknights, Reading RG6 6DW (United Kingdom); Morgan, A.J. [Cardiff School of Biosciences, Cardiff University, BIOSI 1, Museum Avenue, Cardiff CF10 3TL (United Kingdom)

    2010-03-15

    A Pb-mine site situated on acidic soil, but comprising of Ca-enriched islands around derelict buildings was used to study the spatial pattern of genetic diversity in Lumbricus rubellus. Two distinct genetic lineages ('A' and 'B'), differentiated at both the mitochondrial (mtDNA COII) and nuclear level (AFLPs) were revealed with a mean inter-lineage mtDNA sequence divergence of approximately 13%, indicative of a cryptic species complex. AFLP analysis indicates that lineage A individuals within one central 'ecological island' site are uniquely clustered, with little genetic overlap with lineage A individuals at the two peripheral sites. FTIR microspectroscopy of Pb-sequestering chloragocytes revealed different phosphate profiles in residents of adjacent acidic and calcareous islands. Bioinformatics found over-representation of Ca pathway genes in EST{sub Pb} libraries. Subsequent sequencing of a Ca-transport gene, SERCA, revealed mutations in the protein's cytosolic domain. We recommend the mandatory genotyping of all individuals prior to field-based ecotoxicological assays, particularly those using discriminating genomic technologies. - Landscapes punctuated by Pb-polluted islands have engendered local genetic differentiation in resident earthworms.

  5. Molecular genetic differentiation in earthworms inhabiting a heterogeneous Pb-polluted landscape

    International Nuclear Information System (INIS)

    Andre, J.; King, R.A.; Stuerzenbaum, S.R.; Kille, P.; Hodson, M.E.; Morgan, A.J.

    2010-01-01

    A Pb-mine site situated on acidic soil, but comprising of Ca-enriched islands around derelict buildings was used to study the spatial pattern of genetic diversity in Lumbricus rubellus. Two distinct genetic lineages ('A' and 'B'), differentiated at both the mitochondrial (mtDNA COII) and nuclear level (AFLPs) were revealed with a mean inter-lineage mtDNA sequence divergence of approximately 13%, indicative of a cryptic species complex. AFLP analysis indicates that lineage A individuals within one central 'ecological island' site are uniquely clustered, with little genetic overlap with lineage A individuals at the two peripheral sites. FTIR microspectroscopy of Pb-sequestering chloragocytes revealed different phosphate profiles in residents of adjacent acidic and calcareous islands. Bioinformatics found over-representation of Ca pathway genes in EST Pb libraries. Subsequent sequencing of a Ca-transport gene, SERCA, revealed mutations in the protein's cytosolic domain. We recommend the mandatory genotyping of all individuals prior to field-based ecotoxicological assays, particularly those using discriminating genomic technologies. - Landscapes punctuated by Pb-polluted islands have engendered local genetic differentiation in resident earthworms.

  6. Sampling the genomic pool of protein tyrosine kinase genes using the polymerase chain reaction with genomic DNA.

    Science.gov (United States)

    Oates, A C; Wollberg, P; Achen, M G; Wilks, A F

    1998-08-28

    The polymerase chain reaction (PCR), with cDNA as template, has been widely used to identify members of protein families from many species. A major limitation of using cDNA in PCR is that detection of a family member is dependent on temporal and spatial patterns of gene expression. To circumvent this restriction, and in order to develop a technique that is broadly applicable we have tested the use of genomic DNA as PCR template to identify members of protein families in an expression-independent manner. This test involved amplification of DNA encoding protein tyrosine kinase (PTK) genes from the genomes of three animal species that are well known development models; namely, the mouse Mus musculus, the fruit fly Drosophila melanogaster, and the nematode worm Caenorhabditis elegans. Ten PTK genes were identified from the mouse, 13 from the fruit fly, and 13 from the nematode worm. Among these kinases were 13 members of the PTK family that had not been reported previously. Selected PTKs from this screen were shown to be expressed during development, demonstrating that the amplified fragments did not arise from pseudogenes. This approach will be useful for the identification of many novel members of gene families in organisms of agricultural, medical, developmental and evolutionary significance and for analysis of gene families from any species, or biological sample whose habitat precludes the isolation of mRNA. Furthermore, as a tool to hasten the discovery of members of gene families that are of particular interest, this method offers an opportunity to sample the genome for new members irrespective of their expression pattern.

  7. Rhizome of life, catastrophes, sequence exchanges, gene creations and giant viruses: How microbial genomics challenges Darwin

    Directory of Open Access Journals (Sweden)

    Vicky eMerhej

    2012-08-01

    Full Text Available Darwin’s theory about the evolution of species has been the object of considerable dispute. In this review, we have described seven key principles in Darwin’s book The Origin of Species and tried to present how genomics challenge each of these concepts and improve our knowledge about evolution. Darwin believed that species evolution consists on a positive directional selection ensuring the survival of the fittest. The most developed state of the species is characterized by increasing complexity. Darwin proposed the theory of descent with modification according to which all species evolve from a single common ancestor through a gradual process of small modification of their vertical inheritance. Finally, the process of evolution can be depicted in the form of a tree. However, microbial genomics showed that evolution is better described as the biological changes over time." The mode of change is not unidirectional and does not necessarily favors advantageous mutations to increase fitness it is rather subject to random selection as a result of catastrophic stochastic processes. Complexity is not necessarily the completion of development: several complex organisms have gone extinct and many microbes including bacteria with intracellular lifestyle have streamlined highly effective genomes. Genomes evolve through large events of gene deletions, duplications, insertions and genomes rearrangements rather than a gradual adaptative process. Genomes are dynamic and chimeric entities with gene repertoires that result from vertical and horizontal acquisitions as well as de novo gene creation. The chimeric character of microbial genomes excludes the possibility of finding a single common ancestor for all the genes recorded currently. Genomes are collections of genes with different evolutionary histories that cannot be represented by a single tree of life. A forest, a network or a rhizome of life may be more accurate to represent evolutionary relationships

  8. Genomics of crop wild relatives: expanding the gene pool for crop improvement.

    Science.gov (United States)

    Brozynska, Marta; Furtado, Agnelo; Henry, Robert J

    2016-04-01

    Plant breeders require access to new genetic diversity to satisfy the demands of a growing human population for more food that can be produced in a variable or changing climate and to deliver the high-quality food with nutritional and health benefits demanded by consumers. The close relatives of domesticated plants, crop wild relatives (CWRs), represent a practical gene pool for use by plant breeders. Genomics of CWR generates data that support the use of CWR to expand the genetic diversity of crop plants. Advances in DNA sequencing technology are enabling the efficient sequencing of CWR and their increased use in crop improvement. As the sequencing of genomes of major crop species is completed, attention has shifted to analysis of the wider gene pool of major crops including CWR. A combination of de novo sequencing and resequencing is required to efficiently explore useful genetic variation in CWR. Analysis of the nuclear genome, transcriptome and maternal (chloroplast and mitochondrial) genome of CWR is facilitating their use in crop improvement. Genome analysis results in discovery of useful alleles in CWR and identification of regions of the genome in which diversity has been lost in domestication bottlenecks. Targeting of high priority CWR for sequencing will maximize the contribution of genome sequencing of CWR. Coordination of global efforts to apply genomics has the potential to accelerate access to and conservation of the biodiversity essential to the sustainability of agriculture and food production. © 2015 Society for Experimental Biology, Association of Applied Biologists and John Wiley & Sons Ltd.

  9. Microarray-based genomic surveying of gene polymorphisms in Chlamydia trachomatis

    OpenAIRE

    Brunelle, Brian W; Nicholson, Tracy L; Stephens, Richard S

    2004-01-01

    By comparing two fully sequenced genomes of Chlamydia trachomatis using competitive hybridization on DNA microarrays, a logarithmic correlation was demonstrated between the signal ratio of the arrays and the 75-99% range of nucleotide identities of the genes. Variable genes within 14 uncharacterized strains of C. trachomatis were identified by array analysis and verified by DNA sequencing. These genes may be crucial for understanding chlamydial virulence and pathogenesis.

  10. The genomic view of genes responsive to the antagonistic phytohormones, abscisic acid, and gibberellin.

    Science.gov (United States)

    Yazaki, Junshi; Kikuchi, Shoshi

    2005-01-01

    We now have the various genomics tools for monocot (Oryza sativa) and a dicot (Arabidopsis thaliana) plant. Plant is not only a very important agricultural resource but also a model organism for biological research. It is important that the interaction between ABA and GA is investigated for controlling the transition from embryogenesis to germination in seeds using genomics tools. These studies have investigated the relationship between dormancy and germination using genomics tools. Genomics tools identified genes that had never before been annotated as ABA- or GA-responsive genes in plant, detected new interactions between genes responsive to the two hormones, comprehensively characterized cis-elements of hormone-responsive genes, and characterized cis-elements of rice and Arabidopsis. In these research, ABA- and GA-regulated genes have been classified as functional proteins (proteins that probably function in stress or PR tolerance) and regulatory proteins (protein factors involved in further regulation of signal transduction). Comparison between ABA and/or GA-responsive genes in rice and those in Arabidopsis has shown that the cis-element has specificity in each species. cis-Elements for the dehydration-stress response have been specified in Arabidopsis but not in rice. cis-Elements for protein storage are remarkably richer in the upstream regions of the rice gene than in those of Arabidopsis.

  11. Mapping and annotating obesity-related genes in pig and human genomes.

    Science.gov (United States)

    Martelli, Pier Luigi; Fontanesi, Luca; Piovesan, Damiano; Fariselli, Piero; Casadio, Rita

    2014-01-01

    Background. Obesity is a major health problem in both developed and emerging countries. Obesity is a complex disease whose etiology involves genetic factors in strong interplay with environmental determinants and lifestyle. The discovery of genetic factors and biological pathways underlying human obesity is hampered by the difficulty in controlling the genetic background of human cohorts. Animal models are then necessary to further dissect the genetics of obesity. Pig has emerged as one of the most attractive models, because of the similarity with humans in the mechanisms regulating the fat deposition. Results. We collected the genes related to obesity in humans and to fat deposition traits in pig. We localized them on both human and pig genomes, building a map useful to interpret comparative studies on obesity. We characterized the collected genes structurally and functionally with BAR+ and mapped them on KEGG pathways and on STRING protein interaction network. Conclusions. The collected set consists of 361 obesity related genes in human and pig genomes. All genes were mapped on the human genome, and 54 could not be localized on the pig genome (release 2012). Only for 3 human genes there is no counterpart in pig, confirming that this animal is a good model for human obesity studies. Obesity related genes are mostly involved in regulation and signaling processes/pathways and relevant connection emerges between obesity-related genes and diseases such as cancer and infectious diseases.

  12. The human genome and sport, including epigenetics, gene doping, and athleticogenomics.

    Science.gov (United States)

    Sharp, N C Craig

    2010-03-01

    Hugh Montgomery's discovery of the first of more than 239 fitness genes together with rapid advances in human gene therapy have created a prospect of using genes, genetic elements, and cells that have the capacity to enhance athletic performance (to paraphrase the World Anti-Doping Agency's definition of gene doping). This brief overview covers the main areas of interface between genetics and sport, attempts to provide a context against which gene doping may be viewed, and predicts a futuristic legitimate use of genomic (and possibly epigenetic) information in sport. Copyright 2010 Elsevier Inc. All rights reserved.

  13. Sox genes in grass carp (Ctenopharyngodon idella) with their implications for genome duplication and evolution

    OpenAIRE

    Zhong , Lei; Yu , Xiaomu; Tong , Jingou

    2006-01-01

    Abstract The Sox gene family is found in a broad range of animal taxa and encodes important gene regulatory proteins involved in a variety of developmental processes. We have obtained clones representing the HMG boxes of twelve Sox genes from grass carp (Ctenopharyngodon idella), one of the four major domestic carps in China. The cloned Sox genes belong to group B1, B2 and C. Our analyses show that whereas the human genome contains a single copy of Sox4, Sox11 and Sox14, each of these genes h...

  14. A protocol for assessment of direct effects of RNAi to earthworms

    DEFF Research Database (Denmark)

    de Pinto, Roberta; Strandberg, Morten Tune; Kostov, Kaloyan

    both how to study fate and effects of the RNAi molecules. If COI or another gene apt for silencing will be successful in the earthworm L. terrestris, this could become a suggested positive control agent for future non-target studies of RNAi. The testing of environmental effects of NBT require...

  15. Integrative Functional Genomics for Systems Genetics in GeneWeaver.org.

    Science.gov (United States)

    Bubier, Jason A; Langston, Michael A; Baker, Erich J; Chesler, Elissa J

    2017-01-01

    The abundance of existing functional genomics studies permits an integrative approach to interpreting and resolving the results of diverse systems genetics studies. However, a major challenge lies in assembling and harmonizing heterogeneous data sets across species for facile comparison to the positional candidate genes and coexpression networks that come from systems genetic studies. GeneWeaver is an online database and suite of tools at www.geneweaver.org that allows for fast aggregation and analysis of gene set-centric data. GeneWeaver contains curated experimental data together with resource-level data such as GO annotations, MP annotations, and KEGG pathways, along with persistent stores of user entered data sets. These can be entered directly into GeneWeaver or transferred from widely used resources such as GeneNetwork.org. Data are analyzed using statistical tools and advanced graph algorithms to discover new relations, prioritize candidate genes, and generate function hypotheses. Here we use GeneWeaver to find genes common to multiple gene sets, prioritize candidate genes from a quantitative trait locus, and characterize a set of differentially expressed genes. Coupling a large multispecies repository curated and empirical functional genomics data to fast computational tools allows for the rapid integrative analysis of heterogeneous data for interpreting and extrapolating systems genetics results.

  16. Genome-Wide Identification and Evolution of HECT Genes in Soybean

    Directory of Open Access Journals (Sweden)

    Xianwen Meng

    2015-04-01

    Full Text Available Proteins containing domains homologous to the E6-associated protein (E6-AP carboxyl terminus (HECT are an important class of E3 ubiquitin ligases involved in the ubiquitin proteasome pathway. HECT-type E3s play crucial roles in plant growth and development. However, current understanding of plant HECT genes and their evolution is very limited. In this study, we performed a genome-wide analysis of the HECT domain-containing genes in soybean. Using high-quality genome sequences, we identified 19 soybean HECT genes. The predicted HECT genes were distributed unevenly across 15 of 20 chromosomes. Nineteen of these genes were inferred to be segmentally duplicated gene pairs, suggesting that in soybean, segmental duplications have made a significant contribution to the expansion of the HECT gene family. Phylogenetic analysis showed that these HECT genes can be divided into seven groups, among which gene structure and domain architecture was relatively well-conserved. The Ka/Ks ratios show that after the duplication events, duplicated HECT genes underwent purifying selection. Moreover, expression analysis reveals that 15 of the HECT genes in soybean are differentially expressed in 14 tissues, and are often highly expressed in the flowers and roots. In summary, this work provides useful information on which further functional studies of soybean HECT genes can be based.

  17. Approximating the edit distance for genomes with duplicate genes under DCJ, insertion and deletion

    Directory of Open Access Journals (Sweden)

    Shao Mingfu

    2012-12-01

    Full Text Available Abstract Computing the edit distance between two genomes under certain operations is a basic problem in the study of genome evolution. The double-cut-and-join (DCJ model has formed the basis for most algorithmic research on rearrangements over the last few years. The edit distance under the DCJ model can be easily computed for genomes without duplicate genes. In this paper, we study the edit distance for genomes with duplicate genes under a model that includes DCJ operations, insertions and deletions. We prove that computing the edit distance is equivalent to finding the optimal cycle decomposition of the corresponding adjacency graph, and give an approximation algorithm with an approximation ratio of 1.5 + ∈.

  18. Characterization and distribution of repetitive elements in association with genes in the human genome.

    Science.gov (United States)

    Liang, Kai-Chiang; Tseng, Joseph T; Tsai, Shaw-Jenq; Sun, H Sunny

    2015-08-01

    Repetitive elements constitute more than 50% of the human genome. Recent studies implied that the complexity of living organisms is not just a direct outcome of a number of coding sequences; the repetitive elements, which do not encode proteins, may also play a significant role. Though scattered studies showed that repetitive elements in the regulatory regions of a gene control gene expression, no systematic survey has been done to report the characterization and distribution of various types of these repetitive elements in the human genome. Sequences from 5' and 3' untranslated regions and upstream and downstream of a gene were downloaded from the Ensembl database. The repetitive elements in the neighboring of each gene were identified and classified using cross-matching implemented in the RepeatMasker. The annotation and distribution of distinct classes of repetitive elements associated with individual gene were collected to characterize genes in association with different types of repetitive elements using systems biology program. We identified a total of 1,068,400 repetitive elements which belong to 37-class families and 1235 subclasses that are associated with 33,761 genes and 57,365 transcripts. In addition, we found that the tandem repeats preferentially locate proximal to the transcription start site (TSS) of genes and the major function of these genes are involved in developmental processes. On the other hand, interspersed repetitive elements showed a tendency to be accumulated at distal region from the TSS and the function of interspersed repeat-containing genes took part in the catabolic/metabolic processes. Results from the distribution analysis were collected and used to construct a gene-based repetitive element database (GBRED; http://www.binfo.ncku.edu.tw/GBRED/index.html). A user-friendly web interface was designed to provide the information of repetitive elements associated with any particular gene(s). This is the first study focusing on the gene

  19. Gene fragmentation: a key to mitochondrial genome evolution in Euglenozoa?

    Czech Academy of Sciences Publication Activity Database

    Flegontov, Pavel; Gray, M.W.; Burger, G.; Lukeš, Julius

    2011-01-01

    Roč. 57, č. 4 (2011), 225-232 ISSN 0172-8083 Institutional research plan: CEZ:AV0Z60220518 Keywords : Euglena * Diplonema * Mitochondrial genome * RNA editing * Constructive neutral evolution Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 2.556, year: 2011

  20. Gene prediction in the fathead minnow [Pimephales promelas] genome

    Science.gov (United States)

    The fathead minnow is a well-established model organism which has been widely used for regulatory ecotoxicity testing and research for over half century. While much information has been gathered on the organism over the years, the fathead minnow genome, a critical source of infor...

  1. A reference gene set for sex pheromone biosynthesis and degradation genes from the diamondback moth, Plutella xylostella, based on genome and transcriptome digital gene expression analyses

    OpenAIRE

    He, Peng; Zhang, Yun-Fei; Hong, Duan-Yang; Wang, Jun; Wang, Xing-Liang; Zuo, Ling-Hua; Tang, Xian-Fu; Xu, Wei-Ming; He, Ming

    2017-01-01

    Background Female moths synthesize species-specific sex pheromone components and release them to attract male moths, which depend on precise sex pheromone chemosensory system to locate females. Two types of genes involved in the sex pheromone biosynthesis and degradation pathways play essential roles in this important moth behavior. To understand the function of genes in the sex pheromone pathway, this study investigated the genome-wide and digital gene expression of sex pheromone biosynthesi...

  2. Mitochondrial Genomes of Kinorhyncha: trnM Duplication and New Gene Orders within Animals

    OpenAIRE

    Popova, Olga V.; Mikhailov, Kirill V.; Nikitin, Mikhail A.; Logacheva, Maria D.; Penin, Aleksey A.; Muntyan, Maria S.; Kedrova, Olga S.; Petrov, Nikolai B.; Panchin, Yuri V.; Aleoshin, Vladimir V.

    2016-01-01

    Many features of mitochondrial genomes of animals, such as patterns of gene arrangement, nucleotide content and substitution rate variation are extensively used in evolutionary and phylogenetic studies. Nearly 6,000 mitochondrial genomes of animals have already been sequenced, covering the majority of animal phyla. One of the groups that escaped mitogenome sequencing is phylum Kinorhyncha-an isolated taxon of microscopic worm-like ecdysozoans. The kinorhynchs are thought to be one of the earl...

  3. Comparative genome sequencing of drosophila pseudoobscura: Chromosomal, gene and cis-element evolution

    Energy Technology Data Exchange (ETDEWEB)

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Todd, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catherine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenee; Verduzco, Daniel; Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.

    2004-04-01

    The genome sequence of a second fruit fly, D. pseudoobscura, presents an opportunity for comparative analysis of a primary model organism D. melanogaster. The vast majority of Drosophila genes have remained on the same arm, but within each arm gene order has been extensively reshuffled leading to the identification of approximately 1300 syntenic blocks. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 35 My since divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome wide average consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than control sequences between the species but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a picture of repeat mediated chromosomal rearrangement, and high co-adaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila.

  4. Analysis and prediction of gene splice sites in four Aspergillus genomes

    DEFF Research Database (Denmark)

    Wang, Kai; Ussery, David; Brunak, Søren

    2009-01-01

    Several Aspergillus fungal genomic sequences have been published, with many more in progress. Obviously, it is essential to have high-quality, consistently annotated sets of proteins from each of the genomes, in order to make meaningful comparisons. We have developed a dedicated, publicly available......, splice site prediction program called NetAspGene, for the genus Aspergillus. Gene sequences from Aspergillus fumigatus, the most common mould pathogen, were used to build and test our model. Compared to many animals and plants, Aspergillus contains smaller introns; thus we have applied a larger window...... better splice site prediction than other available tools. NetAspGene will be very helpful for the study in Aspergillus splice sites and especially in alternative splicing. A webpage for NetAspGene is publicly available at http://www.cbs.dtu.dk/services/NetAspGene....

  5. Large-scale trends in the evolution of gene structures within 11 animal genomes.

    Directory of Open Access Journals (Sweden)

    Mark Yandell

    2006-03-01

    Full Text Available We have used the annotations of six animal genomes (Homo sapiens, Mus musculus, Ciona intestinalis, Drosophila melanogaster, Anopheles gambiae, and Caenorhabditis elegans together with the sequences of five unannotated Drosophila genomes to survey changes in protein sequence and gene structure over a variety of timescales--from the less than 5 million years since the divergence of D. simulans and D. melanogaster to the more than 500 million years that have elapsed since the Cambrian explosion. To do so, we have developed a new open-source software library called CGL (for "Comparative Genomics Library". Our results demonstrate that change in intron-exon structure is gradual, clock-like, and largely independent of coding-sequence evolution. This means that genome annotations can be used in new ways to inform, corroborate, and test conclusions drawn from comparative genomics analyses that are based upon protein and nucleotide sequence similarities.

  6. Genome-Wide Analysis of the RNA Helicase Gene Family in Gossypium raimondii

    Directory of Open Access Journals (Sweden)

    Jie Chen

    2014-03-01

    Full Text Available The RNA helicases, which help to unwind stable RNA duplexes, and have important roles in RNA metabolism, belong to a class of motor proteins that play important roles in plant development and responses to stress. Although this family of genes has been the subject of systematic investigation in Arabidopsis, rice, and tomato, it has not yet been characterized in cotton. In this study, we identified 161 putative RNA helicase genes in the genome of the diploid cotton species Gossypium raimondii. We classified these genes into three subfamilies, based on the presence of either a DEAD-box (51 genes, DEAH-box (52 genes, or DExD/H-box (58 genes in their coding regions. Chromosome location analysis showed that the genes that encode RNA helicases are distributed across all 13 chromosomes of G. raimondii. Syntenic analysis revealed that 62 of the 161 G. raimondii helicase genes (38.5% are within the identified syntenic blocks. Sixty-six (40.99% helicase genes from G. raimondii have one or several putative orthologs in tomato. Additionally, GrDEADs have more conserved gene structures and more simple domains than GrDEAHs and GrDExD/Hs. Transcriptome sequencing data demonstrated that many of these helicases, especially GrDEADs, are highly expressed at the fiber initiation stage and in mature leaves. To our knowledge, this is the first report of a genome-wide analysis of the RNA helicase gene family in cotton.

  7. Genomic resources for gene discovery, functional genome annotation, and evolutionary studies of maize and its close relatives.

    Science.gov (United States)

    Wang, Chao; Shi, Xue; Liu, Lin; Li, Haiyan; Ammiraju, Jetty S S; Kudrna, David A; Xiong, Wentao; Wang, Hao; Dai, Zhaozhao; Zheng, Yonglian; Lai, Jinsheng; Jin, Weiwei; Messing, Joachim; Bennetzen, Jeffrey L; Wing, Rod A; Luo, Meizhong

    2013-11-01

    Maize is one of the most important food crops and a key model for genetics and developmental biology. A genetically anchored and high-quality draft genome sequence of maize inbred B73 has been obtained to serve as a reference sequence. To facilitate evolutionary studies in maize and its close relatives, much like the Oryza Map Alignment Project (OMAP) (www.OMAP.org) bacterial artificial chromosome (BAC) resource did for the rice community, we constructed BAC libraries for maize inbred lines Zheng58, Chang7-2, and Mo17 and maize wild relatives Zea mays ssp. parviglumis and Tripsacum dactyloides. Furthermore, to extend functional genomic studies to maize and sorghum, we also constructed binary BAC (BIBAC) libraries for the maize inbred B73 and the sorghum landrace Nengsi-1. The BAC/BIBAC vectors facilitate transfer of large intact DNA inserts from BAC clones to the BIBAC vector and functional complementation of large DNA fragments. These seven Zea Map Alignment Project (ZMAP) BAC/BIBAC libraries have average insert sizes ranging from 92 to 148 kb, organellar DNA from 0.17 to 2.3%, empty vector rates between 0.35 and 5.56%, and genome equivalents of 4.7- to 8.4-fold. The usefulness of the Parviglumis and Tripsacum BAC libraries was demonstrated by mapping clones to the reference genome. Novel genes and alleles present in these ZMAP libraries can now be used for functional complementation studies and positional or homology-based cloning of genes for translational genomics.

  8. Genome sequencing of herb Tulsi (Ocimum tenuiflorum) unravels key genes behind its strong medicinal properties.

    Science.gov (United States)

    Upadhyay, Atul K; Chacko, Anita R; Gandhimathi, A; Ghosh, Pritha; Harini, K; Joseph, Agnel P; Joshi, Adwait G; Karpe, Snehal D; Kaushik, Swati; Kuravadi, Nagesh; Lingu, Chandana S; Mahita, J; Malarini, Ramya; Malhotra, Sony; Malini, Manoharan; Mathew, Oommen K; Mutt, Eshita; Naika, Mahantesha; Nitish, Sathyanarayanan; Pasha, Shaik Naseer; Raghavender, Upadhyayula S; Rajamani, Anantharamanan; Shilpa, S; Shingate, Prashant N; Singh, Heikham Russiachand; Sukhwal, Anshul; Sunitha, Margaret S; Sumathi, Manojkumar; Ramaswamy, S; Gowda, Malali; Sowdhamini, Ramanathan

    2015-08-28

    Krishna Tulsi, a member of Lamiaceae family, is a herb well known for its spiritual, religious and medicinal importance in India. The common name of this plant is 'Tulsi' (or 'Tulasi' or 'Thulasi') and is considered sacred by Hindus. We present the draft genome of Ocimum tenuiflurum L (subtype Krishna Tulsi) in this report. The paired-end and mate-pair sequence libraries were generated for the whole genome sequenced with the Illumina Hiseq 1000, resulting in an assembled genome of 374 Mb, with a genome coverage of 61 % (612 Mb estimated genome size). We have also studied transcriptomes (RNA-Seq) of two subtypes of O. tenuiflorum, Krishna and Rama Tulsi and report the relative expression of genes in both the varieties. The pathways leading to the production of medicinally-important specialized metabolites have been studied in detail, in relation to similar pathways in Arabidopsis thaliana and other plants. Expression levels of anthocyanin biosynthesis-related genes in leaf samples of Krishna Tulsi were observed to be relatively high, explaining the purple colouration of Krishna Tulsi leaves. The expression of six important genes identified from genome data were validated by performing q-RT-PCR in different tissues of five different species, which shows the high extent of urosolic acid-producing genes in young leaves of the Rama subtype. In addition, the presence of eugenol and ursolic acid, implied as potential drugs in the cure of many diseases including cancer was confirmed using mass spectrometry. The availability of the whole genome of O.tenuiflorum and our sequence analysis suggests that small amino acid changes at the functional sites of genes involved in metabolite synthesis pathways confer special medicinal properties to this herb.

  9. Herbicide Glyphosate Impact to Earthworm (E. fetida

    Directory of Open Access Journals (Sweden)

    Greta Dajoraitė

    2016-10-01

    Full Text Available Glyphosate is a broad spectrum weed resistant herbicide. Glyphosate may pose negative impact on land ecosystems because of wide broad usage and hydrofilic characteristic. The aim of this study was to investigate negative effects of glyphosate on soil invertebrate organisms (earthworm Eisenia fetida. The duration of experiment was 8 weeks. The range of the test concentrations of glyphosate were: 0,1, 1, 5, 10, 20 mg/kg. To investigate the glyphosate impact on earthworm Eisenia fetida the following endpoints were measured: survival, reproduction and weight. The exposure to 20 mg/kg glyphosate has led to the 100% mortality of earthworms. Glyphosate has led to decreased E. fetida reproduction, the cocoons were observed only in the lowest concentration (0,1 mg/kg. In general: long-term glyphosate toxicity to earthworms (E. fetida may be significant.

  10. Antimicrobial activity of earthworm (Eudrilus eugeniae) paste

    African Journals Online (AJOL)

    kavi pradeep

    2013-08-01

    Aug 1, 2013 ... Earthworm plays a major role in the proper functioning of the soil ecosystem. ... Minimum inhibitory concentration (MIC) was determined using micro dilution ... worms were kept in plastic troughs, covered tightly with polythene.

  11. Genome-wide analysis of WRKY gene family in Cucumis sativus.

    Science.gov (United States)

    Ling, Jian; Jiang, Weijie; Zhang, Ying; Yu, Hongjun; Mao, Zhenchuan; Gu, Xingfang; Huang, Sanwen; Xie, Bingyan

    2011-09-28

    WRKY proteins are a large family of transcriptional regulators in higher plant. They are involved in many biological processes, such as plant development, metabolism, and responses to biotic and abiotic stresses. Prior to the present study, only one full-length cucumber WRKY protein had been reported. The recent publication of the draft genome sequence of cucumber allowed us to conduct a genome-wide search for cucumber WRKY proteins, and to compare these positively identified proteins with their homologs in model plants, such as Arabidopsis. We identified a total of 55 WRKY genes in the cucumber genome. According to structural features of their encoded proteins, the cucumber WRKY (CsWRKY) genes were classified into three groups (group 1-3). Analysis of expression profiles of CsWRKY genes indicated that 48 WRKY genes display differential expression either in their transcript abundance or in their expression patterns under normal growth conditions, and 23 WRKY genes were differentially expressed in response to at least one abiotic stresses (cold, drought or salinity). The expression profile of stress-inducible CsWRKY genes were correlated with those of their putative Arabidopsis WRKY (AtWRKY) orthologs, except for the group 3 WRKY genes. Interestingly, duplicated group 3 AtWRKY genes appear to have been under positive selection pressure during evolution. In contrast, there was no evidence of recent gene duplication or positive selection pressure among CsWRKY group 3 genes, which may have led to the expressional divergence of group 3 orthologs. Fifty-five WRKY genes were identified in cucumber and the structure of their encoded proteins, their expression, and their evolution were examined. Considering that there has been extensive expansion of group 3 WRKY genes in angiosperms, the occurrence of different evolutionary events could explain the functional divergence of these genes.

  12. Genome-Wide Analysis of Syntenic Gene Deletion in the Grasses

    Science.gov (United States)

    Schnable, James C.; Freeling, Michael; Lyons, Eric

    2012-01-01

    The grasses, Poaceae, are one of the largest and most successful angiosperm families. Like many radiations of flowering plants, the divergence of the major grass lineages was preceded by a whole-genome duplication (WGD), although these events are not rare for flowering plants. By combining identification of syntenic gene blocks with measures of gene pair divergence and different frequencies of ancient gene loss, we have separated the two subgenomes present in modern grasses. Reciprocal loss of duplicated genes or genomic regions has been hypothesized to reproductively isolate populations and, thus, speciation. However, in contrast to previous studies in yeast and teleost fishes, we found very little evidence of reciprocal loss of homeologous genes between the grasses, suggesting that post-WGD gene loss may not be the cause of the grass radiation. The sets of homeologous and orthologous genes and predicted locations of deleted genes identified in this study, as well as links to the CoGe comparative genomics web platform for analyzing pan-grass syntenic regions, are provided along with this paper as a resource for the grass genetics community. PMID:22275519

  13. Generating Genome-Scale Candidate Gene Lists for Pharmacogenomics

    DEFF Research Database (Denmark)

    Hansen, Niclas Tue; Brunak, Søren; Altman, R. B.

    2009-01-01

    A critical task in pharmacogenomics is identifying genes that may be important modulators of drug response. High-throughput experimental methods are often plagued by false positives and do not take advantage of existing knowledge. Candidate gene lists can usefully summarize existing knowledge...

  14. Genome polymorphism markers and stress genes expression for ...

    African Journals Online (AJOL)

    SAM

    2014-06-11

    Jun 11, 2014 ... RNA extraction and purification for SOD and PAL gene expression. Fresh leaf tissues (100 mg), from ... Data analysis. Gelquant program for quantification of protein, DNA and RNA gel. (version 1.8.2) was used for .... by reprogramming the expression of endogenous genes. Higher level of these antioxidant ...

  15. Genome organization and expression of the rat ACBP gene family

    DEFF Research Database (Denmark)

    Mandrup, S; Andreasen, P H; Knudsen, J

    1993-01-01

    pool former. We have molecularly cloned and characterized the rat ACBP gene family which comprises one expressed and four processed pseudogenes. One of these was shown to exist in two allelic forms. A comprehensive computer-aided analysis of the promoter region of the expressed ACBP gene revealed...

  16. Genome scanning for identification of resistance gene analogs (RGAs)

    African Journals Online (AJOL)

    Disease resistance in plants is a desirable economic trait. Many disease resistance genes from various plants have been cloned so far. The gene products of some of these can be distinguished by the presence of an N terminal nucleotide binding site and a C-terminal stretch of leucine-rich repeats. Oligonucleotides already ...

  17. Genome-wide analysis of cell wall-related genes in Tuber melanosporum.

    Science.gov (United States)

    Balestrini, Raffaella; Sillo, Fabiano; Kohler, Annegret; Schneider, Georg; Faccio, Antonella; Tisserant, Emilie; Martin, Francis; Bonfante, Paola

    2012-06-01

    A genome-wide inventory of proteins involved in cell wall synthesis and remodeling has been obtained by taking advantage of the recently released genome sequence of the ectomycorrhizal Tuber melanosporum black truffle. Genes that encode cell wall biosynthetic enzymes, enzymes involved in cell wall polysaccharide synthesis or modification, GPI-anchored proteins and other cell wall proteins were identified in the black truffle genome. As a second step, array data were validated and the symbiotic stage was chosen as the main focus. Quantitative RT-PCR experiments were performed on 29 selected genes to verify their expression during ectomycorrhizal formation. The results confirmed the array data, and this suggests that cell wall-related genes are required for morphogenetic transition from mycelium growth to the ectomycorrhizal branched hyphae. Labeling experiments were also performed on T. melanosporum mycelium and ectomycorrhizae to localize cell wall components.

  18. Comparative analysis of genome maintenance genes in naked mole rat, mouse, and human.

    Science.gov (United States)

    MacRae, Sheila L; Zhang, Quanwei; Lemetre, Christophe; Seim, Inge; Calder, Robert B; Hoeijmakers, Jan; Suh, Yousin; Gladyshev, Vadim N; Seluanov, Andrei; Gorbunova, Vera; Vijg, Jan; Zhang, Zhengdong D

    2015-04-01

    Genome maintenance (GM) is an essential defense system against aging and cancer, as both are characterized by increased genome instability. Here, we compared the copy number variation and mutation rate of 518 GM-associated genes in the naked mole rat (NMR), mouse, and human genomes. GM genes appeared to be strongly conserved, with copy number variation in only four genes. Interestingly, we found NMR to have a higher copy number of CEBPG, a regulator of DNA repair, and TINF2, a protector of telomere integrity. NMR, as well as human, was also found to have a lower rate of germline nucleotide substitution than the mouse. Together, the data suggest that the long-lived NMR, as well as human, has more robust GM than mouse and identifies new targets for the analysis of the exceptional longevity of the NMR. © 2015 The Authors. Aging Cell published by the Anatomical Society and John Wiley & Sons Ltd.

  19. Genomic Prediction and Association Mapping of Curd-Related Traits in Gene Bank Accessions of Cauliflower.

    Science.gov (United States)

    Thorwarth, Patrick; Yousef, Eltohamy A A; Schmid, Karl J

    2018-02-02

    Genetic resources are an important source of genetic variation for plant breeding. Genome-wide association studies (GWAS) and genomic prediction greatly facilitate the analysis and utilization of useful genetic diversity for improving complex phenotypic traits in crop plants. We explored the potential of GWAS and genomic prediction for improving curd-related traits in cauliflower ( Brassica oleracea var. botrytis ) by combining 174 randomly selected cauliflower gene bank accessions from two different gene banks. The collection was genotyped with genotyping-by-sequencing (GBS) and phenotyped for six curd-related traits at two locations and three growing seasons. A GWAS analysis based on 120,693 single-nucleotide polymorphisms identified a total of 24 significant associations for curd-related traits. The potential for genomic prediction was assessed with a genomic best linear unbiased prediction model and BayesB. Prediction abilities ranged from 0.10 to 0.66 for different traits and did not differ between prediction methods. Imputation of missing genotypes only slightly improved prediction ability. Our results demonstrate that GWAS and genomic prediction in combination with GBS and phenotyping of highly heritable traits can be used to identify useful quantitative trait loci and genotypes among genetically diverse gene bank material for subsequent utilization as genetic resources in cauliflower breeding. Copyright © 2018 Thorwarth et al.

  20. Genomic Prediction and Association Mapping of Curd-Related Traits in Gene Bank Accessions of Cauliflower

    Directory of Open Access Journals (Sweden)

    Patrick Thorwarth

    2018-02-01

    Full Text Available Genetic resources are an important source of genetic variation for plant breeding. Genome-wide association studies (GWAS and genomic prediction greatly facilitate the analysis and utilization of useful genetic diversity for improving complex phenotypic traits in crop plants. We explored the potential of GWAS and genomic prediction for improving curd-related traits in cauliflower (Brassica oleracea var. botrytis by combining 174 randomly selected cauliflower gene bank accessions from two different gene banks. The collection was genotyped with genotyping-by-sequencing (GBS and phenotyped for six curd-related traits at two locations and three growing seasons. A GWAS analysis based on 120,693 single-nucleotide polymorphisms identified a total of 24 significant associations for curd-related traits. The potential for genomic prediction was assessed with a genomic best linear unbiased prediction model and BayesB. Prediction abilities ranged from 0.10 to 0.66 for different traits and did not differ between prediction methods. Imputation of missing genotypes only slightly improved prediction ability. Our results demonstrate that GWAS and genomic prediction in combination with GBS and phenotyping of highly heritable traits can be used to identify useful quantitative trait loci and genotypes among genetically diverse gene bank material for subsequent utilization as genetic resources in cauliflower breeding.

  1. Characterisation of genome-wide PLZF/RARA target genes.

    Directory of Open Access Journals (Sweden)

    Salvatore Spicuglia

    Full Text Available The PLZF/RARA fusion protein generated by the t(11;17(q23;q21 translocation in acute promyelocytic leukaemia (APL is believed to act as an oncogenic transcriptional regulator recruiting epigenetic factors to genes important for its transforming potential. However, molecular mechanisms associated with PLZF/RARA-dependent leukaemogenesis still remain unclear.We searched for specific PLZF/RARA target genes by ChIP-on-chip in the haematopoietic cell line U937 conditionally expressing PLZF/RARA. By comparing bound regions found in U937 cells expressing endogenous PLZF with PLZF/RARA-induced U937 cells, we isolated specific PLZF/RARA target gene promoters. We next analysed gene expression profiles of our identified target genes in PLZF/RARA APL patients and analysed DNA sequences and epigenetic modification at PLZF/RARA binding sites. We identify 413 specific PLZF/RARA target genes including a number encoding transcription factors involved in the regulation of haematopoiesis. Among these genes, 22 were significantly down regulated in primary PLZF/RARA APL cells. In addition, repressed PLZF/RARA target genes were associated with increased levels of H3K27me3 and decreased levels of H3K9K14ac. Finally, sequence analysis of PLZF/RARA bound sequences reveals the presence of both consensus and degenerated RAREs as well as enrichment for tissue-specific transcription factor motifs, highlighting the complexity of targeting fusion protein to chromatin. Our study suggests that PLZF/RARA directly targets genes important for haematopoietic development and supports the notion that PLZF/RARA acts mainly as an epigenetic regulator of its direct target genes.

  2. Origin of the Y genome in Elymus and its relationship to other genomes in Triticeae based on evidence from elongation factor G (EF-G) gene sequences.

    Science.gov (United States)

    Sun, Genlou; Komatsuda, Takao

    2010-08-01

    It is well known that Elymus arose through hybridization between representatives of different genera. Cytogenetic analyses show that all its members include the St genome in combination with one or more of four other genomes, the H, Y, P, and W genomes. The origins of the H, P, and W genomes are known, but not for the Y genome. We analyzed the single copy nuclear gene coding for elongation factor G (EF-G) from 28 accessions of polyploid Elymus species and 45 accessions of diploid Triticeae species in order to investigate origin of the Y genome and its relationship to other genomes in the tribe Triticeae. Sequence comparisons among the St, H, Y, P, W, and E genomes detected genome-specific polymorphisms at 66 nucleotide positions. The St and Y genomes are relatively dissimilar. The phylogeny of the Y genome sequences was investigated for the first time. They were most similar to the W genome sequences. The Y genome sequences were placed in two different groups. These two groups were included in an unresolved clade that included the W and E sequences as well as sequences from many annual species. The H genomes sequences were in a clade with the F, P, and Ns genome sequences as sister groups. These two clades were more closely related to each other and to the L and Xp genomes than they were to the St genome sequences. These data support the hypothesis that the Y genome evolved in a diploid species and has a different origin from the St genome. Copyright 2010 Elsevier Inc. All rights reserved.

  3. Earthworms – good indicators for forest disturbance

    OpenAIRE

    YAHYA KOOCH; KATAYOUN HAGHVERDI

    2014-01-01

    In temperate forests, formation of canopy gaps by windthrow is a characteristic natural disturbance event. Little work has been done on the effects of canopy gaps on soil properties and fauna, especially earthworms as ecosystem engineers. We conducted a study to examine the reaction of earthworms (density/biomass) and different soil properties (i.e., soil moisture, pH, organic matter, total N, and available Ca) to different canopy gap areas in 25-ha areas of Liresar district beech forest loca...

  4. Genomic characterisation of Wongabel virus reveals novel genes within the Rhabdoviridae.

    Science.gov (United States)

    Gubala, Aneta J; Proll, David F; Barnard, Ross T; Cowled, Chris J; Crameri, Sandra G; Hyatt, Alex D; Boyle, David B

    2008-06-20

    Viruses belonging to the family Rhabdoviridae infect a variety of different hosts, including insects, vertebrates and plants. Currently, there are approximately 200 ICTV-recognised rhabdoviruses isolated around the world. However, the majority remain poorly characterised and only a fraction have been definitively assigned to genera. The genomic and transcriptional complexity displayed by several of the characterised rhabdoviruses indicates large diversity and complexity within this family. To enable an improved taxonomic understanding of this family, it is necessary to gain further information about the poorly characterised members of this family. Here we present the complete genome sequence and predicted transcription strategy of Wongabel virus (WONV), a previously uncharacterised rhabdovirus isolated from biting midges (Culicoides austropalpalis) collected in northern Queensland, Australia. The 13,196 nucleotide genome of WONV encodes five typical rhabdovirus genes N, P, M, G and L. In addition, the WONV genome contains three genes located between the P and M genes (U1, U2, U3) and two open reading frames overlapping with the N and G genes (U4, U5). These five additional genes and their putative protein products appear to be novel, and their functions are unknown. Predictive analysis of the U5 gene product revealed characteristics typical of viroporins, and indicated structural similarities with the alpha-1 protein (putative viroporin) of viruses in the genus Ephemerovirus. Phylogenetic analyses of the N and G proteins of WONV indicated closest similarity with the avian-associated Flanders virus; however, the genomes of these two viruses are significantly diverged. WONV displays a novel and unique genome structure that has not previously been described for any animal rhabdovirus.

  5. Transcriptional interference networks coordinate the expression of functionally related genes clustered in the same genomic loci.

    Science.gov (United States)

    Boldogköi, Zsolt

    2012-01-01

    The regulation of gene expression is essential for normal functioning of biological systems in every form of life. Gene expression is primarily controlled at the level of transcription, especially at the phase of initiation. Non-coding RNAs are one of the major players at every level of genetic regulation, including the control of chromatin organization, transcription, various post-transcriptional processes, and translation. In this study, the Transcriptional Interference Network (TIN) hypothesis was put forward in an attempt to explain the global expression of antisense RNAs and the overall occurrence of tandem gene clusters in the genomes of various biological systems ranging from viruses to mammalian cells. The TIN hypothesis suggests the existence of a novel layer of genetic regulation, based on the interactions between the transcriptional machineries of neighboring genes at their overlapping regions, which are assumed to play a fundamental role in coordinating gene expression within a cluster of functionally linked genes. It is claimed that the transcriptional overlaps between adjacent genes are much more widespread in genomes than is thought today. The Waterfall model of the TIN hypothesis postulates a unidirectional effect of upstream genes on the transcription of downstream genes within a cluster of tandemly arrayed genes, while the Seesaw model proposes a mutual interdependence of gene expression between the oppositely oriented genes. The TIN represents an auto-regulatory system with an exquisitely timed and highly synchronized cascade of gene expression in functionally linked genes located in close physical proximity to each other. In this study, we focused on herpesviruses. The reason for this lies in the compressed nature of viral genes, which allows a tight regulation and an easier investigation of the transcriptional interactions between genes. However, I believe that the same or similar principles can be applied to cellular organisms too.

  6. Gene discovery by chemical mutagenesis and whole-genome sequencing in Dictyostelium.

    Science.gov (United States)

    Li, Cheng-Lin Frank; Santhanam, Balaji; Webb, Amanda Nicole; Zupan, Blaž; Shaulsky, Gad

    2016-09-01

    Whole-genome sequencing is a useful approach for identification of chemical-induced lesions, but previous applications involved tedious genetic mapping to pinpoint the causative mutations. We propose that saturation mutagenesis under low mutagenic loads, followed by whole-genome sequencing, should allow direct implication of genes by identifying multiple independent alleles of each relevant gene. We tested the hypothesis by performing three genetic screens with chemical mutagenesis in the social soil amoeba Dictyostelium discoideum Through genome sequencing, we successfully identified mutant genes with multiple alleles in near-saturation screens, including resistance to intense illumination and strong suppressors of defects in an allorecognition pathway. We tested the causality of the mutations by comparison to published data and by direct complementation tests, finding both dominant and recessive causative mutations. Therefore, our strategy provides a cost- and time-efficient approach to gene discovery by integrating chemical mutagenesis and whole-genome sequencing. The method should be applicable to many microbial systems, and it is expected to revolutionize the field of functional genomics in Dictyostelium by greatly expanding the mutation spectrum relative to other common mutagenesis methods. © 2016 Li et al.; Published by Cold Spring Harbor Laboratory Press.

  7. The Genome Biology of Effector Gene Evolution in Filamentous Plant Pathogens.

    Science.gov (United States)

    Sánchez-Vallet, Andrea; Fouché, Simone; Fudal, Isabelle; Hartmann, Fanny E; Soyer, Jessica L; Tellier, Aurélien; Croll, Daniel

    2018-05-16

    Filamentous pathogens, including fungi and oomycetes, pose major threats to global food security. Crop pathogens cause damage by secreting effectors that manipulate the host to the pathogen's advantage. Genes encoding such effectors are among the most rapidly evolving genes in pathogen genomes. Here, we review how the major characteristics of the emergence, function, and regulation of effector genes are tightly linked to the genomic compartments where these genes are located in pathogen genomes. The presence of repetitive elements in these compartments is associated with elevated rates of point mutations and sequence rearrangements with a major impact on effector diversification. The expression of many effectors converges on an epigenetic control mediated by the presence of repetitive elements. Population genomics analyses showed that rapidly evolving pathogens show high rates of turnover at effector loci and display a mosaic in effector presence-absence polymorphism among strains. We conclude that effective pathogen containment strategies require a thorough understanding of the effector genome biology and the pathogen's potential for rapid adaptation. Expected final online publication date for the Annual Review of Phytopathology Volume 56 is August 25, 2018. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.

  8. Genes, race, and psychology in the genome era: an introduction.

    Science.gov (United States)

    Anderson, Norman B; Nickerson, Kim J

    2005-01-01

    The mapping of the human genome has reawakened interest in the topic of race and genetics, especially the use of genetic technology to examine racial differences in complex outcomes such as health and intelligence. Advances in genomic research challenge psychology to address the myriad conceptual, methodological, and analytical issues associated with research on genetics and race. In addition, the field needs to understand the numerous social, ethical, legal, clinical, and policy implications of research in this arena. Addressing these issues should not only benefit psychology but could also serve to guide such thought in other fields, including molecular biology. The purpose of this special issue is to begin a discussion of this issue of race and genetics within the field of psychology. Several scholars who work in the fields of genetics, race, or related areas were invited to write (or had previously submitted) articles sharing their perspectives. (c) 2005 APA

  9. Genome-wide identification and comparative expression analysis of LEA genes in watermelon and melon genomes.

    Science.gov (United States)

    Celik Altunoglu, Yasemin; Baloglu, Mehmet Cengiz; Baloglu, Pinar; Yer, Esra Nurten; Kara, Sibel

    2017-01-01

    Late embryogenesis abundant (LEA) proteins are large and diverse group of polypeptides which were first identified during seed dehydration and then in vegetative plant tissues during different stress responses. Now, gene family members of LEA proteins have been detected in various organisms. However, there is no report for this protein family in watermelon and melon until this study. A total of 73 LEA genes from watermelon ( ClLEA ) and 61 LEA genes from melon ( CmLEA ) were identified in this comprehensive study. They were classified into four and three distinct clusters in watermelon and melon, respectively. There was a correlation between gene structure and motif composition among each LEA groups. Segmental duplication played an important role for LEA gene expansion in watermelon. Maximum gene ontology of LEA genes was observed with poplar LEA genes. For evaluation of tissue specific expression patterns of ClLEA and CmLEA genes, publicly available RNA-seq data were analyzed. The expression analysis of selected LEA genes in root and leaf tissues of drought-stressed watermelon and melon were examined using qRT-PCR. Among them, ClLEA - 12 - 17 - 46 genes were quickly induced after drought application. Therefore, they might be considered as early response genes for water limitation conditions in watermelon. In addition, CmLEA - 42 - 43 genes were found to be up-regulated in both tissues of melon under drought stress. Our results can open up new frontiers about understanding of functions of these important family members under normal developmental stages and stress conditions by bioinformatics and transcriptomic approaches.

  10. Of genes and microbes: solving the intricacies in host genomes.

    Science.gov (United States)

    Wang, Jun; Chen, Liang; Zhao, Na; Xu, Xizhan; Xu, Yakun; Zhu, Baoli

    2018-05-01

    Microbiome research is a quickly developing field in biomedical research, and we have witnessed its potential in understanding the physiology, metabolism and immunology, its critical role in understanding the health and disease of the host, and its vast capacity in disease prediction, intervention and treatment. However, many of the fundamental questions still need to be addressed, including the shaping forces of microbial diversity between individuals and across time. Microbiome research falls into the classical nature vs. nurture scenario, such that host genetics shape part of the microbiome, while environmental influences change the original course of microbiome development. In this review, we focus on the nature, i.e., the genetic part of the equation, and summarize the recent efforts in understanding which parts of the genome, especially the human and mouse genome, play important roles in determining the composition and functions of microbial communities, primarily in the gut but also on the skin. We aim to present an overview of different approaches in studying the intricate relationships between host genetic variations and microbes, its underlying philosophy and methodology, and we aim to highlight a few key discoveries along this exploration, as well as current pitfalls. More evidence and results will surely appear in upcoming studies, and the accumulating knowledge will lead to a deeper understanding of what we could finally term a "hologenome", that is, the organized, closely interacting genome of the host and the microbiome.

  11. Genome-wide analysis of the MYB gene family in physic nut (Jatropha curcas L.).

    Science.gov (United States)

    Zhou, Changpin; Chen, Yanbo; Wu, Zhenying; Lu, Wenjia; Han, Jinli; Wu, Pingzhi; Chen, Yaping; Li, Meiru; Jiang, Huawu; Wu, Guojiang

    2015-11-01

    The MYB proteins comprise one of the largest transcription factor families in plants, and play key roles in regulatory networks controlling development, metabolism, and stress responses. A total of 125 MYB genes (JcMYB) have been identified in the physic nut (Jatropha curcas L.) genome, including 120 2R-type MYB, 4 3R-MYB, and 1 4R-MYB genes. Based on exon-intron arrangement of MYBs from both lower (Physcomitrella patens) and higher (physic nut, Arabidopsis, and rice) plants, we can classify plant MYB genes into ten groups (MI-X), except for MIX genes which are nonexistent in higher plants. We also observed that MVIII genes may be one of the most ancient MYB types which consist of both R2R3- and 3R-MYB genes. Most MYB genes (76.8% in physic nut) belong to the MI group which can be divided into 34 subgroups. The JcMYB genes were nonrandomly distributed on its 11 linkage groups (LGs). The expansion of MYB genes across several subgroups was observed and resulted from genome triplication of ancient dicotyledons and from both ancient and recent tandem duplication events in the physic nut genome. The expression patterns of several MYB duplicates in the physic nut showed differences in four tissues (root, stem, leaf, and seed), and 34 MYB genes responded to at least one abiotic stressor (drought, salinity, phosphate starvation, and nitrogen starvation) in leaves and/or roots based on the data analysis of digital gene expression tags. Overexpression of the JcMYB001 gene in Arabidopsis increased its sensitivity to drought and salinity stresses. Copyright © 2015 Elsevier B.V. All rights reserved.

  12. Inferring causal genomic alterations in breast cancer using gene expression data

    Science.gov (United States)

    2011-01-01

    Background One of the primary objectives in cancer research is to identify causal genomic alterations, such as somatic copy number variation (CNV) and somatic mutations, during tumor development. Many valuable studies lack genomic data to detect CNV; therefore, methods that are able to infer CNVs from gene expression data would help maximize the value of these studies. Results We developed a framework for identifying recurrent regions of CNV and distinguishing the cancer driver genes from the passenger genes in the regions. By inferring CNV regions across many datasets we were able to identify 109 recurrent amplified/deleted CNV regions. Many of these regions are enriched for genes involved in many important processes associated with tumorigenesis and cancer progression. Genes in these recurrent CNV regions were then examined in the context of gene regulatory networks to prioritize putative cancer driver genes. The cancer driver genes uncovered by the framework include not only well-known oncogenes but also a number of novel cancer susceptibility genes validated via siRNA experiments. Conclusions To our knowledge, this is the first effort to systematically identify and validate drivers for expression based CNV regions in breast cancer. The framework where the wavelet analysis of copy number alteration based on expression coupled with the gene regulatory network analysis, provides a blueprint for leveraging genomic data to identify key regulatory components and gene targets. This integrative approach can be applied to many other large-scale gene expression studies and other novel types of cancer data such as next-generation sequencing based expression (RNA-Seq) as well as CNV data. PMID:21806811

  13. Development of earthworm burrow systems and the influence of earthworms on soil hydrology

    NARCIS (Netherlands)

    Ligthart, T.N.

    1996-01-01


    Inoculation of earthworms can help to restore or ameliorate land qualities. Earthworms create burrows and alter the structure of the soil matrix, which influence the water infiltration, drainage, water retention and the aeration of the soil. The way and rate of the development of

  14. Earthworm invasion in North America: Food resource competition affects native millipede survival and invasive earthworm reproduction

    Science.gov (United States)

    Bruce Snyder; Mac Callaham; Christopher Lowe; Paul Hendrix

    2013-01-01

    The invasive non-native earthworm Amynthas agrestis (Goto and Hatai, 1899) has recently been documented invading forests of the Appalachian Mountains in the southeastern United States. This epigeic earthworm decreases the depth of organic soil horizons, and this may play a role in the decrease of millipede richness and abundance associated with A. agrestis invasion. To...

  15. Assembling large genomes: analysis of the stick insect (Clitarchus hookeri) genome reveals a high repeat content and sex-biased genes associated with reproduction.

    Science.gov (United States)

    Wu, Chen; Twort, Victoria G; Crowhurst, Ross N; Newcomb, Richard D; Buckley, Thomas R

    2017-11-16

    Stick insects (Phasmatodea) have a high incidence of parthenogenesis and other alternative reproductive strategies, yet the genetic basis of reproduction is poorly understood. Phasmatodea includes nearly 3000 species, yet only the genome of Timema cristinae has been published to date. Clitarchus hookeri is a geographical parthenogenetic stick insect distributed across New Zealand. Sexual reproduction dominates in northern habitats but is replaced by parthenogenesis in the south. Here, we present a de novo genome assembly of a female C. hookeri and use it to detect candidate genes associated with gamete production and development in females and males. We also explore the factors underlying large genome size in stick insects. The C. hookeri genome assembly was 4.2 Gb, similar to the flow cytometry estimate, making it the second largest insect genome sequenced and assembled to date. Like the large genome of Locusta migratoria, the genome of C. hookeri is also highly repetitive and the predicted gene models are much longer than those from most other sequenced insect genomes, largely due to longer introns. Miniature inverted repeat transposable elements (MITEs), absent in the much smaller T. cristinae genome, is the most abundant repeat type in the C. hookeri genome assembly. Mapping RNA-Seq reads from female and male gonadal transcriptomes onto the genome assembly resulted in the identification of 39,940 gene loci, 15.8% and 37.6% of which showed female-biased and male-biased expression, respectively. The genes that were over-expressed in females were mostly associated with molecular transportation, developmental process, oocyte growth and reproductive process; whereas, the male-biased genes were enriched in rhythmic process, molecular transducer activity and synapse. Several genes involved in the juvenile hormone synthesis pathway were also identified. The evolution of large insect genomes such as L. migratoria and C. hookeri genomes is most likely due to the

  16. Gene cloning: exploring cotton functional genomics and genetic improvement

    Institute of Scientific and Technical Information of China (English)

    Diqiu LIU; Xianlong ZHANG

    2008-01-01

    Cotton is the most important natural fiber plant in the world. The genetic improvement of the quality of the cotton fiber and agricultural productivity is imperative under the situation of increasing consumption and rapid development of textile technology. Recently, the study of cotton molecular biology has progressed greatly. A lot of specifically or preferentially expressed cotton fiber genes were cloned and analyzed. On the other hand, identification of stress response genes expressed in cotton was performed by other research groups. The major stress factors were studied including the wilt pathogens Verticillium dahliae, Fusarium oxy-sporum f. sp. vasinfectum, bacterial blight, root-knot nematode, drought, and salt stress. What is more, a few genes related to the biosynthesis of gossypol, other sesquiterpene phytoalexins and the major seed oil fatty acids were isolated from cotton. In the present review, we focused on the major advances in cotton gene cloning and expression profiling in the recent years.

  17. Genome and transcriptome sequencing characterises the gene space of Macadamia integrifolia (Proteaceae).

    Science.gov (United States)

    Nock, Catherine J; Baten, Abdul; Barkla, Bronwyn J; Furtado, Agnelo; Henry, Robert J; King, Graham J

    2016-11-17

    The large Gondwanan plant family Proteaceae is an early-diverging eudicot lineage renowned for its morphological, taxonomic and ecological diversity. Macadamia is the most economically important Proteaceae crop and represents an ancient rainforest-restricted lineage. The family is a focus for studies of adaptive radiation due to remarkable species diversification in Mediterranean-climate biodiversity hotspots, and numerous evolutionary transitions between biomes. Despite a long history of research, comparative analyses in the Proteaceae and macadamia breeding programs are restricted by a paucity of genetic information. To address this, we sequenced the genome and transcriptome of the widely grown Macadamia integrifolia cultivar 741. Over 95 gigabases of DNA and RNA-seq sequence data were de novo assembled and annotated. The draft assembly has a total length of 518 Mb and spans approximately 79% of the estimated genome size. Following annotation, 35,337 protein-coding genes were predicted of which over 90% were expressed in at least one of the leaf, shoot or flower tissues examined. Gene family comparisons with five other eudicot species revealed 13,689 clusters containing macadamia genes and 1005 macadamia-specific clusters, and provides evidence for linage-specific expansion of gene families involved in pathogen recognition, plant defense and monoterpene synthesis. Cyanogenesis is an important defense strategy in the Proteaceae, and a detailed analysis of macadamia gene homologues potentially involved in cyanogenic glycoside biosynthesis revealed several highly expressed candidate genes. The gene space of macadamia provides a foundation for comparative genomics, gene discovery and the acceleration of molecular-assisted breeding. This study presents the first available genomic resources for the large basal eudicot family Proteaceae, access to most macadamia genes and opportunities to uncover the genetic basis of traits of importance for adaptation and crop

  18. Major soybean maturity gene haplotypes revealed by SNPViz analysis of 72 sequenced soybean genomes.

    Directory of Open Access Journals (Sweden)

    Tiffany Langewisch

    Full Text Available In this Genomics Era, vast amounts of next-generation sequencing data have become publicly available for multiple genomes across hundreds of species. Analyses of these large-scale datasets can become cumbersome, especially when comparing nucleotide polymorphisms across many samples within a dataset and among different datasets or organisms. To facilitate the exploration of allelic variation and diversity, we have developed and deployed an in-house computer software to categorize and visualize these haplotypes. The SNPViz software enables users to analyze region-specific haplotypes from single nucleotide polymorphism (SNP datasets for different sequenced genomes. The examination of allelic variation and diversity of important soybean [Glycine max (L. Merr.] flowering time and maturity genes may provide additional insight into flowering time regulation and enhance researchers' ability to target soybean breeding for particular environments. For this study, we utilized two available soybean genomic datasets for a total of 72 soybean genotypes encompassing cultivars, landraces, and the wild species Glycine soja. The major soybean maturity genes E1, E2, E3, and E4 along with the Dt1 gene for plant growth architecture were analyzed in an effort to determine the number of major haplotypes for each gene, to evaluate the consistency of the haplotypes with characterized variant alleles, and to identify evidence of artificial selection. The results indicated classification of a small number of predominant haplogroups for each gene and important insights into possible allelic diversity for each gene within the context of known causative mutations. The software has both a stand-alone and web-based version and can be used to analyze other genes, examine additional soybean datasets, and view similar genome sequence and SNP datasets from other species.

  19. Codon usage is associated with the evolutionary age of genes in metazoan genomes

    Directory of Open Access Journals (Sweden)

    Linial Nathan

    2009-12-01

    Full Text Available Abstract Background Codon usage may vary significantly between different organisms and between genes within the same organism. Several evolutionary processes have been postulated to be the predominant determinants of codon usage: selection, mutation, and genetic drift. However, the relative contribution of each of these factors in different species remains debatable. The availability of complete genomes for tens of multicellular organisms provides an opportunity to inspect the relationship between codon usage and the evolutionary age of genes. Results We assign an evolutionary age to a gene based on the relative positions of its identified homologues in a standard phylogenetic tree. This yields a classification of all genes in a genome to several evolutionary age classes. The present study starts from the observation that each age class of genes has a unique codon usage and proceeds to provide a quantitative analysis of the codon usage in these classes. This observation is made for the genomes of Homo sapiens, Mus musculus, and Drosophila melanogaster. It is even more remarkable that the differences between codon usages in different age groups exhibit similar and consistent behavior in various organisms. While we find that GC content and gene length are also associated with the evolutionary age of genes, they can provide only a partial explanation for the observed codon usage. Conclusion While factors such as GC content, mutational bias, and selection shape the codon usage in a genome, the evolutionary history of an organism over hundreds of millions of years is an overlooked property that is strongly linked to GC content, protein length, and, even more significantly, to the codon usage of metazoan genomes.

  20. Ancient signals: comparative genomics of plant MAPK and MAPKK gene families

    DEFF Research Database (Denmark)

    Hamel, Louis-Philippe; Nicole, Marie-Claude; Sritubtim, Somrudee

    2006-01-01

    MAPK signal transduction modules play crucial roles in regulating many biological processes in plants, and their components are encoded by highly conserved genes. The recent availability of genome sequences for rice and poplar now makes it possible to examine how well the previously described...... Arabidopsis MAPK and MAPKK gene family structures represent the broader evolutionary situation in plants, and analysis of gene expression data for MPK and MKK genes in all three species allows further refinement of those families, based on functionality. The Arabidopsis MAPK nomenclature appears sufficiently...

  1. A genomic perspective on protein tyrosine phosphatases: gene structure, pseudogenes, and genetic disease linkage

    DEFF Research Database (Denmark)

    Andersen, Jannik N; Jansen, Peter G; Echwald, Søren M

    2004-01-01

    sequence databases, we discovered one novel human PTP gene and defined chromosomal loci and exon structure of the additional 37 genes encoding known PTP transcripts. Direct orthologs were present in the mouse genome for all 38 human PTP genes. In addition, we identified 12 PTP pseudogenes unique to humans...... that have probably contaminated previous bioinformatics analysis of this gene family. PCR amplification and transcript sequencing indicate that some PTP pseudogenes are expressed, but their function (if any) is unknown. Furthermore, we analyzed the enhanced diversity generated by alternative splicing...

  2. Annotating novel genes by integrating synthetic lethals and genomic information

    Directory of Open Access Journals (Sweden)

    Faty Mahamadou

    2008-01-01

    Full Text Available Abstract Background Large scale screening for synthetic lethality serves as a common tool in yeast genetics to systematically search for genes that play a role in specific biological processes. Often the amounts of data resulting from a single large scale screen far exceed the capacities of experimental characterization of every identified target. Thus, there is need for computational tools that select promising candidate genes in order to reduce the number of follow-up experiments to a manageable size. Results We analyze synthetic lethality data for arp1 and jnm1, two spindle migration genes, in order to identify novel members in this process. To this end, we use an unsupervised statistical method that integrates additional information from biological data sources, such as gene expression, phenotypic profiling, RNA degradation and sequence similarity. Different from existing methods that require large amounts of synthetic lethal data, our method merely relies on synthetic lethality information from two single screens. Using a Multivariate Gaussian Mixture Model, we determine the best subset of features that assign the target genes to two groups. The approach identifies a small group of genes as candidates involved in spindle migration. Experimental testing confirms the majority of our candidates and we present she1 (YBL031W as a novel gene involved in spindle migration. We applied the statistical methodology also to TOR2 signaling as another example. Conclusion We demonstrate the general use of Multivariate Gaussian Mixture Modeling for selecting candidate genes for experimental characterization from synthetic lethality data sets. For the given example, integration of different data sources contributes to the identification of genetic interaction partners of arp1 and jnm1 that play a role in the same biological process.

  3. Machine Learning Leveraging Genomes from Metagenomes Identifies Influential Antibiotic Resistance Genes in the Infant Gut Microbiome

    Science.gov (United States)

    Olm, Matthew R.; Morowitz, Michael J.

    2018-01-01

    ABSTRACT Antibiotic resistance in pathogens is extensively studied, and yet little is known about how antibiotic resistance genes of typical gut bacteria influence microbiome dynamics. Here, we leveraged genomes from metagenomes to investigate how genes of the premature infant gut resistome correspond to the ability of bacteria to survive under certain environmental and clinical conditions. We found that formula feeding impacts the resistome. Random forest models corroborated by statistical tests revealed that the gut resistome of formula-fed infants is enriched in class D beta-lactamase genes. Interestingly, Clostridium difficile strains harboring this gene are at higher abundance in formula-fed infants than C. difficile strains lacking this gene. Organisms with genes for major facilitator superfamily drug efflux pumps have higher replication rates under all conditions, even in the absence of antibiotic therapy. Using a machine learning approach, we identified genes that are predictive of an organism’s direction of change in relative abundance after administration of vancomycin and cephalosporin antibiotics. The most accurate results were obtained by reducing annotated genomic data to five principal components classified by boosted decision trees. Among the genes involved in predicting whether an organism increased in relative abundance after treatment are those that encode subclass B2 beta-lactamases and transcriptional regulators of vancomycin resistance. This demonstrates that machine learning applied to genome-resolved metagenomics data can identify key genes for survival after antibiotics treatment and predict how organisms in the gut microbiome will respond to antibiotic administration. IMPORTANCE The process of reconstructing genomes from environmental sequence data (genome-resolved metagenomics) allows unique insight into microbial systems. We apply this technique to investigate how the antibiotic resistance genes of bacteria affect their ability to

  4. Genome analysis and identification of gelatinase encoded gene in Enterobacter aerogenes

    Science.gov (United States)

    Shahimi, Safiyyah; Mutalib, Sahilah Abdul; Khalid, Rozida Abdul; Repin, Rul Aisyah Mat; Lamri, Mohd Fadly; Bakar, Mohd Faizal Abu; Isa, Mohd Noor Mat

    2016-11-01

    In this study, bioinformatic analysis towards genome sequence of E. aerogenes was done to determine gene encoded for gelatinase. Enterobacter aerogenes was isolated from hot spring water and gelatinase species-specific bacterium to porcine and fish gelatin. This bacterium offers the possibility of enzymes production which is specific to both species gelatine, respectively. Enterobacter aerogenes was partially genome sequenced resulting in 5.0 mega basepair (Mbp) total size of sequence. From pre-process pipeline, 87.6 Mbp of total reads, 68.8 Mbp of total high quality reads and 78.58 percent of high quality percentage was determined. Genome assembly produced 120 contigs with 67.5% of contigs over 1 kilo base pair (kbp), 124856 bp of N50 contig length and 55.17 % of GC base content percentage. About 4705 protein gene was identified from protein prediction analysis. Two candidate genes selected have highest similarity identity percentage against gelatinase enzyme available in Swiss-Prot and NCBI online database. They were NODE_9_length_26866_cov_148.013245_12 containing 1029 base pair (bp) sequence with 342 amino acid sequence and NODE_24_length_155103_cov_177.082458_62 which containing 717 bp sequence with 238 amino acid sequence, respectively. Thus, two paired of primers (forward and reverse) were designed, based on the open reading frame (ORF) of selected genes. Genome analysis of E. aerogenes resulting genes encoded gelatinase were identified.

  5. Genome-Wide Identification and Expression Analysis of WRKY Gene Family in Capsicum annuum L.

    Science.gov (United States)

    Diao, Wei-Ping; Snyder, John C; Wang, Shu-Bin; Liu, Jin-Bing; Pan, Bao-Gui; Guo, Guang-Jun; Wei, Ge

    2016-01-01

    The WRKY family of transcription factors is one of the most important families of plant transcriptional regulators with members regulating multiple biological processes, especially in regulating defense against biotic and abiotic stresses. However, little information is available about WRKYs in pepper (Capsicum annuum L.). The recent release of completely assembled genome sequences of pepper allowed us to perform a genome-wide investigation for pepper WRKY proteins. In the present study, a total of 71 WRKY genes were identified in the pepper genome. According to structural features of their encoded proteins, the pepper WRKY genes (CaWRKY) were classified into three main groups, with the second group further divided into five subgroups. Genome mapping analysis revealed that CaWRKY were enriched on four chromosomes, especially on chromosome 1, and 15.5% of the family members were tandemly duplicated genes. A phylogenetic tree was constructed depending on WRKY domain' sequences derived from pepper and Arabidopsis. The expression of 21 selected CaWRKY genes in response to seven different biotic and abiotic stresses (salt, heat shock, drought, Phytophtora capsici, SA, MeJA, and ABA) was evaluated by quantitative RT-PCR; Some CaWRKYs were highly expressed and up-regulated by stress treatment. Our results will provide a platform for functional identification and molecular breeding studies of WRKY genes in pepper.

  6. Outbred genome sequencing and CRISPR/Cas9 gene editing in butterflies

    Science.gov (United States)

    Li, Xueyan; Fan, Dingding; Zhang, Wei; Liu, Guichun; Zhang, Lu; Zhao, Li; Fang, Xiaodong; Chen, Lei; Dong, Yang; Chen, Yuan; Ding, Yun; Zhao, Ruoping; Feng, Mingji; Zhu, Yabing; Feng, Yue; Jiang, Xuanting; Zhu, Deying; Xiang, Hui; Feng, Xikan; Li, Shuaicheng; Wang, Jun; Zhang, Guojie; Kronforst, Marcus R.; Wang, Wen

    2015-01-01

    Butterflies are exceptionally diverse but their potential as an experimental system has been limited by the difficulty of deciphering heterozygous genomes and a lack of genetic manipulation technology. Here we use a hybrid assembly approach to construct high-quality reference genomes for Papilio xuthus (contig and scaffold N50: 492 kb, 3.4 Mb) and Papilio machaon (contig and scaffold N50: 81 kb, 1.15 Mb), highly heterozygous species that differ in host plant affiliations, and adult and larval colour patterns. Integrating comparative genomics and analyses of gene expression yields multiple insights into butterfly evolution, including potential roles of specific genes in recent diversification. To functionally test gene function, we develop an efficient (up to 92.5%) CRISPR/Cas9 gene editing method that yields obvious phenotypes with three genes, Abdominal-B, ebony and frizzled. Our results provide valuable genomic and technological resources for butterflies and unlock their potential as a genetic model system. PMID:26354079

  7. Evolutionary genomics of plant genes encoding N-terminal-TM-C2 domain proteins and the similar FAM62 genes and synaptotagmin genes of metazoans

    Directory of Open Access Journals (Sweden)

    Craxton Molly

    2007-07-01

    Full Text Available Abstract Background Synaptotagmin genes are found in animal genomes and are known to function in the nervous system. Genes with a similar domain architecture as well as sequence similarity to synaptotagmin C2 domains have also been found in plant genomes. The plant genes share an additional region of sequence similarity with a group of animal genes named FAM62. FAM62 genes also have a similar domain architecture. Little is known about the functions of the plant genes and animal FAM62 genes. Indeed, many members of the large and diverse Syt gene family await functional characterization. Understanding the evolutionary relationships among these genes will help to realize the full implications of functional studies and lead to improved genome annotation. Results I collected and compared plant Syt-like sequences from the primary nucleotide sequence databases at NCBI. The collection comprises six groups of plant genes conserved in embryophytes: NTMC2Type1 to NTMC2Type6. I collected and compared metazoan FAM62 sequences and identified some similar sequences from other eukaryotic lineages. I found evidence of RNA editing and alternative splicing. I compared the intron patterns of Syt genes. I also compared Rabphilin and Doc2 genes. Conclusion Genes encoding proteins with N-terminal-transmembrane-C2 domain architectures resembling synaptotagmins, are widespread in eukaryotes. A collection of these genes is presented here. The collection provides a resource for studies of intron evolution. I have classified the collection into homologous gene families according to distinctive patterns of sequence conservation and intron position. The evolutionary histories of these gene families are traceable through the appearance of family members in different eukaryotic lineages. Assuming an intron-rich eukaryotic ancestor, the conserved intron patterns distinctive of individual gene families, indicate independent origins of Syt, FAM62 and NTMC2 genes. Resemblances

  8. Detailed analysis of putative genes encoding small proteins in legume genomes

    Directory of Open Access Journals (Sweden)

    Gabriel eGuillén

    2013-06-01

    Full Text Available Diverse plant genome sequencing projects coupled with powerful bioinformatics tools have facilitated massive data analysis to construct specialized databases classified according to cellular function. However, there are still a considerable number of genes encoding proteins whose function has not yet been characterized. Included in this category are small proteins (SPs, 30-150 amino acids encoded by short open reading frames (sORFs. SPs play important roles in plant physiology, growth, and development. Unfortunately, protocols focused on the genome-wide identification and characterization of sORFs are scarce or remain poorly implemented. As a result, these genes are underrepresented in many genome annotations. In this work, we exploited publicly available genome sequences of Phaseolus vulgaris, Medicago truncatula, Glycine max and Lotus japonicus to analyze the abundance of annotated SPs in plant legumes. Our strategy to uncover bona fide sORFs at the genome level was centered in bioinformatics analysis of characteristics such as evidence of expression (transcription, presence of known protein regions or domains, and identification of orthologous genes in the genomes explored. We collected 6170, 10461, 30521, and 23599 putative sORFs from P. vulgaris, G. max, M. truncatula, and L. japonicus genomes, respectively. Expressed sequence tags (ESTs available in the DFCI Gene Index database provided evidence that ~one-third of the predicted legume sORFs are expressed. Most potential SPs have a counterpart in a different plant species and counterpart regions or domains in larger proteins. Potential functional sORFs were also classified according to a reduced set of GO categories, and the expression of 13 of them during P. vulgaris nodule ontogeny was confirmed by qPCR. This analysis provides a collection of sORFs that potentially encode for meaningful SPs, and offers the possibility of their further functional evaluation.

  9. Inter-replicon Gene Flow Contributes to Transcriptional Integration in the Sinorhizobium meliloti Multipartite Genome

    Directory of Open Access Journals (Sweden)

    George C. diCenzo

    2018-05-01

    Full Text Available Integration of newly acquired genes into existing regulatory networks is necessary for successful horizontal gene transfer (HGT. Ten percent of bacterial species contain at least two DNA replicons over 300 kilobases in size, with the secondary replicons derived predominately through HGT. The Sinorhizobium meliloti genome is split between a 3.7 Mb chromosome, a 1.7 Mb chromid consisting largely of genes acquired through ancient HGT, and a 1.4 Mb megaplasmid consisting primarily of recently acquired genes. Here, RNA-sequencing is used to examine the transcriptional consequences of massive, synthetic genome reduction produced through the removal of the megaplasmid and/or the chromid. Removal of the pSymA megaplasmid influenced the transcription of only six genes. In contrast, removal of the chromid influenced expression of ∼8% of chromosomal genes and ∼4% of megaplasmid genes. This was mediated in part by the loss of the ETR DNA region whose presence on pSymB is due to a translocation from the chromosome. No obvious functional bias among the up-regulated genes was detected, although genes with putative homologs on the chromid were enriched. Down-regulated genes were enriched in motility and sensory transduction pathways. Four transcripts were examined further, and in each case the transcriptional change could be traced to loss of specific pSymB regions. In particularly, a chromosomal transporter was induced due to deletion of bdhA likely mediated through 3-hydroxybutyrate accumulation. These data provide new insights into the evolution of the multipartite bacterial genome, and more generally into the integration of horizontally acquired genes into the transcriptome.

  10. Inter-replicon Gene Flow Contributes to Transcriptional Integration in the Sinorhizobium meliloti Multipartite Genome.

    Science.gov (United States)

    diCenzo, George C; Wellappili, Deelaka; Golding, G Brian; Finan, Turlough M

    2018-05-04

    Integration of newly acquired genes into existing regulatory networks is necessary for successful horizontal gene transfer (HGT). Ten percent of bacterial species contain at least two DNA replicons over 300 kilobases in size, with the secondary replicons derived predominately through HGT. The Sinorhizobium meliloti genome is split between a 3.7 Mb chromosome, a 1.7 Mb chromid consisting largely of genes acquired through ancient HGT, and a 1.4 Mb megaplasmid consisting primarily of recently acquired genes. Here, RNA-sequencing is used to examine the transcriptional consequences of massive, synthetic genome reduction produced through the removal of the megaplasmid and/or the chromid. Removal of the pSymA megaplasmid influenced the transcription of only six genes. In contrast, removal of the chromid influenced expression of ∼8% of chromosomal genes and ∼4% of megaplasmid genes. This was mediated in part by the loss of the ETR DNA region whose presence on pSymB is due to a translocation from the chromosome. No obvious functional bias among the up-regulated genes was detected, although genes with putative homologs on the chromid were enriched. Down-regulated genes were enriched in motility and sensory transduction pathways. Four transcripts were examined further, and in each case the transcriptional change could be traced to loss of specific pSymB regions. In particularly, a chromosomal transporter was induced due to deletion of bdhA likely mediated through 3-hydroxybutyrate accumulation. These data provide new insights into the evolution of the multipartite bacterial genome, and more generally into the integration of horizontally acquired genes into the transcriptome. Copyright © 2018 diCenzo, et al.

  11. Ranking of Prokaryotic Genomes Based on Maximization of Sortedness of Gene Lengths.

    Science.gov (United States)

    Bolshoy, A; Salih, B; Cohen, I; Tatarinova, T

    How variations of gene lengths (some genes become longer than their predecessors, while other genes become shorter and the sizes of these factions are randomly different from organism to organism) depend on organismal evolution and adaptation is still an open question. We propose to rank the genomes according to lengths of their genes, and then find association between the genome rank and variousproperties, such as growth temperature, nucleotide composition, and pathogenicity. This approach reveals evolutionary driving factors. The main purpose of this study is to test effectiveness and robustness of several ranking methods. The selected method of evaluation is measuring of overall sortedness of the data. We have demonstrated that all considered methods give consistent results and Bubble Sort and Simulated Annealing achieve the highest sortedness. Also, Bubble Sort is considerably faster than the Simulated Annealing method.

  12. Improvisation in evolution of genes and genomes: whose structure is it anyway?

    Science.gov (United States)

    Shakhnovich, Boris E; Shakhnovich, Eugene I

    2008-06-01

    Significant progress has been made in recent years in a variety of seemingly unrelated fields such as sequencing, protein structure prediction, and high-throughput transcriptomics and metabolomics. At the same time, new microscopic models have been developed that made it possible to analyze the evolution of genes and genomes from first principles. The results from these efforts enable, for the first time, a comprehensive insight into the evolution of complex systems and organisms on all scales--from sequences to organisms and populations. Every newly sequenced genome uncovers new genes, families, and folds. Where do these new genes come from? How do gene duplication and subsequent divergence of sequence and structure affect the fitness of the organism? What role does regulation play in the evolution of proteins and folds? Emerging synergism between data and modeling provides first robust answers to these questions.

  13. Comparative Annotation of Viral Genomes with Non-Conserved Gene Structure

    DEFF Research Database (Denmark)

    de Groot, Saskia; Mailund, Thomas; Hein, Jotun

    2007-01-01

    Motivation: Detecting genes in viral genomes is a complex task. Due to the biological necessity of them being constrained in length, RNA viruses in particular tend to code in overlapping reading frames. Since one amino acid is encoded by a triplet of nucleic acids, up to three genes may be coded...... allows for coding in unidirectional nested and overlapping reading frames, to annotate two homologous aligned viral genomes. Our method does not insist on conserved gene structure between the two sequences, thus making it applicable for the pairwise comparison of more distantly related sequences. Results...... and HIV2, as well as of two different Hepatitis Viruses, attaining results of ~87% sensitivity and ~98.5% specificity. We subsequently incorporate prior knowledge by "knowing" the gene structure of one sequence and annotating the other conditional on it. Boosting accuracy close to perfect we demonstrate...

  14. Targeted and genome-scale methylomics reveals gene body signatures in human cell lines

    Science.gov (United States)

    Ball, Madeleine Price; Li, Jin Billy; Gao, Yuan; Lee, Je-Hyuk; LeProust, Emily; Park, In-Hyun; Xie, Bin; Daley, George Q.; Church, George M.

    2012-01-01

    Cytosine methylation, an epigenetic modification of DNA, is a target of growing interest for developing high throughput profiling technologies. Here we introduce two new, complementary techniques for cytosine methylation profiling utilizing next generation sequencing technology: bisulfite padlock probes (BSPPs) and methyl sensitive cut counting (MSCC). In the first method, we designed a set of ~10,000 BSPPs distributed over the ENCODE pilot project regions to take advantage of existing expression and chromatin immunoprecipitation data. We observed a pattern of low promoter methylation coupled with high gene body methylation in highly expressed genes. Using the second method, MSCC, we gathered genome-scale data for 1.4 million HpaII sites and confirmed that gene body methylation in highly expressed genes is a consistent phenomenon over the entire genome. Our observations highlight the usefulness of techniques which are not inherently or intentionally biased in favor of only profiling particular subsets like CpG islands or promoter regions. PMID:19329998

  15. Proviral HIV-genome-wide and pol-gene specific zinc finger nucleases: usability for targeted HIV gene therapy.

    Science.gov (United States)

    Wayengera, Misaki

    2011-07-22

    Infection with HIV, which culminates in the establishment of a latent proviral reservoir, presents formidable challenges for ultimate cure. Building on the hypothesis that ex-vivo or even in-vivo abolition or disruption of HIV-gene/genome-action by target mutagenesis or excision can irreversibly abrogate HIV's innate fitness to replicate and survive, we previously identified the isoschizomeric bacteria restriction enzymes (REases) AcsI and ApoI as potent cleavers of the HIV-pol gene (11 and 9 times in HIV-1 and 2, respectively). However, both enzymes, along with others found to cleave across the entire HIV-1 genome, slice (SX) at palindromic sequences that are prevalent within the human genome and thereby pose the risk of host genome toxicity. A long-term goal in the field of R-M enzymatic therapeutics has thus been to generate synthetic restriction endonucleases with longer recognition sites limited in specificity to HIV. We aimed (i) to assemble and construct zinc finger arrays and nucleases (ZFN) with either proviral-HIV-pol gene or proviral-HIV-1 whole-genome specificity respectively, and (ii) to advance a model for pre-clinically testing lentiviral vectors (LV) that deliver and transduce either ZFN genotype. First, we computationally generated the consensus sequences of (a) 114 dsDNA-binding zinc finger (Zif) arrays (ZFAs or ZifHIV-pol) and (b) two zinc-finger nucleases (ZFNs) which, unlike the AcsI and ApoI homeodomains, possess specificity to >18 base-pair sequences uniquely present within the HIV-pol gene (ZifHIV-polFN). Another 15 ZFNs targeting >18 bp sequences within the complete HIV-1 proviral genome were constructed (ZifHIV-1FN). Second, a model for constructing lentiviral vectors (LVs) that deliver and transduce a diploid copy of either ZifHIV-polFN or ZifHIV-1FN chimeric genes (termed LV- 2xZifHIV-polFN and LV- 2xZifHIV-1FN, respectively) is proposed. Third, two preclinical models for controlled testing of the safety and efficacy of either of these

  16. Genome Wide Identification, Evolutionary, and Expression Analysis of VQ Genes from Two Pyrus Species.

    Science.gov (United States)

    Cao, Yunpeng; Meng, Dandan; Abdullah, Muhammad; Jin, Qing; Lin, Yi; Cai, Yongping

    2018-04-23

    The VQ motif-containing gene, a member of the plant-specific genes, is involved in the plant developmental process and various stress responses. The VQ motif-containing gene family has been studied in several plants, such as rice ( Oryza sativa ), maize ( Zea mays ), and Arabidopsis ( Arabidopsis thaliana ). However, no systematic study has been performed in Pyrus species, which have important economic value. In our study, we identified 41 and 28 VQ motif-containing genes in Pyrus bretschneideri and Pyrus communis , respectively. Phylogenetic trees were calculated using A. thaliana and O. sativa VQ motif-containing genes as a template, allowing us to categorize these genes into nine subfamilies. Thirty-two and eight paralogous of VQ motif-containing genes were found in P. bretschneideri and P. communis , respectively, showing that the VQ motif-containing genes had a more remarkable expansion in P. bretschneideri than in P. communis . A total of 31 orthologous pairs were identified from the P. bretschneideri and P. communis VQ motif-containing genes. Additionally, among the paralogs, we found that these duplication gene pairs probably derived from segmental duplication/whole-genome duplication (WGD) events in the genomes of P. bretschneideri and P. communis , respectively. The gene expression profiles in both P. bretschneideri and P. communis fruits suggested functional redundancy for some orthologous gene pairs derived from a common ancestry, and sub-functionalization or neo-functionalization for some of them. Our study provided the first systematic evolutionary analysis of the VQ motif-containing genes in Pyrus , and highlighted the diversification and duplication of VQ motif-containing genes in both P. bretschneideri and P. communis .

  17. Genomic Features That Predict Allelic Imbalance in Humans Suggest Patterns of Constraint on Gene Expression Variation

    Science.gov (United States)

    Fédrigo, Olivier; Haygood, Ralph; Mukherjee, Sayan; Wray, Gregory A.

    2009-01-01

    Variation in gene expression is an important contributor to phenotypic diversity within and between species. Although this variation often has a genetic component, identification of the genetic variants driving this relationship remains challenging. In particular, measurements of gene expression usually do not reveal whether the genetic basis for any observed variation lies in cis or in trans to the gene, a distinction that has direct relevance to the physical location of the underlying genetic variant, and which may also impact its evolutionary trajectory. Allelic imbalance measurements identify cis-acting genetic effects by assaying the relative contribution of the two alleles of a cis-regulatory region to gene expression within individuals. Identification of patterns that predict commonly imbalanced genes could therefore serve as a useful tool and also shed light on the evolution of cis-regulatory variation itself. Here, we show that sequence motifs, polymorphism levels, and divergence levels around a gene can be used to predict commonly imbalanced genes in a human data set. Reduction of this feature set to four factors revealed that only one factor significantly differentiated between commonly imbalanced and nonimbalanced genes. We demonstrate that these results are consistent between the original data set and a second published data set in humans obtained using different technical and statistical methods. Finally, we show that variation in the single allelic imbalance-associated factor is partially explained by the density of genes in the region of a target gene (allelic imbalance is less probable for genes in gene-dense regions), and, to a lesser extent, the evenness of expression of the gene across tissues and the magnitude of negative selection on putative regulatory regions of the gene. These results suggest that the genomic distribution of functional cis-regulatory variants in the human genome is nonrandom, perhaps due to local differences in evolutionary

  18. New genes expressed in human brains: implications for annotating evolving genomes.

    Science.gov (United States)

    Zhang, Yong E; Landback, Patrick; Vibranovski, Maria; Long, Manyuan

    2012-11-01

    New genes have frequently formed and spread to fixation in a wide variety of organisms, constituting abundant sets of lineage-specific genes. It was recently reported that an excess of primate-specific and human-specific genes were upregulated in the brains of fetuses and infants, and especially in the prefrontal cortex, which is involved in cognition. These findings reveal the prevalent addition of new genetic components to the transcriptome of the human brain. More generally, these findings suggest that genomes are continually evolving in both sequence and content, eroding the conservation endowed by common ancestry. Despite increasing recognition of the importance of new genes, we highlight here that these genes are still seriously under-characterized in functional studies and that new gene annotation is inconsistent in current practice. We propose an integrative approach to annotate new genes, taking advantage of functional and evolutionary genomic methods. We finally discuss how the refinement of new gene annotation will be important for the detection of evolutionary forces governing new gene origination. Copyright © 2012 WILEY Periodicals, Inc.

  19. GBOOST: a GPU-based tool for detecting gene-gene interactions in genome-wide case control studies.

    Science.gov (United States)

    Yung, Ling Sing; Yang, Can; Wan, Xiang; Yu, Weichuan

    2011-05-01

    Collecting millions of genetic variations is feasible with the advanced genotyping technology. With a huge amount of genetic variations data in hand, developing efficient algorithms to carry out the gene-gene interaction analysis in a timely manner has become one of the key problems in genome-wide association studies (GWAS). Boolean operation-based screening and testing (BOOST), a recent work in GWAS, completes gene-gene interaction analysis in 2.5 days on a desktop computer. Compared with central processing units (CPUs), graphic processing units (GPUs) are highly parallel hardware and provide massive computing resources. We are, therefore, motivated to use GPUs to further speed up the analysis of gene-gene interactions. We implement the BOOST method based on a GPU framework and name it GBOOST. GBOOST achieves a 40-fold speedup compared with BOOST. It completes the analysis of Wellcome Trust Case Control Consortium Type 2 Diabetes (WTCCC T2D) genome data within 1.34 h on a desktop computer equipped with Nvidia GeForce GTX 285 display card. GBOOST code is available at http://bioinformatics.ust.hk/BOOST.html#GBOOST.

  20. The RNAPII-CTD Maintains Genome Integrity through Inhibition of Retrotransposon Gene Expression and Transposition.

    Directory of Open Access Journals (Sweden)

    Maria J Aristizabal

    2015-10-01

    Full Text Available RNA polymerase II (RNAPII contains a unique C-terminal domain that is composed of heptapeptide repeats and which plays important regulatory roles during gene expression. RNAPII is responsible for the transcription of most protein-coding genes, a subset of non-coding genes, and retrotransposons. Retrotransposon transcription is the first step in their multiplication cycle, given that the RNA intermediate is required for the synthesis of cDNA, the material that is ultimately incorporated into a new genomic location. Retrotransposition can have grave consequences to genome integrity, as integration events can change the gene expression landscape or lead to alteration or loss of genetic information. Given that RNAPII transcribes retrotransposons, we sought to investigate if the RNAPII-CTD played a role in the regulation of retrotransposon gene expression. Importantly, we found that the RNAPII-CTD functioned to maintaining genome integrity through inhibition of retrotransposon gene expression, as reducing CTD length significantly increased expression and transposition rates of Ty1 elements. Mechanistically, the increased Ty1 mRNA levels in the rpb1-CTD11 mutant were partly due to Cdk8-dependent alterations to the RNAPII-CTD phosphorylation status. In addition, Cdk8 alone contributed to Ty1 gene expression regulation by altering the occupancy of the gene-specific transcription factor Ste12. Loss of STE12 and TEC1 suppressed growth phenotypes of the RNAPII-CTD truncation mutant. Collectively, our results implicate Ste12 and Tec1 as general and important contributors to the Cdk8, RNAPII-CTD regulatory circuitry as it relates to the maintenance of genome integrity.

  1. Diversity of 23S rRNA genes within individual prokaryotic genomes.

    Directory of Open Access Journals (Sweden)

    Anna Pei

    Full Text Available BACKGROUND: The concept of ribosomal constraints on rRNA genes is deduced primarily based on the comparison of consensus rRNA sequences between closely related species, but recent advances in whole-genome sequencing allow evaluation of this concept within organisms with multiple rRNA operons. METHODOLOGY/PRINCIPAL FINDINGS: Using the 23S rRNA gene as an example, we analyzed the diversity among individual rRNA genes within a genome. Of 184 prokaryotic species containing multiple 23S rRNA genes, diversity was observed in 113 (61.4% genomes (mean 0.40%, range 0.01%-4.04%. Significant (1.17%-4.04% intragenomic variation was found in 8 species. In 5 of the 8 species, the diversity in the primary structure had only minimal effect on the secondary structure (stem versus loop transition. In the remaining 3 species, the diversity significantly altered local secondary structure, but the alteration appears minimized through complex rearrangement. Intervening sequences (IVS, ranging between 9 and 1471 nt in size, were found in 7 species. IVS in Deinococcus radiodurans and Nostoc sp. encode transposases. T. tengcongensis was the only species in which intragenomic diversity >3% was observed among 4 paralogous 23S rRNA genes. CONCLUSIONS/SIGNIFICANCE: These findings indicate tight ribosomal constraints on individual 23S rRNA genes within a genome. Although classification using primary 23S rRNA sequences could be erroneous, significant diversity among paralogous 23S rRNA genes was observed only once in the 184 species analyzed, indicating little overall impact on the mainstream of 23S rRNA gene-based prokaryotic taxonomy.

  2. Characteristics of the mouse genomic histamine H1 receptor gene

    Energy Technology Data Exchange (ETDEWEB)

    Inoue, Isao; Taniuchi, Ichiro; Kitamura, Daisuke [Kyushu Univ., Fukuoka (Japan)] [and others

    1996-08-15

    We report here the molecular cloning of a mouse histamine H1 receptor gene. The protein deduced from the nucleotide sequence is composed of 488 amino acid residues with characteristic properties of GTP binding protein-coupled receptors. Our results suggest that the mouse histamine H1 receptor gene is a single locus, and no related sequences were detected. Interspecific backcross analysis indicated that the mouse histamine H1 receptor gene (Hrh1) is located in the central region of mouse Chromosome 6 linked to microphthalmia (Mitfmi), ras-related fibrosarcoma oncogene 1 (Raf1), and ret proto-oncogene (Ret) in a region of homology with human chromosome 3p. 12 refs., 3 figs.

  3. Identifying candidate driver genes by integrative ovarian cancer genomics data

    Science.gov (United States)

    Lu, Xinguo; Lu, Jibo

    2017-08-01

    Integrative analysis of molecular mechanics underlying cancer can distinguish interactions that cannot be revealed based on one kind of data for the appropriate diagnosis and treatment of cancer patients. Tumor samples exhibit heterogeneity in omics data, such as somatic mutations, Copy Number Variations CNVs), gene expression profiles and so on. In this paper we combined gene co-expression modules and mutation modulators separately in tumor patients to obtain the candidate driver genes for resistant and sensitive tumor from the heterogeneous data. The final list of modulators identified are well known in biological processes associated with ovarian cancer, such as CCL17, CACTIN, CCL16, CCL22, APOB, KDF1, CCL11, HNF1B, LRG1, MED1 and so on, which can help to facilitate the discovery of biomarkers, molecular diagnostics, and drug discovery.

  4. The Most Developmentally Truncated Fishes Show Extensive Hox Gene Loss and Miniaturized Genomes

    Science.gov (United States)

    Malmstrøm, Martin; Britz, Ralf; Matschiner, Michael; Tørresen, Ole K; Hadiaty, Renny Kurnia; Yaakob, Norsham; Tan, Heok Hui; Jakobsen, Kjetill Sigurd; Salzburger, Walter; Rüber, Lukas

    2018-01-01

    Abstract The world’s smallest fishes belong to the genus Paedocypris. These miniature fishes are endemic to an extreme habitat: the peat swamp forests in Southeast Asia, characterized by highly acidic blackwater. This threatened habitat is home to a large array of fishes, including a number of miniaturized but also developmentally truncated species. Especially the genus Paedocypris is characterized by profound, organism-wide developmental truncation, resulting in sexually mature individuals of <8 mm in length with a larval phenotype. Here, we report on evolutionary simplification in the genomes of two species of the dwarf minnow genus Paedocypris using whole-genome sequencing. The two species feature unprecedented Hox gene loss and genome reduction in association with their massive developmental truncation. We also show how other genes involved in the development of musculature, nervous system, and skeleton have been lost in Paedocypris, mirroring its highly progenetic phenotype. Further, our analyses suggest two mechanisms responsible for the genome streamlining in Paedocypris in relation to other Cypriniformes: severe intron shortening and reduced repeat content. As the first report on the genomic sequence of a vertebrate species with organism-wide developmental truncation, the results of our work enhance our understanding of genome evolution and how genotypes are translated to phenotypes. In addition, as a naturally simplified system closely related to zebrafish, Paedocypris provides novel insights into vertebrate development. PMID:29684203

  5. The Most Developmentally Truncated Fishes Show Extensive Hox Gene Loss and Miniaturized Genomes.

    Science.gov (United States)

    Malmstrøm, Martin; Britz, Ralf; Matschiner, Michael; Tørresen, Ole K; Hadiaty, Renny Kurnia; Yaakob, Norsham; Tan, Heok Hui; Jakobsen, Kjetill Sigurd; Salzburger, Walter; Rüber, Lukas

    2018-04-01

    The world's smallest fishes belong to the genus Paedocypris. These miniature fishes are endemic to an extreme habitat: the peat swamp forests in Southeast Asia, characterized by highly acidic blackwater. This threatened habitat is home to a large array of fishes, including a number of miniaturized but also developmentally truncated species. Especially the genus Paedocypris is characterized by profound, organism-wide developmental truncation, resulting in sexually mature individuals of <8 mm in length with a larval phenotype. Here, we report on evolutionary simplification in the genomes of two species of the dwarf minnow genus Paedocypris using whole-genome sequencing. The two species feature unprecedented Hox gene loss and genome reduction in association with their massive developmental truncation. We also show how other genes involved in the development of musculature, nervous system, and skeleton have been lost in Paedocypris, mirroring its highly progenetic phenotype. Further, our analyses suggest two mechanisms responsible for the genome streamlining in Paedocypris in relation to other Cypriniformes: severe intron shortening and reduced repeat content. As the first report on the genomic sequence of a vertebrate species with organism-wide developmental truncation, the results of our work enhance our understanding of genome evolution and how genotypes are translated to phenotypes. In addition, as a naturally simplified system closely related to zebrafish, Paedocypris provides novel insights into vertebrate development.

  6. Genomic evidence of geographically widespread effect of gene flow from polar bears into brown bears.

    Science.gov (United States)

    Cahill, James A; Stirling, Ian; Kistler, Logan; Salamzade, Rauf; Ersmark, Erik; Fulton, Tara L; Stiller, Mathias; Green, Richard E; Shapiro, Beth

    2015-03-01

    Polar bears are an arctic, marine adapted species that is closely related to brown bears. Genome analyses have shown that polar bears are distinct and genetically homogeneous in comparison to brown bears. However, these analyses have also revealed a remarkable episode of polar bear gene flow into the population of brown bears that colonized the Admiralty, Baranof and Chichagof islands (ABC islands) of Alaska. Here, we present an analysis of data from a large panel of polar bear and brown bear genomes that includes brown bears from the ABC islands, the Alaskan mainland and Europe. Our results provide clear evidence that gene flow between the two species had a geographically wide impact, with polar bear DNA found within the genomes of brown bears living both on the ABC islands and in the Alaskan mainland. Intriguingly, while brown bear genomes contain up to 8.8% polar bear ancestry, polar bear genomes appear to be devoid of brown bear ancestry, suggesting the presence of a barrier to gene flow in that direction. © 2014 The Authors. Molecular Ecology Published by John Wiley & Sons Ltd.

  7. Gene order data from a model amphibian (Ambystoma: new perspectives on vertebrate genome structure and evolution

    Directory of Open Access Journals (Sweden)

    Voss S Randal

    2006-08-01

    Full Text Available Abstract Background Because amphibians arise from a branch of the vertebrate evolutionary tree that is juxtaposed between fishes and amniotes, they provide important comparative perspective for reconstructing character changes that have occurred during vertebrate evolution. Here, we report the first comparative study of vertebrate genome structure that includes a representative amphibian. We used 491 transcribed sequences from a salamander (Ambystoma genetic map and whole genome assemblies for human, mouse, rat, dog, chicken, zebrafish, and the freshwater pufferfish Tetraodon nigroviridis to compare gene orders and rearrangement rates. Results Ambystoma has experienced a rate of genome rearrangement that is substantially lower than mammalian species but similar to that of chicken and fish. Overall, we found greater conservation of genome structure between Ambystoma and tetrapod vertebrates, nevertheless, 57% of Ambystoma-fish orthologs are found in conserved syntenies of four or more genes. Comparisons between Ambystoma and amniotes reveal extensive conservation of segmental homology for 57% of the presumptive Ambystoma-amniote orthologs. Conclusion Our analyses suggest relatively constant interchromosomal rearrangement rates from the euteleost ancestor to the origin of mammals and illustrate the utility of amphibian mapping data in establishing ancestral amniote and tetrapod gene orders. Comparisons between Ambystoma and amniotes reveal some of the key events that have structured the human genome since diversification of the ancestral amniote lineage.

  8. Native and exotic earthworms affect orchid seed loss

    OpenAIRE

    McCormick, Melissa K.; Parker, Kenneth L.; Szlavecz, Katalin; Whigham, Dennis F.

    2013-01-01

    Non-native earthworms have invaded ecosystems around the world but have recently received increased attention as they invaded previously earthworm-free habitats in northern North America. Earthworms can affect plants by ingesting seeds and burying them in the soil. These effects can be negative or positive but are expected to become increasingly negative with decreasing seed size. Orchids have some of the smallest seeds of any plants, so we hypothesized that earthworm consumption of seeds wou...

  9. MVisAGe Identifies Concordant and Discordant Genomic Alterations of Driver Genes in Squamous Tumors.

    Science.gov (United States)

    Walter, Vonn; Du, Ying; Danilova, Ludmila; Hayward, Michele C; Hayes, D Neil

    2018-06-15

    Integrated analyses of multiple genomic datatypes are now common in cancer profiling studies. Such data present opportunities for numerous computational experiments, yet analytic pipelines are limited. Tools such as the cBioPortal and Regulome Explorer, although useful, are not easy to access programmatically or to implement locally. Here, we introduce the MVisAGe R package, which allows users to quantify gene-level associations between two genomic datatypes to investigate the effect of genomic alterations (e.g., DNA copy number changes on gene expression). Visualizing Pearson/Spearman correlation coefficients according to the genomic positions of the underlying genes provides a powerful yet novel tool for conducting exploratory analyses. We demonstrate its utility by analyzing three publicly available cancer datasets. Our approach highlights canonical oncogenes in chr11q13 that displayed the strongest associations between expression and copy number, including CCND1 and CTTN , genes not identified by copy number analysis in the primary reports. We demonstrate highly concordant usage of shared oncogenes on chr3q, yet strikingly diverse oncogene usage on chr11q as a function of HPV infection status. Regions of chr19 that display remarkable associations between methylation and gene expression were identified, as were previously unreported miRNA-gene expression associations that may contribute to the epithelial-to-mesenchymal transition. Significance: This study presents an important bioinformatics tool that will enable integrated analyses of multiple genomic datatypes. Cancer Res; 78(12); 3375-85. ©2018 AACR . ©2018 American Association for Cancer Research.

  10. The king cobra genome reveals dynamic gene evolution and adaptation in the snake venom system.

    Science.gov (United States)

    Vonk, Freek J; Casewell, Nicholas R; Henkel, Christiaan V; Heimberg, Alysha M; Jansen, Hans J; McCleary, Ryan J R; Kerkkamp, Harald M E; Vos, Rutger A; Guerreiro, Isabel; Calvete, Juan J; Wüster, Wolfgang; Woods, Anthony E; Logan, Jessica M; Harrison, Robert A; Castoe, Todd A; de Koning, A P Jason; Pollock, David D; Yandell, Mark; Calderon, Diego; Renjifo, Camila; Currier, Rachel B; Salgado, David; Pla, Davinia; Sanz, Libia; Hyder, Asad S; Ribeiro, José M C; Arntzen, Jan W; van den Thillart, Guido E E J M; Boetzer, Marten; Pirovano, Walter; Dirks, Ron P; Spaink, Herman P; Duboule, Denis; McGlinn, Edwina; Kini, R Manjunatha; Richardson, Michael K

    2013-12-17

    Snakes are limbless predators, and many species use venom to help overpower relatively large, agile prey. Snake venoms are complex protein mixtures encoded by several multilocus gene families that function synergistically to cause incapacitation. To examine venom evolution, we sequenced and interrogated the genome of a venomous snake, the king cobra (Ophiophagus hannah), and compared it, together with our unique transcriptome, microRNA, and proteome datasets from this species, with data from other vertebrates. In contrast to the platypus, the only other venomous vertebrate with a sequenced genome, we find that snake toxin genes evolve through several distinct co-option mechanisms and exhibit surprisingly variable levels of gene duplication and directional selection that correlate with their functional importance in prey capture. The enigmatic accessory venom gland shows a very different pattern of toxin gene expression from the main venom gland and seems to have recruited toxin-like lectin genes repeatedly for new nontoxic functions. In addition, tissue-specific microRNA analyses suggested the co-option of core genetic regulatory components of the venom secretory system from a pancreatic origin. Although the king cobra is limbless, we recovered coding sequences for all Hox genes involved in amniote limb development, with the exception of Hoxd12. Our results provide a unique view of the origin and evolution of snake venom and reveal multiple genome-level adaptive responses to natural selection in this complex biological weapon system. More generally, they provide insight into mechanisms of protein evolution under strong selection.

  11. Population genomics of the immune evasion (var genes of Plasmodium falciparum.

    Directory of Open Access Journals (Sweden)

    Alyssa E Barry

    2007-03-01

    Full Text Available Var genes encode the major surface antigen (PfEMP1 of the blood stages of the human malaria parasite Plasmodium falciparum. Differential expression of up to 60 diverse var genes in each parasite genome underlies immune evasion. We compared the diversity of the DBLalpha domain of var genes sampled from 30 parasite isolates from a malaria endemic area of Papua New Guinea (PNG and 59 from widespread geographic origins (global. Overall, we obtained over 8,000 quality-controlled DBLalpha sequences. Within our sampling frame, the global population had a total of 895 distinct DBLalpha "types" and negligible overlap among repertoires. This indicated that var gene diversity on a global scale is so immense that many genomes would need to be sequenced to capture its true extent. In contrast, we found a much lower diversity in PNG of 185 DBLalpha types, with an average of approximately 7% overlap among repertoires. While we identify marked geographic structuring, nearly 40% of types identified in PNG were also found in samples from different countries showing a cosmopolitan distribution for much of the diversity. We also present evidence to suggest that recombination plays a key role in maintaining the unprecedented levels of polymorphism found in these immune evasion genes. This population genomic framework provides a cost effective molecular epidemiological tool to rapidly explore the geographic diversity of var genes.

  12. The king cobra genome reveals dynamic gene evolution and adaptation in the snake venom system

    Science.gov (United States)

    Vonk, Freek J.; Casewell, Nicholas R.; Henkel, Christiaan V.; Heimberg, Alysha M.; Jansen, Hans J.; McCleary, Ryan J. R.; Kerkkamp, Harald M. E.; Vos, Rutger A.; Guerreiro, Isabel; Calvete, Juan J.; Wüster, Wolfgang; Woods, Anthony E.; Logan, Jessica M.; Harrison, Robert A.; Castoe, Todd A.; de Koning, A. P. Jason; Pollock, David D.; Yandell, Mark; Calderon, Diego; Renjifo, Camila; Currier, Rachel B.; Salgado, David; Pla, Davinia; Sanz, Libia; Hyder, Asad S.; Ribeiro, José M. C.; Arntzen, Jan W.; van den Thillart, Guido E. E. J. M.; Boetzer, Marten; Pirovano, Walter; Dirks, Ron P.; Spaink, Herman P.; Duboule, Denis; McGlinn, Edwina; Kini, R. Manjunatha; Richardson, Michael K.

    2013-01-01

    Snakes are limbless predators, and many species use venom to help overpower relatively large, agile prey. Snake venoms are complex protein mixtures encoded by several multilocus gene families that function synergistically to cause incapacitation. To examine venom evolution, we sequenced and interrogated the genome of a venomous snake, the king cobra (Ophiophagus hannah), and compared it, together with our unique transcriptome, microRNA, and proteome datasets from this species, with data from other vertebrates. In contrast to the platypus, the only other venomous vertebrate with a sequenced genome, we find that snake toxin genes evolve through several distinct co-option mechanisms and exhibit surprisingly variable levels of gene duplication and directional selection that correlate with their functional importance in prey capture. The enigmatic accessory venom gland shows a very different pattern of toxin gene expression from the main venom gland and seems to have recruited toxin-like lectin genes repeatedly for new nontoxic functions. In addition, tissue-specific microRNA analyses suggested the co-option of core genetic regulatory components of the venom secretory system from a pancreatic origin. Although the king cobra is limbless, we recovered coding sequences for all Hox genes involved in amniote limb development, with the exception of Hoxd12. Our results provide a unique view of the origin and evolution of snake venom and reveal multiple genome-level adaptive responses to natural selection in this complex biological weapon system. More generally, they provide insight into mechanisms of protein evolution under strong selection. PMID:24297900

  13. Evolutionary trajectories of snake genes and genomes revealed by comparative analyses of five-pacer viper

    Science.gov (United States)

    Yin, Wei; Wang, Zong-ji; Li, Qi-ye; Lian, Jin-ming; Zhou, Yang; Lu, Bing-zheng; Jin, Li-jun; Qiu, Peng-xin; Zhang, Pei; Zhu, Wen-bo; Wen, Bo; Huang, Yi-jun; Lin, Zhi-long; Qiu, Bi-tao; Su, Xing-wen; Yang, Huan-ming; Zhang, Guo-jie; Yan, Guang-mei; Zhou, Qi

    2016-01-01

    Snakes have numerous features distinctive from other tetrapods and a rich history of genome evolution that is still obscure. Here, we report the high-quality genome of the five-pacer viper, Deinagkistrodon acutus, and comparative analyses with other representative snake and lizard genomes. We map the evolutionary trajectories of transposable elements (TEs), developmental genes and sex chromosomes onto the snake phylogeny. TEs exhibit dynamic lineage-specific expansion, and many viper TEs show brain-specific gene expression along with their nearby genes. We detect signatures of adaptive evolution in olfactory, venom and thermal-sensing genes and also functional degeneration of genes associated with vision and hearing. Lineage-specific relaxation of functional constraints on respective Hox and Tbx limb-patterning genes supports fossil evidence for a successive loss of forelimbs then hindlimbs during snake evolution. Finally, we infer that the ZW sex chromosome pair had undergone at least three recombination suppression events in the ancestor of advanced snakes. These results altogether forge a framework for our deep understanding into snakes' history of molecular evolution. PMID:27708285

  14. Exploiting a Reference Genome in Terms of Duplications: The Network of Paralogs and Single Copy Genes in Arabidopsis thaliana

    Directory of Open Access Journals (Sweden)

    Mara Sangiovanni

    2013-12-01

    Full Text Available Arabidopsis thaliana became the model organism for plant studies because of its small diploid genome, rapid lifecycle and short adult size. Its genome was the first among plants to be sequenced, becoming the reference in plant genomics. However, the Arabidopsis genome is characterized by an inherently complex organization, since it has undergone ancient whole genome duplications, followed by gene reduction, diploidization events and extended rearrangements, which relocated and split up the retained portions. These events, together with probable chromosome reductions, dramatically increased the genome complexity, limiting its role as a reference. The identification of paralogs and single copy genes within a highly duplicated genome is a prerequisite to understand its organization and evolution and to improve its exploitation in comparative genomics. This is still controversial, even in the widely studied Arabidopsis genome. This is also due to the lack of a reference bioinformatics pipeline that could exhaustively identify paralogs and singleton genes. We describe here a complete computational strategy to detect both duplicated and single copy genes in a genome, discussing all the methodological issues that may strongly affect the results, their quality and their reliability. This approach was used to analyze the organization of Arabidopsis nuclear protein coding genes, and besides classifying computationally defined paralogs into networks and single copy genes into different classes, it unraveled further intriguing aspects concerning the genome annotation and the gene relationships in this reference plant species. Since our results may be useful for comparative genomics and genome functional analyses, we organized a dedicated web interface to make them accessible to the scientific community.

  15. The Genome of Tolypocladium inflatum: Evolution, Organization, and Expression of the Cyclosporin Biosynthetic Gene Cluster

    Science.gov (United States)

    Bushley, Kathryn E.; Raja, Rajani; Jaiswal, Pankaj; Cumbie, Jason S.; Nonogaki, Mariko; Boyd, Alexander E.; Owensby, C. Alisha; Knaus, Brian J.; Elser, Justin; Miller, Daniel; Di, Yanming; McPhail, Kerry L.; Spatafora, Joseph W.

    2013-01-01

    The ascomycete fungus Tolypocladium inflatum, a pathogen of beetle larvae, is best known as the producer of the immunosuppressant drug cyclosporin. The draft genome of T. inflatum strain NRRL 8044 (ATCC 34921), the isolate from which cyclosporin was first isolated, is presented along with comparative analyses of the biosynthesis of cyclosporin and other secondary metabolites in T. inflatum and related taxa. Phylogenomic analyses reveal previously undetected and complex patterns of homology between the nonribosomal peptide synthetase (NRPS) that encodes for cyclosporin synthetase (simA) and those of other secondary metabolites with activities against insects (e.g., beauvericin, destruxins, etc.), and demonstrate the roles of module duplication and gene fusion in diversification of NRPSs. The secondary metabolite gene cluster responsible for cyclosporin biosynthesis is described. In addition to genes necessary for cyclosporin biosynthesis, it harbors a gene for a cyclophilin, which is a member of a family of immunophilins known to bind cyclosporin. Comparative analyses support a lineage specific origin of the cyclosporin gene cluster rather than horizontal gene transfer from bacteria or other fungi. RNA-Seq transcriptome analyses in a cyclosporin-inducing medium delineate the boundaries of the cyclosporin cluster and reveal high levels of expression of the gene cluster cyclophilin. In medium containing insect hemolymph, weaker but significant upregulation of several genes within the cyclosporin cluster, including the highly expressed cyclophilin gene, was observed. T. inflatum also represents the first reference draft genome of Ophiocordycipitaceae, a third family of insect pathogenic fungi within the fungal order Hypocreales, and supports parallel and qualitatively distinct radiations of insect pathogens. The T. inflatum genome provides additional insight into the evolution and biosynthesis of cyclosporin and lays a foundation for further investigations of the role

  16. From genes to milk: genomic organization and epigenetic regulation of the mammary transcriptome.

    Science.gov (United States)

    Lemay, Danielle G; Pollard, Katherine S; Martin, William F; Freeman Zadrowski, Courtneay; Hernandez, Joseph; Korf, Ian; German, J Bruce; Rijnkels, Monique

    2013-01-01

    Even in genomes lacking operons, a gene's position in the genome influences its potential for expression. The mechanisms by which adjacent genes are co-expressed are still not completely understood. Using lactation and the mammary gland as a model system, we explore the hypothesis that chromatin state contributes to the co-regulation of gene neighborhoods. The mammary gland represents a unique evolutionary model, due to its recent appearance, in the context of vertebrate genomes. An understanding of how the mammary gland is regulated to produce milk is also of biomedical and agricultural importance for human lactation and dairying. Here, we integrate epigenomic and transcriptomic data to develop a comprehensive regulatory model. Neighborhoods of mammary-expressed genes were determined using expression data derived from pregnant and lactating mice and a neighborhood scoring tool, G-NEST. Regions of open and closed chromatin were identified by ChIP-Seq of histone modifications H3K36me3, H3K4me2, and H3K27me3 in the mouse mammary gland and liver tissue during lactation. We found that neighborhoods of genes in regions of uniquely active chromatin in the lactating mammary gland, compared with liver tissue, were extremely rare. Rather, genes in most neighborhoods were suppressed during lactation as reflected in their expression levels and their location in regions of silenced chromatin. Chromatin silencing was largely shared between the liver and mammary gland during lactation, and what distinguished the mammary gland was mainly a small tissue-specific repertoire of isolated, expressed genes. These findings suggest that an advantage of the neighborhood organization is in the collective repression of groups of genes via a shared mechanism of chromatin repression. Genes essential to the mammary gland's uniqueness are isolated from neighbors, and likely have less tolerance for variation in expression, properties they share with genes responsible for an organism's survival.

  17. Multi-targeted priming for genome-wide gene expression assays

    Directory of Open Access Journals (Sweden)

    Adomas Aleksandra B

    2010-08-01

    Full Text Available Abstract Background Complementary approaches to assaying global gene expression are needed to assess gene expression in regions that are poorly assayed by current methodologies. A key component of nearly all gene expression assays is the reverse transcription of transcribed sequences that has traditionally been performed by priming the poly-A tails on many of the transcribed genes in eukaryotes with oligo-dT, or by priming RNA indiscriminately with random hexamers. We designed an algorithm to find common sequence motifs that were present within most protein-coding genes of Saccharomyces cerevisiae and of Neurospora crassa, but that were not present within their ribosomal RNA or transfer RNA genes. We then experimentally tested whether degenerately priming these motifs with multi-targeted primers improved the accuracy and completeness of transcriptomic assays. Results We discovered two multi-targeted primers that would prime a preponderance of genes in the genomes of Saccharomyces cerevisiae and Neurospora crassa while avoiding priming ribosomal RNA or transfer RNA. Examining the response of Saccharomyces cerevisiae to nitrogen deficiency and profiling Neurospora crassa early sexual development, we demonstrated that using multi-targeted primers in reverse transcription led to superior performance of microarray profiling and next-generation RNA tag sequencing. Priming with multi-targeted primers in addition to oligo-dT resulted in higher sensitivity, a larger number of well-measured genes and greater power to detect differences in gene expression. Conclusions Our results provide the most complete and detailed expression profiles of the yeast nitrogen starvation response and N. crassa early sexual development to date. Furthermore, our multi-targeting priming methodology for genome-wide gene expression assays provides selective targeting of multiple sequences and counter-selection against undesirable sequences, facilitating a more complete and

  18. Improvement of genome assembly completeness and identification of novel full-length protein-coding genes by RNA-seq in the giant panda genome.

    Science.gov (United States)

    Chen, Meili; Hu, Yibo; Liu, Jingxing; Wu, Qi; Zhang, Chenglin; Yu, Jun; Xiao, Jingfa; Wei, Fuwen; Wu, Jiayan

    2015-12-11

    High-quality and complete gene models are the basis of whole genome analyses. The giant panda (Ailuropoda melanoleuca) genome was the first genome sequenced on the basis of solely short reads, but the genome annotation had lacked the support of transcriptomic evidence. In this study, we applied RNA-seq to globally improve the genome assembly completeness and to detect novel expressed transcripts in 12 tissues from giant pandas, by using a transcriptome reconstruction strategy that combined reference-based and de novo methods. Several aspects of genome assembly completeness in the transcribed regions were effectively improved by the de novo assembled transcripts, including genome scaffolding, the detection of small-size assembly errors, the extension of scaffold/contig boundaries, and gap closure. Through expression and homology validation, we detected three groups of novel full-length protein-coding genes. A total of 12.62% of the novel protein-coding genes were validated by proteomic data. GO annotation analysis showed that some of the novel protein-coding genes were involved in pigmentation, anatomical structure formation and reproduction, which might be related to the development and evolution of the black-white pelage, pseudo-thumb and delayed embryonic implantation of giant pandas. The updated genome annotation will help further giant panda studies from both structural and functional perspectives.

  19. Integration of Genome-Wide TF Binding and Gene Expression Data to Characterize Gene Regulatory Networks in Plant Development.

    Science.gov (United States)

    Chen, Dijun; Kaufmann, Kerstin

    2017-01-01

    Key transcription factors (TFs) controlling the morphogenesis of flowers and leaves have been identified in the model plant Arabidopsis thaliana. Recent genome-wide approaches based on chromatin immunoprecipitation (ChIP) followed by high-throughput DNA sequencing (ChIP-seq) enable systematic identification of genome-wide TF binding sites (TFBSs) of these regulators. Here, we describe a computational pipeline for analyzing ChIP-seq data to identify TFBSs and to characterize gene regulatory networks (GRNs) with applications to the regulatory studies of flower development. In particular, we provide step-by-step instructions on how to download, analyze, visualize, and integrate genome-wide data in order to construct GRNs for beginners of bioinformatics. The practical guide presented here is ready to apply to other similar ChIP-seq datasets to characterize GRNs of interest.

  20. Genome-wide identification and comparative expression analysis of LEA genes in watermelon and melon genomes

    OpenAIRE

    Celik Altunoglu, Yasemin; Baloglu, Mehmet Cengiz; Baloglu, Pinar; Yer, Esra Nurten; Kara, Sibel

    2017-01-01

    Late embryogenesis abundant (LEA) proteins are large and diverse group of polypeptides which were first identified during seed dehydration and then in vegetative plant tissues during different stress responses. Now, gene family members of LEA proteins have been detected in various organisms. However, there is no report for this protein family in watermelon and melon until this study. A total of 73 LEA genes from watermelon (ClLEA) and 61 LEA genes from melon (CmLEA) were identified in this co...

  1. Genome-wide survey of flavonoid biosynthesis genes and gene expression analysis between black- and yellow-seeded Brassica napus

    Directory of Open Access Journals (Sweden)

    Cunmin Qu

    2016-12-01

    Full Text Available Flavonoids, the compounds that impart color to fruits, flowers, and seeds, are the most widespread secondary metabolites in plants. However, a systematic analysis of these loci has not been performed in Brassicaceae. In this study, we isolated 649 nucleotide sequences related to flavonoid biosynthesis, i.e., the Transparent Testa (TT genes, and their associated amino acid sequences in 17 Brassicaceae species, grouped into Arabidopsis or Brassicaceae subgroups. Moreover, 36 copies of 21 genes of the flavonoid biosynthesis pathway were identified in A. thaliana, 53 were identified in B. rapa, 50 in B. oleracea, and 95 in B. napus, followed the genomic distribution, collinearity analysis and genes triplication of them among Brassicaceae species. The results showed that the extensive gene loss, whole genome triplication, and diploidization that occurred after divergence from the common ancestor. Using qRT-PCR methods, we analyzed the expression of eighteen flavonoid biosynthesis genes in 6 yellow- and black-seeded B. napus inbred lines with different genetic background, found that 12 of which were preferentially expressed during seed development, whereas the remaining genes were expressed in all B. napus tissues examined. Moreover, fourteen of these genes showed significant differences in expression level during seed development, and all but four of these (i.e., BnTT5, BnTT7, BnTT10, and BnTTG1 had similar expression patterns among the yellow- and black-seeded B. napus. Results showed that the structural genes (BnTT3, BnTT18 and BnBAN, regulatory genes (BnTTG2 and BnTT16 and three encoding transfer proteins (BnTT12, BnTT19, and BnAHA10 might play an crucial roles in the formation of different seed coat colors in B. napus. These data will be helpful for illustrating the molecular mechanisms of flavonoid biosynthesis in Brassicaceae species.

  2. Transcriptional responses of earthworm (Eisenia fetida) exposed to naphthenic acids in soil

    International Nuclear Information System (INIS)

    Wang, Jie; Cao, Xiaofeng; Sun, Jinhua; Chai, Liwei; Huang, Yi; Tang, Xiaoyan

    2015-01-01

    In this study, earthworms (Eisenia fetida) were exposed to commercial NAs contaminated soil, and changes in the levels of reactive oxygen species (ROS) and gene expressions of their defense system were monitored. The effects on the gene expression involved in reproduction and carcinogenesis were also evaluated. Significant increases in ROS levels was observed in NAs exposure groups, and the superoxide dismutase (SOD) and catalase (CAT) genes were both up-regulated at low and medium exposure doses, which implied NAs might exert toxicity by oxidative stress. The transcription of CRT and HSP70 coincided with oxidative stress, which implied both chaperones perform important functions in the protection against oxidative toxicity. The upregulation of TCTP gene indicated a potential adverse effect of NAs to terrestrial organisms through induction of carcinogenesis, and the downregulation of ANN gene indicated that NAs might potentially result in deleterious reproduction effects. - Highlights: • The first attempt to study gene ecotoxicity of NAs in terrestrial environment. • NAs exert toxicity by oxidative stress on earthworm. • NAs might cause carcinogenesis and reproductive disruption to earthworm. - NAs induced oxidative stress and altered transcriptions of genes involved in defense, reproduction, and carcinogenesis

  3. The genome of Nectria haematococca: contribution of supernumerary chromosomes to gene expansion

    Energy Technology Data Exchange (ETDEWEB)

    Coleman, J.J.; Rounsley, S.D.; Rodriguez-Carres, M.; Kuo, A.; Wasmann, C.c.; Grimwood, J.; Schmutz, J.; Taga, M.; White, G.J.; Zhuo, S.; Schwartz, D.C.; Freitag, M.; Ma, L.-J.; Danchin, E.G.J.; Henrissat, B.; Cutinho, P.M.; Nelson, D.R.; Straney, D.; Napoli, C.A.; Baker, B.M.; Gribskov, M.; Rep, M.; Kroken, S.; Molnar, I.; Rensing, C.; Kennell, J.C.; Zamora, J.; Farman, M.L.; Selker, E.U.; Salamov, A.; Shapiro, H.; Pangilinan, J.; Lindquist, E.; Lamers, C.; Grigoriev, I.V.; Geiser, D.M.; Covert, S.F.; Temporini, S.; VanEtten, H.D.

    2009-04-20

    The ascomycetous fungus Nectria haematococca, (asexual name Fusarium solani), is a member of a group of .50 species known as the"Fusarium solani species complex". Members of this complex have diverse biological properties including the ability to cause disease on .100 genera of plants and opportunistic infections in humans. The current research analyzed the most extensively studied member of this complex, N. haematococca mating population VI (MPVI). Several genes controlling the ability of individual isolates of this species to colonize specific habitats are located on supernumerary chromosomes. Optical mapping revealed that the sequenced isolate has 17 chromosomes ranging from 530 kb to 6.52 Mb and that the physical size of the genome, 54.43 Mb, and the number of predicted genes, 15,707, are among the largest reported for ascomycetes. Two classes of genes have contributed to gene expansion: specific genes that are not found in other fungi including its closest sequenced relative, Fusarium graminearum; and genes that commonly occur as single copies in other fungi but are present as multiple copies in N. haematococca MPVI. Some of these additional genes appear to have resulted from gene duplication events, while others may have been acquired through horizontal gene transfer. The supernumerary nature of three chromosomes, 14, 15, and 17, was confirmed by their absence in pulsed field gel electrophoresis experiments of some isolates and by demonstrating that these isolates lacked chromosome-specific sequences found on the ends of these chromosomes. These supernumerary chromosomes contain more repeat sequences, are enriched in unique and duplicated genes, and have a lower G+C content in comparison to the other chromosomes. Although the origin(s) of the extra genes and the supernumerary chromosomes is not known, the gene expansion and its large genome size are consistent with this species' diverse range of habitats. Furthermore, the presence of unique genes on

  4. Genome-wide identification, evolutionary and expression analysis of the aspartic protease gene superfamily in grape

    Science.gov (United States)

    2013-01-01

    Background Aspartic proteases (APs) are a large family of proteolytic enzymes found in almost all organisms. In plants, they are involved in many biological processes, such as senescence, stress responses, programmed cell death, and reproduction. Prior to the present study, no grape AP gene(s) had been reported, and their research on woody species was very limited. Results In this study, a total of 50 AP genes (VvAP) were identified in the grape genome, among which 30 contained the complete ASP domain. Synteny analysis within grape indicated that segmental and tandem duplication events contributed to the expansion of the grape AP family. Additional analysis between grape and Arabidopsis demonstrated that several grape AP genes were found in the corresponding syntenic blocks of Arabidopsis, suggesting that these genes arose before the divergence of grape and Arabidopsis. Phylogenetic relationships of the 30 VvAPs with the complete ASP domain and their Arabidopsis orthologs, as well as their gene and protein features were analyzed and their cellular localization was predicted. Moreover, expression profiles of VvAP genes in six different tissues were determined, and their transcript abundance under various stresses and hormone treatments were measured. Twenty-seven VvAP genes were expressed in at least one of the six tissues examined; nineteen VvAPs responded to at least one abiotic stress, 12 VvAPs responded to powdery mildew infection, and most of the VvAPs responded to SA and ABA treatments. Furthermore, integrated synteny and phylogenetic analysis identified orthologous AP genes between grape and Arabidopsis, providing a unique starting point for investigating the function of grape AP genes. Conclusions The genome-wide identification, evolutionary and expression analyses of grape AP genes provide a framework for future analysis of AP genes in defining their roles during stress response. Integrated synteny and phylogenetic analyses provide novel insight into the

  5. The genome of Nectria haematococca: contribution of supernumerary chromosomes to gene expansion.

    Directory of Open Access Journals (Sweden)

    Jeffrey J Coleman

    2009-08-01

    Full Text Available The ascomycetous fungus Nectria haematococca, (asexual name Fusarium solani, is a member of a group of >50 species known as the "Fusarium solani species complex". Members of this complex have diverse biological properties including the ability to cause disease on >100 genera of plants and opportunistic infections in humans. The current research analyzed the most extensively studied member of this complex, N. haematococca mating population VI (MPVI. Several genes controlling the ability of individual isolates of this species to colonize specific habitats are located on supernumerary chromosomes. Optical mapping revealed that the sequenced isolate has 17 chromosomes ranging from 530 kb to 6.52 Mb and that the physical size of the genome, 54.43 Mb, and the number of predicted genes, 15,707, are among the largest reported for ascomycetes. Two classes of genes have contributed to gene expansion: specific genes that are not found in other fungi including its closest sequenced relative, Fusarium graminearum; and genes that commonly occur as single copies in other fungi but are present as multiple copies in N. haematococca MPVI. Some of these additional genes appear to have resulted from gene duplication events, while others may have been acquired through horizontal gene transfer. The supernumerary nature of three chromosomes, 14, 15, and 17, was confirmed by their absence in pulsed field gel electrophoresis experiments of some isolates and by demonstrating that these isolates lacked chromosome-specific sequences found on the ends of these chromosomes. These supernumerary chromosomes contain more repeat sequences, are enriched in unique and duplicated genes, and have a lower G+C content in comparison to the other chromosomes. Although the origin(s of the extra genes and the supernumerary chromosomes is not known, the gene expansion and its large genome size are consistent with this species' diverse range of habitats. Furthermore, the presence of unique

  6. Whole genome homology-based identification of candidate genes ...

    African Journals Online (AJOL)

    Sesame (Sesamum indicum L.) is one of the most important oilseed crops. It is mainly grown in arid and semi-arid regions with occurrence of unpredictable drought which is one of the major constraints of its production. However, the lack of gene resources associated with drought tolerance hinders sesame genetic ...

  7. Genomics of Human Pulmonary Tuberculosis: from Genes to Pathways

    DEFF Research Database (Denmark)

    Stein, Catherine M.; Sausville, Lindsay; Wejse, Christian

    2017-01-01

    Purpose of Review Tuberculosis (TB), caused by Mycobacterium tuberculosis (MTB), remains a major public health threat globally. Several lines of evidence support a role for host genetic factors in resistance/susceptibility to TB disease and MTB infection. However, results across candidate gene...

  8. Genomic consequences of selection on self-incompatibility genes

    DEFF Research Database (Denmark)

    Schierup, Mikkel Heide; Vekemans, Xavier

    2008-01-01

    closely related species. We review recent empirical studies demonstrating these features and relate the empirical findings to theoretical predictions. We show how these features are being exploited in searches for other genes under multi-allelic balancing selection and for inference on recent breakdown...

  9. A Novel Phytophthora sojae Resistance Rps12 Gene Mapped to a Genomic Region That Contains Several Rps Genes.

    Science.gov (United States)

    Sahoo, Dipak K; Abeysekara, Nilwala S; Cianzio, Silvia R; Robertson, Alison E; Bhattacharyya, Madan K

    2017-01-01

    Phytophthora sojae Kaufmann and Gerdemann, which causes Phytophthora root rot, is a widespread pathogen that limits soybean production worldwide. Development of Phytophthora resistant cultivars carrying Phytophthora resistance Rps genes is a cost-effective approach in controlling this disease. For this mapping study of a novel Rps gene, 290 recombinant inbred lines (RILs) (F7 families) were developed by crossing the P. sojae resistant cultivar PI399036 with the P. sojae susceptible AR2 line, and were phenotyped for responses to a mixture of three P. sojae isolates that overcome most of the known Rps genes. Of these 290 RILs, 130 were homozygous resistant, 12 heterzygous and segregating for Phytophthora resistance, and 148 were recessive homozygous and susceptible. From this population, 59 RILs homozygous for Phytophthora sojae resistance and 61 susceptible to a mixture of P. sojae isolates R17 and Val12-11 or P7074 that overcome resistance encoded by known Rps genes mapped to Chromosome 18 were selected for mapping novel Rps gene. A single gene accounted for the 1:1 segregation of resistance and susceptibility among the RILs. The gene encoding the Phytophthora resistance mapped to a 5.8 cM interval between the SSR markers BARCSOYSSR_18_1840 and Sat_064 located in the lower arm of Chromosome 18. The gene is mapped 2.2 cM proximal to the NBSRps4/6-like sequence that was reported to co-segregate with the Phytophthora resistance genes Rps4 and Rps6. The gene is mapped to a highly recombinogenic, gene-rich genomic region carrying several nucleotide binding site-leucine rich repeat (NBS-LRR)-like genes. We named this novel gene as Rps12, which is expected to be an invaluable resource in breeding soybeans for Phytophthora resistance.

  10. A Novel Phytophthora sojae Resistance Rps12 Gene Mapped to a Genomic Region That Contains Several Rps Genes.

    Directory of Open Access Journals (Sweden)

    Dipak K Sahoo

    Full Text Available Phytophthora sojae Kaufmann and Gerdemann, which causes Phytophthora root rot, is a widespread pathogen that limits soybean production worldwide. Development of Phytophthora resistant cultivars carrying Phytophthora resistance Rps genes is a cost-effective approach in controlling this disease. For this mapping study of a novel Rps gene, 290 recombinant inbred lines (RILs (F7 families were developed by crossing the P. sojae resistant cultivar PI399036 with the P. sojae susceptible AR2 line, and were phenotyped for responses to a mixture of three P. sojae isolates that overcome most of the known Rps genes. Of these 290 RILs, 130 were homozygous resistant, 12 heterzygous and segregating for Phytophthora resistance, and 148 were recessive homozygous and susceptible. From this population, 59 RILs homozygous for Phytophthora sojae resistance and 61 susceptible to a mixture of P. sojae isolates R17 and Val12-11 or P7074 that overcome resistance encoded by known Rps genes mapped to Chromosome 18 were selected for mapping novel Rps gene. A single gene accounted for the 1:1 segregation of resistance and susceptibility among the RILs. The gene encoding the Phytophthora resistance mapped to a 5.8 cM interval between the SSR markers BARCSOYSSR_18_1840 and Sat_064 located in the lower arm of Chromosome 18. The gene is mapped 2.2 cM proximal to the NBSRps4/6-like sequence that was reported to co-segregate with the Phytophthora resistance genes Rps4 and Rps6. The gene is mapped to a highly recombinogenic, gene-rich genomic region carrying several nucleotide binding site-leucine rich repeat (NBS-LRR-like genes. We named this novel gene as Rps12, which is expected to be an invaluable resource in breeding soybeans for Phytophthora resistance.

  11. Cartilage-selective genes identified in genome-scale analysis of non-cartilage and cartilage gene expression

    Directory of Open Access Journals (Sweden)

    Cohn Zachary A

    2007-06-01

    Full Text Available Abstract Background Cartilage plays a fundamental role in the development of the human skeleton. Early in embryogenesis, mesenchymal cells condense and differentiate into chondrocytes to shape the early skeleton. Subsequently, the cartilage anlagen differentiate to form the growth plates, which are responsible for linear bone growth, and the articular chondrocytes, which facilitate joint function. However, despite the multiplicity of roles of cartilage during human fetal life, surprisingly little is known about its transcriptome. To address this, a whole genome microarray expression profile was generated using RNA isolated from 18–22 week human distal femur fetal cartilage and compared with a database of control normal human tissues aggregated at UCLA, termed Celsius. Results 161 cartilage-selective genes were identified, defined as genes significantly expressed in cartilage with low expression and little variation across a panel of 34 non-cartilage tissues. Among these 161 genes were cartilage-specific genes such as cartilage collagen genes and 25 genes which have been associated with skeletal phenotypes in humans and/or mice. Many of the other cartilage-selective genes do not have established roles in cartilage or are novel, unannotated genes. Quantitative RT-PCR confirmed the unique pattern of gene expression observed by microarray analysis. Conclusion Defining the gene expression pattern for cartilage has identified new genes that may contribute to human skeletogenesis as well as provided further candidate genes for skeletal dysplasias. The data suggest that fetal cartilage is a complex and transcriptionally active tissue and demonstrate that the set of genes selectively expressed in the tissue has been greatly underestimated.

  12. Regulatory network construction in Arabidopsis by using genome-wide gene expression quantitative trait loci

    NARCIS (Netherlands)

    Keurentjes, Joost J.B.; Fu, Jingyuan; Terpstra, Inez R.; Garcia, Juan M.; Ackerveken, Guido van den; Snoek, L. Basten; Peeters, Anton J.M.; Vreugdenhil, Dick; Koornneef, Maarten; Jansen, Ritsert C.

    2007-01-01

    Accessions of a plant species can show considerable genetic differences that are analyzed effectively by using recombinant inbred line (RIL) populations. Here we describe the results of genome-wide expression variation analysis in an RIL population of Arabidopsis thaliana. For many genes, variation

  13. Natural variation of histone modification and its impact on gene expression in the rat genome

    NARCIS (Netherlands)

    Rintisch, Carola; Heinig, Matthias; Bauerfeind, Anja; Schafer, Sebastian; Mieth, Christin; Patone, Giannino; Hummel, Oliver; Chen, Wei; Cook, Stuart; Cuppen, Edwin; Colomé-Tatché, Maria; Johannes, Frank; Jansen, Ritsert C; Neil, Helen; Werner, Michel; Pravenec, Michal; Vingron, Martin; Hubner, Norbert

    Histone modifications are epigenetic marks that play fundamental roles in many biological processes including the control of chromatin-mediated regulation of gene expression. Little is known about interindividual variability of histone modification levels across the genome and to what extent they

  14. Genome-wide Analyses Identify KIF5A as a Novel ALS Gene

    NARCIS (Netherlands)

    Nicolas, Aude; Kenna, Kevin P.; Renton, Alan E.; Ticozzi, Nicola; Faghri, Faraz; Chia, Ruth; Dominov, Janice A.; Kenna, Brendan J.; Nalls, Mike A.; Keagle, Pamela; Rivera, Alberto M.; van Rheenen, Wouter; Murphy, Natalie A.; van Vugt, Joke J.F.A.; Geiger, Joshua T.; van der Spek, Rick; Pliner, Hannah A.; Smith, Bradley N.; Marangi, Giuseppe; Topp, Simon D.; Abramzon, Yevgeniya; Gkazi, Athina Soragia; Eicher, John D.; Kenna, Aoife; Logullo, Francesco O.; Simone, Isabella L.; Logroscino, Giancarlo; Salvi, Fabrizio; Bartolomei, Ilaria; Borghero, Giuseppe; Murru, Maria Rita; Costantino, Emanuela; Pani, Carla; Puddu, Roberta; Caredda, Carla; Piras, Valeria; Tranquilli, Stefania; Cuccu, Stefania; Corongiu, Daniela; Melis, Maurizio; Milia, Antonio; Marrosu, Francesco; Marrosu, Maria Giovanna; Floris, Gianluca; Cannas, Antonino; Capasso, Margherita; Caponnetto, Claudia; Mancardi, Gianluigi; Origone, Paola; Mandich, Paola; Conforti, Francesca L.; Cavallaro, Sebastiano; Mora, Gabriele; Marinou, Kalliopi; Sideri, Riccardo; Penco, Silvana; Mosca, Lorena; Lunetta, Christian; Pinter, Giuseppe Lauria; Corbo, Massimo; Riva, Nilo; Carrera, Paola; Volanti, Paolo; Mandrioli, Jessica; Fini, Nicola; Fasano, Antonio; Tremolizzo, Lucio; Arosio, Alessandro; Ferrarese, Carlo; Trojsi, Francesca; Tedeschi, Gioacchino; Monsurrò, Maria Rosaria; Piccirillo, Giovanni; Femiano, Cinzia; Ticca, Anna; Ortu, Enzo; La Bella, Vincenzo; Spataro, Rossella; Colletti, Tiziana; Sabatelli, Mario; Zollino, Marcella; Conte, Amelia; Luigetti, Marco; Lattante, Serena; Marangi, Giuseppe; Santarelli, Marialuisa; Petrucci, Antonio; Pugliatti, Maura; Pirisi, Angelo; Parish, Leslie D.; Occhineri, Patrizia; Giannini, Fabio; Battistini, Stefania; Ricci, Claudia; Benigni, Michele; Cau, Tea B.; Loi, Daniela; Calvo, Andrea; Moglia, Cristina; Brunetti, Maura; Barberis, Marco; Restagno, Gabriella; Casale, Federico; Marrali, Giuseppe; Fuda, Giuseppe; Ossola, Irene; Cammarosano, Stefania; Canosa, Antonio; Ilardi, Antonio; Manera, Umberto; Grassano, Maurizio; Tanel, Raffaella; Pisano, Fabrizio; Mora, Gabriele; Calvo, Andrea; Mazzini, Letizia; Riva, Nilo; Mandrioli, Jessica; Caponnetto, Claudia; Battistini, Stefania; Volanti, Paolo; La Bella, Vincenzo; Conforti, Francesca L.; Borghero, Giuseppe; Messina, Sonia; Simone, Isabella L.; Trojsi, Francesca; Salvi, Fabrizio; Logullo, Francesco O.; D'Alfonso, Sandra; Corrado, Lucia; Capasso, Margherita; Ferrucci, Luigi; Harms, Matthew B.; Goldstein, David B.; Shneider, Neil A.; Goutman, Stephen A.; Simmons, Zachary; Miller, Timothy M.; Chandran, Siddharthan; Pal, Suvankar; Manousakis, George; Appel, Stanley H.; Simpson, Ericka; Wang, Leo; Baloh, Robert H.; Gibson, Summer B.; Bedlack, Richard; Lacomis, David; Sareen, Dhruv; Sherman, Alexander; Bruijn, Lucie; Penny, Michelle; Moreno, Cristiane de Araujo Martins; Kamalakaran, Sitharthan; Goldstein, David B.; Allen, Andrew S.; Appel, Stanley; Baloh, Robert H.; Bedlack, Richard S.; Boone, Braden E.; Brown, Robert; Carulli, John P.; Chesi, Alessandra; Chung, Wendy K.; Cirulli, Elizabeth T.; Cooper, Gregory M.; Couthouis, Julien; Day-Williams, Aaron G.; Dion, Patrick A.; Gibson, Summer B.; Gitler, Aaron D.; Glass, Jonathan D.; Goldstein, David B.; Han, Yujun; Harms, Matthew B.; Harris, Tim; Hayes, Sebastian D.; Jones, Angela L.; Keebler, Jonathan; Krueger, Brian J.; Lasseigne, Brittany N.; Levy, Shawn E.; Lu, Yi Fan; Maniatis, Tom; McKenna-Yasek, Diane; Miller, Timothy M.; Myers, Richard M.; Petrovski, Slavé; Pulst, Stefan M.; Raphael, Alya R.; Ravits, John M.; Ren, Zhong; Rouleau, Guy A.; Sapp, Peter C.; Shneider, Neil A.; Simpson, Ericka; Sims, Katherine B.; Staropoli, John F.; Waite, Lindsay L.; Wang, Quanli; Wimbish, Jack R.; Xin, Winnie W.; Gitler, Aaron D.; Harris, Tim; Myers, Richard M.; Phatnani, Hemali; Kwan, Justin; Sareen, Dhruv; Broach, James R.; Simmons, Zachary; Arcila-Londono, Ximena; Lee, Edward B.; Van Deerlin, Vivianna M.; Shneider, Neil A.; Fraenkel, Ernest; Ostrow, Lyle W.; Baas, Frank; Zaitlen, Noah; Berry, James D.; Malaspina, Andrea; Fratta, Pietro; Cox, Gregory A.; Thompson, Leslie M.; Finkbeiner, Steve; Dardiotis, Efthimios; Miller, Timothy M.; Chandran, Siddharthan; Pal, Suvankar; Hornstein, Eran; MacGowan, Daniel J.L.; Heiman-Patterson, Terry D.; Hammell, Molly G.; Patsopoulos, Nikolaos A.; Dubnau, Joshua; Nath, Avindra; Phatnani, Hemali; Musunuri, Rajeeva Lochan; Evani, Uday Shankar; Abhyankar, Avinash; Zody, Michael C.; Kaye, Julia; Finkbeiner, Steven; Wyman, Stacia K.; LeNail, Alexander; Lima, Leandro; Fraenkel, Ernest; Rothstein, Jeffrey D.; Svendsen, Clive N.; Thompson, Leslie M.; Van Eyk, Jenny; Maragakis, Nicholas J.; Berry, James D.; Glass, Jonathan D.; Miller, Timothy M.; Kolb, Stephen J.; Baloh, Robert H.; Cudkowicz, Merit; Baxi, Emily; Kaye, Julia; Finkbeiner, Steven; Wyman, Stacia K.; Finkbeiner, Steven; LeNail, Alex; Lima, Leandro; Fraenkel, Ernest; Fraenkel, Ernest; Svendsen, Clive N.; Svendsen, Clive N.; Thompson, Leslie M.; Thompson, Leslie M.; Van Eyk, Jennifer E.; Berry, James D.; Berry, James D.; Miller, Timothy M.; Kolb, Stephen J.; Cudkowicz, Merit; Cudkowicz, Merit; Baxi, Emily; Benatar, Michael; Taylor, J. Paul; Wu, Gang; Rampersaud, Evadnie; Wuu, Joanne; Rademakers, Rosa; Züchner, Stephan; Schule, Rebecca; McCauley, Jacob; Hussain, Sumaira; Cooley, Anne; Wallace, Marielle; Clayman, Christine; Barohn, Richard; Statland, Jeffrey; Ravits, John M.; Swenson, Andrea; Jackson, Carlayne; Trivedi, Jaya; Khan, Shaida; Katz, Jonathan; Jenkins, Liberty; Burns, Ted; Gwathmey, Kelly; Caress, James; McMillan, Corey; Elman, Lauren; Pioro, Erik P.; Heckmann, Jeannine; So, Yuen; Walk, David; Maiser, Samuel; Zhang, Jinghui; Benatar, Michael; Taylor, J. Paul; Taylor, J. Paul; Rampersaud, Evadnie; Wu, Gang; Wuu, Joanne; Silani, Vincenzo; Ticozzi, Nicola; Gellera, Cinzia; Ratti, Antonia; Taroni, Franco; Lauria, Giuseppe; Verde, Federico; Fogh, Isabella; Tiloca, Cinzia; Comi, Giacomo P.; Sorarù, Gianni; Cereda, Cristina; D'Alfonso, Sandra; Corrado, Lucia; De Marchi, Fabiola; Corti, Stefania; Ceroni, Mauro; Mazzini, Letizia; Siciliano, Gabriele; Filosto, Massimiliano; Inghilleri, Maurizio; Peverelli, Silvia; Colombrita, Claudia; Poletti, Barbara; Maderna, Luca; Del Bo, Roberto; Gagliardi, Stella; Querin, Giorgia; Bertolin, Cinzia; Pensato, Viviana; Castellotti, Barbara; Lauria, Giuseppe; Verde, Federico; Fogh, Isabella; Tiloca, Cinzia; Fogh, Isabella; Comi, Giacomo P.; Sorarù, Gianni; Cereda, Cristina; Camu, William; Mouzat, Kevin; Lumbroso, Serge; Corcia, Philippe; Meininger, Vincent; Besson, Gérard; Lagrange, Emmeline; Clavelou, Pierre; Guy, Nathalie; Couratier, Philippe; Vourch, Patrick; Danel, Véronique; Bernard, Emilien; Lemasson, Gwendal; Corcia, Philippe; Laaksovirta, Hannu; Myllykangas, Liisa; Jansson, Lilja; Valori, Miko; Ealing, John; Hamdalla, Hisham; Rollinson, Sara; Pickering-Brown, Stuart; Orrell, Richard W.; Sidle, Katie C.; Malaspina, Andrea; Hardy, John; Singleton, Andrew B.; Johnson, Janel O.; Arepalli, Sampath; Sapp, Peter C.; McKenna-Yasek, Diane; Polak, Meraida; Asress, Seneshaw; Al-Sarraj, Safa; King, Andrew; Troakes, Claire; Vance, Caroline; de Belleroche, Jacqueline; Baas, Frank; ten Asbroek, Anneloor L.M.A.; Muñoz-Blanco, José Luis; Hernandez, Dena G.; Ding, Jinhui; Gibbs, J. Raphael; Scholz, Sonja W.; Scholz, Sonja W.; Floeter, Mary Kay; Campbell, Roy H.; Landi, Francesco; Bowser, Robert; Pulst, Stefan M.; Ravits, John M.; MacGowan, Daniel J.L.; Kirby, Janine; Pioro, Erik P.; Pamphlett, Roger; Broach, James; Gerhard, Glenn; Dunckley, Travis L.; Brady, Christopher B.; Brady, Christopher B.; Kowall, Neil W.; Troncoso, Juan C.; Le Ber, Isabelle; Mouzat, Kevin; Lumbroso, Serge; Mouzat, Kevin; Lumbroso, Serge; Heiman-Patterson, Terry D.; Heiman-Patterson, Terry D.; Kamel, Freya; Van Den Bosch, Ludo; Van Den Bosch, Ludo; Baloh, Robert H.; Strom, Tim M.; Meitinger, Thomas; Strom, Tim M.; Shatunov, Aleksey; Van Eijk, Kristel R.; de Carvalho, Mamede; de Carvalho, Mamede; Kooyman, Maarten; Middelkoop, Bas; Moisse, Matthieu; McLaughlin, Russell; Van Es, Michael A.; Weber, Markus; Boylan, Kevin B.; Van Blitterswijk, Marka; Rademakers, Rosa; Morrison, Karen; Basak, A. Nazli; Mora, Jesús S.; Drory, Vivian; Shaw, Pamela; Turner, Martin R.; Talbot, Kevin; Hardiman, Orla; Williams, Kelly L.; Fifita, Jennifer A.; Nicholson, Garth A.; Blair, Ian P.; Nicholson, Garth A.; Rouleau, Guy A.; Esteban-Pérez, Jesús; García-Redondo, Alberto; Al-Chalabi, Ammar; Al Kheifat, Ahmad; Al-Chalabi, Ammar; Andersen, Peter M.; Basak, A. Nazli; Blair, Ian P.; Chio, Adriano; Cooper-Knock, Jonathan; Corcia, Philippe; Couratier, Philippe; de Carvalho, Mamede; Dekker, Annelot; Drory, Vivian; Redondo, Alberto Garcia; Gotkine, Marc; Hardiman, Orla; Hide, Winston; Iacoangeli, Alfredo; Glass, Jonathan D.; Kenna, Kevin P.; Kiernan, Matthew; Kooyman, Maarten; Landers, John E.; McLaughlin, Russell; Middelkoop, Bas; Mill, Jonathan; Neto, Miguel Mitne; Moisse, Matthieu; Pardina, Jesus Mora; Morrison, Karen; Newhouse, Stephen; Pinto, Susana; Pulit, Sara; Robberecht, Wim; Shatunov, Aleksey; Shaw, Pamela; Shaw, Chris; Silani, Vincenzo; Sproviero, William; Tazelaar, Gijs; Ticozzi, Nicola; Van Damme, Philip; van den Berg, Leonard; van der Spek, Rick; Van Eijk, Kristel R.; Van Es, Michael A.; van Rheenen, Wouter; van Vugt, Joke J.F.A.; Veldink, Jan H.; Weber, Markus; Williams, Kelly L.; Van Damme, Philip; Robberecht, Wim; Zatz, Mayana; Robberecht, Wim; Bauer, Denis C.; Twine, Natalie A.; Rogaeva, Ekaterina; Zinman, Lorne; Ostrow, Lyle W.; Maragakis, Nicholas J.; Rothstein, Jeffrey D.; Simmons, Zachary; Cooper-Knock, Johnathan; Brice, Alexis; Goutman, Stephen A.; Feldman, Eva L.; Gibson, Summer B.; Taroni, Franco; Ratti, Antonia; Ratti, Antonia; Gellera, Cinzia; Van Damme, Philip; Robberecht, Wim; Fratta, Pietro; Sabatelli, Mario; Lunetta, Christian; Ludolph, Albert C.; Andersen, Peter M.; Weishaupt, Jochen H.; Camu, William; Trojanowski, John Q.; Van Deerlin, Vivianna M.; Brown, Robert H.; van den Berg, Leonard; Veldink, Jan H.; Harms, Matthew B.; Glass, Jonathan D.; Stone, David J.; Tienari, Pentti; Silani, Vincenzo; Silani, Vincenzo; Chiò, Adriano; Shaw, Christopher E.; Chiò, Adriano; Traynor, Bryan J.; Landers, John E.; Traynor, Bryan J.

    2018-01-01

    To identify novel genes associated with ALS, we undertook two lines of investigation. We carried out a genome-wide association study comparing 20,806 ALS cases and 59,804 controls. Independently, we performed a rare variant burden analysis comparing 1,138 index familial ALS cases and 19,494

  15. Facilitating genome navigation : survey sequencing and dense radiation-hybrid gene mapping

    NARCIS (Netherlands)

    Hitte, C; Madeoy, J; Kirkness, EF; Priat, C; Lorentzen, TD; Senger, F; Thomas, D; Derrien, T; Ramirez, C; Scott, C; Evanno, G; Pullar, B; Cadieu, E; Oza, [No Value; Lourgant, K; Jaffe, DB; Tacher, S; Dreano, S; Berkova, N; Andre, C; Deloukas, P; Fraser, C; Lindblad-Toh, K; Ostrander, EA; Galibert, F

    Accurate and comprehensive sequence coverage for large genomes has been restricted to only a few species of specific interest. Lower sequence coverage (survey sequencing) of related species can yield a wealth of information about gene content and putative regulatory elements. But survey sequences

  16. Functional annotation of rare gene aberration drivers of pancreatic cancer | Office of Cancer Genomics

    Science.gov (United States)

    As we enter the era of precision medicine, characterization of cancer genomes will directly influence therapeutic decisions in the clinic. Here we describe a platform enabling functionalization of rare gene mutations through their high-throughput construction, molecular barcoding and delivery to cancer models for in vivo tumour driver screens. We apply these technologies to identify oncogenic drivers of pancreatic ductal adenocarcinoma (PDAC).

  17. Functional Associations by Response Overlap (FARO), a functional genomics approach matching gene expression phenotypes

    DEFF Research Database (Denmark)

    Nielsen, Henrik Bjørn; Mundy, J.; Willenbrock, Hanni

    2007-01-01

    The systematic comparison of transcriptional responses of organisms is a powerful tool in functional genomics. For example, mutants may be characterized by comparing their transcript profiles to those obtained in other experiments querying the effects on gene expression of many experimental facto...

  18. Localization of candidate allergen genes on the apple (Malus domestica) genome and their putative allergenicity

    NARCIS (Netherlands)

    Gao Zhongshan,

    2005-01-01

    Apple is generally considered as a healthy food, but 2-3% European people can not eat this fruit because it provokes allergy reaction. Four classes of apple allergen genes have been identified, they are Mal d 1, Mal d 2, Mal d 3 and Mal d 4 . This thesis focuses on the genomic characterization of

  19. Earthworms and post agricultural succession in the Neotropics

    Science.gov (United States)

    Grizelle Gonzalez; C.Y. Huang; S.C. Chang

    2008-01-01

    Earthworms are classified into endogeic, anecic, and epigeic species to represent soil, soil and litter, and litter feeders, respectively (Bouché 1977). Earthworms can alter soil physical properties and biogeochemical processes (e.g., Ewards and Bohlen 1996) according to their functionality. Endogeic earthworms alter soil properties primarily through changing soil...

  20. Earthworms, Dirt, and Rotten Leaves: An Exploration in Ecology.

    Science.gov (United States)

    McLaughlin, Molly

    1994-01-01

    This article provides a model for inviting children to "an exploration in ecology" by observing earthworms. It gives reasons to explore earthworms and guides the investigator through a detailed examination of the worms to answer 21 observation questions. Explores the ways in which earthworms interact with their environment. (LZ)

  1. Identification of a Divided Genome for VSH-1, the Prophage-Like Gene Transfer Agent of Brachyspira hyodysenteriae

    Science.gov (United States)

    The Brachyspira hyodysenteriae B204 genome sequence revealed three VSH-1 tail genes hvp31, hvp60, and hvp37, in a 3.6 kb cluster. The location and transcription direction of these genes relative to the previously described VSH-1 16.3 kb gene operon indicate that the gene transfer agent VSH-1 has a ...

  2. Genome-wide functional genomic and transcriptomic analyses for genes regulating sensitivity to vorinostat.

    Science.gov (United States)

    Falkenberg, Katrina J; Gould, Cathryn M; Johnstone, Ricky W; Simpson, Kaylene J

    2014-01-01

    Identification of mechanisms of resistance to histone deacetylase inhibitors, such as vorinostat, is important in order to utilise these anticancer compounds more efficiently in the clinic. Here, we present a dataset containing multiple tiers of stringent siRNA screening for genes that when knocked down conferred sensitivity to vorinostat-induced cell death. We also present data from a miRNA overexpression screen for miRNAs contributing to vorinostat sensitivity. Furthermore, we provide transcriptomic analysis using massively parallel sequencing upon knockdown of 14 validated vorinostat-resistance genes. These datasets are suitable for analysis of genes and miRNAs involved in cell death in the presence and absence of vorinostat as well as computational biology approaches to identify gene regulatory networks.

  3. In vitro analysis of integrated global high-resolution DNA methylation profiling with genomic imbalance and gene expression in osteosarcoma.

    Directory of Open Access Journals (Sweden)

    Bekim Sadikovic

    Full Text Available Genetic and epigenetic changes contribute to deregulation of gene expression and development of human cancer. Changes in DNA methylation are key epigenetic factors regulating gene expression and genomic stability. Recent progress in microarray technologies resulted in developments of high resolution platforms for profiling of genetic, epigenetic and gene expression changes. OS is a pediatric bone tumor with characteristically high level of numerical and structural chromosomal changes. Furthermore, little is known about DNA methylation changes in OS. Our objective was to develop an integrative approach for analysis of high-resolution epigenomic, genomic, and gene expression profiles in order to identify functional epi/genomic differences between OS cell lines and normal human osteoblasts. A combination of Affymetrix Promoter Tilling Arrays for DNA methylation, Agilent array-CGH platform for genomic imbalance and Affymetrix Gene 1.0 platform for gene expression analysis was used. As a result, an integrative high-resolution approach for interrogation of genome-wide tumour-specific changes in DNA methylation was developed. This approach was used to provide the first genomic DNA methylation maps, and to identify and validate genes with aberrant DNA methylation in OS cell lines. This first integrative analysis of global cancer-related changes in DNA methylation, genomic imbalance, and gene expression has provided comprehensive evidence of the cumulative roles of epigenetic and genetic mechanisms in deregulation of gene expression networks.

  4. Investigation of mutations in the HBB gene using the 1,000 genomes database.

    Science.gov (United States)

    Carlice-Dos-Reis, Tânia; Viana, Jaime; Moreira, Fabiano Cordeiro; Cardoso, Greice de Lemos; Guerreiro, João; Santos, Sidney; Ribeiro-Dos-Santos, Ândrea

    2017-01-01

    Mutations in the HBB gene are responsible for several serious hemoglobinopathies, such as sickle cell anemia and β-thalassemia. Sickle cell anemia is one of the most common monogenic diseases worldwide. Due to its prevalence, diverse strategies have been developed for a better understanding of its molecular mechanisms. In silico analysis has been increasingly used to investigate the genotype-phenotype relationship of many diseases, and the sequences of healthy individuals deposited in the 1,000 Genomes database appear to be an excellent tool for such analysis. The objective of this study is to analyze the variations in the HBB gene in the 1,000 Genomes database, to describe the mutation frequencies in the different population groups, and to investigate the pattern of pathogenicity. The computational tool SNPEFF was used to align the data from 2,504 samples of the 1,000 Genomes database with the HG19 genome reference. The pathogenicity of each amino acid change was investigated using the databases CLINVAR, dbSNP and HbVar and five different predictors. Twenty different mutations were found in 209 healthy individuals. The African group had the highest number of individuals with mutations, and the European group had the lowest number. Thus, it is concluded that approximately 8.3% of phenotypically healthy individuals from the 1,000 Genomes database have some mutation in the HBB gene. The frequency of mutated genes was estimated at 0.042, so that the expected frequency of being homozygous or compound heterozygous for these variants in the next generation is approximately 0.002. In total, 193 subjects had a non-synonymous mutation, which 186 (7.4%) have a deleterious mutation. Considering that the 1,000 Genomes database is representative of the world's population, it can be estimated that fourteen out of every 10,000 individuals in the world will have a hemoglobinopathy in the next generation.

  5. Ultra Large Gene Families: A Matter of Adaptation or Genomic Parasites?

    Directory of Open Access Journals (Sweden)

    Philipp H. Schiffer

    2016-08-01

    Full Text Available Gene duplication is an important mechanism of molecular evolution. It offers a fast track to modification, diversification, redundancy or rescue of gene function. However, duplication may also be neutral or (slightly deleterious, and often ends in pseudo-geneisation. Here, we investigate the phylogenetic distribution of ultra large gene families on long and short evolutionary time scales. In particular, we focus on a family of NACHT-domain and leucine-rich-repeat-containing (NLR-genes, which we previously found in large numbers to occupy one chromosome arm of the zebrafish genome. We were interested to see whether such a tight clustering is characteristic for ultra large gene families. Our data reconfirm that most gene family inflations are lineage-specific, but we can only identify very few gene clusters. Based on our observations we hypothesise that, beyond a certain size threshold, ultra large gene families continue to proliferate in a mechanism we term “run-away evolution”. This process might ultimately lead to the failure of genomic integrity and drive species to extinction.

  6. Mapping determinants of gene expression plasticity by genetical genomics in C. elegans.

    Directory of Open Access Journals (Sweden)

    Yang Li

    2006-12-01

    Full Text Available Recent genetical genomics studies have provided intimate views on gene regulatory networks. Gene expression variations between genetically different individuals have been mapped to the causal regulatory regions, termed expression quantitative trait loci. Whether the environment-induced plastic response of gene expression also shows heritable difference has not yet been studied. Here we show that differential expression induced by temperatures of 16 degrees C and 24 degrees C has a strong genetic component in Caenorhabditis elegans recombinant inbred strains derived from a cross between strains CB4856 (Hawaii and N2 (Bristol. No less than 59% of 308 trans-acting genes showed a significant eQTL-by-environment interaction, here termed plasticity quantitative trait loci. In contrast, only 8% of an estimated 188 cis-acting genes showed such interaction. This indicates that heritable differences in plastic responses of gene expression are largely regulated in trans. This regulation is spread over many different regulators. However, for one group of trans-genes we found prominent evidence for a common master regulator: a transband of 66 coregulated genes appeared at 24 degrees C. Our results suggest widespread genetic variation of differential expression responses to environmental impacts and demonstrate the potential of genetical genomics for mapping the molecular determinants of phenotypic plasticity.

  7. Gain and loss of phototrophic genes revealed by comparison of two Citromicrobium bacterial genomes.

    Directory of Open Access Journals (Sweden)

    Qiang Zheng

    Full Text Available Proteobacteria are thought to have diverged from a phototrophic ancestor, according to the scattered distribution of phototrophy throughout the proteobacterial clade, and so the occurrence of numerous closely related phototrophic and chemotrophic microorganisms may be the result of the loss of genes for phototrophy. A widespread form of bacterial phototrophy is based on the photochemical reaction center, encoded by puf and puh operons that typically are in a 'photosynthesis gene cluster' (abbreviated as the PGC with pigment biosynthesis genes. Comparison of two closely related Citromicrobial genomes (98.1% sequence identity of complete 16S rRNA genes, Citromicrobium sp. JL354, which contains two copies of reaction center genes, and Citromicrobium strain JLT1363, which is chemotrophic, revealed evidence for the loss of phototrophic genes. However, evidence of horizontal gene transfer was found in these two bacterial genomes. An incomplete PGC (pufLMC-puhCBA in strain JL354 was located within an integrating conjugative element, which indicates a potential mechanism for the horizontal transfer of genes for phototrophy.

  8. Earthworm Effects without Earthworms: Inoculation of Raw Organic Matter with Worm-Worked Substrates Alters Microbial Community Functioning

    OpenAIRE

    Aira, Manuel; Domínguez, Jorge

    2011-01-01

    BACKGROUND: Earthworms are key organisms in organic matter decomposition because of the interactions they establish with soil microorganisms. They enhance decomposition rates through the joint action of direct effects (i.e. effects due to direct earthworm activity such as digestion, burrowing, etc) and indirect effects (i.e. effects derived from earthworm activities such as cast ageing). Here we test whether indirect earthworm effects affect microbial community functioning in the substrate, a...

  9. Biological consequences of ancient gene acquisition and duplication in the large genome soil bacterium, ""solibacter usitatus"" strain Ellin6076

    Energy Technology Data Exchange (ETDEWEB)

    Challacombe, Jean F [Los Alamos National Laboratory; Eichorst, Stephanie A [Los Alamos National Laboratory; Xie, Gary [Los Alamos National Laboratory; Kuske, Cheryl R [Los Alamos National Laboratory; Hauser, Loren [ORNL; Land, Miriam [ORNL

    2009-01-01

    Bacterial genome sizes range from ca. 0.5 to 10Mb and are influenced by gene duplication, horizontal gene transfer, gene loss and other evolutionary processes. Sequenced genomes of strains in the phylum Acidobacteria revealed that 'Solibacter usistatus' strain Ellin6076 harbors a 9.9 Mb genome. This large genome appears to have arisen by horizontal gene transfer via ancient bacteriophage and plasmid-mediated transduction, as well as widespread small-scale gene duplications. This has resulted in an increased number of paralogs that are potentially ecologically important (ecoparalogs). Low amino acid sequence identities among functional group members and lack of conserved gene order and orientation in the regions containing similar groups of paralogs suggest that most of the paralogs were not the result of recent duplication events. The genome sizes of cultured subdivision 1 and 3 strains in the phylum Acidobacteria were estimated using pulsed-field gel electrophoresis to determine the prevalence of the large genome trait within the phylum. Members of subdivision 1 were estimated to have smaller genome sizes ranging from ca. 2.0 to 4.8 Mb, whereas members of subdivision 3 had slightly larger genomes, from ca. 5.8 to 9.9 Mb. It is hypothesized that the large genome of strain Ellin6076 encodes traits that provide a selective metabolic, defensive and regulatory advantage in the variable soil environment.

  10. Genes Important for Schizosaccharomyces pombe Meiosis Identified Through a Functional Genomics Screen

    Science.gov (United States)

    Blyth, Julie; Makrantoni, Vasso; Barton, Rachael E.; Spanos, Christos; Rappsilber, Juri; Marston, Adele L.

    2018-01-01

    Meiosis is a specialized cell division that generates gametes, such as eggs and sperm. Errors in meiosis result in miscarriages and are the leading cause of birth defects; however, the molecular origins of these defects remain unknown. Studies in model organisms are beginning to identify the genes and pathways important for meiosis, but the parts list is still poorly defined. Here we present a comprehensive catalog of genes important for meiosis in the fission yeast, Schizosaccharomyces pombe. Our genome-wide functional screen surveyed all nonessential genes for roles in chromosome segregation and spore formation. Novel genes important at distinct stages of the meiotic chromosome segregation and differentiation program were identified. Preliminary characterization implicated three of these genes in centrosome/spindle pole body, centromere, and cohesion function. Our findings represent a near-complete parts list of genes important for meiosis in fission yeast, providing a valuable resource to advance our molecular understanding of meiosis. PMID:29259000

  11. Genome-Wide Distribution, Organisation and Functional Characterization of Disease Resistance and Defence Response Genes across Rice Species

    Science.gov (United States)

    Singh, Sangeeta; Chand, Suresh; Singh, N. K.; Sharma, Tilak Raj

    2015-01-01

    The resistance (R) genes and defense response (DR) genes have become very important resources for the development of disease resistant cultivars. In the present investigation, genome-wide identification, expression, phylogenetic and synteny analysis was done for R and DR-genes across three species of rice viz: Oryza sativa ssp indica cv 93-11, Oryza sativa ssp japonica and wild rice species, Oryza brachyantha. We used the in silico approach to identify and map 786 R -genes and 167 DR-genes, 672 R-genes and 142 DR-genes, 251 R-genes and 86 DR-genes in the japonica, indica and O. brachyanth a genomes, respectively. Our analysis showed that 60.5% and 55.6% of the R-genes are tandemly repeated within clusters and distributed over all the rice chromosomes in indica and japonica genomes, respectively. The phylogenetic analysis along with motif distribution shows high degree of conservation of R- and DR-genes in clusters. In silico expression analysis of R-genes and DR-genes showed more than 85% were expressed genes showing corresponding EST matches in the databases. This study gave special emphasis on mechanisms of gene evolution and duplication for R and DR genes across species. Analysis of paralogs across rice species indicated 17% and 4.38% R-genes, 29% and 11.63% DR-genes duplication in indica and Oryza brachyantha, as compared to 20% and 26% duplication of R-genes and DR-genes in japonica respectively. We found that during the course of duplication only 9.5% of R- and DR-genes changed their function and rest of the genes have maintained their identity. Syntenic relationship across three genomes inferred that more orthology is shared between indica and japonica genomes as compared to brachyantha genome. Genome wide identification of R-genes and DR-genes in the rice genome will help in allele mining and functional validation of these genes, and to understand molecular mechanism of disease resistance and their evolution in rice and related species. PMID:25902056

  12. DNA microarrays of baculovirus genomes: differential expression of viral genes in two susceptible insect cell lines.

    Science.gov (United States)

    Yamagishi, J; Isobe, R; Takebuchi, T; Bando, H

    2003-03-01

    We describe, for the first time, the generation of a viral DNA chip for simultaneous expression measurements of nearly all known open reading frames (ORFs) in the best-studied members of the family Baculoviridae, Autographa californica multiple nucleopolyhedrovirus (AcMNPV) and Bombyx mori nucleopolyhedrovirus (BmNPV). In this study, a viral DNA chip (Ac-BmNPV chip) was fabricated and used to characterize the viral gene expression profile for AcMNPV in different cell types. The viral chip is composed of microarrays of viral DNA prepared by robotic deposition of PCR-amplified viral DNA fragments on glass for ORFs in the NPV genome. Viral gene expression was monitored by hybridization to the DNA fragment microarrays with fluorescently labeled cDNAs prepared from infected Spodoptera frugiperda, Sf9 cells and Trichoplusia ni, TnHigh-Five cells, the latter a major producer of baculovirus and recombinant proteins. A comparison of expression profiles of known ORFs in AcMNPV elucidated six genes (ORF150, p10, pk2, and three late gene expression factor genes lef-3, p35 and lef- 6) the expression of each of which was regulated differently in the two cell lines. Most of these genes are known to be closely involved in the viral life cycle such as in DNA replication, late gene expression and the release of polyhedra from infected cells. These results imply that the differential expression of these viral genes accounts for the differences in viral replication between these two cell lines. Thus, these fabricated microarrays of NPV DNA which allow a rapid analysis of gene expression at the viral genome level should greatly speed the functional analysis of large genomes of NPV.

  13. Homology of yeast photoreactivating gene fragment with human genomic digests

    International Nuclear Information System (INIS)

    Meechan, P.J.; Milam, K.M.; Cleaver, J.E.

    1984-01-01

    Enzymatic photoreactivation of UV-induced DNA lesions has been demonstrated for a variety of prokaryotic and eukaryotic organisms. Its presence in placental mammals, however, has not been clearly established. The authors attempted to resolve this question by assaying for the presence (or absence) of sequences in human DNA complimentary to a fragment of the photoreactivating gene from S. cerevisiae that has recently been cloned. In another study, DNA from human, chick E. coli and yeast cells was digested with either HindIII of BglII, electrophoresed on a 0.5% agarose gel, transferred (Southern blot) to a nylon membrane and probed for homology against a Sau3A restriction fragment from S. cerevisiae that compliments phr/sup -/ cells. Hybridization to human DNA digests was observed only under relatively non-stringent conditions indicating the gene is not conserved in placental mammals. These results are correlated with current literature data concerning photoreactivating enzymes

  14. Genome-wide analysis of the GRAS gene family in Prunus mume.

    Science.gov (United States)

    Lu, Jiuxing; Wang, Tao; Xu, Zongda; Sun, Lidan; Zhang, Qixiang

    2015-02-01

    Prunus mume is an ornamental flower and fruit tree in Rosaceae. We investigated the GRAS gene family to improve the breeding and cultivation of P. mume and other Rosaceae fruit trees. The GRAS gene family encodes transcriptional regulators that have diverse functions in plant growth and development, such as gibberellin and phytochrome A signal transduction, root radial patterning, and axillary meristem formation and gametogenesis in the P. mume genome. Despite the important roles of these genes in plant growth regulation, no findings on the GRAS genes of P. mume have been reported. In this study, we discerned phylogenetic relationships of P. mume GRAS genes, and their locations, structures in the genome and expression levels of different tissues. Out of 46 identified GRAS genes, 45 were located on the 8 P. mume chromosomes. Phylogenetic results showed that these genes could be classified into 11 groups. We found that Group X was P. mume-specific, and three genes of Group IX clustered with the rice-specific gene Os4. We speculated that these genes existed before the divergence of dicotyledons and monocotyledons and were lost in Arabidopsis. Tissue expression analysis indicated that 13 genes showed high expression levels in roots, stems, leaves, flowers and fruits, and were related to plant growth and development. Functional analysis of 24 GRAS genes and an orthologous relationship analysis indicated that many functioned during plant growth and flower and fruit development. Our bioinformatics analysis provides valuable information to improve the economic, agronomic and ecological benefits of P. mume and other Rosaceae fruit trees.

  15. Genome-wide analysis of the WRKY gene family in cotton.

    Science.gov (United States)

    Dou, Lingling; Zhang, Xiaohong; Pang, Chaoyou; Song, Meizhen; Wei, Hengling; Fan, Shuli; Yu, Shuxun

    2014-12-01

    WRKY proteins are major transcription factors involved in regulating plant growth and development. Although many studies have focused on the functional identification of WRKY genes, our knowledge concerning many areas of WRKY gene biology is limited. For example, in cotton, the phylogenetic characteristics, global expression patterns, molecular mechanisms regulating expression, and target genes/pathways of WRKY genes are poorly characterized. Therefore, in this study, we present a genome-wide analysis of the WRKY gene family in cotton (Gossypium raimondii and Gossypium hirsutum). We identified 116 WRKY genes in G. raimondii from the completed genome sequence, and we cloned 102 WRKY genes in G. hirsutum. Chromosomal location analysis indicated that WRKY genes in G. raimondii evolved mainly from segmental duplication followed by tandem amplifications. Phylogenetic analysis of alga, bryophyte, lycophyta, monocot and eudicot WRKY domains revealed family member expansion with increasing complexity of the plant body. Microarray, expression profiling and qRT-PCR data revealed that WRKY genes in G. hirsutum may regulate the development of fibers, anthers, tissues (roots, stems, leaves and embryos), and are involved in the response to stresses. Expression analysis showed that most group II and III GhWRKY genes are highly expressed under diverse stresses. Group I members, representing the ancestral form, seem to be insensitive to abiotic stress, with low expression divergence. Our results indicate that cotton WRKY genes might have evolved by adaptive duplication, leading to sensitivity to diverse stresses. This study provides fundamental information to inform further analysis and understanding of WRKY gene functions in cotton species.

  16. Functional Genome Mining for Metabolites Encoded by Large Gene Clusters through Heterologous Expression of a Whole-Genome Bacterial Artificial Chromosome Library in Streptomyces spp.

    Science.gov (United States)

    Xu, Min; Wang, Yemin; Zhao, Zhilong; Gao, Guixi; Huang, Sheng-Xiong; Kang, Qianjin; He, Xinyi; Lin, Shuangjun; Pang, Xiuhua; Deng, Zixin

    2016-01-01

    ABSTRACT Genome sequencing projects in the last decade revealed numerous cryptic biosynthetic pathways for unknown secondary metabolites in microbes, revitalizing drug discovery from microbial metabolites by approaches called genome mining. In this work, we developed a heterologous expression and functional screening approach for genome mining from genomic bacterial artificial chromosome (BAC) libraries in Streptomyces spp. We demonstrate mining from a strain of Streptomyces rochei, which is known to produce streptothricins and borrelidin, by expressing its BAC library in the surrogate host Streptomyces lividans SBT5, and screening for antimicrobial activity. In addition to the successful capture of the streptothricin and borrelidin biosynthetic gene clusters, we discovered two novel linear lipopeptides and their corresponding biosynthetic gene cluster, as well as a novel cryptic gene cluster for an unknown antibiotic from S. rochei. This high-throughput functional genome mining approach can be easily applied to other streptomycetes, and it is very suitable for the large-scale screening of genomic BAC libraries for bioactive natural products and the corresponding biosynthetic pathways. IMPORTANCE Microbial genomes encode numerous cryptic biosynthetic gene clusters for unknown small metabolites with potential biological activities. Several genome mining approaches have been developed to activate and bring these cryptic metabolites to biological tests for future drug discovery. Previous sequence-guided procedures relied on bioinformatic analysis to predict potentially interesting biosynthetic gene clusters. In this study, we describe an efficient approach based on heterologous expression and functional screening of a whole-genome library for the mining of bioactive metabolites from Streptomyces. The usefulness of this function-driven approach was demonstrated by the capture of four large biosynthetic gene clusters for metabolites of various chemical types, including

  17. The mitochondrial gene encoding ribosomal protein S12 has been translocated to the nuclear genome in Oenothera.

    Science.gov (United States)

    Grohmann, L; Brennicke, A; Schuster, W

    1992-01-01

    The Oenothera mitochondrial genome contains only a gene fragment for ribosomal protein S12 (rps12), while other plants encode a functional gene in the mitochondrion. The complete Oenothera rps12 gene is located in the nucleus. The transit sequence necessary to target this protein to the mitochondrion is encoded by a 5'-extension of the open reading frame. Comparison of the amino acid sequence encoded by the nuclear gene with the polypeptides encoded by edited mitochondrial cDNA and genomic sequences of other plants suggests that gene transfer between mitochondrion and nucleus started from edited mitochondrial RNA molecules. Mechanisms and requirements of gene transfer and activation are discussed. Images PMID:1454526

  18. A Scan for Positively Selected Genes in the Genomes of Humans and Chimpanzees

    DEFF Research Database (Denmark)

    Nielsen, Rasmus; Bustamente, Carlos; Clark, Andrew G.

    2005-01-01

    Since the divergence of humans and chimpanzees about 5 million years ago, these species have undergone a remarkable evolution with drastic divergence in anatomy and cognitive abilities. At the molecular level, despite the small overall magnitude of DNA sequence divergence, we might expect...... such evolutionary changes to leave a noticeable signature throughout the genome. We here compare 13,731 annotated genes from humans to their chimpanzee orthologs to identify genes that show evidence of positive selection. Many of the genes that present a signature of positive selection tend to be involved...

  19. A genomics based discovery of secondary metabolite biosynthetic gene clusters in Aspergillus ustus.

    Directory of Open Access Journals (Sweden)

    Borui Pi

    Full Text Available Secondary metabolites (SMs produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic.

  20. Co-Expression of Neighboring Genes in the Zebrafish (Danio rerio Genome

    Directory of Open Access Journals (Sweden)

    Daryi Wang

    2009-08-01

    Full Text Available Neighboring genes in the eukaryotic genome have a tendency to express concurrently, and the proximity of two adjacent genes is often considered a possible explanation for their co-expression behavior. However, the actual contribution of the physical distance between two genes to their co-expression behavior has yet to be defined. To further investigate this issue, we studied the co-expression of neighboring genes in zebrafish, which has a compact genome and has experienced a whole genome duplication event. Our analysis shows that the proportion of highly co-expressed neighboring pairs (Pearson’s correlation coefficient R>0.7 is low (0.24% ~ 0.67%; however, it is still significantly higher than that of random pairs. In particular, the statistical result implies that the co-expression tendency of neighboring pairs is negatively correlated with their physical distance. Our findings therefore suggest that physical distance may play an important role in the co-expression of neighboring genes. Possible mechanisms related to the neighboring genes’ co-expression are also discussed.

  1. Dynamic evolution of Geranium mitochondrial genomes through multiple horizontal and intracellular gene transfers.

    Science.gov (United States)

    Park, Seongjun; Grewe, Felix; Zhu, Andan; Ruhlman, Tracey A; Sabir, Jamal; Mower, Jeffrey P; Jansen, Robert K

    2015-10-01

    The exchange of genetic material between cellular organelles through intracellular gene transfer (IGT) or between species by horizontal gene transfer (HGT) has played an important role in plant mitochondrial genome evolution. The mitochondrial genomes of Geraniaceae display a number of unusual phenomena including highly accelerated rates of synonymous substitutions, extensive gene loss and reduction in RNA editing. Mitochondrial DNA sequences assembled for 17 species of Geranium revealed substantial reduction in gene and intron content relative to the ancestor of the Geranium lineage. Comparative analyses of nuclear transcriptome data suggest that a number of these sequences have been functionally relocated to the nucleus via IGT. Evidence for rampant HGT was detected in several Geranium species containing foreign organellar DNA from diverse eudicots, including many transfers from parasitic plants. One lineage has experienced multiple, independent HGT episodes, many of which occurred within the past 5.5 Myr. Both duplicative and recapture HGT were documented in Geranium lineages. The mitochondrial genome of Geranium brycei contains at least four independent HGT tracts that are absent in its nearest relative. Furthermore, G. brycei mitochondria carry two copies of the cox1 gene that differ in intron content, providing insight into contrasting hypotheses on cox1 intron evolution. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.

  2. Confluence of genes, environment, development, and behavior in a post Genome-Wide Association Study world.

    Science.gov (United States)

    Vrieze, Scott I; Iacono, William G; McGue, Matt

    2012-11-01

    This article serves to outline a research paradigm to investigate main effects and interactions of genes, environment, and development on behavior and psychiatric illness. We provide a historical context for candidate gene studies and genome-wide association studies, including benefits, limitations, and expected payoffs. Using substance use and abuse as our driving example, we then turn to the importance of etiological psychological theory in guiding genetic, environmental, and developmental research, as well as the utility of refined phenotypic measures, such as endophenotypes, in the pursuit of etiological understanding and focused tests of genetic and environmental associations. Phenotypic measurement has received considerable attention in the history of psychology and is informed by psychometrics, whereas the environment remains relatively poorly measured and is often confounded with genetic effects (i.e., gene-environment correlation). Genetically informed designs, which are no longer limited to twin and adoption studies thanks to ever-cheaper genotyping, are required to understand environmental influences. Finally, we outline the vast amount of individual difference in structural genomic variation, most of which remains to be leveraged in genetic association tests. Although the genetic data can be massive and burdensome (tens of millions of variants per person), we argue that improved understanding of genomic structure and function will provide investigators with new tools to test specific a priori hypotheses derived from etiological psychological theory, much like current candidate gene research but with less confusion and more payoff than candidate gene research has to date.

  3. A Genomics Based Discovery of Secondary Metabolite Biosynthetic Gene Clusters in Aspergillus ustus

    Science.gov (United States)

    Pi, Borui; Yu, Dongliang; Dai, Fangwei; Song, Xiaoming; Zhu, Congyi; Li, Hongye; Yu, Yunsong

    2015-01-01

    Secondary metabolites (SMs) produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic. PMID:25706180

  4. Tibrogargan and Coastal Plains rhabdoviruses: genomic characterization, evolution of novel genes and seroprevalence in Australian livestock.

    Science.gov (United States)

    Gubala, Aneta; Davis, Steven; Weir, Richard; Melville, Lorna; Cowled, Chris; Boyle, David

    2011-09-01

    Tibrogargan virus (TIBV) and Coastal Plains virus (CPV) were isolated from cattle in Australia and TIBV has also been isolated from the biting midge Culicoides brevitarsis. Complete genomic sequencing revealed that the viruses share a novel genome structure within the family Rhabdoviridae, each virus containing two additional putative genes between the matrix protein (M) and glycoprotein (G) genes and one between the G and viral RNA polymerase (L) genes. The predicted novel protein products are highly diverged at the sequence level but demonstrate clear conservation of secondary structure elements, suggesting conservation of biological functions. Phylogenetic analyses showed that TIBV and CPV form an independent group within the 'dimarhabdovirus supergroup'. Although no disease has been observed in association with these viruses, antibodies were detected at high prevalence in cattle and buffalo in northern Australia, indicating the need for disease monitoring and further study of this distinctive group of viruses.

  5. Field of genes: the politics of science and identity in the Estonian Genome Project.

    Science.gov (United States)

    Fletcher, Amy L

    2004-04-01

    This case study of the Estonian Genome Project (EGP) analyses the Estonian policy decision to construct a national human gene bank. Drawing upon qualitative data from newspaper articles and public policy documents, it focuses on how proponents use discourse to link the EGP to the broader political goal of securing Estonia's position within the Western/European scientific and cultural space. This dominant narrative is then situated within the analytical notion of the "brand state", which raises potentially negative political consequences for this type of market-driven genomic research. Considered against the increasing number of countries engaging in gene bank and/or gene database projects, this analysis of Estonia elucidates issues that cross national boundaries, while also illuminating factors specific to this small, post-Soviet state as it enters the global biocybernetic economy.

  6. Prepatterning of developmental gene expression by modified histones before zygotic genome activation

    DEFF Research Database (Denmark)

    Lindeman, Leif C.; Andersen, Ingrid S.; Reiner, Andrew H.

    2011-01-01

    A hallmark of anamniote vertebrate development is a window of embryonic transcription-independent cell divisions before onset of zygotic genome activation (ZGA). Chromatin determinants of ZGA are unexplored; however, marking of developmental genes by modified histones in sperm suggests a predictive...... role of histone marks for ZGA. In zebrafish, pre-ZGA development for ten cell cycles provides an opportunity to examine whether genomic enrichment in modified histones is present before initiation of transcription. By profiling histone H3 trimethylation on all zebrafish promoters before and after ZGA......, we demonstrate here an epigenetic prepatterning of developmental gene expression. This involves pre-ZGA marking of transcriptionally inactive genes involved in homeostatic and developmental regulation by permissive H3K4me3 with or without repressive H3K9me3 or H3K27me3. Our data suggest that histone...

  7. Genomic instability of osteosarcoma cell lines in culture: impact on the prediction of metastasis relevant genes.

    Science.gov (United States)

    Muff, Roman; Rath, Prisni; Ram Kumar, Ram Mohan; Husmann, Knut; Born, Walter; Baudis, Michael; Fuchs, Bruno

    2015-01-01

    Osteosarcoma is a rare but highly malignant cancer of the bone. As a consequence, the number of established cell lines used for experimental in vitro and in vivo osteosarcoma research is limited and the value of these cell lines relies on their stability during culture. Here we investigated the stability in gene expression by microarray analysis and array genomic hybridization of three low metastatic cell lines and derivatives thereof with increased metastatic potential using cells of different passages. The osteosarcoma cell lines showed altered gene expression during in vitro culture, and it was more pronounced in two metastatic cell lines compared to the respective parental cells. Chromosomal instability contributed in part to the altered gene expression in SAOS and LM5 cells with low and high metastatic potential. To identify metastasis-relevant genes in a background of passage-dependent altered gene expression, genes involved in "Pathways in cancer" that were consistently regulated under all passage comparisons were evaluated. Genes belonging to "Hedgehog signaling pathway" and "Wnt signaling pathway" were significantly up-regulated, and IHH, WNT10B and TCF7 were found up-regulated in all three metastatic compared to the parental cell lines. Considerable instability during culture in terms of gene expression and chromosomal aberrations was observed in osteosarcoma cell lines. The use of cells from different passages and a search for genes consistently regulated in early and late passages allows the analysis of metastasis-relevant genes despite the observed instability in gene expression in osteosarcoma cell lines during culture.

  8. Data on the genome-wide identification of CNL R-genes in Setaria italica (L.) P. Beauv.

    Science.gov (United States)

    Andersen, Ethan J; Nepal, Madhav P

    2017-08-01

    We report data associated with the identification of 242 disease resistance genes (R-genes) in the genome of Setaria italica as presented in "Genetic diversity of disease resistance genes in foxtail millet ( Setaria italica L.)" (Andersen and Nepal, 2017) [1]. Our data describe the structure and evolution of the Coiled-coil, Nucleotide-binding site, Leucine-rich repeat (CNL) R-genes in foxtail millet. The CNL genes were identified through rigorous extraction and analysis of recently available plant genome sequences using cutting-edge analytical software. Data visualization includes gene structure diagrams, chromosomal syntenic maps, a chromosomal density plot, and a maximum-likelihood phylogenetic tree comparing Sorghum bicolor , Panicum virgatum , Setaria italica , and Arabidopsis thaliana . Compilation of InterProScan annotations, Gene Ontology (GO) annotations, and Basic Local Alignment Search Tool (BLAST) results for the 242 R-genes identified in the foxtail millet genome are also included in tabular format.

  9. Intervene: a tool for intersection and visualization of multiple gene or genomic region sets.

    Science.gov (United States)

    Khan, Aziz; Mathelier, Anthony

    2017-05-31

    A common task for scientists relies on comparing lists of genes or genomic regions derived from high-throughput sequencing experiments. While several tools exist to intersect and visualize sets of genes, similar tools dedicated to the visualization of genomic region sets are currently limited. To address this gap, we have developed the Intervene tool, which provides an easy and automated interface for the effective intersection and visualization of genomic region or list sets, thus facilitating their analysis and interpretation. Intervene contains three modules: venn to generate Venn diagrams of up to six sets, upset to generate UpSet plots of multiple sets, and pairwise to compute and visualize intersections of multiple sets as clustered heat maps. Intervene, and its interactive web ShinyApp companion, generate publication-quality figures for the interpretation of genomic region and list sets. Intervene and its web application companion provide an easy command line and an interactive web interface to compute intersections of multiple genomic and list sets. They have the capacity to plot intersections using easy-to-interpret visual approaches. Intervene is developed and designed to meet the needs of both computer scientists and biologists. The source code is freely available at https://bitbucket.org/CBGR/intervene , with the web application available at https://asntech.shinyapps.io/intervene .

  10. Evolutionary genomics and adaptive evolution of the Hedgehog gene family (Shh, Ihh and Dhh in vertebrates.

    Directory of Open Access Journals (Sweden)

    Joana Pereira

    Full Text Available The Hedgehog (Hh gene family codes for a class of secreted proteins composed of two active domains that act as signalling molecules during embryo development, namely for the development of the nervous and skeletal systems and the formation of the testis cord. While only one Hh gene is found typically in invertebrate genomes, most vertebrates species have three (Sonic hedgehog--Shh; Indian hedgehog--Ihh; and Desert hedgehog--Dhh, each with different expression patterns and functions, which likely helped promote the increasing complexity of vertebrates and their successful diversification. In this study, we used comparative genomic and adaptive evolutionary analyses to characterize the evolution of the Hh genes in vertebrates following the two major whole genome duplication (WGD events. To overcome the lack of Hh-coding sequences on avian publicly available databases, we used an extensive dataset of 45 avian and three non-avian reptilian genomes to show that birds have all three Hh paralogs. We find suggestions that following the WGD events, vertebrate Hh paralogous genes evolved independently within similar linkage groups and under different evolutionary rates, especially within the catalytic domain. The structural regions around the ion-binding site were identified to be under positive selection in the signaling domain. These findings contrast with those observed in invertebrates, where different lineages that experienced gene duplication retained similar selective constraints in the Hh orthologs. Our results provide new insights on the evolutionary history of the Hh gene family, the functional roles of these paralogs in vertebrate species, and on the location of mutational hotspots.

  11. Evolutionary genomics and adaptive evolution of the Hedgehog gene family (Shh, Ihh and Dhh) in vertebrates.

    Science.gov (United States)

    Pereira, Joana; Johnson, Warren E; O'Brien, Stephen J; Jarvis, Erich D; Zhang, Guojie; Gilbert, M Thomas P; Vasconcelos, Vitor; Antunes, Agostinho

    2014-01-01

    The Hedgehog (Hh) gene family codes for a class of secreted proteins composed of two active domains that act as signalling molecules during embryo development, namely for the development of the nervous and skeletal systems and the formation of the testis cord. While only one Hh gene is found typically in invertebrate genomes, most vertebrates species have three (Sonic hedgehog--Shh; Indian hedgehog--Ihh; and Desert hedgehog--Dhh), each with different expression patterns and functions, which likely helped promote the increasing complexity of vertebrates and their successful diversification. In this study, we used comparative genomic and adaptive evolutionary analyses to characterize the evolution of the Hh genes in vertebrates following the two major whole genome duplication (WGD) events. To overcome the lack of Hh-coding sequences on avian publicly available databases, we used an extensive dataset of 45 avian and three non-avian reptilian genomes to show that birds have all three Hh paralogs. We find suggestions that following the WGD events, vertebrate Hh paralogous genes evolved independently within similar linkage groups and under different evolutionary rates, especially within the catalytic domain. The structural regions around the ion-binding site were identified to be under positive selection in the signaling domain. These findings contrast with those observed in invertebrates, where different lineages that experienced gene duplication retained similar selective constraints in the Hh orthologs. Our results provide new insights on the evolutionary history of the Hh gene family, the functional roles of these paralogs in vertebrate species, and on the location of mutational hotspots.

  12. Genome-Wide DNA Methylation Indicates Silencing of Tumor Suppressor Genes in Uterine Leiomyoma

    Science.gov (United States)

    Navarro, Antonia; Yin, Ping; Monsivais, Diana; Lin, Simon M.; Du, Pan; Wei, Jian-Jun; Bulun, Serdar E.

    2012-01-01

    Background Uterine leiomyomas, or fibroids, represent the most common benign tumor of the female reproductive tract. Fibroids become symptomatic in 30% of all women and up to 70% of African American women of reproductive age. Epigenetic dysregulation of individual genes has been demonstrated in leiomyoma cells; however, the in vivo ge