WorldWideScience

Sample records for comparative genomics based

  1. Comparative genomics beyond sequence-based alignments

    DEFF Research Database (Denmark)

    Þórarinsson, Elfar; Yao, Zizhen; Wiklund, Eric D.;

    2008-01-01

    Recent computational scans for non-coding RNAs (ncRNAs) in multiple organisms have relied on existing multiple sequence alignments. However, as sequence similarity drops, a key signal of RNA structure--frequent compensating base changes--is increasingly likely to cause sequence-based alignment me...

  2. GenColors-based comparative genome databases for small eukaryotic genomes.

    Science.gov (United States)

    Felder, Marius; Romualdi, Alessandro; Petzold, Andreas; Platzer, Matthias; Sühnel, Jürgen; Glöckner, Gernot

    2013-01-01

    Many sequence data repositories can give a quick and easily accessible overview on genomes and their annotations. Less widespread is the possibility to compare related genomes with each other in a common database environment. We have previously described the GenColors database system (http://gencolors.fli-leibniz.de) and its applications to a number of bacterial genomes such as Borrelia, Legionella, Leptospira and Treponema. This system has an emphasis on genome comparison. It combines data from related genomes and provides the user with an extensive set of visualization and analysis tools. Eukaryote genomes are normally larger than prokaryote genomes and thus pose additional challenges for such a system. We have, therefore, adapted GenColors to also handle larger datasets of small eukaryotic genomes and to display eukaryotic gene structures. Further recent developments include whole genome views, genome list options and, for bacterial genome browsers, the display of horizontal gene transfer predictions. Two new GenColors-based databases for two fungal species (http://fgb.fli-leibniz.de) and for four social amoebas (http://sacgb.fli-leibniz.de) were set up. Both new resources open up a single entry point for related genomes for the amoebozoa and fungal research communities and other interested users. Comparative genomics approaches are greatly facilitated by these resources.

  3. CFGP: a web-based, comparative fungal genomics platform.

    Science.gov (United States)

    Park, Jongsun; Park, Bongsoo; Jung, Kyongyong; Jang, Suwang; Yu, Kwangyul; Choi, Jaeyoung; Kong, Sunghyung; Park, Jaejin; Kim, Seryun; Kim, Hyojeong; Kim, Soonok; Kim, Jihyun F; Blair, Jaime E; Lee, Kwangwon; Kang, Seogchan; Lee, Yong-Hwan

    2008-01-01

    Since the completion of the Saccharomyces cerevisiae genome sequencing project in 1996, the genomes of over 80 fungal species have been sequenced or are currently being sequenced. Resulting data provide opportunities for studying and comparing fungal biology and evolution at the genome level. To support such studies, the Comparative Fungal Genomics Platform (CFGP; http://cfgp.snu.ac.kr), a web-based multifunctional informatics workbench, was developed. The CFGP comprises three layers, including the basal layer, middleware and the user interface. The data warehouse in the basal layer contains standardized genome sequences of 65 fungal species. The middleware processes queries via six analysis tools, including BLAST, ClustalW, InterProScan, SignalP 3.0, PSORT II and a newly developed tool named BLASTMatrix. The BLASTMatrix permits the identification and visualization of genes homologous to a query across multiple species. The Data-driven User Interface (DUI) of the CFGP was built on a new concept of pre-collecting data and post-executing analysis instead of the 'fill-in-the-form-and-press-SUBMIT' user interfaces utilized by most bioinformatics sites. A tool termed Favorite, which supports the management of encapsulated sequence data and provides a personalized data repository to users, is another novel feature in the DUI.

  4. A Web-Based Comparative Genomics Tutorial for Investigating Microbial Genomes

    Directory of Open Access Journals (Sweden)

    Michael Strong

    2009-12-01

    Full Text Available As the number of completely sequenced microbial genomes continues to rise at an impressive rate, it is important to prepare students with the skills necessary to investigate microorganisms at the genomic level. As a part of the core curriculum for first-year graduate students in the biological sciences, we have implemented a web-based tutorial to introduce students to the fields of comparative and functional genomics. The tutorial focuses on recent computational methods for identifying functionally linked genes and proteins on a genome-wide scale and was used to introduce students to the Rosetta Stone, Phylogenetic Profile, conserved Gene Neighbor, and Operon computational methods. Students learned to use a number of publicly available web servers and databases to identify functionally linked genes in the Escherichia coli genome, with emphasis on genome organization and operon structure. The overall effectiveness of the tutorial was assessed based on student evaluations and homework assignments. The tutorial is available to other educators at http://www.doe-mbi.ucla.edu/~strong/m253.php.

  5. Ebolavirus comparative genomics

    DEFF Research Database (Denmark)

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat

    2015-01-01

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. To examine the dynamics of this genome, we compare more than 100 currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms...

  6. Genomic profiling of oral squamous cell carcinoma by array-based comparative genomic hybridization.

    Directory of Open Access Journals (Sweden)

    Shunichi Yoshioka

    Full Text Available We designed a study to investigate genetic relationships between primary tumors of oral squamous cell carcinoma (OSCC and their lymph node metastases, and to identify genomic copy number aberrations (CNAs related to lymph node metastasis. For this purpose, we collected a total of 42 tumor samples from 25 patients and analyzed their genomic profiles by array-based comparative genomic hybridization. We then compared the genetic profiles of metastatic primary tumors (MPTs with their paired lymph node metastases (LNMs, and also those of LNMs with non-metastatic primary tumors (NMPTs. Firstly, we found that although there were some distinctive differences in the patterns of genomic profiles between MPTs and their paired LNMs, the paired samples shared similar genomic aberration patterns in each case. Unsupervised hierarchical clustering analysis grouped together 12 of the 15 MPT-LNM pairs. Furthermore, similarity scores between paired samples were significantly higher than those between non-paired samples. These results suggested that MPTs and their paired LNMs are composed predominantly of genetically clonal tumor cells, while minor populations with different CNAs may also exist in metastatic OSCCs. Secondly, to identify CNAs related to lymph node metastasis, we compared CNAs between grouped samples of MPTs and LNMs, but were unable to find any CNAs that were more common in LNMs. Finally, we hypothesized that subpopulations carrying metastasis-related CNAs might be present in both the MPT and LNM. Accordingly, we compared CNAs between NMPTs and LNMs, and found that gains of 7p, 8q and 17q were more common in the latter than in the former, suggesting that these CNAs may be involved in lymph node metastasis of OSCC. In conclusion, our data suggest that in OSCCs showing metastasis, the primary and metastatic tumors share similar genomic profiles, and that cells in the primary tumor may tend to metastasize after acquiring metastasis-associated CNAs.

  7. Array-based comparative genomic hybridization for genome-wide screening of DNA copy number in bladder tumors.

    NARCIS (Netherlands)

    Veltman, J.A.; Fridlyand, J.; Pejavar, S.; Olshen, A.B.; Korkola, J.E.; Vries, S. de; Carroll, P.; Kuo, W.L.; Pinkel, D.; Albertson, D.; Cordon-Cardo, C.; Jain, A.N.; Waldman, F.M.

    2003-01-01

    Genome-wide copy number profiles were characterized in 41 primary bladder tumors using array-based comparative genomic hybridization (array CGH). In addition to previously identified alterations in large chromosomal regions, alterations were identified in many small genomic regions, some with high-l

  8. YersiniaBase: a genomic resource and analysis platform for comparative analysis of Yersinia.

    Science.gov (United States)

    Tan, Shi Yang; Dutta, Avirup; Jakubovics, Nicholas S; Ang, Mia Yang; Siow, Cheuk Chuen; Mutha, Naresh Vr; Heydari, Hamed; Wee, Wei Yee; Wong, Guat Jah; Choo, Siew Woh

    2015-01-16

    Yersinia is a Gram-negative bacteria that includes serious pathogens such as the Yersinia pestis, which causes plague, Yersinia pseudotuberculosis, Yersinia enterocolitica. The remaining species are generally considered non-pathogenic to humans, although there is evidence that at least some of these species can cause occasional infections using distinct mechanisms from the more pathogenic species. With the advances in sequencing technologies, many genomes of Yersinia have been sequenced. However, there is currently no specialized platform to hold the rapidly-growing Yersinia genomic data and to provide analysis tools particularly for comparative analyses, which are required to provide improved insights into their biology, evolution and pathogenicity. To facilitate the ongoing and future research of Yersinia, especially those generally considered non-pathogenic species, a well-defined repository and analysis platform is needed to hold the Yersinia genomic data and analysis tools for the Yersinia research community. Hence, we have developed the YersiniaBase, a robust and user-friendly Yersinia resource and analysis platform for the analysis of Yersinia genomic data. YersiniaBase has a total of twelve species and 232 genome sequences, of which the majority are Yersinia pestis. In order to smooth the process of searching genomic data in a large database, we implemented an Asynchronous JavaScript and XML (AJAX)-based real-time searching system in YersiniaBase. Besides incorporating existing tools, which include JavaScript-based genome browser (JBrowse) and Basic Local Alignment Search Tool (BLAST), YersiniaBase also has in-house developed tools: (1) Pairwise Genome Comparison tool (PGC) for comparing two user-selected genomes; (2) Pathogenomics Profiling Tool (PathoProT) for comparative pathogenomics analysis of Yersinia genomes; (3) YersiniaTree for constructing phylogenetic tree of Yersinia. We ran analyses based on the tools and genomic data in YersiniaBase and the

  9. Ebolavirus comparative genomics

    Science.gov (United States)

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; Uberbacher, Edward C.; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; Lund, Ole; Buzard, Gregory S.; Pedersen, Thomas D.; Wassenaar, Trudy M.; Ussery, David W.

    2015-01-01

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. To examine the dynamics of this genome, we compare more than 100 currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of the same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP) and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. This information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies. This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). PMID:26175035

  10. Genome-based comparative analyses of Antarctic and temperate species of Paenibacillus.

    Directory of Open Access Journals (Sweden)

    Melissa Dsouza

    Full Text Available Antarctic soils represent a unique environment characterised by extremes of temperature, salinity, elevated UV radiation, low nutrient and low water content. Despite the harshness of this environment, members of 15 bacterial phyla have been identified in soils of the Ross Sea Region (RSR. However, the survival mechanisms and ecological roles of these phyla are largely unknown. The aim of this study was to investigate whether strains of Paenibacillus darwinianus owe their resilience to substantial genomic changes. For this, genome-based comparative analyses were performed on three P. darwinianus strains, isolated from gamma-irradiated RSR soils, together with nine temperate, soil-dwelling Paenibacillus spp. The genome of each strain was sequenced to over 1,000-fold coverage, then assembled into contigs totalling approximately 3 Mbp per genome. Based on the occurrence of essential, single-copy genes, genome completeness was estimated at approximately 88%. Genome analysis revealed between 3,043-3,091 protein-coding sequences (CDSs, primarily associated with two-component systems, sigma factors, transporters, sporulation and genes induced by cold-shock, oxidative and osmotic stresses. These comparative analyses provide an insight into the metabolic potential of P. darwinianus, revealing potential adaptive mechanisms for survival in Antarctic soils. However, a large proportion of these mechanisms were also identified in temperate Paenibacillus spp., suggesting that these mechanisms are beneficial for growth and survival in a range of soil environments. These analyses have also revealed that the P. darwinianus genomes contain significantly fewer CDSs and have a lower paralogous content. Notwithstanding the incompleteness of the assemblies, the large differences in genome sizes, determined by the number of genes in paralogous clusters and the CDS content, are indicative of genome content scaling. Finally, these sequences are a resource for further

  11. Application of Microarray-Based Comparative Genomic Hybridization in Prenatal and Postnatal Settings: Three Case Reports

    Directory of Open Access Journals (Sweden)

    Jing Liu

    2011-01-01

    Full Text Available Microarray-based comparative genomic hybridization (array CGH is a newly emerged molecular cytogenetic technique for rapid evaluation of the entire genome with sub-megabase resolution. It allows for the comprehensive investigation of thousands and millions of genomic loci at once and therefore enables the efficient detection of DNA copy number variations (a.k.a, cryptic genomic imbalances. The development and the clinical application of array CGH have revolutionized the diagnostic process in patients and has provided a clue to many unidentified or unexplained diseases which are suspected to have a genetic cause. In this paper, we present three clinical cases in both prenatal and postnatal settings. Among all, array CGH played a major discovery role to reveal the cryptic and/or complex nature of chromosome arrangements. By identifying the genetic causes responsible for the clinical observation in patients, array CGH has provided accurate diagnosis and appropriate clinical management in a timely and efficient manner.

  12. Comprehensive genome characterization of solitary fibrous tumors using high-resolution array-based comparative genomic hybridization.

    Science.gov (United States)

    Bertucci, François; Bouvier-Labit, Corinne; Finetti, Pascal; Adélaïde, José; Metellus, Philippe; Mokhtari, Karima; Decouvelaere, Anne-Valérie; Miquel, Catherine; Jouvet, Anne; Figarella-Branger, Dominique; Pedeutour, Florence; Chaffanet, Max; Birnbaum, Daniel

    2013-02-01

    Solitary fibrous tumors (SFTs) are rare spindle cell tumors with limited therapeutic options. Their molecular basis is poorly known. No consistent cytogenetic abnormality has been reported. We used high-resolution whole-genome array-based comparative genomic hybridization (Agilent 244K oligonucleotide chips) to profile 47 samples, meningeal in >75% of cases. Few copy number aberrations (CNAs) were observed. Sixty-eight percent of samples did not show any gene CNA after exclusion of probes located in regions with referenced copy number variation (CNV). Only low-level CNAs were observed. The genomic profiles were very homogeneous among samples. No molecular class was revealed by clustering of DNA copy numbers. All cases displayed a "simplex" profile. No recurrent CNA was identified. Imbalances occurring in >20%, such as the gain of 8p11.23-11.22 region, contained known CNVs. The 13q14.11-13q31.1 region (lost in 4% of cases) was the largest altered region and contained the lowest percentage of genes with referenced CNVs. A total of 425 genes without CNV showed copy number transition in at least one sample, but only but only 1 in at least 10% of samples. The genomic profiles of meningeal and extra-meningeal cases did not show any differences.

  13. Genomic organization, annotation, and ligand-receptor inferences of chicken chemokines and chemokine receptor genes based on comparative genomics

    Directory of Open Access Journals (Sweden)

    Sze Sing-Hoi

    2005-03-01

    Full Text Available Abstract Background Chemokines and their receptors play important roles in host defense, organogenesis, hematopoiesis, and neuronal communication. Forty-two chemokines and 19 cognate receptors have been found in the human genome. Prior to this report, only 11 chicken chemokines and 7 receptors had been reported. The objectives of this study were to systematically identify chicken chemokines and their cognate receptor genes in the chicken genome and to annotate these genes and ligand-receptor binding by a comparative genomics approach. Results Twenty-three chemokine and 14 chemokine receptor genes were identified in the chicken genome. All of the chicken chemokines contained a conserved CC, CXC, CX3C, or XC motif, whereas all the chemokine receptors had seven conserved transmembrane helices, four extracellular domains with a conserved cysteine, and a conserved DRYLAIV sequence in the second intracellular domain. The number of coding exons in these genes and the syntenies are highly conserved between human, mouse, and chicken although the amino acid sequence homologies are generally low between mammalian and chicken chemokines. Chicken genes were named with the systematic nomenclature used in humans and mice based on phylogeny, synteny, and sequence homology. Conclusion The independent nomenclature of chicken chemokines and chemokine receptors suggests that the chicken may have ligand-receptor pairings similar to mammals. All identified chicken chemokines and their cognate receptors were identified in the chicken genome except CCR9, whose ligand was not identified in this study. The organization of these genes suggests that there were a substantial number of these genes present before divergence between aves and mammals and more gene duplications of CC, CXC, CCR, and CXCR subfamilies in mammals than in aves after the divergence.

  14. An HMM-based comparative genomic framework for detecting introgression in eukaryotes.

    Directory of Open Access Journals (Sweden)

    Kevin J Liu

    2014-06-01

    Full Text Available One outcome of interspecific hybridization and subsequent effects of evolutionary forces is introgression, which is the integration of genetic material from one species into the genome of an individual in another species. The evolution of several groups of eukaryotic species has involved hybridization, and cases of adaptation through introgression have been already established. In this work, we report on PhyloNet-HMM-a new comparative genomic framework for detecting introgression in genomes. PhyloNet-HMM combines phylogenetic networks with hidden Markov models (HMMs to simultaneously capture the (potentially reticulate evolutionary history of the genomes and dependencies within genomes. A novel aspect of our work is that it also accounts for incomplete lineage sorting and dependence across loci. Application of our model to variation data from chromosome 7 in the mouse (Mus musculus domesticus genome detected a recently reported adaptive introgression event involving the rodent poison resistance gene Vkorc1, in addition to other newly detected introgressed genomic regions. Based on our analysis, it is estimated that about 9% of all sites within chromosome 7 are of introgressive origin (these cover about 13 Mbp of chromosome 7, and over 300 genes. Further, our model detected no introgression in a negative control data set. We also found that our model accurately detected introgression and other evolutionary processes from synthetic data sets simulated under the coalescent model with recombination, isolation, and migration. Our work provides a powerful framework for systematic analysis of introgression while simultaneously accounting for dependence across sites, point mutations, recombination, and ancestral polymorphism.

  15. Array comparative genomic hybridization-based characterization of genetic alterations in pulmonary neuroendocrine tumors

    OpenAIRE

    Voortman, Johannes; Lee, Jih-Hsiang; Killian, Jonathan Keith; Suuriniemi, Miia; Wang, Yonghong; Lucchi, Marco; Smith, William I; Meltzer, Paul; Wang, Yisong; Giaccone, Giuseppe

    2010-01-01

    The goal of this study was to characterize and classify pulmonary neuroendocrine tumors based on array comparative genomic hybridization (aCGH). Using aCGH, we performed karyotype analysis of 33 small cell lung cancer (SCLC) tumors, 13 SCLC cell lines, 19 bronchial carcinoids, and 9 gastrointestinal carcinoids. In contrast to the relatively conserved karyotypes of carcinoid tumors, the karyotypes of SCLC tumors and cell lines were highly aberrant. High copy number (CN) gains were detected in ...

  16. Comparative genomics of Eukaryotes

    NARCIS (Netherlands)

    Noort, Vera van

    2007-01-01

    This thesis focuses on developing comparative genomics methods in eukaryotes, with an emphasis on applications for gene function prediction and regulatory element detection. In the past, methods have been developed to predict functional associations between gene pairs in prokaryotes. The challenge

  17. UniPrimer: A Web-Based Primer Design Tool for Comparative Analyses of Primate Genomes

    Directory of Open Access Journals (Sweden)

    Nomin Batnyam

    2012-01-01

    Full Text Available Whole genome sequences of various primates have been released due to advanced DNA-sequencing technology. A combination of computational data mining and the polymerase chain reaction (PCR assay to validate the data is an excellent method for conducting comparative genomics. Thus, designing primers for PCR is an essential procedure for a comparative analysis of primate genomes. Here, we developed and introduced UniPrimer for use in those studies. UniPrimer is a web-based tool that designs PCR- and DNA-sequencing primers. It compares the sequences from six different primates (human, chimpanzee, gorilla, orangutan, gibbon, and rhesus macaque and designs primers on the conserved region across species. UniPrimer is linked to RepeatMasker, Primer3Plus, and OligoCalc softwares to produce primers with high accuracy and UCSC In-Silico PCR to confirm whether the designed primers work. To test the performance of UniPrimer, we designed primers on sample sequences using UniPrimer and manually designed primers for the same sequences. The comparison of the two processes showed that UniPrimer was more effective than manual work in terms of saving time and reducing errors.

  18. A RAD-based linkage map and comparative genomics in the gudgeons (genus Gnathopogon, Cyprinidae

    Directory of Open Access Journals (Sweden)

    Kakioka Ryo

    2013-01-01

    Full Text Available Abstract Background The construction of linkage maps is a first step in exploring the genetic basis for adaptive phenotypic divergence in closely related species by quantitative trait locus (QTL analysis. Linkage maps are also useful for comparative genomics in non-model organisms. Advances in genomics technologies make it more feasible than ever to study the genetics of adaptation in natural populations. Restriction-site associated DNA (RAD sequencing in next-generation sequencers facilitates the development of many genetic markers and genotyping. We aimed to construct a linkage map of the gudgeons of the genus Gnathopogon (Cyprinidae for comparative genomics with the zebrafish Danio rerio (a member of the same family as gudgeons and for the future QTL analysis of the genetic architecture underlying adaptive phenotypic evolution of Gnathopogon. Results We constructed the first genetic linkage map of Gnathopogon using a 198 F2 interspecific cross between two closely related species in Japan: river-dwelling Gnathopogon elongatus and lake-dwelling Gnathopogon caerulescens. Based on 1,622 RAD-tag markers, a linkage map spanning 1,390.9 cM with 25 linkage groups and an average marker interval of 0.87 cM was constructed. We also identified a region involving female-specific transmission ratio distortion (TRD. Synteny and collinearity were extensively conserved between Gnathopogon and zebrafish. Conclusions The dense SNP-based linkage map presented here provides a basis for future QTL analysis. It will also be useful for transferring genomic information from a “traditional” model fish species, zebrafish, to screen candidate genes underlying ecologically important traits of the gudgeons.

  19. A RAD-based linkage map and comparative genomics in the gudgeons (genus Gnathopogon, Cyprinidae)

    Science.gov (United States)

    2013-01-01

    Background The construction of linkage maps is a first step in exploring the genetic basis for adaptive phenotypic divergence in closely related species by quantitative trait locus (QTL) analysis. Linkage maps are also useful for comparative genomics in non-model organisms. Advances in genomics technologies make it more feasible than ever to study the genetics of adaptation in natural populations. Restriction-site associated DNA (RAD) sequencing in next-generation sequencers facilitates the development of many genetic markers and genotyping. We aimed to construct a linkage map of the gudgeons of the genus Gnathopogon (Cyprinidae) for comparative genomics with the zebrafish Danio rerio (a member of the same family as gudgeons) and for the future QTL analysis of the genetic architecture underlying adaptive phenotypic evolution of Gnathopogon. Results We constructed the first genetic linkage map of Gnathopogon using a 198 F2 interspecific cross between two closely related species in Japan: river-dwelling Gnathopogon elongatus and lake-dwelling Gnathopogon caerulescens. Based on 1,622 RAD-tag markers, a linkage map spanning 1,390.9 cM with 25 linkage groups and an average marker interval of 0.87 cM was constructed. We also identified a region involving female-specific transmission ratio distortion (TRD). Synteny and collinearity were extensively conserved between Gnathopogon and zebrafish. Conclusions The dense SNP-based linkage map presented here provides a basis for future QTL analysis. It will also be useful for transferring genomic information from a “traditional” model fish species, zebrafish, to screen candidate genes underlying ecologically important traits of the gudgeons. PMID:23324215

  20. Carotenoid Biosynthesis in Cyanobacteria: Structural and Evolutionary Scenarios Based on Comparative Genomics

    Directory of Open Access Journals (Sweden)

    Chengwei Liang , Fangqing Zhao , Wei Wei , Zhangxiao Wen , Song Qin

    2006-01-01

    Full Text Available Carotenoids are widely distributed pigments in nature and their biosynthetic pathway has been extensively studied in various organisms. The recent access to the overwhelming amount genomic data of cyanobacteria has given birth to a novel approach called comparative genomics. The putative enzymes involved in the carotenoid biosynthesis among the cyanobacteria were determined by similarity-based tools. The reconstruction of biosynthetic pathway was based on the related enzymes. It is interesting to find that nearly all the cyanobacteria share quite similar pathway to synthesize β-carotene except for Gloeobacter violaceus PCC 7421. The enzymes, crtE-B-P-Qb-L, involved in the upstream pathway are more conserved than the subsequent ones (crtW-R. In addition, many carotenoid synthesis enzymes exhibit diversity in structure and function. Such examples in the families of ζ –carotene desaturase, lycopene cylases and carotene ketolases were described in this article. When we mapped these crt genes to the cyanobacterial genomes, the crt genes showed great structural variation among species. All of them are dispersed on the whole chromosome in contrast to the linear adjacent distribution of the crt gene cluster in other eubacteria. Moreover, in unicellular cyanobacteria, each step of the carotenogenic pathway is usually catalyzed by one gene product, whereas multiple ketolase genes are found in filamentous cyanobacteria. Such increased numbers of crt genes and their correlation to the ecological adaptation were carefully discussed.

  1. Rodent malaria parasites : genome organization & comparative genomics

    NARCIS (Netherlands)

    Kooij, Taco W.A.

    2006-01-01

    The aim of the studies described in this thesis was to investigate the genome organization of rodent malaria parasites (RMPs) and compare the organization and gene content of the genomes of RMPs and the human malaria parasite P. falciparum. The release of the complete genome sequence of P. falciparu

  2. Rodent malaria parasites : genome organization & comparative genomics

    NARCIS (Netherlands)

    Kooij, Taco W.A.

    2006-01-01

    The aim of the studies described in this thesis was to investigate the genome organization of rodent malaria parasites (RMPs) and compare the organization and gene content of the genomes of RMPs and the human malaria parasite P. falciparum. The release of the complete genome sequence of P.

  3. 1-Mb resolution array-based comparative genomic hybridization using a BAC clone set optimized for cancer gene analysis

    NARCIS (Netherlands)

    Greshock, J; Naylor, TL; Margolin, A; Diskin, S; Cleaver, SH; Futreal, PA; deJong, PJ; Zhao, SY; Liebman, M; Weber, BL

    2004-01-01

    Array-based comparative genomic hybridization (aCGH) is a recently developed tool for genome-wide determination of DNA copy number alterations. This technology has tremendous potential for disease-gene discovery in cancer and developmental disorders as well as numerous other applications. However, w

  4. Phytozome Comparative Plant Genomics Portal

    Energy Technology Data Exchange (ETDEWEB)

    Goodstein, David; Batra, Sajeev; Carlson, Joseph; Hayes, Richard; Phillips, Jeremy; Shu, Shengqiang; Schmutz, Jeremy; Rokhsar, Daniel

    2014-09-09

    The Dept. of Energy Joint Genome Institute is a genomics user facility supporting DOE mission science in the areas of Bioenergy, Carbon Cycling, and Biogeochemistry. The Plant Program at the JGI applies genomic, analytical, computational and informatics platforms and methods to: 1. Understand and accelerate the improvement (domestication) of bioenergy crops 2. Characterize and moderate plant response to climate change 3. Use comparative genomics to identify constrained elements and infer gene function 4. Build high quality genomic resource platforms of JGI Plant Flagship genomes for functional and experimental work 5. Expand functional genomic resources for Plant Flagship genomes

  5. Enhancer Identification through Comparative Genomics

    Energy Technology Data Exchange (ETDEWEB)

    Visel, Axel; Bristow, James; Pennacchio, Len A.

    2006-10-01

    With the availability of genomic sequence from numerousvertebrates, a paradigm shift has occurred in the identification ofdistant-acting gene regulatory elements. In contrast to traditionalgene-centric studies in which investigators randomly scanned genomicfragments that flank genes of interest in functional assays, the modernapproach begins electronically with publicly available comparativesequence datasets that provide investigators with prioritized lists ofputative functional sequences based on their evolutionary conservation.However, although a large number of tools and resources are nowavailable, application of comparative genomic approaches remains far fromtrivial. In particular, it requires users to dynamically consider thespecies and methods for comparison depending on the specific biologicalquestion under investigation. While there is currently no single generalrule to this end, it is clear that when applied appropriately,comparative genomic approaches exponentially increase our power ingenerating biological hypotheses for subsequent experimentaltesting.

  6. Genome Mapping in Plant Comparative Genomics.

    Science.gov (United States)

    Chaney, Lindsay; Sharp, Aaron R; Evans, Carrie R; Udall, Joshua A

    2016-09-01

    Genome mapping produces fingerprints of DNA sequences to construct a physical map of the whole genome. It provides contiguous, long-range information that complements and, in some cases, replaces sequencing data. Recent advances in genome-mapping technology will better allow researchers to detect large (>1kbp) structural variations between plant genomes. Some molecular and informatics complications need to be overcome for this novel technology to achieve its full utility. This technology will be useful for understanding phenotype responses due to DNA rearrangements and will yield insights into genome evolution, particularly in polyploids. In this review, we outline recent advances in genome-mapping technology, including the processes required for data collection and analysis, and applications in plant comparative genomics.

  7. Array comparative genomic hybridization-based characterization of genetic alterations in pulmonary neuroendocrine tumors.

    Science.gov (United States)

    Voortman, Johannes; Lee, Jih-Hsiang; Killian, Jonathan Keith; Suuriniemi, Miia; Wang, Yonghong; Lucchi, Marco; Smith, William I; Meltzer, Paul; Wang, Yisong; Giaccone, Giuseppe

    2010-07-20

    The goal of this study was to characterize and classify pulmonary neuroendocrine tumors based on array comparative genomic hybridization (aCGH). Using aCGH, we performed karyotype analysis of 33 small cell lung cancer (SCLC) tumors, 13 SCLC cell lines, 19 bronchial carcinoids, and 9 gastrointestinal carcinoids. In contrast to the relatively conserved karyotypes of carcinoid tumors, the karyotypes of SCLC tumors and cell lines were highly aberrant. High copy number (CN) gains were detected in SCLC tumors and cell lines in cytogenetic bands encoding JAK2, FGFR1, and MYC family members. In some of those samples, the CN of these genes exceeded 100, suggesting that they could represent driver alterations and potential drug targets in subgroups of SCLC patients. In SCLC tumors, as well as bronchial carcinoids and carcinoids of gastrointestinal origin, recurrent CN alterations were observed in 203 genes, including the RB1 gene and 59 microRNAs of which 51 locate in the DLK1-DIO3 domain. These findings suggest the existence of partially shared CN alterations in these tumor types. In contrast, CN alterations of the TP53 gene and the MYC family members were predominantly observed in SCLC. Furthermore, we demonstrated that the aCGH profile of SCLC cell lines highly resembles that of clinical SCLC specimens. Finally, by analyzing potential drug targets, we provide a genomics-based rationale for targeting the AKT-mTOR and apoptosis pathways in SCLC.

  8. Application of Array-based Comparative Genome Hybridization in Children with Developmental Delay or Mental Retardation

    Directory of Open Access Journals (Sweden)

    Jao-Shwann Liang

    2008-12-01

    Full Text Available Children with developmental delay or mental retardation (DD/MR are commonly en countered in child neurology clinics, and establishing an etiologic diagnosis is a challenge for child neurologists. Among the etiologies, chromosomal imbalance is one of the most important causes. However, many of these chromosomal imbalances are submicroscopic and cannot be detected by conventional cytogenetic methods. Microarray-based comparative genomic hybridization (array CGH is considered to be superior in the investigation of chromosomal deletions or duplications in children with DD/MR, and has been demonstrated to improve the diagnostic detection rate for these small chromosomal abnormalities. Here, we review the recent studies of array CGH in the evaluation of patients with idiopathic DD/MR.

  9. Gene expression profiles in squamous cell cervical carcinoma using array-based comparative genomic hybridization analysis.

    Science.gov (United States)

    Choi, Y-W; Bae, S M; Kim, Y-W; Lee, H N; Kim, Y W; Park, T C; Ro, D Y; Shin, J C; Shin, S J; Seo, J-S; Ahn, W S

    2007-01-01

    Our aim was to identify novel genomic regions of interest and provide highly dynamic range information on correlation between squamous cell cervical carcinoma and its related gene expression patterns by a genome-wide array-based comparative genomic hybridization (array-CGH). We analyzed 15 cases of cervical cancer from KangNam St Mary's Hospital of the Catholic University of Korea. Microdissection assay was performed to obtain DNA samples from paraffin-embedded cervical tissues of cancer as well as of the adjacent normal tissues. The bacterial artificial chromosome (BAC) array used in this study consisted of 1440 human BACs and the space among the clones was 2.08 Mb. All the 15 cases of cervical cancer showed the differential changes of the cervical cancer-associated genetic alterations. The analysis limit of average gains and losses was 53%. A significant positive correlation was found in 8q24.3, 1p36.32, 3q27.1, 7p21.1, 11q13.1, and 3p14.2 changes through the cervical carcinogenesis. The regions of high level of gain were 1p36.33-1p36.32, 8q24.3, 16p13.3, 1p36.33, 3q27.1, and 7p21.1. And the regions of homozygous loss were 2q12.1, 22q11.21, 3p14.2, 6q24.3, 7p15.2, and 11q25. In the high level of gain regions, GSDMDC1, RECQL4, TP73, ABCF3, ALG3, HDAC9, ESRRA, and RPS6KA4 were significantly correlated with cervical cancer. The genes encoded by frequently lost clones were PTPRG, GRM7, ZDHHC3, EXOSC7, LRP1B, and NR3C2. Therefore, array-CGH analyses showed that specific genomic alterations were maintained in cervical cancer that were critical to the malignant phenotype and may give a chance to find out possible target genes present in the gained or lost clones.

  10. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

    Directory of Open Access Journals (Sweden)

    Francesca Bertolini

    Full Text Available Few studies investigated the donkey (Equus asinus at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca. The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing and Ion Torrent (RRL runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources.

  11. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

    Science.gov (United States)

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources.

  12. CFGP 2.0: a versatile web-based platform for supporting comparative and evolutionary genomics of fungi and Oomycetes.

    Science.gov (United States)

    Choi, Jaeyoung; Cheong, Kyeongchae; Jung, Kyongyong; Jeon, Jongbum; Lee, Gir-Won; Kang, Seogchan; Kim, Sangsoo; Lee, Yin-Won; Lee, Yong-Hwan

    2013-01-01

    In 2007, Comparative Fungal Genomics Platform (CFGP; http://cfgp.snu.ac.kr/) was publicly open with 65 genomes corresponding to 58 fungal and Oomycete species. The CFGP provided six bioinformatics tools, including a novel tool entitled BLASTMatrix that enables search homologous genes to queries in multiple species simultaneously. CFGP also introduced Favorite, a personalized virtual space for data storage and analysis with these six tools. Since 2007, CFGP has grown to archive 283 genomes corresponding to 152 fungal and Oomycete species as well as 201 genomes that correspond to seven bacteria, 39 plants and 105 animals. In addition, the number of tools in Favorite increased to 27. The Taxonomy Browser of CFGP 2.0 allows users to interactively navigate through a large number of genomes according to their taxonomic positions. The user interface of BLASTMatrix was also improved to facilitate subsequent analyses of retrieved data. A newly developed genome browser, Seoul National University Genome Browser (SNUGB), was integrated into CFGP 2.0 to support graphical presentation of diverse genomic contexts. Based on the standardized genome warehouse of CFGP 2.0, several systematic platforms designed to support studies on selected gene families have been developed. Most of them are connected through Favorite to allow of sharing data across the platforms.

  13. Using Array-Based Comparative Genomic Hybridization to Diagnose Pallister-Killian Syndrome.

    Science.gov (United States)

    Lee, Mi Na; Lee, Jiwon; Yu, Hee Joon; Lee, Jeehun; Kim, Sun Hee

    2017-01-01

    Pallister-Killian syndrome (PKS) is a rare multisystem disorder characterized by isochromosome 12p and tissue-limited mosaic tetrasomy 12p. In this study, we diagnosed three pediatric patients who were suspicious of having PKS using array-based comparative genomic hybridization (array CGH) and FISH analyses performed on peripheral lymphocytes. Patients 1 and 2 presented with craniofacial dysmorphic features, hypotonia, and a developmental delay. Array CGH revealed two to three copies of 12p in patient 1 and three copies in patient 2. FISH analysis showed trisomy or tetrasomy 12p. Patient 3, who had clinical features comparable to those of patients 1 and 2, was diagnosed by using FISH analysis alone. Here, we report three patients with mosaic tetrasomy 12p. There have been only reported cases diagnosed by chromosome analysis and FISH analysis on skin fibroblast or amniotic fluid. To our knowledge, patient 1 was the first case diagnosed by using array CGH performed on peripheral lymphocytes in Korea.

  14. Stochastic segmentation models for array-based comparative genomic hybridization data analysis.

    Science.gov (United States)

    Lai, Tze Leung; Xing, Haipeng; Zhang, Nancy

    2008-04-01

    Array-based comparative genomic hybridization (array-CGH) is a high throughput, high resolution technique for studying the genetics of cancer. Analysis of array-CGH data typically involves estimation of the underlying chromosome copy numbers from the log fluorescence ratios and segmenting the chromosome into regions with the same copy number at each location. We propose for the analysis of array-CGH data, a new stochastic segmentation model and an associated estimation procedure that has attractive statistical and computational properties. An important benefit of this Bayesian segmentation model is that it yields explicit formulas for posterior means, which can be used to estimate the signal directly without performing segmentation. Other quantities relating to the posterior distribution that are useful for providing confidence assessments of any given segmentation can also be estimated by using our method. We propose an approximation method whose computation time is linear in sequence length which makes our method practically applicable to the new higher density arrays. Simulation studies and applications to real array-CGH data illustrate the advantages of the proposed approach.

  15. Application of Array-Based Comparative Genomic Hybridization to Pediatric Neurologic Diseases

    OpenAIRE

    2013-01-01

    Purpose Array comparative genomic hybridization (array-CGH) is a technique used to analyze quantitative increase or decrease of chromosomes by competitive DNA hybridization of patients and controls. This study aimed to evaluate the benefits and yield of array-CGH in comparison with conventional karyotyping in pediatric neurology patients. Materials and Methods We included 87 patients from the pediatric neurology clinic with at least one of the following features: developmental delay, mental r...

  16. Identification of genetic bases of vibrio fluvialis species-specific biochemical pathways and potential virulence factors by comparative genomic analysis.

    Science.gov (United States)

    Lu, Xin; Liang, Weili; Wang, Yunduan; Xu, Jialiang; Zhu, Jun; Kan, Biao

    2014-03-01

    Vibrio fluvialis is an important food-borne pathogen that causes diarrheal illness and sometimes extraintestinal infections in humans. In this study, we sequenced the genome of a clinical V. fluvialis strain and determined its phylogenetic relationships with other Vibrio species by comparative genomic analysis. We found that the closest relationship was between V. fluvialis and V. furnissii, followed by those with V. cholerae and V. mimicus. Moreover, based on genome comparisons and gene complementation experiments, we revealed genetic mechanisms of the biochemical tests that differentiate V. fluvialis from closely related species. Importantly, we identified a variety of genes encoding potential virulence factors, including multiple hemolysins, transcriptional regulators, and environmental survival and adaptation apparatuses, and the type VI secretion system, which is indicative of complex regulatory pathways modulating pathogenesis in this organism. The availability of V. fluvialis genome sequences may promote our understanding of pathogenic mechanisms for this emerging pathogen.

  17. Comparative Microbial Genomics and Forensics.

    Science.gov (United States)

    Massey, Steven E

    2016-08-01

    Forensic science concerns the application of scientific techniques to questions of a legal nature and may also be used to address questions of historical importance. Forensic techniques are often used in legal cases that involve crimes against persons or property, and they increasingly may involve cases of bioterrorism, crimes against nature, medical negligence, or tracing the origin of food- and crop-borne disease. Given the rapid advance of genome sequencing and comparative genomics techniques, we ask how these might be used to address cases of a forensic nature, focusing on the use of microbial genome sequence analysis. Such analyses rely on the increasingly large numbers of microbial genomes present in public databases, the ability of individual investigators to rapidly sequence whole microbial genomes, and an increasing depth of understanding of their evolution and function. Suggestions are made as to how comparative microbial genomics might be applied forensically and may represent possibilities for the future development of forensic techniques. A particular emphasis is on the nascent field of genomic epidemiology, which utilizes rapid whole-genome sequencing to identify the source and spread of infectious outbreaks. Also discussed is the application of comparative microbial genomics to the study of historical epidemics and deaths and how the approaches developed may also be applicable to more recent and actionable cases.

  18. Cloud computing for comparative genomics.

    Science.gov (United States)

    Wall, Dennis P; Kudtarkar, Parul; Fusaro, Vincent A; Pivovarov, Rimma; Patil, Prasad; Tonellato, Peter J

    2010-05-18

    Large comparative genomics studies and tools are becoming increasingly more compute-expensive as the number of available genome sequences continues to rise. The capacity and cost of local computing infrastructures are likely to become prohibitive with the increase, especially as the breadth of questions continues to rise. Alternative computing architectures, in particular cloud computing environments, may help alleviate this increasing pressure and enable fast, large-scale, and cost-effective comparative genomics strategies going forward. To test this, we redesigned a typical comparative genomics algorithm, the reciprocal smallest distance algorithm (RSD), to run within Amazon's Elastic Computing Cloud (EC2). We then employed the RSD-cloud for ortholog calculations across a wide selection of fully sequenced genomes. We ran more than 300,000 RSD-cloud processes within the EC2. These jobs were farmed simultaneously to 100 high capacity compute nodes using the Amazon Web Service Elastic Map Reduce and included a wide mix of large and small genomes. The total computation time took just under 70 hours and cost a total of $6,302 USD. The effort to transform existing comparative genomics algorithms from local compute infrastructures is not trivial. However, the speed and flexibility of cloud computing environments provides a substantial boost with manageable cost. The procedure designed to transform the RSD algorithm into a cloud-ready application is readily adaptable to similar comparative genomics problems.

  19. Association between chromosomal aberration of COX8C and tethered spinal cord syndrome: array-based comparative genomic hybridization analysis

    Directory of Open Access Journals (Sweden)

    Qiu-jiong Zhao

    2016-01-01

    Full Text Available Copy number variations have been found in patients with neural tube abnormalities. In this study, we performed genome-wide screening using high-resolution array-based comparative genomic hybridization in three children with tethered spinal cord syndrome and two healthy parents. Of eight copy number variations, four were non-polymorphic. These non-polymorphic copy number variations were associated with Angelman and Prader-Willi syndromes, and microcephaly. Gene function enrichment analysis revealed that COX8C, a gene associated with metabolic disorders of the nervous system, was located in the copy number variation region of Patient 1. Our results indicate that array-based comparative genomic hybridization can be used to diagnose tethered spinal cord syndrome. Our results may help determine the pathogenesis of tethered spinal cord syndrome and prevent occurrence of this disease.

  20. Comparative genomic hybridization: practical guidelines.

    NARCIS (Netherlands)

    Jeuken, J.W.M.; Sprenger, S.H.; Wesseling, P.

    2002-01-01

    Comparative genomic hybridization (CGH) is a technique used to identify copy number changes throughout a genome. Until now, hundreds of CGH studies have been published reporting chromosomal imbalances in a large variety of human neoplasms. Additionally, technical improvements of specific steps in a

  1. Evidence-based annotation of the malaria parasite's genome using comparative expression profiling.

    Directory of Open Access Journals (Sweden)

    Yingyao Zhou

    Full Text Available A fundamental problem in systems biology and whole genome sequence analysis is how to infer functions for the many uncharacterized proteins that are identified, whether they are conserved across organisms of different phyla or are phylum-specific. This problem is especially acute in pathogens, such as malaria parasites, where genetic and biochemical investigations are likely to be more difficult. Here we perform comparative expression analysis on Plasmodium parasite life cycle data derived from P. falciparum blood, sporozoite, zygote and ookinete stages, and P. yoelii mosquito oocyst and salivary gland sporozoites, blood and liver stages and show that type II fatty acid biosynthesis genes are upregulated in liver and insect stages relative to asexual blood stages. We also show that some universally uncharacterized genes with orthologs in Plasmodium species, Saccharomyces cerevisiae and humans show coordinated transcription patterns in large collections of human and yeast expression data and that the function of the uncharacterized genes can sometimes be predicted based on the expression patterns across these diverse organisms. We also use a comprehensive and unbiased literature mining method to predict which uncharacterized parasite-specific genes are likely to have roles in processes such as gliding motility, host-cell interactions, sporozoite stage, or rhoptry function. These analyses, together with protein-protein interaction data, provide probabilistic models that predict the function of 926 uncharacterized malaria genes and also suggest that malaria parasites may provide a simple model system for the study of some human processes. These data also provide a foundation for further studies of transcriptional regulation in malaria parasites.

  2. [Study on an inquiry-based teaching case in genomics curriculum: identifying virulence factors of Escherichia coli by using comparative genomics].

    Science.gov (United States)

    Jidong, Zhou; Yudong, Li

    2015-02-01

    Genomics is the core subject of various "omics" and it also becomes a topic of increasing interest in undergraduate curricula of biological sciences. However, the study on teaching methodology of genomics courses was very limited so far. Here we report an application of inquiry-based teaching in genomics courses by using virulence factors of Escherichia coli as an example of comparative genomics study. Specially, students first built a multiple-genome alignment of different E. coli strains to investigate the gene conservation using the Mauve tool; then putative virulence factor genes were identified by using BLAST tool to obtain gene annotations. The teaching process was divided into five modules: situation, resources, task, process and evaluation. Learning-assessment results revealed that students had acquired the knowledge and skills of genomics, and their learning interest and ability of self-study were also motivated. Moreover, the special teaching case can be applied to other related courses, such as microbiology, bioinformatics, molecular biology and food safety detection technology.

  3. Comparative genomics of Listeria species.

    Science.gov (United States)

    Glaser, P; Frangeul, L; Buchrieser, C; Rusniok, C; Amend, A; Baquero, F; Berche, P; Bloecker, H; Brandt, P; Chakraborty, T; Charbit, A; Chetouani, F; Couvé, E; de Daruvar, A; Dehoux, P; Domann, E; Domínguez-Bernal, G; Duchaud, E; Durant, L; Dussurget, O; Entian, K D; Fsihi, H; García-del Portillo, F; Garrido, P; Gautier, L; Goebel, W; Gómez-López, N; Hain, T; Hauf, J; Jackson, D; Jones, L M; Kaerst, U; Kreft, J; Kuhn, M; Kunst, F; Kurapkat, G; Madueno, E; Maitournam, A; Vicente, J M; Ng, E; Nedjari, H; Nordsiek, G; Novella, S; de Pablos, B; Pérez-Diaz, J C; Purcell, R; Remmel, B; Rose, M; Schlueter, T; Simoes, N; Tierrez, A; Vázquez-Boland, J A; Voss, H; Wehland, J; Cossart, P

    2001-10-26

    Listeria monocytogenes is a food-borne pathogen with a high mortality rate that has also emerged as a paradigm for intracellular parasitism. We present and compare the genome sequences of L. monocytogenes (2,944,528 base pairs) and a nonpathogenic species, L. innocua (3,011,209 base pairs). We found a large number of predicted genes encoding surface and secreted proteins, transporters, and transcriptional regulators, consistent with the ability of both species to adapt to diverse environments. The presence of 270 L. monocytogenes and 149 L. innocua strain-specific genes (clustered in 100 and 63 islets, respectively) suggests that virulence in Listeria results from multiple gene acquisition and deletion events.

  4. Comparative genomics-based investigation of resequencing targets in Vibrio fischeri: Focus on point miscalls and artefactual expansions

    Directory of Open Access Journals (Sweden)

    Ruby Edward G

    2008-03-01

    Full Text Available Abstract Background Sequence closure often represents the end-point of a genome project, without a system in place for subsequent improvement and refinement. Building on the genome project of Vibrio fischeri ES114, we used a comparative approach to identify and investigate genes that had a high likelihood of sequence error. Results Comparison of the V. fischeri ES114 genome with that of conspecific strain MJ11 identified 82 target loci in ES114 as containing likely errors, and thus of high-priority for resequencing. Analysis of the targets identified 75 loci in which an error had occurred, resulting in the correction of 10,457 base pairs to generate the new ES114 genomic sequence. A majority of the inaccurate loci involved frameshift errors, correction of which fused adjacent ORFs. Although insertions/deletions are thought to be rare in microbial genome assemblies, fourteen of the loci contained extraneous sequence of over 300 bp, likely due to imperfect contig ends that were misassembled in tandem rather than as overlapping segments. Additionally we updated the entire genome annotation with 113 new features including previously uncalled protein-coding genes, regulatory RNA genes and operon leader peptides, and we analyzed the transcriptional apparatus encoded by ES114. Conclusion We demonstrate that errors in microbial genome sequences, thought to largely be confined to point mutations, may also consist of other prevalent large-scale rearrangements such as insertions. Ongoing genome quality control and annotation programs are necessary to accompany technological advancements in data generation. These updates further advance V. fischeri as an important model for understanding intercellular communication and colonization of animal tissue.

  5. dbCRY: a Web-based comparative and evolutionary genomics platform for blue-light receptors.

    Science.gov (United States)

    Kim, Yong-Min; Choi, Jaeyoung; Lee, Hye-Young; Lee, Gir-Won; Lee, Yong-Hwan; Choi, Doil

    2014-01-01

    Cryptochromes are flavoproteins that play a central role in the circadian oscillations of all living organisms except archaea. Cryptochromes are clustered into three subfamilies: plant-type cryptochromes, animal-type cryptochromes and cryptochrome-DASH proteins. These subfamilies are composed of photolyase/cryptochrome superfamily with 6-4 photolyase and cyclobutane pyrimidine dimer photolyase. Cryptochromes have conserved domain architectures with two distinct domains, an N-terminal photolyase-related domain and a C-terminal domain. Although the molecular function and domain architecture of cryptochromes are conserved, their molecular mechanisms differ between plants and animals. Thus, cryptochromes are one of the best candidates for comparative and evolutionary studies. Here, we have developed a Web-based platform for comparative and evolutionary studies of cryptochromes, dbCRY (http://www.dbcryptochrome.org/). A pipeline built upon the consensus domain profile was applied to 1438 genomes and identified 1309 genes. To support comparative and evolutionary genomics studies, the Web interface provides diverse functions such as (i) browsing by species, (ii) protein domain analysis, (iii) multiple sequence alignment, (iv) homology search and (v) extended analysis opportunities through the implementation of 'Favorite Browser' powered by the Comparative Fungal Genomics Platform 2.0 (CFGP 2.0; http://cfgp.snu.ac.kr/). dbCRY would serve as a standardized and systematic solution for cryptochrome genomics studies. Database URL: http://www.dbcryptochrome.org/

  6. Comparative genomics of Helicobacter pylori

    Institute of Scientific and Technical Information of China (English)

    Quan-Jiang Dong; Qing Wang; Ying-Nin Xin; Ni Li; Shi-Ying Xuan

    2009-01-01

    Genomic sequences have been determined for a number of strains of Helicobacter pylori (H pylori) and related bacteria.With the development of microarray analysis and the wide use of subtractive hybridization techniques,comparative studies have been carried out with respect to the interstrain differences between H pylori and inter-species differences in the genome of related bacteria.It was found that the core genome of H pylori constitutes 1111 genes that are determinants of the species properties.A great pool of auxillary genes are mainly from the categories of cag pathogenicity islands,outer membrane proteins,restriction-modification system and hypothetical proteins of unknown function.Persistence of H pylori in the human stomach leads to the diversification of the genome.Comparative genomics suggest that a host jump has occurs from humans to felines.Candidate genes specific for the development of the gastric diseases were identified.With the aid of proteomics,population genetics and other molecular methods,future comparative genomic studies would dramatically promote our understanding of the evolution,pathogenesis and microbiology of H pylori.

  7. Comparative BAC-based mapping in the white-throated sparrow, a novel behavioral genomics model, using interspecies overgo hybridization

    Directory of Open Access Journals (Sweden)

    Gonser Rusty A

    2011-06-01

    Full Text Available Abstract Background The genomics era has produced an arsenal of resources from sequenced organisms allowing researchers to target species that do not have comparable mapping and sequence information. These new "non-model" organisms offer unique opportunities to examine environmental effects on genomic patterns and processes. Here we use comparative mapping as a first step in characterizing the genome organization of a novel animal model, the white-throated sparrow (Zonotrichia albicollis, which occurs as white or tan morphs that exhibit alternative behaviors and physiology. Morph is determined by the presence or absence of a complex chromosomal rearrangement. This species is an ideal model for behavioral genomics because the association between genotype and phenotype is absolute, making it possible to identify the genomic bases of phenotypic variation. Findings We initiated a genomic study in this species by characterizing the white-throated sparrow BAC library via filter hybridization with overgo probes designed for the chicken, turkey, and zebra finch. Cross-species hybridization resulted in 640 positive sparrow BACs assigned to 77 chicken loci across almost all macro-and microchromosomes, with a focus on the chromosomes associated with morph. Out of 216 overgos, 36% of the probes hybridized successfully, with an average number of 3.0 positive sparrow BACs per overgo. Conclusions These data will be utilized for determining chromosomal architecture and for fine-scale mapping of candidate genes associated with phenotypic differences. Our research confirms the utility of interspecies hybridization for developing comparative maps in other non-model organisms.

  8. Detection of genomic imbalances in microdissected Hodgkin and Reed-Sternberg cells of classical Hodgkin's lymphoma by array-based comparative genomic hybridization.

    Science.gov (United States)

    Hartmann, Sylvia; Martin-Subero, José I; Gesk, Stefan; Hüsken, Julia; Giefing, Maciej; Nagel, Inga; Riemke, Jennifer; Chott, Andreas; Klapper, Wolfram; Parrens, Marie; Merlio, Jean-Philippe; Küppers, Ralf; Bräuninger, Andreas; Siebert, Reiner; Hansmann, Martin-Leo

    2008-09-01

    Cytogenetic analysis of classical Hodgkin's lymphoma is limited by the low content of the neoplastic Hodgkin-Reed-Sternberg cells in the affected tissues. However, available cytogenetic data point to an extreme karyotype complexity. To obtain insights into chromosomal imbalances in classical Hodgkin's lymphoma, we applied array-based comparative genomic hybridization (array comparative genomic hybridization) using DNA from microdissected Hodgkin-Reed-Sternberg cells. To avoid biases introduced by DNA amplification for array comparative genomic hybridization, cHL cases rich in Hodgkin-Reed-Sternberg cells were selected. DNA obtained from approximately 100,000 microdissected Hodgkin-Reed-Sternberg cells of each of ten classical Hodgkin's lymphoma cases was hybridized onto commercial 105 K oligonucleotide comparative genomic hybridization microarrays. Selected imbalances were confirmed by interphase cytogenetics and quantitative polymerase chain reaction analysis and further studied in an independent series of classical Hodgkin's lymphoma. Gains identified in at least five cHL affected 2p12-16, 5q15-23, 6p22, 8q13, 8q24, 9p21-24, 9q34, 12q13-14, 17q12, 19p13, 19q13 and 20q11 whereas losses recurrent in at least five cases involved Xp21, 6q23-24 and 13q22. Copy number changes of selected genes and a small deletion (156 kb) of the CDKN2B (p15) gene were confirmed by interphase cytogenetics and polymerase chain reaction analysis, respectively. Several gained regions included genes constitutively expressed in cHL. Among these, gains of STAT6 (12q13), NOTCH1 (9q34) and JUNB (19p13) were present in additional cHL with the usual low Hodgkin-Reed-Sternberg cell content. The present study demonstrates that array comparative genomic hybridization of microdissected Hodgkin-Reed-Sternberg cells is suitable for identifying and characterizing chromosomal imbalances. Regions affected by genomic changes in Hodgkin-Reed-Sternberg cells recurrently include genes constitutively

  9. Comparative genomics of Dothideomycete fungi

    NARCIS (Netherlands)

    Burgt, van der A.

    2014-01-01

    Fungi are a diverse group of eukaryotic micro-organisms particularly suited for comparative genomics analyses. Fungi are important to industry, fundamental science and many of them are notorious pathogens of crops, thereby endangering global food supply. Dozens of fungi have been sequenced in the la

  10. Microarray-based Comparative Genomic Indexing of the Cronobacter genus (Enterobacter sakazakii)

    Science.gov (United States)

    Cronobacter is a recently defined genus synonymous with Enterobacter sakazakii. This new genus currently comprises 6 genomospecies. To extend our understanding of the genetic relationship between Cronobacter sakazakii BAA-894 and the other species of this genus, microarray-based comparative genomi...

  11. Pyrosequencing-based comparative genome analysis of the nosocomial pathogen Enterococcus faecium and identification of a large transferable pathogenicity island

    Directory of Open Access Journals (Sweden)

    Bonten Marc JM

    2010-04-01

    Full Text Available Abstract Background The Gram-positive bacterium Enterococcus faecium is an important cause of nosocomial infections in immunocompromized patients. Results We present a pyrosequencing-based comparative genome analysis of seven E. faecium strains that were isolated from various sources. In the genomes of clinical isolates several antibiotic resistance genes were identified, including the vanA transposon that confers resistance to vancomycin in two strains. A functional comparison between E. faecium and the related opportunistic pathogen E. faecalis based on differences in the presence of protein families, revealed divergence in plant carbohydrate metabolic pathways and oxidative stress defense mechanisms. The E. faecium pan-genome was estimated to be essentially unlimited in size, indicating that E. faecium can efficiently acquire and incorporate exogenous DNA in its gene pool. One of the most prominent sources of genomic diversity consists of bacteriophages that have integrated in the genome. The CRISPR-Cas system, which contributes to immunity against bacteriophage infection in prokaryotes, is not present in the sequenced strains. Three sequenced isolates carry the esp gene, which is involved in urinary tract infections and biofilm formation. The esp gene is located on a large pathogenicity island (PAI, which is between 64 and 104 kb in size. Conjugation experiments showed that the entire esp PAI can be transferred horizontally and inserts in a site-specific manner. Conclusions Genes involved in environmental persistence, colonization and virulence can easily be aquired by E. faecium. This will make the development of successful treatment strategies targeted against this organism a challenge for years to come.

  12. Comparative genomics of Lactobacillus and other LAB

    DEFF Research Database (Denmark)

    Wassenaar, Trudy M.; Lukjancenko, Oksana

    2014-01-01

    The genomes of 66 LABs, belonging to five different genera, were compared for genome size and gene content. The analyzed genomes included 37 Lactobacillus genomes of 17 species, six Lactococcus lactis genomes, four Leuconostoc genomes of three species, six Streptococcus genomes of two species......, twelve Enterococcus genomes of four species and a single Weissella genome. Genomes of pathogenic strains or species were not included. Since the gene density in these genomes is relatively constant, genome size is a measure of gene content. The genomes of Enterococcus were significantly larger than...... that of the others, with the two Streptococcus species having the shortest genomes. The widest distribution in genome content was observed for Lactobacillus. The number of tRNA and rRNA gene copies varied considerably, with exceptional high numbers observed for Lb. delbrueckii, while these numbers were relatively...

  13. Comparative Genome Analysis and Genome Evolution

    NARCIS (Netherlands)

    Snel, Berend

    2002-01-01

    This thesis described a collection of bioinformatic analyses on complete genome sequence data. We have studied the evolution of gene content and find that vertical inheritance dominates over horizontal gene trasnfer, even to the extent that we can use the gene content to make genome phylogenies. Usi

  14. Comparative Genome Analysis and Genome Evolution

    NARCIS (Netherlands)

    Snel, Berend

    2003-01-01

    This thesis described a collection of bioinformatic analyses on complete genome sequence data. We have studied the evolution of gene content and find that vertical inheritance dominates over horizontal gene trasnfer, even to the extent that we can use the gene content to make genome phylogenies. Usi

  15. Gramene database: Navigating plant comparative genomics resources

    Directory of Open Access Journals (Sweden)

    Parul Gupta

    2016-11-01

    Full Text Available Gramene (http://www.gramene.org is an online, open source, curated resource for plant comparative genomics and pathway analysis designed to support researchers working in plant genomics, breeding, evolutionary biology, system biology, and metabolic engineering. It exploits phylogenetic relationships to enrich the annotation of genomic data and provides tools to perform powerful comparative analyses across a wide spectrum of plant species. It consists of an integrated portal for querying, visualizing and analyzing data for 44 plant reference genomes, genetic variation data sets for 12 species, expression data for 16 species, curated rice pathways and orthology-based pathway projections for 66 plant species including various crops. Here we briefly describe the functions and uses of the Gramene database.

  16. A white paper on nematode comparative genomics.

    Science.gov (United States)

    Bird, David McK; Blaxter, Mark L; McCarter, James P; Mitreva, Makedonka; Sternberg, Paul W; Thomas, W Kelley

    2005-12-01

    In response to the new opportunities for genome sequencing and comparative genomics, the Society of Nematology (SON) formed a committee to develop a white paper in support of the broad scientific needs associated with this phylum and interests of SON members. Although genome sequencing is expensive, the data generated are unique in biological systems in that genomes have the potential to be complete (every base of the genome can be accounted for), accurate (the data are digital and not subject to stochastic variation), and permanent (once obtained, the genome of a species does not need to be experimentally re-sampled). The availability of complete, accurate, and permanent genome sequences from diverse nematode species will underpin future studies into the biology and evolution of this phylum and the ecological associations (particularly parasitic) nematodes have with other organisms. We anticipate that upwards of 100 nematode genomes will be solved to varying levels of completion in the coming decade and suggest biological and practical considerations to guide the selection of the most informative taxa for sequencing.

  17. YersiniaBase: a genomic resource and analysis platform for comparative analysis of Yersinia

    OpenAIRE

    Tan, Shi Yang; Dutta, Avirup; Jakubovics, Nicholas S.; Ang, Mia Yang; Siow, Cheuk Chuen; Mutha, Naresh VR; Heydari, Hamed; Wee, Wei Yee; Wong, Guat Jah; Choo, Siew Woh

    2015-01-01

    Background Yersinia is a Gram-negative bacteria that includes serious pathogens such as the Yersinia pestis, which causes plague, Yersinia pseudotuberculosis, Yersinia enterocolitica. The remaining species are generally considered non-pathogenic to humans, although there is evidence that at least some of these species can cause occasional infections using distinct mechanisms from the more pathogenic species. With the advances in sequencing technologies, many genomes of Yersinia have been sequ...

  18. Aquerium: A web application for comparative exploration of domain-based protein occurrences on the taxonomically clustered genome tree.

    Science.gov (United States)

    Adebali, Ogun; Zhulin, Igor B

    2017-01-01

    Gene duplication and loss are major driving forces in evolution. While many important genomic resources provide information on gene presence, there is a lack of tools giving equal importance to presence and absence information as well as web platforms enabling easy visual comparison of multiple domain-based protein occurrences at once. Here, we present Aquerium, a platform for visualizing genomic presence and absence of biomolecules with a focus on protein domain architectures. The web server offers advanced domain organization querying against the database of pre-computed domains for ∼26,000 organisms and it can be utilized for identification of evolutionary events, such as fusion, disassociation, duplication, and shuffling of protein domains. The tool also allows alternative inputs of custom entries or BLASTP results for visualization. Aquerium will be a useful tool for biologists who perform comparative genomic and evolutionary analyses. The web server is freely accessible at http://aquerium.utk.edu. Proteins 2016; 85:72-77. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  19. Genetic characterization of dogs via chromosomal analysis and array-based comparative genomic hybridization (aCGH).

    Science.gov (United States)

    Müller, M H; Reimann-Berg, N; Bullerdiek, J; Murua Escobar, H

    2012-01-01

    The results of cytogenetic and molecular cytogenetic investigations revealed similarities in genetic background and biological behaviour between tumours and genetic diseases of humans and dogs. These findings classify the dog a good and accepted model for human cancers such as osteosarcomas, mammary carcinomas, oral melanomas and others. With the appearance of new studies and advances in canine genome sequencing, the number of known homologies in diseases between these species raised and still is expected to increase. In this context, array-based comparative genomic hybridization (aCGH) provides a novel tool to rapidly characterize numerical aberrations in canine tumours or to detect copy number aberrations between different breeds. As it is possible to spot probes covering the whole genome on each chip to discover copy number aberrations of all chromosomes simultaneously, this method is time-saving and cost-effective - considering the relation of costs and the amount of data obtained. Complemented with traditional methods like karyotyping and fluorescence in situ hybridization (FISH) analyses, the aCGH is able to provide new insights into the underlying causes of canine carcinogenesis.

  20. Comparative study of de novo assembly and genome-guided assembly strategies for transcriptome reconstruction based on RNA-Seq.

    Science.gov (United States)

    Lu, Bingxin; Zeng, Zhenbing; Shi, Tieliu

    2013-02-01

    Transcriptome reconstruction is an important application of RNA-Seq, providing critical information for further analysis of transcriptome. Although RNA-Seq offers the potential to identify the whole picture of transcriptome, it still presents special challenges. To handle these difficulties and reconstruct transcriptome as completely as possible, current computational approaches mainly employ two strategies: de novo assembly and genome-guided assembly. In order to find the similarities and differences between them, we firstly chose five representative assemblers belonging to the two classes respectively, and then investigated and compared their algorithm features in theory and real performances in practice. We found that all the methods can be reduced to graph reduction problems, yet they have different conceptual and practical implementations, thus each assembly method has its specific advantages and disadvantages, performing worse than others in certain aspects while outperforming others in anther aspects at the same time. Finally we merged assemblies of the five assemblers and obtained a much better assembly. Additionally we evaluated an assembler using genome-guided de novo assembly approach, and achieved good performance. Based on these results, we suggest that to obtain a comprehensive set of recovered transcripts, it is better to use a combination of de novo assembly and genome-guided assembly.

  1. Structure, evolution, and comparative genomics of tetraploid cotton based on a high-density genetic linkage map.

    Science.gov (United States)

    Li, Ximei; Jin, Xin; Wang, Hantao; Zhang, Xianlong; Lin, Zhongxu

    2016-06-01

    A high-density linkage map was constructed using 1,885 newly obtained loci and 3,747 previously published loci, which included 5,152 loci with 4696.03 cM in total length and 0.91 cM in mean distance. Homology analysis in the cotton genome further confirmed the 13 expected homologous chromosome pairs and revealed an obvious inversion on Chr10 or Chr20 and repeated inversions on Chr07 or Chr16. In addition, two reciprocal translocations between Chr02 and Chr03 and between Chr04 and Chr05 were confirmed. Comparative genomics between the tetraploid cotton and the diploid cottons showed that no major structural changes exist between DT and D chromosomes but rather between AT and A chromosomes. Blast analysis between the tetraploid cotton genome and the mixed genome of two diploid cottons showed that most AD chromosomes, regardless of whether it is from the AT or DT genome, preferentially matched with the corresponding homologous chromosome in the diploid A genome, and then the corresponding homologous chromosome in the diploid D genome, indicating that the diploid D genome underwent converted evolution by the diploid A genome to form the DT genome during polyploidization. In addition, the results reflected that a series of chromosomal translocations occurred among Chr01/Chr15, Chr02/Chr14, Chr03/Chr17, Chr04/Chr22, and Chr05/Chr19.

  2. The CGView Server: a comparative genomics tool for circular genomes.

    Science.gov (United States)

    Grant, Jason R; Stothard, Paul

    2008-07-01

    The CGView Server generates graphical maps of circular genomes that show sequence features, base composition plots, analysis results and sequence similarity plots. Sequences can be supplied in raw, FASTA, GenBank or EMBL format. Additional feature or analysis information can be submitted in the form of GFF (General Feature Format) files. The server uses BLAST to compare the primary sequence to up to three comparison genomes or sequence sets. The BLAST results and feature information are converted to a graphical map showing the entire sequence, or an expanded and more detailed view of a region of interest. Several options are included to control which types of features are displayed and how the features are drawn. The CGView Server can be used to visualize features associated with any bacterial, plasmid, chloroplast or mitochondrial genome, and can aid in the identification of conserved genome segments, instances of horizontal gene transfer, and differences in gene copy number. Because a collection of sequences can be used in place of a comparison genome, maps can also be used to visualize regions of a known genome covered by newly obtained sequence reads. The CGView Server can be accessed at http://stothard.afns.ualberta.ca/cgview_server/

  3. Sequencing and comparing whole mitochondrial genomes ofanimals

    Energy Technology Data Exchange (ETDEWEB)

    Boore, Jeffrey L.; Macey, J. Robert; Medina, Monica

    2005-04-22

    Comparing complete animal mitochondrial genome sequences is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. Not only are they much more informative than shorter sequences of individual genes for inferring evolutionary relatedness, but these data also provide sets of genome-level characters, such as the relative arrangements of genes, that can be especially powerful. We describe here the protocols commonly used for physically isolating mtDNA, for amplifying these by PCR or RCA, for cloning,sequencing, assembly, validation, and gene annotation, and for comparing both sequences and gene arrangements. On several topics, we offer general observations based on our experiences to date with determining and comparing complete mtDNA sequences.

  4. Approaches for Comparative Genomics in Aspergillus and Penicillium

    DEFF Research Database (Denmark)

    Rasmussen, Jane Lind Nybo; Theobald, Sebastian; Brandl, Julian

    2016-01-01

    The number of available genomes in the closely related fungal genera Aspergillus and Penicillium is rapidly increasing. At the time of writing, the genomes of 62 species are available, and an even higher number is being prepared. Fungal comparative genomics is thus becoming steadily more powerful...... and applicable for many types of studies. In this chapter, we provide an overview of the state-of-the-art of comparative genomics in these fungi, along with recommended methods. The chapter describes databases for fungal comparative genomics. Based on experience, we suggest strategies for multiple types...... of comparative genomics, ranging from analysis of single genes, over gene clusters and CaZymes to genome-scale comparative genomics. Furthermore, we have examined published comparative genomics papers to summarize the preferred bioinformatic methods and parameters for a given type of analysis, highly useful...

  5. Comparative genomics of bifidobacterium, lactobacillus and related probiotic genera

    DEFF Research Database (Denmark)

    Lukjancenko, Oksana; Ussery, David; Wassenaar, Trudy M.

    2012-01-01

    Six bacterial genera containing species commonly used as probiotics for human consumption or starter cultures for food fermentation were compared and contrasted, based on publicly available complete genome sequences. The analysis included 19 Bifidobacterium genomes, 21 Lactobacillus genomes, 4...... Lactococcus and 3 Leuconostoc genomes, as well as a selection of Enterococcus (11) and Streptococcus (23) genomes. The latter two genera included genomes from probiotic or commensal as well as pathogenic organisms to investigate if their non-pathogenic members shared more genes with the other probiotic......- and core genome of each genus were compared. In addition, it was investigated whether pathogenic genomes contain different COG classes compared to the probiotic or fermentative organisms, again comparing their pan- and core genomes. The obtained results were compared with published data from the literature...

  6. Microarray-based comparative genomic profiling of reference strains and selected Canadian field isolates of Actinobacillus pleuropneumoniae

    Directory of Open Access Journals (Sweden)

    MacInnes Janet I

    2009-02-01

    Full Text Available Abstract Background Actinobacillus pleuropneumoniae, the causative agent of porcine pleuropneumonia, is a highly contagious respiratory pathogen that causes severe losses to the swine industry worldwide. Current commercially-available vaccines are of limited value because they do not induce cross-serovar immunity and do not prevent development of the carrier state. Microarray-based comparative genomic hybridizations (M-CGH were used to estimate whole genomic diversity of representative Actinobacillus pleuropneumoniae strains. Our goal was to identify conserved genes, especially those predicted to encode outer membrane proteins and lipoproteins because of their potential for the development of more effective vaccines. Results Using hierarchical clustering, our M-CGH results showed that the majority of the genes in the genome of the serovar 5 A. pleuropneumoniae L20 strain were conserved in the reference strains of all 15 serovars and in representative field isolates. Fifty-eight conserved genes predicted to encode for outer membrane proteins or lipoproteins were identified. As well, there were several clusters of diverged or absent genes including those associated with capsule biosynthesis, toxin production as well as genes typically associated with mobile elements. Conclusion Although A. pleuropneumoniae strains are essentially clonal, M-CGH analysis of the reference strains of the fifteen serovars and representative field isolates revealed several classes of genes that were divergent or absent. Not surprisingly, these included genes associated with capsule biosynthesis as the capsule is associated with sero-specificity. Several of the conserved genes were identified as candidates for vaccine development, and we conclude that M-CGH is a valuable tool for reverse vaccinology.

  7. A comparative genomic study in schizophrenic and in bipolar disorder patients, based on microarray expression profiling meta-analysis.

    Science.gov (United States)

    Logotheti, Marianthi; Papadodima, Olga; Venizelos, Nikolaos; Chatziioannou, Aristotelis; Kolisis, Fragiskos

    2013-01-01

    Schizophrenia affecting almost 1% and bipolar disorder affecting almost 3%-5% of the global population constitute two severe mental disorders. The catecholaminergic and the serotonergic pathways have been proved to play an important role in the development of schizophrenia, bipolar disorder, and other related psychiatric disorders. The aim of the study was to perform and interpret the results of a comparative genomic profiling study in schizophrenic patients as well as in healthy controls and in patients with bipolar disorder and try to relate and integrate our results with an aberrant amino acid transport through cell membranes. In particular we have focused on genes and mechanisms involved in amino acid transport through cell membranes from whole genome expression profiling data. We performed bioinformatic analysis on raw data derived from four different published studies. In two studies postmortem samples from prefrontal cortices, derived from patients with bipolar disorder, schizophrenia, and control subjects, have been used. In another study we used samples from postmortem orbitofrontal cortex of bipolar subjects while the final study was performed based on raw data from a gene expression profiling dataset in the postmortem superior temporal cortex of schizophrenics. The data were downloaded from NCBI's GEO datasets.

  8. A Comparative Genomic Study in Schizophrenic and in Bipolar Disorder Patients, Based on Microarray Expression Profiling Meta-Analysis

    Directory of Open Access Journals (Sweden)

    Marianthi Logotheti

    2013-01-01

    Full Text Available Schizophrenia affecting almost 1% and bipolar disorder affecting almost 3%–5% of the global population constitute two severe mental disorders. The catecholaminergic and the serotonergic pathways have been proved to play an important role in the development of schizophrenia, bipolar disorder, and other related psychiatric disorders. The aim of the study was to perform and interpret the results of a comparative genomic profiling study in schizophrenic patients as well as in healthy controls and in patients with bipolar disorder and try to relate and integrate our results with an aberrant amino acid transport through cell membranes. In particular we have focused on genes and mechanisms involved in amino acid transport through cell membranes from whole genome expression profiling data. We performed bioinformatic analysis on raw data derived from four different published studies. In two studies postmortem samples from prefrontal cortices, derived from patients with bipolar disorder, schizophrenia, and control subjects, have been used. In another study we used samples from postmortem orbitofrontal cortex of bipolar subjects while the final study was performed based on raw data from a gene expression profiling dataset in the postmortem superior temporal cortex of schizophrenics. The data were downloaded from NCBI's GEO datasets.

  9. Comparative genomics of Shiga toxin encoding bacteriophages

    Directory of Open Access Journals (Sweden)

    Smith Darren L

    2012-07-01

    Full Text Available Abstract Background Stx bacteriophages are responsible for driving the dissemination of Stx toxin genes (stx across their bacterial host range. Lysogens carrying Stx phages can cause severe, life-threatening disease and Stx toxin is an integral virulence factor. The Stx-bacteriophage vB_EcoP-24B, commonly referred to as Ф24B, is capable of multiply infecting a single bacterial host cell at a high frequency, with secondary infection increasing the rate at which subsequent bacteriophage infections can occur. This is biologically unusual, therefore determining the genomic content and context of Ф24B compared to other lambdoid Stx phages is important to understanding the factors controlling this phenomenon and determining whether they occur in other Stx phages. Results The genome of the Stx2 encoding phage, Ф24B was sequenced and annotated. The genomic organisation and general features are similar to other sequenced Stx bacteriophages induced from Enterohaemorrhagic Escherichia coli (EHEC, however Ф24B possesses significant regions of heterogeneity, with implications for phage biology and behaviour. The Ф24B genome was compared to other sequenced Stx phages and the archetypal lambdoid phage, lambda, using the Circos genome comparison tool and a PCR-based multi-loci comparison system. Conclusions The data support the hypothesis that Stx phages are mosaic, and recombination events between the host, phages and their remnants within the same infected bacterial cell will continue to drive the evolution of Stx phage variants and the subsequent dissemination of shigatoxigenic potential.

  10. Comparative genome-scale analysis of niche-based stress-responsive genes in Lactobacillus helveticus strains.

    Science.gov (United States)

    Senan, Suja; Prajapati, Jashbhai B; Joshi, Chaitanya G

    2014-04-01

    Next generation sequencing technologies with advanced bioinformatic tools present a unique opportunity to compare genomes from diverse niches. The identification of niche-specific stress-responsive genes can help in characterizing robust strains for multiple applications. In this study, we attempted to compare the stress-responsive genes of a potential probiotic strain, Lactobacillus helveticus MTCC 5463, and a cheese starter strain, Lactobacillus helveticus DPC 4571, from a gut and dairy niche, respectively. Sequencing of MTCC 5463 was done using 454 GS FLX, and contigs were assembled using GS Assembler software. Genome analysis was done using BLAST hits and the prokaryotic annotation server RAST. The MTCC 5463 genome carried multiple orthologs of genes governing stress responses, whereas the DPC 4571 genome lacked in the number of major stress-response proteins. The absence of the bile salt hydrolase gene in DPC 4571 and its presence in MTCC 5463 clearly indicated niche adaptation. Further, MTCC 5463 carried higher copy numbers of genes contributing towards heat, cold, osmotic, and oxidative stress resistance as compared with DPC 4571. Through comparative genomics, we could thus identify stress-responsive gene sets required to adapt to gut and dairy niches.

  11. Gramene database: navigating plant comparative genomics resources

    Science.gov (United States)

    Gramene (http://www.gramene.org) is an online, open source, curated resource for plant comparative genomics and pathway analysis designed to support researchers working in plant genomics, breeding, evolutionary biology, system biology, and metabolic engineering. It exploits phylogenetic relationship...

  12. Cocoa/Cotton Comparative Genomics

    Science.gov (United States)

    With genome sequence from two members of the Malvaceae family recently made available, we are exploring syntenic relationships, gene content, and evolutionary trajectories between the cacao and cotton genomes. An assembly of cacao (Theobroma cacao) using Illumina and 454 sequence technology yielded ...

  13. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium

    DEFF Research Database (Denmark)

    Machado, Henrique; Gram, Lone

    2017-01-01

    Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand...... the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur, amino-acid usage, ANI), which allowed us to identify two...... misidentified strains. Genome analyses also revealed occurrence of higher and lower GC content clades, correlating with phylogenetic clusters. Pan-and core-genome analysis revealed the conservation of 25% of the genome throughout the genus, with a large and open pan-genome. The major source of genomic diversity...

  14. Comparative genome analysis of Basidiomycete fungi

    Energy Technology Data Exchange (ETDEWEB)

    Riley, Robert; Salamov, Asaf; Henrissat, Bernard; Nagy, Laszlo; Brown, Daren; Held, Benjamin; Baker, Scott; Blanchette, Robert; Boussau, Bastien; Doty, Sharon L.; Fagnan, Kirsten; Floudas, Dimitris; Levasseur, Anthony; Manning, Gerard; Martin, Francis; Morin, Emmanuelle; Otillar, Robert; Pisabarro, Antonio; Walton, Jonathan; Wolfe, Ken; Hibbett, David; Grigoriev, Igor

    2013-08-07

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes symbionts, pathogens, and saprotrophs including the majority of wood decaying and ectomycorrhizal species. To better understand the genetic diversity of this phylum we compared the genomes of 35 basidiomycetes including 6 newly sequenced genomes. These genomes span extremes of genome size, gene number, and repeat content. Analysis of core genes reveals that some 48percent of basidiomycete proteins are unique to the phylum with nearly half of those (22percent) found in only one organism. Correlations between lifestyle and certain gene families are evident. Phylogenetic patterns of plant biomass-degrading genes in Agaricomycotina suggest a continuum rather than a dichotomy between the white rot and brown rot modes of wood decay. Based on phylogenetically-informed PCA analysis of wood decay genes, we predict that that Botryobasidium botryosum and Jaapia argillacea have properties similar to white rot species, although neither has typical ligninolytic class II fungal peroxidases (PODs). This prediction is supported by growth assays in which both fungi exhibit wood decay with white rot-like characteristics. Based on this, we suggest that the white/brown rot dichotomy may be inadequate to describe the full range of wood decaying fungi. Analysis of the rate of discovery of proteins with no or few homologs suggests the value of continued sequencing of basidiomycete fungi.

  15. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium.

    Science.gov (United States)

    Machado, Henrique; Gram, Lone

    2017-01-01

    Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur, amino-acid usage, ANI), which allowed us to identify two misidentified strains. Genome analyses also revealed occurrence of higher and lower GC content clades, correlating with phylogenetic clusters. Pan- and core-genome analysis revealed the conservation of 25% of the genome throughout the genus, with a large and open pan-genome. The major source of genomic diversity could be traced to the smaller chromosome and plasmids. Several of the physiological traits studied in the genus did not correlate with phylogenetic data. Since horizontal gene transfer (HGT) is often suggested as a source of genetic diversity and a potential driver of genomic evolution in bacterial species, we looked into evidence of such in Photobacterium genomes. Genomic islands were the source of genomic differences between strains of the same species. Also, we found transposase genes and CRISPR arrays that suggest multiple encounters with foreign DNA. Presence of genomic exchange traits was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms.

  16. Microarray based comparative genomic hybridization testing in deletion bearing patients with Angelman syndrome: genotype-phenotype correlations.

    Science.gov (United States)

    Sahoo, T; Peters, S U; Madduri, N S; Glaze, D G; German, J R; Bird, L M; Barbieri-Welge, R; Bichell, T J; Beaudet, A L; Bacino, C A

    2006-06-01

    Angelman syndrome (AS) is a neurodevelopmental disorder characterised by severe mental retardation, dysmorphic features, ataxia, seizures, and typical behavioural characteristics, including a happy sociable disposition. AS is caused by maternal deficiency of UBE3A (E6 associated protein ubiquitin protein ligase 3A gene), located in an imprinted region on chromosome 15q11-q13. Although there are four different molecular types of AS, deletions of the 15q11-q13 region account for approximately 70% of the AS patients. These deletions are usually detected by fluorescence in situ hybridisation studies. The deletions can also be subclassified based on their size into class I and class II, with the former being larger and encompassing the latter. We studied 22 patients with AS due to microdeletions using a microarray based comparative genomic hybridisation (array CGH) assay to define the deletions and analysed their phenotypic severity, especially expression of the autism phenotype, in order to establish clinical correlations. Overall, children with larger, class I deletions were significantly more likely to meet criteria for autism, had lower cognitive scores, and lower expressive language scores compared with children with smaller, class II deletions. Children with class I deletions also required more medications to control their seizures than did those in the class II group. There are four known genes (NIPA1, NIPA2, CYFIP1, & GCP5) that are affected by class I but not class II deletions, thus raising the possibility of a role for these genes in autism as well as the development of expressive language skills.

  17. A three-step workflow procedure for the interpretation of array-based comparative genome hybridization results in patients with idiopathic mental retardation and congenital anomalies.

    Science.gov (United States)

    Poot, Martin; Hochstenbach, Ron

    2010-08-01

    One of the aims of clinical genetics is to identify gene mutations or genomic rearrangements that may underlie complex presentations of phenotypic features, such as multiple congenital malformations and mental retardation. During the decade after publication of the first article on array-based comparative genome hybridization, this technique has supplemented karyotyping as the prime genome-wide screening method in patients with idiopathic multiple congenital malformations and mental retardation. The use of this novel, discovery-based, approach has dramatically increased the detection rate of genomic imbalances. Array-based comparative genome hybridization detects copy number changes in the genome of patients and healthy subjects, some of which may represent phenotypically neutral copy number variations. This prompts the need for properly distinguishing between those copy number changes that may contribute to the clinical phenotype amid a pool of neutral copy number variations. We briefly review the characteristics of copy number changes in relation to their clinical relevance. Second, we discuss several published workflow schemes to identify copy number changes putatively contributing to the phenotype, and third, we propose a three-step procedure aiming to rapidly evaluate copy number changes on a case-by-case basis as to their potential contribution to the phenotype of patients with idiopathic multiple congenital malformations and mental retardation. This workflow is gene-centered and should aid in identification of disease-related candidate genes and in estimating the recurrence risk for the disorder in the family.

  18. 3D Genome Tuner: Compare Multiple Circular Genomes in a 3D Context

    Institute of Scientific and Technical Information of China (English)

    Qi Wang; Qun Liang; Xiuqing Zhang

    2009-01-01

    Circular genomes, being the largest proportion of sequenced genomes, play an important role in genome analysis. However, traditional 2D circular map only provides an overview and annotations of genome but does not offer feature-based comparison. For remedying these shortcomings, we developed 3D Genome Tuner, a hybrid of circular map and comparative map tools. Its capability of viewing comparisons between multiple circular maps in a 3D space offers great benefits to the study of comparative genomics. The program is freely available(under an LGPL licence)at http://sourceforge.net/projects/dgenometuner.

  19. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium

    OpenAIRE

    Henrique Machado; Lone Gram

    2017-01-01

    Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationship...

  20. Microarray and comparative genomics-based identification of genes and gene regulatory regions of the mouse immune system

    Directory of Open Access Journals (Sweden)

    Katz Jonathan D

    2004-10-01

    Full Text Available Abstract Background In this study we have built and mined a gene expression database composed of 65 diverse mouse tissues for genes preferentially expressed in immune tissues and cell types. Using expression pattern criteria, we identified 360 genes with preferential expression in thymus, spleen, peripheral blood mononuclear cells, lymph nodes (unstimulated or stimulated, or in vitro activated T-cells. Results Gene clusters, formed based on similarity of expression-pattern across either all tissues or the immune tissues only, had highly significant associations both with immunological processes such as chemokine-mediated response, antigen processing, receptor-related signal transduction, and transcriptional regulation, and also with more general processes such as replication and cell cycle control. Within-cluster gene correlations implicated known associations of known genes, as well as immune process-related roles for poorly described genes. To characterize regulatory mechanisms and cis-elements of genes with similar patterns of expression, we used a new version of a comparative genomics-based cis-element analysis tool to identify clusters of cis-elements with compositional similarity among multiple genes. Several clusters contained genes that shared 5–6 cis-elements that included ETS and zinc-finger binding sites. cis-Elements AP2 EGRF ETSF MAZF SP1F ZF5F and AREB ETSF MZF1 PAX5 STAT were shared in a thymus-expressed set; AP4R E2FF EBOX ETSF MAZF SP1F ZF5F and CREB E2FF MAZF PCAT SP1F STAT cis-clusters occurred in activated T-cells; CEBP CREB NFKB SORY and GATA NKXH OCT1 RBIT occurred in stimulated lymph nodes. Conclusion This study demonstrates a series of analytic approaches that have allowed the implication of genes and regulatory elements that participate in the differentiation, maintenance, and function of the immune system. Polymorphism or mutation of these could adversely impact immune system functions.

  1. Comparative Reannotation of 21 Aspergillus Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Salamov, Asaf; Riley, Robert; Kuo, Alan; Grigoriev, Igor

    2013-03-08

    We used comparative gene modeling to reannotate 21 Aspergillus genomes. Initial automatic annotation of individual genomes may contain some errors of different nature, e.g. missing genes, incorrect exon-intron structures, 'chimeras', which fuse 2 or more real genes or alternatively splitting some real genes into 2 or more models. The main premise behind the comparative modeling approach is that for closely related genomes most orthologous families have the same conserved gene structure. The algorithm maps all gene models predicted in each individual Aspergillus genome to the other genomes and, for each locus, selects from potentially many competing models, the one which most closely resembles the orthologous genes from other genomes. This procedure is iterated until no further change in gene models is observed. For Aspergillus genomes we predicted in total 4503 new gene models ( ~;;2percent per genome), supported by comparative analysis, additionally correcting ~;;18percent of old gene models. This resulted in a total of 4065 more genes with annotated PFAM domains (~;;3percent increase per genome). Analysis of a few genomes with EST/transcriptomics data shows that the new annotation sets also have a higher number of EST-supported splice sites at exon-intron boundaries.

  2. Comparative genomics and evolution of eukaryotic phospholipidbiosynthesis

    Energy Technology Data Exchange (ETDEWEB)

    Lykidis, Athanasios

    2006-12-01

    Phospholipid biosynthetic enzymes produce diverse molecular structures and are often present in multiple forms encoded by different genes. This work utilizes comparative genomics and phylogenetics for exploring the distribution, structure and evolution of phospholipid biosynthetic genes and pathways in 26 eukaryotic genomes. Although the basic structure of the pathways was formed early in eukaryotic evolution, the emerging picture indicates that individual enzyme families followed unique evolutionary courses. For example, choline and ethanolamine kinases and cytidylyltransferases emerged in ancestral eukaryotes, whereas, multiple forms of the corresponding phosphatidyltransferases evolved mainly in a lineage specific manner. Furthermore, several unicellular eukaryotes maintain bacterial-type enzymes and reactions for the synthesis of phosphatidylglycerol and cardiolipin. Also, base-exchange phosphatidylserine synthases are widespread and ancestral enzymes. The multiplicity of phospholipid biosynthetic enzymes has been largely generated by gene expansion in a lineage specific manner. Thus, these observations suggest that phospholipid biosynthesis has been an actively evolving system. Finally, comparative genomic analysis indicates the existence of novel phosphatidyltransferases and provides a candidate for the uncharacterized eukaryotic phosphatidylglycerol phosphate phosphatase.

  3. Genome-Based Construction of the Metabolic Pathways of Orientia tsutsugamushi and Comparative Analysis within the Rickettsiales Order

    Directory of Open Access Journals (Sweden)

    Chan-Ki Min

    2008-01-01

    Full Text Available Orientia tsutsugamushi, the causative agent of scrub typhus, is an obligate intracellular bacterium that belongs to the order of Rickettsiales. Recently, we have reported that O. tsutsugamushi has a unique genomic structure, consisting of highly repetitive sequences, and suggested that it may provide valuable insight into the evolution of intracellular bacteria. Here, we have used genomic information to construct the major metabolic pathways of O. tsutsugamushi and performed a comparative analysis of the metabolic genes and pathways of O. tsutsugamushi with other members of the Rickettsiales order. While O. tsutsugamushi has the largest genome among the members of this order, mainly due to the presence of repeated sequences, its metabolic pathways have been highly streamlined. Overall, the metabolic pathways of O. tsutsugamushi were similar to Rickettsia but there were notable differences in several pathways including carbohydrate metabolism, the TCA cycle, and the synthesis of cell wall components as well as in the transport systems. Our results will provide a useful guide to the postgenomic analysis of O. tsutsugamushi and lead to a better understanding of the virulence and physiology of this intracellular pathogen.

  4. Comparative genomic analysis of sixty mycobacteriophage genomes: Genome clustering, gene acquisition and gene size

    Science.gov (United States)

    Hatfull, Graham F.; Jacobs-Sera, Deborah; Lawrence, Jeffrey G.; Pope, Welkin H.; Russell, Daniel A.; Ko, Ching-Chung; Weber, Rebecca J.; Patel, Manisha C.; Germane, Katherine L.; Edgar, Robert H.; Hoyte, Natasha N.; Bowman, Charles A.; Tantoco, Anthony T.; Paladin, Elizabeth C.; Myers, Marlana S.; Smith, Alexis L.; Grace, Molly S.; Pham, Thuy T.; O'Brien, Matthew B.; Vogelsberger, Amy M.; Hryckowian, Andrew J.; Wynalek, Jessica L.; Donis-Keller, Helen; Bogel, Matt W.; Peebles, Craig L.; Cresawn, Steve G.; Hendrix, Roger W.

    2010-01-01

    Mycobacteriophages are viruses that infect mycobacterial hosts. Expansion of a collection of sequenced phage genomes to a total of sixty – all infecting a common bacterial host – provides further insight into their diversity and evolution. Of the sixty phage genomes, 55 can be grouped into nine clusters according to their nucleotide sequence similarities, five of which can be further divided into subclusters; five genomes do not cluster with other phages. The sequence diversity between genomes within a cluster varies greatly; for example, the six genomes in cluster D share more than 97.5% average nucleotide similarity with each other. In contrast, similarity between the two genomes in Cluster I is barely detectable by diagonal plot analysis. The total of 6,858 predicted ORFs have been grouped into 1523 phamilies (phams) of related sequences, 46% of which possess only a single member. Only 18.8% of the phams have sequence similarity to non-mycobacteriophage database entries and fewer than 10% of all phams can be assigned functions based on database searching or synteny. Genome clustering facilitates the identification of genes that are in greatest genetic flux and are more likely to have been exchanged horizontally in relatively recent evolutionary time. Although mycobacteriophage genes exhibit smaller average size than genes of their host (205 residues compared to 315), phage genes in higher flux average only ∼100 amino acids, suggesting that the primary units of genetic exchange correspond to single protein domains. PMID:20064525

  5. Comparative Genomics of Green Sulfur Bacteria

    DEFF Research Database (Denmark)

    Ussery, David; Davenport, C; Tümmler, B

    2010-01-01

    Eleven completely sequenced Chlorobi genomes were compared in oligonucleotide usage, gene contents, and synteny. The green sulfur bacteria (GSB) are equipped with a core genome that sustains their anoxygenic phototrophic lifestyle by photosynthesis, sulfur oxidation, and CO(2) fixation. Whole...... weight of 10(6), and are probably instrumental for the bacteria to generate their own intimate (micro)environment....

  6. Comparative genomic analysis of esophageal cancers.

    Science.gov (United States)

    Caygill, Christine P J; Gatenby, Piers A C; Herceg, Zdenko; Lima, Sheila C S; Pinto, Luis F R; Watson, Anthony; Wu, Ming-Shiang

    2014-09-01

    The following, from the 12th OESO World Conference: Cancers of the Esophagus, includes commentaries on comparative genomic analysis of esophageal cancers: genomic polymorphisms, the genetic and epigenetic drivers in esophageal cancers, and the collection of data in the UK Barrett's Oesophagus Registry.

  7. Detection of chromosomal abnormalities by comparative genomic hybridization.

    Science.gov (United States)

    Lapierre, Jean-Michel; Tachdjian, Gérard

    2005-04-01

    Comparative genomic hybridization (CGH) is a modified in-situ hybridization technique. In this type of analysis, two differentially labeled genomic DNAs (study and reference) are cohybridized to normal metaphase spreads or to microarray. Chromosomal locations of copy number changes in the DNA segments of the study genome are revealed by a variable fluorescence intensity ratio along each target chromosome. Thus, CGH allows detection and mapping of DNA sequence copy differences between two genomes in a single experiment. Since its development, comparative genomic hybridization has been applied mostly as a research tool in the field of cancer cytogenetics to identify genetic changes in many previously unknown regions. It is also a powerful tool for detection and identification of unbalanced chromosomal abnormalities in prenatal, postnatal and preimplantation diagnostics. The development of comparative genomic hybridization and increase in resolution analysis by using the microarray-based technique offer new information on chromosomal pathologies and thus better management of patients.

  8. Homology-independent metrics for comparative genomics.

    Science.gov (United States)

    Coutinho, Tarcisio José Domingos; Franco, Glória Regina; Lobo, Francisco Pereira

    2015-01-01

    A mainstream procedure to analyze the wealth of genomic data available nowadays is the detection of homologous regions shared across genomes, followed by the extraction of biological information from the patterns of conservation and variation observed in such regions. Although of pivotal importance, comparative genomic procedures that rely on homology inference are obviously not applicable if no homologous regions are detectable. This fact excludes a considerable portion of "genomic dark matter" with no significant similarity - and, consequently, no inferred homology to any other known sequence - from several downstream comparative genomic methods. In this review we compile several sequence metrics that do not rely on homology inference and can be used to compare nucleotide sequences and extract biologically meaningful information from them. These metrics comprise several compositional parameters calculated from sequence data alone, such as GC content, dinucleotide odds ratio, and several codon bias metrics. They also share other interesting properties, such as pervasiveness (patterns persist on smaller scales) and phylogenetic signal. We also cite examples where these homology-independent metrics have been successfully applied to support several bioinformatics challenges, such as taxonomic classification of biological sequences without homology inference. They where also used to detect higher-order patterns of interactions in biological systems, ranging from detecting coevolutionary trends between the genomes of viruses and their hosts to characterization of gene pools of entire microbial communities. We argue that, if correctly understood and applied, homology-independent metrics can add important layers of biological information in comparative genomic studies without prior homology inference.

  9. DNA microarrays for comparative genomic hybridization based on DOP-PCR amplification of BAC and PAC clones.

    Science.gov (United States)

    Fiegler, Heike; Carr, Philippa; Douglas, Eleanor J; Burford, Deborah C; Hunt, Sarah; Scott, Carol E; Smith, James; Vetrie, David; Gorman, Patricia; Tomlinson, Ian P M; Carter, Nigel P

    2003-04-01

    We have designed DOP-PCR primers specifically for the amplification of large insert clones for use in the construction of DNA microarrays. A bioinformatic approach was used to construct primers that were efficient in the general amplification of human DNA but were poor at amplifying E. coli DNA, a common contaminant of DNA preparations from large insert clones. We chose the three most selective primers for use in printing DNA microarrays. DNA combined from the amplification of large insert clones by use of these three primers and spotted onto glass slides showed more than a sixfold increase in the human to E. coli hybridization ratio when compared to the standard DOP-PCR primer, 6MW. The microarrays reproducibly delineated previously characterized gains and deletions in a cancer cell line and identified a small gain not detected by use of conventional CGH. We also describe a method for the bulk testing of the hybridization characteristics of chromosome-specific clones spotted on microarrays by use of DNA amplified from flow-sorted chromosomes. Finally, we describe a set of clones selected from the publicly available Golden Path of the human genome at 1-Mb intervals and a view in the Ensembl genome browser from which data required for the use of these clones in array CGH and other experiments can be downloaded across the Internet.

  10. Complete Genome Sequence and Comparative Genomics of a Novel Myxobacterium Myxococcus hansupus.

    Science.gov (United States)

    Sharma, Gaurav; Narwani, Tarun; Subramanian, Srikrishna

    2016-01-01

    Myxobacteria, a group of Gram-negative aerobes, belong to the class δ-proteobacteria and order Myxococcales. Unlike anaerobic δ-proteobacteria, they exhibit several unusual physiogenomic properties like gliding motility, desiccation-resistant myxospores and large genomes with high coding density. Here we report a 9.5 Mbp complete genome of Myxococcus hansupus that encodes 7,753 proteins. Phylogenomic and genome-genome distance based analysis suggest that Myxococcus hansupus is a novel member of the genus Myxococcus. Comparative genome analysis with other members of the genus Myxococcus was performed to explore their genome diversity. The variation in number of unique proteins observed across different species is suggestive of diversity at the genus level while the overrepresentation of several Pfam families indicates the extent and mode of genome expansion as compared to non-Myxococcales δ-proteobacteria.

  11. Microalterations of inherently unstable genomic regions in rat mammary carcinomas as revealed by long oligonucleotide array-based comparative genomic hybridization

    NARCIS (Netherlands)

    Adamovic, T.; McAllister, D.; Guryev, V.; Wang, X.; Andrae, J.W.; Cuppen, E.; Jacob, H.; Sugg, S.L.

    2009-01-01

    The presence of copy number variants in normal genomes poses a challenge to identify small genuine somatic copy number changes in high-resolution cancer genome profiling studies due to the use of unpaired reference DNA. Another problem is the well-known rearrangements of immunoglobulin and T-cell

  12. Microalterations of inherently unstable genomic regions in rat mammary carcinomas as revealed by long oligonucleotide array-based comparative genomic hybridization

    NARCIS (Netherlands)

    Adamovic, T.; McAllister, D.; Guryev, V.; Wang, X.; Andrae, J.W.; Cuppen, E.; Jacob, H.; Sugg, S.L.

    2009-01-01

    The presence of copy number variants in normal genomes poses a challenge to identify small genuine somatic copy number changes in high-resolution cancer genome profiling studies due to the use of unpaired reference DNA. Another problem is the well-known rearrangements of immunoglobulin and T-cell re

  13. Microalterations of Inherently Unstable Genomic Regions in Rat Mammary Carcinomas as Revealed by Long Oligonucleotide Array-Based Comparative Genomic Hybridization

    NARCIS (Netherlands)

    Adamovic, Tatjana; McAllister, Donna; Guryev, Victor; Wang, Xujing; Andrae, Jaime Wendt; Cuppen, Edwin; Jacob, Howard J.; Sugg, Sonia L.

    2009-01-01

    The presence of copy number variants in normal genomes poses a challenge to identify small genuine somatic copy number changes in high-resolution cancer genome profiling studies due to the use of unpaired reference DNA. Another problem is the well-known rearrangements of immunoglobulin and T-cell re

  14. Comparative genomic analysis of eutherian kallikrein genes

    Directory of Open Access Journals (Sweden)

    Marko Premzl

    2017-03-01

    Full Text Available The present study made attempts to update and revise eutherian kallikrein genes implicated in major physiological and pathological processes and in medical molecular diagnostics. Using eutherian comparative genomic analysis protocol and free available genomic sequence assemblies, the tests of reliability of eutherian public genomic sequences annotated most comprehensive curated third party data gene data set of eutherian kallikrein genes including 121 complete coding sequences among 335 potential coding sequences. The present analysis first described 13 major gene clusters of eutherian kallikrein genes, and explained their differential gene expansion patterns. One updated classification and nomenclature of eutherian kallikrein genes was proposed, as new framework of future experiments.

  15. Comparative genomic data of the Avian Phylogenomics Project

    DEFF Research Database (Denmark)

    Zhang, Guojie; Li, Bo; Li, Cai;

    2014-01-01

    , which include 38 newly sequenced avian genomes plus previously released or simultaneously released genomes of Chicken, Zebra finch, Turkey, Pigeon, Peregrine falcon, Duck, Budgerigar, Adelie penguin, Emperor penguin and the Medium Ground Finch. We hope that this resource will serve future efforts...... in an average N50 scaffold size of about 50 kb. Repetitive elements comprised 4%-22% of the bird genomes. The assembled scaffolds allowed the homology-based annotation of 13,000 ~ 17000 protein coding genes in each avian genome relative to chicken, zebra finch and human, as well as comparative and sequence...

  16. Using Comparative Genomics for Inquiry-Based Learning to Dissect Virulence of "Escherichia coli" O157:H7 and "Yersinia pestis"

    Science.gov (United States)

    Baumler, David J.; Banta, Lois M.; Hung, Kai F.; Schwarz, Jodi A.; Cabot, Eric L.; Glasner, Jeremy D.; Perna, Nicole T.

    2012-01-01

    Genomics and bioinformatics are topics of increasing interest in undergraduate biological science curricula. Many existing exercises focus on gene annotation and analysis of a single genome. In this paper, we present two educational modules designed to enable students to learn and apply fundamental concepts in comparative genomics using examples…

  17. Using Comparative Genomics for Inquiry-Based Learning to Dissect Virulence of "Escherichia coli" O157:H7 and "Yersinia pestis"

    Science.gov (United States)

    Baumler, David J.; Banta, Lois M.; Hung, Kai F.; Schwarz, Jodi A.; Cabot, Eric L.; Glasner, Jeremy D.; Perna, Nicole T.

    2012-01-01

    Genomics and bioinformatics are topics of increasing interest in undergraduate biological science curricula. Many existing exercises focus on gene annotation and analysis of a single genome. In this paper, we present two educational modules designed to enable students to learn and apply fundamental concepts in comparative genomics using examples…

  18. GenoSets: visual analytic methods for comparative genomics.

    Directory of Open Access Journals (Sweden)

    Aurora A Cain

    Full Text Available Many important questions in biology are, fundamentally, comparative, and this extends to our analysis of a growing number of sequenced genomes. Existing genomic analysis tools are often organized around literal views of genomes as linear strings. Even when information is highly condensed, these views grow cumbersome as larger numbers of genomes are added. Data aggregation and summarization methods from the field of visual analytics can provide abstracted comparative views, suitable for sifting large multi-genome datasets to identify critical similarities and differences. We introduce a software system for visual analysis of comparative genomics data. The system automates the process of data integration, and provides the analysis platform to identify and explore features of interest within these large datasets. GenoSets borrows techniques from business intelligence and visual analytics to provide a rich interface of interactive visualizations supported by a multi-dimensional data warehouse. In GenoSets, visual analytic approaches are used to enable querying based on orthology, functional assignment, and taxonomic or user-defined groupings of genomes. GenoSets links this information together with coordinated, interactive visualizations for both detailed and high-level categorical analysis of summarized data. GenoSets has been designed to simplify the exploration of multiple genome datasets and to facilitate reasoning about genomic comparisons. Case examples are included showing the use of this system in the analysis of 12 Brucella genomes. GenoSets software and the case study dataset are freely available at http://genosets.uncc.edu. We demonstrate that the integration of genomic data using a coordinated multiple view approach can simplify the exploration of large comparative genomic data sets, and facilitate reasoning about comparisons and features of interest.

  19. Sinbase: an integrated database to study genomics, genetics and comparative genomics in Sesamum indicum.

    Science.gov (United States)

    Wang, Linhai; Yu, Jingyin; Li, Donghua; Zhang, Xiurong

    2015-01-01

    Sesame (Sesamum indicum L.) is an ancient and important oilseed crop grown widely in tropical and subtropical areas. It belongs to the gigantic order Lamiales, which includes many well-known or economically important species, such as olive (Olea europaea), leonurus (Leonurus japonicus) and lavender (Lavandula spica), many of which have important pharmacological properties. Despite their importance, genetic and genomic analyses on these species have been insufficient due to a lack of reference genome information. The now available S. indicum genome will provide an unprecedented opportunity for studying both S. indicum genetic traits and comparative genomics. To deliver S. indicum genomic information to the worldwide research community, we designed Sinbase, a web-based database with comprehensive sesame genomic, genetic and comparative genomic information. Sinbase includes sequences of assembled sesame pseudomolecular chromosomes, protein-coding genes (27,148), transposable elements (372,167) and non-coding RNAs (1,748). In particular, Sinbase provides unique and valuable information on colinear regions with various plant genomes, including Arabidopsis thaliana, Glycine max, Vitis vinifera and Solanum lycopersicum. Sinbase also provides a useful search function and data mining tools, including a keyword search and local BLAST service. Sinbase will be updated regularly with new features, improvements to genome annotation and new genomic sequences, and is freely accessible at http://ocri-genomics.org/Sinbase/.

  20. VISTA - computational tools for comparative genomics

    Energy Technology Data Exchange (ETDEWEB)

    Frazer, Kelly A.; Pachter, Lior; Poliakov, Alexander; Rubin,Edward M.; Dubchak, Inna

    2004-01-01

    Comparison of DNA sequences from different species is a fundamental method for identifying functional elements in genomes. Here we describe the VISTA family of tools created to assist biologists in carrying out this task. Our first VISTA server at http://www-gsd.lbl.gov/VISTA/ was launched in the summer of 2000 and was designed to align long genomic sequences and visualize these alignments with associated functional annotations. Currently the VISTA site includes multiple comparative genomics tools and provides users with rich capabilities to browse pre-computed whole-genome alignments of large vertebrate genomes and other groups of organisms with VISTA Browser, submit their own sequences of interest to several VISTA servers for various types of comparative analysis, and obtain detailed comparative analysis results for a set of cardiovascular genes. We illustrate capabilities of the VISTA site by the analysis of a 180 kilobase (kb) interval on human chromosome 5 that encodes for the kinesin family member3A (KIF3A) protein.

  1. An evaluation of Comparative Genome Sequencing (CGS by comparing two previously-sequenced bacterial genomes

    Directory of Open Access Journals (Sweden)

    Herring Christopher D

    2007-08-01

    Full Text Available Abstract Background With the development of new technology, it has recently become practical to resequence the genome of a bacterium after experimental manipulation. It is critical though to know the accuracy of the technique used, and to establish confidence that all of the mutations were detected. Results In order to evaluate the accuracy of genome resequencing using the microarray-based Comparative Genome Sequencing service provided by Nimblegen Systems Inc., we resequenced the E. coli strain W3110 Kohara using MG1655 as a reference, both of which have been completely sequenced using traditional sequencing methods. CGS detected 7 of 8 small sequence differences, one large deletion, and 9 of 12 IS element insertions present in W3110, but did not detect a large chromosomal inversion. In addition, we confirmed that CGS also detected 2 SNPs, one deletion and 7 IS element insertions that are not present in the genome sequence, which we attribute to changes that occurred after the creation of the W3110 lambda clone library. The false positive rate for SNPs was one per 244 Kb of genome sequence. Conclusion CGS is an effective way to detect multiple mutations present in one bacterium relative to another, and while highly cost-effective, is prone to certain errors. Mutations occurring in repeated sequences or in sequences with a high degree of secondary structure may go undetected. It is also critical to follow up on regions of interest in which SNPs were not called because they often indicate deletions or IS element insertions.

  2. DNAVis: interactive visualization of comparative genome annotations

    NARCIS (Netherlands)

    Fiers, M.W.E.J.; Wetering, van de H.; Peeters, T.H.J.M.; Wijk, van J.J.; Nap, J.P.H.

    2006-01-01

    The software package DNAVis offers a fast, interactive and real-time visualization of DNA sequences and their comparative genome annotations. DNAVis implements advanced methods of information visualization such as linked views, perspective walls and semantic zooming, in addition to the display of he

  3. The perennial ryegrass GenomeZipper: targeted use of genome resources for comparative grass genomics.

    Science.gov (United States)

    Pfeifer, Matthias; Martis, Mihaela; Asp, Torben; Mayer, Klaus F X; Lübberstedt, Thomas; Byrne, Stephen; Frei, Ursula; Studer, Bruno

    2013-02-01

    Whole-genome sequences established for model and major crop species constitute a key resource for advanced genomic research. For outbreeding forage and turf grass species like ryegrasses (Lolium spp.), such resources have yet to be developed. Here, we present a model of the perennial ryegrass (Lolium perenne) genome on the basis of conserved synteny to barley (Hordeum vulgare) and the model grass genome Brachypodium (Brachypodium distachyon) as well as rice (Oryza sativa) and sorghum (Sorghum bicolor). A transcriptome-based genetic linkage map of perennial ryegrass served as a scaffold to establish the chromosomal arrangement of syntenic genes from model grass species. This scaffold revealed a high degree of synteny and macrocollinearity and was then utilized to anchor a collection of perennial ryegrass genes in silico to their predicted genome positions. This resulted in the unambiguous assignment of 3,315 out of 8,876 previously unmapped genes to the respective chromosomes. In total, the GenomeZipper incorporates 4,035 conserved grass gene loci, which were used for the first genome-wide sequence divergence analysis between perennial ryegrass, barley, Brachypodium, rice, and sorghum. The perennial ryegrass GenomeZipper is an ordered, information-rich genome scaffold, facilitating map-based cloning and genome assembly in perennial ryegrass and closely related Poaceae species. It also represents a milestone in describing synteny between perennial ryegrass and fully sequenced model grass genomes, thereby increasing our understanding of genome organization and evolution in the most important temperate forage and turf grass species.

  4. Comparative genomics of chondrichthyan Hoxa clusters.

    Science.gov (United States)

    Mulley, John F; Zhong, Ying-Fu; Holland, Peter Wh

    2009-09-02

    The chondrichthyan or cartilaginous fish (chimeras, sharks, skates and rays) occupy an important phylogenetic position as the sister group to all other jawed vertebrates and as an early lineage to diverge from the vertebrate lineage following two whole genome duplication events in vertebrate evolution. There have been few comparative genomic analyses incorporating data from chondrichthyan fish and none comparing genomic information from within the group. We have sequenced the complete Hoxa cluster of the Little Skate (Leucoraja erinacea) and compared to the published Hoxa cluster of the Horn Shark (Heterodontus francisci) and to available data from the Elephant Shark (Callorhinchus milii) genome project. A BAC clone containing the full Little Skate Hoxa cluster was fully sequenced and assembled. Analyses of coding sequences and conserved non-coding elements reveal a strikingly high level of conservation across the cartilaginous fish, with twenty ultraconserved elements (100%,100 bp) found between Skate and Horn Shark, compared to three between human and marsupials. We have also identified novel potential non-coding RNAs in the Skate BAC clone, some of which are conserved to other species. We find that the Little Skate Hoxa cluster is remarkably similar to the previously published Horn Shark Hoxa cluster with respect to sequence identity, gene size and intergenic distance despite over 180 million years of separation between the two lineages. We suggest that the genomes of cartilaginous fish are more highly conserved than those of tetrapods or teleost fish and so are more likely to have retained ancestral non-coding elements. While useful for isolating homologous DNA, this complicates bioinformatic approaches to identify chondrichthyan-specific non-coding DNA elements.

  5. Comparative genomics of chondrichthyan Hoxa clusters

    Directory of Open Access Journals (Sweden)

    Zhong Ying-Fu

    2009-09-01

    Full Text Available Abstract Background The chondrichthyan or cartilaginous fish (chimeras, sharks, skates and rays occupy an important phylogenetic position as the sister group to all other jawed vertebrates and as an early lineage to diverge from the vertebrate lineage following two whole genome duplication events in vertebrate evolution. There have been few comparative genomic analyses incorporating data from chondrichthyan fish and none comparing genomic information from within the group. We have sequenced the complete Hoxa cluster of the Little Skate (Leucoraja erinacea and compared to the published Hoxa cluster of the Horn Shark (Heterodontus francisci and to available data from the Elephant Shark (Callorhinchus milii genome project. Results A BAC clone containing the full Little Skate Hoxa cluster was fully sequenced and assembled. Analyses of coding sequences and conserved non-coding elements reveal a strikingly high level of conservation across the cartilaginous fish, with twenty ultraconserved elements (100%,100 bp found between Skate and Horn Shark, compared to three between human and marsupials. We have also identified novel potential non-coding RNAs in the Skate BAC clone, some of which are conserved to other species. Conclusion We find that the Little Skate Hoxa cluster is remarkably similar to the previously published Horn Shark Hoxa cluster with respect to sequence identity, gene size and intergenic distance despite over 180 million years of separation between the two lineages. We suggest that the genomes of cartilaginous fish are more highly conserved than those of tetrapods or teleost fish and so are more likely to have retained ancestral non-coding elements. While useful for isolating homologous DNA, this complicates bioinformatic approaches to identify chondrichthyan-specific non-coding DNA elements

  6. What constitutes an Arabian Helicobacter pylori? Lessons from comparative genomics.

    Science.gov (United States)

    Kumar, Narender; Albert, M John; Al Abkal, Hanan; Siddique, Iqbal; Ahmed, Niyaz

    2017-02-01

    Helicobacter pylori, the human gastric pathogen, causes a variety of gastric diseases ranging from mild gastritis to gastric cancer. While the studies on H. pylori are dominated by those based on either East Asian or Western strains, information regarding H. pylori strains prevalent in the Middle East remains scarce. Therefore, we carried out whole-genome sequencing and comparative analysis of three H. pylori strains isolated from three native Arab, Kuwaiti patients. H. pylori strains were sequenced using Illumina platform. The sequence reads were filtered and draft genomes were assembled and annotated. Various pathogenicity-associated regions and phages present within the genomes were identified. Phylogenetic analysis was carried out to determine the genetic relatedness of Kuwaiti strains to various lineages of H. pylori. The core genome content and virulence-related genes were analyzed to assess the pathogenic potential. The three genomes clustered along with HpEurope strains in the phylogenetic tree comprising various H. pylori lineages. A total of 1187 genes spread among various functional classes were identified in the core genome analysis. The three genomes possessed a complete cagPAI and also retained most of the known outer membrane proteins as well as virulence-related genes. The cagA gene in all three strains consisted of an AB-C type EPIYA motif. The comparative genomic analysis of Kuwaiti H. pylori strains revealed a European ancestry and a high pathogenic potential. © 2016 John Wiley & Sons Ltd.

  7. High-resolution genome-wide array-based comparative genome hybridization reveals cryptic chromosome changes in AML and MDS cases with trisomy 8 as the sole cytogenetic aberration.

    Science.gov (United States)

    Paulsson, K; Heidenblad, M; Strömbeck, B; Staaf, J; Jönsson, G; Borg, A; Fioretos, T; Johansson, B

    2006-05-01

    Although trisomy 8 as the sole chromosome aberration is the most common numerical abnormality in acute myeloid leukemia (AML) and myelodysplastic syndromes (MDS), little is known about its pathogenetic effects. Considering that +8 is a frequent secondary change in AML/MDS, cryptic--possibly primary--genetic aberrations may occur in cases with trisomy 8 as the apparently single anomaly. However, no such hidden anomalies have been reported. We performed a high-resolution genome-wide array-based comparative genome hybridization (array CGH) analysis of 10 AML/MDS cases with isolated +8, utilizing a 32K bacterial artificial chromosome array set, providing >98% coverage of the genome with a resolution of 100 kb. Array CGH revealed intrachromosomal imbalances, not corresponding to known genomic copy number polymorphisms, in 4/10 cases, comprising nine duplications and hemizygous deletions ranging in size from 0.5 to 2.2 Mb. A 1.8 Mb deletion at 7p14.1, which had occurred prior to the +8, was identified in MDS transforming to AML. Furthermore, a deletion including ETV6 was present in one case. The remaining seven imbalances involved more than 40 genes. The present results show that cryptic genetic abnormalities are frequent in trisomy 8-positive AML/MDS cases and that +8 as the sole cytogenetic aberration is not always the primary genetic event.

  8. Comparative genomics of brain size evolution

    Directory of Open Access Journals (Sweden)

    Wolfgang eEnard

    2014-05-01

    Full Text Available Which genetic changes took place during mammalian, primate and human evolution to build a larger brain? To answer this question, one has to correlate genetic changes with brain size changes across a phylogeny. Such a comparative genomics approach provides unique information to better understand brain evolution and brain development. However, its statistical power is limited for example due to the limited number of species, the presumably complex genetics of brain size evolution and the large search space of mammalian genomes. Hence, it is crucial to add functional information, for example by limiting the search space to genes and regulatory elements known to play a role in the relevant cell types during brain development. Similarly, it is crucial to experimentally follow up on hypotheses generated by such a comparative approach. Recent progress in understanding the molecular and cellular mechanisms of mammalian brain development, in genome sequencing and in genome editing, promises to make a close integration of evolutionary and experimental methods a fruitful approach to better understand the genetics of mammalian brain size evolution.

  9. Whole-genome sequencing for comparative genomics and de novo genome assembly.

    Science.gov (United States)

    Benjak, Andrej; Sala, Claudia; Hartkoorn, Ruben C

    2015-01-01

    Next-generation sequencing technologies for whole-genome sequencing of mycobacteria are rapidly becoming an attractive alternative to more traditional sequencing methods. In particular this technology is proving useful for genome-wide identification of mutations in mycobacteria (comparative genomics) as well as for de novo assembly of whole genomes. Next-generation sequencing however generates a vast quantity of data that can only be transformed into a usable and comprehensible form using bioinformatics. Here we describe the methodology one would use to prepare libraries for whole-genome sequencing, and the basic bioinformatics to identify mutations in a genome following Illumina HiSeq or MiSeq sequencing, as well as de novo genome assembly following sequencing using Pacific Biosciences (PacBio).

  10. Mycobacterial species as case-study of comparative genome analysis.

    Science.gov (United States)

    Zakham, F; Belayachi, L; Ussery, D; Akrim, M; Benjouad, A; El Aouad, R; Ennaji, M M

    2011-02-08

    The genus Mycobacterium represents more than 120 species including important pathogens of human and cause major public health problems and illnesses. Further, with more than 100 genome sequences from this genus, comparative genome analysis can provide new insights for better understanding the evolutionary events of these species and improving drugs, vaccines, and diagnostics tools for controlling Mycobacterial diseases. In this present study we aim to outline a comparative genome analysis of fourteen Mycobacterial genomes: M. avium subsp. paratuberculosis K—10, M. bovis AF2122/97, M. bovis BCG str. Pasteur 1173P2, M. leprae Br4923, M. marinum M, M. sp. KMS, M. sp. MCS, M. tuberculosis CDC1551, M. tuberculosis F11, M. tuberculosis H37Ra, M. tuberculosis H37Rv, M. tuberculosis KZN 1435 , M. ulcerans Agy99,and M. vanbaalenii PYR—1, For this purpose a comparison has been done based on their length of genomes, GC content, number of genes in different data bases (Genbank, Refseq, and Prodigal). The BLAST matrix of these genomes has been figured to give a lot of information about the similarity between species in a simple scheme. As a result of multiple genome analysis, the pan and core genome have been defined for twelve Mycobacterial species. We have also introduced the genome atlas of the reference strain M. tuberculosis H37Rv which can give a good overview of this genome. And for examining the phylogenetic relationships among these bacteria, a phylogenic tree has been constructed from 16S rRNA gene for tuberculosis and non tuberculosis Mycobacteria to understand the evolutionary events of these species.

  11. An Orthology-Based Analysis of Pathogenic Protozoa Impacting Global Health: An Improved Comparative Genomics Approach with Prokaryotes and Model Eukaryote Orthologs

    Science.gov (United States)

    Cuadrat, Rafael R. C.; da Serra Cruz, Sérgio Manuel; Tschoeke, Diogo Antônio; Silva, Edno; Tosta, Frederico; Jucá, Henrique; Jardim, Rodrigo; Campos, Maria Luiza M.; Mattoso, Marta

    2014-01-01

    Abstract A key focus in 21st century integrative biology and drug discovery for neglected tropical and other diseases has been the use of BLAST-based computational methods for identification of orthologous groups in pathogenic organisms to discern orthologs, with a view to evaluate similarities and differences among species, and thus allow the transfer of annotation from known/curated proteins to new/non-annotated ones. We used here a profile-based sensitive methodology to identify distant homologs, coupled to the NCBI's COG (Unicellular orthologs) and KOG (Eukaryote orthologs), permitting us to perform comparative genomics analyses on five protozoan genomes. OrthoSearch was used in five protozoan proteomes showing that 3901 and 7473 orthologs can be identified by comparison with COG and KOG proteomes, respectively. The core protozoa proteome inferred was 418 Protozoa-COG orthologous groups and 704 Protozoa-KOG orthologous groups: (i) 31.58% (132/418) belongs to the category J (translation, ribosomal structure, and biogenesis), and 9.81% (41/418) to the category O (post-translational modification, protein turnover, chaperones) using COG; (ii) 21.45% (151/704) belongs to the categories J, and 13.92% (98/704) to the O using KOG. The phylogenomic analysis showed four well-supported clades for Eukarya, discriminating Multicellular [(i) human, fly, plant and worm] and Unicellular [(ii) yeast, (iii) fungi, and (iv) protozoa] species. These encouraging results attest to the usefulness of the profile-based methodology for comparative genomics to accelerate semi-automatic re-annotation, especially of the protozoan proteomes. This approach may also lend itself for applications in global health, for example, in the case of novel drug target discovery against pathogenic organisms previously considered difficult to research with traditional drug discovery tools. PMID:24960463

  12. SNUGB: a versatile genome browser supporting comparative and functional fungal genomics

    Directory of Open Access Journals (Sweden)

    Kim Seungill

    2008-12-01

    Full Text Available Abstract Background Since the full genome sequences of Saccharomyces cerevisiae were released in 1996, genome sequences of over 90 fungal species have become publicly available. The heterogeneous formats of genome sequences archived in different sequencing centers hampered the integration of the data for efficient and comprehensive comparative analyses. The Comparative Fungal Genomics Platform (CFGP was developed to archive these data via a single standardized format that can support multifaceted and integrated analyses of the data. To facilitate efficient data visualization and utilization within and across species based on the architecture of CFGP and associated databases, a new genome browser was needed. Results The Seoul National University Genome Browser (SNUGB integrates various types of genomic information derived from 98 fungal/oomycete (137 datasets and 34 plant and animal (38 datasets species, graphically presents germane features and properties of each genome, and supports comparison between genomes. The SNUGB provides three different forms of the data presentation interface, including diagram, table, and text, and six different display options to support visualization and utilization of the stored information. Information for individual species can be quickly accessed via a new tool named the taxonomy browser. In addition, SNUGB offers four useful data annotation/analysis functions, including 'BLAST annotation.' The modular design of SNUGB makes its adoption to support other comparative genomic platforms easy and facilitates continuous expansion. Conclusion The SNUGB serves as a powerful platform supporting comparative and functional genomics within the fungal kingdom and also across other kingdoms. All data and functions are available at the web site http://genomebrowser.snu.ac.kr/.

  13. Genomic alterations detected by comparative genomic hybridization in ovarian endometriomas

    Directory of Open Access Journals (Sweden)

    L.C. Veiga-Castelli

    2010-08-01

    Full Text Available Endometriosis is a complex and multifactorial disease. Chromosomal imbalance screening in endometriotic tissue can be used to detect hot-spot regions in the search for a possible genetic marker for endometriosis. The objective of the present study was to detect chromosomal imbalances by comparative genomic hybridization (CGH in ectopic tissue samples from ovarian endometriomas and eutopic tissue from the same patients. We evaluated 10 ovarian endometriotic tissues and 10 eutopic endometrial tissues by metaphase CGH. CGH was prepared with normal and test DNA enzymatically digested, ligated to adaptors and amplified by PCR. A second PCR was performed for DNA labeling. Equal amounts of both normal and test-labeled DNA were hybridized in human normal metaphases. The Isis FISH Imaging System V 5.0 software was used for chromosome analysis. In both eutopic and ectopic groups, 4/10 samples presented chromosomal alterations, mainly chromosomal gains. CGH identified 11q12.3-q13.1, 17p11.1-p12, 17q25.3-qter, and 19p as critical regions. Genomic imbalances in 11q, 17p, 17q, and 19p were detected in normal eutopic and/or ectopic endometrium from women with ovarian endometriosis. These regions contain genes such as POLR2G, MXRA7 and UBA52 involved in biological processes that may lead to the establishment and maintenance of endometriotic implants. This genomic imbalance may affect genes in which dysregulation impacts both eutopic and ectopic endometrium.

  14. Comparative genomics of brain size evolution

    OpenAIRE

    Enard, Wolfgang

    2014-01-01

    Which genetic changes took place during mammalian, primate and human evolution to build a larger brain? To answer this question, one has to correlate genetic changes with brain size changes across a phylogeny. Such a comparative genomics approach provides unique information to better understand brain evolution and brain development. However, its statistical power is limited for example due to the limited number of species, the presumably complex genetics of brain size evolution and the large ...

  15. Comparative genomics of brain size evolution

    OpenAIRE

    2014-01-01

    Which genetic changes took place during mammalian, primate and human evolution to build a larger brain? To answer this question, one has to correlate genetic changes with brain size changes across a phylogeny. Such a comparative genomics approach provides unique information to better understand brain evolution and brain development. However, its statistical power is limited for example due to the limited number of species, the presumably complex genetics of brain size evolution and the large ...

  16. Genomic selection models double the accuracy of predicted breeding values for bacterial cold water disease resistance compared to a traditional pedigree-based model in rainbow trout aquaculture.

    Science.gov (United States)

    Vallejo, Roger L; Leeds, Timothy D; Gao, Guangtu; Parsons, James E; Martin, Kyle E; Evenhuis, Jason P; Fragomeni, Breno O; Wiens, Gregory D; Palti, Yniv

    2017-02-01

    Previously, we have shown that bacterial cold water disease (BCWD) resistance in rainbow trout can be improved using traditional family-based selection, but progress has been limited to exploiting only between-family genetic variation. Genomic selection (GS) is a new alternative that enables exploitation of within-family genetic variation. We compared three GS models [single-step genomic best linear unbiased prediction (ssGBLUP), weighted ssGBLUP (wssGBLUP), and BayesB] to predict genomic-enabled breeding values (GEBV) for BCWD resistance in a commercial rainbow trout population, and compared the accuracy of GEBV to traditional estimates of breeding values (EBV) from a pedigree-based BLUP (P-BLUP) model. We also assessed the impact of sampling design on the accuracy of GEBV predictions. For these comparisons, we used BCWD survival phenotypes recorded on 7893 fish from 102 families, of which 1473 fish from 50 families had genotypes [57 K single nucleotide polymorphism (SNP) array]. Naïve siblings of the training fish (n = 930 testing fish) were genotyped to predict their GEBV and mated to produce 138 progeny testing families. In the following generation, 9968 progeny were phenotyped to empirically assess the accuracy of GEBV predictions made on their non-phenotyped parents. The accuracy of GEBV from all tested GS models were substantially higher than the P-BLUP model EBV. The highest increase in accuracy relative to the P-BLUP model was achieved with BayesB (97.2 to 108.8%), followed by wssGBLUP at iteration 2 (94.4 to 97.1%) and 3 (88.9 to 91.2%) and ssGBLUP (83.3 to 85.3%). Reducing the training sample size to n = ~1000 had no negative impact on the accuracy (0.67 to 0.72), but with n = ~500 the accuracy dropped to 0.53 to 0.61 if the training and testing fish were full-sibs, and even substantially lower, to 0.22 to 0.25, when they were not full-sibs. Using progeny performance data, we showed that the accuracy of genomic predictions is substantially higher

  17. Genomic and comparative genomic analyses of Rickettsia heilongjiangensis provide insight into its evolution and pathogenesis.

    Science.gov (United States)

    Duan, Changsong; Xiong, Xiaolu; Qi, Yong; Gong, Wenping; Jiao, Jun; Wen, Bohai

    2014-08-01

    Rickettsia heilongjiangensis, the causative agent of far eastern spotted fever, is an obligate intracellular gram-negative bacterium that belongs to the spotted fever group rickettsiae. To understand the evolution and pathogenesis of R. heilongjiangensis, we analyzed its genome and compared it with other rickettsial genomes available in GenBank. The R. heilongjiangensis chromosome contains 1333 genes, including 1297 protein coding genes and 36 RNA coding genes. The genome also contains 121 pseudogenes, 54 insertion sequences, and 39 tandem repeats. Sixteen genes encoding the major components of the type IV secretion systems were identified in the R. heilongjiangensis genome. In total, 37 β-barrel outer membrane proteins were predicted in the genome, eight of which have been previously confirmed to be outer membrane proteins. In addition, 266 potential virulence factor genes, seven partially deleted antibiotic resistance genes, and a genomic island were identified in the genome. The codon usage in the genome is compatible with its low GC content, and the amino acid usage shows apparent bias. A comparative genomic analysis showed that R. heilongjiangensis and R. japonica share one unique fragment that may be a target sequence for a diagnostic assay. The orthologs of 37 genes of R. heilongjiangensis were found in pathogenic R. rickettsii str. Sheila Smith but not in non-pathogenic R. rickettsii str. Iowa, which may explain why R. heilongjiangensis is pathogenic. Pan-genome analysis showed that R. heilongjiangensis and 42 other rickettsiae strains share 693 core genes with a pan-genome size of 4837 genes. The pan-genome-based phylogeny showed that R. heilongjiangensis was closely related to R. japonica.

  18. Lactobacillus paracasei comparative genomics: towards species pan-genome definition and exploitation of diversity.

    Science.gov (United States)

    Smokvina, Tamara; Wels, Michiel; Polka, Justyna; Chervaux, Christian; Brisse, Sylvain; Boekhorst, Jos; van Hylckama Vlieg, Johan E T; Siezen, Roland J

    2013-01-01

    Lactobacillus paracasei is a member of the normal human and animal gut microbiota and is used extensively in the food industry in starter cultures for dairy products or as probiotics. With the development of low-cost, high-throughput sequencing techniques it has become feasible to sequence many different strains of one species and to determine its "pan-genome". We have sequenced the genomes of 34 different L. paracasei strains, and performed a comparative genomics analysis. We analysed genome synteny and content, focussing on the pan-genome, core genome and variable genome. Each genome was shown to contain around 2800-3100 protein-coding genes, and comparative analysis identified over 4200 ortholog groups that comprise the pan-genome of this species, of which about 1800 ortholog groups make up the conserved core. Several factors previously associated with host-microbe interactions such as pili, cell-envelope proteinase, hydrolases p40 and p75 or the capacity to produce short branched-chain fatty acids (bkd operon) are part of the L. paracasei core genome present in all analysed strains. The variome consists mainly of hypothetical proteins, phages, plasmids, transposon/conjugative elements, and known functions such as sugar metabolism, cell-surface proteins, transporters, CRISPR-associated proteins, and EPS biosynthesis proteins. An enormous variety and variability of sugar utilization gene cassettes were identified, with each strain harbouring between 25-53 cassettes, reflecting the high adaptability of L. paracasei to different niches. A phylogenomic tree was constructed based on total genome contents, and together with an analysis of horizontal gene transfer events we conclude that evolution of these L. paracasei strains is complex and not always related to niche adaptation. The results of this genome content comparison was used, together with high-throughput growth experiments on various carbohydrates, to perform gene-trait matching analysis, in order to link

  19. A combinatorial approach of comprehensive QTL-based comparative genome mapping and transcript profiling identified a seed weight-regulating candidate gene in chickpea

    Science.gov (United States)

    Bajaj, Deepak; Upadhyaya, Hari D.; Khan, Yusuf; Das, Shouvik; Badoni, Saurabh; Shree, Tanima; Kumar, Vinod; Tripathi, Shailesh; Gowda, C. L. L.; Singh, Sube; Sharma, Shivali; Tyagi, Akhilesh K.; Chattopdhyay, Debasis; Parida, Swarup K.

    2015-01-01

    High experimental validation/genotyping success rate (94–96%) and intra-specific polymorphic potential (82–96%) of 1536 SNP and 472 SSR markers showing in silico polymorphism between desi ICC 4958 and kabuli ICC 12968 chickpea was obtained in a 190 mapping population (ICC 4958 × ICC 12968) and 92 diverse desi and kabuli genotypes. A high-density 2001 marker-based intra-specific genetic linkage map comprising of eight LGs constructed is comparatively much saturated (mean map-density: 0.94 cM) in contrast to existing intra-specific genetic maps in chickpea. Fifteen robust QTLs (PVE: 8.8–25.8% with LOD: 7.0–13.8) associated with pod and seed number/plant (PN and SN) and 100 seed weight (SW) were identified and mapped on 10 major genomic regions of eight LGs. One of 126.8 kb major genomic region harbouring a strong SW-associated robust QTL (Caq'SW1.1: 169.1–171.3 cM) has been delineated by integrating high-resolution QTL mapping with comprehensive marker-based comparative genome mapping and differential expression profiling. This identified one potential regulatory SNP (G/A) in the cis-acting element of candidate ERF (ethylene responsive factor) TF (transcription factor) gene governing seed weight in chickpea. The functionally relevant molecular tags identified have potential to be utilized for marker-assisted genetic improvement of chickpea. PMID:25786576

  20. Comparative Genomic and Phylogenomic Analyses Reveal a Conserved Core Genome Shared by Estuarine and Oceanic Cyanopodoviruses

    Science.gov (United States)

    Huang, Sijun; Zhang, Si; Jiao, Nianzhi; Chen, Feng

    2015-01-01

    Podoviruses are among the major viral groups that infect marine picocyanobacteria Prochlorococcus and Synechococcus. Here, we reported the genome sequences of five Synechococcus podoviruses isolated from the estuarine environment, and performed comparative genomic and phylogenomic analyses based on a total of 20 cyanopodovirus genomes. The genomes of all the known marine cyanopodoviruses are highly syntenic. A pan-genome of 349 clustered orthologous groups was determined, among which 15 were core genes. These core genes make up nearly half of each genome in length, reflecting the high level of genome conservation among this cyanophage type. The whole genome phylogenies based on concatenated core genes and gene content were highly consistent and confirmed the separation of two discrete marine cyanopodovirus clusters MPP-A and MPP-B. The genomes within cluster MPP-B grouped into subclusters mainly corresponding to Prochlorococcus or Synechococcus host types. Auxiliary metabolic genes tend to occur in a specific phylogenetic group of these cyanopodoviruses. All the MPP-B phages analyzed here encode the photosynthesis gene psbA, which are absent in all the MPP-A genomes thus far. Interestingly, all the MPP-B and two MPP-A Synechococcus podoviruses encode the thymidylate synthase gene thyX, while at the same genome locus all the MPP-B Prochlorococcus podoviruses encode the transaldolase gene talC. Both genes are hypothesized to have the potential to facilitate the biosynthesis of deoxynucleotide for phage replication. Inheritance of specific functional genes could be important to the evolution and ecological fitness of certain cyanophage genotypes. Our analyses demonstrate that cyanopodoviruses of estuarine and oceanic origins share a conserved core genome and suggest that accessory genes may be related to environmental adaptation. PMID:26569403

  1. Comparative Genomic and Phylogenomic Analyses Reveal a Conserved Core Genome Shared by Estuarine and Oceanic Cyanopodoviruses.

    Directory of Open Access Journals (Sweden)

    Sijun Huang

    Full Text Available Podoviruses are among the major viral groups that infect marine picocyanobacteria Prochlorococcus and Synechococcus. Here, we reported the genome sequences of five Synechococcus podoviruses isolated from the estuarine environment, and performed comparative genomic and phylogenomic analyses based on a total of 20 cyanopodovirus genomes. The genomes of all the known marine cyanopodoviruses are highly syntenic. A pan-genome of 349 clustered orthologous groups was determined, among which 15 were core genes. These core genes make up nearly half of each genome in length, reflecting the high level of genome conservation among this cyanophage type. The whole genome phylogenies based on concatenated core genes and gene content were highly consistent and confirmed the separation of two discrete marine cyanopodovirus clusters MPP-A and MPP-B. The genomes within cluster MPP-B grouped into subclusters mainly corresponding to Prochlorococcus or Synechococcus host types. Auxiliary metabolic genes tend to occur in a specific phylogenetic group of these cyanopodoviruses. All the MPP-B phages analyzed here encode the photosynthesis gene psbA, which are absent in all the MPP-A genomes thus far. Interestingly, all the MPP-B and two MPP-A Synechococcus podoviruses encode the thymidylate synthase gene thyX, while at the same genome locus all the MPP-B Prochlorococcus podoviruses encode the transaldolase gene talC. Both genes are hypothesized to have the potential to facilitate the biosynthesis of deoxynucleotide for phage replication. Inheritance of specific functional genes could be important to the evolution and ecological fitness of certain cyanophage genotypes. Our analyses demonstrate that cyanopodoviruses of estuarine and oceanic origins share a conserved core genome and suggest that accessory genes may be related to environmental adaptation.

  2. Comparative genomics of toxigenic and non-toxigenic Staphylococcus hyicus

    DEFF Research Database (Denmark)

    Leekitcharoenphon, Pimlapas; Pamp, Sünje Johanna; Andresen, Lars Ole;

    2016-01-01

    The most common causative agent of exudative epidermitis (EE) in pigs is Staphylococcus hyicus. S. hyicus can be grouped into toxigenic and non-toxigenic strains based on their ability to cause EE in pigs and specific virulence genes have been identified. A genome wide comparison between non......-toxigenic and toxigenic strains has never been performed. In this study, we sequenced eleven toxigenic and six non-toxigenic S. hyicus strains and performed comparative genomic and phylogenetic analysis. Our analyses revealed two genomic regions encoding genes that were predominantly found in toxigenic strains...... (polymorphic toxin) and was associated with the gene encoding ExhA. A clear differentiation between toxigenic and non-toxigenic strains based on genomic and phylogenetic analyses was not apparent. The results of this study support the observation that exfoliative toxins of S. hyicus and S. aureus are located...

  3. PSAT: A web tool to compare genomic neighborhoods of multiple prokaryotic genomes

    Directory of Open Access Journals (Sweden)

    Wasnick Michael

    2008-03-01

    Full Text Available Abstract Background The conservation of gene order among prokaryotic genomes can provide valuable insight into gene function, protein interactions, or events by which genomes have evolved. Although some tools are available for visualizing and comparing the order of genes between genomes of study, few support an efficient and organized analysis between large numbers of genomes. The Prokaryotic Sequence homology Analysis Tool (PSAT is a web tool for comparing gene neighborhoods among multiple prokaryotic genomes. Results PSAT utilizes a database that is preloaded with gene annotation, BLAST hit results, and gene-clustering scores designed to help identify regions of conserved gene order. Researchers use the PSAT web interface to find a gene of interest in a reference genome and efficiently retrieve the sequence homologs found in other bacterial genomes. The tool generates a graphic of the genomic neighborhood surrounding the selected gene and the corresponding regions for its homologs in each comparison genome. Homologs in each region are color coded to assist users with analyzing gene order among various genomes. In contrast to common comparative analysis methods that filter sequence homolog data based on alignment score cutoffs, PSAT leverages gene context information for homologs, including those with weak alignment scores, enabling a more sensitive analysis. Features for constraining or ordering results are designed to help researchers browse results from large numbers of comparison genomes in an organized manner. PSAT has been demonstrated to be useful for helping to identify gene orthologs and potential functional gene clusters, and detecting genome modifications that may result in loss of function. Conclusion PSAT allows researchers to investigate the order of genes within local genomic neighborhoods of multiple genomes. A PSAT web server for public use is available for performing analyses on a growing set of reference genomes through any

  4. Determining and comparing protein function in Bacterial genome sequences

    DEFF Research Database (Denmark)

    Vesth, Tammi Camilla

    annotation of genes – the descriptions assigned to genes that describe the likely function of the encoded proteins. This process is limited by several factors, including the definition of a function which can be more or less specific as well as how many genes can actually be assigned a function based...... of this class have very little homology to other known genomes making functional annotation based on sequence similarity very difficult. Inspired in part by this analysis, an approach for comparative functional annotation was created based public sequenced genomes, CMGfunc. Functionally related groups...

  5. Reference Based Genome Compression

    CERN Document Server

    Chern, Bobbie; Manolakos, Alexandros; No, Albert; Venkat, Kartik; Weissman, Tsachy

    2012-01-01

    DNA sequencing technology has advanced to a point where storage is becoming the central bottleneck in the acquisition and mining of more data. Large amounts of data are vital for genomics research, and generic compression tools, while viable, cannot offer the same savings as approaches tuned to inherent biological properties. We propose an algorithm to compress a target genome given a known reference genome. The proposed algorithm first generates a mapping from the reference to the target genome, and then compresses this mapping with an entropy coder. As an illustration of the performance: applying our algorithm to James Watson's genome with hg18 as a reference, we are able to reduce the 2991 megabyte (MB) genome down to 6.99 MB, while Gzip compresses it to 834.8 MB.

  6. Reference Based Genome Compression

    OpenAIRE

    Chern, Bobbie; Ochoa, Idoia; Manolakos, Alexandros; No, Albert; Venkat, Kartik; Weissman, Tsachy

    2012-01-01

    DNA sequencing technology has advanced to a point where storage is becoming the central bottleneck in the acquisition and mining of more data. Large amounts of data are vital for genomics research, and generic compression tools, while viable, cannot offer the same savings as approaches tuned to inherent biological properties. We propose an algorithm to compress a target genome given a known reference genome. The proposed algorithm first generates a mapping from the reference to the target gen...

  7. Supervised Lowess normalization of comparative genome hybridization data - application to lactococcal strain comparisons

    NARCIS (Netherlands)

    van Hijum, Sacha A. F. T.; Baerends, Richard J. S.; Zomer, Aldert L.; Karsens, Harma A.; Martin-Requena, Victoria; Trelles, Oswaldo; Kok, Jan; Kuipers, Oscar P.

    2008-01-01

    Background: Array-based comparative genome hybridization (aCGH) is commonly used to determine the genomic content of bacterial strains. Since prokaryotes in general have less conserved genome sequences than eukaryotes, sequence divergences between the genes in the genomes used for an aCGH experiment

  8. Comparative genomics and phylogenetic analysis of S. dysenteriae subgroup

    Institute of Scientific and Technical Information of China (English)

    YANG; E; BIN; Wen; PENG; Junping; ZHANG; Xiaobing; WANG; Ji

    2005-01-01

    Genomic compositions of representatives of thirteen S. Dysenteriae serotypes were investigated by performing comparative genomic hybridization (CGH) with microarray containing the whole genomic ORFs (open reading frames, ORFs) of E. Coli K12 strain MG1655 and specific ORFs of S. Dysenteriae A1 strain Sd51197. The CGH results indicated the genomes of the serotypes contain 2654 conserved ORFs originating from E. Coli. However, 219 intrinsic genes of E. Coli including those prophage genes, molecular chaperones, synthesis of specific O antigen and so on were absent. Moreover, some specific genes such as type II secretion system associated components, iron transport related genes and some others as well were acquired through horizontal transfer. According to phylogenic trees based on genetic composition, it was demonstrated that A1, A2, A8, A10 were distinct from the other S. Dysenteriae serotypes. Our results in this report may provide new insights into the physiological process, pathogenicity and evolution of S. Dysenteriae.

  9. Assigning protein functions by comparative genome analysis protein phylogenetic profiles

    Science.gov (United States)

    Pellegrini, Matteo; Marcotte, Edward M.; Thompson, Michael J.; Eisenberg, David; Grothe, Robert; Yeates, Todd O.

    2003-05-13

    A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.

  10. Comparative genomics using data mining tools

    Indian Academy of Sciences (India)

    Tannistha Nandi; Chandrika B-Rao; Srinivasan Ramachandran

    2002-02-01

    We have analysed the genomes of representatives of three kingdoms of life, namely, archaea, eubacteria and eukaryota using data mining tools based on compositional analyses of the protein sequences. The representatives chosen in this analysis were Methanococcus jannaschii, Haemophilus influenzae and Saccharomyces cerevisiae. We have identified the common and different features between the three genomes in the protein evolution patterns. M. jannaschii has been seen to have a greater number of proteins with more charged amino acids whereas S. cerevisiae has been observed to have a greater number of hydrophilic proteins. Despite the differences in intrinsic compositional characteristics between the proteins from the different genomes we have also identified certain common characteristics. We have carried out exploratory Principal Component Analysis of the multivariate data on the proteins of each organism in an effort to classify the proteins into clusters. Interestingly, we found that most of the proteins in each organism cluster closely together, but there are a few ‘outliers’. We focus on the outliers for the functional investigations, which may aid in revealing any unique features of the biology of the respective organisms.

  11. Comparative Genomics of Ten Solanaceous Plastomes

    Directory of Open Access Journals (Sweden)

    Harpreet Kaur

    2014-01-01

    Full Text Available Availability of complete plastid genomes of ten solanaceous species, Atropa belladonna, Capsicum annuum, Datura stramonium, Nicotiana sylvestris, Nicotiana tabacum, Nicotiana tomentosiformis, Nicotiana undulata, Solanum bulbocastanum, Solanum lycopersicum, and Solanum tuberosum provided us with an opportunity to conduct their in silico comparative analysis in depth. The size of complete chloroplast genomes and LSC and SSC regions of three species of Solanum is comparatively smaller than that of any other species studied till date (exception: SSC region of A. belladonna. AT content of coding regions was found to be less than noncoding regions. A duplicate copy of trnH gene in C. annuum and two alternative tRNA genes for proline in D. stramonium were observed for the first time in this analysis. Further, homology search revealed the presence of rps19 pseudogene and infA genes in A. belladonna and D. stramonium, a region identical to rps19 pseudogene in C. annum and orthologues of sprA gene in another six species. Among the eighteen intron-containing genes, 3 genes have two introns and 15 genes have one intron. The longest insertion was found in accD gene in C. annuum. Phylogenetic analysis using concatenated protein coding sequences gave two clades, one for Nicotiana species and another for Solanum, Capsicum, Atropa, and Datura.

  12. [Sotos syndrome diagnosed by comparative genomic hybridisation].

    Science.gov (United States)

    Saldarriaga, Wilmar; Molina-Barrera, Laura Camila; Ramírez-Cheyne, Julián

    2016-01-01

    Sotos Syndrome (SS) is a genetic disease with an autosomal dominant pattern caused by haplo-insufficiency of NSD1 gene secondary to point mutations or microdeletion of the 5q35 locus where the gene is located. It is a rare syndrome, occurring in 7 out of every 100,000 births. The objective of this report is to present the case of a 4 year-old patient with a global developmental delay, as well as specific physical findings suggesting a syndrome of genetic origin. Female patient, 4 years of age, thinning hair, triangular facie, long palpebral fissure, arched palate, prominent jaw, winged scapula and clinodactilia of the fifth finger both hands. The molecular test comparative genomic hybridisation test by microarray was subsequently performed, with the result showing 5q35.2 q35.3 region microdeletion of 2,082 MB, including the NSD1 gene. Finally, this article also proposes the performing of comparative genomic hybridisation as the first diagnostic option in cases where clinical findings are suggestive of SS. Copyright © 2015 Sociedad Chilena de Pediatría. Publicado por Elsevier España, S.L.U. All rights reserved.

  13. Comparative Genome Analysis in the Integrated Microbial Genomes(IMG) System

    Energy Technology Data Exchange (ETDEWEB)

    Kyrpides, Nikos C.; Markowitz, Victor M.

    2006-03-01

    Comparative genome analysis is critical for the effectiveexploration of a rapidly growing number of complete and draft sequencesfor microbial genomes. The Integrated Microbial Genomes (IMG) system(img.jgi.doe.gov) has been developed as a community resource thatprovides support for comparative analysis of microbial genomes in anintegrated context. IMG allows users to navigate the multidimensionalmicrobial genome data space and focus their analysis on a subset ofgenes, genomes, and functions of interest. IMG provides graphicalviewers, summaries and occurrence profile tools for comparing genes,pathways and functions (terms) across specific genomes. Genes can befurther examined using gene neighborhoods and compared with sequencealignment tools.

  14. Comparative genomics and proteomics of 13 Porphyromonas gingivalis strains

    Directory of Open Access Journals (Sweden)

    Tsute Chen

    2015-09-01

    Full Text Available At the current time, genome sequences of a total of 13 Porphyromonas gingivalis strains are available, including five completed genomes (strains ATCC 33277, HG66, TDC60, JCVISC001, and W83 and eight high-coverage draft sequences (F0185, F0566, F0568, F0569, F0570, SJD2, W4087, and W50 that are assembled into fewer than 300 contigs. This study compared these genomes at both nucleotide and protein sequence levels in order to understand their phylogenetic and functional relatedness. There are four copies of 16S rRNA gene sequences in each of the strains of ATCC 33277, HG66, TDC60, and W83 and one copy in the other nine genomes. These 25 16S rRNA sequences represent only 13 unique sequences. The five copies in W83 and W50 are identical and the three copies in HG66 are identical to the four copies in ATCC 33277, suggesting close evolutionary lineage between W83 and W50, as well as HG66 and ATCC 33277. Genome-wide comparison based on “Rapid Annotation using Subsystem Technology” (RAST also showed that for the overall biological functions of the genomes, W83 is closer to W50, and HG66 to ATCC33277, than to other genomes. The comparison of the RAST subsystems identified biological functions that are unique to individual, shared by some, or by all genomes. Functions unique to individual genomes include: a tetracycline resistance protein TetQ, DNA metabolism gene YcfH, and DNA repair gene exonuclease SbcC (only in SJD2; very-short-patch mismatch repair endonuclease and a phage packaging terminase similar to Bacteroides phage B124-14 (in W4087; an internalin similar to a Listeria surface virulence protein (W83; a Type I restriction-modification system (F0569; an iron acquisition/heme transport protein (F0566; colicin I receptor and carbamoylputrescine amidase (W50; L-serine dehydratase (TDC60; and spermidine synthase and ribokinase (JCVISC001. The results also identified biological functions that are missing in individual or several genomes. For

  15. Comparative Whole-Genome Mapping To Determine Staphylococcus aureus Genome Size, Virulence Motifs, and Clonality

    Science.gov (United States)

    Pantrang, Madhulatha; Stahl, Buffy; Briska, Adam M.; Stemper, Mary E.; Wagner, Trevor K.; Zentz, Emily B.; Callister, Steven M.; Lovrich, Steven D.; Henkhaus, John K.; Dykes, Colin W.

    2012-01-01

    Despite being a clonal pathogen, Staphylococcus aureus continues to acquire virulence and antibiotic-resistant genes located on mobile genetic elements such as genomic islands, prophages, pathogenicity islands, and the staphylococcal chromosomal cassette mec (SCCmec) by horizontal gene transfer from other staphylococci. The potential virulence of a S. aureus strain is often determined by comparing its pulsed-field gel electrophoresis (PFGE) or multilocus sequence typing profiles to that of known epidemic or virulent clones and by PCR of the toxin genes. Whole-genome mapping (formerly optical mapping), which is a high-resolution ordered restriction mapping of a bacterial genome, is a relatively new genomic tool that allows comparative analysis across entire bacterial genomes to identify regions of genomic similarities and dissimilarities, including small and large insertions and deletions. We explored whether whole-genome maps (WGMs) of methicillin-resistant S. aureus (MRSA) could be used to predict the presence of methicillin resistance, SCCmec type, and Panton-Valentine leukocidin (PVL)-producing genes on an S. aureus genome. We determined the WGMs of 47 diverse clinical isolates of S. aureus, including well-characterized reference MRSA strains, and annotated the signature restriction pattern in SCCmec types, arginine catabolic mobile element (ACME), and PVL-carrying prophage, PhiSa2 or PhiSa2-like regions on the genome. WGMs of these isolates accurately characterized them as MRSA or methicillin-sensitive S. aureus based on the presence or absence of the SCCmec motif, ACME and the unique signature pattern for the prophage insertion that harbored the PVL genes. Susceptibility to methicillin resistance and the presence of mecA, SCCmec types, and PVL genes were confirmed by PCR. A WGM clustering approach was further able to discriminate isolates within the same PFGE clonal group. These results showed that WGMs could be used not only to genotype S. aureus but also to

  16. Comparative genomics reveals insights into avian genome evolution and adaptation

    DEFF Research Database (Denmark)

    Zhang, Guojie; Li, Cai; Li, Qiye

    2014-01-01

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, ...

  17. Approaches for Comparative Genomics in Aspergillus and Penicillium

    DEFF Research Database (Denmark)

    Rasmussen, Jane Lind Nybo; Theobald, Sebastian; Brandl, Julian

    2016-01-01

    of comparative genomics, ranging from analysis of single genes, over gene clusters and CaZymes to genome-scale comparative genomics. Furthermore, we have examined published comparative genomics papers to summarize the preferred bioinformatic methods and parameters for a given type of analysis, highly useful...... for new fungal geneticists. Moreover, the chapter contains a detailed overview of comparative genomics studies of key fungal traits such as primary metabolism, secondary metabolism, and secretome analysis. Finally, we gaze into a possible future of the field by comparing the current state of fungal......The number of available genomes in the closely related fungal genera Aspergillus and Penicillium is rapidly increasing. At the time of writing, the genomes of 62 species are available, and an even higher number is being prepared. Fungal comparative genomics is thus becoming steadily more powerful...

  18. Comparative genomics of emerging human ehrlichiosis agents.

    Directory of Open Access Journals (Sweden)

    Julie C Dunning Hotopp

    2006-02-01

    Full Text Available Anaplasma (formerly Ehrlichia phagocytophilum, Ehrlichia chaffeensis, and Neorickettsia (formerly Ehrlichia sennetsu are intracellular vector-borne pathogens that cause human ehrlichiosis, an emerging infectious disease. We present the complete genome sequences of these organisms along with comparisons to other organisms in the Rickettsiales order. Ehrlichia spp. and Anaplasma spp. display a unique large expansion of immunodominant outer membrane proteins facilitating antigenic variation. All Rickettsiales have a diminished ability to synthesize amino acids compared to their closest free-living relatives. Unlike members of the Rickettsiaceae family, these pathogenic Anaplasmataceae are capable of making all major vitamins, cofactors, and nucleotides, which could confer a beneficial role in the invertebrate vector or the vertebrate host. Further analysis identified proteins potentially involved in vacuole confinement of the Anaplasmataceae, a life cycle involving a hematophagous vector, vertebrate pathogenesis, human pathogenesis, and lack of transovarial transmission. These discoveries provide significant insights into the biology of these obligate intracellular pathogens.

  19. Array-based comparative genomic hybridization is more informative than conventional karyotyping and fluorescence in situ hybridization in the analysis of first-trimester spontaneous abortion

    Directory of Open Access Journals (Sweden)

    Gao Jinsong

    2012-07-01

    Full Text Available Abstract Background Array-based comparative genomic hybridization (aCGH is a new technique for detecting submicroscopic deletions and duplications, and can overcome many of the limitations associated with classic cytogenetic analysis. However, its clinical use in spontaneous abortion needs comprehensive evaluation. We used aCGH to investigate chromosomal imbalances in 100 spontaneous abortions and compared the results with G-banding karyotyping and fluorescence in situ hybridization (FISH. Inconsistent results were verified by quantitative fluorescence PCR. Results Abnormalities were detected in 61 cases. aCGH achieved the highest detection rate (93.4%, 57/61 compared with traditional karyotyping (77%, 47/61 and FISH analysis (68.9%, 42/61. aCGH identified all chromosome abnormalities reported by traditional karyotyping and interphase FISH analysis, with the exception of four triploids. It also detected three additional aneuploidy cases in 37 specimens with ‘normal’ karyotypes, one mosaicism and 10 abnormalities in 14 specimens that failed to grow in vitro. Conclusions aCGH analysis circumvents many limitations in traditional karyotyping or FISH. The accuracy and efficiency of aCGH in spontaneous abortions highlights its clinical usefulness for the future. As aborted tissues have the potential to be contaminated with maternal cells, the threshold value of detection in aCGH should be lowered to avoid false negatives.

  20. Whole genome comparative studies between chicken and turkey and their implications for avian genome evolution

    Directory of Open Access Journals (Sweden)

    Carré Wilfrid

    2008-04-01

    Full Text Available Abstract Background Comparative genomics is a powerful means of establishing inter-specific relationships between gene function/location and allows insight into genomic rearrangements, conservation and evolutionary phylogeny. The availability of the complete sequence of the chicken genome has initiated the development of detailed genomic information in other birds including turkey, an agriculturally important species where mapping has hitherto focused on linkage with limited physical information. No molecular study has yet examined conservation of avian microchromosomes, nor differences in copy number variants (CNVs between birds. Results We present a detailed comparative cytogenetic map between chicken and turkey based on reciprocal chromosome painting and mapping of 338 chicken BACs to turkey metaphases. Two inter-chromosomal changes (both involving centromeres and three pericentric inversions have been identified between chicken and turkey; and array CGH identified 16 inter-specific CNVs. Conclusion This is the first study to combine the modalities of zoo-FISH and array CGH between different avian species. The first insight into the conservation of microchromosomes, the first comparative cytogenetic map of any bird and the first appraisal of CNVs between birds is provided. Results suggest that avian genomes have remained relatively stable during evolution compared to mammalian equivalents.

  1. The bonobo genome compared with the chimpanzee and human genomes

    Science.gov (United States)

    Prüfer, Kay; Munch, Kasper; Hellmann, Ines; Akagi, Keiko; Miller, Jason R.; Walenz, Brian; Koren, Sergey; Sutton, Granger; Kodira, Chinnappa; Winer, Roger; Knight, James R.; Mullikin, James C.; Meader, Stephen J.; Ponting, Chris P.; Lunter, Gerton; Higashino, Saneyuki; Hobolth, Asger; Dutheil, Julien; Karakoç, Emre; Alkan, Can; Sajjadian, Saba; Catacchio, Claudia Rita; Ventura, Mario; Marques-Bonet, Tomas; Eichler, Evan E.; André, Claudine; Atencia, Rebeca; Mugisha, Lawrence; Junhold, Jörg; Patterson, Nick; Siebauer, Michael; Good, Jeffrey M.; Fischer, Anne; Ptak, Susan E.; Lachmann, Michael; Symer, David E.; Mailund, Thomas; Schierup, Mikkel H.; Andrés, Aida M.; Kelso, Janet; Pääbo, Svante

    2012-01-01

    Two African apes are the closest living relatives of humans: the chimpanzee (Pan troglodytes) and the bonobo (Pan paniscus). Although they are similar in many respects, bonobos and chimpanzees differ strikingly in key social and sexual behaviours1–4, and for some of these traits they show more similarity with humans than with each other. Here we report the sequencing and assembly of the bonobo genome to study its evolutionary relationship with the chimpanzee and human genomes. We find that more than three per cent of the human genome is more closely related to either the bonobo or the chimpanzee genome than these are to each other. These regions allow various aspects of the ancestry of the two ape species to be reconstructed. In addition, many of the regions that overlap genes may eventually help us understand the genetic basis of phenotypes that humans share with one of the two apes to the exclusion of the other. PMID:22722832

  2. The bonobo genome compared with the chimpanzee and human genomes.

    Science.gov (United States)

    Prüfer, Kay; Munch, Kasper; Hellmann, Ines; Akagi, Keiko; Miller, Jason R; Walenz, Brian; Koren, Sergey; Sutton, Granger; Kodira, Chinnappa; Winer, Roger; Knight, James R; Mullikin, James C; Meader, Stephen J; Ponting, Chris P; Lunter, Gerton; Higashino, Saneyuki; Hobolth, Asger; Dutheil, Julien; Karakoç, Emre; Alkan, Can; Sajjadian, Saba; Catacchio, Claudia Rita; Ventura, Mario; Marques-Bonet, Tomas; Eichler, Evan E; André, Claudine; Atencia, Rebeca; Mugisha, Lawrence; Junhold, Jörg; Patterson, Nick; Siebauer, Michael; Good, Jeffrey M; Fischer, Anne; Ptak, Susan E; Lachmann, Michael; Symer, David E; Mailund, Thomas; Schierup, Mikkel H; Andrés, Aida M; Kelso, Janet; Pääbo, Svante

    2012-06-28

    Two African apes are the closest living relatives of humans: the chimpanzee (Pan troglodytes) and the bonobo (Pan paniscus). Although they are similar in many respects, bonobos and chimpanzees differ strikingly in key social and sexual behaviours, and for some of these traits they show more similarity with humans than with each other. Here we report the sequencing and assembly of the bonobo genome to study its evolutionary relationship with the chimpanzee and human genomes. We find that more than three per cent of the human genome is more closely related to either the bonobo or the chimpanzee genome than these are to each other. These regions allow various aspects of the ancestry of the two ape species to be reconstructed. In addition, many of the regions that overlap genes may eventually help us understand the genetic basis of phenotypes that humans share with one of the two apes to the exclusion of the other.

  3. Phylogeny and comparative genome analysis of a Basidiomycete fungi

    Energy Technology Data Exchange (ETDEWEB)

    Riley, Robert W.; Salamov, Asaf; Grigoriev, Igor; Hibbett, David

    2011-03-14

    Fungi of the phylum Basidiomycota, make up some 37percent of the described fungi, and are important from the perspectives of forestry, agriculture, medicine, and bioenergy. This diverse phylum includes the mushrooms, wood rots, plant pathogenic rusts and smuts, and some human pathogens. To better understand these important fungi, we have undertaken a comparative genomic analysis of the Basidiomycetes with available sequenced genomes. We report a phylogeny that sheds light on previously unclear evolutionary relationships among the Basidiomycetes. We also define a `core proteome? based on protein families conserved in all Basidiomycetes. We identify key expansions and contractions in protein families that may be responsible for the degradation of plant biomass such as cellulose, hemicellulose, and lignin. Finally, we speculate as to the genomic changes that drove such expansions and contractions.

  4. A web server for mining Comparative Genomic Hybridization (CGH) data

    Science.gov (United States)

    Liu, Jun; Ranka, Sanjay; Kahveci, Tamer

    2007-11-01

    Advances in cytogenetics and molecular biology has established that chromosomal alterations are critical in the pathogenesis of human cancer. Recurrent chromosomal alterations provide cytological and molecular markers for the diagnosis and prognosis of disease. They also facilitate the identification of genes that are important in carcinogenesis, which in the future may help in the development of targeted therapy. A large amount of publicly available cancer genetic data is now available and it is growing. There is a need for public domain tools that allow users to analyze their data and visualize the results. This chapter describes a web based software tool that will allow researchers to analyze and visualize Comparative Genomic Hybridization (CGH) datasets. It employs novel data mining methodologies for clustering and classification of CGH datasets as well as algorithms for identifying important markers (small set of genomic intervals with aberrations) that are potentially cancer signatures. The developed software will help in understanding the relationships between genomic aberrations and cancer types.

  5. Lactobacillus paracasei comparative genomics: towards species pan-genome definition and exploitation of diversity.

    Directory of Open Access Journals (Sweden)

    Tamara Smokvina

    Full Text Available Lactobacillus paracasei is a member of the normal human and animal gut microbiota and is used extensively in the food industry in starter cultures for dairy products or as probiotics. With the development of low-cost, high-throughput sequencing techniques it has become feasible to sequence many different strains of one species and to determine its "pan-genome". We have sequenced the genomes of 34 different L. paracasei strains, and performed a comparative genomics analysis. We analysed genome synteny and content, focussing on the pan-genome, core genome and variable genome. Each genome was shown to contain around 2800-3100 protein-coding genes, and comparative analysis identified over 4200 ortholog groups that comprise the pan-genome of this species, of which about 1800 ortholog groups make up the conserved core. Several factors previously associated with host-microbe interactions such as pili, cell-envelope proteinase, hydrolases p40 and p75 or the capacity to produce short branched-chain fatty acids (bkd operon are part of the L. paracasei core genome present in all analysed strains. The variome consists mainly of hypothetical proteins, phages, plasmids, transposon/conjugative elements, and known functions such as sugar metabolism, cell-surface proteins, transporters, CRISPR-associated proteins, and EPS biosynthesis proteins. An enormous variety and variability of sugar utilization gene cassettes were identified, with each strain harbouring between 25-53 cassettes, reflecting the high adaptability of L. paracasei to different niches. A phylogenomic tree was constructed based on total genome contents, and together with an analysis of horizontal gene transfer events we conclude that evolution of these L. paracasei strains is complex and not always related to niche adaptation. The results of this genome content comparison was used, together with high-throughput growth experiments on various carbohydrates, to perform gene-trait matching analysis

  6. Streptococcus thermophilus core genome: comparative genome hybridization study of 47 strains.

    Science.gov (United States)

    Rasmussen, Thomas Bovbjerg; Danielsen, Morten; Valina, Ondrej; Garrigues, Christel; Johansen, Eric; Pedersen, Martin Bastian

    2008-08-01

    A DNA microarray platform based on 2,200 genes from publicly available sequences was designed for Streptococcus thermophilus. We determined how single-nucleotide polymorphisms in the 65- to 75-mer oligonucleotide probe sequences affect the hybridization signals. The microarrays were then used for comparative genome hybridization (CGH) of 47 dairy S. thermophilus strains. An analysis of the exopolysaccharide genes in each strain confirmed previous findings that this class of genes is indeed highly variable. A phylogenetic tree based on the CGH data showed similar distances for most strains, indicating frequent recombination or gene transfer within S. thermophilus. By comparing genome sizes estimated from the microarrays and pulsed-field gel electrophoresis, the amount of unknown DNA in each strain was estimated. A core genome comprised of 1,271 genes detected in all 47 strains was identified. Likewise, a set of noncore genes detected in only some strains was identified. The concept of an industrial core genome is proposed. This is comprised of the genes in the core genome plus genes that are necessary in an applied industrial context.

  7. Genomic selection models double the accuracy of predicted breeding values for bacterial cold water disease resistance compared to a traditional pedigree-based model in rainbow trout aquaculture

    Science.gov (United States)

    Previously we have shown that bacterial cold water disease (BCWD) resistance in rainbow trout can be improved using traditional family-based selection, but progress has been limited to exploiting only between-family genetic variation. Genomic selection (GS) is a new alternative enabling exploitation...

  8. Genome-enabled selection doubles the accuracy of predicted breeding values for bacterial cold water disease resistance compared to traditional family-based selection in rainbow trout aquaculture

    Science.gov (United States)

    We have shown previously that bacterial cold water disease (BCWD) resistance in rainbow trout can be improved using traditional family-based selection, but progress has been limited to exploiting only between-family genetic variation. Genomic selection (GS) is a new alternative enabling exploitation...

  9. Circos: an information aesthetic for comparative genomics.

    Science.gov (United States)

    Krzywinski, Martin; Schein, Jacqueline; Birol, Inanç; Connors, Joseph; Gascoyne, Randy; Horsman, Doug; Jones, Steven J; Marra, Marco A

    2009-09-01

    We created a visualization tool called Circos to facilitate the identification and analysis of similarities and differences arising from comparisons of genomes. Our tool is effective in displaying variation in genome structure and, generally, any other kind of positional relationships between genomic intervals. Such data are routinely produced by sequence alignments, hybridization arrays, genome mapping, and genotyping studies. Circos uses a circular ideogram layout to facilitate the display of relationships between pairs of positions by the use of ribbons, which encode the position, size, and orientation of related genomic elements. Circos is capable of displaying data as scatter, line, and histogram plots, heat maps, tiles, connectors, and text. Bitmap or vector images can be created from GFF-style data inputs and hierarchical configuration files, which can be easily generated by automated tools, making Circos suitable for rapid deployment in data analysis and reporting pipelines.

  10. Comparative analysis of the base compositions of the pre-mRNA 3' cleaved-off region and the mRNA 3' untranslated region relative to the genomic base composition in animals and plants.

    Science.gov (United States)

    Li, Xiu-Qing

    2014-01-01

    The precursor messenger RNA (pre-mRNA) three-prime cleaved-off region (3'COR) and the mRNA three-prime untranslated region (3'UTR) play critical roles in regulating gene expression. The differences in base composition between these regions and the corresponding genomes are still largely uncharacterized in animals and plants. In this study, the base compositions of non-redundant 3'CORs and 3'UTRs were compared with the corresponding whole genomes of eleven animals, four dicotyledonous plants, and three monocotyledonous (cereal) plants. Among the four bases (A, C, G, and U for adenine, cytosine, guanine, and uracil, respectively), U (which corresponds to T, for thymine, in DNA) was the most frequent, A the second most frequent, G the third most frequent, and C the least frequent in most of the species in both the 3'COR and 3'UTR regions. In comparison with the whole genomes, in both regions the U content was usually the most overrepresented (particularly in the monocotyledonous plants), and the C content was the most underrepresented. The order obtained for the species groups, when ranked from high to low according to the U contents in the 3'COR and 3'UTR was as follows: dicotyledonous plants, monocotyledonous plants, non-mammal animals, and mammals. In contrast, the genomic T content was highest in dicotyledonous plants, lowest in monocotyledonous plants, and intermediate in animals. These results suggest the following: 1) there is a mechanism operating in both animals and plants which is biased toward U and against C in the 3'COR and 3'UTR; 2) the 3'UTR and 3'COR, as functional units, minimized the difference between dicotyledonous and monocotyledonous plants, while the dicotyledonous and monocotyledonous genomes evolved into two extreme groups in terms of base composition.

  11. Comparative genomic hybridization in clinical cytogenetics

    Energy Technology Data Exchange (ETDEWEB)

    Bryndorf, T.; Kirchhoff, M.; Rose, H. [and others

    1995-11-01

    We report the results of applying comparative genomic hybridization (CGH) in a cytogenetic service laboratory for (1) determination of the origin of extra and missing chromosomal material in intricate cases of unbalanced aberrations and (2) detection of common prenatal numerical chromosome aberrations. A total of 11 fetal samples were analyzed. Seven cases of complex unbalanced aberrations that could not be identified reliably by conventional cytogenetics were successfully resolved by CGH analysis. CGH results were validated by using FISH with chromosome-specific probes. Four cases representing common prenatal numerical aberrations (trisomy 21, 18, and 13 and monosomy X) were also successfully diagnosed by CGH. We conclude that CGH is a powerful adjunct to traditional cytogenetic techniques that makes it possible to solve clinical cases of intricate unbalanced aberrations in a single hybridization. CGH may also be a useful adjunct to screen for euchromatic involvement in marker chromosomes. Further technical development may render CGH applicable for routine aberration screening. 16 refs., 4 figs., 2 tabs.

  12. Human and mouse genome analysis using array comparative genomic hybridization

    NARCIS (Netherlands)

    Snijders, Antoine Maria

    2004-01-01

    Almost all human cancers as well as developmental abnormalities are characterized by the presence of genetic alterations, most of which target a gene or a particular genomic locus resulting in altered gene expression and ultimately an altered phenotype. Different types of genetic alterations include

  13. DNA Microarrays in Comparative Genomics and Transcriptomics

    DEFF Research Database (Denmark)

    Willenbrock, Hanni

    2007-01-01

    of each method’s ability to analyze DNA copy number data. Moreover, our study shows that analysis methods developed for cancer research may also successfully be applied to DNA copy number profiles from bacterial genomes. However, here the purpose is to characterize variations in the gene content...... to verify predictions of highly expressed genes. Moreover, the codon bias of microbial genomes was found to constitute an environmental signature. For example, soil bacteria have very similar codon bias....

  14. Phytozome: a comparative platform for green plant genomics

    OpenAIRE

    Goodstein, David M.; Shu, Shengqiang; Howson, Russell; Neupane, Rochak; Hayes, Richard D.; Fazo, Joni; Mitros, Therese; Dirks, William; Hellsten, Uffe; Putnam, Nicholas ; Rokhsar, Daniel S.

    2011-01-01

    The number of sequenced plant genomes and associated genomic resources is growing rapidly with the advent of both an increased focus on plant genomics from funding agencies, and the application of inexpensive next generation sequencing. To interact with this increasing body of data, we have developed Phytozome (http://www.phytozome.net), a comparative hub for plant genome and gene family data and analysis. Phytozome provides a view of the evolutionary history of every plant gene at the level ...

  15. Comparative genomic hybridization: Detection of segmental aneusomies

    Energy Technology Data Exchange (ETDEWEB)

    Cronin, J.E.; Magrane, G.G.; Gray, J.W. [Univ. of California, San Francisco, CA (United States)] [and others

    1994-09-01

    Comparative genomic hybridization (CGH) has been used successfully to detect whole chromosome and segmental aneusomies. However, its sensitivity for detection of segmental aneusomies is still not well known. We present here an analysis of CGH sensitivity with emphasis on detection of abnormalities commonly found during pre-and neo-natal diagnosis. CGH is performed by hybridizing green and red fluorescing test and normal DNA samples, respectively, to normal metaphase spreads and measuring green:red fluorescence ratios along all chromosomes. The ratios are normalized such that 2 copies of a normal chromosome region in the test sample gives a ratio of 1.0. Alterations in test vs. control gene copy number range from 1.5 [trisomy] to 0.5 [monosomy]. Clinical samples analyzed included Wolf Hirschhorn (4p-), Cri du Chat (5p-) and DiGeorge (22q-). In addition, 7 cell lines with chromosome 21 segmental aneusomies were analyzed. These included 3 with terminal duplications, 1 with a terminal deletion, 1 with an interstitial deletion and 2 with interstitial amplifications. The DiGeorge deletion was the only deletion not deleted by CGH. This is not surprising as standard G banding does not routinely detect this 1-2 megabase deletion. The 4p- and 5p- monosomies were detected and breakpoints correctly assigned prospectively. Proximal alterations involving 21q22.11 are unambiguously defined. Specifically, two interstitial aneusomies involving this region are detected. Studies involving late prophase chromosome normal spreads gave identical breakpoints. Thus, analysis of extended chromosomes did not improve the sensitivity of the technique. Taken together, these data suggest that CGH can detect segmental aneusomies greater than 8 megabases in extent. Smaller aneusomies can, at times, be detected. Work is now underway to modify the analysis software to increase sensitivity and to decrease the amount of material needed for analysis.

  16. Comparative genomics reveals diversity among xanthomonads infecting tomato and pepper

    LENUS (Irish Health Repository)

    Potnis, Neha

    2011-03-11

    Abstract Background Bacterial spot of tomato and pepper is caused by four Xanthomonas species and is a major plant disease in warm humid climates. The four species are distinct from each other based on physiological and molecular characteristics. The genome sequence of strain 85-10, a member of one of the species, Xanthomonas euvesicatoria (Xcv) has been previously reported. To determine the relationship of the four species at the genome level and to investigate the molecular basis of their virulence and differing host ranges, draft genomic sequences of members of the other three species were determined and compared to strain 85-10. Results We sequenced the genomes of X. vesicatoria (Xv) strain 1111 (ATCC 35937), X. perforans (Xp) strain 91-118 and X. gardneri (Xg) strain 101 (ATCC 19865). The genomes were compared with each other and with the previously sequenced Xcv strain 85-10. In addition, the molecular features were predicted that may be required for pathogenicity including the type III secretion apparatus, type III effectors, other secretion systems, quorum sensing systems, adhesins, extracellular polysaccharide, and lipopolysaccharide determinants. Several novel type III effectors from Xg strain 101 and Xv strain 1111 genomes were computationally identified and their translocation was validated using a reporter gene assay. A homolog to Ax21, the elicitor of XA21-mediated resistance in rice, and a functional Ax21 sulfation system were identified in Xcv. Genes encoding proteins with functions mediated by type II and type IV secretion systems have also been compared, including enzymes involved in cell wall deconstruction, as contributors to pathogenicity. Conclusions Comparative genomic analyses revealed considerable diversity among bacterial spot pathogens, providing new insights into differences and similarities that may explain the diverse nature of these strains. Genes specific to pepper pathogens, such as the O-antigen of the lipopolysaccharide cluster

  17. Comparative genomics reveals diversity among xanthomonads infecting tomato and pepper

    Directory of Open Access Journals (Sweden)

    Koebnik Ralf

    2011-03-01

    Full Text Available Abstract Background Bacterial spot of tomato and pepper is caused by four Xanthomonas species and is a major plant disease in warm humid climates. The four species are distinct from each other based on physiological and molecular characteristics. The genome sequence of strain 85-10, a member of one of the species, Xanthomonas euvesicatoria (Xcv has been previously reported. To determine the relationship of the four species at the genome level and to investigate the molecular basis of their virulence and differing host ranges, draft genomic sequences of members of the other three species were determined and compared to strain 85-10. Results We sequenced the genomes of X. vesicatoria (Xv strain 1111 (ATCC 35937, X. perforans (Xp strain 91-118 and X. gardneri (Xg strain 101 (ATCC 19865. The genomes were compared with each other and with the previously sequenced Xcv strain 85-10. In addition, the molecular features were predicted that may be required for pathogenicity including the type III secretion apparatus, type III effectors, other secretion systems, quorum sensing systems, adhesins, extracellular polysaccharide, and lipopolysaccharide determinants. Several novel type III effectors from Xg strain 101 and Xv strain 1111 genomes were computationally identified and their translocation was validated using a reporter gene assay. A homolog to Ax21, the elicitor of XA21-mediated resistance in rice, and a functional Ax21 sulfation system were identified in Xcv. Genes encoding proteins with functions mediated by type II and type IV secretion systems have also been compared, including enzymes involved in cell wall deconstruction, as contributors to pathogenicity. Conclusions Comparative genomic analyses revealed considerable diversity among bacterial spot pathogens, providing new insights into differences and similarities that may explain the diverse nature of these strains. Genes specific to pepper pathogens, such as the O-antigen of the

  18. Comparative genomics and genome biology of invasive Campylobacter jejuni.

    Science.gov (United States)

    Skarp, C P A; Akinrinade, O; Nilsson, A J E; Ellström, P; Myllykangas, S; Rautelin, H

    2015-11-25

    Campylobacter jejuni is a major pathogen in bacterial gastroenteritis worldwide and can cause bacteremia in severe cases. C. jejuni is highly structured into clonal lineages of which the ST677CC lineage has been overrepresented among C. jejuni isolates derived from blood. In this study, we characterized the genomes of 31 C. jejuni blood isolates and 24 faecal isolates belonging to ST677CC in order to study the genome biology related to C. jejuni invasiveness. We combined the genome analyses with phenotypical evidence on serum resistance which was associated with phase variation of wcbK; a GDP-mannose 4,6-dehydratase involved in capsular biosynthesis. We also describe the finding of a Type III restriction-modification system unique to the ST-794 sublineage. However, features previously considered to be related to pathogenesis of C. jejuni were either absent or disrupted among our strains. Our results refine the role of capsule features associated with invasive disease and accentuate the possibility of methylation and restriction enzymes in the potential of C. jejuni to establish invasive infections. Our findings underline the importance of studying clinically relevant well-characterized bacterial strains in order to understand pathogenesis mechanisms important in human infections.

  19. Comparative genomics of Cluster O mycobacteriophages.

    Directory of Open Access Journals (Sweden)

    Steven G Cresawn

    Full Text Available Mycobacteriophages--viruses of mycobacterial hosts--are genetically diverse but morphologically are all classified in the Caudovirales with double-stranded DNA and tails. We describe here a group of five closely related mycobacteriophages--Corndog, Catdawg, Dylan, Firecracker, and YungJamal--designated as Cluster O with long flexible tails but with unusual prolate capsids. Proteomic analysis of phage Corndog particles, Catdawg particles, and Corndog-infected cells confirms expression of half of the predicted gene products and indicates a non-canonical mechanism for translation of the Corndog tape measure protein. Bioinformatic analysis identifies 8-9 strongly predicted SigA promoters and all five Cluster O genomes contain more than 30 copies of a 17 bp repeat sequence with dyad symmetry located throughout the genomes. Comparison of the Cluster O phages provides insights into phage genome evolution including the processes of gene flux by horizontal genetic exchange.

  20. Comparative Omics-Driven Genome Annotation Refinement: Application across Yersiniae

    Energy Technology Data Exchange (ETDEWEB)

    Rutledge, Alexandra C.; Jones, Marcus B.; Chauhan, Sadhana; Purvine, Samuel O.; Sanford, James; Monroe, Matthew E.; Brewer, Heather M.; Payne, Samuel H.; Ansong, Charles; Frank, Bryan C.; Smith, Richard D.; Peterson, Scott; Motin, Vladimir L.; Adkins, Joshua N.

    2012-03-27

    Genome sequencing continues to be a rapidly evolving technology, yet most downstream aspects of genome annotation pipelines remain relatively stable or are even being abandoned. To date, the perceived value of manual curation for genome annotations is not offset by the real cost and time associated with the process. In order to balance the large number of sequences generated, the annotation process is now performed almost exclusively in an automated fashion for most genome sequencing projects. One possible way to reduce errors inherent to automated computational annotations is to apply data from 'omics' measurements (i.e. transcriptional and proteomic) to the un-annotated genome with a proteogenomic-based approach. This approach does require additional experimental and bioinformatics methods to include omics technologies; however, the approach is readily automatable and can benefit from rapid developments occurring in those research domains as well. The annotation process can be improved by experimental validation of transcription and translation and aid in the discovery of annotation errors. Here the concept of annotation refinement has been extended to include a comparative assessment of genomes across closely related species, as is becoming common in sequencing efforts. Transcriptomic and proteomic data derived from three highly similar pathogenic Yersiniae (Y. pestis CO92, Y. pestis pestoides F, and Y. pseudotuberculosis PB1/+) was used to demonstrate a comprehensive comparative omic-based annotation methodology. Peptide and oligo measurements experimentally validated the expression of nearly 40% of each strain's predicted proteome and revealed the identification of 28 novel and 68 previously incorrect protein-coding sequences (e.g., observed frameshifts, extended start sites, and translated pseudogenes) within the three current Yersinia genome annotations. Gene loss is presumed to play a major role in Y. pestis acquiring its niche as a virulent

  1. The Whole Genome Assembly and Comparative Genomic Research of Thellungiella parvula (Extremophile Crucifer Mitochondrion

    Directory of Open Access Journals (Sweden)

    Xuelin Wang

    2016-01-01

    Full Text Available The complete nucleotide sequences of the mitochondrial (mt genome of an extremophile species Thellungiella parvula (T. parvula have been determined with the lengths of 255,773 bp. T. parvula mt genome is a circular sequence and contains 32 protein-coding genes, 19 tRNA genes, and three ribosomal RNA genes with a 11.5% coding sequence. The base composition of 27.5% A, 27.5% T, 22.7% C, and 22.3% G in descending order shows a slight bias of 55% AT. Fifty-three repeats were identified in the mitochondrial genome of T. parvula, including 24 direct repeats, 28 tandem repeats (TRs, and one palindromic repeat. Furthermore, a total of 199 perfect microsatellites have been mined with a high A/T content (83.1% through simple sequence repeat (SSR analysis and they were distributed unevenly within this mitochondrial genome. We also analyzed other plant mitochondrial genomes’ evolution in general, providing clues for the understanding of the evolution of organelles genomes in plants. Comparing with other Brassicaceae species, T. parvula is related to Arabidopsis thaliana whose characters of low temperature resistance have been well documented. This study will provide important genetic tools for other Brassicaceae species research and improve yields of economically important plants.

  2. Comparative analysis of Acinetobacters: three genomes for three lifestyles.

    Directory of Open Access Journals (Sweden)

    David Vallenet

    Full Text Available Acinetobacter baumannii is the source of numerous nosocomial infections in humans and therefore deserves close attention as multidrug or even pandrug resistant strains are increasingly being identified worldwide. Here we report the comparison of two newly sequenced genomes of A. baumannii. The human isolate A. baumannii AYE is multidrug resistant whereas strain SDF, which was isolated from body lice, is antibiotic susceptible. As reference for comparison in this analysis, the genome of the soil-living bacterium A. baylyi strain ADP1 was used. The most interesting dissimilarities we observed were that i whereas strain AYE and A. baylyi genomes harbored very few Insertion Sequence elements which could promote expression of downstream genes, strain SDF sequence contains several hundred of them that have played a crucial role in its genome reduction (gene disruptions and simple DNA loss; ii strain SDF has low catabolic capacities compared to strain AYE. Interestingly, the latter has even higher catabolic capacities than A. baylyi which has already been reported as a very nutritionally versatile organism. This metabolic performance could explain the persistence of A. baumannii nosocomial strains in environments where nutrients are scarce; iii several processes known to play a key role during host infection (biofilm formation, iron uptake, quorum sensing, virulence factors were either different or absent, the best example of which is iron uptake. Indeed, strain AYE and A. baylyi use siderophore-based systems to scavenge iron from the environment whereas strain SDF uses an alternate system similar to the Haem Acquisition System (HAS. Taken together, all these observations suggest that the genome contents of the 3 Acinetobacters compared are partly shaped by life in distinct ecological niches: human (and more largely hospital environment, louse, soil.

  3. Sequencing and comparative analysis of the gorilla MHC genomic sequence.

    Science.gov (United States)

    Wilming, Laurens G; Hart, Elizabeth A; Coggill, Penny C; Horton, Roger; Gilbert, James G R; Clee, Chris; Jones, Matt; Lloyd, Christine; Palmer, Sophie; Sims, Sarah; Whitehead, Siobhan; Wiley, David; Beck, Stephan; Harrow, Jennifer L

    2013-01-01

    Major histocompatibility complex (MHC) genes play a critical role in vertebrate immune response and because the MHC is linked to a significant number of auto-immune and other diseases it is of great medical interest. Here we describe the clone-based sequencing and subsequent annotation of the MHC region of the gorilla genome. Because the MHC is subject to extensive variation, both structural and sequence-wise, it is not readily amenable to study in whole genome shotgun sequence such as the recently published gorilla genome. The variation of the MHC also makes it of evolutionary interest and therefore we analyse the sequence in the context of human and chimpanzee. In our comparisons with human and re-annotated chimpanzee MHC sequence we find that gorilla has a trimodular RCCX cluster, versus the reference human bimodular cluster, and additional copies of Class I (pseudo)genes between Gogo-K and Gogo-A (the orthologues of HLA-K and -A). We also find that Gogo-H (and Patr-H) is coding versus the HLA-H pseudogene and, conversely, there is a Gogo-DQB2 pseudogene versus the HLA-DQB2 coding gene. Our analysis, which is freely available through the VEGA genome browser, provides the research community with a comprehensive dataset for comparative and evolutionary research of the MHC.

  4. The genome sequence of Caenorhabditis briggsae: a platform for comparative genomics.

    Directory of Open Access Journals (Sweden)

    Lincoln D Stein

    2003-11-01

    Full Text Available The soil nematodes Caenorhabditis briggsae and Caenorhabditis elegans diverged from a common ancestor roughly 100 million years ago and yet are almost indistinguishable by eye. They have the same chromosome number and genome sizes, and they occupy the same ecological niche. To explore the basis for this striking conservation of structure and function, we have sequenced the C. briggsae genome to a high-quality draft stage and compared it to the finished C. elegans sequence. We predict approximately 19,500 protein-coding genes in the C. briggsae genome, roughly the same as in C. elegans. Of these, 12,200 have clear C. elegans orthologs, a further 6,500 have one or more clearly detectable C. elegans homologs, and approximately 800 C. briggsae genes have no detectable matches in C. elegans. Almost all of the noncoding RNAs (ncRNAs known are shared between the two species. The two genomes exhibit extensive colinearity, and the rate of divergence appears to be higher in the chromosomal arms than in the centers. Operons, a distinctive feature of C. elegans, are highly conserved in C. briggsae, with the arrangement of genes being preserved in 96% of cases. The difference in size between the C. briggsae (estimated at approximately 104 Mbp and C. elegans (100.3 Mbp genomes is almost entirely due to repetitive sequence, which accounts for 22.4% of the C. briggsae genome in contrast to 16.5% of the C. elegans genome. Few, if any, repeat families are shared, suggesting that most were acquired after the two species diverged or are undergoing rapid evolution. Coclustering the C. elegans and C. briggsae proteins reveals 2,169 protein families of two or more members. Most of these are shared between the two species, but some appear to be expanding or contracting, and there seem to be as many as several hundred novel C. briggsae gene families. The C. briggsae draft sequence will greatly improve the annotation of the C. elegans genome. Based on similarity to C

  5. Comparative genomics of the Bifidobacterium breve taxon

    NARCIS (Netherlands)

    Bottacini, Francesca; O'Connell Motherway, Mary; Kuczynski, Justin; O'Connell, Kerry Joan; Serafini, Fausta; Duranti, Sabrina; Milani, Christian; Turroni, Francesca; Lugli, Gabriele Andrea; Zomer, Aldert; Zhurina, Daria; Riedel, Christian; Ventura, Marco; van Sinderen, Douwe

    2014-01-01

    BACKGROUND: Bifidobacteria are commonly found as part of the microbiota of the gastrointestinal tract (GIT) of a broad range of hosts, where their presence is positively correlated with the host's health status. In this study, we assessed the genomes of thirteen representatives of Bifidobacterium br

  6. Floral gene resources from basal angiosperms for comparative genomics research

    Directory of Open Access Journals (Sweden)

    Zhang Xiaohong

    2005-03-01

    Full Text Available Abstract Background The Floral Genome Project was initiated to bridge the genomic gap between the most broadly studied plant model systems. Arabidopsis and rice, although now completely sequenced and under intensive comparative genomic investigation, are separated by at least 125 million years of evolutionary time, and cannot in isolation provide a comprehensive perspective on structural and functional aspects of flowering plant genome dynamics. Here we discuss new genomic resources available to the scientific community, comprising cDNA libraries and Expressed Sequence Tag (EST sequences for a suite of phylogenetically basal angiosperms specifically selected to bridge the evolutionary gaps between model plants and provide insights into gene content and genome structure in the earliest flowering plants. Results Random sequencing of cDNAs from representatives of phylogenetically important eudicot, non-grass monocot, and gymnosperm lineages has so far (as of 12/1/04 generated 70,514 ESTs and 48,170 assembled unigenes. Efficient sorting of EST sequences into putative gene families based on whole Arabidopsis/rice proteome comparison has permitted ready identification of cDNA clones for finished sequencing. Preliminarily, (i proportions of functional categories among sequenced floral genes seem representative of the entire Arabidopsis transcriptome, (ii many known floral gene homologues have been captured, and (iii phylogenetic analyses of ESTs are providing new insights into the process of gene family evolution in relation to the origin and diversification of the angiosperms. Conclusion Initial comparisons illustrate the utility of the EST data sets toward discovery of the basic floral transcriptome. These first findings also afford the opportunity to address a number of conspicuous evolutionary genomic questions, including reproductive organ transcriptome overlap between angiosperms and gymnosperms, genome-wide duplication history, lineage

  7. Floral gene resources from basal angiosperms for comparative genomics research

    Science.gov (United States)

    Albert, Victor A; Soltis, Douglas E; Carlson, John E; Farmerie, William G; Wall, P Kerr; Ilut, Daniel C; Solow, Teri M; Mueller, Lukas A; Landherr, Lena L; Hu, Yi; Buzgo, Matyas; Kim, Sangtae; Yoo, Mi-Jeong; Frohlich, Michael W; Perl-Treves, Rafael; Schlarbaum, Scott E; Bliss, Barbara J; Zhang, Xiaohong; Tanksley, Steven D; Oppenheimer, David G; Soltis, Pamela S; Ma, Hong; dePamphilis, Claude W; Leebens-Mack, James H

    2005-01-01

    Background The Floral Genome Project was initiated to bridge the genomic gap between the most broadly studied plant model systems. Arabidopsis and rice, although now completely sequenced and under intensive comparative genomic investigation, are separated by at least 125 million years of evolutionary time, and cannot in isolation provide a comprehensive perspective on structural and functional aspects of flowering plant genome dynamics. Here we discuss new genomic resources available to the scientific community, comprising cDNA libraries and Expressed Sequence Tag (EST) sequences for a suite of phylogenetically basal angiosperms specifically selected to bridge the evolutionary gaps between model plants and provide insights into gene content and genome structure in the earliest flowering plants. Results Random sequencing of cDNAs from representatives of phylogenetically important eudicot, non-grass monocot, and gymnosperm lineages has so far (as of 12/1/04) generated 70,514 ESTs and 48,170 assembled unigenes. Efficient sorting of EST sequences into putative gene families based on whole Arabidopsis/rice proteome comparison has permitted ready identification of cDNA clones for finished sequencing. Preliminarily, (i) proportions of functional categories among sequenced floral genes seem representative of the entire Arabidopsis transcriptome, (ii) many known floral gene homologues have been captured, and (iii) phylogenetic analyses of ESTs are providing new insights into the process of gene family evolution in relation to the origin and diversification of the angiosperms. Conclusion Initial comparisons illustrate the utility of the EST data sets toward discovery of the basic floral transcriptome. These first findings also afford the opportunity to address a number of conspicuous evolutionary genomic questions, including reproductive organ transcriptome overlap between angiosperms and gymnosperms, genome-wide duplication history, lineage-specific gene duplication and

  8. A process for analysis of microarray comparative genomics hybridisation studies for bacterial genomes

    Directory of Open Access Journals (Sweden)

    Woodward Martin J

    2008-01-01

    Full Text Available Abstract Background Microarray based comparative genomic hybridisation (CGH experiments have been used to study numerous biological problems including understanding genome plasticity in pathogenic bacteria. Typically such experiments produce large data sets that are difficult for biologists to handle. Although there are some programmes available for interpretation of bacterial transcriptomics data and CGH microarray data for looking at genetic stability in oncogenes, there are none specifically to understand the mosaic nature of bacterial genomes. Consequently a bottle neck still persists in accurate processing and mathematical analysis of these data. To address this shortfall we have produced a simple and robust CGH microarray data analysis process that may be automated in the future to understand bacterial genomic diversity. Results The process involves five steps: cleaning, normalisation, estimating gene presence and absence or divergence, validation, and analysis of data from test against three reference strains simultaneously. Each stage of the process is described and we have compared a number of methods available for characterising bacterial genomic diversity, for calculating the cut-off between gene presence and absence or divergence, and shown that a simple dynamic approach using a kernel density estimator performed better than both established, as well as a more sophisticated mixture modelling technique. We have also shown that current methods commonly used for CGH microarray analysis in tumour and cancer cell lines are not appropriate for analysing our data. Conclusion After carrying out the analysis and validation for three sequenced Escherichia coli strains, CGH microarray data from 19 E. coli O157 pathogenic test strains were used to demonstrate the benefits of applying this simple and robust process to CGH microarray studies using bacterial genomes.

  9. Microbes Online: an integrated portal for comparative functional genomics

    OpenAIRE

    Arkin, Adam P.

    2014-01-01

    Since 2003, MicrobesOnline (http://www.microbesonline.org) has been providing a community resource for comparative and functional genome analysis. The portal includes over 1000 complete genomes of bacteria, archaea and fungi, as well as 1000s of viruses and plasmids. In addition to standard comparative genomic analysis, including gene prediction, sequence homology, domain identification, gene family assignments and functional annotations from E.C. and GO, MicrobesOnline integrates data from ...

  10. Comparative Genomics of Symbiotic Bacteria in Earthworm Nephridia

    DEFF Research Database (Denmark)

    Kjeldsen, Kasper Urup; Pinel, Nicolas; Lund, Marie Braad

    sequencing along with two of its closest relatives; the plant pathogenic Acidovorax avena subsp. citrulli and the free-living Acidovorax sp. JS42. In addition, the genome of the nephridial symbiont of the earthworm Aporrectodea tuberculata was partially sequenced. In order to resolve the functional...... and evolutionary basis of the symbiosis we annotated and compared the genomes. The genomes ranged in size from 4.4 to 5.6 Mbp, the E. fetida symbiont genome being the largest. The symbiont genomes showed no evidence of gene-loss related to any particular type of functional gene category. In contrast, genes...

  11. The Aspergillus Genome Database, a curated comparative genomics resource for gene, protein and sequence information for the Aspergillus research community.

    Science.gov (United States)

    Arnaud, Martha B; Chibucos, Marcus C; Costanzo, Maria C; Crabtree, Jonathan; Inglis, Diane O; Lotia, Adil; Orvis, Joshua; Shah, Prachi; Skrzypek, Marek S; Binkley, Gail; Miyasato, Stuart R; Wortman, Jennifer R; Sherlock, Gavin

    2010-01-01

    The Aspergillus Genome Database (AspGD) is an online genomics resource for researchers studying the genetics and molecular biology of the Aspergilli. AspGD combines high-quality manual curation of the experimental scientific literature examining the genetics and molecular biology of Aspergilli, cutting-edge comparative genomics approaches to iteratively refine and improve structural gene annotations across multiple Aspergillus species, and web-based research tools for accessing and exploring the data. All of these data are freely available at http://www.aspgd.org. We welcome feedback from users and the research community at aspergillus-curator@genome.stanford.edu.

  12. Whole genome comparative studies between chicken and turkey and their implications for avian genome evolution

    NARCIS (Netherlands)

    Griffin, D.K.; Robertson, L.B.; Tempest, H.G.; Vignal, A.; Fillon, V.; Crooijmans, R.P.M.A.; Groenen, M.A.M.; Deryusheva, S.; Gaginskaya, E.; Carre, W.; Waddington, D.; Talbot, R.; Völker, M.; Masabanda, J.S.; Burt, D.W.

    2008-01-01

    Background Comparative genomics is a powerful means of establishing inter-specific relationships between gene function/location and allows insight into genomic rearrangements, conservation and evolutionary phylogeny. The availability of the complete sequence of the chicken genome has initiated the d

  13. Human-mouse comparative genomics: successes and failures to reveal functional regions of the human genome

    Energy Technology Data Exchange (ETDEWEB)

    Pennacchio, Len A.; Baroukh, Nadine; Rubin, Edward M.

    2003-05-15

    Deciphering the genetic code embedded within the human genome remains a significant challenge despite the human genome consortium's recent success at defining its linear sequence (Lander et al. 2001; Venter et al. 2001). While useful strategies exist to identify a large percentage of protein encoding regions, efforts to accurately define functional sequences in the remaining {approx}97 percent of the genome lag. Our primary interest has been to utilize the evolutionary relationship and the universal nature of genomic sequence information in vertebrates to reveal functional elements in the human genome. This has been achieved through the combined use of vertebrate comparative genomics to pinpoint highly conserved sequences as candidates for biological activity and transgenic mouse studies to address the functionality of defined human DNA fragments. Accordingly, we describe strategies and insights into functional sequences in the human genome through the use of comparative genomics coupled wit h functional studies in the mouse.

  14. AcCNET (Accessory Genome Constellation Network): comparative genomics software for accessory genome analysis using bipartite networks.

    Science.gov (United States)

    Lanza, Val F; Baquero, Fernando; de la Cruz, Fernando; Coque, Teresa M

    2017-01-15

    AcCNET (Accessory genome Constellation Network) is a Perl application that aims to compare accessory genomes of a large number of genomic units, both at qualitative and quantitative levels. Using the proteomes extracted from the analysed genomes, AcCNET creates a bipartite network compatible with standard network analysis platforms. AcCNET allows merging phylogenetic and functional information about the concerned genomes, thus improving the capability of current methods of network analysis. The AcCNET bipartite network opens a new perspective to explore the pangenome of bacterial species, focusing on the accessory genome behind the idiosyncrasy of a particular strain and/or population.

  15. Comparative genomics of the lactic acid bacteria

    Energy Technology Data Exchange (ETDEWEB)

    Makarova, K.; Slesarev, A.; Wolf, Y.; Sorokin, A.; Mirkin, B.; Koonin, E.; Pavlov, A.; Pavlova, N.; Karamychev, V.; Polouchine, N.; Shakhova, V.; Grigoriev, I.; Lou, Y.; Rokhsar, D.; Lucas, S.; Huang, K.; Goodstein, D. M.; Hawkins, T.; Plengvidhya, V.; Welker, D.; Hughes, J.; Goh, Y.; Benson, A.; Baldwin, K.; Lee, J. -H.; Diaz-Muniz, I.; Dosti, B.; Smeianov, V; Wechter, W.; Barabote, R.; Lorca, G.; Altermann, E.; Barrangou, R.; Ganesan, B.; Xie, Y.; Rawsthorne, H.; Tamir, D.; Parker, C.; Breidt, F.; Broadbent, J.; Hutkins, R.; O' Sullivan, D.; Steele, J.; Unlu, G.; Saier, M.; Klaenhammer, T.; Richardson, P.; Kozyavkin, S.; Weimer, B.; Mills, D.

    2006-06-01

    Lactic acid-producing bacteria are associated with various plant and animal niches and play a key role in the production of fermented foods and beverages. We report nine genome sequences representing the phylogenetic and functional diversity of these bacteria. The small genomes of lactic acid bacteria encode a broad repertoire of transporters for efficient carbon and nitrogen acquisition from the nutritionally rich environments they inhabit and reflect a limited range of biosynthetic capabilities that indicate both prototrophic and auxotrophic strains. Phylogenetic analyses, comparison of gene content across the group, and reconstruction of ancestral gene sets indicate a combination of extensive gene loss and key gene acquisitions via horizontal gene transfer during the coevolution of lactic acid bacteria with their habitats.

  16. Comparative genomic analysis of soybean flowering genes.

    Directory of Open Access Journals (Sweden)

    Chol-Hee Jung

    Full Text Available Flowering is an important agronomic trait that determines crop yield. Soybean is a major oilseed legume crop used for human and animal feed. Legumes have unique vegetative and floral complexities. Our understanding of the molecular basis of flower initiation and development in legumes is limited. Here, we address this by using a computational approach to examine flowering regulatory genes in the soybean genome in comparison to the most studied model plant, Arabidopsis. For this comparison, a genome-wide analysis of orthologue groups was performed, followed by an in silico gene expression analysis of the identified soybean flowering genes. Phylogenetic analyses of the gene families highlighted the evolutionary relationships among these candidates. Our study identified key flowering genes in soybean and indicates that the vernalisation and the ambient-temperature pathways seem to be the most variant in soybean. A comparison of the orthologue groups containing flowering genes indicated that, on average, each Arabidopsis flowering gene has 2-3 orthologous copies in soybean. Our analysis highlighted that the CDF3, VRN1, SVP, AP3 and PIF3 genes are paralogue-rich genes in soybean. Furthermore, the genome mapping of the soybean flowering genes showed that these genes are scattered randomly across the genome. A paralogue comparison indicated that the soybean genes comprising the largest orthologue group are clustered in a 1.4 Mb region on chromosome 16 of soybean. Furthermore, a comparison with the undomesticated soybean (Glycine soja revealed that there are hundreds of SNPs that are associated with putative soybean flowering genes and that there are structural variants that may affect the genes of the light-signalling and ambient-temperature pathways in soybean. Our study provides a framework for the soybean flowering pathway and insights into the relationship and evolution of flowering genes between a short-day soybean and the long-day plant

  17. A hybrid computational grid architecture for comparative genomics.

    Science.gov (United States)

    Singh, Aarti; Chen, Chen; Liu, Weiguo; Mitchell, Wayne; Schmidt, Bertil

    2008-03-01

    Comparative genomics provides a powerful tool for studying evolutionary changes among organisms, helping to identify genes that are conserved among species, as well as genes that give each organism its unique characteristics. However, the huge datasets involved makes this approach impractical on traditional computer architectures leading to prohibitively long runtimes. In this paper, we present a new computational grid architecture based on a hybrid computing model to significantly accelerate comparative genomics applications. The hybrid computing model consists of two types of parallelism: coarse grained and fine grained. The coarse-grained parallelism uses a volunteer computing infrastructure for job distribution, while the fine-grained parallelism uses commodity computer graphics hardware for fast sequence alignment. We present the deployment and evaluation of this approach on our grid test bed for the all-against-all comparison of microbial genomes. The results of this comparison are then used by phenotype--genotype explorer (PheGee). PheGee is a new tool that nominates candidate genes responsible for a given phenotype.

  18. Comparative genome analysis of Bacillus cereus group genomes with Bacillus subtilis

    OpenAIRE

    Anderson, Iain; Sorokin, Alexei; Kapatral, Vinayak; Reznik, Gary; Bhattacharya, Anamitra; Mikhailova, Natalia; Burd, Henry; Joukov, Victor; Kaznadzey, Denis; Walunas, Theresa; D'Souza, Mark; Larsen, Niels; Pusch, Gordon; Liolios, Konstantinos; Grechkin, Yuri

    2005-01-01

    Genome features of the Bacillus cereus group genomes (representative strains of Bacillus cereus, Bacillus anthracis and Bacillus thuringiensis sub spp israelensis) were analyzed and compared with the Bacillus subtilis genome. A core set of 1,381 protein families among the four Bacillus genomes, with an additional set of 933 families common to the B. cereus group, was identified. Differences in signal transduction pathways, membrane transporters, cell surface structures, cell wall, and S-...

  19. Identification of gene clusters associated with host adaptation and antibiotic resistance in Chinese Staphylococcus aureus isolates by microarray-based comparative genomics.

    Directory of Open Access Journals (Sweden)

    Henan Li

    Full Text Available A comparative genomic microarray comprising 2,457 genes from two whole genomes of S. aureus was employed for the comparative genome hybridization analysis of 50 strains of divergent clonal lineages, including methicillin-resistant S. aureus (MRSA, methicillin-susceptible S. aureus (MSSA, and swine strains in China. Large-scale validation was confirmed via polymerase chain reaction in 160 representative clinical strains. All of the 50 strains were clustered into seven different complexes by phylogenetic tree analysis. Thirteen gene clusters were specific to different S. aureus clones. Ten gene clusters, including seven known (vSa3, vSa4, vSaα, vSaβ, Tn5801, and phage ϕSa3 and three novel (C8, C9, and C10 gene clusters, were specific to human MRSA. Notably, two global regulators, sarH2 and sarH3, at cluster C9 were specific to human MRSA, and plasmid pUB110 at cluster C10 was specific to swine MRSA. Three clusters known to be part of SCCmec, vSa4 or Tn5801, and vSaα as well as one novel gene cluster C12 with homology with Tn554 of S. epidermidis were identified as MRSA-specific gene clusters. The replacement of ST239-spa t037 with ST239-spa t030 in Beijing may be a result of its acquisition of vSa4, phage ϕSa1, and ϕSa3. In summary, thirteen critical gene clusters were identified to be contributors to the evolution of host specificity and antibiotic resistance in Chinese S. aureus.

  20. Comparative genomics of Serratia spp.: two paths towards endosymbiotic life.

    Directory of Open Access Journals (Sweden)

    Alejandro Manzano-Marín

    Full Text Available Symbiosis is a widespread phenomenon in nature, in which insects show a great number of these associations. Buchnera aphidicola, the obligate endosymbiont of aphids, coexists in some species with another intracellular bacterium, Serratia symbiotica. Of particular interest is the case of the cedar aphid Cinara cedri, where B. aphidicola BCc and S. symbiotica SCc need each other to fulfil their symbiotic role with the insect. Moreover, various features seem to indicate that S. symbiotica SCc is closer to an obligate endosymbiont than to other facultative S. symbiotica, such as the one described for the aphid Acirthosyphon pisum (S. symbiotica SAp. This work is based on the comparative genomics of five strains of Serratia, three free-living and two endosymbiotic ones (one facultative and one obligate which should allow us to dissect the genome reduction taking place in the adaptive process to an intracellular life-style. Using a pan-genome approach, we have identified shared and strain-specific genes from both endosymbiotic strains and gained insight into the different genetic reduction both S. symbiotica have undergone. We have identified both retained and reduced functional categories in S. symbiotica compared to the Free-Living Serratia (FLS that seem to be related with its endosymbiotic role in their specific host-symbiont systems. By means of a phylogenomic reconstruction we have solved the position of both endosymbionts with confidence, established the probable insect-pathogen origin of the symbiotic clade as well as the high amino-acid substitution rate in S. symbiotica SCc. Finally, we were able to quantify the minimal number of rearrangements suffered in the endosymbiotic lineages and reconstruct a minimal rearrangement phylogeny. All these findings provide important evidence for the existence of at least two distinctive S. symbiotica lineages that are characterized by different rearrangements, gene content, genome size and branch lengths.

  1. New genomic resources for switchgrass: a BAC library and comparative analysis of homoeologous genomic regions harboring bioenergy traits

    Directory of Open Access Journals (Sweden)

    Feltus Frank A

    2011-07-01

    Full Text Available Abstract Background Switchgrass, a C4 species and a warm-season grass native to the prairies of North America, has been targeted for development into an herbaceous biomass fuel crop. Genetic improvement of switchgrass feedstock traits through marker-assisted breeding and biotechnology approaches calls for genomic tools development. Establishment of integrated physical and genetic maps for switchgrass will accelerate mapping of value added traits useful to breeding programs and to isolate important target genes using map based cloning. The reported polyploidy series in switchgrass ranges from diploid (2X = 18 to duodecaploid (12X = 108. Like in other large, repeat-rich plant genomes, this genomic complexity will hinder whole genome sequencing efforts. An extensive physical map providing enough information to resolve the homoeologous genomes would provide the necessary framework for accurate assembly of the switchgrass genome. Results A switchgrass BAC library constructed by partial digestion of nuclear DNA with EcoRI contains 147,456 clones covering the effective genome approximately 10 times based on a genome size of 3.2 Gigabases (~1.6 Gb effective. Restriction digestion and PFGE analysis of 234 randomly chosen BACs indicated that 95% of the clones contained inserts, ranging from 60 to 180 kb with an average of 120 kb. Comparative sequence analysis of two homoeologous genomic regions harboring orthologs of the rice OsBRI1 locus, a low-copy gene encoding a putative protein kinase and associated with biomass, revealed that orthologous clones from homoeologous chromosomes can be unambiguously distinguished from each other and correctly assembled to respective fingerprint contigs. Thus, the data obtained not only provide genomic resources for further analysis of switchgrass genome, but also improve efforts for an accurate genome sequencing strategy. Conclusions The construction of the first switchgrass BAC library and comparative analysis of

  2. Comparative genomics of Mycoplasma: analysis of conserved essential genes and diversity of the pan-genome.

    Directory of Open Access Journals (Sweden)

    Wei Liu

    Full Text Available Mycoplasma, the smallest self-replicating organism with a minimal metabolism and little genomic redundancy, is expected to be a close approximation to the minimal set of genes needed to sustain bacterial life. This study employs comparative evolutionary analysis of twenty Mycoplasma genomes to gain an improved understanding of essential genes. By analyzing the core genome of mycoplasmas, we finally revealed the conserved essential genes set for mycoplasma survival. Further analysis showed that the core genome set has many characteristics in common with experimentally identified essential genes. Several key genes, which are related to DNA replication and repair and can be disrupted in transposon mutagenesis studies, may be critical for bacteria survival especially over long period natural selection. Phylogenomic reconstructions based on 3,355 homologous groups allowed robust estimation of phylogenetic relatedness among mycoplasma strains. To obtain deeper insight into the relative roles of molecular evolution in pathogen adaptation to their hosts, we also analyzed the positive selection pressures on particular sites and lineages. There appears to be an approximate correlation between the divergence of species and the level of positive selection detected in corresponding lineages.

  3. Genomic array as compared to karyotyping in myelodysplastic syndromes in a prospective clinical trial

    NARCIS (Netherlands)

    Stevens-Kroef, Marian J; Olde Weghuis, Daniel; ElIdrissi-Zaynoun, Najat; van der Reijden, Bert; Cremers, Eline M P; Alhan, Canan; Westers, Theresia M; Visser-Wisselaar, Heleen A; Chitu, Dana A; Cunha, Sonia M; Vellenga, Edo; Klein, Saskia K; Wijermans, Pierre; de Greef, Georgine E; Schaafsma, M Ron; Muus, Petra; Ossenkoppele, Gert J; van de Loosdrecht, Arjan A; Jansen, Joop H

    2017-01-01

    Karyotyping is considered as the gold standard in the genetic subclassification of myelodysplastic syndrome (MDS). Oligo/SNP-based genomic array profiling is a high-resolution tool that also enables genome wide analysis. We compared karyotyping with oligo/SNP-based array profiling in 104 MDS patient

  4. PromBase: a web resource for various genomic features and predicted promoters in prokaryotic genomes

    Directory of Open Access Journals (Sweden)

    Bansal Manju

    2011-07-01

    Full Text Available Abstract Background As more and more genomes are being sequenced, an overview of their genomic features and annotation of their functional elements, which control the expression of each gene or transcription unit of the genome, is a fundamental challenge in genomics and bioinformatics. Findings Relative stability of DNA sequence has been used to predict promoter regions in 913 microbial genomic sequences with GC-content ranging from 16.6% to 74.9%. Irrespective of the genome GC-content the relative stability based promoter prediction method has already been proven to be robust in terms of recall and precision. The predicted promoter regions for the 913 microbial genomes have been accumulated in a database called PromBase. Promoter search can be carried out in PromBase either by specifying the gene name or the genomic position. Each predicted promoter region has been assigned to a reliability class (low, medium, high, very high and highest based on the difference between its average free energy and the downstream region. The recall and precision values for each class are shown graphically in PromBase. In addition, PromBase provides detailed information about base composition, CDS and CG/TA skews for each genome and various DNA sequence dependent structural properties (average free energy, curvature and bendability in the vicinity of all annotated translation start sites (TLS. Conclusion PromBase is a database, which contains predicted promoter regions and detailed analysis of various genomic features for 913 microbial genomes. PromBase can serve as a valuable resource for comparative genomics study and help the experimentalist to rapidly access detailed information on various genomic features and putative promoter regions in any given genome. This database is freely accessible for academic and non- academic users via the worldwide web http://nucleix.mbu.iisc.ernet.in/prombase/.

  5. Comparative genomic characterization of citrus-associated Xylella fastidiosa strains

    Directory of Open Access Journals (Sweden)

    Nunes Luiz R

    2007-12-01

    Full Text Available Abstract Background The xylem-inhabiting bacterium Xylella fastidiosa (Xf is the causal agent of Pierce's disease (PD in vineyards and citrus variegated chlorosis (CVC in orange trees. Both of these economically-devastating diseases are caused by distinct strains of this complex group of microorganisms, which has motivated researchers to conduct extensive genomic sequencing projects with Xf strains. This sequence information, along with other molecular tools, have been used to estimate the evolutionary history of the group and provide clues to understand the capacity of Xf to infect different hosts, causing a variety of symptoms. Nonetheless, although significant amounts of information have been generated from Xf strains, a large proportion of these efforts has concentrated on the study of North American strains, limiting our understanding about the genomic composition of South American strains – which is particularly important for CVC-associated strains. Results This paper describes the first genome-wide comparison among South American Xf strains, involving 6 distinct citrus-associated bacteria. Comparative analyses performed through a microarray-based approach allowed identification and characterization of large mobile genetic elements that seem to be exclusive to South American strains. Moreover, a large-scale sequencing effort, based on Suppressive Subtraction Hybridization (SSH, identified 290 new ORFs, distributed in 135 Groups of Orthologous Elements, throughout the genomes of these bacteria. Conclusion Results from microarray-based comparisons provide further evidence concerning activity of horizontally transferred elements, reinforcing their importance as major mediators in the evolution of Xf. Moreover, the microarray-based genomic profiles showed similarity between Xf strains 9a5c and Fb7, which is unexpected, given the geographical and chronological differences associated with the isolation of these microorganisms. The newly

  6. Microbial NAD metabolism: lessons from comparative genomics.

    Science.gov (United States)

    Gazzaniga, Francesca; Stebbins, Rebecca; Chang, Sheila Z; McPeek, Mark A; Brenner, Charles

    2009-09-01

    NAD is a coenzyme for redox reactions and a substrate of NAD-consuming enzymes, including ADP-ribose transferases, Sir2-related protein lysine deacetylases, and bacterial DNA ligases. Microorganisms that synthesize NAD from as few as one to as many as five of the six identified biosynthetic precursors have been identified. De novo NAD synthesis from aspartate or tryptophan is neither universal nor strictly aerobic. Salvage NAD synthesis from nicotinamide, nicotinic acid, nicotinamide riboside, and nicotinic acid riboside occurs via modules of different genes. Nicotinamide salvage genes nadV and pncA, found in distinct bacteria, appear to have spread throughout the tree of life via horizontal gene transfer. Biochemical, genetic, and genomic analyses have advanced to the point at which the precursors and pathways utilized by a microorganism can be predicted. Challenges remain in dissecting regulation of pathways.

  7. Comparative population genomics: power and principles for the inference of functionality.

    Science.gov (United States)

    Lawrie, David S; Petrov, Dmitri A

    2014-04-01

    The availability of sequenced genomes from multiple related organisms allows the detection and localization of functional genomic elements based on the idea that such elements evolve more slowly than neutral sequences. Although such comparative genomics methods have proven useful in discovering functional elements and ascertaining levels of functional constraint in the genome as a whole, here we outline limitations intrinsic to this approach that cannot be overcome by sequencing more species. We argue that it is essential to supplement comparative genomics with ultra-deep sampling of populations from closely related species to enable substantially more powerful genomic scans for functional elements. The convergence of sequencing technology and population genetics theory has made such projects feasible and has exciting implications for functional genomics. Copyright © 2014 Elsevier Ltd. All rights reserved.

  8. High resolution microarray comparative genomic hybridisation analysis using spotted oligonucleotides.

    NARCIS (Netherlands)

    Carvalho, B; Ouwerkerk, E; Meijer, G.A.; Ylstra, B.

    2004-01-01

    BACKGROUND: Currently, comparative genomic hybridisation array (array CGH) is the method of choice for studying genome wide DNA copy number changes. To date, either amplified representations of bacterial artificial chromosomes (BACs)/phage artificial chromosomes (PACs) or cDNAs have been spotted as

  9. Phytozome: a comparative platform for green plant genomics.

    Science.gov (United States)

    Goodstein, David M; Shu, Shengqiang; Howson, Russell; Neupane, Rochak; Hayes, Richard D; Fazo, Joni; Mitros, Therese; Dirks, William; Hellsten, Uffe; Putnam, Nicholas; Rokhsar, Daniel S

    2012-01-01

    The number of sequenced plant genomes and associated genomic resources is growing rapidly with the advent of both an increased focus on plant genomics from funding agencies, and the application of inexpensive next generation sequencing. To interact with this increasing body of data, we have developed Phytozome (http://www.phytozome.net), a comparative hub for plant genome and gene family data and analysis. Phytozome provides a view of the evolutionary history of every plant gene at the level of sequence, gene structure, gene family and genome organization, while at the same time providing access to the sequences and functional annotations of a growing number (currently 25) of complete plant genomes, including all the land plants and selected algae sequenced at the Joint Genome Institute, as well as selected species sequenced elsewhere. Through a comprehensive plant genome database and web portal, these data and analyses are available to the broader plant science research community, providing powerful comparative genomics tools that help to link model systems with other plants of economic and ecological importance.

  10. Array-based comparative genomic hybridization analysis reveals chromosomal copy number aberrations associated with clinical outcome in canine diffuse large B-cell lymphoma.

    Directory of Open Access Journals (Sweden)

    Arianna Aricò

    Full Text Available Canine Diffuse Large B-cell Lymphoma (cDLBCL is an aggressive cancer with variable clinical response. Despite recent attempts by gene expression profiling to identify the dog as a potential animal model for human DLBCL, this tumor remains biologically heterogeneous with no prognostic biomarkers to predict prognosis. The aim of this work was to identify copy number aberrations (CNAs by high-resolution array comparative genomic hybridization (aCGH in 12 dogs with newly diagnosed DLBCL. In a subset of these dogs, the genetic profiles at the end of therapy and at relapse were also assessed. In primary DLBCLs, 90 different genomic imbalances were counted, consisting of 46 gains and 44 losses. Two gains in chr13 were significantly correlated with clinical stage. In addition, specific regions of gains and losses were significantly associated to duration of remission. In primary DLBCLs, individual variability was found, however 14 recurrent CNAs (>30% were identified. Losses involving IGK, IGL and IGH were always found, and gains along the length of chr13 and chr31 were often observed (>41%. In these segments, MYC, LDHB, HSF1, KIT and PDGFRα are annotated. At the end of therapy, dogs in remission showed four new CNAs, whereas three new CNAs were observed in dogs at relapse compared with the previous profiles. One ex novo CNA, involving TCR, was present in dogs in remission after therapy, possibly induced by the autologous vaccine. Overall, aCGH identified small CNAs associated with outcome, which, along with future expression studies, may reveal target genes relevant to cDLBCL.

  11. Dyneins across eukaryotes: a comparative genomic analysis.

    Science.gov (United States)

    Wickstead, Bill; Gull, Keith

    2007-12-01

    Dyneins are large minus-end-directed microtubule motors. Each dynein contains at least one dynein heavy chain (DHC) and a variable number of intermediate chains (IC), light intermediate chains (LIC) and light chains (LC). Here, we used genome sequence data from 24 diverse eukaryotes to assess the distribution of DHCs, ICs, LICs and LCs across Eukaryota. Phylogenetic inference identified nine DHC families (two cytoplasmic and seven axonemal) and six IC families (one cytoplasmic). We confirm that dyneins have been lost from higher plants and show that this is most likely because of a single loss of cytoplasmic dynein 1 from the ancestor of Rhodophyta and Viridiplantae, followed by lineage-specific losses of other families. Independent losses in Entamoeba mean that at least three extant eukaryotic lineages are entirely devoid of dyneins. Cytoplasmic dynein 2 is associated with intraflagellar transport (IFT), but in two chromalveolate organisms, we find an IFT footprint without the retrograde motor. The distribution of one family of outer-arm dyneins accounts for 2-headed or 3-headed outer-arm ultrastructures observed in different organisms. One diatom species builds motile axonemes without any inner-arm dyneins (IAD), and the unexpected conservation of IAD I1 in non-flagellate algae and LC8 (DYNLL1/2) in all lineages reveals a surprising fluidity to dynein function.

  12. Comparative genomics reveals mobile pathogenicity chromosomes in Fusarium

    NARCIS (Netherlands)

    Ma, L.-J.; van der Does, H.C.; Borkovich, K.A.; Coleman, J.J.; Daboussi, M.J.; Di Pietro, A.; Dufresne, M.; Freitag, M.; Grabherr, M.; Henrissat, B.; Houterman, P.M.; Kang, S.; Shim, W.B.; Woloshuk, C.; Xie, X.; Xu, J.-R; Antoniw, J.; Baker, S.E.; Bluhm, B.H.; Breakspear, A.; Brown, D.W.; Butchko, R.A.E.; Chapman, S.; Coulson, R.; Coutinho, P.M.; Danchin, E.G.J.; Diener, A.; Gale, L.R.; Gardiner, D.M.; Goff, S.; Hammond-Kosack, K.E.; Hilburn, K.; Hua-Van, A.; Jonkers, W.; Kazan, K.; Kodira, C.D.; Koehrsen, M.; Kumar, L.; Lee, Y.H.; Li, L.; Manners, J.M.; Miranda-Saavedra, D.; Mukherjee, M.; Park, G.; Park, J.; Park, S.Y.; Proctor, R.H.; Regev, A.; Ruiz-Roldan, M.C.; Sain, D.; Sakthikumar, S.; Sykes, S.; Schwartz, D.C.; Gillian Turgeon, B.; Wapinski, I.; Yoder, O.; Young, S.; Zeng, Q.; Zhou, S.; Galagan, J.; Cuomo, C.A.; Kistler, H.C.; Rep, M.

    2010-01-01

    Fusarium species are among the most important phytopathogenic and toxigenic fungi. To understand the molecular underpinnings of pathogenicity in the genus Fusarium, we compared the genomes of three phenotypically diverse species: Fusarium graminearum, Fusarium verticillioides and Fusarium oxysporum

  13. Genome-based Taxonomic Classification of Bacteroidetes

    Directory of Open Access Journals (Sweden)

    Richard L. Hahnke

    2016-12-01

    Full Text Available The bacterial phylum Bacteroidetes, characterized by a distinct gliding motility, occurs in a broad variety of ecosystems, habitats, life styles and physiologies. Accordingly, taxonomic classification of the phylum, based on a limited number of features, proved difficult and controversial in the past, for example, when decisions were based on unresolved phylogenetic trees of the 16S rRNA gene sequence. Here we use a large collection of type-strain genomes from Bacteroidetes and closely related phyla for assessing their taxonomy based on the principles of phylogenetic classification and trees inferred from genome-scale data. No significant conflict between 16S rRNA gene and whole-genome phylogenetic analysis is found, whereas many but not all of the involved taxa are supported as monophyletic groups, particularly in the genome-scale trees. Phenotypic and phylogenomic features support the separation of Balneolaceae as new phylum Balneolaeota from Rhodothermaeota and of Saprospiraceae as new class Saprospiria from Chitinophagia. Epilithonimonas is nested within the older genus Chryseobacterium and without significant phenotypic differences; thus merging the two genera is proposed. Similarly, Vitellibacter is proposed to be included in Aequorivita. Flexibacter is confirmed as being heterogeneous and dissected, yielding six distinct genera. Hallella seregens is a later heterotypic synonym of Prevotella dentalis. Compared to values directly calculated from genome sequences, the G+C content mentioned in many species descriptions is too imprecise; moreover, corrected G+C content values have a significantly better fit to the phylogeny. Corresponding emendations of species descriptions are provided where necessary. Whereas most observed conflict with the current classification of Bacteroidetes is already visible in 16S rRNA gene trees, as expected whole-genome phylogenies are much better resolved.

  14. Genome-Based Taxonomic Classification of Bacteroidetes.

    Science.gov (United States)

    Hahnke, Richard L; Meier-Kolthoff, Jan P; García-López, Marina; Mukherjee, Supratim; Huntemann, Marcel; Ivanova, Natalia N; Woyke, Tanja; Kyrpides, Nikos C; Klenk, Hans-Peter; Göker, Markus

    2016-01-01

    The bacterial phylum Bacteroidetes, characterized by a distinct gliding motility, occurs in a broad variety of ecosystems, habitats, life styles, and physiologies. Accordingly, taxonomic classification of the phylum, based on a limited number of features, proved difficult and controversial in the past, for example, when decisions were based on unresolved phylogenetic trees of the 16S rRNA gene sequence. Here we use a large collection of type-strain genomes from Bacteroidetes and closely related phyla for assessing their taxonomy based on the principles of phylogenetic classification and trees inferred from genome-scale data. No significant conflict between 16S rRNA gene and whole-genome phylogenetic analysis is found, whereas many but not all of the involved taxa are supported as monophyletic groups, particularly in the genome-scale trees. Phenotypic and phylogenomic features support the separation of Balneolaceae as new phylum Balneolaeota from Rhodothermaeota and of Saprospiraceae as new class Saprospiria from Chitinophagia. Epilithonimonas is nested within the older genus Chryseobacterium and without significant phenotypic differences; thus merging the two genera is proposed. Similarly, Vitellibacter is proposed to be included in Aequorivita. Flexibacter is confirmed as being heterogeneous and dissected, yielding six distinct genera. Hallella seregens is a later heterotypic synonym of Prevotella dentalis. Compared to values directly calculated from genome sequences, the G+C content mentioned in many species descriptions is too imprecise; moreover, corrected G+C content values have a significantly better fit to the phylogeny. Corresponding emendations of species descriptions are provided where necessary. Whereas most observed conflict with the current classification of Bacteroidetes is already visible in 16S rRNA gene trees, as expected whole-genome phylogenies are much better resolved.

  15. Comparative Genomics Reveals the Core and Accessory Genomes of Streptomyces Species.

    Science.gov (United States)

    Kim, Ji-Nu; Kim, Yeonbum; Jeong, Yujin; Roe, Jung-Hye; Kim, Byung-Gee; Cho, Byung-Kwan

    2015-10-01

    The development of rapid and efficient genome sequencing methods has enabled us to study the evolutionary background of bacterial genetic information. Here, we present comparative genomic analysis of 17 Streptomyces species, for which the genome has been completely sequenced, using the pan-genome approach. The analysis revealed that 34,592 ortholog clusters constituted the pan-genome of these Streptomyces species, including 2,018 in the core genome, 11,743 in the dispensable genome, and 20,831 in the unique genome. The core genome was converged to a smaller number of genes than reported previously, with 3,096 gene families. Functional enrichment analysis showed that genes involved in transcription were most abundant in the Streptomyces pan-genome. Finally, we investigated core genes for the sigma factors, mycothiol biosynthesis pathway, and secondary metabolism pathways; our data showed that many genes involved in stress response and morphological differentiation were commonly expressed in Streptomyces species. Elucidation of the core genome offers a basis for understanding the functional evolution of Streptomyces species and provides insights into target selection for the construction of industrial strains.

  16. Comparative Genomics of the Extreme Acidophile Acidithiobacillus thiooxidans Reveals Intraspecific Divergence and Niche Adaptation

    OpenAIRE

    Zhang, Xian; Feng, Xue; Tao, Jiemeng; Ma, Liyuan; Xiao, Yunhua; Liang, Yili; Liu, Xueduan; Yin, Huaqun

    2016-01-01

    Acidithiobacillus thiooxidans known for its ubiquity in diverse acidic and sulfur-bearing environments worldwide was used as the research subject in this study. To explore the genomic fluidity and intraspecific diversity of Acidithiobacillus thiooxidans (A. thiooxidans) species, comparative genomics based on nine draft genomes was performed. Phylogenomic scrutiny provided first insights into the multiple groupings of these strains, suggesting that genetic diversity might be potentially correl...

  17. IMGD: an integrated platform supporting comparative genomics and phylogenetics of insect mitochondrial genomes

    Directory of Open Access Journals (Sweden)

    Jung Kyongyong

    2009-04-01

    Full Text Available Abstract Background Sequences and organization of the mitochondrial genome have been used as markers to investigate evolutionary history and relationships in many taxonomic groups. The rapidly increasing mitochondrial genome sequences from diverse insects provide ample opportunities to explore various global evolutionary questions in the superclass Hexapoda. To adequately support such questions, it is imperative to establish an informatics platform that facilitates the retrieval and utilization of available mitochondrial genome sequence data. Results The Insect Mitochondrial Genome Database (IMGD is a new integrated platform that archives the mitochondrial genome sequences from 25,747 hexapod species, including 112 completely sequenced and 20 nearly completed genomes and 113,985 partially sequenced mitochondrial genomes. The Species-driven User Interface (SUI of IMGD supports data retrieval and diverse analyses at multi-taxon levels. The Phyloviewer implemented in IMGD provides three methods for drawing phylogenetic trees and displays the resulting trees on the web. The SNP database incorporated to IMGD presents the distribution of SNPs and INDELs in the mitochondrial genomes of multiple isolates within eight species. A newly developed comparative SNU Genome Browser supports the graphical presentation and interactive interface for the identified SNPs/INDELs. Conclusion The IMGD provides a solid foundation for the comparative mitochondrial genomics and phylogenetics of insects. All data and functions described here are available at the web site http://www.imgd.org/.

  18. Comparative genomic paleontology across plant kingdom reveals the dynamics of TE-driven genome evolution.

    Science.gov (United States)

    El Baidouri, Moaine; Panaud, Olivier

    2013-01-01

    Long terminal repeat-retrotransposons (LTR-RTs) are the most abundant class of transposable elements (TEs) in plants. They strongly impact the structure, function, and evolution of their host genome, and, in particular, their role in genome size variation has been clearly established. However, the dynamics of the process through which LTR-RTs have differentially shaped plant genomes is still poorly understood because of a lack of comparative studies. Using a new robust and automated family classification procedure, we exhaustively characterized the LTR-RTs in eight plant genomes for which a high-quality sequence is available (i.e., Arabidopsis thaliana, A. lyrata, grapevine, soybean, rice, Brachypodium dystachion, sorghum, and maize). This allowed us to perform a comparative genome-wide study of the retrotranspositional landscape in these eight plant lineages from both monocots and dicots. We show that retrotransposition has recurrently occurred in all plant genomes investigated, regardless their size, and through bursts, rather than a continuous process. Moreover, in each genome, only one or few LTR-RT families have been active in the recent past, and the difference in genome size among the species studied could thus mostly be accounted for by the extent of the latest transpositional burst(s). Following these bursts, LTR-RTs are efficiently eliminated from their host genomes through recombination and deletion, but we show that the removal rate is not lineage specific. These new findings lead us to propose a new model of TE-driven genome evolution in plants.

  19. Two-component signal transduction in Agaricus bisporus: a comparative genomic analysis with other basidiomycetes through the web-based tool BASID2CS.

    Science.gov (United States)

    Lavín, José L; García-Yoldi, Alberto; Ramírez, Lucía; Pisabarro, Antonio G; Oguiza, José A

    2013-06-01

    Two-component systems (TCSs) are signal transduction mechanisms present in many eukaryotes, including fungi that play essential roles in the regulation of several cellular functions and responses. In this study, we carry out a genomic analysis of the TCS proteins in two varieties of the white button mushroom Agaricus bisporus. The genomes of both A. bisporus varieties contain eight genes coding for TCS proteins, which include four hybrid Histidine Kinases (HKs), a single histidine-containing phosphotransfer (HPt) protein and three Response Regulators (RRs). Comparison of the TCS proteins among A. bisporus and the sequenced basidiomycetes showed a conserved core complement of five TCS proteins including the Tco1/Nik1 hybrid HK, HPt protein and Ssk1, Skn7 and Rim15-like RRs. In addition, Dual-HKs, unusual hybrid HKs with 2 HK and 2 RR domains, are absent in A. bisporus and are limited to various species of basidiomycetes. Differential expression analysis showed no significant up- or down-regulation of the Agaricus TCS genes in the conditions/tissue analyzed with the exception of the Skn7-like RR gene (Agabi_varbisH97_2|198669) that is significantly up-regulated on compost compared to cultured mycelia. Furthermore, the pipeline web server BASID2CS (http://bioinformatics.unavarra.es:1000/B2CS/BASID2CS.htm) has been specifically designed for the identification, classification and functional annotation of putative TCS proteins from any predicted proteome of basidiomycetes using a combination of several bioinformatic approaches. Copyright © 2012 Elsevier Inc. All rights reserved.

  20. Comparative rates of evolution in endosymbiotic nuclear genomes

    Directory of Open Access Journals (Sweden)

    Keeling Patrick J

    2006-06-01

    Full Text Available Abstract Background The nucleomorphs associated with secondary plastids of cryptomonads and chlorarachniophytes are the sole examples of organelles with eukaryotic nuclear genomes. Although not as widespread as their prokaryotic equivalents in mitochondria and plastids, nucleomorph genomes share similarities in terms of reduction and compaction. They also differ in several aspects, not least in that they encode proteins that target to the plastid, and so function in a different compartment from that in which they are encoded. Results Here, we test whether the phylogenetically distinct nucleomorph genomes of the cryptomonad, Guillardia theta, and the chlorarachniophyte, Bigelowiella natans, have experienced similar evolutionary pressures during their transformation to reduced organelles. We compared the evolutionary rates of genes from nuclear, nucleomorph, and plastid genomes, all of which encode proteins that function in the same cellular compartment, the plastid, and are thus subject to similar selection pressures. Furthermore, we investigated the divergence of nucleomorphs within cryptomonads by comparing G. theta and Rhodomonas salina. Conclusion Chlorarachniophyte nucleomorph genes have accumulated errors at a faster rate than other genomes within the same cell, regardless of the compartment where the gene product functions. In contrast, most nucleomorph genes in cryptomonads have evolved faster than genes in other genomes on average, but genes for plastid-targeted proteins are not overly divergent, and it appears that cryptomonad nucleomorphs are not presently evolving rapidly and have therefore stabilized. Overall, these analyses suggest that the forces at work in the two lineages are different, despite the similarities between the structures of their genomes.

  1. Comparative genomics of vesicomyid clam (Bivalvia: Mollusca chemosynthetic symbionts

    Directory of Open Access Journals (Sweden)

    Girguis Peter R

    2008-12-01

    Full Text Available Abstract Background The Vesicomyidae (Bivalvia: Mollusca are a family of clams that form symbioses with chemosynthetic gamma-proteobacteria. They exist in environments such as hydrothermal vents and cold seeps and have a reduced gut and feeding groove, indicating a large dependence on their endosymbionts for nutrition. Recently, two vesicomyid symbiont genomes were sequenced, illuminating the possible nutritional contributions of the symbiont to the host and making genome-wide evolutionary analyses possible. Results To examine the genomic evolution of the vesicomyid symbionts, a comparative genomics framework, including the existing genomic data combined with heterologous microarray hybridization results, was used to analyze conserved gene content in four vesicomyid symbiont genomes. These four symbionts were chosen to include a broad phylogenetic sampling of the vesicomyid symbionts and represent distinct chemosynthetic environments: cold seeps and hydrothermal vents. Conclusion The results of this comparative genomics analysis emphasize the importance of the symbionts' chemoautotrophic metabolism within their hosts. The fact that these symbionts appear to be metabolically capable autotrophs underscores the extent to which the host depends on them for nutrition and reveals the key to invertebrate colonization of these challenging environments.

  2. Comparative Genomics of a Parthenogenesis-Inducing Wolbachia Symbiont

    Directory of Open Access Journals (Sweden)

    Amelia R. I. Lindsey

    2016-07-01

    Full Text Available Wolbachia is an intracellular symbiont of invertebrates responsible for inducing a wide variety of phenotypes in its host. These host-Wolbachia relationships span the continuum from reproductive parasitism to obligate mutualism, and provide a unique system to study genomic changes associated with the evolution of symbiosis. We present the genome sequence from a parthenogenesis-inducing Wolbachia strain (wTpre infecting the minute parasitoid wasp Trichogramma pretiosum. The wTpre genome is the most complete parthenogenesis-inducing Wolbachia genome available to date. We used comparative genomics across 16 Wolbachia strains, representing five supergroups, to identify a core Wolbachia genome of 496 sets of orthologous genes. Only 14 of these sets are unique to Wolbachia when compared to other bacteria from the Rickettsiales. We show that the B supergroup of Wolbachia, of which wTpre is a member, contains a significantly higher number of ankyrin repeat-containing genes than other supergroups. In the wTpre genome, there is evidence for truncation of the protein coding sequences in 20% of ORFs, mostly as a result of frameshift mutations. The wTpre strain represents a conversion from cytoplasmic incompatibility to a parthenogenesis-inducing lifestyle, and is required for reproduction in the Trichogramma host it infects. We hypothesize that the large number of coding frame truncations has accompanied the change in reproductive mode of the wTpre strain.

  3. Comparative genomic analysis of eutherian interferon-γ-inducible GTPases.

    Science.gov (United States)

    Premzl, Marko

    2012-11-01

    The interferon-γ-inducible GTPases, IFGGs, are intracellular proteins involved in immune response against pathogens. A comprehensive comparative genomic review and analysis of eutherian IFGGs was carried out using public genomic sequences. The 64 eutherian IFGG genes were examined in detail and annotated. The eutherian IFGG promoter types were first catalogued followed by a phylogenetic analysis of eutherian IFGGs, which described five major IFGG clusters. The patterns of differential gene expansions and protein regions that may regulate IFGG catalytic features suggested a new classification of eutherian IFGGs. This mini-review has also provided new tests of reliability of public genomic sequences as well as tests of protein molecular evolution.

  4. Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes.

    Science.gov (United States)

    Riechmann, J L; Heard, J; Martin, G; Reuber, L; Jiang, C; Keddie, J; Adam, L; Pineda, O; Ratcliffe, O J; Samaha, R R; Creelman, R; Pilgrim, M; Broun, P; Zhang, J Z; Ghandehari, D; Sherman, B K; Yu, G

    2000-12-15

    The completion of the Arabidopsis thaliana genome sequence allows a comparative analysis of transcriptional regulators across the three eukaryotic kingdoms. Arabidopsis dedicates over 5% of its genome to code for more than 1500 transcription factors, about 45% of which are from families specific to plants. Arabidopsis transcription factors that belong to families common to all eukaryotes do not share significant similarity with those of the other kingdoms beyond the conserved DNA binding domains, many of which have been arranged in combinations specific to each lineage. The genome-wide comparison reveals the evolutionary generation of diversity in the regulation of transcription.

  5. Comparative Genome Analysis of Basidiomycete Fungi

    Energy Technology Data Exchange (ETDEWEB)

    Riley, Robert; Salamov, Asaf; Morin, Emmanuelle; Nagy, Laszlo; Manning, Gerard; Baker, Scott; Brown, Daren; Henrissat, Bernard; Levasseur, Anthony; Hibbett, David; Martin, Francis; Grigoriev, Igor

    2012-03-19

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes the mushrooms, wood rots, symbionts, and plant and animal pathogens. To better understand the diversity of phenotypes in basidiomycetes, we performed a comparative analysis of 35 basidiomycete fungi spanning the diversity of the phylum. Phylogenetic patterns of lignocellulose degrading genes suggest a continuum rather than a sharp dichotomy between the white rot and brown rot modes of wood decay. Patterns of secondary metabolic enzymes give additional insight into the broad array of phenotypes found in the basidiomycetes. We suggest that the profile of an organism in lignocellulose-targeting genes can be used to predict its nutritional mode, and predict Dacryopinax sp. as a brown rot; Botryobasidium botryosum and Jaapia argillacea as white rots.

  6. Comparative Genomics via Wavelet Analysis for Closely Related Bacteria

    Directory of Open Access Journals (Sweden)

    Jiuzhou Song

    2004-01-01

    Full Text Available Comparative genomics has been a valuable method for extracting and extrapolating genome information among closely related bacteria. The efficiency of the traditional methods is extremely influenced by the software method used. To overcome the problem here, we propose using wavelet analysis to perform comparative genomics. First, global comparison using wavelet analysis gives the difference at a quantitative level. Then local comparison using keto-excess or purine-excess plots shows precise positions of inversions, translocations, and horizontally transferred DNA fragments. We firstly found that the level of energy spectra difference is related to the similarity of bacteria strains; it could be a quantitative index to describe the similarities of genomes. The strategy is described in detail by comparisons of closely related strains: S.typhi CT18, S.typhi Ty2, S.typhimurium LT2, H.pylori 26695, and H.pylori J99.

  7. Comparative Genomics via Wavelet Analysis for Closely Related Bacteria

    Science.gov (United States)

    Song, Jiuzhou; Ware, Tony; Liu, Shu-Lin; Surette, M.

    2004-12-01

    Comparative genomics has been a valuable method for extracting and extrapolating genome information among closely related bacteria. The efficiency of the traditional methods is extremely influenced by the software method used. To overcome the problem here, we propose using wavelet analysis to perform comparative genomics. First, global comparison using wavelet analysis gives the difference at a quantitative level. Then local comparison using keto-excess or purine-excess plots shows precise positions of inversions, translocations, and horizontally transferred DNA fragments. We firstly found that the level of energy spectra difference is related to the similarity of bacteria strains; it could be a quantitative index to describe the similarities of genomes. The strategy is described in detail by comparisons of closely related strains: S.typhi CT18, S.typhi Ty2, S.typhimurium LT2, H.pylori 26695, and H.pylori J99.

  8. Sputnik: a database platform for comparative plant genomics.

    Science.gov (United States)

    Rudd, Stephen; Mewes, Hans-Werner; Mayer, Klaus F X

    2003-01-01

    Two million plant ESTs, from 20 different plant species, and totalling more than one 1000 Mbp of DNA sequence, represents a formidable transcriptomic resource. Sputnik uses the potential of this sequence resource to fill some of the information gap in the un-sequenced plant genomes and to serve as the foundation for in silicio comparative plant genomics. The complexity of the individual EST collections has been reduced using optimised EST clustering techniques. Annotation of cluster sequences is performed by exploiting and transferring information from the comprehensive knowledgebase already produced for the completed model plant genome (Arabidopsis thaliana) and by performing additional state of-the-art sequence analyses relevant to today's plant biologist. Functional predictions, comparative analyses and associative annotations for 500 000 plant EST derived peptides make Sputnik (http://mips.gsf.de/proj/sputnik/) a valid platform for contemporary plant genomics.

  9. GenomePixelizer--a visualization program for comparative genomics within and between species.

    Science.gov (United States)

    Kozik, A; Kochetkova, E; Michelmore, R

    2002-02-01

    GenomePixelizer is a visualization tool that generates custom images of the physical or genetic positions of specified sets of genes in whole genomes or parts of genomes. Multiple sets of genes can be shown simultaneously with user-defined characteristics displayed. It allows the analysis of duplication events within and between species based on sequence similarities. The program is written in Tcl/Tk and works on any platform that supports the Tcl/Tk toolkit. GenomePixelizer generates HTML ImageMap tags for each gene in the image allowing links to databases. Images can be saved and presented on web pages.

  10. MicrobesOnline: an integrated portal for comparative functional genomics

    OpenAIRE

    Joachimiak, Marcin P.

    2014-01-01

    The Virtual Institute for Microbial Stress and Survival (VIMSS, http://vimss.lbl.gov/) funded by the Dept. of Energy's Genomics:GTL Program, is dedicated to using integrated environmental, functional genomic, and comparative sequence and phylogeny data to understand mechanisms by which microbes and microbial communities survive in uncertain environments while carrying out processes of interest for bioremediation and energy generation. To support this work, VIMSS has developed a Web portal, al...

  11. Establishing a framework for comparative analysis of genome sequences

    Energy Technology Data Exchange (ETDEWEB)

    Bansal, A.K.

    1995-06-01

    This paper describes a framework and a high-level language toolkit for comparative analysis of genome sequence alignment The framework integrates the information derived from multiple sequence alignment and phylogenetic tree (hypothetical tree of evolution) to derive new properties about sequences. Multiple sequence alignments are treated as an abstract data type. Abstract operations have been described to manipulate a multiple sequence alignment and to derive mutation related information from a phylogenetic tree by superimposing parsimonious analysis. The framework has been applied on protein alignments to derive constrained columns (in a multiple sequence alignment) that exhibit evolutionary pressure to preserve a common property in a column despite mutation. A Prolog toolkit based on the framework has been implemented and demonstrated on alignments containing 3000 sequences and 3904 columns.

  12. The Aspergillus Genome Database (AspGD): recent developments in comprehensive multispecies curation, comparative genomics and community resources.

    Science.gov (United States)

    Arnaud, Martha B; Cerqueira, Gustavo C; Inglis, Diane O; Skrzypek, Marek S; Binkley, Jonathan; Chibucos, Marcus C; Crabtree, Jonathan; Howarth, Clinton; Orvis, Joshua; Shah, Prachi; Wymore, Farrell; Binkley, Gail; Miyasato, Stuart R; Simison, Matt; Sherlock, Gavin; Wortman, Jennifer R

    2012-01-01

    The Aspergillus Genome Database (AspGD; http://www.aspgd.org) is a freely available, web-based resource for researchers studying fungi of the genus Aspergillus, which includes organisms of clinical, agricultural and industrial importance. AspGD curators have now completed comprehensive review of the entire published literature about Aspergillus nidulans and Aspergillus fumigatus, and this annotation is provided with streamlined, ortholog-based navigation of the multispecies information. AspGD facilitates comparative genomics by providing a full-featured genomics viewer, as well as matched and standardized sets of genomic information for the sequenced aspergilli. AspGD also provides resources to foster interaction and dissemination of community information and resources. We welcome and encourage feedback at aspergillus-curator@lists.stanford.edu.

  13. The Rice Genome Knowledgebase (RGKbase): an annotation database for rice comparative genomics and evolutionary biology.

    Science.gov (United States)

    Wang, Dapeng; Xia, Yan; Li, Xinna; Hou, Lixia; Yu, Jun

    2013-01-01

    Over the past 10 years, genomes of cultivated rice cultivars and their wild counterparts have been sequenced although most efforts are focused on genome assembly and annotation of two major cultivated rice (Oryza sativa L.) subspecies, 93-11 (indica) and Nipponbare (japonica). To integrate information from genome assemblies and annotations for better analysis and application, we now introduce a comparative rice genome database, the Rice Genome Knowledgebase (RGKbase, http://rgkbase.big.ac.cn/RGKbase/). RGKbase is built to have three major components: (i) integrated data curation for rice genomics and molecular biology, which includes genome sequence assemblies, transcriptomic and epigenomic data, genetic variations, quantitative trait loci (QTLs) and the relevant literature; (ii) User-friendly viewers, such as Gbrowse, GeneBrowse and Circos, for genome annotations and evolutionary dynamics and (iii) Bioinformatic tools for compositional and synteny analyses, gene family classifications, gene ontology terms and pathways and gene co-expression networks. RGKbase current includes data from five rice cultivars and species: Nipponbare (japonica), 93-11 (indica), PA64s (indica), the African rice (Oryza glaberrima) and a wild rice species (Oryza brachyantha). We are also constantly introducing new datasets from variety of public efforts, such as two recent releases-sequence data from ∼1000 rice varieties, which are mapped into the reference genome, yielding ample high-quality single-nucleotide polymorphisms and insertions-deletions.

  14. Characterization of Mycobacterium chelonae-Like Strains by Comparative Genomics

    Directory of Open Access Journals (Sweden)

    Christiane L. Nogueira

    2017-05-01

    Full Text Available Isolates of the Mycobacterium chelonae-M. abscessus complex are subdivided into four clusters (CHI to CHIV in the INNO-LiPA® Mycobacterium spp DNA strip assay. A considerable phenotypic variability was observed among isolates of the CHII cluster. In this study, we examined the diversity of 26 CHII cluster isolates by phenotypic analysis, drug susceptibility testing, whole genome sequencing and single-gene analysis. Pairwise genome comparisons were performed using several approaches, including average nucleotide identity (ANI and genome-to-genome distance (GGD among others. Based on ANI and GGD the isolates were identified as M. chelonae (14 isolates, M. franklinii (2 isolates and M. salmoniphium (1 isolate. The remaining 9 isolates were subdivided into three novel putative genomospecies. Phenotypic analyses including drug susceptibility testing, as well as whole genome comparison by TETRA and delta differences, were not helpful in separating the groups revealed by ANI and GGD. The analysis of standard four conserved genomic regions showed that rpoB alone and the concatenated sequences clearly distinguished the taxonomic groups delimited by whole genome analyses. In conclusion, the CHII INNO-LiPa is not a homogeneous cluster; on the contrary, it is composed of closely related different species belonging to the M. chelonae-M. abscessus complex and also several unidentified isolates. The detection of these isolates, putatively novel species, indicates a wider inner variability than the presently known in this complex.

  15. DCODE.ORG Anthology of Comparative Genomic Tools

    Energy Technology Data Exchange (ETDEWEB)

    Loots, G G; Ovcharenko, I

    2005-01-11

    Comparative genomics provides the means to demarcate functional regions in anonymous DNA sequences. The successful application of this method to identifying novel genes is currently shifting to deciphering the noncoding encryption of gene regulation across genomes. To facilitate the use of comparative genomics to practical applications in genetics and genomics we have developed several analytical and visualization tools for the analysis of arbitrary sequences and whole genomes. These tools include two alignment tools: zPicture and Mulan; a phylogenetic shadowing tool: eShadow for identifying lineage- and species-specific functional elements; two evolutionary conserved transcription factor analysis tools: rVista and multiTF; a tool for extracting cis-regulatory modules governing the expression of co-regulated genes, CREME; and a dynamic portal to multiple vertebrate and invertebrate genome alignments, the ECR Browser. Here we briefly describe each one of these tools and provide specific examples on their practical applications. All the tools are publicly available at the http://www.dcode.org/ web site.

  16. Comparative analysis of catfish BAC end sequences with the zebrafish genome

    Directory of Open Access Journals (Sweden)

    Abernathy Jason

    2009-12-01

    Full Text Available Abstract Background Comparative mapping is a powerful tool to transfer genomic information from sequenced genomes to closely related species for which whole genome sequence data are not yet available. However, such an approach is still very limited in catfish, the most important aquaculture species in the United States. This project was initiated to generate additional BAC end sequences and demonstrate their applications in comparative mapping in catfish. Results We reported the generation of 43,000 BAC end sequences and their applications for comparative genome analysis in catfish. Using these and the additional 20,000 existing BAC end sequences as a resource along with linkage mapping and existing physical map, conserved syntenic regions were identified between the catfish and zebrafish genomes. A total of 10,943 catfish BAC end sequences (17.3% had significant BLAST hits to the zebrafish genome (cutoff value ≤ e-5, of which 3,221 were unique gene hits, providing a platform for comparative mapping based on locations of these genes in catfish and zebrafish. Genetic linkage mapping of microsatellites associated with contigs allowed identification of large conserved genomic segments and construction of super scaffolds. Conclusion BAC end sequences and their associated polymorphic markers are great resources for comparative genome analysis in catfish. Highly conserved chromosomal regions were identified to exist between catfish and zebrafish. However, it appears that the level of conservation at local genomic regions are high while a high level of chromosomal shuffling and rearrangements exist between catfish and zebrafish genomes. Orthologous regions established through comparative analysis should facilitate both structural and functional genome analysis in catfish.

  17. Comparative bacterial proteomics: analysis of the core genome concept.

    Directory of Open Access Journals (Sweden)

    Stephen J Callister

    Full Text Available While comparative bacterial genomic studies commonly predict a set of genes indicative of common ancestry, experimental validation of the existence of this core genome requires extensive measurement and is typically not undertaken. Enabled by an extensive proteome database developed over six years, we have experimentally verified the expression of proteins predicted from genomic ortholog comparisons among 17 environmental and pathogenic bacteria. More exclusive relationships were observed among the expressed protein content of phenotypically related bacteria, which is indicative of the specific lifestyles associated with these organisms. Although genomic studies can establish relative orthologous relationships among a set of bacteria and propose a set of ancestral genes, our proteomics study establishes expressed lifestyle differences among conserved genes and proposes a set of expressed ancestral traits.

  18. Comparative Bacterial Proteomics: Analysis of the Core Genome Concept

    Energy Technology Data Exchange (ETDEWEB)

    Callister, Stephen J.; McCue, Lee Ann; Turse, Josh E.; Monroe, Matthew E.; Auberry, Kenneth J.; Smith, Richard D.; Adkins, Joshua N.; Lipton, Mary S.

    2008-02-06

    Comparative bacterial genomic studies commonly predict a set of genes indicative of common ancestry. Experimental validation of the existence of this core genome requires extensive measurement and is not typically undertaken. Enabled by an extensive proteome database development over a six year period, we experimentally verified the expression of proteins predicted from genomic ortholog comparisons among 17 environmental and pathogenic bacteria. More exclusive relationships were observed among the expressed protein content of phenotypically related bacteria, which is indicative of the specific lifestyles associated with these organisms. While genomic studies establish relative orthologous relationships among a set of bacteria and propose a set of ancestral genes, our proteomics study establishes expressed lifestyle differences among conserved genes and proposes a set of expressed ancestral traits.

  19. Cytogenetic analysis from DNA by comparative genomic hybridization.

    Science.gov (United States)

    Tachdjian, G; Aboura, A; Lapierre, J M; Viguié, F

    2000-01-01

    Comparative genomic hybridization (CGH) is a modified in situ hybridization technique which allows detection and mapping of DNA sequence copy differences between two genomes in a single experiment. In CGH analysis, two differentially labelled genomic DNA (study and reference) are co-hybridized to normal metaphase spreads. Chromosomal locations of copy number changes in the DNA segments of the study genome are revealed by a variable fluorescence intensity ratio along each target chromosome. Since its development, CGH has been applied mostly as a research tool in the field of cancer cytogenetics to identify genetic changes in many previously unknown regions. CGH may also have a role in clinical cytogenetics for detection and identification of unbalanced chromosomal abnormalities.

  20. Comparative Bacterial Proteomics: Analysis of the Core Genome Concept

    Science.gov (United States)

    Callister, Stephen J.; McCue, Lee Ann; Turse, Joshua E.; Monroe, Matthew E.; Auberry, Kenneth J.; Smith, Richard D.; Adkins, Joshua N.; Lipton, Mary S.

    2008-01-01

    While comparative bacterial genomic studies commonly predict a set of genes indicative of common ancestry, experimental validation of the existence of this core genome requires extensive measurement and is typically not undertaken. Enabled by an extensive proteome database developed over six years, we have experimentally verified the expression of proteins predicted from genomic ortholog comparisons among 17 environmental and pathogenic bacteria. More exclusive relationships were observed among the expressed protein content of phenotypically related bacteria, which is indicative of the specific lifestyles associated with these organisms. Although genomic studies can establish relative orthologous relationships among a set of bacteria and propose a set of ancestral genes, our proteomics study establishes expressed lifestyle differences among conserved genes and proposes a set of expressed ancestral traits. PMID:18253490

  1. BrucellaBase: Genome information resource.

    Science.gov (United States)

    Sankarasubramanian, Jagadesan; Vishnu, Udayakumar S; Khader, L K M Abdul; Sridhar, Jayavel; Gunasekaran, Paramasamy; Rajendhran, Jeyaprakash

    2016-09-01

    Brucella sp. causes a major zoonotic disease, brucellosis. Brucella belongs to the family Brucellaceae under the order Rhizobiales of Alphaproteobacteria. We present BrucellaBase, a web-based platform, providing features of a genome database together with unique analysis tools. We have developed a web version of the multilocus sequence typing (MLST) (Whatmore et al., 2007) and phylogenetic analysis of Brucella spp. BrucellaBase currently contains genome data of 510 Brucella strains along with the user interfaces for BLAST, VFDB, CARD, pairwise genome alignment and MLST typing. Availability of these tools will enable the researchers interested in Brucella to get meaningful information from Brucella genome sequences. BrucellaBase will regularly be updated with new genome sequences, new features along with improvements in genome annotations. BrucellaBase is available online at http://www.dbtbrucellosis.in/brucellabase.html or http://59.99.226.203/brucellabase/homepage.html.

  2. BLAT-Based Comparative Analysis for Transposable Elements: BLATCAT

    Directory of Open Access Journals (Sweden)

    Sangbum Lee

    2014-01-01

    Full Text Available The availability of several whole genome sequences makes comparative analyses possible. In primate genomes, the priority of transposable elements (TEs is significantly increased because they account for ~45% of the primate genomes, they can regulate the gene expression level, and they are associated with genomic fluidity in their host genomes. Here, we developed the BLAST-like alignment tool (BLAT based comparative analysis for transposable elements (BLATCAT program. The BLATCAT program can compare specific regions of six representative primate genome sequences (human, chimpanzee, gorilla, orangutan, gibbon, and rhesus macaque on the basis of BLAT and simultaneously carry out RepeatMasker and/or Censor functions, which are widely used Windows-based web-server functions to detect TEs. All results can be stored as a HTML file for manual inspection of a specific locus. BLATCAT will be very convenient and efficient for comparative analyses of TEs in various primate genomes.

  3. BLAT-based comparative analysis for transposable elements: BLATCAT.

    Science.gov (United States)

    Lee, Sangbum; Oh, Sumin; Kang, Keunsoo; Han, Kyudong

    2014-01-01

    The availability of several whole genome sequences makes comparative analyses possible. In primate genomes, the priority of transposable elements (TEs) is significantly increased because they account for ~45% of the primate genomes, they can regulate the gene expression level, and they are associated with genomic fluidity in their host genomes. Here, we developed the BLAST-like alignment tool (BLAT) based comparative analysis for transposable elements (BLATCAT) program. The BLATCAT program can compare specific regions of six representative primate genome sequences (human, chimpanzee, gorilla, orangutan, gibbon, and rhesus macaque) on the basis of BLAT and simultaneously carry out RepeatMasker and/or Censor functions, which are widely used Windows-based web-server functions to detect TEs. All results can be stored as a HTML file for manual inspection of a specific locus. BLATCAT will be very convenient and efficient for comparative analyses of TEs in various primate genomes.

  4. Low-pass sequencing for microbial comparative genomics

    Directory of Open Access Journals (Sweden)

    Kennedy Sean

    2004-01-01

    Full Text Available Abstract Background We studied four extremely halophilic archaea by low-pass shotgun sequencing: (1 the metabolically versatile Haloarcula marismortui; (2 the non-pigmented Natrialba asiatica; (3 the psychrophile Halorubrum lacusprofundi and (4 the Dead Sea isolate Halobaculum gomorrense. Approximately one thousand single pass genomic sequences per genome were obtained. The data were analyzed by comparative genomic analyses using the completed Halobacterium sp. NRC-1 genome as a reference. Low-pass shotgun sequencing is a simple, inexpensive, and rapid approach that can readily be performed on any cultured microbe. Results As expected, the four archaeal halophiles analyzed exhibit both bacterial and eukaryotic characteristics as well as uniquely archaeal traits. All five halophiles exhibit greater than sixty percent GC content and low isoelectric points (pI for their predicted proteins. Multiple insertion sequence (IS elements, often involved in genome rearrangements, were identified in H. lacusprofundi and H. marismortui. The core biological functions that govern cellular and genetic mechanisms of H. sp. NRC-1 appear to be conserved in these four other halophiles. Multiple TATA box binding protein (TBP and transcription factor IIB (TFB homologs were identified from most of the four shotgunned halophiles. The reconstructed molecular tree of all five halophiles shows a large divergence between these species, but with the closest relationship being between H. sp. NRC-1 and H. lacusprofundi. Conclusion Despite the diverse habitats of these species, all five halophiles share (1 high GC content and (2 low protein isoelectric points, which are characteristics associated with environmental exposure to UV radiation and hypersalinity, respectively. Identification of multiple IS elements in the genome of H. lacusprofundi and H. marismortui suggest that genome structure and dynamic genome reorganization might be similar to that previously observed in the

  5. Genome sequencing and comparative genomics reveal a repertoire of putative pathogenicity genes in chilli anthracnose fungus Colletotrichum truncatum.

    Science.gov (United States)

    Rao, Soumya; Nandineni, Madhusudan R

    2017-01-01

    Colletotrichum truncatum, a major fungal phytopathogen, causes the anthracnose disease on an economically important spice crop chilli (Capsicum annuum), resulting in huge economic losses in tropical and sub-tropical countries. It follows a subcuticular intramural infection strategy on chilli with a short, asymptomatic, endophytic phase, which contrasts with the intracellular hemibiotrophic lifestyle adopted by most of the Colletotrichum species. However, little is known about the molecular determinants and the mechanism of pathogenicity in this fungus. A high quality whole genome sequence and gene annotation based on transcriptome data of an Indian isolate of C. truncatum from chilli has been obtained. Analysis of the genome sequence revealed a rich repertoire of pathogenicity genes in C. truncatum encoding secreted proteins, effectors, plant cell wall degrading enzymes, secondary metabolism associated proteins, with potential roles in the host-specific infection strategy, placing it next only to the Fusarium species. The size of genome assembly, number of predicted genes and some of the functional categories were similar to other sequenced Colletotrichum species. The comparative genomic analyses with other species and related fungi identified some unique genes and certain highly expanded gene families of CAZymes, proteases and secondary metabolism associated genes in the genome of C. truncatum. The draft genome assembly and functional annotation of potential pathogenicity genes of C. truncatum provide an important genomic resource for understanding the biology and lifestyle of this important phytopathogen and will pave the way for designing efficient disease control regimens.

  6. On the Approximability of Comparing Genomes with Duplicates

    CERN Document Server

    Angibaud, Sébastien; Rusu, Irena; Thevenin, Annelyse; Vialette, Stéphane

    2008-01-01

    A central problem in comparative genomics consists in computing a (dis-)similarity measure between two genomes, e.g. in order to construct a phylogeny. All the existing measures are defined on genomes without duplicates. However, we know that genes can be duplicated within the same genome. One possible approach to overcome this difficulty is to establish a one-to-one correspondence (i.e. a matching) between genes of both genomes, where the correspondence is chosen in order to optimize the studied measure. In this paper, we are interested in three measures (number of breakpoints, number of common intervals and number of conserved intervals) and three models of matching (exemplar, intermediate and maximum matching models). We prove that, for each model and each measure M, computing a matching between two genomes that optimizes M is APX-hard. We also study the complexity of the following problem: is there an exemplarization (resp. an intermediate/maximum matching) that induces no breakpoint? We prove the problem...

  7. Evolution of cancer suppression as revealed by mammalian comparative genomics.

    Science.gov (United States)

    Tollis, Marc; Schiffman, Joshua D; Boddy, Amy M

    2017-02-02

    Cancer suppression is an important feature in the evolution of large and long-lived animals. While some tumor suppression pathways are conserved among all multicellular organisms, others mechanisms of cancer resistance are uniquely lineage specific. Comparative genomics has become a powerful tool to discover these unique and shared molecular adaptations in respect to cancer suppression. These findings may one day be translated to human patients through evolutionary medicine. Here, we will review theory and methods of comparative cancer genomics and highlight major findings of cancer suppression across mammals. Our current knowledge of cancer genomics suggests that more efficient DNA repair and higher sensitivity to DNA damage may be the key to tumor suppression in large or long-lived mammals.

  8. Complete genome sequence of the fire blight pathogen Erwinia pyrifoliae DSM 12163T and comparative genomic insights into plant pathogenicity

    Directory of Open Access Journals (Sweden)

    Frey Jürg E

    2010-01-01

    Full Text Available Abstract Background Erwinia pyrifoliae is a newly described necrotrophic pathogen, which causes fire blight on Asian (Nashi pear and is geographically restricted to Eastern Asia. Relatively little is known about its genetics compared to the closely related main fire blight pathogen E. amylovora. Results The genome of the type strain of E. pyrifoliae strain DSM 12163T, was sequenced using both 454 and Solexa pyrosequencing and annotated. The genome contains a circular chromosome of 4.026 Mb and four small plasmids. Based on their respective role in virulence in E. amylovora or related organisms, we identified several putative virulence factors, including type III and type VI secretion systems and their effectors, flagellar genes, sorbitol metabolism, iron uptake determinants, and quorum-sensing components. A deletion in the rpoS gene covering the most conserved region of the protein was identified which may contribute to the difference in virulence/host-range compared to E. amylovora. Comparative genomics with the pome fruit epiphyte Erwinia tasmaniensis Et1/99 showed that both species are overall highly similar, although specific differences were identified, for example the presence of some phage gene-containing regions and a high number of putative genomic islands containing transposases in the E. pyrifoliae DSM 12163T genome. Conclusions The E. pyrifoliae genome is an important addition to the published genome of E. tasmaniensis and the unfinished genome of E. amylovora providing a foundation for re-sequencing additional strains that may shed light on the evolution of the host-range and virulence/pathogenicity of this important group of plant-associated bacteria.

  9. Comparative Genomics Analysis of Streptomyces Species Reveals Their Adaptation to the Marine Environment and Their Diversity at the Genomic Level

    Science.gov (United States)

    Tian, Xinpeng; Zhang, Zhewen; Yang, Tingting; Chen, Meili; Li, Jie; Chen, Fei; Yang, Jin; Li, Wenjie; Zhang, Bing; Zhang, Zhang; Wu, Jiayan; Zhang, Changsheng; Long, Lijuan; Xiao, Jingfa

    2016-01-01

    Over 200 genomes of streptomycete strains that were isolated from various environments are available from the NCBI. However, little is known about the characteristics that are linked to marine adaptation in marine-derived streptomycetes. The particularity and complexity of the marine environment suggest that marine streptomycetes are genetically diverse. Here, we sequenced nine strains from the Streptomyces genus that were isolated from different longitudes, latitudes, and depths of the South China Sea. Then we compared these strains to 22 NCBI downloaded streptomycete strains. Thirty-one streptomycete strains are clearly grouped into a marine-derived subgroup and multiple source subgroup-based phylogenetic tree. The phylogenetic analyses have revealed the dynamic process underlying streptomycete genome evolution, and lateral gene transfer is an important driving force during the process. Pan-genomics analyses have revealed that streptomycetes have an open pan-genome, which reflects the diversity of these streptomycetes and guarantees the species a quick and economical response to diverse environments. Functional and comparative genomics analyses indicate that the marine-derived streptomycetes subgroup possesses some common characteristics of marine adaptation. Our findings have expanded our knowledge of how ocean isolates of streptomycete strains adapt to marine environments. The availability of streptomycete genomes from the South China Sea will be beneficial for further analysis on marine streptomycetes and will enrich the South China Sea’s genetic data sources. PMID:27446038

  10. Comparative Genomics Analysis of Streptomyces Species Reveals Their Adaptation to the Marine Environment and Their Diversity at the Genomic Level.

    Science.gov (United States)

    Tian, Xinpeng; Zhang, Zhewen; Yang, Tingting; Chen, Meili; Li, Jie; Chen, Fei; Yang, Jin; Li, Wenjie; Zhang, Bing; Zhang, Zhang; Wu, Jiayan; Zhang, Changsheng; Long, Lijuan; Xiao, Jingfa

    2016-01-01

    Over 200 genomes of streptomycete strains that were isolated from various environments are available from the NCBI. However, little is known about the characteristics that are linked to marine adaptation in marine-derived streptomycetes. The particularity and complexity of the marine environment suggest that marine streptomycetes are genetically diverse. Here, we sequenced nine strains from the Streptomyces genus that were isolated from different longitudes, latitudes, and depths of the South China Sea. Then we compared these strains to 22 NCBI downloaded streptomycete strains. Thirty-one streptomycete strains are clearly grouped into a marine-derived subgroup and multiple source subgroup-based phylogenetic tree. The phylogenetic analyses have revealed the dynamic process underlying streptomycete genome evolution, and lateral gene transfer is an important driving force during the process. Pan-genomics analyses have revealed that streptomycetes have an open pan-genome, which reflects the diversity of these streptomycetes and guarantees the species a quick and economical response to diverse environments. Functional and comparative genomics analyses indicate that the marine-derived streptomycetes subgroup possesses some common characteristics of marine adaptation. Our findings have expanded our knowledge of how ocean isolates of streptomycete strains adapt to marine environments. The availability of streptomycete genomes from the South China Sea will be beneficial for further analysis on marine streptomycetes and will enrich the South China Sea's genetic data sources.

  11. Genomic Alterations in Sporadic Synchronous Primary Breast Cancer Using Array and Metaphase Comparative Genomic Hybridization

    Directory of Open Access Journals (Sweden)

    Arezou A. Ghazani

    2007-06-01

    Full Text Available Synchronous primary breast cancer describes the occurrence of multiple tumors affecting one or both breasts at initial diagnosis. This provides a unique opportunity to identify tissue-specific genomic markers that characterize each tumor while controlling for the individual genetic background of a patient. The aim of this study was to examine the genomic alterations and degree of similarity between synchronous cancers. Using metaphase comparative genomic hybridization and array comparative genomic hybridization (aCGH, the genomic alterations of 23 synchronous breast cancers from 10 patients were examined at both chromosomal and gene levels. Synchronous breast cancers, when compared to their matched counterparts, were found to have a common core set of genetic alterations, with additional unique changes present in each. They also frequently exhibited features distinct from the more usual solitary primary breast cancers. The most frequent genomic alterations included chromosomal gains of 1q, 3p, 4q, and 8q, and losses of 11q, 12q, 16q, and 17p. aCGH identified copy number amplification in regions that are present in all 23 tumor samples, including 1p31.3–1p32.3 harboring JAK1. Our findings suggest that synchronous primary breast cancers represent a unique type of breast cancer and, at least in some instances, one tumor may give rise to the other.

  12. Comparative Genomics of Escherichia coli Strains Causing Urinary Tract Infections

    DEFF Research Database (Denmark)

    Vejborg, Rebecca Munk; Hancock, Viktoria; Schembri, Mark A.

    2011-01-01

    The virulence determinants of uropathogenic Escherichia coli have been studied extensively over the years, but relatively little is known about what differentiates isolates causing various types of urinary tract infections. In this study, we compared the genomic profiles of 45 strains from a range...

  13. Genomic species are ecological species as revealed by comparative genomics in Agrobacterium tumefaciens.

    Science.gov (United States)

    Lassalle, Florent; Campillo, Tony; Vial, Ludovic; Baude, Jessica; Costechareyre, Denis; Chapulliot, David; Shams, Malek; Abrouk, Danis; Lavire, Céline; Oger-Desfeux, Christine; Hommais, Florence; Guéguen, Laurent; Daubin, Vincent; Muller, Daniel; Nesme, Xavier

    2011-01-01

    The definition of bacterial species is based on genomic similarities, giving rise to the operational concept of genomic species, but the reasons of the occurrence of differentiated genomic species remain largely unknown. We used the Agrobacterium tumefaciens species complex and particularly the genomic species presently called genomovar G8, which includes the sequenced strain C58, to test the hypothesis of genomic species having specific ecological adaptations possibly involved in the speciation process. We analyzed the gene repertoire specific to G8 to identify potential adaptive genes. By hybridizing 25 strains of A. tumefaciens on DNA microarrays spanning the C58 genome, we highlighted the presence and absence of genes homologous to C58 in the taxon. We found 196 genes specific to genomovar G8 that were mostly clustered into seven genomic islands on the C58 genome-one on the circular chromosome and six on the linear chromosome-suggesting higher plasticity and a major adaptive role of the latter. Clusters encoded putative functional units, four of which had been verified experimentally. The combination of G8-specific functions defines a hypothetical species primary niche for G8 related to commensal interaction with a host plant. This supports that the G8 ancestor was able to exploit a new ecological niche, maybe initiating ecological isolation and thus speciation. Searching genomic data for synapomorphic traits is a powerful way to describe bacterial species. This procedure allowed us to find such phenotypic traits specific to genomovar G8 and thus propose a Latin binomial, Agrobacterium fabrum, for this bona fide genomic species.

  14. Whole genome sequencing and comparative genomics of closely related Fusarium Head Blight fungi: Fusarium graminearum, F. meridionale and F. asiaticum.

    Science.gov (United States)

    Walkowiak, Sean; Rowland, Owen; Rodrigue, Nicolas; Subramaniam, Rajagopal

    2016-12-09

    The Fusarium graminearum species complex is composed of many distinct fungal species that cause several diseases in economically important crops, including Fusarium Head Blight of wheat. Despite being closely related, these species and individuals within species have distinct phenotypic differences in toxin production and pathogenicity, with some isolates reported as non-pathogenic on certain hosts. In this report, we compare genomes and gene content of six new isolates from the species complex, including the first available genomes of F. asiaticum and F. meridionale, with four other genomes reported in previous studies. A comparison of genome structure and gene content revealed a 93-99% overlap across all ten genomes. We identified more than 700 k base pairs (kb) of single nucleotide polymorphisms (SNPs), insertions, and deletions (indels) within common regions of the genome, which validated the species and genetic populations reported within species. We constructed a non-redundant pan gene list containing 15,297 genes from the ten genomes and among them 1827 genes or 12% were absent in at least one genome. These genes were co-localized in telomeric regions and select regions within chromosomes with a corresponding increase in SNPs and indels. Many are also predicted to encode for proteins involved in secondary metabolism and other functions associated with disease. Genes that were common between isolates contained high levels of nucleotide variation and may be pseudogenes, allelic, or under diversifying selection. The genomic resources we have contributed will be useful for the identification of genes that contribute to the phenotypic variation and niche specialization that have been reported among members of the F. graminearum species complex.

  15. SynTView — an interactive multi-view genome browser for next-generation comparative microorganism genomics

    Science.gov (United States)

    2013-01-01

    Background Dynamic visualisation interfaces are required to explore the multiple microbial genome data now available, especially those obtained by high-throughput sequencing — a.k.a. “Next-Generation Sequencing” (NGS) — technologies; they would also be useful for “standard” annotated genomes whose chromosome organizations may be compared. Although various software systems are available, few offer an optimal combination of feature-rich capabilities, non-static user interfaces and multi-genome data handling. Results We developed SynTView, a comparative and interactive viewer for microbial genomes, designed to run as either a web-based tool (Flash technology) or a desktop application (AIR environment). The basis of the program is a generic genome browser with sub-maps holding information about genomic objects (annotations). The software is characterised by the presentation of syntenic organisations of microbial genomes and the visualisation of polymorphism data (typically Single Nucleotide Polymorphisms — SNPs) along these genomes; these features are accessible to the user in an integrated way. A variety of specialised views are available and are all dynamically inter-connected (including linear and circular multi-genome representations, dot plots, phylogenetic profiles, SNP density maps, and more). SynTView is not linked to any particular database, allowing the user to plug his own data into the system seamlessly, and use external web services for added functionalities. SynTView has now been used in several genome sequencing projects to help biologists make sense out of huge data sets. Conclusions The most important assets of SynTView are: (i) the interactivity due to the Flash technology; (ii) the capabilities for dynamic interaction between many specialised views; and (iii) the flexibility allowing various user data sets to be integrated. It can thus be used to investigate massive amounts of information efficiently at the chromosome level. This

  16. Restauro-G: A Rapid Genome Re-Annotation System for Comparative Genomics

    Institute of Scientific and Technical Information of China (English)

    Satoshi Tamaki; Kazuharu Arakawa; Nobuaki Kono; Masaru Tomita

    2007-01-01

    Annotations of complete genome sequences submitted directly from sequencing projects are diverse in terms of annotation strategies and update frequencies. These inconsistencies make comparative studies difficult. To allow rapid data preparation of a large number of complete genomes, automation and speed are important for genome re-annotation. Here we introduce an open-source rapid genome re-annotation software system, Restauro-G, specialized for bacterial genomes. Restauro-G re-annotates a genome by similarity searches utilizing the BLAST-Like Alignment Tool, referring to protein databases such as UniProt KB, NCBI nr, NCBI COGs, Pfam, and PSORTb. Re-annotation by Restauro-G achieved over 98% accuracy for most bacterial chromosomes in comparison with the original manually curated annotation of EMBL releases. Restauro-G was developed in the generic bioinformatics workbench G-language Genome Analysis Environment and is distributed at http://restauro-g.iab.keio.ac.jp/ under the GNU General Public License.

  17. CMG-biotools, a free workbench for basic comparative microbial genomics.

    Directory of Open Access Journals (Sweden)

    Tammi Vesth

    Full Text Available BACKGROUND: Today, there are more than a hundred times as many sequenced prokaryotic genomes than were present in the year 2000. The economical sequencing of genomic DNA has facilitated a whole new approach to microbial genomics. The real power of genomics is manifested through comparative genomics that can reveal strain specific characteristics, diversity within species and many other aspects. However, comparative genomics is a field not easily entered into by scientists with few computational skills. The CMG-biotools package is designed for microbiologists with limited knowledge of computational analysis and can be used to perform a number of analyses and comparisons of genomic data. RESULTS: The CMG-biotools system presents a stand-alone interface for comparative microbial genomics. The package is a customized operating system, based on Xubuntu 10.10, available through the open source Ubuntu project. The system can be installed on a virtual computer, allowing the user to run the system alongside any other operating system. Source codes for all programs are provided under GNU license, which makes it possible to transfer the programs to other systems if so desired. We here demonstrate the package by comparing and analyzing the diversity within the class Negativicutes, represented by 31 genomes including 10 genera. The analyses include 16S rRNA phylogeny, basic DNA and codon statistics, proteome comparisons using BLAST and graphical analyses of DNA structures. CONCLUSION: This paper shows the strength and diverse use of the CMG-biotools system. The system can be installed on a vide range of host operating systems and utilizes as much of the host computer as desired. It allows the user to compare multiple genomes, from various sources using standardized data formats and intuitive visualizations of results. The examples presented here clearly shows that users with limited computational experience can perform complicated analysis without much

  18. Comparative Genome Analyses of Vibrio anguillarum Strains Reveal a Link with Pathogenicity Traits

    Science.gov (United States)

    Castillo, Daniel; Alvise, Paul D.; Xu, Ruiqi; Zhang, Faxing; Middelboe, Mathias

    2017-01-01

    ABSTRACT Vibrio anguillarum is a marine bacterium that can cause vibriosis in many fish and shellfish species, leading to high mortalities and economic losses in aquaculture. Although putative virulence factors have been identified, the mechanism of pathogenesis of V. anguillarum is not fully understood. Here, we analyzed whole-genome sequences of a collection of V. anguillarum strains and compared them to virulence of the strains as determined in larval challenge assays. Previously identified virulence factors were globally distributed among the strains, with some genetic diversity. However, the pan-genome revealed that six out of nine high-virulence strains possessed a unique accessory genome that was attributed to pathogenic genomic islands, prophage-like elements, virulence factors, and a new set of gene clusters involved in biosynthesis, modification, and transport of polysaccharides. In contrast, V. anguillarum strains that were medium to nonvirulent had a high degree of genomic homogeneity. Finally, we found that a phylogeny based on the core genomes clustered the strains with moderate to no virulence, while six out of nine high-virulence strains represented phylogenetically separate clusters. Hence, we suggest a link between genotype and virulence characteristics of Vibrio anguillarum, which can be used to unravel the molecular evolution of V. anguillarum and can also be important from survey and diagnostic perspectives. IMPORTANCE Comparative genome analysis of strains of a pathogenic bacterial species can be a powerful tool to discover acquisition of mobile genetic elements related to virulence. Here, we compared 28 V. anguillarum strains that differed in virulence in fish larval models. By pan-genome analyses, we found that six of nine highly virulent strains had a unique core and accessory genome. In contrast, V. anguillarum strains that were medium to nonvirulent had low genomic diversity. Integration of genomic and phenotypic features provides

  19. Comparative Genome Analyses of Vibrio anguillarum Strains Reveal a Link with Pathogenicity Traits.

    Science.gov (United States)

    Castillo, Daniel; Alvise, Paul D; Xu, Ruiqi; Zhang, Faxing; Middelboe, Mathias; Gram, Lone

    2017-01-01

    Vibrio anguillarum is a marine bacterium that can cause vibriosis in many fish and shellfish species, leading to high mortalities and economic losses in aquaculture. Although putative virulence factors have been identified, the mechanism of pathogenesis of V. anguillarum is not fully understood. Here, we analyzed whole-genome sequences of a collection of V. anguillarum strains and compared them to virulence of the strains as determined in larval challenge assays. Previously identified virulence factors were globally distributed among the strains, with some genetic diversity. However, the pan-genome revealed that six out of nine high-virulence strains possessed a unique accessory genome that was attributed to pathogenic genomic islands, prophage-like elements, virulence factors, and a new set of gene clusters involved in biosynthesis, modification, and transport of polysaccharides. In contrast, V. anguillarum strains that were medium to nonvirulent had a high degree of genomic homogeneity. Finally, we found that a phylogeny based on the core genomes clustered the strains with moderate to no virulence, while six out of nine high-virulence strains represented phylogenetically separate clusters. Hence, we suggest a link between genotype and virulence characteristics of Vibrio anguillarum, which can be used to unravel the molecular evolution of V. anguillarum and can also be important from survey and diagnostic perspectives. IMPORTANCE Comparative genome analysis of strains of a pathogenic bacterial species can be a powerful tool to discover acquisition of mobile genetic elements related to virulence. Here, we compared 28 V. anguillarum strains that differed in virulence in fish larval models. By pan-genome analyses, we found that six of nine highly virulent strains had a unique core and accessory genome. In contrast, V. anguillarum strains that were medium to nonvirulent had low genomic diversity. Integration of genomic and phenotypic features provides insights

  20. The Perennial Ryegrass GenomeZipper – Targeted Use of Genome Resources for Comparative Grass Genomics

    DEFF Research Database (Denmark)

    Pfeiffer, Matthias; Martis, Mihaela; Asp, Torben;

    2013-01-01

    to establish the chromosomal arrangement of syntenic genes from model grass species. This scaffold revealed a high degree of synteny and macrocollinearity and was then utilized to anchor a collection of perennial ryegrass genes in silico to their predicted genome positions. This resulted in the unambiguous...

  1. CyanoClust: comparative genome resources of cyanobacteria and plastids.

    Science.gov (United States)

    Sasaki, Naobumi V; Sato, Naoki

    2010-01-01

    Cyanobacteria, which perform oxygen-evolving photosynthesis as do chloroplasts of plants and algae, are one of the best-studied prokaryotic phyla and one from which many representative genomes have been sequenced. Lack of a suitable comparative genomic database has been a problem in cyanobacterial genomics because many proteins involved in physiological functions such as photosynthesis and nitrogen fixation are not catalogued in commonly used databases, such as Clusters of Orthologous Proteins (COG). CyanoClust is a database of homolog groups in cyanobacteria and plastids that are produced by the program Gclust. We have developed a web-server system for the protein homology database featuring cyanobacteria and plastids. Database URL: http://cyanoclust.c.u-tokyo.ac.jp/.

  2. Comparative genomics of Neisseria meningitidis: core genome, islands of horizontal transfer and pathogen-specific genes.

    Science.gov (United States)

    Dunning Hotopp, Julie C; Grifantini, Renata; Kumar, Nikhil; Tzeng, Yih Ling; Fouts, Derrick; Frigimelica, Elisabetta; Draghi, Monia; Giuliani, Marzia Monica; Rappuoli, Rino; Stephens, David S; Grandi, Guido; Tettelin, Hervé

    2006-12-01

    To better understand Neisseria meningitidis genomes and virulence, microarray comparative genome hybridization (mCGH) data were collected from one Neisseria cinerea, two Neisseria lactamica, two Neisseria gonorrhoeae and 48 Neisseria meningitidis isolates. For N. meningitidis, these isolates are from diverse clonal complexes, invasive and carriage strains, and all major serogroups. The microarray platform represented N. meningitidis strains MC58, Z2491 and FAM18, and N. gonorrhoeae FA1090. By comparing hybridization data to genome sequences, the core N. meningitidis genome and insertions/deletions (e.g. capsule locus, type I secretion system) related to pathogenicity were identified, including further characterization of the capsule locus, bioinformatics analysis of a type I secretion system, and identification of some metabolic pathways associated with intracellular survival in pathogens. Hybridization data clustered meningococcal isolates from similar clonal complexes that were distinguished by the differential presence of six distinct islands of horizontal transfer. Several of these islands contained prophage or other mobile elements, including a novel prophage and a transposon carrying portions of a type I secretion system. Acquisition of some genetic islands appears to have occurred in multiple lineages, including transfer between N. lactamica and N. meningitidis. However, island acquisition occurs infrequently, such that the genomic-level relationship is not obscured within clonal complexes. The N. meningitidis genome is characterized by the horizontal acquisition of multiple genetic islands; the study of these islands reveals important sets of genes varying between isolates and likely to be related to pathogenicity.

  3. The Chlamydia psittaci genome: a comparative analysis of intracellular pathogens.

    Directory of Open Access Journals (Sweden)

    Anja Voigt

    Full Text Available BACKGROUND: Chlamydiaceae are a family of obligate intracellular pathogens causing a wide range of diseases in animals and humans, and facing unique evolutionary constraints not encountered by free-living prokaryotes. To investigate genomic aspects of infection, virulence and host preference we have sequenced Chlamydia psittaci, the pathogenic agent of ornithosis. RESULTS: A comparison of the genome of the avian Chlamydia psittaci isolate 6BC with the genomes of other chlamydial species, C. trachomatis, C. muridarum, C. pneumoniae, C. abortus, C. felis and C. caviae, revealed a high level of sequence conservation and synteny across taxa, with the major exception of the human pathogen C. trachomatis. Important differences manifest in the polymorphic membrane protein family specific for the Chlamydiae and in the highly variable chlamydial plasticity zone. We identified a number of psittaci-specific polymorphic membrane proteins of the G family that may be related to differences in host-range and/or virulence as compared to closely related Chlamydiaceae. We calculated non-synonymous to synonymous substitution rate ratios for pairs of orthologous genes to identify putative targets of adaptive evolution and predicted type III secreted effector proteins. CONCLUSIONS: This study is the first detailed analysis of the Chlamydia psittaci genome sequence. It provides insights in the genome architecture of C. psittaci and proposes a number of novel candidate genes mostly of yet unknown function that may be important for pathogen-host interactions.

  4. Comparative analysis of methods for genome-wide nucleosome cartography.

    Science.gov (United States)

    Quintales, Luis; Vázquez, Enrique; Antequera, Francisco

    2015-07-01

    Nucleosomes contribute to compacting the genome into the nucleus and regulate the physical access of regulatory proteins to DNA either directly or through the epigenetic modifications of the histone tails. Precise mapping of nucleosome positioning across the genome is, therefore, essential to understanding the genome regulation. In recent years, several experimental protocols have been developed for this purpose that include the enzymatic digestion, chemical cleavage or immunoprecipitation of chromatin followed by next-generation sequencing of the resulting DNA fragments. Here, we compare the performance and resolution of these methods from the initial biochemical steps through the alignment of the millions of short-sequence reads to a reference genome to the final computational analysis to generate genome-wide maps of nucleosome occupancy. Because of the lack of a unified protocol to process data sets obtained through the different approaches, we have developed a new computational tool (NUCwave), which facilitates their analysis, comparison and assessment and will enable researchers to choose the most suitable method for any particular purpose. NUCwave is freely available at http://nucleosome.usal.es/nucwave along with a step-by-step protocol for its use. © The Author 2014. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  5. Sequencing and comparative genome analysis of two pathogenic Streptococcus gallolyticus subspecies: genome plasticity, adaptation and virulence.

    Directory of Open Access Journals (Sweden)

    I-Hsuan Lin

    Full Text Available Streptococcus gallolyticus infections in humans are often associated with bacteremia, infective endocarditis and colon cancers. The disease manifestations are different depending on the subspecies of S. gallolyticus causing the infection. Here, we present the complete genomes of S. gallolyticus ATCC 43143 (biotype I and S. pasteurianus ATCC 43144 (biotype II.2. The genomic differences between the two biotypes were characterized with comparative genomic analyses. The chromosome of ATCC 43143 and ATCC 43144 are 2,36 and 2,10 Mb in length and encode 2246 and 1869 CDS respectively. The organization and genomic contents of both genomes were most similar to the recently published S. gallolyticus UCN34, where 2073 (92% and 1607 (86% of the ATCC 43143 and ATCC 43144 CDS were conserved in UCN34 respectively. There are around 600 CDS conserved in all Streptococcus genomes, indicating the Streptococcus genus has a small core-genome (constitute around 30% of total CDS and substantial evolutionary plasticity. We identified eight and five regions of genome plasticity in ATCC 43143 and ATCC 43144 respectively. Within these regions, several proteins were recognized to contribute to the fitness and virulence of each of the two subspecies. We have also predicted putative cell-surface associated proteins that could play a role in adherence to host tissues, leading to persistent infections causing sub-acute and chronic diseases in humans. This study showed evidence that the S. gallolyticus still possesses genes making it suitable in a rumen environment, whereas the ability for S. pasteurianus to live in rumen is reduced. The genome heterogeneity and genetic diversity among the two biotypes, especially membrane and lipoproteins, most likely contribute to the differences in the pathogenesis of the two S. gallolyticus biotypes and the type of disease an infected patient eventually develops.

  6. Cost-effective cloud computing: a case study using the comparative genomics tool, roundup.

    Science.gov (United States)

    Kudtarkar, Parul; Deluca, Todd F; Fusaro, Vincent A; Tonellato, Peter J; Wall, Dennis P

    2010-12-22

    Comparative genomics resources, such as ortholog detection tools and repositories are rapidly increasing in scale and complexity. Cloud computing is an emerging technological paradigm that enables researchers to dynamically build a dedicated virtual cluster and may represent a valuable alternative for large computational tools in bioinformatics. In the present manuscript, we optimize the computation of a large-scale comparative genomics resource-Roundup-using cloud computing, describe the proper operating principles required to achieve computational efficiency on the cloud, and detail important procedures for improving cost-effectiveness to ensure maximal computation at minimal costs. Utilizing the comparative genomics tool, Roundup, as a case study, we computed orthologs among 902 fully sequenced genomes on Amazon's Elastic Compute Cloud. For managing the ortholog processes, we designed a strategy to deploy the web service, Elastic MapReduce, and maximize the use of the cloud while simultaneously minimizing costs. Specifically, we created a model to estimate cloud runtime based on the size and complexity of the genomes being compared that determines in advance the optimal order of the jobs to be submitted. We computed orthologous relationships for 245,323 genome-to-genome comparisons on Amazon's computing cloud, a computation that required just over 200 hours and cost $8,000 USD, at least 40% less than expected under a strategy in which genome comparisons were submitted to the cloud randomly with respect to runtime. Our cost savings projections were based on a model that not only demonstrates the optimal strategy for deploying RSD to the cloud, but also finds the optimal cluster size to minimize waste and maximize usage. Our cost-reduction model is readily adaptable for other comparative genomics tools and potentially of significant benefit to labs seeking to take advantage of the cloud as an alternative to local computing infrastructure.

  7. Integrating cytogenetics and genomics in comparative evolutionary studies of cichlid fish

    Directory of Open Access Journals (Sweden)

    Mazzuchelli Juliana

    2012-09-01

    Full Text Available Abstract Background The availability of a large number of recently sequenced vertebrate genomes opens new avenues to integrate cytogenetics and genomics in comparative and evolutionary studies. Cytogenetic mapping can offer alternative means to identify conserved synteny shared by distinct genomes and also to define genome regions that are still not fine characterized even after wide-ranging nucleotide sequence efforts. An efficient way to perform comparative cytogenetic mapping is based on BAC clones mapping by fluorescence in situ hybridization. In this report, to address the knowledge gap on the genome evolution in cichlid fishes, BAC clones of an Oreochromis niloticus library covering the linkage groups (LG 1, 3, 5, and 7 were mapped onto the chromosomes of 9 African cichlid species. The cytogenetic mapping data were also integrated with BAC-end sequences information of O. niloticus and comparatively analyzed against the genome of other fish species and vertebrates. Results The location of BACs from LG1, 3, 5, and 7 revealed a strong chromosomal conservation among the analyzed cichlid species genomes, which evidenced a synteny of the markers of each LG. Comparative in silico analysis also identified large genomic blocks that were conserved in distantly related fish groups and also in other vertebrates. Conclusions Although it has been suggested that fishes contain plastic genomes with high rates of chromosomal rearrangements and probably low rates of synteny conservation, our results evidence that large syntenic chromosome segments have been maintained conserved during evolution, at least for the considered markers. Additionally, our current cytogenetic mapping efforts integrated with genomic approaches conduct to a new perspective to address important questions involving chromosome evolution in fishes.

  8. Comparative genomics of 12 strains of Erwinia amylovora identifies a pan-genome with a large conserved core.

    Directory of Open Access Journals (Sweden)

    Rachel A Mann

    Full Text Available The plant pathogen Erwinia amylovora can be divided into two host-specific groupings; strains infecting a broad range of hosts within the Rosaceae subfamily Spiraeoideae (e.g., Malus, Pyrus, Crataegus, Sorbus and strains infecting Rubus (raspberries and blackberries. Comparative genomic analysis of 12 strains representing distinct populations (e.g., geographic, temporal, host origin of E. amylovora was used to describe the pan-genome of this major pathogen. The pan-genome contains 5751 coding sequences and is highly conserved relative to other phytopathogenic bacteria comprising on average 89% conserved, core genes. The chromosomes of Spiraeoideae-infecting strains were highly homogeneous, while greater genetic diversity was observed between Spiraeoideae- and Rubus-infecting strains (and among individual Rubus-infecting strains, the majority of which was attributed to variable genomic islands. Based on genomic distance scores and phylogenetic analysis, the Rubus-infecting strain ATCC BAA-2158 was genetically more closely related to the Spiraeoideae-infecting strains of E. amylovora than it was to the other Rubus-infecting strains. Analysis of the accessory genomes of Spiraeoideae- and Rubus-infecting strains has identified putative host-specific determinants including variation in the effector protein HopX1(Ea and a putative secondary metabolite pathway only present in Rubus-infecting strains.

  9. Comparative genomics of 12 strains of Erwinia amylovora identifies a pan-genome with a large conserved core.

    Science.gov (United States)

    Mann, Rachel A; Smits, Theo H M; Bühlmann, Andreas; Blom, Jochen; Goesmann, Alexander; Frey, Jürg E; Plummer, Kim M; Beer, Steven V; Luck, Joanne; Duffy, Brion; Rodoni, Brendan

    2013-01-01

    The plant pathogen Erwinia amylovora can be divided into two host-specific groupings; strains infecting a broad range of hosts within the Rosaceae subfamily Spiraeoideae (e.g., Malus, Pyrus, Crataegus, Sorbus) and strains infecting Rubus (raspberries and blackberries). Comparative genomic analysis of 12 strains representing distinct populations (e.g., geographic, temporal, host origin) of E. amylovora was used to describe the pan-genome of this major pathogen. The pan-genome contains 5751 coding sequences and is highly conserved relative to other phytopathogenic bacteria comprising on average 89% conserved, core genes. The chromosomes of Spiraeoideae-infecting strains were highly homogeneous, while greater genetic diversity was observed between Spiraeoideae- and Rubus-infecting strains (and among individual Rubus-infecting strains), the majority of which was attributed to variable genomic islands. Based on genomic distance scores and phylogenetic analysis, the Rubus-infecting strain ATCC BAA-2158 was genetically more closely related to the Spiraeoideae-infecting strains of E. amylovora than it was to the other Rubus-infecting strains. Analysis of the accessory genomes of Spiraeoideae- and Rubus-infecting strains has identified putative host-specific determinants including variation in the effector protein HopX1(Ea) and a putative secondary metabolite pathway only present in Rubus-infecting strains.

  10. Comparative genomics of transcriptional regulation of methionine metabolism in Proteobacteria.

    Directory of Open Access Journals (Sweden)

    Semen A Leyn

    Full Text Available Methionine metabolism and uptake genes in Proteobacteria are controlled by a variety of RNA and DNA regulatory systems. We have applied comparative genomics to reconstruct regulons for three known transcription factors, MetJ, MetR, and SahR, and three known riboswitch motifs, SAH, SAM-SAH, and SAM_alpha, in ∼ 200 genomes from 22 taxonomic groups of Proteobacteria. We also identified two novel regulons: a SahR-like transcription factor SamR controlling various methionine biosynthesis genes in the Xanthomonadales group, and a potential RNA regulatory element with terminator-antiterminator mechanism controlling the metX or metZ genes in beta-proteobacteria. For each analyzed regulator we identified the core, taxon-specific and genome-specific regulon members. By analyzing the distribution of these regulators in bacterial genomes and by comparing their regulon contents we elucidated possible evolutionary scenarios for the regulation of the methionine metabolism genes in Proteobacteria.

  11. Inference of self-regulated transcriptional networks by comparative genomics.

    Science.gov (United States)

    Cornish, Joseph P; Matthews, Fialelei; Thomas, Julien R; Erill, Ivan

    2012-01-01

    The assumption of basic properties, like self-regulation, in simple transcriptional regulatory networks can be exploited to infer regulatory motifs from the growing amounts of genomic and meta-genomic data. These motifs can in principle be used to elucidate the nature and scope of transcriptional networks through comparative genomics. Here we assess the feasibility of this approach using the SOS regulatory network of Gram-positive bacteria as a test case. Using experimentally validated data, we show that the known regulatory motif can be inferred through the assumption of self-regulation. Furthermore, the inferred motif provides a more robust search pattern for comparative genomics than the experimental motifs defined in reference organisms. We take advantage of this robustness to generate a functional map of the SOS response in Gram-positive bacteria. Our results reveal definite differences in the composition of the LexA regulon between Firmicutes and Actinobacteria, and confirm that regulation of cell-division inhibition is a widespread characteristic of this network among Gram-positive bacteria.

  12. Comparative mitochondrial genomics toward exploring molecular markers in the medicinal fungus Cordyceps militaris.

    Science.gov (United States)

    Zhang, Shu; Hao, Ai-Jing; Zhao, Yu-Xiang; Zhang, Xiao-Yu; Zhang, Yong-Jie

    2017-01-10

    Cordyceps militaris is a fungus used for developing health food, but knowledge about its intraspecific differentiation is limited due to lack of efficient markers. Herein, we assembled the mitochondrial genomes of eight C. militaris strains and performed a comparative mitochondrial genomic analysis together with three previously reported mitochondrial genomes of the fungus. Sizes of the 11 mitochondrial genomes varied from 26.5 to 33.9 kb mainly due to variable intron contents (from two to eight introns per strain). Nucleotide variability varied according to different regions with non-coding regions showing higher variation frequency than coding regions. Recombination events were identified between some locus pairs but seemed not to contribute greatly to genetic variations of the fungus. Based on nucleotide diversity fluctuations across the alignment of all mitochondrial genomes, molecular markers with the potential to be used for future typing studies were determined.

  13. Comparative genomics of Synechococcus and proposal of the new genus Parasynechococcus

    Directory of Open Access Journals (Sweden)

    Felipe Coutinho

    2016-01-01

    Full Text Available Synechococcus is among the most important contributors to global primary productivity. The genomes of several strains of this taxon have been previously sequenced in an effort to understand the physiology and ecology of these highly diverse microorganisms. Here we present a comparative study of Synechococcus genomes. For that end, we developed GenTaxo, a program written in Perl to perform genomic taxonomy based on average nucleotide identity, average amino acid identity and dinucleotide signatures, which revealed that the analyzed strains are drastically distinct regarding their genomic content. Phylogenomic reconstruction indicated a division of Synechococcus in two clades (i.e. Synechococcus and the new genus Parasynechococcus, corroborating evidences that this is in fact a polyphyletic group. By clustering protein encoding genes into homologue groups we were able to trace the Pangenome and core genome of both marine and freshwater Synechococcus and determine the genotypic traits that differentiate these lineages.

  14. Comparative genomics of Synechococcus and proposal of the new genus Parasynechococcus.

    Science.gov (United States)

    Coutinho, Felipe; Tschoeke, Diogo Antonio; Thompson, Fabiano; Thompson, Cristiane

    2016-01-01

    Synechococcus is among the most important contributors to global primary productivity. The genomes of several strains of this taxon have been previously sequenced in an effort to understand the physiology and ecology of these highly diverse microorganisms. Here we present a comparative study of Synechococcus genomes. For that end, we developed GenTaxo, a program written in Perl to perform genomic taxonomy based on average nucleotide identity, average amino acid identity and dinucleotide signatures, which revealed that the analyzed strains are drastically distinct regarding their genomic content. Phylogenomic reconstruction indicated a division of Synechococcus in two clades (i.e. Synechococcus and the new genus Parasynechococcus), corroborating evidences that this is in fact a polyphyletic group. By clustering protein encoding genes into homologue groups we were able to trace the Pangenome and core genome of both marine and freshwater Synechococcus and determine the genotypic traits that differentiate these lineages.

  15. Comparative genomics of Enterococcus faecalis from healthy Norwegian infants

    Directory of Open Access Journals (Sweden)

    Nes Ingolf F

    2009-04-01

    Full Text Available Abstract Background Enterococcus faecalis, traditionally considered a harmless commensal of the intestinal tract, is now ranked among the leading causes of nosocomial infections. In an attempt to gain insight into the genetic make-up of commensal E. faecalis, we have studied genomic variation in a collection of community-derived E. faecalis isolated from the feces of Norwegian infants. Results The E. faecalis isolates were first sequence typed by multilocus sequence typing (MLST and characterized with respect to antibiotic resistance and properties associated with virulence. A subset of the isolates was compared to the vancomycin resistant strain E. faecalis V583 (V583 by whole genome microarray comparison (comparative genomic hybridization (CGH. Several of the putative enterococcal virulence factors were found to be highly prevalent among the commensal baby isolates. The genomic variation as observed by CGH was less between isolates displaying the same MLST sequence type than between isolates belonging to different evolutionary lineages. Conclusion The variations in gene content observed among the investigated commensal E. faecalis is comparable to the genetic variation previously reported among strains of various origins thought to be representative of the major E. faecalis lineages. Previous MLST analysis of E. faecalis have identified so-called high-risk enterococcal clonal complexes (HiRECC, defined as genetically distinct subpopulations, epidemiologically associated with enterococcal infections. The observed correlation between CGH and MLST presented here, may offer a method for the identification of lineage-specific genes, and may therefore add clues on how to distinguish pathogenic from commensal E. faecalis. In this work, information on the core genome of E. faecalis is also substantially extended.

  16. Genomic characterization of some Iranian children with idiopathic mental retardation using array comparative genomic hybridization

    Directory of Open Access Journals (Sweden)

    Farkhondeh Behjati

    2013-01-01

    Full Text Available Background: Mental retardation (MR has a prevalence of 1-3% and genetic causes are present in more than 50% of patients. Chromosomal abnormalities are one of the most common genetic causes of MR and are responsible for 4-28% of mental retardation. However, the smallest loss or gain of material visible by standard cytogenetic is about 4 Mb and for smaller abnormalities, molecular cytogenetic techniques such as array comparative genomic hybridization (array CGH should be used. It has been shown that 15-25% of idiopathic MR (IMR has submicroscopic rearrangements detectable by array CGH. In this project, the genomic abnormalities were investigated in 32 MR patients using this technique. Materials and Methods: Patients with IMR with dysmorphism were investigated in this study. Karyotype analysis, fragile X and metabolic tests were first carried out on the patients. The copy number variation was then assessed in a total of 32 patients with normal results for the mentioned tests using whole genome oligo array CGH. Multiple ligation probe amplification was carried out as a confirmation test. Results: In total, 19% of the patients showed genomic abnormalities. This is reduced to 12.5% once the two patients with abnormal karyotypes (upon re-evaluation are removed. Conclusion: The array CGH technique increased the detection rate of genomic imbalances in our patients by 12.5%. It is an accurate and reliable method for the determination of genomic imbalances in patients with IMR and dysmorphism.

  17. A gene-based high-resolution comparative radiation hybrid map as a framework for genome sequence assembly of a bovine chromosome 6 region associated with QTL for growth, body composition, and milk performance traits

    Directory of Open Access Journals (Sweden)

    Laurent Pascal

    2006-03-01

    Full Text Available Abstract Background A number of different quantitative trait loci (QTL for various phenotypic traits, including milk production, functional, and conformation traits in dairy cattle as well as growth and body composition traits in meat cattle, have been mapped consistently in the middle region of bovine chromosome 6 (BTA6. Dense genetic and physical maps and, ultimately, a fully annotated genome sequence as well as their mutual connections are required to efficiently identify genes and gene variants responsible for genetic variation of phenotypic traits. A comprehensive high-resolution gene-rich map linking densely spaced bovine markers and genes to the annotated human genome sequence is required as a framework to facilitate this approach for the region on BTA6 carrying the QTL. Results Therefore, we constructed a high-resolution radiation hybrid (RH map for the QTL containing chromosomal region of BTA6. This new RH map with a total of 234 loci including 115 genes and ESTs displays a substantial increase in loci density compared to existing physical BTA6 maps. Screening the available bovine genome sequence resources, a total of 73 loci could be assigned to sequence contigs, which were already identified as specific for BTA6. For 43 loci, corresponding sequence contigs, which were not yet placed on the bovine genome assembly, were identified. In addition, the improved potential of this high-resolution RH map for BTA6 with respect to comparative mapping was demonstrated. Mapping a large number of genes on BTA6 and cross-referencing them with map locations in corresponding syntenic multi-species chromosome segments (human, mouse, rat, dog, chicken achieved a refined accurate alignment of conserved segments and evolutionary breakpoints across the species included. Conclusion The gene-anchored high-resolution RH map (1 locus/300 kb for the targeted region of BTA6 presented here will provide a valuable platform to guide high-quality assembling and

  18. Genomic comparisons of Brucella spp. and closely related bacteria using base compositional and proteome based methods

    DEFF Research Database (Denmark)

    Bohlin, Jon; Snipen, Lars; Cloeckaert, Axel

    2010-01-01

    , genomic codon and amino acid frequencies based comparisons) and proteomes (all-against-all BLAST protein comparisons and pan-genomic analyses). RESULTS: We found that the oligonucleotide based methods gave different results compared to that of the proteome based methods. Differences were also found...... than proteome comparisons between species in genus Brucella and genus Ochrobactrum. Pan-genomic analyses indicated that uptake of DNA from outside genus Brucella appears to be limited. CONCLUSIONS: While both the proteome based methods and the Markov chain based genomic signatures were able to reflect...

  19. Parallel WGA and WTA for Comparative Genome and Transcriptome NGS Analysis Using Tiny Cell Numbers.

    Science.gov (United States)

    Korfhage, Christian; Fricke, Evelyn; Meier, Andreas

    2015-07-01

    Genomic DNA determines how and when the transcriptome is changed by a trigger or environmental change and how cellular metabolism is influenced. Comparative genome and transcriptome analysis of the same cell sample links a defined genome with all changes in the bases, structure, or numbers of the transcriptome. However, comparative genome and transcriptome analysis using next-generation sequencing (NGS) or real-time PCR is often limited by the small amount of sample available. In mammals, the amount of DNA and RNA in a single cell is ∼10 picograms, but deep analysis of the genome and transcriptome currently requires several hundred nanograms of nucleic acids for library preparation for NGS sequencing. Consequently, accurate whole-genome amplification (WGA) and whole-transcriptome amplification (WTA) is required for such quantitative analysis. This unit describes how the genome and the transcriptome of a tiny number of cells can be amplified in a highly parallel and comparable process. Protocols for quality control of amplified DNA and application of amplified DNA for NGS are included.

  20. Genome analysis and comparative genomics of a Giardia intestinalis assemblage E isolate

    Directory of Open Access Journals (Sweden)

    Andersson Jan O

    2010-10-01

    Full Text Available Abstract Background Giardia intestinalis is a protozoan parasite that causes diarrhea in a wide range of mammalian species. To further understand the genetic diversity between the Giardia intestinalis species, we have performed genome sequencing and analysis of a wild-type Giardia intestinalis sample from the assemblage E group, isolated from a pig. Results We identified 5012 protein coding genes, the majority of which are conserved compared to the previously sequenced genomes of the WB and GS strains in terms of microsynteny and sequence identity. Despite this, there is an unexpectedly large number of chromosomal rearrangements and several smaller structural changes that are present in all chromosomes. Novel members of the VSP, NEK Kinase and HCMP gene families were identified, which may reveal possible mechanisms for host specificity and new avenues for antigenic variation. We used comparative genomics of the three diverse Giardia intestinalis isolates P15, GS and WB to define a core proteome for this species complex and to identify lineage-specific genes. Extensive analyses of polymorphisms in the core proteome of Giardia revealed differential rates of divergence among cellular processes. Conclusions Our results indicate that despite a well conserved core of genes there is significant genome variation between Giardia isolates, both in terms of gene content, gene polymorphisms, structural chromosomal variations and surface molecule repertoires. This study improves the annotation of the Giardia genomes and enables the identification of functionally important variation.

  1. Classical Oncogenes and Tumor Suppressor Genes: A Comparative Genomics Perspective

    Directory of Open Access Journals (Sweden)

    Oxana K. Pickeral

    2000-05-01

    Full Text Available We have curated a reference set of cancer-related genes and reanalyzed their sequences in the light of molecular information and resources that have become available since they were first cloned. Homology studies were carried out for human oncogenes and tumor suppressors, compared with the complete proteome of the nematode, Caenorhabditis elegans, and partial proteomes of mouse and rat and the fruit fly, Drosophila melanogaster. Our results demonstrate that simple, semi-automated bioinformatics approaches to identifying putative functionally equivalent gene products in different organisms may often be misleading. An electronic supplement to this article1 provides an integrated view of our comparative genomics analysis as well as mapping data, physical cDNA resources and links to published literature and reviews, thus creating a “window” into the genomes of humans and other organisms for cancer biology.

  2. Whole-genome sequence of the Tibetan frog Nanorana parkeri and the comparative evolution of tetrapod genomes.

    Science.gov (United States)

    Sun, Yan-Bo; Xiong, Zi-Jun; Xiang, Xue-Yan; Liu, Shi-Ping; Zhou, Wei-Wei; Tu, Xiao-Long; Zhong, Li; Wang, Lu; Wu, Dong-Dong; Zhang, Bao-Lin; Zhu, Chun-Ling; Yang, Min-Min; Chen, Hong-Man; Li, Fang; Zhou, Long; Feng, Shao-Hong; Huang, Chao; Zhang, Guo-Jie; Irwin, David; Hillis, David M; Murphy, Robert W; Yang, Huan-Ming; Che, Jing; Wang, Jun; Zhang, Ya-Ping

    2015-03-17

    The development of efficient sequencing techniques has resulted in large numbers of genomes being available for evolutionary studies. However, only one genome is available for all amphibians, that of Xenopus tropicalis, which is distantly related from the majority of frogs. More than 96% of frogs belong to the Neobatrachia, and no genome exists for this group. This dearth of amphibian genomes greatly restricts genomic studies of amphibians and, more generally, our understanding of tetrapod genome evolution. To fill this gap, we provide the de novo genome of a Tibetan Plateau frog, Nanorana parkeri, and compare it to that of X. tropicalis and other vertebrates. This genome encodes more than 20,000 protein-coding genes, a number similar to that of Xenopus. Although the genome size of Nanorana is considerably larger than that of Xenopus (2.3 vs. 1.5 Gb), most of the difference is due to the respective number of transposable elements in the two genomes. The two frogs exhibit considerable conserved whole-genome synteny despite having diverged approximately 266 Ma, indicating a slow rate of DNA structural evolution in anurans. Multigenome synteny blocks further show that amphibians have fewer interchromosomal rearrangements than mammals but have a comparable rate of intrachromosomal rearrangements. Our analysis also identifies 11 Mb of anuran-specific highly conserved elements that will be useful for comparative genomic analyses of frogs. The Nanorana genome offers an improved understanding of evolution of tetrapod genomes and also provides a genomic reference for other evolutionary studies.

  3. Ecology of marine Bacteroidetes: a comparative genomics approach.

    Science.gov (United States)

    Fernández-Gómez, Beatriz; Richter, Michael; Schüler, Margarete; Pinhassi, Jarone; Acinas, Silvia G; González, José M; Pedrós-Alió, Carlos

    2013-05-01

    Bacteroidetes are commonly assumed to be specialized in degrading high molecular weight (HMW) compounds and to have a preference for growth attached to particles, surfaces or algal cells. The first sequenced genomes of marine Bacteroidetes seemed to confirm this assumption. Many more genomes have been sequenced recently. Here, a comparative analysis of marine Bacteroidetes genomes revealed a life strategy different from those of other important phyla of marine bacterioplankton such as Cyanobacteria and Proteobacteria. Bacteroidetes have many adaptations to grow attached to particles, have the capacity to degrade polymers, including a large number of peptidases, glycoside hydrolases (GHs), glycosyl transferases, adhesion proteins, as well as the genes for gliding motility. Several of the polymer degradation genes are located in close association with genes for TonB-dependent receptors and transducers, suggesting an integrated regulation of adhesion and degradation of polymers. This confirmed the role of this abundant group of marine bacteria as degraders of particulate matter. Marine Bacteroidetes had a significantly larger number of proteases than GHs, while non-marine Bacteroidetes had equal numbers of both. Proteorhodopsin containing Bacteroidetes shared two characteristics: small genome size and a higher number of genes involved in CO2 fixation per Mb. The latter may be important in order to survive when floating freely in the illuminated, but nutrient-poor, ocean surface.

  4. Bamboo Flowering from the Perspective of Comparative Genomics and Transcriptomics.

    Science.gov (United States)

    Biswas, Prasun; Chakraborty, Sukanya; Dutta, Smritikana; Pal, Amita; Das, Malay

    2016-01-01

    Bamboos are an important member of the subfamily Bambusoideae, family Poaceae. The plant group exhibits wide variation with respect to the timing (1-120 years) and nature (sporadic vs. gregarious) of flowering among species. Usually flowering in woody bamboos is synchronous across culms growing over a large area, known as gregarious flowering. In many monocarpic bamboos this is followed by mass death and seed setting. While in sporadic flowering an isolated wild clump may flower, set little or no seed and remain alive. Such wide variation in flowering time and extent means that the plant group serves as repositories for genes and expression patterns that are unique to bamboo. Due to the dearth of available genomic and transcriptomic resources, limited studies have been undertaken to identify the potential molecular players in bamboo flowering. The public release of the first bamboo genome sequence Phyllostachys heterocycla, availability of related genomes Brachypodium distachyon and Oryza sativa provide us the opportunity to study this long-standing biological problem in a comparative and functional genomics framework. We identified bamboo genes homologous to those of Oryza and Brachypodium that are involved in established pathways such as vernalization, photoperiod, autonomous, and hormonal regulation of flowering. Additionally, we investigated triggers like stress (drought), physiological maturity and micro RNAs that may play crucial roles in flowering. We also analyzed available transcriptome datasets of different bamboo species to identify genes and their involvement in bamboo flowering. Finally, we summarize potential research hurdles that need to be addressed in future research.

  5. In silico comparative genomic analysis of GABAA receptor transcriptional regulation

    Directory of Open Access Journals (Sweden)

    Joyce Christopher J

    2007-06-01

    Full Text Available Abstract Background Subtypes of the GABAA receptor subunit exhibit diverse temporal and spatial expression patterns. In silico comparative analysis was used to predict transcriptional regulatory features in individual mammalian GABAA receptor subunit genes, and to identify potential transcriptional regulatory components involved in the coordinate regulation of the GABAA receptor gene clusters. Results Previously unreported putative promoters were identified for the β2, γ1, γ3, ε, θ and π subunit genes. Putative core elements and proximal transcriptional factors were identified within these predicted promoters, and within the experimentally determined promoters of other subunit genes. Conserved intergenic regions of sequence in the mammalian GABAA receptor gene cluster comprising the α1, β2, γ2 and α6 subunits were identified as potential long range transcriptional regulatory components involved in the coordinate regulation of these genes. A region of predicted DNase I hypersensitive sites within the cluster may contain transcriptional regulatory features coordinating gene expression. A novel model is proposed for the coordinate control of the gene cluster and parallel expression of the α1 and β2 subunits, based upon the selective action of putative Scaffold/Matrix Attachment Regions (S/MARs. Conclusion The putative regulatory features identified by genomic analysis of GABAA receptor genes were substantiated by cross-species comparative analysis and now require experimental verification. The proposed model for the coordinate regulation of genes in the cluster accounts for the head-to-head orientation and parallel expression of the α1 and β2 subunit genes, and for the disruption of transcription caused by insertion of a neomycin gene in the close vicinity of the α6 gene, which is proximal to a putative critical S/MAR.

  6. Microarray comparative genomic hybridisation analysis incorporating genomic organisation, and application to enterobacterial plant pathogens.

    Directory of Open Access Journals (Sweden)

    Leighton Pritchard

    2009-08-01

    Full Text Available Microarray comparative genomic hybridisation (aCGH provides an estimate of the relative abundance of genomic DNA (gDNA taken from comparator and reference organisms by hybridisation to a microarray containing probes that represent sequences from the reference organism. The experimental method is used in a number of biological applications, including the detection of human chromosomal aberrations, and in comparative genomic analysis of bacterial strains, but optimisation of the analysis is desirable in each problem domain.We present a method for analysis of bacterial aCGH data that encodes spatial information from the reference genome in a hidden Markov model. This technique is the first such method to be validated in comparisons of sequenced bacteria that diverge at the strain and at the genus level: Pectobacterium atrosepticum SCRI1043 (Pba1043 and Dickeya dadantii 3937 (Dda3937; and Lactococcus lactis subsp. lactis IL1403 and L. lactis subsp. cremoris MG1363. In all cases our method is found to outperform common and widely used aCGH analysis methods that do not incorporate spatial information. This analysis is applied to comparisons between commercially important plant pathogenic soft-rotting enterobacteria (SRE Pba1043, P. atrosepticum SCRI1039, P. carotovorum 193, and Dda3937.Our analysis indicates that it should not be assumed that hybridisation strength is a reliable proxy for sequence identity in aCGH experiments, and robustly extends the applicability of aCGH to bacterial comparisons at the genus level. Our results in the SRE further provide evidence for a dynamic, plastic 'accessory' genome, revealing major genomic islands encoding gene products that provide insight into, and may play a direct role in determining, variation amongst the SRE in terms of their environmental survival, host range and aetiology, such as phytotoxin synthesis, multidrug resistance, and nitrogen fixation.

  7. Comparative genomic assessment of Multi-Locus Sequence Typing: rapid accumulation of genomic heterogeneity among clonal isolates of Campylobacter jejuni

    Directory of Open Access Journals (Sweden)

    Nash John HE

    2008-08-01

    Full Text Available Abstract Background Multi-Locus Sequence Typing (MLST has emerged as a leading molecular typing method owing to its high ability to discriminate among bacterial isolates, the relative ease with which data acquisition and analysis can be standardized, and the high portability of the resulting sequence data. While MLST has been successfully applied to the study of the population structure for a number of different bacterial species, it has also provided compelling evidence for high rates of recombination in some species. We have analyzed a set of Campylobacter jejuni strains using MLST and Comparative Genomic Hybridization (CGH on a full-genome microarray in order to determine whether recombination and high levels of genomic mosaicism adversely affect the inference of strain relationships based on the analysis of a restricted number of genetic loci. Results Our results indicate that, in general, there is significant concordance between strain relationships established by MLST and those based on shared gene content as established by CGH. While MLST has significant predictive power with respect to overall genome similarity of isolates, we also found evidence for significant differences in genomic content among strains that would otherwise appear to be highly related based on their MLST profiles. Conclusion The extensive genomic mosaicism between closely related strains has important implications in the context of establishing strain to strain relationships because it suggests that the exact gene content of strains, and by extension their phenotype, is less likely to be "predicted" based on a small number of typing loci. This in turn suggests that a greater emphasis should be placed on analyzing genes of clinical interest as we forge ahead with the next generation of molecular typing methods.

  8. Comparative Analysis on Genomes from Oryza alta and Oryza latifolia by C0t-1 DNA

    Institute of Scientific and Technical Information of China (English)

    WANG De-bin; WANG Yang; WU Qi; ZHAO Hou-ming; LI Gang; QIN Rui; WANG Chun-tai; LIU Hong

    2010-01-01

    In order to reveal the origin and evolutionary relationship between two CCDD genome species, Oryza alta and Oryza latifolia, fluorescence in situ hybridization (FISH) was adopted to analyze the genomes of the two species with C0t-1 DNA from O. alta as a probe. Karyotype was also comparatively analyzed between O. alta and O. latifolia based on their similar band patterns of the hybridization signals. There were a high homology and close relationship between O. alta and O. latifolia, however, the distinction between the hybridization signals was also clear. C0t-1 DNA was proved to be species- and genome type-specific. It is suggested that C0t-1 DNA-FISH could be more efficient to analyze the genomic relationship between different species. According to the comparative analysis of highly and moderately repetitive DNA sequences between the two allotetraploidy species, O. alta and O. latifolia, the possible origin and evolutionary mechanism of allotetraploidy of Oryza were discussed.

  9. Comparative Genomics of the Extreme Acidophile Acidithiobacillus thiooxidans Reveals Intraspecific Divergence and Niche Adaptation

    Directory of Open Access Journals (Sweden)

    Xian Zhang

    2016-08-01

    Full Text Available Acidithiobacillus thiooxidans known for its ubiquity in diverse acidic and sulfur-bearing environments worldwide was used as the research subject in this study. To explore the genomic fluidity and intraspecific diversity of Acidithiobacillus thiooxidans (A. thiooxidans species, comparative genomics based on nine draft genomes was performed. Phylogenomic scrutiny provided first insights into the multiple groupings of these strains, suggesting that genetic diversity might be potentially correlated with their geographic distribution as well as geochemical conditions. While these strains shared a large number of common genes, they displayed differences in gene content. Functional assignment indicated that the core genome was essential for microbial basic activities such as energy acquisition and uptake of nutrients, whereas the accessory genome was thought to be involved in niche adaptation. Comprehensive analysis of their predicted central metabolism revealed that few differences were observed among these strains. Further analyses showed evidences of relevance between environmental conditions and genomic diversification. Furthermore, a diverse pool of mobile genetic elements including insertion sequences and genomic islands in all A. thiooxidans strains probably demonstrated the frequent genetic flow (such as lateral gene transfer in the extremely acidic environments. From another perspective, these elements might endow A. thiooxidans species with capacities to withstand the chemical constraints of their natural habitats. Taken together, our findings bring some valuable data to better understand the genomic diversity and econiche adaptation within A. thiooxidans strains.

  10. Comparative Genomics of the Extreme Acidophile Acidithiobacillus thiooxidans Reveals Intraspecific Divergence and Niche Adaptation.

    Science.gov (United States)

    Zhang, Xian; Feng, Xue; Tao, Jiemeng; Ma, Liyuan; Xiao, Yunhua; Liang, Yili; Liu, Xueduan; Yin, Huaqun

    2016-08-19

    Acidithiobacillus thiooxidans known for its ubiquity in diverse acidic and sulfur-bearing environments worldwide was used as the research subject in this study. To explore the genomic fluidity and intraspecific diversity of Acidithiobacillus thiooxidans (A. thiooxidans) species, comparative genomics based on nine draft genomes was performed. Phylogenomic scrutiny provided first insights into the multiple groupings of these strains, suggesting that genetic diversity might be potentially correlated with their geographic distribution as well as geochemical conditions. While these strains shared a large number of common genes, they displayed differences in gene content. Functional assignment indicated that the core genome was essential for microbial basic activities such as energy acquisition and uptake of nutrients, whereas the accessory genome was thought to be involved in niche adaptation. Comprehensive analysis of their predicted central metabolism revealed that few differences were observed among these strains. Further analyses showed evidences of relevance between environmental conditions and genomic diversification. Furthermore, a diverse pool of mobile genetic elements including insertion sequences and genomic islands in all A. thiooxidans strains probably demonstrated the frequent genetic flow (such as lateral gene transfer) in the extremely acidic environments. From another perspective, these elements might endow A. thiooxidans species with capacities to withstand the chemical constraints of their natural habitats. Taken together, our findings bring some valuable data to better understand the genomic diversity and econiche adaptation within A. thiooxidans strains.

  11. Genomic Comparative Study of Bovine Mastitis Escherichia coli

    Science.gov (United States)

    Kempf, Florent; Slugocki, Cindy; Blum, Shlomo E.; Leitner, Gabriel; Germon, Pierre

    2016-01-01

    Escherichia coli, one of the main causative agents of bovine mastitis, is responsible for significant losses on dairy farms. In order to better understand the pathogenicity of E. coli mastitis, an accurate characterization of E. coli strains isolated from mastitis cases is required. By using phylogenetic analyses and whole genome comparison of 5 currently available mastitis E. coli genome sequences, we searched for genotypic traits specific for mastitis isolates. Our data confirm that there is a bias in the distribution of mastitis isolates in the different phylogenetic groups of the E. coli species, with the majority of strains belonging to phylogenetic groups A and B1. An interesting feature is that clustering of strains based on their accessory genome is very similar to that obtained using the core genome. This finding illustrates the fact that phenotypic properties of strains from different phylogroups are likely to be different. As a consequence, it is possible that different strategies could be used by mastitis isolates of different phylogroups to trigger mastitis. Our results indicate that mastitis E. coli isolates analyzed in this study carry very few of the virulence genes described in other pathogenic E. coli strains. A more detailed analysis of the presence/absence of genes involved in LPS synthesis, iron acquisition and type 6 secretion systems did not uncover specific properties of mastitis isolates. Altogether, these results indicate that mastitis E. coli isolates are rather characterized by a lack of bona fide currently described virulence genes. PMID:26809117

  12. Genomic Comparative Study of Bovine Mastitis Escherichia coli.

    Science.gov (United States)

    Kempf, Florent; Slugocki, Cindy; Blum, Shlomo E; Leitner, Gabriel; Germon, Pierre

    2016-01-01

    Escherichia coli, one of the main causative agents of bovine mastitis, is responsible for significant losses on dairy farms. In order to better understand the pathogenicity of E. coli mastitis, an accurate characterization of E. coli strains isolated from mastitis cases is required. By using phylogenetic analyses and whole genome comparison of 5 currently available mastitis E. coli genome sequences, we searched for genotypic traits specific for mastitis isolates. Our data confirm that there is a bias in the distribution of mastitis isolates in the different phylogenetic groups of the E. coli species, with the majority of strains belonging to phylogenetic groups A and B1. An interesting feature is that clustering of strains based on their accessory genome is very similar to that obtained using the core genome. This finding illustrates the fact that phenotypic properties of strains from different phylogroups are likely to be different. As a consequence, it is possible that different strategies could be used by mastitis isolates of different phylogroups to trigger mastitis. Our results indicate that mastitis E. coli isolates analyzed in this study carry very few of the virulence genes described in other pathogenic E. coli strains. A more detailed analysis of the presence/absence of genes involved in LPS synthesis, iron acquisition and type 6 secretion systems did not uncover specific properties of mastitis isolates. Altogether, these results indicate that mastitis E. coli isolates are rather characterized by a lack of bona fide currently described virulence genes.

  13. Comparative genomics of two ‘Candidatus Accumulibacter' clades performing biological phosphorus removal

    Science.gov (United States)

    Flowers, Jason J; He, Shaomei; Malfatti, Stephanie; del Rio, Tijana Glavina; Tringe, Susannah G; Hugenholtz, Philip; McMahon, Katherine D

    2013-01-01

    Members of the genus Candidatus Accumulibacter are important in many wastewater treatment systems performing enhanced biological phosphorus removal (EBPR). The Accumulibacter lineage can be subdivided phylogenetically into multiple clades, and previous work showed that these clades are ecologically distinct. The complete genome of Candidatus Accumulibacter phosphatis strain UW-1, a member of Clade IIA, was previously sequenced. Here, we report a draft genome sequence of Candidatus Accumulibacter spp. strain UW-2, a member of Clade IA, assembled following shotgun metagenomic sequencing of laboratory-scale bioreactor sludge. We estimate the genome to be 80–90% complete. Although the two clades share 16S rRNA sequence identity of >98.0%, we observed a remarkable lack of synteny between the two genomes. We identified 2317 genes shared between the two genomes, with an average nucleotide identity (ANI) of 78.3%, and accounting for 49% of genes in the UW-1 genome. Unlike UW-1, the UW-2 genome seemed to lack genes for nitrogen fixation and carbon fixation. Despite these differences, metabolic genes essential for denitrification and EBPR, including carbon storage polymer and polyphosphate metabolism, were conserved in both genomes. The ANI from genes associated with EBPR was statistically higher than that from genes not associated with EBPR, indicating a high selective pressure in EBPR systems. Further, we identified genomic islands of foreign origins including a near-complete lysogenic phage in the Clade IA genome. Interestingly, Clade IA appeared to be more phage susceptible based on it containing only a single Clustered Regularly Interspaced Short Palindromic Repeats locus as compared with the two found in Clade IIA. Overall, the comparative analysis provided a genetic basis to understand physiological differences and ecological niches of Accumulibacter populations, and highlights the importance of diversity in maintaining system functional resilience. PMID:23887171

  14. Comparative genomics of two 'Candidatus Accumulibacter' clades performing biological phosphorus removal.

    Science.gov (United States)

    Flowers, Jason J; He, Shaomei; Malfatti, Stephanie; del Rio, Tijana Glavina; Tringe, Susannah G; Hugenholtz, Philip; McMahon, Katherine D

    2013-12-01

    Members of the genus Candidatus Accumulibacter are important in many wastewater treatment systems performing enhanced biological phosphorus removal (EBPR). The Accumulibacter lineage can be subdivided phylogenetically into multiple clades, and previous work showed that these clades are ecologically distinct. The complete genome of Candidatus Accumulibacter phosphatis strain UW-1, a member of Clade IIA, was previously sequenced. Here, we report a draft genome sequence of Candidatus Accumulibacter spp. strain UW-2, a member of Clade IA, assembled following shotgun metagenomic sequencing of laboratory-scale bioreactor sludge. We estimate the genome to be 80-90% complete. Although the two clades share 16S rRNA sequence identity of >98.0%, we observed a remarkable lack of synteny between the two genomes. We identified 2317 genes shared between the two genomes, with an average nucleotide identity (ANI) of 78.3%, and accounting for 49% of genes in the UW-1 genome. Unlike UW-1, the UW-2 genome seemed to lack genes for nitrogen fixation and carbon fixation. Despite these differences, metabolic genes essential for denitrification and EBPR, including carbon storage polymer and polyphosphate metabolism, were conserved in both genomes. The ANI from genes associated with EBPR was statistically higher than that from genes not associated with EBPR, indicating a high selective pressure in EBPR systems. Further, we identified genomic islands of foreign origins including a near-complete lysogenic phage in the Clade IA genome. Interestingly, Clade IA appeared to be more phage susceptible based on it containing only a single Clustered Regularly Interspaced Short Palindromic Repeats locus as compared with the two found in Clade IIA. Overall, the comparative analysis provided a genetic basis to understand physiological differences and ecological niches of Accumulibacter populations, and highlights the importance of diversity in maintaining system functional resilience.

  15. Diversity of Pseudomonas Genomes, Including Populus-Associated Isolates, as Revealed by Comparative Genome Analysis.

    Science.gov (United States)

    Jun, Se-Ran; Wassenaar, Trudy M; Nookaew, Intawat; Hauser, Loren; Wanchai, Visanu; Land, Miriam; Timm, Collin M; Lu, Tse-Yuan S; Schadt, Christopher W; Doktycz, Mitchel J; Pelletier, Dale A; Ussery, David W

    2015-10-30

    The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches, including the rhizosphere and endosphere of many plants. Their diversity influences the phylogenetic diversity and heterogeneity of these communities. On the basis of average amino acid identity, comparative genome analysis of >1,000 Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides (eastern cottonwood) trees resulted in consistent and robust genomic clusters with phylogenetic homogeneity. All Pseudomonas aeruginosa genomes clustered together, and these were clearly distinct from other Pseudomonas species groups on the basis of pangenome and core genome analyses. In contrast, the genomes of Pseudomonas fluorescens were organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. Most of our 21 Populus-associated isolates formed three distinct subgroups within the major P. fluorescens group, supported by pathway profile analysis, while two isolates were more closely related to Pseudomonas chlororaphis and Pseudomonas putida. Genes specific to Populus-associated subgroups were identified. Genes specific to subgroup 1 include several sensory systems that act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor. Genes specific to subgroup 2 contain hypothetical genes, and genes specific to subgroup 3 were annotated with hydrolase activity. This study justifies the need to sequence multiple isolates, especially from P. fluorescens, which displays the most genetic variation, in order to study functional capabilities from a pangenomic perspective. This information will prove useful when choosing Pseudomonas strains for use to promote growth and increase disease resistance in plants.

  16. The genome sequence of Blochmannia floridanus: Comparative analysis of reduced genomes

    Science.gov (United States)

    Gil, Rosario; Silva, Francisco J.; Zientz, Evelyn; Delmotte, François; González-Candelas, Fernando; Latorre, Amparo; Rausell, Carolina; Kamerbeek, Judith; Gadau, Jürgen; Hölldobler, Bert; van Ham, Roeland C. H. J.; Gross, Roy; Moya, Andrés

    2003-01-01

    Bacterial symbioses are widespread among insects, probably being one of the key factors of their evolutionary success. We present the complete genome sequence of Blochmannia floridanus, the primary endosymbiont of carpenter ants. Although these ants feed on a complex diet, this symbiosis very likely has a nutritional basis: Blochmannia is able to supply nitrogen and sulfur compounds to the host while it takes advantage of the host metabolic machinery. Remarkably, these bacteria lack all known genes involved in replication initiation (dnaA, priA, and recA). The phylogenetic analysis of a set of conserved protein-coding genes shows that Bl. floridanus is phylogenetically related to Buchnera aphidicola and Wigglesworthia glossinidia, the other endosymbiotic bacteria whose complete genomes have been sequenced so far. Comparative analysis of the five known genomes from insect endosymbiotic bacteria reveals they share only 313 genes, a number that may be close to the minimum gene set necessary to sustain endosymbiotic life. PMID:12886019

  17. Genomic analysis by oligonucleotide array Comparative Genomic Hybridization utilizing formalin-fixed, paraffin-embedded tissues.

    Science.gov (United States)

    Savage, Stephanie J; Hostetter, Galen

    2011-01-01

    Formalin fixation has been used to preserve tissues for more than a hundred years, and there are currently more than 300 million archival samples in the United States alone. The application of genomic protocols such as high-density oligonucleotide array Comparative Genomic Hybridization (aCGH) to formalin-fixed, paraffin-embedded (FFPE) tissues, therefore, opens an untapped resource of available tissues for research and facilitates utilization of existing clinical data in a research sample set. However, formalin fixation results in cross-linking of proteins and DNA, typically leading to such a significant degradation of DNA template that little is available for use in molecular applications. Here, we describe a protocol to circumvent formalin fixation artifact by utilizing enzymatic reactions to obtain quality DNA from a wide range of FFPE tissues for successful genome-wide discovery of gene dosage alterations in archival clinical samples.

  18. Comparative analysis of microsatellites in chloroplast genomes of lower and higher plants.

    Science.gov (United States)

    George, Biju; Bhatt, Bhavin S; Awasthi, Mayur; George, Binu; Singh, Achuit K

    2015-11-01

    Microsatellites, or simple sequence repeats (SSRs), contain repetitive DNA sequence where tandem repeats of one to six base pairs are present number of times. Chloroplast genome sequences have been  shown to possess extensive variations in the length, number and distribution of SSRs. However, a comparative analysis of chloroplast microsatellites is not available. Considering their potential importance in generating genomic diversity, we have systematically analysed the abundance and distribution of simple and compound microsatellites in 164 sequenced chloroplast genomes from wide range of plants. The key findings of these studies are (1) a large number of mononucleotide repeats as compared to SSR(2-6)(di-, tri-, tetra-, penta-, hexanucleotide repeats) are present in all chloroplast genomes investigated, (2) lower plants such as algae show wide variation in relative abundance, density and distribution of microsatellite repeats as compared to flowering plants, (3) longer SSRs are excluded from coding regions of most chloroplast genomes, (4) GC content has a weak influence on number, relative abundance and relative density of mononucleotide as well as SSR(2-6). However, GC content strongly showed negative correlation with relative density (R (2) = 0.5, P plants possesses relatively more genomic diversity compared to higher plants.

  19. Reduction and expansion in microsporidian genome evolution: new insights from comparative genomics.

    Science.gov (United States)

    Nakjang, Sirintra; Williams, Tom A; Heinz, Eva; Watson, Andrew K; Foster, Peter G; Sendra, Kacper M; Heaps, Sarah E; Hirt, Robert P; Martin Embley, T

    2013-01-01

    Microsporidia are an abundant group of obligate intracellular parasites of other eukaryotes, including immunocompromised humans, but the molecular basis of their intracellular lifestyle and pathobiology are poorly understood. New genomes from a taxonomically broad range of microsporidians, complemented by published expression data, provide an opportunity for comparative analyses to identify conserved and lineage-specific patterns of microsporidian genome evolution that have underpinned this success. In this study, we infer that a dramatic bottleneck in the last common microsporidian ancestor (LCMA) left a small conserved core of genes that was subsequently embellished by gene family expansion driven by gene acquisition in different lineages. Novel expressed protein families represent a substantial fraction of sequenced microsporidian genomes and are significantly enriched for signals consistent with secretion or membrane location. Further evidence of selection is inferred from the gain and reciprocal loss of functional domains between paralogous genes, for example, affecting transport proteins. Gene expansions among transporter families preferentially affect those that are located on the plasma membrane of model organisms, consistent with recruitment to plug conserved gaps in microsporidian biosynthesis and metabolism. Core microsporidian genes shared with other eukaryotes are enriched in orthologs that, in yeast, are highly expressed, highly connected, and often essential, consistent with strong negative selection against further reduction of the conserved gene set since the LCMA. Our study reveals that microsporidian genome evolution is a highly dynamic process that has balanced constraint, reductive evolution, and genome expansion during adaptation to an extraordinarily successful obligate intracellular lifestyle.

  20. Comparative genome analysis of Bacillus cereus group genomes withBacillus subtilis

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, Iain; Sorokin, Alexei; Kapatral, Vinayak; Reznik, Gary; Bhattacharya, Anamitra; Mikhailova, Natalia; Burd, Henry; Joukov, Victor; Kaznadzey, Denis; Walunas, Theresa; D' Souza, Mark; Larsen, Niels; Pusch,Gordon; Liolios, Konstantinos; Grechkin, Yuri; Lapidus, Alla; Goltsman,Eugene; Chu, Lien; Fonstein, Michael; Ehrlich, S. Dusko; Overbeek, Ross; Kyrpides, Nikos; Ivanova, Natalia

    2005-09-14

    Genome features of the Bacillus cereus group genomes (representative strains of Bacillus cereus, Bacillus anthracis and Bacillus thuringiensis sub spp israelensis) were analyzed and compared with the Bacillus subtilis genome. A core set of 1,381 protein families among the four Bacillus genomes, with an additional set of 933 families common to the B. cereus group, was identified. Differences in signal transduction pathways, membrane transporters, cell surface structures, cell wall, and S-layer proteins suggesting differences in their phenotype were identified. The B. cereus group has signal transduction systems including a tyrosine kinase related to two-component system histidine kinases from B. subtilis. A model for regulation of the stress responsive sigma factor sigmaB in the B. cereus group different from the well studied regulation in B. subtilis has been proposed. Despite a high degree of chromosomal synteny among these genomes, significant differences in cell wall and spore coat proteins that contribute to the survival and adaptation in specific hosts has been identified.

  1. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.

    Directory of Open Access Journals (Sweden)

    Jianmin Fu

    Full Text Available Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales.

  2. Beyond the thale: comparative genomics and genetics of Arabidopsis relatives.

    Science.gov (United States)

    Koenig, Daniel; Weigel, Detlef

    2015-05-01

    For decades a small number of model species have rightly occupied a privileged position in laboratory experiments, but it is becoming increasingly clear that our knowledge of biology is greatly improved when informed by a broader diversity of species and evolutionary context. Arabidopsis thaliana has been the primary model organism for plants, benefiting from a high-quality reference genome sequence and resources for reverse genetics. However, recent studies have made a group of species also in the Brassicaceae family and closely related to A. thaliana a focal point for comparative molecular, genomic, phenotypic and evolutionary studies. In this Review, we emphasize how such studies complement continued study of the model plant itself, provide an evolutionary perspective and summarize our current understanding of genetic and phenotypic diversity in plants.

  3. A comparative encyclopedia of DNA elements in the mouse genome.

    Science.gov (United States)

    Yue, Feng; Cheng, Yong; Breschi, Alessandra; Vierstra, Jeff; Wu, Weisheng; Ryba, Tyrone; Sandstrom, Richard; Ma, Zhihai; Davis, Carrie; Pope, Benjamin D; Shen, Yin; Pervouchine, Dmitri D; Djebali, Sarah; Thurman, Robert E; Kaul, Rajinder; Rynes, Eric; Kirilusha, Anthony; Marinov, Georgi K; Williams, Brian A; Trout, Diane; Amrhein, Henry; Fisher-Aylor, Katherine; Antoshechkin, Igor; DeSalvo, Gilberto; See, Lei-Hoon; Fastuca, Meagan; Drenkow, Jorg; Zaleski, Chris; Dobin, Alex; Prieto, Pablo; Lagarde, Julien; Bussotti, Giovanni; Tanzer, Andrea; Denas, Olgert; Li, Kanwei; Bender, M A; Zhang, Miaohua; Byron, Rachel; Groudine, Mark T; McCleary, David; Pham, Long; Ye, Zhen; Kuan, Samantha; Edsall, Lee; Wu, Yi-Chieh; Rasmussen, Matthew D; Bansal, Mukul S; Kellis, Manolis; Keller, Cheryl A; Morrissey, Christapher S; Mishra, Tejaswini; Jain, Deepti; Dogan, Nergiz; Harris, Robert S; Cayting, Philip; Kawli, Trupti; Boyle, Alan P; Euskirchen, Ghia; Kundaje, Anshul; Lin, Shin; Lin, Yiing; Jansen, Camden; Malladi, Venkat S; Cline, Melissa S; Erickson, Drew T; Kirkup, Vanessa M; Learned, Katrina; Sloan, Cricket A; Rosenbloom, Kate R; Lacerda de Sousa, Beatriz; Beal, Kathryn; Pignatelli, Miguel; Flicek, Paul; Lian, Jin; Kahveci, Tamer; Lee, Dongwon; Kent, W James; Ramalho Santos, Miguel; Herrero, Javier; Notredame, Cedric; Johnson, Audra; Vong, Shinny; Lee, Kristen; Bates, Daniel; Neri, Fidencio; Diegel, Morgan; Canfield, Theresa; Sabo, Peter J; Wilken, Matthew S; Reh, Thomas A; Giste, Erika; Shafer, Anthony; Kutyavin, Tanya; Haugen, Eric; Dunn, Douglas; Reynolds, Alex P; Neph, Shane; Humbert, Richard; Hansen, R Scott; De Bruijn, Marella; Selleri, Licia; Rudensky, Alexander; Josefowicz, Steven; Samstein, Robert; Eichler, Evan E; Orkin, Stuart H; Levasseur, Dana; Papayannopoulou, Thalia; Chang, Kai-Hsin; Skoultchi, Arthur; Gosh, Srikanta; Disteche, Christine; Treuting, Piper; Wang, Yanli; Weiss, Mitchell J; Blobel, Gerd A; Cao, Xiaoyi; Zhong, Sheng; Wang, Ting; Good, Peter J; Lowdon, Rebecca F; Adams, Leslie B; Zhou, Xiao-Qiao; Pazin, Michael J; Feingold, Elise A; Wold, Barbara; Taylor, James; Mortazavi, Ali; Weissman, Sherman M; Stamatoyannopoulos, John A; Snyder, Michael P; Guigo, Roderic; Gingeras, Thomas R; Gilbert, David M; Hardison, Ross C; Beer, Michael A; Ren, Bing

    2014-11-20

    The laboratory mouse shares the majority of its protein-coding genes with humans, making it the premier model organism in biomedical research, yet the two mammals differ in significant ways. To gain greater insights into both shared and species-specific transcriptional and cellular regulatory programs in the mouse, the Mouse ENCODE Consortium has mapped transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications and replication domains throughout the mouse genome in diverse cell and tissue types. By comparing with the human genome, we not only confirm substantial conservation in the newly annotated potential functional sequences, but also find a large degree of divergence of sequences involved in transcriptional regulation, chromatin state and higher order chromatin organization. Our results illuminate the wide range of evolutionary forces acting on genes and their regulatory regions, and provide a general resource for research into mammalian biology and mechanisms of human diseases.

  4. A Comparative Encyclopedia of DNA Elements in the Mouse Genome

    Science.gov (United States)

    Yue, Feng; Cheng, Yong; Breschi, Alessandra; Vierstra, Jeff; Wu, Weisheng; Ryba, Tyrone; Sandstrom, Richard; Ma, Zhihai; Davis, Carrie; Pope, Benjamin D.; Shen, Yin; Pervouchine, Dmitri D.; Djebali, Sarah; Thurman, Bob; Kaul, Rajinder; Rynes, Eric; Kirilusha, Anthony; Marinov, Georgi K.; Williams, Brian A.; Trout, Diane; Amrhein, Henry; Fisher-Aylor, Katherine; Antoshechkin, Igor; DeSalvo, Gilberto; See, Lei-Hoon; Fastuca, Meagan; Drenkow, Jorg; Zaleski, Chris; Dobin, Alex; Prieto, Pablo; Lagarde, Julien; Bussotti, Giovanni; Tanzer, Andrea; Denas, Olgert; Li, Kanwei; Bender, M. A.; Zhang, Miaohua; Byron, Rachel; Groudine, Mark T.; McCleary, David; Pham, Long; Ye, Zhen; Kuan, Samantha; Edsall, Lee; Wu, Yi-Chieh; Rasmussen, Matthew D.; Bansal, Mukul S.; Keller, Cheryl A.; Morrissey, Christapher S.; Mishra, Tejaswini; Jain, Deepti; Dogan, Nergiz; Harris, Robert S.; Cayting, Philip; Kawli, Trupti; Boyle, Alan P.; Euskirchen, Ghia; Kundaje, Anshul; Lin, Shin; Lin, Yiing; Jansen, Camden; Malladi, Venkat S.; Cline, Melissa S.; Erickson, Drew T.; Kirkup, Vanessa M; Learned, Katrina; Sloan, Cricket A.; Rosenbloom, Kate R.; de Sousa, Beatriz Lacerda; Beal, Kathryn; Pignatelli, Miguel; Flicek, Paul; Lian, Jin; Kahveci, Tamer; Lee, Dongwon; Kent, W. James; Santos, Miguel Ramalho; Herrero, Javier; Notredame, Cedric; Johnson, Audra; Vong, Shinny; Lee, Kristen; Bates, Daniel; Neri, Fidencio; Diegel, Morgan; Canfield, Theresa; Sabo, Peter J.; Wilken, Matthew S.; Reh, Thomas A.; Giste, Erika; Shafer, Anthony; Kutyavin, Tanya; Haugen, Eric; Dunn, Douglas; Reynolds, Alex P.; Neph, Shane; Humbert, Richard; Hansen, R. Scott; De Bruijn, Marella; Selleri, Licia; Rudensky, Alexander; Josefowicz, Steven; Samstein, Robert; Eichler, Evan E.; Orkin, Stuart H.; Levasseur, Dana; Papayannopoulou, Thalia; Chang, Kai-Hsin; Skoultchi, Arthur; Gosh, Srikanta; Disteche, Christine; Treuting, Piper; Wang, Yanli; Weiss, Mitchell J.; Blobel, Gerd A.; Good, Peter J.; Lowdon, Rebecca F.; Adams, Leslie B.; Zhou, Xiao-Qiao; Pazin, Michael J.; Feingold, Elise A.; Wold, Barbara; Taylor, James; Kellis, Manolis; Mortazavi, Ali; Weissman, Sherman M.; Stamatoyannopoulos, John; Snyder, Michael P.; Guigo, Roderic; Gingeras, Thomas R.; Gilbert, David M.; Hardison, Ross C.; Beer, Michael A.; Ren, Bing

    2014-01-01

    Summary As the premier model organism in biomedical research, the laboratory mouse shares the majority of protein-coding genes with humans, yet the two mammals differ in significant ways. To gain greater insights into both shared and species-specific transcriptional and cellular regulatory programs in the mouse, the Mouse ENCODE Consortium has mapped transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications, and replication domains throughout the mouse genome in diverse cell and tissue types. By comparing with the human genome, we not only confirm substantial conservation in the newly annotated potential functional sequences, but also find a large degree of divergence of other sequences involved in transcriptional regulation, chromatin state and higher order chromatin organization. Our results illuminate the wide range of evolutionary forces acting on genes and their regulatory regions, and provide a general resource for research into mammalian biology and mechanisms of human diseases. PMID:25409824

  5. CGHScan: finding variable regions using high-density microarray comparative genomic hybridization data

    Directory of Open Access Journals (Sweden)

    Rajashekara Gireesh

    2006-04-01

    Full Text Available Abstract Background Comparative genomic hybridization can rapidly identify chromosomal regions that vary between organisms and tissues. This technique has been applied to detecting differences between normal and cancerous tissues in eukaryotes as well as genomic variability in microbial strains and species. The density of oligonucleotide probes available on current microarray platforms is particularly well-suited for comparisons of organisms with smaller genomes like bacteria and yeast where an entire genome can be assayed on a single microarray with high resolution. Available methods for analyzing these experiments typically confine analyses to data from pre-defined annotated genome features, such as entire genes. Many of these methods are ill suited for datasets with the number of measurements typical of high-density microarrays. Results We present an algorithm for analyzing microarray hybridization data to aid identification of regions that vary between an unsequenced genome and a sequenced reference genome. The program, CGHScan, uses an iterative random walk approach integrating multi-layered significance testing to detect these regions from comparative genomic hybridization data. The algorithm tolerates a high level of noise in measurements of individual probe intensities and is relatively insensitive to the choice of method for normalizing probe intensity values and identifying probes that differ between samples. When applied to comparative genomic hybridization data from a published experiment, CGHScan identified eight of nine known deletions in a Brucella ovis strain as compared to Brucella melitensis. The same result was obtained using two different normalization methods and two different scores to classify data for individual probes as representing conserved or variable genomic regions. The undetected region is a small (58 base pair deletion that is below the resolution of CGHScan given the array design employed in the study

  6. Comparative Genomics of the Ubiquitous, Hydrocarbon-degrading Genus Marinobacter

    Science.gov (United States)

    Singer, E.; Webb, E.; Edwards, K. J.

    2012-12-01

    The genus Marinobacter is amongst the most ubiquitous in the global oceans and strains have been isolated from a wide variety of marine environments, including offshore oil-well heads, coastal thermal springs, Antarctic sea water, saline soils and associations with diatoms and dinoflagellates. Many strains have been recognized to be important hydrocarbon degraders in various marine habitats presenting sometimes extreme pH or salinity conditions. Analysis of the genome of M. aquaeolei revealed enormous adaptation versatility with an assortment of strategies for carbon and energy acquisition, sensation, and defense. In an effort to elucidate the ecological and biogeochemical significance of the Marinobacters, seven Marinobacter strains from diverse environments were included in a comparative genomics study. Genomes were screened for metabolic and adaptation potential to elucidate the strategies responsible for the omnipresence of the Marinobacter genus and their remedial action potential in hydrocarbon-polluted waters. The core genome predominantly encodes for key genes involved in hydrocarbon degradation, biofilm-relevant processes, including utilization of external DNA, halotolerance, as well as defense mechanisms against heavy metals, antibiotics, and toxins. All Marinobacter strains were observed to degrade a wide spectrum of hydrocarbon species, including aliphatic, polycyclic aromatic as well as acyclic isoprenoid compounds. Various genes predicted to facilitate hydrocarbon degradation, e.g. alkane 1-monooxygenase, appear to have originated from lateral gene transfer as they are located on gene clusters of 10-20% lower GC-content compared to genome averages and are flanked by transposases. Top ortholog hits are found in other hydrocarbon degrading organisms, e.g. Alcanivorax borkumensis. Strategies for hydrocarbon uptake encoded by various Marinobacter strains include cell surface hydrophobicity adaptation via capsular polysaccharide biosynthesis and attachment

  7. Array comparative genomic hybridization in retinoma and retinoblastoma tissues.

    Science.gov (United States)

    Sampieri, Katia; Amenduni, Mariangela; Papa, Filomena Tiziana; Katzaki, Eleni; Mencarelli, Maria Antonietta; Marozza, Annabella; Epistolato, Maria Carmela; Toti, Paolo; Lazzi, Stefano; Bruttini, Mirella; De Filippis, Roberta; De Francesco, Sonia; Longo, Ilaria; Meloni, Ilaria; Mari, Francesca; Acquaviva, Antonio; Hadjistilianou, Theodora; Renieri, Alessandra; Ariani, Francesca

    2009-03-01

    In retinoblastoma, two RB1 mutations are necessary for tumor development. Recurrent genomic rearrangements may represent subsequent events required for retinoblastoma progression. Array-comparative genomic hybridization was carried out in 18 eye samples, 10 from bilateral and eight from unilateral retinoblastoma patients. Two unilateral cases also showed areas of retinoma. The most frequent imbalance in retinoblastomas was 6p gain (40%), followed by gains at 1q12-q25.3, 2p24.3-p24.2, 9q22.2, and 9q33.1 and losses at 11q24.3, 13q13.2-q22.3, and 16q12.1-q21. Bilateral cases showed a lower number of imbalances than unilateral cases (P = 0.002). Unilateral cases were divided into low-level ( or = 7) chromosomal instability groups. The first group presented with younger age at diagnosis (mean 511 days) compared with the second group (mean 1606 days). In one retinoma case ophthalmoscopically diagnosed as a benign lesion no rearrangements were detected, whereas the adjacent retinoblastoma displayed seven aberrations. The other retinoma case identified by retrospective histopathological examination shared three rearrangements with the adjacent retinoblastoma. Two other gene-free rearrangements were retinoma specific. One rearrangement, dup5p, was retinoblastoma specific and included the SKP2 gene. Genomic profiling indicated that the first retinoma was a pretumoral lesion, whereas the other represents a subclone of cells bearing 'benign' rearrangements overwhelmed by another subclone presenting aberrations with higher 'oncogenic' potential. In summary, the present study shows that bilateral and unilateral retinoblastoma have different chromosomal instability that correlates with the age of tumor onset in unilateral cases. This is the first report of genomic profiling in retinoma tissue, shedding light on the different nature of lesions named 'retinoma'.

  8. Comparative genomic in situ hybridization analysis on the ...

    African Journals Online (AJOL)

    AJL

    2012-04-10

    Apr 10, 2012 ... different parents/ancestors/genomes in hybrid plants to be distinguished ... sequences in common between the two species. Therefore, cGISH ... genomic organization and genome evolution in plants. (Zoller et al., 2001).

  9. Comparative chloroplast genomics: analyses including new sequences from the angiosperms Nuphar advena and Ranunculus macranthus

    Directory of Open Access Journals (Sweden)

    Boore Jeffrey L

    2007-06-01

    Full Text Available Abstract Background The number of completely sequenced plastid genomes available is growing rapidly. This array of sequences presents new opportunities to perform comparative analyses. In comparative studies, it is often useful to compare across wide phylogenetic spans and, within angiosperms, to include representatives from basally diverging lineages such as the genomes reported here: Nuphar advena (from a basal-most lineage and Ranunculus macranthus (a basal eudicot. We report these two new plastid genome sequences and make comparisons (within angiosperms, seed plants, or all photosynthetic lineages to evaluate features such as the status of ycf15 and ycf68 as protein coding genes, the distribution of simple sequence repeats (SSRs and longer dispersed repeats (SDR, and patterns of nucleotide composition. Results The Nuphar [GenBank:NC_008788] and Ranunculus [GenBank:NC_008796] plastid genomes share characteristics of gene content and organization with many other chloroplast genomes. Like other plastid genomes, these genomes are A+T-rich, except for rRNA and tRNA genes. Detailed comparisons of Nuphar with Nymphaea, another Nymphaeaceae, show that more than two-thirds of these genomes exhibit at least 95% sequence identity and that most SSRs are shared. In broader comparisons, SSRs vary among genomes in terms of abundance and length and most contain repeat motifs based on A and T nucleotides. Conclusion SSR and SDR abundance varies by genome and, for SSRs, is proportional to genome size. Long SDRs are rare in the genomes assessed. SSRs occur less frequently than predicted and, although the majority of the repeat motifs do include A and T nucleotides, the A+T bias in SSRs is less than that predicted from the underlying genomic nucleotide composition. In codon usage third positions show an A+T bias, however variation in codon usage does not correlate with differences in A+T-richness. Thus, although plastome nucleotide composition shows "A

  10. Comparative genomics boosts target prediction for bacterial small RNAs.

    Science.gov (United States)

    Wright, Patrick R; Richter, Andreas S; Papenfort, Kai; Mann, Martin; Vogel, Jörg; Hess, Wolfgang R; Backofen, Rolf; Georg, Jens

    2013-09-10

    Small RNAs (sRNAs) constitute a large and heterogeneous class of bacterial gene expression regulators. Much like eukaryotic microRNAs, these sRNAs typically target multiple mRNAs through short seed pairing, thereby acting as global posttranscriptional regulators. In some bacteria, evidence for hundreds to possibly more than 1,000 different sRNAs has been obtained by transcriptome sequencing. However, the experimental identification of possible targets and, therefore, their confirmation as functional regulators of gene expression has remained laborious. Here, we present a strategy that integrates phylogenetic information to predict sRNA targets at the genomic scale and reconstructs regulatory networks upon functional enrichment and network analysis (CopraRNA, for Comparative Prediction Algorithm for sRNA Targets). Furthermore, CopraRNA precisely predicts the sRNA domains for target recognition and interaction. When applied to several model sRNAs, CopraRNA revealed additional targets and functions for the sRNAs CyaR, FnrS, RybB, RyhB, SgrS, and Spot42. Moreover, the mRNAs gdhA, lrp, marA, nagZ, ptsI, sdhA, and yobF-cspC were suggested as regulatory hubs targeted by up to seven different sRNAs. The verification of many previously undetected targets by CopraRNA, even for extensively investigated sRNAs, demonstrates its advantages and shows that CopraRNA-based analyses can compete with experimental target prediction approaches. A Web interface allows high-confidence target prediction and efficient classification of bacterial sRNAs.

  11. Analysis of the Complete Chloroplast Genome of a Medicinal Plant, Dianthus superbus var. longicalyncinus, from a Comparative Genomics Perspective.

    Directory of Open Access Journals (Sweden)

    Gurusamy Raman

    Full Text Available Dianthus superbus var. longicalycinus is an economically important traditional Chinese medicinal plant that is also used for ornamental purposes. In this study, D. superbus was compared to its closely related family of Caryophyllaceae chloroplast (cp genomes such as Lychnis chalcedonica and Spinacia oleracea. D. superbus had the longest large single copy (LSC region (82,805 bp, with some variations in the inverted repeat region A (IRA/LSC regions. The IRs underwent both expansion and constriction during evolution of the Caryophyllaceae family; however, intense variations were not identified. The pseudogene ribosomal protein subunit S19 (rps19 was identified at the IRA/LSC junction, but was not present in the cp genome of other Caryophyllaceae family members. The translation initiation factor IF-1 (infA and ribosomal protein subunit L23 (rpl23 genes were absent from the Dianthus cp genome. When the cp genome of Dianthus was compared with 31 other angiosperm lineages, the infA gene was found to have been lost in most members of rosids, solanales of asterids and Lychnis of Caryophyllales, whereas rpl23 gene loss or pseudogization had occurred exclusively in Caryophyllales. Nevertheless, the cp genome of Dianthus and Spinacia has two introns in the proteolytic subunit of ATP-dependent protease (clpP gene, but Lychnis has lost introns from the clpP gene. Furthermore, phylogenetic analysis of individual protein-coding genes infA and rpl23 revealed that gene loss or pseudogenization occurred independently in the cp genome of Dianthus. Molecular phylogenetic analysis also demonstrated a sister relationship between Dianthus and Lychnis based on 78 protein-coding sequences. The results presented herein will contribute to studies of the evolution, molecular biology and genetic engineering of the medicinal and ornamental plant, D. superbus var. longicalycinus.

  12. Analysis of the Complete Chloroplast Genome of a Medicinal Plant, Dianthus superbus var. longicalyncinus, from a Comparative Genomics Perspective.

    Science.gov (United States)

    Raman, Gurusamy; Park, SeonJoo

    2015-01-01

    Dianthus superbus var. longicalycinus is an economically important traditional Chinese medicinal plant that is also used for ornamental purposes. In this study, D. superbus was compared to its closely related family of Caryophyllaceae chloroplast (cp) genomes such as Lychnis chalcedonica and Spinacia oleracea. D. superbus had the longest large single copy (LSC) region (82,805 bp), with some variations in the inverted repeat region A (IRA)/LSC regions. The IRs underwent both expansion and constriction during evolution of the Caryophyllaceae family; however, intense variations were not identified. The pseudogene ribosomal protein subunit S19 (rps19) was identified at the IRA/LSC junction, but was not present in the cp genome of other Caryophyllaceae family members. The translation initiation factor IF-1 (infA) and ribosomal protein subunit L23 (rpl23) genes were absent from the Dianthus cp genome. When the cp genome of Dianthus was compared with 31 other angiosperm lineages, the infA gene was found to have been lost in most members of rosids, solanales of asterids and Lychnis of Caryophyllales, whereas rpl23 gene loss or pseudogization had occurred exclusively in Caryophyllales. Nevertheless, the cp genome of Dianthus and Spinacia has two introns in the proteolytic subunit of ATP-dependent protease (clpP) gene, but Lychnis has lost introns from the clpP gene. Furthermore, phylogenetic analysis of individual protein-coding genes infA and rpl23 revealed that gene loss or pseudogenization occurred independently in the cp genome of Dianthus. Molecular phylogenetic analysis also demonstrated a sister relationship between Dianthus and Lychnis based on 78 protein-coding sequences. The results presented herein will contribute to studies of the evolution, molecular biology and genetic engineering of the medicinal and ornamental plant, D. superbus var. longicalycinus.

  13. The multiple facets of homology and their use in comparative genomics to study the evolution of genes, genomes, and species.

    Science.gov (United States)

    Descorps-Declère, Stéphane; Lemoine, Frédéric; Sculo, Quentin; Lespinet, Olivier; Labedan, Bernard

    2008-04-01

    The incredible development of comparative genomics during the last decade has required a correct use of the concept of homology that was previously utilized only by evolutionary biologists. Unhappily, this concept has been often misunderstood and thus misused when exploited outside its evolutionary context. This review brings back to the correct definition of homology and explains how this definition has been progressively refined in order to adapt it to the various new kinds of analysis of gene properties and of their products that appear with the progress of comparative genomics. Then, we illustrate the power and the proficiency of such a concept when using the available genomics data in order to study the evolution of individual genes, of entire genomes and of species, respectively. After explaining how we detect homologues by an exhaustive comparison of a hundred of complete proteomes, we describe three main lines of research we have developed in the recent years. The first one exploits synteny and gene context data to better understand the mechanisms of genome evolution in prokaryotes. The second one is based on phylogenomics approaches to reconstruct the tree of life. The last one is devoted to reminding that protein homology is often limited to structural segments (SOH=segment of homology or module). Detecting and numbering modules allows tracing back protein history by identifying the events of gene duplication and gene fusion. We insist that one of the main present difficulties in such studies is a lack of a reliable method to identify genuine orthologues. Finally, we show how these homology studies are helpful to annotate genes and genomes and to study the complexity of the relationships between sequence and function of a gene.

  14. Complete genome sequence of Enterococcus faecium strain TX16 and comparative genomic analysis of Enterococcus faecium genomes

    Directory of Open Access Journals (Sweden)

    Qin Xiang

    2012-07-01

    Full Text Available Abstract Background Enterococci are among the leading causes of hospital-acquired infections in the United States and Europe, with Enterococcus faecalis and Enterococcus faecium being the two most common species isolated from enterococcal infections. In the last decade, the proportion of enterococcal infections caused by E. faecium has steadily increased compared to other Enterococcus species. Although the underlying mechanism for the gradual replacement of E. faecalis by E. faecium in the hospital environment is not yet understood, many studies using genotyping and phylogenetic analysis have shown the emergence of a globally dispersed polyclonal subcluster of E. faecium strains in clinical environments. Systematic study of the molecular epidemiology and pathogenesis of E. faecium has been hindered by the lack of closed, complete E. faecium genomes that can be used as references. Results In this study, we report the complete genome sequence of the E. faecium strain TX16, also known as DO, which belongs to multilocus sequence type (ST 18, and was the first E. faecium strain ever sequenced. Whole genome comparison of the TX16 genome with 21 E. faecium draft genomes confirmed that most clinical, outbreak, and hospital-associated (HA strains (including STs 16, 17, 18, and 78, in addition to strains of non-hospital origin, group in the same clade (referred to as the HA clade and are evolutionally considerably more closely related to each other by phylogenetic and gene content similarity analyses than to isolates in the community-associated (CA clade with approximately a 3–4% average nucleotide sequence difference between the two clades at the core genome level. Our study also revealed that many genomic loci in the TX16 genome are unique to the HA clade. 380 ORFs in TX16 are HA-clade specific and antibiotic resistance genes are enriched in HA-clade strains. Mobile elements such as IS16 and transposons were also found almost exclusively in HA strains

  15. Comparative genomics of flatworms (platyhelminthes) reveals shared genomic features of ecto- and endoparastic neodermata.

    Science.gov (United States)

    Hahn, Christoph; Fromm, Bastian; Bachmann, Lutz

    2014-05-01

    The ectoparasitic Monogenea comprise a major part of the obligate parasitic flatworm diversity. Although genomic adaptations to parasitism have been studied in the endoparasitic tapeworms (Cestoda) and flukes (Trematoda), no representative of the Monogenea has been investigated yet. We present the high-quality draft genome of Gyrodactylus salaris, an economically important monogenean ectoparasite of wild Atlantic salmon (Salmo salar). A total of 15,488 gene models were identified, of which 7,102 were functionally annotated. The controversial phylogenetic relationships within the obligate parasitic Neodermata were resolved in a phylogenomic analysis using 1,719 gene models (alignment length of >500,000 amino acids) for a set of 16 metazoan taxa. The Monogenea were found basal to the Cestoda and Trematoda, which implies ectoparasitism being plesiomorphic within the Neodermata and strongly supports a common origin of complex life cycles. Comparative analysis of seven parasitic flatworm genomes identified shared genomic features for the ecto- and endoparasitic lineages, such as a substantial reduction of the core bilaterian gene complement, including the homeodomain-containing genes, and a loss of the piwi and vasa genes, which are considered essential for animal development. Furthermore, the shared loss of functional fatty acid biosynthesis pathways and the absence of peroxisomes, the latter organelles presumed ubiquitous in eukaryotes except for parasitic protozoans, were inferred. The draft genome of G. salaris opens for future in-depth analyses of pathogenicity and host specificity of poorly characterized G. salaris strains, and will enhance studies addressing the genomics of host-parasite interactions and speciation in the highly diverse monogenean flatworms.

  16. Comparative Genomics of Flatworms (Platyhelminthes) Reveals Shared Genomic Features of Ecto- and Endoparastic Neodermata

    Science.gov (United States)

    Hahn, Christoph; Fromm, Bastian; Bachmann, Lutz

    2014-01-01

    The ectoparasitic Monogenea comprise a major part of the obligate parasitic flatworm diversity. Although genomic adaptations to parasitism have been studied in the endoparasitic tapeworms (Cestoda) and flukes (Trematoda), no representative of the Monogenea has been investigated yet. We present the high-quality draft genome of Gyrodactylus salaris, an economically important monogenean ectoparasite of wild Atlantic salmon (Salmo salar). A total of 15,488 gene models were identified, of which 7,102 were functionally annotated. The controversial phylogenetic relationships within the obligate parasitic Neodermata were resolved in a phylogenomic analysis using 1,719 gene models (alignment length of >500,000 amino acids) for a set of 16 metazoan taxa. The Monogenea were found basal to the Cestoda and Trematoda, which implies ectoparasitism being plesiomorphic within the Neodermata and strongly supports a common origin of complex life cycles. Comparative analysis of seven parasitic flatworm genomes identified shared genomic features for the ecto- and endoparasitic lineages, such as a substantial reduction of the core bilaterian gene complement, including the homeodomain-containing genes, and a loss of the piwi and vasa genes, which are considered essential for animal development. Furthermore, the shared loss of functional fatty acid biosynthesis pathways and the absence of peroxisomes, the latter organelles presumed ubiquitous in eukaryotes except for parasitic protozoans, were inferred. The draft genome of G. salaris opens for future in-depth analyses of pathogenicity and host specificity of poorly characterized G. salaris strains, and will enhance studies addressing the genomics of host–parasite interactions and speciation in the highly diverse monogenean flatworms. PMID:24732282

  17. The complete genome sequence and comparative genome analysis of the high pathogenicity Yersinia enterocolitica strain 8081.

    Directory of Open Access Journals (Sweden)

    Nicholas R Thomson

    2006-12-01

    Full Text Available The human enteropathogen, Yersinia enterocolitica, is a significant link in the range of Yersinia pathologies extending from mild gastroenteritis to bubonic plague. Comparison at the genomic level is a key step in our understanding of the genetic basis for this pathogenicity spectrum. Here we report the genome of Y. enterocolitica strain 8081 (serotype 0:8; biotype 1B and extensive microarray data relating to the genetic diversity of the Y. enterocolitica species. Our analysis reveals that the genome of Y. enterocolitica strain 8081 is a patchwork of horizontally acquired genetic loci, including a plasticity zone of 199 kb containing an extraordinarily high density of virulence genes. Microarray analysis has provided insights into species-specific Y. enterocolitica gene functions and the intraspecies differences between the high, low, and nonpathogenic Y. enterocolitica biotypes. Through comparative genome sequence analysis we provide new information on the evolution of the Yersinia. We identify numerous loci that represent ancestral clusters of genes potentially important in enteric survival and pathogenesis, which have been lost or are in the process of being lost, in the other sequenced Yersinia lineages. Our analysis also highlights large metabolic operons in Y. enterocolitica that are absent in the related enteropathogen, Yersinia pseudotuberculosis, indicating major differences in niche and nutrients used within the mammalian gut. These include clusters directing, the production of hydrogenases, tetrathionate respiration, cobalamin synthesis, and propanediol utilisation. Along with ancestral gene clusters, the genome of Y. enterocolitica has revealed species-specific and enteropathogen-specific loci. This has provided important insights into the pathology of this bacterium and, more broadly, into the evolution of the genus. Moreover, wider investigations looking at the patterns of gene loss and gain in the Yersinia have highlighted common

  18. Genome Sequence of Desulfurella amilsii Strain TR1 and Comparative Genomics of Desulfurellaceae Family.

    Science.gov (United States)

    Florentino, Anna P; Stams, Alfons J M; Sánchez-Andrea, Irene

    2017-01-01

    The acidotolerant sulfur reducer Desulfurella amilsii was isolated from sediments of Tinto River, an extremely acidic environment. Its ability to grow in a broad range of pH and to tolerate certain heavy metals offers potential for metal recovery processes. Here we report its high-quality draft genome sequence and compare it to the available genome sequences of other members of Desulfurellaceae family: D. acetivorans. D. multipotens, Hippea maritima. H. alviniae, H. medeae, and H. jasoniae. For most species, pairwise comparisons for average nucleotide identity (ANI) and in silico DNA-DNA hybridization (DDH) revealed ANI values from 67.5 to 80% and DDH values from 12.9 to 24.2%. D. acetivorans and D. multipotens, however, surpassed the estimated thresholds of species definition for both DDH (98.6%) and ANI (88.1%). Therefore, they should be merged to a single species. Comparative analysis of Desulfurellaceae genomes revealed different gene content for sulfur respiration between Desulfurella and Hippea species. Sulfur reductase is only encoded in D. amilsii, in which it is suggested to play a role in sulfur respiration, especially at low pH. Polysulfide reductase is only encoded in Hippea species; it is likely that this genus uses polysulfide as electron acceptor. Genes encoding thiosulfate reductase are present in all the genomes, but dissimilatory sulfite reductase is only present in Desulfurella species. Thus, thiosulfate respiration via sulfite is only likely in this genus. Although sulfur disproportionation occurs in Desulfurella species, the molecular mechanism behind this process is not yet understood, hampering a genome prediction. The metabolism of acetate in Desulfurella species can occur via the acetyl-CoA synthetase or via acetate kinase in combination with phosphate acetyltransferase, while in Hippea species, it might occur via the acetate kinase. Large differences in gene sets involved in resistance to acidic conditions were not detected among the

  19. Comparative genomics reveals mobile pathogenicity chromosomes in Fusarium

    Energy Technology Data Exchange (ETDEWEB)

    Ma, Li Jun; van der Does, H. C.; Borkovich, Katherine A.; Coleman, Jeffrey J.; Daboussi, Marie-Jose; Di Pietro, Antonio; Dufresne, Marie; Freitag, Michael; Grabherr, Manfred; Henrissat, Bernard; Houterman, Petra M.; Kang, Seogchan; Shim, Won-Bo; Wolochuk, Charles; Xie, Xiaohui; Xu, Jin Rong; Antoniw, John; Baker, Scott E.; Bluhm, Burton H.; Breakspear, Andrew; Brown, Daren W.; Butchko, Robert A.; Chapman, Sinead; Coulson, Richard; Coutinho, Pedro M.; Danchin, Etienne G.; Diener, Andrew; Gale, Liane R.; Gardiner, Donald; Goff, Steven; Hammond-Kossack, Kim; Hilburn, Karen; Hua-Van, Aurelie; Jonkers, Wilfried; Kazan, Kemal; Kodira, Chinnappa D.; Koehrsen, Michael; Kumar, Lokesh; Lee, Yong Hwan; Li, Liande; Manners, John M.; Miranda-Saavedra, Diego; Mukherjee, Mala; Park, Gyungsoon; Park, Jongsun; Park, Sook Young; Proctor, Robert H.; Regev, Aviv; Ruiz-Roldan, M. C.; Sain, Divya; Sakthikumar, Sharadha; Sykes, Sean; Schwartz, David C.; Turgeon, Barbara G.; Wapinski, Ilan; Yoder, Olen; Young, Sarah; Zeng, Qiandong; Zhou, Shiguo; Galagan, James; Cuomo, Christina A.; Kistler, H. Corby; Rep, Martijn

    2010-03-18

    Fusarium species are among the most important phytopathogenic and toxigenic fungi, having significant impact on crop production and animal health. Distinctively, members of the F. oxysporum species complex exhibit wide host range but discontinuously distributed host specificity, reflecting remarkable genetic adaptability. To understand the molecular underpinnings of diverse phenotypic traits and their evolution in Fusarium, we compared the genomes of three economically important and phylogenetically related, yet phenotypically diverse plant-pathogenic species, F. graminearum, F. verticillioides and F. oxysporum f. sp. lycopersici. Our analysis revealed greatly expanded lineage-specific (LS) genomic regions in F. oxysporum that include four entire chromosomes, accounting for more than one-quarter of the genome. LS regions are rich in transposons and genes with distinct evolutionary profiles but related to pathogenicity. Experimentally, we demonstrate for the first time the transfer of two LS chromosomes between strains of F. oxysporum, resulting in the conversion of a non-pathogenic strain into a pathogen. Transfer of LS chromosomes between otherwise genetically isolated strains explains the polyphyletic origin of host specificity and the emergence of new pathogenic lineages in the F. oxysporum species complex, putting the evolution of fungal pathogenicity into a new perspective.

  20. Comparative Genomics of the Listeria monocytogenes ST204 Subgroup

    Science.gov (United States)

    Fox, Edward M.; Allnutt, Theodore; Bradbury, Mark I.; Fanning, Séamus; Chandry, P. Scott

    2016-01-01

    The ST204 subgroup of Listeria monocytogenes is among the most frequently isolated in Australia from a range of environmental niches. In this study we provide a comparative genomics analysis of food and food environment isolates from geographically diverse sources. Analysis of the ST204 genomes showed a highly conserved core genome with the majority of variation seen in mobile genetic elements such as plasmids, transposons and phage insertions. Most strains (13/15) harbored plasmids, which although varying in size contained highly conserved sequences. Interestingly 4 isolates contained a conserved plasmid of 91,396 bp. The strains examined were isolated over a period of 12 years and from different geographic locations suggesting plasmids are an important component of the genetic repertoire of this subgroup and may provide a range of stress tolerance mechanisms. In addition to this 4 phage insertion sites and 2 transposons were identified among isolates, including a novel transposon. These genetic elements were highly conserved across isolates that harbored them, and also contained a range of genetic markers linked to stress tolerance and virulence. The maintenance of conserved mobile genetic elements in the ST204 population suggests these elements may contribute to the diverse range of niches colonized by ST204 isolates. Environmental stress selection may contribute to maintaining these genetic features, which in turn may be co-selecting for virulence markers relevant to clinical infection with ST204 isolates. PMID:28066377

  1. Comparative Genomics of the Listeria monocytogenes ST204 Subgroup.

    Science.gov (United States)

    Fox, Edward M; Allnutt, Theodore; Bradbury, Mark I; Fanning, Séamus; Chandry, P Scott

    2016-01-01

    The ST204 subgroup of Listeria monocytogenes is among the most frequently isolated in Australia from a range of environmental niches. In this study we provide a comparative genomics analysis of food and food environment isolates from geographically diverse sources. Analysis of the ST204 genomes showed a highly conserved core genome with the majority of variation seen in mobile genetic elements such as plasmids, transposons and phage insertions. Most strains (13/15) harbored plasmids, which although varying in size contained highly conserved sequences. Interestingly 4 isolates contained a conserved plasmid of 91,396 bp. The strains examined were isolated over a period of 12 years and from different geographic locations suggesting plasmids are an important component of the genetic repertoire of this subgroup and may provide a range of stress tolerance mechanisms. In addition to this 4 phage insertion sites and 2 transposons were identified among isolates, including a novel transposon. These genetic elements were highly conserved across isolates that harbored them, and also contained a range of genetic markers linked to stress tolerance and virulence. The maintenance of conserved mobile genetic elements in the ST204 population suggests these elements may contribute to the diverse range of niches colonized by ST204 isolates. Environmental stress selection may contribute to maintaining these genetic features, which in turn may be co-selecting for virulence markers relevant to clinical infection with ST204 isolates.

  2. Comparative genomics of the tardigrades Hypsibius dujardini and Ramazzottius varieornatus.

    Science.gov (United States)

    Yoshida, Yuki; Koutsovoulos, Georgios; Laetsch, Dominik R; Stevens, Lewis; Kumar, Sujai; Horikawa, Daiki D; Ishino, Kyoko; Komine, Shiori; Kunieda, Takekazu; Tomita, Masaru; Blaxter, Mark; Arakawa, Kazuharu

    2017-07-01

    Tardigrada, a phylum of meiofaunal organisms, have been at the center of discussions of the evolution of Metazoa, the biology of survival in extreme environments, and the role of horizontal gene transfer in animal evolution. Tardigrada are placed as sisters to Arthropoda and Onychophora (velvet worms) in the superphylum Panarthropoda by morphological analyses, but many molecular phylogenies fail to recover this relationship. This tension between molecular and morphological understanding may be very revealing of the mode and patterns of evolution of major groups. Limnoterrestrial tardigrades display extreme cryptobiotic abilities, including anhydrobiosis and cryobiosis, as do bdelloid rotifers, nematodes, and other animals of the water film. These extremophile behaviors challenge understanding of normal, aqueous physiology: how does a multicellular organism avoid lethal cellular collapse in the absence of liquid water? Meiofaunal species have been reported to have elevated levels of horizontal gene transfer (HGT) events, but how important this is in evolution, and particularly in the evolution of extremophile physiology, is unclear. To address these questions, we resequenced and reassembled the genome of H. dujardini, a limnoterrestrial tardigrade that can undergo anhydrobiosis only after extensive pre-exposure to drying conditions, and compared it to the genome of R. varieornatus, a related species with tolerance to rapid desiccation. The 2 species had contrasting gene expression responses to anhydrobiosis, with major transcriptional change in H. dujardini but limited regulation in R. varieornatus. We identified few horizontally transferred genes, but some of these were shown to be involved in entry into anhydrobiosis. Whole-genome molecular phylogenies supported a Tardigrada+Nematoda relationship over Tardigrada+Arthropoda, but rare genomic changes tended to support Tardigrada+Arthropoda.

  3. Reconstructing the Evolution of Brachypodium Genomes Using Comparative Chromosome Painting.

    Directory of Open Access Journals (Sweden)

    Alexander Betekhtin

    Full Text Available Brachypodium distachyon is a model for the temperate cereals and grasses and has a biology, genomics infrastructure and cytogenetic platform fit for purpose. It is a member of a genus with fewer than 20 species, which have different genome sizes, basic chromosome numbers and ploidy levels. The phylogeny and interspecific relationships of this group have not to date been resolved by sequence comparisons and karyotypical studies. The aims of this study are not only to reconstruct the evolution of Brachypodium karyotypes to resolve the phylogeny, but also to highlight the mechanisms that shape the evolution of grass genomes. This was achieved through the use of comparative chromosome painting (CCP which hybridises fluorescent, chromosome-specific probes derived from B. distachyon to homoeologous meiotic chromosomes of its close relatives. The study included five diploids (B. distachyon 2n = 10, B. sylvaticum 2n = 18, B. pinnatum 2n = 16; 2n = 18, B. arbuscula 2n = 18 and B. stacei 2n = 20 three allotetraploids (B. pinnatum 2n = 28, B. phoenicoides 2n = 28 and B. hybridum 2n = 30, and two species of unknown ploidy (B. retusum 2n = 38 and B. mexicanum 2n = 40. On the basis of the patterns of hybridisation and incorporating published data, we propose two alternative, but similar, models of karyotype evolution in the genus Brachypodium. According to the first model, the extant genome of B. distachyon derives from B. mexicanum or B. stacei by several rounds of descending dysploidy, and the other diploids evolve from B. distachyon via ascending dysploidy. The allotetraploids arise by interspecific hybridisation and chromosome doubling between B. distachyon and other diploids. The second model differs from the first insofar as it incorporates an intermediate 2n = 18 species between the B. mexicanum or B. stacei progenitors and the dysploidic B. distachyon.

  4. Comparative Genomic Analysis of Mannheimia haemolytica from Bovine Sources.

    Directory of Open Access Journals (Sweden)

    Cassidy L Klima

    Full Text Available Bovine respiratory disease is a common health problem in beef production. The primary bacterial agent involved, Mannheimia haemolytica, is a target for antimicrobial therapy and at risk for associated antimicrobial resistance development. The role of M. haemolytica in pathogenesis is linked to serotype with serotypes 1 (S1 and 6 (S6 isolated from pneumonic lesions and serotype 2 (S2 found in the upper respiratory tract of healthy animals. Here, we sequenced the genomes of 11 strains of M. haemolytica, representing all three serotypes and performed comparative genomics analysis to identify genetic features that may contribute to pathogenesis. Possible virulence associated genes were identified within 14 distinct prophage, including a periplasmic chaperone, a lipoprotein, peptidoglycan glycosyltransferase and a stress response protein. Prophage content ranged from 2-8 per genome, but was higher in S1 and S6 strains. A type I-C CRISPR-Cas system was identified in each strain with spacer diversity and organization conserved among serotypes. The majority of spacers occur in S1 and S6 strains and originate from phage suggesting that serotypes 1 and 6 may be more resistant to phage predation. However, two spacers complementary to the host chromosome targeting a UDP-N-acetylglucosamine 2-epimerase and a glycosyl transferases group 1 gene are present in S1 and S6 strains only indicating these serotypes may employ CRISPR-Cas to regulate gene expression to avoid host immune responses or enhance adhesion during infection. Integrative conjugative elements are present in nine of the eleven genomes. Three of these harbor extensive multi-drug resistance cassettes encoding resistance against the majority of drugs used to combat infection in beef cattle, including macrolides and tetracyclines used in human medicine. The findings here identify key features that are likely contributing to serotype related pathogenesis and specific targets for vaccine design

  5. Genome Sequencing and Comparative Genomics Analysis Revealed Pathogenic Potential in Penicillium capsulatum as a Novel Fungal Pathogen Belonging to Eurotiales

    Science.gov (United States)

    Yang, Ying; Chen, Min; Li, Zongwei; Al-Hatmi, Abdullah M. S.; de Hoog, Sybren; Pan, Weihua; Ye, Qiang; Bo, Xiaochen; Li, Zhen; Wang, Shengqi; Wang, Junzhi; Chen, Huipeng; Liao, Wanqing

    2016-01-01

    Penicillium capsulatum is a rare Penicillium species used in paper manufacturing, but recently it has been reported to cause invasive infection. To research the pathogenicity of the clinical Penicillium strain, we sequenced the genomes and transcriptomes of the clinical and environmental strains of P. capsulatum. Comparative analyses of these two P. capsulatum strains and close related strains belonging to Eurotiales were performed. The assembled genome sizes of P. capsulatum are approximately 34.4 Mbp in length and encode 11,080 predicted genes. The different isolates of P. capsulatum are highly similar, with the exception of several unique genes, INDELs or SNPs in the genes coding for glycosyl hydrolases, amino acid transporters and circumsporozoite protein. A phylogenomic analysis was performed based on the whole genome data of 38 strains belonging to Eurotiales. By comparing the whole genome sequences and the virulence-related genes from 20 important related species, including fungal pathogens and non-human pathogens belonging to Eurotiales, we found meaningful pathogenicity characteristics between P. capsulatum and its closely related species. Our research indicated that P. capsulatum may be a neglected opportunistic pathogen. This study is beneficial for mycologists, geneticists and epidemiologists to achieve a deeper understanding of the genetic basis of the role of P. capsulatum as a newly reported fungal pathogen. PMID:27761131

  6. Comparative genomic analysis of the thermophilic biomass-degrading fungi Myceliophthora thermophila and Thielavia terrestris

    Energy Technology Data Exchange (ETDEWEB)

    Berka, Randy M.; Grigoriev, Igor V.; Otillar, Robert; Salamov, Asaf; Grimwood, Jane; Reid, Ian; Ishmael, Nadeeza; John, Tricia; Darmond, Corinne; Moisan, Marie-Claude; Henrissat, Bernard; Coutinho, Pedro M.; Lombard, Vincent; Natvig, Donald O.; Lindquist, Erika; Schmutz, Jeremy; Lucas, Susan; Harris, Paul; Powlowski, Justin; Bellemare, Annie; Taylor, David; Butler, Gregory; de Vries, Ronald P.; Allijn, Iris E.; van den Brink, Joost; Ushinsky, Sophia; Storms, Reginald; Powell, Amy J.; Paulsen, Ian T.; Elbourne, Liam D. H.; Baker, Scott. E.; Magnuson, Jon; LaBoissiere, Sylvie; Clutterbuck, A. John; Martinez, Diego; Wogulis, Mark; Lopez de Leon, Alfredo; Rey, Michael W.; Tsang, Adrian

    2011-05-16

    Thermostable enzymes and thermophilic cell factories may afford economic advantages in the production of many chemicals and biomass-based fuels. Here we describe and compare the genomes of two thermophilic fungi, Myceliophthora thermophila and Thielavia terrestris. To our knowledge, these genomes are the first described for thermophilic eukaryotes and the first complete telomere-to-telomere genomes for filamentous fungi. Genome analyses and experimental data suggest that both thermophiles are capable of hydrolyzing all major polysaccharides found in biomass. Examination of transcriptome data and secreted proteins suggests that the two fungi use shared approaches in the hydrolysis of cellulose and xylan but distinct mechanisms in pectin degradation. Characterization of the biomass-hydrolyzing activity of recombinant enzymes suggests that these organisms are highly efficient in biomass decomposition at both moderate and high temperatures. Furthermore, we present evidence suggesting that aside from representing a potential reservoir of thermostable enzymes, thermophilic fungi are amenable to manipulation using classical and molecular genetics.

  7. Comparative Genome Analyses of Vibrio anguillarum Strains Reveal a Link with Pathogenicity Traits

    DEFF Research Database (Denmark)

    Castillo, Daniel; D'Alvise, Paul; Xu, Ruiqi

    2017-01-01

    Vibrio anguillarum is a marine bacterium that can cause vibriosis in many fish and shellfish species, leading to high mortalities and economic losses in aquaculture. Although putative virulence factors have been identified, the mechanism of pathogenesis of V. anguillarum is not fully understood....... anguillarum strains that were medium to nonvirulent had a high degree of genomic homogeneity. Finally, we found that a phylogeny based on the core genomes clustered the strains with moderate to no virulence, while six out of nine high-virulence strains represented phylogenetically separate clusters. Hence, we suggest...... be a powerful tool to discover acquisition of mobile genetic elements related to virulence. Here, we compared 28 V. anguillarum strains that differed in virulence in fish larval models. By pan-genome analyses, we found that six of nine highly virulent strains had a unique core and accessory genome. In contrast...

  8. Comparative genomic analysis of the thermophilic biomass-degrading fungi Myceliophthora thermophila and Thielavia terrestris

    Energy Technology Data Exchange (ETDEWEB)

    Berka, Randy M.; Grigoriev, Igor V.; Otillar, Robert; Salamov, Asaf; Grimwood, Jane; Reid, Ian; Ishmael, Nadeeza; John, Tricia; Darmond, Corinne; Moisan, Marie-Claude; Henrissat, Bernard; Coutinho, Pedro M.; Lombard, Vincent; Natvig, Donald O.; Lindquist, Erika; Schmutz, Jeremy; Lucas, Susan; Harris, Paul; Powlowski, Justin; Bellemare, Annie; Taylor, David; Butler, Gregory; de Vries, Ronald P.; Allijn, Iris E.; van den Brink, Joost; Ushinsky, Sophia; Storms, Reginald; Powell, Amy J.; Paulsen, Ian T.; Elbourne, Liam D. H.; Baker, Scott E.; Magnuson, Jon; LaBoissiere, Sylvie; Clutterbuck, A. John; Martinez, Diego; Wogulis, Mark; de Leon, Alfredo Lopez; Rey, Michael W.; Tsang, Adrian

    2011-10-02

    Thermostable enzymes and thermophilic cell factories may afford economic advantages in the production of many chemicals and biomass-based fuels. Here we describe and compare the genomes of two thermophilic fungi, Myceliophthora thermophila and Thielavia terrestris. To our knowledge, these genomes are the first described for thermophilic eukaryotes and the first complete telomere-to-telomere genomes for filamentous fungi. Genome analyses and experimental data suggest that both thermophiles are capable of hydrolyzing all major polysaccharides found in biomass. Examination of transcriptome data and secreted proteins suggests that the two fungi use shared approaches in the hydrolysis of cellulose and xylan but distinct mechanisms in pectin degradation. Characterization of the biomass-hydrolyzing activity of recombinant enzymes suggests that these organisms are highly efficient in biomass decomposition at both moderate and high temperatures. Furthermore, we present evidence suggesting that aside from representing a potential reservoir of thermostable enzymes, thermophilic fungi are amenable to manipulation using classical and molecular genetics.

  9. Comparative genomic analysis of the thermophilic biomass-degrading fungi Myceliophthora thermophila and Thielavia terrestris.

    Science.gov (United States)

    Berka, Randy M; Grigoriev, Igor V; Otillar, Robert; Salamov, Asaf; Grimwood, Jane; Reid, Ian; Ishmael, Nadeeza; John, Tricia; Darmond, Corinne; Moisan, Marie-Claude; Henrissat, Bernard; Coutinho, Pedro M; Lombard, Vincent; Natvig, Donald O; Lindquist, Erika; Schmutz, Jeremy; Lucas, Susan; Harris, Paul; Powlowski, Justin; Bellemare, Annie; Taylor, David; Butler, Gregory; de Vries, Ronald P; Allijn, Iris E; van den Brink, Joost; Ushinsky, Sophia; Storms, Reginald; Powell, Amy J; Paulsen, Ian T; Elbourne, Liam D H; Baker, Scott E; Magnuson, Jon; Laboissiere, Sylvie; Clutterbuck, A John; Martinez, Diego; Wogulis, Mark; de Leon, Alfredo Lopez; Rey, Michael W; Tsang, Adrian

    2011-10-02

    Thermostable enzymes and thermophilic cell factories may afford economic advantages in the production of many chemicals and biomass-based fuels. Here we describe and compare the genomes of two thermophilic fungi, Myceliophthora thermophila and Thielavia terrestris. To our knowledge, these genomes are the first described for thermophilic eukaryotes and the first complete telomere-to-telomere genomes for filamentous fungi. Genome analyses and experimental data suggest that both thermophiles are capable of hydrolyzing all major polysaccharides found in biomass. Examination of transcriptome data and secreted proteins suggests that the two fungi use shared approaches in the hydrolysis of cellulose and xylan but distinct mechanisms in pectin degradation. Characterization of the biomass-hydrolyzing activity of recombinant enzymes suggests that these organisms are highly efficient in biomass decomposition at both moderate and high temperatures. Furthermore, we present evidence suggesting that aside from representing a potential reservoir of thermostable enzymes, thermophilic fungi are amenable to manipulation using classical and molecular genetics.

  10. Genomic Sequencing of Orientia tsutsugamushi Strain Karp, an Assembly Comparable to the Genome Size of the Strain Ikeda.

    Science.gov (United States)

    Liao, Hsiao-Mei; Chao, Chien-Chung; Lei, Haiyan; Li, Bingjie; Tsai, Shien; Hung, Guo-Chiuan; Ching, Wei-Mei; Lo, Shyh-Ching

    2016-08-18

    Orientia tsutsugamushi, an intracellular bacterium, belongs to the family Rickettsiaceae This study presents the draft genome sequence of strain Karp, with 2.0 Mb as the size of the completed genome. This nearly finished draft genome sequence was annotated with the RAST server and the contents compared to those of the other strains.

  11. Genomic Sequencing of Orientia tsutsugamushi Strain Karp, an Assembly Comparable to the Genome Size of the Strain Ikeda

    Science.gov (United States)

    Liao, Hsiao-Mei; Chao, Chien-Chung; Lei, Haiyan; Li, Bingjie; Tsai, Shien; Hung, Guo-Chiuan

    2016-01-01

    Orientia tsutsugamushi, an intracellular bacterium, belongs to the family Rickettsiaceae. This study presents the draft genome sequence of strain Karp, with 2.0 Mb as the size of the completed genome. This nearly finished draft genome sequence was annotated with the RAST server and the contents compared to those of the other strains. PMID:27540052

  12. Characterization of copy number variation in genomic regions containing STR loci using array comparative genomic hybridization.

    Science.gov (United States)

    Repnikova, Elena A; Rosenfeld, Jill A; Bailes, Andrea; Weber, Cecilia; Erdman, Linda; McKinney, Aimee; Ramsey, Sarah; Hashimoto, Sayaka; Lamb Thrush, Devon; Astbury, Caroline; Reshmi, Shalini C; Shaffer, Lisa G; Gastier-Foster, Julie M; Pyatt, Robert E

    2013-09-01

    Short tandem repeat (STR) loci are commonly used in forensic casework, familial analysis for human identification, and for monitoring hematopoietic cell engraftment after bone marrow transplant. Unexpected genetic variation leading to sequence and length differences in STR loci can complicate STR typing, and presents challenges in casework interpretation. Copy number variation (CNV) is a relatively recently identified form of genetic variation consisting of genomic regions present at variable copy numbers within an individual compared to a reference genome. Large scale population studies have demonstrated that likely all individuals carry multiple regions with CNV of 1kb in size or greater in their genome. To date, no study correlating genomic regions containing STR loci with CNV has been conducted. In this study, we analyzed results from 32,850 samples sent for clinical array comparative genomic hybridization (CGH) analysis for the presence of CNV at regions containing the 13 CODIS (Combined DNA Index System) STR, and the Amelogenin X (AMELX) and Amelogenin Y (AMELY) loci. Thirty-two individuals with CNV involving STR loci on chromosomes 2, 4, 7, 11, 12, 13, 16, and 21, and twelve with CNV involving the AMELX/AMELY loci were identified. These results were correlated with data from publicly available databases housing information on CNV identified in normal populations and additional clinical cases. These collective results demonstrate the presence of CNV in regions containing 9 of the 13 CODIS STR and AMELX/Y loci. Further characterization of STR profiles within regions of CNV, additional cataloging of these variants in multiple populations, and contributing such examples to the public domain will provide valuable information for reliable use of these loci.

  13. Comparative analysis of whole genome structure of Streptococcus suis using whole genome PCR scanning

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    An outbreak associated with Streptococcus suis infection in humans emerged in Sichuan province, China in 2005. The outbreak is atypical for the apparent large number of human cases, high fatality rate and geographical spread. To determine whether the bacterium has changed, we compared both human and animal isolates from the Sichuan outbreak with those collected previously within China and in other countries using whole genome PCR scanning (WGPScaning) comparative sequencing of several known virulence factor genes and multilocus sequence typing (MLST) analysis. WGPScanning analysis showed that all primer pairs yielded PCR products of the expected sizes in all four strains tested. The nucleotide sequences of all the detected virulence factor genes are identical in the four strains and MLST results showed that the four isolates studied and reference strain all belonged to the ST1 complex. No new genetic changes were found in the genome structure of the isolates from this Sichuan outbreak.

  14. Comparative analysis of whole genome structure of Streptococcus suis using whole genome PCR scanning

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    An outbreak associated with Streptococcus suis infection in humans emerged in Sichuan province, China in 2005. The outbreak is atypical for the apparent large number of human cases, high fatality rate and geographical spread. To determine whether the bacterium has changed, we compared both human and animal isolates from the Sichuan outbreak with those collected previously within China and in other countries using whole genome PCR scanning (WGPScaning) comparative sequencing of several known virulence factor genes and multilocus sequence typing (MLST) analysis. WGPScanning analysis showed that all primer pairs yielded PCR products of the expected sizes in all four strains tested. The nucleotide sequences of all the detected virulence factor genes are identical in the four strains and MLST results showed that the four isolates studied and reference strain all belonged to the ST1 com-plex. No new genetic changes were found in the genome structure of the isolates from this Sichuan outbreak.

  15. Comparative analysis of genomic signal processing for microarray data clustering.

    Science.gov (United States)

    Istepanian, Robert S H; Sungoor, Ala; Nebel, Jean-Christophe

    2011-12-01

    Genomic signal processing is a new area of research that combines advanced digital signal processing methodologies for enhanced genetic data analysis. It has many promising applications in bioinformatics and next generation of healthcare systems, in particular, in the field of microarray data clustering. In this paper we present a comparative performance analysis of enhanced digital spectral analysis methods for robust clustering of gene expression across multiple microarray data samples. Three digital signal processing methods: linear predictive coding, wavelet decomposition, and fractal dimension are studied to provide a comparative evaluation of the clustering performance of these methods on several microarray datasets. The results of this study show that the fractal approach provides the best clustering accuracy compared to other digital signal processing and well known statistical methods.

  16. Comparative genomics and transcriptomics of lineages I, II, and III strains of Listeria monocytogenes

    Science.gov (United States)

    2012-01-01

    Background Listeria monocytogenes is a food-borne pathogen that causes infections with a high-mortality rate and has served as an invaluable model for intracellular parasitism. Here, we report complete genome sequences for two L. monocytogenes strains belonging to serotype 4a (L99) and 4b (CLIP80459), and transcriptomes of representative strains from lineages I, II, and III, thereby permitting in-depth comparison of genome- and transcriptome -based data from three lineages of L. monocytogenes. Lineage III, represented by the 4a L99 genome is known to contain strains less virulent for humans. Results The genome analysis of the weakly pathogenic L99 serotype 4a provides extensive evidence of virulence gene decay, including loss of several important surface proteins. The 4b CLIP80459 genome, unlike the previously sequenced 4b F2365 genome harbours an intact inlB invasion gene. These lineage I strains are characterized by the lack of prophage genes, as they share only a single prophage locus with other L. monocytogenes genomes 1/2a EGD-e and 4a L99. Comparative transcriptome analysis during intracellular growth uncovered adaptive expression level differences in lineages I, II and III of Listeria, notable amongst which was a strong intracellular induction of flagellar genes in strain 4a L99 compared to the other lineages. Furthermore, extensive differences between strains are manifest at levels of metabolic flux control and phosphorylated sugar uptake. Intriguingly, prophage gene expression was found to be a hallmark of intracellular gene expression. Deletion mutants in the single shared prophage locus of lineage II strain EGD-e 1/2a, the lma operon, revealed severe attenuation of virulence in a murine infection model. Conclusion Comparative genomics and transcriptome analysis of L. monocytogenes strains from three lineages implicate prophage genes in intracellular adaptation and indicate that gene loss and decay may have led to the emergence of attenuated lineages

  17. Comparative genomics and transcriptomics of lineages I, II, and III strains of Listeria monocytogenes

    Directory of Open Access Journals (Sweden)

    Hain Torsten

    2012-04-01

    Full Text Available Abstract Background Listeria monocytogenes is a food-borne pathogen that causes infections with a high-mortality rate and has served as an invaluable model for intracellular parasitism. Here, we report complete genome sequences for two L. monocytogenes strains belonging to serotype 4a (L99 and 4b (CLIP80459, and transcriptomes of representative strains from lineages I, II, and III, thereby permitting in-depth comparison of genome- and transcriptome -based data from three lineages of L. monocytogenes. Lineage III, represented by the 4a L99 genome is known to contain strains less virulent for humans. Results The genome analysis of the weakly pathogenic L99 serotype 4a provides extensive evidence of virulence gene decay, including loss of several important surface proteins. The 4b CLIP80459 genome, unlike the previously sequenced 4b F2365 genome harbours an intact inlB invasion gene. These lineage I strains are characterized by the lack of prophage genes, as they share only a single prophage locus with other L. monocytogenes genomes 1/2a EGD-e and 4a L99. Comparative transcriptome analysis during intracellular growth uncovered adaptive expression level differences in lineages I, II and III of Listeria, notable amongst which was a strong intracellular induction of flagellar genes in strain 4a L99 compared to the other lineages. Furthermore, extensive differences between strains are manifest at levels of metabolic flux control and phosphorylated sugar uptake. Intriguingly, prophage gene expression was found to be a hallmark of intracellular gene expression. Deletion mutants in the single shared prophage locus of lineage II strain EGD-e 1/2a, the lma operon, revealed severe attenuation of virulence in a murine infection model. Conclusion Comparative genomics and transcriptome analysis of L. monocytogenes strains from three lineages implicate prophage genes in intracellular adaptation and indicate that gene loss and decay may have led to the emergence

  18. Comparative Genomics and Transcriptomic Analysis of Mycobacterium Kansasii

    KAUST Repository

    Alzahid, Yara

    2014-04-01

    The group of Mycobacteria is one of the most intensively studied bacterial taxa, as they cause the two historical and worldwide known diseases: leprosy and tuberculosis. Mycobacteria not identified as tuberculosis or leprosy complex, have been referred to by ‘environmental mycobacteria’ or ‘Nontuberculous mycobacteria (NTM). Mycobacterium kansasii (M. kansasii) is one of the most frequent NTM pathogens, as it causes pulmonary disease in immuno-competent patients and pulmonary, and disseminated disease in patients with various immuno-deficiencies. There have been five documented subtypes of this bacterium, by different molecular typing methods, showing that type I causes tuberculosis-like disease in healthy individuals, and type II in immune-compromised individuals. The remaining types are said to be environmental, thereby, not causing any diseases. The aim of this project was to conduct a comparative genomic study of M. kansasii types I-V and investigating the gene expression level of those types. From various comparative genomics analysis, provided genomics evidence on why M. kansasii type I is considered pathogenic, by focusing on three key elements that are involved in virulence of Mycobacteria: ESX secretion system, Phospholipase c (plcb) and Mammalian cell entry (Mce) operons. The results showed the lack of the espA operon in types II-V, which renders the ESX- 1 operon dysfunctional, as espA is one of the key factors that control this secretion system. However, gene expression analysis showed this operon to be deleted in types II, III and IV. Furthermore, plcB was found to be truncated in types III and IV. Analysis of Mce operons (1-4) show that mce-1 operon is duplicated, mce-2 is absent and mce-3 and mce-4 is present in one copy in M. kansasii types I-V. Gene expression profiles of type I-IV, showed that the secreted proteins of ESX-1 were slightly upregulated in types II-IV when compared to type I and the secreted forms of ESX-5 were highly down

  19. Comparative genomics of Clavibacter michiganensis subspecies, pathogens of important agricultural crops.

    Science.gov (United States)

    Tambong, James T

    2017-01-01

    Subspecies of Clavibacter michiganensis are important phytobacterial pathogens causing devastating diseases in several agricultural crops. The genome organizations of these pathogens are poorly understood. Here, the complete genomes of 5 subspecies (C. michiganensis subsp. michiganensis, Cmi; C. michiganensis subsp. sepedonicus, Cms; C. michiganensis subsp. nebraskensis, Cmn; C. michiganensis subsp. insidiosus, Cmi and C. michiganensis subsp. capsici, Cmc) were analyzed. This study assessed the taxonomic position of the subspecies based on 16S rRNA and genome-based DNA homology and concludes that there is ample evidence to elevate some of the subspecies to species-level. Comparative genomics analysis indicated distinct genomic features evident on the DNA structural atlases and annotation features. Based on orthologous gene analysis, about 2300 CDSs are shared across all the subspecies; and Cms showed the highest number of subspecies-specific CDS, most of which are mobile elements suggesting that Cms could be more prone to translocation of foreign genes. Cms and Cmi had the highest number of pseudogenes, an indication of potential degenerating genomes. The stress response factors that may be involved in cold/heat shock, detoxification, oxidative stress, osmoregulation, and carbon utilization are outlined. For example, the wco-cluster encoding for extracellular polysaccharide II is highly conserved while the sucrose-6-phosphate hydrolase that catalyzes the hydrolysis of sucrose-6-phosphate yielding glucose-6-phosphate and fructose is highly divergent. A unique second form of the enzyme is only present in Cmn NCPPB 2581. Also, twenty-eight plasmid-borne CDSs in the other subspecies were found to have homologues in the chromosomal genome of Cmn which is known not to carry plasmids. These CDSs include pathogenesis-related factors such as Endocellulases E1 and Beta-glucosidase. The results presented here provide an insight of the functional organization of the genomes of

  20. Array-based comparative genomic hybridization facilitates identification of breakpoints of a novel der(1)t(1;18)(p36.3;q23)dn in a child presenting with mental retardation.

    Science.gov (United States)

    Lennon, P A; Cooper, M L; Curtis, M A; Lim, C; Ou, Z; Patel, A; Cheung, S W; Bacino, C A

    2006-06-01

    Monosomy of distal 1p36 represents the most common terminal deletion in humans and results in one of the most frequently diagnosed mental retardation syndromes. This deletion is considered a contiguous gene deletion syndrome, and has been shown to vary in deletion sizes that contribute to the spectrum of phenotypic anomalies seen in patients with monosomy 1p36. We report on an 8-year-old female with characteristics of the monosomy 1p36 syndrome who demonstrated a novel der(1)t(1;18)(p36.3;q23). Initial G-banded karyotype analysis revealed a deleted chromosome 1, with a breakpoint within 1p36.3. Subsequent FISH and array-based comparative genomic hybridization not only confirmed and partially characterized the deletion of chromosome 1p36.3, but also uncovered distal trisomy for 18q23. In this patient, the duplicated 18q23 is translocated onto the deleted 1p36.3 region, suggesting telomere capture. Molecular characterization of this novel der(1)t(1;18)(p36.3;q23), guided by our clinical array-comparative genomic hybridization, demonstrated a 3.2 Mb terminal deletion of chromosome 1p36.3 and a 200 kb duplication of 18q23 onto the deleted 1p36.3, presumably stabilizing the deleted chromosome 1. DNA sequence analysis around the breakpoints demonstrated no homology, and therefore this telomere capture of distal 18q is apparently the result of a non-homologous recombination. Partial trisomy for 18q23 has not been previously reported. The importance of mapping the breakpoints of all balanced and unbalanced translocations found in the clinical laboratory, when phenotypic abnormalities are found, is discussed.

  1. AgBase: a functional genomics resource for agriculture

    Directory of Open Access Journals (Sweden)

    Hill David P

    2006-09-01

    Full Text Available Abstract Background Many agricultural species and their pathogens have sequenced genomes and more are in progress. Agricultural species provide food, fiber, xenotransplant tissues, biopharmaceuticals and biomedical models. Moreover, many agricultural microorganisms are human zoonoses. However, systems biology from functional genomics data is hindered in agricultural species because agricultural genome sequences have relatively poor structural and functional annotation and agricultural research communities are smaller with limited funding compared to many model organism communities. Description To facilitate systems biology in these traditionally agricultural species we have established "AgBase", a curated, web-accessible, public resource http://www.agbase.msstate.edu for structural and functional annotation of agricultural genomes. The AgBase database includes a suite of computational tools to use GO annotations. We use standardized nomenclature following the Human Genome Organization Gene Nomenclature guidelines and are currently functionally annotating chicken, cow and sheep gene products using the Gene Ontology (GO. The computational tools we have developed accept and batch process data derived from different public databases (with different accession codes, return all existing GO annotations, provide a list of products without GO annotation, identify potential orthologs, model functional genomics data using GO and assist proteomics analysis of ESTs and EST assemblies. Our journal database helps prevent redundant manual GO curation. We encourage and publicly acknowledge GO annotations from researchers and provide a service for researchers interested in GO and analysis of functional genomics data. Conclusion The AgBase database is the first database dedicated to functional genomics and systems biology analysis for agriculturally important species and their pathogens. We use experimental data to improve structural annotation of genomes and to

  2. The chickpea genomic web resource: visualization and analysis of the desi-type Cicer arietinum nuclear genome for comparative exploration of legumes.

    Science.gov (United States)

    Misra, Gopal; Priya, Piyush; Bandhiwal, Nitesh; Bareja, Neha; Jain, Mukesh; Bhatia, Sabhyata; Chattopadhyay, Debasis; Tyagi, Akhilesh K; Yadav, Gitanjali

    2014-12-18

    Availability of the draft nuclear genome sequences of small-seeded desi-type legume crop Cicer arietinum has provided an opportunity for investigating unique chickpea genomic features and evaluation of their biological significance. The increasing number of legume genome sequences also presents a challenge for developing reliable and information-driven bioinformatics applications suitable for comparative exploration of this important class of crop plants. The Chickpea Genomic Web Resource (CGWR) is an implementation of a suite of web-based applications dedicated to chickpea genome visualization and comparative analysis, based on next generation sequencing and assembly of Cicer arietinum desi-type genotype ICC4958. CGWR has been designed and configured for mapping, scanning and browsing the significant chickpea genomic features in view of the important existing and potential roles played by the various legume genome projects in mutant mapping and cloning. It also enables comparative informatics of ICC4958 DNA sequence analysis with other wild and cultivated genotypes of chickpea, various other leguminous species as well as several non-leguminous model plants, to enable investigations into evolutionary processes that shape legume genomes. CGWR is an online database offering a comprehensive visual and functional genomic analysis of the chickpea genome, along with customized maps and gene-clustering options. It is also the only plant based web resource supporting display and analysis of nucleosome positioning patterns in the genome. The usefulness of CGWR has been demonstrated with discoveries of biological significance made using this server. The CGWR is compatible with all available operating systems and browsers, and is available freely under the open source license at http://www.nipgr.res.in/CGWR/home.php.

  3. MicrobesOnline: an integrated portal for comparative and functional genomics

    Energy Technology Data Exchange (ETDEWEB)

    Dehal, Paramvir; Joachimiak, Marcin; Price, Morgan; Bates, John; Baumohl, Jason; Chivian, Dylan; Friedland, Greg; Huang, Kathleen; Keller, Keith; Novichkov, Pavel; Dubchak, Inna; Alm, Eric; Arkin, Adam

    2011-07-14

    Since 2003, MicrobesOnline (http://www.microbesonline.org) has been providing a community resource for comparative and functional genome analysis. The portal includes over 1000 complete genomes of bacteria, archaea and fungi and thousands of expression microarrays from diverse organisms ranging from model organisms such as Escherichia coli and Saccharomyces cerevisiae to environmental microbes such as Desulfovibrio vulgaris and Shewanella oneidensis. To assist in annotating genes and in reconstructing their evolutionary history, MicrobesOnline includes a comparative genome browser based on phylogenetic trees for every gene family as well as a species tree. To identify co-regulated genes, MicrobesOnline can search for genes based on their expression profile, and provides tools for identifying regulatory motifs and seeing if they are conserved. MicrobesOnline also includes fast phylogenetic profile searches, comparative views of metabolic pathways, operon predictions, a workbench for sequence analysis and integration with RegTransBase and other microbial genome resources. The next update of MicrobesOnline will contain significant new functionality, including comparative analysis of metagenomic sequence data. Programmatic access to the database, along with source code and documentation, is available at http://microbesonline.org/programmers.html.

  4. MicrobesOnline: an integrated portal for comparative and functional genomics

    Energy Technology Data Exchange (ETDEWEB)

    Dehal, Paramvir S.; Joachimiak, Marcin P.; Price, Morgan N.; Bates, John T.; Baumohl, Jason K.; Chivian, Dylan; Friedland, Greg D.; Huang, Katherine H.; Keller, Keith; Novichkov, Pavel S.; Dubchak, Inna L.; Alm, Eric J.; Arkin, Adam P.

    2009-09-17

    Since 2003, MicrobesOnline (http://www.microbesonline.org) has been providing a community resource for comparative and functional genome analysis. The portal includes over 1000 complete genomes of bacteria, archaea and fungi and thousands of expression microarrays from diverse organisms ranging from model organisms such as Escherichia coli and Saccharomyces cerevisiae to environmental microbes such as Desulfovibrio vulgaris and Shewanella oneidensis. To assist in annotating genes and in reconstructing their evolutionary history, MicrobesOnline includes a comparative genome browser based on phylogenetic trees for every gene family as well as a species tree. To identify co-regulated genes, MicrobesOnline can search for genes based on their expression profile, and provides tools for identifying regulatory motifs and seeing if they are conserved. MicrobesOnline also includes fast phylogenetic profile searches, comparative views of metabolic pathways, operon predictions, a workbench for sequence analysis and integration with RegTransBase and other microbial genome resources. The next update of MicrobesOnline will contain significant new functionality, including comparative analysis of metagenomic sequence data. Programmatic access to the database, along with source code and documentation, is available at http://microbesonline.org/programmers.html.

  5. Comparative genomic analysis of Vibrio parahaemolyticus: serotype conversion and virulence

    Directory of Open Access Journals (Sweden)

    Gil Ana I

    2011-06-01

    Full Text Available Abstract Background Vibrio parahaemolyticus is a common cause of foodborne disease. Beginning in 1996, a more virulent strain having serotype O3:K6 caused major outbreaks in India and other parts of the world, resulting in the emergence of a pandemic. Other serovariants of this strain emerged during its dissemination and together with the original O3:K6 were termed strains of the pandemic clone. Two genomes, one of this virulent strain and one pre-pandemic strain have been sequenced. We sequenced four additional genomes of V. parahaemolyticus in this study that were isolated from different geographical regions and time points. Comparative genomic analyses of six strains of V. parahaemolyticus isolated from Asia and Peru were performed in order to advance knowledge concerning the evolution of V. parahaemolyticus; specifically, the genetic changes contributing to serotype conversion and virulence. Two pre-pandemic strains and three pandemic strains, isolated from different geographical regions, were serotype O3:K6 and either toxin profiles (tdh+, trh- or (tdh-, trh+. The sixth pandemic strain sequenced in this study was serotype O4:K68. Results Genomic analyses revealed that the trh+ and tdh+ strains had different types of pathogenicity islands and mobile elements as well as major structural differences between the tdh pathogenicity islands of the pre-pandemic and pandemic strains. In addition, the results of single nucleotide polymorphism (SNP analysis showed that 94% of the SNPs between O3:K6 and O4:K68 pandemic isolates were within a 141 kb region surrounding the O- and K-antigen-encoding gene clusters. The "core" genes of V. parahaemolyticus were also compared to those of V. cholerae and V. vulnificus, in order to delineate differences between these three pathogenic species. Approximately one-half (49-59% of each species' core genes were conserved in all three species, and 14-24% of the core genes were species-specific and in different

  6. Comparative genomic reconstruction of transcriptional networks controlling central metabolism in the Shewanella genus

    Directory of Open Access Journals (Sweden)

    Kovaleva Galina

    2011-06-01

    Full Text Available Abstract Background Genome-scale prediction of gene regulation and reconstruction of transcriptional regulatory networks in bacteria is one of the critical tasks of modern genomics. The Shewanella genus is comprised of metabolically versatile gamma-proteobacteria, whose lifestyles and natural environments are substantially different from Escherichia coli and other model bacterial species. The comparative genomics approaches and computational identification of regulatory sites are useful for the in silico reconstruction of transcriptional regulatory networks in bacteria. Results To explore conservation and variations in the Shewanella transcriptional networks we analyzed the repertoire of transcription factors and performed genomics-based reconstruction and comparative analysis of regulons in 16 Shewanella genomes. The inferred regulatory network includes 82 transcription factors and their DNA binding sites, 8 riboswitches and 6 translational attenuators. Forty five regulons were newly inferred from the genome context analysis, whereas others were propagated from previously characterized regulons in the Enterobacteria and Pseudomonas spp.. Multiple variations in regulatory strategies between the Shewanella spp. and E. coli include regulon contraction and expansion (as in the case of PdhR, HexR, FadR, numerous cases of recruiting non-orthologous regulators to control equivalent pathways (e.g. PsrA for fatty acid degradation and, conversely, orthologous regulators to control distinct pathways (e.g. TyrR, ArgR, Crp. Conclusions We tentatively defined the first reference collection of ~100 transcriptional regulons in 16 Shewanella genomes. The resulting regulatory network contains ~600 regulated genes per genome that are mostly involved in metabolism of carbohydrates, amino acids, fatty acids, vitamins, metals, and stress responses. Several reconstructed regulons including NagR for N-acetylglucosamine catabolism were experimentally validated in S

  7. Comparative chloroplast genomes of eleven Schima (Theaceae) species: Insights into DNA barcoding and phylogeny.

    Science.gov (United States)

    Yu, Xiang-Qin; Drew, Bryan T; Yang, Jun-Bo; Gao, Lian-Ming; Li, De-Zhu

    2017-01-01

    Schima is an ecologically and economically important woody genus in tea family (Theaceae). Unresolved species delimitations and phylogenetic relationships within Schima limit our understanding of the genus and hinder utilization of the genus for economic purposes. In the present study, we conducted comparative analysis among the complete chloroplast (cp) genomes of 11 Schima species. Our results indicate that Schima cp genomes possess a typical quadripartite structure, with conserved genomic structure and gene order. The size of the Schima cp genome is about 157 kilo base pairs (kb). They consistently encode 114 unique genes, including 80 protein-coding genes, 30 tRNAs, and 4 rRNAs, with 17 duplicated in the inverted repeat (IR). These cp genomes are highly conserved and do not show obvious expansion or contraction of the IR region. The percent variability of the 68 coding and 93 noncoding (>150 bp) fragments is consistently less than 3%. The seven most widely touted DNA barcode regions as well as one promising barcode candidate showed low sequence divergence. Eight mutational hotspots were identified from the 11 cp genomes. These hotspots may potentially be useful as specific DNA barcodes for species identification of Schima. The 58 cpSSR loci reported here are complementary to the microsatellite markers identified from the nuclear genome, and will be leveraged for further population-level studies. Phylogenetic relationships among the 11 Schima species were resolved with strong support based on the cp genome data set, which corresponds well with the species distribution pattern. The data presented here will serve as a foundation to facilitate species identification, DNA barcoding and phylogenetic reconstructions for future exploration of Schima.

  8. Chromosomal imbalances in nasopharyngeal carcinoma: a meta-analysis of comparative genomic hybridization results

    Directory of Open Access Journals (Sweden)

    Jin Ping

    2006-01-01

    Full Text Available Abstract Nasopharyngeal carcinoma (NPC is a highly prevalent disease in Southeast Asia and its prevalence is clearly affected by genetic background. Various theories have been suggested for its high incidence in this geographical region but to these days no conclusive explanation has been identified. Chromosomal imbalances identifiable through comparative genomic hybridization may shed some light on common genetic alterations that may be of relevance to the onset and progression of NPC. Review of the literature, however, reveals contradictory results among reported findings possibly related to factors associated with patient selection, stage of disease, differences in methodological details etc. To increase the power of the analysis and attempt to identify commonalities among the reported findings, we performed a meta-analysis of results described in NPC tissues based on chromosomal comparative genomic hybridization (CGH. This meta-analysis revealed consistent patters in chromosomal abnormalities that appeared to cluster in specific "hot spots" along the genome following a stage-dependent progression.

  9. Intraspecies comparative genomics of three strains of Orientia tsutsugamushi with different antibiotic sensitivity.

    Science.gov (United States)

    Liao, Hsiao-Mei; Chao, Chien-Chung; Lei, Haiyan; Li, Bingjie; Tsai, Shien; Hung, Guo-Chiuan; Ching, Wei-Mei; Lo, Shyh-Ching

    2017-06-01

    We recently reported the genome of Orientia tsutsugamushi (OT) strain Karp (GenBank Accession #: NZ_LYMA00000000.2, https://www.ncbi.nlm.nih.gov/nuccore/NZ_LYMA00000000.2) with > 2 Mb in size through clone-based sequencing and high throughput genomic shotgun sequencing (HTS). The genomes of OT strains AFSC4 and AFSC7 were similarly sequenced by HTS Since strains AFSC4 (GenBank Accession #: NZ_LYMT00000000.1, https://www.ncbi.nlm.nih.gov/nuccore/1035784408) and AFSC7 (GenBank Accession #: NZ_LYMB00000000.1, https://www.ncbi.nlm.nih.gov/nuccore/1035854767) were more resistant to antibiotics than strain Karp, we conducted comparative analysis of the three draft genomes annotated by RAST server aimed to identify possible genetic bases of difference in microbial antibiotic sensitivity. Intraspecies comparative genomics analysis of the three OT strains revealed that two ORFs encoding hypothetical proteins in both strains AFSC4 and AFSC7 are absent in strain Karp.

  10. Comparative genomics of oral isolates of Streptococcus mutans by in silico genome subtraction does not reveal accessory DNA associated with severe early childhood caries.

    Science.gov (United States)

    Argimón, Silvia; Konganti, Kranti; Chen, Hao; Alekseyenko, Alexander V; Brown, Stuart; Caufield, Page W

    2014-01-01

    Comparative genomics is a popular method for the identification of microbial virulence determinants, especially since the sequencing of a large number of whole bacterial genomes from pathogenic and non-pathogenic strains has become relatively inexpensive. The bioinformatics pipelines for comparative genomics usually include gene prediction and annotation and can require significant computer power. To circumvent this, we developed a rapid method for genome-scale in silico subtractive hybridization, based on blastn and independent of feature identification and annotation. Whole genome comparisons by in silico genome subtraction were performed to identify genetic loci specific to Streptococcus mutans strains associated with severe early childhood caries (S-ECC), compared to strains isolated from caries-free (CF) children. The genome similarity of the 20 S. mutans strains included in this study, calculated by Simrank k-mer sharing, ranged from 79.5% to 90.9%, confirming this is a genetically heterogeneous group of strains. We identified strain-specific genetic elements in 19 strains, with sizes ranging from 200 to 39 kb. These elements contained protein-coding regions with functions mostly associated with mobile DNA. We did not, however, identify any genetic loci consistently associated with dental caries, i.e., shared by all the S-ECC strains and absent in the CF strains. Conversely, we did not identify any genetic loci specific with the healthy group. Comparison of previously published genomes from pathogenic and carriage strains of Neisseria meningitidis with our in silico genome subtraction yielded the same set of genes specific to the pathogenic strains, thus validating our method. Our results suggest that S. mutans strains derived from caries active or caries free dentitions cannot be differentiated based on the presence or absence of specific genetic elements. Our in silico genome subtraction method is available as the Microbial Genome Comparison (MGC) tool

  11. RegPredict: an integrated system for regulon inference in prokaryotes by comparative genomics approach

    Energy Technology Data Exchange (ETDEWEB)

    Novichkov, Pavel S.; Rodionov, Dmitry A.; Stavrovskaya, Elena D.; Novichkova, Elena S.; Kazakov, Alexey E.; Gelfand, Mikhail S.; Arkin, Adam P.; Mironov, Andrey A.; Dubchak, Inna

    2010-05-26

    RegPredict web server is designed to provide comparative genomics tools for reconstruction and analysis of microbial regulons using comparative genomics approach. The server allows the user to rapidly generate reference sets of regulons and regulatory motif profiles in a group of prokaryotic genomes. The new concept of a cluster of co-regulated orthologous operons allows the user to distribute the analysis of large regulons and to perform the comparative analysis of multiple clusters independently. Two major workflows currently implemented in RegPredict are: (i) regulon reconstruction for a known regulatory motif and (ii) ab initio inference of a novel regulon using several scenarios for the generation of starting gene sets. RegPredict provides a comprehensive collection of manually curated positional weight matrices of regulatory motifs. It is based on genomic sequences, ortholog and operon predictions from the MicrobesOnline. An interactive web interface of RegPredict integrates and presents diverse genomic and functional information about the candidate regulon members from several web resources. RegPredict is freely accessible at http://regpredict.lbl.gov.

  12. Comparative genomic and phylogenomic analyses of the Bifidobacteriaceae family

    National Research Council Canada - National Science Library

    Gabriele Andrea Lugli; Christian Milani; Francesca Turroni; Sabrina Duranti; Leonardo Mancabelli; Marta Mangifesta; Chiara Ferrario; Monica Modesto; Paola Mattarelli; Killer Jiři; Douwe van Sinderen; Marco Ventura

    2017-01-01

    ... they also occur as pathogenic bacteria of the urogenital tract. The pan-genome of the genus Bifidobacterium has been explored in detail in recent years, though genomics of the Bifidobacteriaceae family has not yet received much attention...

  13. Chromosomal imbalances revealed in primary rhabdomyosarcomas by comparative genomic hybridization

    Institute of Scientific and Technical Information of China (English)

    LI Qiao-xin; LIU Chun-xia; CHUN Cai-pu; QI Yan; CHANG Bin; LI Xin-xia; CHEN Yun-zhao; NONG Wei-xia; LI Hong-an; LI Feng

    2009-01-01

    Background Previous cytogenetic studies revealed aberrations varied among the throe subtypes of rhabdomyosarcoma. We profiled chromosomal imbalances in the different subtypes and investigated the relationships between clinical parameters and genomic aberrations.Methods Comparative genomic hybridization was used to investigate genomic imbalances in 25 cases of primary rhabdomyosarcomas and two rhabdomyosarcoma cell lines. Specimens were reviewed to determine histological type, pathological grading and clinical staging.Results Changes involving one or more regions of the genome were seen in all rhabdomyosarcomal patients. For rhabdomyosarcoma, DNA sequence gains were most frequently (>30%) seen in chromosomes 2p, 12q, 6p, 9q, 10q, 1p,2q, 6q, 8q, 15q and 18q; losses from 3p, 11p and 6p. In aggressive alveolar rhabdomyosarcoma, frequent gains were seen on chromosomes 12q, 2p, 6p, 2q, 4q, 10q and 15q; losses from 3p, 6p, 1q and 5q. For embryonic rhabdomyosarcoma, frequent gains were on 7p, 9q, 2p, 18q, 1p and 8q; losses only from 11p. Frequently gained chromosome arms of translocation associated with rhabdomyosarcoma were 12q, 2, 6, 10q, 4q and 15q; losses from 3p,6p and 5q. The frequently gained chromosome arms of nontranslocation associated with rhabdomyosarcoma were 2p,9q and 18q, while 11p and 14q were the frequently lost chromosome arms. Gains on chromosome 12q were significantly correlated with translocation type. Gains on chromosome 9q were significantly correlated with clinical staging. Conclusions Gains on chromosomes 2p, 12q, 6p, 9q, 10q, 1p, 2q, 6q, 8q, 15q and 18q and losses on chromosomes 3p, 11p and 6p may be related to rhabdomyosarcomal carcinogenesis. Furthermore, gains on chromosome 12q may be correlated with translocation and gains on chromosome 9q with the early stages of rhabdomyosarcoma.

  14. Comparative genomics of the dormancy regulons in mycobacteria.

    Science.gov (United States)

    Gerasimova, Anna; Kazakov, Alexey E; Arkin, Adam P; Dubchak, Inna; Gelfand, Mikhail S

    2011-07-01

    In response to stresses, Mycobacterium cells become dormant. This process is regulated by the DosR transcription factor. In Mycobacterium tuberculosis, the dormancy regulon is well characterized and contains the dosR gene itself and dosS and dosT genes encoding DosR kinases, nitroreductases (acg; Rv3131), diacylglycerol acyltransferase (DGAT) (Rv3130c), and many universal stress proteins (USPs). In this study, we apply comparative genomic analysis to characterize the DosR regulons in nine Mycobacterium genomes, Rhodococcus sp. RHA1, Nocardia farcinica, and Saccharopolyspora erythraea. The regulons are highly labile, containing eight core gene groups (regulators, kinases, USPs, DGATs, nitroreductases, ferredoxins, heat shock proteins, and the orthologs of the predicted kinase [Rv2004c] from M. tuberculosis) and 10 additional genes with more restricted taxonomic distribution that are mostly involved in anaerobic respiration. The largest regulon is observed in M. marinum and the smallest in M. abscessus. Analysis of large gene families encoding USPs, nitroreductases, and DGATs demonstrates a mosaic distribution of regulated and nonregulated members, suggesting frequent acquisition and loss of DosR-binding sites.

  15. The aggregate site frequency spectrum for comparative population genomic inference.

    Science.gov (United States)

    Xue, Alexander T; Hickerson, Michael J

    2015-12-01

    Understanding how assemblages of species responded to past climate change is a central goal of comparative phylogeography and comparative population genomics, an endeavour that has increasing potential to integrate with community ecology. New sequencing technology now provides the potential to perform complex demographic inference at unprecedented resolution across assemblages of nonmodel species. To this end, we introduce the aggregate site frequency spectrum (aSFS), an expansion of the site frequency spectrum to use single nucleotide polymorphism (SNP) data sets collected from multiple, co-distributed species for assemblage-level demographic inference. We describe how the aSFS is constructed over an arbitrary number of independent population samples and then demonstrate how the aSFS can differentiate various multispecies demographic histories under a wide range of sampling configurations while allowing effective population sizes and expansion magnitudes to vary independently. We subsequently couple the aSFS with a hierarchical approximate Bayesian computation (hABC) framework to estimate degree of temporal synchronicity in expansion times across taxa, including an empirical demonstration with a data set consisting of five populations of the threespine stickleback (Gasterosteus aculeatus). Corroborating what is generally understood about the recent postglacial origins of these populations, the joint aSFS/hABC analysis strongly suggests that the stickleback data are most consistent with synchronous expansion after the Last Glacial Maximum (posterior probability = 0.99). The aSFS will have general application for multilevel statistical frameworks to test models involving assemblages and/or communities, and as large-scale SNP data from nonmodel species become routine, the aSFS expands the potential for powerful next-generation comparative population genomic inference.

  16. Characterization of hemizygous deletions in citrus using array-comparative genomic hybridization and microsynteny comparisons with the poplar genome.

    Science.gov (United States)

    Ríos, Gabino; Naranjo, Miguel A; Iglesias, Domingo J; Ruiz-Rivero, Omar; Geraud, Marion; Usach, Antonio; Talón, Manuel

    2008-08-09

    Many fruit-tree species, including relevant Citrus spp varieties exhibit a reproductive biology that impairs breeding and strongly constrains genetic improvements. In citrus, juvenility increases the generation time while sexual sterility, inbreeding depression and self-incompatibility prevent the production of homozygous cultivars. Genomic technology may provide citrus researchers with a new set of tools to address these various restrictions. In this work, we report a valuable genomics-based protocol for the structural analysis of deletion mutations on an heterozygous background. Two independent fast neutron mutants of self-incompatible clementine (Citrus clementina Hort. Ex Tan. cv. Clemenules) were the subject of the study. Both mutants, named 39B3 and 39E7, were expected to carry DNA deletions in hemizygous dosage. Array-based Comparative Genomic Hybridization (array-CGH) using a Citrus cDNA microarray allowed the identification of underrepresented genes in these two mutants. Subsequent comparison of citrus deleted genes with annotated plant genomes, especially poplar, made possible to predict the presence of a large deletion in 39B3 of about 700 kb and at least two deletions of approximately 100 and 500 kb in 39E7. The deletion in 39B3 was further characterized by PCR on available Citrus BACs, which helped us to build a partial physical map of the deletion. Among the deleted genes, ClpC-like gene coding for a putative subunit of a multifunctional chloroplastic protease involved in the regulation of chlorophyll b synthesis was directly related to the mutated phenotype since the mutant showed a reduced chlorophyll a/b ratio in green tissues. In this work, we report the use of array-CGH for the successful identification of genes included in a hemizygous deletion induced by fast neutron irradiation on Citrus clementina. The study of gene content and order into the 39B3 deletion also led to the unexpected conclusion that microsynteny and local gene colinearity in

  17. Characterization of hemizygous deletions in Citrus using array-Comparative Genomic Hybridization and microsynteny comparisons with the poplar genome

    Directory of Open Access Journals (Sweden)

    Usach Antonio

    2008-08-01

    Full Text Available Abstract Background Many fruit-tree species, including relevant Citrus spp varieties exhibit a reproductive biology that impairs breeding and strongly constrains genetic improvements. In citrus, juvenility increases the generation time while sexual sterility, inbreeding depression and self-incompatibility prevent the production of homozygous cultivars. Genomic technology may provide citrus researchers with a new set of tools to address these various restrictions. In this work, we report a valuable genomics-based protocol for the structural analysis of deletion mutations on an heterozygous background. Results Two independent fast neutron mutants of self-incompatible clementine (Citrus clementina Hort. Ex Tan. cv. Clemenules were the subject of the study. Both mutants, named 39B3 and 39E7, were expected to carry DNA deletions in hemizygous dosage. Array-based Comparative Genomic Hybridization (array-CGH using a Citrus cDNA microarray allowed the identification of underrepresented genes in these two mutants. Subsequent comparison of citrus deleted genes with annotated plant genomes, especially poplar, made possible to predict the presence of a large deletion in 39B3 of about 700 kb and at least two deletions of approximately 100 and 500 kb in 39E7. The deletion in 39B3 was further characterized by PCR on available Citrus BACs, which helped us to build a partial physical map of the deletion. Among the deleted genes, ClpC-like gene coding for a putative subunit of a multifunctional chloroplastic protease involved in the regulation of chlorophyll b synthesis was directly related to the mutated phenotype since the mutant showed a reduced chlorophyll a/b ratio in green tissues. Conclusion In this work, we report the use of array-CGH for the successful identification of genes included in a hemizygous deletion induced by fast neutron irradiation on Citrus clementina. The study of gene content and order into the 39B3 deletion also led to the unexpected

  18. IMG/M: integrated genome and metagenome comparative data analysis system.

    Science.gov (United States)

    Chen, I-Min A; Markowitz, Victor M; Chu, Ken; Palaniappan, Krishna; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Andersen, Evan; Huntemann, Marcel; Varghese, Neha; Hadjithomas, Michalis; Tennessen, Kristin; Nielsen, Torben; Ivanova, Natalia N; Kyrpides, Nikos C

    2017-01-04

    The Integrated Microbial Genomes with Microbiome Samples (IMG/M: https://img.jgi.doe.gov/m/) system contains annotated DNA and RNA sequence data of (i) archaeal, bacterial, eukaryotic and viral genomes from cultured organisms, (ii) single cell genomes (SCG) and genomes from metagenomes (GFM) from uncultured archaea, bacteria and viruses and (iii) metagenomes from environmental, host associated and engineered microbiome samples. Sequence data are generated by DOE's Joint Genome Institute (JGI), submitted by individual scientists, or collected from public sequence data archives. Structural and functional annotation is carried out by JGI's genome and metagenome annotation pipelines. A variety of analytical and visualization tools provide support for examining and comparing IMG/M's datasets. IMG/M allows open access interactive analysis of publicly available datasets, while manual curation, submission and access to private datasets and computationally intensive workspace-based analysis require login/password access to its expert review (ER) companion system (IMG/M ER: https://img.jgi.doe.gov/mer/). Since the last report published in the 2014 NAR Database Issue, IMG/M's dataset content has tripled in terms of number of datasets and overall protein coding genes, while its analysis tools have been extended to cope with the rapid growth in the number and size of datasets handled by the system.

  19. GI-SVM: A sensitive method for predicting genomic islands based on unannotated sequence of a single genome.

    Science.gov (United States)

    Lu, Bingxin; Leong, Hon Wai

    2016-02-01

    Genomic islands (GIs) are clusters of functionally related genes acquired by lateral genetic transfer (LGT), and they are present in many bacterial genomes. GIs are extremely important for bacterial research, because they not only promote genome evolution but also contain genes that enhance adaption and enable antibiotic resistance. Many methods have been proposed to predict GI. But most of them rely on either annotations or comparisons with other closely related genomes. Hence these methods cannot be easily applied to new genomes. As the number of newly sequenced bacterial genomes rapidly increases, there is a need for methods to detect GI based solely on sequences of a single genome. In this paper, we propose a novel method, GI-SVM, to predict GIs given only the unannotated genome sequence. GI-SVM is based on one-class support vector machine (SVM), utilizing composition bias in terms of k-mer content. From our evaluations on three real genomes, GI-SVM can achieve higher recall compared with current methods, without much loss of precision. Besides, GI-SVM allows flexible parameter tuning to get optimal results for each genome. In short, GI-SVM provides a more sensitive method for researchers interested in a first-pass detection of GI in newly sequenced genomes.

  20. Comparative genomic hybridization: technical development and cytogenetic aspects for routine use in clinical laboratories.

    Science.gov (United States)

    Lapierre, J M; Cacheux, V; Da Silva, F; Collot, N; Hervy, N; Wiss, J; Tachdjian, G

    1998-01-01

    Comparative genomic hybridization (CGH) offers a new global approach for detection of chromosomal material imbalances of the entire genome in a single experiment without cell culture. In this paper, we discuss the technical development and the cytogenetic aspects of CGH in a clinical laboratory. Based only on the visual inspection of CGH metaphase spreads, the correct identification of numerical and structural anomalies are reported. No commercial image analysis software was required in these experiments. We have demonstrated that this new technology can be set up easily for routine use in a clinical cytogenetics laboratory.

  1. Evolution of electron transfer out of the cell: comparative genomics of six Geobacter genomes

    Directory of Open Access Journals (Sweden)

    Young Nelson D

    2010-01-01

    Full Text Available Abstract Background Geobacter species grow by transferring electrons out of the cell - either to Fe(III-oxides or to man-made substances like energy-harvesting electrodes. Study of Geobacter sulfurreducens has shown that TCA cycle enzymes, inner-membrane respiratory enzymes, and periplasmic and outer-membrane cytochromes are required. Here we present comparative analysis of six Geobacter genomes, including species from the clade that predominates in the subsurface. Conservation of proteins across the genomes was determined to better understand the evolution of Geobacter species and to create a metabolic model applicable to subsurface environments. Results The results showed that enzymes for acetate transport and oxidation, and for proton transport across the inner membrane were well conserved. An NADH dehydrogenase, the ATP synthase, and several TCA cycle enzymes were among the best conserved in the genomes. However, most of the cytochromes required for Fe(III-reduction were not, including many of the outer-membrane cytochromes. While conservation of cytochromes was poor, an abundance and diversity of cytochromes were found in every genome, with duplications apparent in several species. Conclusions These results indicate there is a common pathway for acetate oxidation and energy generation across the family and in the last common ancestor. They also suggest that while cytochromes are important for extracellular electron transport, the path of electrons across the periplasm and outer membrane is variable. This combination of abundant cytochromes with weak sequence conservation suggests they may not be specific terminal reductases, but rather may be important in their heme-bearing capacity, as sinks for electrons between the inner-membrane electron transport chain and the extracellular acceptor.

  2. Genome sequence of Cronobacter sakazakii BAA-894 and comparative genomic hybridization analysis with other Cronobacter species.

    Directory of Open Access Journals (Sweden)

    Eva Kucerova

    Full Text Available BACKGROUND: The genus Cronobacter (formerly called Enterobacter sakazakii is composed of five species; C. sakazakii, C. malonaticus, C. turicensis, C. muytjensii, and C. dublinensis. The genus includes opportunistic human pathogens, and the first three species have been associated with neonatal infections. The most severe diseases are caused in neonates and include fatal necrotizing enterocolitis and meningitis. The genetic basis of the diversity within the genus is unknown, and few virulence traits have been identified. METHODOLOGY/PRINCIPAL FINDINGS: We report here the first sequence of a member of this genus, C. sakazakii strain BAA-894. The genome of Cronobacter sakazakii strain BAA-894 comprises a 4.4 Mb chromosome (57% GC content and two plasmids; 31 kb (51% GC and 131 kb (56% GC. The genome was used to construct a 387,000 probe oligonucleotide tiling DNA microarray covering the whole genome. Comparative genomic hybridization (CGH was undertaken on five other C. sakazakii strains, and representatives of the four other Cronobacter species. Among 4,382 annotated genes inspected in this study, about 55% of genes were common to all C. sakazakii strains and 43% were common to all Cronobacter strains, with 10-17% absence of genes. CONCLUSIONS/SIGNIFICANCE: CGH highlighted 15 clusters of genes in C. sakazakii BAA-894 that were divergent or absent in more than half of the tested strains; six of these are of probable prophage origin. Putative virulence factors were identified in these prophage and in other variable regions. A number of genes unique to Cronobacter species associated with neonatal infections (C. sakazakii, C. malonaticus and C. turicensis were identified. These included a copper and silver resistance system known to be linked to invasion of the blood-brain barrier by neonatal meningitic strains of Escherichia coli. In addition, genes encoding for multidrug efflux pumps and adhesins were identified that were unique to C. sakazakii

  3. Comparative genomic hybridizations reveal absence of large Streptomyces coelicolor genomic islands in Streptomyces lividans

    Directory of Open Access Journals (Sweden)

    Sherman David H

    2007-07-01

    Full Text Available Abstract Background The genomes of Streptomyces coelicolor and Streptomyces lividans bear a considerable degree of synteny. While S. coelicolor is the model streptomycete for studying antibiotic synthesis and differentiation, S. lividans is almost exclusively considered as the preferred host, among actinomycetes, for cloning and expression of exogenous DNA. We used whole genome microarrays as a comparative genomics tool for identifying the subtle differences between these two chromosomes. Results We identified five large S. coelicolor genomic islands (larger than 25 kb and 18 smaller islets absent in S. lividans chromosome. Many of these regions show anomalous GC bias and codon usage patterns. Six of them are in close vicinity of tRNA genes while nine are flanked with near perfect repeat sequences indicating that these are probable recent evolutionary acquisitions into S. coelicolor. Embedded within these segments are at least four DNA methylases and two probable methyl-sensing restriction endonucleases. Comparison with S. coelicolor transcriptome and proteome data revealed that some of the missing genes are active during the course of growth and differentiation in S. coelicolor. In particular, a pair of methylmalonyl CoA mutase (mcm genes involved in polyketide precursor biosynthesis, an acyl-CoA dehydrogenase implicated in timing of actinorhodin synthesis and bldB, a developmentally significant regulator whose mutation causes complete abrogation of antibiotic synthesis belong to this category. Conclusion Our findings provide tangible hints for elucidating the genetic basis of important phenotypic differences between these two streptomycetes. Importantly, absence of certain genes in S. lividans identified here could potentially explain the relative ease of DNA transformations and the conditional lack of actinorhodin synthesis in S. lividans.

  4. Evolutionary insights into scleractinian corals using comparative genomic hybridizations.

    KAUST Repository

    Aranda, Manuel

    2012-09-21

    Coral reefs belong to the most ecologically and economically important ecosystems on our planet. Yet, they are under steady decline worldwide due to rising sea surface temperatures, disease, and pollution. Understanding the molecular impact of these stressors on different coral species is imperative in order to predict how coral populations will respond to this continued disturbance. The use of molecular tools such as microarrays has provided deep insight into the molecular stress response of corals. Here, we have performed comparative genomic hybridizations (CGH) with different coral species to an Acropora palmata microarray platform containing 13,546 cDNA clones in order to identify potentially rapidly evolving genes and to determine the suitability of existing microarray platforms for use in gene expression studies (via heterologous hybridization).

  5. Survey sequencing and comparative analysis of the elephant shark (Callorhinchus milii genome.

    Directory of Open Access Journals (Sweden)

    Byrappa Venkatesh

    2007-04-01

    Full Text Available Owing to their phylogenetic position, cartilaginous fishes (sharks, rays, skates, and chimaeras provide a critical reference for our understanding of vertebrate genome evolution. The relatively small genome of the elephant shark, Callorhinchus milii, a chimaera, makes it an attractive model cartilaginous fish genome for whole-genome sequencing and comparative analysis. Here, the authors describe survey sequencing (1.4x coverage and comparative analysis of the elephant shark genome, one of the first cartilaginous fish genomes to be sequenced to this depth. Repetitive sequences, represented mainly by a novel family of short interspersed element-like and long interspersed element-like sequences, account for about 28% of the elephant shark genome. Fragments of approximately 15,000 elephant shark genes reveal specific examples of genes that have been lost differentially during the evolution of tetrapod and teleost fish lineages. Interestingly, the degree of conserved synteny and conserved sequences between the human and elephant shark genomes are higher than that between human and teleost fish genomes. Elephant shark contains putative four Hox clusters indicating that, unlike teleost fish genomes, the elephant shark genome has not experienced an additional whole-genome duplication. These findings underscore the importance of the elephant shark as a critical reference vertebrate genome for comparative analysis of the human and other vertebrate genomes. This study also demonstrates that a survey-sequencing approach can be applied productively for comparative analysis of distantly related vertebrate genomes.

  6. Survey Sequencing and Comparative Analysis of the Elephant Shark (Callorhinchus milii) Genome

    Science.gov (United States)

    Venkatesh, Byrappa; Kirkness, Ewen F; Loh, Yong-Hwee; Halpern, Aaron L; Lee, Alison P; Johnson, Justin; Dandona, Nidhi; Viswanathan, Lakshmi D; Tay, Alice; Venter, J. Craig; Strausberg, Robert L; Brenner, Sydney

    2007-01-01

    Owing to their phylogenetic position, cartilaginous fishes (sharks, rays, skates, and chimaeras) provide a critical reference for our understanding of vertebrate genome evolution. The relatively small genome of the elephant shark, Callorhinchus milii, a chimaera, makes it an attractive model cartilaginous fish genome for whole-genome sequencing and comparative analysis. Here, the authors describe survey sequencing (1.4× coverage) and comparative analysis of the elephant shark genome, one of the first cartilaginous fish genomes to be sequenced to this depth. Repetitive sequences, represented mainly by a novel family of short interspersed element–like and long interspersed element–like sequences, account for about 28% of the elephant shark genome. Fragments of approximately 15,000 elephant shark genes reveal specific examples of genes that have been lost differentially during the evolution of tetrapod and teleost fish lineages. Interestingly, the degree of conserved synteny and conserved sequences between the human and elephant shark genomes are higher than that between human and teleost fish genomes. Elephant shark contains putative four Hox clusters indicating that, unlike teleost fish genomes, the elephant shark genome has not experienced an additional whole-genome duplication. These findings underscore the importance of the elephant shark as a critical reference vertebrate genome for comparative analysis of the human and other vertebrate genomes. This study also demonstrates that a survey-sequencing approach can be applied productively for comparative analysis of distantly related vertebrate genomes. PMID:17407382

  7. Comparative genomic analysis and phylogenetic position of Theileria equi

    Directory of Open Access Journals (Sweden)

    Kappmeyer Lowell S

    2012-11-01

    Full Text Available Abstract Background Transmission of arthropod-borne apicomplexan parasites that cause disease and result in death or persistent infection represents a major challenge to global human and animal health. First described in 1901 as Piroplasma equi, this re-emergent apicomplexan parasite was renamed Babesia equi and subsequently Theileria equi, reflecting an uncertain taxonomy. Understanding mechanisms by which apicomplexan parasites evade immune or chemotherapeutic elimination is required for development of effective vaccines or chemotherapeutics. The continued risk of transmission of T. equi from clinically silent, persistently infected equids impedes the goal of returning the U. S. to non-endemic status. Therefore comparative genomic analysis of T. equi was undertaken to: 1 identify genes contributing to immune evasion and persistence in equid hosts, 2 identify genes involved in PBMC infection biology and 3 define the phylogenetic position of T. equi relative to sequenced apicomplexan parasites. Results The known immunodominant proteins, EMA1, 2 and 3 were discovered to belong to a ten member gene family with a mean amino acid identity, in pairwise comparisons, of 39%. Importantly, the amino acid diversity of EMAs is distributed throughout the length of the proteins. Eight of the EMA genes were simultaneously transcribed. As the agents that cause bovine theileriosis infect and transform host cell PBMCs, we confirmed that T. equi infects equine PBMCs, however, there is no evidence of host cell transformation. Indeed, a number of genes identified as potential manipulators of the host cell phenotype are absent from the T. equi genome. Comparative genomic analysis of T. equi revealed the phylogenetic positioning relative to seven apicomplexan parasites using deduced amino acid sequences from 150 genes placed it as a sister taxon to Theileria spp. Conclusions The EMA family does not fit the paradigm for classical antigenic variation, and we propose a

  8. Evolutionary insights into scleractinian corals using comparative genomic hybridizations

    Directory of Open Access Journals (Sweden)

    Aranda Manuel

    2012-09-01

    Full Text Available Abstract Background Coral reefs belong to the most ecologically and economically important ecosystems on our planet. Yet, they are under steady decline worldwide due to rising sea surface temperatures, disease, and pollution. Understanding the molecular impact of these stressors on different coral species is imperative in order to predict how coral populations will respond to this continued disturbance. The use of molecular tools such as microarrays has provided deep insight into the molecular stress response of corals. Here, we have performed comparative genomic hybridizations (CGH with different coral species to an Acropora palmata microarray platform containing 13,546 cDNA clones in order to identify potentially rapidly evolving genes and to determine the suitability of existing microarray platforms for use in gene expression studies (via heterologous hybridization. Results Our results showed that the current microarray platform for A. palmata is able to provide biological relevant information for a wide variety of coral species covering both the complex clade as well the robust clade. Analysis of the fraction of highly diverged genes showed a significantly higher amount of genes without annotation corroborating previous findings that point towards a higher rate of divergence for taxonomically restricted genes. Among the genes with annotation, we found many mitochondrial genes to be highly diverged in M. faveolata when compared to A. palmata, while the majority of nuclear encoded genes maintained an average divergence rate. Conclusions The use of present microarray platforms for transcriptional analyses in different coral species will greatly enhance the understanding of the molecular basis of stress and health and highlight evolutionary differences between scleractinian coral species. On a genomic basis, we show that cDNA arrays can be used to identify patterns of divergence. Mitochondrion-encoded genes seem to have diverged faster than

  9. The Genome of Nosema sp. Isolate YNPr: A Comparative Analysis of Genome Evolution within the Nosema/Vairimorpha Clade

    Science.gov (United States)

    Ma, Zhenggang; Li, Tian; Zhang, Xiaoyan; Debrunner-Vossbrinck, Bettina A.; Zhou, Zeyang; Vossbrinck, Charles R.

    2016-01-01

    The microsporidian parasite designated here as Nosema sp. Isolate YNPr was isolated from the cabbage butterfly Pieris rapae collected in Honghe Prefecture, Yunnan Province, China. The genome was sequenced by Illumina sequencing and compared to those of two related members of the Nosema/Vairimorpha clade, Nosema ceranae and Nosema apis. Based upon assembly statistics, the Nosema sp. YNPr genome is 3.36 x 106bp with a G+C content of 23.18% and 2,075 protein coding sequences. An “ACCCTT” motif is present approximately 50-bp upstream of the start codon, as reported from other members of the clade and from Encephalitozoon cuniculi, a sister taxon. Comparative small subunit ribosomal DNA (SSU rDNA) analysis as well as genome-wide phylogenetic analysis confirms a closer relationship between N. ceranae and Nosema sp. YNPr than between the two honeybee parasites N. ceranae and N. apis. The more closely related N. ceranae and Nosema sp. YNPr show similarities in a number of structural characteristics such as gene synteny, gene length, gene number, transposon composition and gene reduction. Based on transposable element content of the assemblies, the transposon content of Nosema sp. YNPr is 4.8%, that of N. ceranae is 3.7%, and that of N. apis is 2.5%, with large differences in the types of transposons present among these 3 species. Gene function annotation indicates that the number of genes participating in most metabolic activities is similar in all three species. However, the number of genes in the transcription, general function, and cysteine protease categories is greater in N. apis than in the other two species. Our studies further characterize the evolution of the Nosema/Vairimorpha clade of microsporidia. These organisms maintain variable but very reduced genomes. We are interested in understanding the effects of genetic drift versus natural selection on genome size in the microsporidia and in developing a testable hypothesis for further studies on the genomic

  10. Using comparative genomic hybridization to survey genomic sequence divergence across species: a proof-of-concept from Drosophila

    Directory of Open Access Journals (Sweden)

    Kulathinal Rob J

    2010-04-01

    Full Text Available Abstract Background Genome-wide analysis of sequence divergence among species offers profound insights into the evolutionary processes that shape lineages. When full-genome sequencing is not feasible for a broad comparative study, we propose the use of array-based comparative genomic hybridization (aCGH in order to identify orthologous genes with high sequence divergence. Here we discuss experimental design, statistical power, success rate, sources of variation and potential confounding factors. We used a spotted PCR product microarray platform from Drosophila melanogaster to assess sequence divergence on a gene-by-gene basis in three fully sequenced heterologous species (D. sechellia, D. simulans, and D. yakuba. Because complete genome assemblies are available for these species this study presents a powerful test for the use of aCGH as a tool to measure sequence divergence. Results We found a consistent and linear relationship between hybridization ratio and sequence divergence of the sample to the platform species. At higher levels of sequence divergence (D. melanogaster ~84% of features had significantly less hybridization to the array in the heterologous species than the platform species, and thus could be identified as "diverged". At lower levels of divergence (≥ 97% identity, only 13% of genes were identified as diverged. While ~40% of the variation in hybridization ratio can be accounted for by variation in sequence identity of the heterologous sample relative to D. melanogaster, other individual characteristics of the DNA sequences, such as GC content, also contribute to variation in hybridization ratio, as does technical variation. Conclusions Here we demonstrate that aCGH can accurately be used as a proxy to estimate genome-wide divergence, thus providing an efficient way to evaluate how evolutionary processes and genomic architecture can shape species diversity in non-model systems. Given the increased number of species for which

  11. Comparative Genomics of Erwinia amylovora and Related Erwinia Species—What do We Learn?

    Directory of Open Access Journals (Sweden)

    Youfu Zhao

    2011-09-01

    Full Text Available Erwinia amylovora, the causal agent of fire blight disease of apples and pears, is one of the most important plant bacterial pathogens with worldwide economic significance. Recent reports on the complete or draft genome sequences of four species in the genus Erwinia, including E. amylovora, E. pyrifoliae, E. tasmaniensis, and E. billingiae, have provided us near complete genetic information about this pathogen and its closely-related species. This review describes in silico subtractive hybridization-based comparative genomic analyses of eight genomes currently available, and highlights what we have learned from these comparative analyses, as well as genetic and functional genomic studies. Sequence analyses reinforce the assumption that E. amylovora is a relatively homogeneous species and support the current classification scheme of E. amylovora and its related species. The potential evolutionary origin of these Erwinia species is also proposed. The current understanding of the pathogen, its virulence mechanism and host specificity from genome sequencing data is summarized. Future research directions are also suggested.

  12. arrayCGHbase: an analysis platform for comparative genomic hybridization microarrays

    Directory of Open Access Journals (Sweden)

    Moreau Yves

    2005-05-01

    Full Text Available Abstract Background The availability of the human genome sequence as well as the large number of physically accessible oligonucleotides, cDNA, and BAC clones across the entire genome has triggered and accelerated the use of several platforms for analysis of DNA copy number changes, amongst others microarray comparative genomic hybridization (arrayCGH. One of the challenges inherent to this new technology is the management and analysis of large numbers of data points generated in each individual experiment. Results We have developed arrayCGHbase, a comprehensive analysis platform for arrayCGH experiments consisting of a MIAME (Minimal Information About a Microarray Experiment supportive database using MySQL underlying a data mining web tool, to store, analyze, interpret, compare, and visualize arrayCGH results in a uniform and user-friendly format. Following its flexible design, arrayCGHbase is compatible with all existing and forthcoming arrayCGH platforms. Data can be exported in a multitude of formats, including BED files to map copy number information on the genome using the Ensembl or UCSC genome browser. Conclusion ArrayCGHbase is a web based and platform independent arrayCGH data analysis tool, that allows users to access the analysis suite through the internet or a local intranet after installation on a private server. ArrayCGHbase is available at http://medgen.ugent.be/arrayCGHbase/.

  13. Comparative population genomics of maize domestication and improvement.

    Science.gov (United States)

    Hufford, Matthew B; Xu, Xun; van Heerwaarden, Joost; Pyhäjärvi, Tanja; Chia, Jer-Ming; Cartwright, Reed A; Elshire, Robert J; Glaubitz, Jeffrey C; Guill, Kate E; Kaeppler, Shawn M; Lai, Jinsheng; Morrell, Peter L; Shannon, Laura M; Song, Chi; Springer, Nathan M; Swanson-Wagner, Ruth A; Tiffin, Peter; Wang, Jun; Zhang, Gengyun; Doebley, John; McMullen, Michael D; Ware, Doreen; Buckler, Edward S; Yang, Shuang; Ross-Ibarra, Jeffrey

    2012-06-03

    Domestication and plant breeding are ongoing 10,000-year-old evolutionary experiments that have radically altered wild species to meet human needs. Maize has undergone a particularly striking transformation. Researchers have sought for decades to identify the genes underlying maize evolution, but these efforts have been limited in scope. Here, we report a comprehensive assessment of the evolution of modern maize based on the genome-wide resequencing of 75 wild, landrace and improved maize lines. We find evidence of recovery of diversity after domestication, likely introgression from wild relatives, and evidence for stronger selection during domestication than improvement. We identify a number of genes with stronger signals of selection than those previously shown to underlie major morphological changes. Finally, through transcriptome-wide analysis of gene expression, we find evidence both consistent with removal of cis-acting variation during maize domestication and improvement and suggestive of modern breeding having increased dominance in expression while targeting highly expressed genes.

  14. Family Competition Pheromone Genetic Algorithm for Comparative Genome Assembly

    Institute of Scientific and Technical Information of China (English)

    Chien-Hao Su; Chien-Shun Chiou; Jung-Che Kuo; Pei-Jen Wang; Cheng-Yan Kao; Hsueh-Ting Chu

    2014-01-01

    Genome assembly is a prerequisite step for analyzing next generation sequencing data and also far from being solved. Many assembly tools have been proposed and used extensively. Majority of them aim to assemble sequencing reads into contigs; however, we focus on the assembly of contigs into scaffolds in this paper. This is called scaffolding, which estimates the relative order of the contigs as well as the size of the gaps between these contigs. Pheromone trail-based genetic algorithm (PGA) was previously proposed and had decent performance according to their paper. From our previous study, we found that family competition mechanism in genetic algorithm is able to further improve the results. Therefore, we propose family competition pheromone genetic algorithm (FCPGA) and demonstrate the improvement over PGA.

  15. Comparative genomic reconstruction of transcriptional networks controlling central metabolism in the Shewanella genus

    Energy Technology Data Exchange (ETDEWEB)

    Rodionov, Dmitry A.; Novichkov, Pavel; Stavrovskaya, Elena D.; Rodionova, Irina A.; Li, Xiaoqing; Kazanov, Marat D.; Ravcheev, Dmitry A.; Gerasimova, Anna V.; Kazakov, Alexey E.; Kovaleva, Galina Y.; Permina, Elizabeth A.; Laikova, Olga N.; Overbeek, Ross; Romine, Margaret F.; Fredrickson, Jim K.; Arkin, Adam P.; Dubchak, Inna; Osterman, Andrei L.; Gelfand, Mikhail S.

    2011-06-15

    Genome-scale prediction of gene regulation and reconstruction of transcriptional regulatory networks in bacteria is one of the critical tasks of modern genomics. Despite the growing number of genome-scale gene expression studies, our abilities to convert the results of these studies into accurate regulatory annotations and to project them from model to other organisms are extremely limited. The comparative genomics approaches and computational identification of regulatory sites are useful for the in silico reconstruction of transcriptional regulatory networks in bacteria. The Shewanella genus is comprised of metabolically versatile gamma-proteobacteria, whose lifestyles and natural environments are substantially different from Escherichia coli and other model bacterial species. To explore conservation and variations in the Shewanella transcriptional networks we analyzed the repertoire of transcription factors and performed genomics-based reconstruction and comparative analysis of regulons in 16 Shewanella genomes. The inferred regulatory network includes 82 transcription factors and their DNA binding sites, 8 riboswitches and 6 translational attenuators. Forty five regulons were newly inferred from the genome context analysis, whereas others were propagated from previously characterized regulons in the Enterobacteria and Pseudomonas spp.. However, even orthologous regulators with conserved DNA-binding motifs may control substantially different gene sets, revealing striking differences in regulatory strategies between the Shewanella spp. and E. coli. Multiple examples of regulatory network rewiring include regulon contraction and expansion (as in the case of PdhR, HexR, FadR), and numerous cases of recruiting non-orthologous regulators to control equivalent pathways (e.g. NagR for N-acetylglucosamine catabolism and PsrA for fatty acid degradation) and, conversely, orthologous regulators to control distinct pathways (e.g. TyrR, ArgR, Crp).

  16. Automated comparative auditing of NCIT genomic roles using NCBI.

    Science.gov (United States)

    Cohen, Barry; Oren, Marc; Min, Hua; Perl, Yehoshua; Halper, Michael

    2008-12-01

    Biomedical research has identified many human genes and various knowledge about them. The National Cancer Institute Thesaurus (NCIT) represents such knowledge as concepts and roles (relationships). Due to the rapid advances in this field, it is to be expected that the NCIT's Gene hierarchy will contain role errors. A comparative methodology to audit the Gene hierarchy with the use of the National Center for Biotechnology Information's (NCBI's) Entrez Gene database is presented. The two knowledge sources are accessed via a pair of Web crawlers to ensure up-to-date data. Our algorithms then compare the knowledge gathered from each, identify discrepancies that represent probable errors, and suggest corrective actions. The primary focus is on two kinds of gene-roles: (1) the chromosomal locations of genes, and (2) the biological processes in which genes play a role. Regarding chromosomal locations, the discrepancies revealed are striking and systematic, suggesting a structurally common origin. In regard to the biological processes, difficulties arise because genes frequently play roles in multiple processes, and processes may have many designations (such as synonymous terms). Our algorithms make use of the roles defined in the NCIT Biological Process hierarchy to uncover many probable gene-role errors in the NCIT. These results show that automated comparative auditing is a promising technique that can identify a large number of probable errors and corrections for them in a terminological genomic knowledge repository, thus facilitating its overall maintenance.

  17. Comparative Genomics of Interreplichore Translocations in Bacteria: A Measure of Chromosome Topology?

    Science.gov (United States)

    Khedkar, Supriya; Seshasayee, Aswin Sai Narain

    2016-06-01

    Genomes evolve not only in base sequence but also in terms of their architecture, defined by gene organization and chromosome topology. Whereas genome sequence data inform us about the changes in base sequences for a large variety of organisms, the study of chromosome topology is restricted to a few model organisms studied using microscopy and chromosome conformation capture techniques. Here, we exploit whole genome sequence data to study the link between gene organization and chromosome topology in bacteria. Using comparative genomics across ∼250 pairs of closely related bacteria we show that: (a) many organisms show a high degree of interreplichore translocations throughout the chromosome and not limited to the inversion-prone terminus (ter) or the origin of replication (oriC); (b) translocation maps may reflect chromosome topologies; and (c) symmetric interreplichore translocations do not disrupt the distance of a gene from oriC or affect gene expression states or strand biases in gene densities. In summary, we suggest that translocation maps might be a first line in defining a gross chromosome topology given a pair of closely related genome sequences.

  18. Comparative Genomic Analyses of the Human NPHP1 Locus Reveal Complex Genomic Architecture and Its Regional Evolution in Primates.

    Directory of Open Access Journals (Sweden)

    Bo Yuan

    2015-12-01

    Full Text Available Many loci in the human genome harbor complex genomic structures that can result in susceptibility to genomic rearrangements leading to various genomic disorders. Nephronophthisis 1 (NPHP1, MIM# 256100 is an autosomal recessive disorder that can be caused by defects of NPHP1; the gene maps within the human 2q13 region where low copy repeats (LCRs are abundant. Loss of function of NPHP1 is responsible for approximately 85% of the NPHP1 cases-about 80% of such individuals carry a large recurrent homozygous NPHP1 deletion that occurs via nonallelic homologous recombination (NAHR between two flanking directly oriented ~45 kb LCRs. Published data revealed a non-pathogenic inversion polymorphism involving the NPHP1 gene flanked by two inverted ~358 kb LCRs. Using optical mapping and array-comparative genomic hybridization, we identified three potential novel structural variant (SV haplotypes at the NPHP1 locus that may protect a haploid genome from the NPHP1 deletion. Inter-species comparative genomic analyses among primate genomes revealed massive genomic changes during evolution. The aggregated data suggest that dynamic genomic rearrangements occurred historically within the NPHP1 locus and generated SV haplotypes observed in the human population today, which may confer differential susceptibility to genomic instability and the NPHP1 deletion within a personal genome. Our study documents diverse SV haplotypes at a complex LCR-laden human genomic region. Comparative analyses provide a model for how this complex region arose during primate evolution, and studies among humans suggest that intra-species polymorphism may potentially modulate an individual's susceptibility to acquiring disease-associated alleles.

  19. Comparative genomics reveals evidence of marine adaptation in Salinispora species

    Science.gov (United States)

    2012-01-01

    Background Actinobacteria represent a consistent component of most marine bacterial communities yet little is known about the mechanisms by which these Gram-positive bacteria adapt to life in the marine environment. Here we employed a phylogenomic approach to identify marine adaptation genes in marine Actinobacteria. The focus was on the obligate marine actinomycete genus Salinispora and the identification of marine adaptation genes that have been acquired from other marine bacteria. Results Functional annotation, comparative genomics, and evidence of a shared evolutionary history with bacteria from hyperosmotic environments were used to identify a pool of more than 50 marine adaptation genes. An Actinobacterial species tree was used to infer the likelihood of gene gain or loss in accounting for the distribution of each gene. Acquired marine adaptation genes were associated with electron transport, sodium and ABC transporters, and channels and pores. In addition, the loss of a mechanosensitive channel gene appears to have played a major role in the inability of Salinispora strains to grow following transfer to low osmotic strength media. Conclusions The marine Actinobacteria for which genome sequences are available are broadly distributed throughout the Actinobacterial phylogenetic tree and closely related to non-marine forms suggesting they have been independently introduced relatively recently into the marine environment. It appears that the acquisition of transporters in Salinispora spp. represents a major marine adaptation while gene loss is proposed to play a role in the inability of this genus to survive outside of the marine environment. This study reveals fundamental differences between marine adaptations in Gram-positive and Gram-negative bacteria and no common genetic basis for marine adaptation among the Actinobacteria analyzed. PMID:22401625

  20. Comparative genomics reveals evidence of marine adaptation in Salinispora species.

    Science.gov (United States)

    Penn, Kevin; Jensen, Paul R

    2012-03-08

    Actinobacteria represent a consistent component of most marine bacterial communities yet little is known about the mechanisms by which these Gram-positive bacteria adapt to life in the marine environment. Here we employed a phylogenomic approach to identify marine adaptation genes in marine Actinobacteria. The focus was on the obligate marine actinomycete genus Salinispora and the identification of marine adaptation genes that have been acquired from other marine bacteria. Functional annotation, comparative genomics, and evidence of a shared evolutionary history with bacteria from hyperosmotic environments were used to identify a pool of more than 50 marine adaptation genes. An Actinobacterial species tree was used to infer the likelihood of gene gain or loss in accounting for the distribution of each gene. Acquired marine adaptation genes were associated with electron transport, sodium and ABC transporters, and channels and pores. In addition, the loss of a mechanosensitive channel gene appears to have played a major role in the inability of Salinispora strains to grow following transfer to low osmotic strength media. The marine Actinobacteria for which genome sequences are available are broadly distributed throughout the Actinobacterial phylogenetic tree and closely related to non-marine forms suggesting they have been independently introduced relatively recently into the marine environment. It appears that the acquisition of transporters in Salinispora spp. represents a major marine adaptation while gene loss is proposed to play a role in the inability of this genus to survive outside of the marine environment. This study reveals fundamental differences between marine adaptations in Gram-positive and Gram-negative bacteria and no common genetic basis for marine adaptation among the Actinobacteria analyzed.

  1. Comparative genomics reveals evidence of marine adaptation in Salinispora species

    Directory of Open Access Journals (Sweden)

    Penn Kevin

    2012-03-01

    Full Text Available Abstract Background Actinobacteria represent a consistent component of most marine bacterial communities yet little is known about the mechanisms by which these Gram-positive bacteria adapt to life in the marine environment. Here we employed a phylogenomic approach to identify marine adaptation genes in marine Actinobacteria. The focus was on the obligate marine actinomycete genus Salinispora and the identification of marine adaptation genes that have been acquired from other marine bacteria. Results Functional annotation, comparative genomics, and evidence of a shared evolutionary history with bacteria from hyperosmotic environments were used to identify a pool of more than 50 marine adaptation genes. An Actinobacterial species tree was used to infer the likelihood of gene gain or loss in accounting for the distribution of each gene. Acquired marine adaptation genes were associated with electron transport, sodium and ABC transporters, and channels and pores. In addition, the loss of a mechanosensitive channel gene appears to have played a major role in the inability of Salinispora strains to grow following transfer to low osmotic strength media. Conclusions The marine Actinobacteria for which genome sequences are available are broadly distributed throughout the Actinobacterial phylogenetic tree and closely related to non-marine forms suggesting they have been independently introduced relatively recently into the marine environment. It appears that the acquisition of transporters in Salinispora spp. represents a major marine adaptation while gene loss is proposed to play a role in the inability of this genus to survive outside of the marine environment. This study reveals fundamental differences between marine adaptations in Gram-positive and Gram-negative bacteria and no common genetic basis for marine adaptation among the Actinobacteria analyzed.

  2. Comparative cytogenetic analysis of Avena macrostachya and diploid C-genome Avena species.

    Science.gov (United States)

    Badaeva, Ekaterina D; Shelukhina, Olga Yu; Diederichsen, Axel; Loskutov, Igor G; Pukhalskiy, Vitaly A

    2010-02-01

    The chromosome set of Avena macrostachya Balansa ex Coss. et Durieu was analyzed using C-banding and fluorescence in situ hybridization with 5S and 18S-5.8S-26S rRNA gene probes, and the results were compared with the C-genome diploid Avena L. species. The location of major nucleolar organizer regions and 5S rDNA sites on different chromosomes confirmed the affiliation of A. macrostachya with the C-genome group. However, the symmetric karyotype, the absence of "diffuse heterochromatin" and the location of large C-band complexes in proximal chromosome regions pointed to an isolated position of A. macrostachya from other Avena species. Based on the distribution of rDNA loci on the C-genome chromosomes of diploid and polyploid Avena species, we propose a model of the chromosome alterations that occurred during the evolution of oat species.

  3. The genome sequence of E. coli W (ATCC 9637: comparative genome analysis and an improved genome-scale reconstruction of E. coli

    Directory of Open Access Journals (Sweden)

    Lee Sang

    2011-01-01

    Full Text Available Abstract Background Escherichia coli is a model prokaryote, an important pathogen, and a key organism for industrial biotechnology. E. coli W (ATCC 9637, one of four strains designated as safe for laboratory purposes, has not been sequenced. E. coli W is a fast-growing strain and is the only safe strain that can utilize sucrose as a carbon source. Lifecycle analysis has demonstrated that sucrose from sugarcane is a preferred carbon source for industrial bioprocesses. Results We have sequenced and annotated the genome of E. coli W. The chromosome is 4,900,968 bp and encodes 4,764 ORFs. Two plasmids, pRK1 (102,536 bp and pRK2 (5,360 bp, are also present. W has unique features relative to other sequenced laboratory strains (K-12, B and Crooks: it has a larger genome and belongs to phylogroup B1 rather than A. W also grows on a much broader range of carbon sources than does K-12. A genome-scale reconstruction was developed and validated in order to interrogate metabolic properties. Conclusions The genome of W is more similar to commensal and pathogenic B1 strains than phylogroup A strains, and therefore has greater utility for comparative analyses with these strains. W should therefore be the strain of choice, or 'type strain' for group B1 comparative analyses. The genome annotation and tools created here are expected to allow further utilization and development of E. coli W as an industrial organism for sucrose-based bioprocesses. Refinements in our E. coli metabolic reconstruction allow it to more accurately define E. coli metabolism relative to previous models.

  4. Complete genome sequence and comparative genome analysis of a new special Yersinia enterocolitica.

    Science.gov (United States)

    Shi, Guoxiang; Su, Mingming; Liang, Junrong; Duan, Ran; Gu, Wenpeng; Xiao, Yuchun; Zhang, Zhewen; Qiu, Haiyan; Zhang, Zheng; Li, Yi; Zhang, Xiaohe; Ling, Yunchao; Song, Lai; Chen, Meili; Zhao, Yongbing; Wu, Jiayan; Jing, Huaiqi; Xiao, Jingfa; Wang, Xin

    2016-09-01

    Yersinia enterocolitica is the most diverse species among the Yersinia genera and shows more polymorphism, especially for the non-pathogenic strains. Individual non-pathogenic Y. enterocolitica strains are wrongly identified because of atypical phenotypes. In this study, we isolated an unusual Y. enterocolitica strain LC20 from Rattus norvegicus. The strain did not utilize urea and could not be classified as the biotype. API 20E identified Escherichia coli; however, it grew well at 25 °C, but E. coli grew well at 37 °C. We analyzed the genome of LC20 and found the whole chromosome of LC20 was collinear with Y. enterocolitica 8081, and the urease gene did not exist on the genome which is consistent with the result of API 20E. Also, the 16 S and 23 SrRNA gene of LC20 lay on a branch of Y. enterocolitica. Furthermore, the core-based and pan-based phylogenetic trees showed that LC20 was classified into the Y. enterocolitica cluster. Two plasmids (80 and 50 k) from LC20 shared low genetic homology with pYV from the Yersinia genus, one was an ancestral Yersinia plasmid and the other was novel encoding a number of transposases. Some pathogenic and non-pathogenic Y. enterocolitica-specific genes coexisted in LC20. Thus, although it could not be classified into any Y. enterocolitica biotype due to its special biochemical metabolism, we concluded the LC20 was a Y. enterocolitica strain because its genome was similar to other Y. enterocolitica and it might be a strain with many mutations and combinations emerging in the processes of its evolution.

  5. Gene discovery in the hamster: a comparative genomics approach for gene annotation by sequencing of hamster testis cDNAs

    Directory of Open Access Journals (Sweden)

    Khan Shafiq A

    2003-06-01

    Full Text Available Abstract Background Complete genome annotation will likely be achieved through a combination of computer-based analysis of available genome sequences combined with direct experimental characterization of expressed regions of individual genomes. We have utilized a comparative genomics approach involving the sequencing of randomly selected hamster testis cDNAs to begin to identify genes not previously annotated on the human, mouse, rat and Fugu (pufferfish genomes. Results 735 distinct sequences were analyzed for their relatedness to known sequences in public databases. Eight of these sequences were derived from previously unidentified genes and expression of these genes in testis was confirmed by Northern blotting. The genomic locations of each sequence were mapped in human, mouse, rat and pufferfish, where applicable, and the structure of their cognate genes was derived using computer-based predictions, genomic comparisons and analysis of uncharacterized cDNA sequences from human and macaque. Conclusion The use of a comparative genomics approach resulted in the identification of eight cDNAs that correspond to previously uncharacterized genes in the human genome. The proteins encoded by these genes included a new member of the kinesin superfamily, a SET/MYND-domain protein, and six proteins for which no specific function could be predicted. Each gene was expressed primarily in testis, suggesting that they may play roles in the development and/or function of testicular cells.

  6. Identification of conserved regulatory elements by comparative genome analysis

    Directory of Open Access Journals (Sweden)

    Jareborg Niclas

    2003-05-01

    Full Text Available Abstract Background For genes that have been successfully delineated within the human genome sequence, most regulatory sequences remain to be elucidated. The annotation and interpretation process requires additional data resources and significant improvements in computational methods for the detection of regulatory regions. One approach of growing popularity is based on the preferential conservation of functional sequences over the course of evolution by selective pressure, termed 'phylogenetic footprinting'. Mutations are more likely to be disruptive if they appear in functional sites, resulting in a measurable difference in evolution rates between functional and non-functional genomic segments. Results We have devised a flexible suite of methods for the identification and visualization of conserved transcription-factor-binding sites. The system reports those putative transcription-factor-binding sites that are both situated in conserved regions and located as pairs of sites in equivalent positions in alignments between two orthologous sequences. An underlying collection of metazoan transcription-factor-binding profiles was assembled to facilitate the study. This approach results in a significant improvement in the detection of transcription-factor-binding sites because of an increased signal-to-noise ratio, as demonstrated with two sets of promoter sequences. The method is implemented as a graphical web application, ConSite, which is at the disposal of the scientific community at http://www.phylofoot.org/. Conclusions Phylogenetic footprinting dramatically improves the predictive selectivity of bioinformatic approaches to the analysis of promoter sequences. ConSite delivers unparalleled performance using a novel database of high-quality binding models for metazoan transcription factors. With a dynamic interface, this bioinformatics tool provides broad access to promoter analysis with phylogenetic footprinting.

  7. Comparative genome research between maize and rice using genomic in situ hybridization

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    Using the genomic DNAs of maize and rice as probes respectively,the homology of maize and rice genomes was assessed by genomic in situ hybridization. When rice genomic DNAs were hybridized to maize, all chromosomes displayed many multiple discrete regions, while each rice chromosome delineated a single consecutive chromosomal region after they were hybridized with maize genomic DNAs. The results indicate that the genomes of maize and rice share high homology, and confirm the proposal that maize and rice are diverged from a common ancestor.

  8. A general pipeline for the development of anchor markers for comparative genomics in plants

    Directory of Open Access Journals (Sweden)

    Stougaard Jens

    2006-08-01

    Full Text Available Abstract Background Complete or near-complete genomic sequence information is presently only available for a few plant species representing a large phylogenetic diversity among plants. In order to effectively transfer this information to species lacking sequence information, comparative genomic tools need to be developed. Molecular markers permitting cross-species mapping along co-linear genomic regions are central to comparative genomics. These "anchor" markers, defining unique loci in genetic linkage maps of multiple species, are gene-based and possess a number of features that make them relatively sparse. To identify potential anchor marker sequences more efficiently, we have established an automated bioinformatic pipeline that combines multi-species Expressed Sequence Tags (EST and genome sequence data. Results Taking advantage of sequence data from related species, the pipeline identifies evolutionarily conserved sequences that are likely to define unique orthologous loci in most species of the same phylogenetic clade. The key features are the identification of evolutionarily conserved sequences followed by automated design of intron-flanking Polymerase Chain Reaction (PCR primer pairs. Polymorphisms can subsequently be identified by size- or sequence variation of PCR products, amplified from mapping parents or populations. We illustrate our procedure in legumes and grasses and exemplify its application in legumes, where model plant studies and the genome- and EST-sequence data available have a potential impact on the breeding of crop species and on our understanding of the evolution of this large and diverse family. Conclusion We provide a database of 459 candidate anchor loci which have the potential to serve as map anchors in more than 18,000 legume species, a number of which are of agricultural importance. For grasses, the database contains 1335 candidate anchor loci. Based on this database, we have evaluated 76 candidate anchor loci

  9. Genomics and Comparative Genomic Analyses Provide Insight into the Taxonomy and Pathogenic Potential of Novel Emmonsia Pathogens

    Science.gov (United States)

    Yang, Ying; Ye, Qiang; Li, Kang; Li, Zongwei; Bo, Xiaochen; Li, Zhen; Xu, Yingchun; Wang, Shengqi; Wang, Peng; Chen, Huipeng; Wang, Junzhi

    2017-01-01

    Over the last 50 years, newly described species of Emmonsia-like fungi have been implicated globally as sources of systemic human mycosis (emmonsiosis). Their ability to convert into yeast-like cells capable of replication and extra-pulmonary dissemination during the course of infection differentiates them from classical Emmonsia species. Immunocompromised patients are at highest risk of emmonsiosis and exhibit high mortality rates. In order to investigate the molecular basis for pathogenicity of the newly described Emmonsia species, genomic sequencing and comparative genomic analyses of Emmonsia sp. 5z489, which was isolated from a non-deliberately immunosuppressed diabetic patient in China and represents a novel seventh isolate of Emmonsia-like fungi, was performed. The genome size of 5z489 was 35.5 Mbp in length, which is ~5 Mbp larger than other Emmonsia strains. Further, 9,188 protein genes were predicted in the 5z489 genome and 16% of the assembly was identified as repetitive elements, which is the largest abundance in Emmonsia species. Phylogenetic analyses based on whole genome data classified 5z489 and CAC-2015a, another novel isolate, as members of the genus Emmonsia. Our analyses showed that divergences among Emmonsia occurred much earlier than other genera within the family Ajellomycetaceae, suggesting relatively distant evolutionary relationships among the genus. Through comparisons of Emmonsia species, we discovered significant pathogenicity characteristics within the genus as well as putative virulence factors that may play a role in the infection and pathogenicity of the novel Emmonsia strains. Moreover, our analyses revealed a novel distribution mode of DNA methylation patterns across the genome of 5z489, with >50% of methylated bases located in intergenic regions. These methylation patterns differ considerably from other reported fungi, where most methylation occurs in repetitive loci. It is unclear if this difference is related to physiological

  10. Complete nucleotide sequence of the Cryptomeria japonica D. Don. chloroplast genome and comparative chloroplast genomics: diversified genomic structure of coniferous species

    Science.gov (United States)

    Hirao, Tomonori; Watanabe, Atsushi; Kurita, Manabu; Kondo, Teiji; Takata, Katsuhiko

    2008-01-01

    Background The recent determination of complete chloroplast (cp) genomic sequences of various plant species has enabled numerous comparative analyses as well as advances in plant and genome evolutionary studies. In angiosperms, the complete cp genome sequences of about 70 species have been determined, whereas those of only three gymnosperm species, Cycas taitungensis, Pinus thunbergii, and Pinus koraiensis have been established. The lack of information regarding the gene content and genomic structure of gymnosperm cp genomes may severely hamper further progress of plant and cp genome evolutionary studies. To address this need, we report here the complete nucleotide sequence of the cp genome of Cryptomeria japonica, the first in the Cupressaceae sensu lato of gymnosperms, and provide a comparative analysis of their gene content and genomic structure that illustrates the unique genomic features of gymnosperms. Results The C. japonica cp genome is 131,810 bp in length, with 112 single copy genes and two duplicated (trnI-CAU, trnQ-UUG) genes that give a total of 116 genes. Compared to other land plant cp genomes, the C. japonica cp has lost one of the relevant large inverted repeats (IRs) found in angiosperms, fern, liverwort, and gymnosperms, such as Cycas and Gingko, and additionally has completely lost its trnR-CCG, partially lost its trnT-GGU, and shows diversification of accD. The genomic structure of the C. japonica cp genome also differs significantly from those of other plant species. For example, we estimate that a minimum of 15 inversions would be required to transform the gene organization of the Pinus thunbergii cp genome into that of C. japonica. In the C. japonica cp genome, direct repeat and inverted repeat sequences are observed at the inversion and translocation endpoints, and these sequences may be associated with the genomic rearrangements. Conclusion The observed differences in genomic structure between C. japonica and other land plants, including

  11. Genome Stability of Lyme Disease Spirochetes: Comparative Genomics of Borrelia burgdorferi Plasmids

    Energy Technology Data Exchange (ETDEWEB)

    Casjens S. R.; Dunn J.; Mongodin, E. F.; Qiu, W.-G.; Luft, B. J.; Schutzer, S. E.; Gilcrease, E. B.; Huang, W. M.; Vujadinovic, M.; Aron, J. K.; Vargas, L. C.; Freeman, S.; Radune, D.; Weidman, J. F.; Dimitrov, G. I.; Khouri, H. M.; Sosa, J. E.; Halpin, R. A.; Fraser, C. M.

    2012-03-14

    Lyme disease is the most common tick-borne human illness in North America. In order to understand the molecular pathogenesis, natural diversity, population structure and epizootic spread of the North American Lyme agent, Borrelia burgdorferi sensu stricto, a much better understanding of the natural diversity of its genome will be required. Towards this end we present a comparative analysis of the nucleotide sequences of the numerous plasmids of B. burgdorferi isolates B31, N40, JD1 and 297. These strains were chosen because they include the three most commonly studied laboratory strains, and because they represent different major genetic lineages and so are informative regarding the genetic diversity and evolution of this organism. A unique feature of Borrelia genomes is that they carry a large number of linear and circular plasmids, and this work shows that strains N40, JD1, 297 and B31 carry related but non-identical sets of 16, 20, 19 and 21 plasmids, respectively, that comprise 33-40% of their genomes. We deduce that there are at least 28 plasmid compatibility types among the four strains. The B. burgdorferi {approx}900 Kbp linear chromosomes are evolutionarily exceptionally stable, except for a short {le}20 Kbp plasmid-like section at the right end. A few of the plasmids, including the linear lp54 and circular cp26, are also very stable. We show here that the other plasmids, especially the linear ones, are considerably more variable. Nearly all of the linear plasmids have undergone one or more substantial inter-plasmid rearrangements since their last common ancestor. In spite of these rearrangements and differences in plasmid contents, the overall gene complement of the different isolates has remained relatively constant.

  12. Genome stability of Lyme disease spirochetes: comparative genomics of Borrelia burgdorferi plasmids.

    Directory of Open Access Journals (Sweden)

    Sherwood R Casjens

    Full Text Available Lyme disease is the most common tick-borne human illness in North America. In order to understand the molecular pathogenesis, natural diversity, population structure and epizootic spread of the North American Lyme agent, Borrelia burgdorferi sensu stricto, a much better understanding of the natural diversity of its genome will be required. Towards this end we present a comparative analysis of the nucleotide sequences of the numerous plasmids of B. burgdorferi isolates B31, N40, JD1 and 297. These strains were chosen because they include the three most commonly studied laboratory strains, and because they represent different major genetic lineages and so are informative regarding the genetic diversity and evolution of this organism. A unique feature of Borrelia genomes is that they carry a large number of linear and circular plasmids, and this work shows that strains N40, JD1, 297 and B31 carry related but non-identical sets of 16, 20, 19 and 21 plasmids, respectively, that comprise 33-40% of their genomes. We deduce that there are at least 28 plasmid compatibility types among the four strains. The B. burgdorferi ∼900 Kbp linear chromosomes are evolutionarily exceptionally stable, except for a short ≤20 Kbp plasmid-like section at the right end. A few of the plasmids, including the linear lp54 and circular cp26, are also very stable. We show here that the other plasmids, especially the linear ones, are considerably more variable. Nearly all of the linear plasmids have undergone one or more substantial inter-plasmid rearrangements since their last common ancestor. In spite of these rearrangements and differences in plasmid contents, the overall gene complement of the different isolates has remained relatively constant.

  13. GO4genome: A Prokaryotic Phylogeny Based on Genome Organization

    OpenAIRE

    Merkl, Rainer; Wiezer, Arnim

    2009-01-01

    Determining the phylogeny of closely related prokaryotes may fail in an analysis of rRNA or a small set of sequences. Whole-genome phylogeny utilizes the maximally available sample space. For a precise determination of genome similarity, two aspects have to be considered when developing an algorithm of whole-genome phylogeny: (1) gene order conservation is a more precise signal than gene content; and (2) when using sequence similarity, failures in identifying orthologues or the in situ replac...

  14. Comparative genomics of the relationship between gene structure and expression

    NARCIS (Netherlands)

    Ren, X.

    2006-01-01

    The relationship between the structure of genes and their expression is a relatively new aspect of genome organization and regulation. With more genome sequences and expression data becoming available, bioinformatics approaches can help the further elucidation of the relationships between gene struc

  15. Comparative Genomic Analysis of Meningitis- and Bacteremia-Causing Pneumococci Identifies a Common Core Genome.

    Science.gov (United States)

    Kulohoma, Benard W; Cornick, Jennifer E; Chaguza, Chrispin; Yalcin, Feyruz; Harris, Simon R; Gray, Katherine J; Kiran, Anmol M; Molyneux, Elizabeth; French, Neil; Parkhill, Julian; Faragher, Brian E; Everett, Dean B; Bentley, Stephen D; Heyderman, Robert S

    2015-10-01

    Streptococcus pneumoniae is a nasopharyngeal commensal that occasionally invades normally sterile sites to cause bloodstream infection and meningitis. Although the pneumococcal population structure and evolutionary genetics are well defined, it is not clear whether pneumococci that cause meningitis are genetically distinct from those that do not. Here, we used whole-genome sequencing of 140 isolates of S. pneumoniae recovered from bloodstream infection (n = 70) and meningitis (n = 70) to compare their genetic contents. By fitting a double-exponential decaying-function model, we show that these isolates share a core of 1,427 genes (95% confidence interval [CI], 1,425 to 1,435 genes) and that there is no difference in the core genome or accessory gene content from these disease manifestations. Gene presence/absence alone therefore does not explain the virulence behavior of pneumococci that reach the meninges. Our analysis, however, supports the requirement of a range of previously described virulence factors and vaccine candidates for both meningitis- and bacteremia-causing pneumococci. This high-resolution view suggests that, despite considerable competency for genetic exchange, all pneumococci are under considerable pressure to retain key components advantageous for colonization and transmission and that these components are essential for access to and survival in sterile sites.

  16. Comparative genomics of Gossypium and Arabidopsis: Unraveling the consequences of both ancient and recent polyploidy

    Science.gov (United States)

    Rong, Junkang; Bowers, John E.; Schulze, Stefan R.; Waghmare, Vijay N.; Rogers, Carl J.; Pierce, Gary J.; Zhang, Hua; Estill, James C.; Paterson, Andrew H.

    2005-01-01

    Both ancient and recent polyploidy, together with post-polyploidization loss of many duplicated gene copies, complicates angiosperm comparative genomics. To explore an approach by which these challenges might be mitigated, genetic maps of extant diploid and tetraploid cottons (Gossypium spp.) were used to infer the approximate order of 3016 loci along the chromosomes of their hypothetical common ancestor. The inferred Gossypium gene order corresponded more closely than the original maps did to a similarly inferred ancestral gene order predating an independent paleopolyploidization (α) in Arabidopsis. At least 59% of the cotton map and 53% of the Arabidopsis transcriptome showed correspondence in multilocus gene arrangements based on one or both of two software packages (CrimeStatII, FISH). Genomic regions in which chromosome structural rearrangement has been rapid (obscuring gene order correspondence) have also been subject to greater divergence of individual gene sequences. About 26%-44% of corresponding regions involved multiple Arabidopsis or cotton chromosomes, in some cases consistent with known, more ancient, duplications. The genomic distributions of multiple-locus probes provided early insight into the consequences for chromosome structure of an ancient large-scale duplication in cotton. Inferences that mitigate the consequences of ancient duplications improve leveraging of genomic information for model organisms in the study of more complex genomes. PMID:16109973

  17. Synergistic use of plant-prokaryote comparative genomics for functional annotations

    Directory of Open Access Journals (Sweden)

    Waller Jeffrey C

    2011-06-01

    Full Text Available Abstract Background Identifying functions for all gene products in all sequenced organisms is a central challenge of the post-genomic era. However, at least 30-50% of the proteins encoded by any given genome are of unknown or vaguely known function, and a large number are wrongly annotated. Many of these ‘unknown’ proteins are common to prokaryotes and plants. We set out to predict and experimentally test the functions of such proteins. Our approach to functional prediction integrates comparative genomics based mainly on microbial genomes with functional genomic data from model microorganisms and post-genomic data from plants. This approach bridges the gap between automated homology-based annotations and the classical gene discovery efforts of experimentalists, and is more powerful than purely computational approaches to identifying gene-function associations. Results Among Arabidopsis genes, we focused on those (2,325 in total that (i are unique or belong to families with no more than three members, (ii occur in prokaryotes, and (iii have unknown or poorly known functions. Computer-assisted selection of promising targets for deeper analysis was based on homology-independent characteristics associated in the SEED database with the prokaryotic members of each family. In-depth comparative genomic analysis was performed for 360 top candidate families. From this pool, 78 families were connected to general areas of metabolism and, of these families, specific functional predictions were made for 41. Twenty-one predicted functions have been experimentally tested or are currently under investigation by our group in at least one prokaryotic organism (nine of them have been validated, four invalidated, and eight are in progress. Ten additional predictions have been independently validated by other groups. Discovering the function of very widespread but hitherto enigmatic proteins such as the YrdC or YgfZ families illustrates the power of our approach

  18. Bootstrap, Bayesian probability and maximum likelihood mapping: exploring new tools for comparative genome analyses

    Directory of Open Access Journals (Sweden)

    Gogarten J Peter

    2002-02-01

    Full Text Available Abstract Background Horizontal gene transfer (HGT played an important role in shaping microbial genomes. In addition to genes under sporadic selection, HGT also affects housekeeping genes and those involved in information processing, even ribosomal RNA encoding genes. Here we describe tools that provide an assessment and graphic illustration of the mosaic nature of microbial genomes. Results We adapted the Maximum Likelihood (ML mapping to the analyses of all detected quartets of orthologous genes found in four genomes. We have automated the assembly and analyses of these quartets of orthologs given the selection of four genomes. We compared the ML-mapping approach to more rigorous Bayesian probability and Bootstrap mapping techniques. The latter two approaches appear to be more conservative than the ML-mapping approach, but qualitatively all three approaches give equivalent results. All three tools were tested on mitochondrial genomes, which presumably were inherited as a single linkage group. Conclusions In some instances of interphylum relationships we find nearly equal numbers of quartets strongly supporting the three possible topologies. In contrast, our analyses of genome quartets containing the cyanobacterium Synechocystis sp. indicate that a large part of the cyanobacterial genome is related to that of low GC Gram positives. Other groups that had been suggested as sister groups to the cyanobacteria contain many fewer genes that group with the Synechocystis orthologs. Interdomain comparisons of genome quartets containing the archaeon Halobacterium sp. revealed that Halobacterium sp. shares more genes with Bacteria that live in the same environment than with Bacteria that are more closely related based on rRNA phylogeny . Many of these genes encode proteins involved in substrate transport and metabolism and in information storage and processing. The performed analyses demonstrate that relationships among prokaryotes cannot be accurately

  19. LegumeIP: an integrative database for comparative genomics and transcriptomics of model legumes.

    Science.gov (United States)

    Li, Jun; Dai, Xinbin; Liu, Tingsong; Zhao, Patrick Xuechun

    2012-01-01

    Legumes play a vital role in maintaining the nitrogen cycle of the biosphere. They conduct symbiotic nitrogen fixation through endosymbiotic relationships with bacteria in root nodules. However, this and other characteristics of legumes, including mycorrhization, compound leaf development and profuse secondary metabolism, are absent in the typical model plant Arabidopsis thaliana. We present LegumeIP (http://plantgrn.noble.org/LegumeIP/), an integrative database for comparative genomics and transcriptomics of model legumes, for studying gene function and genome evolution in legumes. LegumeIP compiles gene and gene family information, syntenic and phylogenetic context and tissue-specific transcriptomic profiles. The database holds the genomic sequences of three model legumes, Medicago truncatula, Glycine max and Lotus japonicus plus two reference plant species, A. thaliana and Populus trichocarpa, with annotations based on UniProt, InterProScan, Gene Ontology and the Kyoto Encyclopedia of Genes and Genomes databases. LegumeIP also contains large-scale microarray and RNA-Seq-based gene expression data. Our new database is capable of systematic synteny analysis across M. truncatula, G. max, L. japonicas and A. thaliana, as well as construction and phylogenetic analysis of gene families across the five hosted species. Finally, LegumeIP provides comprehensive search and visualization tools that enable flexible queries based on gene annotation, gene family, synteny and relative gene expression.

  20. Sub-megabase resolution tiling (SMRT array-based comparative genomic hybridization profiling reveals novel gains and losses of chromosomal regions in Hodgkin Lymphoma and Anaplastic Large Cell Lymphoma cell lines

    Directory of Open Access Journals (Sweden)

    Lam Wan L

    2008-01-01

    Full Text Available Abstract Background Hodgkin lymphoma (HL and Anaplastic Large Cell Lymphoma (ALCL, are forms of malignant lymphoma defined by unique morphologic, immunophenotypic, genotypic, and clinical characteristics, but both overexpress CD30. We used sub-megabase resolution tiling (SMRT array-based comparative genomic hybridization to screen HL-derived cell lines (KMH2 and L428 and ALCL cell lines (DEL and SR-786 in order to identify disease-associated gene copy number gains and losses. Results Significant copy number gains and losses were observed on several chromosomes in all four cell lines. Assessment of copy number alterations with 26,819 DNA segments identified an average of 20 genetic alterations. Of the recurrent minimally altered regions identified, 11 (55% were within previously published regions of chromosomal alterations in HL and ALCL cell lines while 9 (45% were novel alterations not previously reported. HL cell lines L428 and KMH2 shared gains in chromosome cytobands 2q23.1-q24.2, 7q32.2-q36.3, 9p21.3-p13.3, 12q13.13-q14.1, and losses in 13q12.13-q12.3, and 18q21.32-q23. ALCL cell lines SR-786 and DEL, showed gains in cytobands 5p15.32-p14.3, 20p12.3-q13.11, and 20q13.2-q13.32. Both pairs of HL and ALCL cell lines showed losses in 18q21.32-18q23. Conclusion This study is considered to be the first one describing HL and ALCL cell line genomes at sub-megabase resolution. This high-resolution analysis allowed us to propose novel candidate target genes that could potentially contribute to the pathogenesis of HL and ALCL. FISH was used to confirm the amplification of all three isoforms of the trypsin gene (PRSS1/PRSS2/PRSS3 in KMH2 and L428 (HL and DEL (ALCL cell lines. These are novel findings that have not been previously reported in the lymphoma literature, and opens up an entirely new area of research that has not been previously associated with lymphoma biology. The findings raise interesting possibilities about the role of signaling

  1. Genomic profiling of plasmablastic lymphoma using array comparative genomic hybridization (aCGH: revealing significant overlapping genomic lesions with diffuse large B-cell lymphoma

    Directory of Open Access Journals (Sweden)

    Lu Xin-Yan

    2009-11-01

    Full Text Available Abstract Background Plasmablastic lymphoma (PL is a subtype of diffuse large B-cell lymphoma (DLBCL. Studies have suggested that tumors with PL morphology represent a group of neoplasms with clinopathologic characteristics corresponding to different entities including extramedullary plasmablastic tumors associated with plasma cell myeloma (PCM. The goal of the current study was to evaluate the genetic similarities and differences among PL, DLBCL (AIDS-related and non AIDS-related and PCM using array-based comparative genomic hybridization. Results Examination of genomic data in PL revealed that the most frequent segmental gain (> 40% include: 1p36.11-1p36.33, 1p34.1-1p36.13, 1q21.1-1q23.1, 7q11.2-7q11.23, 11q12-11q13.2 and 22q12.2-22q13.3. This correlated with segmental gains occurring in high frequency in DLBCL (AIDS-related and non AIDS-related cases. There were some segmental gains and some segmental loss that occurred in PL but not in the other types of lymphoma suggesting that these foci may contain genes responsible for the differentiation of this lymphoma. Additionally, some segmental gains and some segmental loss occurred only in PL and AIDS associated DLBCL suggesting that these foci may be associated with HIV infection. Furthermore, some segmental gains and some segmental loss occurred only in PL and PCM suggesting that these lesions may be related to plasmacytic differentiation. Conclusion To the best of our knowledge, the current study represents the first genomic exploration of PL. The genomic aberration pattern of PL appears to be more similar to that of DLBCL (AIDS-related or non AIDS-related than to PCM. Our findings suggest that PL may remain best classified as a subtype of DLBCL at least at the genome level.

  2. A reference pan-genome approach to comparative bacterial genomics: identification of novel epidemiological markers in pathogenic Campylobacter.

    Directory of Open Access Journals (Sweden)

    Guillaume Méric

    Full Text Available The increasing availability of hundreds of whole bacterial genomes provides opportunities for enhanced understanding of the genes and alleles responsible for clinically important phenotypes and how they evolved. However, it is a significant challenge to develop easy-to-use and scalable methods for characterizing these large and complex data and relating it to disease epidemiology. Existing approaches typically focus on either homologous sequence variation in genes that are shared by all isolates, or non-homologous sequence variation--focusing on genes that are differentially present in the population. Here we present a comparative genomics approach that simultaneously approximates core and accessory genome variation in pathogen populations and apply it to pathogenic species in the genus Campylobacter. A total of 7 published Campylobacter jejuni and Campylobacter coli genomes were selected to represent diversity across these species, and a list of all loci that were present at least once was compiled. After filtering duplicates a 7-isolate reference pan-genome, of 3,933 loci, was defined. A core genome of 1,035 genes was ubiquitous in the sample accounting for 59% of the genes in each isolate (average genome size of 1.68 Mb. The accessory genome contained 2,792 genes. A Campylobacter population sample of 192 genomes was screened for the presence of reference pan-genome loci with gene presence defined as a BLAST match of ≥ 70% identity over ≥ 50% of the locus length--aligned using MUSCLE on a gene-by-gene basis. A total of 21 genes were present only in C. coli and 27 only in C. jejuni, providing information about functional differences associated with species and novel epidemiological markers for population genomic analyses. Homologs of these genes were found in several of the genomes used to define the pan-genome and, therefore, would not have been identified using a single reference strain approach.

  3. Serological evaluation of Mycobacterium ulcerans antigens identified by comparative genomics.

    Directory of Open Access Journals (Sweden)

    Sacha J Pidot

    Full Text Available A specific and sensitive serodiagnostic test for Mycobacterium ulcerans infection would greatly assist the diagnosis of Buruli ulcer and would also facilitate seroepidemiological surveys. By comparative genomics, we identified 45 potential M. ulcerans specific proteins, of which we were able to express and purify 33 in E. coli. Sera from 30 confirmed Buruli ulcer patients, 24 healthy controls from the same endemic region and 30 healthy controls from a non-endemic region in Benin were screened for antibody responses to these specific proteins by ELISA. Serum IgG responses of Buruli ulcer patients were highly variable, however, seven proteins (MUP045, MUP057, MUL_0513, Hsp65, and the polyketide synthase domains ER, AT propionate, and KR A showed a significant difference between patient and non-endemic control antibody responses. However, when sera from the healthy control subjects living in the same Buruli ulcer endemic area as the patients were examined, none of the proteins were able to discriminate between these two groups. Nevertheless, six of the seven proteins showed an ability to distinguish people living in an endemic area from those in a non-endemic area with an average sensitivity of 69% and specificity of 88%, suggesting exposure to M. ulcerans. Further validation of these six proteins is now underway to assess their suitability for use in Buruli ulcer seroepidemiological studies. Such studies are urgently needed to assist efforts to uncover environmental reservoirs and understand transmission pathways of the M. ulcerans.

  4. Evolution and comparative genomics of Campylobacter jejuni ST-677 clonal complex.

    Science.gov (United States)

    Kivistö, Rauni I; Kovanen, Sara; Skarp-de Haan, Astrid; Schott, Thomas; Rahkio, Marjatta; Rossi, Mirko; Hänninen, Marja-Liisa

    2014-09-04

    Campylobacter is the most common bacterial cause of gastroenteritis in the European Union with over 200,000 laboratory-confirmed cases reported annually. This is the first study to describe findings related to comparative genomics analyses of the sequence type (ST)-677 clonal complex (CC), a Campylobacter jejuni lineage associated with bacteremia cases in humans. We performed whole-genome sequencing, using Illumina HiSeq sequencing technology, on five related ST-677 CC isolates from two chicken farms to identify microevolution taking place at the farms. Our further aim was to identify novel putative virulence determinants from the ST-677 CC genomes. For this purpose, clinical isolates of the same CC were included in comparative genomic analyses against well-known reference strains of C. jejuni. Overall, the ST-677 CC was recognized as a highly clonal lineage with relatively small differences between the genomes. Among the farm isolates differences were identified mainly in the lengths of the homopolymeric tracts in genes related to the capsule, lipo-oligosaccharide, and flagella. We identified genomic features shared with C. jejuni subsp. doylei, which has also been shown to be associated with bacteremia in humans. These included the degradation of the cytolethal distending toxin operon and similarities between the capsular polysaccharide biosynthesis loci. The phase-variable GDP-mannose 4,6-dehydratase (EC 4.2.1.47) (wcbK, CAMP1649), associated with the capsular polysaccharide biosynthesis locus, may play a central role in ST-677 CC conferring acid and serum resistance during different stages of infection. Homology-based searches revealed several additional novel features and characteristics, including two putative type Vb secretion systems and a novel restriction modification/methyltransferase gene cluster, putatively associated with pathogenesis and niche adaptation.

  5. ReplicationDomain: a visualization tool and comparative database for genome-wide replication timing data

    Directory of Open Access Journals (Sweden)

    Yokochi Tomoki

    2008-12-01

    Full Text Available Abstract Background Eukaryotic DNA replication is regulated at the level of large chromosomal domains (0.5–5 megabases in mammals within which replicons are activated relatively synchronously. These domains replicate in a specific temporal order during S-phase and our genome-wide analyses of replication timing have demonstrated that this temporal order of domain replication is a stable property of specific cell types. Results We have developed ReplicationDomain http://www.replicationdomain.org as a web-based database for analysis of genome-wide replication timing maps (replication profiles from various cell lines and species. This database also provides comparative information of transcriptional expression and is configured to display any genome-wide property (for instance, ChIP-Chip or ChIP-Seq data via an interactive web interface. Our published microarray data sets are publicly available. Users may graphically display these data sets for a selected genomic region and download the data displayed as text files, or alternatively, download complete genome-wide data sets. Furthermore, we have implemented a user registration system that allows registered users to upload their own data sets. Upon uploading, registered users may choose to: (1 view their data sets privately without sharing; (2 share with other registered users; or (3 make their published or "in press" data sets publicly available, which can fulfill journal and funding agencies' requirements for data sharing. Conclusion ReplicationDomain is a novel and powerful tool to facilitate the comparative visualization of replication timing in various cell types as well as other genome-wide chromatin features and is considerably faster and more convenient than existing browsers when viewing multi-megabase segments of chromosomes. Furthermore, the data upload function with the option of private viewing or sharing of data sets between registered users should be a valuable resource for the

  6. Systematic discovery of regulatory motifs in Fusarium graminearum by comparing four Fusarium genomes

    Directory of Open Access Journals (Sweden)

    Kistler Corby

    2010-03-01

    Full Text Available Abstract Background Fusarium graminearum (Fg, a major fungal pathogen of cultivated cereals, is responsible for billions of dollars in agriculture losses. There is a growing interest in understanding the transcriptional regulation of this organism, especially the regulation of genes underlying its pathogenicity. The generation of whole genome sequence assemblies for Fg and three closely related Fusarium species provides a unique opportunity for such a study. Results Applying comparative genomics approaches, we developed a computational pipeline to systematically discover evolutionarily conserved regulatory motifs in the promoter, downstream and the intronic regions of Fg genes, based on the multiple alignments of sequenced Fusarium genomes. Using this method, we discovered 73 candidate regulatory motifs in the promoter regions. Nearly 30% of these motifs are highly enriched in promoter regions of Fg genes that are associated with a specific functional category. Through comparison to Saccharomyces cerevisiae (Sc and Schizosaccharomyces pombe (Sp, we observed conservation of transcription factors (TFs, their binding sites and the target genes regulated by these TFs related to pathways known to respond to stress conditions or phosphate metabolism. In addition, this study revealed 69 and 39 conserved motifs in the downstream regions and the intronic regions, respectively, of Fg genes. The top intronic motif is the splice donor site. For the downstream regions, we noticed an intriguing absence of the mammalian and Sc poly-adenylation signals among the list of conserved motifs. Conclusion This study provides the first comprehensive list of candidate regulatory motifs in Fg, and underscores the power of comparative genomics in revealing functional elements among related genomes. The conservation of regulatory pathways among the Fusarium genomes and the two yeast species reveals their functional significance, and provides new insights in their

  7. Comparative genomics of Geobacter chemotaxis genes reveals diverse signaling function

    Directory of Open Access Journals (Sweden)

    Antommattei Frances M

    2008-10-01

    Full Text Available Abstract Background Geobacter species are δ-Proteobacteria and are often the predominant species in a variety of sedimentary environments where Fe(III reduction is important. Their ability to remediate contaminated environments and produce electricity makes them attractive for further study. Cell motility, biofilm formation, and type IV pili all appear important for the growth of Geobacter in changing environments and for electricity production. Recent studies in other bacteria have demonstrated that signaling pathways homologous to the paradigm established for Escherichia coli chemotaxis can regulate type IV pili-dependent motility, the synthesis of flagella and type IV pili, the production of extracellular matrix material, and biofilm formation. The classification of these pathways by comparative genomics improves the ability to understand how Geobacter thrives in natural environments and better their use in microbial fuel cells. Results The genomes of G. sulfurreducens, G. metallireducens, and G. uraniireducens contain multiple (~70 homologs of chemotaxis genes arranged in several major clusters (six, seven, and seven, respectively. Unlike the single gene cluster of E. coli, the Geobacter clusters are not all located near the flagellar genes. The probable functions of some Geobacter clusters are assignable by homology to known pathways; others appear to be unique to the Geobacter sp. and contain genes of unknown function. We identified large numbers of methyl-accepting chemotaxis protein (MCP homologs that have diverse sensing domain architectures and generate a potential for sensing a great variety of environmental signals. We discuss mechanisms for class-specific segregation of the MCPs in the cell membrane, which serve to maintain pathway specificity and diminish crosstalk. Finally, the regulation of gene expression in Geobacter differs from E. coli. The sequences of predicted promoter elements suggest that the alternative sigma factors

  8. A genetic linkage map and comparative mapping of the prairie vole (Microtus ochrogaster genome

    Directory of Open Access Journals (Sweden)

    Young Larry J

    2011-07-01

    Full Text Available Abstract Background The prairie vole (Microtus ochrogaster is an emerging rodent model for investigating the genetics, evolution and molecular mechanisms of social behavior. Though a karyotype for the prairie vole has been reported and low-resolution comparative cytogenetic analyses have been done in this species, other basic genetic resources for this species, such as a genetic linkage map, are lacking. Results Here we report the construction of a genome-wide linkage map of the prairie vole. The linkage map consists of 406 markers that are spaced on average every 7 Mb and span an estimated ~90% of the genome. The sex average length of the linkage map is 1707 cM, which, like other Muroid rodent linkage maps, is on the lower end of the length distribution of linkage maps reported to date for placental mammals. Linkage groups were assigned to 19 out of the 26 prairie vole autosomes as well as the X chromosome. Comparative analyses of the prairie vole linkage map based on the location of 387 Type I markers identified 61 large blocks of synteny with the mouse genome. In addition, the results of the comparative analyses revealed a potential elevated rate of inversions in the prairie vole lineage compared to the laboratory mouse and rat. Conclusions A genetic linkage map of the prairie vole has been constructed and represents the fourth genome-wide high-resolution linkage map reported for Muroid rodents and the first for a member of the Arvicolinae sub-family. This resource will advance studies designed to dissect the genetic basis of a variety of social behaviors and other traits in the prairie vole as well as our understanding of genome evolution in the genus Microtus.

  9. Functional and Comparative Genomics of Lignocellulose Degradation by Schizophyllum commune

    Energy Technology Data Exchange (ETDEWEB)

    Ohm, Robin A.; Lee, Hanbyul; Park, Hongjae; Brewer, Heather M.; Carver, Akiko; Copeland, Alex; Grimwood, Jane; Lindquist, Erika; Lipzen, Anna; Martin, Joel; Purvine, Samuel O.; Schackwitz, Wendy; Tegelaar, Martin; Tritt, Andrew; Baker, Scott; Choi, In-Geol; Lugones, Luis G.; Wosten, Han A. B.; Grigoriev, Igor V.

    2014-03-14

    The Basidiomycete fungus Schizophyllum commune is a wood-decaying fungus and is used as a model system to study lignocellulose degradation. Version 3.0 of the genome assembly filled 269 of 316 sequence gaps and added 680 kb of sequence. This new assembly was reannotated using RNAseq transcriptomics data, and this resulted in 3110 (24percent) more genes. Two additional S. commune strains with different wood-decaying properties were sequenced, from Tattone (France) and Loenen (The Netherlands). Sequence comparison shows remarkably high sequence diversity between the strains. The overall SNP rate of > 100 SNPs/kb is among the highest rates of within-species polymorphisms in Basidiomycetes. Some well-described proteins like hydrophobins and transcription factors have less than 70percent sequence identity among the strains. Some chromosomes are better conserved than others and in some cases large parts of chromosomes are missing from one or more strains. Gene expression on glucose, cellulose and wood was analyzed in two S. commune strains. Overall, gene expression correlated between the two strains, but there were some notable exceptions. Of particular interest are CAZymes (carbohydrate-active enzymes) that are regulated in different ways in the different strains. In both strains the transcription factor Fsp1 was strongly up-regulated during growth on cellulose and wood, when compared to glucose. Over-expression of Fsp1 using a constitutive promoter resulted in higher cellulose and xylose-degrading enzyme activity, which suggests that Fsp1 is involved in regulating CAZyme gene expression. Two CAZyme genes (of family GH61 and GH11) were shown to be strongly up-regulated during growth on cellulose, compared to glucose. Proteomics on the secreted proteins in the growth medium confirmed this. A promoter analysis revealed the shortest active promoters for these two genes, as well as putative transcription factor binding sites.

  10. Comparative Analysis of Fatty Acid Desaturases in Cyanobacterial Genomes

    Directory of Open Access Journals (Sweden)

    Xiaoyuan Chi

    2008-01-01

    Full Text Available Fatty acid desaturases are enzymes that introduce double bonds into the hydrocarbon chains of fatty acids. The fatty acid desaturases from 37 cyanobacterial genomes were identified and classified based upon their conserved histidine-rich motifs and phylogenetic analysis, which help to determine the amounts and distributions of desaturases in cyanobacterial species. The filamentous or N2-fixing cyanobacteria usually possess more types of fatty acid desaturases than that of unicellular species. The pathway of acyl-lipid desaturation for unicellular marine cyanobacteria Synechococcus and Prochlorococcus differs from that of other cyanobacteria, indicating different phylogenetic histories of the two genera from other cyanobacteria isolated from freshwater, soil, or symbiont. Strain Gloeobacter violaceus PCC 7421 was isolated from calcareous rock and lacks thylakoid membranes. The types and amounts of desaturases of this strain are distinct to those of other cyanobacteria, reflecting the earliest divergence of it from the cyanobacterial line. Three thermophilic unicellular strains, Thermosynechococcus elongatus BP-1 and two Synechococcus Yellowstone species, lack highly unsaturated fatty acids in lipids and contain only one Δ9 desaturase in contrast with mesophilic strains, which is probably due to their thermic habitats. Thus, the amounts and types of fatty acid desaturases are various among different cyanobacterial species, which may result from the adaption to environments in evolution.

  11. Genome-wide mapping of copy number variation in humans: comparative analysis of high resolution array platforms.

    Directory of Open Access Journals (Sweden)

    Rajini R Haraksingh

    Full Text Available Accurate and efficient genome-wide detection of copy number variants (CNVs is essential for understanding human genomic variation, genome-wide CNV association type studies, cytogenetics research and diagnostics, and independent validation of CNVs identified from sequencing based technologies. Numerous, array-based platforms for CNV detection exist utilizing array Comparative Genome Hybridization (aCGH, Single Nucleotide Polymorphism (SNP genotyping or both. We have quantitatively assessed the abilities of twelve leading genome-wide CNV detection platforms to accurately detect Gold Standard sets of CNVs in the genome of HapMap CEU sample NA12878, and found significant differences in performance. The technologies analyzed were the NimbleGen 4.2 M, 2.1 M and 3×720 K Whole Genome and CNV focused arrays, the Agilent 1×1 M CGH and High Resolution and 2×400 K CNV and SNP+CGH arrays, the Illumina Human Omni1Quad array and the Affymetrix SNP 6.0 array. The Gold Standards used were a 1000 Genomes Project sequencing-based set of 3997 validated CNVs and an ultra high-resolution aCGH-based set of 756 validated CNVs. We found that sensitivity, total number, size range and breakpoint resolution of CNV calls were highest for CNV focused arrays. Our results are important for cost effective CNV detection and validation for both basic and clinical applications.

  12. StellaBase: The Nematostella vectensis Genomics Database

    OpenAIRE

    James C Sullivan; Ryan, Joseph F; Watson, James A.; Webb, Jeramy; Mullikin, James C; Rokhsar, Daniel; Finnerty, John R

    2005-01-01

    StellaBase, the Nematostella vectensis Genomics Database, is a web-based resource that will facilitate desktop and bench-top studies of the starlet sea anemone. Nematostella is an emerging model organism that has already proven useful for addressing fundamental questions in developmental evolution and evolutionary genomics. StellaBase allows users to query the assembled Nematostella genome, a confirmed gene library, and a predicted genome using both keyword and homology based search functions...

  13. DeltaProt: a software toolbox for comparative genomics

    Directory of Open Access Journals (Sweden)

    Willassen Nils P

    2010-11-01

    Full Text Available Abstract Background Statistical bioinformatics is the study of biological data sets obtained by new micro-technologies by means of proper statistical methods. For a better understanding of environmental adaptations of proteins, orthologous sequences from different habitats may be explored and compared. The main goal of the DeltaProt Toolbox is to provide users with important functionality that is needed for comparative screening and studies of extremophile proteins and protein classes. Visualization of the data sets is also the focus of this article, since visualizations can play a key role in making the various relationships transparent. This application paper is intended to inform the reader of the existence, functionality, and applicability of the toolbox. Results We present the DeltaProt Toolbox, a software toolbox that may be useful in importing, analyzing and visualizing data from multiple alignments of proteins. The toolbox has been written in MATLAB™ to provide an easy and user-friendly platform, including a graphical user interface, while ensuring good numerical performance. Problems in genome biology may be easily stated thanks to a compact input format. The toolbox also offers the possibility of utilizing structural information from the SABLE or other structure predictors. Different sequence plots can then be viewed and compared in order to find their similarities and differences. Detailed statistics are also calculated during the procedure. Conclusions The DeltaProt package is open source and freely available for academic, non-commercial use. The latest version of DeltaProt can be obtained from http://services.cbu.uib.no/software/deltaprot/. The website also contains documentation, and the toolbox comes with real data sets that are intended for training in applying the models to carry out bioinformatical and statistical analyses of protein sequences. Equipped with the new algorithms proposed here, DeltaProt serves as an auxiliary

  14. Comparative genomics of bacterial and plant folate synthesis and salvage: predictions and validations

    Directory of Open Access Journals (Sweden)

    Noiriel Alexandre

    2007-07-01

    Full Text Available Abstract Background Folate synthesis and salvage pathways are relatively well known from classical biochemistry and genetics but they have not been subjected to comparative genomic analysis. The availability of genome sequences from hundreds of diverse bacteria, and from Arabidopsis thaliana, enabled such an analysis using the SEED database and its tools. This study reports the results of the analysis and integrates them with new and existing experimental data. Results Based on sequence similarity and the clustering, fusion, and phylogenetic distribution of genes, several functional predictions emerged from this analysis. For bacteria, these included the existence of novel GTP cyclohydrolase I and folylpolyglutamate synthase gene families, and of a trifunctional p-aminobenzoate synthesis gene. For plants and bacteria, the predictions comprised the identities of a 'missing' folate synthesis gene (folQ and of a folate transporter, and the absence from plants of a folate salvage enzyme. Genetic and biochemical tests bore out these predictions. Conclusion For bacteria, these results demonstrate that much can be learnt from comparative genomics, even for well-explored primary metabolic pathways. For plants, the findings particularly illustrate the potential for rapid functional assignment of unknown genes that have prokaryotic homologs, by analyzing which genes are associated with the latter. More generally, our data indicate how combined genomic analysis of both plants and prokaryotes can be more powerful than isolated examination of either group alone.

  15. Microbial comparative pan-genomics using binomial mixture models

    DEFF Research Database (Denmark)

    Ussery, David; Snipen, L; Almøy, T

    2009-01-01

    The size of the core- and pan-genome of bacterial species is a topic of increasing interest due to the growing number of sequenced prokaryote genomes, many from the same species. Attempts to estimate these quantities have been made, using regression methods or mixture models. We extend the latter...... approach by using statistical ideas developed for capture-recapture problems in ecology and epidemiology. RESULTS: We estimate core- and pan-genome sizes for 16 different bacterial species. The results reveal a complex dependency structure for most species, manifested as heterogeneous detection...... probabilities. Estimated pan-genome sizes range from small (around 2600 gene families) in Buchnera aphidicola to large (around 43000 gene families) in Escherichia coli. Results for Echerichia coli show that as more data become available, a larger diversity is estimated, indicating an extensive pool of rarely...

  16. Comparative analysis of genome maintenance genes in naked mole rat, mouse, and human

    NARCIS (Netherlands)

    S.L. Macrae (Sheila L.); Q. Zhang (Quanwei); C. Lemetre (Christophe); I. Seim (Inge); R.B. Calder (Robert B.); J.H.J. Hoeijmakers (Jan); Y. Suh (Yousin); V.N. Gladyshev (Vadim N.); A. Seluanov (Andrei); V. Gorbunova (Vera); J. Vijg (Jan); Z.D. Zhang (Zhengdong D.)

    2015-01-01

    textabstractGenome maintenance (GM) is an essential defense system against aging and cancer, as both are characterized by increased genome instability. Here, we compared the copy number variation and mutation rate of 518 GM-associated genes in the naked mole rat (NMR), mouse, and human genomes. GM g

  17. Comparative genomic analysis of Lactobacillus rhamnosus GG reveals pili containing a human- mucus binding protein

    NARCIS (Netherlands)

    Kankainen, M.; Paulin, L.; Tynkkynen, S.; Ossowski, von I.; Reunanen, J.; Partanen, P.; Satokari, A.; Vesterlund, S.; Hendrickx, A.P.; Lebeer, S.; Keersmaecker, de S.C.; Vanderleyden, J.; Hämäläinen, T.; Laukkanen, S.; Salovuori, N.; Ritari, J.; Alatalo, E.; Korpela, R.; Mattila-Sandholm, T.; Lassig, A.; Hatakka, K.; Kinnunen, K.T.; Karjalainen, H.; Saxelin, M.; Laakso, K.; Surakka, A.; Palva, A.; Salusjärvi, T.; Auvinen, P.; Vos, de W.M.

    2009-01-01

    To unravel the biological function of the widely used probiotic bacterium Lactobacillus rhamnosus GG, we compared its 3.0-Mbp genome sequence with the similarly sized genome of L. rhamnosus LC705, an adjunct starter culture exhibiting reduced binding to mucus. Both genomes demonstrated high sequence

  18. De Novo Identification of Single Nucleotide Mutations in Caenorhabditis elegans Using Array Comparative Genomic Hybridization

    Science.gov (United States)

    Maydan, Jason S.; Okada, H. Mark; Flibotte, Stephane; Edgley, Mark L.; Moerman, Donald G.

    2009-01-01

    Array comparative genomic hybridization (aCGH) has been used primarily to detect copy-number variants between two genomes. Here we report using aCGH to detect single nucleotide mutations on oligonucleotide microarrays with overlapping 50-mer probes. This technique represents a powerful method for rapidly detecting novel homozygous single nucleotide mutations in any organism with a sequenced reference genome. PMID:19189945

  19. WormBase 2016: expanding to enable helminth genomic research

    Science.gov (United States)

    Howe, Kevin L.; Bolt, Bruce J.; Cain, Scott; Chan, Juancarlos; Chen, Wen J.; Davis, Paul; Done, James; Down, Thomas; Gao, Sibyl; Grove, Christian; Harris, Todd W.; Kishore, Ranjana; Lee, Raymond; Lomax, Jane; Li, Yuling; Muller, Hans-Michael; Nakamura, Cecilia; Nuin, Paulo; Paulini, Michael; Raciti, Daniela; Schindelman, Gary; Stanley, Eleanor; Tuli, Mary Ann; Van Auken, Kimberly; Wang, Daniel; Wang, Xiaodong; Williams, Gary; Wright, Adam; Yook, Karen; Berriman, Matthew; Kersey, Paul; Schedl, Tim; Stein, Lincoln; Sternberg, Paul W.

    2016-01-01

    WormBase (www.wormbase.org) is a central repository for research data on the biology, genetics and genomics of Caenorhabditis elegans and other nematodes. The project has evolved from its original remit to collect and integrate all data for a single species, and now extends to numerous nematodes, ranging from evolutionary comparators of C. elegans to parasitic species that threaten plant, animal and human health. Research activity using C. elegans as a model system is as vibrant as ever, and we have created new tools for community curation in response to the ever-increasing volume and complexity of data. To better allow users to navigate their way through these data, we have made a number of improvements to our main website, including new tools for browsing genomic features and ontology annotations. Finally, we have developed a new portal for parasitic worm genomes. WormBase ParaSite (parasite.wormbase.org) contains all publicly available nematode and platyhelminth annotated genome sequences, and is designed specifically to support helminth genomic research. PMID:26578572

  20. Ecology, Diversity and Comparative Genomics of Oceanic Cyanobacterial Viruses

    Science.gov (United States)

    2004-06-01

    genome contains a group of some genes with little homology to known bacterial proteins, but with homology to eukaryotic prion -like proteins (e-5), an...eukaryotic and prion -like genes have been made in the genomes of mycobacteriophages (Pedulla et al., 2003). The presence of these genes in a...and asceptically add filter sterilized nutrients to bring level of top agarose nutrients to desired media of choice d. Add appropriate volume of

  1. Metagenome Skimming of Insect Specimen Pools: Potential for Comparative Genomics.

    Science.gov (United States)

    Linard, Benjamin; Crampton-Platt, Alex; Gillett, Conrad P D T; Timmermans, Martijn J T N; Vogler, Alfried P

    2015-05-14

    Metagenomic analyses are challenging in metazoans, but high-copy number and repeat regions can be assembled from low-coverage sequencing by "genome skimming," which is applied here as a new way of characterizing metagenomes obtained in an ecological or taxonomic context. Illumina shotgun sequencing on two pools of Coleoptera (beetles) of approximately 200 species each were assembled into tens of thousands of scaffolds. Repeated low-coverage sequencing recovered similar scaffold sets consistently, although approximately 70% of scaffolds could not be identified against existing genome databases. Identifiable scaffolds included mitochondrial DNA, conserved sequences with hits to expressed sequence tag and protein databases, and known repeat elements of high and low complexity, including numerous copies of rRNA and histone genes. Assemblies of histones captured a diversity of gene order and primary sequence in Coleoptera. Scaffolds with similarity to multiple sites in available coleopteran genome sequences for Dendroctonus and Tribolium revealed high specificity of scaffolds to either of these genomes, in particular for high-copy number repeats. Numerous "clusters" of scaffolds mapped to the same genomic site revealed intra- and/or intergenomic variation within a metagenome pool. In addition to effect of taxonomic composition of the metagenomes, the number of mapped scaffolds also revealed structural differences between the two reference genomes, although the significance of this striking finding remains unclear. Finally, apparently exogenous sequences were recovered, including potential food plants, fungal pathogens, and bacterial symbionts. The "metagenome skimming" approach is useful for capturing the genomic diversity of poorly studied, species-rich lineages and opens new prospects in environmental genomics.

  2. Microbial comparative pan-genomics using binomial mixture models

    Directory of Open Access Journals (Sweden)

    Ussery David W

    2009-08-01

    Full Text Available Abstract Background The size of the core- and pan-genome of bacterial species is a topic of increasing interest due to the growing number of sequenced prokaryote genomes, many from the same species. Attempts to estimate these quantities have been made, using regression methods or mixture models. We extend the latter approach by using statistical ideas developed for capture-recapture problems in ecology and epidemiology. Results We estimate core- and pan-genome sizes for 16 different bacterial species. The results reveal a complex dependency structure for most species, manifested as heterogeneous detection probabilities. Estimated pan-genome sizes range from small (around 2600 gene families in Buchnera aphidicola to large (around 43000 gene families in Escherichia coli. Results for Echerichia coli show that as more data become available, a larger diversity is estimated, indicating an extensive pool of rarely occurring genes in the population. Conclusion Analyzing pan-genomics data with binomial mixture models is a way to handle dependencies between genomes, which we find is always present. A bottleneck in the estimation procedure is the annotation of rarely occurring genes.

  3. Comparative genome analysis of pathogenic and non-pathogenic Clavibacter strains reveals adaptations to their lifestyle.

    Science.gov (United States)

    Załuga, Joanna; Stragier, Pieter; Baeyen, Steve; Haegeman, Annelies; Van Vaerenbergh, Johan; Maes, Martine; De Vos, Paul

    2014-05-22

    The genus Clavibacter harbors economically important plant pathogens infecting agricultural crops such as potato and tomato. Although the vast majority of Clavibacter strains are pathogenic, there is an increasing number of non-pathogenic isolates reported. Non-pathogenic Clavibacter strains isolated from tomato seeds are particularly problematic because they affect the current detection and identification tests for Clavibacter michiganensis subsp. michiganensis (Cmm), which is regulated with a zero tolerance in tomato seed. Their misidentification as pathogenic Cmm hampers a clear judgment on the seed quality and health. To get more insight in the genetic features linked to the lifestyle of these bacteria, a whole-genome sequence of the tomato seed-borne non-pathogenic Clavibacter LMG 26808 was determined. To gain a better understanding of the molecular determinants of pathogenicity, the genome sequence of LMG 26808 was compared with that of the pathogenic Cmm strain (NCPPB 382). The comparative analysis revealed that LMG 26808 does not contain plasmids pCM1 and pCM2 and also lacks the majority of important virulence factors described so far for pathogenic Cmm. This explains its apparent non-pathogenic nature in tomato plants. Moreover, the genome analysis of LMG 26808 detected sequences from a plasmid originating from a member of Enterobacteriaceae/Klebsiella relative. Genes received that way and coding for antibiotic resistance may provide a competitive advantage for survival of LMG 26808 in its ecological niche. Genetically, LMG 26808 was the most similar to the pathogenic Cmm NCPPB 382 but contained more mobile genetic elements. The genome of this non-pathogenic Clavibacter strain contained also a high number of transporters and regulatory genes. The genome sequence of the non-pathogenic Clavibacter strain LMG 26808 and the comparative analyses with other pathogenic Clavibacter strains provided a better understanding of the genetic bases of virulence and

  4. Comparative Analysis of Genomics and Proteomics in Bacillus thuringiensis 4.0718

    Science.gov (United States)

    Rang, Jie; He, Hao; Wang, Ting; Ding, Xuezhi; Zuo, Mingxing; Quan, Meifang; Sun, Yunjun; Yu, Ziquan; Hu, Shengbiao; Xia, Liqiu

    2015-01-01

    Bacillus thuringiensis is a widely used biopesticide that produced various insecticidal active substances during its life cycle. Separation and purification of numerous insecticide active substances have been difficult because of the relatively short half-life of such substances. On the other hand, substances can be synthetized at different times during development, so samples at different stages have to be studied, further complicating the analysis. A dual genomic and proteomic approach would enhance our ability to identify such substances, and particularily using mass spectrometry-based proteomic methods. The comparative analysis for genomic and proteomic data have showed that not all of the products deduced from the annotated genome could be identified among the proteomic data. For instance, genome annotation results showed that 39 coding sequences in the whole genome were related to insect pathogenicity, including five cry genes. However, Cry2Ab, Cry1Ia, Cytotoxin K, Bacteriocin, Exoenzyme C3 and Alveolysin could not be detected in the proteomic data obtained. The sporulation-related proteins were also compared analysis, results showed that the great majority sporulation-related proteins can be detected by mass spectrometry. This analysis revealed Spo0A~P, SigF, SigE(+), SigK(+) and SigG(+), all known to play an important role in the process of spore formation regulatory network, also were displayed in the proteomic data. Through the comparison of the two data sets, it was possible to infer that some genes were silenced or were expressed at very low levels. For instance, found that cry2Ab seems to lack a functional promoter while cry1Ia may not be expressed due to the presence of transposons. With this comparative study a relatively complete database can be constructed and used to transform hereditary material, thereby prompting the high expression of toxic proteins. A theoretical basis is provided for constructing highly virulent engineered bacteria and for

  5. Comparative analysis of genomics and proteomics in Bacillus thuringiensis 4.0718.

    Directory of Open Access Journals (Sweden)

    Jie Rang

    Full Text Available Bacillus thuringiensis is a widely used biopesticide that produced various insecticidal active substances during its life cycle. Separation and purification of numerous insecticide active substances have been difficult because of the relatively short half-life of such substances. On the other hand, substances can be synthetized at different times during development, so samples at different stages have to be studied, further complicating the analysis. A dual genomic and proteomic approach would enhance our ability to identify such substances, and particularily using mass spectrometry-based proteomic methods. The comparative analysis for genomic and proteomic data have showed that not all of the products deduced from the annotated genome could be identified among the proteomic data. For instance, genome annotation results showed that 39 coding sequences in the whole genome were related to insect pathogenicity, including five cry genes. However, Cry2Ab, Cry1Ia, Cytotoxin K, Bacteriocin, Exoenzyme C3 and Alveolysin could not be detected in the proteomic data obtained. The sporulation-related proteins were also compared analysis, results showed that the great majority sporulation-related proteins can be detected by mass spectrometry. This analysis revealed Spo0A~P, SigF, SigE(+, SigK(+ and SigG(+, all known to play an important role in the process of spore formation regulatory network, also were displayed in the proteomic data. Through the comparison of the two data sets, it was possible to infer that some genes were silenced or were expressed at very low levels. For instance, found that cry2Ab seems to lack a functional promoter while cry1Ia may not be expressed due to the presence of transposons. With this comparative study a relatively complete database can be constructed and used to transform hereditary material, thereby prompting the high expression of toxic proteins. A theoretical basis is provided for constructing highly virulent engineered

  6. Supervised Lowess normalization of comparative genome hybridization data – application to lactococcal strain comparisons

    Directory of Open Access Journals (Sweden)

    Karsens Harma A

    2008-02-01

    Full Text Available Abstract Background Array-based comparative genome hybridization (aCGH is commonly used to determine the genomic content of bacterial strains. Since prokaryotes in general have less conserved genome sequences than eukaryotes, sequence divergences between the genes in the genomes used for an aCGH experiment obstruct determination of genome variations (e.g. deletions. Current normalization methods do not take into consideration sequence divergence between target and microarray features and therefore cannot distinguish a difference in signal due to systematic errors in the data or due to sequence divergence. Results We present supervised Lowess, or S-Lowess, an application of the subset Lowess normalization method. By using a predicted subset of array features with minimal sequence divergence between the analyzed strains for the normalization procedure we remove systematic errors from dual-dye aCGH data in two steps: (1 determination of a subset of conserved genes (i.e. likely conserved genes, LCG; and (2 using the LCG for subset Lowess normalization. Subset Lowess determines the correction factors for systematic errors in the subset of array features and normalizes all array features using these correction factors. The performance of S-Lowess was assessed on aCGH experiments in which differentially labeled genomic DNA fragments of Lactococcus lactis IL1403 and L. lactis MG1363 strains were hybridized to IL1403 DNA microarrays. Since both genomes are sequenced and gene deletions identified, the success rate of different aCGH normalization methods in detecting these deletions in the MG1363 genome were determined. S-Lowess detects 97% of the deletions, whereas other aCGH normalization methods detect up to only 60% of the deletions. Conclusion S-Lowess is implemented in a user-friendly web-tool accessible from http://bioinformatics.biol.rug.nl/websoftware/s-lowess. We demonstrate that it outperforms existing normalization methods and maximizes

  7. A Guide to the PLAZA 3.0 Plant Comparative Genomic Database.

    Science.gov (United States)

    Vandepoele, Klaas

    2017-01-01

    PLAZA 3.0 is an online resource for comparative genomics and offers a versatile platform to study gene functions and gene families or to analyze genome organization and evolution in the green plant lineage. Starting from genome sequence information for over 35 plant species, precomputed comparative genomic data sets cover homologous gene families, multiple sequence alignments, phylogenetic trees, and genomic colinearity information within and between species. Complementary functional data sets, a Workbench, and interactive visualization tools are available through a user-friendly web interface, making PLAZA an excellent starting point to translate sequence or omics data sets into biological knowledge. PLAZA is available at http://bioinformatics.psb.ugent.be/plaza/ .

  8. Intronic alternative splicing regulators identified by comparative genomics in nematodes.

    Directory of Open Access Journals (Sweden)

    Jennifer L Kabat

    2006-07-01

    Full Text Available Many alternative splicing events are regulated by pentameric and hexameric intronic sequences that serve as binding sites for splicing regulatory factors. We hypothesized that intronic elements that regulate alternative splicing are under selective pressure for evolutionary conservation. Using a Wobble Aware Bulk Aligner genomic alignment of Caenorhabditis elegans and Caenorhabditis briggsae, we identified 147 alternatively spliced cassette exons that exhibit short regions of high nucleotide conservation in the introns flanking the alternative exon. In vivo experiments on the alternatively spliced let-2 gene confirm that these conserved regions can be important for alternative splicing regulation. Conserved intronic element sequences were collected into a dataset and the occurrence of each pentamer and hexamer motif was counted. We compared the frequency of pentamers and hexamers in the conserved intronic elements to a dataset of all C. elegans intron sequences in order to identify short intronic motifs that are more likely to be associated with alternative splicing. High-scoring motifs were examined for upstream or downstream preferences in introns surrounding alternative exons. Many of the high-scoring nematode pentamer and hexamer motifs correspond to known mammalian splicing regulatory sequences, such as (TGCATG, indicating that the mechanism of alternative splicing regulation is well conserved in metazoans. A comparison of the analysis of the conserved intronic elements, and analysis of the entire introns flanking these same exons, reveals that focusing on intronic conservation can increase the sensitivity of detecting putative splicing regulatory motifs. This approach also identified novel sequences whose role in splicing is under investigation and has allowed us to take a step forward in defining a catalog of splicing regulatory elements for an organism. In vivo experiments confirm that one novel high-scoring sequence from our analysis

  9. A Comparative Analysis of the Lyve-SET Phylogenomics Pipeline for Genomic Epidemiology of Foodborne Pathogens

    Science.gov (United States)

    Katz, Lee S.; Griswold, Taylor; Williams-Newkirk, Amanda J.; Wagner, Darlene; Petkau, Aaron; Sieffert, Cameron; Van Domselaar, Gary; Deng, Xiangyu; Carleton, Heather A.

    2017-01-01

    Modern epidemiology of foodborne bacterial pathogens in industrialized countries relies increasingly on whole genome sequencing (WGS) techniques. As opposed to profiling techniques such as pulsed-field gel electrophoresis, WGS requires a variety of computational methods. Since 2013, United States agencies responsible for food safety including the CDC, FDA, and USDA, have been performing whole-genome sequencing (WGS) on all Listeria monocytogenes found in clinical, food, and environmental samples. Each year, more genomes of other foodborne pathogens such as Escherichia coli, Campylobacter jejuni, and Salmonella enterica are being sequenced. Comparing thousands of genomes across an entire species requires a fast method with coarse resolution; however, capturing the fine details of highly related isolates requires a computationally heavy and sophisticated algorithm. Most L. monocytogenes investigations employing WGS depend on being able to identify an outbreak clade whose inter-genomic distances are less than an empirically determined threshold. When the difference between a few single nucleotide polymorphisms (SNPs) can help distinguish between genomes that are likely outbreak-associated and those that are less likely to be associated, we require a fine-resolution method. To achieve this level of resolution, we have developed Lyve-SET, a high-quality SNP pipeline. We evaluated Lyve-SET by retrospectively investigating 12 outbreak data sets along with four other SNP pipelines that have been used in outbreak investigation or similar scenarios. To compare these pipelines, several distance and phylogeny-based comparison methods were applied, which collectively showed that multiple pipelines were able to identify most outbreak clusters and strains. Currently in the US PulseNet system, whole genome multi-locus sequence typing (wgMLST) is the preferred primary method for foodborne WGS cluster detection and outbreak investigation due to its ability to name standardized

  10. Comparative analysis of copy number detection by whole-genome BAC and oligonucleotide array CGH

    Directory of Open Access Journals (Sweden)

    Bejjani Bassem A

    2010-06-01

    Full Text Available Abstract Background Microarray-based comparative genomic hybridization (aCGH is a powerful diagnostic tool for the detection of DNA copy number gains and losses associated with chromosome abnormalities, many of which are below the resolution of conventional chromosome analysis. It has been presumed that whole-genome oligonucleotide (oligo arrays identify more clinically significant copy-number abnormalities than whole-genome bacterial artificial chromosome (BAC arrays, yet this has not been systematically studied in a clinical diagnostic setting. Results To determine the difference in detection rate between similarly designed BAC and oligo arrays, we developed whole-genome BAC and oligonucleotide microarrays and validated them in a side-by-side comparison of 466 consecutive clinical specimens submitted to our laboratory for aCGH. Of the 466 cases studied, 67 (14.3% had a copy-number imbalance of potential clinical significance detectable by the whole-genome BAC array, and 73 (15.6% had a copy-number imbalance of potential clinical significance detectable by the whole-genome oligo array. However, because both platforms identified copy number variants of unclear clinical significance, we designed a systematic method for the interpretation of copy number alterations and tested an additional 3,443 cases by BAC array and 3,096 cases by oligo array. Of those cases tested on the BAC array, 17.6% were found to have a copy-number abnormality of potential clinical significance, whereas the detection rate increased to 22.5% for the cases tested by oligo array. In addition, we validated the oligo array for detection of mosaicism and found that it could routinely detect mosaicism at levels of 30% and greater. Conclusions Although BAC arrays have faster turnaround times, the increased detection rate of oligo arrays makes them attractive for clinical cytogenetic testing.

  11. Genomics-based plant germplasm research (GPGR

    Directory of Open Access Journals (Sweden)

    Jizeng Jia

    2017-04-01

    Full Text Available Plant germplasm underpins much of crop genetic improvement. Millions of germplasm accessions have been collected and conserved ex situ and/or in situ, and the major challenge is now how to exploit and utilize this abundant resource. Genomics-based plant germplasm research (GPGR or “Genoplasmics” is a novel cross-disciplinary research field that seeks to apply the principles and techniques of genomics to germplasm research. We describe in this paper the concept, strategy, and approach behind GPGR, and summarize current progress in the areas of the definition and construction of core collections, enhancement of germplasm with core collections, and gene discovery from core collections. GPGR is opening a new era in germplasm research. The contribution, progress and achievements of GPGR in the future are predicted.

  12. Study on an inquiry-based teaching case in genomics curriculum:identifying virulence factors of Escherichia coli by using comparative genomics%探究式教学在基因组学课程中的实例研究:基于比较基因组学鉴定大肠杆菌致病因子

    Institute of Scientific and Technical Information of China (English)

    周纪东; 李余动

    2015-01-01

    基因组学是现代生命科学中众多“组学”研究方法的核心,越来越受到高校生物教学的重视,但目前关于其教学方法的研究报道较少。文章以大肠杆菌致病因子的比较基因组学教学作为一个专题教学实例,展示探究式教学法在基因组学课程中的应用。学生首先以Mauve软件比较不同大肠杆菌的基因组,分析基因的保守性,并以BLAST检索基因的功能注释信息,从而鉴定其是否为致病因子基因。在该专题教学中,整个教学过程通过情境、资源、任务、过程、评价5个模块来组织实施。教学效果表明,通过探究式教学的应用,学生在掌握基因组学课程知识与专业技能的基础上,提高了科研兴趣与自主学习能力。此外,该专题教学内容还可应用于其他相关课程如微生物学、生物信息学、分子生物学及食品安全检测技术等。%Genomics is the core subject of various “omics” and it also becomes a topic of increasing interest in undergraduate curricula of biological sciences. However, the study on teaching methodology of genomics courses was very limited so far. Here we report an application of inquiry-based teaching in genomics courses by using virulence factors of Escherichia coli as an example of comparative genomics study. Specially, students first built a mul-tiple-genome alignment of different E. coli strains to investigate the gene conservation using the Mauve tool; then putative virulence factor genes were identified by using BLAST tool to obtain gene annotations. The teaching process was divided into five modules: situation, resources, task, process and evaluation. Learning-assessment results re-vealed that students had acquired the knowledge and skills of genomics, and their learning interest and ability of self-study were also motivated. Moreover, the special teaching case can be applied to other related courses, such as microbiology

  13. Base-By-Base: Single nucleotide-level analysis of whole viral genome alignments

    Directory of Open Access Journals (Sweden)

    Tcherepanov Vasily

    2004-07-01

    Full Text Available Abstract Background With ever increasing numbers of closely related virus genomes being sequenced, it has become desirable to be able to compare two genomes at a level more detailed than gene content because two strains of an organism may share the same set of predicted genes but still differ in their pathogenicity profiles. For example, detailed comparison of multiple isolates of the smallpox virus genome (each approximately 200 kb, with 200 genes is not feasible without new bioinformatics tools. Results A software package, Base-By-Base, has been developed that provides visualization tools to enable researchers to 1 rapidly identify and correct alignment errors in large, multiple genome alignments; and 2 generate tabular and graphical output of differences between the genomes at the nucleotide level. Base-By-Base uses detailed annotation information about the aligned genomes and can list each predicted gene with nucleotide differences, display whether variations occur within promoter regions or coding regions and whether these changes result in amino acid substitutions. Base-By-Base can connect to our mySQL database (Virus Orthologous Clusters; VOCs to retrieve detailed annotation information about the aligned genomes or use information from text files. Conclusion Base-By-Base enables users to quickly and easily compare large viral genomes; it highlights small differences that may be responsible for important phenotypic differences such as virulence. It is available via the Internet using Java Web Start and runs on Macintosh, PC and Linux operating systems with the Java 1.4 virtual machine.

  14. Complete genome sequences and comparative genome analysis of Lactobacillus plantarum strain 5-2 isolated from fermented soybean.

    Science.gov (United States)

    Liu, Chen-Jian; Wang, Rui; Gong, Fu-Ming; Liu, Xiao-Feng; Zheng, Hua-Jun; Luo, Yi-Yong; Li, Xiao-Ran

    2015-12-01

    Lactobacillus plantarum is an important probiotic and is mostly isolated from fermented foods. We sequenced the genome of L. plantarum strain 5-2, which was derived from fermented soybean isolated from Yunnan province, China. The strain was determined to contain 3114 genes. Fourteen complete insertion sequence (IS) elements were found in 5-2 chromosome. There were 24 DNA replication proteins and 76 DNA repair proteins in the 5-2 genome. Consistent with the classification of L. plantarum as a facultative heterofermentative lactobacillus, the 5-2 genome encodes key enzymes required for the EMP (Embden-Meyerhof-Parnas) and phosphoketolase (PK) pathways. Several components of the secretion machinery are found in the 5-2 genome, which was compared with L. plantarum ST-III, JDM1 and WCFS1. Most of the specific proteins in the four genomes appeared to be related to their prophage elements.

  15. Score-based prediction of genomic islands in prokaryotic genomes using hidden Markov models

    Directory of Open Access Journals (Sweden)

    Surovcik Katharina

    2006-03-01

    Full Text Available Abstract Background Horizontal gene transfer (HGT is considered a strong evolutionary force shaping the content of microbial genomes in a substantial manner. It is the difference in speed enabling the rapid adaptation to changing environmental demands that distinguishes HGT from gene genesis, duplications or mutations. For a precise characterization, algorithms are needed that identify transfer events with high reliability. Frequently, the transferred pieces of DNA have a considerable length, comprise several genes and are called genomic islands (GIs or more specifically pathogenicity or symbiotic islands. Results We have implemented the program SIGI-HMM that predicts GIs and the putative donor of each individual alien gene. It is based on the analysis of codon usage (CU of each individual gene of a genome under study. CU of each gene is compared against a carefully selected set of CU tables representing microbial donors or highly expressed genes. Multiple tests are used to identify putatively alien genes, to predict putative donors and to mask putatively highly expressed genes. Thus, we determine the states and emission probabilities of an inhomogeneous hidden Markov model working on gene level. For the transition probabilities, we draw upon classical test theory with the intention of integrating a sensitivity controller in a consistent manner. SIGI-HMM was written in JAVA and is publicly available. It accepts as input any file created according to the EMBL-format. It generates output in the common GFF format readable for genome browsers. Benchmark tests showed that the output of SIGI-HMM is in agreement with known findings. Its predictions were both consistent with annotated GIs and with predictions generated by different methods. Conclusion SIGI-HMM is a sensitive tool for the identification of GIs in microbial genomes. It allows to interactively analyze genomes in detail and to generate or to test hypotheses about the origin of acquired

  16. e-Fungi: a data resource for comparative analysis of fungal genomes

    Directory of Open Access Journals (Sweden)

    Hubbard Simon J

    2007-11-01

    Full Text Available Abstract Background The number of sequenced fungal genomes is ever increasing, with about 200 genomes already fully sequenced or in progress. Only a small percentage of those genomes have been comprehensively studied, for example using techniques from functional genomics. Comparative analysis has proven to be a useful strategy for enhancing our understanding of evolutionary biology and of the less well understood genomes. However, the data required for these analyses tends to be distributed in various heterogeneous data sources, making systematic comparative studies a cumbersome task. Furthermore, comparative analyses benefit from close integration of derived data sets that cluster genes or organisms in a way that eases the expression of requests that clarify points of similarity or difference between species. Description To support systematic comparative analyses of fungal genomes we have developed the e-Fungi database, which integrates a variety of data for more than 30 fungal genomes. Publicly available genome data, functional annotations, and pathway information has been integrated into a single data repository and complemented with results of comparative analyses, such as MCL and OrthoMCL cluster analysis, and predictions of signaling proteins and the sub-cellular localisation of proteins. To access the data, a library of analysis tasks is available through a web interface. The analysis tasks are motivated by recent comparative genomics studies, and aim to support the study of evolutionary biology as well as community efforts for improving the annotation of genomes. Web services for each query are also available, enabling the tasks to be incorporated into workflows. Conclusion The e-Fungi database provides fungal biologists with a resource for comparative studies of a large range of fungal genomes. Its analysis library supports the comparative study of genome data, functional annotation, and results of large scale analyses over all the

  17. Comparative Genomics of Vibrio cholerae from Haiti, Asia, and Africa

    Science.gov (United States)

    Reimer, Aleisha R.; Van Domselaar, Gary; Stroika, Steven; Walker, Matthew; Kent, Heather; Tarr, Cheryl; Talkington, Deborah; Rowe, Lori; Olsen-Rasmussen, Melissa; Frace, Michael; Sammons, Scott; Dahourou, Georges Anicet; Boncy, Jacques; Smith, Anthony M.; Mabon, Philip; Petkau, Aaron; Graham, Morag; Gilmour, Matthew W.

    2011-01-01

    Cholera was absent from the island of Hispaniola at least a century before an outbreak that began in Haiti in the fall of 2010. Pulsed-field gel electrophoresis (PFGE) analysis of clinical isolates from the Haiti outbreak and recent global travelers returning to the United States showed indistinguishable PFGE fingerprints. To better explore the genetic ancestry of the Haiti outbreak strain, we acquired 23 whole-genome Vibrio cholerae sequences: 9 isolates obtained in Haiti or the Dominican Republic, 12 PFGE pattern-matched isolates linked to Asia or Africa, and 2 nonmatched outliers from the Western Hemisphere. Phylogenies for whole-genome sequences and core genome single-nucleotide polymorphisms showed that the Haiti outbreak strain is genetically related to strains originating in India and Cameroon. However, because no identical genetic match was found among sequenced contemporary isolates, a definitive genetic origin for the outbreak in Haiti remains speculative. PMID:22099115

  18. Comparative genomics of Vibrio cholerae from Haiti, Asia, and Africa.

    Science.gov (United States)

    Reimer, Aleisha R; Van Domselaar, Gary; Stroika, Steven; Walker, Matthew; Kent, Heather; Tarr, Cheryl; Talkington, Deborah; Rowe, Lori; Olsen-Rasmussen, Melissa; Frace, Michael; Sammons, Scott; Dahourou, Georges Anicet; Boncy, Jacques; Smith, Anthony M; Mabon, Philip; Petkau, Aaron; Graham, Morag; Gilmour, Matthew W; Gerner-Smidt, Peter

    2011-11-01

    Cholera was absent from the island of Hispaniola at least a century before an outbreak that began in Haiti in the fall of 2010. Pulsed-field gel electrophoresis (PFGE) analysis of clinical isolates from the Haiti outbreak and recent global travelers returning to the United States showed indistinguishable PFGE fingerprints. To better explore the genetic ancestry of the Haiti outbreak strain, we acquired 23 whole-genome Vibrio cholerae sequences: 9 isolates obtained in Haiti or the Dominican Republic, 12 PFGE pattern-matched isolates linked to Asia or Africa, and 2 nonmatched outliers from the Western Hemisphere. Phylogenies for whole-genome sequences and core genome single-nucleotide polymorphisms showed that the Haiti outbreak strain is genetically related to strains originating in India and Cameroon. However, because no identical genetic match was found among sequenced contemporary isolates, a definitive genetic origin for the outbreak in Haiti remains speculative.

  19. Natural Product Biosynthetic Diversity and Comparative Genomics of the Cyanobacteria.

    Science.gov (United States)

    Dittmann, Elke; Gugger, Muriel; Sivonen, Kaarina; Fewer, David P

    2015-10-01

    Cyanobacteria are an ancient lineage of slow-growing photosynthetic bacteria and a prolific source of natural products with intricate chemical structures and potent biological activities. The bulk of these natural products are known from just a handful of genera. Recent efforts have elucidated the mechanisms underpinning the biosynthesis of a diverse array of natural products from cyanobacteria. Many of the biosynthetic mechanisms are unique to cyanobacteria or rarely described from other organisms. Advances in genome sequence technology have precipitated a deluge of genome sequences for cyanobacteria. This makes it possible to link known natural products to biosynthetic gene clusters but also accelerates the discovery of new natural products through genome mining. These studies demonstrate that cyanobacteria encode a huge variety of cryptic gene clusters for the production of natural products, and the known chemical diversity is likely to be just a fraction of the true biosynthetic capabilities of this fascinating and ancient group of organisms.

  20. Comparative Genome Analysis of Lolium-Festuca Complex Species

    DEFF Research Database (Denmark)

    Czaban, Adrian; Byrne, Stephen; Sharma, Sapna

    2015-01-01

    The Lolium-Festuca complex incorporates species from the Lolium genera and the broad leaf Fescues. Plants belonging to this complex exhibit significant phenotypic plasticity for agriculturally important traits, such as annuality/perenniality, establishment potential, growth speed, nutritional value......, winter hardiness, drought tolerance and resistance to grazing. In this study we have sequenced and assembled the low copy fraction of the genomes of Lolium westerwoldicum, Lolium multiflorum, Festuca pratensis and Lolium temulentum. We have also generated de-novo transcriptome assemblies for each species......, and these have aided in the annotation of the genomic sequence. Using this data we were able to generate annotated assemblies of the gene rich regions of the four species to complement the already sequenced Lolium perenne genome. Using these gene models we have identified orthologous genes between the species...

  1. IMG 4 version of the integrated microbial genomes comparative analysis system

    Energy Technology Data Exchange (ETDEWEB)

    Markowitz, Victor M. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Chen, I-Min A. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Palaniappan, Krishna [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Chu, Ken [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Szeto, Ernest [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Pillay, Manoj [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Ratner, Anna [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Huang, Jinghua [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Woyke, Tanja [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Huntemann, Marcel [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Anderson, Iain [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Billis, Konstantinos [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Varghese, Neha [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Mavromatis, Konstantinos [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Pati, Amrita [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Ivanova, Natalia N. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Kyrpides, Nikos C. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program

    2013-10-27

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG’s data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG’s annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Finally, different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).

  2. IMG 4 version of the integrated microbial genomes comparative analysis system.

    Science.gov (United States)

    Markowitz, Victor M; Chen, I-Min A; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Woyke, Tanja; Huntemann, Marcel; Anderson, Iain; Billis, Konstantinos; Varghese, Neha; Mavromatis, Konstantinos; Pati, Amrita; Ivanova, Natalia N; Kyrpides, Nikos C

    2014-01-01

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG's data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG's annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).

  3. Comparative genomics in cyprinids: Common carp EST's help the annotation of the zebrafish genome

    NARCIS (Netherlands)

    Christoffels, A.; Bartfai, R.; Srinivasan, H.; Komen, J.

    2006-01-01

    Background - Automatic annotation of sequenced eukaryotic genomes integrates a combination of methodologies such as ab-initio methods and alignment of homologous genes and/or proteins. For example, annotation of the zebrafish genome within Ensembl relies heavily on available cDNA and protein sequenc

  4. Complete genome sequence and comparative genomic analysis of an emerging human pathogen, serotype V Streptococcus agalactiae

    NARCIS (Netherlands)

    Tettelin, H; Masignani, [No Value; Cieslewicz, MJ; Eisen, JA; Peterson, S; Paulsen, IT; Nelson, KE; Margarit, [No Value; Read, TD; Madoff, LC; Beanan, MJ; Brinkac, LM; Daugherty, SC; DeBoy, RT; Durkin, AS; Kolonay, JF; Madupu, R; Lewis, MR; Radune, D; Fedorova, NB; Scanlan, D; Khouri, H; Mulligan, S; Carty, HA; Cline, RT; Van Aken, SE; Gill, J; Scarselli, M; Mora, M; Iacobini, ET; Brettoni, C; Galli, G; Mariani, M; Vegni, F; Maione, D; Rinaudo, D; Rappuoli, R; Telford, JL; Kasper, DL; Grandi, G; Fraser, CM

    2002-01-01

    The 2,160,267 bp genome sequence of Streptococcus agalactiae, the leading cause of bacterial sepsis, pneumonia, and meningitis in neonates in the U.S. and Europe, is predicted to encode 2,175 genes. Genome comparisons among S. agalactiae, Streptococcus pneumoniae, Streptococcus pyogenes, and the oth

  5. Complete genome sequence and comparative genomic analysis of an emerging human pathogen, serotype V Streptococcus agalactiae

    NARCIS (Netherlands)

    Tettelin, H; Masignani, [No Value; Cieslewicz, MJ; Eisen, JA; Peterson, S; Paulsen, IT; Nelson, KE; Margarit, [No Value; Read, TD; Madoff, LC; Beanan, MJ; Brinkac, LM; Daugherty, SC; DeBoy, RT; Durkin, AS; Kolonay, JF; Madupu, R; Lewis, MR; Radune, D; Fedorova, NB; Scanlan, D; Khouri, H; Mulligan, S; Carty, HA; Cline, RT; Van Aken, SE; Gill, J; Scarselli, M; Mora, M; Iacobini, ET; Brettoni, C; Galli, G; Mariani, M; Vegni, F; Maione, D; Rinaudo, D; Rappuoli, R; Telford, JL; Kasper, DL; Grandi, G; Fraser, CM

    2002-01-01

    The 2,160,267 bp genome sequence of Streptococcus agalactiae, the leading cause of bacterial sepsis, pneumonia, and meningitis in neonates in the U.S. and Europe, is predicted to encode 2,175 genes. Genome comparisons among S. agalactiae, Streptococcus pneumoniae, Streptococcus pyogenes, and the

  6. Leveraging Comparative Genomics to Identify and Functionally Characterize Genes Associated with Sperm Phenotypes in Python bivittatus (Burmese Python

    Directory of Open Access Journals (Sweden)

    Kristopher J. L. Irizarry

    2016-01-01

    Full Text Available Comparative genomics approaches provide a means of leveraging functional genomics information from a highly annotated model organism’s genome (such as the mouse genome in order to make physiological inferences about the role of genes and proteins in a less characterized organism’s genome (such as the Burmese python. We employed a comparative genomics approach to produce the functional annotation of Python bivittatus genes encoding proteins associated with sperm phenotypes. We identify 129 gene-phenotype relationships in the python which are implicated in 10 specific sperm phenotypes. Results obtained through our systematic analysis identified subsets of python genes exhibiting associations with gene ontology annotation terms. Functional annotation data was represented in a semantic scatter plot. Together, these newly annotated Python bivittatus genome resources provide a high resolution framework from which the biology relating to reptile spermatogenesis, fertility, and reproduction can be further investigated. Applications of our research include (1 production of genetic diagnostics for assessing fertility in domestic and wild reptiles; (2 enhanced assisted reproduction technology for endangered and captive reptiles; and (3 novel molecular targets for biotechnology-based approaches aimed at reducing fertility and reproduction of invasive reptiles. Additional enhancements to reptile genomic resources will further enhance their value.

  7. Leveraging Comparative Genomics to Identify and Functionally Characterize Genes Associated with Sperm Phenotypes in Python bivittatus (Burmese Python).

    Science.gov (United States)

    Irizarry, Kristopher J L; Rutllant, Josep

    2016-01-01

    Comparative genomics approaches provide a means of leveraging functional genomics information from a highly annotated model organism's genome (such as the mouse genome) in order to make physiological inferences about the role of genes and proteins in a less characterized organism's genome (such as the Burmese python). We employed a comparative genomics approach to produce the functional annotation of Python bivittatus genes encoding proteins associated with sperm phenotypes. We identify 129 gene-phenotype relationships in the python which are implicated in 10 specific sperm phenotypes. Results obtained through our systematic analysis identified subsets of python genes exhibiting associations with gene ontology annotation terms. Functional annotation data was represented in a semantic scatter plot. Together, these newly annotated Python bivittatus genome resources provide a high resolution framework from which the biology relating to reptile spermatogenesis, fertility, and reproduction can be further investigated. Applications of our research include (1) production of genetic diagnostics for assessing fertility in domestic and wild reptiles; (2) enhanced assisted reproduction technology for endangered and captive reptiles; and (3) novel molecular targets for biotechnology-based approaches aimed at reducing fertility and reproduction of invasive reptiles. Additional enhancements to reptile genomic resources will further enhance their value.

  8. Genome-wide distribution comparative and composition analysis of the SSRs in Poaceae.

    Science.gov (United States)

    Wang, Yi; Yang, Chao; Jin, Qiaojun; Zhou, Dongjie; Wang, Shuangshuang; Yu, Yuanjie; Yang, Long

    2015-02-15

    The Poaceae family is of great importance to human beings since it comprises the cereal grasses which are the main sources for human food and animal feed. With the rapid growth of genomic data from Poaceae members, comparative genomics becomes a convinent method to study genetics of diffierent species. The SSRs (Simple Sequence Repeats) are widely used markers in the studies of Poaceae for their high abundance and stability. In this study, using the genomic sequences of 9 Poaceae species, we detected 11,993,943 SSR loci and developed 6,799,910 SSR primer pairs. The results show that SSRs are distributed on all the genomic elements in grass. Hexamer is the most frequent motif and AT/TA is the most frequent motif in dimer. The abundance of the SSRs has a positive linear relationship with the recombination rate. SSR sequences in the coding regions involve a higher GC content in the Poaceae than that in the other species. SSRs of 70-80 bp in length showed the highest AT/GC base ratio among all of these loci. The result shows the highest polymorphism rate belongs to the SSRs ranged from 30 bp to 40 bp. Using all the SSR primers of Japonica, nineteen universal primers were selected and located on the genome of the grass family. The information of SSR loci, the SSR primers and the tools of mining and analyzing SSR are provided in the PSSRD (Poaceae SSR Database, http://biodb.sdau.edu.cn/pssrd/). Our study and the PSSRD database provide a foundation for the comparative study in the Poaceae and it will accelerate the study on markers application, gene mapping and molecular breeding.

  9. An isothermal primer extension method for whole genome amplification of fresh and degraded DNA: applications in comparative genomic hybridization, genotyping and mutation screening.

    Science.gov (United States)

    Lee, Cheryl I P; Leong, Siew Hong; Png, Adrian E H; Choo, Keng Wah; Syn, Christopher; Lim, Dennis T H; Law, Hai Yang; Kon, Oi Lian

    2006-01-01

    We describe a protocol that uses a bioinformatically optimized primer in an isothermal whole genome amplification (WGA) reaction. Overnight incubation at 37 degrees C efficiently generates several hundred- to several thousand-fold increases in input DNA. The amplified product retains reasonably faithful quantitative representation of unamplified whole genomic DNA (gDNA). We provide protocols for applying this isothermal primer extension WGA protocol in three different techniques of genomic analysis: comparative genomic hybridization (CGH), genotyping at simple tandem repeat (STR) loci and screening for single base mutations in a common monogenic disorder, beta-thalassemia. gDNA extracted from formalin-fixed paraffin-embedded (FFPE) tissues can also be amplified with this protocol.

  10. Comparative analysis of the domestic cat genome reveals genetic signatures underlying feline biology and domestication.

    Science.gov (United States)

    Montague, Michael J; Li, Gang; Gandolfi, Barbara; Khan, Razib; Aken, Bronwen L; Searle, Steven M J; Minx, Patrick; Hillier, LaDeana W; Koboldt, Daniel C; Davis, Brian W; Driscoll, Carlos A; Barr, Christina S; Blackistone, Kevin; Quilez, Javier; Lorente-Galdos, Belen; Marques-Bonet, Tomas; Alkan, Can; Thomas, Gregg W C; Hahn, Matthew W; Menotti-Raymond, Marilyn; O'Brien, Stephen J; Wilson, Richard K; Lyons, Leslie A; Murphy, William J; Warren, Wesley C

    2014-12-02

    Little is known about the genetic changes that distinguish domestic cat populations from their wild progenitors. Here we describe a high-quality domestic cat reference genome assembly and comparative inferences made with other cat breeds, wildcats, and other mammals. Based upon these comparisons, we identified positively selected genes enriched for genes involved in lipid metabolism that underpin adaptations to a hypercarnivorous diet. We also found positive selection signals within genes underlying sensory processes, especially those affecting vision and hearing in the carnivore lineage. We observed an evolutionary tradeoff between functional olfactory and vomeronasal receptor gene repertoires in the cat and dog genomes, with an expansion of the feline chemosensory system for detecting pheromones at the expense of odorant detection. Genomic regions harboring signatures of natural selection that distinguish domestic cats from their wild congeners are enriched in neural crest-related genes associated with behavior and reward in mouse models, as predicted by the domestication syndrome hypothesis. Our description of a previously unidentified allele for the gloving pigmentation pattern found in the Birman breed supports the hypothesis that cat breeds experienced strong selection on specific mutations drawn from random bred populations. Collectively, these findings provide insight into how the process of domestication altered the ancestral wildcat genome and build a resource for future disease mapping and phylogenomic studies across all members of the Felidae.

  11. Comparative analysis of mitochondrial genomes of five aphid species (Hemiptera: Aphididae and phylogenetic implications.

    Directory of Open Access Journals (Sweden)

    Yuan Wang

    Full Text Available Insect mitochondrial genomes (mitogenomes are of great interest in exploring molecular evolution, phylogenetics and population genetics. Only two mitogenomes have been previously released in the insect group Aphididae, which consists of about 5,000 known species including some agricultural, forestry and horticultural pests. Here we report the complete 16,317 bp mitogenome of Cavariella salicicola and two nearly complete mitogenomes of Aphis glycines and Pterocomma pilosum. We also present a first comparative analysis of mitochondrial genomes of aphids. Results showed that aphid mitogenomes share conserved genomic organization, nucleotide and amino acid composition, and codon usage features. All 37 genes usually present in animal mitogenomes were sequenced and annotated. The analysis of gene evolutionary rate revealed the lowest and highest rates for COI and ATP8, respectively. A unique repeat region exclusively in aphid mitogenomes, which included variable numbers of tandem repeats in a lineage-specific manner, was highlighted for the first time. This region may have a function as another origin of replication. Phylogenetic reconstructions based on protein-coding genes and the stem-loop structures of control regions confirmed a sister relationship between Cavariella and pterocommatines. Current evidence suggest that pterocommatines could be formally transferred into Macrosiphini. Our paper also offers methodological instructions for obtaining other Aphididae mitochondrial genomes.

  12. The chloroplast genome of the hexaploid Spartina maritima (Poaceae, Chloridoideae): Comparative analyses and molecular dating.

    Science.gov (United States)

    Rousseau-Gueutin, M; Bellot, S; Martin, G E; Boutte, J; Chelaifa, H; Lima, O; Michon-Coudouel, S; Naquin, D; Salmon, A; Ainouche, K; Ainouche, M

    2015-12-01

    The history of many plant lineages is complicated by reticulate evolution with cases of hybridization often followed by genome duplication (allopolyploidy). In such a context, the inference of phylogenetic relationships and biogeographic scenarios based on molecular data is easier using haploid markers like chloroplast genome sequences. Hybridization and polyploidization occurred recurrently in the genus Spartina (Poaceae, Chloridoideae), as illustrated by the recent formation of the invasive allododecaploid S. anglica during the 19th century in Europe. Until now, only a few plastid markers were available to explore the history of this genus and their low variability limited the resolution of species relationships. We sequenced the complete chloroplast genome (plastome) of S. maritima, the native European parent of S. anglica, and compared it to the plastomes of other Poaceae. Our analysis revealed the presence of fast-evolving regions of potential taxonomic, phylogeographic and phylogenetic utility at various levels within the Poaceae family. Using secondary calibrations, we show that the tetraploid and hexaploid lineages of Spartina diverged 6-10 my ago, and that the two parents of the invasive allopolyploid S. anglica separated 2-4 my ago via long distance dispersal of the ancestor of S. maritima over the Atlantic Ocean. Finally, we discuss the meaning of divergence times between chloroplast genomes in the context of reticulate evolution. Copyright © 2015 Elsevier Inc. All rights reserved.

  13. Chromosomal Localization of DNA Amplifications in Neuroblastoma Tumors Using cDNA Microarray Comparative Genomic Hybridization

    Directory of Open Access Journals (Sweden)

    Ben Beheshti

    2003-01-01

    Full Text Available Conventional comparative genomic hybridization (CGH profiling of neuroblastomas has identified many genomic aberrations, although the limited resolution has precluded a precise localization of sequences of interest within amplicons. To map high copy number genomic gains in clinically matched stage IV neuroblastomas, CGH analysis using a 19,200-feature cDNA microarray was used. A dedicated (freely available algorithm was developed for rapid in silico determination of chromosomal localizations of microarray cDNA targets, and for generation of an ideogram-type profile of copy number changes. Using these methodologies, novel gene amplifications undetectable by chromosome CGH were identified, and larger MYCN amplicon sizes (in one tumor up to 6 Mb than those previously reported in neuroblastoma were identified. The genes HPCAL1, LPIN1/KIAA0188, NAG, and NSE1/LOC151354 were found to be coamplified with MYCN. To determine whether stage IV primary tumors could be further subclassified based on their genomic copy number profiles, hierarchical clustering was performed. Cluster analysis of microarray CGH data identified three groups: 1 no amplifications evident, 2 a small MYCN amplicon as the only detectable imbalance, and 3 a large MYCN amplicon with additional gene amplifications. Application of CGH to cDNA microarray targets will help to determine both the variation of amplicon size and help better define amplification-dependent and independent pathways of progression in neuroblastoma.

  14. Comparative genome sequencing of drosophila pseudoobscura: Chromosomal, gene and cis-element evolution

    Energy Technology Data Exchange (ETDEWEB)

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Todd, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catherine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenee; Verduzco, Daniel; Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.

    2004-04-01

    The genome sequence of a second fruit fly, D. pseudoobscura, presents an opportunity for comparative analysis of a primary model organism D. melanogaster. The vast majority of Drosophila genes have remained on the same arm, but within each arm gene order has been extensively reshuffled leading to the identification of approximately 1300 syntenic blocks. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 35 My since divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome wide average consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than control sequences between the species but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a picture of repeat mediated chromosomal rearrangement, and high co-adaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila.

  15. Comparative Analysis of 35 Basidiomycete Genomes Reveals Diversity and Uniqueness of the Phylum

    Energy Technology Data Exchange (ETDEWEB)

    Riley, Robert; Salamov, Asaf; Otillar, Robert; Fagnan, Kirsten; Boussau, Bastien; Brown, Daren; Henrissat, Bernard; Levasseur, Anthony; Held, Benjamin; Nagy, Laszlo; Floudas, Dimitris; Morin, Emmanuelle; Manning, Gerard; Baker, Scott; Martin, Francis; Blanchette, Robert; Hibbett, David; Grigoriev, Igor V.

    2013-03-11

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes symbionts, pathogens, and saprobes including wood decaying fungi. To better understand the diversity of this phylum we compared the genomes of 35 basidiomycete fungi including 6 newly sequenced genomes. The genomes of basidiomycetes span extremes of genome size, gene number, and repeat content. A phylogenetic tree of Basidiomycota was generated using the Phyldog software, which uses all available protein sequence data to simultaneously infer gene and species trees. Analysis of core genes reveals that some 48percent of basidiomycete proteins are unique to the phylum with nearly half of those (22percent) comprising proteins found in only one organism. Phylogenetic patterns of plant biomass-degrading genes suggest a continuum rather than a sharp dichotomy between the white rot and brown rot modes of wood decay among the members of Agaricomycotina subphylum. There is a correlation of the profile of certain gene families to nutritional mode in Agaricomycotina. Based on phylogenetically-informed PCA analysis of such profiles, we predict that that Botryobasidium botryosum and Jaapia argillacea have properties similar to white rot species, although neither has liginolytic class II fungal peroxidases. Furthermore, we find that both fungi exhibit wood decay with white rot-like characteristics in growth assays. Analysis of the rate of discovery of proteins with no or few homologs suggests the high value of continued sequencing of basidiomycete fungi.

  16. Evaluation of Apis mellifera syriaca Levant region honeybee conservation using comparative genome hybridization.

    Science.gov (United States)

    Haddad, Nizar Jamal; Batainh, Ahmed; Saini, Deepti; Migdadi, Osama; Aiyaz, Mohamed; Manchiganti, Rushiraj; Krishnamurthy, Venkatesh; Al-Shagour, Banan; Brake, Mohammad; Bourgeois, Lelania; De Guzman, Lilia; Rinderer, Thomas; Hamouri, Zayed Mahoud

    2016-06-01

    Apis mellifera syriaca is the native honeybee subspecies of Jordan and much of the Levant region. It expresses behavioral adaptations to a regional climate with very high temperatures, nectar dearth in summer, attacks of the Oriental wasp and is resistant to Varroa mites. The A. m. syriaca control reference sample (CRS) in this study was originally collected and stored since 2001 from "Wadi Ben Hammad", a remote valley in the southern region of Jordan. Morphometric and mitochondrial DNA markers of these honeybees had shown highest similarity to reference A. m. syriaca samples collected in 1952 by Brother Adam of samples collected from the Middle East. Samples 1-5 were collected from the National Center for Agricultural Research and Extension breeding apiary which was established for the conservation of A. m. syriaca. Our objective was to determine the success of an A. m. syriaca honey bee conservation program using genomic information from an array-based comparative genomic hybridization platform to evaluate genetic similarities to a historic reference collection (CRS). Our results had shown insignificant genomic differences between the current population in the conservation program and the CRS indicated that program is successfully conserving A. m. syriaca. Functional genomic variations were identified which are useful for conservation monitoring and may be useful for breeding programs designed to improve locally adapted strains of A. m. syriaca.

  17. Genomic comparisons of Brucella spp. and closely related bacteria using base compositional and proteome based methods

    Science.gov (United States)

    2010-01-01

    Background Classification of bacteria within the genus Brucella has been difficult due in part to considerable genomic homogeneity between the different species and biovars, in spite of clear differences in phenotypes. Therefore, many different methods have been used to assess Brucella taxonomy. In the current work, we examine 32 sequenced genomes from genus Brucella representing the six classical species, as well as more recently described species, using bioinformatical methods. Comparisons were made at the level of genomic DNA using oligonucleotide based methods (Markov chain based genomic signatures, genomic codon and amino acid frequencies based comparisons) and proteomes (all-against-all BLAST protein comparisons and pan-genomic analyses). Results We found that the oligonucleotide based methods gave different results compared to that of the proteome based methods. Differences were also found between the oligonucleotide based methods used. Whilst the Markov chain based genomic signatures grouped the different species in genus Brucella according to host preference, the codon and amino acid frequencies based methods reflected small differences between the Brucella species. Only minor differences could be detected between all genera included in this study using the codon and amino acid frequencies based methods. Proteome comparisons were found to be in strong accordance with current Brucella taxonomy indicating a remarkable association between gene gain or loss on one hand and mutations in marker genes on the other. The proteome based methods found greater similarity between Brucella species and Ochrobactrum species than between species within genus Agrobacterium compared to each other. In other words, proteome comparisons of species within genus Agrobacterium were found to be more diverse than proteome comparisons between species in genus Brucella and genus Ochrobactrum. Pan-genomic analyses indicated that uptake of DNA from outside genus Brucella appears to be

  18. Comparative and functional genomics of lipases in holometabolous insects.

    Science.gov (United States)

    Horne, Irene; Haritos, Victoria S; Oakeshott, John G

    2009-08-01

    Lipases have key roles in insect lipid acquisition, storage and mobilisation and are also fundamental to many physiological processes underpinning insect reproduction, development, defence from pathogens and oxidative stress, and pheromone signalling. We have screened the recently sequenced genomes of five species from four orders of holometabolous insects, the dipterans Drosophila melanogaster and Anopheles gambiae, the hymenopteran Apis mellifera, the moth Bombyx mori and the beetle Tribolium castaneum, for the six major lipase families that are also found in other organisms. The two most numerous families in the insects, the neutral and acid lipases, are also the main families in mammals, albeit not in Caenorhabditis elegans, plants or microbes. Total numbers of the lipases vary two-fold across the five insect species, from numbers similar to those in mammals up to numbers comparable to those seen in C. elegans. Whilst there is a high degree of orthology with mammalian lipases in the other four families, the grea