WorldWideScience

Sample records for array-based comparative genomic

  1. Genomic profiling of oral squamous cell carcinoma by array-based comparative genomic hybridization.

    Directory of Open Access Journals (Sweden)

    Shunichi Yoshioka

    Full Text Available We designed a study to investigate genetic relationships between primary tumors of oral squamous cell carcinoma (OSCC and their lymph node metastases, and to identify genomic copy number aberrations (CNAs related to lymph node metastasis. For this purpose, we collected a total of 42 tumor samples from 25 patients and analyzed their genomic profiles by array-based comparative genomic hybridization. We then compared the genetic profiles of metastatic primary tumors (MPTs with their paired lymph node metastases (LNMs, and also those of LNMs with non-metastatic primary tumors (NMPTs. Firstly, we found that although there were some distinctive differences in the patterns of genomic profiles between MPTs and their paired LNMs, the paired samples shared similar genomic aberration patterns in each case. Unsupervised hierarchical clustering analysis grouped together 12 of the 15 MPT-LNM pairs. Furthermore, similarity scores between paired samples were significantly higher than those between non-paired samples. These results suggested that MPTs and their paired LNMs are composed predominantly of genetically clonal tumor cells, while minor populations with different CNAs may also exist in metastatic OSCCs. Secondly, to identify CNAs related to lymph node metastasis, we compared CNAs between grouped samples of MPTs and LNMs, but were unable to find any CNAs that were more common in LNMs. Finally, we hypothesized that subpopulations carrying metastasis-related CNAs might be present in both the MPT and LNM. Accordingly, we compared CNAs between NMPTs and LNMs, and found that gains of 7p, 8q and 17q were more common in the latter than in the former, suggesting that these CNAs may be involved in lymph node metastasis of OSCC. In conclusion, our data suggest that in OSCCs showing metastasis, the primary and metastatic tumors share similar genomic profiles, and that cells in the primary tumor may tend to metastasize after acquiring metastasis-associated CNAs.

  2. High-Throughput Analysis of Subtelomeric Chromosome Rearrangements by Use of Array-Based Comparative Genomic Hybridization

    OpenAIRE

    Veltman, Joris A; Schoenmakers, Eric F.P.M.; Eussen, Bert H; Janssen, Irene; Merkx, Gerard; van Cleef, Brigitte; van Ravenswaaij, Conny M.; Brunner, Han G.; Smeets, Dominique; van Kessel, Ad Geurts

    2002-01-01

    Telomeric chromosome rearrangements may cause mental retardation, congenital anomalies, and miscarriages. Automated detection of subtle deletions or duplications involving telomeres is essential for high-throughput diagnosis, but impossible when conventional cytogenetic methods are used. Array-based comparative genomic hybridization (CGH) allows high-resolution screening of copy number abnormalities by hybridizing differentially labeled test and reference genomes to arrays of robotically spot...

  3. Exome sequencing and array-based comparative genomic hybridisation analysis of preferential 6-methylmercaptopurine producers.

    Science.gov (United States)

    Chua, E W; Cree, S; Barclay, M L; Doudney, K; Lehnert, K; Aitchison, A; Kennedy, M A

    2015-10-01

    Preferential conversion of azathioprine or 6-mercaptopurine into methylated metabolites is a major cause of thiopurine resistance. To seek potentially Mendelian causes of thiopurine hypermethylation, we recruited 12 individuals who exhibited extreme therapeutic resistance while taking azathioprine or 6-mercaptopurine and performed whole-exome sequencing (WES) and copy-number variant analysis by array-based comparative genomic hybridisation (aCGH). Exome-wide variant filtering highlighted four genes potentially associated with thiopurine metabolism (ENOSF1 and NFS1), transport (SLC17A4) or therapeutic action (RCC2). However, variants of each gene were found only in two or three patients, and it is unclear whether these genes could influence thiopurine hypermethylation. Analysis by aCGH did not identify any unusual or pathogenic copy-number variants. This suggests that if causative mutations for the hypermethylation phenotype exist they may be heterogeneous, occurring in several different genes, or they may lie within regulatory regions not captured by WES. Alternatively, hypermethylation may arise from the involvement of multiple genes with small effects. To test this hypothesis would require recruitment of large patient samples and application of genome-wide association studies. PMID:25752523

  4. Identification of genomic alterations in pancreatic cancer using array-based comparative genomic hybridization.

    Directory of Open Access Journals (Sweden)

    Jian-Wei Liang

    Full Text Available BACKGROUND: Genomic aberration is a common feature of human cancers and also is one of the basic mechanisms that lead to overexpression of oncogenes and underexpression of tumor suppressor genes. Our study aims to identify frequent genomic changes in pancreatic cancer. MATERIALS AND METHODS: We used array comparative genomic hybridization (array CGH to identify recurrent genomic alterations and validated the protein expression of selected genes by immunohistochemistry. RESULTS: Sixteen gains and thirty-two losses occurred in more than 30% and 60% of the tumors, respectively. High-level amplifications at 7q21.3-q22.1 and 19q13.2 and homozygous deletions at 1p33-p32.3, 1p22.1, 1q22, 3q27.2, 6p22.3, 6p21.31, 12q13.2, 17p13.2, 17q21.31 and 22q13.1 were identified. Especially, amplification of AKT2 was detected in two carcinomas and homozygous deletion of CDKN2C in other two cases. In 15 independent validation samples, we found that AKT2 (19q13.2 and MCM7 (7q22.1 were amplified in 6 and 9 cases, and CAMTA2 (17p13.2 and PFN1 (17p13.2 were homozygously deleted in 3 and 1 cases. AKT2 and MCM7 were overexpressed, and CAMTA2 and PFN1 were underexpressed in pancreatic cancer tissues than in morphologically normal operative margin tissues. Both GISTIC and Genomic Workbench software identified 22q13.1 containing APOBEC3A and APOBEC3B as the only homozygous deletion region. And the expression levels of APOBEC3A and APOBEC3B were significantly lower in tumor tissues than in morphologically normal operative margin tissues. Further validation showed that overexpression of PSCA was significantly associated with lymph node metastasis, and overexpression of HMGA2 was significantly associated with invasive depth of pancreatic cancer. CONCLUSION: These recurrent genomic changes may be useful for revealing the mechanism of pancreatic carcinogenesis and providing candidate biomarkers.

  5. A genome-wide analysis of array-based comparative genomic hybridization (CGH) data to detect intra-species variations and evolutionary relationships.

    Science.gov (United States)

    Array-based comparative genomics hybridization (CGH) has gained prevalence as a technique of choice for the detection of structural variations in the genome. In this study, we propose a novel genome-wide method of classification using CGH data, in order to reveal putative phylogenetic relationships ...

  6. Genomic profiling of rectal adenoma and carcinoma by array-based comparative genomic hybridization

    Directory of Open Access Journals (Sweden)

    Shi Zhi-Zhou

    2012-11-01

    Full Text Available Abstract Background Rectal cancer is one of the most common cancers in the world. Early detection and early therapy are important for the control of death caused by rectal cancer. The present study aims to investigate the genomic alterations in rectal adenoma and carcinoma. Methods We detected the genomic changes of 8 rectal adenomas and 8 carcinomas using array CGH. Then 14 genes were selected for analyzing the expression between rectal tumor and paracancerous normal tissues as well as from adenoma to carcinoma by real-time PCR. The expression of GPNMB and DIS3 were further investigated in rectal adenoma and carcinoma tissues by immunohistochemistry. Results We indentified ten gains and 22 losses in rectal adenoma, and found 25 gains and 14 losses in carcinoma. Gains of 7p21.3-p15.3, 7q22.3-q32.1, 13q13.1-q14.11, 13q21.1-q32.1, 13q32.2-q34, 20p11.21 and 20q11.23-q12 and losses of 17p13.1-p11.2, 18p11.32-p11.21 and 18q11.1-q11.2 were shared by both rectal adenoma and carcinoma. Gains of 1q, 6p21.33-p21.31 and losses of 10p14-p11.21, 14q12-q21.1, 14q22.1-q24.3, 14q31.3-q32.1, 14q32.2-q32.32, 15q15.1-q21.1, 15q22.31 and 15q25.1-q25.2 were only detected in carcinoma but not in adenoma. Copy number and mRNA expression of EFNA1 increased from rectal adenoma to carcinoma. C13orf27 and PMEPA1 with increased copy number in both adenoma and carcinoma were over expressed in rectal cancer tissues. Protein and mRNA expression of GPNMB was significantly higher in cancer tissues than rectal adenoma tissues. Conclusion Our data may help to identify the driving genes involved in the adenoma-carcinoma progression.

  7. Array-based comparative genomic hybridization for genomic-wide screening of DNA copy number alterations in aggressive bone tumors

    Directory of Open Access Journals (Sweden)

    Kanamori Masahiko

    2012-11-01

    Full Text Available Abstract Background The genetic pathways of aggressive changes of bone tumors are still poorly understood. It is very important to analyze DNA copy number alterations (DCNAs, to identify the molecular events in the step of progression to the aggressive change of bone tissue. Methods Genome-wide array-based comparative genomic hybridization (array CGH was used to investigate DCNAs of 14 samples from 13 aggressive bone tumors, such as giant cell tumors (GCTs and osteosarcoma (OS, etc. Results Primary aggressive bone tumors had copy number gains of 17.8±12.7% in the genome, and losses of 17.3±11.4% in 287 target clones (threshold for each DCNA: ≦085, 1.15≦. Genetic unstable cases, which were defined by the total DCNAs aberration ≧30%, were identified in 9 of 13 patients (3 of 7 GCTs and all malignant tumors. High-level amplification of TGFβ2, CCND3, WI-6509, SHGC-5557, TCL1A, CREBBP, HIC1, THRA, AFM217YD10, LAMA3, RUNX1 and D22S543, were commonly observed in aggressive bone tumors. On the other hand, NRAS, D2S447, RAF1, ROBO1, MYB, MOS, FGFR2, HRAS, D13S319, D13S327, D18S552, YES1 and DCC, were commonly low. We compared genetic instability between a primary OS and its metastatic site in Case #13. Metastatic lesion showed increased 9 DCNAs of remarkable change (m/p ratio ≧1.3 folds, compared to a primary lesion. D1S214, D1S1635, EXT1, AFM137XA11, 8 M16/SP6, CCND2, IGH, 282 M15/SP6, HIC1 and LAMA3, were overexpressed. We gave attention to HIC1 (17p13.3, which was common high amplification in this series. Conclusion Our results may provide several entry points for the identification of candidate genes associated with aggressive change of bone tumors. Especially, the locus 17p11-13 including HIC1 close to p53 was common high amplification in this series and review of the literature.

  8. Prenatal diagnosis of chromosomal abnormalities using array-based comparative genomic hybridization

    Science.gov (United States)

    This study was designed to evaluate the feasibility of using a targeted array-CGH strategy for prenatal diagnosis of genomic imbalances in a clinical setting of current pregnancies. Women undergoing prenatal diagnosis were counseled and offered array-CGH (BCM V4.0) in addition to routine chromosome ...

  9. Array-based comparative genomic hybridization for the detection of DNA sequence copy number changes in Barrett's adenocarcinoma.

    Science.gov (United States)

    Albrecht, Bettina; Hausmann, Michael; Zitzelsberger, Horst; Stein, Hubert; Siewert, Jörg Rüdiger; Hopt, Ulrich; Langer, Rupert; Höfler, Heinz; Werner, Martin; Walch, Axel

    2004-07-01

    Array-based comparative genomic hybridization (aCGH) allows the identification of DNA sequence copy number changes at high resolution by co-hybridizing differentially labelled test and control DNAs to a micro-array of genomic clones. The present study has analysed a series of 23 formalin-fixed, paraffin wax-embedded tissue samples of Barrett's adenocarcinoma (BCA, n = 18) and non-neoplastic squamous oesophageal (n = 2) and gastric cardia mucosa (n = 3) by aCGH. The micro-arrays used contained 287 genomic targets covering oncogenes, tumour suppressor genes, and DNA sequences localized within chromosomal regions previously reported to be altered in BCA. DNA sequence copy number changes for a panel of approximately 50 genes were identified, most of which have not been previously described in BCA. DNA sequence copy number gains (mean 41 +/- 25/BCA) were more frequent than DNA sequence copy number losses (mean 20 +/- 15/BCA). The highest frequencies for DNA sequence copy number gains were detected for SNRPN (61%); GNLY (44%); NME1 (44%); DDX15, ABCB1 (MDR), ATM, LAMA3, MYBL2, ZNF217, and TNFRSF6B (39% each); and MSH2, TERC, SERPINE1, AFM137XA11, IGF1R, and PTPN1 (33% each). DNA sequence copy number losses were identified for PDGFB (44%); D17S125 (39%); AKT3 (28%); and RASSFI, FHIT, CDKN2A (p16), and SAS (CDK4) (28% each). In all non-neoplastic tissue samples of squamous oesophageal and gastric cardia mucosa, the measured mean ratios were 1.00 (squamous oesophageal mucosa) or 1.01 (gastric mucosa), indicating that no DNA sequence copy number chances were present. For validation, the DNA sequence copy number changes of selected clones (SNRPN, CMYC, HER2, ZNF217) detected by aCGH were confirmed by fluorescence in situ hybridization (FISH). These data show the sensitivity of aCGH for the identification of DNA sequence copy number changes at high resolution in BCA. The newly identified genes may include so far unknown biomarkers in BCA and are therefore a starting point for

  10. Microalterations of Inherently Unstable Genomic Regions in Rat Mammary Carcinomas as Revealed by Long Oligonucleotide Array-Based Comparative Genomic Hybridization

    NARCIS (Netherlands)

    Adamovic, Tatjana; McAllister, Donna; Guryev, Victor; Wang, Xujing; Andrae, Jaime Wendt; Cuppen, Edwin; Jacob, Howard J.; Sugg, Sonia L.

    2009-01-01

    The presence of copy number variants in normal genomes poses a challenge to identify small genuine somatic copy number changes in high-resolution cancer genome profiling studies due to the use of unpaired reference DNA. Another problem is the well-known rearrangements of immunoglobulin and T-cell re

  11. Array-based comparative genomic hybridization analysis reveals chromosomal copy number aberrations associated with clinical outcome in canine diffuse large B-cell lymphoma.

    Directory of Open Access Journals (Sweden)

    Arianna Aricò

    Full Text Available Canine Diffuse Large B-cell Lymphoma (cDLBCL is an aggressive cancer with variable clinical response. Despite recent attempts by gene expression profiling to identify the dog as a potential animal model for human DLBCL, this tumor remains biologically heterogeneous with no prognostic biomarkers to predict prognosis. The aim of this work was to identify copy number aberrations (CNAs by high-resolution array comparative genomic hybridization (aCGH in 12 dogs with newly diagnosed DLBCL. In a subset of these dogs, the genetic profiles at the end of therapy and at relapse were also assessed. In primary DLBCLs, 90 different genomic imbalances were counted, consisting of 46 gains and 44 losses. Two gains in chr13 were significantly correlated with clinical stage. In addition, specific regions of gains and losses were significantly associated to duration of remission. In primary DLBCLs, individual variability was found, however 14 recurrent CNAs (>30% were identified. Losses involving IGK, IGL and IGH were always found, and gains along the length of chr13 and chr31 were often observed (>41%. In these segments, MYC, LDHB, HSF1, KIT and PDGFRα are annotated. At the end of therapy, dogs in remission showed four new CNAs, whereas three new CNAs were observed in dogs at relapse compared with the previous profiles. One ex novo CNA, involving TCR, was present in dogs in remission after therapy, possibly induced by the autologous vaccine. Overall, aCGH identified small CNAs associated with outcome, which, along with future expression studies, may reveal target genes relevant to cDLBCL.

  12. Evolutionary insights from suffix array-based genome sequence analysis

    Indian Academy of Sciences (India)

    Anindya Poddar; Nagasuma Chandra; Madhavi Ganapathiraju; K Sekar; Judith Klein-Seetharaman; Raj Reddy; N Balakrishnan

    2007-08-01

    Gene and protein sequence analyses, central components of studies in modern biology are easily amenable to string matching and pattern recognition algorithms. The growing need of analysing whole genome sequences more efficiently and thoroughly, has led to the emergence of new computational methods. Suffix trees and suffix arrays are data structures, well known in many other areas and are highly suited for sequence analysis too. Here we report an improvement to the design of construction of suffix arrays. Enhancement in versatility and scalability, enabled by this approach, is demonstrated through the use of real-life examples. The scalability of the algorithm to whole genomes renders it suitable to address many biologically interesting problems. One example is the evolutionary insight gained by analysing unigrams, bi-grams and higher n-grams, indicating that the genetic code has a direct influence on the overall composition of the genome. Further, different proteomes have been analysed for the coverage of the possible peptide space, which indicate that as much as a quarter of the total space at the tetra-peptide level is left un-sampled in prokaryotic organisms, although almost all tri-peptides can be seen in one protein or another in a proteome. Besides, distinct patterns begin to emerge for the counts of particular tetra and higher peptides, indicative of a ‘meaning’ for tetra and higher n-grams. The toolkit has also been used to demonstrate the usefulness of identifying repeats in whole proteomes efficiently. As an example, 16 members of one COG, coded by the genome of Mycobacterium tuberculosis H37Rv have been found to contain a repeating sequence of 300 amino acids.

  13. Prognostic Impact of Array-based Genomic Profiles in Esophageal Squamous Cell Cancer

    International Nuclear Information System (INIS)

    Esophageal squamous cell carcinoma (ESCC) is a genetically complex tumor type and a major cause of cancer related mortality. Although distinct genetic alterations have been linked to ESCC development and prognosis, the genetic alterations have not gained clinical applicability. We applied array-based comparative genomic hybridization (aCGH) to obtain a whole genome copy number profile relevant for identifying deranged pathways and clinically applicable markers. A 32 k aCGH platform was used for high resolution mapping of copy number changes in 30 stage I-IV ESCC. Potential interdependent alterations and deranged pathways were identified and copy number changes were correlated to stage, differentiation and survival. Copy number alterations affected median 19% of the genome and included recurrent gains of chromosome regions 5p, 7p, 7q, 8q, 10q, 11q, 12p, 14q, 16p, 17p, 19p, 19q, and 20q and losses of 3p, 5q, 8p, 9p and 11q. High-level amplifications were observed in 30 regions and recurrently involved 7p11 (EGFR), 11q13 (MYEOV, CCND1, FGF4, FGF3, PPFIA, FAD, TMEM16A, CTTS and SHANK2) and 11q22 (PDFG). Gain of 7p22.3 predicted nodal metastases and gains of 1p36.32 and 19p13.3 independently predicted poor survival in multivariate analysis. aCGH profiling verified genetic complexity in ESCC and herein identified imbalances of multiple central tumorigenic pathways. Distinct gains correlate with clinicopathological variables and independently predict survival, suggesting clinical applicability of genomic profiling in ESCC

  14. High frequency of submicroscopic chromosomal imbalances in patients with syndromic craniosynostosis detected by a combined approach of microsatellite segregation analysis, multiplex ligation-dependent probe amplification and array-based comparative genome hybridisation.

    NARCIS (Netherlands)

    Jehee, F.S.; Krepischi-Santos, A.C.; Rocha, K.M.; Cavalcanti, D.P.; Kim, C.A.; Bertola, D.R.; Alonso, L.G.; D'Angelo, C.S.; Mazzeu, J.F.; Froyen, G.; Lugtenberg, D.; Vianna-Morgante, A.M.; Rosenberg, C.; Passos-Bueno, M.R.

    2008-01-01

    We present the first comprehensive study, to our knowledge, on genomic chromosomal analysis in syndromic craniosynostosis. In total, 45 patients with craniosynostotic disorders were screened with a variety of methods including conventional karyotype, microsatellite segregation analysis, subtelomeric

  15. Ebolavirus comparative genomics

    OpenAIRE

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; Uberbacher, Edward C.; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; Lund, Ole; Buzard, Gregory S; Pedersen, Thomas Dybdal; Wassenaar, Trudy M.; Ussery, David W.

    2015-01-01

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. To examine the dynamics of this genome, we compare more than 100 currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequen...

  16. Comparative genomic hybridization using oligonucleotide microarrays and total genomic DNA

    OpenAIRE

    Barrett, Michael T; Scheffer, Alicia; Ben-Dor, Amir; Sampas, Nick; Lipson, Doron; Kincaid, Robert; Tsang, Peter; Curry, Bo; Baird, Kristin; Meltzer, Paul S.; Yakhini, Zohar; Bruhn, Laurakay; Laderman, Stephen

    2004-01-01

    Array-based comparative genomic hybridization (CGH) measures copy-number variations at multiple loci simultaneously, providing an important tool for studying cancer and developmental disorders and for developing diagnostic and therapeutic targets. Arrays for CGH based on PCR products representing assemblies of BAC or cDNA clones typically require maintenance, propagation, replication, and verification of large clone sets. Furthermore, it is difficult to control the specificity of the hybridiz...

  17. From array-based hybridization of Helicobacter pylori isolates to the complete genome sequence of an isolate associated with MALT lymphoma

    Directory of Open Access Journals (Sweden)

    Mégraud Francis

    2010-06-01

    Full Text Available Abstract Background elicobacter pylori infection is associated with several gastro-duodenal inflammatory diseases of various levels of severity. To determine whether certain combinations of genetic markers can be used to predict the clinical source of the infection, we analyzed well documented and geographically homogenous clinical isolates using a comparative genomics approach. Results A set of 254 H. pylori genes was used to perform array-based comparative genomic hybridization among 120 French H. pylori strains associated with chronic gastritis (n = 33, duodenal ulcers (n = 27, intestinal metaplasia (n = 17 or gastric extra-nodal marginal zone B-cell MALT lymphoma (n = 43. Hierarchical cluster analyses of the DNA hybridization values allowed us to identify a homogeneous subpopulation of strains that clustered exclusively with cagPAI minus MALT lymphoma isolates. The genome sequence of B38, a representative of this MALT lymphoma strain-cluster, was completed, fully annotated, and compared with the six previously released H. pylori genomes (i.e. J99, 26695, HPAG1, P12, G27 and Shi470. B38 has the smallest H. pylori genome described thus far (1,576,758 base pairs containing 1,528 CDSs; it contains the vacAs2m2 allele and lacks the genes encoding the major virulence factors (absence of cagPAI, babB, babC, sabB, and homB. Comparative genomics led to the identification of very few sequences that are unique to the B38 strain (9 intact CDSs and 7 pseudogenes. Pair-wise genomic synteny comparisons between B38 and the 6 H. pylori sequenced genomes revealed an almost complete co-linearity, never seen before between the genomes of strain Shi470 (a Peruvian isolate and B38. Conclusion These isolates are deprived of the main H. pylori virulence factors characterized previously, but are nonetheless associated with gastric neoplasia.

  18. Comparative genomics of Bifidobacteria

    OpenAIRE

    Bottacini, Francesca

    2013-01-01

    Chapter 2 of this thesis describes the sequence analysis of 14 bifidobacterial genomes from various species of the genus Bifidobacterium, and the determination of their open pan-genome trend. This analysis first determined the total number of genes to be considered as the reservoir of functions available to representatives of this genus. Many identified genes are still uncharacterized, but may be involved in the adaptation to the gut environment. This comparative genomic analysis also determi...

  19. Ebolavirus comparative genomics

    DEFF Research Database (Denmark)

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat;

    2015-01-01

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. To examine the dynamics of this genome, we compare more than 100 currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a...... distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus and Marburgvirus within the family Filoviridae....... Whereas all ebolavirus genomes are quite similar (multiple sequences of the same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP) and polymerase (L). We predict regions that could...

  20. Normalization and centering of array-based heterologous genome hybridization based on divergent control probes

    Directory of Open Access Journals (Sweden)

    Wheeler David

    2011-05-01

    Full Text Available Abstract Background Hybridization of heterologous (non-specific nucleic acids onto arrays designed for model-organisms has been proposed as a viable genomic resource for estimating sequence variation and gene expression in non-model organisms. However, conventional methods of normalization that assume equivalent distributions (such as quantile normalization are inappropriate when applied to non-specific (heterologous hybridization. We propose an algorithm for normalizing and centering intensity data from heterologous hybridization that makes no prior assumptions of distribution, reduces the false appearance of homology, and provides a way for researchers to confirm whether heterologous hybridization is suitable. Results Data are normalized by adjusting for Gibbs free energy binding, and centered by adjusting for the median of a common set of control probes assumed to be equivalently dissimilar for all species. This procedure was compared to existing approaches and found to be as successful as Loess normalization at detecting sequence variations (deletions and even more successful than quantile normalization at reducing the accumulation of false positive probe matches between two related nematode species, Caenorhabditis elegans and C. briggsae. Despite the improvements, we still found that probe fluorescence intensity was too poorly correlated with sequence similarity to result in reliable detection of matching probe sequence. Conclusions Cross-species hybridizations can be a way to adapt genome-enabled tools for closely related non-model organisms, but data must be appropriately normalized and centered in a way that accommodates hybridization of nucleic acids with diverged sequence. For short, 25-mer probes, hybridization intensity alone may be insufficiently correlated with sequence similarity to allow reliable inference of homology at the probe level.

  1. Array-based genomic screening at diagnosis and during follow-up in chronic lymphocytic leukemia

    DEFF Research Database (Denmark)

    Gunnarsson, Rebeqa; Mansouri, Larry; Isaksson, Anders;

    2011-01-01

    High-resolution genomic microarrays enable simultaneous detection of copy-number aberrations such as the known recurrent aberrations in chronic lymphocytic leukemia [del(11q), del(13q), del(17p) and trisomy 12], and copy-number neutral loss of heterozygosity. Moreover, comparison of genomic...

  2. Comparative Genome Viewer

    International Nuclear Information System (INIS)

    The amount of information about genomes, both in the form of complete sequences and annotations, has been exponentially increasing in the last few years. As a result there is the need for tools providing a graphical representation of such information that should be comprehensive and intuitive. Visual representation is especially important in the comparative genomics field since it should provide a combined view of data belonging to different genomes. We believe that existing tools are limited in this respect as they focus on a single genome at a time (conservation histograms) or compress alignment representation to a single dimension. We have therefore developed a web-based tool called Comparative Genome Viewer (Cgv): it integrates a bidimensional representation of alignments between two regions, both at small and big scales, with the richness of annotations present in other genome browsers. We give access to our system through a web-based interface that provides the user with an interactive representation that can be updated in real time using the mouse to move from region to region and to zoom in on interesting details.

  3. Ebolavirus comparative genomics.

    Science.gov (United States)

    Jun, Se-Ran; Leuze, Michael R; Nookaew, Intawat; Uberbacher, Edward C; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; Lund, Ole; Buzard, Gregory S; Pedersen, Thomas D; Wassenaar, Trudy M; Ussery, David W

    2015-09-01

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. To examine the dynamics of this genome, we compare more than 100 currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of the same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP) and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. This information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies.This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). PMID:26175035

  4. Phytozome Comparative Plant Genomics Portal

    Energy Technology Data Exchange (ETDEWEB)

    Goodstein, David; Batra, Sajeev; Carlson, Joseph; Hayes, Richard; Phillips, Jeremy; Shu, Shengqiang; Schmutz, Jeremy; Rokhsar, Daniel

    2014-09-09

    The Dept. of Energy Joint Genome Institute is a genomics user facility supporting DOE mission science in the areas of Bioenergy, Carbon Cycling, and Biogeochemistry. The Plant Program at the JGI applies genomic, analytical, computational and informatics platforms and methods to: 1. Understand and accelerate the improvement (domestication) of bioenergy crops 2. Characterize and moderate plant response to climate change 3. Use comparative genomics to identify constrained elements and infer gene function 4. Build high quality genomic resource platforms of JGI Plant Flagship genomes for functional and experimental work 5. Expand functional genomic resources for Plant Flagship genomes

  5. Genome-wide array-based comparative genomic hybridization (array-CGH) analysis in Aicardi Syndrome

    Science.gov (United States)

    Aicardi syndrome is characterized by agenesis of the corpus callosum, chorioretinal lacunae, severe seizures (starting as infantile spasms), neuronal migration defects, mental retardation, costovertebral defects, and typical facial features. Because Aicardi syndrome is sporadic and affects only fem...

  6. 微阵列比较基因组杂交技术分析一例猫叫综合征患儿的基因组拷贝数变异%Analysis of copy number variations in an infant with Cri du Chat syndrome by array-based comparative genomic hybridization

    Institute of Scientific and Technical Information of China (English)

    罗福薇; 罗彩群; 谢建生; 耿茜; 刘红; 李芳; 陈武斌; 王丽

    2013-01-01

    Objective To analyze genomic copy number variations in an infant with Cri du Chat syndrome,and to explore the underlying genetic cause.Methods G-banding analysis was carried out on cultured peripheral blood sample from the patient.Copy number variation analysis was performed using microarray comparative genomic hybridization,and the result was verified with fluorescence in situ hybridization.Results The infant was found to have a 46,XY,der(5)(p?) karyotype.By microarray comparative genomic hybridization,a 23.263 Mb deletion was detected in 5p14.2-p15.3 region in addition to a 14.602 Mb duplication in 12p31 region.A derivative chromosome was formed by rejoining of 12p31 region with the 5p14.2 breakpoint.The patient therefore has a karyotype of arr cgh 5p15.3p14.2 (PLEKHG4B→CDH12) × 1 pat,12p13.33p13.1 (IQSEC3→GUC Y2C) × 3 pat.Loss of distal 5p and gain of distal 12p were verified with fluorescence in situ hybridization.Conclusion The Cri du Chat syndrome manifested by the patient was caused by deletion of distal 5p from an unbalanced translocation involving chromosome 5.Microarray comparative genomic hybridization is a powerful tool for revealing genomic copy number variations for its high-resolution,high-throughput and high-accuracy.%目的 对1例猫叫综合征患儿进行基因组拷贝数分析,寻找其致病原因.方法 对患儿外周血进行常规G显带分析,应用微阵列比较基因组杂交技术进行全基因组扫描,并应用荧光原位杂交技术对异常拷贝数区域进行验证.结果 患儿染色体核型为46,XY,der(5)(p?).微阵列比较基因组杂交显示其在5p14.2-p15.3处存在23.263Mb的片段缺失,12号染色体12p31区域存在14.602 Mb的片段重复.重复片段连接至5p14.2处,形成5号衍生染色体,即arr cgh 5p15.3p14.2(PLEKHG4B→CDH12)×1 pat,12p13.33p13.1(IQSEC3→GUC Y2C)× 3 pat.荧光原位杂交证实患儿存在5p末端缺失及12p末端重复.结论 5号染色体不平衡易位导致患儿5p末端

  7. Array-based approaches to bacterial transcriptome analysis

    OpenAIRE

    Mäder, Ulrike; Nicolas, Pierre

    2012-01-01

    Microarray technology has been extensively used to compare or quantify genome-wide mRNA levels, a key factor in the adaptive response of bacteria to the environment. Classical gene expression arrays based on an existing genome annotation with relatively few probes for each gene, are well suited to assess the expression levels of all annotated transcripts under many different conditions. Newer genomic tiling arrays that cover both strands of a genome by overlapping probes and, more recently, R...

  8. Comparative genomic hybridization: an overview.

    OpenAIRE

    Houldsworth, J; Chaganti, R S

    1994-01-01

    Comparative genomic hybridization (CGH) is a newly described molecular-cytogenetic assay that globally assays for chromosomal gains and losses in a genomic complement. In this assay, normal human metaphase chromosomes are competitively hybridized with two differentially labeled genomic DNAs (test and reference), which upon fluorescence microscopy, reveal the chromosomal locations of copy number changes in DNA sequences between the two complements. Application of CGH to DNAs extracted from fre...

  9. Cloud computing for comparative genomics

    Directory of Open Access Journals (Sweden)

    Pivovarov Rimma

    2010-05-01

    Full Text Available Abstract Background Large comparative genomics studies and tools are becoming increasingly more compute-expensive as the number of available genome sequences continues to rise. The capacity and cost of local computing infrastructures are likely to become prohibitive with the increase, especially as the breadth of questions continues to rise. Alternative computing architectures, in particular cloud computing environments, may help alleviate this increasing pressure and enable fast, large-scale, and cost-effective comparative genomics strategies going forward. To test this, we redesigned a typical comparative genomics algorithm, the reciprocal smallest distance algorithm (RSD, to run within Amazon's Elastic Computing Cloud (EC2. We then employed the RSD-cloud for ortholog calculations across a wide selection of fully sequenced genomes. Results We ran more than 300,000 RSD-cloud processes within the EC2. These jobs were farmed simultaneously to 100 high capacity compute nodes using the Amazon Web Service Elastic Map Reduce and included a wide mix of large and small genomes. The total computation time took just under 70 hours and cost a total of $6,302 USD. Conclusions The effort to transform existing comparative genomics algorithms from local compute infrastructures is not trivial. However, the speed and flexibility of cloud computing environments provides a substantial boost with manageable cost. The procedure designed to transform the RSD algorithm into a cloud-ready application is readily adaptable to similar comparative genomics problems.

  10. Comparative genomics of Helicobacter pylori

    Institute of Scientific and Technical Information of China (English)

    Quan-Jiang Dong; Qing Wang; Ying-Nin Xin; Ni Li; Shi-Ying Xuan

    2009-01-01

    Genomic sequences have been determined for a number of strains of Helicobacter pylori (H pylori) and related bacteria.With the development of microarray analysis and the wide use of subtractive hybridization techniques,comparative studies have been carried out with respect to the interstrain differences between H pylori and inter-species differences in the genome of related bacteria.It was found that the core genome of H pylori constitutes 1111 genes that are determinants of the species properties.A great pool of auxillary genes are mainly from the categories of cag pathogenicity islands,outer membrane proteins,restriction-modification system and hypothetical proteins of unknown function.Persistence of H pylori in the human stomach leads to the diversification of the genome.Comparative genomics suggest that a host jump has occurs from humans to felines.Candidate genes specific for the development of the gastric diseases were identified.With the aid of proteomics,population genetics and other molecular methods,future comparative genomic studies would dramatically promote our understanding of the evolution,pathogenesis and microbiology of H pylori.

  11. Array-based techniques for fingerprinting medicinal herbs

    Directory of Open Access Journals (Sweden)

    Xue Charlie

    2011-05-01

    Full Text Available Abstract Poor quality control of medicinal herbs has led to instances of toxicity, poisoning and even deaths. The fundamental step in quality control of herbal medicine is accurate identification of herbs. Array-based techniques have recently been adapted to authenticate or identify herbal plants. This article reviews the current array-based techniques, eg oligonucleotides microarrays, gene-based probe microarrays, Suppression Subtractive Hybridization (SSH-based arrays, Diversity Array Technology (DArT and Subtracted Diversity Array (SDA. We further compare these techniques according to important parameters such as markers, polymorphism rates, restriction enzymes and sample type. The applicability of the array-based methods for fingerprinting depends on the availability of genomics and genetics of the species to be fingerprinted. For the species with few genome sequence information but high polymorphism rates, SDA techniques are particularly recommended because they require less labour and lower material cost.

  12. Comparative genomic analyses in Asparagus.

    Science.gov (United States)

    Kuhl, Joseph C; Havey, Michael J; Martin, William J; Cheung, Foo; Yuan, Qiaoping; Landherr, Lena; Hu, Yi; Leebens-Mack, James; Town, Christopher D; Sink, Kenneth C

    2005-12-01

    Garden asparagus (Asparagus officinalis L.) belongs to the monocot family Asparagaceae in the order Asparagales. Onion (Allium cepa L.) and Asparagus officinalis are 2 of the most economically important plants of the core Asparagales, a well supported monophyletic group within the Asparagales. Coding regions in onion have lower GC contents than the grasses. We compared the GC content of 3374 unique expressed sequence tags (ESTs) from A. officinalis with Lycoris longituba and onion (both members of the core Asparagales), Acorus americanus (sister to all other monocots), the grasses, and Arabidopsis. Although ESTs in A. officinalis and Acorus had a higher average GC content than Arabidopsis, Lycoris, and onion, all were clearly lower than the grasses. The Asparagaceae have the smallest nuclear genomes among all plants in the core Asparagales, which typically have huge genomes. Within the Asparagaceae, European Asparagus species have approximately twice the nuclear DNA of that of southern African Asparagus species. We cloned and sequenced 20 genomic amplicons from European A. officinalis and the southern African species Asparagus plumosus and observed no clear evidence for a recent genome doubling in A. officinalis relative to A. plumosus. These results indicate that members of the genus Asparagus with smaller genomes may be useful genomic models for plants in the core Asparagales. PMID:16391674

  13. Enhancer Identification through Comparative Genomics

    Energy Technology Data Exchange (ETDEWEB)

    Visel, Axel; Bristow, James; Pennacchio, Len A.

    2006-10-01

    With the availability of genomic sequence from numerousvertebrates, a paradigm shift has occurred in the identification ofdistant-acting gene regulatory elements. In contrast to traditionalgene-centric studies in which investigators randomly scanned genomicfragments that flank genes of interest in functional assays, the modernapproach begins electronically with publicly available comparativesequence datasets that provide investigators with prioritized lists ofputative functional sequences based on their evolutionary conservation.However, although a large number of tools and resources are nowavailable, application of comparative genomic approaches remains far fromtrivial. In particular, it requires users to dynamically consider thespecies and methods for comparison depending on the specific biologicalquestion under investigation. While there is currently no single generalrule to this end, it is clear that when applied appropriately,comparative genomic approaches exponentially increase our power ingenerating biological hypotheses for subsequent experimentaltesting.

  14. Comparative Genome Analysis and Genome Evolution

    NARCIS (Netherlands)

    Snel, Berend

    2003-01-01

    This thesis described a collection of bioinformatic analyses on complete genome sequence data. We have studied the evolution of gene content and find that vertical inheritance dominates over horizontal gene trasnfer, even to the extent that we can use the gene content to make genome phylogenies. Usi

  15. Comparative genomics of Lactobacillus and other LAB

    DEFF Research Database (Denmark)

    Wassenaar, Trudy M.; Lukjancenko, Oksana

    2014-01-01

    The genomes of 66 LABs, belonging to five different genera, were compared for genome size and gene content. The analyzed genomes included 37 Lactobacillus genomes of 17 species, six Lactococcus lactis genomes, four Leuconostoc genomes of three species, six Streptococcus genomes of two species...... that of the others, with the two Streptococcus species having the shortest genomes. The widest distribution in genome content was observed for Lactobacillus. The number of tRNA and rRNA gene copies varied considerably, with exceptional high numbers observed for Lb. delbrueckii, while these numbers were relatively...... high for Lb. sanfransiscensis and Lb. salivarius, with respect to their moderate gene size. The phylogenetic relationship of the 16S ribosomal RNA genes of these genomes was established and pan- and core genomes were defined for each genus. In addition, core genome analysis was performed on all food...

  16. The kangaroo genome: Leaps and bounds in comparative genomics

    OpenAIRE

    Wakefield, Matthew J.; Graves, Jennifer A. Marshall.

    2003-01-01

    The kangaroo genome is a rich and unique resource for comparative genomics. Marsupial genetics and cytology have made significant contributions to the understanding of gene function and evolution, and increasing the availability of kangaroo DNA sequence information would provide these benefits on a genomic scale. Here we summarize the contributions from cytogenetic and genetic studies of marsupials, describe the genomic resources currently available and those being developed, and explore the ...

  17. Cocoa/Cotton Comparative Genomics

    Science.gov (United States)

    With genome sequence from two members of the Malvaceae family recently made available, we are exploring syntenic relationships, gene content, and evolutionary trajectories between the cacao and cotton genomes. An assembly of cacao (Theobroma cacao) using Illumina and 454 sequence technology yielded ...

  18. A High-Resolution SNP Array-Based Linkage Map Anchors a New Domestic Cat Draft Genome Assembly and Provides Detailed Patterns of Recombination.

    Science.gov (United States)

    Li, Gang; Hillier, LaDeana W; Grahn, Robert A; Zimin, Aleksey V; David, Victor A; Menotti-Raymond, Marilyn; Middleton, Rondo; Hannah, Steven; Hendrickson, Sher; Makunin, Alex; O'Brien, Stephen J; Minx, Pat; Wilson, Richard K; Lyons, Leslie A; Warren, Wesley C; Murphy, William J

    2016-01-01

    High-resolution genetic and physical maps are invaluable tools for building accurate genome assemblies, and interpreting results of genome-wide association studies (GWAS). Previous genetic and physical maps anchored good quality draft assemblies of the domestic cat genome, enabling the discovery of numerous genes underlying hereditary disease and phenotypes of interest to the biomedical science and breeding communities. However, these maps lacked sufficient marker density to order thousands of shorter scaffolds in earlier assemblies, which instead relied heavily on comparative mapping with related species. A high-resolution map would aid in validating and ordering chromosome scaffolds from existing and new genome assemblies. Here, we describe a high-resolution genetic linkage map of the domestic cat genome based on genotyping 453 domestic cats from several multi-generational pedigrees on the Illumina 63K SNP array. The final maps include 58,055 SNP markers placed relative to 6637 markers with unique positions, distributed across all autosomes and the X chromosome. Our final sex-averaged maps span a total autosomal length of 4464 cM, the longest described linkage map for any mammal, confirming length estimates from a previous microsatellite-based map. The linkage map was used to order and orient the scaffolds from a substantially more contiguous domestic cat genome assembly (Felis catus v8.0), which incorporated ∼20 × coverage of Illumina fragment reads. The new genome assembly shows substantial improvements in contiguity, with a nearly fourfold increase in N50 scaffold size to 18 Mb. We use this map to report probable structural errors in previous maps and assemblies, and to describe features of the recombination landscape, including a massive (∼50 Mb) recombination desert (of virtually zero recombination) on the X chromosome that parallels a similar desert on the porcine X chromosome in both size and physical location. PMID:27172201

  19. [Research proceedings on primate comparative genomics].

    Science.gov (United States)

    Liao, Cheng-Hong; Su, Bing

    2012-02-01

    With the accomplishment of genome sequencing of human, chimpanzee and other primates, there has been a great amount of primate genome information accumulated. Primate comparative genomics has become a new research field at current genome era. In this article, we reviewed recent progress in phylogeny, genome structure and gene expression of human and nonhuman primates, and we elaborated the major biological differences among human, chimpanzee and other non-human primate species, which is informative in revealing the mechanism of human evolution. PMID:22345018

  20. Comparative Reannotation of 21 Aspergillus Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Salamov, Asaf; Riley, Robert; Kuo, Alan; Grigoriev, Igor

    2013-03-08

    We used comparative gene modeling to reannotate 21 Aspergillus genomes. Initial automatic annotation of individual genomes may contain some errors of different nature, e.g. missing genes, incorrect exon-intron structures, 'chimeras', which fuse 2 or more real genes or alternatively splitting some real genes into 2 or more models. The main premise behind the comparative modeling approach is that for closely related genomes most orthologous families have the same conserved gene structure. The algorithm maps all gene models predicted in each individual Aspergillus genome to the other genomes and, for each locus, selects from potentially many competing models, the one which most closely resembles the orthologous genes from other genomes. This procedure is iterated until no further change in gene models is observed. For Aspergillus genomes we predicted in total 4503 new gene models ( ~;;2percent per genome), supported by comparative analysis, additionally correcting ~;;18percent of old gene models. This resulted in a total of 4065 more genes with annotated PFAM domains (~;;3percent increase per genome). Analysis of a few genomes with EST/transcriptomics data shows that the new annotation sets also have a higher number of EST-supported splice sites at exon-intron boundaries.

  1. Comparative genomics of vertebrate Fox cluster loci

    Directory of Open Access Journals (Sweden)

    Shimeld Sebastian M

    2006-10-01

    Full Text Available Abstract Background Vertebrate genomes contain numerous duplicate genes, many of which are organised into paralagous regions indicating duplication of linked groups of genes. Comparison of genomic organisation in different lineages can often allow the evolutionary history of such regions to be traced. A classic example of this is the Hox genes, where the presence of a single continuous Hox cluster in amphioxus and four vertebrate clusters has allowed the genomic evolution of this region to be established. Fox transcription factors of the C, F, L1 and Q1 classes are also organised in clusters in both amphioxus and humans. However in contrast to the Hox genes, only two clusters of paralogous Fox genes have so far been identified in the Human genome and the organisation in other vertebrates is unknown. Results To uncover the evolutionary history of the Fox clusters, we report on the comparative genomics of these loci. We demonstrate two further paralogous regions in the Human genome, and identify orthologous regions in mammalian, chicken, frog and teleost genomes, timing the duplications to before the separation of the actinopterygian and sarcopterygian lineages. An additional Fox class, FoxS, was also found to reside in this duplicated genomic region. Conclusion Comparison of loci identifies the pattern of gene duplication, loss and cluster break up through multiple lineages, and suggests FoxS1 is a likely remnant of Fox cluster duplication.

  2. Concurrent array-based queue

    Energy Technology Data Exchange (ETDEWEB)

    Heidelberger, Philip; Steinmacher-Burow, Burkhard

    2015-01-06

    According to one embodiment, a method for implementing an array-based queue in memory of a memory system that includes a controller includes configuring, in the memory, metadata of the array-based queue. The configuring comprises defining, in metadata, an array start location in the memory for the array-based queue, defining, in the metadata, an array size for the array-based queue, defining, in the metadata, a queue top for the array-based queue and defining, in the metadata, a queue bottom for the array-based queue. The method also includes the controller serving a request for an operation on the queue, the request providing the location in the memory of the metadata of the queue.

  3. A High-Resolution SNP Array-Based Linkage Map Anchors a New Domestic Cat Draft Genome Assembly and Provides Detailed Patterns of Recombination

    OpenAIRE

    Gang Li; Hillier, LaDeana W; Grahn, Robert A.; Zimin, Aleksey V; David, Victor A.; Marilyn Menotti-Raymond; Rondo Middleton; Steven Hannah; Sher Hendrickson; Alex Makunin; O’Brien, Stephen J; Pat Minx; Wilson, Richard K.; Lyons, Leslie A.; Warren, Wesley C.

    2016-01-01

    High-resolution genetic and physical maps are invaluable tools for building accurate genome assemblies, and interpreting results of genome-wide association studies (GWAS). Previous genetic and physical maps anchored good quality draft assemblies of the domestic cat genome, enabling the discovery of numerous genes underlying hereditary disease and phenotypes of interest to the biomedical science and breeding communities. However, these maps lacked sufficient marker density to order thousands o...

  4. Sequencing and comparing whole mitochondrial genomes ofanimals

    Energy Technology Data Exchange (ETDEWEB)

    Boore, Jeffrey L.; Macey, J. Robert; Medina, Monica

    2005-04-22

    Comparing complete animal mitochondrial genome sequences is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. Not only are they much more informative than shorter sequences of individual genes for inferring evolutionary relatedness, but these data also provide sets of genome-level characters, such as the relative arrangements of genes, that can be especially powerful. We describe here the protocols commonly used for physically isolating mtDNA, for amplifying these by PCR or RCA, for cloning,sequencing, assembly, validation, and gene annotation, and for comparing both sequences and gene arrangements. On several topics, we offer general observations based on our experiences to date with determining and comparing complete mtDNA sequences.

  5. Comparative genomics and transcriptomics of Propionibacterium acnes.

    Directory of Open Access Journals (Sweden)

    Elzbieta Brzuszkiewicz

    Full Text Available The anaerobic gram-positive bacterium Propionibacterium acnes is a human skin commensal that is occasionally associated with inflammatory diseases. Recent work has indicated that evolutionary distinct lineages of P. acnes play etiologic roles in disease while others are associated with maintenance of skin homeostasis. To shed light on the molecular basis for differential strain properties, we carried out genomic and transcriptomic analysis of distinct P. acnes strains. We sequenced the genome of the P. acnes strain 266, a type I-1a strain. Comparative genome analysis of strain 266 and four other P. acnes strains revealed that overall genome plasticity is relatively low; however, a number of island-like genomic regions, encoding a variety of putative virulence-associated and fitness traits differ between phylotypes, as judged from PCR analysis of a collection of P. acnes strains. Comparative transcriptome analysis of strains KPA171202 (type I-2 and 266 during exponential growth revealed inter-strain differences in gene expression of transport systems and metabolic pathways. In addition, transcript levels of genes encoding possible virulence factors such as dermatan-sulphate adhesin, polyunsaturated fatty acid isomerase, iron acquisition protein HtaA and lipase GehA were upregulated in strain 266. We investigated differential gene expression during exponential and stationary growth phases. Genes encoding components of the energy-conserving respiratory chain as well as secreted and virulence-associated factors were transcribed during the exponential phase, while the stationary growth phase was characterized by upregulation of genes involved in stress responses and amino acid metabolism. Our data highlight the genomic basis for strain diversity and identify, for the first time, the actively transcribed part of the genome, underlining the important role growth status plays in the inflammation-inducing activity of P. acnes. We argue that the disease

  6. VISTA - computational tools for comparative genomics

    Energy Technology Data Exchange (ETDEWEB)

    Frazer, Kelly A.; Pachter, Lior; Poliakov, Alexander; Rubin,Edward M.; Dubchak, Inna

    2004-01-01

    Comparison of DNA sequences from different species is a fundamental method for identifying functional elements in genomes. Here we describe the VISTA family of tools created to assist biologists in carrying out this task. Our first VISTA server at http://www-gsd.lbl.gov/VISTA/ was launched in the summer of 2000 and was designed to align long genomic sequences and visualize these alignments with associated functional annotations. Currently the VISTA site includes multiple comparative genomics tools and provides users with rich capabilities to browse pre-computed whole-genome alignments of large vertebrate genomes and other groups of organisms with VISTA Browser, submit their own sequences of interest to several VISTA servers for various types of comparative analysis, and obtain detailed comparative analysis results for a set of cardiovascular genes. We illustrate capabilities of the VISTA site by the analysis of a 180 kilobase (kb) interval on human chromosome 5 that encodes for the kinesin family member3A (KIF3A) protein.

  7. Comparative genomics of Shiga toxin encoding bacteriophages

    Directory of Open Access Journals (Sweden)

    Smith Darren L

    2012-07-01

    Full Text Available Abstract Background Stx bacteriophages are responsible for driving the dissemination of Stx toxin genes (stx across their bacterial host range. Lysogens carrying Stx phages can cause severe, life-threatening disease and Stx toxin is an integral virulence factor. The Stx-bacteriophage vB_EcoP-24B, commonly referred to as Ф24B, is capable of multiply infecting a single bacterial host cell at a high frequency, with secondary infection increasing the rate at which subsequent bacteriophage infections can occur. This is biologically unusual, therefore determining the genomic content and context of Ф24B compared to other lambdoid Stx phages is important to understanding the factors controlling this phenomenon and determining whether they occur in other Stx phages. Results The genome of the Stx2 encoding phage, Ф24B was sequenced and annotated. The genomic organisation and general features are similar to other sequenced Stx bacteriophages induced from Enterohaemorrhagic Escherichia coli (EHEC, however Ф24B possesses significant regions of heterogeneity, with implications for phage biology and behaviour. The Ф24B genome was compared to other sequenced Stx phages and the archetypal lambdoid phage, lambda, using the Circos genome comparison tool and a PCR-based multi-loci comparison system. Conclusions The data support the hypothesis that Stx phages are mosaic, and recombination events between the host, phages and their remnants within the same infected bacterial cell will continue to drive the evolution of Stx phage variants and the subsequent dissemination of shigatoxigenic potential.

  8. Diagnosis and Prognostication of Ductal Adenocarcinomas of the Pancreas Based on Genome-Wide DNA Methylation Profiling by Bacterial Artificial Chromosome Array-Based Methylated CpG Island Amplification

    Directory of Open Access Journals (Sweden)

    Masahiro Gotoh

    2011-01-01

    Full Text Available To establish diagnostic criteria for ductal adenocarcinomas of the pancreas (PCs, bacterial artificial chromosome (BAC array-based methylated CpG island amplification was performed using 139 tissue samples. Twelve BAC clones, for which DNA methylation status was able to discriminate cancerous tissue (T from noncancerous pancreatic tissue in the learning cohort with a specificity of 100%, were identified. Using criteria that combined the 12 BAC clones, T-samples were diagnosed as cancers with 100% sensitivity and specificity in both the learning and validation cohorts. DNA methylation status on 11 of the BAC clones, which was able to discriminate patients showing early relapse from those with no relapse in the learning cohort with 100% specificity, was correlated with the recurrence-free and overall survival rates in the validation cohort and was an independent prognostic factor by multivariate analysis. Genome-wide DNA methylation profiling may provide optimal diagnostic markers and prognostic indicators for patients with PCs.

  9. Comparative genomics of chondrichthyan Hoxa clusters

    Directory of Open Access Journals (Sweden)

    Zhong Ying-Fu

    2009-09-01

    Full Text Available Abstract Background The chondrichthyan or cartilaginous fish (chimeras, sharks, skates and rays occupy an important phylogenetic position as the sister group to all other jawed vertebrates and as an early lineage to diverge from the vertebrate lineage following two whole genome duplication events in vertebrate evolution. There have been few comparative genomic analyses incorporating data from chondrichthyan fish and none comparing genomic information from within the group. We have sequenced the complete Hoxa cluster of the Little Skate (Leucoraja erinacea and compared to the published Hoxa cluster of the Horn Shark (Heterodontus francisci and to available data from the Elephant Shark (Callorhinchus milii genome project. Results A BAC clone containing the full Little Skate Hoxa cluster was fully sequenced and assembled. Analyses of coding sequences and conserved non-coding elements reveal a strikingly high level of conservation across the cartilaginous fish, with twenty ultraconserved elements (100%,100 bp found between Skate and Horn Shark, compared to three between human and marsupials. We have also identified novel potential non-coding RNAs in the Skate BAC clone, some of which are conserved to other species. Conclusion We find that the Little Skate Hoxa cluster is remarkably similar to the previously published Horn Shark Hoxa cluster with respect to sequence identity, gene size and intergenic distance despite over 180 million years of separation between the two lineages. We suggest that the genomes of cartilaginous fish are more highly conserved than those of tetrapods or teleost fish and so are more likely to have retained ancestral non-coding elements. While useful for isolating homologous DNA, this complicates bioinformatic approaches to identify chondrichthyan-specific non-coding DNA elements

  10. Genome-wide comparison of paired fresh frozen and formalin-fixed paraffin-embedded gliomas by custom BAC and oligonucleotide array comparative genomic hybridization: facilitating analysis of archival gliomas

    OpenAIRE

    Mohapatra, Gayatry; Engler, David A.; Starbuck, Kristen D.; Kim, James C.; Bernay, Derek C.; Scangas, George A.; Rousseau, Audrey; Batchelor, Tracy T.; Betensky, Rebecca A.; Louis, David N.

    2010-01-01

    Molecular genetic analysis of cancer is rapidly evolving as a result of improvement in genomic technologies and the growing applicability of such analyses to clinical oncology. Array based comparative genomic hybridization (aCGH) is a powerful tool for detecting DNA copy number alterations (CNA), particularly in solid tumors, and has been applied to the study of malignant gliomas. In the clinical setting, however, gliomas are often sampled by small biopsies and thus formalin-fixed paraffin-em...

  11. Comparative genomics of brain size evolution

    Directory of Open Access Journals (Sweden)

    Wolfgang Enard

    2014-05-01

    Full Text Available Which genetic changes took place during mammalian, primate and human evolution to build a larger brain? To answer this question, one has to correlate genetic changes with brain size changes across a phylogeny. Such a comparative genomics approach provides unique information to better understand brain evolution and brain development. However, its statistical power is limited for example due to the limited number of species, the presumably complex genetics of brain size evolution and the large search space of mammalian genomes. Hence, it is crucial to add functional information, for example by limiting the search space to genes and regulatory elements known to play a role in the relevant cell types during brain development. Similarly, it is crucial to experimentally follow up on hypotheses generated by such a comparative approach. Recent progress in understanding the molecular and cellular mechanisms of mammalian brain development, in genome sequencing and in genome editing, promises to make a close integration of evolutionary and experimental methods a fruitful approach to better understand the genetics of mammalian brain size evolution.

  12. Comparative genomics of brain size evolution

    OpenAIRE

    Enard, Wolfgang

    2014-01-01

    Which genetic changes took place during mammalian, primate and human evolution to build a larger brain? To answer this question, one has to correlate genetic changes with brain size changes across a phylogeny. Such a comparative genomics approach provides unique information to better understand brain evolution and brain development. However, its statistical power is limited for example due to the limited number of species, the presumably complex genetics of brain size evolution and the large ...

  13. Comparative genome analysis of Basidiomycete fungi

    Energy Technology Data Exchange (ETDEWEB)

    Riley, Robert; Salamov, Asaf; Henrissat, Bernard; Nagy, Laszlo; Brown, Daren; Held, Benjamin; Baker, Scott; Blanchette, Robert; Boussau, Bastien; Doty, Sharon L.; Fagnan, Kirsten; Floudas, Dimitris; Levasseur, Anthony; Manning, Gerard; Martin, Francis; Morin, Emmanuelle; Otillar, Robert; Pisabarro, Antonio; Walton, Jonathan; Wolfe, Ken; Hibbett, David; Grigoriev, Igor

    2013-08-07

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes symbionts, pathogens, and saprotrophs including the majority of wood decaying and ectomycorrhizal species. To better understand the genetic diversity of this phylum we compared the genomes of 35 basidiomycetes including 6 newly sequenced genomes. These genomes span extremes of genome size, gene number, and repeat content. Analysis of core genes reveals that some 48percent of basidiomycete proteins are unique to the phylum with nearly half of those (22percent) found in only one organism. Correlations between lifestyle and certain gene families are evident. Phylogenetic patterns of plant biomass-degrading genes in Agaricomycotina suggest a continuum rather than a dichotomy between the white rot and brown rot modes of wood decay. Based on phylogenetically-informed PCA analysis of wood decay genes, we predict that that Botryobasidium botryosum and Jaapia argillacea have properties similar to white rot species, although neither has typical ligninolytic class II fungal peroxidases (PODs). This prediction is supported by growth assays in which both fungi exhibit wood decay with white rot-like characteristics. Based on this, we suggest that the white/brown rot dichotomy may be inadequate to describe the full range of wood decaying fungi. Analysis of the rate of discovery of proteins with no or few homologs suggests the value of continued sequencing of basidiomycete fungi.

  14. Comparative genomics of biotechnologically important yeasts.

    Science.gov (United States)

    Riley, Robert; Haridas, Sajeet; Wolfe, Kenneth H; Lopes, Mariana R; Hittinger, Chris Todd; Göker, Markus; Salamov, Asaf A; Wisecaver, Jennifer H; Long, Tanya M; Calvey, Christopher H; Aerts, Andrea L; Barry, Kerrie W; Choi, Cindy; Clum, Alicia; Coughlan, Aisling Y; Deshpande, Shweta; Douglass, Alexander P; Hanson, Sara J; Klenk, Hans-Peter; LaButti, Kurt M; Lapidus, Alla; Lindquist, Erika A; Lipzen, Anna M; Meier-Kolthoff, Jan P; Ohm, Robin A; Otillar, Robert P; Pangilinan, Jasmyn L; Peng, Yi; Rokas, Antonis; Rosa, Carlos A; Scheuner, Carmen; Sibirny, Andriy A; Slot, Jason C; Stielow, J Benjamin; Sun, Hui; Kurtzman, Cletus P; Blackwell, Meredith; Grigoriev, Igor V; Jeffries, Thomas W

    2016-08-30

    Ascomycete yeasts are metabolically diverse, with great potential for biotechnology. Here, we report the comparative genome analysis of 29 taxonomically and biotechnologically important yeasts, including 16 newly sequenced. We identify a genetic code change, CUG-Ala, in Pachysolen tannophilus in the clade sister to the known CUG-Ser clade. Our well-resolved yeast phylogeny shows that some traits, such as methylotrophy, are restricted to single clades, whereas others, such as l-rhamnose utilization, have patchy phylogenetic distributions. Gene clusters, with variable organization and distribution, encode many pathways of interest. Genomics can predict some biochemical traits precisely, but the genomic basis of others, such as xylose utilization, remains unresolved. Our data also provide insight into early evolution of ascomycetes. We document the loss of H3K9me2/3 heterochromatin, the origin of ascomycete mating-type switching, and panascomycete synteny at the MAT locus. These data and analyses will facilitate the engineering of efficient biosynthetic and degradative pathways and gateways for genomic manipulation. PMID:27535936

  15. Genomic alterations detected by comparative genomic hybridization in ovarian endometriomas

    Directory of Open Access Journals (Sweden)

    L.C. Veiga-Castelli

    2010-08-01

    Full Text Available Endometriosis is a complex and multifactorial disease. Chromosomal imbalance screening in endometriotic tissue can be used to detect hot-spot regions in the search for a possible genetic marker for endometriosis. The objective of the present study was to detect chromosomal imbalances by comparative genomic hybridization (CGH in ectopic tissue samples from ovarian endometriomas and eutopic tissue from the same patients. We evaluated 10 ovarian endometriotic tissues and 10 eutopic endometrial tissues by metaphase CGH. CGH was prepared with normal and test DNA enzymatically digested, ligated to adaptors and amplified by PCR. A second PCR was performed for DNA labeling. Equal amounts of both normal and test-labeled DNA were hybridized in human normal metaphases. The Isis FISH Imaging System V 5.0 software was used for chromosome analysis. In both eutopic and ectopic groups, 4/10 samples presented chromosomal alterations, mainly chromosomal gains. CGH identified 11q12.3-q13.1, 17p11.1-p12, 17q25.3-qter, and 19p as critical regions. Genomic imbalances in 11q, 17p, 17q, and 19p were detected in normal eutopic and/or ectopic endometrium from women with ovarian endometriosis. These regions contain genes such as POLR2G, MXRA7 and UBA52 involved in biological processes that may lead to the establishment and maintenance of endometriotic implants. This genomic imbalance may affect genes in which dysregulation impacts both eutopic and ectopic endometrium.

  16. Comparative genomics and evolution of eukaryotic phospholipidbiosynthesis

    Energy Technology Data Exchange (ETDEWEB)

    Lykidis, Athanasios

    2006-12-01

    Phospholipid biosynthetic enzymes produce diverse molecular structures and are often present in multiple forms encoded by different genes. This work utilizes comparative genomics and phylogenetics for exploring the distribution, structure and evolution of phospholipid biosynthetic genes and pathways in 26 eukaryotic genomes. Although the basic structure of the pathways was formed early in eukaryotic evolution, the emerging picture indicates that individual enzyme families followed unique evolutionary courses. For example, choline and ethanolamine kinases and cytidylyltransferases emerged in ancestral eukaryotes, whereas, multiple forms of the corresponding phosphatidyltransferases evolved mainly in a lineage specific manner. Furthermore, several unicellular eukaryotes maintain bacterial-type enzymes and reactions for the synthesis of phosphatidylglycerol and cardiolipin. Also, base-exchange phosphatidylserine synthases are widespread and ancestral enzymes. The multiplicity of phospholipid biosynthetic enzymes has been largely generated by gene expansion in a lineage specific manner. Thus, these observations suggest that phospholipid biosynthesis has been an actively evolving system. Finally, comparative genomic analysis indicates the existence of novel phosphatidyltransferases and provides a candidate for the uncharacterized eukaryotic phosphatidylglycerol phosphate phosphatase.

  17. Genome-wide array comparative genomic hybridization analysis reveals distinct amplifications in osteosarcoma

    International Nuclear Information System (INIS)

    Osteosarcoma is a highly malignant bone neoplasm of children and young adults. It is characterized by extremely complex karyotypes and high frequency of chromosomal amplifications. Currently, only the histological response (degree of necrosis) to therapy represent gold standard for predicting the outcome in a patient with non-metastatic osteosarcoma at the time of definitive surgery. Patients with lower degree of necrosis have a higher risk of relapse and poor outcome even after chemotherapy and complete resection of the primary tumor. Therefore, a better understanding of the underlying molecular genetic events leading to tumor initiation and progression could result in the identification of potential diagnostic and therapeutic targets. We used a genome-wide screening method – array based comparative genomic hybridization (array-CGH) to identify DNA copy number changes in 48 patients with osteosarcoma. We applied fluorescence in situ hybridization (FISH) to validate some of amplified clones in this study. Clones showing gains (79%) were more frequent than losses (66%). High-level amplifications and homozygous deletions constitute 28.6% and 3.8% of tumor genome respectively. High-level amplifications were present in 238 clones, of which about 37% of them showed recurrent amplification. Most frequently amplified clones were mapped to 1p36.32 (PRDM16), 6p21.1 (CDC5L, HSPCB, NFKBIE), 8q24, 12q14.3 (IFNG), 16p13 (MGRN1), and 17p11.2 (PMP22 MYCD, SOX1,ELAC27). We validated some of the amplified clones by FISH from 6p12-p21, 8q23-q24, and 17p11.2 amplicons. Homozygous deletions were noted for 32 clones and only 7 clones showed in more than one case. These 7 clones were mapped to 1q25.1 (4 cases), 3p14.1 (4 cases), 13q12.2 (2 cases), 4p15.1 (2 cases), 6q12 (2 cases), 6q12 (2 cases) and 6q16.3 (2 cases). This study clearly demonstrates the utility of array CGH in defining high-resolution DNA copy number changes and refining amplifications. The resolution of array CGH

  18. De novo likelihood-based measures for comparing genome assemblies

    OpenAIRE

    Ghodsi, Mohammadreza; Hill, Christopher M; Astrovskaya, Irina; Lin, Henry; Sommer, Dan D; Koren, Sergey; Pop, Mihai

    2013-01-01

    Background The current revolution in genomics has been made possible by software tools called genome assemblers, which stitch together DNA fragments “read” by sequencing machines into complete or nearly complete genome sequences. Despite decades of research in this field and the development of dozens of genome assemblers, assessing and comparing the quality of assembled genome sequences still relies on the availability of independently determined standards, such as manually curated genome seq...

  19. Comparative Genomics of Bifidobacterium, Lactobacillus and Related Probiotic Genera

    OpenAIRE

    Lukjancenko, Oksana; Ussery, David W.; Wassenaar, Trudy M

    2011-01-01

    Six bacterial genera containing species commonly used as probiotics for human consumption or starter cultures for food fermentation were compared and contrasted, based on publicly available complete genome sequences. The analysis included 19 Bifidobacterium genomes, 21 Lactobacillus genomes, 4 Lactococcus and 3 Leuconostoc genomes, as well as a selection of Enterococcus (11) and Streptococcus (23) genomes. The latter two genera included genomes from probiotic or commensal as well as pathogeni...

  20. Comparative Genomics of Ten Solanaceous Plastomes

    Directory of Open Access Journals (Sweden)

    Harpreet Kaur

    2014-01-01

    Full Text Available Availability of complete plastid genomes of ten solanaceous species, Atropa belladonna, Capsicum annuum, Datura stramonium, Nicotiana sylvestris, Nicotiana tabacum, Nicotiana tomentosiformis, Nicotiana undulata, Solanum bulbocastanum, Solanum lycopersicum, and Solanum tuberosum provided us with an opportunity to conduct their in silico comparative analysis in depth. The size of complete chloroplast genomes and LSC and SSC regions of three species of Solanum is comparatively smaller than that of any other species studied till date (exception: SSC region of A. belladonna. AT content of coding regions was found to be less than noncoding regions. A duplicate copy of trnH gene in C. annuum and two alternative tRNA genes for proline in D. stramonium were observed for the first time in this analysis. Further, homology search revealed the presence of rps19 pseudogene and infA genes in A. belladonna and D. stramonium, a region identical to rps19 pseudogene in C. annum and orthologues of sprA gene in another six species. Among the eighteen intron-containing genes, 3 genes have two introns and 15 genes have one intron. The longest insertion was found in accD gene in C. annuum. Phylogenetic analysis using concatenated protein coding sequences gave two clades, one for Nicotiana species and another for Solanum, Capsicum, Atropa, and Datura.

  1. Comparative genomics of bifidobacterium, lactobacillus and related probiotic genera

    DEFF Research Database (Denmark)

    Lukjancenko, Oksana; Ussery, David; Wassenaar, Trudy M.

    2012-01-01

    Six bacterial genera containing species commonly used as probiotics for human consumption or starter cultures for food fermentation were compared and contrasted, based on publicly available complete genome sequences. The analysis included 19 Bifidobacterium genomes, 21 Lactobacillus genomes, 4...... Lactococcus and 3 Leuconostoc genomes, as well as a selection of Enterococcus (11) and Streptococcus (23) genomes. The latter two genera included genomes from probiotic or commensal as well as pathogenic organisms to investigate if their non-pathogenic members shared more genes with the other probiotic...... core genome of each genus were compared. In addition, it was investigated whether pathogenic genomes contain different COG classes compared to the probiotic or fermentative organisms, again comparing their pan- and core genomes. The obtained results were compared with published data from the literature...

  2. Comparative genomic data of the Avian Phylogenomics Project

    DEFF Research Database (Denmark)

    Zhang, Guojie; Li, Bo; Li, Cai;

    2014-01-01

    in phylogenomics and comparative genomics. FINDINGS: The 38 bird genomes were sequenced using the Illumina HiSeq 2000 platform and assembled using a whole genome shotgun strategy. The 48 genomes were categorized into two groups according to the N50 scaffold size of the assemblies: a high depth group comprising 23...

  3. The Latest Buzz in Comparative Genomics

    OpenAIRE

    Kulathinal, Rob J.; Hartl, Daniel L.

    2005-01-01

    A second species of fruit fly has just been added to the growing list of organisms with complete and annotated genome sequences. The publication of the Drosophila pseudoobscura sequence provides a snapshot of how genomes have changed over tens of millions of years and sets the stage for the analysis of more fly genomes.

  4. Comparative genomics of emerging human ehrlichiosis agents.

    Directory of Open Access Journals (Sweden)

    Julie C Dunning Hotopp

    2006-02-01

    Full Text Available Anaplasma (formerly Ehrlichia phagocytophilum, Ehrlichia chaffeensis, and Neorickettsia (formerly Ehrlichia sennetsu are intracellular vector-borne pathogens that cause human ehrlichiosis, an emerging infectious disease. We present the complete genome sequences of these organisms along with comparisons to other organisms in the Rickettsiales order. Ehrlichia spp. and Anaplasma spp. display a unique large expansion of immunodominant outer membrane proteins facilitating antigenic variation. All Rickettsiales have a diminished ability to synthesize amino acids compared to their closest free-living relatives. Unlike members of the Rickettsiaceae family, these pathogenic Anaplasmataceae are capable of making all major vitamins, cofactors, and nucleotides, which could confer a beneficial role in the invertebrate vector or the vertebrate host. Further analysis identified proteins potentially involved in vacuole confinement of the Anaplasmataceae, a life cycle involving a hematophagous vector, vertebrate pathogenesis, human pathogenesis, and lack of transovarial transmission. These discoveries provide significant insights into the biology of these obligate intracellular pathogens.

  5. Comparative genomics reveals insights into avian genome evolution and adaptation

    DEFF Research Database (Denmark)

    Zhang, Guojie; Li, Cai; Li, Qiye;

    2014-01-01

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size......, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this...... pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits....

  6. Evaluation of Apis mellifera syriaca Levant region honeybee conservation using comparative genome hybridization.

    Science.gov (United States)

    Haddad, Nizar Jamal; Batainh, Ahmed; Saini, Deepti; Migdadi, Osama; Aiyaz, Mohamed; Manchiganti, Rushiraj; Krishnamurthy, Venkatesh; Al-Shagour, Banan; Brake, Mohammad; Bourgeois, Lelania; De Guzman, Lilia; Rinderer, Thomas; Hamouri, Zayed Mahoud

    2016-06-01

    Apis mellifera syriaca is the native honeybee subspecies of Jordan and much of the Levant region. It expresses behavioral adaptations to a regional climate with very high temperatures, nectar dearth in summer, attacks of the Oriental wasp and is resistant to Varroa mites. The A. m. syriaca control reference sample (CRS) in this study was originally collected and stored since 2001 from "Wadi Ben Hammad", a remote valley in the southern region of Jordan. Morphometric and mitochondrial DNA markers of these honeybees had shown highest similarity to reference A. m. syriaca samples collected in 1952 by Brother Adam of samples collected from the Middle East. Samples 1-5 were collected from the National Center for Agricultural Research and Extension breeding apiary which was established for the conservation of A. m. syriaca. Our objective was to determine the success of an A. m. syriaca honey bee conservation program using genomic information from an array-based comparative genomic hybridization platform to evaluate genetic similarities to a historic reference collection (CRS). Our results had shown insignificant genomic differences between the current population in the conservation program and the CRS indicated that program is successfully conserving A. m. syriaca. Functional genomic variations were identified which are useful for conservation monitoring and may be useful for breeding programs designed to improve locally adapted strains of A. m. syriaca. PMID:27010806

  7. Comparative genomics of the lactic acid bacteria

    Science.gov (United States)

    Lactic acid-producing bacteria are associated with various plant and animal niches and play a key role in the production of fermented foods and beverages. We report nine genome sequences representing the phylogenetic and functional diversity of these bacteria. The small genomes of lactic acid bacter...

  8. Comparative analysis of plant genome architecture

    International Nuclear Information System (INIS)

    It is clear that there are close, family wide similarities between different crop species in both the genes (often with only allelic differences) and the gene order along chromosomes. However, there are extensive differences in both the type and organization of repetitive DNA, even between related species, which may be of importance for genome changes and the exchange of genes in both long (evolutionary) and short (plant breeding) time-scales. There is additional non-genic information in a genome, related to the methylation and coiling of sequences, and to the three-dimensional organization of these sequences in the nucleus. Highly repetitive DNA makes up the majority of most plant genomes. Some sequences, such as microsatellites, are similar in every organism, while other repeat units are specific to one species or a small group of species. Different sequences have characteristic genomic distribution, and most can be identified by their chromosomal distribution. Knowledge of the genome architecture - the organization and the nature of repetitive sequences, and the three-dimensional organization in the interphase nucleus - is likely to be helpful for applied research and plant breeding. There is little knowledge of why repetitive sequences have particular characteristic. Is the three-dimensional architecture of the nucleus important for functions? Do repetitive sequences put coding or regulatory sequences in particular nuclear position? Why are different sequences located at particular sites in the genome? A comprehensive and quantitative model is being constructed of the variable and constant parts of the plant genome. Such integrated models of large scale genome organization may be useful in learning the function of different components of the genome, and in evolutionary studies. Since repetitive DNA changes are frequent, perhaps one can learn more about which manipulations are possible in plant genomes by examining the changes already made between related

  9. Mycobacterial species as case-study of comparative genome analysis

    DEFF Research Database (Denmark)

    Zakham, F.; Belayachi, L.; Ussery, David; Akrim, M.; Benjouad, A.; El Aouad, R.; Ennaji, M. M.

    2011-01-01

    evolutionary events of these species and improving drugs, vaccines, and diagnostics tools for controlling Mycobacterial diseases. In this present study we aim to outline a comparative genome analysis of fourteen Mycobacterial genomes: M. avium subsp. paratuberculosis K—10, M. bovis AF2122/97, M. bovis BCG str...... genomes, GC content, number of genes in different data bases (Genbank, Refseq, and Prodigal). The BLAST matrix of these genomes has been figured to give a lot of information about the similarity between species in a simple scheme. As a result of multiple genome analysis, the pan and core genome have been...

  10. Comparative genomic hybridization in clinical cytogenetics

    Energy Technology Data Exchange (ETDEWEB)

    Bryndorf, T.; Kirchhoff, M.; Rose, H. [and others

    1995-11-01

    We report the results of applying comparative genomic hybridization (CGH) in a cytogenetic service laboratory for (1) determination of the origin of extra and missing chromosomal material in intricate cases of unbalanced aberrations and (2) detection of common prenatal numerical chromosome aberrations. A total of 11 fetal samples were analyzed. Seven cases of complex unbalanced aberrations that could not be identified reliably by conventional cytogenetics were successfully resolved by CGH analysis. CGH results were validated by using FISH with chromosome-specific probes. Four cases representing common prenatal numerical aberrations (trisomy 21, 18, and 13 and monosomy X) were also successfully diagnosed by CGH. We conclude that CGH is a powerful adjunct to traditional cytogenetic techniques that makes it possible to solve clinical cases of intricate unbalanced aberrations in a single hybridization. CGH may also be a useful adjunct to screen for euchromatic involvement in marker chromosomes. Further technical development may render CGH applicable for routine aberration screening. 16 refs., 4 figs., 2 tabs.

  11. Comparative genomic data of the Avian Phylogenomics Project

    OpenAIRE

    Zhang, Guojie; Li, Bo; Li, Cai; Gilbert, M. Thomas P.; Jarvis, Erich D.; Wang, Jun; Avian Genome Consortium

    2014-01-01

    Background The evolutionary relationships of modern birds are among the most challenging to understand in systematic biology and have been debated for centuries. To address this challenge, we assembled or collected the genomes of 48 avian species spanning most orders of birds, including all Neognathae and two of the five Palaeognathae orders, and used the genomes to construct a genome-scale avian phylogenetic tree and perform comparative genomics analyses (Jarvis et al. in press; Zhang et al....

  12. Genome-wide mapping of copy number variation in humans: comparative analysis of high resolution array platforms.

    Directory of Open Access Journals (Sweden)

    Rajini R Haraksingh

    Full Text Available Accurate and efficient genome-wide detection of copy number variants (CNVs is essential for understanding human genomic variation, genome-wide CNV association type studies, cytogenetics research and diagnostics, and independent validation of CNVs identified from sequencing based technologies. Numerous, array-based platforms for CNV detection exist utilizing array Comparative Genome Hybridization (aCGH, Single Nucleotide Polymorphism (SNP genotyping or both. We have quantitatively assessed the abilities of twelve leading genome-wide CNV detection platforms to accurately detect Gold Standard sets of CNVs in the genome of HapMap CEU sample NA12878, and found significant differences in performance. The technologies analyzed were the NimbleGen 4.2 M, 2.1 M and 3×720 K Whole Genome and CNV focused arrays, the Agilent 1×1 M CGH and High Resolution and 2×400 K CNV and SNP+CGH arrays, the Illumina Human Omni1Quad array and the Affymetrix SNP 6.0 array. The Gold Standards used were a 1000 Genomes Project sequencing-based set of 3997 validated CNVs and an ultra high-resolution aCGH-based set of 756 validated CNVs. We found that sensitivity, total number, size range and breakpoint resolution of CNV calls were highest for CNV focused arrays. Our results are important for cost effective CNV detection and validation for both basic and clinical applications.

  13. 3D Genome Tuner: Compare Multiple Circular Genomes in a 3D Context

    Institute of Scientific and Technical Information of China (English)

    Qi Wang; Qun Liang; Xiuqing Zhang

    2009-01-01

    Circular genomes, being the largest proportion of sequenced genomes, play an important role in genome analysis. However, traditional 2D circular map only provides an overview and annotations of genome but does not offer feature-based comparison. For remedying these shortcomings, we developed 3D Genome Tuner, a hybrid of circular map and comparative map tools. Its capability of viewing comparisons between multiple circular maps in a 3D space offers great benefits to the study of comparative genomics. The program is freely available(under an LGPL licence)at http://sourceforge.net/projects/dgenometuner.

  14. Using comparative genomic hybridization to survey genomic sequence divergence across species: a proof-of-concept from Drosophila

    Directory of Open Access Journals (Sweden)

    Kulathinal Rob J

    2010-04-01

    Full Text Available Abstract Background Genome-wide analysis of sequence divergence among species offers profound insights into the evolutionary processes that shape lineages. When full-genome sequencing is not feasible for a broad comparative study, we propose the use of array-based comparative genomic hybridization (aCGH in order to identify orthologous genes with high sequence divergence. Here we discuss experimental design, statistical power, success rate, sources of variation and potential confounding factors. We used a spotted PCR product microarray platform from Drosophila melanogaster to assess sequence divergence on a gene-by-gene basis in three fully sequenced heterologous species (D. sechellia, D. simulans, and D. yakuba. Because complete genome assemblies are available for these species this study presents a powerful test for the use of aCGH as a tool to measure sequence divergence. Results We found a consistent and linear relationship between hybridization ratio and sequence divergence of the sample to the platform species. At higher levels of sequence divergence (D. melanogaster ~84% of features had significantly less hybridization to the array in the heterologous species than the platform species, and thus could be identified as "diverged". At lower levels of divergence (≥ 97% identity, only 13% of genes were identified as diverged. While ~40% of the variation in hybridization ratio can be accounted for by variation in sequence identity of the heterologous sample relative to D. melanogaster, other individual characteristics of the DNA sequences, such as GC content, also contribute to variation in hybridization ratio, as does technical variation. Conclusions Here we demonstrate that aCGH can accurately be used as a proxy to estimate genome-wide divergence, thus providing an efficient way to evaluate how evolutionary processes and genomic architecture can shape species diversity in non-model systems. Given the increased number of species for which

  15. Comparative genomic hybridization: Detection of segmental aneusomies

    Energy Technology Data Exchange (ETDEWEB)

    Cronin, J.E.; Magrane, G.G.; Gray, J.W. [Univ. of California, San Francisco, CA (United States)] [and others

    1994-09-01

    Comparative genomic hybridization (CGH) has been used successfully to detect whole chromosome and segmental aneusomies. However, its sensitivity for detection of segmental aneusomies is still not well known. We present here an analysis of CGH sensitivity with emphasis on detection of abnormalities commonly found during pre-and neo-natal diagnosis. CGH is performed by hybridizing green and red fluorescing test and normal DNA samples, respectively, to normal metaphase spreads and measuring green:red fluorescence ratios along all chromosomes. The ratios are normalized such that 2 copies of a normal chromosome region in the test sample gives a ratio of 1.0. Alterations in test vs. control gene copy number range from 1.5 [trisomy] to 0.5 [monosomy]. Clinical samples analyzed included Wolf Hirschhorn (4p-), Cri du Chat (5p-) and DiGeorge (22q-). In addition, 7 cell lines with chromosome 21 segmental aneusomies were analyzed. These included 3 with terminal duplications, 1 with a terminal deletion, 1 with an interstitial deletion and 2 with interstitial amplifications. The DiGeorge deletion was the only deletion not deleted by CGH. This is not surprising as standard G banding does not routinely detect this 1-2 megabase deletion. The 4p- and 5p- monosomies were detected and breakpoints correctly assigned prospectively. Proximal alterations involving 21q22.11 are unambiguously defined. Specifically, two interstitial aneusomies involving this region are detected. Studies involving late prophase chromosome normal spreads gave identical breakpoints. Thus, analysis of extended chromosomes did not improve the sensitivity of the technique. Taken together, these data suggest that CGH can detect segmental aneusomies greater than 8 megabases in extent. Smaller aneusomies can, at times, be detected. Work is now underway to modify the analysis software to increase sensitivity and to decrease the amount of material needed for analysis.

  16. Initial sequencing and comparative analysis of the mouse genome

    Energy Technology Data Exchange (ETDEWEB)

    Waterston, Robert H.; Lindblad-Toh, Kerstin; Birney, Ewan; Rogers, Jane; Abril, Josep F.; Agarwal, Pankaj; Agarwala, Richa; Ainscough, Rachel; Alexandersson, Marina; An, Peter; Antonarakis, Stylianos E.; Attwood, John; Baertsch, Robert; Bailey, Jonathon; Barlow, Karen; Beck, Stephan; Berry, Eric; Birren, Bruce; Bloom, Toby; Bork, Peer; Botcherby, Marc; Bray, Nicolas; Brent, Michael R.; Brown, Daniel G.; Brown, Stephen D.; Bult, Carol; Burton, John; Butler, Jonathan; Campbell, Robert D.; Carninci, Piero; Cawley, Simon; Chiaromonte, Francesca; Chinwalla, Asif T.; Church, Deanna M.; Clamp, Michele; Clee, Christopher; Collins, Francis S.; Cook, Lisa L.; Copley, Richard R.; Coulson, Alan; Couronne, Olivier; Cuff, James; Curwen, Val; Cutts, Tim; Daly, Mark; David, Robert; Davies, Joy; Delehaunty, Kimberly D.; Deri, Justin; Dermitzakis, Emmanouil T.; Dewey, Colin; Dickens, Nicholas J.; Diekhans, Mark; Dodge, Sheila; Dubchak, Inna; Dunn, Diane M.; Eddy, Sean R.; Elnitski, Laura; Emes, Richard D.; Eswara, Pallavi; Eyras, Eduardo; Felsenfeld, Adam; Fewell, Ginger A.; Flicek, Paul; Foley, Karen; Frankel, Wayne N.; Fulton, Lucinda A.; Fulton, Robert S.; Furey, Terrence S.; Gage, Diane; Gibbs, Richard A.; Glusman, Gustavo; Gnerre, Sante; Goldman, Nick; Goodstadt, Leo; Grafham, Darren; Graves, Tina A.; Green, Eric D.; Gregory, Simon; Guigo, Roderic; Guyer, Mark; Hardison, Ross C.; Haussler, David; Hayashizaki, Yoshihide; Hillier, LaDeana W.; Hinrichs, Angela; Hlavina, Wratko; Holzer, Timothy; Hsu, Fan; Hua, Axin; Hubbard, Tim; Hunt, Adrienne; Jackson, Ian; Jaffe, David B.; Johnson, L. Steven; Jones, Matthew; Jones, Thomas A.; Joy, Ann; Kamal, Michael; Karlsson, Elinor K.; Karolchik, Donna; Kasprzyk, Arkadiusz; Kawai, Jun; Keibler, Evan; Kells, Cristyn; Kent, W. James; Kirby, Andrew; Kolbe, Diana L.; Korf, Ian; Kucherlapati, Raju S.; Kulbokas III, Edward J.; Kulp, David; Landers, Tom; Leger, J.P.; Leonard, Steven; Letunic, Ivica; Levine, Rosie; et al.

    2002-12-15

    The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of the genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.

  17. Sockeye: A 3D Environment for Comparative Genomics

    OpenAIRE

    Montgomery, Stephen B.; Astakhova, Tamara; Bilenky, Mikhail; Birney, Ewan; Fu, Tony; Hassel, Maik; Melsopp, Craig; Rak, Marcin; Robertson, A. Gordon; Sleumer, Monica; Siddiqui, Asim S.; Jones, Steven J M

    2004-01-01

    Comparative genomics techniques are used in bioinformatics analyses to identify the structural and functional properties of DNA sequences. As the amount of available sequence data steadily increases, the ability to perform large-scale comparative analyses has become increasingly relevant. In addition, the growing complexity of genomic feature annotation means that new approaches to genomic visualization need to be explored. We have developed a Java-based application called Sockeye that uses t...

  18. Characterization of hemizygous deletions in Citrus using array-Comparative Genomic Hybridization and microsynteny comparisons with the poplar genome

    Directory of Open Access Journals (Sweden)

    Usach Antonio

    2008-08-01

    Full Text Available Abstract Background Many fruit-tree species, including relevant Citrus spp varieties exhibit a reproductive biology that impairs breeding and strongly constrains genetic improvements. In citrus, juvenility increases the generation time while sexual sterility, inbreeding depression and self-incompatibility prevent the production of homozygous cultivars. Genomic technology may provide citrus researchers with a new set of tools to address these various restrictions. In this work, we report a valuable genomics-based protocol for the structural analysis of deletion mutations on an heterozygous background. Results Two independent fast neutron mutants of self-incompatible clementine (Citrus clementina Hort. Ex Tan. cv. Clemenules were the subject of the study. Both mutants, named 39B3 and 39E7, were expected to carry DNA deletions in hemizygous dosage. Array-based Comparative Genomic Hybridization (array-CGH using a Citrus cDNA microarray allowed the identification of underrepresented genes in these two mutants. Subsequent comparison of citrus deleted genes with annotated plant genomes, especially poplar, made possible to predict the presence of a large deletion in 39B3 of about 700 kb and at least two deletions of approximately 100 and 500 kb in 39E7. The deletion in 39B3 was further characterized by PCR on available Citrus BACs, which helped us to build a partial physical map of the deletion. Among the deleted genes, ClpC-like gene coding for a putative subunit of a multifunctional chloroplastic protease involved in the regulation of chlorophyll b synthesis was directly related to the mutated phenotype since the mutant showed a reduced chlorophyll a/b ratio in green tissues. Conclusion In this work, we report the use of array-CGH for the successful identification of genes included in a hemizygous deletion induced by fast neutron irradiation on Citrus clementina. The study of gene content and order into the 39B3 deletion also led to the unexpected

  19. Comparative Genomics of Green Sulfur Bacteria

    DEFF Research Database (Denmark)

    Ussery, David; Davenport, C; Tümmler, B

    2010-01-01

    -genome gene family and single gene sequence comparisons yielded similar phylogenetic trees of the sequenced chromosomes indicating a concerted vertical evolution of large gene sets. Chromosomal synteny of genes is not preserved in the phylum Chlorobi. The accessory genome is characterized by anomalous...... oligonucleotide usage and endows the strains with individual features for transport, secretion, cell wall, extracellular constituents, and a few elements of the biosynthetic apparatus. Giant genes are a peculiar feature of the genera Chlorobium and Prosthecochloris. The predicted proteins have a huge molecular...

  20. Analysis of the allohexaploid bread wheat genome (Triticum aestivum) using comparative whole genome shotgun sequencing

    Science.gov (United States)

    The large 17 Gb allopolyploid genome of bread wheat is a major challenge for genome analysis because it is composed of three closely- related and independently maintained genomes, with genes dispersed as small “islands” separated by vast tracts of repetitive DNA. We used a novel comparative genomi...

  1. Whole genome comparative studies between chicken and turkey and their implications for avian genome evolution

    NARCIS (Netherlands)

    Griffin, D.K.; Robertson, L.B.; Tempest, H.G.; Vignal, A.; Fillon, V.; Crooijmans, R.P.M.A.; Groenen, M.A.M.; Deryusheva, S.; Gaginskaya, E.; Carre, W.; Waddington, D.; Talbot, R.; Völker, M.; Masabanda, J.S.; Burt, D.W.

    2008-01-01

    Background Comparative genomics is a powerful means of establishing inter-specific relationships between gene function/location and allows insight into genomic rearrangements, conservation and evolutionary phylogeny. The availability of the complete sequence of the chicken genome has initiated the d

  2. Complete genome sequence and comparative genomic analysis of an emerging human pathogen, serotype V Streptococcus agalactiae

    OpenAIRE

    Tettelin, Hervé; Masignani, Vega; Cieslewicz, Michael J.; Eisen, Jonathan A.; Peterson, Scott; Wessels, Michael R.; Paulsen, Ian T.; Nelson, Karen E.; Margarit, Immaculada; Read, Timothy D.; Madoff, Lawrence C.; Wolf, Alex M.; Beanan, Maureen J; Brinkac, Lauren M.; Sean C Daugherty

    2002-01-01

    The 2,160,267 bp genome sequence of Streptococcus agalactiae, the leading cause of bacterial sepsis, pneumonia, and meningitis in neonates in the U.S. and Europe, is predicted to encode 2,175 genes. Genome comparisons among S. agalactiae, Streptococcus pneumoniae, Streptococcus pyogenes, and the other completely sequenced genomes identified genes specific to the streptococci and to S. agalactiae. These in silico analyses, combined with comparative genome hybridization experiments between the ...

  3. Comparative genome analysis of Bacillus cereus group genomes with Bacillus subtilis

    OpenAIRE

    Anderson, Iain; Sorokin, Alexei; Kapatral, Vinayak; Reznik, Gary; Bhattacharya, Anamitra; Mikhailova, Natalia; Burd, Henry; Joukov, Victor; Kaznadzey, Denis; Walunas, Theresa; D'Souza, Mark; Larsen, Niels; Pusch, Gordon; Liolios, Konstantinos; Grechkin, Yuri

    2005-01-01

    Genome features of the Bacillus cereus group genomes (representative strains of Bacillus cereus, Bacillus anthracis and Bacillus thuringiensis sub spp israelensis) were analyzed and compared with the Bacillus subtilis genome. A core set of 1,381 protein families among the four Bacillus genomes, with an additional set of 933 families common to the B. cereus group, was identified. Differences in signal transduction pathways, membrane transporters, cell surface structures, cell wall, and S-...

  4. Comparative genomics of the lactic acid bacteria

    Energy Technology Data Exchange (ETDEWEB)

    Makarova, K.; Slesarev, A.; Wolf, Y.; Sorokin, A.; Mirkin, B.; Koonin, E.; Pavlov, A.; Pavlova, N.; Karamychev, V.; Polouchine, N.; Shakhova, V.; Grigoriev, I.; Lou, Y.; Rokhsar, D.; Lucas, S.; Huang, K.; Goodstein, D. M.; Hawkins, T.; Plengvidhya, V.; Welker, D.; Hughes, J.; Goh, Y.; Benson, A.; Baldwin, K.; Lee, J. -H.; Diaz-Muniz, I.; Dosti, B.; Smeianov, V; Wechter, W.; Barabote, R.; Lorca, G.; Altermann, E.; Barrangou, R.; Ganesan, B.; Xie, Y.; Rawsthorne, H.; Tamir, D.; Parker, C.; Breidt, F.; Broadbent, J.; Hutkins, R.; O' Sullivan, D.; Steele, J.; Unlu, G.; Saier, M.; Klaenhammer, T.; Richardson, P.; Kozyavkin, S.; Weimer, B.; Mills, D.

    2006-06-01

    Lactic acid-producing bacteria are associated with various plant and animal niches and play a key role in the production of fermented foods and beverages. We report nine genome sequences representing the phylogenetic and functional diversity of these bacteria. The small genomes of lactic acid bacteria encode a broad repertoire of transporters for efficient carbon and nitrogen acquisition from the nutritionally rich environments they inhabit and reflect a limited range of biosynthetic capabilities that indicate both prototrophic and auxotrophic strains. Phylogenetic analyses, comparison of gene content across the group, and reconstruction of ancestral gene sets indicate a combination of extensive gene loss and key gene acquisitions via horizontal gene transfer during the coevolution of lactic acid bacteria with their habitats.

  5. Comparative Genome Mapping of Sorghum and Maize

    OpenAIRE

    Whitkus, R; Doebley, J; Lee, M.

    1992-01-01

    Linkage relationships were determined among 85 maize low copy number nuclear DNA probes and seven isozyme loci in an F(2) population derived from a cross of Sorghum bicolor ssp. bicolor X S. bicolor ssp. arundinaceum. Thirteen linkage groups were defined, three more than the 10 chromosomes of sorghum. Use of maize DNA probes to produce the sorghum linkage map allowed us to make several inferences concerning processes involved in the evolutionary divergence of the maize and sorghum genomes. Th...

  6. Comparative genomics using data mining tools

    Indian Academy of Sciences (India)

    Tannistha Nandi; Chandrika B-Rao; Srinivasan Ramachandran

    2002-02-01

    We have analysed the genomes of representatives of three kingdoms of life, namely, archaea, eubacteria and eukaryota using data mining tools based on compositional analyses of the protein sequences. The representatives chosen in this analysis were Methanococcus jannaschii, Haemophilus influenzae and Saccharomyces cerevisiae. We have identified the common and different features between the three genomes in the protein evolution patterns. M. jannaschii has been seen to have a greater number of proteins with more charged amino acids whereas S. cerevisiae has been observed to have a greater number of hydrophilic proteins. Despite the differences in intrinsic compositional characteristics between the proteins from the different genomes we have also identified certain common characteristics. We have carried out exploratory Principal Component Analysis of the multivariate data on the proteins of each organism in an effort to classify the proteins into clusters. Interestingly, we found that most of the proteins in each organism cluster closely together, but there are a few ‘outliers’. We focus on the outliers for the functional investigations, which may aid in revealing any unique features of the biology of the respective organisms.

  7. Comparative genetics and genomics of nematodes: genome structure, development, and lifestyle.

    Science.gov (United States)

    Sommer, Ralf J; Streit, Adrian

    2011-01-01

    Nematodes are found in virtually all habitats on earth. Many of them are parasites of plants and animals, including humans. The free-living nematode, Caenorhabditis elegans, is one of the genetically best-studied model organisms and was the first metazoan whose genome was fully sequenced. In recent years, the draft genome sequences of another six nematodes representing four of the five major clades of nematodes were published. Compared to mammalian genomes, all these genomes are very small. Nevertheless, they contain almost the same number of genes as the human genome. Nematodes are therefore a very attractive system for comparative genetic and genomic studies, with C. elegans as an excellent baseline. Here, we review the efforts that were made to extend genetic analysis to nematodes other than C. elegans, and we compare the seven available nematode genomes. One of the most striking findings is the unexpectedly high incidence of gene acquisition through horizontal gene transfer (HGT). PMID:21721943

  8. The Burkholderia Genome Database: facilitating flexible queries and comparative analyses

    OpenAIRE

    Winsor, Geoffrey L.; Khaira, Bhavjinder; Van Rossum, Thea; Lo, Raymond; Whiteside, Matthew D.; Fiona S.L. Brinkman

    2008-01-01

    Summary: As the genome sequences of multiple strains of a given bacterial species are obtained, more generalized bacterial genome databases may be complemented by databases that are focused on providing more information geared for a distinct bacterial phylogenetic group and its associated research community. The Burkholderia Genome Database represents a model for such a database, providing a powerful, user-friendly search and comparative analysis interface that contains features not found in ...

  9. Comparative Analysis of Codon Usage Bias Patterns in Microsporidian Genomes

    OpenAIRE

    Xiang, Heng; Zhang, Ruizhi; Butler, Robert R.; Liu, Tie; Zhang, Li; Pombert, Jean-François; Zhou, Zeyang

    2015-01-01

    The sub-3 Mbp genomes from microsporidian species of the Encephalitozoon genus are the smallest known among eukaryotes and paragons of genomic reduction and compaction in parasites. However, their diminutive stature is not characteristic of all Microsporidia, whose genome sizes vary by an order of magnitude. This large variability suggests that different evolutionary forces are applied on the group as a whole. In this study, we have compared the codon usage bias (CUB) between eight taxonomica...

  10. Comparative Copy Number Variation From Whole Genome Sequencing

    OpenAIRE

    Janevski, A.; Varadan, V.; Kamalakaran, S.; Banerjee, N.; Dimitrova, D

    2011-01-01

    Whole genome sequencing enables a high resolution view of the humangenome and enables unique insights into copy number variations in anunprecedented scale. Numerous tools and studies have already been introduced that provide confirmatory and new genomic variability datain individuals and across populations. We investigate two such methods, CNV-seq and FREEC and compare their outputs when applied to five whole genome sequences representing four populations. We focus onthe ability of these tool...

  11. GenoSets: visual analytic methods for comparative genomics.

    Directory of Open Access Journals (Sweden)

    Aurora A Cain

    Full Text Available Many important questions in biology are, fundamentally, comparative, and this extends to our analysis of a growing number of sequenced genomes. Existing genomic analysis tools are often organized around literal views of genomes as linear strings. Even when information is highly condensed, these views grow cumbersome as larger numbers of genomes are added. Data aggregation and summarization methods from the field of visual analytics can provide abstracted comparative views, suitable for sifting large multi-genome datasets to identify critical similarities and differences. We introduce a software system for visual analysis of comparative genomics data. The system automates the process of data integration, and provides the analysis platform to identify and explore features of interest within these large datasets. GenoSets borrows techniques from business intelligence and visual analytics to provide a rich interface of interactive visualizations supported by a multi-dimensional data warehouse. In GenoSets, visual analytic approaches are used to enable querying based on orthology, functional assignment, and taxonomic or user-defined groupings of genomes. GenoSets links this information together with coordinated, interactive visualizations for both detailed and high-level categorical analysis of summarized data. GenoSets has been designed to simplify the exploration of multiple genome datasets and to facilitate reasoning about genomic comparisons. Case examples are included showing the use of this system in the analysis of 12 Brucella genomes. GenoSets software and the case study dataset are freely available at http://genosets.uncc.edu. We demonstrate that the integration of genomic data using a coordinated multiple view approach can simplify the exploration of large comparative genomic data sets, and facilitate reasoning about comparisons and features of interest.

  12. The MicrobesOnline Web site for comparative genomics

    OpenAIRE

    Alm, Eric J.; Huang, Katherine H.; Price, Morgan N; Koche, Richard P.; Keller, Keith; Dubchak, Inna L; Arkin, Adam P.

    2005-01-01

    At present, hundreds of microbial genomes have been sequenced, and hundreds more are currently in the pipeline. The Virtual Institute for Microbial Stress and Survival has developed a publicly available suite of Web-based comparative genomic tools (http://www.microbesonline.org) designed to facilitate multispecies comparison among prokaryotes. Highlights of the MicrobesOnline Web site include operon and regulon predictions, a multispecies genome browser, a multispecies Gene Ontology browser, ...

  13. Computational Methods for the Analysis of Array Comparative Genomic Hybridization

    Directory of Open Access Journals (Sweden)

    Raj Chari

    2006-01-01

    Full Text Available Array comparative genomic hybridization (array CGH is a technique for assaying the copy number status of cancer genomes. The widespread use of this technology has lead to a rapid accumulation of high throughput data, which in turn has prompted the development of computational strategies for the analysis of array CGH data. Here we explain the principles behind array image processing, data visualization and genomic profile analysis, review currently available software packages, and raise considerations for future software development.

  14. Massive comparative genomic analysis reveals convergent evolution of specialized bacteria

    OpenAIRE

    Raoult Didier; Pontarotti Pierre; Royer-Carenzi Manuela; Merhej Vicky

    2009-01-01

    Abstract Background Genome size and gene content in bacteria are associated with their lifestyles. Obligate intracellular bacteria (i.e., mutualists and parasites) have small genomes that derived from larger free-living bacterial ancestors; however, the different steps of bacterial specialization from free-living to intracellular lifestyle have not been studied comprehensively. The growing number of available sequenced genomes makes it possible to perform a statistical comparative analysis of...

  15. Determining and comparing protein function in Bacterial genome sequences

    DEFF Research Database (Denmark)

    Vesth, Tammi Camilla

    predictions were made in about 60% of the cases. This project has highlighted the difficulties and challenges in functional annotation and computational analysis of sequence data. It has provided possible solutions for creating reproducible pipelines for comparative genomics as well as constructed a number of......In November 2013, there was around 21.000 different prokaryotic genomes sequenced and publicly available, and the number is growing daily with another 20.000 or more genomes expected to be sequenced and deposited by the end of 2014. An important part of the analysis of this data is the functional...... known functions. This thesis describes the development of new tools for comparative functional annotation and a system for comparative genomics in general. As novel sequenced genomes are becoming more readily available, there is a need for standard analysis tools. The system CMG-biotools is presented...

  16. Comparative Genomics of an Emerging Amphibian Virus.

    Science.gov (United States)

    Epstein, Brendan; Storfer, Andrew

    2016-01-01

    Ranaviruses, a genus of the Iridoviridae, are large double-stranded DNA viruses that infect cold-blooded vertebrates worldwide. Ranaviruses have caused severe epizootics in commercial frog and fish populations, and are currently classified as notifiable pathogens in international trade. Previous work shows that a ranavirus that infects tiger salamanders throughout Western North America (Ambystoma tigrinum virus, or ATV) is in high prevalence among salamanders in the fishing bait trade. Bait ATV strains have elevated virulence and are transported long distances by humans, providing widespread opportunities for pathogen pollution. We sequenced the genomes of 15 strains of ATV collected from tiger salamanders across western North America and performed phylogenetic and population genomic analyses and tests for recombination. We find that ATV forms a monophyletic clade within the rest of the Ranaviruses and that it likely emerged within the last several thousand years, before human activities influenced its spread. We also identify several genes under strong positive selection, some of which appear to be involved in viral virulence and/or host immune evasion. In addition, we provide support for the pathogen pollution hypothesis with evidence of recombination among ATV strains, and potential bait-endemic strain recombination. PMID:26530419

  17. DNA Microarrays in Comparative Genomics and Transcriptomics

    DEFF Research Database (Denmark)

    Willenbrock, Hanni

    2007-01-01

    analysis, analysis of chromosomal aberrations and DNA sequence dependent gene expression. First, this thesis contains a description of how the gene expression profiles from children with acute lymphoblastic leukemia may be used to improve the diagnosis of these patients and potentially improve their...... experimental factor such as compound treatment may be obtained. The same characterization could otherwise be time consuming and require an extensive biological knowledge of the investigated biological system. Often, solid tumors are characterized by a multitude of chromosomal aberrations where parts of the...... verify predictions of highly expressed genes. Moreover, the codon bias of microbial genomes was found to constitute an environmental signature. For example, soil bacteria have very similar codon bias....

  18. Comparative Genomics and Extensive Recombinations in Phage Communities

    Science.gov (United States)

    Poisson, Guylaine; Belcaid, Mahdi; Bergeron, Anne

    Comparing the genomes of two closely related viruses often produces mosaics where nearly identical sequences alternate with sequences that are unique to each genome. When several closely related genomes are compared, the unique sequences are likely to be shared with third genomes, leading to virus mosaic communities. Here we present comparative analysis of sets of Staphylococcus aureus phages that share large identical sequences with up to three other genomes, and with different partners along their genomes. We introduce mosaic graphs to represent these complex recombination events, and use them to illustrate the breath and depth of sequence sharing: some genomes are almost completely made up of shared sequences, while genomes that share very large identical sequences can adopt alternate functional modules. Mosaic graphs also allow us to identify breakpoints that could eventually be used for the construction of recombination networks. These findings have several implications on phage metagenomics assembly, on the horizontal gene transfer paradigm, and more generally on the understanding of the composition and evolutionary dynamics of virus communities.

  19. Analysis of Chinese women with primary ovarian insufficiency by high resolution array-comparative genomic hybridization

    Institute of Scientific and Technical Information of China (English)

    LIAO Can; FU Fang; YANG Xin; SUN Yi-min; LI Dong-zhi

    2011-01-01

    Background Primary ovarian insufficiency (POI) is defined as a primary ovarian defect characterized by absent menarche (primary amenorrhea) or premature depletion of ovarian follicles before the age of 40 years. The etiology of primary ovarian insufficiency in human female patients is still unclear. The purpose of this study is to investigate the potential genetic causes in primary amenorrhea patients by high resolution array based comparative genomic hybridization (array-CGH) analysis.Methods Following the standard karyotyping analysis, genomic DNA from whole blood of 15 primary amenorrhea patients and 15 normal control women was hybridized with Affymetrix cytogenetic 2.7M arrays following the standard protocol. Copy number variations identified by array-CGH were confirmed by real time polymerase chain reaction.Results All the 30 samples were negative by conventional karyotyping analysis. Microdeletions on chromosome 17q21.31-q21.32 with approximately 1.3 Mb were identified in four patients by high resolution array-CGH analysis. This included the female reproductive secretory pathway related factor N-ethylmaleimide-sensitive factor (NSF) gene.Conclusions The results of the present study suggest that there may be critical regions regulating primary ovarian insufficiency in women with a 17q21.31-q21.32 microdeletion. This effect might be due to the loss of function of the NSF gene/genes within the deleted region or to effects on contiguous genes.

  20. Comparative analysis of the mitochondrial genomes in gastropods

    International Nuclear Information System (INIS)

    In this work we presented a comparative analysis of the mitochondrial genomes in gastropods. Nucleotide and amino acids composition was calculated and a comparative visual analysis of the start and termination codons was performed. The organization of the genome was compared calculating the number of intergenic sequences, the location of the genes and the number of reorganized genes (breakpoints) in comparison with the sequence that is presumed to be ancestral for the group. In order to calculate variations in the rates of molecular evolution within the group, the relative rate test was performed. In spite of the differences in the size of the genomes, the amino acids number is conserved. The nucleotide and amino acid composition is similar between Vetigastropoda, Ceanogastropoda and Neritimorpha in comparison to Heterobranchia and Patellogastropoda. The mitochondrial genomes of the group are very compact with few intergenic sequences, the only exception is the genome of Patellogastropoda with 26,828 bp. Start codons of the Heterobranchia and Patellogastropoda are very variable and there is also an increase in genome rearrangements for these two groups. Generally, the hypothesis of constant rates of molecular evolution between the groups is rejected, except when the genomes of Caenogastropoda and Vetigastropoda are compared.

  1. Mycobacterial species as case-study of comparative genome analysis.

    Science.gov (United States)

    Zakham, F; Belayachi, L; Ussery, D; Akrim, M; Benjouad, A; El Aouad, R; Ennaji, M M

    2011-01-01

    The genus Mycobacterium represents more than 120 species including important pathogens of human and cause major public health problems and illnesses. Further, with more than 100 genome sequences from this genus, comparative genome analysis can provide new insights for better understanding the evolutionary events of these species and improving drugs, vaccines, and diagnostics tools for controlling Mycobacterial diseases. In this present study we aim to outline a comparative genome analysis of fourteen Mycobacterial genomes: M. avium subsp. paratuberculosis K—10, M. bovis AF2122/97, M. bovis BCG str. Pasteur 1173P2, M. leprae Br4923, M. marinum M, M. sp. KMS, M. sp. MCS, M. tuberculosis CDC1551, M. tuberculosis F11, M. tuberculosis H37Ra, M. tuberculosis H37Rv, M. tuberculosis KZN 1435 , M. ulcerans Agy99,and M. vanbaalenii PYR—1, For this purpose a comparison has been done based on their length of genomes, GC content, number of genes in different data bases (Genbank, Refseq, and Prodigal). The BLAST matrix of these genomes has been figured to give a lot of information about the similarity between species in a simple scheme. As a result of multiple genome analysis, the pan and core genome have been defined for twelve Mycobacterial species. We have also introduced the genome atlas of the reference strain M. tuberculosis H37Rv which can give a good overview of this genome. And for examining the phylogenetic relationships among these bacteria, a phylogenic tree has been constructed from 16S rRNA gene for tuberculosis and non tuberculosis Mycobacteria to understand the evolutionary events of these species. PMID:21396338

  2. Comparative Genomics of a Parthenogenesis-Inducing Wolbachia Symbiont

    Directory of Open Access Journals (Sweden)

    Amelia R. I. Lindsey

    2016-07-01

    Full Text Available Wolbachia is an intracellular symbiont of invertebrates responsible for inducing a wide variety of phenotypes in its host. These host-Wolbachia relationships span the continuum from reproductive parasitism to obligate mutualism, and provide a unique system to study genomic changes associated with the evolution of symbiosis. We present the genome sequence from a parthenogenesis-inducing Wolbachia strain (wTpre infecting the minute parasitoid wasp Trichogramma pretiosum. The wTpre genome is the most complete parthenogenesis-inducing Wolbachia genome available to date. We used comparative genomics across 16 Wolbachia strains, representing five supergroups, to identify a core Wolbachia genome of 496 sets of orthologous genes. Only 14 of these sets are unique to Wolbachia when compared to other bacteria from the Rickettsiales. We show that the B supergroup of Wolbachia, of which wTpre is a member, contains a significantly higher number of ankyrin repeat-containing genes than other supergroups. In the wTpre genome, there is evidence for truncation of the protein coding sequences in 20% of ORFs, mostly as a result of frameshift mutations. The wTpre strain represents a conversion from cytoplasmic incompatibility to a parthenogenesis-inducing lifestyle, and is required for reproduction in the Trichogramma host it infects. We hypothesize that the large number of coding frame truncations has accompanied the change in reproductive mode of the wTpre strain.

  3. Comparative Genomics of a Parthenogenesis-Inducing Wolbachia Symbiont.

    Science.gov (United States)

    Lindsey, Amelia R I; Werren, John H; Richards, Stephen; Stouthamer, Richard

    2016-01-01

    Wolbachia is an intracellular symbiont of invertebrates responsible for inducing a wide variety of phenotypes in its host. These host-Wolbachia relationships span the continuum from reproductive parasitism to obligate mutualism, and provide a unique system to study genomic changes associated with the evolution of symbiosis. We present the genome sequence from a parthenogenesis-inducing Wolbachia strain (wTpre) infecting the minute parasitoid wasp Trichogramma pretiosum The wTpre genome is the most complete parthenogenesis-inducing Wolbachia genome available to date. We used comparative genomics across 16 Wolbachia strains, representing five supergroups, to identify a core Wolbachia genome of 496 sets of orthologous genes. Only 14 of these sets are unique to Wolbachia when compared to other bacteria from the Rickettsiales. We show that the B supergroup of Wolbachia, of which wTpre is a member, contains a significantly higher number of ankyrin repeat-containing genes than other supergroups. In the wTpre genome, there is evidence for truncation of the protein coding sequences in 20% of ORFs, mostly as a result of frameshift mutations. The wTpre strain represents a conversion from cytoplasmic incompatibility to a parthenogenesis-inducing lifestyle, and is required for reproduction in the Trichogramma host it infects. We hypothesize that the large number of coding frame truncations has accompanied the change in reproductive mode of the wTpre strain. PMID:27194801

  4. Comparative rates of evolution in endosymbiotic nuclear genomes

    Directory of Open Access Journals (Sweden)

    Keeling Patrick J

    2006-06-01

    Full Text Available Abstract Background The nucleomorphs associated with secondary plastids of cryptomonads and chlorarachniophytes are the sole examples of organelles with eukaryotic nuclear genomes. Although not as widespread as their prokaryotic equivalents in mitochondria and plastids, nucleomorph genomes share similarities in terms of reduction and compaction. They also differ in several aspects, not least in that they encode proteins that target to the plastid, and so function in a different compartment from that in which they are encoded. Results Here, we test whether the phylogenetically distinct nucleomorph genomes of the cryptomonad, Guillardia theta, and the chlorarachniophyte, Bigelowiella natans, have experienced similar evolutionary pressures during their transformation to reduced organelles. We compared the evolutionary rates of genes from nuclear, nucleomorph, and plastid genomes, all of which encode proteins that function in the same cellular compartment, the plastid, and are thus subject to similar selection pressures. Furthermore, we investigated the divergence of nucleomorphs within cryptomonads by comparing G. theta and Rhodomonas salina. Conclusion Chlorarachniophyte nucleomorph genes have accumulated errors at a faster rate than other genomes within the same cell, regardless of the compartment where the gene product functions. In contrast, most nucleomorph genes in cryptomonads have evolved faster than genes in other genomes on average, but genes for plastid-targeted proteins are not overly divergent, and it appears that cryptomonad nucleomorphs are not presently evolving rapidly and have therefore stabilized. Overall, these analyses suggest that the forces at work in the two lineages are different, despite the similarities between the structures of their genomes.

  5. Comparative genomics of vesicomyid clam (Bivalvia: Mollusca chemosynthetic symbionts

    Directory of Open Access Journals (Sweden)

    Girguis Peter R

    2008-12-01

    Full Text Available Abstract Background The Vesicomyidae (Bivalvia: Mollusca are a family of clams that form symbioses with chemosynthetic gamma-proteobacteria. They exist in environments such as hydrothermal vents and cold seeps and have a reduced gut and feeding groove, indicating a large dependence on their endosymbionts for nutrition. Recently, two vesicomyid symbiont genomes were sequenced, illuminating the possible nutritional contributions of the symbiont to the host and making genome-wide evolutionary analyses possible. Results To examine the genomic evolution of the vesicomyid symbionts, a comparative genomics framework, including the existing genomic data combined with heterologous microarray hybridization results, was used to analyze conserved gene content in four vesicomyid symbiont genomes. These four symbionts were chosen to include a broad phylogenetic sampling of the vesicomyid symbionts and represent distinct chemosynthetic environments: cold seeps and hydrothermal vents. Conclusion The results of this comparative genomics analysis emphasize the importance of the symbionts' chemoautotrophic metabolism within their hosts. The fact that these symbionts appear to be metabolically capable autotrophs underscores the extent to which the host depends on them for nutrition and reveals the key to invertebrate colonization of these challenging environments.

  6. SNUGB: a versatile genome browser supporting comparative and functional fungal genomics

    Directory of Open Access Journals (Sweden)

    Kim Seungill

    2008-12-01

    Full Text Available Abstract Background Since the full genome sequences of Saccharomyces cerevisiae were released in 1996, genome sequences of over 90 fungal species have become publicly available. The heterogeneous formats of genome sequences archived in different sequencing centers hampered the integration of the data for efficient and comprehensive comparative analyses. The Comparative Fungal Genomics Platform (CFGP was developed to archive these data via a single standardized format that can support multifaceted and integrated analyses of the data. To facilitate efficient data visualization and utilization within and across species based on the architecture of CFGP and associated databases, a new genome browser was needed. Results The Seoul National University Genome Browser (SNUGB integrates various types of genomic information derived from 98 fungal/oomycete (137 datasets and 34 plant and animal (38 datasets species, graphically presents germane features and properties of each genome, and supports comparison between genomes. The SNUGB provides three different forms of the data presentation interface, including diagram, table, and text, and six different display options to support visualization and utilization of the stored information. Information for individual species can be quickly accessed via a new tool named the taxonomy browser. In addition, SNUGB offers four useful data annotation/analysis functions, including 'BLAST annotation.' The modular design of SNUGB makes its adoption to support other comparative genomic platforms easy and facilitates continuous expansion. Conclusion The SNUGB serves as a powerful platform supporting comparative and functional genomics within the fungal kingdom and also across other kingdoms. All data and functions are available at the web site http://genomebrowser.snu.ac.kr/.

  7. PSAT: A web tool to compare genomic neighborhoods of multiple prokaryotic genomes

    Directory of Open Access Journals (Sweden)

    Wasnick Michael

    2008-03-01

    Full Text Available Abstract Background The conservation of gene order among prokaryotic genomes can provide valuable insight into gene function, protein interactions, or events by which genomes have evolved. Although some tools are available for visualizing and comparing the order of genes between genomes of study, few support an efficient and organized analysis between large numbers of genomes. The Prokaryotic Sequence homology Analysis Tool (PSAT is a web tool for comparing gene neighborhoods among multiple prokaryotic genomes. Results PSAT utilizes a database that is preloaded with gene annotation, BLAST hit results, and gene-clustering scores designed to help identify regions of conserved gene order. Researchers use the PSAT web interface to find a gene of interest in a reference genome and efficiently retrieve the sequence homologs found in other bacterial genomes. The tool generates a graphic of the genomic neighborhood surrounding the selected gene and the corresponding regions for its homologs in each comparison genome. Homologs in each region are color coded to assist users with analyzing gene order among various genomes. In contrast to common comparative analysis methods that filter sequence homolog data based on alignment score cutoffs, PSAT leverages gene context information for homologs, including those with weak alignment scores, enabling a more sensitive analysis. Features for constraining or ordering results are designed to help researchers browse results from large numbers of comparison genomes in an organized manner. PSAT has been demonstrated to be useful for helping to identify gene orthologs and potential functional gene clusters, and detecting genome modifications that may result in loss of function. Conclusion PSAT allows researchers to investigate the order of genes within local genomic neighborhoods of multiple genomes. A PSAT web server for public use is available for performing analyses on a growing set of reference genomes through any

  8. Comparative Genome Analysis of Basidiomycete Fungi

    Energy Technology Data Exchange (ETDEWEB)

    Riley, Robert; Salamov, Asaf; Morin, Emmanuelle; Nagy, Laszlo; Manning, Gerard; Baker, Scott; Brown, Daren; Henrissat, Bernard; Levasseur, Anthony; Hibbett, David; Martin, Francis; Grigoriev, Igor

    2012-03-19

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes the mushrooms, wood rots, symbionts, and plant and animal pathogens. To better understand the diversity of phenotypes in basidiomycetes, we performed a comparative analysis of 35 basidiomycete fungi spanning the diversity of the phylum. Phylogenetic patterns of lignocellulose degrading genes suggest a continuum rather than a sharp dichotomy between the white rot and brown rot modes of wood decay. Patterns of secondary metabolic enzymes give additional insight into the broad array of phenotypes found in the basidiomycetes. We suggest that the profile of an organism in lignocellulose-targeting genes can be used to predict its nutritional mode, and predict Dacryopinax sp. as a brown rot; Botryobasidium botryosum and Jaapia argillacea as white rots.

  9. Gramene 2016: comparative plant genomics and pathway resources.

    Science.gov (United States)

    Tello-Ruiz, Marcela K; Stein, Joshua; Wei, Sharon; Preece, Justin; Olson, Andrew; Naithani, Sushma; Amarasinghe, Vindhya; Dharmawardhana, Palitha; Jiao, Yinping; Mulvaney, Joseph; Kumari, Sunita; Chougule, Kapeel; Elser, Justin; Wang, Bo; Thomason, James; Bolser, Daniel M; Kerhornou, Arnaud; Walts, Brandon; Fonseca, Nuno A; Huerta, Laura; Keays, Maria; Tang, Y Amy; Parkinson, Helen; Fabregat, Antonio; McKay, Sheldon; Weiser, Joel; D'Eustachio, Peter; Stein, Lincoln; Petryszak, Robert; Kersey, Paul J; Jaiswal, Pankaj; Ware, Doreen

    2016-01-01

    Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the database website adopted a new Drupal management platform. The genomes section features 39 fully assembled reference genomes that are integrated using ontology-based annotation and comparative analyses, and accessed through both visual and programmatic interfaces. Additional community data, such as genetic variation, expression and methylation, are also mapped for a subset of genomes. The Plant Reactome pathway portal (http://plantreactome.gramene.org) provides a reference resource for analyzing plant metabolic and regulatory pathways. In addition to ∼ 200 curated rice reference pathways, the portal hosts gene homology-based pathway projections for 33 plant species. Both the genome and pathway browsers interface with the EMBL-EBI's Expression Atlas to enable the projection of baseline and differential expression data from curated expression studies in plants. Gramene's archive website (http://archive.gramene.org) continues to provide previously reported resources on comparative maps, markers and QTL. To further aid our users, we have also introduced a live monthly educational webinar series and a Gramene YouTube channel carrying video tutorials. PMID:26553803

  10. Complete genome sequencing and comparative genomic analysis of functionally diverse Lysinibacillus sphaericus III(3)7.

    Science.gov (United States)

    Rey, Andrés; Silva-Quintero, Laura; Dussán, Jenny

    2016-09-01

    Lysinibacillus sphaericus III(3)7 is a native Colombian strain, the first one isolated from soil samples. This strain has shown high levels of pathogenic activity against Culex quinquefaciatus larvae in laboratory assays compared to other members of the same species. Using Pacific Biosciences sequencing technology we sequenced, annotated (de novo) and described the genome of strain III(3)7, achieving a complete genome sequence status. We then performed a comparative analysis between the newly sequenced genome and the ones previously reported for Colombian isolates L. sphaericus OT4b.31, CBAM5 and OT4b.25, with the inclusion of L. sphaericus C3-41 that has been used as a reference genome for most of previous genome sequencing projects. We concluded that L. sphaericus III(3)7 is highly similar with strain OT4b.25 and shares high levels of synteny with isolates CBAM5 and C3-41. PMID:27419068

  11. The Perennial Ryegrass GenomeZipper – Targeted Use of Genome Resources for Comparative Grass Genomics

    DEFF Research Database (Denmark)

    Pfeiffer, Matthias; Martis, Mihaela; Asp, Torben;

    2013-01-01

    (Lolium perenne) genome on the basis of conserved synteny to barley (Hordeum vulgare) and the model grass genome Brachypodium (Brachypodium distachyon) as well as rice (Oryza sativa) and sorghum (Sorghum bicolor). A transcriptome-based genetic linkage map of perennial ryegrass served as a scaffold to...... assignment of 3,315 out of 8,876 previously unmapped genes to the respective chromosomes. In total, the GenomeZipper incorporates 4,035 conserved grass gene loci, which were used for the first genome-wide sequence divergence analysis between perennial ryegrass, barley, Brachypodium, rice, and sorghum. The...

  12. SUPERFAMILY--sophisticated comparative genomics, data mining, visualization and phylogeny.

    Science.gov (United States)

    Wilson, Derek; Pethica, Ralph; Zhou, Yiduo; Talbot, Charles; Vogel, Christine; Madera, Martin; Chothia, Cyrus; Gough, Julian

    2009-01-01

    SUPERFAMILY provides structural, functional and evolutionary information for proteins from all completely sequenced genomes, and large sequence collections such as UniProt. Protein domain assignments for over 900 genomes are included in the database, which can be accessed at http://supfam.org/. Hidden Markov models based on Structural Classification of Proteins (SCOP) domain definitions at the superfamily level are used to provide structural annotation. We recently produced a new model library based on SCOP 1.73. Family level assignments are also available. From the web site users can submit sequences for SCOP domain classification; search for keywords such as superfamilies, families, organism names, models and sequence identifiers; find over- and underrepresented families or superfamilies within a genome relative to other genomes or groups of genomes; compare domain architectures across selections of genomes and finally build multiple sequence alignments between Protein Data Bank (PDB), genomic and custom sequences. Recent extensions to the database include InterPro abstracts and Gene Ontology terms for superfamiles, taxonomic visualization of the distribution of families across the tree of life, searches for functionally similar domain architectures and phylogenetic trees. The database, models and associated scripts are available for download from the ftp site. PMID:19036790

  13. Leptospire Genomic Diversity Revealed by Microarray-Based Comparative Genomic Hybridization

    OpenAIRE

    Eribo, Broderick; Mingmongkolchai, Sirima; Yan, Tingfen; Dubbs, Padunsri; Nelson, Karen E

    2012-01-01

    Comparative genomic hybridization was used to compare genetic diversity of five strains of Leptospira (Leptospira interrogans serovars Bratislava, Canicola, and Hebdomadis and Leptospira kirschneri serovars Cynopteri and Grippotyphosa). The array was designed based on two available sequenced Leptospira reference genomes, those of L. interrogans serovar Copenhageni and L. interrogans serovar Lai. A comparison of genetic contents showed that L. interrogans serovar Bratislava was closest to the ...

  14. Comparative genomics of mitochondria in chlorarachniophyte algae: endosymbiotic gene transfer and organellar genome dynamics

    OpenAIRE

    Goro Tanifuji; Archibald, John M.; Tetsuo Hashimoto

    2016-01-01

    Chlorarachniophyte algae possess four DNA-containing compartments per cell, the nucleus, mitochondrion, plastid and nucleomorph, the latter being a relic nucleus derived from a secondary endosymbiont. While the evolutionary dynamics of plastid and nucleomorph genomes have been investigated, a comparative investigation of mitochondrial genomes (mtDNAs) has not been carried out. We have sequenced the complete mtDNA of Lotharella oceanica and compared it to that of another chlorarachniophyte, Bi...

  15. Genome evolution in the eremothecium clade of the Saccharomyces complex revealed by comparative genomics.

    Science.gov (United States)

    Wendland, Jürgen; Walther, Andrea

    2011-12-01

    We used comparative genomics to elucidate the genome evolution within the pre-whole-genome duplication genus Eremothecium. To this end, we sequenced and assembled the complete genome of Eremothecium cymbalariae, a filamentous ascomycete representing the Eremothecium type strain. Genome annotation indicated 4712 gene models and 143 tRNAs. We compared the E. cymbalariae genome with that of its relative, the riboflavin overproducer Ashbya (Eremothecium) gossypii, and the reconstructed yeast ancestor. Decisive changes in the Eremothecium lineage leading to the evolution of the A. gossypii genome include the reduction from eight to seven chromosomes, the downsizing of the genome by removal of 10% or 900 kb of DNA, mostly in intergenic regions, the loss of a TY3-Gypsy-type transposable element, the re-arrangement of mating-type loci, and a massive increase of its GC content. Key species-specific events are the loss of MNN1-family of mannosyltransferases required to add the terminal fourth and fifth α-1,3-linked mannose residue to O-linked glycans and genes of the Ehrlich pathway in E. cymbalariae and the loss of ZMM-family of meiosis-specific proteins and acquisition of riboflavin overproduction in A. gossypii. This reveals that within the Saccharomyces complex genome, evolution is not only based on genome duplication with subsequent gene deletions and chromosomal rearrangements but also on fungi associated with specific environments (e.g. involving fungal-insect interactions as in Eremothecium), which have encountered challenges that may be reflected both in genome streamlining and their biosynthetic potential. PMID:22384365

  16. DCODE.ORG Anthology of Comparative Genomic Tools

    Energy Technology Data Exchange (ETDEWEB)

    Loots, G G; Ovcharenko, I

    2005-01-11

    Comparative genomics provides the means to demarcate functional regions in anonymous DNA sequences. The successful application of this method to identifying novel genes is currently shifting to deciphering the noncoding encryption of gene regulation across genomes. To facilitate the use of comparative genomics to practical applications in genetics and genomics we have developed several analytical and visualization tools for the analysis of arbitrary sequences and whole genomes. These tools include two alignment tools: zPicture and Mulan; a phylogenetic shadowing tool: eShadow for identifying lineage- and species-specific functional elements; two evolutionary conserved transcription factor analysis tools: rVista and multiTF; a tool for extracting cis-regulatory modules governing the expression of co-regulated genes, CREME; and a dynamic portal to multiple vertebrate and invertebrate genome alignments, the ECR Browser. Here we briefly describe each one of these tools and provide specific examples on their practical applications. All the tools are publicly available at the http://www.dcode.org/ web site.

  17. Low-pass sequencing for microbial comparative genomics

    Directory of Open Access Journals (Sweden)

    Kennedy Sean

    2004-01-01

    Full Text Available Abstract Background We studied four extremely halophilic archaea by low-pass shotgun sequencing: (1 the metabolically versatile Haloarcula marismortui; (2 the non-pigmented Natrialba asiatica; (3 the psychrophile Halorubrum lacusprofundi and (4 the Dead Sea isolate Halobaculum gomorrense. Approximately one thousand single pass genomic sequences per genome were obtained. The data were analyzed by comparative genomic analyses using the completed Halobacterium sp. NRC-1 genome as a reference. Low-pass shotgun sequencing is a simple, inexpensive, and rapid approach that can readily be performed on any cultured microbe. Results As expected, the four archaeal halophiles analyzed exhibit both bacterial and eukaryotic characteristics as well as uniquely archaeal traits. All five halophiles exhibit greater than sixty percent GC content and low isoelectric points (pI for their predicted proteins. Multiple insertion sequence (IS elements, often involved in genome rearrangements, were identified in H. lacusprofundi and H. marismortui. The core biological functions that govern cellular and genetic mechanisms of H. sp. NRC-1 appear to be conserved in these four other halophiles. Multiple TATA box binding protein (TBP and transcription factor IIB (TFB homologs were identified from most of the four shotgunned halophiles. The reconstructed molecular tree of all five halophiles shows a large divergence between these species, but with the closest relationship being between H. sp. NRC-1 and H. lacusprofundi. Conclusion Despite the diverse habitats of these species, all five halophiles share (1 high GC content and (2 low protein isoelectric points, which are characteristics associated with environmental exposure to UV radiation and hypersalinity, respectively. Identification of multiple IS elements in the genome of H. lacusprofundi and H. marismortui suggest that genome structure and dynamic genome reorganization might be similar to that previously observed in the

  18. On the Approximability of Comparing Genomes with Duplicates

    CERN Document Server

    Angibaud, Sébastien; Rusu, Irena; Thevenin, Annelyse; Vialette, Stéphane

    2008-01-01

    A central problem in comparative genomics consists in computing a (dis-)similarity measure between two genomes, e.g. in order to construct a phylogeny. All the existing measures are defined on genomes without duplicates. However, we know that genes can be duplicated within the same genome. One possible approach to overcome this difficulty is to establish a one-to-one correspondence (i.e. a matching) between genes of both genomes, where the correspondence is chosen in order to optimize the studied measure. In this paper, we are interested in three measures (number of breakpoints, number of common intervals and number of conserved intervals) and three models of matching (exemplar, intermediate and maximum matching models). We prove that, for each model and each measure M, computing a matching between two genomes that optimizes M is APX-hard. We also study the complexity of the following problem: is there an exemplarization (resp. an intermediate/maximum matching) that induces no breakpoint? We prove the problem...

  19. Sequencing and comparative analyses of the genomes of zoysiagrasses.

    Science.gov (United States)

    Tanaka, Hidenori; Hirakawa, Hideki; Kosugi, Shunichi; Nakayama, Shinobu; Ono, Akiko; Watanabe, Akiko; Hashiguchi, Masatsugu; Gondo, Takahiro; Ishigaki, Genki; Muguerza, Melody; Shimizu, Katsuya; Sawamura, Noriko; Inoue, Takayasu; Shigeki, Yuichi; Ohno, Naoki; Tabata, Satoshi; Akashi, Ryo; Sato, Shusei

    2016-04-01

    Zoysiais a warm-season turfgrass, which comprises 11 allotetraploid species (2n= 4x= 40), each possessing different morphological and physiological traits. To characterize the genetic systems ofZoysiaplants and to analyse their structural and functional differences in individual species and accessions, we sequenced the genomes ofZoysiaspecies using HiSeq and MiSeq platforms. As a reference sequence ofZoysiaspecies, we generated a high-quality draft sequence of the genome ofZ. japonicaaccession 'Nagirizaki' (334 Mb) in which 59,271 protein-coding genes were predicted. In parallel, draft genome sequences ofZ. matrella'Wakaba' andZ. pacifica'Zanpa' were also generated for comparative analyses. To investigate the genetic diversity among theZoysiaspecies, genome sequence reads of three additional accessions,Z. japonica'Kyoto',Z. japonica'Miyagi' andZ. matrella'Chiba Fair Green', were accumulated, and aligned against the reference genome of 'Nagirizaki' along with those from 'Wakaba' and 'Zanpa'. As a result, we detected 7,424,163 single-nucleotide polymorphisms and 852,488 short indels among these species. The information obtained in this study will be valuable for basic studies on zoysiagrass evolution and genetics as well as for the breeding of zoysiagrasses, and is made available in the 'Zoysia Genome Database' athttp://zoysia.kazusa.or.jp. PMID:26975196

  20. Comparative analysis of rosaceous genomes and the reconstruction of a putative ancestral genome for the family

    Directory of Open Access Journals (Sweden)

    Velasco Riccardo

    2011-01-01

    Full Text Available Abstract Background Comparative genome mapping studies in Rosaceae have been conducted until now by aligning genetic maps within the same genus, or closely related genera and using a limited number of common markers. The growing body of genomics resources and sequence data for both Prunus and Fragaria permits detailed comparisons between these genera and the recently released Malus × domestica genome sequence. Results We generated a comparative analysis using 806 molecular markers that are anchored genetically to the Prunus and/or Fragaria reference maps, and physically to the Malus genome sequence. Markers in common for Malus and Prunus, and Malus and Fragaria, respectively were 784 and 148. The correspondence between marker positions was high and conserved syntenic blocks were identified among the three genera in the Rosaceae. We reconstructed a proposed ancestral genome for the Rosaceae. Conclusions A genome containing nine chromosomes is the most likely candidate for the ancestral Rosaceae progenitor. The number of chromosomal translocations observed between the three genera investigated was low. However, the number of inversions identified among Malus and Prunus was much higher than any reported genome comparisons in plants, suggesting that small inversions have played an important role in the evolution of these two genera or of the Rosaceae.

  1. Identification of recurrent chromosomal aberrations in germ cell tumors of neonates and infants using genomewide array-based comparative genomic hybridization.

    NARCIS (Netherlands)

    Veltman, I.M.; Veltman, J.; Janssen, I.M.; Hulsbergen- van de Kaa, C.A.; Oosterhuis, W.; Schneider, D.; Stoop, H.; Gillis, A.J.M.; Zahn, S.; Looijenga, L.H.J.; Gobel, U.; Geurts van Kessel, A.H.M.

    2005-01-01

    Human germ cell tumors (GCTs) of neonates and infants comprise a heterogeneous group of neoplasms, including teratomas and yolk sac tumors with distinct clinical and epidemiologic features. As yet, little is known about the cytogenetic constitution of these tumors. We applied the recently developed

  2. Comparing thousands of circular genomes using the CGView Comparison Tool

    Directory of Open Access Journals (Sweden)

    Grant Jason R

    2012-05-01

    Full Text Available Abstract Background Continued sequencing efforts coupled with advances in sequencing technology will lead to the completion of a vast number of small genomes. Whole-genome comparisons represent an important part of the analysis of any new genome sequence, as they can provide a better understanding of the biology and evolution of the source organism. Visualization of the results is important, as it allows information from a variety of sources to be integrated and interpreted. However, existing graphical comparison tools lack features needed for efficiently comparing a new genome to hundreds or thousands of existing sequences. Moreover, existing tools are limited in terms of the types of comparisons that can be performed, the extent to which the output can be customized, and the ease with which the entire process can be automated. Results The CGView Comparison Tool (CCT is a package for visually comparing bacterial, plasmid, chloroplast, or mitochondrial sequences of interest to existing genomes or sequence collections. The comparisons are conducted using BLAST, and the BLAST results are presented in the form of graphical maps that can also show sequence features, gene and protein names, COG (Clusters of Orthologous Groups of proteins category assignments, and sequence composition characteristics. CCT can generate maps in a variety of sizes, including 400 Megapixel maps suitable for posters. Comparisons can be conducted within a particular species or genus, or all available genomes can be used. The entire map creation process, from downloading sequences to redrawing zoomed maps, can be completed easily using scripts included with the CCT. User-defined features or analysis results can be included on maps, and maps can be extensively customized. To simplify program setup, a CCT virtual machine that includes all dependencies preinstalled is available. Detailed tutorials illustrating the use of CCT are included with the CCT documentation. Conclusion

  3. Reduction and Expansion in Microsporidian Genome Evolution: New Insights from Comparative Genomics

    OpenAIRE

    Nakjang, S.; Williams, T.A.; Heinz, E; Watson, A. K.; Foster, P. G.; Sendra, K. M.; Heaps, S. E.; Hirt, R. P.; Martin Embley, T.

    2013-01-01

    Microsporidia are an abundant group of obligate intracellular parasites of other eukaryotes, including immunocompromised humans, but the molecular basis of their intracellular lifestyle and pathobiology are poorly understood. New genomes from a taxonomically broad range of microsporidians, complemented by published expression data, provide an opportunity for comparative analyses to identify conserved and lineage-specific patterns of microsporidian genome evolution that have underpinned this s...

  4. Online Genome Analysis Resources for Educators, a Comparative Review

    OpenAIRE

    Sarah Grace Prescott

    2012-01-01

    A comparative review of several companies that offer similar kits or services that allow students to isolate DNA (human and others), amplify it by PCR, and in some cases sequence the resulting sample.  The companies include:  Carolina® Biological Supply Company, Bio-Rad®, Edvotek® Inc., Hiram Genomics Store, and 23andMe.

  5. Chromosomal 16p microdeletion in Rubinstein-Taybi syndrome detected by oligonucleotide-based array comparative genomic hybridization: a case report

    Directory of Open Access Journals (Sweden)

    Mohd Fadley Md A

    2012-01-01

    Full Text Available Abstract Introduction Chromosomal aberrations of chromosome 16 are uncommon and submicroscopic deletions have rarely been reported. At present, a cytogenetic or molecular abnormality can only be detected in 55% of Rubinstein-Taybi syndrome patients, leaving the diagnosis in 45% of patients to rest on clinical features only. Interestingly, this microdeletion of 16 p13.3 was found in a young child with an unexplained syndromic condition due to an indistinct etiological diagnosis. To the best of our knowledge, no evidence of a microdeletion of 16 p13.3 with contiguous gene deletion, comprising cyclic adenosine monophosphate-response element-binding protein and tumor necrosis factor receptor-associated protein 1 genes, has been described in typical Rubinstein-Taybi syndrome. Case presentation We present the case of a three-year-old Malaysian Chinese girl with a de novo microdeletion on the short arm of chromosome 16, identified by oligonucleotide array-based comparative genomic hybridization. Our patient showed mild to moderate global developmental delay, facial dysmorphism, bilateral broad thumbs and great toes, a moderate size atrial septal defect, hypotonia and feeding difficulties. A routine chromosome analysis on 20 metaphase cells showed a normal 46, XX karyotype. Further investigation by high resolution array-based comparative genomic hybridization revealed a 120 kb microdeletion on chromosomal band 16 p13.3. Conclusion A mutation or abnormality in the cyclic adenosine monophosphate-response element-binding protein has previously been determined as a cause of Rubinstein-Taybi syndrome. However, microdeletion of 16 p13.3 comprising cyclic adenosine monophosphate-response element-binding protein and tumor necrosis factor receptor-associated protein 1 genes is a rare scenario in the pathogenesis of Rubinstein-Taybi syndrome. Additionally, due to insufficient coverage of the human genome by conventional techniques, clinically significant genomic

  6. Restauro-G: A Rapid Genome Re-Annotation System for Comparative Genomics

    Institute of Scientific and Technical Information of China (English)

    Satoshi Tamaki; Kazuharu Arakawa; Nobuaki Kono; Masaru Tomita

    2007-01-01

    Annotations of complete genome sequences submitted directly from sequencing projects are diverse in terms of annotation strategies and update frequencies. These inconsistencies make comparative studies difficult. To allow rapid data preparation of a large number of complete genomes, automation and speed are important for genome re-annotation. Here we introduce an open-source rapid genome re-annotation software system, Restauro-G, specialized for bacterial genomes. Restauro-G re-annotates a genome by similarity searches utilizing the BLAST-Like Alignment Tool, referring to protein databases such as UniProt KB, NCBI nr, NCBI COGs, Pfam, and PSORTb. Re-annotation by Restauro-G achieved over 98% accuracy for most bacterial chromosomes in comparison with the original manually curated annotation of EMBL releases. Restauro-G was developed in the generic bioinformatics workbench G-language Genome Analysis Environment and is distributed at http://restauro-g.iab.keio.ac.jp/ under the GNU General Public License.

  7. Phylogeny and comparative genome analysis of a Basidiomycete fungi

    Energy Technology Data Exchange (ETDEWEB)

    Riley, Robert W.; Salamov, Asaf; Grigoriev, Igor; Hibbett, David

    2011-03-14

    Fungi of the phylum Basidiomycota, make up some 37percent of the described fungi, and are important from the perspectives of forestry, agriculture, medicine, and bioenergy. This diverse phylum includes the mushrooms, wood rots, plant pathogenic rusts and smuts, and some human pathogens. To better understand these important fungi, we have undertaken a comparative genomic analysis of the Basidiomycetes with available sequenced genomes. We report a phylogeny that sheds light on previously unclear evolutionary relationships among the Basidiomycetes. We also define a `core proteome? based on protein families conserved in all Basidiomycetes. We identify key expansions and contractions in protein families that may be responsible for the degradation of plant biomass such as cellulose, hemicellulose, and lignin. Finally, we speculate as to the genomic changes that drove such expansions and contractions.

  8. Comparative analysis of methods for genome-wide nucleosome cartography.

    Science.gov (United States)

    Quintales, Luis; Vázquez, Enrique; Antequera, Francisco

    2015-07-01

    Nucleosomes contribute to compacting the genome into the nucleus and regulate the physical access of regulatory proteins to DNA either directly or through the epigenetic modifications of the histone tails. Precise mapping of nucleosome positioning across the genome is, therefore, essential to understanding the genome regulation. In recent years, several experimental protocols have been developed for this purpose that include the enzymatic digestion, chemical cleavage or immunoprecipitation of chromatin followed by next-generation sequencing of the resulting DNA fragments. Here, we compare the performance and resolution of these methods from the initial biochemical steps through the alignment of the millions of short-sequence reads to a reference genome to the final computational analysis to generate genome-wide maps of nucleosome occupancy. Because of the lack of a unified protocol to process data sets obtained through the different approaches, we have developed a new computational tool (NUCwave), which facilitates their analysis, comparison and assessment and will enable researchers to choose the most suitable method for any particular purpose. NUCwave is freely available at http://nucleosome.usal.es/nucwave along with a step-by-step protocol for its use. PMID:25296770

  9. Lactobacillus paracasei comparative genomics: towards species pan-genome definition and exploitation of diversity.

    Directory of Open Access Journals (Sweden)

    Tamara Smokvina

    Full Text Available Lactobacillus paracasei is a member of the normal human and animal gut microbiota and is used extensively in the food industry in starter cultures for dairy products or as probiotics. With the development of low-cost, high-throughput sequencing techniques it has become feasible to sequence many different strains of one species and to determine its "pan-genome". We have sequenced the genomes of 34 different L. paracasei strains, and performed a comparative genomics analysis. We analysed genome synteny and content, focussing on the pan-genome, core genome and variable genome. Each genome was shown to contain around 2800-3100 protein-coding genes, and comparative analysis identified over 4200 ortholog groups that comprise the pan-genome of this species, of which about 1800 ortholog groups make up the conserved core. Several factors previously associated with host-microbe interactions such as pili, cell-envelope proteinase, hydrolases p40 and p75 or the capacity to produce short branched-chain fatty acids (bkd operon are part of the L. paracasei core genome present in all analysed strains. The variome consists mainly of hypothetical proteins, phages, plasmids, transposon/conjugative elements, and known functions such as sugar metabolism, cell-surface proteins, transporters, CRISPR-associated proteins, and EPS biosynthesis proteins. An enormous variety and variability of sugar utilization gene cassettes were identified, with each strain harbouring between 25-53 cassettes, reflecting the high adaptability of L. paracasei to different niches. A phylogenomic tree was constructed based on total genome contents, and together with an analysis of horizontal gene transfer events we conclude that evolution of these L. paracasei strains is complex and not always related to niche adaptation. The results of this genome content comparison was used, together with high-throughput growth experiments on various carbohydrates, to perform gene-trait matching analysis

  10. Lactobacillus paracasei comparative genomics: towards species pan-genome definition and exploitation of diversity.

    Science.gov (United States)

    Smokvina, Tamara; Wels, Michiel; Polka, Justyna; Chervaux, Christian; Brisse, Sylvain; Boekhorst, Jos; van Hylckama Vlieg, Johan E T; Siezen, Roland J

    2013-01-01

    Lactobacillus paracasei is a member of the normal human and animal gut microbiota and is used extensively in the food industry in starter cultures for dairy products or as probiotics. With the development of low-cost, high-throughput sequencing techniques it has become feasible to sequence many different strains of one species and to determine its "pan-genome". We have sequenced the genomes of 34 different L. paracasei strains, and performed a comparative genomics analysis. We analysed genome synteny and content, focussing on the pan-genome, core genome and variable genome. Each genome was shown to contain around 2800-3100 protein-coding genes, and comparative analysis identified over 4200 ortholog groups that comprise the pan-genome of this species, of which about 1800 ortholog groups make up the conserved core. Several factors previously associated with host-microbe interactions such as pili, cell-envelope proteinase, hydrolases p40 and p75 or the capacity to produce short branched-chain fatty acids (bkd operon) are part of the L. paracasei core genome present in all analysed strains. The variome consists mainly of hypothetical proteins, phages, plasmids, transposon/conjugative elements, and known functions such as sugar metabolism, cell-surface proteins, transporters, CRISPR-associated proteins, and EPS biosynthesis proteins. An enormous variety and variability of sugar utilization gene cassettes were identified, with each strain harbouring between 25-53 cassettes, reflecting the high adaptability of L. paracasei to different niches. A phylogenomic tree was constructed based on total genome contents, and together with an analysis of horizontal gene transfer events we conclude that evolution of these L. paracasei strains is complex and not always related to niche adaptation. The results of this genome content comparison was used, together with high-throughput growth experiments on various carbohydrates, to perform gene-trait matching analysis, in order to link

  11. Whole genome comparative studies between chicken and turkey and their implications for avian genome evolution

    Directory of Open Access Journals (Sweden)

    Carré Wilfrid

    2008-04-01

    Full Text Available Abstract Background Comparative genomics is a powerful means of establishing inter-specific relationships between gene function/location and allows insight into genomic rearrangements, conservation and evolutionary phylogeny. The availability of the complete sequence of the chicken genome has initiated the development of detailed genomic information in other birds including turkey, an agriculturally important species where mapping has hitherto focused on linkage with limited physical information. No molecular study has yet examined conservation of avian microchromosomes, nor differences in copy number variants (CNVs between birds. Results We present a detailed comparative cytogenetic map between chicken and turkey based on reciprocal chromosome painting and mapping of 338 chicken BACs to turkey metaphases. Two inter-chromosomal changes (both involving centromeres and three pericentric inversions have been identified between chicken and turkey; and array CGH identified 16 inter-specific CNVs. Conclusion This is the first study to combine the modalities of zoo-FISH and array CGH between different avian species. The first insight into the conservation of microchromosomes, the first comparative cytogenetic map of any bird and the first appraisal of CNVs between birds is provided. Results suggest that avian genomes have remained relatively stable during evolution compared to mammalian equivalents.

  12. Comparative genomics reveals diversity among xanthomonads infecting tomato and pepper

    LENUS (Irish Health Repository)

    Potnis, Neha

    2011-03-11

    Abstract Background Bacterial spot of tomato and pepper is caused by four Xanthomonas species and is a major plant disease in warm humid climates. The four species are distinct from each other based on physiological and molecular characteristics. The genome sequence of strain 85-10, a member of one of the species, Xanthomonas euvesicatoria (Xcv) has been previously reported. To determine the relationship of the four species at the genome level and to investigate the molecular basis of their virulence and differing host ranges, draft genomic sequences of members of the other three species were determined and compared to strain 85-10. Results We sequenced the genomes of X. vesicatoria (Xv) strain 1111 (ATCC 35937), X. perforans (Xp) strain 91-118 and X. gardneri (Xg) strain 101 (ATCC 19865). The genomes were compared with each other and with the previously sequenced Xcv strain 85-10. In addition, the molecular features were predicted that may be required for pathogenicity including the type III secretion apparatus, type III effectors, other secretion systems, quorum sensing systems, adhesins, extracellular polysaccharide, and lipopolysaccharide determinants. Several novel type III effectors from Xg strain 101 and Xv strain 1111 genomes were computationally identified and their translocation was validated using a reporter gene assay. A homolog to Ax21, the elicitor of XA21-mediated resistance in rice, and a functional Ax21 sulfation system were identified in Xcv. Genes encoding proteins with functions mediated by type II and type IV secretion systems have also been compared, including enzymes involved in cell wall deconstruction, as contributors to pathogenicity. Conclusions Comparative genomic analyses revealed considerable diversity among bacterial spot pathogens, providing new insights into differences and similarities that may explain the diverse nature of these strains. Genes specific to pepper pathogens, such as the O-antigen of the lipopolysaccharide cluster

  13. Comparative genomics reveals diversity among xanthomonads infecting tomato and pepper

    Directory of Open Access Journals (Sweden)

    Koebnik Ralf

    2011-03-01

    Full Text Available Abstract Background Bacterial spot of tomato and pepper is caused by four Xanthomonas species and is a major plant disease in warm humid climates. The four species are distinct from each other based on physiological and molecular characteristics. The genome sequence of strain 85-10, a member of one of the species, Xanthomonas euvesicatoria (Xcv has been previously reported. To determine the relationship of the four species at the genome level and to investigate the molecular basis of their virulence and differing host ranges, draft genomic sequences of members of the other three species were determined and compared to strain 85-10. Results We sequenced the genomes of X. vesicatoria (Xv strain 1111 (ATCC 35937, X. perforans (Xp strain 91-118 and X. gardneri (Xg strain 101 (ATCC 19865. The genomes were compared with each other and with the previously sequenced Xcv strain 85-10. In addition, the molecular features were predicted that may be required for pathogenicity including the type III secretion apparatus, type III effectors, other secretion systems, quorum sensing systems, adhesins, extracellular polysaccharide, and lipopolysaccharide determinants. Several novel type III effectors from Xg strain 101 and Xv strain 1111 genomes were computationally identified and their translocation was validated using a reporter gene assay. A homolog to Ax21, the elicitor of XA21-mediated resistance in rice, and a functional Ax21 sulfation system were identified in Xcv. Genes encoding proteins with functions mediated by type II and type IV secretion systems have also been compared, including enzymes involved in cell wall deconstruction, as contributors to pathogenicity. Conclusions Comparative genomic analyses revealed considerable diversity among bacterial spot pathogens, providing new insights into differences and similarities that may explain the diverse nature of these strains. Genes specific to pepper pathogens, such as the O-antigen of the

  14. Comparative genomics of transcriptional regulation of methionine metabolism in Proteobacteria.

    Directory of Open Access Journals (Sweden)

    Semen A Leyn

    Full Text Available Methionine metabolism and uptake genes in Proteobacteria are controlled by a variety of RNA and DNA regulatory systems. We have applied comparative genomics to reconstruct regulons for three known transcription factors, MetJ, MetR, and SahR, and three known riboswitch motifs, SAH, SAM-SAH, and SAM_alpha, in ∼ 200 genomes from 22 taxonomic groups of Proteobacteria. We also identified two novel regulons: a SahR-like transcription factor SamR controlling various methionine biosynthesis genes in the Xanthomonadales group, and a potential RNA regulatory element with terminator-antiterminator mechanism controlling the metX or metZ genes in beta-proteobacteria. For each analyzed regulator we identified the core, taxon-specific and genome-specific regulon members. By analyzing the distribution of these regulators in bacterial genomes and by comparing their regulon contents we elucidated possible evolutionary scenarios for the regulation of the methionine metabolism genes in Proteobacteria.

  15. Comparative genomics of transcriptional regulation of methionine metabolism in Proteobacteria.

    Science.gov (United States)

    Leyn, Semen A; Suvorova, Inna A; Kholina, Tatiana D; Sherstneva, Sofia S; Novichkov, Pavel S; Gelfand, Mikhail S; Rodionov, Dmitry A

    2014-01-01

    Methionine metabolism and uptake genes in Proteobacteria are controlled by a variety of RNA and DNA regulatory systems. We have applied comparative genomics to reconstruct regulons for three known transcription factors, MetJ, MetR, and SahR, and three known riboswitch motifs, SAH, SAM-SAH, and SAM_alpha, in ∼ 200 genomes from 22 taxonomic groups of Proteobacteria. We also identified two novel regulons: a SahR-like transcription factor SamR controlling various methionine biosynthesis genes in the Xanthomonadales group, and a potential RNA regulatory element with terminator-antiterminator mechanism controlling the metX or metZ genes in beta-proteobacteria. For each analyzed regulator we identified the core, taxon-specific and genome-specific regulon members. By analyzing the distribution of these regulators in bacterial genomes and by comparing their regulon contents we elucidated possible evolutionary scenarios for the regulation of the methionine metabolism genes in Proteobacteria. PMID:25411846

  16. Comparative genomics and transcriptomics of trait-gene association

    Directory of Open Access Journals (Sweden)

    Pierlé Sebastián

    2012-11-01

    Full Text Available Abstract Background The Order Rickettsiales includes important tick-borne pathogens, from Rickettsia rickettsii, which causes Rocky Mountain spotted fever, to Anaplasma marginale, the most prevalent vector-borne pathogen of cattle. Although most pathogens in this Order are transmitted by arthropod vectors, little is known about the microbial determinants of transmission. A. marginale provides unique tools for studying the determinants of transmission, with multiple strain sequences available that display distinct and reproducible transmission phenotypes. The closed core A. marginale genome suggests that any phenotypic differences are due to single nucleotide polymorphisms (SNPs. We combined DNA/RNA comparative genomic approaches using strains with different tick transmission phenotypes and identified genes that segregate with transmissibility. Results Comparison of seven strains with different transmission phenotypes generated a list of SNPs affecting 18 genes and nine promoters. Transcriptional analysis found two candidate genes downstream from promoter SNPs that were differentially transcribed. To corroborate the comparative genomics approach we used three RNA-seq platforms to analyze the transcriptomes from two A. marginale strains with different transmission phenotypes. RNA-seq analysis confirmed the comparative genomics data and found 10 additional genes whose transcription between strains with distinct transmission efficiencies was significantly different. Six regions of the genome that contained no annotation were found to be transcriptionally active, and two of these newly identified transcripts were differentially transcribed. Conclusions This approach identified 30 genes and two novel transcripts potentially involved in tick transmission. We describe the transcriptome of an obligate intracellular bacterium in depth, while employing massive parallel sequencing to dissect an important trait in bacterial pathogenesis.

  17. Annelids in evolutionary developmental biology and comparative genomics

    Directory of Open Access Journals (Sweden)

    Mcdougall C.

    2008-09-01

    Full Text Available Annelids have had a long history in comparative embryology and morphology, which has helped to establish them in zoology textbooks as an ideal system to understand the evolution of the typical triploblastic, coelomate, protostome condition. In recent years there has been a relative upsurge in embryological data, particularly with regard to the expression and function of developmental control genes. Polychaetes, as well as other annelids such as the parasitic leech, are now also entering the age of comparative genomics. All of this comparative data has had an important impact on our views of the ancestral conditions at various levels of the animal phylogeny, including the bilaterian ancestor and the nature of the annelid ancestor. Here we review some of the recent advances made in annelid comparative development and genomics, revealing a hitherto unsuspected level of complexity in these ancestors. It is also apparent that the transition to a parasitic lifestyle leads to, or requires, extensive modifications and derivations at both the genomic and embryological levels.

  18. A Web-Based Comparative Genomics Tutorial for Investigating Microbial Genomes

    Directory of Open Access Journals (Sweden)

    Michael Strong

    2009-12-01

    Full Text Available As the number of completely sequenced microbial genomes continues to rise at an impressive rate, it is important to prepare students with the skills necessary to investigate microorganisms at the genomic level. As a part of the core curriculum for first-year graduate students in the biological sciences, we have implemented a web-based tutorial to introduce students to the fields of comparative and functional genomics. The tutorial focuses on recent computational methods for identifying functionally linked genes and proteins on a genome-wide scale and was used to introduce students to the Rosetta Stone, Phylogenetic Profile, conserved Gene Neighbor, and Operon computational methods. Students learned to use a number of publicly available web servers and databases to identify functionally linked genes in the Escherichia coli genome, with emphasis on genome organization and operon structure. The overall effectiveness of the tutorial was assessed based on student evaluations and homework assignments. The tutorial is available to other educators at http://www.doe-mbi.ucla.edu/~strong/m253.php.

  19. Genomic characterization of some Iranian children with idiopathic mental retardation using array comparative genomic hybridization

    Directory of Open Access Journals (Sweden)

    Farkhondeh Behjati

    2013-01-01

    Full Text Available Background: Mental retardation (MR has a prevalence of 1-3% and genetic causes are present in more than 50% of patients. Chromosomal abnormalities are one of the most common genetic causes of MR and are responsible for 4-28% of mental retardation. However, the smallest loss or gain of material visible by standard cytogenetic is about 4 Mb and for smaller abnormalities, molecular cytogenetic techniques such as array comparative genomic hybridization (array CGH should be used. It has been shown that 15-25% of idiopathic MR (IMR has submicroscopic rearrangements detectable by array CGH. In this project, the genomic abnormalities were investigated in 32 MR patients using this technique. Materials and Methods: Patients with IMR with dysmorphism were investigated in this study. Karyotype analysis, fragile X and metabolic tests were first carried out on the patients. The copy number variation was then assessed in a total of 32 patients with normal results for the mentioned tests using whole genome oligo array CGH. Multiple ligation probe amplification was carried out as a confirmation test. Results: In total, 19% of the patients showed genomic abnormalities. This is reduced to 12.5% once the two patients with abnormal karyotypes (upon re-evaluation are removed. Conclusion: The array CGH technique increased the detection rate of genomic imbalances in our patients by 12.5%. It is an accurate and reliable method for the determination of genomic imbalances in patients with IMR and dysmorphism.

  20. Transcriptome-based exon capture enables highly cost-effective comparative genomic data collection at moderate evolutionary scales

    OpenAIRE

    Bi Ke; Vanderpool Dan; Singhal Sonal; Linderoth Tyler; Moritz Craig; Good Jeffrey M

    2012-01-01

    Abstract Background To date, exon capture has largely been restricted to species with fully sequenced genomes, which has precluded its application to lineages that lack high quality genomic resources. We developed a novel strategy for designing array-based exon capture in chipmunks (Tamias) based on de novo transcriptome assemblies. We evaluated the performance of our approach across specimens from four chipmunk species. Results We selectively targeted 11,975 exons (~4 Mb) on custom capture a...

  1. Comparative omics-driven genome annotation refinement: application across Yersiniae.

    Directory of Open Access Journals (Sweden)

    Alexandra C Schrimpe-Rutledge

    Full Text Available Genome sequencing continues to be a rapidly evolving technology, yet most downstream aspects of genome annotation pipelines remain relatively stable or are even being abandoned. The annotation process is now performed almost exclusively in an automated fashion to balance the large number of sequences generated. One possible way of reducing errors inherent to automated computational annotations is to apply data from omics measurements (i.e. transcriptional and proteomic to the un-annotated genome with a proteogenomic-based approach. Here, the concept of annotation refinement has been extended to include a comparative assessment of genomes across closely related species. Transcriptomic and proteomic data derived from highly similar pathogenic Yersiniae (Y. pestis CO92, Y. pestis Pestoides F, and Y. pseudotuberculosis PB1/+ was used to demonstrate a comprehensive comparative omic-based annotation methodology. Peptide and oligo measurements experimentally validated the expression of nearly 40% of each strain's predicted proteome and revealed the identification of 28 novel and 68 incorrect (i.e., observed frameshifts, extended start sites, and translated pseudogenes protein-coding sequences within the three current genome annotations. Gene loss is presumed to play a major role in Y. pestis acquiring its niche as a virulent pathogen, thus the discovery of many translated pseudogenes, including the insertion-ablated argD, underscores a need for functional analyses to investigate hypotheses related to divergence. Refinements included the discovery of a seemingly essential ribosomal protein, several virulence-associated factors, a transcriptional regulator, and many hypothetical proteins that were missed during annotation.

  2. Floral gene resources from basal angiosperms for comparative genomics research

    Directory of Open Access Journals (Sweden)

    Zhang Xiaohong

    2005-03-01

    Full Text Available Abstract Background The Floral Genome Project was initiated to bridge the genomic gap between the most broadly studied plant model systems. Arabidopsis and rice, although now completely sequenced and under intensive comparative genomic investigation, are separated by at least 125 million years of evolutionary time, and cannot in isolation provide a comprehensive perspective on structural and functional aspects of flowering plant genome dynamics. Here we discuss new genomic resources available to the scientific community, comprising cDNA libraries and Expressed Sequence Tag (EST sequences for a suite of phylogenetically basal angiosperms specifically selected to bridge the evolutionary gaps between model plants and provide insights into gene content and genome structure in the earliest flowering plants. Results Random sequencing of cDNAs from representatives of phylogenetically important eudicot, non-grass monocot, and gymnosperm lineages has so far (as of 12/1/04 generated 70,514 ESTs and 48,170 assembled unigenes. Efficient sorting of EST sequences into putative gene families based on whole Arabidopsis/rice proteome comparison has permitted ready identification of cDNA clones for finished sequencing. Preliminarily, (i proportions of functional categories among sequenced floral genes seem representative of the entire Arabidopsis transcriptome, (ii many known floral gene homologues have been captured, and (iii phylogenetic analyses of ESTs are providing new insights into the process of gene family evolution in relation to the origin and diversification of the angiosperms. Conclusion Initial comparisons illustrate the utility of the EST data sets toward discovery of the basic floral transcriptome. These first findings also afford the opportunity to address a number of conspicuous evolutionary genomic questions, including reproductive organ transcriptome overlap between angiosperms and gymnosperms, genome-wide duplication history, lineage

  3. The Korea Brassica Genome Project: a Glimpse of the Brassica Genome Based on Comparative Genome Analysis With Arabidopsis

    Directory of Open Access Journals (Sweden)

    Beom-Seok Park

    2006-04-01

    Full Text Available A complete genome sequence provides unlimited information in the sequenced organism as well as in related taxa. According to the guidance of the Multinational Brassica Genome Project (MBGP, the Korea Brassica Genome Project (KBGP is sequencing chromosome 1 (cytogenetically oriented chromosome #1 of Brassica rapa. We have selected 48 seed BACs on chromosome 1 using EST genetic markers and FISH analyses. Among them, 30 BAC clones have been sequenced and 18 are on the way. Comparative genome analyses of the EST sequences and sequenced BAC clones from Brassica chromosome 1 revealed their homeologous partner regions on the Arabidopsis genome and a syntenic comparative map between Brassica chromosome 1 and Arabidopsis chromosomes. In silico chromosome walking and clone validation have been successfully applied to extending sequence contigs based on the comparative map and BAC end sequences. In addition, we have defined the (pericentromeric heterochromatin blocks with centromeric tandem repeats, rDNA and centromeric retrotransposons. In-depth sequence analyses of five homeologous BAC clones and an Arabidopsis chromosomal region reveal overall co-linearity, with 82% sequence similarity. The data indicate that the Brassica genome has undergone triplication and subsequent gene losses after the divergence of Arabidopsis and Brassica. Based on in-depth comparative genome analyses, we propose a comparative genomics approach for conquering the Brassica genome. In 2005 we intend to construct an integrated physical map, including sequence information from 500 BAC clones and integration of fingerprinting data and end sequence data of more than 100 000 BAC clones. The sequences have been submitted to GenBank with accession numbers: 10 204 BAC ends of the KBrH library (CW978640–CW988843; KBrH138P04, AC155338; KBrH117N09, AC155337; KBrH097M21, AC155348; KBrH093K03, AC155347; KBrH081N08, AC155346; KBrH080L24, AC155345; KBrH077A05, AC155343; KBrH020D15

  4. Comparative genomics of defense systems in archaea and bacteria

    OpenAIRE

    Makarova, Kira S.; Wolf, Yuri I.; Koonin, Eugene V.

    2013-01-01

    Our knowledge of prokaryotic defense systems has vastly expanded as the result of comparative genomic analysis, followed by experimental validation. This expansion is both quantitative, including the discovery of diverse new examples of known types of defense systems, such as restriction-modification or toxin-antitoxin systems, and qualitative, including the discovery of fundamentally new defense mechanisms, such as the CRISPR-Cas immunity system. Large-scale statistical analysis reveals that...

  5. Online Genome Analysis Resources for Educators, a Comparative Review

    Directory of Open Access Journals (Sweden)

    Sarah Grace Prescott

    2012-08-01

    Full Text Available A comparative review of several companies that offer similar kits or services that allow students to isolate DNA (human and others, amplify it by PCR, and in some cases sequence the resulting sample.  The companies include:  Carolina® Biological Supply Company, Bio-Rad®, Edvotek® Inc., Hiram Genomics Store, and 23andMe.

  6. Comparative genome analysis and genome-guided physiological analysis of Roseobacter litoralis

    Directory of Open Access Journals (Sweden)

    Simon Meinhard

    2011-06-01

    Full Text Available Abstract Background Roseobacter litoralis OCh149, the type species of the genus, and Roseobacter denitrificans OCh114 were the first described organisms of the Roseobacter clade, an ecologically important group of marine bacteria. Both species were isolated from seaweed and are able to perform aerobic anoxygenic photosynthesis. Results The genome of R. litoralis OCh149 contains one circular chromosome of 4,505,211 bp and three plasmids of 93,578 bp (pRLO149_94, 83,129 bp (pRLO149_83 and 63,532 bp (pRLO149_63. Of the 4537 genes predicted for R. litoralis, 1122 (24.7% are not present in the genome of R. denitrificans. Many of the unique genes of R. litoralis are located in genomic islands and on plasmids. On pRLO149_83 several potential heavy metal resistance genes are encoded which are not present in the genome of R. denitrificans. The comparison of the heavy metal tolerance of the two organisms showed an increased zinc tolerance of R. litoralis. In contrast to R. denitrificans, the photosynthesis genes of R. litoralis are plasmid encoded. The activity of the photosynthetic apparatus was confirmed by respiration rate measurements, indicating a growth-phase dependent response to light. Comparative genomics with other members of the Roseobacter clade revealed several genomic regions that were only conserved in the two Roseobacter species. One of those regions encodes a variety of genes that might play a role in host association of the organisms. The catabolism of different carbon and nitrogen sources was predicted from the genome and combined with experimental data. In several cases, e.g. the degradation of some algal osmolytes and sugars, the genome-derived predictions of the metabolic pathways in R. litoralis differed from the phenotype. Conclusions The genomic differences between the two Roseobacter species are mainly due to lateral gene transfer and genomic rearrangements. Plasmid pRLO149_83 contains predominantly recently acquired genetic

  7. Comparative Omics-Driven Genome Annotation Refinement: Application across Yersiniae

    Energy Technology Data Exchange (ETDEWEB)

    Rutledge, Alexandra C.; Jones, Marcus B.; Chauhan, Sadhana; Purvine, Samuel O.; Sanford, James; Monroe, Matthew E.; Brewer, Heather M.; Payne, Samuel H.; Ansong, Charles; Frank, Bryan C.; Smith, Richard D.; Peterson, Scott; Motin, Vladimir L.; Adkins, Joshua N.

    2012-03-27

    Genome sequencing continues to be a rapidly evolving technology, yet most downstream aspects of genome annotation pipelines remain relatively stable or are even being abandoned. To date, the perceived value of manual curation for genome annotations is not offset by the real cost and time associated with the process. In order to balance the large number of sequences generated, the annotation process is now performed almost exclusively in an automated fashion for most genome sequencing projects. One possible way to reduce errors inherent to automated computational annotations is to apply data from 'omics' measurements (i.e. transcriptional and proteomic) to the un-annotated genome with a proteogenomic-based approach. This approach does require additional experimental and bioinformatics methods to include omics technologies; however, the approach is readily automatable and can benefit from rapid developments occurring in those research domains as well. The annotation process can be improved by experimental validation of transcription and translation and aid in the discovery of annotation errors. Here the concept of annotation refinement has been extended to include a comparative assessment of genomes across closely related species, as is becoming common in sequencing efforts. Transcriptomic and proteomic data derived from three highly similar pathogenic Yersiniae (Y. pestis CO92, Y. pestis pestoides F, and Y. pseudotuberculosis PB1/+) was used to demonstrate a comprehensive comparative omic-based annotation methodology. Peptide and oligo measurements experimentally validated the expression of nearly 40% of each strain's predicted proteome and revealed the identification of 28 novel and 68 previously incorrect protein-coding sequences (e.g., observed frameshifts, extended start sites, and translated pseudogenes) within the three current Yersinia genome annotations. Gene loss is presumed to play a major role in Y. pestis acquiring its niche as a virulent

  8. Genome analysis and comparative genomics of a Giardia intestinalis assemblage E isolate

    Directory of Open Access Journals (Sweden)

    Andersson Jan O

    2010-10-01

    Full Text Available Abstract Background Giardia intestinalis is a protozoan parasite that causes diarrhea in a wide range of mammalian species. To further understand the genetic diversity between the Giardia intestinalis species, we have performed genome sequencing and analysis of a wild-type Giardia intestinalis sample from the assemblage E group, isolated from a pig. Results We identified 5012 protein coding genes, the majority of which are conserved compared to the previously sequenced genomes of the WB and GS strains in terms of microsynteny and sequence identity. Despite this, there is an unexpectedly large number of chromosomal rearrangements and several smaller structural changes that are present in all chromosomes. Novel members of the VSP, NEK Kinase and HCMP gene families were identified, which may reveal possible mechanisms for host specificity and new avenues for antigenic variation. We used comparative genomics of the three diverse Giardia intestinalis isolates P15, GS and WB to define a core proteome for this species complex and to identify lineage-specific genes. Extensive analyses of polymorphisms in the core proteome of Giardia revealed differential rates of divergence among cellular processes. Conclusions Our results indicate that despite a well conserved core of genes there is significant genome variation between Giardia isolates, both in terms of gene content, gene polymorphisms, structural chromosomal variations and surface molecule repertoires. This study improves the annotation of the Giardia genomes and enables the identification of functionally important variation.

  9. Comparative Analysis of Codon Usage Bias Patterns in Microsporidian Genomes.

    Directory of Open Access Journals (Sweden)

    Heng Xiang

    Full Text Available The sub-3 Mbp genomes from microsporidian species of the Encephalitozoon genus are the smallest known among eukaryotes and paragons of genomic reduction and compaction in parasites. However, their diminutive stature is not characteristic of all Microsporidia, whose genome sizes vary by an order of magnitude. This large variability suggests that different evolutionary forces are applied on the group as a whole. In this study, we have compared the codon usage bias (CUB between eight taxonomically distinct microsporidian genomes: Encephalitozoon intestinalis, Encephalitozoon cuniculi, Spraguea lophii, Trachipleistophora hominis, Enterocytozoon bieneusi, Nematocida parisii, Nosema bombycis and Nosema ceranae. While the CUB was found to be weak in all eight Microsporidia, nearly all (98% of the optimal codons in S. lophii, T. hominis, E. bieneusi, N. parisii, N. bombycis and N. ceranae are fond of A/U in third position whereas most (64.6% optimal codons in the Encephalitozoon species E. intestinalis and E. cuniculi are biased towards G/C. Although nucleotide composition biases are likely the main factor driving the CUB in Microsporidia according to correlation analyses, directed mutational pressure also likely affects the CUB as suggested by ENc-plots, correspondence and neutrality analyses. Overall, the Encephalitozoon genomes were found to be markedly different from the other microsporidians and, despite being the first sequenced representatives of this lineage, are uncharacteristic of the group as a whole. The disparities observed cannot be attributed solely to differences in host specificity and we hypothesize that other forces are at play in the lineage leading to Encephalitozoon species.

  10. Comparative Analysis of Codon Usage Bias Patterns in Microsporidian Genomes.

    Science.gov (United States)

    Xiang, Heng; Zhang, Ruizhi; Butler, Robert R; Liu, Tie; Zhang, Li; Pombert, Jean-François; Zhou, Zeyang

    2015-01-01

    The sub-3 Mbp genomes from microsporidian species of the Encephalitozoon genus are the smallest known among eukaryotes and paragons of genomic reduction and compaction in parasites. However, their diminutive stature is not characteristic of all Microsporidia, whose genome sizes vary by an order of magnitude. This large variability suggests that different evolutionary forces are applied on the group as a whole. In this study, we have compared the codon usage bias (CUB) between eight taxonomically distinct microsporidian genomes: Encephalitozoon intestinalis, Encephalitozoon cuniculi, Spraguea lophii, Trachipleistophora hominis, Enterocytozoon bieneusi, Nematocida parisii, Nosema bombycis and Nosema ceranae. While the CUB was found to be weak in all eight Microsporidia, nearly all (98%) of the optimal codons in S. lophii, T. hominis, E. bieneusi, N. parisii, N. bombycis and N. ceranae are fond of A/U in third position whereas most (64.6%) optimal codons in the Encephalitozoon species E. intestinalis and E. cuniculi are biased towards G/C. Although nucleotide composition biases are likely the main factor driving the CUB in Microsporidia according to correlation analyses, directed mutational pressure also likely affects the CUB as suggested by ENc-plots, correspondence and neutrality analyses. Overall, the Encephalitozoon genomes were found to be markedly different from the other microsporidians and, despite being the first sequenced representatives of this lineage, are uncharacteristic of the group as a whole. The disparities observed cannot be attributed solely to differences in host specificity and we hypothesize that other forces are at play in the lineage leading to Encephalitozoon species. PMID:26057384

  11. The Genome Sequence of Caenorhabditis briggsae: A Platform for Comparative Genomics

    Directory of Open Access Journals (Sweden)

    Stein Lincoln D

    2003-01-01

    Full Text Available The soil nematodes Caenorhabditis briggsae and Caenorhabditis elegans diverged from a common ancestor roughly 100 million years ago and yet are almost indistinguishable by eye. They have the same chromosome number and genome sizes, and they occupy the same ecological niche. To explore the basis for this striking conservation of structure and function, we have sequenced the C. briggsae genome to a high-quality draft stage and compared it to the finished C. elegans sequence. We predict approximately 19,500 protein-coding genes in the C. briggsae genome, roughly the same as in C. elegans. Of these, 12,200 have clear C. elegans orthologs, a further 6,500 have one or more clearly detectable C. elegans homologs, and approximately 800 C. briggsae genes have no detectable matches in C. elegans. Almost all of the noncoding RNAs (ncRNAs known are shared between the two species. The two genomes exhibit extensive colinearity, and the rate of divergence appears to be higher in the chromosomal arms than in the centers. Operons, a distinctive feature of C. elegans, are highly conserved in C. briggsae, with the arrangement of genes being preserved in 96% of cases. The difference in size between the C. briggsae (estimated at approximately 104 Mbp and C. elegans (100.3 Mbp genomes is almost entirely due to repetitive sequence, which accounts for 22.4% of the C. briggsae genome in contrast to 16.5% of the C. elegans genome. Few, if any, repeat families are shared, suggesting that most were acquired after the two species diverged or are undergoing rapid evolution. Coclustering the C. elegans and C. briggsae proteins reveals 2,169 protein families of two or more members. Most of these are shared between the two species, but some appear to be expanding or contracting, and there seem to be as many as several hundred novel C. briggsae gene families. The C. briggsae draft sequence will greatly improve the annotation of the C. elegans genome. Based on similarity to C

  12. The genome sequence of Caenorhabditis briggsae: a platform for comparative genomics.

    Directory of Open Access Journals (Sweden)

    Lincoln D Stein

    2003-11-01

    Full Text Available The soil nematodes Caenorhabditis briggsae and Caenorhabditis elegans diverged from a common ancestor roughly 100 million years ago and yet are almost indistinguishable by eye. They have the same chromosome number and genome sizes, and they occupy the same ecological niche. To explore the basis for this striking conservation of structure and function, we have sequenced the C. briggsae genome to a high-quality draft stage and compared it to the finished C. elegans sequence. We predict approximately 19,500 protein-coding genes in the C. briggsae genome, roughly the same as in C. elegans. Of these, 12,200 have clear C. elegans orthologs, a further 6,500 have one or more clearly detectable C. elegans homologs, and approximately 800 C. briggsae genes have no detectable matches in C. elegans. Almost all of the noncoding RNAs (ncRNAs known are shared between the two species. The two genomes exhibit extensive colinearity, and the rate of divergence appears to be higher in the chromosomal arms than in the centers. Operons, a distinctive feature of C. elegans, are highly conserved in C. briggsae, with the arrangement of genes being preserved in 96% of cases. The difference in size between the C. briggsae (estimated at approximately 104 Mbp and C. elegans (100.3 Mbp genomes is almost entirely due to repetitive sequence, which accounts for 22.4% of the C. briggsae genome in contrast to 16.5% of the C. elegans genome. Few, if any, repeat families are shared, suggesting that most were acquired after the two species diverged or are undergoing rapid evolution. Coclustering the C. elegans and C. briggsae proteins reveals 2,169 protein families of two or more members. Most of these are shared between the two species, but some appear to be expanding or contracting, and there seem to be as many as several hundred novel C. briggsae gene families. The C. briggsae draft sequence will greatly improve the annotation of the C. elegans genome. Based on similarity to C

  13. WormBase: methods for data mining and comparative genomics.

    Science.gov (United States)

    Harris, Todd W; Stein, Lincoln D

    2006-01-01

    WormBase is a comprehensive repository for information on Caenorhabditis elegans and related nematodes. Although the primary web-based interface of WormBase (http:// www.wormbase.org/) is familiar to most C. elegans researchers, WormBase also offers powerful data-mining features for addressing questions of comparative genomics, genome structure, and evolution. In this chapter, we focus on data mining at WormBase through the use of flexible web interfaces, custom queries, and scripts. The intended audience includes users wishing to query the database beyond the confines of the web interface or fetch data en masse. No knowledge of programming is necessary or assumed, although users with intermediate skills in the Perl scripting language will be able to utilize additional data-mining approaches. PMID:16988424

  14. Comparative analysis of cytogenetic manifestations of human genome instability

    International Nuclear Information System (INIS)

    The comparative analysis of cytogenetic manifestations of human genome instability was carried out. The studied parameters are the micronuclei rate (MNR), the level of single and double chromosome fragment and the level of premature chromatid division (PCD). PCD and chromosome fragments were chosen as anomalies that possibly result in MN formation. We analysed the MNR in buccal epithelium (BE) and peripheral blood lymphocytes (PBL), the level of single and double chromosome fragment as well as level PCD - in PBL only. Average MNR in BE was higher than in PBL. The studied parameters are independent ones and have to be considered altogether for more comprehensive evaluation of the level and peculiarities of manifestation of human genome instability

  15. The Whole Genome Assembly and Comparative Genomic Research of Thellungiella parvula (Extremophile Crucifer Mitochondrion

    Directory of Open Access Journals (Sweden)

    Xuelin Wang

    2016-01-01

    Full Text Available The complete nucleotide sequences of the mitochondrial (mt genome of an extremophile species Thellungiella parvula (T. parvula have been determined with the lengths of 255,773 bp. T. parvula mt genome is a circular sequence and contains 32 protein-coding genes, 19 tRNA genes, and three ribosomal RNA genes with a 11.5% coding sequence. The base composition of 27.5% A, 27.5% T, 22.7% C, and 22.3% G in descending order shows a slight bias of 55% AT. Fifty-three repeats were identified in the mitochondrial genome of T. parvula, including 24 direct repeats, 28 tandem repeats (TRs, and one palindromic repeat. Furthermore, a total of 199 perfect microsatellites have been mined with a high A/T content (83.1% through simple sequence repeat (SSR analysis and they were distributed unevenly within this mitochondrial genome. We also analyzed other plant mitochondrial genomes’ evolution in general, providing clues for the understanding of the evolution of organelles genomes in plants. Comparing with other Brassicaceae species, T. parvula is related to Arabidopsis thaliana whose characters of low temperature resistance have been well documented. This study will provide important genetic tools for other Brassicaceae species research and improve yields of economically important plants.

  16. Comparative genome analysis of Bacillus cereus group genomes withBacillus subtilis

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, Iain; Sorokin, Alexei; Kapatral, Vinayak; Reznik, Gary; Bhattacharya, Anamitra; Mikhailova, Natalia; Burd, Henry; Joukov, Victor; Kaznadzey, Denis; Walunas, Theresa; D' Souza, Mark; Larsen, Niels; Pusch,Gordon; Liolios, Konstantinos; Grechkin, Yuri; Lapidus, Alla; Goltsman,Eugene; Chu, Lien; Fonstein, Michael; Ehrlich, S. Dusko; Overbeek, Ross; Kyrpides, Nikos; Ivanova, Natalia

    2005-09-14

    Genome features of the Bacillus cereus group genomes (representative strains of Bacillus cereus, Bacillus anthracis and Bacillus thuringiensis sub spp israelensis) were analyzed and compared with the Bacillus subtilis genome. A core set of 1,381 protein families among the four Bacillus genomes, with an additional set of 933 families common to the B. cereus group, was identified. Differences in signal transduction pathways, membrane transporters, cell surface structures, cell wall, and S-layer proteins suggesting differences in their phenotype were identified. The B. cereus group has signal transduction systems including a tyrosine kinase related to two-component system histidine kinases from B. subtilis. A model for regulation of the stress responsive sigma factor sigmaB in the B. cereus group different from the well studied regulation in B. subtilis has been proposed. Despite a high degree of chromosomal synteny among these genomes, significant differences in cell wall and spore coat proteins that contribute to the survival and adaptation in specific hosts has been identified.

  17. Comparative Genomics of the Ubiquitous, Hydrocarbon-degrading Genus Marinobacter

    Science.gov (United States)

    Singer, E.; Webb, E.; Edwards, K. J.

    2012-12-01

    The genus Marinobacter is amongst the most ubiquitous in the global oceans and strains have been isolated from a wide variety of marine environments, including offshore oil-well heads, coastal thermal springs, Antarctic sea water, saline soils and associations with diatoms and dinoflagellates. Many strains have been recognized to be important hydrocarbon degraders in various marine habitats presenting sometimes extreme pH or salinity conditions. Analysis of the genome of M. aquaeolei revealed enormous adaptation versatility with an assortment of strategies for carbon and energy acquisition, sensation, and defense. In an effort to elucidate the ecological and biogeochemical significance of the Marinobacters, seven Marinobacter strains from diverse environments were included in a comparative genomics study. Genomes were screened for metabolic and adaptation potential to elucidate the strategies responsible for the omnipresence of the Marinobacter genus and their remedial action potential in hydrocarbon-polluted waters. The core genome predominantly encodes for key genes involved in hydrocarbon degradation, biofilm-relevant processes, including utilization of external DNA, halotolerance, as well as defense mechanisms against heavy metals, antibiotics, and toxins. All Marinobacter strains were observed to degrade a wide spectrum of hydrocarbon species, including aliphatic, polycyclic aromatic as well as acyclic isoprenoid compounds. Various genes predicted to facilitate hydrocarbon degradation, e.g. alkane 1-monooxygenase, appear to have originated from lateral gene transfer as they are located on gene clusters of 10-20% lower GC-content compared to genome averages and are flanked by transposases. Top ortholog hits are found in other hydrocarbon degrading organisms, e.g. Alcanivorax borkumensis. Strategies for hydrocarbon uptake encoded by various Marinobacter strains include cell surface hydrophobicity adaptation via capsular polysaccharide biosynthesis and attachment

  18. Comparative genomics of Serratia spp.: two paths towards endosymbiotic life.

    Directory of Open Access Journals (Sweden)

    Alejandro Manzano-Marín

    Full Text Available Symbiosis is a widespread phenomenon in nature, in which insects show a great number of these associations. Buchnera aphidicola, the obligate endosymbiont of aphids, coexists in some species with another intracellular bacterium, Serratia symbiotica. Of particular interest is the case of the cedar aphid Cinara cedri, where B. aphidicola BCc and S. symbiotica SCc need each other to fulfil their symbiotic role with the insect. Moreover, various features seem to indicate that S. symbiotica SCc is closer to an obligate endosymbiont than to other facultative S. symbiotica, such as the one described for the aphid Acirthosyphon pisum (S. symbiotica SAp. This work is based on the comparative genomics of five strains of Serratia, three free-living and two endosymbiotic ones (one facultative and one obligate which should allow us to dissect the genome reduction taking place in the adaptive process to an intracellular life-style. Using a pan-genome approach, we have identified shared and strain-specific genes from both endosymbiotic strains and gained insight into the different genetic reduction both S. symbiotica have undergone. We have identified both retained and reduced functional categories in S. symbiotica compared to the Free-Living Serratia (FLS that seem to be related with its endosymbiotic role in their specific host-symbiont systems. By means of a phylogenomic reconstruction we have solved the position of both endosymbionts with confidence, established the probable insect-pathogen origin of the symbiotic clade as well as the high amino-acid substitution rate in S. symbiotica SCc. Finally, we were able to quantify the minimal number of rearrangements suffered in the endosymbiotic lineages and reconstruct a minimal rearrangement phylogeny. All these findings provide important evidence for the existence of at least two distinctive S. symbiotica lineages that are characterized by different rearrangements, gene content, genome size and branch lengths.

  19. Xylella fastidiosa comparative genomic database is an information resource to explore the annotation, genomic features, and biology of different strains

    Directory of Open Access Journals (Sweden)

    Alessandro M. Varani

    2012-01-01

    Full Text Available The Xylella fastidiosa comparative genomic database is a scientific resource with the aim to provide a user-friendly interface for accessing high-quality manually curated genomic annotation and comparative sequence analysis, as well as for identifying and mapping prophage-like elements, a marked feature of Xylella genomes. Here we describe a database and tools for exploring the biology of this important plant pathogen. The hallmarks of this database are the high quality genomic annotation, the functional and comparative genomic analysis and the identification and mapping of prophage-like elements. It is available from web site http://www.xylella.lncc.br.

  20. Comparative genomics reveals mobile pathogenicity chromosomes in Fusarium

    Energy Technology Data Exchange (ETDEWEB)

    Ma, Li Jun; van der Does, H. C.; Borkovich, Katherine A.; Coleman, Jeffrey J.; Daboussi, Marie-Jose; Di Pietro, Antonio; Dufresne, Marie; Freitag, Michael; Grabherr, Manfred; Henrissat, Bernard; Houterman, Petra M.; Kang, Seogchan; Shim, Won-Bo; Wolochuk, Charles; Xie, Xiaohui; Xu, Jin Rong; Antoniw, John; Baker, Scott E.; Bluhm, Burton H.; Breakspear, Andrew; Brown, Daren W.; Butchko, Robert A.; Chapman, Sinead; Coulson, Richard; Coutinho, Pedro M.; Danchin, Etienne G.; Diener, Andrew; Gale, Liane R.; Gardiner, Donald; Goff, Steven; Hammond-Kossack, Kim; Hilburn, Karen; Hua-Van, Aurelie; Jonkers, Wilfried; Kazan, Kemal; Kodira, Chinnappa D.; Koehrsen, Michael; Kumar, Lokesh; Lee, Yong Hwan; Li, Liande; Manners, John M.; Miranda-Saavedra, Diego; Mukherjee, Mala; Park, Gyungsoon; Park, Jongsun; Park, Sook Young; Proctor, Robert H.; Regev, Aviv; Ruiz-Roldan, M. C.; Sain, Divya; Sakthikumar, Sharadha; Sykes, Sean; Schwartz, David C.; Turgeon, Barbara G.; Wapinski, Ilan; Yoder, Olen; Young, Sarah; Zeng, Qiandong; Zhou, Shiguo; Galagan, James; Cuomo, Christina A.; Kistler, H. Corby; Rep, Martijn

    2010-03-18

    Fusarium species are among the most important phytopathogenic and toxigenic fungi, having significant impact on crop production and animal health. Distinctively, members of the F. oxysporum species complex exhibit wide host range but discontinuously distributed host specificity, reflecting remarkable genetic adaptability. To understand the molecular underpinnings of diverse phenotypic traits and their evolution in Fusarium, we compared the genomes of three economically important and phylogenetically related, yet phenotypically diverse plant-pathogenic species, F. graminearum, F. verticillioides and F. oxysporum f. sp. lycopersici. Our analysis revealed greatly expanded lineage-specific (LS) genomic regions in F. oxysporum that include four entire chromosomes, accounting for more than one-quarter of the genome. LS regions are rich in transposons and genes with distinct evolutionary profiles but related to pathogenicity. Experimentally, we demonstrate for the first time the transfer of two LS chromosomes between strains of F. oxysporum, resulting in the conversion of a non-pathogenic strain into a pathogen. Transfer of LS chromosomes between otherwise genetically isolated strains explains the polyphyletic origin of host specificity and the emergence of new pathogenic lineages in the F. oxysporum species complex, putting the evolution of fungal pathogenicity into a new perspective.

  1. The complete genome sequence and comparative genome analysis of the high pathogenicity Yersinia enterocolitica strain 8081.

    Directory of Open Access Journals (Sweden)

    Nicholas R Thomson

    2006-12-01

    Full Text Available The human enteropathogen, Yersinia enterocolitica, is a significant link in the range of Yersinia pathologies extending from mild gastroenteritis to bubonic plague. Comparison at the genomic level is a key step in our understanding of the genetic basis for this pathogenicity spectrum. Here we report the genome of Y. enterocolitica strain 8081 (serotype 0:8; biotype 1B and extensive microarray data relating to the genetic diversity of the Y. enterocolitica species. Our analysis reveals that the genome of Y. enterocolitica strain 8081 is a patchwork of horizontally acquired genetic loci, including a plasticity zone of 199 kb containing an extraordinarily high density of virulence genes. Microarray analysis has provided insights into species-specific Y. enterocolitica gene functions and the intraspecies differences between the high, low, and nonpathogenic Y. enterocolitica biotypes. Through comparative genome sequence analysis we provide new information on the evolution of the Yersinia. We identify numerous loci that represent ancestral clusters of genes potentially important in enteric survival and pathogenesis, which have been lost or are in the process of being lost, in the other sequenced Yersinia lineages. Our analysis also highlights large metabolic operons in Y. enterocolitica that are absent in the related enteropathogen, Yersinia pseudotuberculosis, indicating major differences in niche and nutrients used within the mammalian gut. These include clusters directing, the production of hydrogenases, tetrathionate respiration, cobalamin synthesis, and propanediol utilisation. Along with ancestral gene clusters, the genome of Y. enterocolitica has revealed species-specific and enteropathogen-specific loci. This has provided important insights into the pathology of this bacterium and, more broadly, into the evolution of the genus. Moreover, wider investigations looking at the patterns of gene loss and gain in the Yersinia have highlighted common

  2. Reconstructing the Evolution of Brachypodium Genomes Using Comparative Chromosome Painting

    Science.gov (United States)

    Betekhtin, Alexander; Jenkins, Glyn; Hasterok, Robert

    2014-01-01

    Brachypodium distachyon is a model for the temperate cereals and grasses and has a biology, genomics infrastructure and cytogenetic platform fit for purpose. It is a member of a genus with fewer than 20 species, which have different genome sizes, basic chromosome numbers and ploidy levels. The phylogeny and interspecific relationships of this group have not to date been resolved by sequence comparisons and karyotypical studies. The aims of this study are not only to reconstruct the evolution of Brachypodium karyotypes to resolve the phylogeny, but also to highlight the mechanisms that shape the evolution of grass genomes. This was achieved through the use of comparative chromosome painting (CCP) which hybridises fluorescent, chromosome-specific probes derived from B. distachyon to homoeologous meiotic chromosomes of its close relatives. The study included five diploids (B. distachyon 2n = 10, B. sylvaticum 2n = 18, B. pinnatum 2n = 16; 2n = 18, B. arbuscula 2n = 18 and B. stacei 2n = 20) three allotetraploids (B. pinnatum 2n = 28, B. phoenicoides 2n = 28 and B. hybridum 2n = 30), and two species of unknown ploidy (B. retusum 2n = 38 and B. mexicanum 2n = 40). On the basis of the patterns of hybridisation and incorporating published data, we propose two alternative, but similar, models of karyotype evolution in the genus Brachypodium. According to the first model, the extant genome of B. distachyon derives from B. mexicanum or B. stacei by several rounds of descending dysploidy, and the other diploids evolve from B. distachyon via ascending dysploidy. The allotetraploids arise by interspecific hybridisation and chromosome doubling between B. distachyon and other diploids. The second model differs from the first insofar as it incorporates an intermediate 2n = 18 species between the B. mexicanum or B. stacei progenitors and the dysploidic B. distachyon. PMID:25493646

  3. Comparative Genomic Analysis of Mannheimia haemolytica from Bovine Sources.

    Directory of Open Access Journals (Sweden)

    Cassidy L Klima

    Full Text Available Bovine respiratory disease is a common health problem in beef production. The primary bacterial agent involved, Mannheimia haemolytica, is a target for antimicrobial therapy and at risk for associated antimicrobial resistance development. The role of M. haemolytica in pathogenesis is linked to serotype with serotypes 1 (S1 and 6 (S6 isolated from pneumonic lesions and serotype 2 (S2 found in the upper respiratory tract of healthy animals. Here, we sequenced the genomes of 11 strains of M. haemolytica, representing all three serotypes and performed comparative genomics analysis to identify genetic features that may contribute to pathogenesis. Possible virulence associated genes were identified within 14 distinct prophage, including a periplasmic chaperone, a lipoprotein, peptidoglycan glycosyltransferase and a stress response protein. Prophage content ranged from 2-8 per genome, but was higher in S1 and S6 strains. A type I-C CRISPR-Cas system was identified in each strain with spacer diversity and organization conserved among serotypes. The majority of spacers occur in S1 and S6 strains and originate from phage suggesting that serotypes 1 and 6 may be more resistant to phage predation. However, two spacers complementary to the host chromosome targeting a UDP-N-acetylglucosamine 2-epimerase and a glycosyl transferases group 1 gene are present in S1 and S6 strains only indicating these serotypes may employ CRISPR-Cas to regulate gene expression to avoid host immune responses or enhance adhesion during infection. Integrative conjugative elements are present in nine of the eleven genomes. Three of these harbor extensive multi-drug resistance cassettes encoding resistance against the majority of drugs used to combat infection in beef cattle, including macrolides and tetracyclines used in human medicine. The findings here identify key features that are likely contributing to serotype related pathogenesis and specific targets for vaccine design

  4. Reconstructing the Evolution of Brachypodium Genomes Using Comparative Chromosome Painting.

    Science.gov (United States)

    Betekhtin, Alexander; Jenkins, Glyn; Hasterok, Robert

    2014-01-01

    Brachypodium distachyon is a model for the temperate cereals and grasses and has a biology, genomics infrastructure and cytogenetic platform fit for purpose. It is a member of a genus with fewer than 20 species, which have different genome sizes, basic chromosome numbers and ploidy levels. The phylogeny and interspecific relationships of this group have not to date been resolved by sequence comparisons and karyotypical studies. The aims of this study are not only to reconstruct the evolution of Brachypodium karyotypes to resolve the phylogeny, but also to highlight the mechanisms that shape the evolution of grass genomes. This was achieved through the use of comparative chromosome painting (CCP) which hybridises fluorescent, chromosome-specific probes derived from B. distachyon to homoeologous meiotic chromosomes of its close relatives. The study included five diploids (B. distachyon 2n = 10, B. sylvaticum 2n = 18, B. pinnatum 2n = 16; 2n = 18, B. arbuscula 2n = 18 and B. stacei 2n = 20) three allotetraploids (B. pinnatum 2n = 28, B. phoenicoides 2n = 28 and B. hybridum 2n = 30), and two species of unknown ploidy (B. retusum 2n = 38 and B. mexicanum 2n = 40). On the basis of the patterns of hybridisation and incorporating published data, we propose two alternative, but similar, models of karyotype evolution in the genus Brachypodium. According to the first model, the extant genome of B. distachyon derives from B. mexicanum or B. stacei by several rounds of descending dysploidy, and the other diploids evolve from B. distachyon via ascending dysploidy. The allotetraploids arise by interspecific hybridisation and chromosome doubling between B. distachyon and other diploids. The second model differs from the first insofar as it incorporates an intermediate 2n = 18 species between the B. mexicanum or B. stacei progenitors and the dysploidic B. distachyon. PMID:25493646

  5. Comparative Genomic Analysis of Mannheimia haemolytica from Bovine Sources.

    Science.gov (United States)

    Klima, Cassidy L; Cook, Shaun R; Zaheer, Rahat; Laing, Chad; Gannon, Vick P; Xu, Yong; Rasmussen, Jay; Potter, Andrew; Hendrick, Steve; Alexander, Trevor W; McAllister, Tim A

    2016-01-01

    Bovine respiratory disease is a common health problem in beef production. The primary bacterial agent involved, Mannheimia haemolytica, is a target for antimicrobial therapy and at risk for associated antimicrobial resistance development. The role of M. haemolytica in pathogenesis is linked to serotype with serotypes 1 (S1) and 6 (S6) isolated from pneumonic lesions and serotype 2 (S2) found in the upper respiratory tract of healthy animals. Here, we sequenced the genomes of 11 strains of M. haemolytica, representing all three serotypes and performed comparative genomics analysis to identify genetic features that may contribute to pathogenesis. Possible virulence associated genes were identified within 14 distinct prophage, including a periplasmic chaperone, a lipoprotein, peptidoglycan glycosyltransferase and a stress response protein. Prophage content ranged from 2-8 per genome, but was higher in S1 and S6 strains. A type I-C CRISPR-Cas system was identified in each strain with spacer diversity and organization conserved among serotypes. The majority of spacers occur in S1 and S6 strains and originate from phage suggesting that serotypes 1 and 6 may be more resistant to phage predation. However, two spacers complementary to the host chromosome targeting a UDP-N-acetylglucosamine 2-epimerase and a glycosyl transferases group 1 gene are present in S1 and S6 strains only indicating these serotypes may employ CRISPR-Cas to regulate gene expression to avoid host immune responses or enhance adhesion during infection. Integrative conjugative elements are present in nine of the eleven genomes. Three of these harbor extensive multi-drug resistance cassettes encoding resistance against the majority of drugs used to combat infection in beef cattle, including macrolides and tetracyclines used in human medicine. The findings here identify key features that are likely contributing to serotype related pathogenesis and specific targets for vaccine design intended to reduce the

  6. Reconstructing the Evolution of Brachypodium Genomes Using Comparative Chromosome Painting.

    Directory of Open Access Journals (Sweden)

    Alexander Betekhtin

    Full Text Available Brachypodium distachyon is a model for the temperate cereals and grasses and has a biology, genomics infrastructure and cytogenetic platform fit for purpose. It is a member of a genus with fewer than 20 species, which have different genome sizes, basic chromosome numbers and ploidy levels. The phylogeny and interspecific relationships of this group have not to date been resolved by sequence comparisons and karyotypical studies. The aims of this study are not only to reconstruct the evolution of Brachypodium karyotypes to resolve the phylogeny, but also to highlight the mechanisms that shape the evolution of grass genomes. This was achieved through the use of comparative chromosome painting (CCP which hybridises fluorescent, chromosome-specific probes derived from B. distachyon to homoeologous meiotic chromosomes of its close relatives. The study included five diploids (B. distachyon 2n = 10, B. sylvaticum 2n = 18, B. pinnatum 2n = 16; 2n = 18, B. arbuscula 2n = 18 and B. stacei 2n = 20 three allotetraploids (B. pinnatum 2n = 28, B. phoenicoides 2n = 28 and B. hybridum 2n = 30, and two species of unknown ploidy (B. retusum 2n = 38 and B. mexicanum 2n = 40. On the basis of the patterns of hybridisation and incorporating published data, we propose two alternative, but similar, models of karyotype evolution in the genus Brachypodium. According to the first model, the extant genome of B. distachyon derives from B. mexicanum or B. stacei by several rounds of descending dysploidy, and the other diploids evolve from B. distachyon via ascending dysploidy. The allotetraploids arise by interspecific hybridisation and chromosome doubling between B. distachyon and other diploids. The second model differs from the first insofar as it incorporates an intermediate 2n = 18 species between the B. mexicanum or B. stacei progenitors and the dysploidic B. distachyon.

  7. Complete genome sequence of Enterococcus faecium strain TX16 and comparative genomic analysis of Enterococcus faecium genomes

    Directory of Open Access Journals (Sweden)

    Qin Xiang

    2012-07-01

    Full Text Available Abstract Background Enterococci are among the leading causes of hospital-acquired infections in the United States and Europe, with Enterococcus faecalis and Enterococcus faecium being the two most common species isolated from enterococcal infections. In the last decade, the proportion of enterococcal infections caused by E. faecium has steadily increased compared to other Enterococcus species. Although the underlying mechanism for the gradual replacement of E. faecalis by E. faecium in the hospital environment is not yet understood, many studies using genotyping and phylogenetic analysis have shown the emergence of a globally dispersed polyclonal subcluster of E. faecium strains in clinical environments. Systematic study of the molecular epidemiology and pathogenesis of E. faecium has been hindered by the lack of closed, complete E. faecium genomes that can be used as references. Results In this study, we report the complete genome sequence of the E. faecium strain TX16, also known as DO, which belongs to multilocus sequence type (ST 18, and was the first E. faecium strain ever sequenced. Whole genome comparison of the TX16 genome with 21 E. faecium draft genomes confirmed that most clinical, outbreak, and hospital-associated (HA strains (including STs 16, 17, 18, and 78, in addition to strains of non-hospital origin, group in the same clade (referred to as the HA clade and are evolutionally considerably more closely related to each other by phylogenetic and gene content similarity analyses than to isolates in the community-associated (CA clade with approximately a 3–4% average nucleotide sequence difference between the two clades at the core genome level. Our study also revealed that many genomic loci in the TX16 genome are unique to the HA clade. 380 ORFs in TX16 are HA-clade specific and antibiotic resistance genes are enriched in HA-clade strains. Mobile elements such as IS16 and transposons were also found almost exclusively in HA strains

  8. Comparative analysis of whole genome structure of Streptococcus suis using whole genome PCR scanning

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    An outbreak associated with Streptococcus suis infection in humans emerged in Sichuan province, China in 2005. The outbreak is atypical for the apparent large number of human cases, high fatality rate and geographical spread. To determine whether the bacterium has changed, we compared both human and animal isolates from the Sichuan outbreak with those collected previously within China and in other countries using whole genome PCR scanning (WGPScaning) comparative sequencing of several known virulence factor genes and multilocus sequence typing (MLST) analysis. WGPScanning analysis showed that all primer pairs yielded PCR products of the expected sizes in all four strains tested. The nucleotide sequences of all the detected virulence factor genes are identical in the four strains and MLST results showed that the four isolates studied and reference strain all belonged to the ST1 com-plex. No new genetic changes were found in the genome structure of the isolates from this Sichuan outbreak.

  9. Comparative analysis of whole genome structure of Streptococcus suis using whole genome PCR scanning

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    An outbreak associated with Streptococcus suis infection in humans emerged in Sichuan province, China in 2005. The outbreak is atypical for the apparent large number of human cases, high fatality rate and geographical spread. To determine whether the bacterium has changed, we compared both human and animal isolates from the Sichuan outbreak with those collected previously within China and in other countries using whole genome PCR scanning (WGPScaning) comparative sequencing of several known virulence factor genes and multilocus sequence typing (MLST) analysis. WGPScanning analysis showed that all primer pairs yielded PCR products of the expected sizes in all four strains tested. The nucleotide sequences of all the detected virulence factor genes are identical in the four strains and MLST results showed that the four isolates studied and reference strain all belonged to the ST1 complex. No new genetic changes were found in the genome structure of the isolates from this Sichuan outbreak.

  10. Genome Sequence Analyses of Pseudomonas savastanoi pv. glycinea and Subtractive Hybridization-Based Comparative Genomics with Nine Pseudomonads

    OpenAIRE

    Qi, Mingsheng; Wang, Dongping; Bradley, Carl A.; Zhao, Youfu

    2011-01-01

    Bacterial blight, caused by Pseudomonas savastanoi pv. glycinea (Psg), is a common disease of soybean. In an effort to compare a current field isolate with one isolated in the early 1960s, the genomes of two Psg strains, race 4 and B076, were sequenced using 454 pyrosequencing. The genomes of both Psg strains share more than 4,900 highly conserved genes, indicating very low genetic diversity between Psg genomes. Though conserved, genome rearrangements and recombination events occur commonly w...

  11. Comparative genomics of Blattabacterium cuenoti: the frozen legacy of an ancient endosymbiont genome.

    Science.gov (United States)

    Patiño-Navarrete, Rafael; Moya, Andrés; Latorre, Amparo; Peretó, Juli

    2013-01-01

    Many insect species have established long-term symbiotic relationships with intracellular bacteria. Symbiosis with bacteria has provided insects with novel ecological capabilities, which have allowed them colonize previously unexplored niches. Despite its importance to the understanding of the emergence of biological complexity, the evolution of symbiotic relationships remains hitherto a mystery in evolutionary biology. In this study, we contribute to the investigation of the evolutionary leaps enabled by mutualistic symbioses by sequencing the genome of Blattabacterium cuenoti, primary endosymbiont of the omnivorous cockroach Blatta orientalis, and one of the most ancient symbiotic associations. We perform comparative analyses between the Blattabacterium cuenoti genome and that of previously sequenced endosymbionts, namely those from the omnivorous hosts the Blattella germanica (Blattelidae) and Periplaneta americana (Blattidae), and the endosymbionts harbored by two wood-feeding hosts, the subsocial cockroach Cryptocercus punctulatus (Cryptocercidae) and the termite Mastotermes darwiniensis (Termitidae). Our study shows a remarkable evolutionary stasis of this symbiotic system throughout the evolutionary history of cockroaches and the deepest branching termite M. darwiniensis, in terms of not only chromosome architecture but also gene content, as revealed by the striking conservation of the Blattabacterium core genome. Importantly, the architecture of central metabolic network inferred from the endosymbiont genomes was established very early in Blattabacterium evolutionary history and could be an outcome of the essential role played by this endosymbiont in the host's nitrogen economy. PMID:23355305

  12. Comparative genomics of Mycoplasma: analysis of conserved essential genes and diversity of the pan-genome.

    Directory of Open Access Journals (Sweden)

    Wei Liu

    Full Text Available Mycoplasma, the smallest self-replicating organism with a minimal metabolism and little genomic redundancy, is expected to be a close approximation to the minimal set of genes needed to sustain bacterial life. This study employs comparative evolutionary analysis of twenty Mycoplasma genomes to gain an improved understanding of essential genes. By analyzing the core genome of mycoplasmas, we finally revealed the conserved essential genes set for mycoplasma survival. Further analysis showed that the core genome set has many characteristics in common with experimentally identified essential genes. Several key genes, which are related to DNA replication and repair and can be disrupted in transposon mutagenesis studies, may be critical for bacteria survival especially over long period natural selection. Phylogenomic reconstructions based on 3,355 homologous groups allowed robust estimation of phylogenetic relatedness among mycoplasma strains. To obtain deeper insight into the relative roles of molecular evolution in pathogen adaptation to their hosts, we also analyzed the positive selection pressures on particular sites and lineages. There appears to be an approximate correlation between the divergence of species and the level of positive selection detected in corresponding lineages.

  13. Comparative Genomics and Transcriptomic Analysis of Mycobacterium Kansasii

    KAUST Repository

    Alzahid, Yara

    2014-04-01

    The group of Mycobacteria is one of the most intensively studied bacterial taxa, as they cause the two historical and worldwide known diseases: leprosy and tuberculosis. Mycobacteria not identified as tuberculosis or leprosy complex, have been referred to by ‘environmental mycobacteria’ or ‘Nontuberculous mycobacteria (NTM). Mycobacterium kansasii (M. kansasii) is one of the most frequent NTM pathogens, as it causes pulmonary disease in immuno-competent patients and pulmonary, and disseminated disease in patients with various immuno-deficiencies. There have been five documented subtypes of this bacterium, by different molecular typing methods, showing that type I causes tuberculosis-like disease in healthy individuals, and type II in immune-compromised individuals. The remaining types are said to be environmental, thereby, not causing any diseases. The aim of this project was to conduct a comparative genomic study of M. kansasii types I-V and investigating the gene expression level of those types. From various comparative genomics analysis, provided genomics evidence on why M. kansasii type I is considered pathogenic, by focusing on three key elements that are involved in virulence of Mycobacteria: ESX secretion system, Phospholipase c (plcb) and Mammalian cell entry (Mce) operons. The results showed the lack of the espA operon in types II-V, which renders the ESX- 1 operon dysfunctional, as espA is one of the key factors that control this secretion system. However, gene expression analysis showed this operon to be deleted in types II, III and IV. Furthermore, plcB was found to be truncated in types III and IV. Analysis of Mce operons (1-4) show that mce-1 operon is duplicated, mce-2 is absent and mce-3 and mce-4 is present in one copy in M. kansasii types I-V. Gene expression profiles of type I-IV, showed that the secreted proteins of ESX-1 were slightly upregulated in types II-IV when compared to type I and the secreted forms of ESX-5 were highly down

  14. Survey sequencing and comparative analysis of the elephant shark (Callorhinchus milii) genome.

    OpenAIRE

    Byrappa Venkatesh; Kirkness, Ewen F.; Yong-Hwee Loh; Halpern, Aaron L; Lee, Alison P.; Justin Johnson; Nidhi Dandona; Viswanathan, Lakshmi D; Alice Tay; J Craig Venter; Strausberg, Robert L; Sydney Brenner

    2007-01-01

    Owing to their phylogenetic position, cartilaginous fishes (sharks, rays, skates, and chimaeras) provide a critical reference for our understanding of vertebrate genome evolution. The relatively small genome of the elephant shark, Callorhinchus milii, a chimaera, makes it an attractive model cartilaginous fish genome for whole-genome sequencing and comparative analysis. Here, the authors describe survey sequencing (1.4× coverage) and comparative analysis of the elephant shark genome, one of t...

  15. Comparative genomic characterization of citrus-associated Xylella fastidiosa strains

    Directory of Open Access Journals (Sweden)

    Nunes Luiz R

    2007-12-01

    Full Text Available Abstract Background The xylem-inhabiting bacterium Xylella fastidiosa (Xf is the causal agent of Pierce's disease (PD in vineyards and citrus variegated chlorosis (CVC in orange trees. Both of these economically-devastating diseases are caused by distinct strains of this complex group of microorganisms, which has motivated researchers to conduct extensive genomic sequencing projects with Xf strains. This sequence information, along with other molecular tools, have been used to estimate the evolutionary history of the group and provide clues to understand the capacity of Xf to infect different hosts, causing a variety of symptoms. Nonetheless, although significant amounts of information have been generated from Xf strains, a large proportion of these efforts has concentrated on the study of North American strains, limiting our understanding about the genomic composition of South American strains – which is particularly important for CVC-associated strains. Results This paper describes the first genome-wide comparison among South American Xf strains, involving 6 distinct citrus-associated bacteria. Comparative analyses performed through a microarray-based approach allowed identification and characterization of large mobile genetic elements that seem to be exclusive to South American strains. Moreover, a large-scale sequencing effort, based on Suppressive Subtraction Hybridization (SSH, identified 290 new ORFs, distributed in 135 Groups of Orthologous Elements, throughout the genomes of these bacteria. Conclusion Results from microarray-based comparisons provide further evidence concerning activity of horizontally transferred elements, reinforcing their importance as major mediators in the evolution of Xf. Moreover, the microarray-based genomic profiles showed similarity between Xf strains 9a5c and Fb7, which is unexpected, given the geographical and chronological differences associated with the isolation of these microorganisms. The newly

  16. Comparative genomic analysis of Vibrio parahaemolyticus: serotype conversion and virulence

    Directory of Open Access Journals (Sweden)

    Gil Ana I

    2011-06-01

    Full Text Available Abstract Background Vibrio parahaemolyticus is a common cause of foodborne disease. Beginning in 1996, a more virulent strain having serotype O3:K6 caused major outbreaks in India and other parts of the world, resulting in the emergence of a pandemic. Other serovariants of this strain emerged during its dissemination and together with the original O3:K6 were termed strains of the pandemic clone. Two genomes, one of this virulent strain and one pre-pandemic strain have been sequenced. We sequenced four additional genomes of V. parahaemolyticus in this study that were isolated from different geographical regions and time points. Comparative genomic analyses of six strains of V. parahaemolyticus isolated from Asia and Peru were performed in order to advance knowledge concerning the evolution of V. parahaemolyticus; specifically, the genetic changes contributing to serotype conversion and virulence. Two pre-pandemic strains and three pandemic strains, isolated from different geographical regions, were serotype O3:K6 and either toxin profiles (tdh+, trh- or (tdh-, trh+. The sixth pandemic strain sequenced in this study was serotype O4:K68. Results Genomic analyses revealed that the trh+ and tdh+ strains had different types of pathogenicity islands and mobile elements as well as major structural differences between the tdh pathogenicity islands of the pre-pandemic and pandemic strains. In addition, the results of single nucleotide polymorphism (SNP analysis showed that 94% of the SNPs between O3:K6 and O4:K68 pandemic isolates were within a 141 kb region surrounding the O- and K-antigen-encoding gene clusters. The "core" genes of V. parahaemolyticus were also compared to those of V. cholerae and V. vulnificus, in order to delineate differences between these three pathogenic species. Approximately one-half (49-59% of each species' core genes were conserved in all three species, and 14-24% of the core genes were species-specific and in different

  17. Comparative genomics in cyprinids: common carp ESTs help the annotation of the zebrafish genome

    Directory of Open Access Journals (Sweden)

    Srinivasan Hamsa

    2006-12-01

    Our data show that there is sufficient homology between the transcribed sequences of common carp and zebrafish to warrant an even deeper cyprinid transcriptome comparison. On the other hand, the comparative analysis illustrates the value in utilizing partially sequenced transcriptomes to understand gene structure in this diverse teleost group. We highlight the need for integrated resources to leverage the wealth of fragmented genomic data.

  18. Comparative genomics of mitochondria in chlorarachniophyte algae: endosymbiotic gene transfer and organellar genome dynamics

    Science.gov (United States)

    Tanifuji, Goro; Archibald, John M.; Hashimoto, Tetsuo

    2016-02-01

    Chlorarachniophyte algae possess four DNA-containing compartments per cell, the nucleus, mitochondrion, plastid and nucleomorph, the latter being a relic nucleus derived from a secondary endosymbiont. While the evolutionary dynamics of plastid and nucleomorph genomes have been investigated, a comparative investigation of mitochondrial genomes (mtDNAs) has not been carried out. We have sequenced the complete mtDNA of Lotharella oceanica and compared it to that of another chlorarachniophyte, Bigelowiella natans. The linear mtDNA of L. oceanica is 36.7 kbp in size and contains 35 protein genes, three rRNAs and 24 tRNAs. The codons GUG and UUG appear to be capable of acting as initiation codons in the chlorarachniophyte mtDNAs, in addition to AUG. Rpl16, rps4 and atp8 genes are missing in L.oceanica mtDNA, despite being present in B. natans mtDNA. We searched for, and found, mitochondrial rpl16 and rps4 genes with spliceosomal introns in the L. oceanica nuclear genome, indicating that mitochondrion-to-host-nucleus gene transfer occurred after the divergence of these two genera. Despite being of similar size and coding capacity, the level of synteny between L. oceanica and B. natans mtDNA is low, suggesting frequent rearrangements. Overall, our results suggest that chlorarachniophyte mtDNAs are more evolutionarily dynamic than their plastid counterparts.

  19. Chromosomal imbalances revealed in primary rhabdomyosarcomas by comparative genomic hybridization

    Institute of Scientific and Technical Information of China (English)

    LI Qiao-xin; LIU Chun-xia; CHUN Cai-pu; QI Yan; CHANG Bin; LI Xin-xia; CHEN Yun-zhao; NONG Wei-xia; LI Hong-an; LI Feng

    2009-01-01

    Background Previous cytogenetic studies revealed aberrations varied among the throe subtypes of rhabdomyosarcoma. We profiled chromosomal imbalances in the different subtypes and investigated the relationships between clinical parameters and genomic aberrations.Methods Comparative genomic hybridization was used to investigate genomic imbalances in 25 cases of primary rhabdomyosarcomas and two rhabdomyosarcoma cell lines. Specimens were reviewed to determine histological type, pathological grading and clinical staging.Results Changes involving one or more regions of the genome were seen in all rhabdomyosarcomal patients. For rhabdomyosarcoma, DNA sequence gains were most frequently (>30%) seen in chromosomes 2p, 12q, 6p, 9q, 10q, 1p,2q, 6q, 8q, 15q and 18q; losses from 3p, 11p and 6p. In aggressive alveolar rhabdomyosarcoma, frequent gains were seen on chromosomes 12q, 2p, 6p, 2q, 4q, 10q and 15q; losses from 3p, 6p, 1q and 5q. For embryonic rhabdomyosarcoma, frequent gains were on 7p, 9q, 2p, 18q, 1p and 8q; losses only from 11p. Frequently gained chromosome arms of translocation associated with rhabdomyosarcoma were 12q, 2, 6, 10q, 4q and 15q; losses from 3p,6p and 5q. The frequently gained chromosome arms of nontranslocation associated with rhabdomyosarcoma were 2p,9q and 18q, while 11p and 14q were the frequently lost chromosome arms. Gains on chromosome 12q were significantly correlated with translocation type. Gains on chromosome 9q were significantly correlated with clinical staging. Conclusions Gains on chromosomes 2p, 12q, 6p, 9q, 10q, 1p, 2q, 6q, 8q, 15q and 18q and losses on chromosomes 3p, 11p and 6p may be related to rhabdomyosarcomal carcinogenesis. Furthermore, gains on chromosome 12q may be correlated with translocation and gains on chromosome 9q with the early stages of rhabdomyosarcoma.

  20. Comparative genomics of pectinacetylesterases: Insight on function and biology.

    Science.gov (United States)

    de Souza, Amancio José; Pauly, Markus

    2015-01-01

    Pectin acetylation influences the gelling ability of this important plant polysaccharide for the food industry. Plant apoplastic pectinacetylesterases (PAEs) play a key role in regulating the degree of pectin acetylation and modifying their expression thus represents one way to engineer plant polysaccharides for food applications. Identifying the major active enzymes within the PAE gene family will aid in our understanding of this biological phenomena as well as provide the tools for direct trait manipulation. Using comparative genomics we propose that there is a minimal set of 4 distinct PAEs in plants. Possible functional diversification of the PAE family in the grasses is also explored with the identification of 3 groups of PAE genes specific to grasses. PMID:26237162

  1. Evolutionary insights into scleractinian corals using comparative genomic hybridizations.

    KAUST Repository

    Aranda, Manuel

    2012-09-21

    Coral reefs belong to the most ecologically and economically important ecosystems on our planet. Yet, they are under steady decline worldwide due to rising sea surface temperatures, disease, and pollution. Understanding the molecular impact of these stressors on different coral species is imperative in order to predict how coral populations will respond to this continued disturbance. The use of molecular tools such as microarrays has provided deep insight into the molecular stress response of corals. Here, we have performed comparative genomic hybridizations (CGH) with different coral species to an Acropora palmata microarray platform containing 13,546 cDNA clones in order to identify potentially rapidly evolving genes and to determine the suitability of existing microarray platforms for use in gene expression studies (via heterologous hybridization).

  2. Establishing a framework for comparative analysis of genome sequences

    Energy Technology Data Exchange (ETDEWEB)

    Bansal, A.K.

    1995-06-01

    This paper describes a framework and a high-level language toolkit for comparative analysis of genome sequence alignment The framework integrates the information derived from multiple sequence alignment and phylogenetic tree (hypothetical tree of evolution) to derive new properties about sequences. Multiple sequence alignments are treated as an abstract data type. Abstract operations have been described to manipulate a multiple sequence alignment and to derive mutation related information from a phylogenetic tree by superimposing parsimonious analysis. The framework has been applied on protein alignments to derive constrained columns (in a multiple sequence alignment) that exhibit evolutionary pressure to preserve a common property in a column despite mutation. A Prolog toolkit based on the framework has been implemented and demonstrated on alignments containing 3000 sequences and 3904 columns.

  3. Evolutionary insights into scleractinian corals using comparative genomic hybridizations

    Directory of Open Access Journals (Sweden)

    Aranda Manuel

    2012-09-01

    Full Text Available Abstract Background Coral reefs belong to the most ecologically and economically important ecosystems on our planet. Yet, they are under steady decline worldwide due to rising sea surface temperatures, disease, and pollution. Understanding the molecular impact of these stressors on different coral species is imperative in order to predict how coral populations will respond to this continued disturbance. The use of molecular tools such as microarrays has provided deep insight into the molecular stress response of corals. Here, we have performed comparative genomic hybridizations (CGH with different coral species to an Acropora palmata microarray platform containing 13,546 cDNA clones in order to identify potentially rapidly evolving genes and to determine the suitability of existing microarray platforms for use in gene expression studies (via heterologous hybridization. Results Our results showed that the current microarray platform for A. palmata is able to provide biological relevant information for a wide variety of coral species covering both the complex clade as well the robust clade. Analysis of the fraction of highly diverged genes showed a significantly higher amount of genes without annotation corroborating previous findings that point towards a higher rate of divergence for taxonomically restricted genes. Among the genes with annotation, we found many mitochondrial genes to be highly diverged in M. faveolata when compared to A. palmata, while the majority of nuclear encoded genes maintained an average divergence rate. Conclusions The use of present microarray platforms for transcriptional analyses in different coral species will greatly enhance the understanding of the molecular basis of stress and health and highlight evolutionary differences between scleractinian coral species. On a genomic basis, we show that cDNA arrays can be used to identify patterns of divergence. Mitochondrion-encoded genes seem to have diverged faster than

  4. Evolution of electron transfer out of the cell: comparative genomics of six Geobacter genomes

    Directory of Open Access Journals (Sweden)

    Young Nelson D

    2010-01-01

    Full Text Available Abstract Background Geobacter species grow by transferring electrons out of the cell - either to Fe(III-oxides or to man-made substances like energy-harvesting electrodes. Study of Geobacter sulfurreducens has shown that TCA cycle enzymes, inner-membrane respiratory enzymes, and periplasmic and outer-membrane cytochromes are required. Here we present comparative analysis of six Geobacter genomes, including species from the clade that predominates in the subsurface. Conservation of proteins across the genomes was determined to better understand the evolution of Geobacter species and to create a metabolic model applicable to subsurface environments. Results The results showed that enzymes for acetate transport and oxidation, and for proton transport across the inner membrane were well conserved. An NADH dehydrogenase, the ATP synthase, and several TCA cycle enzymes were among the best conserved in the genomes. However, most of the cytochromes required for Fe(III-reduction were not, including many of the outer-membrane cytochromes. While conservation of cytochromes was poor, an abundance and diversity of cytochromes were found in every genome, with duplications apparent in several species. Conclusions These results indicate there is a common pathway for acetate oxidation and energy generation across the family and in the last common ancestor. They also suggest that while cytochromes are important for extracellular electron transport, the path of electrons across the periplasm and outer membrane is variable. This combination of abundant cytochromes with weak sequence conservation suggests they may not be specific terminal reductases, but rather may be important in their heme-bearing capacity, as sinks for electrons between the inner-membrane electron transport chain and the extracellular acceptor.

  5. MethLAB: A graphical user interface package for the analysis of array-based DNA methylation data

    OpenAIRE

    Kilaru, Varun; Barfield, Richard T.; Schroeder, James W.; Smith, Alicia K.; Conneely, Karen N.

    2012-01-01

    Recent evidence suggests that DNA methylation changes may underlie numerous complex traits and diseases. The advent of commercial, array-based methods to interrogate DNA methylation has led to a profusion of epigenetic studies in the literature. Array-based methods, such as the popular Illumina GoldenGate and Infinium platforms, estimate the proportion of DNA methylated at single-base resolution for thousands of CpG sites across the genome. These arrays generate enormous amounts of data, but ...

  6. Gene discovery in trypanosoma vivax through GSS and comparative genomics

    International Nuclear Information System (INIS)

    Full text: Trypanosoma vivax is a hemoparasite affecting livestock industry in South America and Africa. According to Seidl et al more than 11 million cattle evaluated in more than 3 billion dollars are found in the Pantanal region of Brazil and other lowlands in Bolivia. According to the same authors, if the outbreak reported in Pocone-MT (Center-East of Brazil) had gone untreated, the estimated losses would have exceeded US$140,000 on the seven ranches, $200 million in the Pantanal and $700 million regionwide. Despite the high economic relevance of the disease caused by T. vivax, few researches on its molecular characterization has been made as compared with human trypanosomes as T. brucei spp and T. cruzi. The main reason is the difficulty to grow the parasite into laboratory rodents and 'in vitro'. Very few (West African) strains have been adapted to laboratory rodents. Furthermore, most field isolates cannot be characterized by tools as RAPD, since parasitemias are usually very low making difficult the separation of parasites from animal blood for posterior extraction of parasite DNA. These characteristics have limited the research on T. vivax during the last decades, consequently very few markers have been described for its molecular characterization. A search in Genbank showed that there are only 22 entries for T. vivax confronted with nearly 98289, 38577, 23507 available for T. brucei, T. cruzi and Leishmania, respectively. T. vivax (molecular) biology is also little understood, even considering major differences as mechanical transmission in South America and both cyclical and mechanical transmission in Africa. In a consultation with several experts on genomics, it was emphasized that T. vivax and T. congolense are underepresented species in the molecular parasitology and genomics age, then they should be considered to have their genome sequenced. In order to discovery new markers to be explored in the molecular characterization of T. vivax, we decided to

  7. Automated Comparative Auditing of NCIT Genomic Roles Using NCBI

    Science.gov (United States)

    Cohen, Barry; Oren, Marc; Min, Hua; Perl, Yehoshua; Halper, Michael

    2008-01-01

    Biomedical research has identified many human genes and various knowledge about them. The National Cancer Institute Thesaurus (NCIT) represents such knowledge as concepts and roles (relationships). Due to the rapid advances in this field, it is to be expected that the NCIT’s Gene hierarchy will contain role errors. A comparative methodology to audit the Gene hierarchy with the use of the National Center for Biotechnology Information’s (NCBI’s) Entrez Gene database is presented. The two knowledge sources are accessed via a pair of Web crawlers to ensure up-to-date data. Our algorithms then compare the knowledge gathered from each, identify discrepancies that represent probable errors, and suggest corrective actions. The primary focus is on two kinds of gene-roles: (1) the chromosomal locations of genes, and (2) the biological processes in which genes plays a role. Regarding chromosomal locations, the discrepancies revealed are striking and systematic, suggesting a structurally common origin. In regard to the biological processes, difficulties arise because genes frequently play roles in multiple processes, and processes may have many designations (such as synonymous terms). Our algorithms make use of the roles defined in the NCIT Biological Process hierarchy to uncover many probable gene-role errors in the NCIT. These results show that automated comparative auditing is a promising technique that can identify a large number of probable errors and corrections for them in a terminological genomic knowledge repository, thus facilitating its overall maintenance. PMID:18486558

  8. Comparative Genomic Analyses of the Human NPHP1 Locus Reveal Complex Genomic Architecture and Its Regional Evolution in Primates

    Science.gov (United States)

    Yuan, Bo; Liu, Pengfei; Gupta, Aditya; Beck, Christine R.; Tejomurtula, Anusha; Campbell, Ian M.; Gambin, Tomasz; Simmons, Alexandra D.; Withers, Marjorie A.; Harris, R. Alan; Rogers, Jeffrey; Schwartz, David C.; Lupski, James R.

    2015-01-01

    Many loci in the human genome harbor complex genomic structures that can result in susceptibility to genomic rearrangements leading to various genomic disorders. Nephronophthisis 1 (NPHP1, MIM# 256100) is an autosomal recessive disorder that can be caused by defects of NPHP1; the gene maps within the human 2q13 region where low copy repeats (LCRs) are abundant. Loss of function of NPHP1 is responsible for approximately 85% of the NPHP1 cases—about 80% of such individuals carry a large recurrent homozygous NPHP1 deletion that occurs via nonallelic homologous recombination (NAHR) between two flanking directly oriented ~45 kb LCRs. Published data revealed a non-pathogenic inversion polymorphism involving the NPHP1 gene flanked by two inverted ~358 kb LCRs. Using optical mapping and array-comparative genomic hybridization, we identified three potential novel structural variant (SV) haplotypes at the NPHP1 locus that may protect a haploid genome from the NPHP1 deletion. Inter-species comparative genomic analyses among primate genomes revealed massive genomic changes during evolution. The aggregated data suggest that dynamic genomic rearrangements occurred historically within the NPHP1 locus and generated SV haplotypes observed in the human population today, which may confer differential susceptibility to genomic instability and the NPHP1 deletion within a personal genome. Our study documents diverse SV haplotypes at a complex LCR-laden human genomic region. Comparative analyses provide a model for how this complex region arose during primate evolution, and studies among humans suggest that intra-species polymorphism may potentially modulate an individual’s susceptibility to acquiring disease-associated alleles. PMID:26641089

  9. Comparative Genomic Analyses of the Human NPHP1 Locus Reveal Complex Genomic Architecture and Its Regional Evolution in Primates.

    Directory of Open Access Journals (Sweden)

    Bo Yuan

    2015-12-01

    Full Text Available Many loci in the human genome harbor complex genomic structures that can result in susceptibility to genomic rearrangements leading to various genomic disorders. Nephronophthisis 1 (NPHP1, MIM# 256100 is an autosomal recessive disorder that can be caused by defects of NPHP1; the gene maps within the human 2q13 region where low copy repeats (LCRs are abundant. Loss of function of NPHP1 is responsible for approximately 85% of the NPHP1 cases-about 80% of such individuals carry a large recurrent homozygous NPHP1 deletion that occurs via nonallelic homologous recombination (NAHR between two flanking directly oriented ~45 kb LCRs. Published data revealed a non-pathogenic inversion polymorphism involving the NPHP1 gene flanked by two inverted ~358 kb LCRs. Using optical mapping and array-comparative genomic hybridization, we identified three potential novel structural variant (SV haplotypes at the NPHP1 locus that may protect a haploid genome from the NPHP1 deletion. Inter-species comparative genomic analyses among primate genomes revealed massive genomic changes during evolution. The aggregated data suggest that dynamic genomic rearrangements occurred historically within the NPHP1 locus and generated SV haplotypes observed in the human population today, which may confer differential susceptibility to genomic instability and the NPHP1 deletion within a personal genome. Our study documents diverse SV haplotypes at a complex LCR-laden human genomic region. Comparative analyses provide a model for how this complex region arose during primate evolution, and studies among humans suggest that intra-species polymorphism may potentially modulate an individual's susceptibility to acquiring disease-associated alleles.

  10. Comparative genomics reveals evidence of marine adaptation in Salinispora species

    Directory of Open Access Journals (Sweden)

    Penn Kevin

    2012-03-01

    Full Text Available Abstract Background Actinobacteria represent a consistent component of most marine bacterial communities yet little is known about the mechanisms by which these Gram-positive bacteria adapt to life in the marine environment. Here we employed a phylogenomic approach to identify marine adaptation genes in marine Actinobacteria. The focus was on the obligate marine actinomycete genus Salinispora and the identification of marine adaptation genes that have been acquired from other marine bacteria. Results Functional annotation, comparative genomics, and evidence of a shared evolutionary history with bacteria from hyperosmotic environments were used to identify a pool of more than 50 marine adaptation genes. An Actinobacterial species tree was used to infer the likelihood of gene gain or loss in accounting for the distribution of each gene. Acquired marine adaptation genes were associated with electron transport, sodium and ABC transporters, and channels and pores. In addition, the loss of a mechanosensitive channel gene appears to have played a major role in the inability of Salinispora strains to grow following transfer to low osmotic strength media. Conclusions The marine Actinobacteria for which genome sequences are available are broadly distributed throughout the Actinobacterial phylogenetic tree and closely related to non-marine forms suggesting they have been independently introduced relatively recently into the marine environment. It appears that the acquisition of transporters in Salinispora spp. represents a major marine adaptation while gene loss is proposed to play a role in the inability of this genus to survive outside of the marine environment. This study reveals fundamental differences between marine adaptations in Gram-positive and Gram-negative bacteria and no common genetic basis for marine adaptation among the Actinobacteria analyzed.

  11. The Aspergillus Genome Database, a curated comparative genomics resource for gene, protein and sequence information for the Aspergillus research community

    OpenAIRE

    Arnaud, Martha B.; Chibucos, Marcus C; Costanzo, Maria C.; Crabtree, Jonathan; Inglis, Diane O.; Lotia, Adil; Orvis, Joshua; Shah, Prachi; Skrzypek, Marek S.; Binkley, Gail; Miyasato, Stuart R.; Wortman, Jennifer R.; Sherlock, Gavin

    2009-01-01

    The Aspergillus Genome Database (AspGD) is an online genomics resource for researchers studying the genetics and molecular biology of the Aspergilli. AspGD combines high-quality manual curation of the experimental scientific literature examining the genetics and molecular biology of Aspergilli, cutting-edge comparative genomics approaches to iteratively refine and improve structural gene annotations across multiple Aspergillus species, and web-based research tools for accessing and exploring ...

  12. Large-Scale Comparative Genomics Meta-Analysis of Campylobacter jejuni Isolates Reveals Low Level of Genome Plasticity

    OpenAIRE

    Taboada, Eduardo N.; Acedillo, Rey R; Carrillo, Catherine D.; Findlay, Wendy A.; Medeiros, Diane T.; Mykytczuk, Oksana L; Roberts, Michael J.; Valencia, C. Alexander; Farber, Jeffrey M.; Nash, John H E

    2004-01-01

    We have used comparative genomic hybridization (CGH) on a full-genome Campylobacter jejuni microarray to examine genome-wide gene conservation patterns among 51 strains isolated from food and clinical sources. These data have been integrated with data from three previous C. jejuni CGH studies to perform a meta-analysis that included 97 strains from the four separate data sets. Although many genes were found to be divergent across multiple strains (n = 350), many genes (n = 249) were uniquely ...

  13. Comparison of genomic abnormalities between BRCAX and sporadic breast cancers studied by comparative genomic hybridization.

    Science.gov (United States)

    Gronwald, Jacek; Jauch, Anna; Cybulski, Cezary; Schoell, Brigitte; Böhm-Steuer, Barbara; Lener, Marcin; Grabowska, Ewa; Górski, Bohdan; Jakubowska, Anna; Domagała, Wenancjusz; Chosia, Maria; Scott, Rodney J; Lubiński, Jan

    2005-03-20

    Very little is known about the chromosomal regions harbouring genes involved in initiation and progression of BRCAX-associated breast cancers. We applied comparative genomic hybridization (CGH) to identify the most frequent genomic imbalances in 18 BRCAX hereditary breast cancers and compared them to chromosomal aberrations detected in a group of 27 sporadic breast cancers. The aberrations observed most frequently in BRCAX tumours were gains of 8q (83%), 19q (67%), 19p (61%), 20q (61%), 1q (56%), 17q (56%) and losses of 8p (56%), 11q (44%) and 13q (33%). The sporadic cases most frequently showed gains of 1q (67%), 8q (48%), 17q (37%), 16p (33%), 19q (33%) and losses of 11q (26%), 8p (22%) and 16q (19%). Losses of 8p and gains 8q, 19 as well as gains of 20q (with respect to ductal tumours only) were detected significantly more often in BRCAX than in sporadic breast cancers. Analysis of 8p-losses and 8q-gains showed that these aberrations are early events in the tumorigenesis of BRCAX tumors. The findings of this report indicate similarities between BRCAX and BRCA2 tumours, possibly suggesting a common pathway of disease. These findings need confirmation by more extensive studies because only a limited number of cases were analysed and there are relatively few reports published. PMID:15540206

  14. The evolution of the ligand/receptor couple: a long road from comparative endocrinology to comparative genomics

    OpenAIRE

    Markov, Gabriel V.; Paris, Mathilde; Bertrand, Stephanie; Laudet, Vincent

    2008-01-01

    The evolution of the ligand/receptor couple: a long road from comparative endocrinology to comparative genomics FRANCE (Markov, Gabriel V.) FRANCE Received: 2008-02-11 Revised: 2008-05-14 Accepted: 2008-06-11

  15. Comparative genome research between maize and rice using genomic in situ hybridization

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    Using the genomic DNAs of maize and rice as probes respectively,the homology of maize and rice genomes was assessed by genomic in situ hybridization. When rice genomic DNAs were hybridized to maize, all chromosomes displayed many multiple discrete regions, while each rice chromosome delineated a single consecutive chromosomal region after they were hybridized with maize genomic DNAs. The results indicate that the genomes of maize and rice share high homology, and confirm the proposal that maize and rice are diverged from a common ancestor.

  16. Genome stability of Lyme disease spirochetes: comparative genomics of Borrelia burgdorferi plasmids.

    Directory of Open Access Journals (Sweden)

    Sherwood R Casjens

    Full Text Available Lyme disease is the most common tick-borne human illness in North America. In order to understand the molecular pathogenesis, natural diversity, population structure and epizootic spread of the North American Lyme agent, Borrelia burgdorferi sensu stricto, a much better understanding of the natural diversity of its genome will be required. Towards this end we present a comparative analysis of the nucleotide sequences of the numerous plasmids of B. burgdorferi isolates B31, N40, JD1 and 297. These strains were chosen because they include the three most commonly studied laboratory strains, and because they represent different major genetic lineages and so are informative regarding the genetic diversity and evolution of this organism. A unique feature of Borrelia genomes is that they carry a large number of linear and circular plasmids, and this work shows that strains N40, JD1, 297 and B31 carry related but non-identical sets of 16, 20, 19 and 21 plasmids, respectively, that comprise 33-40% of their genomes. We deduce that there are at least 28 plasmid compatibility types among the four strains. The B. burgdorferi ∼900 Kbp linear chromosomes are evolutionarily exceptionally stable, except for a short ≤20 Kbp plasmid-like section at the right end. A few of the plasmids, including the linear lp54 and circular cp26, are also very stable. We show here that the other plasmids, especially the linear ones, are considerably more variable. Nearly all of the linear plasmids have undergone one or more substantial inter-plasmid rearrangements since their last common ancestor. In spite of these rearrangements and differences in plasmid contents, the overall gene complement of the different isolates has remained relatively constant.

  17. Genome Stability of Lyme Disease Spirochetes: Comparative Genomics of Borrelia burgdorferi Plasmids

    Energy Technology Data Exchange (ETDEWEB)

    Casjens S. R.; Dunn J.; Mongodin, E. F.; Qiu, W.-G.; Luft, B. J.; Schutzer, S. E.; Gilcrease, E. B.; Huang, W. M.; Vujadinovic, M.; Aron, J. K.; Vargas, L. C.; Freeman, S.; Radune, D.; Weidman, J. F.; Dimitrov, G. I.; Khouri, H. M.; Sosa, J. E.; Halpin, R. A.; Fraser, C. M.

    2012-03-14

    Lyme disease is the most common tick-borne human illness in North America. In order to understand the molecular pathogenesis, natural diversity, population structure and epizootic spread of the North American Lyme agent, Borrelia burgdorferi sensu stricto, a much better understanding of the natural diversity of its genome will be required. Towards this end we present a comparative analysis of the nucleotide sequences of the numerous plasmids of B. burgdorferi isolates B31, N40, JD1 and 297. These strains were chosen because they include the three most commonly studied laboratory strains, and because they represent different major genetic lineages and so are informative regarding the genetic diversity and evolution of this organism. A unique feature of Borrelia genomes is that they carry a large number of linear and circular plasmids, and this work shows that strains N40, JD1, 297 and B31 carry related but non-identical sets of 16, 20, 19 and 21 plasmids, respectively, that comprise 33-40% of their genomes. We deduce that there are at least 28 plasmid compatibility types among the four strains. The B. burgdorferi {approx}900 Kbp linear chromosomes are evolutionarily exceptionally stable, except for a short {le}20 Kbp plasmid-like section at the right end. A few of the plasmids, including the linear lp54 and circular cp26, are also very stable. We show here that the other plasmids, especially the linear ones, are considerably more variable. Nearly all of the linear plasmids have undergone one or more substantial inter-plasmid rearrangements since their last common ancestor. In spite of these rearrangements and differences in plasmid contents, the overall gene complement of the different isolates has remained relatively constant.

  18. Microbial comparative pan-genomics using binomial mixture models

    DEFF Research Database (Denmark)

    Ussery, David; Snipen, L; Almøy, T

    2009-01-01

    The size of the core- and pan-genome of bacterial species is a topic of increasing interest due to the growing number of sequenced prokaryote genomes, many from the same species. Attempts to estimate these quantities have been made, using regression methods or mixture models. We extend the latter...

  19. Comparative genomics of the relationship between gene structure and expression

    NARCIS (Netherlands)

    Ren, X.

    2006-01-01

    The relationship between the structure of genes and their expression is a relatively new aspect of genome organization and regulation. With more genome sequences and expression data becoming available, bioinformatics approaches can help the further elucidation of the relationships between gene struc

  20. Comparative Genomic Analysis of Meningitis- and Bacteremia-Causing Pneumococci Identifies a Common Core Genome.

    Science.gov (United States)

    Kulohoma, Benard W; Cornick, Jennifer E; Chaguza, Chrispin; Yalcin, Feyruz; Harris, Simon R; Gray, Katherine J; Kiran, Anmol M; Molyneux, Elizabeth; French, Neil; Parkhill, Julian; Faragher, Brian E; Everett, Dean B; Bentley, Stephen D; Heyderman, Robert S

    2015-10-01

    Streptococcus pneumoniae is a nasopharyngeal commensal that occasionally invades normally sterile sites to cause bloodstream infection and meningitis. Although the pneumococcal population structure and evolutionary genetics are well defined, it is not clear whether pneumococci that cause meningitis are genetically distinct from those that do not. Here, we used whole-genome sequencing of 140 isolates of S. pneumoniae recovered from bloodstream infection (n = 70) and meningitis (n = 70) to compare their genetic contents. By fitting a double-exponential decaying-function model, we show that these isolates share a core of 1,427 genes (95% confidence interval [CI], 1,425 to 1,435 genes) and that there is no difference in the core genome or accessory gene content from these disease manifestations. Gene presence/absence alone therefore does not explain the virulence behavior of pneumococci that reach the meninges. Our analysis, however, supports the requirement of a range of previously described virulence factors and vaccine candidates for both meningitis- and bacteremia-causing pneumococci. This high-resolution view suggests that, despite considerable competency for genetic exchange, all pneumococci are under considerable pressure to retain key components advantageous for colonization and transmission and that these components are essential for access to and survival in sterile sites. PMID:26259813

  1. Comparative genomics of drug resistance in Trypanosoma brucei rhodesiense.

    Science.gov (United States)

    Graf, Fabrice E; Ludin, Philipp; Arquint, Christian; Schmidt, Remo S; Schaub, Nadia; Kunz Renggli, Christina; Munday, Jane C; Krezdorn, Jessica; Baker, Nicola; Horn, David; Balmer, Oliver; Caccone, Adalgisa; de Koning, Harry P; Mäser, Pascal

    2016-09-01

    Trypanosoma brucei rhodesiense is one of the causative agents of human sleeping sickness, a fatal disease that is transmitted by tsetse flies and restricted to Sub-Saharan Africa. Here we investigate two independent lines of T. b. rhodesiense that have been selected with the drugs melarsoprol and pentamidine over the course of 2 years, until they exhibited stable cross-resistance to an unprecedented degree. We apply comparative genomics and transcriptomics to identify the underlying mutations. Only few mutations have become fixed during selection. Three genes were affected by mutations in both lines: the aminopurine transporter AT1, the aquaporin AQP2, and the RNA-binding protein UBP1. The melarsoprol-selected line carried a large deletion including the adenosine transporter gene AT1, whereas the pentamidine-selected line carried a heterozygous point mutation in AT1, G430R, which rendered the transporter non-functional. Both resistant lines had lost AQP2, and both lines carried the same point mutation, R131L, in the RNA-binding motif of UBP1. The finding that concomitant deletion of the known resistance genes AT1 and AQP2 in T. b. brucei failed to phenocopy the high levels of resistance of the T. b. rhodesiense mutants indicated a possible role of UBP1 in melarsoprol-pentamidine cross-resistance. However, homozygous in situ expression of UBP1-Leu(131) in T. b. brucei did not affect the sensitivity to melarsoprol or pentamidine. PMID:26973180

  2. Sequence and comparative genomic analysis of actin-related proteins.

    Science.gov (United States)

    Muller, Jean; Oma, Yukako; Vallar, Laurent; Friederich, Evelyne; Poch, Olivier; Winsor, Barbara

    2005-12-01

    Actin-related proteins (ARPs) are key players in cytoskeleton activities and nuclear functions. Two complexes, ARP2/3 and ARP1/11, also known as dynactin, are implicated in actin dynamics and in microtubule-based trafficking, respectively. ARP4 to ARP9 are components of many chromatin-modulating complexes. Conventional actins and ARPs codefine a large family of homologous proteins, the actin superfamily, with a tertiary structure known as the actin fold. Because ARPs and actin share high sequence conservation, clear family definition requires distinct features to easily and systematically identify each subfamily. In this study we performed an in depth sequence and comparative genomic analysis of ARP subfamilies. A high-quality multiple alignment of approximately 700 complete protein sequences homologous to actin, including 148 ARP sequences, allowed us to extend the ARP classification to new organisms. Sequence alignments revealed conserved residues, motifs, and inserted sequence signatures to define each ARP subfamily. These discriminative characteristics allowed us to develop ARPAnno (http://bips.u-strasbg.fr/ARPAnno), a new web server dedicated to the annotation of ARP sequences. Analyses of sequence conservation among actins and ARPs highlight part of the actin fold and suggest interactions between ARPs and actin-binding proteins. Finally, analysis of ARP distribution across eukaryotic phyla emphasizes the central importance of nuclear ARPs, particularly the multifunctional ARP4. PMID:16195354

  3. Comparative analysis of genome maintenance genes in naked mole rat, mouse, and human

    NARCIS (Netherlands)

    S.L. Macrae (Sheila L.); Q. Zhang (Quanwei); C. Lemetre (Christophe); I. Seim (Inge); R.B. Calder (Robert B.); J.H.J. Hoeijmakers (Jan); Y. Suh (Yousin); V.N. Gladyshev (Vadim N.); A. Seluanov (Andrei); V. Gorbunova (Vera); J. Vijg (Jan); Z.D. Zhang (Zhengdong D.)

    2015-01-01

    textabstractGenome maintenance (GM) is an essential defense system against aging and cancer, as both are characterized by increased genome instability. Here, we compared the copy number variation and mutation rate of 518 GM-associated genes in the naked mole rat (NMR), mouse, and human genomes. GM g

  4. Genomic-associated Markers and comparative Genome Maps of Xanthomonas oryzae pv. oryzae and X. oryzae pv. oryzicola.

    Science.gov (United States)

    Feng, Wenjie; Wang, Yi; Huang, Lisha; Feng, Chuanshun; Chu, Zhaohui; Ding, Xinhua; Yang, Long

    2015-09-01

    Xanthomonas oryzae pv. oryzae (Xoo) and X. oryzae pv. oryzicola (Xoc) cause two major seed quarantine diseases in rice, bacterial blight and bacterial leaf streak, respectively. Xoo and Xoc share high similarity in genomic sequence, which results in hard differentiation of the two pathogens. Genomic-associated Markers and comparative Genome Maps database (GMGM) is an integrated database providing comprehensive information including compared genome maps and full genomic-coverage molecular makers of Xoo and Xoc. This database was established based on bioinformatic analysis of complete sequenced genomes of several X. oryzae pathovars of which the similarity of the genomes was up to 91.39 %. The program was designed with a series of specific PCR primers, including 286 pairs of Xoo dominant markers, 288 pairs of Xoc dominant markers, and 288 pairs of Xoo and Xoc co-dominant markers, which were predicted to distinguish two pathovars. Test on a total of 40 donor pathogen strains using randomly selected 120 pairs of primers demonstrated that over 52.5 % of the primers were efficacious. The GMGM web portal ( http://biodb.sdau.edu.cn/gmgm/ ) will be a powerful tool that can present highly specific diagnostic markers, and it also provides information about comparative genome maps of the two pathogens for future evolution study. PMID:26093644

  5. In silico comparative genomic analysis of GABAA receptor transcriptional regulation

    Directory of Open Access Journals (Sweden)

    Joyce Christopher J

    2007-06-01

    Full Text Available Abstract Background Subtypes of the GABAA receptor subunit exhibit diverse temporal and spatial expression patterns. In silico comparative analysis was used to predict transcriptional regulatory features in individual mammalian GABAA receptor subunit genes, and to identify potential transcriptional regulatory components involved in the coordinate regulation of the GABAA receptor gene clusters. Results Previously unreported putative promoters were identified for the β2, γ1, γ3, ε, θ and π subunit genes. Putative core elements and proximal transcriptional factors were identified within these predicted promoters, and within the experimentally determined promoters of other subunit genes. Conserved intergenic regions of sequence in the mammalian GABAA receptor gene cluster comprising the α1, β2, γ2 and α6 subunits were identified as potential long range transcriptional regulatory components involved in the coordinate regulation of these genes. A region of predicted DNase I hypersensitive sites within the cluster may contain transcriptional regulatory features coordinating gene expression. A novel model is proposed for the coordinate control of the gene cluster and parallel expression of the α1 and β2 subunits, based upon the selective action of putative Scaffold/Matrix Attachment Regions (S/MARs. Conclusion The putative regulatory features identified by genomic analysis of GABAA receptor genes were substantiated by cross-species comparative analysis and now require experimental verification. The proposed model for the coordinate regulation of genes in the cluster accounts for the head-to-head orientation and parallel expression of the α1 and β2 subunit genes, and for the disruption of transcription caused by insertion of a neomycin gene in the close vicinity of the α6 gene, which is proximal to a putative critical S/MAR.

  6. Functional and Comparative Genomics of Lignocellulose Degradation by Schizophyllum commune

    Energy Technology Data Exchange (ETDEWEB)

    Ohm, Robin A.; Lee, Hanbyul; Park, Hongjae; Brewer, Heather M.; Carver, Akiko; Copeland, Alex; Grimwood, Jane; Lindquist, Erika; Lipzen, Anna; Martin, Joel; Purvine, Samuel O.; Schackwitz, Wendy; Tegelaar, Martin; Tritt, Andrew; Baker, Scott; Choi, In-Geol; Lugones, Luis G.; Wosten, Han A. B.; Grigoriev, Igor V.

    2014-03-14

    The Basidiomycete fungus Schizophyllum commune is a wood-decaying fungus and is used as a model system to study lignocellulose degradation. Version 3.0 of the genome assembly filled 269 of 316 sequence gaps and added 680 kb of sequence. This new assembly was reannotated using RNAseq transcriptomics data, and this resulted in 3110 (24percent) more genes. Two additional S. commune strains with different wood-decaying properties were sequenced, from Tattone (France) and Loenen (The Netherlands). Sequence comparison shows remarkably high sequence diversity between the strains. The overall SNP rate of > 100 SNPs/kb is among the highest rates of within-species polymorphisms in Basidiomycetes. Some well-described proteins like hydrophobins and transcription factors have less than 70percent sequence identity among the strains. Some chromosomes are better conserved than others and in some cases large parts of chromosomes are missing from one or more strains. Gene expression on glucose, cellulose and wood was analyzed in two S. commune strains. Overall, gene expression correlated between the two strains, but there were some notable exceptions. Of particular interest are CAZymes (carbohydrate-active enzymes) that are regulated in different ways in the different strains. In both strains the transcription factor Fsp1 was strongly up-regulated during growth on cellulose and wood, when compared to glucose. Over-expression of Fsp1 using a constitutive promoter resulted in higher cellulose and xylose-degrading enzyme activity, which suggests that Fsp1 is involved in regulating CAZyme gene expression. Two CAZyme genes (of family GH61 and GH11) were shown to be strongly up-regulated during growth on cellulose, compared to glucose. Proteomics on the secreted proteins in the growth medium confirmed this. A promoter analysis revealed the shortest active promoters for these two genes, as well as putative transcription factor binding sites.

  7. Comparative Genome Analysis of Lolium-Festuca Complex Species

    DEFF Research Database (Denmark)

    Czaban, Adrian; Byrne, Stephen; Sharma, Sapna;

    2015-01-01

    , winter hardiness, drought tolerance and resistance to grazing. In this study we have sequenced and assembled the low copy fraction of the genomes of Lolium westerwoldicum, Lolium multiflorum, Festuca pratensis and Lolium temulentum. We have also generated de-novo transcriptome assemblies for each species......, and these have aided in the annotation of the genomic sequence. Using this data we were able to generate annotated assemblies of the gene rich regions of the four species to complement the already sequenced Lolium perenne genome. Using these gene models we have identified orthologous genes between the species...

  8. DeltaProt: a software toolbox for comparative genomics

    Directory of Open Access Journals (Sweden)

    Willassen Nils P

    2010-11-01

    Full Text Available Abstract Background Statistical bioinformatics is the study of biological data sets obtained by new micro-technologies by means of proper statistical methods. For a better understanding of environmental adaptations of proteins, orthologous sequences from different habitats may be explored and compared. The main goal of the DeltaProt Toolbox is to provide users with important functionality that is needed for comparative screening and studies of extremophile proteins and protein classes. Visualization of the data sets is also the focus of this article, since visualizations can play a key role in making the various relationships transparent. This application paper is intended to inform the reader of the existence, functionality, and applicability of the toolbox. Results We present the DeltaProt Toolbox, a software toolbox that may be useful in importing, analyzing and visualizing data from multiple alignments of proteins. The toolbox has been written in MATLAB™ to provide an easy and user-friendly platform, including a graphical user interface, while ensuring good numerical performance. Problems in genome biology may be easily stated thanks to a compact input format. The toolbox also offers the possibility of utilizing structural information from the SABLE or other structure predictors. Different sequence plots can then be viewed and compared in order to find their similarities and differences. Detailed statistics are also calculated during the procedure. Conclusions The DeltaProt package is open source and freely available for academic, non-commercial use. The latest version of DeltaProt can be obtained from http://services.cbu.uib.no/software/deltaprot/. The website also contains documentation, and the toolbox comes with real data sets that are intended for training in applying the models to carry out bioinformatical and statistical analyses of protein sequences. Equipped with the new algorithms proposed here, DeltaProt serves as an auxiliary

  9. Comparative genomic analysis of Lactobacillus rhamnosus GG reveals pili containing a human- mucus binding protein

    OpenAIRE

    Kankainen, M; Paulin, L.; Tynkkynen, S.; Ossowski, von, I.; Reunanen, J.; Partanen, P.; Satokari, A.; Vesterlund, S.; Hendrickx, A.P.; Lebeer, S.; Keersmaecker, de, S.C.; Vanderleyden, J.; Hämäläinen, T. (Tiina); Laukkanen, S.; Salovuori, N.

    2009-01-01

    To unravel the biological function of the widely used probiotic bacterium Lactobacillus rhamnosus GG, we compared its 3.0-Mbp genome sequence with the similarly sized genome of L. rhamnosus LC705, an adjunct starter culture exhibiting reduced binding to mucus. Both genomes demonstrated high sequence identity and synteny. However, for both strains, genomic islands, 5 in GG and 4 in LC705, punctuated the colinearity. A significant number of strain-specific genes were predicted in these islands ...

  10. Genome sequence and comparative analysis of Avibacterium paragallinarum

    OpenAIRE

    Requena, David; Chumbe, Ana; Torres, Michael; Alzamora, Ofelia; Ramirez, Manuel; Valdivia-Olarte, Hugo; Gutierrez, Andres Hazaet; Izquierdo-Lara, Ray; Saravia, Luis Enrique; Zavaleta, Milagros; Tataje-Lavanda, Luis; Best, Ivan; Fernández-Sánchez, Manolo; Icochea, Eliana; Zimic, Mirko

    2013-01-01

    Background: Avibacterium paragallinarum, the causative agent of infectious coryza, is a highly contagious respiratory acute disease of poultry, which affects commercial chickens, laying hens and broilers worldwide. Methodology: In this study, we performed the whole genome sequencing, assembly and annotation of a Peruvian isolate of A. paragallinarum. Genome was sequenced in a 454 GS FLX Titanium system. De novo assembly was performed and annotation was completed with GS De Novo Assembler 2.6 ...

  11. Comparative Genomics of Symbiotic Bacteria in Earthworm Nephridia

    DEFF Research Database (Denmark)

    Kjeldsen, Kasper Urup; Pinel, Nicolas; Lund, Marie Braad;

    The excretory and osmoregulatory organs (nephridia) of lumbricid earthworms are densely colonized by extracellular bacterial symbionts belonging to the newly established betaproteobacterial genus Verminephrobacter. The nephridial symbiont of the earthworm Eisenia fetida was subjected to full genome...... sequencing along with two of its closest relatives; the plant pathogenic Acidovorax avena subsp. citrulli and the free-living Acidovorax sp. JS42. In addition, the genome of the nephridial symbiont of the earthworm Aporrectodea tuberculata was partially sequenced. In order to resolve the functional...

  12. Metagenome Skimming of Insect Specimen Pools: Potential for Comparative Genomics.

    Science.gov (United States)

    Linard, Benjamin; Crampton-Platt, Alex; Gillett, Conrad P D T; Timmermans, Martijn J T N; Vogler, Alfried P

    2015-06-01

    Metagenomic analyses are challenging in metazoans, but high-copy number and repeat regions can be assembled from low-coverage sequencing by "genome skimming," which is applied here as a new way of characterizing metagenomes obtained in an ecological or taxonomic context. Illumina shotgun sequencing on two pools of Coleoptera (beetles) of approximately 200 species each were assembled into tens of thousands of scaffolds. Repeated low-coverage sequencing recovered similar scaffold sets consistently, although approximately 70% of scaffolds could not be identified against existing genome databases. Identifiable scaffolds included mitochondrial DNA, conserved sequences with hits to expressed sequence tag and protein databases, and known repeat elements of high and low complexity, including numerous copies of rRNA and histone genes. Assemblies of histones captured a diversity of gene order and primary sequence in Coleoptera. Scaffolds with similarity to multiple sites in available coleopteran genome sequences for Dendroctonus and Tribolium revealed high specificity of scaffolds to either of these genomes, in particular for high-copy number repeats. Numerous "clusters" of scaffolds mapped to the same genomic site revealed intra- and/or intergenomic variation within a metagenome pool. In addition to effect of taxonomic composition of the metagenomes, the number of mapped scaffolds also revealed structural differences between the two reference genomes, although the significance of this striking finding remains unclear. Finally, apparently exogenous sequences were recovered, including potential food plants, fungal pathogens, and bacterial symbionts. The "metagenome skimming" approach is useful for capturing the genomic diversity of poorly studied, species-rich lineages and opens new prospects in environmental genomics. PMID:25979752

  13. Pan-vertebrate comparative genomics unmasks retrovirus macroevolution

    OpenAIRE

    Hayward, Alexander; Cornwallis, Charlie K.; Jern, Patric

    2014-01-01

    For millions of years retroviruses, such as HIV in humans, have attacked vertebrates. Occasionally retroviruses infiltrate germ cells, incorporate themselves into the host’s genome, and transmit vertically to the host’s offspring as endogenous retroviruses (ERVs). Consequently, ERVs make up large portions of vertebrate genomes and represent a record of past host–retrovirus interactions. We developed pan-vertebrate ERV analyses to provide an overview of host–retrovirus interactions, generating...

  14. Genomic Comparative Study of Bovine Mastitis Escherichia coli

    OpenAIRE

    Kempf, Florent; Slugocki, Cindy; Blum, Shlomo E.; Leitner, Gabriel; Germon, Pierre

    2016-01-01

    Escherichia coli, one of the main causative agents of bovine mastitis, is responsible for significant losses on dairy farms. In order to better understand the pathogenicity of E. coli mastitis, an accurate characterization of E. coli strains isolated from mastitis cases is required. By using phylogenetic analyses and whole genome comparison of 5 currently available mastitis E. coli genome sequences, we searched for genotypic traits specific for mastitis isolates. Our data confirm that there i...

  15. Kiwifruit Information Resource (KIR): a comparative platform for kiwifruit genomics.

    Science.gov (United States)

    Yue, Junyang; Liu, Jian; Ban, Rongjun; Tang, Wei; Deng, Lin; Fei, Zhangjun; Liu, Yongsheng

    2015-10-01

    The Kiwifruit Information Resource (KIR) is dedicated to maintain and integrate comprehensive datasets on genomics, functional genomics and transcriptomics of kiwifruit (Actinidiaceae). KIR serves as a central access point for existing/new genomic and genetic data. KIR also provides researchers with a variety of visualization and analysis tools. Current developments include the updated genome structure of Actinidia chinensis cv. Hongyang and its newest genome annotation, putative transcripts, gene expression, physical markers of genetic traits as well as relevant publications based on the latest genome assembly. Nine thousand five hundred and forty-seven new transcripts are detected and 21 132 old transcripts are changed. At the present release, the next-generation transcriptome sequencing data has been incorporated into gene models and splice variants. Protein-protein interactions are also identified based on experimentally determined orthologous interactions. Furthermore, the experimental results reported in peer-reviewed literature are manually extracted and integrated within a well-developed query page. In total, 122 identifications are currently associated, including commonly used gene names and symbols. All KIR datasets are helpful to facilitate a broad range of kiwifruit research topics and freely available to the research community. Database URL: http://bdg.hfut.edu.cn/kir/index.html. PMID:26656885

  16. Microbial comparative pan-genomics using binomial mixture models

    Directory of Open Access Journals (Sweden)

    Ussery David W

    2009-08-01

    Full Text Available Abstract Background The size of the core- and pan-genome of bacterial species is a topic of increasing interest due to the growing number of sequenced prokaryote genomes, many from the same species. Attempts to estimate these quantities have been made, using regression methods or mixture models. We extend the latter approach by using statistical ideas developed for capture-recapture problems in ecology and epidemiology. Results We estimate core- and pan-genome sizes for 16 different bacterial species. The results reveal a complex dependency structure for most species, manifested as heterogeneous detection probabilities. Estimated pan-genome sizes range from small (around 2600 gene families in Buchnera aphidicola to large (around 43000 gene families in Escherichia coli. Results for Echerichia coli show that as more data become available, a larger diversity is estimated, indicating an extensive pool of rarely occurring genes in the population. Conclusion Analyzing pan-genomics data with binomial mixture models is a way to handle dependencies between genomes, which we find is always present. A bottleneck in the estimation procedure is the annotation of rarely occurring genes.

  17. Refined annotation and assembly of the Tetrahymena thermophila genome sequence through EST analysis, comparative genomic hybridization, and targeted gap closure

    Directory of Open Access Journals (Sweden)

    Lee Suzanne R

    2008-11-01

    Full Text Available Abstract Background Tetrahymena thermophila, a widely studied model for cellular and molecular biology, is a binucleated single-celled organism with a germline micronucleus (MIC and somatic macronucleus (MAC. The recent draft MAC genome assembly revealed low sequence repetitiveness, a result of the epigenetic removal of invasive DNA elements found only in the MIC genome. Such low repetitiveness makes complete closure of the MAC genome a feasible goal, which to achieve would require standard closure methods as well as removal of minor MIC contamination of the MAC genome assembly. Highly accurate preliminary annotation of Tetrahymena's coding potential was hindered by the lack of both comparative genomic sequence information from close relatives and significant amounts of cDNA evidence, thus limiting the value of the genomic information and also leaving unanswered certain questions, such as the frequency of alternative splicing. Results We addressed the problem of MIC contamination using comparative genomic hybridization with purified MIC and MAC DNA probes against a whole genome oligonucleotide microarray, allowing the identification of 763 genome scaffolds likely to contain MIC-limited DNA sequences. We also employed standard genome closure methods to essentially finish over 60% of the MAC genome. For the improvement of annotation, we have sequenced and analyzed over 60,000 verified EST reads from a variety of cellular growth and development conditions. Using this EST evidence, a combination of automated and manual reannotation efforts led to updates that affect 16% of the current protein-coding gene models. By comparing EST abundance, many genes showing apparent differential expression between these conditions were identified. Rare instances of alternative splicing and uses of the non-standard amino acid selenocysteine were also identified. Conclusion We report here significant progress in genome closure and reannotation of Tetrahymena

  18. Genome-wide comparative analysis of the Brassica rapa gene space reveals genome shrinkage and differential loss of duplicated genes after whole genome triplication

    OpenAIRE

    Mun, Jeong-Hwan; Kwon, Soo-Jin; Yang, Tae-Jin; Seol, Young-Joo; Jin, Mina; Kim, Jin-A; Lim, Myung-Ho; Kim, Jung Sun; Baek, Seunghoon; Choi, Beom-Soon; Yu, Hee-Ju; Kim, Dae-Soo; Kim, Namshin; Lim, Ki-Byung; Lee, Soo-In

    2009-01-01

    Background Brassica rapa is one of the most economically important vegetable crops worldwide. Owing to its agronomic importance and phylogenetic position, B. rapa provides a crucial reference to understand polyploidy-related crop genome evolution. The high degree of sequence identity and remarkably conserved genome structure between Arabidopsis and Brassica genomes enables comparative tiling sequencing using Arabidopsis sequences as references to select the counterpart regions in B. rapa, whi...

  19. e-Fungi: a data resource for comparative analysis of fungal genomes

    Directory of Open Access Journals (Sweden)

    Hubbard Simon J

    2007-11-01

    Full Text Available Abstract Background The number of sequenced fungal genomes is ever increasing, with about 200 genomes already fully sequenced or in progress. Only a small percentage of those genomes have been comprehensively studied, for example using techniques from functional genomics. Comparative analysis has proven to be a useful strategy for enhancing our understanding of evolutionary biology and of the less well understood genomes. However, the data required for these analyses tends to be distributed in various heterogeneous data sources, making systematic comparative studies a cumbersome task. Furthermore, comparative analyses benefit from close integration of derived data sets that cluster genes or organisms in a way that eases the expression of requests that clarify points of similarity or difference between species. Description To support systematic comparative analyses of fungal genomes we have developed the e-Fungi database, which integrates a variety of data for more than 30 fungal genomes. Publicly available genome data, functional annotations, and pathway information has been integrated into a single data repository and complemented with results of comparative analyses, such as MCL and OrthoMCL cluster analysis, and predictions of signaling proteins and the sub-cellular localisation of proteins. To access the data, a library of analysis tasks is available through a web interface. The analysis tasks are motivated by recent comparative genomics studies, and aim to support the study of evolutionary biology as well as community efforts for improving the annotation of genomes. Web services for each query are also available, enabling the tasks to be incorporated into workflows. Conclusion The e-Fungi database provides fungal biologists with a resource for comparative studies of a large range of fungal genomes. Its analysis library supports the comparative study of genome data, functional annotation, and results of large scale analyses over all the

  20. IMG 4 version of the integrated microbial genomes comparative analysis system

    Energy Technology Data Exchange (ETDEWEB)

    Markowitz, Victor M. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Chen, I-Min A. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Palaniappan, Krishna [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Chu, Ken [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Szeto, Ernest [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Pillay, Manoj [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Ratner, Anna [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Huang, Jinghua [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Woyke, Tanja [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Huntemann, Marcel [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Anderson, Iain [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Billis, Konstantinos [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Varghese, Neha [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Mavromatis, Konstantinos [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Pati, Amrita [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Ivanova, Natalia N. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Kyrpides, Nikos C. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program

    2013-10-27

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG’s data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG’s annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Finally, different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).

  1. Roundup 2.0: enabling comparative genomics for over 1800 genomes

    OpenAIRE

    Cui, Jike; St. Gabriel, Kristian Che; Jung, Jae-Yoon; Wall, Dennis Paul Paul; DeLuca, Todd F

    2012-01-01

    Summary: Roundup is an online database of gene orthologs for over 1800 genomes, including 226 Eukaryota, 1447 Bacteria, 113 Archaea and 21 Viruses. Orthologs are inferred using the Reciprocal Smallest Distance algorithm. Users may query Roundup for single-linkage clusters of orthologous genes based on any group of genomes. Annotated query results may be viewed in a variety of ways including as clusters of orthologs and as phylogenetic profiles. Genomic results may be downloaded in formats sui...

  2. Transcriptome-based exon capture enables highly cost-effective comparative genomic data collection at moderate evolutionary scales

    Directory of Open Access Journals (Sweden)

    Bi Ke

    2012-08-01

    Full Text Available Abstract Background To date, exon capture has largely been restricted to species with fully sequenced genomes, which has precluded its application to lineages that lack high quality genomic resources. We developed a novel strategy for designing array-based exon capture in chipmunks (Tamias based on de novo transcriptome assemblies. We evaluated the performance of our approach across specimens from four chipmunk species. Results We selectively targeted 11,975 exons (~4 Mb on custom capture arrays, and enriched over 99% of the targets in all libraries. The percentage of aligned reads was highly consistent (24.4-29.1% across all specimens, including in multiplexing up to 20 barcoded individuals on a single array. Base coverage among specimens and within targets in each species library was uniform, and the performance of targets among independent exon captures was highly reproducible. There was no decrease in coverage among chipmunk species, which showed up to 1.5% sequence divergence in coding regions. We did observe a decline in capture performance of a subset of targets designed from a much more divergent ground squirrel genome (30 My, however, over 90% of the targets were also recovered. Final assemblies yielded over ten thousand orthologous loci (~3.6 Mb with thousands of fixed and polymorphic SNPs among species identified. Conclusions Our study demonstrates the potential of a transcriptome-enabled, multiplexed, exon capture method to create thousands of informative markers for population genomic and phylogenetic studies in non-model species across the tree of life.

  3. Complete genome sequences and comparative genome analysis of Lactobacillus plantarum strain 5-2 isolated from fermented soybean.

    Science.gov (United States)

    Liu, Chen-Jian; Wang, Rui; Gong, Fu-Ming; Liu, Xiao-Feng; Zheng, Hua-Jun; Luo, Yi-Yong; Li, Xiao-Ran

    2015-12-01

    Lactobacillus plantarum is an important probiotic and is mostly isolated from fermented foods. We sequenced the genome of L. plantarum strain 5-2, which was derived from fermented soybean isolated from Yunnan province, China. The strain was determined to contain 3114 genes. Fourteen complete insertion sequence (IS) elements were found in 5-2 chromosome. There were 24 DNA replication proteins and 76 DNA repair proteins in the 5-2 genome. Consistent with the classification of L. plantarum as a facultative heterofermentative lactobacillus, the 5-2 genome encodes key enzymes required for the EMP (Embden-Meyerhof-Parnas) and phosphoketolase (PK) pathways. Several components of the secretion machinery are found in the 5-2 genome, which was compared with L. plantarum ST-III, JDM1 and WCFS1. Most of the specific proteins in the four genomes appeared to be related to their prophage elements. PMID:26212213

  4. Comparative genomics and evolution of the tailed-bacteriophages.

    Science.gov (United States)

    Casjens, Sherwood R

    2005-08-01

    The number of completely sequenced tailed-bacteriophage genomes that have been published increased to more than 125 last year. The comparison of these genomes has brought their highly mosaic nature into much sharper focus. Furthermore, reports of the complete sequences of about 150 bacterial genomes have shown that the many prophage and parts thereof that reside in these bacterial genomes must comprise a significant fraction of Earth's phage gene pool. These phage and prophage genomes are fertile ground for attempts to deduce the nature of viral evolutionary processes, and such analyses have made it clear that these phage have enjoyed a significant level of horizontal exchange of genetic information throughout their long histories. The strength of these evolutionary deductions rests largely on the extensive knowledge that has accumulated during intensive study into the molecular nature of the life cycles of a few 'model system' phages over the past half century. Recent molecular studies of phages other than these model system phages have made it clear that much remains to be learnt about the variety of lifestyle strategies utilized by the tailed-phage. PMID:16019256

  5. Comparative analysis of catfish BAC end sequences with the zebrafish genome

    OpenAIRE

    Abernathy Jason; Xu Peng; Somridhivej Benjaporn; Ninwichian Parichart; Wang Shaolin; Jiang Yanliang; Liu Hong; Kucuktas Huseyin; Liu Zhanjiang

    2009-01-01

    Abstract Background Comparative mapping is a powerful tool to transfer genomic information from sequenced genomes to closely related species for which whole genome sequence data are not yet available. However, such an approach is still very limited in catfish, the most important aquaculture species in the United States. This project was initiated to generate additional BAC end sequences and demonstrate their applications in comparative mapping in catfish. Results We reported the generation of...

  6. Comparative analysis of catfish BAC end sequences with the zebrafish genome

    OpenAIRE

    Liu, Hong; Jiang, Yanliang; Wang, Shaolin; Ninwichian, Parichart; Somridhivej, Benjaporn; Xu, Peng(Academy of Mathematics and Systems Science, Chinese Academy of Sciences, 100190, Beijing, China); Abernathy, Jason; Kucuktas, Huseyin; Liu, Zhanjiang

    2009-01-01

    Background Comparative mapping is a powerful tool to transfer genomic information from sequenced genomes to closely related species for which whole genome sequence data are not yet available. However, such an approach is still very limited in catfish, the most important aquaculture species in the United States. This project was initiated to generate additional BAC end sequences and demonstrate their applications in comparative mapping in catfish. Results We reported the generation of 43,000 B...

  7. Comparative genome analysis across a kingdom of eukaryotic organisms: Specialization and diversification in the Fungi

    OpenAIRE

    Cornell, Michael J.; Alam, Intikhab; Soanes, Darren M.; Wong, Han Min; Hedeler, Cornelia; Paton, Norman W; Rattray, Magnus; Hubbard, Simon J; Talbot, Nicholas J.; Oliver, Stephen G

    2007-01-01

    The recent proliferation of genome sequencing in diverse fungal species has provided the first opportunity for comparative genome analysis across a eukaryotic kingdom. Here, we report a comparative study of 34 complete fungal genome sequences, representing a broad diversity of Ascomycete, Basidiomycete, and Zygomycete species. We have clustered all predicted protein-encoding gene sequences from these species to provide a means of investigating gene innovations, gene family expansions, protein...

  8. UniPrimer: A Web-Based Primer Design Tool for Comparative Analyses of Primate Genomes

    OpenAIRE

    Nomin Batnyam; Jimin Lee; Jungnam Lee; Seung Bok Hong; Sejong Oh; Kyudong Han

    2012-01-01

    Whole genome sequences of various primates have been released due to advanced DNA-sequencing technology. A combination of computational data mining and the polymerase chain reaction (PCR) assay to validate the data is an excellent method for conducting comparative genomics. Thus, designing primers for PCR is an essential procedure for a comparative analysis of primate genomes. Here, we developed and introduced UniPrimer for use in those studies. UniPrimer is a web-based tool that designs PCR-...

  9. Complete genome sequence and comparative genomic analysis of an emerging human pathogen, serotype V Streptococcus agalactiae

    NARCIS (Netherlands)

    Tettelin, H; Masignani, [No Value; Cieslewicz, MJ; Eisen, JA; Peterson, S; Paulsen, IT; Nelson, KE; Margarit, [No Value; Read, TD; Madoff, LC; Beanan, MJ; Brinkac, LM; Daugherty, SC; DeBoy, RT; Durkin, AS; Kolonay, JF; Madupu, R; Lewis, MR; Radune, D; Fedorova, NB; Scanlan, D; Khouri, H; Mulligan, S; Carty, HA; Cline, RT; Van Aken, SE; Gill, J; Scarselli, M; Mora, M; Iacobini, ET; Brettoni, C; Galli, G; Mariani, M; Vegni, F; Maione, D; Rinaudo, D; Rappuoli, R; Telford, JL; Kasper, DL; Grandi, G; Fraser, CM

    2002-01-01

    The 2,160,267 bp genome sequence of Streptococcus agalactiae, the leading cause of bacterial sepsis, pneumonia, and meningitis in neonates in the U.S. and Europe, is predicted to encode 2,175 genes. Genome comparisons among S. agalactiae, Streptococcus pneumoniae, Streptococcus pyogenes, and the oth

  10. Comparative genomics of toxigenic and non-toxigenic Staphylococcus hyicus

    DEFF Research Database (Denmark)

    Leekitcharoenphon, Pimlapas; Pamp, Sünje Johanna; Andresen, Lars Ole;

    2016-01-01

    The most common causative agent of exudative epidermitis (EE) in pigs is Staphylococcus hyicus. S. hyicus can be grouped into toxigenic and non-toxigenic strains based on their ability to cause EE in pigs and specific virulence genes have been identified. A genome wide comparison between non...

  11. Cloud Computing for Comparative Genomics with Windows Azure Platform

    OpenAIRE

    Insik Kim; Jae-Yoon Jung; DeLuca, Todd F.; Nelson, Tristan H; Wall, Dennis P

    2012-01-01

    Cloud computing services have emerged as a cost-effective alternative for cluster systems as the number of genomes and required computation power to analyze them increased in recent years. Here we introduce the Microsoft Azure platform with detailed execution steps and a cost comparison with Amazon Web Services.

  12. Cloud computing for comparative genomics with windows azure platform.

    Science.gov (United States)

    Kim, Insik; Jung, Jae-Yoon; Deluca, Todd F; Nelson, Tristan H; Wall, Dennis P

    2012-01-01

    Cloud computing services have emerged as a cost-effective alternative for cluster systems as the number of genomes and required computation power to analyze them increased in recent years. Here we introduce the Microsoft Azure platform with detailed execution steps and a cost comparison with Amazon Web Services. PMID:23032609

  13. Comparative Analysis of Genome Diversity in Bullmastiff Dogs.

    Science.gov (United States)

    Mortlock, Sally-Anne; Khatkar, Mehar S; Williamson, Peter

    2016-01-01

    Management and preservation of genomic diversity in dog breeds is a major objective for maintaining health. The present study was undertaken to characterise genomic diversity in Bullmastiff dogs using both genealogical and molecular analysis. Genealogical analysis of diversity was conducted using a database consisting of 16,378 Bullmastiff pedigrees from year 1980 to 2013. Additionally, a total of 188 Bullmastiff dogs were genotyped using the 170,000 SNP Illumina CanineHD Beadchip. Genealogical parameters revealed a mean inbreeding coefficient of 0.047; 142 total founders (f); an effective number of founders (fe) of 79; an effective number of ancestors (fa) of 62; and an effective population size of the reference population of 41. Genetic diversity and the degree of genome-wide homogeneity within the breed were also investigated using molecular data. Multiple-locus heterozygosity (MLH) was equal to 0.206; runs of homozygosity (ROH) as proportion of the genome, averaged 16.44%; effective population size was 29.1, with an average inbreeding coefficient of 0.035, all estimated using SNP Data. Fine-scale population structure was analysed using NETVIEW, a population analysis pipeline. Visualisation of the high definition network captured relationships among individuals within and between subpopulations. Effects of unequal founder use, and ancestral inbreeding and selection, were evident. While current levels of Bullmastiff heterozygosity, inbreeding and homozygosity are not unusual, a relatively small effective population size indicates that a breeding strategy to reduce the inbreeding rate may be beneficial. PMID:26824579

  14. Comparative genomics of mutualistic viruses of Glyptapanteles parasitic wasps

    Science.gov (United States)

    Polydnaviruses, a family of double-stranded DNA viruses with segmented genomes, have evolved as obligate endosymbionts of endoparasitoid wasps, and are some of the few viruses known to share mutualistic relationships with eukaryotic hosts. Virus particles are replication deficient and are produced o...

  15. Comparative population genomics of maize domestication and improvement

    Science.gov (United States)

    Domestication and modern breeding represent exemplary case studies of evolution in action. Maize is an outcrossing species with a complex genome, and an understanding of maize evolution is thus relevant for both plant and animal systems. This study is the largest plant resequencing effort to date, ...

  16. Comparative genomic analysis of novel Acinetobacter symbionts: A combined systems biology and genomics approach

    Science.gov (United States)

    Gupta, Vipin; Haider, Shazia; Sood, Utkarsh; Gilbert, Jack A.; Ramjee, Meenakshi; Forbes, Ken; Singh, Yogendra; Lopes, Bruno S.; Lal, Rup

    2016-01-01

    The increasing trend of antibiotic resistance in Acinetobacter drastically limits the range of therapeutic agents required to treat multidrug resistant (MDR) infections. This study focused on analysis of novel Acinetobacter strains using a genomics and systems biology approach. Here we used a network theory method for pathogenic and non-pathogenic Acinetobacter spp. to identify the key regulatory proteins (hubs) in each strain. We identified nine key regulatory proteins, guaA, guaB, rpsB, rpsI, rpsL, rpsE, rpsC, rplM and trmD, which have functional roles as hubs in a hierarchical scale-free fractal protein-protein interaction network. Two key hubs (guaA and guaB) were important for insect-associated strains, and comparative analysis identified guaA as more important than guaB due to its role in effective module regulation. rpsI played a significant role in all the novel strains, while rplM was unique to sheep-associated strains. rpsM, rpsB and rpsI were involved in the regulation of overall network topology across all Acinetobacter strains analyzed in this study. Future analysis will investigate whether these hubs are useful as drug targets for treating Acinetobacter infections. PMID:27378055

  17. Genome sequences and comparative genomics of two Lactobacillus ruminis strains from the bovine and human intestinal tracts

    LENUS (Irish Health Repository)

    2011-08-30

    Abstract Background The genus Lactobacillus is characterized by an extraordinary degree of phenotypic and genotypic diversity, which recent genomic analyses have further highlighted. However, the choice of species for sequencing has been non-random and unequal in distribution, with only a single representative genome from the L. salivarius clade available to date. Furthermore, there is no data to facilitate a functional genomic analysis of motility in the lactobacilli, a trait that is restricted to the L. salivarius clade. Results The 2.06 Mb genome of the bovine isolate Lactobacillus ruminis ATCC 27782 comprises a single circular chromosome, and has a G+C content of 44.4%. In silico analysis identified 1901 coding sequences, including genes for a pediocin-like bacteriocin, a single large exopolysaccharide-related cluster, two sortase enzymes, two CRISPR loci and numerous IS elements and pseudogenes. A cluster of genes related to a putative pilin was identified, and shown to be transcribed in vitro. A high quality draft assembly of the genome of a second L. ruminis strain, ATCC 25644 isolated from humans, suggested a slightly larger genome of 2.138 Mb, that exhibited a high degree of synteny with the ATCC 27782 genome. In contrast, comparative analysis of L. ruminis and L. salivarius identified a lack of long-range synteny between these closely related species. Comparison of the L. salivarius clade core proteins with those of nine other Lactobacillus species distributed across 4 major phylogenetic groups identified the set of shared proteins, and proteins unique to each group. Conclusions The genome of L. ruminis provides a comparative tool for directing functional analyses of other members of the L. salivarius clade, and it increases understanding of the divergence of this distinct Lactobacillus lineage from other commensal lactobacilli. The genome sequence provides a definitive resource to facilitate investigation of the genetics, biochemistry and host

  18. Comparative genomics in chicken and Pekin duck using FISH mapping and microarray analysis

    Directory of Open Access Journals (Sweden)

    Fowler Katie E

    2009-08-01

    Full Text Available Abstract Background The availability of the complete chicken (Gallus gallus genome sequence as well as a large number of chicken probes for fluorescent in-situ hybridization (FISH and microarray resources facilitate comparative genomic studies between chicken and other bird species. In a previous study, we provided a comprehensive cytogenetic map for the turkey (Meleagris gallopavo and the first analysis of copy number variants (CNVs in birds. Here, we extend this approach to the Pekin duck (Anas platyrhynchos, an obvious target for comparative genomic studies due to its agricultural importance and resistance to avian flu. Results We provide a detailed molecular cytogenetic map of the duck genome through FISH assignment of 155 chicken clones. We identified one inter- and six intrachromosomal rearrangements between chicken and duck macrochromosomes and demonstrated conserved synteny among all microchromosomes analysed. Array comparative genomic hybridisation revealed 32 CNVs, of which 5 overlap previously designated "hotspot" regions between chicken and turkey. Conclusion Our results suggest extensive conservation of avian genomes across 90 million years of evolution in both macro- and microchromosomes. The data on CNVs between chicken and duck extends previous analyses in chicken and turkey and supports the hypotheses that avian genomes contain fewer CNVs than mammalian genomes and that genomes of evolutionarily distant species share regions of copy number variation ("CNV hotspots". Our results will expedite duck genomics, assist marker development and highlight areas of interest for future evolutionary and functional studies.

  19. High-Resolution Comparative Genomic Hybridization of Inflammatory Breast Cancer and Identification of Candidate Genes

    OpenAIRE

    Bekhouche, Ismahane; Finetti, Pascal; Adelaïde, José; Ferrari, Anthony; Tarpin, Carole; Charafe-Jauffret, Emmanuelle; Charpin, Colette; Houvenaeghel, Gilles; Jacquemier, Jocelyne; Bidaut, Ghislain; Birnbaum, Daniel; Viens, Patrice; Chaffanet, Max; Bertucci, François

    2011-01-01

    Background Inflammatory breast cancer (IBC) is an aggressive form of BC poorly defined at the molecular level. We compared the molecular portraits of 63 IBC and 134 non-IBC (nIBC) clinical samples. Methodology/Findings Genomic imbalances of 49 IBCs and 124 nIBCs were determined using high-resolution array-comparative genomic hybridization, and mRNA expression profiles of 197 samples using whole-genome microarrays. Genomic profiles of IBCs were as heterogeneous as those of nIBCs, and globally ...

  20. Leveraging Comparative Genomics to Identify and Functionally Characterize Genes Associated with Sperm Phenotypes in Python bivittatus (Burmese Python)

    OpenAIRE

    Irizarry, Kristopher J. L.; Josep Rutllant

    2016-01-01

    Comparative genomics approaches provide a means of leveraging functional genomics information from a highly annotated model organism's genome (such as the mouse genome) in order to make physiological inferences about the role of genes and proteins in a less characterized organism's genome (such as the Burmese python). We employed a comparative genomics approach to produce the functional annotation of Python bivittatus genes encoding proteins associated with sperm phenotypes. We identify 129 g...

  1. Complete genome sequence of the fire blight pathogen Erwinia pyrifoliae DSM 12163T and comparative genomic insights into plant pathogenicity

    Directory of Open Access Journals (Sweden)

    Frey Jürg E

    2010-01-01

    Full Text Available Abstract Background Erwinia pyrifoliae is a newly described necrotrophic pathogen, which causes fire blight on Asian (Nashi pear and is geographically restricted to Eastern Asia. Relatively little is known about its genetics compared to the closely related main fire blight pathogen E. amylovora. Results The genome of the type strain of E. pyrifoliae strain DSM 12163T, was sequenced using both 454 and Solexa pyrosequencing and annotated. The genome contains a circular chromosome of 4.026 Mb and four small plasmids. Based on their respective role in virulence in E. amylovora or related organisms, we identified several putative virulence factors, including type III and type VI secretion systems and their effectors, flagellar genes, sorbitol metabolism, iron uptake determinants, and quorum-sensing components. A deletion in the rpoS gene covering the most conserved region of the protein was identified which may contribute to the difference in virulence/host-range compared to E. amylovora. Comparative genomics with the pome fruit epiphyte Erwinia tasmaniensis Et1/99 showed that both species are overall highly similar, although specific differences were identified, for example the presence of some phage gene-containing regions and a high number of putative genomic islands containing transposases in the E. pyrifoliae DSM 12163T genome. Conclusions The E. pyrifoliae genome is an important addition to the published genome of E. tasmaniensis and the unfinished genome of E. amylovora providing a foundation for re-sequencing additional strains that may shed light on the evolution of the host-range and virulence/pathogenicity of this important group of plant-associated bacteria.

  2. Comparative genomics of the Staphylococcus intermedius group of animal pathogens

    Directory of Open Access Journals (Sweden)

    Nouri eBen Zakour

    2012-04-01

    Full Text Available The Staphylococcus intermedius group consists of 3 closely-related coagulase-positive bacterial species including S. intermedius, Staphylococus pseudintermedius, and Staphylococcus delphini. S. pseudintermedius is a major skin pathogen of dogs, which occasionally causes severe zoonotic infections of humans. S. delphini has been isolated from an array of different animals including horses, mink and pigeons, whereas S. intermedius has been isolated only from pigeons to date. Here we provide a detailed analysis of the S. pseudintermedius whole genome sequence in comparison to high quality draft S. intermedius and S. delphini genomes, and to other sequenced staphylococcal species. The core genome of the SIG was highly conserved with average nucleotide identity (ANI between the 3 species of 93.61%, which is very close to the threshold of species delineation (95% ANI, highlighting the close-relatedness of the SIG species. However, considerable variation was identified in the content of mobile genetic elements, cell wall-associated proteins, and iron and sugar transporters, reflecting the distinct ecological niches inhabited. Of note, S. pseudintermedius ED99 contained a Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR locus of the Nmeni subtype and S. intermedius contained both Nmeni and Mtube subtypes. In contrast to S. intermedius and S. delphini and most other staphylococci examined to date, S. pseudintermedius contained at least 9 predicted reverse transcriptase (RT Group II introns. Furthermore, S. pseudintermedius ED99 encoded several transposons which were largely responsible for its multi-resistant phenotype. Overall, the study highlights extensive differences in accessory genome content between closely-related staphylococcal species inhabiting distinct host niches, providing new avenues for research into pathogenesis and bacterial host-adaptation.

  3. Metagenome Skimming of Insect Specimen Pools: Potential for Comparative Genomics

    OpenAIRE

    Linard, Benjamin; Crampton-Platt, Alex; Gillett, Conrad P. D. T.; Timmermans, Martijn J. T. N.; Vogler, Alfried P.

    2015-01-01

    Metagenomic analyses are challenging in metazoans, but high-copy number and repeat regions can be assembled from low-coverage sequencing by “genome skimming,” which is applied here as a new way of characterizing metagenomes obtained in an ecological or taxonomic context. Illumina shotgun sequencing on two pools of Coleoptera (beetles) of approximately 200 species each were assembled into tens of thousands of scaffolds. Repeated low-coverage sequencing recovered similar scaffold sets consisten...

  4. Investigating hookworm genomes by comparative analysis of two Ancylostoma species

    OpenAIRE

    Kapulkin Wadim; Stajich Jason E; Xu Jian; Wylie Todd; Dante Mike; Martin John; Hawdon John; Arasu Prema; McCarter James P; Mitreva Makedonka; Clifton Sandra W; Waterston Robert H; Wilson Richard K

    2005-01-01

    Abstract Background Hookworms, infecting over one billion people, are the mostly closely related major human parasites to the model nematode Caenorhabditis elegans. Applying genomics techniques to these species, we analyzed 3,840 and 3,149 genes from Ancylostoma caninum and A. ceylanicum. Results Transcripts originated from libraries representing infective L3 larva, stimulated L3, arrested L3, and adults. Most genes are represented in single stages including abundant transcripts like hsp-20 i...

  5. Comparative analysis of whole-genome sequences of Streptococcus suis

    Institute of Scientific and Technical Information of China (English)

    LI Pengli; WEI Wu; LI Yixue; MA Yuanyuan; DING Guohui; LI Xiaoping; WANG Xiaojing; ZHANG Liwen; SUN Jingchun; WANG Yong; TU Kang; WANG Ningning; HAO Pei; WANG Chuan; CAO Zhiwei; SHI Tieliu

    2006-01-01

    The outbreak of Streptococcus suis recently in some districts of Sichuan Province in China has caused over 30 deaths and over 200 infections in human beings. In order to study the pathogenicity mechanism and to prevent the bacteria from spreading and infecting human beings and swine, we have annotated and analyzed the genomes of two strains, Streptococcus suis P1/7 and 89-1591 respectively. The whole length of P1/7 is 2.007 Mb,and has 1969 ORFs. In contrast, the partial genome sequence of 89-1591 is 1.98 Mb in length and exists in 177 contigs with 1918 ORFs. Analysis shows that the average lengths of CDSs in two genomes are very close, and the numbers of the homolog ORFs are 1306 between those two strains. Most of the toxicity factors of the two strains are homologeous, but there are still some significant differences between those two strains. For example, among the 11 genes (cps2A-cps2K) encoding for the capsules in P1/7, 4(cps2A, 2B, 2I, 2J) are not detected in strain 89-1591.At the same time, the genes encoding EF and Haemolysin in P1/7 are also not found in strain 89-1591. Besides, the genes related to DNA replication, repair and recombination differ from each other significantly and there also exist certain differences among the surface proteins. Those characteristics indicate that those two strains have evolved their own specific functions to adapt to the different environments and that the pathogenesis of the two strains is different. We have accumulated comprehensive genomics information for future systematic studies of S.sui. Our results are helpful for disease prevention,vaccine development, as well as drug design for S.suis.

  6. The Integrated Microbial Genomes (IMG) System: An Expanding Comparative Analysis Resource

    Energy Technology Data Exchange (ETDEWEB)

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Grechkin, Yuri; Ratner, Anna; Anderson, Iain; Lykidis, Athanasios; Mavromatis, Konstantinos; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2009-09-13

    The integrated microbial genomes (IMG) system serves as a community resource for comparative analysis of publicly available genomes in a comprehensive integrated context. IMG contains both draft and complete microbial genomes integrated with other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and reviewing the annotations of genes and genomes in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through regular releases. Several companion IMG systems have been set up in order to serve domain specific needs, such as expert review of genome annotations. IMG is available at .

  7. Mitochondrial genome sequences and comparative genomics ofPhytophthora ramorum and P. sojae

    Energy Technology Data Exchange (ETDEWEB)

    Martin, Frank N.; Douda, Bensasson; Tyler, Brett M.; Boore,Jeffrey L.

    2007-01-01

    The complete sequences of the mitochondrial genomes of theoomycetes of Phytophthora ramorum and P. sojae were determined during thecourse of their complete nuclear genome sequencing (Tyler, et al. 2006).Both are circular, with sizes of 39,314 bp for P. ramorum and 42,975 bpfor P. sojae. Each contains a total of 37 identifiable protein-encodinggenes, 25 or 26 tRNAs (P. sojae and P. ramorum, respectively)specifying19 amino acids, and a variable number of ORFs (7 for P. ramorum and 12for P. sojae) which are potentially additional functional genes.Non-coding regions comprise approximately 11.5 percent and 18.4 percentof the genomes of P. ramorum and P. sojae, respectively. Relative to P.sojae, there is an inverted repeat of 1,150 bp in P. ramorum thatincludes an unassigned unique ORF, a tRNA gene, and adjacent non-codingsequences, but otherwise the gene order in both species is identical.Comparisons of these genomes with published sequences of the P. infestansmitochondrial genome reveals a number of similarities, but the gene orderin P. infestans differs in two adjacent locations due to inversions.Sequence alignments of the three genomes indicated sequence conservationranging from 75 to 85 percent and that specific regions were morevariable than others.

  8. Comparative Genomics Analysis of Streptomyces Species Reveals Their Adaptation to the Marine Environment and Their Diversity at the Genomic Level

    Science.gov (United States)

    Tian, Xinpeng; Zhang, Zhewen; Yang, Tingting; Chen, Meili; Li, Jie; Chen, Fei; Yang, Jin; Li, Wenjie; Zhang, Bing; Zhang, Zhang; Wu, Jiayan; Zhang, Changsheng; Long, Lijuan; Xiao, Jingfa

    2016-01-01

    Over 200 genomes of streptomycete strains that were isolated from various environments are available from the NCBI. However, little is known about the characteristics that are linked to marine adaptation in marine-derived streptomycetes. The particularity and complexity of the marine environment suggest that marine streptomycetes are genetically diverse. Here, we sequenced nine strains from the Streptomyces genus that were isolated from different longitudes, latitudes, and depths of the South China Sea. Then we compared these strains to 22 NCBI downloaded streptomycete strains. Thirty-one streptomycete strains are clearly grouped into a marine-derived subgroup and multiple source subgroup-based phylogenetic tree. The phylogenetic analyses have revealed the dynamic process underlying streptomycete genome evolution, and lateral gene transfer is an important driving force during the process. Pan-genomics analyses have revealed that streptomycetes have an open pan-genome, which reflects the diversity of these streptomycetes and guarantees the species a quick and economical response to diverse environments. Functional and comparative genomics analyses indicate that the marine-derived streptomycetes subgroup possesses some common characteristics of marine adaptation. Our findings have expanded our knowledge of how ocean isolates of streptomycete strains adapt to marine environments. The availability of streptomycete genomes from the South China Sea will be beneficial for further analysis on marine streptomycetes and will enrich the South China Sea’s genetic data sources. PMID:27446038

  9. Comparative Analysis of CpG Islands in Four Fish Genomes

    Directory of Open Access Journals (Sweden)

    Leng Han

    2008-01-01

    Full Text Available There has been much interest in CpG islands (CGIs, clusters of CpG dinucleotides in GC-rich regions, because they are considered gene markers and involved in gene regulation. To date, there has been no genome-wide analysis of CGIs in the fish genome. We first evaluated the performance of three popular CGI identification algorithms in four fish genomes (tetraodon, stickleback, medaka, and zebrafish. Our results suggest that Takai and Jones' (2002 algorithm is most suitable for comparative analysis of CGIs in the fish genome. Then, we performed a systematic analysis of CGIs in the four fish genomes using Takai and Jones' algorithm, compared to other vertebrate genomes. We found that both the number of CGIs and the CGI density vary greatly among these genomes. Remarkably, each fish genome presents a distinct distribution of CGI density with some genomic factors (e.g., chromosome size and chromosome GC content. These findings are helpful for understanding evolution of fish genomes and the features of fish CGIs.

  10. Comparative analysis of catfish BAC end sequences with the zebrafish genome

    Directory of Open Access Journals (Sweden)

    Abernathy Jason

    2009-12-01

    Full Text Available Abstract Background Comparative mapping is a powerful tool to transfer genomic information from sequenced genomes to closely related species for which whole genome sequence data are not yet available. However, such an approach is still very limited in catfish, the most important aquaculture species in the United States. This project was initiated to generate additional BAC end sequences and demonstrate their applications in comparative mapping in catfish. Results We reported the generation of 43,000 BAC end sequences and their applications for comparative genome analysis in catfish. Using these and the additional 20,000 existing BAC end sequences as a resource along with linkage mapping and existing physical map, conserved syntenic regions were identified between the catfish and zebrafish genomes. A total of 10,943 catfish BAC end sequences (17.3% had significant BLAST hits to the zebrafish genome (cutoff value ≤ e-5, of which 3,221 were unique gene hits, providing a platform for comparative mapping based on locations of these genes in catfish and zebrafish. Genetic linkage mapping of microsatellites associated with contigs allowed identification of large conserved genomic segments and construction of super scaffolds. Conclusion BAC end sequences and their associated polymorphic markers are great resources for comparative genome analysis in catfish. Highly conserved chromosomal regions were identified to exist between catfish and zebrafish. However, it appears that the level of conservation at local genomic regions are high while a high level of chromosomal shuffling and rearrangements exist between catfish and zebrafish genomes. Orthologous regions established through comparative analysis should facilitate both structural and functional genome analysis in catfish.

  11. Whole genome comparative analysis of channel catfish (Ictalurus punctatus) with four model fish species

    OpenAIRE

    Jiang, Yanliang; Gao, Xiaoyu; Liu, Shikai; Zhang, Yu; Liu, Hong; Sun, Fanyue; Bao, Lisui; Waldbieser, Geoff; Liu, Zhanjiang

    2013-01-01

    Background Comparative mapping is a powerful tool to study evolution of genomes. It allows transfer of genome information from the well-studied model species to non-model species. Catfish is an economically important aquaculture species in United States. A large amount of genome resources have been developed from catfish including genetic linkage maps, physical maps, BAC end sequences (BES), integrated linkage and physical maps using BES-derived markers, physical map contig-specific sequences...

  12. Detection of complete and partial chromosome gains and losses by comparative genomic in situ hybridization

    OpenAIRE

    Manoir, Stanislas du; Speicher, Michael R.; Joos, Stefan; Schröck, Evelin; Popp, Susanne, 1983-; Döhner, Hartmut; Kovacs, Gyula; Robert-Nicoud, Michel; Lichter, Peter; Cremer, Thomas

    1993-01-01

    Comparative genomic in situ hybridization (CGH) provides a new possibility for searching genomes for imbalanced genetic material. Labeled genomic test DNA, prepared from clinical or tumor specimens, is mixed with differently labeled control DNA prepared from cells with normal chromosome complements. The mixed probe is used for chromosomal in situ suppression (CISS) hybridization to normal metaphase spreads (CGH-metaphase spreads). Hybridized test and control DNA sequences are detected via dif...

  13. PGSB PlantsDB: updates to the database framework for comparative plant genome research

    OpenAIRE

    Spannagl, Manuel; Nussbaumer, Thomas; Bader, Kai C.; Martis, Mihaela-Maria; Seidel, Michael; Kugler, Karl G; Gundlach, Heidrun; Mayer, Klaus F. X.

    2016-01-01

    PGSB (Plant Genome and Systems Biology: formerly MIPS) PlantsDB (http://pgsb.helmholtz-muenchen.de/plant/index.jsp) is a database framework for the comparative analysis and visualization of plant genome data. The resource has been updated with new data sets and types as well as specialized tools and interfaces to address user demands for intuitive access to complex plant genome data. In its latest incarnation, we have re-worked both the layout and navigation structure and implemented new keyw...

  14. PGSB PlantsDB: updates to the database framework for comparative plant genome research

    OpenAIRE

    Spannagl, Manuel; Nussbaumer, Thomas; Bader, Kai C.; Martis, Mihaela M.; Seidel, Michael; Kugler, Karl G.; Gundlach, Heidrun; Mayer, Klaus F. X

    2015-01-01

    PGSB (Plant Genome and Systems Biology: formerly MIPS) PlantsDB (http://pgsb.helmholtz-muenchen.de/plant/index.jsp) is a database framework for the comparative analysis and visualization of plant genome data. The resource has been updated with new data sets and types as well as specialized tools and interfaces to address user demands for intuitive access to complex plant genome data. In its latest incarnation, we have re-worked both the layout and navigation structure and implemented new keyw...

  15. EDGAR: A software framework for the comparative analysis of prokaryotic genomes

    Directory of Open Access Journals (Sweden)

    Vorhölter Frank-Jörg

    2009-05-01

    Full Text Available Abstract Background The introduction of next generation sequencing approaches has caused a rapid increase in the number of completely sequenced genomes. As one result of this development, it is now feasible to analyze large groups of related genomes in a comparative approach. A main task in comparative genomics is the identification of orthologous genes in different genomes and the classification of genes as core genes or singletons. Results To support these studies EDGAR – "Efficient Database framework for comparative Genome Analyses using BLAST score Ratios" – was developed. EDGAR is designed to automatically perform genome comparisons in a high throughput approach. Comparative analyses for 582 genomes across 75 genus groups taken from the NCBI genomes database were conducted with the software and the results were integrated into an underlying database. To demonstrate a specific application case, we analyzed ten genomes of the bacterial genus Xanthomonas, for which phylogenetic studies were awkward due to divergent taxonomic systems. The resultant phylogeny EDGAR provided was consistent with outcomes from traditional approaches performed recently and moreover, it was possible to root each strain with unprecedented accuracy. Conclusion EDGAR provides novel analysis features and significantly simplifies the comparative analysis of related genomes. The software supports a quick survey of evolutionary relationships and simplifies the process of obtaining new biological insights into the differential gene content of kindred genomes. Visualization features, like synteny plots or Venn diagrams, are offered to the scientific community through a web-based and therefore platform independent user interface http://edgar.cebitec.uni-bielefeld.de, where the precomputed data sets can be browsed.

  16. Evolution of Prdm Genes in Animals: Insights from Comparative Genomics

    OpenAIRE

    Vervoort, Michel; Meulemeester, David; Béhague, Julien; Kerner, Pierre

    2015-01-01

    Prdm genes encode transcription factors with a subtype of SET domain known as the PRDF1-RIZ (PR) homology domain and a variable number of zinc finger motifs. These genes are involved in a wide variety of functions during animal development. As most Prdm genes have been studied in vertebrates, especially in mice, little is known about the evolution of this gene family. We searched for Prdm genes in the fully sequenced genomes of 93 different species representative of all the main metazoan line...

  17. Progress in the detection of human genome structural variations

    Institute of Scientific and Technical Information of China (English)

    2009-01-01

    The emerging of high-throughput and high-resolution genomic technologies led to the detection of submicroscopic variants ranging from 1 kb to 3 Mb in the human genome.These variants include copy number variations(CNVs),inversions,insertions,deletions and other complex rearrangements of DNA sequences.This paper briefly reviews the commonly used technologies to discover both genomic structural variants and their potential influences.Particularly,we highlight the array-based,PCR-based and sequencing-based assays,including array-based comparative genomic hybridization(aCGH),representational oligonucleotide microarray analysis(ROMA),multiplex amplifiable probe hybridization(MAPH),multiplex ligation-dependent probe amplification(MLPA),paired-end mapping(PEM),and next-generation DNA sequencing technologies.Furthermore,we discuss the limitations and challenges of current assays and give advices on how to make the database of genomic variations more reliable.

  18. Progress in the detection of human genome structural variations

    Institute of Scientific and Technical Information of China (English)

    WU XueMei; XIAO HuaSheng

    2009-01-01

    The emerging of high.throughput and high-resolution genomic technologies led to the detection of submicroscopic variants ranging from 1 kb to 3 Mb in the human genome. These variants include copy number variations (CNVs), inversions, insertions, deletions and other complex rearrangements of DNA sequences. This paper briefly reviews the commonly used technologies to discover both genomic structural variants and their potential influences. Particularly, we highlight the array-based, PCR-based and sequencing-based assays, including array-based comparative genomic hybridization (aCGH),representational oligonucleotide microarray analysis (ROMA), multiplex amplifiable probe hybridization (MAPH), multiplex ligation-dependent probe amplification (MLPA), paired-end mapping (PEM), and next-generation DNA sequencing technologies. Furthermore, we discuss the limitations and challenges of current assays and give advices on how to make the database of genomic variations more reliable.

  19. Evolution of Prdm Genes in Animals: Insights from Comparative Genomics.

    Science.gov (United States)

    Vervoort, Michel; Meulemeester, David; Béhague, Julien; Kerner, Pierre

    2016-03-01

    Prdm genes encode transcription factors with a subtype of SET domain known as the PRDF1-RIZ (PR) homology domain and a variable number of zinc finger motifs. These genes are involved in a wide variety of functions during animal development. As most Prdm genes have been studied in vertebrates, especially in mice, little is known about the evolution of this gene family. We searched for Prdm genes in the fully sequenced genomes of 93 different species representative of all the main metazoan lineages. A total of 976 Prdm genes were identified in these species. The number of Prdm genes per species ranges from 2 to 19. To better understand how the Prdm gene family has evolved in metazoans, we performed phylogenetic analyses using this large set of identified Prdm genes. These analyses allowed us to define 14 different subfamilies of Prdm genes and to establish, through ancestral state reconstruction, that 11 of them are ancestral to bilaterian animals. Three additional subfamilies were acquired during early vertebrate evolution (Prdm5, Prdm11, and Prdm17). Several gene duplication and gene loss events were identified and mapped onto the metazoan phylogenetic tree. By studying a large number of nonmetazoan genomes, we confirmed that Prdm genes likely constitute a metazoan-specific gene family. Our data also suggest that Prdm genes originated before the diversification of animals through the association of a single ancestral SET domain encoding gene with one or several zinc finger encoding genes. PMID:26560352

  20. BGI-RIS: an integrated information resource and comparative analysis workbench for rice genomics

    DEFF Research Database (Denmark)

    Zhao, Wenming; Wang, Jing; He, Ximiao;

    2004-01-01

    the application of the rice genomic information and to provide a foundation for functional and evolutionary studies of other important cereal crops, we implemented our Rice Information System (BGI-RIS), the most up-to-date integrated information resource as well as a workbench for comparative genomic...

  1. Comparative genomics in chicken and Pekin duck using FISH mapping and microarray analysis

    NARCIS (Netherlands)

    Skinner, M.; Robertson, L.B.; Tempest, H.G.; Langley, E.J.; Ioannou, D.; Fowler, K.E.; Crooijmans, R.P.M.A.

    2009-01-01

    Background: The availability of the complete chicken (Gallus gallus) genome sequence as well as a large number of chicken probes for fluorescent in-situ hybridization (FISH) and microarray resources facilitate comparative genomic studies between chicken and other bird species. In a previous study, w

  2. Reference set of regulons in Desulfovibrionales inferred by comparative genomics approach

    Energy Technology Data Exchange (ETDEWEB)

    Kazakov, A.E.; Rodionov, D.A.; Price, M.N.; Arkin, A.P.; Dubchak, I.; Novichkov, P.S.

    2010-11-15

    in this study, we carried out large-scale comparative genomics analysis of regulatory interactions in Desulfovibrio vulgaris and 12 related genomes from Desulfovibrionales order using our recently developed web server RegPredict (http://regpredict.lbl.gov). An overall reference collection of 26 Desulfovibrionales regulogs can be accessed through RegPrecise database (http://regpredict.lbl.gov).

  3. Comparative genomic characterization of three Streptococcus parauberis strains in fish pathogen, as assessed by wide-genome analyses.

    Directory of Open Access Journals (Sweden)

    Seong-Won Nho

    Full Text Available Streptococcus parauberis, which is the main causative agent of streptococcosis among olive flounder (Paralichthys olivaceus in northeast Asia, can be distinctly divided into two groups (type I and type II by an agglutination test. Here, the whole genome sequences of two Japanese strains (KRS-02083 and KRS-02109 were determined and compared with the previously determined genome of a Korean strain (KCTC 11537. The genomes of S. parauberis are intermediate in size and have lower GC contents than those of other streptococci. We annotated 2,236 and 2,048 genes in KRS-02083 and KRS-02109, respectively. Our results revealed that the three S. parauberis strains contain different genomic insertions and deletions. In particular, the genomes of Korean and Japanese strains encode different factors for sugar utilization; the former encodes the phosphotransferase system (PTS for sorbose, whereas the latter encodes proteins for lactose hydrolysis, respectively. And the KRS-02109 strain, specifically, was the type II strain found to be able to resist phage infection through the clustered regularly interspaced short palindromic repeats (CRISPR/Cas system and which might contribute valuably to serologically distribution. Thus, our genome-wide association study shows that polymorphisms can affect pathogen responses, providing insight into biological/biochemical pathways and phylogenetic diversity.

  4. Comparative genomics of 12 strains of Erwinia amylovora identifies a pan-genome with a large conserved core.

    Directory of Open Access Journals (Sweden)

    Rachel A Mann

    Full Text Available The plant pathogen Erwinia amylovora can be divided into two host-specific groupings; strains infecting a broad range of hosts within the Rosaceae subfamily Spiraeoideae (e.g., Malus, Pyrus, Crataegus, Sorbus and strains infecting Rubus (raspberries and blackberries. Comparative genomic analysis of 12 strains representing distinct populations (e.g., geographic, temporal, host origin of E. amylovora was used to describe the pan-genome of this major pathogen. The pan-genome contains 5751 coding sequences and is highly conserved relative to other phytopathogenic bacteria comprising on average 89% conserved, core genes. The chromosomes of Spiraeoideae-infecting strains were highly homogeneous, while greater genetic diversity was observed between Spiraeoideae- and Rubus-infecting strains (and among individual Rubus-infecting strains, the majority of which was attributed to variable genomic islands. Based on genomic distance scores and phylogenetic analysis, the Rubus-infecting strain ATCC BAA-2158 was genetically more closely related to the Spiraeoideae-infecting strains of E. amylovora than it was to the other Rubus-infecting strains. Analysis of the accessory genomes of Spiraeoideae- and Rubus-infecting strains has identified putative host-specific determinants including variation in the effector protein HopX1(Ea and a putative secondary metabolite pathway only present in Rubus-infecting strains.

  5. Family Competition Pheromone Genetic Algorithm for Comparative Genome Assembly

    Institute of Scientific and Technical Information of China (English)

    Chien-Hao Su; Chien-Shun Chiou; Jung-Che Kuo; Pei-Jen Wang; Cheng-Yan Kao; Hsueh-Ting Chu

    2014-01-01

    Genome assembly is a prerequisite step for analyzing next generation sequencing data and also far from being solved. Many assembly tools have been proposed and used extensively. Majority of them aim to assemble sequencing reads into contigs; however, we focus on the assembly of contigs into scaffolds in this paper. This is called scaffolding, which estimates the relative order of the contigs as well as the size of the gaps between these contigs. Pheromone trail-based genetic algorithm (PGA) was previously proposed and had decent performance according to their paper. From our previous study, we found that family competition mechanism in genetic algorithm is able to further improve the results. Therefore, we propose family competition pheromone genetic algorithm (FCPGA) and demonstrate the improvement over PGA.

  6. Comparative genomic analysis of human fungal pathogens causing paracoccidioidomycosis.

    Directory of Open Access Journals (Sweden)

    Christopher A Desjardins

    2011-10-01

    Full Text Available Paracoccidioides is a fungal pathogen and the cause of paracoccidioidomycosis, a health-threatening human systemic mycosis endemic to Latin America. Infection by Paracoccidioides, a dimorphic fungus in the order Onygenales, is coupled with a thermally regulated transition from a soil-dwelling filamentous form to a yeast-like pathogenic form. To better understand the genetic basis of growth and pathogenicity in Paracoccidioides, we sequenced the genomes of two strains of Paracoccidioides brasiliensis (Pb03 and Pb18 and one strain of Paracoccidioides lutzii (Pb01. These genomes range in size from 29.1 Mb to 32.9 Mb and encode 7,610 to 8,130 genes. To enable genetic studies, we mapped 94% of the P. brasiliensis Pb18 assembly onto five chromosomes. We characterized gene family content across Onygenales and related fungi, and within Paracoccidioides we found expansions of the fungal-specific kinase family FunK1. Additionally, the Onygenales have lost many genes involved in carbohydrate metabolism and fewer genes involved in protein metabolism, resulting in a higher ratio of proteases to carbohydrate active enzymes in the Onygenales than their relatives. To determine if gene content correlated with growth on different substrates, we screened the non-pathogenic onygenale Uncinocarpus reesii, which has orthologs for 91% of Paracoccidioides metabolic genes, for growth on 190 carbon sources. U. reesii showed growth on a limited range of carbohydrates, primarily basic plant sugars and cell wall components; this suggests that Onygenales, including dimorphic fungi, can degrade cellulosic plant material in the soil. In addition, U. reesii grew on gelatin and a wide range of dipeptides and amino acids, indicating a preference for proteinaceous growth substrates over carbohydrates, which may enable these fungi to also degrade animal biomass. These capabilities for degrading plant and animal substrates suggest a duality in lifestyle that could enable pathogenic

  7. Comparative Genomics Reveals Biomarkers to Identify Lactobacillus Species.

    Science.gov (United States)

    Koul, Shikha; Kalia, Vipin Chandra

    2016-09-01

    Bacteria possessing multiple copies of 16S rRNA (rrs) gene demonstrate high intragenomic heterogeneity. It hinders clear distinction at species level and even leads to overestimation of the bacterial diversity. Fifty completely sequenced genomes belonging to 19 species of Lactobacillus species were found to possess 4-9 copies of rrs each. Multiple sequence alignment of 268 rrs genes from all the 19 species could be classified into 20 groups. Lactobacillus sanfranciscensis TMW 1.1304 was the only species where all the 7 copies of rrs were exactly similar and thus formed a distinct group. In order to circumvent the problem of high heterogeneity arising due to multiple copies of rrs, 19 additional genes (732-3645 nucleotides in size) common to Lactobacillus genomes, were selected and digested with 10 Type II restriction endonucleases (RE), under in silico conditions. The following unique gene-RE combinations: recA (1098 nts)-HpyCH4 V, CviAII, BfuCI and RsaI were found to be useful in identifying 29 strains representing 17 species. Digestion patterns of genes-ruvB (1020 nts), dnaA (1368 nts), purA (1290 nts), dnaJ (1140 nts), and gyrB (1944 nts) in combination with REs-AluI, BfuCI, CviAI, Taq1, and Tru9I allowed clear identification of an additional 14 strains belonging to 8 species. Digestion pattern of genes recA, ruvB, dnaA, purA, dnaJ and gyrB can be used as biomarkers for identifying different species of Lactobacillus. PMID:27407290

  8. Identification of conserved regulatory elements by comparative genome analysis

    Directory of Open Access Journals (Sweden)

    Jareborg Niclas

    2003-05-01

    Full Text Available Abstract Background For genes that have been successfully delineated within the human genome sequence, most regulatory sequences remain to be elucidated. The annotation and interpretation process requires additional data resources and significant improvements in computational methods for the detection of regulatory regions. One approach of growing popularity is based on the preferential conservation of functional sequences over the course of evolution by selective pressure, termed 'phylogenetic footprinting'. Mutations are more likely to be disruptive if they appear in functional sites, resulting in a measurable difference in evolution rates between functional and non-functional genomic segments. Results We have devised a flexible suite of methods for the identification and visualization of conserved transcription-factor-binding sites. The system reports those putative transcription-factor-binding sites that are both situated in conserved regions and located as pairs of sites in equivalent positions in alignments between two orthologous sequences. An underlying collection of metazoan transcription-factor-binding profiles was assembled to facilitate the study. This approach results in a significant improvement in the detection of transcription-factor-binding sites because of an increased signal-to-noise ratio, as demonstrated with two sets of promoter sequences. The method is implemented as a graphical web application, ConSite, which is at the disposal of the scientific community at http://www.phylofoot.org/. Conclusions Phylogenetic footprinting dramatically improves the predictive selectivity of bioinformatic approaches to the analysis of promoter sequences. ConSite delivers unparalleled performance using a novel database of high-quality binding models for metazoan transcription factors. With a dynamic interface, this bioinformatics tool provides broad access to promoter analysis with phylogenetic footprinting.

  9. Genomic instability of micronucleated cells revealed by single-cell comparative genomic hybridization.

    NARCIS (Netherlands)

    Imle, A.; Polzer, B.; Alexander, S.; Klein, C.A.; Friedl, P.H.A.

    2009-01-01

    Nuclear variation in size and shape and genomic instability are hallmarks of dedifferentiated cancer cells. Although micronuclei are a typical long-term consequence of DNA damage, their contribution to chromosomal instability and clonal diversity in cancer disease is unclear. We isolated cancer cell

  10. Comparative genomics of 274 Vibrio cholerae genomes reveals mobile functions structuring three niche dimensions

    NARCIS (Netherlands)

    Dutilh, Bas E; Thompson, Cristiane C; Vicente, Ana C P; Marin, Michel A; Lee, Clarence; Silva, Genivaldo G Z; Schmieder, Robert; Andrade, Bruno G N; Chimetto, Luciane; Cuevas, Daniel; Garza, Daniel R; Okeke, Iruka N; Aboderin, Aaron Oladipo; Spangler, Jessica; Ross, Tristen; Dinsdale, Elizabeth A; Thompson, Fabiano L; Harkins, Timothy T; Edwards, Robert A

    2014-01-01

    BACKGROUND: Vibrio cholerae is a globally dispersed pathogen that has evolved with humans for centuries, but also includes non-pathogenic environmental strains. Here, we identify the genomic variability underlying this remarkable persistence across the three major niche dimensions space, time, and h

  11. Comparative Genome Analysis Reveals Divergent Genome Size Evolution in a Carnivorous Plant Genus

    Czech Academy of Sciences Publication Activity Database

    Vu, G.T.H.; Schmutzer, T.; Bull, F.; Cao, H.X.; Fuchs, J.; Tran, T.D.; Jovtchev, G.; Pistrick, K.; Stein, N.; Pečinka, A.; Neumann, Pavel; Novák, Petr; Macas, Jiří; Dear, P.H.; Blattner, F.R.; Scholz, U.; Schubert, I.

    2015-01-01

    Roč. 8, č. 3 (2015). ISSN 1940-3372 R&D Projects: GA ČR GBP501/12/G090 Institutional support: RVO:60077344 Keywords : Genlisea * genome * repetitive sequences Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 3.933, year: 2014

  12. Genomics meets induced mutations in citrus: identification of deleted genes through comparative genomic hybridization

    International Nuclear Information System (INIS)

    We report on the use of genomic approaches to identify pivotal genes in induced citrus mutants. Citrus is the most economically important fruit crop in the world while Spain is the first fresh citrus producer. The survival of the Citrus industry is critically dependent on genetically superior cultivars but improvements in fruit quality traits through traditional techniques are extremely difficult due to the unusual combination of biological characteristics of citrus. Genomic science, however, holds promise of improvements in breeding. In this work, we reported the successful identification of genes included in hemizygous deletions induced by fast neutron irradiation on Citrus clementina. Microarray-based CGH was used to identify underrepresented genes in a citrus mutant that shows color break delay. Subsequent confirmation of gene doses through quantitative PCR and comparison of best hits of putative deleted citrus genes against annotated genomes from other eudicots, specially poplar, enabled the prediction that these genes were clustered into a 700 kb fragment. The availability of Citrus BAC end sequences helped to draw a partial physical map of the deletion. Furthermore, gene content and order in the deleted segment was established by PCR location of gene hits on the physical map. Finally, a lower chlorophyll a/b ratio was found in green tissues from the mutant, an observation that can be related to the hemizygous deletion of a ClpC-like gene, coding a putative subunit of a multifunctional protease complex located into the chloroplast. Analysis of gene content and order inside this Citrus deletion led to the conclusion that microsynteny and local gene colinearity with Populus trichocarpa were higher than with the phylogenetically closer Arabidopsis thaliana genome. In conclusion, a combined strategy including genomics tools and induced citrus mutations has been proved to be a successful approach to identify genes with major roles in citrus fruit development

  13. Genomics Meets Induced Mutations in Citrus: Identification of Deleted Genes Through Comparative Genomic Hybridization

    International Nuclear Information System (INIS)

    We report on the use of genomic approaches to identify pivotal genes in induced citrus mutants. Citrus is the most economically important fruit crop in the world and Spain is the first fresh citrus producer. The survival of the citrus industry is critically dependent on genetically superior cultivars but improvements in fruit quality traits through traditional techniques are extremely difficult due to the unusual combination of biological characteristics of citrus. Genomic science, however, holds promise of improvements in breeding. In this work, we reported the successful identification of genes included in hemizygous deletions induced by fast neutron irradiation on Citrus clementina. Microarray-based CGH was used to identify underrepresented genes in a citrus mutant that shows color break delay. Subsequent confirmation of gene doses through quantitative PCR and comparison of best hits of putative deleted citrus genes against annotated genomes from other eudicots, specially poplar, enabled the prediction that these genes were clustered into a 700 kb fragment. The availability of Citrus BAC end sequences helped to draw a partial physical map of the deletion. Furthermore, gene content and order in the deleted segment was established by PCR location of gene hits on the physical map. Finally, a lower chlorophyll a/b ratio was found in green tissues from the mutant, an observation that can be related to the hemizygous deletion of a ClpC-like gene, coding a putative subunit of a multifunctional protease complex located into the chloroplast. Analysis of gene content and order inside this Citrus deletion led to the conclusion that microsynteny and local gene colinearity with Populus trichocarpa were higher than with the phylogenetically closer Arabidopsis thaliana genome. In conclusion, a combined strategy including genomics tools and induced citrus mutations has been proved to be a successful approach to identify genes with major roles in citrus fruit development

  14. Integrating cytogenetics and genomics in comparative evolutionary studies of cichlid fish

    Directory of Open Access Journals (Sweden)

    Mazzuchelli Juliana

    2012-09-01

    Full Text Available Abstract Background The availability of a large number of recently sequenced vertebrate genomes opens new avenues to integrate cytogenetics and genomics in comparative and evolutionary studies. Cytogenetic mapping can offer alternative means to identify conserved synteny shared by distinct genomes and also to define genome regions that are still not fine characterized even after wide-ranging nucleotide sequence efforts. An efficient way to perform comparative cytogenetic mapping is based on BAC clones mapping by fluorescence in situ hybridization. In this report, to address the knowledge gap on the genome evolution in cichlid fishes, BAC clones of an Oreochromis niloticus library covering the linkage groups (LG 1, 3, 5, and 7 were mapped onto the chromosomes of 9 African cichlid species. The cytogenetic mapping data were also integrated with BAC-end sequences information of O. niloticus and comparatively analyzed against the genome of other fish species and vertebrates. Results The location of BACs from LG1, 3, 5, and 7 revealed a strong chromosomal conservation among the analyzed cichlid species genomes, which evidenced a synteny of the markers of each LG. Comparative in silico analysis also identified large genomic blocks that were conserved in distantly related fish groups and also in other vertebrates. Conclusions Although it has been suggested that fishes contain plastic genomes with high rates of chromosomal rearrangements and probably low rates of synteny conservation, our results evidence that large syntenic chromosome segments have been maintained conserved during evolution, at least for the considered markers. Additionally, our current cytogenetic mapping efforts integrated with genomic approaches conduct to a new perspective to address important questions involving chromosome evolution in fishes.

  15. PGSB PlantsDB: updates to the database framework for comparative plant genome research.

    Science.gov (United States)

    Spannagl, Manuel; Nussbaumer, Thomas; Bader, Kai C; Martis, Mihaela M; Seidel, Michael; Kugler, Karl G; Gundlach, Heidrun; Mayer, Klaus F X

    2016-01-01

    PGSB (Plant Genome and Systems Biology: formerly MIPS) PlantsDB (http://pgsb.helmholtz-muenchen.de/plant/index.jsp) is a database framework for the comparative analysis and visualization of plant genome data. The resource has been updated with new data sets and types as well as specialized tools and interfaces to address user demands for intuitive access to complex plant genome data. In its latest incarnation, we have re-worked both the layout and navigation structure and implemented new keyword search options and a new BLAST sequence search functionality. Actively involved in corresponding sequencing consortia, PlantsDB has dedicated special efforts to the integration and visualization of complex triticeae genome data, especially for barley, wheat and rye. We enhanced CrowsNest, a tool to visualize syntenic relationships between genomes, with data from the wheat sub-genome progenitor Aegilops tauschii and added functionality to the PGSB RNASeqExpressionBrowser. GenomeZipper results were integrated for the genomes of barley, rye, wheat and perennial ryegrass and interactive access is granted through PlantsDB interfaces. Data exchange and cross-linking between PlantsDB and other plant genome databases is stimulated by the transPLANT project (http://transplantdb.eu/). PMID:26527721

  16. Genome-wide comparative analysis of NBS-encoding genes between Brassica species and Arabidopsis thaliana

    OpenAIRE

    Yu, Jingyin; Tehrim, Sadia; Zhang, Fengqi; Tong, Chaobo; Huang, Junyan; Cheng, Xiaohui; Dong, Caihua; Zhou, Yanqiu; Qin, Rui; Hua, Wei; Liu, Shengyi

    2014-01-01

    Background Plant disease resistance (R) genes with the nucleotide binding site (NBS) play an important role in offering resistance to pathogens. The availability of complete genome sequences of Brassica oleracea and Brassica rapa provides an important opportunity for researchers to identify and characterize NBS-encoding R genes in Brassica species and to compare with analogues in Arabidopsis thaliana based on a comparative genomics approach. However, little is known about the evolutionary fat...

  17. Cost-Effective Cloud Computing: A Case Study Using the Comparative Genomics Tool, Roundup

    OpenAIRE

    Parul Kudtarkar; DeLuca, Todd F.; Fusaro, Vincent A; Tonellato, Peter J.; Wall, Dennis P

    2010-01-01

    Background Comparative genomics resources, such as ortholog detection tools and repositories are rapidly increasing in scale and complexity. Cloud computing is an emerging technological paradigm that enables researchers to dynamically build a dedicated virtual cluster and may represent a valuable alternative for large computational tools in bioinformatics. In the present manuscript, we optimize the computation of a large-scale comparative genomics resource—Roundup—using cloud computing, descr...

  18. The Princeton Protein Orthology Database (P-POD): A Comparative Genomics Analysis Tool for Biologists

    OpenAIRE

    Sven Heinicke; Livstone, Michael S.; Charles Lu; Rose Oughtred; Fan Kang; Angiuoli, Samuel V; Owen White; David Botstein; Kara Dolinski

    2007-01-01

    Many biological databases that provide comparative genomics information and tools are now available on the internet. While certainly quite useful, to our knowledge none of the existing databases combine results from multiple comparative genomics methods with manually curated information from the literature. Here we describe the Princeton Protein Orthology Database (P-POD, http://ortholog.princeton.edu), a user-friendly database system that allows users to find and visualize the phylogenetic r...

  19. Comparative Analysis of Fatty Acid Desaturases in Cyanobacterial Genomes

    Directory of Open Access Journals (Sweden)

    Xiaoyuan Chi

    2008-01-01

    Full Text Available Fatty acid desaturases are enzymes that introduce double bonds into the hydrocarbon chains of fatty acids. The fatty acid desaturases from 37 cyanobacterial genomes were identified and classified based upon their conserved histidine-rich motifs and phylogenetic analysis, which help to determine the amounts and distributions of desaturases in cyanobacterial species. The filamentous or N2-fixing cyanobacteria usually possess more types of fatty acid desaturases than that of unicellular species. The pathway of acyl-lipid desaturation for unicellular marine cyanobacteria Synechococcus and Prochlorococcus differs from that of other cyanobacteria, indicating different phylogenetic histories of the two genera from other cyanobacteria isolated from freshwater, soil, or symbiont. Strain Gloeobacter violaceus PCC 7421 was isolated from calcareous rock and lacks thylakoid membranes. The types and amounts of desaturases of this strain are distinct to those of other cyanobacteria, reflecting the earliest divergence of it from the cyanobacterial line. Three thermophilic unicellular strains, Thermosynechococcus elongatus BP-1 and two Synechococcus Yellowstone species, lack highly unsaturated fatty acids in lipids and contain only one Δ9 desaturase in contrast with mesophilic strains, which is probably due to their thermic habitats. Thus, the amounts and types of fatty acid desaturases are various among different cyanobacterial species, which may result from the adaption to environments in evolution.

  20. Comparative genome analysis and resistance gene mapping in grain legumes

    International Nuclear Information System (INIS)

    Using, DNA markers and genome organization, several important disease resistance genes have been analyzed in mungbean (Vigna radiata), cowpea (Vigna unguiculata), common bean (Phaseolus vulgaris), and soybean (Glycine max). In the process, medium-density linkage maps consisting of restriction fragment length polymorphism (RFLP) markers were constructed for both mungbean and cowpea. Comparisons between these maps, as well as the maps of soybean and common bean, indicate that there is significant conservation of DNA marker order, though the conserved blocks in soybean are much shorter than in the others. DNA mapping results also indicate that a gene for seed weight may be conserved between mungbean and cowpea. Using the linkage maps, genes that control bruchid (genus Callosobruchus) and powdery mildew (Erysiphe polygoni) resistance in mungbean, aphid resistance in cowpea (Aphis craccivora), and cyst nematode (Heterodera glycines) resistance in soybean have all been mapped and characterized. For some of these traits resistance was found to be oligogenic and DNA mapping uncovered multiple genes involved in the phenotype. (author)

  1. Investigating hookworm genomes by comparative analysis of two Ancylostoma species

    Directory of Open Access Journals (Sweden)

    Kapulkin Wadim

    2005-04-01

    Full Text Available Abstract Background Hookworms, infecting over one billion people, are the mostly closely related major human parasites to the model nematode Caenorhabditis elegans. Applying genomics techniques to these species, we analyzed 3,840 and 3,149 genes from Ancylostoma caninum and A. ceylanicum. Results Transcripts originated from libraries representing infective L3 larva, stimulated L3, arrested L3, and adults. Most genes are represented in single stages including abundant transcripts like hsp-20 in infective L3 and vit-3 in adults. Over 80% of the genes have homologs in C. elegans, and nearly 30% of these were with observable RNA interference phenotypes. Homologies were identified to nematode-specific and clade V specific gene families. To study the evolution of hookworm genes, 574 A. caninum / A. ceylanicum orthologs were identified, all of which were found to be under purifying selection with distribution ratios of nonsynonymous to synonymous amino acid substitutions similar to that reported for C. elegans / C. briggsae orthologs. The phylogenetic distance between A. caninum and A. ceylanicum is almost identical to that for C. elegans / C. briggsae. Conclusion The genes discovered should substantially accelerate research toward better understanding of the parasites' basic biology as well as new therapies including vaccines and novel anthelmintics.

  2. Comparative Genome Analysis Provides Insights into the Pathogenicity of Flavobacterium psychrophilum

    Science.gov (United States)

    Castillo, Daniel; Christiansen, Rói Hammershaimb; Dalsgaard, Inger; Madsen, Lone; Espejo, Romilio

    2016-01-01

    Flavobacterium psychrophilum is a fish pathogen in salmonid aquaculture worldwide that causes cold water disease (CWD) and rainbow trout fry syndrome (RTFS). Comparative genome analyses of 11 F. psychrophilum isolates representing temporally and geographically distant populations were used to describe the F. psychrophilum pan-genome and to examine virulence factors, prophages, CRISPR arrays, and genomic islands present in the genomes. Analysis of the genomic DNA sequences were complemented with selected phenotypic characteristics of the strains. The pan genome analysis showed that F. psychrophilum could hold at least 3373 genes, while the core genome contained 1743 genes. On average, 67 new genes were detected for every new genome added to the analysis, indicating that F. psychrophilum possesses an open pan genome. The putative virulence factors were equally distributed among isolates, independent of geographic location, year of isolation and source of isolates. Only one prophage-related sequence was found which corresponded to the previously described prophage 6H, and appeared in 5 out of 11 isolates. CRISPR array analysis revealed two different loci with dissimilar spacer content, which only matched one sequence in the database, the temperate bacteriophage 6H. Genomic Islands (GIs) were identified in F. psychrophilum isolates 950106-1/1 and CSF 259–93, associated with toxins and antibiotic resistance. Finally, phenotypic characterization revealed a high degree of similarity among the strains with respect to biofilm formation and secretion of extracellular enzymes. Global scale dispersion of virulence factors in the genomes and the abilities for biofilm formation, hemolytic activity and secretion of extracellular enzymes among the strains suggested that F. psychrophilum isolates have a similar mode of action on adhesion, colonization and destruction of fish tissues across large spatial and temporal scales of occurrence. Overall, the genomic characterization and

  3. An orphan gyrB in the Mycobacterium smegmatis genome uncovered by comparative genomics

    Indian Academy of Sciences (India)

    P. Jain; V. Nagaraja

    2002-11-01

    DNA gyrase is an essential topoisomerase found in all bacteria. It is encoded by gyrB and gyrA genes. These genes are organized differently in different bacteria. Direct comparison of Mycobacterium tuberculosis and Mycobacterium smegmatis genomes reveals presence of an additional gyrB in M. smegmatis flanked by novel genes. Analysis of the amino acid sequence of GyrB from different organisms suggests that the orphan GyrB in M. smegmatis may have an important cellular role.

  4. New Markov Model Approaches to Deciphering Microbial Genome Function and Evolution: Comparative Genomics of Laterally Transferred Genes

    Energy Technology Data Exchange (ETDEWEB)

    Borodovsky, M.

    2013-04-11

    Algorithmic methods for gene prediction have been developed and successfully applied to many different prokaryotic genome sequences. As the set of genes in a particular genome is not homogeneous with respect to DNA sequence composition features, the GeneMark.hmm program utilizes two Markov models representing distinct classes of protein coding genes denoted "typical" and "atypical". Atypical genes are those whose DNA features deviate significantly from those classified as typical and they represent approximately 10% of any given genome. In addition to the inherent interest of more accurately predicting genes, the atypical status of these genes may also reflect their separate evolutionary ancestry from other genes in that genome. We hypothesize that atypical genes are largely comprised of those genes that have been relatively recently acquired through lateral gene transfer (LGT). If so, what fraction of atypical genes are such bona fide LGTs? We have made atypical gene predictions for all fully completed prokaryotic genomes; we have been able to compare these results to other "surrogate" methods of LGT prediction.

  5. Comparative genomics reveals convergent rates of evolution in ant–plant mutualisms

    Science.gov (United States)

    Rubin, Benjamin E. R.; Moreau, Corrie S.

    2016-01-01

    Symbiosis—the close and often long-term interaction of species—is predicted to drive genome evolution in a variety of ways. For example, parasitic interactions have been shown to increase rates of molecular evolution, a trend generally attributed to the Red Queen Hypothesis. However, it is much less clear how mutualisms impact the genome, as both increased and reduced rates of change have been predicted. Here we sequence the genomes of seven species of ants, three that have convergently evolved obligate plant–ant mutualism and four closely related species of non-mutualists. Comparing these sequences, we investigate how genome evolution is shaped by mutualistic behaviour. We find that rates of molecular evolution are higher in the mutualists genome wide, a characteristic apparently not the result of demography. Our results suggest that the intimate relationships of obligate mutualists may lead to selective pressures similar to those seen in parasites, thereby increasing rates of evolution. PMID:27557866

  6. A comparative genome-wide study of ncRNAs in trypanosomatids

    Directory of Open Access Journals (Sweden)

    Wachtel Chaim

    2010-11-01

    Full Text Available Abstract Background Recent studies have provided extensive evidence for multitudes of non-coding RNA (ncRNA transcripts in a wide range of eukaryotic genomes. ncRNAs are emerging as key players in multiple layers of cellular regulation. With the availability of many whole genome sequences, comparative analysis has become a powerful tool to identify ncRNA molecules. In this study, we performed a systematic genome-wide in silico screen to search for novel small ncRNAs in the genome of Trypanosoma brucei using techniques of comparative genomics. Results In this study, we identified by comparative genomics, and validated by experimental analysis several novel ncRNAs that are conserved across multiple trypanosomatid genomes. When tested on known ncRNAs, our procedure was capable of finding almost half of the known repertoire through homology over six genomes, and about two-thirds of the known sequences were found in at least four genomes. After filtering, 72 conserved unannotated sequences in at least four genomes were found, 29 of which, ranging in size from 30 to 392 nts, were conserved in all six genomes. Fifty of the 72 candidates in the final set were chosen for experimental validation. Eighteen of the 50 (36% were shown to be expressed, and for 11 of them a distinct expression product was detected, suggesting that they are short ncRNAs. Using functional experimental assays, five of the candidates were shown to be novel H/ACA and C/D snoRNAs; these included three sequences that appear as singletons in the genome, unlike previously identified snoRNA molecules that are found in clusters. The other candidates appear to be novel ncRNA molecules, and their function is, as yet, unknown. Conclusions Using comparative genomic techniques, we predicted 72 sequences as ncRNA candidates in T. brucei. The expression of 50 candidates was tested in laboratory experiments. This resulted in the discovery of 11 novel short ncRNAs in procyclic stage T. brucei

  7. Comparative genomic analysis as a tool for biologicaldiscovery

    Energy Technology Data Exchange (ETDEWEB)

    Nobrega, Marcelo A.; Pennacchio, Len A.

    2003-03-30

    Biology is a discipline rooted in comparisons. Comparative physiology has assembled a detailed catalogue of the biological similarities and differences between species, revealing insights into how life has adapted to fill a wide-range of environmental niches. For example, the oxygen and carbon dioxide carrying capacity of vertebrate has evolved to provide strong advantages for species respiring at sea level, at high elevation or within water. Comparative- anatomy, -biochemistry, -pharmacology, -immunology and -cell biology have provided the fundamental paradigms from which each discipline has grown.

  8. Complete genome sequence and comparative genome analysis of a new special Yersinia enterocolitica.

    Science.gov (United States)

    Shi, Guoxiang; Su, Mingming; Liang, Junrong; Duan, Ran; Gu, Wenpeng; Xiao, Yuchun; Zhang, Zhewen; Qiu, Haiyan; Zhang, Zheng; Li, Yi; Zhang, Xiaohe; Ling, Yunchao; Song, Lai; Chen, Meili; Zhao, Yongbing; Wu, Jiayan; Jing, Huaiqi; Xiao, Jingfa; Wang, Xin

    2016-09-01

    Yersinia enterocolitica is the most diverse species among the Yersinia genera and shows more polymorphism, especially for the non-pathogenic strains. Individual non-pathogenic Y. enterocolitica strains are wrongly identified because of atypical phenotypes. In this study, we isolated an unusual Y. enterocolitica strain LC20 from Rattus norvegicus. The strain did not utilize urea and could not be classified as the biotype. API 20E identified Escherichia coli; however, it grew well at 25 °C, but E. coli grew well at 37 °C. We analyzed the genome of LC20 and found the whole chromosome of LC20 was collinear with Y. enterocolitica 8081, and the urease gene did not exist on the genome which is consistent with the result of API 20E. Also, the 16 S and 23 SrRNA gene of LC20 lay on a branch of Y. enterocolitica. Furthermore, the core-based and pan-based phylogenetic trees showed that LC20 was classified into the Y. enterocolitica cluster. Two plasmids (80 and 50 k) from LC20 shared low genetic homology with pYV from the Yersinia genus, one was an ancestral Yersinia plasmid and the other was novel encoding a number of transposases. Some pathogenic and non-pathogenic Y. enterocolitica-specific genes coexisted in LC20. Thus, although it could not be classified into any Y. enterocolitica biotype due to its special biochemical metabolism, we concluded the LC20 was a Y. enterocolitica strain because its genome was similar to other Y. enterocolitica and it might be a strain with many mutations and combinations emerging in the processes of its evolution. PMID:27129539

  9. Whole genome sequencing in drug discovery research: a one fits all solution?

    OpenAIRE

    Marc Sultan

    2015-01-01

    With the recent availability of Illumina's HiSeq X ten sequencing platform, the cost of whole genome sequencing (WGS) has dropped to nearly $1,000 per genome. The affordability of WGS has now the potential of replacing other genotyping platforms such as whole exome sequencing (WES) and array based genotyping for (smaller) clinical study cohorts. In a recent pilot study, we compared the performance and genotyping quality of the HiSeq X WGS approach against WES and array based genotyping with r...

  10. Comparative genomics of two 'Candidatus Accumulibacter' clades performing biological phosphorus removal.

    Science.gov (United States)

    Flowers, Jason J; He, Shaomei; Malfatti, Stephanie; del Rio, Tijana Glavina; Tringe, Susannah G; Hugenholtz, Philip; McMahon, Katherine D

    2013-12-01

    Members of the genus Candidatus Accumulibacter are important in many wastewater treatment systems performing enhanced biological phosphorus removal (EBPR). The Accumulibacter lineage can be subdivided phylogenetically into multiple clades, and previous work showed that these clades are ecologically distinct. The complete genome of Candidatus Accumulibacter phosphatis strain UW-1, a member of Clade IIA, was previously sequenced. Here, we report a draft genome sequence of Candidatus Accumulibacter spp. strain UW-2, a member of Clade IA, assembled following shotgun metagenomic sequencing of laboratory-scale bioreactor sludge. We estimate the genome to be 80-90% complete. Although the two clades share 16S rRNA sequence identity of >98.0%, we observed a remarkable lack of synteny between the two genomes. We identified 2317 genes shared between the two genomes, with an average nucleotide identity (ANI) of 78.3%, and accounting for 49% of genes in the UW-1 genome. Unlike UW-1, the UW-2 genome seemed to lack genes for nitrogen fixation and carbon fixation. Despite these differences, metabolic genes essential for denitrification and EBPR, including carbon storage polymer and polyphosphate metabolism, were conserved in both genomes. The ANI from genes associated with EBPR was statistically higher than that from genes not associated with EBPR, indicating a high selective pressure in EBPR systems. Further, we identified genomic islands of foreign origins including a near-complete lysogenic phage in the Clade IA genome. Interestingly, Clade IA appeared to be more phage susceptible based on it containing only a single Clustered Regularly Interspaced Short Palindromic Repeats locus as compared with the two found in Clade IIA. Overall, the comparative analysis provided a genetic basis to understand physiological differences and ecological niches of Accumulibacter populations, and highlights the importance of diversity in maintaining system functional resilience. PMID:23887171

  11. Genome-based comparative analyses of Antarctic and temperate species of Paenibacillus.

    Directory of Open Access Journals (Sweden)

    Melissa Dsouza

    Full Text Available Antarctic soils represent a unique environment characterised by extremes of temperature, salinity, elevated UV radiation, low nutrient and low water content. Despite the harshness of this environment, members of 15 bacterial phyla have been identified in soils of the Ross Sea Region (RSR. However, the survival mechanisms and ecological roles of these phyla are largely unknown. The aim of this study was to investigate whether strains of Paenibacillus darwinianus owe their resilience to substantial genomic changes. For this, genome-based comparative analyses were performed on three P. darwinianus strains, isolated from gamma-irradiated RSR soils, together with nine temperate, soil-dwelling Paenibacillus spp. The genome of each strain was sequenced to over 1,000-fold coverage, then assembled into contigs totalling approximately 3 Mbp per genome. Based on the occurrence of essential, single-copy genes, genome completeness was estimated at approximately 88%. Genome analysis revealed between 3,043-3,091 protein-coding sequences (CDSs, primarily associated with two-component systems, sigma factors, transporters, sporulation and genes induced by cold-shock, oxidative and osmotic stresses. These comparative analyses provide an insight into the metabolic potential of P. darwinianus, revealing potential adaptive mechanisms for survival in Antarctic soils. However, a large proportion of these mechanisms were also identified in temperate Paenibacillus spp., suggesting that these mechanisms are beneficial for growth and survival in a range of soil environments. These analyses have also revealed that the P. darwinianus genomes contain significantly fewer CDSs and have a lower paralogous content. Notwithstanding the incompleteness of the assemblies, the large differences in genome sizes, determined by the number of genes in paralogous clusters and the CDS content, are indicative of genome content scaling. Finally, these sequences are a resource for further

  12. Comparative genomic analysis of the Tribolium immune system

    Science.gov (United States)

    The red flour beetle Tribolium castaneum has contributed a wealth of knowledge on insect development but limited information about innate immunity. With its complete nucleotide sequence determined, we have taken the opportunity to annotate immunity-related genes and compare them with homologous mole...

  13. Comparative genomics of an endophytic Pseudomonas putida isolated from mango orchard

    Science.gov (United States)

    Asif, Huma; Studholme, David J.; Khan, Asifullah; Aurongzeb, M.; Khan, Ishtiaq A.; Azim, M. Kamran

    2016-01-01

    Abstract We analyzed the genome sequence of an endophytic bacterial strain Pseudomonas putida TJI51 isolated from mango bark tissues. Next generation DNA sequencing and short read de novo assembly generated the 5,805,096 bp draft genome of P. putida TJI51. Out of 6,036 protein coding genes in P. putida TJI51 sequences, 4,367 (72%) were annotated with functional specifications, while the remaining encoded hypothetical proteins. Comparative genome sequence analysis revealed that the P. putida TJI51genome contains several regions, not identified in so far sequenced P. putida genomes. Some of these regions were predicted to encode enzymes, including acetylornithine deacetylase, betaine aldehyde dehydrogenase, aldehyde dehydrogenase, benzoylformate decarboxylase, hydroxyacylglutathione hydrolase, and uroporphyrinogen decarboxylase. The genome of P. putida TJI51 contained three nonribosomal peptide synthetase gene clusters. Genome sequence analysis of P. putidaTJI51 identified this bacterium as an endophytic resident. The endophytic fitness might be linked with alginate, which facilitates bacterial colonization in plant tissues. Genome sequence analysis shed light on the presence of a diverse spectrum of metabolic activities and adaptation of this isolate to various niches. PMID:27560648

  14. Comparative genomic assessment of Multi-Locus Sequence Typing: rapid accumulation of genomic heterogeneity among clonal isolates of Campylobacter jejuni

    Directory of Open Access Journals (Sweden)

    Nash John HE

    2008-08-01

    Full Text Available Abstract Background Multi-Locus Sequence Typing (MLST has emerged as a leading molecular typing method owing to its high ability to discriminate among bacterial isolates, the relative ease with which data acquisition and analysis can be standardized, and the high portability of the resulting sequence data. While MLST has been successfully applied to the study of the population structure for a number of different bacterial species, it has also provided compelling evidence for high rates of recombination in some species. We have analyzed a set of Campylobacter jejuni strains using MLST and Comparative Genomic Hybridization (CGH on a full-genome microarray in order to determine whether recombination and high levels of genomic mosaicism adversely affect the inference of strain relationships based on the analysis of a restricted number of genetic loci. Results Our results indicate that, in general, there is significant concordance between strain relationships established by MLST and those based on shared gene content as established by CGH. While MLST has significant predictive power with respect to overall genome similarity of isolates, we also found evidence for significant differences in genomic content among strains that would otherwise appear to be highly related based on their MLST profiles. Conclusion The extensive genomic mosaicism between closely related strains has important implications in the context of establishing strain to strain relationships because it suggests that the exact gene content of strains, and by extension their phenotype, is less likely to be "predicted" based on a small number of typing loci. This in turn suggests that a greater emphasis should be placed on analyzing genes of clinical interest as we forge ahead with the next generation of molecular typing methods.

  15. Comparative genome analysis of Spiroplasma melliferum IPMB4A, a honeybee-associated bacterium

    Directory of Open Access Journals (Sweden)

    Lo Wen-Sui

    2013-01-01

    Full Text Available Abstract Background The genus Spiroplasma contains a group of helical, motile, and wall-less bacteria in the class Mollicutes. Similar to other members of this class, such as the animal-pathogenic Mycoplasma and the plant-pathogenic ‘Candidatus Phytoplasma’, all characterized Spiroplasma species were found to be associated with eukaryotic hosts. While most of the Spiroplasma species appeared to be harmless commensals of insects, a small number of species have evolved pathogenicity toward various arthropods and plants. In this study, we isolated a novel strain of honeybee-associated S. melliferum and investigated its genetic composition and evolutionary history by whole-genome shotgun sequencing and comparative analysis with other Mollicutes genomes. Results The whole-genome shotgun sequencing of S. melliferum IPMB4A produced a draft assembly that was ~1.1 Mb in size and covered ~80% of the chromosome. Similar to other Spiroplasma genomes that have been studied to date, we found that this genome contains abundant repetitive sequences that originated from plectrovirus insertions. These phage fragments represented a major obstacle in obtaining a complete genome sequence of Spiroplasma with the current sequencing technology. Comparative analysis of S. melliferum IPMB4A with other Spiroplasma genomes revealed that these phages may have facilitated extensive genome rearrangements in these bacteria and contributed to horizontal gene transfers that led to species-specific adaptation to different eukaryotic hosts. In addition, comparison of gene content with other Mollicutes suggested that the common ancestor of the SEM (Spiroplasma, Entomoplasma, and Mycoplasma clade may have had a relatively large genome and flexible metabolic capacity; the extremely reduced genomes of present day Mycoplasma and ‘Candidatus Phytoplasma’ species are likely to be the result of independent gene losses in these lineages. Conclusions The findings in this study

  16. Comparative assessment of genetic diversity in cytoplasmic and nuclear genome of upland cotton.

    Science.gov (United States)

    Egamberdiev, Sharof S; Saha, Sukumar; Salakhutdinov, Ilkhom; Jenkins, Johnie N; Deng, Dewayne; Y Abdurakhmonov, Ibrokhim

    2016-06-01

    The importance of the cytoplasmic genome for many economically important traits is well documented in several crop species, including cotton. There is no report on application of cotton chloroplast specific SSR markers as a diagnostic tool to study genetic diversity among improved Upland cotton lines. The complete plastome sequence information in GenBank provided us an opportunity to report on 17 chloroplast specific SSR markers using a cost-effective data mining strategy. Here we report the comparative analysis of genetic diversity among a set of 42 improved Upland cotton lines using SSR markers specific to chloroplast and nuclear genome, respectively. Our results revealed that low to moderate level of genetic diversity existed in both nuclear and cytoplasm genome among this set of cotton lines. However, the specific estimation suggested that genetic diversity is lower in cytoplasmic genome compared to the nuclear genome among this set of Upland cotton lines. In summary, this research is important from several perspectives. We detected a set of cytoplasm genome specific SSR primer pairs by using a cost-effective data mining strategy. We reported for the first time the genetic diversity in the cytoplasmic genome within a set of improved Upland cotton accessions. Results revealed that the genetic diversity in cytoplasmic genome is narrow, compared to the nuclear genome within this set of Upland cotton accessions. Our results suggested that most of these polymorphic chloroplast SSRs would be a valuable complementary tool in addition to the nuclear SSR in the study of evolution, gene flow and genetic diversity in Upland cotton. PMID:27155886

  17. Mitome: dynamic and interactive database for comparative mitochondrial genomics in metazoan animals.

    Science.gov (United States)

    Lee, Yong Seok; Oh, Jeongsu; Kim, Young Uk; Kim, Namchul; Yang, Sungjin; Hwang, Ui Wook

    2008-01-01

    Mitome is a specialized mitochondrial genome database designed for easy comparative analysis of various features of metazoan mitochondrial genomes such as base frequency, A+T skew, codon usage and gene arrangement pattern. A particular function of the database is the automatic reconstruction of phylogenetic relationships among metazoans selected by a user from a taxonomic tree menu based on nucleotide sequences, amino acid sequences or gene arrangement patterns. Mitome also enables us (i) to easily find the taxonomic positions of organisms of which complete mitochondrial genome sequences are publicly available; (ii) to acquire various metazoan mitochondrial genome characteristics through a graphical genome browser; (iii) to search for homology patterns in mitochondrial gene arrangements; (iv) to download nucleotide or amino acid sequences not only of an entire mitochondrial genome but also of each component; and (v) to find interesting references easily through links with PubMed. In order to provide users with a dynamic, responsive, interactive and faster web database, Mitome is constructed using two recently highlighted techniques, Ajax (Asynchronous JavaScript and XML) and Web Services. Mitome has the potential to become very useful in the fields of molecular phylogenetics and evolution and comparative organelle genomics. The database is available at: http://www.mitome.info. PMID:17940090

  18. Comparing De Novo Genome Assembly: The Long and Short of It

    OpenAIRE

    Narzisi, Giuseppe; Mishra, Bud

    2011-01-01

    Recent advances in DNA sequencing technology and their focal role in Genome Wide Association Studies (GWAS) have rekindled a growing interest in the whole-genome sequence assembly (WGSA) problem, thereby, inundating the field with a plethora of new formalizations, algorithms, heuristics and implementations. And yet, scant attention has been paid to comparative assessments of these assemblers' quality and accuracy. No commonly accepted and standardized method for comparison exists yet. Even wo...

  19. Comparative sequencing provides insights about the structure and conservation of marsupial and monotreme genomes

    OpenAIRE

    Margulies, Elliott H.; Maduro, Valerie V.B.; Thomas, Pamela J.; Tomkins, Jeffery P.; Amemiya, Chris T.; Luo, Meizhong; Green, Eric D

    2005-01-01

    Sequencing and comparative analyses of genomes from multiple vertebrates are providing insights about the genetic basis for biological diversity. To date, these efforts largely have focused on eutherian mammals, chicken, and fish. In this article, we describe the generation and study of genomic sequences from noneutherian mammals, a group of species occupying unusual phylogenetic positions. A large sequence data set (totaling >5 Mb) was generated for the same orthologous region in three marsu...

  20. Comparative genomics of the bacteria Dickeya solani and Pectobacterium wasabiae,emerging pathogens of Solanum tuberosum

    OpenAIRE

    Khayi, Slimane

    2015-01-01

    The pectolytic bacteria Pectobacterium and Dickeya species cause important diseases on Solanum tuberosum and other arable and horticultural crops. These bacteria are responsible for blackleg in the field and tuber soft rots in storage and in transit as well as in the field worldwide. The main objectives of this thesis are: 1) To study the diversity of a D. solani population using comparative genomics approaches in order to understand the genomic structure and evolution of this emerging specie...

  1. Comparative genomics of Pseudomonas fluorescens subclade III strains from human lungs

    OpenAIRE

    Brittan S Scales; Erb-Downward, John R.; Huffnagle, Ian M.; LiPuma, John J.; Huffnagle, Gary B.

    2015-01-01

    Background While the taxonomy and genomics of environmental strains from the P. fluorescens species-complex has been reported, little is known about P. fluorescens strains from clinical samples. In this report, we provide the first genomic analysis of P. fluorescens strains in which human vs. environmental isolates are compared. Results Seven P. fluorescens strains were isolated from respiratory samples from cystic fibrosis (CF) patients. The clinical strains could grow at a higher temperatur...

  2. Comparative genomic and transcriptional analyses of CRISPR systems across the genus Pyrobaculum

    OpenAIRE

    Bernick, David L.; Cox, Courtney L.; Dennis, Patrick P.; Lowe, Todd M.

    2012-01-01

    Within the domain Archaea, the CRISPR immune system appears to be nearly ubiquitous based on computational genome analyses. Initial studies in bacteria demonstrated that the CRISPR system targets invading plasmid and viral DNA. Recent experiments in the model archaeon Pyrococcus furiosus have uncovered a novel RNA-targeting variant of the CRISPR system. Because our understanding of CRISPR system evolution in other archaea is limited, we have taken a comparative genomic and transcriptomic view...

  3. Comparative Genomic and Transcriptional Analyses of CRISPR Systems Across the Genus Pyrobaculum

    OpenAIRE

    Bernick, David L.; Cox, Courtney L.; Dennis, Patrick P.; Lowe, Todd M.

    2012-01-01

    Within the domain Archaea, the CRISPR immune system appears to be nearly ubiquitous based on computational genome analyses. Initial studies in bacteria demonstrated that the CRISPR system targets invading plasmid and viral DNA. Recent experiments in the model archaeon Pyrococcus furiosus uncovered a novel RNA-targeting variant of the CRISPR system potentially unique to archaea. Because our understanding of CRISPR system evolution in other archaea is limited, we have taken a comparative genom...

  4. Comparative genomic analysis of four representative plant growth-promoting rhizobacteria in Pseudomonas

    OpenAIRE

    Shen, Xuemei; Hu, Hongbo; Peng, Huasong; Wang, Wei; Zhang, Xuehong

    2013-01-01

    Background Some Pseudomonas strains function as predominant plant growth-promoting rhizobacteria (PGPR). Within this group, Pseudomonas chlororaphis and Pseudomonas fluorescens are non-pathogenic biocontrol agents, and some Pseudomonas aeruginosa and Pseudomonas stutzeri strains are PGPR. P. chlororaphis GP72 is a plant growth-promoting rhizobacterium with a fully sequenced genome. We conducted a genomic analysis comparing GP72 with three other pseudomonad PGPR: P. fluorescens Pf-5, P. aerugi...

  5. Identification of Ciliary and Ciliopathy Genes in Caenorhabditis Elegans through Comparative Genomics

    OpenAIRE

    Chen, Nansheng; Mah, Allan; Oliver E Blacque; Chu, Jeffrey; Phgora, Kiran; Bakhoum, Mathieu W.; Newbury, C. Rebecca Hunt; Khattra, Jaswinder; Chan, Susanna; Efimenko, Evgheni; Johnsen, Robert; Phirke, Prasad; Swoboda, Peter; Marra, Marco; Moerman, Donald

    2006-01-01

    Background The recent availability of genome sequences of multiple related Caenorhabditis species has made it possible to identify, using comparative genomics, similarly transcribed genes in Caenorhabditis elegans and its sister species. Taking this approach, we have identified numerous novel ciliary genes in C. elegans, some of which may be orthologs of unidentified human ciliopathy genes. Results By screening for genes possessing canonical X-box sequences in promoters of three Caenorhabditi...

  6. SALAD database: a motif-based database of protein annotations for plant comparative genomics

    OpenAIRE

    Mihara, Motohiro; Itoh, Takeshi; Izawa, Takeshi

    2009-01-01

    Proteins often have several motifs with distinct evolutionary histories. Proteins with similar motifs have similar biochemical properties and thus related biological functions. We constructed a unique comparative genomics database termed the SALAD database (http://salad.dna.affrc.go.jp/salad/) from plant-genome-based proteome data sets. We extracted evolutionarily conserved motifs by MEME software from 209 529 protein-sequence annotation groups selected by BLASTP from the proteome data sets o...

  7. Comparative Genomics of Gardnerella vaginalis Strains Reveals Substantial Differences in Metabolic and Virulence Potential

    OpenAIRE

    Carl J Yeoman; Yildirim, Suleyman; Thomas, Susan M; Durkin, A. Scott; Torralba, Manolito; Sutton, Granger; Buhay, Christian J.; Ding, Yan; Dugan-Rocha, Shannon P.; Muzny, Donna M.; Qin, Xiang; Gibbs, Richard A; Leigh, Steven R.; Stumpf, Rebecca; White, Bryan A.

    2010-01-01

    Background Gardnerella vaginalis is described as a common vaginal bacterial species whose presence correlates strongly with bacterial vaginosis (BV). Here we report the genome sequencing and comparative analyses of three strains of G. vaginalis. Strains 317 (ATCC 14019) and 594 (ATCC 14018) were isolated from the vaginal tracts of women with symptomatic BV, while Strain 409-05 was isolated from a healthy, asymptomatic individual with a Nugent score of 9. Principal Findings Substantial genomic...

  8. Complete Chloroplast Genome Sequence of Omani Lime (Citrus aurantiifolia) and Comparative Analysis within the Rosids

    OpenAIRE

    Huei-Jiun Su; Hogenhout, Saskia A.; Al-Sadi, Abdullah M.; Chih-Horng Kuo

    2014-01-01

    The genus Citrus contains many economically important fruits that are grown worldwide for their high nutritional and medicinal value. Due to frequent hybridizations among species and cultivars, the exact number of natural species and the taxonomic relationships within this genus are unclear. To compare the differences between the Citrus chloroplast genomes and to develop useful genetic markers, we used a reference-assisted approach to assemble the complete chloroplast genome of Omani lime (C....

  9. Comparative genomic de-convolution of the cotton genome revealed a decaploid ancestor and widespread chromosomal fractionation.

    Science.gov (United States)

    Wang, Xiyin; Guo, Hui; Wang, Jinpeng; Lei, Tianyu; Liu, Tao; Wang, Zhenyi; Li, Yuxian; Lee, Tae-Ho; Li, Jingping; Tang, Haibao; Jin, Dianchuan; Paterson, Andrew H

    2016-02-01

    The 'apparently' simple genomes of many angiosperms mask complex evolutionary histories. The reference genome sequence for cotton (Gossypium spp.) revealed a ploidy change of a complexity unprecedented to date, indeed that could not be distinguished as to its exact dosage. Herein, by developing several comparative, computational and statistical approaches, we revealed a 5× multiplication in the cotton lineage of an ancestral genome common to cotton and cacao, and proposed evolutionary models to show how such a decaploid ancestor formed. The c. 70% gene loss necessary to bring the ancestral decaploid to its current gene count appears to fit an approximate geometrical model; that is, although many genes may be lost by single-gene deletion events, some may be lost in groups of consecutive genes. Gene loss following cotton decaploidy has largely just reduced gene copy numbers of some homologous groups. We designed a novel approach to deconvolute layers of chromosome homology, providing definitive information on gene orthology and paralogy across broad evolutionary distances, both of fundamental value and serving as an important platform to support further studies in and beyond cotton and genomics communities. PMID:26756535

  10. Characterization of genomic alterations in radiation-associated breast cancer among childhood cancer survivors, using comparative genomic hybridization (CGH arrays.

    Directory of Open Access Journals (Sweden)

    Xiaohong R Yang

    Full Text Available Ionizing radiation is an established risk factor for breast cancer. Epidemiologic studies of radiation-exposed cohorts have been primarily descriptive; molecular events responsible for the development of radiation-associated breast cancer have not been elucidated. In this study, we used array comparative genomic hybridization (array-CGH to characterize genome-wide copy number changes in breast tumors collected in the Childhood Cancer Survivor Study (CCSS. Array-CGH data were obtained from 32 cases who developed a second primary breast cancer following chest irradiation at early ages for the treatment of their first cancers, mostly Hodgkin lymphoma. The majority of these cases developed breast cancer before age 45 (91%, n = 29, had invasive ductal tumors (81%, n = 26, estrogen receptor (ER-positive staining (68%, n = 19 out of 28, and high proliferation as indicated by high Ki-67 staining (77%, n = 17 out of 22. Genomic regions with low-copy number gains and losses and high-level amplifications were similar to what has been reported in sporadic breast tumors, however, the frequency of amplifications of the 17q12 region containing human epidermal growth factor receptor 2 (HER2 was much higher among CCSS cases (38%, n = 12. Our findings suggest that second primary breast cancers in CCSS were enriched for an "amplifier" genomic subgroup with highly proliferative breast tumors. Future investigation in a larger irradiated cohort will be needed to confirm our findings.

  11. Computational Tools for Brassica–Arabidopsis Comparative Genomics

    Directory of Open Access Journals (Sweden)

    Martin Trick

    2006-04-01

    Full Text Available Recent advances, such as the availability of extensive genome survey sequence (GSS data and draft physical maps, are radically transforming the means by which we can dissect Brassica genome structure and systematically relate it to the Arabidopsis model. Hitherto, our view of the co-linearities between these closely related genomes had been largely inferred from comparative RFLP data, necessitating substantial interpolation and expert interpretation. Sequencing of the Brassica rapa genome by the Multinational Brassica Genome Project will, however, enable an entirely computational approach to this problem. Meanwhile we have been developing databases and bioinformatics tools to support our work in Brassica comparative genomics, including a recently completed draft physical map of B. rapa integrated with anchor probes derived from the Arabidopsis genome sequence. We are also exploring new ways to display the emerging Brassica–Arabidopsis sequence homology data. We have mapped all publicly available Brassica sequences in silico to the Arabidopsis TIGR v5 genome sequence and published this in the ATIDB database that uses Generic Genome Browser (GBrowse. This in silico approach potentially identifies all paralogous sequences and so we colour-code the significance of the mappings and offer an integrated, real-time multiple alignment tool to partition them into paralogous groups. The MySQL database driving GBrowse can also be directly interrogated, using the powerful API offered by the Perl Bio∷DB∷GFF methods, facilitating a wide range of data-mining possibilities.

  12. MultiMetEval : Comparative and Multi-Objective Analysis of Genome-Scale Metabolic Models

    NARCIS (Netherlands)

    Zakrzewski, Piotr; Medema, Marnix H.; Gevorgyan, Albert; Kierzek, Andrzej M.; Breitling, Rainer; Takano, Eriko; Fong, Stephen S.

    2012-01-01

    Comparative metabolic modelling is emerging as a novel field, supported by the development of reliable and standardized approaches for constructing genome-scale metabolic models in high throughput. New software solutions are needed to allow efficient comparative analysis of multiple models in the co

  13. Comparative genomic hybridization analysis of benign and invasive male breast neoplasms

    DEFF Research Database (Denmark)

    Ojopi, Elida Paula Benquique; Cavalli, Luciane Regina; Cavalieri, Luciane Mara Bogline;

    2002-01-01

    Comparative genomic hybridization (CGH) analysis was performed for the identification of chromosomal imbalances in two benign gynecomastias and one malignant breast carcinoma derived from patients with male breast disease and compared with cytogenetic analysis in two of the three cases. CGH analy...

  14. Phage morphology recapitulates phylogeny: the comparative genomics of a new group of myoviruses.

    Directory of Open Access Journals (Sweden)

    André M Comeau

    Full Text Available Among dsDNA tailed bacteriophages (Caudovirales, members of the Myoviridae family have the most sophisticated virion design that includes a complex contractile tail structure. The Myoviridae generally have larger genomes than the other phage families. Relatively few "dwarf" myoviruses, those with a genome size of less than 50 kb such as those of the Mu group, have been analyzed in extenso. Here we report on the genome sequencing and morphological characterization of a new group of such phages that infect a diverse range of Proteobacteria, namely Aeromonas salmonicida phage 56, Vibrio cholerae phages 138 and CP-T1, Bdellovibrio phage φ1422, and Pectobacterium carotovorum phage ZF40. This group of dwarf myoviruses shares an identical virion morphology, characterized by usually short contractile tails, and have genome sizes of approximately 45 kb. Although their genome sequences are variable in their lysogeny, replication, and host adaption modules, presumably reflecting differing lifestyles and hosts, their structural and morphogenesis modules have been evolutionarily constrained by their virion morphology. Comparative genomic analysis reveals that these phages, along with related prophage genomes, form a new coherent group within the Myoviridae. The results presented in this communication support the hypothesis that the diversity of phages may be more structured than generally believed and that the innumerable phages in the biosphere all belong to discrete lineages or families.

  15. The Methanosarcina barkeri genome: comparative analysis withMethanosarcina acetivorans and Methanosarcina mazei reveals extensiverearrangement within methanosarcinal genomes

    Energy Technology Data Exchange (ETDEWEB)

    Maeder, Dennis L.; Anderson, Iain; Brettin, Thomas S.; Bruce,David C.; Gilna, Paul; Han, Cliff S.; Lapidus, Alla; Metcalf, William W.; Saunders, Elizabeth; Tapia, Roxanne; Sowers, Kevin R.

    2006-05-19

    We report here a comparative analysis of the genome sequence of Methanosarcina barkeri with those of Methanosarcina acetivorans and Methanosarcina mazei. All three genomes share a conserved double origin of replication and many gene clusters. M. barkeri is distinguished by having an organization that is well conserved with respect to the other Methanosarcinae in the region proximal to the origin of replication with interspecies gene similarities as high as 95%. However it is disordered and marked by increased transposase frequency and decreased gene synteny and gene density in the proximal semi-genome. Of the 3680 open reading frames in M. barkeri, 678 had paralogs with better than 80% similarity to both M. acetivorans and M. mazei while 128 nonhypothetical orfs were unique (non-paralogous) amongst these species including a complete formate dehydrogenase operon, two genes required for N-acetylmuramic acid synthesis, a 14 gene gas vesicle cluster and a bacterial P450-specific ferredoxin reductase cluster not previously observed or characterized in this genus. A cryptic 36 kbp plasmid sequence was detected in M. barkeri that contains an orc1 gene flanked by a presumptive origin of replication consisting of 38 tandem repeats of a 143 nt motif. Three-way comparison of these genomes reveals differing mechanisms for the accrual of changes. Elongation of the large M. acetivorans is the result of multiple gene-scale insertions and duplications uniformly distributed in that genome, while M. barkeri is characterized by localized inversions associated with the loss of gene content. In contrast, the relatively short M. mazei most closely approximates the ancestral organizational state.

  16. CpGislandEVO: A Database and Genome Browser for Comparative Evolutionary Genomics of CpG Islands

    Directory of Open Access Journals (Sweden)

    Guillermo Barturen

    2013-01-01

    Full Text Available Hypomethylated, CpG-rich DNA segments (CpG islands, CGIs are epigenome markers involved in key biological processes. Aberrant methylation is implicated in the appearance of several disorders as cancer, immunodeficiency, or centromere instability. Furthermore, methylation differences at promoter regions between human and chimpanzee strongly associate with genes involved in neurological/psychological disorders and cancers. Therefore, the evolutionary comparative analyses of CGIs can provide insights on the functional role of these epigenome markers in both health and disease. Given the lack of specific tools, we developed CpGislandEVO. Briefly, we first compile a database of statistically significant CGIs for the best assembled mammalian genome sequences available to date. Second, by means of a coupled browser front-end, we focus on the CGIs overlapping orthologous genes extracted from OrthoDB, thus ensuring the comparison between CGIs located on truly homologous genome segments. This allows comparing the main compositional features between homologous CGIs. Finally, to facilitate nucleotide comparisons, we lifted genome coordinates between assemblies from different species, which enables the analysis of sequence divergence by direct count of nucleotide substitutions and indels occurring between homologous CGIs. The resulting CpGislandEVO database, linking together CGIs and single-cytosine DNA methylation data from several mammalian species, is freely available at our website.

  17. Comparative genomic analyses identify the Vibrio harveyi genome sequenced strains BAA-1116 and HY01 as Vibrio campbellii

    Science.gov (United States)

    Lin, Baochuan; Wang, Zheng; Malanoski, Anthony P; O'Grady, Elizabeth A; Wimpee, Charles F; Vuddhakul, Varaporn; Alves Jr, Nelson; Thompson, Fabiano L; Gomez-Gil, Bruno; Vora, Gary J

    2010-01-01

    Three notable members of the Harveyi clade, Vibrio harveyi, Vibrio alginolyticus and Vibrio parahaemolyticus, are best known as marine pathogens of commercial and medical import. In spite of this fact, the discrimination of Harveyi clade members remains difficult due to genetic and phenotypic similarities, and this has led to misidentifications and inaccurate estimations of a species' involvement in certain environments. To begin to understand the underlying genetics that complicate species level discrimination, we compared the genomes of Harveyi clade members isolated from different environments (seawater, shrimp, corals, oysters, finfish, humans) using microarray-based comparative genomic hybridization (CGH) and multilocus sequence analyses (MLSA). Surprisingly, we found that the only two V. harveyi strains that have had their genomes sequenced (strains BAA-1116 and HY01) have themselves been misidentified. Instead of belonging to the species harveyi, they are actually members of the species campbellii. In total, 28% of the strains tested were found to be misidentified and 42% of these appear to comprise a novel species. Taken together, our findings correct a number of species misidentifications while validating the ability of both CGH and MLSA to distinguish closely related members of the Harveyi clade. PMID:20686623

  18. Understanding the direction of evolution in Burkholderia glumae through comparative genomics.

    Science.gov (United States)

    Lee, Hyun-Hee; Park, Jungwook; Kim, Jinnyun; Park, Inmyoung; Seo, Young-Su

    2016-02-01

    Members of the genus Burkholderia occupy remarkably diverse niches, with genome sizes ranging from ~3.75 to 11.29 Mbp. The genome of Burkholderia glumae ranges in size from ~5.81 to 7.89 Mbp. Unlike other plant pathogenic bacteria, B. glumae can infect a wide range of monocot and dicot plants. Comparative genome analysis of B. glumae strains can provide insight into genome variation as well as differential features of whole metabolism or pathways between multiple strains of B. glumae infecting the same host. Comparative analysis of complete genomes among B. glumae BGR1, B. glumae LMG 2196, and B. glumae PG1 revealed the largest departmentalization of genes onto separate replicons in B. glumae BGR1 and considerable downsizing of the genome in B. glumae LMG 2196. In addition, the presence of large-scale evolutionary events such as rearrangement and inversion and the development of highly specialized systems were found to be related to virulence-associated features in the three B. glumae strains. This connection may explain why this bacterium broadens its host range and reinforces its interaction with hosts. PMID:26454852

  19. Coding exon-structure aware realigner (CESAR) utilizes genome alignments for accurate comparative gene annotation.

    Science.gov (United States)

    Sharma, Virag; Elghafari, Anas; Hiller, Michael

    2016-06-20

    Identifying coding genes is an essential step in genome annotation. Here, we utilize existing whole genome alignments to detect conserved coding exons and then map gene annotations from one genome to many aligned genomes. We show that genome alignments contain thousands of spurious frameshifts and splice site mutations in exons that are truly conserved. To overcome these limitations, we have developed CESAR (Coding Exon-Structure Aware Realigner) that realigns coding exons, while considering reading frame and splice sites of each exon. CESAR effectively avoids spurious frameshifts in conserved genes and detects 91% of shifted splice sites. This results in the identification of thousands of additional conserved exons and 99% of the exons that lack inactivating mutations match real exons. Finally, to demonstrate the potential of using CESAR for comparative gene annotation, we applied it to 188 788 exons of 19 865 human genes to annotate human genes in 99 other vertebrates. These comparative gene annotations are available as a resource (http://bds.mpi-cbg.de/hillerlab/CESAR/). CESAR (https://github.com/hillerlab/CESAR/) can readily be applied to other alignments to accurately annotate coding genes in many other vertebrate and invertebrate genomes. PMID:27016733

  20. Psittacid Herpesvirus 1 and Infectious Laryngotracheitis Virus: Comparative Genome Sequence Analysis of Two Avian Alphaherpesviruses

    Science.gov (United States)

    Thureen, Dean R.; Keeler, Calvin L.

    2006-01-01

    Psittacid herpesvirus 1 (PsHV-1) is the causative agent of Pacheco's disease, an acute, highly contagious, and potentially lethal respiratory herpesvirus infection in psittacine birds, while infectious laryngotracheitis virus (ILTV) is a highly contagious and economically significant avian herpesvirus which is responsible for an acute respiratory disease limited to galliform birds. The complete genome sequence of PsHV-1 has been determined and compared to the ILTV sequence, assembled from published data. The PsHV-1 and ILTV genomes exhibit similar structural characteristics and are 163,025 bp and 148,665 bp in length, respectively. The PsHV-1 genome contains 73 predicted open reading frames (ORFs), while the ILTV genome contains 77 predicted ORFs. Both genomes contain an inversion in the unique long region similar to that observed in pseudorabies virus. PsHV-1 is closely related to ILTV, and it is proposed that it be assigned to the Iltovirus genus. These two avian herpesviruses represent a phylogenetically unique clade of alphaherpesviruses that are distinct from the Marek's disease-like viruses (Mardivirus). The determination of the complete genomic nucleotide sequences of PsHV-1 and ILTV provides a tool for further comparative and functional analysis of this unique class of avian alphaherpesviruses. PMID:16873243

  1. Comparative Genomics of Aeschynomene Symbionts: Insights into the Ecological Lifestyle of Nod-Independent Photosynthetic Bradyrhizobia

    Science.gov (United States)

    Mornico, Damien; Miché, Lucie; Béna, Gilles; Nouwen, Nico; Verméglio, André; Vallenet, David; Smith, Alexander A.T.; Giraud, Eric; Médigue, Claudine; Moulin, Lionel

    2011-01-01

    Tropical aquatic species of the legume genus Aeschynomene are stem- and root-nodulated by bradyrhizobia strains that exhibit atypical features such as photosynthetic capacities or the use of a nod gene-dependent (ND) or a nod gene-independent (NI) pathway to enter into symbiosis with legumes. In this study we used a comparative genomics approach on nine Aeschynomene symbionts representative of their phylogenetic diversity. We produced draft genomes of bradyrhizobial strains representing different phenotypes: five NI photosynthetic strains (STM3809, ORS375, STM3847, STM4509 and STM4523) in addition to the previously sequenced ORS278 and BTAi1 genomes, one photosynthetic strain ORS285 hosting both ND and NI symbiotic systems, and one NI non-photosynthetic strain (STM3843). Comparative genomics allowed us to infer the core, pan and dispensable genomes of Aeschynomene bradyrhizobia, and to detect specific genes and their location in Genomic Islands (GI). Specific gene sets linked to photosynthetic and NI/ND abilities were identified, and are currently being studied in functional analyses. PMID:24704842

  2. Comparative genomic analysis of the thermophilic biomass-degrading fungi Myceliophthora thermophila and thielavia terrestris

    Energy Technology Data Exchange (ETDEWEB)

    Berka, Randy; Grigoriev, Igor V.; Otillar, Robert P.; Salamov, Asaf; Grimwood, Jane; Reid, Ian; Ishmael, Nadeeza; john, tricia; Darmond, Corinne; Moisan, Marie-Claude; Henrissat, Bernard; Coutinho, Pedro M.; Lombard, Vincent; Natvig, Donald O.; Lindquist, Erika; Schmutz, Jeremy; Lucas, Susan; Harris, Paul; Powlowski, Justin; Bellemare, Annie; Taylor, David; Butler, Gregory; de Vries, Ronald P.; Allijn, Iris E.; van den Brink, Joost; Ushinsky, Sophia; Storms, Reginald; Powell, Amy J.; Paulsen, Ian T.; Elbourne, Liam D. H.; Baker, Scott E.; Magnuson, Jon K.; LaBoissiere, Sylvie; Martinez, Diego; Wogulis, Mark; Lopez de Leon, Alfredo; Rey, Michael; Tsang, Adrian

    2011-10-02

    Thermostable enzymes and thermophilic cell factories may afford economic advantages in the production of many chemicals and biomass-based fuels. Here we describe and compare the genomes of two thermophilic fungi, Myceliophthora thermophila and Thielavia terrestris. To our knowledge, these genomes are the first described for thermophilic eukaryotes and the first complete telomere-to-telomere genomes for filamentous fungi. Genome analyses and experimental data suggest that both thermophiles are capable of hydrolyzing all major polysaccharides found in biomass. Examination of transcriptome data and secreted proteins suggests that the two fungi use shared approaches in the hydrolysis of cellulose and xylan but distinct mechanisms in pectin degradation. Characterization of the biomass-hydrolyzing activity of recombinant enzymes suggests that these organisms are highly efficient in biomass decomposition at both moderate and high temperatures. Furthermore, we present evidence suggesting that aside from representing a potential reservoir of thermostable enzymes, thermophilic fungi are amenable to manipulation using classical and molecular genetics.

  3. Comparative genomic analysis of the thermophilic biomass-degrading fungi Myceliophthora thermophila and Thielavia terrestris

    Energy Technology Data Exchange (ETDEWEB)

    Berka, Randy M.; Grigoriev, Igor V.; Otillar, Robert; Salamov, Asaf; Grimwood, Jane; Reid, Ian; Ishmael, Nadeeza; John, Tricia; Darmond, Corinne; Moisan, Marie-Claude; Henrissat, Bernard; Coutinho, Pedro M.; Lombard, Vincent; Natvig, Donald O.; Lindquist, Erika; Schmutz, Jeremy; Lucas, Susan; Harris, Paul; Powlowski, Justin; Bellemare, Annie; Taylor, David; Butler, Gregory; de Vries, Ronald P.; Allijn, Iris E.; van den Brink, Joost; Ushinsky, Sophia; Storms, Reginald; Powell, Amy J.; Paulsen, Ian T.; Elbourne, Liam D. H.; Baker, Scott. E.; Magnuson, Jon; LaBoissiere, Sylvie; Clutterbuck, A. John; Martinez, Diego; Wogulis, Mark; Lopez de Leon, Alfredo; Rey, Michael W.; Tsang, Adrian

    2011-05-16

    Thermostable enzymes and thermophilic cell factories may afford economic advantages in the production of many chemicals and biomass-based fuels. Here we describe and compare the genomes of two thermophilic fungi, Myceliophthora thermophila and Thielavia terrestris. To our knowledge, these genomes are the first described for thermophilic eukaryotes and the first complete telomere-to-telomere genomes for filamentous fungi. Genome analyses and experimental data suggest that both thermophiles are capable of hydrolyzing all major polysaccharides found in biomass. Examination of transcriptome data and secreted proteins suggests that the two fungi use shared approaches in the hydrolysis of cellulose and xylan but distinct mechanisms in pectin degradation. Characterization of the biomass-hydrolyzing activity of recombinant enzymes suggests that these organisms are highly efficient in biomass decomposition at both moderate and high temperatures. Furthermore, we present evidence suggesting that aside from representing a potential reservoir of thermostable enzymes, thermophilic fungi are amenable to manipulation using classical and molecular genetics.

  4. Genetic Characterization and Comparative Genome Analysis of Brucella melitensis Isolates from India

    Directory of Open Access Journals (Sweden)

    Sarwar Azam

    2016-01-01

    Full Text Available Brucellosis is the most frequent zoonotic disease worldwide, with over 500,000 new human infections every year. Brucella melitensis, the most virulent species in humans, primarily affects goats and the zoonotic transmission occurs by ingestion of unpasteurized milk products or through direct contact with fetal tissues. Brucellosis is endemic in India but no information is available on population structure and genetic diversity of Brucella spp. in India. We performed multilocus sequence typing of four B. melitensis strains isolated from naturally infected goats from India. For more detailed genetic characterization, we carried out whole genome sequencing and comparative genome analysis of one of the B. melitensis isolates, Bm IND1. Genome analysis identified 141 unique SNPs, 78 VNTRs, 51 Indels, and 2 putative prophage integrations in the Bm IND1 genome. Our data may help to develop improved epidemiological typing tools and efficient preventive strategies to control brucellosis.

  5. Organization and comparative analysis of the mitochondrial genomes of bioluminescent Elateroidea (Coleoptera: Polyphaga).

    Science.gov (United States)

    Amaral, Danilo T; Mitani, Yasuo; Ohmiya, Yoshihiro; Viviani, Vadim R

    2016-07-25

    Mitochondrial genome organization in the Elateroidea superfamily (Coleoptera), which include the main families of bioluminescent beetles, has been poorly studied and lacking information about Phengodidae family. We sequenced the mitochondrial genomes of Neotropical Lampyridae (Bicellonycha lividipennis), Phengodidae (Brasilocerus sp.2 and Phrixothrix hirtus) and Elateridae (Pyrearinus termitilluminans, Hapsodrilus ignifer and Teslasena femoralis). All species had a typical insect mitochondrial genome except for the following: in the elaterid T. femoralis genome there is a non-coding region between NADH2 and tRNA-Trp; in the phengodids Brasilocerus sp.2 and P. hirtus genomes we did not find the tRNA-Ile and tRNA-Gln. The P. hirtus genome showed a ~1.6kb non-coding region, the rearrangement of tRNA-Tyr, a new tRNA-Leu copy, and several regions with higher AT contents. Phylogenetics analysis using Bayesian and ML models indicated that the Phengodidae+Rhagophthalmidae are closely related to Lampyridae family, and included Drilus flavescens (Drilidae) as an internal clade within Elateridae. This is the first report that compares the mitochondrial genomes organization of the three main families of bioluminescent Elateroidea, including the first Neotropical Lampyridae and Phengodidae. The losses of tRNAs, and translocation and duplication events found in Phengodidae mt genomes, mainly in P. hirtus, may indicate different evolutionary rates in these mitochondrial genomes. The mitophylogenomics analysis indicates the monophyly of the three bioluminescent families and a closer relationship between Lampyridae and Phengodidae/Rhagophthalmidae, in contrast with previous molecular analysis. PMID:27060405

  6. Exploring the zoonotic potential of Mycobacterium avium subspecies paratuberculosis through comparative genomics.

    Science.gov (United States)

    Wynne, James W; Bull, Tim J; Seemann, Torsten; Bulach, Dieter M; Wagner, Josef; Kirkwood, Carl D; Michalski, Wojtek P

    2011-01-01

    A comparative genomics approach was utilised to compare the genomes of Mycobacterium avium subspecies paratuberculosis (MAP) isolated from early onset paediatric Crohn's disease (CD) patients as well as Johne's diseased animals. Draft genome sequences were produced for MAP isolates derived from four CD patients, one ulcerative colitis (UC) patient, and two non-inflammatory bowel disease (IBD) control individuals using Illumina sequencing, complemented by comparative genome hybridisation (CGH). MAP isolates derived from two bovine and one ovine host were also subjected to whole genome sequencing and CGH. All seven human derived MAP isolates were highly genetically similar and clustered together with one bovine type isolate following phylogenetic analysis. Three other sequenced isolates (including the reference bovine derived isolate K10) were genetically distinct. The human isolates contained two large tandem duplications, the organisations of which were confirmed by PCR. Designated vGI-17 and vGI-18 these duplications spanned 63 and 109 open reading frames, respectively. PCR screening of over 30 additional MAP isolates (3 human derived, 27 animal derived and one environmental isolate) confirmed that vGI-17 and vGI-18 are common across many isolates. Quantitative real-time PCR of vGI-17 demonstrated that the proportion of cells containing the vGI-17 duplication varied between 0.01 to 15% amongst isolates with human isolates containing a higher proportion of vGI-17 compared to most animal isolates. These findings suggest these duplications are transient genomic rearrangements. We hypothesise that the over-representation of vGI-17 in human derived MAP strains may enhance their ability to infect or persist within a human host by increasing genome redundancy and conferring crude regulation of protein expression across biologically important regions. PMID:21799786

  7. Exploring the zoonotic potential of Mycobacterium avium subspecies paratuberculosis through comparative genomics.

    Directory of Open Access Journals (Sweden)

    James W Wynne

    Full Text Available A comparative genomics approach was utilised to compare the genomes of Mycobacterium avium subspecies paratuberculosis (MAP isolated from early onset paediatric Crohn's disease (CD patients as well as Johne's diseased animals. Draft genome sequences were produced for MAP isolates derived from four CD patients, one ulcerative colitis (UC patient, and two non-inflammatory bowel disease (IBD control individuals using Illumina sequencing, complemented by comparative genome hybridisation (CGH. MAP isolates derived from two bovine and one ovine host were also subjected to whole genome sequencing and CGH. All seven human derived MAP isolates were highly genetically similar and clustered together with one bovine type isolate following phylogenetic analysis. Three other sequenced isolates (including the reference bovine derived isolate K10 were genetically distinct. The human isolates contained two large tandem duplications, the organisations of which were confirmed by PCR. Designated vGI-17 and vGI-18 these duplications spanned 63 and 109 open reading frames, respectively. PCR screening of over 30 additional MAP isolates (3 human derived, 27 animal derived and one environmental isolate confirmed that vGI-17 and vGI-18 are common across many isolates. Quantitative real-time PCR of vGI-17 demonstrated that the proportion of cells containing the vGI-17 duplication varied between 0.01 to 15% amongst isolates with human isolates containing a higher proportion of vGI-17 compared to most animal isolates. These findings suggest these duplications are transient genomic rearrangements. We hypothesise that the over-representation of vGI-17 in human derived MAP strains may enhance their ability to infect or persist within a human host by increasing genome redundancy and conferring crude regulation of protein expression across biologically important regions.

  8. Comparative chloroplast genomics: Analyses including new sequencesfrom the angiosperms Nuphar advena and Ranunculus macranthus

    Energy Technology Data Exchange (ETDEWEB)

    Raubeso, Linda A.; Peery, Rhiannon; Chumley, Timothy W.; Dziubek,Chris; Fourcade, H. Matthew; Boore, Jeffrey L.; Jansen, Robert K.

    2007-03-01

    The number of completely sequenced plastid genomes available is growing rapidly. This new array of sequences presents new opportunities to perform comparative analyses. In comparative studies, it is most useful to compare across wide phylogenetic spans and, within angiosperms, to include representatives from basally diverging lineages such as the new genomes reported here: Nuphar advena (from a basal-most lineage) and Ranunculus macranthus (from the basal group of eudicots). We report these two new plastid genome sequences and make comparisons (within angiosperms, seed plants, or all photosynthetic lineages) to evaluate features such as the status of ycf15 and ycf68 as protein coding genes, the distribution of simple sequence repeats (SSRs) and longer dispersed repeats (SDR), and patterns of nucleotide composition.

  9. Comparative genomic reconstruction of transcriptional networks controlling central metabolism in the Shewanella genus

    Directory of Open Access Journals (Sweden)

    Kovaleva Galina

    2011-06-01

    Full Text Available Abstract Background Genome-scale prediction of gene regulation and reconstruction of transcriptional regulatory networks in bacteria is one of the critical tasks of modern genomics. The Shewanella genus is comprised of metabolically versatile gamma-proteobacteria, whose lifestyles and natural environments are substantially different from Escherichia coli and other model bacterial species. The comparative genomics approaches and computational identification of regulatory sites are useful for the in silico reconstruction of transcriptional regulatory networks in bacteria. Results To explore conservation and variations in the Shewanella transcriptional networks we analyzed the repertoire of transcription factors and performed genomics-based reconstruction and comparative analysis of regulons in 16 Shewanella genomes. The inferred regulatory network includes 82 transcription factors and their DNA binding sites, 8 riboswitches and 6 translational attenuators. Forty five regulons were newly inferred from the genome context analysis, whereas others were propagated from previously characterized regulons in the Enterobacteria and Pseudomonas spp.. Multiple variations in regulatory strategies between the Shewanella spp. and E. coli include regulon contraction and expansion (as in the case of PdhR, HexR, FadR, numerous cases of recruiting non-orthologous regulators to control equivalent pathways (e.g. PsrA for fatty acid degradation and, conversely, orthologous regulators to control distinct pathways (e.g. TyrR, ArgR, Crp. Conclusions We tentatively defined the first reference collection of ~100 transcriptional regulons in 16 Shewanella genomes. The resulting regulatory network contains ~600 regulated genes per genome that are mostly involved in metabolism of carbohydrates, amino acids, fatty acids, vitamins, metals, and stress responses. Several reconstructed regulons including NagR for N-acetylglucosamine catabolism were experimentally validated in S

  10. Improving de novo sequence assembly using machine learning and comparative genomics for overlap correction

    Directory of Open Access Journals (Sweden)

    Bolanos Randall

    2010-01-01

    Full Text Available Abstract Background With the rapid expansion of DNA sequencing databases, it is now feasible to identify relevant information from prior sequencing projects and completed genomes and apply it to de novo sequencing of new organisms. As an example, this paper demonstrates how such extra information can be used to improve de novo assemblies by augmenting the overlapping step. Finding all pairs of overlapping reads is a key task in many genome assemblers, and to this end, highly efficient algorithms have been developed to find alignments in large collections of sequences. It is well known that due to repeated sequences, many aligned pairs of reads nevertheless do not overlap. But no overlapping algorithm to date takes a rigorous approach to separating aligned but non-overlapping read pairs from true overlaps. Results We present an approach that extends the Minimus assembler by a data driven step to classify overlaps as true or false prior to contig construction. We trained several different classification models within the Weka framework using various statistics derived from overlaps of reads available from prior sequencing projects. These statistics included percent mismatch and k-mer frequencies within the overlaps as well as a comparative genomics score derived from mapping reads to multiple reference genomes. We show that in real whole-genome sequencing data from the E. coli and S. aureus genomes, by providing a curated set of overlaps to the contigging phase of the assembler, we nearly doubled the median contig length (N50 without sacrificing coverage of the genome or increasing the number of mis-assemblies. Conclusions Machine learning methods that use comparative and non-comparative features to classify overlaps as true or false can be used to improve the quality of a sequence assembly.

  11. RegPredict: an integrated system for regulon inference in prokaryotes by comparative genomics approach

    Energy Technology Data Exchange (ETDEWEB)

    Novichkov, Pavel S.; Rodionov, Dmitry A.; Stavrovskaya, Elena D.; Novichkova, Elena S.; Kazakov, Alexey E.; Gelfand, Mikhail S.; Arkin, Adam P.; Mironov, Andrey A.; Dubchak, Inna

    2010-05-26

    RegPredict web server is designed to provide comparative genomics tools for reconstruction and analysis of microbial regulons using comparative genomics approach. The server allows the user to rapidly generate reference sets of regulons and regulatory motif profiles in a group of prokaryotic genomes. The new concept of a cluster of co-regulated orthologous operons allows the user to distribute the analysis of large regulons and to perform the comparative analysis of multiple clusters independently. Two major workflows currently implemented in RegPredict are: (i) regulon reconstruction for a known regulatory motif and (ii) ab initio inference of a novel regulon using several scenarios for the generation of starting gene sets. RegPredict provides a comprehensive collection of manually curated positional weight matrices of regulatory motifs. It is based on genomic sequences, ortholog and operon predictions from the MicrobesOnline. An interactive web interface of RegPredict integrates and presents diverse genomic and functional information about the candidate regulon members from several web resources. RegPredict is freely accessible at http://regpredict.lbl.gov.

  12. An initial comparative map of copy number variations in the goat (Capra hircus genome

    Directory of Open Access Journals (Sweden)

    Casadio Rita

    2010-11-01

    Full Text Available Abstract Background The goat (Capra hircus represents one of the most important farm animal species. It is reared in all continents with an estimated world population of about 800 million of animals. Despite its importance, studies on the goat genome are still in their infancy compared to those in other farm animal species. Comparative mapping between cattle and goat showed only a few rearrangements in agreement with the similarity of chromosome banding. We carried out a cross species cattle-goat array comparative genome hybridization (aCGH experiment in order to identify copy number variations (CNVs in the goat genome analysing animals of different breeds (Saanen, Camosciata delle Alpi, Girgentana, and Murciano-Granadina using a tiling oligonucleotide array with ~385,000 probes designed on the bovine genome. Results We identified a total of 161 CNVs (an average of 17.9 CNVs per goat, with the largest number in the Saanen breed and the lowest in the Camosciata delle Alpi goat. By aggregating overlapping CNVs identified in different animals we determined CNV regions (CNVRs: on the whole, we identified 127 CNVRs covering about 11.47 Mb of the virtual goat genome referred to the bovine genome (0.435% of the latter genome. These 127 CNVRs included 86 loss and 41 gain and ranged from about 24 kb to about 1.07 Mb with a mean and median equal to 90,292 bp and 49,530 bp, respectively. To evaluate whether the identified goat CNVRs overlap with those reported in the cattle genome, we compared our results with those obtained in four independent cattle experiments. Overlapping between goat and cattle CNVRs was highly significant (P Conclusions We describe a first map of goat CNVRs. This provides information on a comparative basis with the cattle genome by identifying putative recurrent interspecies CNVs between these two ruminant species. Several goat CNVs affect genes with important biological functions. Further studies are needed to evaluate the

  13. Microfluidic System for Solution Array Based Bioassays

    Energy Technology Data Exchange (ETDEWEB)

    Dougherty, G M; Tok, J B; Pannu, S S; Rose, K A

    2006-02-10

    The objective of this project is to demonstrate new enabling technology for multiplex biodetection systems that are flexible, miniaturizable, highly automated, low cost, and high performance. It builds on prior successes at LLNL with particle-based solution arrays, such as those used in the Autonomous Pathogen Detection System (APDS) successfully field deployed to multiple locations nationwide. We report the development of a multiplex solution array immunoassay based upon engineered metallic nanorod particles. Nanobarcodes{reg_sign} particles are fabricated by sequential electrodeposition of dissimilar metals within porous alumina templates, yielding optically encoded striping patterns that can be read using standard laboratory microscope optics and PC-based image processing software. The addition of self-assembled monolayer (SAM) coatings and target-specific antibodies allows each encoded class of nanorod particles to be directed against a different antigen target. A prototype assay panel directed against bacterial, viral, and soluble protein targets demonstrates simultaneous detection at sensitivities comparable to state of the art immunoassays, with minimal cross-reactivity. Studies have been performed to characterize the colloidal properties (zeta potential) of the suspended nanorod particles as a function of pH, the ionic strength of the suspending solution, and surface functionalization state. Additional studies have produced means for the non-contact manipulation of the particles, including the insertion of magnetic nickel stripes within the encoding pattern, and control via externally applied electromagnetic fields. Using the results of these studies, the novel Nanobarcodes{reg_sign} based assay was implemented in a prototype automated system with the sample processing functions and optical readout performed on a microfluidic card. The unique physical properties of the nanorod particles enable the development of integrated microfluidic systems for

  14. The genome sequence of E. coli W (ATCC 9637: comparative genome analysis and an improved genome-scale reconstruction of E. coli

    Directory of Open Access Journals (Sweden)

    Lee Sang

    2011-01-01

    Full Text Available Abstract Background Escherichia coli is a model prokaryote, an important pathogen, and a key organism for industrial biotechnology. E. coli W (ATCC 9637, one of four strains designated as safe for laboratory purposes, has not been sequenced. E. coli W is a fast-growing strain and is the only safe strain that can utilize sucrose as a carbon source. Lifecycle analysis has demonstrated that sucrose from sugarcane is a preferred carbon source for industrial bioprocesses. Results We have sequenced and annotated the genome of E. coli W. The chromosome is 4,900,968 bp and encodes 4,764 ORFs. Two plasmids, pRK1 (102,536 bp and pRK2 (5,360 bp, are also present. W has unique features relative to other sequenced laboratory strains (K-12, B and Crooks: it has a larger genome and belongs to phylogroup B1 rather than A. W also grows on a much broader range of carbon sources than does K-12. A genome-scale reconstruction was developed and validated in order to interrogate metabolic properties. Conclusions The genome of W is more similar to commensal and pathogenic B1 strains than phylogroup A strains, and therefore has greater utility for comparative analyses with these strains. W should therefore be the strain of choice, or 'type strain' for group B1 comparative analyses. The genome annotation and tools created here are expected to allow further utilization and development of E. coli W as an industrial organism for sucrose-based bioprocesses. Refinements in our E. coli metabolic reconstruction allow it to more accurately define E. coli metabolism relative to previous models.

  15. Comparative genomics of two independently enriched ‘Candidatus Kuenenia stuttgartiensis’ anammox bacteria

    Directory of Open Access Journals (Sweden)

    DaanRSpeth

    2012-08-01

    Here we present a comparative genomic analysis of two ‘Ca. K. stuttgartiensis’ anammox bacteria that were independently enriched, with the aim to understand more about the evolution, cell plan and metabolism of these important microbes and to further improve and complete the reference genome. The two anammox bacteria used are ‘Ca. K. stuttgartiensis’ RU1, which was originally sequenced for the reference genome in 2002, and for the present study resequenced after seven (2002-2009 years in continuous culture. Furthermore ‘Ca. K. stuttgartiensis’ CH1, enriched from a Chinese wastewater treatment plant was used as an independent source of genomic information. The two different ‘Ca. Kuenenia’ bacteria showed a very high sequence identity (> 99 % at nucleotide level over the entire genome, but 31 genomic regions (average size 11 kb were absent from strain CH1 and 220 kb of sequence was specifically found in the CH1 assembly. The high sequence homology between these two bacteria indicates that mobile genetic elements are the main source of variation between these geographically widely separated strains.

  16. Comparing the Dictyostelium and Entamoeba Genomes Reveals an Ancient Split in the Conosa Lineage.

    Directory of Open Access Journals (Sweden)

    2005-12-01

    Full Text Available The Amoebozoa are a sister clade to the fungi and the animals, but are poorly sampled for completely sequenced genomes. The social amoeba Dictyostelium discoideum and amitochondriate pathogen Entamoeba histolytica are the first Amoebozoa with genomes completely sequenced. Both organisms are classified under the Conosa subphylum. To identify Amoebozoa-specific genomic elements, we compared these two genomes to each other and to other eukaryotic genomes. An expanded phylogenetic tree built from the complete predicted proteomes of 23 eukaryotes places the two amoebae in the same lineage, although the divergence is estimated to be greater than that between animals and fungi, and probably happened shortly after the Amoebozoa split from the opisthokont lineage. Most of the 1,500 orthologous gene families shared between the two amoebae are also shared with plant, animal, and fungal genomes. We found that only 42 gene families are distinct to the amoeba lineage; among these are a large number of proteins that contain repeats of the FNIP domain, and a putative transcription factor essential for proper cell type differentiation in D. discoideum. These Amoebozoa-specific genes may be useful in the design of novel diagnostics and therapies for amoebal pathologies.

  17. Evolution of a microbial nitrilase gene family: a comparative and environmental genomics study

    Directory of Open Access Journals (Sweden)

    Eads Jonathan R

    2005-08-01

    Full Text Available Abstract Background Completed genomes and environmental genomic sequences are bringing a significant contribution to understanding the evolution of gene families, microbial metabolism and community eco-physiology. Here, we used comparative genomics and phylogenetic analyses in conjunction with enzymatic data to probe the evolution and functions of a microbial nitrilase gene family. Nitrilases are relatively rare in bacterial genomes, their biological function being unclear. Results We examined the genetic neighborhood of the different subfamily genes and discovered conserved gene clusters or operons associated with specific nitrilase clades. The inferred evolutionary transitions that separate nitrilases which belong to different gene clusters correlated with changes in their enzymatic properties. We present evidence that Darwinian adaptation acted during one of those transitions and identified sites in the enzyme that may have been under positive selection. Conclusion Changes in the observed biochemical properties of the nitrilases associated with the different gene clusters are consistent with a hypothesis that those enzymes have been recruited to a novel metabolic pathway following gene duplication and neofunctionalization. These results demonstrate the benefits of combining environmental genomic sampling and completed genomes data with evolutionary and biochemical analyses in the study of gene families. They also open new directions for studying the functions of nitrilases and the genes they are associated with.

  18. Comparative genomic analysis of multiple strains of two unusual plant pathogens: Pseudomonas corrugata and Pseudomonas mediterranea

    Directory of Open Access Journals (Sweden)

    Emmanouil A Trantas

    2015-08-01

    Full Text Available The non-fluorescent pseudomonads, Pseudomonas corrugata (Pcor and P. mediterranea (Pmed, are closely related species that cause pith necrosis, a disease of tomato that causes severe crop losses. However, they also show strong antagonistic effects against economically important pathogens, demonstrating their potential for utilization as biological control agents. In addition, their metabolic versatility makes them attractive for the production of commercial biomolecules and bioremediation. An extensive comparative genomics study is required to dissect the mechanisms that Pcor and Pmed employ to cause disease, prevent disease caused by other pathogens, and to mine their genomes for commercially significant chemical pathways. Here, we present the draft genomes of nine Pcor and Pmed strains from different geographical locations. This analysis covered significant genetic heterogeneity and allowed in-depth genomic comparison. All examined strains were able to trigger symptoms in tomato plants but not all induced a hypersensitive-like response in Nicotiana benthamiana. Genome-mining revealed the absence of a type III secretion system and of known type III effectors from all examined Pcor and Pmed strains. The lack of a type III secretion system appears to be unique among the plant pathogenic pseudomonads. Several gene clusters coding for type VI secretion system were detected in all genomes.

  19. Comparing the Dictyostelium and Entamoeba genomes reveals an ancient split in the Conosa lineage.

    Directory of Open Access Journals (Sweden)

    Jie Song

    2005-12-01

    Full Text Available The Amoebozoa are a sister clade to the fungi and the animals, but are poorly sampled for completely sequenced genomes. The social amoeba Dictyostelium discoideum and amitochondriate pathogen Entamoeba histolytica are the first Amoebozoa with genomes completely sequenced. Both organisms are classified under the Conosa subphylum. To identify Amoebozoa-specific genomic elements, we compared these two genomes to each other and to other eukaryotic genomes. An expanded phylogenetic tree built from the complete predicted proteomes of 23 eukaryotes places the two amoebae in the same lineage, although the divergence is estimated to be greater than that between animals and fungi, and probably happened shortly after the Amoebozoa split from the opisthokont lineage. Most of the 1,500 orthologous gene families shared between the two amoebae are also shared with plant, animal, and fungal genomes. We found that only 42 gene families are distinct to the amoeba lineage; among these are a large number of proteins that contain repeats of the FNIP domain, and a putative transcription factor essential for proper cell type differentiation in D. discoideum. These Amoebozoa-specific genes may be useful in the design of novel diagnostics and therapies for amoebal pathologies.

  20. Comparative Genetic Analyses of Human Rhinovirus C (HRV-C) Complete Genome from Malaysia.

    Science.gov (United States)

    Khaw, Yam Sim; Chan, Yoke Fun; Jafar, Faizatul Lela; Othman, Norlijah; Chee, Hui Yee

    2016-01-01

    Human rhinovirus-C (HRV-C) has been implicated in more severe illnesses than HRV-A and HRV-B, however, the limited number of HRV-C complete genomes (complete 5' and 3' non-coding region and open reading frame sequences) has hindered the in-depth genetic study of this virus. This study aimed to sequence seven complete HRV-C genomes from Malaysia and compare their genetic characteristics with the 18 published HRV-Cs. Seven Malaysian HRV-C complete genomes were obtained with newly redesigned primers. The seven genomes were classified as HRV-C6, C12, C22, C23, C26, C42, and pat16 based on the VP4/VP2 and VP1 pairwise distance threshold classification. Five of the seven Malaysian isolates, namely, 3430-MY-10/C22, 8713-MY-10/C23, 8097-MY-11/C26, 1570-MY-10/C42, and 7383-MY-10/pat16 are the first newly sequenced complete HRV-C genomes. All seven Malaysian isolates genomes displayed nucleotide similarity of 63-81% among themselves and 63-96% with other HRV-Cs. Malaysian HRV-Cs had similar putative immunogenic sites, putative receptor utilization and potential antiviral sites as other HRV-Cs. The genomic features of Malaysian isolates were similar to those of other HRV-Cs. Negative selections were frequently detected in HRV-Cs complete coding sequences indicating that these sequences were under functional constraint. The present study showed that HRV-Cs from Malaysia have diverse genetic sequences but share conserved genomic features with other HRV-Cs. This genetic information could provide further aid in the understanding of HRV-C infection. PMID:27199901

  1. Comparative genomic analysis of two-component regulatory proteins in Pseudomonas syringae

    Directory of Open Access Journals (Sweden)

    Ussery David W

    2007-10-01

    Full Text Available Abstract Background Pseudomonas syringae is a widespread bacterial plant pathogen, and strains of P. syringae may be assigned to different pathovars based on host specificity among different plant species. The genomes of P. syringae pv. syringae (Psy B728a, pv. tomato (Pto DC3000 and pv. phaseolicola (Pph 1448A have been recently sequenced providing a major resource for comparative genomic analysis. A mechanism commonly found in bacteria for signal transduction is the two-component system (TCS, which typically consists of a sensor histidine kinase (HK and a response regulator (RR. P. syringae requires a complex array of TCS proteins to cope with diverse plant hosts, host responses, and environmental conditions. Results Based on the genomic data, pattern searches with Hidden Markov Model (HMM profiles have been used to identify putative HKs and RRs. The genomes of Psy B728a, Pto DC3000 and Pph 1448A were found to contain a large number of genes encoding TCS proteins, and a core of complete TCS proteins were shared between these genomes: 30 putative TCS clusters, 11 orphan HKs, 33 orphan RRs, and 16 hybrid HKs. A close analysis of the distribution of genes encoding TCS proteins revealed important differences in TCS proteins among the three P. syringae pathovars. Conclusion In this article we present a thorough analysis of the identification and distribution of TCS proteins among the sequenced genomes of P. syringae. We have identified differences in TCS proteins among the three P. syringae pathovars that may contribute to their diverse host ranges and association with plant hosts. The identification and analysis of the repertoire of TCS proteins in the genomes of P. syringae pathovars constitute a basis for future functional genomic studies of the signal transduction pathways in this important bacterial phytopathogen.

  2. Comparative Genetic Analyses of Human Rhinovirus C (HRV-C) Complete Genome from Malaysia

    Science.gov (United States)

    Khaw, Yam Sim; Chan, Yoke Fun; Jafar, Faizatul Lela; Othman, Norlijah; Chee, Hui Yee

    2016-01-01

    Human rhinovirus-C (HRV-C) has been implicated in more severe illnesses than HRV-A and HRV-B, however, the limited number of HRV-C complete genomes (complete 5′ and 3′ non-coding region and open reading frame sequences) has hindered the in-depth genetic study of this virus. This study aimed to sequence seven complete HRV-C genomes from Malaysia and compare their genetic characteristics with the 18 published HRV-Cs. Seven Malaysian HRV-C complete genomes were obtained with newly redesigned primers. The seven genomes were classified as HRV-C6, C12, C22, C23, C26, C42, and pat16 based on the VP4/VP2 and VP1 pairwise distance threshold classification. Five of the seven Malaysian isolates, namely, 3430-MY-10/C22, 8713-MY-10/C23, 8097-MY-11/C26, 1570-MY-10/C42, and 7383-MY-10/pat16 are the first newly sequenced complete HRV-C genomes. All seven Malaysian isolates genomes displayed nucleotide similarity of 63–81% among themselves and 63–96% with other HRV-Cs. Malaysian HRV-Cs had similar putative immunogenic sites, putative receptor utilization and potential antiviral sites as other HRV-Cs. The genomic features of Malaysian isolates were similar to those of other HRV-Cs. Negative selections were frequently detected in HRV-Cs complete coding sequences indicating that these sequences were under functional constraint. The present study showed that HRV-Cs from Malaysia have diverse genetic sequences but share conserved genomic features with other HRV-Cs. This genetic information could provide further aid in the understanding of HRV-C infection. PMID:27199901

  3. Comparative genome analysis of the high pathogenicity Salmonella Typhimurium strain UK-1.

    Directory of Open Access Journals (Sweden)

    Yingqin Luo

    Full Text Available Salmonella enterica serovar Typhimurium, a gram-negative facultative rod-shaped bacterium causing salmonellosis and foodborne disease, is one of the most common isolated Salmonella serovars in both developed and developing nations. Several S. Typhimurium genomes have been completed and many more genome-sequencing projects are underway. Comparative genome analysis of the multiple strains leads to a better understanding of the evolution of S. Typhimurium and its pathogenesis. S. Typhimurium strain UK-1 (belongs to phage type 1 is highly virulent when orally administered to mice and chickens and efficiently colonizes lymphoid tissues of these species. These characteristics make this strain a good choice for use in vaccine development. In fact, UK-1 has been used as the parent strain for a number of nonrecombinant and recombinant vaccine strains, including several commercial vaccines for poultry. In this study, we conducted a thorough comparative genome analysis of the UK-1 strain with other S. Typhimurium strains and examined the phenotypic impact of several genomic differences. Whole genomic comparison highlights an extremely close relationship between the UK-1 strain and other S. Typhimurium strains; however, many interesting genetic and genomic variations specific to UK-1 were explored. In particular, the deletion of a UK-1-specific gene that is highly similar to the gene encoding the T3SS effector protein NleC exhibited a significant decrease in oral virulence in BALB/c mice. The complete genetic complements in UK-1, especially those elements that contribute to virulence or aid in determining the diversity within bacterial species, provide key information in evaluating the functional characterization of important genetic determinants and for development of vaccines.

  4. Identification of human-specific AluS elements through comparative genomics.

    Science.gov (United States)

    Lee, Jae; Kim, Yun-Ji; Mun, Seyoung; Kim, Heui-Soo; Han, Kyudong

    2015-01-25

    Mobile elements are responsible for ~45% of the human genome. Among them is the Alu element, accounting for 10% of the human genome (>1.1million copies). Several studies of Alu elements have reported that they are frequently involved in human genetic diseases and genomic rearrangements. In this study, we investigated the AluS subfamily, which is a relatively old Alu subfamily and has the highest copy number in primate genomes. Previously, a set of 263 human-specific AluS insertions was identified in the human genome. To validate these, we compared each of the human-specific AluS loci with its pre-insertion site in other primate genomes, including chimpanzee, gorilla, and orangutan. We obtained 24 putative human-specific AluS candidates via the in silico analysis and manual inspection, and then tried to verify them using PCR amplification and DNA sequencing. Through the PCR product sequencing, we were able to detect two instances of near-parallel Alu insertions in nearby sites that led to computational false negatives. Finally, we computationally and experimentally verified 23 human-specific AluS elements. We reported three alternative Alu insertion events, which are accompanied by filler DNA and/or Alu retrotransposition mediated-deletion. Bisulfite sequencing was carried out to examine DNA methylation levels of human-specific AluS elements. The results showed that fixed AluS elements are hypermethylated compared with polymorphic elements, indicating a possible relation between DNA methylation and Alu fixation in the human genome. PMID:25447892

  5. Comparative genomic sequence analysis of strawberry and other rosids reveals significant microsynteny

    Directory of Open Access Journals (Sweden)

    Abbott Albert

    2010-06-01

    Full Text Available Abstract Background Fragaria belongs to the Rosaceae, an economically important family that includes a number of important fruit producing genera such as Malus and Prunus. Using genomic sequences from 50 Fragaria fosmids, we have examined the microsynteny between Fragaria and other plant models. Results In more than half of the strawberry fosmids, we found syntenic regions that are conserved in Populus, Vitis, Medicago and/or Arabidopsis with Populus containing the greatest number of syntenic regions with Fragaria. The longest syntenic region was between LG VIII of the poplar genome and the strawberry fosmid 72E18, where seven out of twelve predicted genes were collinear. We also observed an unexpectedly high level of conserved synteny between Fragaria (rosid I and Vitis (basal rosid. One of the strawberry fosmids, 34E24, contained a cluster of R gene analogs (RGAs with NBS and LRR domains. We detected clusters of RGAs with high sequence similarity to those in 34E24 in all the genomes compared. In the phylogenetic tree we have generated, all the NBS-LRR genes grouped together with Arabidopsis CNL-A type NBS-LRR genes. The Fragaria RGA grouped together with those of Vitis and Populus in the phylogenetic tree. Conclusions Our analysis shows considerable microsynteny between Fragaria and other plant genomes such as Populus, Medicago, Vitis, and Arabidopsis to a lesser degree. We also detected a cluster of NBS-LRR type genes that are conserved in all the genomes compared.

  6. Comparative Genomics of Erwinia amylovora and Related Erwinia Species—What do We Learn?

    Directory of Open Access Journals (Sweden)

    Youfu Zhao

    2011-09-01

    Full Text Available Erwinia amylovora, the causal agent of fire blight disease of apples and pears, is one of the most important plant bacterial pathogens with worldwide economic significance. Recent reports on the complete or draft genome sequences of four species in the genus Erwinia, including E. amylovora, E. pyrifoliae, E. tasmaniensis, and E. billingiae, have provided us near complete genetic information about this pathogen and its closely-related species. This review describes in silico subtractive hybridization-based comparative genomic analyses of eight genomes currently available, and highlights what we have learned from these comparative analyses, as well as genetic and functional genomic studies. Sequence analyses reinforce the assumption that E. amylovora is a relatively homogeneous species and support the current classification scheme of E. amylovora and its related species. The potential evolutionary origin of these Erwinia species is also proposed. The current understanding of the pathogen, its virulence mechanism and host specificity from genome sequencing data is summarized. Future research directions are also suggested.

  7. arrayCGHbase: an analysis platform for comparative genomic hybridization microarrays

    Directory of Open Access Journals (Sweden)

    Moreau Yves

    2005-05-01

    Full Text Available Abstract Background The availability of the human genome sequence as well as the large number of physically accessible oligonucleotides, cDNA, and BAC clones across the entire genome has triggered and accelerated the use of several platforms for analysis of DNA copy number changes, amongst others microarray comparative genomic hybridization (arrayCGH. One of the challenges inherent to this new technology is the management and analysis of large numbers of data points generated in each individual experiment. Results We have developed arrayCGHbase, a comprehensive analysis platform for arrayCGH experiments consisting of a MIAME (Minimal Information About a Microarray Experiment supportive database using MySQL underlying a data mining web tool, to store, analyze, interpret, compare, and visualize arrayCGH results in a uniform and user-friendly format. Following its flexible design, arrayCGHbase is compatible with all existing and forthcoming arrayCGH platforms. Data can be exported in a multitude of formats, including BED files to map copy number information on the genome using the Ensembl or UCSC genome browser. Conclusion ArrayCGHbase is a web based and platform independent arrayCGH data analysis tool, that allows users to access the analysis suite through the internet or a local intranet after installation on a private server. ArrayCGHbase is available at http://medgen.ugent.be/arrayCGHbase/.

  8. MicrobesOnline: an integrated portal for comparative and functional genomics

    Energy Technology Data Exchange (ETDEWEB)

    Dehal, Paramvir S.; Joachimiak, Marcin P.; Price, Morgan N.; Bates, John T.; Baumohl, Jason K.; Chivian, Dylan; Friedland, Greg D.; Huang, Katherine H.; Keller, Keith; Novichkov, Pavel S.; Dubchak, Inna L.; Alm, Eric J.; Arkin, Adam P.

    2009-09-17

    Since 2003, MicrobesOnline (http://www.microbesonline.org) has been providing a community resource for comparative and functional genome analysis. The portal includes over 1000 complete genomes of bacteria, archaea and fungi and thousands of expression microarrays from diverse organisms ranging from model organisms such as Escherichia coli and Saccharomyces cerevisiae to environmental microbes such as Desulfovibrio vulgaris and Shewanella oneidensis. To assist in annotating genes and in reconstructing their evolutionary history, MicrobesOnline includes a comparative genome browser based on phylogenetic trees for every gene family as well as a species tree. To identify co-regulated genes, MicrobesOnline can search for genes based on their expression profile, and provides tools for identifying regulatory motifs and seeing if they are conserved. MicrobesOnline also includes fast phylogenetic profile searches, comparative views of metabolic pathways, operon predictions, a workbench for sequence analysis and integration with RegTransBase and other microbial genome resources. The next update of MicrobesOnline will contain significant new functionality, including comparative analysis of metagenomic sequence data. Programmatic access to the database, along with source code and documentation, is available at http://microbesonline.org/programmers.html.

  9. MicrobesOnline: an integrated portal for comparative and functional genomics

    Energy Technology Data Exchange (ETDEWEB)

    Dehal, Paramvir; Joachimiak, Marcin; Price, Morgan; Bates, John; Baumohl, Jason; Chivian, Dylan; Friedland, Greg; Huang, Kathleen; Keller, Keith; Novichkov, Pavel; Dubchak, Inna; Alm, Eric; Arkin, Adam

    2011-07-14

    Since 2003, MicrobesOnline (http://www.microbesonline.org) has been providing a community resource for comparative and functional genome analysis. The portal includes over 1000 complete genomes of bacteria, archaea and fungi and thousands of expression microarrays from diverse organisms ranging from model organisms such as Escherichia coli and Saccharomyces cerevisiae to environmental microbes such as Desulfovibrio vulgaris and Shewanella oneidensis. To assist in annotating genes and in reconstructing their evolutionary history, MicrobesOnline includes a comparative genome browser based on phylogenetic trees for every gene family as well as a species tree. To identify co-regulated genes, MicrobesOnline can search for genes based on their expression profile, and provides tools for identifying regulatory motifs and seeing if they are conserved. MicrobesOnline also includes fast phylogenetic profile searches, comparative views of metabolic pathways, operon predictions, a workbench for sequence analysis and integration with RegTransBase and other microbial genome resources. The next update of MicrobesOnline will contain significant new functionality, including comparative analysis of metagenomic sequence data. Programmatic access to the database, along with source code and documentation, is available at http://microbesonline.org/programmers.html.

  10. Comparative anatomy of the petioles of different genomic Cydonia × Malus hybrids

    Directory of Open Access Journals (Sweden)

    Elisaveta Onica

    2013-04-01

    Full Text Available In the paper morphological and anatomical structure of the petioles of 15 different genomic hybrids between quince and apple are compared with other hybrids and the initial forms. Specific and common anatomic peculiarities of the petiole for the studied hybrids in comparison to other hybrids and parental forms are given.

  11. CMG-Biotools, a Free Workbench for Basic Comparative Microbial Genomics

    DEFF Research Database (Denmark)

    Vesth, Tammi Camilla; Lagesen, Karin; Acar, Öncel;

    2013-01-01

    This paper shows the strength and diverse use of the CMG-biotools system. The system can be installed on a vide range of host operating systems and utilizes as much of the host computer as desired. It allows the user to compare multiple genomes, from various sources using standardized data format...

  12. Features of 5'-splice-site efficiency derived from disease-causing mutations and comparative genomics

    DEFF Research Database (Denmark)

    Roca, Xavier; Olson, Andrew J; Rao, Atmakuri R;

    2007-01-01

    Many human diseases, including Fanconi anemia, hemophilia B, neurofibromatosis, and phenylketonuria, can be caused by 5'-splice-site (5'ss) mutations that are not predicted to disrupt splicing, according to position weight matrices. By using comparative genomics, we identify pairwise dependencies...

  13. Comparative genomics and repetitive sequence divergence in the species of diploid Nicotiana section Alatae

    Czech Academy of Sciences Publication Activity Database

    Lim, Y.K.; Kovařík, Aleš; Matyášek, Roman; Chase, M.W.; Knapp, S.; McCarthy, E.; Clarkson, J.; Leitch, A.R.

    2006-01-01

    Roč. 48, č. 6 (2006), s. 907-919. ISSN 0960-7412 R&D Projects: GA ČR(CZ) GA521/04/0775 Institutional research plan: CEZ:AV0Z50040507 Keywords : comparative genomics * DNA phylogenetics * tandem repeats Subject RIV: BO - Biophysics Impact factor: 6.565, year: 2006

  14. Comparative genomic and in situ hybridization of germ cell tumors of the infantile testis

    NARCIS (Netherlands)

    Mostert, M; Rosenberg, C; Stoop, H; Schuyer, M; Timmer, A; Oosterhuis, W; Looijenga, L

    2000-01-01

    Chromosomal information on germ cell tumors of the infantile testis, ie, teratomas and yolk sac tumors, is limited and controversial. We studied two teratomas and four yolk sac tumors using comparative genomic hybridization (CGH) and in situ hybridization. No chromosomal anomalies were found in the

  15. Comparative genomics of Escherichia coli isolated from patients with inflammatory bowel disease

    DEFF Research Database (Denmark)

    Vejborg, Rebecca Munk; Hancock, Viktoria; Petersen, Andreas M; Krogfelt, Karen; Klemm, Per

    2011-01-01

    Inflammatory bowel disease (IBD) is used to describe a state of idiopathic, chronic inflammation of the gastrointestinal tract. The two main phenotypes of IBD are Crohn's disease (CD) and ulcerative colitis (UC). The major cause of IBD-associated mortality is colorectal cancer. Although both host......-genetic and exogenous factors have been found to be involved, the aetiology of IBD is still not well understood. In this study we characterized thirteen Escherichia coli strains from patients with IBD by comparative genomic hybridization employing a microarray based on 31 sequenced E. coli genomes from a wide...

  16. Comparing Platforms for C. elegans Mutant Identification Using High-Throughput Whole-Genome Sequencing

    OpenAIRE

    Shen, Yufeng; Sarin, Sumeet; Liu, Ye; Hobert, Oliver; Pe'er, Itsik

    2008-01-01

    Background Whole-genome sequencing represents a promising approach to pinpoint chemically induced mutations in genetic model organisms, thereby short-cutting time-consuming genetic mapping efforts. Principal Findings We compare here the ability of two leading high-throughput platforms for paired-end deep sequencing, SOLiD (ABI) and Genome Analyzer (Illumina; “Solexa”), to achieve the goal of mutant detection. As a test case we used a mutant C. elegans strain that harbors a mutation in the lsy...

  17. Comparative analysis of fungal genomes reveals different plant cell wall degrading capacity in fungi

    OpenAIRE

    Zhao, Zhongtao; Liu, Huiquan; Wang, Chenfang; Xu, Jin-Rong

    2013-01-01

    Background Fungi produce a variety of carbohydrate activity enzymes (CAZymes) for the degradation of plant polysaccharide materials to facilitate infection and/or gain nutrition. Identifying and comparing CAZymes from fungi with different nutritional modes or infection mechanisms may provide information for better understanding of their life styles and infection models. To date, over hundreds of fungal genomes are publicly available. However, a systematic comparative analysis of fungal CAZyme...

  18. Comparative Genomic Hybridization Selection of Blastocysts for Repeated Implantation Failure Treatment: A Pilot Study

    OpenAIRE

    Ermanno Greco; Sara Bono; Alessandra Ruberti; Anna Maria Lobascio; Pierfrancesco Greco; Anil Biricik; Letizia Spizzichino; Alessia Greco; Jan Tesarik; Maria Giulia Minasi; Francesco Fiorentino

    2014-01-01

    The aim of this study is to determine if the use of preimplantation genetic screening (PGS) by array comparative genomic hybridization (array CGH) and transfer of a single euploid blastocyst in patients with repeated implantation failure (RIF) can improve clinical results. Three patient groups are compared: 43 couples with RIF for whom embryos were selected by array CGH (group RIF-PGS), 33 couples with the same history for whom array CGH was not performed (group RIF NO PGS), and 45 good progn...

  19. Genome Sequencing and Comparative Analysis of the Biocontrol Agent Trichoderma harzianum sensu stricto TR274

    Energy Technology Data Exchange (ETDEWEB)

    Steindorff, Andrei S.; Noronha, Elilane F.; Ulhoa, Cirano J.; Kuo, Alan; Salamov, Asaf A.; Haridas, Sajeet; Riley, Robert W.; Druzhinina, Irina S.; Kubicek, Christian P.; Grigoriev, Igor V.

    2015-03-17

    Biological control is a complex process which requires many mechanisms and a high diversity of biochemical pathways. The species of Trichoderma harzianum are well known for their biocontrol activity against many plant pathogens. To gain new insights into the biocontrol mechanism used by T. harzianum, we sequenced the isolate TR274 genome using Illumina. The assembly was performed using AllPaths-LG with a maximum coverage of 100x. The assembly resulted in 2282 contigs with a N50 of 37033bp. The genome size generated was 40.8 Mb and the GC content was 47.7%, similar to other Trichoderma genomes. Using the JGI Annotation Pipeline we predicted 13,932 genes with a high transcriptome support. CEGMA tests suggested 100% genome completeness and 97.9% of RNA-SEQ reads were mapped to the genome. The phylogenetic comparison using orthologous proteins with all Trichoderma genomes sequenced at JGI, corroborates the Trichoderma (T. asperellum and T. atroviride), Longibrachiatum (T. reesei and T. longibrachiatum) and Pachibasium (T. harzianum and T. virens) section division described previously. The comparison between two Trichoderma harzianum species suggests a high genome similarity but some strain-specific expansions. Analyses of the secondary metabolites, CAZymes, transporters, proteases, transcription factors were performed. The Pachybasium section expanded virtually all categories analyzed compared with the other sections, specially Longibrachiatum section, that shows a clear contraction. These results suggests that these proteins families have an important role in their respective phenotypes. Future analysis will improve the understanding of this complex genus and give some insights about its lifestyle and the interactions with the environment.

  20. New families of human regulatory RNA structures identified by comparative analysis of vertebrate genomes

    DEFF Research Database (Denmark)

    Parker, Brian John; Moltke, Ida; Roth, Adam;

    2011-01-01

    comparative method, EvoFam, for genome-wide identification of families of regulatory RNA structures, based on primary sequence and secondary structure similarity. We apply EvoFam to a 41-way genomic vertebrate alignment. Genome-wide, we identify 220 human, high-confidence families outside protein......-coding regions comprising 725 individual structures, including 48 families with known structural RNA elements. Known families identified include both noncoding RNAs, e.g., miRNAs and the recently identified MALAT1/MEN β lincRNA family; and cis-regulatory structures, e.g., iron-responsive elements. We also...... identify tens of new families supported by strong evolutionary evidence and other statistical evidence, such as GO term enrichments. For some of these, detailed analysis has led to the formulation of specific functional hypotheses. Examples include two hypothesized auto-regulatory feedback mechanisms: one...

  1. Comparative genomics of Escherichia coli isolated from patients with inflammatory bowel disease

    DEFF Research Database (Denmark)

    Vejborg, Rebecca Munk; Hancock, Viktoria; Petersen, Andreas M.;

    2011-01-01

    both host-genetic and exogenous factors have been found to be involved, the aetiology of IBD is still not well understood. In this study we characterized thirteen Escherichia coli strains from patients with IBD by comparative genomic hybridization employing a microarray based on 31 sequenced E. coli...... prototypic CD isolate, LF82, suggesting that the IBD-inducing effect of the strains is multifactorial. Several of the IBD isolates carried a number of extraintestinal pathogenic E. coli (ExPEC)-related virulence determinants such as the pap, sfa, cdt and hly genes. The isolates were also found to carry genes...... of ExPEC-associated genomic islands. Conclusions: Combined, these data suggest that E. coli isolates obtained from UC and CD patients represents a heterogeneous population of strains, with genomic profiles that are indistinguishable to those of ExPEC isolates. Our findings indicate that IBD...

  2. Comparative Genomics between Two Xenorhabdus bovienii Strains Highlights Differential Evolutionary Scenarios within an Entomopathogenic Bacterial Species.

    Science.gov (United States)

    Bisch, Gaëlle; Ogier, Jean-Claude; Médigue, Claudine; Rouy, Zoé; Vincent, Stéphanie; Tailliez, Patrick; Givaudan, Alain; Gaudriault, Sophie

    2016-01-01

    Bacteria of the genus Xenorhabdus are symbionts of soil entomopathogenic nematodes of the genus Steinernema. This symbiotic association constitutes an insecticidal complex active against a wide range of insect pests. Within Xenorhabdus bovienii species, the X. bovienii CS03 strain (Xb CS03) is nonvirulent when directly injected into lepidopteran insects, and displays a low virulence when associated with its Steinernema symbiont. The genome of Xb CS03 was sequenced and compared with the genome of a virulent strain, X. bovienii SS-2004 (Xb SS-2004). The genome size and content widely differed between the two strains. Indeed, Xb CS03 had a large genome containing several specific loci involved in the inhibition of competitors, including a few NRPS-PKS loci (nonribosomal peptide synthetases and polyketide synthases) producing antimicrobial molecules. Consistently, Xb CS03 had a greater antimicrobial activity than Xb SS-2004. The Xb CS03 strain contained more pseudogenes than Xb SS-2004. Decay of genes involved in the host invasion and exploitation (toxins, invasins, or extracellular enzymes) was particularly important in Xb CS03. This may provide an explanation for the nonvirulence of the strain when injected into an insect host. We suggest that Xb CS03 and Xb SS-2004 followed divergent evolutionary scenarios to cope with their peculiar life cycle. The fitness strategy of Xb CS03 would involve competitor inhibition, whereas Xb SS-2004 would quickly and efficiently kill the insect host. Hence, Xenorhabdus strains would have widely divergent host exploitation strategies, which impact their genome structure. PMID:26769959

  3. Comparative Genomics of Interreplichore Translocations in Bacteria: A Measure of Chromosome Topology?

    Directory of Open Access Journals (Sweden)

    Supriya Khedkar

    2016-06-01

    Full Text Available Genomes evolve not only in base sequence but also in terms of their architecture, defined by gene organization and chromosome topology. Whereas genome sequence data inform us about the changes in base sequences for a large variety of organisms, the study of chromosome topology is restricted to a few model organisms studied using microscopy and chromosome conformation capture techniques. Here, we exploit whole genome sequence data to study the link between gene organization and chromosome topology in bacteria. Using comparative genomics across ∼250 pairs of closely related bacteria we show that: (a many organisms show a high degree of interreplichore translocations throughout the chromosome and not limited to the inversion-prone terminus (ter or the origin of replication (oriC; (b translocation maps may reflect chromosome topologies; and (c symmetric interreplichore translocations do not disrupt the distance of a gene from oriC or affect gene expression states or strand biases in gene densities. In summary, we suggest that translocation maps might be a first line in defining a gross chromosome topology given a pair of closely related genome sequences.

  4. Comparing Memory-Efficient Genome Assemblers on Stand-Alone and Cloud Infrastructures

    KAUST Repository

    Kleftogiannis, Dimitrios

    2013-09-27

    A fundamental problem in bioinformatics is genome assembly. Next-generation sequencing (NGS) technologies produce large volumes of fragmented genome reads, which require large amounts of memory to assemble the complete genome efficiently. With recent improvements in DNA sequencing technologies, it is expected that the memory footprint required for the assembly process will increase dramatically and will emerge as a limiting factor in processing widely available NGS-generated reads. In this report, we compare current memory-efficient techniques for genome assembly with respect to quality, memory consumption and execution time. Our experiments prove that it is possible to generate draft assemblies of reasonable quality on conventional multi-purpose computers with very limited available memory by choosing suitable assembly methods. Our study reveals the minimum memory requirements for different assembly programs even when data volume exceeds memory capacity by orders of magnitude. By combining existing methodologies, we propose two general assembly strategies that can improve short-read assembly approaches and result in reduction of the memory footprint. Finally, we discuss the possibility of utilizing cloud infrastructures for genome assembly and we comment on some findings regarding suitable computational resources for assembly.

  5. The complete mitochondrial genome of Gastrothylax crumenifer (Gastrothylacidae, Trematoda) and comparative analyses with selected trematodes.

    Science.gov (United States)

    Yang, Xin; Wang, Lixia; Chen, Hongmei; Feng, Hanli; Shen, Bang; Hu, Min; Fang, Rui

    2016-06-01

    In the present study, we sequenced and analyzed the mitochondrial (mt) genome of Gastrothylax crumenifer and compared it with other selected trematodes. The full mt genome of G. crumenifer was amplified, sequenced, assembled, analyzed and then subjected to phylogenetic analysis. The complete mt genome of G. crumenifer is 14,801 bp in length and contains two rRNA genes, two non-coding regions (LNR and SNR), 12 protein-coding genes, and 22 transfer RNA genes. The gene organization of the G. crumenifer mt genome is the same as that of other trematodes, except for Schistosoma haematobium and Schistosoma spindale. All the genes are transcribed in the same direction and rich in "A + T", which is in accordance with other trematodes, such as Fasciola hepatica, Paramphistomum cervi, and Fischoederius elongatus. Phylogenetic analysis using concatenated amino acid sequences of the 12 protein-coding genes showed that G. crumenifer is closely related to F. elongatus. The availability of mt genome sequence of G. crumenifer can provide useful DNA markers for studying the molecular epidemiology and population genetics of this parasite and other paramphistomes. PMID:27021180

  6. Complete chloroplast genome sequence of Omani lime (Citrus aurantiifolia and comparative analysis within the rosids.

    Directory of Open Access Journals (Sweden)

    Huei-Jiun Su

    Full Text Available The genus Citrus contains many economically important fruits that are grown worldwide for their high nutritional and medicinal value. Due to frequent hybridizations among species and cultivars, the exact number of natural species and the taxonomic relationships within this genus are unclear. To compare the differences between the Citrus chloroplast genomes and to develop useful genetic markers, we used a reference-assisted approach to assemble the complete chloroplast genome of Omani lime (C. aurantiifolia. The complete C. aurantiifolia chloroplast genome is 159,893 bp in length; the organization and gene content are similar to most of the rosids lineages characterized to date. Through comparison with the sweet orange (C. sinensis chloroplast genome, we identified three intergenic regions and 94 simple sequence repeats (SSRs that are potentially informative markers with resolution for interspecific relationships. These markers can be utilized to better understand the origin of cultivated Citrus. A comparison among 72 species belonging to 10 families of representative rosids lineages also provides new insights into their chloroplast genome evolution.

  7. Comparative genomics of the fungal pathogens Candida dubliniensis and Candida albicans.

    LENUS (Irish Health Repository)

    Jackson, Andrew P

    2009-12-01

    Candida dubliniensis is the closest known relative of Candida albicans, the most pathogenic yeast species in humans. However, despite both species sharing many phenotypic characteristics, including the ability to form true hyphae, C. dubliniensis is a significantly less virulent and less versatile pathogen. Therefore, to identify C. albicans-specific genes that may be responsible for an increased capacity to cause disease, we have sequenced the C. dubliniensis genome and compared it with the known C. albicans genome sequence. Although the two genome sequences are highly similar and synteny is conserved throughout, 168 species-specific genes are identified, including some encoding known hyphal-specific virulence factors, such as the aspartyl proteinases Sap4 and Sap5 and the proposed invasin Als3. Among the 115 pseudogenes confirmed in C. dubliniensis are orthologs of several filamentous growth regulator (FGR) genes that also have suspected roles in pathogenesis. However, the principal differences in genomic repertoire concern expansion of the TLO gene family of putative transcription factors and the IFA family of putative transmembrane proteins in C. albicans, which represent novel candidate virulence-associated factors. The results suggest that the recent evolutionary histories of C. albicans and C. dubliniensis are quite different. While gene families instrumental in pathogenesis have been elaborated in C. albicans, C. dubliniensis has lost genomic capacity and key pathogenic functions. This could explain why C. albicans is a more potent pathogen in humans than C. dubliniensis.

  8. A general pipeline for the development of anchor markers for comparative genomics in plants

    Directory of Open Access Journals (Sweden)

    Stougaard Jens

    2006-08-01

    Full Text Available Abstract Background Complete or near-complete genomic sequence information is presently only available for a few plant species representing a large phylogenetic diversity among plants. In order to effectively transfer this information to species lacking sequence information, comparative genomic tools need to be developed. Molecular markers permitting cross-species mapping along co-linear genomic regions are central to comparative genomics. These "anchor" markers, defining unique loci in genetic linkage maps of multiple species, are gene-based and possess a number of features that make them relatively sparse. To identify potential anchor marker sequences more efficiently, we have established an automated bioinformatic pipeline that combines multi-species Expressed Sequence Tags (EST and genome sequence data. Results Taking advantage of sequence data from related species, the pipeline identifies evolutionarily conserved sequences that are likely to define unique orthologous loci in most species of the same phylogenetic clade. The key features are the identification of evolutionarily conserved sequences followed by automated design of intron-flanking Polymerase Chain Reaction (PCR primer pairs. Polymorphisms can subsequently be identified by size- or sequence variation of PCR products, amplified from mapping parents or populations. We illustrate our procedure in legumes and grasses and exemplify its application in legumes, where model plant studies and the genome- and EST-sequence data available have a potential impact on the breeding of crop species and on our understanding of the evolution of this large and diverse family. Conclusion We provide a database of 459 candidate anchor loci which have the potential to serve as map anchors in more than 18,000 legume species, a number of which are of agricultural importance. For grasses, the database contains 1335 candidate anchor loci. Based on this database, we have evaluated 76 candidate anchor loci

  9. Unraveling the message: insights into comparative genomics of the naked mole-rat.

    Science.gov (United States)

    Lewis, Kaitlyn N; Soifer, Ilya; Melamud, Eugene; Roy, Margaret; McIsaac, R Scott; Hibbs, Matthew; Buffenstein, Rochelle

    2016-08-01

    Animals have evolved to survive, and even thrive, in different environments. Genetic adaptations may have indirectly created phenotypes that also resulted in a longer lifespan. One example of this phenomenon is the preternaturally long-lived naked mole-rat. This strictly subterranean rodent tolerates hypoxia, hypercapnia, and soil-based toxins. Naked mole-rats also exhibit pronounced resistance to cancer and an attenuated decline of many physiological characteristics that often decline as mammals age. Elucidating mechanisms that give rise to their unique phenotypes will lead to better understanding of subterranean ecophysiology and biology of aging. Comparative genomics could be a useful tool in this regard. Since the publication of a naked mole-rat genome assembly in 2011, analyses of genomic and transcriptomic data have enabled a clearer understanding of mole-rat evolutionary history and suggested molecular pathways (e.g., NRF2-signaling activation and DNA damage repair mechanisms) that may explain the extraordinarily longevity and unique health traits of this species. However, careful scrutiny and re-analysis suggest that some identified features result from incorrect or imprecise annotation and assembly of the naked mole-rat genome: in addition, some of these conclusions (e.g., genes involved in cancer resistance and hairlessness) are rejected when the analysis includes additional, more closely related species. We describe how the combination of better study design, improved genomic sequencing techniques, and new bioinformatic and data analytical tools will improve comparative genomics and ultimately bridge the gap between traditional model and nonmodel organisms. PMID:27364349

  10. Systematic discovery of regulatory motifs in Fusarium graminearum by comparing four Fusarium genomes

    Directory of Open Access Journals (Sweden)

    Kistler Corby

    2010-03-01

    Full Text Available Abstract Background Fusarium graminearum (Fg, a major fungal pathogen of cultivated cereals, is responsible for billions of dollars in agriculture losses. There is a growing interest in understanding the transcriptional regulation of this organism, especially the regulation of genes underlying its pathogenicity. The generation of whole genome sequence assemblies for Fg and three closely related Fusarium species provides a unique opportunity for such a study. Results Applying comparative genomics approaches, we developed a computational pipeline to systematically discover evolutionarily conserved regulatory motifs in the promoter, downstream and the intronic regions of Fg genes, based on the multiple alignments of sequenced Fusarium genomes. Using this method, we discovered 73 candidate regulatory motifs in the promoter regions. Nearly 30% of these motifs are highly enriched in promoter regions of Fg genes that are associated with a specific functional category. Through comparison to Saccharomyces cerevisiae (Sc and Schizosaccharomyces pombe (Sp, we observed conservation of transcription factors (TFs, their binding sites and the target genes regulated by these TFs related to pathways known to respond to stress conditions or phosphate metabolism. In addition, this study revealed 69 and 39 conserved motifs in the downstream regions and the intronic regions, respectively, of Fg genes. The top intronic motif is the splice donor site. For the downstream regions, we noticed an intriguing absence of the mammalian and Sc poly-adenylation signals among the list of conserved motifs. Conclusion This study provides the first comprehensive list of candidate regulatory motifs in Fg, and underscores the power of comparative genomics in revealing functional elements among related genomes. The conservation of regulatory pathways among the Fusarium genomes and the two yeast species reveals their functional significance, and provides new insights in their

  11. A genetic linkage map and comparative mapping of the prairie vole (Microtus ochrogaster genome

    Directory of Open Access Journals (Sweden)

    Young Larry J

    2011-07-01

    Full Text Available Abstract Background The prairie vole (Microtus ochrogaster is an emerging rodent model for investigating the genetics, evolution and molecular mechanisms of social behavior. Though a karyotype for the prairie vole has been reported and low-resolution comparative cytogenetic analyses have been done in this species, other basic genetic resources for this species, such as a genetic linkage map, are lacking. Results Here we report the construction of a genome-wide linkage map of the prairie vole. The linkage map consists of 406 markers that are spaced on average every 7 Mb and span an estimated ~90% of the genome. The sex average length of the linkage map is 1707 cM, which, like other Muroid rodent linkage maps, is on the lower end of the length distribution of linkage maps reported to date for placental mammals. Linkage groups were assigned to 19 out of the 26 prairie vole autosomes as well as the X chromosome. Comparative analyses of the prairie vole linkage map based on the location of 387 Type I markers identified 61 large blocks of synteny with the mouse genome. In addition, the results of the comparative analyses revealed a potential elevated rate of inversions in the prairie vole lineage compared to the laboratory mouse and rat. Conclusions A genetic linkage map of the prairie vole has been constructed and represents the fourth genome-wide high-resolution linkage map reported for Muroid rodents and the first for a member of the Arvicolinae sub-family. This resource will advance studies designed to dissect the genetic basis of a variety of social behaviors and other traits in the prairie vole as well as our understanding of genome evolution in the genus Microtus.

  12. Insights into the Dekkera bruxellensis genomic landscape: comparative genomics reveals variations in ploidy and nutrient utilisation potential amongst wine isolates.

    Science.gov (United States)

    Borneman, Anthony R; Zeppel, Ryan; Chambers, Paul J; Curtin, Chris D

    2014-02-01

    The yeast Dekkera bruxellensis is a major contaminant of industrial fermentations, such as those used for the production of biofuel and wine, where it outlasts and, under some conditions, outcompetes the major industrial yeast Saccharomyces cerevisiae. In order to investigate the level of inter-strain variation that is present within this economically important species, the genomes of four diverse D. bruxellensis isolates were compared. While each of the four strains was shown to contain a core diploid genome, which is clearly sufficient for survival, two of the four isolates have a third haploid complement of chromosomes. The sequences of these additional haploid genomes were both highly divergent from those comprising the diploid core and divergent between the two triploid strains. Similar to examples in the Saccharomyces spp. clade, where some allotriploids have arisen on the basis of enhanced ability to survive a range of environmental conditions, it is likely these strains are products of two independent hybridisation events that may have involved multiple species or distinct sub-species of Dekkera. Interestingly these triploid strains represent the vast majority (92%) of isolates from across the Australian wine industry, suggesting that the additional set of chromosomes may confer a selective advantage in winery environments that has resulted in these hybrid strains all-but replacing their diploid counterparts in Australian winery settings. In addition to the apparent inter-specific hybridisation events, chromosomal aberrations such as strain-specific insertions and deletions and loss-of-heterozygosity by gene conversion were also commonplace. While these events are likely to have affected many phenotypes across these strains, we have been able to link a specific deletion to the inability to utilise nitrate by some strains of D. bruxellensis, a phenotype that may have direct impacts in the ability for these strains to compete with S. cerevisiae. PMID:24550744

  13. Insights into the Dekkera bruxellensis genomic landscape: comparative genomics reveals variations in ploidy and nutrient utilisation potential amongst wine isolates.

    Directory of Open Access Journals (Sweden)

    Anthony R Borneman

    2014-02-01

    Full Text Available The yeast Dekkera bruxellensis is a major contaminant of industrial fermentations, such as those used for the production of biofuel and wine, where it outlasts and, under some conditions, outcompetes the major industrial yeast Saccharomyces cerevisiae. In order to investigate the level of inter-strain variation that is present within this economically important species, the genomes of four diverse D. bruxellensis isolates were compared. While each of the four strains was shown to contain a core diploid genome, which is clearly sufficient for survival, two of the four isolates have a third haploid complement of chromosomes. The sequences of these additional haploid genomes were both highly divergent from those comprising the diploid core and divergent between the two triploid strains. Similar to examples in the Saccharomyces spp. clade, where some allotriploids have arisen on the basis of enhanced ability to survive a range of environmental conditions, it is likely these strains are products of two independent hybridisation events that may have involved multiple species or distinct sub-species of Dekkera. Interestingly these triploid strains represent the vast majority (92% of isolates from across the Australian wine industry, suggesting that the additional set of chromosomes may confer a selective advantage in winery environments that has resulted in these hybrid strains all-but replacing their diploid counterparts in Australian winery settings. In addition to the apparent inter-specific hybridisation events, chromosomal aberrations such as strain-specific insertions and deletions and loss-of-heterozygosity by gene conversion were also commonplace. While these events are likely to have affected many phenotypes across these strains, we have been able to link a specific deletion to the inability to utilise nitrate by some strains of D. bruxellensis, a phenotype that may have direct impacts in the ability for these strains to compete with S

  14. Comparative Genomics of Pathogens Causing Brown Spot Disease of Tobacco: Alternaria longipes and Alternaria alternata

    Science.gov (United States)

    Wan, Wenting; Long, Ni; Zhang, Jing; Tan, Yuntao; Duan, Shengchang; Zeng, Yan; Dong, Yang

    2016-01-01

    The genus Alternaria is a group of infectious/contagious pathogenic fungi that not only invade a wide range of crops but also induce severe allergic reactions in a part of the human population. In this study, two strains Alternaria longipes cx1 and Alternaria alternata cx2 were isolated from different brown spot lesions on infected tobacco leaves. Their complete genomes were sequenced, de novo assembled, and comparatively analyzed. Phylogenetic analysis revealed that A. longipes cx1 and A. alternata cx2 diverged 3.3 million years ago, indicating a recent event of speciation. Seventeen non-ribosomal peptide synthetase (NRPS) genes and 13 polyketide synthase (PKS) genes in A. longipes cx1 and 13 NRPS genes and 12 PKS genes in A. alternata cx2 were identified in these two strains. Some of these genes were predicted to participate in the synthesis of non-host specific toxins (non-HSTs), such as tenuazonic acid (TeA), alternariol (AOH) and alternariol monomethyl ether (AME). By comparative genome analysis, we uncovered that A. longipes cx1 had more genes putatively involved in pathogen-plant interaction, more carbohydrate-degrading enzymes and more secreted proteins than A. alternata cx2. In summary, our results demonstrate the genomic distinction between A. longipes cx1 and A. altenata cx2. They will not only improve the understanding of the phylogenetic relationship among genus Alternaria, but more importantly provide valuable genomic resources for the investigation of plant-pathogen interaction. PMID:27159564

  15. Comparative genomics of xylose-fermenting fungi for enhanced biofuel production

    Energy Technology Data Exchange (ETDEWEB)

    Wohlbach, Dana J.; Kuo, Alan; Sato, Trey K.; Potts, Katlyn M.; Salamov, Asaf A.; LaButti, Kurt M.; Sun, Hui; Clum, Alicia; Pangilinan, Jasmyn L.; Lindquist, Erika A.; Lucas, Susan; Lapidus, Alla; Jin, Mingjie; Gunawan, Christa; Balan, Venkatesh; Dale, Bruce E.; Jeffries, Thomas W.; Zinkel, Robert; Barry, Kerrie W.; Grigoriev, Igor V.; Gasch, Audrey P.

    2011-02-24

    Cellulosic biomass is an abundant and underused substrate for biofuel production. The inability of many microbes to metabolize the pentose sugars abundant within hemicellulose creates specific challenges for microbial biofuel production from cellulosic material. Although engineered strains of Saccharomyces cerevisiae can use the pentose xylose, the fermentative capacity pales in comparison with glucose, limiting the economic feasibility of industrial fermentations. To better understand xylose utilization for subsequent microbial engineering, we sequenced the genomes of two xylose-fermenting, beetle-associated fungi, Spathaspora passalidarum and Candida tenuis. To identify genes involved in xylose metabolism, we applied a comparative genomic approach across 14 Ascomycete genomes, mapping phenotypes and genotypes onto the fungal phylogeny, and measured genomic expression across five Hemiascomycete species with different xylose-consumption phenotypes. This approach implicated many genes and processes involved in xylose assimilation. Several of these genes significantly improved xylose utilization when engineered into S. cerevisiae, demonstrating the power of comparative methods in rapidly identifying genes for biomass conversion while reflecting on fungal ecology.

  16. Prediction of transcription regulatory sites in Archaea by a comparative genomic approach.

    Science.gov (United States)

    Gelfand, M S; Koonin, E V; Mironov, A A

    2000-02-01

    Intragenomic and intergenomic comparisons of upstream nucleotide sequences of archaeal genes were performed with the goal of predicting transcription regulatory sites (operators) and identifying likely regulons. Learning sets for the detection of regulatory sites were constructed using the available experimental data on archaeal transcription regulation or by analogy with known bacterial regulons, and further analysis was performed using iterative profile searches. The information content of the candidate signals detected by this method is insufficient for reliable predictions to be made. Therefore, this approach has to be complemented by examination of evolutionary conservation in different archaeal genomes. This combined strategy resulted in the prediction of a conserved heat shock regulon in all euryarchaea, a nitrogen fixation regulon in the methanogens Methanococcus jannaschii and Methanobacterium thermoautotrophicum and an aromatic amino acid regulon in M.thermoautotrophicum. Unexpectedly, the heat shock regulatory site was detected not only for genes that encode known chaperone proteins but also for archaeal histone genes. This suggests a possible function for archaeal histones in stress-related changes in DNA condensation. In addition, comparative analysis of the genomes of three Pyrococcus species resulted in the prediction of their purine metabolism and transport regulon. The results demonstrate the feasibility of prediction of at least some transcription regulatory sites by comparing poorly characterized prokaryotic genomes, particularly when several closely related genome sequences are available. PMID:10637320

  17. Comparative genomics Lactobacillus reuteri from sourdough reveals adaptation of an intestinal symbiont to food fermentations.

    Science.gov (United States)

    Zheng, Jinshui; Zhao, Xin; Lin, Xiaoxi B; Gänzle, Michael

    2015-01-01

    Lactobacillus reuteri is a dominant member of intestinal microbiota of vertebrates, and occurs in food fermentations. The stable presence of L. reuteri in sourdough provides the opportunity to study the adaptation of vertebrate symbionts to an extra-intestinal habitat. This study evaluated this adaptation by comparative genomics of 16 strains of L. reuteri. A core genome phylogenetic tree grouped L. reuteri into 5 clusters corresponding to the host-adapted lineages. The topology of a gene content tree, which includes accessory genes, differed from the core genome phylogenetic tree, suggesting that the differentiation of L. reuteri is shaped by gene loss or acquisition. About 10% of the core genome (124 core genes) were under positive selection. In lineage III sourdough isolates, 177 genes were under positive selection, mainly related to energy conversion and carbohydrate metabolism. The analysis of the competitiveness of L. reuteri in sourdough revealed that the competitivess of sourdough isolates was equal or higher when compared to rodent isolates. This study provides new insights into the adaptation of L. reuteri to food and intestinal habitats, suggesting that these two habitats exert different selective pressure related to growth rate and energy (carbohydrate) metabolism. PMID:26658825

  18. Comparative genomics of four closely related Clostridium perfringens bacteriophages reveals variable rates of evolution within a core genome

    Science.gov (United States)

    Background: Biotechnological uses of bacteriophage gene products as alternatives to conventional antibiotics will require a thorough understanding of their genomic context. We sequenced and analyzed the genomes of four closely related phages isolated from Clostridium perfringens, an important agricu...

  19. Synergistic use of plant-prokaryote comparative genomics for functional annotations

    Directory of Open Access Journals (Sweden)

    Waller Jeffrey C

    2011-06-01

    Full Text Available Abstract Background Identifying functions for all gene products in all sequenced organisms is a central challenge of the post-genomic era. However, at least 30-50% of the proteins encoded by any given genome are of unknown or vaguely known function, and a large number are wrongly annotated. Many of these ‘unknown’ proteins are common to prokaryotes and plants. We set out to predict and experimentally test the functions of such proteins. Our approach to functional prediction integrates comparative genomics based mainly on microbial genomes with functional genomic data from model microorganisms and post-genomic data from plants. This approach bridges the gap between automated homology-based annotations and the classical gene discovery efforts of experimentalists, and is more powerful than purely computational approaches to identifying gene-function associations. Results Among Arabidopsis genes, we focused on those (2,325 in total that (i are unique or belong to families with no more than three members, (ii occur in prokaryotes, and (iii have unknown or poorly known functions. Computer-assisted selection of promising targets for deeper analysis was based on homology-independent characteristics associated in the SEED database with the prokaryotic members of each family. In-depth comparative genomic analysis was performed for 360 top candidate families. From this pool, 78 families were connected to general areas of metabolism and, of these families, specific functional predictions were made for 41. Twenty-one predicted functions have been experimentally tested or are currently under investigation by our group in at least one prokaryotic organism (nine of them have been validated, four invalidated, and eight are in progress. Ten additional predictions have been independently validated by other groups. Discovering the function of very widespread but hitherto enigmatic proteins such as the YrdC or YgfZ families illustrates the power of our approach

  20. Multiwalled Carbon Nanotubes for Amperometric Array-Based Biosensors

    OpenAIRE

    Taurino, Irene; De Micheli, Giovanni; Carrara, Sandro

    2012-01-01

    For diagnostic and therapeutic purposes an accurate determination of multiple metabolites is often required. Amperometric devices are attractive tools to quantify biological compounds due to the direct conversion of a biochemical event to a current. This review addresses recent developments in the use of multiwalled carbon nanotubes to enhance detection ca- pability of amperometric array-based biosensors. More specifically, the principal techniques for multiwalled carbon nanotube incorporatio...

  1. Linear Microbolometric Array Based on VOx Thin Film

    Science.gov (United States)

    Chen, Xi-Qu

    2010-05-01

    In this paper, a linear microbolometric array based on VOx thin film is proposed. The linear microbolometric array is fabricated by using micromachining technology, and its thermo-sensitive VOx thin film has excellent infrared response spectrum and TCR characteristics. Integrated with CMOS circuit, an experimentally prototypical monolithic linear microbolometric array is designed and fabricated. The testing results of the experimental linear array show that the responsivity of linear array can approach 18KV/W and is potential for infrared image systems.

  2. A novel candidate vaccine for cytauxzoonosis inferred from comparative apicomplexan genomics.

    Directory of Open Access Journals (Sweden)

    Jaime L Tarigo

    Full Text Available Cytauxzoonosis is an emerging infectious disease of domestic cats (Felis catus caused by the apicomplexan protozoan parasite Cytauxzoon felis. The growing epidemic, with its high morbidity and mortality points to the need for a protective vaccine against cytauxzoonosis. Unfortunately, the causative agent has yet to be cultured continuously in vitro, rendering traditional vaccine development approaches beyond reach. Here we report the use of comparative genomics to computationally and experimentally interpret the C. felis genome to identify a novel candidate vaccine antigen for cytauxzoonosis. As a starting point we sequenced, assembled, and annotated the C. felis genome and the proteins it encodes. Whole genome alignment revealed considerable conserved synteny with other apicomplexans. In particular, alignments with the bovine parasite Theileria parva revealed that a C. felis gene, cf76, is syntenic to p67 (the leading vaccine candidate for bovine theileriosis, despite a lack of significant sequence similarity. Recombinant subdomains of cf76 were challenged with survivor-cat antiserum and found to be highly seroreactive. Comparison of eleven geographically diverse samples from the south-central and southeastern USA demonstrated 91-100% amino acid sequence identity across cf76, including a high level of conservation in an immunogenic 226 amino acid (24 kDa carboxyl terminal domain. Using in situ hybridization, transcription of cf76 was documented in the schizogenous stage of parasite replication, the life stage that is believed to be the most important for development of a protective immune response. Collectively, these data point to identification of the first potential vaccine candidate antigen for cytauxzoonosis. Further, our bioinformatic approach emphasizes the use of comparative genomics as an accelerated path to developing vaccines against experimentally intractable pathogens.

  3. A Model for Carbohydrate Metabolism in the Diatom Phaeodactylum tricornutum Deduced from Comparative Whole Genome Analysis

    OpenAIRE

    Kroth, Peter G.; Chiovitti, Anthony; Gruber, Ansgar; Martin-jezequel, Veronique; Mock, Thomas; Schnitzler Parker, Micaela; Michele S. Stanley; Kaplan, Aaron; Caron, Lise; Weber, Till; Maheswari, Uma; Armbrust, Elisabeth Virginia; Bowler, Chris

    2008-01-01

    Background:Diatoms are unicellular algae responsible for approximately 20% of global carbon fixation. Their evolution by secondary endocytobiosis resulted in a complex cellular structure and metabolism compared to algae with primary plastids.Methodology/Principal Findings:The whole genome sequence of the diatom Phaeodactylum tricornutum has recently been completed. We identified and annotated genes for enzymes involved in carbohydrate pathways based on extensive EST support and comparison to ...

  4. Comparative Analysis of Fungal Genomes Reveals Different Plant Cell Wall Degrading Capacity in Fungi

    OpenAIRE

    Zhao, Zhongtao; Liu, Huiquan; Wang, Chenfang; Xu, Jin-Rong

    2013-01-01

    EDITOR'S NOTE Readers are alerted that there is currently a discussion regarding the use of some of the unpublished genomic data presented in this manuscript. Appropriate editorial action will be taken once this matter is resolved. Background Fungi produce a variety of carbohydrate activity enzymes (CAZymes) for the degradation of plant polysaccharide materials to facilitate infection and/or gain nutrition. Identifying and comparing CAZymes from fungi with different nutritional modes o...

  5. Comparative genomics of Toll-like receptor signalling in five species

    Directory of Open Access Journals (Sweden)

    Wu Chunhua

    2009-05-01

    Full Text Available Abstract Background Over the last decade, several studies have identified quantitative trait loci (QTL affecting variation of immune related traits in mammals. Recent studies in humans and mice suggest that part of this variation may be caused by polymorphisms in genes involved in Toll-like receptor (TLR signalling. In this project, we used a comparative approach to investigate the importance of TLR-related genes in comparison with other immunologically relevant genes for resistance traits in five species by associating their genomic location with previously published immune-related QTL regions. Results We report the genomic localisation of TLR1-10 and ten associated signalling molecules in sheep and pig using in-silico and/or radiation hybrid (RH mapping techniques and compare their positions with their annotated homologues in the human, cattle and mouse whole genome sequences. We also report medium-density RH maps for porcine chromosomes 8 and 13. A comparative analysis of the positions of previously published relevant QTLs allowed the identification of homologous regions that are associated with similar health traits in several species and which contain TLR related and other immunologically relevant genes. Additional evidence was gathered by examining relevant gene expression and association studies. Conclusion This comparative genomic approach identified eight genes as potentially causative genes for variations of health related traits. These include susceptibility to clinical mastitis in dairy cattle, general disease resistance in sheep, cattle, humans and mice, and tolerance to protozoan infection in cattle and mice. Four TLR-related genes (TLR1, 6, MyD88, IRF3 appear to be the most likely candidate genes underlying QTL regions which control the resistance to the same or similar pathogens in several species. Further studies are required to investigate the potential role of polymorphisms within these genes.

  6. Study of Modern Human Evolution via Comparative Analysis with the Neanderthal Genome

    OpenAIRE

    Ahmed, Musaddeque; Liang, Ping

    2013-01-01

    Many other human species appeared in evolution in the last 6 million years that have not been able to survive to modern times and are broadly known as archaic humans, as opposed to the extant modern humans. It has always been considered fascinating to compare the modern human genome with that of archaic humans to identify modern human-specific sequence variants and figure out those that made modern humans different from their predecessors or cousin species. Neanderthals are the latest humans ...

  7. Comparative Genomic and Functional Analysis of Lactobacillus casei and Lactobacillus rhamnosus Strains Marketed as Probiotics

    OpenAIRE

    Douillard, François P.; Ribbera, Angela; Järvinen, Hanna M.; Kant, Ravi; Pietilä, Taija E.; Randazzo, Cinzia; Paulin, Lars; Laine, Pia K; Caggia, Cinzia; von Ossowski, Ingemar; Reunanen, Justus; Satokari, Reetta; Salminen, Seppo; Palva, Airi; de Vos, Willem M

    2013-01-01

    Four Lactobacillus strains were isolated from marketed probiotic products, including L. rhamnosus strains from Vifit (Friesland Campina) and Idoform (Ferrosan) and L. casei strains from Actimel (Danone) and Yakult (Yakult Honsa Co.). Their genomes and phenotypes were characterized and compared in detail with L. casei strain BL23 and L. rhamnosus strain GG. Phenotypic analysis of the new isolates indicated differences in carbohydrate utilization between L. casei and L. rhamnosus strains, which...

  8. Microarray-Based Comparative Genomic Hybridization in Neurofibromatoses and DiGeorge Syndrome

    OpenAIRE

    Mantripragada, Kiran K.

    2005-01-01

    Microarray-based comparative genomic hybridization (array-CGH) has emerged as a versatile platform with a wide range of applications in molecular genetics. This thesis focuses on the development of array-CGH with a specific aim to approach disease-related questions through improved strategies in array construction and enhanced resolution of analysis. In paper I, we applied an array covering 11 Mb of 22q, encompassing the NF2 locus, for deletion detection in sporadic schwannoma. Hemizygous del...

  9. Leveraging wall-sized high-resolution displays for comparative genomics analyses of copy number variation

    OpenAIRE

    Ruddle, RA; Fateen, W; Treanor, D; Quirke, P.; Sondergeld, P

    2013-01-01

    The scale of comparative genomics data frequently overwhelms current data visualization methods on conventional (desktop) displays. This paper describes two types of solution that take advantage of wall-sized high-resolution displays (WHirDs), which have orders of magnitude more display real estate (i.e., pixels) than desktop displays. The first allows users to view detailed graphics of copy number variation (CNV) that were output by existing software. A WHirD's resolution allowed a 10× incre...

  10. Psittacid Herpesvirus 1 and Infectious Laryngotracheitis Virus: Comparative Genome Sequence Analysis of Two Avian Alphaherpesviruses

    OpenAIRE

    Thureen, Dean R.; Keeler, Calvin L.

    2006-01-01

    Psittacid herpesvirus 1 (PsHV-1) is the causative agent of Pacheco's disease, an acute, highly contagious, and potentially lethal respiratory herpesvirus infection in psittacine birds, while infectious laryngotracheitis virus (ILTV) is a highly contagious and economically significant avian herpesvirus which is responsible for an acute respiratory disease limited to galliform birds. The complete genome sequence of PsHV-1 has been determined and compared to the ILTV sequence, assembled from pub...

  11. Comparative genomics of the fungal pathogens Candida dubliniensis and Candida albicans

    OpenAIRE

    Jackson, Andrew P; Gamble, John A.; Yeomans, Tim; Moran, Gary P.; Saunders, David; Harris, David; Aslett, Martin; Barrell, Jamie F.; Butler, Geraldine; Citiulo, Francesco; Coleman, David C.; de Groot, Piet W. J.; Goodwin, Tim J.; Quail, Michael A.; McQuillan, Jacqueline

    2009-01-01

    Candida dubliniensis is the closest known relative of Candida albicans, the most pathogenic yeast species in humans. However, despite both species sharing many phenotypic characteristics, including the ability to form true hyphae, C. dubliniensis is a significantly less virulent and less versatile pathogen. Therefore, to identify C. albicans-specific genes that may be responsible for an increased capacity to cause disease, we have sequenced the C. dubliniensis genome and compared it with the ...

  12. Comparative genomics reveals multiple pathways to mutualism for tick-borne pathogens

    OpenAIRE

    Lockwood, Svetlana; Brayton, Kelly A.; Broschat, Shira L.

    2016-01-01

    Background Multiple important human and livestock pathogens employ ticks as their primary host vectors. It is not currently known whether this means of infecting a host arose once or many times during evolution. Results In order to address this question, we conducted a comparative genomics analysis on a set of bacterial pathogens from seven genera – Borrelia, Rickettsia, Anaplasma, Ehrlichia, Francisella, Coxiella, and Bartonella, including species from three different host vectors – ticks, l...

  13. Comparative genomics reveals conservative evolution of the xylem transcriptome in vascular plants

    OpenAIRE

    Wu Harry X; Li Xinguo; Southerton Simon G

    2010-01-01

    Abstract Background Wood is a valuable natural resource and a major carbon sink. Wood formation is an important developmental process in vascular plants which played a crucial role in plant evolution. Although genes involved in xylem formation have been investigated, the molecular mechanisms of xylem evolution are not well understood. We use comparative genomics to examine evolution of the xylem transcriptome to gain insights into xylem evolution. Results The xylem transcriptome is highly con...

  14. Comparative genomics reveals conservative evolution of the xylem transcriptome in vascular plants

    OpenAIRE

    Li, Xinguo; Wu, Harry X.; Southerton, Simon G

    2010-01-01

    Background Wood is a valuable natural resource and a major carbon sink. Wood formation is an important developmental process in vascular plants which played a crucial role in plant evolution. Although genes involved in xylem formation have been investigated, the molecular mechanisms of xylem evolution are not well understood. We use comparative genomics to examine evolution of the xylem transcriptome to gain insights into xylem evolution. Results The xylem transcriptome is highly conserved in...

  15. Chromosomal Localization of DNA Amplifications in Neuroblastoma Tumors Using cDNA Microarray Comparative Genomic Hybridization

    Directory of Open Access Journals (Sweden)

    Ben Beheshti

    2003-01-01

    Full Text Available Conventional comparative genomic hybridization (CGH profiling of neuroblastomas has identified many genomic aberrations, although the limited resolution has precluded a precise localization of sequences of interest within amplicons. To map high copy number genomic gains in clinically matched stage IV neuroblastomas, CGH analysis using a 19,200-feature cDNA microarray was used. A dedicated (freely available algorithm was developed for rapid in silico determination of chromosomal localizations of microarray cDNA targets, and for generation of an ideogram-type profile of copy number changes. Using these methodologies, novel gene amplifications undetectable by chromosome CGH were identified, and larger MYCN amplicon sizes (in one tumor up to 6 Mb than those previously reported in neuroblastoma were identified. The genes HPCAL1, LPIN1/KIAA0188, NAG, and NSE1/LOC151354 were found to be coamplified with MYCN. To determine whether stage IV primary tumors could be further subclassified based on their genomic copy number profiles, hierarchical clustering was performed. Cluster analysis of microarray CGH data identified three groups: 1 no amplifications evident, 2 a small MYCN amplicon as the only detectable imbalance, and 3 a large MYCN amplicon with additional gene amplifications. Application of CGH to cDNA microarray targets will help to determine both the variation of amplicon size and help better define amplification-dependent and independent pathways of progression in neuroblastoma.

  16. Comparative analysis of mitochondrial genomes of five aphid species (Hemiptera: Aphididae and phylogenetic implications.

    Directory of Open Access Journals (Sweden)

    Yuan Wang

    Full Text Available Insect mitochondrial genomes (mitogenomes are of great interest in exploring molecular evolution, phylogenetics and population genetics. Only two mitogenomes have been previously released in the insect group Aphididae, which consists of about 5,000 known species including some agricultural, forestry and horticultural pests. Here we report the complete 16,317 bp mitogenome of Cavariella salicicola and two nearly complete mitogenomes of Aphis glycines and Pterocomma pilosum. We also present a first comparative analysis of mitochondrial genomes of aphids. Results showed that aphid mitogenomes share conserved genomic organization, nucleotide and amino acid composition, and codon usage features. All 37 genes usually present in animal mitogenomes were sequenced and annotated. The analysis of gene evolutionary rate revealed the lowest and highest rates for COI and ATP8, respectively. A unique repeat region exclusively in aphid mitogenomes, which included variable numbers of tandem repeats in a lineage-specific manner, was highlighted for the first time. This region may have a function as another origin of replication. Phylogenetic reconstructions based on protein-coding genes and the stem-loop structures of control regions confirmed a sister relationship between Cavariella and pterocommatines. Current evidence suggest that pterocommatines could be formally transferred into Macrosiphini. Our paper also offers methodological instructions for obtaining other Aphididae mitochondrial genomes.

  17. Comparative genome sequencing of drosophila pseudoobscura: Chromosomal, gene and cis-element evolution

    Energy Technology Data Exchange (ETDEWEB)

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Todd, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catherine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenee; Verduzco, Daniel; Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.

    2004-04-01

    The genome sequence of a second fruit fly, D. pseudoobscura, presents an opportunity for comparative analysis of a primary model organism D. melanogaster. The vast majority of Drosophila genes have remained on the same arm, but within each arm gene order has been extensively reshuffled leading to the identification of approximately 1300 syntenic blocks. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 35 My since divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome wide average consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than control sequences between the species but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a picture of repeat mediated chromosomal rearrangement, and high co-adaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila.

  18. The chloroplast genome of the hexaploid Spartina maritima (Poaceae, Chloridoideae): Comparative analyses and molecular dating.

    Science.gov (United States)

    Rousseau-Gueutin, M; Bellot, S; Martin, G E; Boutte, J; Chelaifa, H; Lima, O; Michon-Coudouel, S; Naquin, D; Salmon, A; Ainouche, K; Ainouche, M

    2015-12-01

    The history of many plant lineages is complicated by reticulate evolution with cases of hybridization often followed by genome duplication (allopolyploidy). In such a context, the inference of phylogenetic relationships and biogeographic scenarios based on molecular data is easier using haploid markers like chloroplast genome sequences. Hybridization and polyploidization occurred recurrently in the genus Spartina (Poaceae, Chloridoideae), as illustrated by the recent formation of the invasive allododecaploid S. anglica during the 19th century in Europe. Until now, only a few plastid markers were available to explore the history of this genus and their low variability limited the resolution of species relationships. We sequenced the complete chloroplast genome (plastome) of S. maritima, the native European parent of S. anglica, and compared it to the plastomes of other Poaceae. Our analysis revealed the presence of fast-evolving regions of potential taxonomic, phylogeographic and phylogenetic utility at various levels within the Poaceae family. Using secondary calibrations, we show that the tetraploid and hexaploid lineages of Spartina diverged 6-10 my ago, and that the two parents of the invasive allopolyploid S. anglica separated 2-4 my ago via long distance dispersal of the ancestor of S. maritima over the Atlantic Ocean. Finally, we discuss the meaning of divergence times between chloroplast genomes in the context of reticulate evolution. PMID:26182838

  19. Comparative Analysis of 35 Basidiomycete Genomes Reveals Diversity and Uniqueness of the Phylum

    Energy Technology Data Exchange (ETDEWEB)

    Riley, Robert; Salamov, Asaf; Otillar, Robert; Fagnan, Kirsten; Boussau, Bastien; Brown, Daren; Henrissat, Bernard; Levasseur, Anthony; Held, Benjamin; Nagy, Laszlo; Floudas, Dimitris; Morin, Emmanuelle; Manning, Gerard; Baker, Scott; Martin, Francis; Blanchette, Robert; Hibbett, David; Grigoriev, Igor V.

    2013-03-11

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes symbionts, pathogens, and saprobes including wood decaying fungi. To better understand the diversity of this phylum we compared the genomes of 35 basidiomycete fungi including 6 newly sequenced genomes. The genomes of basidiomycetes span extremes of genome size, gene number, and repeat content. A phylogenetic tree of Basidiomycota was generated using the Phyldog software, which uses all available protein sequence data to simultaneously infer gene and species trees. Analysis of core genes reveals that some 48percent of basidiomycete proteins are unique to the phylum with nearly half of those (22percent) comprising proteins found in only one organism. Phylogenetic patterns of plant biomass-degrading genes suggest a continuum rather than a sharp dichotomy between the white rot and brown rot modes of wood decay among the members of Agaricomycotina subphylum. There is a correlation of the profile of certain gene families to nutritional mode in Agaricomycotina. Based on phylogenetically-informed PCA analysis of such profiles, we predict that that Botryobasidium botryosum and Jaapia argillacea have properties similar to white rot species, although neither has liginolytic class II fungal peroxidases. Furthermore, we find that both fungi exhibit wood decay with white rot-like characteristics in growth assays. Analysis of the rate of discovery of proteins with no or few homologs suggests the high value of continued sequencing of basidiomycete fungi.

  20. DNA Slippage Occurs at Microsatellite Loci without Minimal Threshold Length in Humans: A Comparative Genomic Approach

    Science.gov (United States)

    Leclercq, Sébastien; Rivals, Eric; Jarne, Philippe

    2010-01-01

    The dynamics of microsatellite, or short tandem repeats (STRs), is well documented for long, polymorphic loci, but much less is known for shorter ones. For example, the issue of a minimum threshold length for DNA slippage remains contentious. Model-fitting methods have generally concluded that slippage only occurs over a threshold length of about eight nucleotides, in contradiction with some direct observations of tandem duplications at shorter repeated sites. Using a comparative analysis of the human and chimpanzee genomes, we examined the mutation patterns at microsatellite loci with lengths as short as one period plus one nucleotide. We found that the rates of tandem insertions and deletions at microsatellite loci strongly deviated from background rates in other parts of the human genome and followed an exponential increase with STR size. More importantly, we detected no lower threshold length for slippage. The rate of tandem duplications at unrepeated sites was higher than expected from random insertions, providing evidence for genome-wide action of indel slippage (an alternative mechanism generating tandem repeats). The rate of point mutations adjacent to STRs did not differ from that estimated elsewhere in the genome, except around dinucleotide loci. Our results suggest that the emergence of STR depends on DNA slippage, indel slippage, and point mutations. We also found that the dynamics of tandem insertions and deletions differed in both rates and size at which these mutations take place. We discuss these results in both evolutionary and mechanistic terms. PMID:20624737

  1. Comparative genomic and proteomic analysis of high grade glioma primary cultures and matched tumor in situ.

    LENUS (Irish Health Repository)

    Howley, R

    2012-10-15

    Developing targeted therapies for high grade gliomas (HGG), the most common primary brain tumor in adults, relies largely on glioma cultures. However, it is unclear if HGG tumorigenic signaling pathways are retained under in-vitro conditions. Using array comparative genomic hybridization and immunohistochemical profiling, we contrasted the epidermal and platelet-derived growth factor receptor (EGFR\\/PDGFR) in-vitro pathway status of twenty-six primary HGG cultures with the pathway status of their original HGG biopsies. Genomic gains or amplifications were lost during culturing while genomic losses were more likely to be retained. Loss of EGFR amplification was further verified immunohistochemically when EGFR over expression was decreased in the majority of cultures. Conversely, PDGFRα and PDGFRβ were more abundantly expressed in primary cultures than in the original tumor (p<0.05). Despite these genomic and proteomic differences, primary HGG cultures retained key aspects of dysregulated tumorigenic signaling. Both in-vivo and in-vitro the presence of EGFR resulted in downstream activation of P70s6K while reduced downstream activation was associated with the presence of PDGFR and the tumor suppressor, PTEN. The preserved pathway dysregulation make this glioma model suitable for further studies of glioma tumorigenesis, however individual culture related differences must be taken into consideration when testing responsiveness to chemotherapeutic agents.

  2. What can comparative genomics tell us about species concepts in the genus Aspergillus?

    Energy Technology Data Exchange (ETDEWEB)

    Rokas, Antonis; payne, gary; Federova, Natalie D.; Baker, Scott E.; Machida, Masa; yu, Jiujiang; georgianna, D. R.; Dean, Ralph A.; Bhatnagar, Deepak; Cleveland, T. E.; Wortman, Jennifer R.; Maiti, R.; Joardar, V.; Amedeo, Paolo; Denning, David W.; Nierman, William C.

    2007-12-15

    Understanding the nature of species" boundaries is a fundamental question in evolutionary biology. The availability of genomes from several species of the genus Aspergillus allows us for the first time to examine the demarcation of fungal species at the whole-genome level. Here, we examine four case studies, two of which involve intraspecific comparisons, whereas the other two deal with interspecific genomic comparisons between closely related species. These four comparisons reveal significant variation in the nature of species boundaries across Aspergillus. For example, comparisons between A. fumigatus and Neosartorya fischeri (the teleomorph of A. fischerianus) and between A. oryzae and A. flavus suggest that measures of sequence similarity and species-specific genes are significantly higher for the A. fumigatus - N. fischeri pair. Importantly, the values obtained from the comparison between A. oryzae and A. flavus are remarkably similar to those obtained from an intra-specific comparison of A. fumigatus strains, giving support to the proposal that A. oryzae represents a distinct ecotype of A. flavus and not a distinct species. We argue that genomic data can aid Aspergillus taxonomy by serving as a source of novel and unprecedented amounts of comparative data, as a resource for the development of additional diagnostic tools, and finally as a knowledge database about the biological differences between strains and species.

  3. Comparative analysis of the domestic cat genome reveals genetic signatures underlying feline biology and domestication.

    Science.gov (United States)

    Montague, Michael J; Li, Gang; Gandolfi, Barbara; Khan, Razib; Aken, Bronwen L; Searle, Steven M J; Minx, Patrick; Hillier, LaDeana W; Koboldt, Daniel C; Davis, Brian W; Driscoll, Carlos A; Barr, Christina S; Blackistone, Kevin; Quilez, Javier; Lorente-Galdos, Belen; Marques-Bonet, Tomas; Alkan, Can; Thomas, Gregg W C; Hahn, Matthew W; Menotti-Raymond, Marilyn; O'Brien, Stephen J; Wilson, Richard K; Lyons, Leslie A; Murphy, William J; Warren, Wesley C

    2014-12-01

    Little is known about the genetic changes that distinguish domestic cat populations from their wild progenitors. Here we describe a high-quality domestic cat reference genome assembly and comparative inferences made with other cat breeds, wildcats, and other mammals. Based upon these comparisons, we identified positively selected genes enriched for genes involved in lipid metabolism that underpin adaptations to a hypercarnivorous diet. We also found positive selection signals within genes underlying sensory processes, especially those affecting vision and hearing in the carnivore lineage. We observed an evolutionary tradeoff between functional olfactory and vomeronasal receptor gene repertoires in the cat and dog genomes, with an expansion of the feline chemosensory system for detecting pheromones at the expense of odorant detection. Genomic regions harboring signatures of natural selection that distinguish domestic cats from their wild congeners are enriched in neural crest-related genes associated with behavior and reward in mouse models, as predicted by the domestication syndrome hypothesis. Our description of a previously unidentified allele for the gloving pigmentation pattern found in the Birman breed supports the hypothesis that cat breeds experienced strong selection on specific mutations drawn from random bred populations. Collectively, these findings provide insight into how the process of domestication altered the ancestral wildcat genome and build a resource for future disease mapping and phylogenomic studies across all members of the Felidae. PMID:25385592

  4. UniPrimer: A Web-Based Primer Design Tool for Comparative Analyses of Primate Genomes

    Directory of Open Access Journals (Sweden)

    Nomin Batnyam

    2012-01-01

    Full Text Available Whole genome sequences of various primates have been released due to advanced DNA-sequencing technology. A combination of computational data mining and the polymerase chain reaction (PCR assay to validate the data is an excellent method for conducting comparative genomics. Thus, designing primers for PCR is an essential procedure for a comparative analysis of primate genomes. Here, we developed and introduced UniPrimer for use in those studies. UniPrimer is a web-based tool that designs PCR- and DNA-sequencing primers. It compares the sequences from six different primates (human, chimpanzee, gorilla, orangutan, gibbon, and rhesus macaque and designs primers on the conserved region across species. UniPrimer is linked to RepeatMasker, Primer3Plus, and OligoCalc softwares to produce primers with high accuracy and UCSC In-Silico PCR to confirm whether the designed primers work. To test the performance of UniPrimer, we designed primers on sample sequences using UniPrimer and manually designed primers for the same sequences. The comparison of the two processes showed that UniPrimer was more effective than manual work in terms of saving time and reducing errors.

  5. UniPrimer: A Web-Based Primer Design Tool for Comparative Analyses of Primate Genomes.

    Science.gov (United States)

    Batnyam, Nomin; Lee, Jimin; Lee, Jungnam; Hong, Seung Bok; Oh, Sejong; Han, Kyudong

    2012-01-01

    Whole genome sequences of various primates have been released due to advanced DNA-sequencing technology. A combination of computational data mining and the polymerase chain reaction (PCR) assay to validate the data is an excellent method for conducting comparative genomics. Thus, designing primers for PCR is an essential procedure for a comparative analysis of primate genomes. Here, we developed and introduced UniPrimer for use in those studies. UniPrimer is a web-based tool that designs PCR- and DNA-sequencing primers. It compares the sequences from six different primates (human, chimpanzee, gorilla, orangutan, gibbon, and rhesus macaque) and designs primers on the conserved region across species. UniPrimer is linked to RepeatMasker, Primer3Plus, and OligoCalc softwares to produce primers with high accuracy and UCSC In-Silico PCR to confirm whether the designed primers work. To test the performance of UniPrimer, we designed primers on sample sequences using UniPrimer and manually designed primers for the same sequences. The comparison of the two processes showed that UniPrimer was more effective than manual work in terms of saving time and reducing errors. PMID:22693428

  6. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

    Directory of Open Access Journals (Sweden)

    Francesca Bertolini

    Full Text Available Few studies investigated the donkey (Equus asinus at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca. The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing and Ion Torrent (RRL runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources.

  7. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms

    Science.gov (United States)

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources. PMID:26151450

  8. Shared Y chromosome repetitive DNA sequences in stallion and donkey as visualized using whole-genomic comparative hybridization

    Directory of Open Access Journals (Sweden)

    R. Mezzanotte

    2010-01-01

    Full Text Available The genome of stallion (Spanish breed and donkey (Spanish endemic Zamorano-Leonés were compared using whole comparative genomic in situ hybridization (W-CGH technique, with special reference to the variability observed in the Y chromosome. Results show that these diverging genomes still share some highly repetitive DNA families localized in pericentromeric regions and, in the particular case of the Y chromosome, a sub-family of highly repeated DNA sequences, greatly expanded in the donkey genome, accounts for a large part of the chromatin in the stallion Y chromosome.

  9. Significance of genomic instability in breast cancer in atomic bomb survivors: analysis of microarray-comparative genomic hybridization

    Directory of Open Access Journals (Sweden)

    Oikawa Masahiro

    2011-12-01

    Full Text Available Abstract Background It has been postulated that ionizing radiation induces breast cancers among atomic bomb (A-bomb survivors. We have reported a higher incidence of HER2 and C-MYC oncogene amplification in breast cancers from A-bomb survivors. The purpose of this study was to clarify the effect of A-bomb radiation exposure on genomic instability (GIN, which is an important hallmark of carcinogenesis, in archival formalin-fixed paraffin-embedded (FFPE tissues of breast cancer by using microarray-comparative genomic hybridization (aCGH. Methods Tumor DNA was extracted from FFPE tissues of invasive ductal cancers from 15 survivors who were exposed at 1.5 km or less from the hypocenter and 13 calendar year-matched non-exposed patients followed by aCGH analysis using a high-density oligonucleotide microarray. The total length of copy number aberrations (CNA was used as an indicator of GIN, and correlation with clinicopathological factors were statistically tested. Results The mean of the derivative log ratio spread (DLRSpread, which estimates the noise by calculating the spread of log ratio differences between consecutive probes for all chromosomes, was 0.54 (range, 0.26 to 1.05. The concordance of results between aCGH and fluorescence in situ hybridization (FISH for HER2 gene amplification was 88%. The incidence of HER2 amplification and histological grade was significantly higher in the A-bomb survivors than control group (P = 0.04, respectively. The total length of CNA tended to be larger in the A-bomb survivors (P = 0.15. Correlation analysis of CNA and clinicopathological factors revealed that DLRSpread was negatively correlated with that significantly (P = 0.034, r = -0.40. Multivariate analysis with covariance revealed that the exposure to A-bomb was a significant (P = 0.005 independent factor which was associated with larger total length of CNA of breast cancers. Conclusions Thus, archival FFPE tissues from A-bomb survivors are useful for

  10. Significance of genomic instability in breast cancer in atomic bomb survivors: analysis of microarray-comparative genomic hybridization

    International Nuclear Information System (INIS)

    It has been postulated that ionizing radiation induces breast cancers among atomic bomb (A-bomb) survivors. We have reported a higher incidence of HER2 and C-MYC oncogene amplification in breast cancers from A-bomb survivors. The purpose of this study was to clarify the effect of A-bomb radiation exposure on genomic instability (GIN), which is an important hallmark of carcinogenesis, in archival formalin-fixed paraffin-embedded (FFPE) tissues of breast cancer by using microarray-comparative genomic hybridization (aCGH). Tumor DNA was extracted from FFPE tissues of invasive ductal cancers from 15 survivors who were exposed at 1.5 km or less from the hypocenter and 13 calendar year-matched non-exposed patients followed by aCGH analysis using a high-density oligonucleotide microarray. The total length of copy number aberrations (CNA) was used as an indicator of GIN, and correlation with clinicopathological factors were statistically tested. The mean of the derivative log ratio spread (DLRSpread), which estimates the noise by calculating the spread of log ratio differences between consecutive probes for all chromosomes, was 0.54 (range, 0.26 to 1.05). The concordance of results between aCGH and fluorescence in situ hybridization (FISH) for HER2 gene amplification was 88%. The incidence of HER2 amplification and histological grade was significantly higher in the A-bomb survivors than control group (P = 0.04, respectively). The total length of CNA tended to be larger in the A-bomb survivors (P = 0.15). Correlation analysis of CNA and clinicopathological factors revealed that DLRSpread was negatively correlated with that significantly (P = 0.034, r = -0.40). Multivariate analysis with covariance revealed that the exposure to A-bomb was a significant (P = 0.005) independent factor which was associated with larger total length of CNA of breast cancers. Thus, archival FFPE tissues from A-bomb survivors are useful for genome-wide aCGH analysis. Our results suggested that A

  11. Mitochondrial comparative genomics and phylogenetic signal assessment of mtDNA among arbuscular mycorrhizal fungi.

    Science.gov (United States)

    Nadimi, Maryam; Daubois, Laurence; Hijri, Mohamed

    2016-05-01

    Mitochondrial (mt) genes, such as cytochrome C oxidase genes (cox), have been widely used for barcoding in many groups of organisms, although this approach has been less powerful in the fungal kingdom due to the rapid evolution of their mt genomes. The use of mt genes in phylogenetic studies of Dikarya has been met with success, while early diverging fungal lineages remain less studied, particularly the arbuscular mycorrhizal fungi (AMF). Advances in next-generation sequencing have substantially increased the number of publically available mtDNA sequences for the Glomeromycota. As a result, comparison of mtDNA across key AMF taxa can now be applied to assess the phylogenetic signal of individual mt coding genes, as well as concatenated subsets of coding genes. Here we show comparative analyses of publically available mt genomes of Glomeromycota, augmented with two mtDNA genomes that were newly sequenced for this study (Rhizophagus irregularis DAOM240159 and Glomus aggregatum DAOM240163), resulting in 16 complete mtDNA datasets. R. irregularis isolate DAOM240159 and G. aggregatum isolate DAOM240163 showed mt genomes measuring 72,293bp and 69,505bp with G+C contents of 37.1% and 37.3%, respectively. We assessed the phylogenies inferred from single mt genes and complete sets of coding genes, which are referred to as "supergenes" (16 concatenated coding genes), using Shimodaira-Hasegawa tests, in order to identify genes that best described AMF phylogeny. We found that rnl, nad5, cox1, and nad2 genes, as well as concatenated subset of these genes, provided phylogenies that were similar to the supergene set. This mitochondrial genomic analysis was also combined with principal coordinate and partitioning analyses, which helped to unravel certain evolutionary relationships in the Rhizophagus genus and for G. aggregatum within the Glomeromycota. We showed evidence to support the position of G. aggregatum within the R. irregularis 'species complex'. PMID:26868331

  12. Microarray-based comparative genomic hybridization analysis in neonates with congenital anomalies: detection of chromosomal imbalances

    Directory of Open Access Journals (Sweden)

    Luiza Emy Dorfman

    2015-02-01

    Full Text Available OBJECTIVE: To identify chromosomal imbalances by whole-genome microarray-based comparative genomic hybridization (array-CGH in DNA samples of neonates with congenital anomalies of unknown cause from a birth defects monitoring program at a public maternity hospital. METHODS: A blind genomic analysis was performed retrospectively in 35 stored DNA samples of neonates born between July of 2011 and December of 2012. All potential DNA copy number variations detected (CNVs were matched with those reported in public genomic databases, and their clinical significance was evaluated. RESULTS: Out of a total of 35 samples tested, 13 genomic imbalances were detected in 12/35 cases (34.3%. In 4/35 cases (11.4%, chromosomal imbalances could be defined as pathogenic; in 5/35 (14.3% cases, DNA CNVs of uncertain clinical significance were identified; and in 4/35 cases (11.4%, normal variants were detected. Among the four cases with results considered causally related to the clinical findings, two of the four (50% showed causative alterations already associated with well-defined microdeletion syndromes. In two of the four samples (50%, the chromosomal imbalances found, although predicted as pathogenic, had not been previously associated with recognized clinical entities. CONCLUSIONS: Array-CGH analysis allowed for a higher rate of detection of chromosomal anomalies, and this determination is especially valuable in neonates with congenital anomalies of unknown etiology, or in cases in which karyotype results cannot be obtained. Moreover, although the interpretation of the results must be refined, this method is a robust and precise tool that can be used in the first-line investigation of congenital anomalies, and should be considered for prospective/retrospective analyses of DNA samples by birth defect monitoring programs.

  13. Comparative genomics of Gardnerella vaginalis strains reveals substantial differences in metabolic and virulence potential.

    Directory of Open Access Journals (Sweden)

    Carl J Yeoman

    Full Text Available BACKGROUND: Gardnerella vaginalis is described as a common vaginal bacterial species whose presence correlates strongly with bacterial vaginosis (BV. Here we report the genome sequencing and comparative analyses of three strains of G. vaginalis. Strains 317 (ATCC 14019 and 594 (ATCC 14018 were isolated from the vaginal tracts of women with symptomatic BV, while Strain 409-05 was isolated from a healthy, asymptomatic individual with a Nugent score of 9. PRINCIPAL FINDINGS: Substantial genomic rearrangement and heterogeneity were observed that appeared to have resulted from both mobile elements and substantial lateral gene transfer. These genomic differences translated to differences in metabolic potential. All strains are equipped with significant virulence potential, including genes encoding the previously described vaginolysin, pili for cytoadhesion, EPS biosynthetic genes for biofilm formation, and antimicrobial resistance systems, We also observed systems promoting multi-drug and lantibiotic extrusion. All G. vaginalis strains possess a large number of genes that may enhance their ability to compete with and exclude other vaginal colonists. These include up to six toxin-antitoxin systems and up to nine additional antitoxins lacking cognate toxins, several of which are clustered within each genome. All strains encode bacteriocidal toxins, including two lysozyme-like toxins produced uniquely by strain 409-05. Interestingly, the BV isolates encode numerous proteins not found in strain 409-05 that likely increase their pathogenic potential. These include enzymes enabling mucin degradation, a trait previously described to strongly correlate with BV, although commonly attributed to non-G. vaginalis species. CONCLUSIONS: Collectively, our results indicate that all three strains are able to thrive in vaginal environments, and therein the BV isolates are capable of occupying a niche that is unique from 409-05. Each strain has significant virulence

  14. Comparative genomics and proteomics of Helicobacter mustelae, an ulcerogenic and carcinogenic gastric pathogen

    LENUS (Irish Health Repository)

    O'Toole, Paul W

    2010-03-10

    Abstract Background Helicobacter mustelae causes gastritis, ulcers and gastric cancer in ferrets and other mustelids. H. mustelae remains the only helicobacter other than H. pylori that causes gastric ulceration and cancer in its natural host. To improve understanding of H. mustelae pathogenesis, and the ulcerogenic and carcinogenic potential of helicobacters in general, we sequenced the H. mustelae genome, and identified 425 expressed proteins in the envelope and cytosolic proteome. Results The H. mustelae genome lacks orthologs of major H. pylori virulence factors including CagA, VacA, BabA, SabA and OipA. However, it encodes ten autotransporter surface proteins, seven of which were detected in the expressed proteome, and which, except for the Hsr protein, are of unknown function. There are 26 putative outer membrane proteins in H. mustelae, some of which are most similar to the Hof proteins of H. pylori. Although homologs of putative virulence determinants of H. pylori (NapA, plasminogen adhesin, collagenase) and Campylobacter jejuni (CiaB, Peb4a) are present in the H. mustelae genome, it also includes a distinct complement of virulence-related genes including a haemagglutinin\\/haemolysin protein, and a glycosyl transferase for producing blood group A\\/B on its lipopolysaccharide. The most highly expressed 264 proteins in the cytosolic proteome included many corresponding proteins from H. pylori, but the rank profile in H. mustelae was distinctive. Of 27 genes shown to be essential for H. pylori colonization of the gerbil, all but three had orthologs in H. mustelae, identifying a shared set of core proteins for gastric persistence. Conclusions The determination of the genome sequence and expressed proteome of the ulcerogenic species H mustelae provides a comparative model for H. pylori to investigate bacterial gastric carcinogenesis in mammals, and to suggest ways whereby cag minus H. pylori strains might cause ulceration and cancer. The genome sequence was

  15. Array comparative genomic hybridization analysis of Trichoderma reesei strains with enhanced cellulase production properties

    Directory of Open Access Journals (Sweden)

    Penttilä Merja

    2010-07-01

    Full Text Available Abstract Background Trichoderma reesei is the main industrial producer of cellulases and hemicellulases that are used to depolymerize biomass in a variety of biotechnical applications. Many of the production strains currently in use have been generated by classical mutagenesis. In this study we characterized genomic alterations in high-producing mutants of T. reesei by high-resolution array comparative genomic hybridization (aCGH. Our aim was to obtain genome-wide information which could be utilized for better understanding of the mechanisms underlying efficient cellulase production, and would enable targeted genetic engineering for improved production of proteins in general. Results We carried out an aCGH analysis of four high-producing strains (QM9123, QM9414, NG14 and Rut-C30 using the natural isolate QM6a as a reference. In QM9123 and QM9414 we detected a total of 44 previously undocumented mutation sites including deletions, chromosomal translocation breakpoints and single nucleotide mutations. In NG14 and Rut-C30 we detected 126 mutations of which 17 were new mutations not documented previously. Among these new mutations are the first chromosomal translocation breakpoints identified in NG14 and Rut-C30. We studied the effects of two deletions identified in Rut-C30 (a deletion of 85 kb in the scaffold 15 and a deletion in a gene encoding a transcription factor on cellulase production by constructing knock-out strains in the QM6a background. Neither the 85 kb deletion nor the deletion of the transcription factor affected cellulase production. Conclusions aCGH analysis identified dozens of mutations in each strain analyzed. The resolution was at the level of single nucleotide mutation. High-density aCGH is a powerful tool for genome-wide analysis of organisms with small genomes e.g. fungi, especially in studies where a large set of interesting strains is analyzed.

  16. Comparative genomic analysis reveals a distant liver enhancer upstream of the COUP-TFII gene

    Energy Technology Data Exchange (ETDEWEB)

    Baroukh, Nadine; Ahituv, Nadav; Chang, Jessie; Shoukry, Malak; Afzal, Veena; Rubin, Edward M.; Pennacchio, Len A.

    2004-08-20

    COUP-TFII is a central nuclear hormone receptor that tightly regulates the expression of numerous target lipid metabolism genes in vertebrates. However, it remains unclear how COUP-TFII itself is transcriptionally controlled since studies with its promoter and upstream region fail to recapitulate the genes liver expression. In an attempt to identify liver enhancers in the vicinity of COUP-TFII, we employed a comparative genomic approach. Initial comparisons between humans and mice of the 3,470kb gene poor region surrounding COUP-TFII revealed 2,023 conserved non-coding elements. To prioritize a subset of these elements for functional studies, we performed further genomic comparisons with the orthologous pufferfish (Fugu rubripes) locus and uncovered two anciently conserved non-coding sequences (CNS) upstream of COUP-TFII (CNS-62kb and CNS-66kb). Testing these two elements using reporter constructs in liver (HepG2) cells revealed that CNS-66kb, but not CNS-62kb, yielded robust in vitro enhancer activity. In addition, an in vivo reporter assay using naked DNA transfer with CNS-66kb linked to luciferase displayed strong reproducible liver expression in adult mice, further supporting its role as a liver enhancer. Together, these studies further support the utility of comparative genomics to uncover gene regulatory sequences based on evolutionary conservation and provide the substrates to better understand the regulation and expression of COUP-TFII.

  17. Comparative genomics provide a rapid detection ofFusarium oxysporum f. sp.conglutinans

    Institute of Scientific and Technical Information of China (English)

    LING Jian; ZHANG Ji-xiang; ZENG Feng; CAO Yue-xia; XIE Bing-yan; YANG Yu-hong

    2016-01-01

    Fusarium oxysporumf. sp. conglutinans(Foc) is the causal agent ofFusarium wilt disease ofBrassica oleracea. A rapid, accurate, and reliable method to detect and identify plant pathogens is vitaly important to integrated disease management. In this study, using a comparative genome analysis amongFusarium oxysporum (Fo), we developed aFoc-speciifc primer set (Focs-1/Focs-2) and established a multiplex-PCR assay. In the assay, the Focs-1/Focs-2 and universal primers for Fusariumspecies(W106R/F106S) could be used to detectFoc isolates in a single PCR reaction. With the optimized PCR parameters, the multiplex-PCR assay showed a high speciifcity for detectingFoc and was very sensitive to detect as little as 100 pg of pureFoc genomic DNA or 1000 spores in 1 g of twice-autoclaved soil. We also demonstrated thatFoc isolates were easily detected from infected plant tissues, as wel as from natural ifeld soils, using the multiplex-PCR assay. To our knowledge, this is a ifrst report on detectionFo by comparative genomic method.

  18. Genomic comparative analysis and gene function prediction in infectious diseases: application to the investigation of a meningitis outbreak

    OpenAIRE

    Lavezzo, Enrico; Toppo, Stefano; Franchin, Elisa; Di Camillo, Barbara; Finotello, Francesca; Falda, Marco; Manganelli, Riccardo; Palù, Giorgio; Barzon, Luisa

    2013-01-01

    Background Next generation sequencing (NGS) is being increasingly used for the detection and characterization of pathogens during outbreaks. This technology allows rapid sequencing of pathogen full genomes, useful not only for accurate genotyping and molecular epidemiology, but also for identification of drug resistance and virulence traits. Methods In this study, an approach based on whole genome sequencing by NGS, comparative genomics, and gene function prediction was set up and retrospecti...

  19. Comparative genomic analysis reveals 2-oxoacid dehydrogenase complex lipoylation correlation with aerobiosis in archaea.

    Directory of Open Access Journals (Sweden)

    Kirill Borziak

    Full Text Available Metagenomic analyses have advanced our understanding of ecological microbial diversity, but to what extent can metagenomic data be used to predict the metabolic capacity of difficult-to-study organisms and their abiotic environmental interactions? We tackle this question, using a comparative genomic approach, by considering the molecular basis of aerobiosis within archaea. Lipoylation, the covalent attachment of lipoic acid to 2-oxoacid dehydrogenase multienzyme complexes (OADHCs, is essential for metabolism in aerobic bacteria and eukarya. Lipoylation is catalysed either by lipoate protein ligase (LplA, which in archaea is typically encoded by two genes (LplA-N and LplA-C, or by a lipoyl(octanoyl transferase (LipB or LipM plus a lipoic acid synthetase (LipA. Does the genomic presence of lipoylation and OADHC genes across archaea from diverse habitats correlate with aerobiosis? First, analyses of 11,826 biotin protein ligase (BPL-LplA-LipB transferase family members and 147 archaeal genomes identified 85 species with lipoylation capabilities and provided support for multiple ancestral acquisitions of lipoylation pathways during archaeal evolution. Second, with the exception of the Sulfolobales order, the majority of species possessing lipoylation systems exclusively retain LplA, or either LipB or LipM, consistent with archaeal genome streamlining. Third, obligate anaerobic archaea display widespread loss of lipoylation and OADHC genes. Conversely, a high level of correspondence is observed between aerobiosis and the presence of LplA/LipB/LipM, LipA and OADHC E2, consistent with the role of lipoylation in aerobic metabolism. This correspondence between OADHC lipoylation capacity and aerobiosis indicates that genomic pathway profiling in archaea is informative and that well characterized pathways may be predictive in relation to abiotic conditions in difficult-to-study extremophiles. Given the highly variable retention of gene repertoires across

  20. A comparative genomics approach to understanding transmissible cancer in Tasmanian devils.

    Science.gov (United States)

    Deakin, Janine E; Belov, Katherine

    2012-01-01

    A fatal contagious cancer is driving an entire species to extinction. Comparative genomics will unravel the origin and evolution of devil facial tumor disease (DFTD). The DFTD allograft arose from a Schwann cell in a female Tasmanian devil more than 15 years ago; since then, the tumor has passed through at least 100,000 hosts, evolving and mutating along the way. Tumor genome sequencing and molecular cytogenetic technologies now allow direct comparisons of candidate genes involved in tumorigenesis in human cancers. As a stable transmissible cancer, DFTD provides unique insights into cancer development, progression, and immune evasion and is likely to help increase our understanding of human cancer. In addition, these studies provide hope for discoveries of drug targets or vaccine candidates that will prevent the extinction of this iconic Australian marsupial. PMID:22657390

  1. Multiphyletic origins of methylotrophy in Alphaproteobacteria, exemplified by comparative genomics of Lake Washington isolates.

    Science.gov (United States)

    Beck, David A C; McTaggart, Tami L; Setboonsarng, Usanisa; Vorobev, Alexey; Goodwin, Lynne; Shapiro, Nicole; Woyke, Tanja; Kalyuzhnaya, Marina G; Lidstrom, Mary E; Chistoserdova, Ludmila

    2015-03-01

    We sequenced the genomes of 19 methylotrophic isolates from Lake Washington, which belong to nine genera within eight families of the Alphaproteobacteria, two of the families being the newly proposed families. Comparative genomic analysis with a focus on methylotrophy metabolism classifies these strains into heterotrophic and obligately or facultatively autotrophic methylotrophs. The most persistent metabolic modules enabling methylotrophy within this group are the N-methylglutamate pathway, the two types of methanol dehydrogenase (MxaFI and XoxF), the tetrahydromethanopterin pathway for formaldehyde oxidation, the serine cycle and the ethylmalonyl-CoA pathway. At the same time, a great potential for metabolic flexibility within this group is uncovered, with different combinations of these modules present. Phylogenetic analysis of key methylotrophy functions reveals that the serine cycle must have evolved independently in at least four lineages of Alphaproteobacteria and that all methylotrophy modules seem to be prone to lateral transfers as well as deletions. PMID:25683159

  2. Comparative Genomic Analysis Reveals a Possible Novel Non-Tuberculous Mycobacterium Species with High Pathogenic Potential

    Science.gov (United States)

    Choo, Siew Woh; Dutta, Avirup; Wong, Guat Jah; Wee, Wei Yee; Ang, Mia Yang; Siow, Cheuk Chuen

    2016-01-01

    Mycobacteria have been reported to cause a wide range of human diseases. We present the first whole-genome study of a Non-Tuberculous Mycobacterium, Mycobacterium sp. UM_CSW (referred to hereafter as UM_CSW), isolated from a patient diagnosed with bronchiectasis. Our data suggest that this clinical isolate is likely a novel mycobacterial species, supported by clear evidence from molecular phylogenetic, comparative genomic, ANI and AAI analyses. UM_CSW is closely related to the Mycobacterium avium complex. While it has characteristic features of an environmental bacterium, it also shows a high pathogenic potential with the presence of a wide variety of putative genes related to bacterial virulence and shares very similar pathogenomic profiles with the known pathogenic mycobacterial species. Thus, we conclude that this possible novel Mycobacterium species should be tightly monitored for its possible causative role in human infections. PMID:27035710

  3. Genome Sequence of Azospirillum brasilense CBG497 and Comparative Analyses of Azospirillum Core and Accessory Genomes provide Insight into Niche Adaptation

    Directory of Open Access Journals (Sweden)

    Victor González

    2012-09-01

    Full Text Available Bacteria of the genus Azospirillum colonize roots of important cereals and grasses, and promote plant growth by several mechanisms, notably phytohormone synthesis. The genomes of several Azospirillum strains belonging to different species, isolated from various host plants and locations, were recently sequenced and published. In this study, an additional genome of an A. brasilense strain, isolated from maize grown on an alkaline soil in the northeast of Mexico, strain CBG497, was obtained. Comparative genomic analyses were performed on this new genome and three other genomes (A. brasilense Sp245, A. lipoferum 4B and Azospirillum sp. B510. The Azospirillum core genome was established and consists of 2,328 proteins, representing between 30% to 38% of the total encoded proteins within a genome. It is mainly chromosomally-encoded and contains 74% of genes of ancestral origin shared with some aquatic relatives. The non-ancestral part of the core genome is enriched in genes involved in signal transduction, in transport and in metabolism of carbohydrates and amino-acids, and in surface properties features linked to adaptation in fluctuating environments, such as soil and rhizosphere. Many genes involved in colonization of plant roots, plant-growth promotion (such as those involved in phytohormone biosynthesis, and properties involved in rhizosphere adaptation (such as catabolism of phenolic compounds, uptake of iron are restricted to a particular strain and/or species, strongly suggesting niche-specific adaptation.

  4. Genome-wide comparative analysis reveals possible common ancestors of nucleotide-binding sites domain containing genes in hybrid Citrus sinensis genome and original Citrus clementina genome

    Science.gov (United States)

    We identified and re-annotated candidate disease resistance (R) genes with nucleotide-binding sites (NBS) domain from a Citrus clementina genome and two complete Citrus sinensis genome sequences (one from the USA and one from China). We found similar numbers of NBS genes from three citrus genomes, r...

  5. Genome Sequence and Comparative Genomics Analysis of a Vibrio cholerae O1 Strain Isolated from a Cholera Patient in Malaysia

    Science.gov (United States)

    Osama, Abdulrazak; Gan, Han Ming; Teh, Cindy Shuan Ju; Yap, Kien-Pong

    2012-01-01

    The genome sequence analysis of a clinical Vibrio cholerae VC35 strain from an outbreak case in Malaysia indicates multiple genes involved in host adaptation and a novel Na+-driven multidrug efflux pump-coding gene in the genome of Vibrio cholerae with the highest similarity to VMA_001754 of Vibrio mimicus VMA223. PMID:23209200

  6. Genome Sequence and Comparative Genomics Analysis of a Vibrio cholerae O1 Strain Isolated from a Cholera Patient in Malaysia

    OpenAIRE

    Osama, Abdulrazak; Gan, Han Ming; Teh, Cindy Shuan Ju; Yap, Kien-Pong; Thong, Kwai-Lin

    2012-01-01

    The genome sequence analysis of a clinical Vibrio cholerae VC35 strain from an outbreak case in Malaysia indicates multiple genes involved in host adaptation and a novel Na+-driven multidrug efflux pump-coding gene in the genome of Vibrio cholerae with the highest similarity to VMA_001754 of Vibrio mimicus VMA223.

  7. Whole-genome Sequencing of the Saprophyte L. biflexa and Comparative Genomics of Pathogoenic and Saprophytic Leptospira Species

    Science.gov (United States)

    We present the complete genome sequences of Leptospira biflexa, a saprophyte that is used as a model for gene functional analysis in Leptospira spp. The L. biflexa genome has 3,730 protein-coding genes distributed across three circular replicons: two of which are chromosomes (3,600- and 278-kb in si...

  8. MultiMetEval: comparative and multi-objective analysis of genome-scale metabolic models.

    Directory of Open Access Journals (Sweden)

    Piotr Zakrzewski

    Full Text Available Comparative metabolic modelling is emerging as a novel field, supported by the development of reliable and standardized approaches for constructing genome-scale metabolic models in high throughput. New software solutions are needed to allow efficient comparative analysis of multiple models in the context of multiple cellular objectives. Here, we present the user-friendly software framework Multi-Metabolic Evaluator (MultiMetEval, built upon SurreyFBA, which allows the user to compose collections of metabolic models that together can be subjected to flux balance analysis. Additionally, MultiMetEval implements functionalities for multi-objective analysis by calculating the Pareto front between two cellular objectives. Using a previously generated dataset of 38 actinobacterial genome-scale metabolic models, we show how these approaches can lead to exciting novel insights. Firstly, after incorporating several pathways for the biosynthesis of natural products into each of these models, comparative flux balance analysis predicted that species like Streptomyces that harbour the highest diversity of secondary metabolite biosynthetic gene clusters in their genomes do not necessarily have the metabolic network topology most suitable for compound overproduction. Secondly, multi-objective analysis of biomass production and natural product biosynthesis in these actinobacteria shows that the well-studied occurrence of discrete metabolic switches during the change of cellular objectives is inherent to their metabolic network architecture. Comparative and multi-objective modelling can lead to insights that could not be obtained by normal flux balance analyses. MultiMetEval provides a powerful platform that makes these analyses straightforward for biologists. Sources and binaries of MultiMetEval are freely available from https://github.com/PiotrZakrzewski/MetEval/downloads.

  9. Whole Genome Sequencing Increases Molecular Diagnostic Yield Compared with Current Diagnostic Testing for Inherited Retinal Disease

    Science.gov (United States)

    Ellingford, Jamie M.; Barton, Stephanie; Bhaskar, Sanjeev; Williams, Simon G.; Sergouniotis, Panagiotis I.; O'Sullivan, James; Lamb, Janine A.; Perveen, Rahat; Hall, Georgina; Newman, William G.; Bishop, Paul N.; Roberts, Stephen A.; Leach, Rick; Tearle, Rick; Bayliss, Stuart; Ramsden, Simon C.; Nemeth, Andrea H.; Black, Graeme C.M.

    2016-01-01

    Purpose To compare the efficacy of whole genome sequencing (WGS) with targeted next-generation sequencing (NGS) in the diagnosis of inherited retinal disease (IRD). Design Case series. Participants A total of 562 patients diagnosed with IRD. Methods We performed a direct comparative analysis of current molecular diagnostics with WGS. We retrospectively reviewed the findings from a diagnostic NGS DNA test for 562 patients with IRD. A subset of 46 of 562 patients (encompassing potential clinical outcomes of diagnostic analysis) also underwent WGS, and we compared mutation detection rates and molecular diagnostic yields. In addition, we compared the sensitivity and specificity of the 2 techniques to identify known single nucleotide variants (SNVs) using 6 control samples with publically available genotype data. Main Outcome Measures Diagnostic yield of genomic testing. Results Across known disease-causing genes, targeted NGS and WGS achieved similar levels of sensitivity and specificity for SNV detection. However, WGS also identified 14 clinically relevant genetic variants through WGS that had not been identified by NGS diagnostic testing for the 46 individuals with IRD. These variants included large deletions and variants in noncoding regions of the genome. Identification of these variants confirmed a molecular diagnosis of IRD for 11 of the 33 individuals referred for WGS who had not obtained a molecular diagnosis through targeted NGS testing. Weighted estimates, accounting for population structure, suggest that WGS methods could result in an overall 29% (95% confidence interval, 15–45) uplift in diagnostic yield. Conclusions We show that WGS methods can detect disease-causing genetic variants missed by current NGS diagnostic methodologies for IRD and thereby demonstrate the clinical utility and additional value of WGS. PMID:26872967

  10. Comparing de novo genome assembly: the long and short of it.

    Directory of Open Access Journals (Sweden)

    Giuseppe Narzisi

    Full Text Available Recent advances in DNA sequencing technology and their focal role in Genome Wide Association Studies (GWAS have rekindled a growing interest in the whole-genome sequence assembly (WGSA problem, thereby, inundating the field with a plethora of new formalizations, algorithms, heuristics and implementations. And yet, scant attention has been paid to comparative assessments of these assemblers' quality and accuracy. No commonly accepted and standardized method for comparison exists yet. Even worse, widely used metrics to compare the assembled sequences emphasize only size, poorly capturing the contig quality and accuracy. This paper addresses these concerns: it highlights common anomalies in assembly accuracy through a rigorous study of several assemblers, compared under both standard metrics (N50, coverage, contig sizes, etc. as well as a more comprehensive metric (Feature-Response Curves, FRC that is introduced here; FRC transparently captures the trade-offs between contigs' quality against their sizes. For this purpose, most of the publicly available major sequence assemblers--both for low-coverage long (Sanger and high-coverage short (Illumina reads technologies--are compared. These assemblers are applied to microbial (Escherichia coli, Brucella, Wolbachia, Staphylococcus, Helicobacter and partial human genome sequences (Chr. Y, using sequence reads of various read-lengths, coverages, accuracies, and with and without mate-pairs. It is hoped that, based on these evaluations, computational biologists will identify innovative sequence assembly paradigms, bioinformaticists will determine promising approaches for developing "next-generation" assemblers, and biotechnologists will formulate more meaningful design desiderata for sequencing technology platforms. A new software tool for computing the FRC metric has been developed and is available through the AMOS open-source consortium.

  11. Comparative genetic mapping between clementine, pummelo and sweet orange and the interspecicic structure of the Clementine genome

    Science.gov (United States)

    Comparative genetic mapping between clementine, pummelo and sweet orange and the interspecicic structure of the Clementine genome The availability of a saturated genetic map of Clementine was identified by the International Citrus Genome Consortium as an essential prerequisite to assist the assembly...

  12. Revising a personal genome by comparing and combining data from two different sequencing platforms.

    Directory of Open Access Journals (Sweden)

    Deokhoon Kim

    Full Text Available For the robust practice of genomic medicine, sequencing results must be compatible, regardless of the sequencing technologies and algorithms used. Presently, genome sequencing is still an imprecise science and is complicated by differences in the chemistry, coverage, alignment, and variant-calling algorithms. We identified ~3.33 million single nucleotide variants (SNVs and ~3.62 million SNVs in the SJK genome using SOLiD and Illumina data, respectively. Approximately 3 million SNVs were concordant between the two platforms while 68,532 SNVs were discordant; 219,616 SNVs were SOLiD-specific and 516,080 SNVs were Illumina-specific (i.e., platform-specific. Concordant, discordant, and platform-specific SNVs were further analyzed and characterized. Overall, a large portion of heterozygous SNVs that were discordant with genotyping calls of single nucleotide polymorphism chips were highly confident. Approximately 70% of the platform-specific SNVs were located in regions containing repetitive sequences. Such platform-specificity may arise from differences between platforms, with regard to read length (36 bp and 72 bp vs. 50 bp, insert size (~100-300 bp vs. ~1-2 kb, sequencing chemistry (sequencing-by-synthesis using single nucleotides vs. ligation-based sequencing using oligomers, and sequencing quality. When data from the two platforms were merged for variant calling, the proportion of callable regions of the reference genome increased to 99.66%, which was 1.43% higher than the average callability of the two platforms, representing ~40 million bases. In this study, we compared the differences in sequencing results between two sequencing platforms. Approximately 90% of the SNVs were concordant between the two platforms, yet ~10% of the SNVs were either discordant or platform-specific, indicating that each platform had its own strengths and weaknesses. When data from the two platforms were merged, both the overall callability of the reference genome and

  13. Comparing the mitochondrial genomes of Wolbachia-dependent and independent filarial nematode species

    Directory of Open Access Journals (Sweden)

    McNulty Samantha N

    2012-04-01

    Full Text Available Abstract Background Many species of filarial nematodes depend on Wolbachia endobacteria to carry out their life cycle. Other species are naturally Wolbachia-free. The biological mechanisms underpinning Wolbachia-dependence and independence in filarial nematodes are not known. Previous studies have indicated that Wolbachia have an impact on mitochondrial gene expression, which may suggest a role in energy metabolism. If Wolbachia can supplement host energy metabolism, reduced mitochondrial function in infected filarial species may account for Wolbachia-dependence. Wolbachia also have a strong influence on mitochondrial evolution due to vertical co-transmission. This could drive alterations in mitochondrial genome sequence in infected species. Comparisons between the mitochondrial genome sequences of Wolbachia-dependent and independent filarial worms may reveal differences indicative of altered mitochondrial function. Results The mitochondrial genomes of 5 species of filarial nematodes, Acanthocheilonema viteae, Chandlerella quiscali, Loa loa, Onchocerca flexuosa, and Wuchereria bancrofti, were sequenced, annotated and compared with available mitochondrial genome sequences from Brugia malayi, Dirofilaria immitis, Onchocerca volvulus and Setaria digitata. B. malayi, D. immitis, O. volvulus and W. bancrofti are Wolbachia-dependent while A. viteae, C. quiscali, L. loa, O. flexuosa and S. digitata are Wolbachia-free. The 9 mitochondrial genomes were similar in size and AT content and encoded the same 12 protein-coding genes, 22 tRNAs and 2 rRNAs. Synteny was perfectly preserved in all species except C. quiscali, which had a different order for 5 tRNA genes. Protein-coding genes were expressed at the RNA level in all examined species. In phylogenetic trees based on mitochondrial protein-coding sequences, species did not cluster according to Wolbachia dependence. Conclusions Thus far, no discernable differences were detected between the mitochondrial

  14. Massive gene losses in Asian cultivated rice unveiled by comparative genome analysis

    Directory of Open Access Journals (Sweden)

    Itoh Takeshi

    2010-02-01

    Full Text Available Abstract Background Rice is one of the most important food crops in the world. With increasing world demand for food crops, there is an urgent need to develop new cultivars that have enhanced performance with regard to yield, disease resistance, and so on. Wild rice is expected to provide useful genetic resources that could improve the present cultivated species. However, the quantity and quality of these unexplored resources remain unclear. Recent accumulation of the genomic information of both cultivated and wild rice species allows for their comparison at the molecular level. Here, we compared the genome sequence of Oryza sativa ssp. japonica with sets of bacterial artificial chromosome end sequences (BESs from two wild rice species, O. rufipogon and O. nivara, and an African rice species, O. glaberrima. Results We found that about four to five percent of the BESs of the two wild rice species and about seven percent of the African rice could not be mapped to the japonica genome, suggesting that a substantial number of genes have been lost in the japonica rice lineage; however, their close relatives still possess their counterpart genes. We estimated that during evolution, O. sativa has lost at least one thousand genes that are still preserved in the genomes of the other species. In addition, our BLASTX searches against the non-redundant protein sequence database showed that disease resistance-related proteins were significantly overrepresented in the close relative-specific genomic portions. In total, 235 unmapped BESs of the three relatives matched 83 non-redundant proteins that contained a disease resistance protein domain, most of which corresponded to an NBS-LRR domain. Conclusion We found that the O. sativa lineage appears to have recently experienced massive gene losses following divergence from its wild ancestor. Our results imply that the domestication process accelerated large-scale genomic deletions in the lineage of Asian

  15. Evolution of tryptophan biosynthetic pathway in microbial genomes: a comparative genetic study.

    Science.gov (United States)

    Priya, V K; Sarkar, Susmita; Sinha, Somdatta

    2014-03-01

    Biosynthetic pathway evolution needs to consider the evolution of a group of genes that code for enzymes catalysing the multiple chemical reaction steps leading to the final end product. Tryptophan biosynthetic pathway has five chemical reaction steps that are highly conserved in diverse microbial genomes, though the genes of the pathway enzymes show considerable variations in arrangements, operon structure (gene fusion and splitting) and regulation. We use a combined bioinformatic and statistical analyses approach to address the question if the pathway genes from different microbial genomes, belonging to a wide range of groups, show similar evolutionary relationships within and between them. Our analyses involved detailed study of gene organization (fusion/splitting events), base composition, relative synonymous codon usage pattern of the genes, gene expressivity, amino acid usage, etc. to assess inter- and intra-genic variations, between and within the pathway genes, in diverse group of microorganisms. We describe these genetic and genomic variations in the tryptophan pathway genes in different microorganisms to show the similarities across organisms, and compare the same genes across different organisms to find the possible variability arising possibly due to horizontal gene transfers. Such studies form the basis for moving from single gene evolution to pathway evolutionary studies that are important steps towards understanding the systems biology of intracellular pathways. PMID:24592292

  16. Carotenoid Biosynthesis in Cyanobacteria: Structural and Evolutionary Scenarios Based on Comparative Genomics

    Directory of Open Access Journals (Sweden)

    Chengwei Liang , Fangqing Zhao , Wei Wei , Zhangxiao Wen , Song Qin

    2006-01-01

    Full Text Available Carotenoids are widely distributed pigments in nature and their biosynthetic pathway has been extensively studied in various organisms. The recent access to the overwhelming amount genomic data of cyanobacteria has given birth to a novel approach called comparative genomics. The putative enzymes involved in the carotenoid biosynthesis among the cyanobacteria were determined by similarity-based tools. The reconstruction of biosynthetic pathway was based on the related enzymes. It is interesting to find that nearly all the cyanobacteria share quite similar pathway to synthesize β-carotene except for Gloeobacter violaceus PCC 7421. The enzymes, crtE-B-P-Qb-L, involved in the upstream pathway are more conserved than the subsequent ones (crtW-R. In addition, many carotenoid synthesis enzymes exhibit diversity in structure and function. Such examples in the families of ζ –carotene desaturase, lycopene cylases and carotene ketolases were described in this article. When we mapped these crt genes to the cyanobacterial genomes, the crt genes showed great structural variation among species. All of them are dispersed on the whole chromosome in contrast to the linear adjacent distribution of the crt gene cluster in other eubacteria. Moreover, in unicellular cyanobacteria, each step of the carotenogenic pathway is usually catalyzed by one gene product, whereas multiple ketolase genes are found in filamentous cyanobacteria. Such increased numbers of crt genes and their correlation to the ecological adaptation were carefully discussed.

  17. Genome-wide comparative analysis of microRNAs in three non-human primates

    Directory of Open Access Journals (Sweden)

    Brameier Markus

    2010-03-01

    Full Text Available Abstract Background MicroRNAs (miRNAs are negative regulators of gene expression in multicellular eukaryotes. With the recently completed sequencing of three primate genomes, the study of miRNA evolution within the primate lineage has only begun and may be expected to provide the genetic and molecular explanations for many phenotypic differences between human and non-human primates. Findings We scanned all three genomes of non-human primates, including chimpanzee (Pan troglodytes, orangutan (Pongo pygmaeus, and rhesus monkey (Macaca mulatta, for homologs of human miRNA genes. Besides sequence homology analysis, our comparative method relies on various postprocessing filters to verify other features of miRNAs, including, in particular, their precursor structure or their occurrence (prediction in other primate genomes. Our study allows direct comparisons between the different species in terms of their miRNA repertoire, their evolutionary distance to human, the effects of filters, as well as the identification of common and species-specific miRNAs in the primate lineage. More than 500 novel putative miRNA genes have been discovered in orangutan that show at least 85 percent identity in precursor sequence. Only about 40 percent are found to be 100 percent identical with their human ortholog. Conclusion Homologs of human precursor miRNAs with perfect or near-perfect sequence identity may be considered to be likely functional in other primates. The computational identification of homologs with less similar sequence, instead, requires further evidence to be provided.

  18. Comparative Genomic Analyses of Streptococcus pseudopneumoniae Provide Insight into Virulence and Commensalism Dynamics.

    Directory of Open Access Journals (Sweden)

    Dea Shahinas

    Full Text Available Streptococcus pseudopneumoniae (SPPN is a recently described species of the viridans group streptococci (VGS. Although the pathogenic potential of S. pseudopneumoniae remains uncertain, it is most commonly isolated from patients with underlying medical conditions, such as chronic obstructive pulmonary disease. S. pseudopneumoniae can be distinguished from the closely related species, S. pneumoniae and S. mitis, by phenotypic characteristics, including optochin resistance in the presence of 5% CO2, bile insolubility, and the lack of the pneumococcal capsule. Previously, we reported the draft genome sequence of S. pseudopneumoniae IS7493, a clinical isolate obtained from an immunocompromised patient with documented pneumonia. Here, we use comparative genomics approaches to identify similarities and key differences between S. pseudopneumoniae IS7493, S. pneumoniae and S. mitis. The genome structure of S. pseudopneumoniae IS7493 is most closely related to that of S. pneumoniae R6, but several recombination events are evident. Analysis of gene content reveals numerous unique features that distinguish S. pseudopneumoniae from other streptococci. The presence of loci for competence, iron transport, pneumolysin production and antimicrobial resistance reinforce the phylogenetic position of S. pseudopneumoniae as an intermediate species between S. pneumoniae and S. mitis. Additionally, the presence of several virulence factors and antibiotic resistance mechanisms suggest the potential of this commensal species to become pathogenic or to contribute to increasing antibiotic resistance levels seen among the VGS.

  19. (Actino)Bacterial "intelligence": using comparative genomics to unravel the information processing capacities of microbes.

    Science.gov (United States)

    Pinto, Daniela; Mascher, Thorsten

    2016-08-01

    Bacterial genomes encode numerous and often sophisticated signaling devices to perceive changes in their environment and mount appropriate adaptive responses. With their help, microbes are able to orchestrate specific decision-making processes that alter the cellular behavior, but also integrate and communicate information. Moreover and beyond, some signal transducing systems also enable bacteria to remember and learn from previous stimuli to anticipate environmental changes. As recently suggested, all of these aspects indicate that bacteria do, in fact, exhibit cognition remarkably reminiscent of what we refer to as intelligent behavior, at least when referred to higher eukaryotes. In this essay, comprehensive data derived from comparative genomics analyses of microbial signal transduction systems are used to probe the concept of cognition in bacterial cells. Using a recent comprehensive analysis of over 100 actinobacterial genomes as a test case, we illustrate the different layers of the capacities of bacteria that result in cognitive and behavioral complexity as well as some form of 'bacterial intelligence'. We try to raise awareness to approach bacteria as cognitive organisms and believe that this view would enrich and open a new path in the experimental studies of bacterial signal transducing systems. PMID:26852121

  20. Mitochondrial genome of Turbinaria ornata (Sargassaceae, Phaeophyceae): comparative mitogenomics of brown algae.

    Science.gov (United States)

    Liu, Feng; Pang, Shaojun

    2015-11-01

    Turbinaria ornata (Turner) J. Agardh is a perennial brown alga native to coral reef ecosystems of tropical areas of the Pacific and Indian Ocean. Very little is known about its organellar genome structure. In the present work, the complete mitochondrial genome sequence of T. ornata was determined and compared with other reported brown algal mtDNAs. The circular mitogenome of 34,981 bp contains a basic set of 65 mitochondrial genes. The structure and organization of T. ornata mitogenome is very similar to Sargassum species. Turbinaria ornata genes overlap by a total of 164 bp in 12 different locations from 1 to 66 bp, and the non-coding sequences are 1872 bp, constituting approximate 5.35 % of the genome. The total spacer size has positive correlation with the brown algal mitogenome size with the correlation coefficient of 0.7972. Several regions displaying greater inconsistency (rnl-trnK spacer, cox2 gene, cox3-atp6 spacer, rps14-rns middle region and trnP-rnl spacer) have been identified in brown algal mtDNAs. The observed uncertainty regarding the position and support values of some branches might be closely associated with the heterogeneity of evolutionary rate. PMID:25893565

  1. Gene discovery in the hamster: a comparative genomics approach for gene annotation by sequencing of hamster testis cDNAs

    Directory of Open Access Journals (Sweden)

    Khan Shafiq A

    2003-06-01

    Full Text Available Abstract Background Complete genome annotation will likely be achieved through a combination of computer-based analysis of available genome sequences combined with direct experimental characterization of expressed regions of individual genomes. We have utilized a comparative genomics approach involving the sequencing of randomly selected hamster testis cDNAs to begin to identify genes not previously annotated on the human, mouse, rat and Fugu (pufferfish genomes. Results 735 distinct sequences were analyzed for their relatedness to known sequences in public databases. Eight of these sequences were derived from previously unidentified genes and expression of these genes in testis was confirmed by Northern blotting. The genomic locations of each sequence were mapped in human, mouse, rat and pufferfish, where applicable, and the structure of their cognate genes was derived using computer-based predictions, genomic comparisons and analysis of uncharacterized cDNA sequences from human and macaque. Conclusion The use of a comparative genomics approach resulted in the identification of eight cDNAs that correspond to previously uncharacterized genes in the human genome. The proteins encoded by these genes included a new member of the kinesin superfamily, a SET/MYND-domain protein, and six proteins for which no specific function could be predicted. Each gene was expressed primarily in testis, suggesting that they may play roles in the development and/or function of testicular cells.

  2. Comparative genomic analyses of the cyanobacterium, Lyngbya aestuarii BL J, a powerful hydrogen producer.

    Directory of Open Access Journals (Sweden)

    AnkitaKothari

    2013-12-01

    Full Text Available The filamentous, non-heterocystous cyanobacterium Lyngbya aestuarii is an important contributor to marine intertidal microbial mats system worldwide. The recent isolate L. aestuarii BL J, is an unusually powerful hydrogen producer. Here we report a morphological, ultrastructural and genomic characterization of this strain to set the basis for future systems studies and applications of this organism. The filaments contain circa 17 μm wide trichomes, composed of stacked disk-like short cells (2 μm long, encased in a prominent, laminated exopolysaccharide sheath. Cellular division occurs by transversal centripetal growth of cross-walls, where several rounds of division proceed simultaneously. Filament division occurs by cell self-immolation of one or groups of cells (necridial cells at the breakage point. Short, sheath-less, motile filaments (hormogonia are also formed. Morphologically and phylogenetically L. aestuarii belongs to a clade of important cyanobacteria that include members of the marine Trichodesmiun and Hydrocoleum genera, as well as terrestrial Microcoleus vaginatus strains, and alkalyphilic strains of Arthrospira. A draft genome of strain BL J was compared to those of other cyanobacteria in order to ascertain some of its ecological constraints and biotechnological potential. The genome had an average GC content of 41.1 %. Of the 6.87 Mb sequenced, 6.44 Mb was present as large contigs (>10,000 bp. It contained 6515 putative protein-encoding genes, of which, 43 % encode proteins of known functional role, 26 % corresponded to proteins with domain or family assignments, 19.6 % encode conserved hypothetical proteins, and 11.3 % encode apparently unique hypothetical proteins. The strain’s genome reveals its adaptations to a life of exposure to intense solar radiation and desiccation. It likely employs the storage compounds, glycogen and cyanophycin but no polyhydroxyalkanoates, and can produce the osmolytes, trehalose and glycine

  3. Sequencing and comparative analysis of the straw mushroom (Volvariella volvacea genome.

    Directory of Open Access Journals (Sweden)

    Dapeng Bao

    Full Text Available Volvariella volvacea, the edible straw mushroom, is a highly nutritious food source that is widely cultivated on a commercial scale in many parts of Asia using agricultural wastes (rice straw, cotton wastes as growth substrates. However, developments in V. volvacea cultivation have been limited due to a low biological efficiency (i.e. conversion of growth substrate to mushroom fruit bodies, sensitivity to low temperatures, and an unclear sexuality pattern that has restricted the breeding of improved strains. We have now sequenced the genome of V. volvacea and assembled it into 62 scaffolds with a total genome size of 35.7 megabases (Mb, containing 11,084 predicted gene models. Comparative analyses were performed with the model species in basidiomycete on mating type system, carbohydrate active enzymes, and fungal oxidative lignin enzymes. We also studied transcriptional regulation of the response to low temperature (4°C. We found that the genome of V. volvacea has many genes that code for enzymes, which are involved in the degradation of cellulose, hemicellulose, and pectin. The molecular genetics of the mating type system in V. volvacea was also found to be similar to the bipolar system in basidiomycetes, suggesting that it is secondary homothallism. Sensitivity to low temperatures could be due to the lack of the initiation of the biosynthesis of unsaturated fatty acids, trehalose and glycogen biosyntheses in this mushroom. Genome sequencing of V. volvacea has improved our understanding of the biological characteristics related to the degradation of the cultivating compost consisting of agricultural waste, the sexual reproduction mechanism, and the sensitivity to low temperatures at the molecular level which in turn will enable us to increase the industrial production of this mushroom.

  4. Genome Sequencing and Comparative Analysis of Saccharomyces cerevisiae Strains of the Peterhof Genetic Collection

    Science.gov (United States)

    Drozdova, Polina B.; Tarasov, Oleg V.; Matveenko, Andrew G.; Radchenko, Elina A.; Sopova, Julia V.; Polev, Dmitrii E.; Inge-Vechtomov, Sergey G.; Dobrynin, Pavel V.

    2016-01-01

    The Peterhof genetic collection of Saccharomyces cerevisiae strains (PGC) is a large laboratory stock that has accumulated several thousands of strains for over than half a century. It originated independently of other common laboratory stocks from a distillery lineage (race XII). Several PGC strains have been extensively used in certain fields of yeast research but their genomes have not been thoroughly explored yet. Here we employed whole genome sequencing to characterize five selected PGC strains including one of the closest to the progenitor, 15V-P4, and several strains that have been used to study translation termination and prions in yeast (25-25-2V-P3982, 1B-D1606, 74-D694, and 6P-33G-D373). The genetic distance between the PGC progenitor and S288C is comparable to that between two geographically isolated populations. The PGC seems to be closer to two bakery strains than to S288C-related laboratory stocks or European wine strains. In genomes of the PGC strains, we found several loci which are absent from the S288C genome; 15V-P4 harbors a rare combination of the gene cluster characteristic for wine strains and the RTM1 cluster. We closely examined known and previously uncharacterized gene variants of particular strains and were able to establish the molecular basis for known phenotypes including phenylalanine auxotrophy, clumping behavior and galactose utilization. Finally, we made sequencing data and results of the analysis available for the yeast community. Our data widen the knowledge about genetic variation between Saccharomyces cerevisiae strains and can form the basis for planning future work in PGC-related strains and with PGC-derived alleles. PMID:27152522

  5. Comparative genome analysis of 19 Ureaplasma urealyticum and Ureaplasma parvum strains

    Directory of Open Access Journals (Sweden)

    Paralanov Vanya

    2012-05-01

    Full Text Available Abstract Background Ureaplasma urealyticum (UUR and Ureaplasma parvum (UPA are sexually transmitted bacteria among humans implicated in a variety of disease states including but not limited to: nongonococcal urethritis, infertility, adverse pregnancy outcomes, chorioamnionitis, and bronchopulmonary dysplasia in neonates. There are 10 distinct serotypes of UUR and 4 of UPA. Efforts to determine whether difference in pathogenic potential exists at the ureaplasma serovar level have been hampered by limitations of antibody-based typing methods, multiple cross-reactions and poor discriminating capacity in clinical samples containing two or more serovars. Results We determined the genome sequences of the American Type Culture Collection (ATCC type strains of all UUR and UPA serovars as well as four clinical isolates of UUR for which we were not able to determine serovar designation. UPA serovars had 0.75−0.78 Mbp genomes and UUR serovars were 0.84−0.95 Mbp. The original classification of ureaplasma isolates into distinct serovars was largely based on differences in the major ureaplasma surface antigen called the multiple banded antigen (MBA and reactions of human and animal sera to the organisms. Whole genome analysis of the 14 serovars and the 4 clinical isolates showed the mba gene was part of a large superfamily, which is a phase variable gene system, and that some serovars have identical sets of mba genes. Most of the differences among serovars are hypothetical genes, and in general the two species and 14 serovars are extremely similar at the genome level. Conclusions Comparative genome analysis suggests UUR is more capable of acquiring genes horizontally, which may contribute to its greater virulence for some conditions. The overwhelming evidence of extensive horizontal gene transfer among these organisms from our previous studies combined with our comparative analysis indicates that ureaplasmas exist as quasi-species rather than as stable

  6. Comparative genome analyses reveal distinct structure in the saltwater crocodile MHC.

    Science.gov (United States)

    Jaratlerdsiri, Weerachai; Deakin, Janine; Godinez, Ricardo M; Shan, Xueyan; Peterson, Daniel G; Marthey, Sylvain; Lyons, Eric; McCarthy, Fiona M; Isberg, Sally R; Higgins, Damien P; Chong, Amanda Y; John, John St; Glenn, Travis C; Ray, David A; Gongora, Jaime

    2014-01-01

    The major histocompatibility complex (MHC) is a dynamic genome region with an essential role in the adaptive immunity of vertebrates, especially antigen presentation. The MHC is generally divided into subregions (classes I, II and III) containing genes of similar function across species, but with different gene number and organisation. Crocodylia (crocodilians) are widely distributed and represent an evolutionary distinct group among higher vertebrates, but the genomic organisation of MHC within this lineage has been largely unexplored. Here, we studied the MHC region of the saltwater crocodile (Crocodylus porosus) and compared it with that of other taxa. We characterised genomic clusters encompassing MHC class I and class II genes in the saltwater crocodile based on sequencing of bacterial artificial chromosomes. Six gene clusters spanning ∼452 kb were identified to contain nine MHC class I genes, six MHC class II genes, three TAP genes, and a TRIM gene. These MHC class I and class II genes were in separate scaffold regions and were greater in length (2-6 times longer) than their counterparts in well-studied fowl B loci, suggesting that the compaction of avian MHC occurred after the crocodilian-avian split. Comparative analyses between the saltwater crocodile MHC and that from the alligator and gharial showed large syntenic areas (>80% identity) with similar gene order. Comparisons with other vertebrates showed that the saltwater crocodile had MHC class I genes located along with TAP, consistent with birds studied. Linkage between MHC class I and TRIM39 observed in the saltwater crocodile resembled MHC in eutherians compared, but absent in avian MHC, suggesting that the saltwater crocodile MHC appears to have gene organisation intermediate between these two lineages. These observations suggest that the structure of the saltwater crocodile MHC, and other crocodilians, can help determine the MHC that was present in the ancestors of archosaurs. PMID:25503521

  7. Comparative genome analyses reveal distinct structure in the saltwater crocodile MHC.

    Directory of Open Access Journals (Sweden)

    Weerachai Jaratlerdsiri

    Full Text Available The major histocompatibility complex (MHC is a dynamic genome region with an essential role in the adaptive immunity of vertebrates, especially antigen presentation. The MHC is generally divided into subregions (classes I, II and III containing genes of similar function across species, but with different gene number and organisation. Crocodylia (crocodilians are widely distributed and represent an evolutionary distinct group among higher vertebrates, but the genomic organisation of MHC within this lineage has been largely unexplored. Here, we studied the MHC region of the saltwater crocodile (Crocodylus porosus and compared it with that of other taxa. We characterised genomic clusters encompassing MHC class I and class II genes in the saltwater crocodile based on sequencing of bacterial artificial chromosomes. Six gene clusters spanning ∼452 kb were identified to contain nine MHC class I genes, six MHC class II genes, three TAP genes, and a TRIM gene. These MHC class I and class II genes were in separate scaffold regions and were greater in length (2-6 times longer than their counterparts in well-studied fowl B loci, suggesting that the compaction of avian MHC occurred after the crocodilian-avian split. Comparative analyses between the saltwater crocodile MHC and that from the alligator and gharial showed large syntenic areas (>80% identity with similar gene order. Comparisons with other vertebrates showed that the saltwater crocodile had MHC class I genes located along with TAP, consistent with birds studied. Linkage between MHC class I and TRIM39 observed in the saltwater crocodile resembled MHC in eutherians compared, but absent in avian MHC, suggesting that the saltwater crocodile MHC appears to have gene organisation intermediate between these two lineages. These observations suggest that the structure of the saltwater crocodile MHC, and other crocodilians, can help determine the MHC that was present in the ancestors of archosaurs.

  8. De novo Transcriptome Assemblies of Rana (Lithobates catesbeiana and Xenopus laevis Tadpole Livers for Comparative Genomics without Reference Genomes.

    Directory of Open Access Journals (Sweden)

    Inanc Birol

    Full Text Available In this work we studied the liver transcriptomes of two frog species, the American bullfrog (Rana (Lithobates catesbeiana and the African clawed frog (Xenopus laevis. We used high throughput RNA sequencing (RNA-seq data to assemble and annotate these transcriptomes, and compared how their baseline expression profiles change when tadpoles of the two species are exposed to thyroid hormone. We generated more than 1.5 billion RNA-seq reads in total for the two species under two conditions as treatment/control pairs. We de novo assembled these reads using Trans-ABySS to reconstruct reference transcriptomes, obtaining over 350,000 and 130,000 putative transcripts for R. catesbeiana and X. laevis, respectively. Using available genomics resources for X. laevis, we annotated over 97% of our X. laevis transcriptome contigs, demonstrating the utility and efficacy of our methodology. Leveraging this validated analysis pipeline, we also annotated the assembled R. catesbeiana transcriptome. We used the expression profiles of the annotated genes of the two species to examine the similarities and differences between the tadpole liver transcriptomes. We also compared the gene ontology terms of expressed genes to measure how the animals react to a challenge by thyroid hormone. Our study reports three main conclusions. First, de novo assembly of RNA-seq data is a powerful method for annotating and establishing transcriptomes of non-model organisms. Second, the liver transcriptomes of the two frog species, R. catesbeiana and X. laevis, show many common features, and the distribution of their gene ontology profiles are statistically indistinguishable. Third, although they broadly respond the same way to the presence of thyroid hormone in their environment, their receptor/signal transduction pathways display marked differences.

  9. Comparative genomic hybridization detects novel amplifications in fibroadenomas of the breast

    DEFF Research Database (Denmark)

    Ojopi, E P; Rogatto, S R; Caldeira, J R;

    2001-01-01

    Comparative genomic hybridization analysis was performed for identification of chromosomal imbalances in 23 samples of fibroadenomas of the breast. Chromosomal gains rather than losses were a feature of these lesions. Only two cases with a familial and/or previous history of breast lesions had gain...... indicates that gain of these regions can also occur in benign breast lesions. Our findings may provide a basis for conducting further investigations to locate and identify genes associated with proliferation that may be involved in the early steps of tumorigenesis of the breast....

  10. Phylogenetic and evolutionary patterns in microbial carotenoid biosynthesis are revealed by comparative genomics.

    Directory of Open Access Journals (Sweden)

    Jonathan L Klassen

    Full Text Available BACKGROUND: Carotenoids are multifunctional, taxonomically widespread and biotechnologically important pigments. Their biosynthesis serves as a model system for understanding the evolution of secondary metabolism. Microbial carotenoid diversity and evolution has hitherto been analyzed primarily from structural and biosynthetic perspectives, with the few phylogenetic analyses of microbial carotenoid biosynthetic proteins using either used limited datasets or lacking methodological rigor. Given the recent accumulation of microbial genome sequences, a reappraisal of microbial carotenoid biosynthetic diversity and evolution from the perspective of comparative genomics is warranted to validate and complement models of microbial carotenoid diversity and evolution based upon structural and biosynthetic data. METHODOLOGY/PRINCIPAL FINDINGS: Comparative genomics were used to identify and analyze in silico microbial carotenoid biosynthetic pathways. Four major phylogenetic lineages of carotenoid biosynthesis are suggested composed of: (i Proteobacteria; (ii Firmicutes; (iii Chlorobi, Cyanobacteria and photosynthetic eukaryotes; and (iv Archaea, Bacteroidetes and two separate sub-lineages of Actinobacteria. Using this phylogenetic framework, specific evolutionary mechanisms are proposed for carotenoid desaturase CrtI-family enzymes and carotenoid cyclases. Several phylogenetic lineage-specific evolutionary mechanisms are also suggested, including: (i horizontal gene transfer; (ii gene acquisition followed by differential gene loss; (iii co-evolution with other biochemical structures such as proteorhodopsins; and (iv positive selection. CONCLUSIONS/SIGNIFICANCE: Comparative genomics analyses of microbial carotenoid biosynthetic proteins indicate a much greater taxonomic diversity then that identified based on structural and biosynthetic data, and divides microbial carotenoid biosynthesis into several, well-supported phylogenetic lineages not evident

  11. Comparative functional genomics of amino acid metabolism of lactic acid bacteria

    OpenAIRE

    Pastink, M.I.

    2009-01-01

    The amino acid metabolism of lactic acid bacteria used as starters in industrial fermentations has profound effects on the quality of the fermented foods. The work described in this PhD thesis was initiated to use genomics technologies and a comparative approach to link the gene content of some well-known lactic acid bacteria to flavor formation and to increase our general knowledge in the area of amino acid metabolism. The three well-known lactic acid bacteria that were used in these studies...

  12. Genomic organization, transcript variants and comparative analysis of the human nucleoporin 155 (NUP155) gene

    DEFF Research Database (Denmark)

    Zhang, Xiuqing; Yang, Huanming; Yu, Jun;

    2002-01-01

    Nucleoporin 155 (Nup155) is a major component of the nuclear pore complex (NPC) involved in cellular nucleo-cytoplasmic transport. We have acquired the complete sequence and interpreted the genomic organization of the Nup155 orthologos from human (Homo sapiens) and pufferfish (Fugu rubripes), which...... cloned DNA complementary to RNAs of the Nup155 orthologs from Fugu and mouse. Comparative analysis of the Nup155 orthologs in many species, including H. sapiens, Mus musculus, Rattus norvegicus, F. rubripes, Arabidopsis thaliana, Drosophila melanogaster, and Saccharomyces cerevisiae, has revealed two...

  13. Small Area Array-Based LED Luminaire Design

    Energy Technology Data Exchange (ETDEWEB)

    Thomas Yuan

    2008-01-09

    This report contains a summary of technical achievements during a three-year project to demonstrate high efficiency LED luminaire designs based on small area array-based gallium nitride diodes. Novel GaN-based LED array designs are described, specifically addressing the thermal, optical, electrical and mechanical requirements for the incorporation of such arrays into viable solid-state LED luminaires. This work resulted in the demonstration of an integrated luminaire prototype of 1000 lumens cool white light output with reflector shaped beams and efficacy of 89.4 lm/W at CCT of 6000oK and CRI of 73; and performance of 903 lumens warm white light output with reflector shaped beams and efficacy of 63.0 lm/W at CCT of 2800oK and CRI of 82. In addition, up to 1275 lumens cool white light output at 114.2 lm/W and 1156 lumens warm white light output at 76.5 lm/W were achieved if the reflector was not used. The success to integrate small area array-based LED designs and address thermal, optical, electrical and mechanical requirements was clearly achieved in these luminaire prototypes with outstanding performance and high efficiency.

  14. Theobroma cacao: A genetically integrated physical map and genome-scale comparative synteny analysis

    Science.gov (United States)

    A comprehensive integrated genomic framework is considered a centerpiece of genomic research. In collaboration with the USDA-ARS (SHRS) and Mars Inc., the Clemson University Genomics Institute (CUGI) has developed a genetically anchored physical map of the T. cacao genome. Three BAC libraries contai...

  15. Correction: Comparative Analysis of Fungal Genomes Reveals Different Plant Cell Wall Degrading Capacity in Fungi

    OpenAIRE

    Zhao, Zhongtao; Liu, Huiquan; Wang, Chenfang; Xu, Jin-Rong

    2014-01-01

    Abstract The version of this article published in BMC Genomics 2013, 14: 274, contains 9 unpublished genomes (Botryobasidium botryosum, Gymnopus luxurians, Hypholoma sublateritium, Jaapia argillacea, Hebeloma cylindrosporum, Conidiobolus coronatus, Laccaria amethystina, Paxillus involutus, and P. rubicundulus) downloaded from JGI website. In this correction, we removed these genomes after discussion with editors and data producers whom we should have contacted before downloading these genomes...

  16. Genome-Wide Comparative Analysis Reveals Similar Types of NBS Genes in Hybrid Citrus sinensis Genome and Original Citrus clementine Genome and Provides New Insights into Non-TIR NBS Genes

    Science.gov (United States)

    In this study, we identified and compared nucleotide-binding site (NBS) domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China). Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approxima...

  17. Oligoarray comparative genomic hybridization of renal cell tumors that developed in patients with acquired cystic renal disease.

    Science.gov (United States)

    Kuntz, Eva; Yusenko, Maria V; Nagy, Anetta; Kovacs, Gyula

    2010-09-01

    Renal cell carcinoma occurs at higher frequency in acquired cystic renal disease than in the general population. We have analyzed 4 tumors obtained from the kidneys of 2 patients with acquired cystic renal disease, including 2 conventional renal cell carcinomas and 2 acquired cystic renal disease-associated tumors, for genetic alterations. DNA changes were established by applying the 44K Agilent Oligonucleotide Array-Based CGH (Agilent Technologies, Waldbronn, Germany), and mutation of VHL gene was detected by direct sequencing of the tumor genome. DNA losses and mutation of the VHL gene, which are characteristic for conventional renal cell carcinomas, were seen in 2 of the tumors. The acquired cystic renal disease-associated eosinophilic-vacuolated cell tumor showed gain of chromosomes 3 and 16. No DNA alterations occurred in the papillary clear cell tumor. We suggest that not only the morphology but also the genetics of renal cell tumors associated with acquired cystic renal disease may differ from those occurring in the general population. PMID:20646738

  18. Comparative Genomics of Two Closely Related Wolbachia with Different Reproductive Effects on Hosts.

    Science.gov (United States)

    Newton, Irene L G; Clark, Michael E; Kent, Bethany N; Bordenstein, Seth R; Qu, Jiaxin; Richards, Stephen; Kelkar, Yogeshwar D; Werren, John H

    2016-01-01

    Wolbachia pipientis are obligate intracellular bacteria commonly found in many arthropods. They can induce various reproductive alterations in hosts, including cytoplasmic incompatibility, male-killing, feminization, and parthenogenetic development, and can provide host protection against some viruses and other pathogens. Wolbachia differ from many other primary endosymbionts in arthropods because they undergo frequent horizontal transmission between hosts and are well known for an abundance of mobile elements and relatively high recombination rates. Here, we compare the genomes of two closely related Wolbachia (with 0.57% genome-wide synonymous divergence) that differ in their reproductive effects on hosts. wVitA induces a sperm-egg incompatibility (also known as cytoplasmic incompatibility) in the parasitoid insect Nasonia vitripennis, whereas wUni causes parthenogenetic development in a different parasitoid, Muscidifurax uniraptor Although these bacteria are closely related, the genomic comparison reveals rampant rearrangements, protein truncations (particularly in proteins predicted to be secreted), and elevated substitution rates. These changes occur predominantly in the wUni lineage, and may be due in part to adaptations by wUni to a new host environment, or its phenotypic shift to parthenogenesis induction. However, we conclude that the approximately 8-fold elevated synonymous substitution rate in wUni is due to a either an elevated mutation rate or a greater number of generations per year in wUni, which occurs in semitropical host species. We identify a set of genes whose loss or pseudogenization in the wUni lineage implicates them in the phenotypic shift from cytoplasmic incompatibility to parthenogenesis induction. Finally, comparison of these closely related strains allows us to determine the fine-scale mutation patterns in Wolbachia Although Wolbachia are AT rich, mutation probabilities estimated from 4-fold degenerate sites are not AT biased, and

  19. Experimental analysis of oligonucleotide microarray design criteria to detect deletions by comparative genomic hybridization

    Directory of Open Access Journals (Sweden)

    Moerman Donald G

    2008-10-01

    Full Text Available Abstract Background Microarray comparative genomic hybridization (CGH is currently one of the most powerful techniques to measure DNA copy number in large genomes. In humans, microarray CGH is widely used to assess copy number variants in healthy individuals and copy number aberrations associated with various diseases, syndromes and disease susceptibility. In model organisms such as Caenorhabditis elegans (C. elegans the technique has been applied to detect mutations, primarily deletions, in strains of interest. Although various constraints on oligonucleotide properties have been suggested to minimize non-specific hybridization and improve the data quality, there have been few experimental validations for CGH experiments. For genomic regions where strict design filters would limit the coverage it would also be useful to quantify the expected loss in data quality associated with relaxed design criteria. Results We have quantified the effects of filtering various oligonucleotide properties by measuring the resolving power for detecting deletions in the human and C. elegans genomes using NimbleGen microarrays. Approximately twice as many oligonucleotides are typically required to be affected by a deletion in human DNA samples in order to achieve the same statistical confidence as one would observe for a deletion in C. elegans. Surprisingly, the ability to detect deletions strongly depends on the oligonucleotide 15-mer count, which is defined as the sum of the genomic frequency of all the constituent 15-mers within the oligonucleotide. A similarity level above 80% to non-target sequences over the length of the probe produces significant cross-hybridization. We recommend the use of a fairly large melting temperature window of up to 10°C, the elimination of repeat sequences, the elimination of homopolymers longer than 5 nucleotides, and a threshold of -1 kcal/mol on the oligonucleotide self-folding energy. We observed very little difference in data

  20. Comparative Genomics of Two Closely Related Wolbachia with Different Reproductive Effects on Hosts

    Science.gov (United States)

    Newton, Irene L.G.; Clark, Michael E.; Kent, Bethany N.; Bordenstein, Seth R.; Qu, Jiaxin; Richards, Stephen; Kelkar, Yogeshwar D.; Werren, John H.

    2016-01-01

    Wolbachia pipientis are obligate intracellular bacteria commonly found in many arthropods. They can induce various reproductive alterations in hosts, including cytoplasmic incompatibility, male-killing, feminization, and parthenogenetic development, and can provide host protection against some viruses and other pathogens. Wolbachia differ from many other primary endosymbionts in arthropods because they undergo frequent horizontal transmission between hosts and are well known for an abundance of mobile elements and relatively high recombination rates. Here, we compare the genomes of two closely related Wolbachia (with 0.57% genome-wide synonymous divergence) that differ in their reproductive effects on hosts. wVitA induces a sperm–egg incompatibility (also known as cytoplasmic incompatibility) in the parasitoid insect Nasonia vitripennis, whereas wUni causes parthenogenetic development in a different parasitoid, Muscidifurax uniraptor. Although these bacteria are closely related, the genomic comparison reveals rampant rearrangements, protein truncations (particularly in proteins predicted to be secreted), and elevated substitution rates. These changes occur predominantly in the wUni lineage, and may be due in part to adaptations by wUni to a new host environment, or its phenotypic shift to parthenogenesis induction. However, we conclude that the approximately 8-fold elevated synonymous substitution rate in wUni is due to a either an elevated mutation rate or a greater number of generations per year in wUni, which occurs in semitropical host species. We identify a set of genes whose loss or pseudogenization in the wUni lineage implicates them in the phenotypic shift from cytoplasmic incompatibility to parthenogenesis induction. Finally, comparison of these closely related strains allows us to determine the fine-scale mutation patterns in Wolbachia. Although Wolbachia are AT rich, mutation probabilities estimated from 4-fold degenerate sites are not AT biased, and

  1. Comparative genomics of the emerging human pathogen Photorhabdus asymbiotica with the insect pathogen Photorhabdus luminescens

    Directory of Open Access Journals (Sweden)

    Churcher Carol

    2009-07-01

    Full Text Available Abstract Background The Gram-negative bacterium Photorhabdus asymbiotica (Pa has been recovered from human infections in both North America and Australia. Recently, Pa has been shown to have a nematode vector that can also infect insects, like its sister species the insect pathogen P. luminescens (Pl. To understand the relationship between pathogenicity to insects and humans in Photorhabdus we have sequenced the complete genome of Pa strain ATCC43949 from North America. This strain (formerly referred to as Xenorhabdus luminescens strain 2 was isolated in 1977 from the blood of an 80 year old female patient with endocarditis, in Maryland, USA. Here we compare the complete genome of Pa ATCC43949 with that of the previously sequenced insect pathogen P. luminescens strain TT01 which was isolated from its entomopathogenic nematode vector collected from soil in Trinidad and Tobago. Results We found that the human pathogen Pa had a smaller genome (5,064,808 bp than that of the insect pathogen Pl (5,688,987 bp but that each pathogen carries approximately one megabase of DNA that is unique to each strain. The reduced size of the Pa genome is associated with a smaller diversity in insecticidal genes such as those encoding the Toxin complexes (Tc's, Makes caterpillars floppy (Mcf toxins and the Photorhabdus Virulence Cassettes (PVCs. The Pa genome, however, also shows the addition of a plasmid related to pMT1 from Yersinia pestis and several novel pathogenicity islands including a novel Type Three Secretion System (TTSS encoding island. Together these data suggest that Pa may show virulence against man via the acquisition of the pMT1-like plasmid and specific effectors, such as SopB, that promote its persistence inside human macrophages. Interestingly the loss of insecticidal genes in Pa is not reflected by a loss of pathogenicity towards insects. Conclusion Our results suggest that North American isolates of Pa have acquired virulence against man via the

  2. Comparative genomics allowed the identification of drug targets against human fungal pathogens

    Directory of Open Access Journals (Sweden)

    Martins Natalia F

    2011-01-01

    Full Text Available Abstract Background The prevalence of invasive fungal infections (IFIs has increased steadily worldwide in the last few decades. Particularly, there has been a global rise in the number of infections among immunosuppressed people. These patients present severe clinical forms of the infections, which are commonly fatal, and they are more susceptible to opportunistic fungal infections than non-immunocompromised people. IFIs have historically been associated with high morbidity and mortality, partly because of the limitations of available antifungal therapies, including side effects, toxicities, drug interactions and antifungal resistance. Thus, the search for alternative therapies and/or the development of more specific drugs is a challenge that needs to be met. Genomics has created new ways of examining genes, which open new strategies for drug development and control of human diseases. Results In silico analyses and manual mining selected initially 57 potential drug targets, based on 55 genes experimentally confirmed as essential for Candida albicans or Aspergillus fumigatus and other 2 genes (kre2 and erg6 relevant for fungal survival within the host. Orthologs for those 57 potential targets were also identified in eight human fungal pathogens (C. albicans, A. fumigatus, Blastomyces dermatitidis, Paracoccidioides brasiliensis, Paracoccidioides lutzii, Coccidioides immitis, Cryptococcus neoformans and Histoplasma capsulatum. Of those, 10 genes were present in all pathogenic fungi analyzed and absent in the human genome. We focused on four candidates: trr1 that encodes for thioredoxin reductase, rim8 that encodes for a protein involved in the proteolytic activation of a transcriptional factor in response to alkaline pH, kre2 that encodes for α-1,2-mannosyltransferase and erg6 that encodes for Δ(24-sterol C-methyltransferase. Conclusions Our data show that the comparative genomics analysis of eight fungal pathogens enabled the identification of

  3. Comparative genomics and association mapping approaches for blast resistant genes in finger millet using SSRs.

    Directory of Open Access Journals (Sweden)

    B Kalyana Babu

    Full Text Available The major limiting factor for production and productivity of finger millet crop is blast disease caused by Magnaporthe grisea. Since, the genome sequence information available in finger millet crop is scarce, comparative genomics plays a very important role in identification of genes/QTLs linked to the blast resistance genes using SSR markers. In the present study, a total of 58 genic SSRs were developed for use in genetic analysis of a global collection of 190 finger millet genotypes. The 58 SSRs yielded ninety five scorable alleles and the polymorphism information content varied from 0.186 to 0.677 at an average of 0.385. The gene diversity was in the range of 0.208 to 0.726 with an average of 0.487. Association mapping for blast resistance was done using 104 SSR markers which identified four QTLs for finger blast and one QTL for neck blast resistance. The genomic marker RM262 and genic marker FMBLEST32 were linked to finger blast disease at a P value of 0.007 and explained phenotypic variance (R² of 10% and 8% respectively. The genomic marker UGEP81 was associated to finger blast at a P value of 0.009 and explained 7.5% of R². The QTLs for neck blast was associated with the genomic SSR marker UGEP18 at a P value of 0.01, which explained 11% of R². Three QTLs for blast resistance were found common by using both GLM and MLM approaches. The resistant alleles were found to be present mostly in the exotic genotypes. Among the genotypes of NW Himalayan region of India, VHC3997, VHC3996 and VHC3930 were found highly resistant, which may be effectively used as parents for developing blast resistant cultivars in the NW Himalayan region of India. The markers linked to the QTLs for blast resistance in the present study can be further used for cloning of the full length gene, fine mapping and their further use in the marker assisted breeding programmes for introgression of blast resistant alleles into locally adapted cultivars.

  4. Comparative Complete Genome Sequence Analysis of the Amino Acid Replacements Responsible for the Thermostability of Corynebacterium efficiens

    OpenAIRE

    Nishio, Yousuke; Nakamura, Yoji; Kawarabayasi, Yutaka; Usuda, Yoshihiro; Kimura, Eiichiro; Sugimoto, Shinichi; Matsui, Kazuhiko; Yamagishi, Akihiko; Kikuchi, Hisashi; Ikeo, Kazuho; Gojobori, Takashi

    2003-01-01

    Corynebacterium efficiens is the closest relative of Corynebacterium glutamicum, a species widely used for the industrial production of amino acids. C. efficiens but not C. glutamicum can grow above 40°C. We sequenced the complete C. efficiens genome to investigate the basis of its thermostability by comparing its genome with that of C. glutamicum. The difference in GC content between the species was reflected in codon usage and nucleotide substitutions. Our compar...

  5. Comparative Genomic and Transcriptional Analyses of CRISPR Systems Across the Genus Pyrobaculum

    Directory of Open Access Journals (Sweden)

    David L Bernick

    2012-07-01

    Full Text Available Within the domain Archaea, the CRISPR immune system appears to be nearly ubiquitous based on computational genome analyses. Initial studies in bacteria demonstrated that the CRISPR system targets invading plasmid and viral DNA. Recent experiments in the model archaeon Pyrococcus furiosus uncovered a novel RNA-targeting variant of the CRISPR system potentially unique to archaea. Because our understanding of CRISPR system evolution in other archaea is limited, we have taken a comparative genomic and transcriptomic view of the CRISPR arrays across six diverse species within the crenarchaeal genus Pyrobaculum. We present transcriptional data from each of four species in the genus (P. aerophilum, P. islandicum, P. calidifontis, P. arsenaticum, analyzing mature CRISPR-associated small RNA abundance from over 20 arrays. Within the genus, there is remarkable conservation of CRISPR array structure, as well as unique features that are have not been studied in other archaeal systems. These unique features include: a nearly invariant CRISPR promoter, conservation of direct repeat families, the 5' polarity of CRISPR-associated small RNA abundance, and a novel CRISPR-specific association with homologues of nurA and herA. These analyses provide a genus-level evolutionary perspective on archaeal CRISPR systems, broadening our understanding beyond existing non-comparative model systems.

  6. Comparative population genomics of latitudinal variation in Drosophila simulans and Drosophila melanogaster.

    Science.gov (United States)

    Machado, Heather E; Bergland, Alan O; O'Brien, Katherine R; Behrman, Emily L; Schmidt, Paul S; Petrov, Dmitri A

    2016-02-01

    Examples of clinal variation in phenotypes and genotypes across latitudinal transects have served as important models for understanding how spatially varying selection and demographic forces shape variation within species. Here, we examine the selective and demographic contributions to latitudinal variation through the largest comparative genomic study to date of Drosophila simulans and Drosophila melanogaster, with genomic sequence data from 382 individual fruit flies, collected across a spatial transect of 19 degrees latitude and at multiple time points over 2 years. Consistent with phenotypic studies, we find less clinal variation in D. simulans than D. melanogaster, particularly for the autosomes. Moreover, we find that clinally varying loci in D. simulans are less stable over multiple years than comparable clines in D. melanogaster. D. simulans shows a significantly weaker pattern of isolation by distance than D. melanogaster and we find evidence for a stronger contribution of migration to D. simulans population genetic structure. While population bottlenecks and migration can plausibly explain the differences in stability of clinal variation between the two species, we also observe a significant enrichment of shared clinal genes, suggesting that the selective forces associated with climate are acting on the same genes and phenotypes in D. simulans and D. melanogaster. PMID:26523848

  7. Global MLST of Salmonella Typhi Revisited in Post-Genomic Era: Genetic conservation, Population Structure and Comparative genomics of rare sequence types

    Directory of Open Access Journals (Sweden)

    Kien-Pong eYap

    2016-03-01

    Full Text Available Typhoid fever, caused by Salmonella enterica serovar Typhi, remains an important public health burden in Southeast Asia and other endemic countries. Various genotyping methods have been applied to study the genetic variations of this human-restricted pathogen. Multilocus Sequence Typing (MLST is one of the widely accepted methods, and recently, there is a growing interest in the re-application of MLST in the post-genomic era. In this study, we provide the global MLST distribution of S. Typhi utilizing both publicly available 1,826 S. Typhi genome sequences in addition to performing conventional MLST on S. Typhi strains isolated from various endemic regions spanning over a century. Our global MLST analysis confirms the predominance of two sequence types (ST1 and ST2 co-existing in the endemic regions. Interestingly, S. Typhi strains with ST8 are currently confined within the African continent. Comparative genomic analyses of ST8 and other rare STs with genomes of ST1/ST2 revealed unique mutations in important virulence genes such as flhB, sipC and tviD that may explain the variations that differentiate between seemingly successful (widespread and unsuccessful (poor dissemination S. Typhi populations. Large scale whole-genome phylogeny demonstrated evidence of phylogeographical structuring and showed that ST8 may have diverged from the earlier ancestral population of ST1 and ST2, which later lost some of its fitness advantages, leading to poor worldwide dissemination. In response to the unprecedented increase in genomic data, this study demonstrates and highlights the utility of large-scale genome-based MLST as a quick and effective approach to narrow the scope of in-depth comparative genomic analysis and consequently provide new insights into the fine scale of pathogen evolution and population structure.

  8. Leveraging Comparative Genomics to Identify and Functionally Characterize Genes Associated with Sperm Phenotypes in Python bivittatus (Burmese Python)

    Science.gov (United States)

    Rutllant, Josep

    2016-01-01

    Comparative genomics approaches provide a means of leveraging functional genomics information from a highly annotated model organism's genome (such as the mouse genome) in order to make physiological inferences about the role of genes and proteins in a less characterized organism's genome (such as the Burmese python). We employed a comparative genomics approach to produce the functional annotation of Python bivittatus genes encoding proteins associated with sperm phenotypes. We identify 129 gene-phenotype relationships in the python which are implicated in 10 specific sperm phenotypes. Results obtained through our systematic analysis identified subsets of python genes exhibiting associations with gene ontology annotation terms. Functional annotation data was represented in a semantic scatter plot. Together, these newly annotated Python bivittatus genome resources provide a high resolution framework from which the biology relating to reptile spermatogenesis, fertility, and reproduction can be further investigated. Applications of our research include (1) production of genetic diagnostics for assessing fertility in domestic and wild reptiles; (2) enhanced assisted reproduction technology for endangered and captive reptiles; and (3) novel molecular targets for biotechnology-based approaches aimed at reducing fertility and reproduction of invasive reptiles. Additional enhancements to reptile genomic resources will further enhance their value. PMID:27200191

  9. Planar patterned stretchable electrode arrays based on flexible printed circuits

    International Nuclear Information System (INIS)

    For stretchable electronics to achieve broad industrial application, they must be reliable to manufacture and must perform robustly while undergoing large deformations. We present a new strategy for creating planar stretchable electronics and demonstrate one such device, a stretchable microelectrode array based on flex circuit technology. Stretchability is achieved through novel, rationally designed perforations that provide islands of low strain and continuous low-strain pathways for conductive traces. This approach enables the device to maintain constant electrical properties and planarity while undergoing applied strains up to 15%. Materials selection is not limited to polyimide composite devices and can potentially be implemented with either soft or hard substrates and can incorporate standard metals or new nano-engineered conductors. By using standard flex circuit technology, our planar microelectrode device achieved constant resistances for strains up to 20% with less than a 4% resistance offset over 120 000 cycles at 10% strain. (paper)

  10. Tin Oxide Nanorod Array-Based Electrochemical Hydrogen Peroxide Biosensor

    Directory of Open Access Journals (Sweden)

    Liu Jinping

    2010-01-01

    Full Text Available Abstract SnO2 nanorod array grown directly on alloy substrate has been employed as the working electrode of H2O2 biosensor. Single-crystalline SnO2 nanorods provide not only low isoelectric point and enough void spaces for facile horseradish peroxidase (HRP immobilization but also numerous conductive channels for electron transport to and from current collector; thus, leading to direct electrochemistry of HRP. The nanorod array-based biosensor demonstrates high H2O2 sensing performance in terms of excellent sensitivity (379 μA mM−1 cm−2, low detection limit (0.2 μM and high selectivity with the apparent Michaelis–Menten constant estimated to be as small as 33.9 μM. Our work further demonstrates the advantages of ordered array architecture in electrochemical device application and sheds light on the construction of other high-performance enzymatic biosensors.

  11. A model for carbohydrate metabolism in the diatom Phaeodactylum tricornutum deduced from comparative whole genome analysis.

    Directory of Open Access Journals (Sweden)

    Peter G Kroth

    Full Text Available BACKGROUND: Diatoms are unicellular algae responsible for approximately 20% of global carbon fixation. Their evolution by secondary endocytobiosis resulted in a complex cellular structure and metabolism compared to algae with primary plastids. METHODOLOGY/PRINCIPAL FINDINGS: The whole genome sequence of the diatom Phaeodactylum tricornutum has recently been completed. We identified and annotated genes for enzymes involved in carbohydrate pathways based on extensive EST support and comparison to the whole genome sequence of a second diatom, Thalassiosira pseudonana. Protein localization to mitochondria was predicted based on identified similarities to mitochondrial localization motifs in other eukaryotes, whereas protein localization to plastids was based on the presence of signal peptide motifs in combination with plastid localization motifs previously shown to be required in diatoms. We identified genes potentially involved in a C4-like photosynthesis in P. tricornutum and, on the basis of sequence-based putative localization of relevant proteins, discuss possible differences in carbon concentrating mechanisms and CO(2 fixation between the two diatoms. We also identified genes encoding enzymes involved in photorespiration with one interesting exception: glycerate kinase was not found in either P. tricornutum or T. pseudonana. Various Calvin cycle enzymes were found in up to five different isoforms, distributed between plastids, mitochondria and the cytosol. Diatoms store energy either as lipids or as chrysolaminaran (a beta-1,3-glucan outside of the plastids. We identified various beta-glucanases and large membrane-bound glucan synthases. Interestingly most of the glucanases appear to contain C-terminal anchor domains that may attach the enzymes to membranes. CONCLUSIONS/SIGNIFICANCE: Here we present a detailed synthesis of carbohydrate metabolism in diatoms based on the genome sequences of Thalassiosira pseudonana and Phaeodactylum tricornutum

  12. Comparative genomics of Ceriporiopsis subvermispora and Phanerochaete chrysosporium provide insight into selective ligninolysis

    Energy Technology Data Exchange (ETDEWEB)

    Fernandez-Fueyo, Elena; Ruiz-Duenas, Francisco J.; Ferreira, Patrica; Floudas, Dimitrios; HIbbett, David S.; Canessa, Paulo; Larrondo, Luis F.; James, Tim Y.; Seelenfreund, Daniela; Lobos, Sergio; Polanco, Ruben; Tello, Mario; Honda, Yoichi; Watanabe, Takahito; Watanabe, Takashi; Ryu, Jae San; Kubicek, Christian P.; Schmoll, Monika; Gaskell, Jill; Hammel, Kenneth E.; John, Franz J.; Vanden Wymelenberg, Amber; Sabat, Grzegorz; Splinter BonDurant, Sandra; Syed, Khajamohiddin; Yadav, Jagjit S.; Doddapaneni, Harshavardhan; Subramanian, Venkataramanan; Lavin, Jose L.; Oguiza, Jose A.; Perez, Gumer; Pisabarro, Antonio G.; Ramirez, Lucia; Santoyo, Francisco; Master, Emma; Coutinho, Pedro M.; Henrissat, Bernard; Lombard, Vincent; Magnuson, Jon Karl; Kues, Ursula; Hori, Chiaki; Igarashi, Kiyohiko; Samejima, Masahiro; Held, Benjamin W.; Barry, Kerrie W.; LaButti, Kurt M.; Lapidus, Alla; Lindquist, Erika A.; Lucas, Susan M.; Riley, Robert; Salamov, Asaf A.; Hoffmeister, Dirk; Schwenk, Daniel; Hadar, Yitzhak; Yarden, Oded; de Vries, Ronald P.; Wiebenga, Ad; Stenlid, Jan; Eastwood, Daniel; Grigoriev, Igor V.; Berka, Randy M.; Blanchette, Robert A.; Kersten, Phil; Martinez, Angel T.; Vicuna, Rafael; Cullen, Dan

    2011-12-06

    Efficient lignin depolymerization is unique to the wood decay basidiomycetes, collectively referred to as white rot fungi. Phanerochaete chrysosporium simultaneously degrades lignin and cellulose, whereas the closely related species, Ceriporiopsis subvermispora, also depolymerizes lignin but may do so with relatively little cellulose degradation. To investigate the basis for selective ligninolysis, we conducted comparative genome analysis of C. subvermispora and P. chrysosporium. Genes encoding manganese peroxidase numbered 13 and five in C. subvermispora and P. chrysosporium, respectively. In addition, the C. subvermispora genome contains at least seven genes predicted to encode laccases, whereas the P. chrysosporium genome contains none. We also observed expansion of the number of C. subvermispora desaturase-encoding genes putatively involved in lipid metabolism. Microarray-based transcriptome analysis showed substantial up-regulation of several desaturase and MnP genes in wood-containing medium. MS identified MnP proteins in C. subvermispora culture filtrates, but none in P. chrysosporium cultures. These results support the importance of MnP and a lignin degradation mechanism whereby cleavage of the dominant nonphenolic structures is mediated by lipid peroxidation products. Two C. subvermispora genes were predicted to encode peroxidases structurally similar to P. chrysosporium lignin peroxidase and, following heterologous expression in Escherichia coli, the enzymes were shown to oxidize high redox potential substrates, but not Mn2. Apart from oxidative lignin degradation, we also examined cellulolytic and hemicellulolytic systems in both fungi. In summary, the C. subvermispora genetic inventory and expression patterns exhibit increased oxidoreductase potential and diminished cellulolytic capability relative to P. chrysosporium.

  13. Evolutionary relationships of Fusobacterium nucleatum based on phylogenetic analysis and comparative genomics

    Directory of Open Access Journals (Sweden)

    Moreira David

    2004-11-01

    Full Text Available Abstract Background The phylogenetic position and evolutionary relationships of Fusobacteria remain uncertain. Especially intriguing is their relatedness to low G+C Gram positive bacteria (Firmicutes by ribosomal molecular phylogenies, but their possession of a typical gram negative outer membrane. Taking advantage of the recent completion of the Fusobacterium nucleatum genome sequence we have examined the evolutionary relationships of Fusobacterium genes by phylogenetic analysis and comparative genomics tools. Results The data indicate that Fusobacterium has a core genome of a very different nature to other bacterial lineages, and branches out at the base of Firmicutes. However, depending on the method used, 35–56% of Fusobacterium genes appear to have a xenologous origin from bacteroidetes, proteobacteria, spirochaetes and the Firmicutes themselves. A high number of hypothetical ORFs with unusual codon usage and short lengths were found and hypothesized to be remnants of transferred genes that were discarded. Some proteins and operons are also hypothesized to be of mixed ancestry. A large portion of the Gram-negative cell wall-related genes seems to have been transferred from proteobacteria. Conclusions Many instances of similarity to other inhabitants of the dental plaque that have been sequenced were found. This suggests that the close physical contact found in this environment might facilitate horizontal gene transfer, supporting the idea of niche-specific gene pools. We hypothesize that at a point in time, probably associated to the rise of mammals, a strong selective pressure might have existed for a cell with a Clostridia-like metabolic apparatus but with the adhesive and immune camouflage features of Proteobacteria.

  14. Characterization and comparative genomic analysis of bacteriophages infecting members of the Bacillus cereus group.

    Science.gov (United States)

    Lee, Ju-Hoon; Shin, Hakdong; Ryu, Sangryeol

    2014-05-01

    The Bacillus cereus group phages infecting B. cereus, B. anthracis, and B. thuringiensis (Bt) have been studied at the molecular level and, recently, at the genomic level to control the pathogens B. cereus and B. anthracis and to prevent phage contamination of the natural insect pesticide Bt. A comparative phylogenetic analysis has revealed three different major phage groups with different morphologies (Myoviridae for group I, Siphoviridae for group II, and Tectiviridae for group III), genome size (group I > group II > group III), and lifestyle (virulent for group I and temperate for group II and III). A subsequent phage genome comparison using a dot plot analysis showed that phages in each group are highly homologous, substantiating the grouping of B. cereus phages. Endolysin is a host lysis protein that contains two conserved domains: a cell-wall-binding domain (CBD) and an enzymatic activity domain (EAD). In B. cereus sensu lato phage group I, four different endolysin groups have been detected, according to combinations of two types of CBD and four types of EAD. Group I phages have two copies of tail lysins and one copy of endolysin, but the functions of the tail lysins are still unknown. In the B. cereus sensu lato phage group II, the B. anthracis phages have been studied and applied for typing and rapid detection of pathogenic host strains. In the B. cereus sensu lato phage group III, the B. thuringiensis phages Bam35 and GIL01 have been studied to understand phage entry and lytic switch regulation mechanisms. In this review, we suggest that further study of the B. cereus group phages would be useful for various phage applications, such as biocontrol, typing, and rapid detection of the pathogens B. cereus and B. anthracis and for the prevention of phage contamination of the natural insect pesticide Bt. PMID:24264384

  15. The complete mitochondrial genome of eastern lowland gorilla, Gorilla beringei graueri, and comparative mitochondrial genomics of Gorilla species.

    Science.gov (United States)

    Hu, Xiao-di; Gao, Li-zhi

    2016-01-01

    In this study, we determined the complete mitochondrial (mt) genome of eastern lowland gorilla, Gorilla beringei graueri for the first time. The total genome was 16,416 bp in length. It contained a total of 13 protein-coding genes, 22 transfer RNA genes, 2 ribosomal RNA genes and 1 control region (D-loop region). The base composition was A (30.88%), G (13.10%), C (30.89%) and T (25.13%), indicating that the percentage of A+T (56.01%) was higher than G+C (43.99%). Comparisons with the other publicly available Gorilla mitogenome showed the conservation of gene order and base compositions but a bunch of nucleotide diversity. This complete mitochondrial genome sequence will provide valuable genetic information for further studies on conservation genetics of eastern lowland gorilla. PMID:25162588

  16. Integrative Genomics Viewer

    OpenAIRE

    James T Robinson; Thorvaldsdóttir, Helga; Winckler, Wendy; Guttman, Mitchell; Lander, Eric S; Getz, Gad; Mesirov, Jill P.

    2011-01-01

    To the Editor: Rapid improvements in sequencing and array-based platforms are resulting in a flood of diverse genome-wide data, including data from exome and whole-genome sequencing, epigenetic surveys, expression profiling of coding and noncoding RNAs, single nucleotide polymorphism (SNP) and copy number profiling, and functional assays. Analysis of these large, diverse data sets holds the promise of a more comprehensive understanding of the genome and its relation to human disease. Exper...

  17. Comparative genomic analysis of single-molecule sequencing and hybrid approaches for finishing the Clostridium autoethanogenum JA1-1 strain DSM 10061 genome

    Energy Technology Data Exchange (ETDEWEB)

    Brown, Steven D [ORNL; Nagaraju, Shilpa [LanzaTech; Utturkar, Sagar M [ORNL; De Tissera, Sashini [LanzaTech; Segovia, Simón [LanzaTech; Mitchell, Wayne [LanzaTech; Land, Miriam L [ORNL; Dassanayake, Asela [LanzaTech; Köpke, Michael [LanzaTech

    2014-01-01

    Background Clostridium autoethanogenum strain JA1-1 (DSM 10061) is an acetogen capable of fermenting CO, CO2 and H2 (e.g. from syngas or waste gases) into biofuel ethanol and commodity chemicals such as 2,3-butanediol. A draft genome sequence consisting of 100 contigs has been published. Results A closed, high-quality genome sequence for C. autoethanogenum DSM10061 was generated using only the latest single-molecule DNA sequencing technology and without the need for manual finishing. It is assigned to the most complex genome classification based upon genome features such as repeats, prophage, nine copies of the rRNA gene operons. It has a low G + C content of 31.1%. Illumina, 454, Illumina/454 hybrid assemblies were generated and then compared to the draft and PacBio assemblies using summary statistics, CGAL, QUAST and REAPR bioinformatics tools and comparative genomic approaches. Assemblies based upon shorter read DNA technologies were confounded by the large number repeats and their size, which in the case of the rRNA gene operons were ~5 kb. CRISPR (Clustered Regularly Interspaced Short Paloindromic Repeats) systems among biotechnologically relevant Clostridia were classified and related to plasmid content and prophages. Potential associations between plasmid content and CRISPR systems may have implications for historical industrial scale Acetone-Butanol-Ethanol (ABE) fermentation failures and future large scale bacterial fermentations. While C. autoethanogenum contains an active CRISPR system, no such system is present in the closely related Clostridium ljungdahlii DSM 13528. A common prophage inserted into the Arg-tRNA shared between the strains suggests a common ancestor. However, C. ljungdahlii contains several additional putative prophages and it has more than double the amount of prophage DNA compared to C. autoethanogenum. Other differences include important metabolic genes for central metabolism (as an additional hydrogenase and the absence of a

  18. Functional analysis and comparative genomics of expressed sequence tags from the lycophyte Selaginella moellendorffii

    Directory of Open Access Journals (Sweden)

    Tanurdzic Milos

    2005-06-01

    Full Text Available Abstract Background The lycophyte Selaginella moellendorffii is a member of one of the oldest lineages of vascular plants on Earth. Fossil records show that the lycophyte clade arose 400 million years ago, 150–200 million years earlier than angiosperms, a group of plants that includes the well-studied flowering plant Arabidopsis thaliana. S. moellendorffii has a genome size of approximately 100 Mbp, as small or smaller than that of A. thaliana. S. moellendorffii has the potential to provide significant comparative information to better understand the evolution of vascular plants. Results We sequenced 2181 Expressed Sequence Tags (ESTs from a S. moellendorffii cDNA library. One thousand three hundred and one non-redundant sequences were assembled, containing 291 contigs and 1010 singletons. Approximately 75% of the ESTs matched proteins in the non-redundant protein database. Among 1301 clusters, 343 were categorized according to Gene Ontology (GO hierarchy and were compared to the GO mapping of A. thaliana tentative consensus sequences. We compared S. moellendorffii ESTs to the A. thaliana and Physcomitrella patens EST databases, using the tBLASTX algorithm. Approximately 60% of the ESTs exhibited similarity with both A. thaliana and P. patens ESTs; whereas, 13% and 1% of the ESTs had exclusive similarity with A. thaliana and P. patens ESTs, respectively. A substantial proportion of the ESTs (26% had no match with A. thaliana or P. patens ESTs. Conclusion We discovered 1301 putative unigenes in S. moellendorffii. These results give an initial insight into its transcriptome that will aid in the study of the S. moellendorffii genome in the near future.

  19. Metabolic characteristics of a glycogen-accumulating organism in Defluviicoccus cluster II revealed by comparative genomics.

    Science.gov (United States)

    Wang, Zhiping; Guo, Feng; Mao, Yanping; Xia, Yu; Zhang, Tong

    2014-11-01

    Glycogen-accumulating organisms (GAOs) may compete with phosphate-accumulating organisms (PAOs) for short-chain fatty acids (VFAs) in anaerobic polyhydroxyalkanoates (PHA) synthesis, but no consequently aerobic polyphosphate accumulation in enhanced biological phosphorus removal (EBPR) process, thus deteriorating the EBPR process. They are detected frequently in the deteriorated EBPR process, but their metabolisms are still far from our comprehensions for there is seldom pure culture. In this study, a nearly complete draft genome of a GAOs in Defluviicoccus cluster II, GAO-HK, is recruited from the metagenome of activated sludge in a full-scale industrial anoxic/aerobic wastewater plant. Comparative genomics reveal similar metabolisms of PHA and glycogen in GAOs of GAO-HK, Defluviicoccus tetraformis TFO71 (TFO71) and Competibacter phosphatis clade IIA (CPIIA), and PAOs of Accumulibacter clade IIA UW-1 (UW-1) and Tetrasphaera elongata Lp2 (Lp2). Although there are similar gene cassettes related with polyphosphate metabolism in these GAOs and PAOs, especially for Defluviicoccus-relative bacteria and UW-1, ppk1 in GAOs are diverse from those in the identified PAOs, implying the difference of polyphosphate metabolism in GAOs and PAOs. Additionally, genes related to the dissimilatory denitrification are absent in TFO71 and GAO-HK, implying that additional nitrate or nitrite may favor PAOs over Defluviicoccus-relative GAOs. Therefore, PAOs suffering from competition of Defluviicoccus-relative GAOs might be rescued with the additional nitrate/nitrite, which is important to improve the stability of EBPR processes. PMID:24889288

  20. Building the genomic nation: 'Homo Brasilis' and the 'Genoma Mexicano' in comparative cultural perspective.

    Science.gov (United States)

    Kent, Michael; García-Deister, Vivette; López-Beltrán, Carlos; Santos, Ricardo Ventura; Schwartz-Marín, Ernesto; Wade, Peter

    2015-12-01

    Abstract This article explores the relationship between genetic research, nationalism and the construction of collective social identities in Latin America. It makes a comparative analysis of two research projects--the 'Genoma Mexicano' and the 'Homo Brasilis'--both of which sought to establish national and genetic profiles. Both have reproduced and strengthened the idea of their respective nations of focus, incorporating biological elements into debates on social identities. Also, both have placed the unifying figure of the mestizo/mestiço at the heart of national identity constructions, and in so doing have displaced alternative identity categories, such as those based on race. However, having been developed in different national contexts, these projects have had distinct scientific and social trajectories: in Mexico, the genomic mestizo is mobilized mainly in relation to health, while in Brazil the key arena is that of race. We show the importance of the nation as a frame for mobilizing genetic data in public policy debates, and demonstrate how race comes In and out of focus in different Latin American national contexts of genomic research, while never completely disappearing. PMID:27479999

  1. Chromosomal imbalances in malignant peripheral nerve sheath tumor detected by metaphase and microarray comparative genomic hybridization.

    Science.gov (United States)

    Nakagawa, Yasuko; Yoshida, Aki; Numoto, Kunihiko; Kunisada, Toshiyuki; Wai, Daniel; Ohata, Norihide; Takeda, Ken; Kawai, Akira; Ozaki, Toshifumi

    2006-02-01

    Malignant peripheral nerve sheath tumors (MPNSTs) are highly malignant tumors affecting adolescents and adults. There have been a few reports on chromosomal aberrations of MPNSTs; however, the tumor-specific alteration remains unknown. We characterized the genomic alterations in 8 MPNSTs and 8 schwannomas by metaphase comparative genomic hybridization (CGH). In 5 of 8 MPNSTs, microarray CGH was added for more detailed analyses. Frequent gains were identified on 3q13-26, 5p13-14, and 12q11-23 and frequent losses were at 1p31, 10p, 11q24-qter, 16, and 17. Microarray CGH revealed frequent gains of EGFR, DAB2, MSH2, KCNK12, DDX15, CDK6, and LAMA3, and losses of CDH1, GLTSCR2, EGR1, CTSB, GATA3, and SULT2A1. These genes seem to be responsible for developing MPNSTs. The concordance rate between metaphase CGH and microarray CGH was 66%. Metaphase CGH was useful for identifying chromosomal alterations before applying microarray CGH. PMID:16391845

  2. Microdeletion and microduplication analysis of chinese conotruncal defects patients with targeted array comparative genomic hybridization.

    Directory of Open Access Journals (Sweden)

    Xiaohui Gong

    Full Text Available OBJECTIVE: The current study aimed to develop a reliable targeted array comparative genomic hybridization (aCGH to detect microdeletions and microduplications in congenital conotruncal defects (CTDs, especially on 22q11.2 region, and for some other chromosomal aberrations, such as 5p15-5p, 7q11.23 and 4p16.3. METHODS: Twenty-seven patients with CTDs, including 12 pulmonary atresia (PA, 10 double-outlet right ventricle (DORV, 3 transposition of great arteries (TGA, 1 tetralogy of Fallot (TOF and one ventricular septal defect (VSD, were enrolled in this study and screened for pathogenic copy number variations (CNVs, using Agilent 8 x 15K targeted aCGH. Real-time quantitative polymerase chain reaction (qPCR was performed to test the molecular results of targeted aCGH. RESULTS: Four of 27 patients (14.8% had 22q11.2 CNVs, 1 microdeletion and 3 microduplications. qPCR test confirmed the microdeletion and microduplication detected by the targeted aCGH. CONCLUSION: Chromosomal abnormalities were a well-known cause of multiple congenital anomalies (MCA. This aCGH using arrays with high-density coverage in the targeted regions can detect genomic imbalances including 22q11.2 and other 10 kinds CNVs effectively and quickly. This approach has the potential to be applied to detect aneuploidy and common microdeletion/microduplication syndromes on a single microarray.

  3. Comparative phylogenomics uncovers the impact of symbiotic associations on host genome evolution.

    Directory of Open Access Journals (Sweden)

    Pierre-Marc Delaux

    2014-07-01

    Full Text Available Mutualistic symbioses between eukaryotes and beneficial microorganisms of their microbiome play an essential role in nutrition, protection against disease, and development of the host. However, the impact of beneficial symbionts on the evolution of host genomes remains poorly characterized. Here we used the independent loss of the most widespread plant-microbe symbiosis, arbuscular mycorrhization (AM, as a model to address this question. Using a large phenotypic approach and phylogenetic analyses, we present evidence that loss of AM symbiosis correlates with the loss of many symbiotic genes in the Arabidopsis lineage (Brassicales. Then, by analyzing the genome and/or transcriptomes of nine other phylogenetically divergent non-host plants, we show that this correlation occurred in a convergent manner in four additional plant lineages, demonstrating the existence of an evolutionary pattern specific to symbiotic genes. Finally, we use a global comparative phylogenomic approach to track this evolutionary pattern among land plants. Based on this approach, we identify a set of 174 highly conserved genes and demonstrate enrichment in symbiosis-related genes. Our findings are consistent with the hypothesis that beneficial symbionts maintain purifying selection on host gene networks during the evolution of entire lineages.

  4. Genome-scale metabolic modeling of Mucor circinelloides and comparative analysis with other oleaginous species.

    Science.gov (United States)

    Vongsangnak, Wanwipa; Klanchui, Amornpan; Tawornsamretkit, Iyarest; Tatiyaborwornchai, Witthawin; Laoteng, Kobkul; Meechai, Asawin

    2016-06-01

    We present a novel genome-scale metabolic model iWV1213 of Mucor circinelloides, which is an oleaginous fungus for industrial applications. The model contains 1213 genes, 1413 metabolites and 1326 metabolic reactions across different compartments. We demonstrate that iWV1213 is able to accurately predict the growth rates of M. circinelloides on various nutrient sources and culture conditions using Flux Balance Analysis and Phenotypic Phase Plane analysis. Comparative analysis of three oleaginous genome-scale models, including M. circinelloides (iWV1213), Mortierella alpina (iCY1106) and Yarrowia lipolytica (iYL619_PCP) revealed that iWV1213 possesses a higher number of genes involved in carbohydrate, amino acid, and lipid metabolisms that might contribute to its versatility in nutrient utilization. Moreover, the identification of unique and common active reactions among the Zygomycetes oleaginous models using Flux Variability Analysis unveiled a set of gene/enzyme candidates as metabolic engineering targets for cellular improvement. Thus, iWV1213 offers a powerful metabolic engineering tool for multi-level omics analysis, enabling strain optimization as a cell factory platform of lipid-based production. PMID:26911256

  5. A preliminary survey of M. hyopneumoniae virulence factors based on comparative genomic analysis

    Directory of Open Access Journals (Sweden)

    Henrique Bunselmeyer Ferreira

    2007-01-01

    Full Text Available Mycoplasma hyopneumoniae is the etiological agent of porcine enzootic pneumonia (PEP, a major problem for the pig industry. The mechanisms of M. hyopneumoniae pathogenicity allow to predict the existence of several classes of virulence factors, whose study has been essentially restricted to the characterization of adhesion-related and major antigenic proteins. The now available complete sequences of the genomes of two pathogenic and one non-pathogenic strain of M. hyopneumoniae allowed to use a comparative genomics approach to putatively identify virulence genes. In this preliminary survey, we were able to identify 118 CDSs encoding putative virulence factors, based on specific criteria ranging from predicted cell surface location or variation between strains to previous functional studies showing antigenicity or involvement in host-pathogen interaction. This survey is expected to serve as a first step towards the functional characterization of new virulence genes/proteins that will be important not only for a better comprehension of M. hyopneumoniae biology, but also for the development of new and improved protocols for PEP vaccination, diagnosis and treatment.

  6. [Confirmation of a prenatal diagnosis of trisomy 13 with comparative genomic hybridization (CGH)].

    Science.gov (United States)

    Marton, T; Thein, A; Bán, Z; Soothill, P; Oroszné, N J; Papp, Z

    2001-05-13

    Trisomy 13 was diagnosed with genetic amniocentesis in a fetus of a 50 years old patient. Fetopathologic examination has shown cyclopy, proboscis and semilobar holoprosencephaly of the fetus, which is consistent with Patau syndrome. DNA was extracted from frozen liver tissue. Result of comparative genomic hybridization (CGH) was consistent with trisomy 13. They processed the DNA according Kallioniemi's method with modifications. CGH was developed for cancer genetics in mid 90s and now it is widely used in prenatal diagnosis too. CGH allows global analysis to detect unbalanced chromosome gains and losses in the whole genome in a single experiment without the need for cell culture. Significant results can be expected in those cases where conventional cytogenetics is not able to provide an answer either because postmortem tissue is not appropriate for cytogenetics or because the chromosomal change is sub-microscopical. CGH is a fluorescent in situ hybridization on a healthy target metaphase, with equal amount of competitive hybridization of green labelled digested test DNA and red labelled digested control DNA. Red to green ratio is assessed with the help of an image analyser. Green dominance represents chromosome gain, while red shift chromosome loss. In the paper they present the fetopathologic report of a trisomy 13 fetus and illustrate the method being the first Hungarian obstetric case diagnosed by CGH. PMID:11419300

  7. Comparative genomic analysis and phenazine production of Pseudomonas chlororaphis, a plant growth-promoting rhizobacterium

    Directory of Open Access Journals (Sweden)

    Yawen Chen

    2015-06-01

    Full Text Available Pseudomonas chlororaphis HT66, a plant growth-promoting rhizobacterium that produces phenazine-1-carboxamide with high yield, was compared with three genomic sequenced P. chlororaphis strains, GP72, 30–84 and O6. The genome sizes of four strains vary from 6.66 to 7.30 Mb. Comparisons of predicted coding sequences indicated 4833 conserved genes in 5869–6455 protein-encoding genes. Phylogenetic analysis showed that the four strains are closely related to each other. Its competitive colonization indicates that P. chlororaphis can adapt well to its environment. No virulence or virulence-related factor was found in P. chlororaphis. All of the four strains could synthesize antimicrobial metabolites including different phenazines and insecticidal protein FitD. Some genes related to the regulation of phenazine biosynthesis were detected among the four strains. It was shown that P. chlororaphis is a safe PGPR in agricultural application and could also be used to produce some phenazine antibiotics with high-yield.

  8. Comparative analyses of chloroplast genome data representing nine green algae in Sphaeropleales (Chlorophyceae, Chlorophyta)

    OpenAIRE

    Fučíková, Karolina; Lewis, Louise A.; Lewis, Paul O.

    2016-01-01

    The chloroplast genomes of green algae are highly variable in their architecture. In this article we summarize gene content across newly obtained and published chloroplast genomes in Chlorophyceae, including new data from nine of species in Sphaeropleales (Chlorophyceae, Chlorophyta). We present genome architecture information, including genome synteny analysis across two groups of species. Also, we provide a phylogenetic tree obtained from analysis of gene order data for species in Chlorophy...

  9. High frequency of submicroscopic genomic aberrations detected by tiling path array comparative genome hybridisation in patients with isolated congenital heart disease

    DEFF Research Database (Denmark)

    Erdogan, F; Larsen, Lars Allan; Zhang, L;

    2008-01-01

    . Chromosomal imbalances have been identified in many forms of syndromic CHD, but very little is known about the impact of DNA copy number changes in non-syndromic CHD. METHOD: A sub-megabase resolution array comparative genome hybridisation (CGH) screen was carried out on 105 patients with CHD as the sole...

  10. Comparative Genome Analysis Provides Insights into the Pathogenicity of Flavobacterium psychrophilum

    DEFF Research Database (Denmark)

    Castillo, Daniel; Christiansen, Rói Hammershaimb; Dalsgaard, Inger;

    2016-01-01

    describe the F. psychrophilum pan-genome and to examine virulence factors, prophages, CRISPR arrays, and genomic islands present in the genomes. Analysis of the genomic DNA sequences were complemented with selected phenotypic characteristics of the strains. The pan genome analysis showed that F......, independent of geographic location, year of isolation and source of isolates. Only one prophage-related sequence was found which corresponded to the previously described prophage 6H, and appeared in 5 out of 11 isolates. CRISPR array analysis revealed two different loci with dissimilar spacer content, which...

  11. Diversity and relationships of cocirculating modern human rotaviruses revealed using large-scale comparative genomics.

    Science.gov (United States)

    McDonald, Sarah M; McKell, Allison O; Rippinger, Christine M; McAllen, John K; Akopov, Asmik; Kirkness, Ewen F; Payne, Daniel C; Edwards, Kathryn M; Chappell, James D; Patton, John T

    2012-09-01

    Group A rotaviruses (RVs) are 11-segmented, double-stranded RNA viruses and are primary causes of gastroenteritis in young children. Despite their medical relevance, the genetic diversity of modern human RVs is poorly understood, and the impact of vaccine use on circulating strains remains unknown. In this study, we report the complete genome sequence analysis of 58 RVs isola