WorldWideScience

Sample records for complete sequences gene

  1. Third-Generation Sequencing and Analysis of Four Complete Pig Liver Esterase Gene Sequences in Clones Identified by Screening BAC Library.

    Science.gov (United States)

    Zhou, Qiongqiong; Sun, Wenjuan; Liu, Xiyan; Wang, Xiliang; Xiao, Yuncai; Bi, Dingren; Yin, Jingdong; Shi, Deshi

    2016-01-01

    Pig liver carboxylesterase (PLE) gene sequences in GenBank are incomplete, which has led to difficulties in studying the genetic structure and regulation mechanisms of gene expression of PLE family genes. The aim of this study was to obtain and analysis of complete gene sequences of PLE family by screening from a Rongchang pig BAC library and third-generation PacBio gene sequencing. After a number of existing incomplete PLE isoform gene sequences were analysed, primers were designed based on conserved regions in PLE exons, and the whole pig genome used as a template for Polymerase chain reaction (PCR) amplification. Specific primers were then selected based on the PCR amplification results. A three-step PCR screening method was used to identify PLE-positive clones by screening a Rongchang pig BAC library and PacBio third-generation sequencing was performed. BLAST comparisons and other bioinformatics methods were applied for sequence analysis. Five PLE-positive BAC clones, designated BAC-10, BAC-70, BAC-75, BAC-119 and BAC-206, were identified. Sequence analysis yielded the complete sequences of four PLE genes, PLE1, PLE-B9, PLE-C4, and PLE-G2. Complete PLE gene sequences were defined as those containing regulatory sequences, exons, and introns. It was found that, not only did the PLE exon sequences of the four genes show a high degree of homology, but also that the intron sequences were highly similar. Additionally, the regulatory region of the genes contained two 720bps reverse complement sequences that may have an important function in the regulation of PLE gene expression. This is the first report to confirm the complete sequences of four PLE genes. In addition, the study demonstrates that each PLE isoform is encoded by a single gene and that the various genes exhibit a high degree of sequence homology, suggesting that the PLE family evolved from a single ancestral gene. Obtaining the complete sequences of these PLE genes provides the necessary foundation for

  2. Complete cDNA sequence and amino acid analysis of a bovine ribonuclease K6 gene.

    Science.gov (United States)

    Pietrowski, D; Förster, M

    2000-01-01

    The complete cDNA sequence of a ribonuclease k6 gene of Bos Taurus has been determined. It codes for a protein with 154 amino acids and contains the invariant cysteine, histidine and lysine residues as well as the characteristic motifs specific to ribonuclease active sites. The deduced protein sequence is 27 residues longer than other known ribonucleases k6 and shows amino acids exchanges which could reflect a strain specificity or polymorphism within the bovine genome. Based on sequence similarity we have termed the identified gene bovine ribonuclease k6 b (brk6b).

  3. Complete genome sequence of Fer-de-Lance Virus reveals a novel gene in reptilian Paramyxoviruses

    Science.gov (United States)

    Kurath, G.; Batts, W.N.; Ahne, W.; Winton, J.R.

    2004-01-01

    The complete RNA genome sequence of the archetype reptilian paramyxovirus, Fer-de-Lance virus (FDLV), has been determined. The genome is 15,378 nucleotides in length and consists of seven nonoverlapping genes in the order 3??? N-U-P-M-F-HN-L 5???, coding for the nucleocapsid, unknown, phospho-, matrix, fusion, hemagglutinin-neuraminidase, and large polymerase proteins, respectively. The gene junctions contain highly conserved transcription start and stop signal sequences and tri-nucleotide intergenic regions similar to those of other Paramyxoviridae. The FDLV P gene expression strategy is like that of rubulaviruses, which express the accessory V protein from the primary transcript and edit a portion of the mRNA to encode P and I proteins. There is also an overlapping open reading frame potentially encoding a small basic protein in the P gene. The gene designated U (unknown), encodes a deduced protein of 19.4 kDa that has no counterpart in other paramyxoviruses and has no similarity with sequences in the National Center for Biotechnology Information database. Active transcription of the U gene in infected cells was demonstrated by Northern blot analysis, and bicistronic N-U mRNA was also evident. The genomes of two other snake paramyxovirus genotypes were also found to have U genes, with 11 to 16% nucleotide divergence from the FDLV U gene. Pairwise comparisons of amino acid identities and phylogenetic analyses of all deduced FDLV protein sequences with homologous sequences from other Paramyxoviridae indicate that FDLV represents a new genus within the subfamily Paramyxovirinae. We suggest the name Ferlavirus for the new genus, with FDLV as the type species.

  4. Murine mammary tumor virus pol-related sequences in human DNA: characterization and sequence comparison with the complete murine mammary tumor virus pol gene

    International Nuclear Information System (INIS)

    Deen, K.C.; Sweet, R.W.

    1986-01-01

    Sequences in the human genome with homology to the murine mammary tumor virus (MMTV) pol gene were isolated from a human phage library. Ten clones with extensive pol homology were shown to define five separate loci. These loci share common sequences immediately adjacent to the pol-like segments and, in addition, contain a related repeat element which bounds this region. This organization is suggestive of a proviral structure. The authors estimate that the human genome contains 30 to 40 copies of these pol-related sequences. The pol region of one of the cloned segments (HM16) and the complete MMTV pol gene were sequenced and compared. The nucleotide homology between these pol sequences is 52% and is concentrated in the terminal regions. The MMTV pol gene contains a single long open reading frame encoding 899 amino acids and is demarcated from the partially overlapping putative gag gene by termination codons and a shift in translational reading frame. The pol sequence of HM16 is multiply terminated but does contain open reading frames which encode 370, 105, and 112 amino acids residues in separate reading frames. The authors deduced a composite pol protein sequence for HM16 by aligning it to the MMTV pol gene and then compared these sequences with other retroviral pol protein sequences. Conserved sequences occur in both the amino and carboxyl regions which lie within the polymerase and endonuclease domains of pol, respectively

  5. Complete exon sequencing of all known Usher syndrome genes greatly improves molecular diagnosis.

    Science.gov (United States)

    Bonnet, Crystel; Grati, M'hamed; Marlin, Sandrine; Levilliers, Jacqueline; Hardelin, Jean-Pierre; Parodi, Marine; Niasme-Grare, Magali; Zelenika, Diana; Délépine, Marc; Feldmann, Delphine; Jonard, Laurence; El-Amraoui, Aziz; Weil, Dominique; Delobel, Bruno; Vincent, Christophe; Dollfus, Hélène; Eliot, Marie-Madeleine; David, Albert; Calais, Catherine; Vigneron, Jacqueline; Montaut-Verient, Bettina; Bonneau, Dominique; Dubin, Jacques; Thauvin, Christel; Duvillard, Alain; Francannet, Christine; Mom, Thierry; Lacombe, Didier; Duriez, Françoise; Drouin-Garraud, Valérie; Thuillier-Obstoy, Marie-Françoise; Sigaudy, Sabine; Frances, Anne-Marie; Collignon, Patrick; Challe, Georges; Couderc, Rémy; Lathrop, Mark; Sahel, José-Alain; Weissenbach, Jean; Petit, Christine; Denoyelle, Françoise

    2011-05-11

    Usher syndrome (USH) combines sensorineural deafness with blindness. It is inherited in an autosomal recessive mode. Early diagnosis is critical for adapted educational and patient management choices, and for genetic counseling. To date, nine causative genes have been identified for the three clinical subtypes (USH1, USH2 and USH3). Current diagnostic strategies make use of a genotyping microarray that is based on the previously reported mutations. The purpose of this study was to design a more accurate molecular diagnosis tool. We sequenced the 366 coding exons and flanking regions of the nine known USH genes, in 54 USH patients (27 USH1, 21 USH2 and 6 USH3). Biallelic mutations were detected in 39 patients (72%) and monoallelic mutations in an additional 10 patients (18.5%). In addition to biallelic mutations in one of the USH genes, presumably pathogenic mutations in another USH gene were detected in seven patients (13%), and another patient carried monoallelic mutations in three different USH genes. Notably, none of the USH3 patients carried detectable mutations in the only known USH3 gene, whereas they all carried mutations in USH2 genes. Most importantly, the currently used microarray would have detected only 30 of the 81 different mutations that we found, of which 39 (48%) were novel. Based on these results, complete exon sequencing of the currently known USH genes stands as a definite improvement for molecular diagnosis of this disease, which is of utmost importance in the perspective of gene therapy.

  6. Complete exon sequencing of all known Usher syndrome genes greatly improves molecular diagnosis

    Directory of Open Access Journals (Sweden)

    Lacombe Didier

    2011-05-01

    Full Text Available Abstract Background Usher syndrome (USH combines sensorineural deafness with blindness. It is inherited in an autosomal recessive mode. Early diagnosis is critical for adapted educational and patient management choices, and for genetic counseling. To date, nine causative genes have been identified for the three clinical subtypes (USH1, USH2 and USH3. Current diagnostic strategies make use of a genotyping microarray that is based on the previously reported mutations. The purpose of this study was to design a more accurate molecular diagnosis tool. Methods We sequenced the 366 coding exons and flanking regions of the nine known USH genes, in 54 USH patients (27 USH1, 21 USH2 and 6 USH3. Results Biallelic mutations were detected in 39 patients (72% and monoallelic mutations in an additional 10 patients (18.5%. In addition to biallelic mutations in one of the USH genes, presumably pathogenic mutations in another USH gene were detected in seven patients (13%, and another patient carried monoallelic mutations in three different USH genes. Notably, none of the USH3 patients carried detectable mutations in the only known USH3 gene, whereas they all carried mutations in USH2 genes. Most importantly, the currently used microarray would have detected only 30 of the 81 different mutations that we found, of which 39 (48% were novel. Conclusions Based on these results, complete exon sequencing of the currently known USH genes stands as a definite improvement for molecular diagnosis of this disease, which is of utmost importance in the perspective of gene therapy.

  7. Complete genome sequence analysis of the fish pathogen Flavobacterium columnare provides insights into antibiotic resistance and pathogenicity related genes.

    Science.gov (United States)

    Zhang, Yulei; Zhao, Lijuan; Chen, Wenjie; Huang, Yunmao; Yang, Ling; Sarathbabu, V; Wu, Zaohe; Li, Jun; Nie, Pin; Lin, Li

    2017-10-01

    We analyzed here the complete genome sequences of a highly virulent Flavobacterium columnare Pf1 strain isolated in our laboratory. The complete genome consists of a 3,171,081 bp circular DNA with 2784 predicted protein-coding genes. Among these, 286 genes were predicted as antibiotic resistance genes, including 32 RND-type efflux pump related genes which were associated with the export of aminoglycosides, indicating inducible aminoglycosides resistances in F. columnare. On the other hand, 328 genes were predicted as pathogenicity related genes which could be classified as virulence factors, gliding motility proteins, adhesins, and many putative secreted proteases. These genes were probably involved in the colonization, invasion and destruction of fish tissues during the infection of F. columnare. Apparently, our obtained complete genome sequences provide the basis for the explanation of the interactions between the F. columnare and the infected fish. The predicted antibiotic resistance and pathogenicity related genes will shed a new light on the development of more efficient preventional strategies against the infection of F. columnare, which is a major worldwide fish pathogen. Copyright © 2017 Elsevier Ltd. All rights reserved.

  8. The complete chloroplast genome sequence of Dendrobium officinale.

    Science.gov (United States)

    Yang, Pei; Zhou, Hong; Qian, Jun; Xu, Haibin; Shao, Qingsong; Li, Yonghua; Yao, Hui

    2016-01-01

    The complete chloroplast sequence of Dendrobium officinale, an endangered and economically important traditional Chinese medicine, was reported and characterized. The genome size is 152,018 bp, with 37.5% GC content. A pair of inverted repeats (IRs) of 26,284 bp are separated by a large single-copy region (LSC, 84,944 bp) and a small single-copy region (SSC, 14,506 bp). The complete cp DNA contains 83 protein-coding genes, 39 tRNA genes and 8 rRNA genes. Fourteen genes contained one or two introns.

  9. The complete mitochondrial genome sequence of Oceanic whitetip shark, Carcharhinus longimanus (Carcharhiniformes: Carcharhinidae).

    Science.gov (United States)

    Li, Weiwen; Dai, Xiaojie; Xu, Qianghua; Wu, Feng; Gao, Chunxia; Zhang, Yanbo

    2016-05-01

    The complete mitochondrial DNA sequence of Carcharhinus longimanus was determined and analyzed. The complete mtDNA genome sequence of C. longimanus was 16,706 bp in length. It contained 22 tRNA genes, 2 rRNA genes, 13 protein-coding genes and 2 non-conding regions: control region (D-loop) and origin of light-strand replication (OL). The complete mitogenome sequence information of C. longimanus can provide a useful data for further studies on molecular systematics, stock evaluation, taxonomic status and conservation genetics.

  10. The complete chloroplast genome sequence of an endemic monotypic genus Hagenia (Rosaceae: structural comparative analysis, gene content and microsatellite detection

    Directory of Open Access Journals (Sweden)

    Andrew W. Gichira

    2017-01-01

    Full Text Available Hagenia is an endangered monotypic genus endemic to the topical mountains of Africa. The only species, Hagenia abyssinica (Bruce J.F. Gmel, is an important medicinal plant producing bioactive compounds that have been traditionally used by African communities as a remedy for gastrointestinal ailments in both humans and animals. Complete chloroplast genomes have been applied in resolving phylogenetic relationships within plant families. We employed high-throughput sequencing technologies to determine the complete chloroplast genome sequence of H. abyssinica. The genome is a circular molecule of 154,961 base pairs (bp, with a pair of Inverted Repeats (IR 25,971 bp each, separated by two single copies; a large (LSC, 84,320 bp and a small single copy (SSC, 18,696. H. abyssinica’s chloroplast genome has a 37.1% GC content and encodes 112 unique genes, 78 of which code for proteins, 30 are tRNA genes and four are rRNA genes. A comparative analysis with twenty other species, sequenced to-date from the family Rosaceae, revealed similarities in structural organization, gene content and arrangement. The observed size differences are attributed to the contraction/expansion of the inverted repeats. The translational initiation factor gene (infA which had been previously reported in other chloroplast genomes was conspicuously missing in H. abyssinica. A total of 172 microsatellites and 49 large repeat sequences were detected in the chloroplast genome. A Maximum Likelihood analyses of 71 protein-coding genes placed Hagenia in Rosoideae. The availability of a complete chloroplast genome, the first in the Sanguisorbeae tribe, is beneficial for further molecular studies on taxonomic and phylogenomic resolution within the Rosaceae family.

  11. The complete chloroplast genome sequence of an endemic monotypic genus Hagenia (Rosaceae): structural comparative analysis, gene content and microsatellite detection.

    Science.gov (United States)

    Gichira, Andrew W; Li, Zhizhong; Saina, Josphat K; Long, Zhicheng; Hu, Guangwan; Gituru, Robert W; Wang, Qingfeng; Chen, Jinming

    2017-01-01

    Hagenia is an endangered monotypic genus endemic to the topical mountains of Africa. The only species, Hagenia abyssinica (Bruce) J.F. Gmel, is an important medicinal plant producing bioactive compounds that have been traditionally used by African communities as a remedy for gastrointestinal ailments in both humans and animals. Complete chloroplast genomes have been applied in resolving phylogenetic relationships within plant families. We employed high-throughput sequencing technologies to determine the complete chloroplast genome sequence of H. abyssinica. The genome is a circular molecule of 154,961 base pairs (bp), with a pair of Inverted Repeats (IR) 25,971 bp each, separated by two single copies; a large (LSC, 84,320 bp) and a small single copy (SSC, 18,696). H. abyssinica 's chloroplast genome has a 37.1% GC content and encodes 112 unique genes, 78 of which code for proteins, 30 are tRNA genes and four are rRNA genes. A comparative analysis with twenty other species, sequenced to-date from the family Rosaceae, revealed similarities in structural organization, gene content and arrangement. The observed size differences are attributed to the contraction/expansion of the inverted repeats. The translational initiation factor gene ( infA ) which had been previously reported in other chloroplast genomes was conspicuously missing in H. abyssinica . A total of 172 microsatellites and 49 large repeat sequences were detected in the chloroplast genome. A Maximum Likelihood analyses of 71 protein-coding genes placed Hagenia in Rosoideae. The availability of a complete chloroplast genome, the first in the Sanguisorbeae tribe, is beneficial for further molecular studies on taxonomic and phylogenomic resolution within the Rosaceae family.

  12. The complete chloroplast genome sequence of Podocarpus lambertii: genome structure, evolutionary aspects, gene content and SSR detection.

    Directory of Open Access Journals (Sweden)

    Leila do Nascimento Vieira

    Full Text Available BACKGROUND: Podocarpus lambertii (Podocarpaceae is a native conifer from the Brazilian Atlantic Forest Biome, which is considered one of the 25 biodiversity hotspots in the world. The advancement of next-generation sequencing technologies has enabled the rapid acquisition of whole chloroplast (cp genome sequences at low cost. Several studies have proven the potential of cp genomes as tools to understand enigmatic and basal phylogenetic relationships at different taxonomic levels, as well as further probe the structural and functional evolution of plants. In this work, we present the complete cp genome sequence of P. lambertii. METHODOLOGY/PRINCIPAL FINDINGS: The P. lambertii cp genome is 133,734 bp in length, and similar to other sequenced cupressophytes, it lacks one of the large inverted repeat regions (IR. It contains 118 unique genes and one duplicated tRNA (trnN-GUU, which occurs as an inverted repeat sequence. The rps16 gene was not found, which was previously reported for the plastid genome of another Podocarpaceae (Nageia nagi and Araucariaceae (Agathis dammara. Structurally, P. lambertii shows 4 inversions of a large DNA fragment ∼20,000 bp compared to the Podocarpus totara cp genome. These unexpected characteristics may be attributed to geographical distance and different adaptive needs. The P. lambertii cp genome presents a total of 28 tandem repeats and 156 SSRs, with homo- and dipolymers being the most common and tri-, tetra-, penta-, and hexapolymers occurring with less frequency. CONCLUSION: The complete cp genome sequence of P. lambertii revealed significant structural changes, even in species from the same genus. These results reinforce the apparently loss of rps16 gene in Podocarpaceae cp genome. In addition, several SSRs in the P. lambertii cp genome are likely intraspecific polymorphism sites, which may allow highly sensitive phylogeographic and population structure studies, as well as phylogenetic studies of species of

  13. Nearly Complete 28S rRNA Gene Sequences Confirm New Hypotheses of Sponge Evolution

    Science.gov (United States)

    Thacker, Robert W.; Hill, April L.; Hill, Malcolm S.; Redmond, Niamh E.; Collins, Allen G.; Morrow, Christine C.; Spicer, Lori; Carmack, Cheryl A.; Zappe, Megan E.; Pohlmann, Deborah; Hall, Chelsea; Diaz, Maria C.; Bangalore, Purushotham V.

    2013-01-01

    The highly collaborative research sponsored by the NSF-funded Assembling the Porifera Tree of Life (PorToL) project is providing insights into some of the most difficult questions in metazoan systematics. Our understanding of phylogenetic relationships within the phylum Porifera has changed considerably with increased taxon sampling and data from additional molecular markers. PorToL researchers have falsified earlier phylogenetic hypotheses, discovered novel phylogenetic alliances, found phylogenetic homes for enigmatic taxa, and provided a more precise understanding of the evolution of skeletal features, secondary metabolites, body organization, and symbioses. Some of these exciting new discoveries are shared in the papers that form this issue of Integrative and Comparative Biology. Our analyses of over 300 nearly complete 28S ribosomal subunit gene sequences provide specific case studies that illustrate how our dataset confirms new hypotheses of sponge evolution. We recovered monophyletic clades for all 4 classes of sponges, as well as the 4 major clades of Demospongiae (Keratosa, Myxospongiae, Haploscleromorpha, and Heteroscleromorpha), but our phylogeny differs in several aspects from traditional classifications. In most major clades of sponges, families within orders appear to be paraphyletic. Although additional sampling of genes and taxa are needed to establish whether this pattern results from a lack of phylogenetic resolution or from a paraphyletic classification system, many of our results are congruent with those obtained from 18S ribosomal subunit gene sequences and complete mitochondrial genomes. These data provide further support for a revision of the traditional classification of sponges. PMID:23748742

  14. Getting complete genomes from complex samples using nanopore sequencing

    DEFF Research Database (Denmark)

    Kirkegaard, Rasmus Hansen; Karst, Søren Michael; Albertsen, Mads

    Short read sequencing and metagenomic binning workflows have made it possible to extract bacterial genome bins from environmental microbial samples containing hundreds to thousands of different species. However, these genome bins often do not represent complete genomes, as they are mostly...... fragmented, incomplete and often contaminated with foreign DNA and with no robust strategies to validate the quality. The value of these `draft genomes` have limited, lasting value to the scientific community, as gene synteny is broken and the uncertainty of what is missing. The genetic material most often...... missed is important multi-copy and/or conserved marker genes such as the 16S rRNA gene, as sequence micro-heterogeneity prevents assembly of these genes in the de novo assembly. We demonstrate that using nanopore long reads it is now possible to overcome these issues and make complete genomes from...

  15. The nucleotide sequences of two leghemoglobin genes from soybean

    DEFF Research Database (Denmark)

    Wiborg, O; Hyldig-Nielsen, J J; Jensen, E O

    1982-01-01

    We present the complete nucleotide sequences of two leghemoglobin genes isolated from soybean DNA. Both genes contain three intervening sequences in identical positions. Comparison of the coding sequences with known amino-acid sequences of soybean leghemoglobins suggest that the two genes...

  16. Polyadenylated Sequencing Primers Enable Complete Readability of PCR Amplicons Analyzed by Dideoxynucleotide Sequencing

    Directory of Open Access Journals (Sweden)

    Martin Beránek

    2012-01-01

    Full Text Available Dideoxynucleotide DNA sequencing is one of the principal procedures in molecular biology. Loss of an initial part of nucleotides behind the 3' end of the sequencing primer limits the readability of sequenced amplicons. We present a method which extends the readability by using sequencing primers modified by polyadenylated tails attached to their 5' ends. Performing a polymerase chain reaction, we amplified eight amplicons of six human genes (AMELX, APOE, HFE, MBL2, SERPINA1 and TGFB1 ranging from 106 bp to 680 bp. Polyadenylation of the sequencing primers minimized the loss of bases in all amplicons. Complete sequences of shorter products (AMELX 106 bp, SERPINA1 121 bp, HFE 208 bp, APOE 244 bp, MBL2 317 bp were obtained. In addition, in the case of TGFB1 products (366 bp, 432 bp, and 680 bp, respectively, the lengths of sequencing readings were significantly longer if adenylated primers were used. Thus, single strand dideoxynucleotide sequencing with adenylated primers enables complete or near complete readability of short PCR amplicons.

  17. The complete mitochondrial genome sequence of Eimeria magna (Apicomplexa: Coccidia).

    Science.gov (United States)

    Tian, Si-Qin; Cui, Ping; Fang, Su-Fang; Liu, Guo-Hua; Wang, Chun-Ren; Zhu, Xing-Quan

    2015-01-01

    In the present study, we determined the complete mitochondrial DNA (mtDNA) sequence of Eimeria magna from rabbits for the first time, and compared its gene contents and genome organizations with that of seven Eimeria spp. from domestic chickens. The size of the complete mt genome sequence of E. magna is 6249 bp, which consists of 3 protein-coding genes (cytb, cox1 and cox3), 12 gene fragments for the large subunit (LSU) rRNA, and 7 gene fragments for the small subunit (SSU) rRNA, without transfer RNA genes, in accordance with that of Eimeria spp. from chickens. The putative direction of translation for three genes (cytb, cox1 and cox3) was the same as those of Eimeria species from domestic chickens. The content of A + T is 65.16% for E. magna mt genome (29.73% A, 35.43% T, 17.09 G and 17.75% C). The E. magna mt genome sequence provides novel mtDNA markers for studying the molecular epidemiology and population genetics of Eimeria spp. and has implications for the molecular diagnosis and control of rabbit coccidiosis.

  18. The Complete Chloroplast Genome Sequences of Six Rehmannia Species

    Directory of Open Access Journals (Sweden)

    Shuyun Zeng

    2017-03-01

    Full Text Available Rehmannia is a non-parasitic genus in Orobanchaceae including six species mainly distributed in central and north China. Its phylogenetic position and infrageneric relationships remain uncertain due to potential hybridization and polyploidization. In this study, we sequenced and compared the complete chloroplast genomes of six Rehmannia species using Illumina sequencing technology to elucidate the interspecific variations. Rehmannia plastomes exhibited typical quadripartite and circular structures with good synteny of gene order. The complete genomes ranged from 153,622 bp to 154,055 bp in length, including 133 genes encoding 88 proteins, 37 tRNAs, and 8 rRNAs. Three genes (rpoA, rpoC2, accD have potentially experienced positive selection. Plastome size variation of Rehmannia was mainly ascribed to the expansion and contraction of the border regions between the inverted repeat (IR region and the single-copy (SC regions. Despite of the conserved structure in Rehmannia plastomes, sequence variations provide useful phylogenetic information. Phylogenetic trees of 23 Lamiales species reconstructed with the complete plastomes suggested that Rehmannia was monophyletic and sister to the clade of Lindenbergia and the parasitic taxa in Orobanchaceae. The interspecific relationships within Rehmannia were completely different with the previous studies. In future, population phylogenomic works based on plastomes are urgently needed to clarify the evolutionary history of Rehmannia.

  19. The complete mitochondrial genome sequence of the maned wolf (Chrysocyon brachyurus).

    Science.gov (United States)

    Zhao, Chao; Yang, Xiufeng; Zhang, Honghai; Zhang, Jin; Chen, Lei; Sha, Weilai; Liu, Guangshuai

    2016-01-01

    In this study, the complete mitochondrial genome of the maned wolf (Chrysocyon brachyurus), the unique species in Chrysocyon, was sequenced and reported for the first time using blood samples obtained from a female individual in Shanghai Zoo, China. Sequence analysis showed that the genome structure was in accordance with other Canidae species and it contained 12 S rRNA gene, 16 S rRNA gene, 22 tRNA genes, 13 protein-coding genes and 1 control region.

  20. Getting complete genomes from complex samples using nanopore sequencing

    DEFF Research Database (Denmark)

    Kirkegaard, Rasmus Hansen; Karst, Søren Michael; Albertsen, Mads

    Background Short read DNA sequencing and metagenomic binning workflows have made it possible to extract bacterial genome bins from environmental microbial samples containing hundreds to thousands of different species. However, these genome bins often do not represent complete genomes......, as they are mostly fragmented, incomplete and often contaminated with foreign DNA. The value of these `draft genomes` have limited, lasting value to the scientific community, as gene synteny is broken and there is some uncertainty of what is missing1. The genetic material most often missed is important multi......-copy and/or conserved marker genes such as the 16S rRNA gene, as sequence micro-heterogeneity prevents assembly of these genes in the de novo assembly. However, long read sequencing technologies are emerging promising an end to fragmented genome assemblies2. Experimental design We extracted DNA from a full...

  1. The complete chloroplast genome sequence of Euonymus japonicus (Celastraceae).

    Science.gov (United States)

    Choi, Kyoung Su; Park, SeonJoo

    2016-09-01

    The complete chloroplast (cp) genome sequence of the Euonymus japonicus, the first sequenced of the genus Euonymus, was reported in this study. The total length was 157 637 bp, containing a pair of 26 678 bp inverted repeat region (IR), which were separated by small single copy (SSC) region and large single copy (LSC) region of 18 340 bp and 85 941 bp, respectively. This genome contains 107 unique genes, including 74 coding genes, four rRNA genes, and 29 tRNA genes. Seventeen genes contain intron of E. japonicus, of which three genes (clpP, ycf3, and rps12) include two introns. The maximum likelihood (ML) phylogenetic analysis revealed that E. japonicus was closely related to Manihot and Populus.

  2. The complete sequence of human chromosome 5

    Energy Technology Data Exchange (ETDEWEB)

    Schmutz, Jeremy; Martin, Joel; Terry, Astrid; Couronne, Olivier; Grimwood, Jane; Lowry, State; Gordon, Laurie A.; Scott, Duncan; Xie, Gary; Huang, Wayne; Hellsten, Uffe; Tran-Gyamfi, Mary; She, Xinwei; Prabhakar, Shyam; Aerts, Andrea; Altherr, Michael; Bajorek, Eva; Black, Stacey; Branscomb, Elbert; Caoile, Chenier; Challacombe, Jean F.; Chan, Yee Man; Denys, Mirian; Detter, Chris; Escobar, Julio; Flowers, Dave; Fotopulos, Dea; Glavina, Tijana; Gomez, Maria; Gonzales, Eidelyn; Goodstenin, David; Grigoriev, Igor; Groza, Matthew; Hammon, Nancy; Hawkins, Trevor; Haydu, Lauren; Israni, Sanjay; Jett, Jamie; Kadner, Kristen; Kimbal, Heather; Kobayashi, Arthur; Lopez, Frederick; Lou, Yunian; Martinez, Diego; Medina, Catherine; Morgan, Jenna; Nandkeshwar, Richard; Noonan, James P.; Pitluck, Sam; Pollard, Martin; Predki, Paul; Priest, James; Ramirez, Lucia; Rash, Sam; Retterer, James; Rodriguez, Alex; Rogers, Stephanie; Salamov, Asaf; Salazar, Angelica; Thayer, Nina; Tice, Hope; Tsai, Ming; Ustaszewska, Anna; Vo, Nu; Wheeler, Jeremy; Wu, Kevin; Yang, Joan; Dickson, Mark; Cheng, Jan-Fang; Eichler, Evan E.; Olsen, Anne; Pennacchio, Len A.; Rokhsar, Daniel S.; Richardson, Paul; Lucas, Susan M.; Myers, Richard M.; Rubin, Edward M.

    2004-04-15

    Chromosome 5 is one of the largest human chromosomes yet has one of the lowest gene densities. This is partially explained by numerous gene-poor regions that display a remarkable degree of noncoding and syntenic conservation with non-mammalian vertebrates, suggesting they are functionally constrained. In total, we compiled 177.7 million base pairs of highly accurate finished sequence containing 923 manually curated protein-encoding genes including the protocadherin and interleukin gene families and the first complete versions of each of the large chromosome 5 specific internal duplications. These duplications are very recent evolutionary events and play a likely mechanistic role, since deletions of these regions are the cause of debilitating disorders including spinal muscular atrophy (SMA).

  3. The complete chloroplast genome sequence of Ampelopsis: gene organization, comparative analysis and phylogenetic relationships to other angiosperms

    Directory of Open Access Journals (Sweden)

    Gurusamy eRaman

    2016-03-01

    Full Text Available Ampelopsis brevipedunculata is an economically important plant that belongs to the Vitaceae family of angiosperms. The phylogenetic placement of Vitaceae is still unresolved. Recent phylogenetic studies suggested that it should be placed in various alternative families including Caryophyllaceae, asteraceae, Saxifragaceae, Dilleniaceae, or with the rest of the rosid families. However, these analyses provided weak supportive results because they were based on only one of several genes. Accordingly, complete chloroplast genome sequences are required to resolve the phylogenetic relationships among angiosperms. Recent phylogenetic analyses based on the complete chloroplast genome sequence suggested strong support for the position of Vitaceae as the earliest diverging lineage of rosids and placed it as a sister to the remaining rosids. These studies also revealed relationships among several major lineages of angiosperms; however, they highlighted the significance of taxon sampling for obtaining accurate phylogenies. In the present study, we sequenced the complete chloroplast genome of A. brevipedunculata and used these data to assess the relationships among 32 angiosperms, including 18 taxa of rosids. The Ampelopsis chloroplast genome is 161,090 bp in length, and includes a pair of inverted repeats of 26,394 bp that are separated by small and large single copy regions of 19,036 bp and 89,266 bp, respectively. The gene content and order of Ampelopsis is identical to many other unrearranged angiosperm chloroplast genomes, including Vitis and tobacco. A phylogenetic tree constructed based on 70 protein-coding genes of 33 angiosperms showed that both Saxifragales and Vitaceae diverged from the rosid clade and formed two clades with 100% bootstrap value. The position of the Vitaceae is sister to Saxifragales, and both are the basal and earliest diverging lineages. Moreover, Saxifragales forms a sister clade to Vitaceae of rosids. Overall, the results of

  4. The complete chloroplast genome sequence of Hibiscus syriacus.

    Science.gov (United States)

    Kwon, Hae-Yun; Kim, Joon-Hyeok; Kim, Sea-Hyun; Park, Ji-Min; Lee, Hyoshin

    2016-09-01

    The complete chloroplast genome sequence of Hibiscus syriacus L. is presented in this study. The genome is composed of 161 019 bp in length, with a typical circular structure containing a pair of inverted repeats of 25 745 bp of length separated by a large single-copy region and a small single-copy region of 89 698 bp and 19 831 bp of length, respectively. The overall GC content is 36.8%. One hundred and fourteen genes were annotated, including 81 protein-coding genes, 4 ribosomal RNA genes and 29 transfer RNA genes.

  5. [Sequencing and analysis of the complete genome of a rabies virus isolate from Sika deer].

    Science.gov (United States)

    Zhao, Yun-Jiao; Guo, Li; Huang, Ying; Zhang, Li-Shi; Qian, Ai-Dong

    2008-05-01

    One DRV strain was isolated from Sika Deer brain and sequenced. Nine overlapped gene fragments were amplified by RT-PCR through 3'-RACE and 5'-RACE method, and the complete DRV genome sequence was assembled. The length of the complete genome is 11863bp. The DRV genome organization was similar to other rabies viruses which were composed of five genes and the initiation sites and termination sites were highly conservative. There were mutated amino acids in important antigen sites of nucleoprotein and glycoprotein. The nucleotide and amino acid homologies of gene N, P, M, G, L in strains with completed genomie sequencing were compared. Compared with N gene sequence of other typical rabies viruses, a phylogenetic tree was established . These results indicated that DRV belonged to gene type 1. The highest homology compared with Chinese vaccine strain 3aG was 94%, and the lowest was 71% compared with WCBV. These findings provided theoretical reference for further research in rabies virus.

  6. The complete chloroplast genome sequence of Dendrobium nobile.

    Science.gov (United States)

    Yan, Wenjin; Niu, Zhitao; Zhu, Shuying; Ye, Meirong; Ding, Xiaoyu

    2016-11-01

    The complete chloroplast (cp) genome sequence of Dendrobium nobile, an endangered and traditional Chinese medicine with important economic value, is presented in this article. The total genome size is 150,793 bp, containing a large single copy (LSC) region (84,939 bp) and a small single copy region (SSC) (13,310 bp) which were separated by two inverted repeat (IRs) regions (26,272 bp). The overall GC contents of the plastid genome were 38.8%. In total, 130 unique genes were annotated and they were consisted of 76 protein-coding genes, 30 tRNA genes and 4 rRNA genes. Fourteen genes contained one or two introns.

  7. The complete chloroplast genome sequence of Abies nephrolepis (Pinaceae: Abietoideae

    Directory of Open Access Journals (Sweden)

    Dong-Keun Yi

    2016-06-01

    Full Text Available The plant chloroplast (cp genome has maintained a relatively conserved structure and gene content throughout evolution. Cp genome sequences have been used widely for resolving evolutionary and phylogenetic issues at various taxonomic levels of plants. Here, we report the complete cp genome of Abies nephrolepis. The A. nephrolepis cp genome is 121,336 base pairs (bp in length including a pair of short inverted repeat regions (IRa and IRb of 139 bp each separated by a small single copy (SSC region of 54,323 bp (SSC and a large single copy region of 66,735 bp (LSC. It contains 114 genes, 68 of which are protein coding genes, 35 tRNA and four rRNA genes, six open reading frames, and one pseudogene. Seventeen repeat units and 64 simple sequence repeats (SSR have been detected in A. nephrolepis cp genome. Large IR sequences locate in 42-kb inversion points (1186 bp. The A. nephrolepis cp genome is identical to Abies koreana’s which is closely related to taxa. Pairwise comparison between two cp genomes revealed 140 polymorphic sites in each. Complete cp genome sequence of A. nephrolepis has a significant potential to provide information on the evolutionary pattern of Abietoideae and valuable data for development of DNA markers for easy identification and classification.

  8. The complete mitochondrial genome sequence of Diaphorina citri (Hemiptera: Psyllidae)

    Science.gov (United States)

    The first complete mitochondrial genome (mitogenome) sequence of Asian citrus psyllid, Diaphorina citri (Hemiptera: Psyllidae), from Guangzhou, China is presented. The circular mitogenome is 14,996 bp in length with an A+T content of 74.5%, and contains 13 protein-coding genes (PCGs), 22 tRNA genes ...

  9. Using nanopore sequencing to get complete genomes from complex samples

    DEFF Research Database (Denmark)

    Kirkegaard, Rasmus Hansen; Karst, Søren Michael; Nielsen, Per Halkjær

    The advantages of “next generation sequencing” has come at the cost of genome finishing. The dominant sequencing technology provides short reads of 150-300 bp, which has made genome assembly very difficult as the reads do not span important repeat regions. Genomes have thus been added...... to the databases as fragmented assemblies and not as finished contigs that resemble the chromosomes in which the DNA is organised within the cells. This is especially troublesome for genomes derived from complex metagenome sequencing. Databases with incomplete genomes can lead to false conclusions about...... the absence of genes and functional predictions of the organisms. Furthermore, it is common that repetitive elements and marker genes such as the 16S rRNA gene are missing completely from these genome bins. Using nanopore long reads, we demonstrate that it is possible to span these regions and make complete...

  10. Complete genome sequence of Parvibaculum lavamentivorans type strain (DS-1(T)).

    Science.gov (United States)

    Schleheck, David; Weiss, Michael; Pitluck, Sam; Bruce, David; Land, Miriam L; Han, Shunsheng; Saunders, Elizabeth; Tapia, Roxanne; Detter, Chris; Brettin, Thomas; Han, James; Woyke, Tanja; Goodwin, Lynne; Pennacchio, Len; Nolan, Matt; Cook, Alasdair M; Kjelleberg, Staffan; Thomas, Torsten

    2011-12-31

    Parvibaculum lavamentivorans DS-1(T) is the type species of the novel genus Parvibaculum in the novel family Rhodobiaceae (formerly Phyllobacteriaceae) of the order Rhizobiales of Alphaproteobacteria. Strain DS-1(T) is a non-pigmented, aerobic, heterotrophic bacterium and represents the first tier member of environmentally important bacterial communities that catalyze the complete degradation of synthetic laundry surfactants. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 3,914,745 bp long genome with its predicted 3,654 protein coding genes is the first completed genome sequence of the genus Parvibaculum, and the first genome sequence of a representative of the family Rhodobiaceae.

  11. The complete chloroplast genome sequence of Curcuma flaviflora (Curcuma).

    Science.gov (United States)

    Zhang, Yan; Deng, Jiabin; Li, Yangyi; Gao, Gang; Ding, Chunbang; Zhang, Li; Zhou, Yonghong; Yang, Ruiwu

    2016-09-01

    The complete chloroplast (cp) genome of Curcuma flaviflora, a medicinal plant in Southeast Asia, was sequenced. The genome size was 160 478 bp in length, with 36.3% GC content. A pair of inverted repeats (IRs) of 26 946 bp were separated by a large single copy (LSC) of 88 008 bp and a small single copy (SSC) of 18 578 bp, respectively. The cp genome contained 132 annotated genes, including 79 protein coding genes, 30 tRNA genes, and four rRNA genes. And 19 of these genes were duplicated in inverted repeat regions.

  12. Complete genome sequence of Gordonia bronchialis type strain (3410T)

    Energy Technology Data Exchange (ETDEWEB)

    Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Jando, Marlen [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Chen, Feng [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Chain, Patrick S. G. [Lawrence Livermore National Laboratory (LLNL); Saunders, Elizabeth H [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Detter, J C [U.S. Department of Energy, Joint Genome Institute; Brettin, Thomas S [ORNL; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute

    2010-01-01

    Gordonia bronchialis Tsukamura 1971 is the type species of the genus. G. bronchialis is a human-pathogenic organism that has been isolated from a large variety of human tissues. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of the family Gordoniaceae. The 5,290,012 bp long genome with its 4,944 protein-coding and 55 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  13. Complete mitochondrial DNA sequence of the Eastern keelback mullet Liza affinis.

    Science.gov (United States)

    Gong, Xiaoling; Zhu, Wenjia; Bao, Baolong

    2016-05-01

    Eastern keelback mullet (Liza affinis) inhabits inlet waters and estuaries of rivers. In this paper, we initially determined the complete mitochondrial genome of Liza affinis. The entire mtDNA sequence is 16,831 bp in length, including 2 rRNA genes, 22 tRNA genes, 13 protein-coding genes and 1 putative control region. Its order and numbers of genes are similar to most bony fishes.

  14. [Complete genome sequencing of polymalic acid-producing strain Aureobasidium pullulans CCTCC M2012223].

    Science.gov (United States)

    Wang, Yongkang; Song, Xiaodan; Li, Xiaorong; Yang, Sang-tian; Zou, Xiang

    2017-01-04

    To explore the genome sequence of Aureobasidium pullulans CCTCC M2012223, analyze the key genes related to the biosynthesis of important metabolites, and provide genetic background for metabolic engineering. Complete genome of A. pullulans CCTCC M2012223 was sequenced by Illumina HiSeq high throughput sequencing platform. Then, fragment assembly, gene prediction, functional annotation, and GO/COG cluster were analyzed in comparison with those of other five A. pullulans varieties. The complete genome sequence of A. pullulans CCTCC M2012223 was 30756831 bp with an average GC content of 47.49%, and 9452 genes were successfully predicted. Genome-wide analysis showed that A. pullulans CCTCC M2012223 had the biggest genome assembly size. Protein sequences involved in the pullulan and polymalic acid pathway were highly conservative in all of six A. pullulans varieties. Although both A. pullulans CCTCC M2012223 and A. pullulans var. melanogenum have a close affinity, some point mutation and inserts were occurred in protein sequences involved in melanin biosynthesis. Genome information of A. pullulans CCTCC M2012223 was annotated and genes involved in melanin, pullulan and polymalic acid pathway were compared, which would provide a theoretical basis for genetic modification of metabolic pathway in A. pullulans.

  15. Complete genome sequence of Acidimicrobium ferrooxidans type strain (ICPT)

    Energy Technology Data Exchange (ETDEWEB)

    Clum, Alicia; Nolan, Matt; Lang, Elke; Glavina Del Rio, Tijana; Tice, Hope; Copeland, Alex; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavrommatis, Konstantinos; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Goker, Markus; Spring, Stefan; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jefferies, Cynthia C.; Chain, Patrick; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter; Lapidus, Alla

    2009-05-20

    Acidimicrobium ferrooxidans (Clark and Norris 1996) is the sole and type species of the genus, which until recently was the only genus within the actinobacterial family Acidimicrobiaceae and in the order Acidomicrobiales. Rapid oxidation of iron pyrite during autotrophic growth in the absence of an enhanced CO2 concentration is characteristic for A. ferrooxidans. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of the order Acidomicrobiales, and the 2,158,157 bp long single replicon genome with its 2038 protein coding and 54 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  16. Complete genome sequence analysis of Nocardia brasiliensis HUJEG-1 reveals a saprobic lifestyle and the genes needed for human pathogenesis.

    Science.gov (United States)

    Vera-Cabrera, Lucio; Ortiz-Lopez, Rocio; Elizondo-Gonzalez, Ramiro; Ocampo-Candiani, Jorge

    2013-01-01

    Nocardia brasiliensis is an important etiologic agent of mycetoma. These bacteria live as a saprobe in soil or organic material and enter the tissue via minor trauma. Mycetoma is characterized by tumefaction and the production of fistula and abscesses, with no spontaneous cure. By using mass sequencing, we determined the complete genomic nucleotide sequence of the bacteria. According to our data, the genome is a circular chromosome 9,436,348-bp long with 68% G+C content that encodes 8,414 proteins. We observed orthologs for virulence factors, a higher number of genes involved in lipid biosynthesis and catabolism, and gene clusters for the synthesis of bioactive compounds, such as antibiotics, terpenes, and polyketides. An in silico analysis of the sequence supports the conclusion that the bacteria acquired diverse genes by horizontal transfer from other soil bacteria, even from eukaryotic organisms. The genome composition reflects the evolution of bacteria via the acquisition of a large amount of DNA, which allows it to survive in new ecological niches, including humans.

  17. Complete genome sequence analysis of Nocardia brasiliensis HUJEG-1 reveals a saprobic lifestyle and the genes needed for human pathogenesis.

    Directory of Open Access Journals (Sweden)

    Lucio Vera-Cabrera

    Full Text Available Nocardia brasiliensis is an important etiologic agent of mycetoma. These bacteria live as a saprobe in soil or organic material and enter the tissue via minor trauma. Mycetoma is characterized by tumefaction and the production of fistula and abscesses, with no spontaneous cure. By using mass sequencing, we determined the complete genomic nucleotide sequence of the bacteria. According to our data, the genome is a circular chromosome 9,436,348-bp long with 68% G+C content that encodes 8,414 proteins. We observed orthologs for virulence factors, a higher number of genes involved in lipid biosynthesis and catabolism, and gene clusters for the synthesis of bioactive compounds, such as antibiotics, terpenes, and polyketides. An in silico analysis of the sequence supports the conclusion that the bacteria acquired diverse genes by horizontal transfer from other soil bacteria, even from eukaryotic organisms. The genome composition reflects the evolution of bacteria via the acquisition of a large amount of DNA, which allows it to survive in new ecological niches, including humans.

  18. Synaptotagmin gene content of the sequenced genomes

    Directory of Open Access Journals (Sweden)

    Craxton Molly

    2004-07-01

    Full Text Available Abstract Background Synaptotagmins exist as a large gene family in mammals. There is much interest in the function of certain family members which act crucially in the regulated synaptic vesicle exocytosis required for efficient neurotransmission. Knowledge of the functions of other family members is relatively poor and the presence of Synaptotagmin genes in plants indicates a role for the family as a whole which is wider than neurotransmission. Identification of the Synaptotagmin genes within completely sequenced genomes can provide the entire Synaptotagmin gene complement of each sequenced organism. Defining the detailed structures of all the Synaptotagmin genes and their encoded products can provide a useful resource for functional studies and a deeper understanding of the evolution of the gene family. The current rapid increase in the number of sequenced genomes from different branches of the tree of life, together with the public deposition of evolutionarily diverse transcript sequences make such studies worthwhile. Results I have compiled a detailed list of the Synaptotagmin genes of Caenorhabditis, Anopheles, Drosophila, Ciona, Danio, Fugu, Mus, Homo, Arabidopsis and Oryza by examining genomic and transcript sequences from public sequence databases together with some transcript sequences obtained by cDNA library screening and RT-PCR. I have compared all of the genes and investigated the relationship between plant Synaptotagmins and their non-Synaptotagmin counterparts. Conclusions I have identified and compared 98 Synaptotagmin genes from 10 sequenced genomes. Detailed comparison of transcript sequences reveals abundant and complex variation in Synaptotagmin gene expression and indicates the presence of Synaptotagmin genes in all animals and land plants. Amino acid sequence comparisons indicate patterns of conservation and diversity in function. Phylogenetic analysis shows the origin of Synaptotagmins in multicellular eukaryotes and their

  19. CpGAVAS, an integrated web server for the annotation, visualization, analysis, and GenBank submission of completely sequenced chloroplast genome sequences

    Science.gov (United States)

    2012-01-01

    Background The complete sequences of chloroplast genomes provide wealthy information regarding the evolutionary history of species. With the advance of next-generation sequencing technology, the number of completely sequenced chloroplast genomes is expected to increase exponentially, powerful computational tools annotating the genome sequences are in urgent need. Results We have developed a web server CPGAVAS. The server accepts a complete chloroplast genome sequence as input. First, it predicts protein-coding and rRNA genes based on the identification and mapping of the most similar, full-length protein, cDNA and rRNA sequences by integrating results from Blastx, Blastn, protein2genome and est2genome programs. Second, tRNA genes and inverted repeats (IR) are identified using tRNAscan, ARAGORN and vmatch respectively. Third, it calculates the summary statistics for the annotated genome. Fourth, it generates a circular map ready for publication. Fifth, it can create a Sequin file for GenBank submission. Last, it allows the extractions of protein and mRNA sequences for given list of genes and species. The annotation results in GFF3 format can be edited using any compatible annotation editing tools. The edited annotations can then be uploaded to CPGAVAS for update and re-analyses repeatedly. Using known chloroplast genome sequences as test set, we show that CPGAVAS performs comparably to another application DOGMA, while having several superior functionalities. Conclusions CPGAVAS allows the semi-automatic and complete annotation of a chloroplast genome sequence, and the visualization, editing and analysis of the annotation results. It will become an indispensible tool for researchers studying chloroplast genomes. The software is freely accessible from http://www.herbalgenomics.org/cpgavas. PMID:23256920

  20. CpGAVAS, an integrated web server for the annotation, visualization, analysis, and GenBank submission of completely sequenced chloroplast genome sequences

    Directory of Open Access Journals (Sweden)

    Liu Chang

    2012-12-01

    Full Text Available Abstract Background The complete sequences of chloroplast genomes provide wealthy information regarding the evolutionary history of species. With the advance of next-generation sequencing technology, the number of completely sequenced chloroplast genomes is expected to increase exponentially, powerful computational tools annotating the genome sequences are in urgent need. Results We have developed a web server CPGAVAS. The server accepts a complete chloroplast genome sequence as input. First, it predicts protein-coding and rRNA genes based on the identification and mapping of the most similar, full-length protein, cDNA and rRNA sequences by integrating results from Blastx, Blastn, protein2genome and est2genome programs. Second, tRNA genes and inverted repeats (IR are identified using tRNAscan, ARAGORN and vmatch respectively. Third, it calculates the summary statistics for the annotated genome. Fourth, it generates a circular map ready for publication. Fifth, it can create a Sequin file for GenBank submission. Last, it allows the extractions of protein and mRNA sequences for given list of genes and species. The annotation results in GFF3 format can be edited using any compatible annotation editing tools. The edited annotations can then be uploaded to CPGAVAS for update and re-analyses repeatedly. Using known chloroplast genome sequences as test set, we show that CPGAVAS performs comparably to another application DOGMA, while having several superior functionalities. Conclusions CPGAVAS allows the semi-automatic and complete annotation of a chloroplast genome sequence, and the visualization, editing and analysis of the annotation results. It will become an indispensible tool for researchers studying chloroplast genomes. The software is freely accessible from http://www.herbalgenomics.org/cpgavas.

  1. Complete plastid genome sequence of Daucus carota: implications for biotechnology and phylogeny of angiosperms.

    Science.gov (United States)

    Ruhlman, Tracey; Lee, Seung-Bum; Jansen, Robert K; Hostetler, Jessica B; Tallon, Luke J; Town, Christopher D; Daniell, Henry

    2006-08-31

    Carrot (Daucus carota) is a major food crop in the US and worldwide. Its capacity for storage and its lifecycle as a biennial make it an attractive species for the introduction of foreign genes, especially for oral delivery of vaccines and other therapeutic proteins. Until recently efforts to express recombinant proteins in carrot have had limited success in terms of protein accumulation in the edible tap roots. Plastid genetic engineering offers the potential to overcome this limitation, as demonstrated by the accumulation of BADH in chromoplasts of carrot taproots to confer exceedingly high levels of salt resistance. The complete plastid genome of carrot provides essential information required for genetic engineering. Additionally, the sequence data add to the rapidly growing database of plastid genomes for assessing phylogenetic relationships among angiosperms. The complete carrot plastid genome is 155,911 bp in length, with 115 unique genes and 21 duplicated genes within the IR. There are four ribosomal RNAs, 30 distinct tRNA genes and 18 intron-containing genes. Repeat analysis reveals 12 direct and 2 inverted repeats > or = 30 bp with a sequence identity > or = 90%. Phylogenetic analysis of nucleotide sequences for 61 protein-coding genes using both maximum parsimony (MP) and maximum likelihood (ML) were performed for 29 angiosperms. Phylogenies from both methods provide strong support for the monophyly of several major angiosperm clades, including monocots, eudicots, rosids, asterids, eurosids II, euasterids I, and euasterids II. The carrot plastid genome contains a number of dispersed direct and inverted repeats scattered throughout coding and non-coding regions. This is the first sequenced plastid genome of the family Apiaceae and only the second published genome sequence of the species-rich euasterid II clade. Both MP and ML trees provide very strong support (100% bootstrap) for the sister relationship of Daucus with Panax in the euasterid II clade. These

  2. Complete plastid genome sequence of Daucus carota: Implications for biotechnology and phylogeny of angiosperms

    Directory of Open Access Journals (Sweden)

    Ruhlman Tracey

    2006-08-01

    Full Text Available Abstract Background Carrot (Daucus carota is a major food crop in the US and worldwide. Its capacity for storage and its lifecycle as a biennial make it an attractive species for the introduction of foreign genes, especially for oral delivery of vaccines and other therapeutic proteins. Until recently efforts to express recombinant proteins in carrot have had limited success in terms of protein accumulation in the edible tap roots. Plastid genetic engineering offers the potential to overcome this limitation, as demonstrated by the accumulation of BADH in chromoplasts of carrot taproots to confer exceedingly high levels of salt resistance. The complete plastid genome of carrot provides essential information required for genetic engineering. Additionally, the sequence data add to the rapidly growing database of plastid genomes for assessing phylogenetic relationships among angiosperms. Results The complete carrot plastid genome is 155,911 bp in length, with 115 unique genes and 21 duplicated genes within the IR. There are four ribosomal RNAs, 30 distinct tRNA genes and 18 intron-containing genes. Repeat analysis reveals 12 direct and 2 inverted repeats ≥ 30 bp with a sequence identity ≥ 90%. Phylogenetic analysis of nucleotide sequences for 61 protein-coding genes using both maximum parsimony (MP and maximum likelihood (ML were performed for 29 angiosperms. Phylogenies from both methods provide strong support for the monophyly of several major angiosperm clades, including monocots, eudicots, rosids, asterids, eurosids II, euasterids I, and euasterids II. Conclusion The carrot plastid genome contains a number of dispersed direct and inverted repeats scattered throughout coding and non-coding regions. This is the first sequenced plastid genome of the family Apiaceae and only the second published genome sequence of the species-rich euasterid II clade. Both MP and ML trees provide very strong support (100% bootstrap for the sister relationship of

  3. Complete mitochondrial genome sequence of the hedgehog seahorse Hippocampus spinosissimus Weber, 1933 (Gasterosteiformes:Syngnathidae).

    Science.gov (United States)

    Liu, Shuaishuai; Zhang, Yanhong; Wang, Changming; Lin, Qiang

    2016-07-01

    The complete mitochondrial genome sequence of the hedgehog seahorse Hippocampus spinosissimus was first determined in this article. The total length of H. spinosissimus mitogenome is 16 527 bp and consists of 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes and 1 control region. The gene order and composition of H. spinosissimus were similar to those of most other vertebrates. The overall base composition of H. spinosissimus is 32.1% A, 30.3% T, 14.9% G and 22.7% C, with a slight A + T-rich feature (62.4%). Phylogenetic analyses based on complete mitochondrial genome sequence showed that H. spinosissimus has a close genetic relationship to H. ingens and H. kuda.

  4. Complete Genome Sequence of Sporisorium scitamineum and Biotrophic Interaction Transcriptome with Sugarcane.

    Directory of Open Access Journals (Sweden)

    Lucas M Taniguti

    Full Text Available Sporisorium scitamineum is a biotrophic fungus responsible for the sugarcane smut, a worldwide spread disease. This study provides the complete sequence of individual chromosomes of S. scitamineum from telomere to telomere achieved by a combination of PacBio long reads and Illumina short reads sequence data, as well as a draft sequence of a second fungal strain. Comparative analysis to previous available sequences of another strain detected few polymorphisms among the three genomes. The novel complete sequence described herein allowed us to identify and annotate extended subtelomeric regions, repetitive elements and the mitochondrial DNA sequence. The genome comprises 19,979,571 bases, 6,677 genes encoding proteins, 111 tRNAs and 3 assembled copies of rDNA, out of our estimated number of copies as 130. Chromosomal reorganizations were detected when comparing to sequences of S. reilianum, the closest smut relative, potentially influenced by repeats of transposable elements. Repetitive elements may have also directed the linkage of the two mating-type loci. The fungal transcriptome profiling from in vitro and from interaction with sugarcane at two time points (early infection and whip emergence revealed that 13.5% of the genes were differentially expressed in planta and particular to each developmental stage. Among them are plant cell wall degrading enzymes, proteases, lipases, chitin modification and lignin degradation enzymes, sugar transporters and transcriptional factors. The fungus also modulates transcription of genes related to surviving against reactive oxygen species and other toxic metabolites produced by the plant. Previously described effectors in smut/plant interactions were detected but some new candidates are proposed. Ten genomic islands harboring some of the candidate genes unique to S. scitamineum were expressed only in planta. RNAseq data was also used to reassure gene predictions.

  5. Complete Sequence of the mitochondrial genome of the tapeworm Hymenolepis diminuta: Gene arrangements indicate that platyhelminths are eutrochozoans

    Energy Technology Data Exchange (ETDEWEB)

    von Nickisch-Rosenegk, Markus; Brown, Wesley M.; Boore, Jeffrey L.

    2001-01-01

    Using ''long-PCR'' we have amplified in overlapping fragments the complete mitochondrial genome of the tapeworm Hymenolepis diminuta (Platyhelminthes: Cestoda) and determined its 13,900 nucleotide sequence. The gene content is the same as that typically found for animal mitochondrial DNA (mtDNA) except that atp8 appears to be lacking, a condition found previously for several other animals. Despite the small size of this mtDNA, there are two large non-coding regions, one of which contains 13 repeats of a 31 nucleotide sequence and a potential stem-loop structure of 25 base pairs with an 11-member loop. Large potential secondary structures are identified also for the non-coding regions of two other cestode mtDNAs. Comparison of the mitochondrial gene arrangement of H. diminuta with those previously published supports a phylogenetic position of flatworms as members of the Eutrochozoa, rather than being basal to either a clade of protostomes or a clade of coelomates.

  6. Complete chloroplast genome sequence of a major economic species, Ziziphus jujuba (Rhamnaceae).

    Science.gov (United States)

    Ma, Qiuyue; Li, Shuxian; Bi, Changwei; Hao, Zhaodong; Sun, Congrui; Ye, Ning

    2017-02-01

    Ziziphus jujuba is an important woody plant with high economic and medicinal value. Here, we analyzed and characterized the complete chloroplast (cp) genome of Z. jujuba, the first member of the Rhamnaceae family for which the chloroplast genome sequence has been reported. We also built a web browser for navigating the cp genome of Z. jujuba ( http://bio.njfu.edu.cn/gb2/gbrowse/Ziziphus_jujuba_cp/ ). Sequence analysis showed that this cp genome is 161,466 bp long and has a typical quadripartite structure of large (LSC, 89,120 bp) and small (SSC, 19,348 bp) single-copy regions separated by a pair of inverted repeats (IRs, 26,499 bp). The sequence contained 112 unique genes, including 78 protein-coding genes, 30 transfer RNAs, and four ribosomal RNAs. The genome structure, gene order, GC content, and codon usage are similar to other typical angiosperm cp genomes. A total of 38 tandem repeats, two forward repeats, and three palindromic repeats were detected in the Z. jujuba cp genome. Simple sequence repeat (SSR) analysis revealed that most SSRs were AT-rich. The homopolymer regions in the cp genome of Z. jujuba were verified and manually corrected by Sanger sequencing. One-third of mononucleotide repeats were found to be erroneously sequenced by the 454 pyrosequencing, which resulted in sequences of 1-4 bases shorter than that by the Sanger sequencing. Analyzing the cp genome of Z. jujuba revealed that the IR contraction and expansion events resulted in ycf1 and rps19 pseudogenes. A phylogenetic analysis based on 64 protein-coding genes showed that Z. jujuba was closely related to members of the Elaeagnaceae family, which will be helpful for phylogenetic studies of other Rosales species. The complete cp genome sequence of Z. jujuba will facilitate population, phylogenetic, and cp genetic engineering studies of this economic plant.

  7. The complete chloroplast genome sequence of Dianthus superbus var. longicalycinus.

    Science.gov (United States)

    Gurusamy, Raman; Lee, Do-Hyung; Park, SeonJoo

    2016-05-01

    The complete chloroplast genome (cpDNA) sequence of Dianthus superbus var. longicalycinus is an economically important traditional Chinese medicine was reported and characterized. The cpDNA of Dianthus superbus var. longicalycinus is 149,539 bp, with 36.3% GC content. A pair of inverted repeats (IRs) of 24,803 bp is separated by a large single-copy region (LSC, 82,805 bp) and a small single-copy region (SSC, 17,128 bp). It encodes 85 protein-coding genes, 36 tRNA genes and 8 rRNA genes. Of 129 individual genes, 13 genes encoded one intron and three genes have two introns.

  8. The First Complete Chloroplast Genome Sequences in Actinidiaceae: Genome Structure and Comparative Analysis.

    Science.gov (United States)

    Yao, Xiaohong; Tang, Ping; Li, Zuozhou; Li, Dawei; Liu, Yifei; Huang, Hongwen

    2015-01-01

    Actinidia chinensis is an important economic plant belonging to the basal lineage of the asterids. Availability of a complete Actinidia chloroplast genome sequence is crucial to understanding phylogenetic relationships among major lineages of angiosperms and facilitates kiwifruit genetic improvement. We report here the complete nucleotide sequences of the chloroplast genomes for Actinidia chinensis and A. chinensis var deliciosa obtained through de novo assembly of Illumina paired-end reads produced by total DNA sequencing. The total genome size ranges from 155,446 to 157,557 bp, with an inverted repeat (IR) of 24,013 to 24,391 bp, a large single copy region (LSC) of 87,984 to 88,337 bp and a small single copy region (SSC) of 20,332 to 20,336 bp. The genome encodes 113 different genes, including 79 unique protein-coding genes, 30 tRNA genes and 4 ribosomal RNA genes, with 16 duplicated in the inverted repeats, and a tRNA gene (trnfM-CAU) duplicated once in the LSC region. Comparisons of IR boundaries among four asterid species showed that IR/LSC borders were extended into the 5' portion of the psbA gene and IR contraction occurred in Actinidia. The clap gene has been lost from the chloroplast genome in Actinidia, and may have been transferred to the nucleus during chloroplast evolution. Twenty-seven polymorphic simple sequence repeat (SSR) loci were identified in the Actinidia chloroplast genome. Maximum parsimony analyses of a 72-gene, 16 taxa angiosperm dataset strongly support the placement of Actinidiaceae in Ericales within the basal asterids.

  9. Sequencing genes in silico using single nucleotide polymorphisms

    Directory of Open Access Journals (Sweden)

    Zhang Xinyi

    2012-01-01

    Full Text Available Abstract Background The advent of high throughput sequencing technology has enabled the 1000 Genomes Project Pilot 3 to generate complete sequence data for more than 906 genes and 8,140 exons representing 697 subjects. The 1000 Genomes database provides a critical opportunity for further interpreting disease associations with single nucleotide polymorphisms (SNPs discovered from genetic association studies. Currently, direct sequencing of candidate genes or regions on a large number of subjects remains both cost- and time-prohibitive. Results To accelerate the translation from discovery to functional studies, we propose an in silico gene sequencing method (ISS, which predicts phased sequences of intragenic regions, using SNPs. The key underlying idea of our method is to infer diploid sequences (a pair of phased sequences/alleles at every functional locus utilizing the deep sequencing data from the 1000 Genomes Project and SNP data from the HapMap Project, and to build prediction models using flanking SNPs. Using this method, we have developed a database of prediction models for 611 known genes. Sequence prediction accuracy for these genes is 96.26% on average (ranges 79%-100%. This database of prediction models can be enhanced and scaled up to include new genes as the 1000 Genomes Project sequences additional genes on additional individuals. Applying our predictive model for the KCNJ11 gene to the Wellcome Trust Case Control Consortium (WTCCC Type 2 diabetes cohort, we demonstrate how the prediction of phased sequences inferred from GWAS SNP genotype data can be used to facilitate interpretation and identify a probable functional mechanism such as protein changes. Conclusions Prior to the general availability of routine sequencing of all subjects, the ISS method proposed here provides a time- and cost-effective approach to broadening the characterization of disease associated SNPs and regions, and facilitating the prioritization of candidate

  10. Complete genome sequence of Nakamurella multipartita type strain (Y-104).

    Science.gov (United States)

    Tice, Hope; Mayilraj, Shanmugam; Sims, David; Lapidus, Alla; Nolan, Matt; Lucas, Susan; Glavina Del Rio, Tijana; Copeland, Alex; Cheng, Jan-Fang; Meincke, Linda; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavromatis, Konstantinos; Ovchinnikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D; Detter, John C; Brettin, Thomas; Rohde, Manfred; Göker, Markus; Bristow, Jim; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter; Chen, Feng

    2010-03-30

    Nakamurella multipartita (Yoshimi et al. 1996) Tao et al. 2004 is the type species of the monospecific genus Nakamurella in the actinobacterial suborder Frankineae. The nonmotile, coccus-shaped strain was isolated from activated sludge acclimated with sugar-containing synthetic wastewater, and is capable of accumulating large amounts of polysaccharides in its cells. Here we describe the features of the organism, together with the complete genome sequence and annotation. This is the first complete genome sequence of a member of the family Nakamurellaceae. The 6,060,298 bp long single replicon genome with its 5415 protein-coding and 56 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  11. The complete chloroplast genome sequence of Dodonaea viscosa: comparative and phylogenetic analyses.

    Science.gov (United States)

    Saina, Josphat K; Gichira, Andrew W; Li, Zhi-Zhong; Hu, Guang-Wan; Wang, Qing-Feng; Liao, Kuo

    2018-02-01

    The plant chloroplast (cp) genome is a highly conserved structure which is beneficial for evolution and systematic research. Currently, numerous complete cp genome sequences have been reported due to high throughput sequencing technology. However, there is no complete chloroplast genome of genus Dodonaea that has been reported before. To better understand the molecular basis of Dodonaea viscosa chloroplast, we used Illumina sequencing technology to sequence its complete genome. The whole length of the cp genome is 159,375 base pairs (bp), with a pair of inverted repeats (IRs) of 27,099 bp separated by a large single copy (LSC) 87,204 bp, and small single copy (SSC) 17,972 bp. The annotation analysis revealed a total of 115 unique genes of which 81 were protein coding, 30 tRNA, and four ribosomal RNA genes. Comparative genome analysis with other closely related Sapindaceae members showed conserved gene order in the inverted and single copy regions. Phylogenetic analysis clustered D. viscosa with other species of Sapindaceae with strong bootstrap support. Finally, a total of 249 SSRs were detected. Moreover, a comparison of the synonymous (Ks) and nonsynonymous (Ka) substitution rates in D. viscosa showed very low values. The availability of cp genome reported here provides a valuable genetic resource for comprehensive further studies in genetic variation, taxonomy and phylogenetic evolution of Sapindaceae family. In addition, SSR markers detected will be used in further phylogeographic and population structure studies of the species in this genus.

  12. The first complete chloroplast genome sequence of a lycophyte,Huperzia lucidula (Lycopodiaceae)

    Energy Technology Data Exchange (ETDEWEB)

    Wolf, Paul G.; Karol, Kenneth G.; Mandoli, Dina F.; Kuehl,Jennifer V.; Arumuganathan, K.; Ellis, Mark W.; Mishler, Brent D.; Kelch,Dean G.; Olmstead, Richard G.; Boore, Jeffrey L.

    2005-02-01

    We used a unique combination of techniques to sequence the first complete chloroplast genome of a lycophyte, Huperzia lucidula. This plant belongs to a significant clade hypothesized to represent the sister group to all other vascular plants. We used fluorescence-activated cell sorting (FACS) to isolate the organelles, rolling circle amplification (RCA) to amplify the genome, and shotgun sequencing to 8x depth coverage to obtain the complete chloroplast genome sequence. The genome is 154,373bp, containing inverted repeats of 15,314 bp each, a large single-copy region of 104,088 bp, and a small single-copy region of 19,671 bp. Gene order is more similar to those of mosses, liverworts, and hornworts than to gene order for other vascular plants. For example, the Huperziachloroplast genome possesses the bryophyte gene order for a previously characterized 30 kb inversion, thus supporting the hypothesis that lycophytes are sister to all other extant vascular plants. The lycophytechloroplast genome data also enable a better reconstruction of the basaltracheophyte genome, which is useful for inferring relationships among bryophyte lineages. Several unique characters are observed in Huperzia, such as movement of the gene ndhF from the small single copy region into the inverted repeat. We present several analyses of evolutionary relationships among land plants by using nucleotide data, amino acid sequences, and by comparing gene arrangements from chloroplast genomes. The results, while still tentative pending the large number of chloroplast genomes from other key lineages that are soon to be sequenced, are intriguing in themselves, and contribute to a growing comparative database of genomic and morphological data across the green plants.

  13. The complete mitochondrial genome sequence of the Tibetan red fox (Vulpes vulpes montana).

    Science.gov (United States)

    Zhang, Jin; Zhang, Honghai; Zhao, Chao; Chen, Lei; Sha, Weilai; Liu, Guangshuai

    2015-01-01

    In this study, the complete mitochondrial genome of the Tibetan red fox (Vulpes Vulpes montana) was sequenced for the first time using blood samples obtained from a wild female red fox captured from Lhasa in Tibet, China. Qinghai--Tibet Plateau is the highest plateau in the world with an average elevation above 3500 m. Sequence analysis showed it contains 12S rRNA gene, 16S rRNA gene, 22 tRNA genes, 13 protein-coding genes and 1 control region (CR). The variable tandem repeats in CR is the main reason of the length variability of mitochondrial genome among canide animals.

  14. Complete sequencing of five araliaceae chloroplast genomes and the phylogenetic implications.

    Directory of Open Access Journals (Sweden)

    Rong Li

    Full Text Available BACKGROUND: The ginseng family (Araliaceae includes a number of economically important plant species. Previously phylogenetic studies circumscribed three major clades within the core ginseng plant family, yet the internal relationships of each major group have been poorly resolved perhaps due to rapid radiation of these lineages. Recent studies have shown that phyogenomics based on chloroplast genomes provides a viable way to resolve complex relationships. METHODOLOGY/PRINCIPAL FINDINGS: We report the complete nucleotide sequences of five Araliaceae chloroplast genomes using next-generation sequencing technology. The five chloroplast genomes are 156,333-156,459 bp in length including a pair of inverted repeats (25,551-26,108 bp separated by the large single-copy (86,028-86,566 bp and small single-copy (18,021-19,117 bp regions. Each chloroplast genome contains the same 114 unique genes consisting of 30 transfer RNA genes, four ribosomal RNA genes, and 80 protein coding genes. Gene size, content, and order, AT content, and IR/SC boundary structure are similar among all Araliaceae chloroplast genomes. A total of 140 repeats were identified in the five chloroplast genomes with palindromic repeat as the most common type. Phylogenomic analyses using parsimony, likelihood, and Bayesian inference based on the complete chloroplast genomes strongly supported the monophyly of the Asian Palmate group and the Aralia-Panax group. Furthermore, the relationships among the sampled taxa within the Asian Palmate group were well resolved. Twenty-six DNA markers with the percentage of variable sites higher than 5% were identified, which may be useful for phylogenetic studies of Araliaceae. CONCLUSION: The chloroplast genomes of Araliaceae are highly conserved in all aspects of genome features. The large-scale phylogenomic data based on the complete chloroplast DNA sequences is shown to be effective for the phylogenetic reconstruction of Araliaceae.

  15. Complete mitogenomic sequence of the Critically Endangered Northern River Shark Glyphis garricki (Carcharhiniformes: Carcharhinidae).

    Science.gov (United States)

    Feutry, Pierre; Grewe, Peter M; Kyne, Peter M; Chen, Xiao

    2015-01-01

    In this study we describe the first complete mitochondrial sequence for the Critically Endangered Northern River shark Glyphis garricki. The complete mitochondrial sequence is 16,702 bp in length, contains 37 genes and one control region with the typical gene order and transcriptional direction of vertebrate mitogenomes. The overall base composition is 31.5% A, 26.3% C, 12.9% G and 29.3% T. The length of 22 tRNA genes ranged from 68 (tRNA-Ser2 and tRNA-Cys) to 75 (tRNA-Leu1) bp. The control region of G. garricki was 1067 bp in length with high A+T (67.9%) and poor G (12.6%) content. The mitogenomic characters (base composition, codon usage and gene length) of G. garricki were very similar to Glyphis glyphis.

  16. Complete nucleotide sequence of a novel Hibiscus-infecting Cilevirus from Florida and its relationship with closely associated Cileviruses

    Science.gov (United States)

    The complete nucleotide sequence of a recently discovered Florida (FL) isolate of Hibiscus infecting Cilevirus (HiCV) was determined by Sanger sequencing. The movement- and coat- protein gene sequences of the HiCV-FL isolate are more divergent than other genes of the previously sequenced HiCV-HA (Ha...

  17. The complete genome sequence of the Atlantic salmon paramyxovirus (ASPV)

    International Nuclear Information System (INIS)

    Nylund, Stian; Karlsen, Marius; Nylund, Are

    2008-01-01

    The complete RNA genome of the Atlantic salmon paramyxovirus (ASPV), isolated from Atlantic salmon suffering from proliferative gill inflammation (PGI), has been determined. The genome is 16,965 nucleotides in length and consists of six nonoverlapping genes in the order 3'- N - P/C/V - M - F - HN - L -5', coding for the nucleocapsid, phospho-, matrix, fusion, hemagglutinin-neuraminidase and large polymerase proteins, respectively. The gene junctions contain highly conserved transcription start and stop signal sequences and trinucleotide intergenic regions similar to those of other Paramyxoviridae. The ASPV P-gene expression strategy is like that of the respiro- and morbilliviruses, which express the phosphoprotein from the primary transcript, and edit a portion of the mRNA to encode the accessory proteins V and W. It also encodes the C-protein by ribosomal choice of translation initiation. Pairwise comparisons of amino acid identities, and phylogenetic analysis of deduced ASPV protein sequences with homologous sequences from other Paramyxoviridae, show that ASPV has an affinity for the genus Respirovirus, but may represent a new genus within the subfamily Paramyxovirinae

  18. Complete genome sequence of Marivirga tractuosa type strain (H-43).

    Science.gov (United States)

    Pagani, Ioanna; Chertkov, Olga; Lapidus, Alla; Lucas, Susan; Del Rio, Tijana Glavina; Tice, Hope; Copeland, Alex; Cheng, Jan-Fang; Nolan, Matt; Saunders, Elizabeth; Pitluck, Sam; Held, Brittany; Goodwin, Lynne; Liolios, Konstantinos; Ovchinikova, Galina; Ivanova, Natalia; Mavromatis, Konstantinos; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Jeffries, Cynthia D; Detter, John C; Han, Cliff; Tapia, Roxanne; Ngatchou-Djao, Olivier D; Rohde, Manfred; Göker, Markus; Spring, Stefan; Sikorski, Johannes; Woyke, Tanja; Bristow, Jim; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Klenk, Hans-Peter; Kyrpides, Nikos C

    2011-04-29

    Marivirga tractuosa (Lewin 1969) Nedashkovskaya et al. 2010 is the type species of the genus Marivirga, which belongs to the family Flammeovirgaceae. Members of this genus are of interest because of their gliding motility. The species is of interest because representative strains show resistance to several antibiotics, including gentamicin, kanamycin, neomycin, polymixin and streptomycin. This is the first complete genome sequence of a member of the family Flammeovirgaceae. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 4,511,574 bp long chromosome and the 4,916 bp plasmid with their 3,808 protein-coding and 49 RNA genes are a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  19. Two complete chloroplast genome sequences of Cannabis sativa varieties.

    Science.gov (United States)

    Oh, Hyehyun; Seo, Boyoung; Lee, Seunghwan; Ahn, Dong-Ha; Jo, Euna; Park, Jin-Kyoung; Min, Gi-Sik

    2016-07-01

    In this study, we determined the complete chloroplast (cp) genomes from two varieties of Cannabis sativa. The genome sizes were 153,848 bp (the Korean non-drug variety, Cheungsam) and 153,854 bp (the African variety, Yoruba Nigeria). The genome structures were identical with 131 individual genes [86 protein-coding genes (PCGs), eight rRNA, and 37 tRNA genes]. Further, except for the presence of an intron in the rps3 genes of two C. sativa varieties, the cp genomes of C. sativa had conservative features similar to that of all known species in the order Rosales. To verify the position of C. sativa within the order Rosales, we conducted phylogenetic analysis by using concatenated sequences of all PCGs from 17 complete cp genomes. The resulting tree strongly supported monophyly of Rosales. Further, the family Cannabaceae, represented by C. sativa, showed close relationship with the family Moraceae. The phylogenetic relationship outlined in our study is well congruent with those previously shown for the order Rosales.

  20. Next generation sequencing yields the complete mitochondrial genome of the largescale mullet, Liza macrolepis (Teleostei: Mugilidae).

    Science.gov (United States)

    Shen, Kang-Ning; Tsai, Shiou-Yi; Chen, Ching-Hung; Hsiao, Chung-Der; Durand, Jean-Dominique

    2016-11-01

    In this study, the complete mitogenome sequence of largescale mullet (Teleostei: Mugilidae) has been sequenced by the next-generation sequencing method. The assembled mitogenome, consisting of 16,832 bp, had the typical vertebrate mitochondrial gene arrangement, including 13 protein-coding genes, 22 transfer RNAs, two ribosomal RNAs genes, and a non-coding control region of D-loop. D-loop which has a length of 1094 bp is located between tRNA-Pro and tRNA-Phe. The overall base composition of largescale mullet is 27.8% for A, 30.1% for C, 16.2% for G, and 25.9% for T. The complete mitogenome may provide essential and important DNA molecular data for further phylogenetic and evolutionary analysis for Mugilidae.

  1. Next generation sequencing yields the complete mitochondrial genome of the Hornlip mullet Plicomugil labiosus (Teleostei: Mugilidae).

    Science.gov (United States)

    Shen, Kang-Ning; Chen, Ching-Hung; Hsiao, Chung-Der

    2016-05-01

    In this study, the complete mitogenome sequence of hornlip mullet Plicomugil labiosus (Teleostei: Mugilidae) has been sequenced by next-generation sequencing method. The assembled mitogenome, consisting of 16,829 bp, had the typical vertebrate mitochondrial gene arrangement, including 13 protein coding genes, 22 transfer RNAs, 2 ribosomal RNAs genes and a non-coding control region of D-loop. D-loop contains 1057 bp length is located between tRNA-Pro and tRNA-Phe. The overall base composition of P. labiosus is 28.0% for A, 29.3% for C, 15.5% for G and 27.2% for T. The complete mitogenome may provide essential and important DNA molecular data for further population, phylogenetic and evolutionary analysis for Mugilidae.

  2. Rickettsia asembonensis Characterization by Multilocus Sequence Typing of Complete Genes, Peru.

    Science.gov (United States)

    Loyola, Steev; Flores-Mendoza, Carmen; Torre, Armando; Kocher, Claudine; Melendrez, Melanie; Luce-Fedrow, Alison; Maina, Alice N; Richards, Allen L; Leguia, Mariana

    2018-05-01

    While studying rickettsial infections in Peru, we detected Rickettsia asembonensis in fleas from domestic animals. We characterized 5 complete genomic regions (17kDa, gltA, ompA, ompB, and sca4) and conducted multilocus sequence typing and phylogenetic analyses. The molecular isolate from Peru is distinct from the original R. asembonensis strain from Kenya.

  3. Complete nucleotide sequences of avian metapneumovirus subtype B genome.

    Science.gov (United States)

    Sugiyama, Miki; Ito, Hiroshi; Hata, Yusuke; Ono, Eriko; Ito, Toshihiro

    2010-12-01

    Complete nucleotide sequences were determined for subtype B avian metapneumovirus (aMPV), the attenuated vaccine strain VCO3/50 and its parental pathogenic strain VCO3/60616. The genomes of both strains comprised 13,508 nucleotides (nt), with a 42-nt leader at the 3'-end and a 46-nt trailer at the 5'-end. The genome contains eight genes in the order 3'-N-P-M-F-M2-SH-G-L-5', which is the same order shown in the other metapneumoviruses. The genes are flanked on either side by conserved transcriptional start and stop signals and have intergenic sequences varying in length from 1 to 88 nt. Comparison of nt and predicted amino acid (aa) sequences of VCO3/60616 with those of other metapneumoviruses revealed higher homology with aMPV subtype A virus than with other metapneumoviruses. A total of 18 nt and 10 deduced aa differences were seen between the strains, and one or a combination of several differences could be associated with attenuation of VCO3/50.

  4. Sequencing and analysis of the complete mitochondrial genome in Anopheles sinensis (Diptera: Culicidae).

    Science.gov (United States)

    Chen, Kai; Wang, Yan; Li, Xiang-Yu; Peng, Heng; Ma, Ya-Jun

    2017-10-02

    Anopheles sinensis (Diptera: Culicidae) is a primary vector of Plasmodium vivax and Brugia malayi in most regions of China. In addition, its phylogenetic relationship with the cryptic species of the Hyrcanus Group is complex and remains unresolved. Mitochondrial genome sequences are widely used as molecular markers for phylogenetic studies of mosquito species complexes, of which mitochondrial genome data of An. sinensis is not available. An. sinensis samples was collected from Shandong, China, and identified by molecular marker. Genomic DNA was extracted, followed by the Illumina sequencing. Two complete mitochondrial genomes were assembled and annotated using the mitochondrial genome of An. gambiae as reference. The mitochondrial genomes sequences of the 28 known Anopheles species were aligned and reconstructed phylogenetic tree by Maximum Likelihood (ML) method. The length of complete mitochondrial genomes of An. sinensis was 15,076 bp and 15,138 bp, consisting of 13 protein-coding genes, 22 transfer RNA (tRNA) genes, 2 ribosomal RNA (rRNA) genes, and an AT-rich control region. As in other insects, most mitochondrial genes are encoded on the J strand, except for ND5, ND4, ND4L, ND1, two rRNA and eight tRNA genes, which are encoded on the N strand. The bootstrap value was set as 1000 in ML analyses. The topologies restored phylogenetic affinity within subfamily Anophelinae. The ML tree showed four major clades, corresponding to the subgenera Cellia, Anopheles, Nyssorhynchus and Kerteszia of the genus Anopheles. The complete mitochondrial genomes of An. sinensis were obtained. The number, order and transcription direction of An. sinensis mitochondrial genes were the same as in other species of family Culicidae.

  5. Complete genome sequence of Francisella tularensis subspecies holarctica FTNF002-00.

    Directory of Open Access Journals (Sweden)

    Ravi D Barabote

    Full Text Available Francisella tularensis subspecies holarctica FTNF002-00 strain was originally obtained from the first known clinical case of bacteremic F. tularensis pneumonia in Southern Europe isolated from an immunocompetent individual. The FTNF002-00 complete genome contains the RD(23 deletion and represents a type strain for a clonal population from the first epidemic tularemia outbreak in Spain between 1997-1998. Here, we present the complete sequence analysis of the FTNF002-00 genome. The complete genome sequence of FTNF002-00 revealed several large as well as small genomic differences with respect to two other published complete genome sequences of F. tularensis subsp. holarctica strains, LVS and OSU18. The FTNF002-00 genome shares >99.9% sequence similarity with LVS and OSU18, and is also approximately 5 MB smaller by comparison. The overall organization of the FTNF002-00 genome is remarkably identical to those of LVS and OSU18, except for a single 3.9 kb inversion in FTNF002-00. Twelve regions of difference ranging from 0.1-1.5 kb and forty-two small insertions and deletions were identified in a comparative analysis of FTNF002-00, LVS, and OSU18 genomes. Two small deletions appear to inactivate two genes in FTNF002-00 causing them to become pseudogenes; the intact genes encode a protein of unknown function and a drug:H(+ antiporter. In addition, we identified ninety-nine proteins in FTNF002-00 containing amino acid mutations compared to LVS and OSU18. Several non-conserved amino acid replacements were identified, one of which occurs in the virulence-associated intracellular growth locus subunit D protein. Many of these changes in FTNF002-00 are likely the consequence of direct selection that increases the fitness of this subsp. holarctica clone within its endemic population. Our complete genome sequence analyses lay the foundation for experimental testing of these possibilities.

  6. Complete DNA sequence of the mitochondrial genome of the treehopper Leptobelus gazella (Membracoidea: Hemiptera).

    Science.gov (United States)

    Zhao, Xing; Liang, Ai-Ping

    2016-09-01

    The first complete DNA sequence of the mitochondrial genome (mitogenome) of Leptobelus gazelle (Membracoidea: Hemiptera) is determined in this study. The circular molecule is 16,007 bp in its full length, which encodes a set of 37 genes, including 13 proteins, 2 ribosomal RNAs, 22 transfer RNAs, and contains an A + T-rich region (CR). The gene numbers, content, and organization of L. gazelle are similar to other typical metazoan mitogenomes. Twelve of the 13 PCGs are initiated with ATR methionine or ATT isoleucine codons, except the atp8 gene that uses the ATC isoleucine as start signal. Ten of the 13 PCGs have complete termination codons, either TAA (nine genes) or TAG (cytb). The remaining 3 PCGs (cox1, cox2 and nad5) have incomplete termination codons T (AA). All of the 22 tRNAs can be folded in the form of a typical clover-leaf structure. The complete mitogenome sequence data of L. gazelle is useful for the phylogenetic and biogeographic studies of the Membracoidea and Hemiptera.

  7. Complete mitochondrial genome sequence of Indian medium carp, Labeo gonius (Hamilton, 1822) and its comparison with other related carp species.

    Science.gov (United States)

    Behera, Bijay Kumar; Kumari, Kavita; Baisvar, Vishwamitra Singh; Rout, Ajaya Kumar; Pakrashi, Sudip; Paria, Prasenjet; Jena, J K

    2017-01-01

    In the present study, the complete mitochondrial genome sequence of Labeo gonius is reported using PGM sequencer (Ion Torrent). The complete mitogenome of L. gonius is obtained by the de novo sequences assembly of genomic reads using the Torrent Mapping Alignment Program (TMAP) which is 16 614 bp in length. The mitogenome of L. gonius comprised of 13 protein-coding genes, 22 tRNAs, 2 rRNA genes, and D-loop as control region along with gene order and organization, being similar to most of other fish mitogenomes of NCBI databases. The mitogenome in the present study has 99% similarity to the complete mitogenome sequence of Labeo fimbriatus, as reported earlier. The phylogenetic analysis of Cypriniformes depicted that their mitogenomes are closely related to each other. The complete mitogenome sequence of L. gonius would be helpful in understanding the population genetics, phylogenetics, and evolution of Indian Carps.

  8. Complete genome sequence of Sanguibacter keddieii type strain (ST-74T)

    Energy Technology Data Exchange (ETDEWEB)

    Ivanova, Natalia; Sikorski, Johannes; Sims, David; Brettin, Thomas; Detter, John C.; Han, Cliff; Lapidus, Alla; Copeland, Alex; Glavina Del Rio, Tijana; Nolan, Matt; Chen, Feng; Lucas, Susan; Tice, Hope; Cheng, Jan-Fang; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Pati, Amrita; Mavromatis, Konstantinos; Chen, Amy; Palaniappan, Krishna; D' haeseleer, Patrik; Chain, Patrick; Bristow, Jim; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Goker, Markus; Pukall, Rudiger; Klenk, Hans-Peter; Kyrpides, Nikos

    2009-05-20

    Sanguibacter keddieii is the type species of the genus Sanguibacter, the only described genus within the family of Sanguibacteraceae. Phylogenetically, this family is located in the neighbourhood of the genus Oerskovia and the family Cellulomonadaceae within the actinobacterial suborder Micrococcineae. The strain described in this report was isolated from blood of apparently healthy cows. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of the family Sanguibacteraceae, and the 4,253,413 bp long single replicon genome with its 3735 protein-coding and 70 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  9. Complete genome sequence of the plant-associated Serratia plymuthica strain AS13

    Energy Technology Data Exchange (ETDEWEB)

    Neupane, Saraswoti [Uppsala University, Uppsala, Sweden; Finlay, Roger D. [Uppsala University, Uppsala, Sweden; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Alstrom, Sadhna [Uppsala University, Uppsala, Sweden; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Han, James [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Peters, Lin [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Held, Brittany [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Detter, J C [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Hauser, Loren John [ORNL; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Pagani, Ioanna [U.S. Department of Energy, Joint Genome Institute; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Hogberg, Nils [Uppsala University, Uppsala, Sweden

    2012-01-01

    Serratia plymuthica AS13 is a plant-associated Gammaproteobacteria, isolated from rapeseed roots. It is of special interest because of its ability to inhibit fungal pathogens of rapeseed and to promote plant growth. The complete genome of S. plymuthica AS13 consists of a 5,442,549 bp circular chromosome. The chromosome contains 4,951 protein-coding genes, 87 tRNA genes and 7 rRNA operons. This genome was sequenced as part of the project enti- tled Genomics of four rapeseed plant growth promoting bacteria with antagonistic effect on plant pathogens within the 2010 DOE-JGI Community Sequencing Program (CSP2010).

  10. Complete sequence and comparative analysis of the chloroplast genome of Plinia trunciflora

    Directory of Open Access Journals (Sweden)

    Maria Eguiluz

    2017-11-01

    Full Text Available Abstract Plinia trunciflora is a Brazilian native fruit tree from the Myrtaceae family, also known as jaboticaba. This species has great potential by its fruit production. Due to the high content of essential oils in their leaves and of anthocyanins in the fruits, there is also an increasing interest by the pharmaceutical industry. Nevertheless, there are few studies focusing on its molecular biology and genetic characterization. We herein report the complete chloroplast (cp genome of P. trunciflora using high-throughput sequencing and compare it to other previously sequenced Myrtaceae genomes. The cp genome of P. trunciflora is 159,512 bp in size, comprising inverted repeats of 26,414 bp and single-copy regions of 88,097 bp (LSC and 18,587 bp (SSC. The genome contains 111 single-copy genes (77 protein-coding, 30 tRNA and four rRNA genes. Phylogenetic analysis using 57 cp protein-coding genes demonstrated that P. trunciflora, Eugenia uniflora and Acca sellowiana form a cluster with closer relationship to Syzygium cumini than with Eucalyptus. The complete cp sequence reported here can be used in evolutionary and population genetics studies, contributing to resolve the complex taxonomy of this species and fill the gap in genetic characterization.

  11. Complete sequence and comparative analysis of the chloroplast genome of Plinia trunciflora

    Science.gov (United States)

    Eguiluz, Maria; Yuyama, Priscila Mary; Guzman, Frank; Rodrigues, Nureyev Ferreira; Margis, Rogerio

    2017-01-01

    Abstract Plinia trunciflora is a Brazilian native fruit tree from the Myrtaceae family, also known as jaboticaba. This species has great potential by its fruit production. Due to the high content of essential oils in their leaves and of anthocyanins in the fruits, there is also an increasing interest by the pharmaceutical industry. Nevertheless, there are few studies focusing on its molecular biology and genetic characterization. We herein report the complete chloroplast (cp) genome of P. trunciflora using high-throughput sequencing and compare it to other previously sequenced Myrtaceae genomes. The cp genome of P. trunciflora is 159,512 bp in size, comprising inverted repeats of 26,414 bp and single-copy regions of 88,097 bp (LSC) and 18,587 bp (SSC). The genome contains 111 single-copy genes (77 protein-coding, 30 tRNA and four rRNA genes). Phylogenetic analysis using 57 cp protein-coding genes demonstrated that P. trunciflora, Eugenia uniflora and Acca sellowiana form a cluster with closer relationship to Syzygium cumini than with Eucalyptus. The complete cp sequence reported here can be used in evolutionary and population genetics studies, contributing to resolve the complex taxonomy of this species and fill the gap in genetic characterization. PMID:29111566

  12. Complete sequence and comparative analysis of the chloroplast genome of Plinia trunciflora.

    Science.gov (United States)

    Eguiluz, Maria; Yuyama, Priscila Mary; Guzman, Frank; Rodrigues, Nureyev Ferreira; Margis, Rogerio

    2017-01-01

    Plinia trunciflora is a Brazilian native fruit tree from the Myrtaceae family, also known as jaboticaba. This species has great potential by its fruit production. Due to the high content of essential oils in their leaves and of anthocyanins in the fruits, there is also an increasing interest by the pharmaceutical industry. Nevertheless, there are few studies focusing on its molecular biology and genetic characterization. We herein report the complete chloroplast (cp) genome of P. trunciflora using high-throughput sequencing and compare it to other previously sequenced Myrtaceae genomes. The cp genome of P. trunciflora is 159,512 bp in size, comprising inverted repeats of 26,414 bp and single-copy regions of 88,097 bp (LSC) and 18,587 bp (SSC). The genome contains 111 single-copy genes (77 protein-coding, 30 tRNA and four rRNA genes). Phylogenetic analysis using 57 cp protein-coding genes demonstrated that P. trunciflora, Eugenia uniflora and Acca sellowiana form a cluster with closer relationship to Syzygium cumini than with Eucalyptus. The complete cp sequence reported here can be used in evolutionary and population genetics studies, contributing to resolve the complex taxonomy of this species and fill the gap in genetic characterization.

  13. Complete genome sequence of Actinosynnema mirum type strain (101T)

    Energy Technology Data Exchange (ETDEWEB)

    Land, Miriam; Lapidus, Alla; Mayilraj, Shanmugam; Chen, Feng; Copeland, Alex; Glavina Del Rio, Tijana; Nolan, Matt; Lucas, Susan; Tice, Hope; Cheng, Jan-Fang; Chertkov, Olga; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Rohde, Manfred; Goker, Markus; Pati, Amrita; Ivanova, Natalia; Mavrommatis, Konstantinos; Chen, Amy; Palaniappan, Krishna; Hauser, Loren; Chang, Yun-Juan; Jefferies, Cynthia; Brettin, Thomas; Detter, John C.; Han, Cliff; Chain, Patrick; Tindall, Brian; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter

    2009-05-20

    Actinosynnema mirum Hasegawa et al. 1978 is the type species of the genus, and is of phylogenetic interest because of its central phylogenetic location in the Actino-synnemataceae, a rapidly growing family within the actinobacterial suborder Pseudo-nocardineae. A. mirum is characterized by its motile spores borne on synnemata and as a producer of nocardicin antibiotics. It is capable of growing aerobically and under a moderate CO2 atmosphere. The strain is a Gram-positive, aerial and substrate mycelium producing bacterium, originally isolated from a grass blade collected from the Raritan River, New Jersey. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first complete genome sequence of a member of the family Actinosynnemataceae, and only the second sequence from the actinobacterial suborder Pseudonocardineae. The 8,248,144 bp long single replicon genome with its 7100 protein-coding and 77 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  14. Complete genome sequence of Calditerrivibrio nitroreducens type strain (Yu37-1T)

    Energy Technology Data Exchange (ETDEWEB)

    Pitluck, Sam [Joint Genome Institute, Walnut Creek, California; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Zeytun, Ahmet [Los Alamos National Laboratory (LANL); Lapidus, Alla L. [Joint Genome Institute, Walnut Creek, California; Nolan, Matt [Joint Genome Institute, Walnut Creek, California; Lucas, Susan [Joint Genome Institute, Walnut Creek, California; Hammon, Nancy [Joint Genome Institute, Walnut Creek, California; Deshpande, Shweta [Joint Genome Institute, Walnut Creek, California; Cheng, Jan-Fang [Joint Genome Institute, Walnut Creek, California; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Liolios, Konstantinos [Joint Genome Institute, Walnut Creek, California; Pagani, Ioanna [Joint Genome Institute, Walnut Creek, California; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [Joint Genome Institute, Walnut Creek, California; Palaniappan, Krishna [Joint Genome Institute, Walnut Creek, California; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Detter, J. Chris [Joint Genome Institute, Walnut Creek, California; Brambilla, Evelyne-Marie [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Ngatchou, Olivier Duplex [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Spring, Stefan [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [Joint Genome Institute, Walnut Creek, California; Bristow, James [Joint Genome Institute, Walnut Creek, California; Eisen, Jonathan [Joint Genome Institute, Walnut Creek, California; Markowitz, Victor [Joint Genome Institute, Walnut Creek, California; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [Joint Genome Institute, Walnut Creek, California; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Land, Miriam L [ORNL

    2011-01-01

    Calditerrivibrio nitroreducens Iino et al. 2008 is the type species of the genus Calditerrivibrio. The species is of interest because of its important role in the nitrate cycle as nitrate reducer and for its isolated phylogenetic position in the Tree of Life. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the third complete genome sequence of a member of the family Deferribacteraceae. The 2,216,552 bp long genome with its 2,128 protein-coding and 50 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  15. The complete chloroplast genome of Capsicum annuum var. glabriusculum using Illumina sequencing.

    Science.gov (United States)

    Raveendar, Sebastin; Na, Young-Wang; Lee, Jung-Ro; Shim, Donghwan; Ma, Kyung-Ho; Lee, Sok-Young; Chung, Jong-Wook

    2015-07-20

    Chloroplast (cp) genome sequences provide a valuable source for DNA barcoding. Molecular phylogenetic studies have concentrated on DNA sequencing of conserved gene loci. However, this approach is time consuming and more difficult to implement when gene organization differs among species. Here we report the complete re-sequencing of the cp genome of Capsicum pepper (Capsicum annuum var. glabriusculum) using the Illumina platform. The total length of the cp genome is 156,817 bp with a 37.7% overall GC content. A pair of inverted repeats (IRs) of 50,284 bp were separated by a small single copy (SSC; 18,948 bp) and a large single copy (LSC; 87,446 bp). The number of cp genes in C. annuum var. glabriusculum is the same as that in other Capsicum species. Variations in the lengths of LSC; SSC and IR regions were the main contributors to the size variation in the cp genome of this species. A total of 125 simple sequence repeat (SSR) and 48 insertions or deletions variants were found by sequence alignment of Capsicum cp genome. These findings provide a foundation for further investigation of cp genome evolution in Capsicum and other higher plants.

  16. Complete sequence and comparative analysis of the chloroplast genome of coconut palm (Cocos nucifera).

    Science.gov (United States)

    Huang, Ya-Yi; Matzke, Antonius J M; Matzke, Marjori

    2013-01-01

    Coconut, a member of the palm family (Arecaceae), is one of the most economically important trees used by mankind. Despite its diverse morphology, coconut is recognized taxonomically as only a single species (Cocos nucifera L.). There are two major coconut varieties, tall and dwarf, the latter of which displays traits resulting from selection by humans. We report here the complete chloroplast (cp) genome of a dwarf coconut plant, and describe the gene content and organization, inverted repeat fluctuations, repeated sequence structure, and occurrence of RNA editing. Phylogenetic relationships of monocots were inferred based on 47 chloroplast protein-coding genes. Potential nodes for events of gene duplication and pseudogenization related to inverted repeat fluctuation were mapped onto the tree using parsimony criteria. We compare our findings with those from other palm species for which complete cp genome sequences are available.

  17. Complete genome sequence of Cryptobacterium curtum type strain (12-3T)

    Energy Technology Data Exchange (ETDEWEB)

    Mavromatis, Konstantinos; Pukall, Rudiger; Rohde, Christine; Sims, David; Brettin, Thomas; Kuske, Cheryl; Detter, John C.; Han, Cliff; Lapidus, Alla; Copeland, Alex; Glavina Del Rio, Tijana; Nolan, Matt; Lucas, Susan; Tice, Hope; Cheng, Jan-Fang; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ovchinnikova, Galina; Pati, Amrita; Ivanova, Natalia; Chen, Amy; Palaniappan, Krishna; Chain, Patrick; D' haeseleer, Patrik; Bristow, Jim; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Rohde, Manfred; Klenk, Hans-Peter; Kyrpides, Nikos C.

    2009-05-20

    Cryptobacterium curtum Nakazawa et al. 1999 is the type species of the genus, and is of phylogenetic interest because of its very distant and isolated position within the family Coriobacteriaceae. C. curtum is an asaccharolytic, opportunistic pathogen with a typical occurrence in the oral cavity, involved in dental and oral infections like periodontitis, inflammations and abscesses. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of the actinobacterial family Coriobacteriaceae, and this 1,617,804 bp long single replicon genome with its 1364 protein-coding and 58 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  18. The complete genome sequence of human adenovirus 84, a highly recombinant new Human mastadenovirus D type with a unique fiber gene.

    Science.gov (United States)

    Kaján, Győző L; Kajon, Adriana E; Pinto, Alexis Castillo; Bartha, Dániel; Arnberg, Niklas

    2017-10-15

    A novel human adenovirus was isolated from a pediatric case of acute respiratory disease in Panama City, Panama in 2011. The clinical isolate was initially identified as an intertypic recombinant based on hexon and fiber gene sequencing. Based on the analysis of its complete genome sequence, the novel complex recombinant Human mastadenovirus D (HAdV-D) strain was classified into a new HAdV type: HAdV-84, and it was designated Adenovirus D human/PAN/P309886/2011/84[P43H17F84]. HAdV-D types possess usually an ocular or gastrointestinal tropism, and respiratory association is scarcely reported. The virus has a novel fiber type, most closely related to, but still clearly distant from that of HAdV-36. The predicted fiber is hypothesised to bind sialic acid with lower affinity compared to HAdV-37. Bioinformatic analysis of the complete genomic sequence of HAdV-84 revealed multiple homologous recombination events and provided deeper insight into HAdV evolution. Copyright © 2017 Elsevier B.V. All rights reserved.

  19. The complete mitochondrial genome sequence of Eimeria innocua (Eimeriidae, Coccidia, Apicomplexa).

    Science.gov (United States)

    Hafeez, Mian Abdul; Vrba, Vladimir; Barta, John Robert

    2016-07-01

    The complete mitochondrial genome of Eimeria innocua KR strain (Eimeriidae, Coccidia, Apicomplexa) was sequenced. This coccidium infects turkeys (Meleagris gallopavo), Bobwhite quails (Colinus virginianus), and Grey partridges (Perdix perdix). Genome organization and gene contents were comparable with other Eimeria spp. infecting galliform birds. The circular-mapping mt genome of E. innocua is 6247 bp in length with three protein-coding genes (cox1, cox3, and cytb), 19 gene fragments encoding large subunit (LSU) rRNA and 14 gene fragments encoding small subunit (SSU) rRNA. Like other Apicomplexa, no tRNA was encoded. The mitochondrial genome of E. innocua confirms its close phylogenetic affinities to Eimeria dispersa.

  20. Complete mitochondrial genome sequences from five Eimeria species (Apicomplexa; Coccidia; Eimeriidae) infecting domestic turkeys.

    Science.gov (United States)

    Ogedengbe, Mosun E; El-Sherry, Shiem; Whale, Julia; Barta, John R

    2014-07-17

    Clinical and subclinical coccidiosis is cosmopolitan and inflicts significant losses to the poultry industry globally. Seven named Eimeria species are responsible for coccidiosis in turkeys: Eimeria dispersa; Eimeria meleagrimitis; Eimeria gallopavonis; Eimeria meleagridis; Eimeria adenoeides; Eimeria innocua; and, Eimeria subrotunda. Although attempts have been made to characterize these parasites molecularly at the nuclear 18S rDNA and ITS loci, the maternally-derived and mitotically replicating mitochondrial genome may be more suited for species level molecular work; however, only limited sequence data are available for Eimeria spp. infecting turkeys. The purpose of this study was to sequence and annotate the complete mitochondrial genomes from 5 Eimeria species that commonly infect the domestic turkey (Meleagris gallopavo). Six single-oocyst derived cultures of five Eimeria species infecting turkeys were PCR-amplified and sequenced completely prior to detailed annotation. Resulting sequences were aligned and used in phylogenetic analyses (BI, ML, and MP) that included complete mitochondrial genomes from 16 Eimeria species or concatenated CDS sequences from each genome. Complete mitochondrial genome sequences were obtained for Eimeria adenoeides Guelph, 6211 bp; Eimeria dispersa Briston, 6238 bp; Eimeria meleagridis USAR97-01, 6212 bp; Eimeria meleagrimitis USMN08-01, 6165 bp; Eimeria gallopavonis Weybridge, 6215 bp; and Eimeria gallopavonis USKS06-01, 6215 bp). The order, orientation and CDS lengths of the three protein coding genes (COI, COIII and CytB) as well as rDNA fragments encoding ribosomal large and small subunit rRNA were conserved among all sequences. Pairwise sequence identities between species ranged from 88.1% to 98.2%; sequence variability was concentrated within CDS or between rDNA fragments (where indels were common). No phylogenetic reconstruction supported monophyly of Eimeria species infecting turkeys; Eimeria dispersa may have arisen

  1. Complete Genome Sequence of the Anaerobic Halophilic Alkalithermophile Natranaerobius thermophilus JW/NM-WN-LFT

    Energy Technology Data Exchange (ETDEWEB)

    Mesbah, Noha [University of Georgia, Athens, GA; Dalin, Eileen [U.S. Department of Energy, Joint Genome Institute; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Chertkov, Olga [Los Alamos National Laboratory (LANL); Han, James [U.S. Department of Energy, Joint Genome Institute; Larimer, Frank W [ORNL; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Wiegel, Juergen [University of Georgia, Athens, GA

    2011-01-01

    The genome of the anaerobic halophilic alkalithermophile Natranaerobius thermophiles consists of one chromosome and two plasmids.The present study is the first to report the completely sequenced genome of polyextremophile and the harboring genes harboring genes associated with roles in regulation of intracellular osmotic pressure, pH homeostasis, and thermophilic stability.

  2. Complete genome sequence of Desulfomicrobium baculatum type strain (XT)

    Energy Technology Data Exchange (ETDEWEB)

    Copeland, Alex; Spring, Stefan; Goker, Markus; Schneider, Susanne; Lapidus, Alla; Glavina Del Rio, Tijana; Tice, Hope; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Nolan, Matt; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavrommatis, Konstantinos; Ovchinnikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jefferies, Cynthia C; Meincke, Linda; Sims, David; Brettin, Thomas; Detter, John C; Han, Cliff; Chain, Patrick; Bristow, James; Eisen, Jonathan; Markowitz, Victor; Hugenholtz, Philip; Klenk, Hans-Peter; Kyrpides, Nikos C; Lucas, Susan

    2009-05-20

    Desulfomicrobium baculatum is the type species of the genus Desulfomicrobium, which is the type genus of the family Desulfomicrobiaceae. It is of phylogenetic interest because of the isolated location of the family Desulfomicrobiaceae within the order Desulfovibrionales. D. baculatum strain XT is a Gram-negative, motile, sulfate-reducing bacterium isolated from water-saturated manganese carbonate ore. It is strictly anaerobic and does not require NaCl for growth, although NaCl concentrations up to 6percent (w/v) are tolerated. The metabolism is respiratory or fermentative. In the presence of sulfate, pyruvate and lactate are incompletely oxidized to acetate and CO2. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of a member of the deltaproteobacterial family Desulfomicrobiaceae, and this 3,942,657 bp long single replicon genome with its 3494 protein-coding and 72 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  3. A complete analysis of HA and NA genes of influenza A viruses.

    LENUS (Irish Health Repository)

    Shi, Weifeng

    2010-12-01

    More and more nucleotide sequences of type A influenza virus are available in public databases. Although these sequences have been the focus of many molecular epidemiological and phylogenetic analyses, most studies only deal with a few representative sequences. In this paper, we present a complete analysis of all Haemagglutinin (HA) and Neuraminidase (NA) gene sequences available to allow large scale analyses of the evolution and epidemiology of type A influenza.

  4. Complete Genome Sequences of Mycobacteriophages Clautastrophe, Kingsolomon, Krypton555, and Nicholas

    OpenAIRE

    Chung, Hui-Min; D’Elia, Tom; Ross, Joseph F.; Alvarado, Samuel M.; Brantley, Molly-Catherine; Bricker, Lydia P.; Butler, Courtney R.; Crist, Carson; Dane, Julia M.; Farran, Brett W.; Hobbs, Sierra; Lapak, Michelle; Lovell, Conner; Ludergnani, Nicholas; McMullen, Allison

    2017-01-01

    ABSTRACT We report here the complete genome sequences of four subcluster L3 mycobacteriophages newly isolated from soil samples, using Mycobacterium smegmatis mc2155 as the host. Comparative genomic analyses with four previously described subcluster L3 phages reveal strong nucleotide similarity and gene conservation, with several large insertions/deletions near their right genome ends.

  5. Complete Chloroplast Genome Sequence of Coptis chinensis Franch. and Its Evolutionary History

    Science.gov (United States)

    He, Yang; Deng, Cao; Fan, Gang; Qin, Shishang

    2017-01-01

    The Coptis chinensis Franch. is an important medicinal plant from the Ranunculales. We used next generation sequencing technology to determine the complete chloroplast genome of C. chinensis. This genome is 155,484 bp long with 38.17% GC content. Two 26,758 bp long inverted repeats separated the genome into a typical quadripartite structure. The C. chinensis chloroplast genome consists of 128 gene loci, including eight rRNA gene loci, 28 tRNA gene loci, and 92 protein-coding gene loci. Most of the SSRs in C. chinensis are poly-A/T. The numbers of mononucleotide SSRs in C. chinensis and other Ranunculaceae species are fewer than those in Berberidaceae species, while the number of dinucleotide SSRs is greater than that in the Berberidaceae. C. chinensis diverged from other Ranunculaceae species an estimated 81 million years ago (Mya). The divergence between Ranunculaceae and Berberidaceae was ~111 Mya, while the Ranunculales and Magnoliaceae shared a common ancestor during the Jurassic, ~153 Mya. Position 104 of the C. chinensis ndhG protein was identified as a positively selected site, indicating possible selection for the photosystem-chlororespiration system in C. chinensis. In summary, the complete sequencing and annotation of the C. chinensis chloroplast genome will facilitate future studies on this important medicinal species. PMID:28698879

  6. Complete Chloroplast Genome Sequence of Coptis chinensis Franch. and Its Evolutionary History

    Directory of Open Access Journals (Sweden)

    Yang He

    2017-01-01

    Full Text Available The Coptis chinensis Franch. is an important medicinal plant from the Ranunculales. We used next generation sequencing technology to determine the complete chloroplast genome of C. chinensis. This genome is 155,484 bp long with 38.17% GC content. Two 26,758 bp long inverted repeats separated the genome into a typical quadripartite structure. The C. chinensis chloroplast genome consists of 128 gene loci, including eight rRNA gene loci, 28 tRNA gene loci, and 92 protein-coding gene loci. Most of the SSRs in C. chinensis are poly-A/T. The numbers of mononucleotide SSRs in C. chinensis and other Ranunculaceae species are fewer than those in Berberidaceae species, while the number of dinucleotide SSRs is greater than that in the Berberidaceae. C. chinensis diverged from other Ranunculaceae species an estimated 81 million years ago (Mya. The divergence between Ranunculaceae and Berberidaceae was ~111 Mya, while the Ranunculales and Magnoliaceae shared a common ancestor during the Jurassic, ~153 Mya. Position 104 of the C. chinensis ndhG protein was identified as a positively selected site, indicating possible selection for the photosystem-chlororespiration system in C. chinensis. In summary, the complete sequencing and annotation of the C. chinensis chloroplast genome will facilitate future studies on this important medicinal species.

  7. The complete chloroplast genome sequence of Helwingia himalaica (Helwingiaceae, Aquifoliales) and a chloroplast phylogenomic analysis of the Campanulidae.

    Science.gov (United States)

    Yao, Xin; Liu, Ying-Ying; Tan, Yun-Hong; Song, Yu; Corlett, Richard T

    2016-01-01

    Complete chloroplast genome sequences have been very useful for understanding phylogenetic relationships in angiosperms at the family level and above, but there are currently large gaps in coverage. We report the chloroplast genome for Helwingia himalaica , the first in the distinctive family Helwingiaceae and only the second genus to be sequenced in the order Aquifoliales. We then combine this with 36 published sequences in the large (c. 35,000 species) subclass Campanulidae in order to investigate relationships at the order and family levels. The Helwingia genome consists of 158,362 bp containing a pair of inverted repeat (IR) regions of 25,996 bp separated by a large single-copy (LSC) region and a small single-copy (SSC) region which are 87,810 and 18,560 bp, respectively. There are 142 known genes, including 94 protein-coding genes, eight ribosomal RNA genes, and 40 tRNA genes. The topology of the phylogenetic relationships between Apiales, Asterales, and Dipsacales differed between analyses based on complete genome sequences and on 36 shared protein-coding genes, showing that further studies of campanulid phylogeny are needed.

  8. Complete nucleotide sequence of Alfalfa mosaic virus isolated from alfalfa (Medicago sativa L.) in Argentina.

    Science.gov (United States)

    Trucco, Verónica; de Breuil, Soledad; Bejerman, Nicolás; Lenardon, Sergio; Giolitti, Fabián

    2014-06-01

    The complete nucleotide sequence of an Alfalfa mosaic virus (AMV) isolate infecting alfalfa (Medicago sativa L.) in Argentina, AMV-Arg, was determined. The virus genome has the typical organization described for AMV, and comprises 3,643, 2,593, and 2,038 nucleotides for RNA1, 2 and 3, respectively. The whole genome sequence and each encoding region were compared with those of other four isolates that have been completely sequenced from China, Italy, Spain and USA. The nucleotide identity percentages ranged from 95.9 to 99.1 % for the three RNAs and from 93.7 to 99 % for the protein 1 (P1), protein 2 (P2), movement protein and coat protein (CP) encoding regions, whereas the amino acid identity percentages of these proteins ranged from 93.4 to 99.5 %, the lowest value corresponding to P2. CP sequences of AMV-Arg were compared with those of other 25 available isolates, and the phylogenetic analysis based on the CP gene was carried out. The highest percentage of nucleotide sequence identity of the CP gene was 98.3 % with a Chinese isolate and 98.6 % at the amino acid level with four isolates, two from Italy, one from Brazil and the remaining one from China. The phylogenetic analysis showed that AMV-Arg is closely related to subgroup I of AMV isolates. To our knowledge, this is the first report of a complete nucleotide sequence of AMV from South America and the first worldwide report of complete nucleotide sequence of AMV isolated from alfalfa as natural host.

  9. Complete sequence analysis reveals two distinct poleroviruses infecting cucurbits in China.

    Science.gov (United States)

    Xiang, Hai-ying; Shang, Qiao-xia; Han, Cheng-gui; Li, Da-wei; Yu, Jia-lin

    2008-01-01

    The complete RNA genomes of a Chinese isolate of cucurbit aphid-borne yellows virus (CABYV-CHN) and a new polerovirus tentatively referred to as melon aphid-borne yellows virus (MABYV) were determined. The entire genome of CABYV-CHN shared 89.0% nucleotide sequence identity with the French CABYV isolate. In contrast, nucleotide sequence identities between MABYV and CABYV and other poleroviruses were in the range of 50.7-74.2%, with amino acid sequence identities ranging from 24.8 to 82.9% for individual gene products. We propose that CABYV-CHN is a strain of CABYV and that MABYV is a member of a tentative distinct species within the genus Polerovirus.

  10. Concurrent speciation in the eastern woodland salamanders (Genus Plethodon):DNA sequences of the complete albumin nuclear and partialmitochondrial 12s genes

    Science.gov (United States)

    Highton, Richard; Hastings, Amy Picard; Palmer, Catherine; Watts, Richard; Hass, Carla A.; Culver, Melanie; Arnold, Stevan

    2012-01-01

    Salamanders of the North American plethodontid genus Plethodon are important model organisms in a variety of studies that depend on a phylogenetic framework (e.g., chemical communication, ecological competition, life histories, hybridization, and speciation), and consequently their systematics has been intensively investigated over several decades. Nevertheless, we lack a synthesis of relationships among the species. In the analyses reported here we use new DNA sequence data from the complete nuclear albumin gene (1818 bp) and the 12s mitochondrial gene (355 bp), as well as published data for four other genes (Wiens et al., 2006), up to a total of 6989 bp, to infer relationships. We relate these results to past systematic work based on morphology, allozymes, and DNA sequences. Although basal relationships show a strong consensus across studies, many terminal relationships remain in flux despite substantial sequencing and other molecular and morphological studies. This systematic instability appears to be a consequence of contemporaneous bursts of speciation in the late Miocene and Pliocene, yielding many closely related extant species in each of the four eastern species groups. Therefore we conclude that many relationships are likely to remain poorly resolved in the face of additional sequencing efforts. On the other hand, the current classification of the 45 eastern species into four species groups is supported. The Plethodon cinereus group (10 species) is the sister group to the clade comprising the other three groups, but these latter groups (Plethodon glutinosus [28 species], Plethodon welleri [5 species], and Plethodon wehrlei [2 species]) probably diverged from each other at approximately the same time.

  11. Complete genome sequence of Isosphaera pallida type strain (IS1BT)

    Energy Technology Data Exchange (ETDEWEB)

    Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Cleland, David M [ORNL; Saunders, Elizabeth H [Los Alamos National Laboratory (LANL); Lapidus, Alla L. [Joint Genome Institute, Walnut Creek, California; Nolan, Matt [Joint Genome Institute, Walnut Creek, California; Lucas, Susan [Joint Genome Institute, Walnut Creek, California; Hammon, Nancy [Joint Genome Institute, Walnut Creek, California; Deshpande, Shweta [Joint Genome Institute, Walnut Creek, California; Cheng, Jan-Fang [Joint Genome Institute, Walnut Creek, California; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [Joint Genome Institute, Walnut Creek, California; Liolios, Konstantinos [Joint Genome Institute, Walnut Creek, California; Pagani, Ioanna [Joint Genome Institute, Walnut Creek, California; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [Joint Genome Institute, Walnut Creek, California; Palaniappan, Krishna [Joint Genome Institute, Walnut Creek, California; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Detter, J. Chris [Joint Genome Institute, Walnut Creek, California; Beck, Brian [ATCC - American Type Culture Collection; Woyke, Tanja [Joint Genome Institute, Walnut Creek, California; Bristow, James [Joint Genome Institute, Walnut Creek, California; Eisen, Jonathan [Joint Genome Institute, Walnut Creek, California; Markowitz, Victor [Joint Genome Institute, Walnut Creek, California; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [Joint Genome Institute, Walnut Creek, California; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany

    2011-01-01

    Isosphaera pallida (ex Woronichin 1927) Giovannoni et al. 1995 is the type species of the genus Isosphaera. The species is of interest because it was the first heterotrophic bacterium known to be phototactic, and it occupies an isolated phylogenetic position within the Planctomycetaceae. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first complete genome sequence of a member of the genus Isosphaera and the third of a member of the family Planctomycetaceae. The 5,472,964 bp long chromosome and the 56,340 bp long plasmid with a total of 3,763 protein-coding and 60 RNA genes are part of the Genomic Encyclopedia of Bacteria and Archaea project.

  12. Complete genome sequence of Marivirga tractuosa type strain (H-43T)

    Science.gov (United States)

    Pagani, Ioanna; Chertkov, Olga; Lapidus, Alla; Lucas, Susan; Del Rio, Tijana Glavina; Tice, Hope; Copeland, Alex; Cheng, Jan-Fang; Nolan, Matt; Saunders, Elizabeth; Pitluck, Sam; Held, Brittany; Goodwin, Lynne; Liolios, Konstantinos; Ovchinikova, Galina; Ivanova, Natalia; Mavromatis, Konstantinos; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Jeffries, Cynthia D.; Detter, John C.; Han, Cliff; Tapia, Roxanne; Ngatchou-Djao, Olivier D.; Rohde, Manfred; Göker, Markus; Spring, Stefan; Sikorski, Johannes; Woyke, Tanja; Bristow, Jim; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Klenk, Hans-Peter; Kyrpides, Nikos C.

    2011-01-01

    Marivirga tractuosa (Lewin 1969) Nedashkovskaya et al. 2010 is the type species of the genus Marivirga, which belongs to the family Flammeovirgaceae. Members of this genus are of interest because of their gliding motility. The species is of interest because representative strains show resistance to several antibiotics, including gentamicin, kanamycin, neomycin, polymixin and streptomycin. This is the first complete genome sequence of a member of the family Flammeovirgaceae. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 4,511,574 bp long chromosome and the 4,916 bp plasmid with their 3,808 protein-coding and 49 RNA genes are a part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:21677852

  13. cis sequence effects on gene expression

    Directory of Open Access Journals (Sweden)

    Jacobs Kevin

    2007-08-01

    Full Text Available Abstract Background Sequence and transcriptional variability within and between individuals are typically studied independently. The joint analysis of sequence and gene expression variation (genetical genomics provides insight into the role of linked sequence variation in the regulation of gene expression. We investigated the role of sequence variation in cis on gene expression (cis sequence effects in a group of genes commonly studied in cancer research in lymphoblastoid cell lines. We estimated the proportion of genes exhibiting cis sequence effects and the proportion of gene expression variation explained by cis sequence effects using three different analytical approaches, and compared our results to the literature. Results We generated gene expression profiling data at N = 697 candidate genes from N = 30 lymphoblastoid cell lines for this study and used available candidate gene resequencing data at N = 552 candidate genes to identify N = 30 candidate genes with sufficient variance in both datasets for the investigation of cis sequence effects. We used two additive models and the haplotype phylogeny scanning approach of Templeton (Tree Scanning to evaluate association between individual SNPs, all SNPs at a gene, and diplotypes, with log-transformed gene expression. SNPs and diplotypes at eight candidate genes exhibited statistically significant (p cis sequence effects in our study, respectively. Conclusion Based on analysis of our results and the extant literature, one in four genes exhibits significant cis sequence effects, and for these genes, about 30% of gene expression variation is accounted for by cis sequence variation. Despite diverse experimental approaches, the presence or absence of significant cis sequence effects is largely supported by previously published studies.

  14. Genetic characterization of complete open reading frame of glycoprotein C gene of bovine herpesvirus 1

    Directory of Open Access Journals (Sweden)

    Saurabh Majumder

    2013-10-01

    Full Text Available Aim: To characterize one of the major glycoprotein genes viz., glycoprotein C (gC; UL44, unique long region 44 of bovineherpesvirus 1(BoHV1 of Indian origin at genetic and phylogenetic level.Materials and Methods: A bovine herpesvirus 1 isolate viz., (BoHV1/IBR 216 II/ 1976/ India maintained at Division ofVirology, IVRI, Mukteswar was used for the current study. The DNA was extracted using commercial kit and the completeORF of gC gene was amplified, cloned, and sequenced by conventional Sanger sequencing method. The sequence wasgenetically and phylogenetically analysed using various bioinformatic tools. The sequence was submitted in the Genbankwith accession number Kc756965.Results: The complete ORF of gC gene was amplified and sequenced. It showed 100% sequence homology with referencecooper strain of BoHV1 and divergence varied from 0% to 2.7% with other isolates of BoHV1. The isolate under study haddivergence of 9.2%, 13%, 26.6%, and 9.2% with BoHV5 (Bovine herpesvirus 5, CvHV1 (Cervid herpesvirus 1, CpHV1(Caprine herpesvirus 1, and BuHV1 (Bubaline herpesvirus 1, respectively.Conclusion: This is the first genetic characterization of complete open reading frame (ORF of glycoprotein C gene (UL44 ofIndian isolate of BoHV1. The gC gene of BoHV1 is highly conserved among all BoHV1 isolates and it can be used as a targetfor designing diagnostic primers for the specific detection of BoHV1.

  15. Genomic sequences of murine gamma B- and gamma C-crystallin-encoding genes: promoter analysis and complete evolutionary pattern of mouse, rat and human gamma-crystallins.

    Science.gov (United States)

    Graw, J; Liebstein, A; Pietrowski, D; Schmitt-John, T; Werner, T

    1993-12-22

    The murine genes, gamma B-cry and gamma C-cry, encoding the gamma B- and gamma C-crystallins, were isolated from a genomic DNA library. The complete nucleotide (nt) sequences of both genes were determined from 661 and 711 bp, respectively, upstream from the first exon to the corresponding polyadenylation sites, comprising more than 2650 and 2890 bp, respectively. The new sequences were compared to the partial cDNA sequences available for the murine gamma B-cry and gamma C-cry, as well as to the corresponding genomic sequences from rat and man, at both the nt and predicted amino acid (aa) sequence levels. In the gamma B-cry promoter region, a canonical CCAAT-box, a TATA-box, putative NF-I and C/EBP sites were detected. An R-repeat is inserted 366 bp upstream from the transcription start point. In contrast, the gamma C-cry promoter does not contain a CCAAT-box, but some other putative binding sites for transcription factors (AP-2, UBP-1, LBP-1) were located by computer analysis. The promoter regions of all six gamma-cry from mouse, rat and human, except human psi gamma F-cry, were analyzed for common sequence elements. A complex sequence element of about 70-80 bp was found in the proximal promoter, which contains a gamma-cry-specific and almost invariant sequence (crygpel) of 14 nt, and ends with the also invariant TATA-box. Within the complex sequence element, a minimum of three further features specific for the gamma A-, gamma B- and gamma D/E/F-cry genes can be defined, at least two of which were recently shown to be functional. In addition to these four sequence elements, a subtype-specific structure of inverted repeats with different-sized spacers can be deduced from the multiple sequence alignment. A phylogenetic analysis based on the promoter region, as well as the complete exon 3 of all gamma-cry from mouse, rat and man, suggests separation of only five gamma-cry subtypes (gamma A-, gamma B-, gamma C-, gamma D- and gamma E/F-cry) prior to species separation.

  16. Complete Genome Sequences of Mycobacteriophages Clautastrophe, Kingsolomon, Krypton555, and Nicholas

    Science.gov (United States)

    Chung, Hui-Min; D’Elia, Tom; Ross, Joseph F.; Alvarado, Samuel M.; Brantley, Molly-Catherine; Bricker, Lydia P.; Butler, Courtney R.; Crist, Carson; Dane, Julia M.; Farran, Brett W.; Hobbs, Sierra; Lapak, Michelle; Lovell, Conner; McMullen, Allison; Mirza, Sohail A.; Thrift, Noah; Vaughan, Donald P.; Worley, Grace; Ejikemeuwa, Amara; Zaw, May; Albritton, Claude F.; Bertrand, Sarah C.; Chaudhry, Shanzay S.; Cheema, Vzair A.; Do, Camilla; Do, Michael L.; Duong, Huyen M.; El-Desoky, Dalia H.; Green, Kelsey M.; Lee, Rhea N.; Thornton, Lauren A.; Vu, James M.; Zahra, Mah Noor; Stoner, Ty H.; Garlena, Rebecca A.; Jacobs-Sera, Deborah; Russell, Daniel A.

    2017-01-01

    ABSTRACT We report here the complete genome sequences of four subcluster L3 mycobacteriophages newly isolated from soil samples, using Mycobacterium smegmatis mc2155 as the host. Comparative genomic analyses with four previously described subcluster L3 phages reveal strong nucleotide similarity and gene conservation, with several large insertions/deletions near their right genome ends. PMID:29122864

  17. The complete genome sequence of the plant growth-promoting bacterium Pseudomonas sp. UW4.

    Directory of Open Access Journals (Sweden)

    Jin Duan

    Full Text Available The plant growth-promoting bacterium (PGPB Pseudomonas sp. UW4, previously isolated from the rhizosphere of common reeds growing on the campus of the University of Waterloo, promotes plant growth in the presence of different environmental stresses, such as flooding, high concentrations of salt, cold, heavy metals, drought and phytopathogens. In this work, the genome sequence of UW4 was obtained by pyrosequencing and the gaps between the contigs were closed by directed PCR. The P. sp. UW4 genome contains a single circular chromosome that is 6,183,388 bp with a 60.05% G+C content. The bacterial genome contains 5,423 predicted protein-coding sequences that occupy 87.2% of the genome. Nineteen genomic islands (GIs were predicted and thirty one complete putative insertion sequences were identified. Genes potentially involved in plant growth promotion such as indole-3-acetic acid (IAA biosynthesis, trehalose production, siderophore production, acetoin synthesis, and phosphate solubilization were determined. Moreover, genes that contribute to the environmental fitness of UW4 were also observed including genes responsible for heavy metal resistance such as nickel, copper, cadmium, zinc, molybdate, cobalt, arsenate, and chromate. Whole-genome comparison with other completely sequenced Pseudomonas strains and phylogeny of four concatenated "housekeeping" genes (16S rRNA, gyrB, rpoB and rpoD of 128 Pseudomonas strains revealed that UW4 belongs to the fluorescens group, jessenii subgroup.

  18. The Complete Genome Sequence of the Plant Growth-Promoting Bacterium Pseudomonas sp. UW4

    Science.gov (United States)

    Duan, Jin; Jiang, Wei; Cheng, Zhenyu; Heikkila, John J.; Glick, Bernard R.

    2013-01-01

    The plant growth-promoting bacterium (PGPB) Pseudomonas sp. UW4, previously isolated from the rhizosphere of common reeds growing on the campus of the University of Waterloo, promotes plant growth in the presence of different environmental stresses, such as flooding, high concentrations of salt, cold, heavy metals, drought and phytopathogens. In this work, the genome sequence of UW4 was obtained by pyrosequencing and the gaps between the contigs were closed by directed PCR. The P. sp. UW4 genome contains a single circular chromosome that is 6,183,388 bp with a 60.05% G+C content. The bacterial genome contains 5,423 predicted protein-coding sequences that occupy 87.2% of the genome. Nineteen genomic islands (GIs) were predicted and thirty one complete putative insertion sequences were identified. Genes potentially involved in plant growth promotion such as indole-3-acetic acid (IAA) biosynthesis, trehalose production, siderophore production, acetoin synthesis, and phosphate solubilization were determined. Moreover, genes that contribute to the environmental fitness of UW4 were also observed including genes responsible for heavy metal resistance such as nickel, copper, cadmium, zinc, molybdate, cobalt, arsenate, and chromate. Whole-genome comparison with other completely sequenced Pseudomonas strains and phylogeny of four concatenated “housekeeping” genes (16S rRNA, gyrB, rpoB and rpoD) of 128 Pseudomonas strains revealed that UW4 belongs to the fluorescens group, jessenii subgroup. PMID:23516524

  19. Cloning and characterization of the major histone H2A genes completes the cloning and sequencing of known histone genes of Tetrahymena thermophila.

    Science.gov (United States)

    Liu, X; Gorovsky, M A

    1996-01-01

    A truncated cDNA clone encoding Tetrahymena thermophila histone H2A2 was isolated using synthetic degenerate oligonucleotide probes derived from H2A protein sequences of Tetrahymena pyriformis. The cDNA clone was used as a homologous probe to isolate a truncated genomic clone encoding H2A1. The remaining regions of the genes for H2A1 (HTA1) and H2A2 (HTA2) were then isolated using inverse PCR on circularized genomic DNA fragments. These partial clones were assembled into intact HTA1 and HTA2 clones. Nucleotide sequences of the two genes were highly homologous within the coding region but not in the noncoding regions. Comparison of the deduced amino acid sequences with protein sequences of T. pyriformis H2As showed only two and three differences respectively, in a total of 137 amino acids for H2A1, and 132 amino acids for H2A2, indicating the two genes arose before the divergence of these two species. The HTA2 gene contains a TAA triplet within the coding region, encoding a glutamine residue. In contrast with the T. thermophila HHO and HTA3 genes, no introns were identified within the two genes. The 5'- and 3'-ends of the histone H2A mRNAs; were determined by RNase protection and by PCR mapping using RACE and RLM-RACE methods. Both genes encode polyadenylated mRNAs and are highly expressed in vegetatively growing cells but only weakly expressed in starved cultures. With the inclusion of these two genes, T. thermophila is the first organism whose entire complement of known core and linker histones, including replication-dependent and basal variants, has been cloned and sequenced. PMID:8760889

  20. Complete sequences of the mitochondrial DNA of the wild Gracilariopsis lemaneiformis and two mutagenic cultivated breeds (Gracilariaceae, Rhodophyta.

    Directory of Open Access Journals (Sweden)

    Lei Zhang

    Full Text Available The complete mitochondrial DNA (mtDNA of Gracilariopsis lemaneiformis was sequenced (25883 bp and mapped to a circular model. The A+T composition was 72.5%. Forty six genes and two potentially functional open reading frames were identified. They include 24 protein-coding genes, 2 rRNA genes, 20 tRNA genes and 2 ORFs (orf60, orf142. There is considerable sequence synteny across the five red algal mtDNAs falling into Florideophyceae including Gr. lemaneiformis in this study and previously sequenced species. A long stem-loop and a hairpin structure were identified in intergenic regions of mt genome of Gr. lemaneiformis, which are believed to be involved with transcription and replication. In addition, the mtDNAs of two mutagenic cultivated breeds ("981" and "07-2" were also sequenced. Compared with the mtDNA of wild Gr. lemaneiformis, the genome size and gene length and order of three strains were completely identical except nine base mutations including eight in the protein-coding genes and one in the tRNA gene. None of the base mutations caused frameshift or a premature stop codon in the mtDNA genes. Phylogenetic analyses based on mitochondrial protein-coding genes and rRNA genes demonstrated Gracilariopsis andersonii had closer phylogenetic relationship with its parasite Gracilariophila oryzoides than Gracilariopsis lemaneiformis which was from the same genus of Gracilariopsis.

  1. Complete sequences of the mitochondrial DNA of the wild Gracilariopsis lemaneiformis and two mutagenic cultivated breeds (Gracilariaceae, Rhodophyta).

    Science.gov (United States)

    Zhang, Lei; Wang, Xumin; Qian, Hao; Chi, Shan; Liu, Cui; Liu, Tao

    2012-01-01

    The complete mitochondrial DNA (mtDNA) of Gracilariopsis lemaneiformis was sequenced (25883 bp) and mapped to a circular model. The A+T composition was 72.5%. Forty six genes and two potentially functional open reading frames were identified. They include 24 protein-coding genes, 2 rRNA genes, 20 tRNA genes and 2 ORFs (orf60, orf142). There is considerable sequence synteny across the five red algal mtDNAs falling into Florideophyceae including Gr. lemaneiformis in this study and previously sequenced species. A long stem-loop and a hairpin structure were identified in intergenic regions of mt genome of Gr. lemaneiformis, which are believed to be involved with transcription and replication. In addition, the mtDNAs of two mutagenic cultivated breeds ("981" and "07-2") were also sequenced. Compared with the mtDNA of wild Gr. lemaneiformis, the genome size and gene length and order of three strains were completely identical except nine base mutations including eight in the protein-coding genes and one in the tRNA gene. None of the base mutations caused frameshift or a premature stop codon in the mtDNA genes. Phylogenetic analyses based on mitochondrial protein-coding genes and rRNA genes demonstrated Gracilariopsis andersonii had closer phylogenetic relationship with its parasite Gracilariophila oryzoides than Gracilariopsis lemaneiformis which was from the same genus of Gracilariopsis.

  2. Complete genome sequence of Hydrogenobacter thermophilus type strain (TK-6T)

    Energy Technology Data Exchange (ETDEWEB)

    Zeytun, Ahmet [Los Alamos National Laboratory (LANL); Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Nolan, Matt [Joint Genome Institute, Walnut Creek, California; Lapidus, Alla L. [Joint Genome Institute, Walnut Creek, California; Lucas, Susan [Joint Genome Institute, Walnut Creek, California; Han, James [Joint Genome Institute; Tice, Hope [Joint Genome Institute, Walnut Creek, California; Cheng, Jan-Fang [Joint Genome Institute, Walnut Creek, California; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [Joint Genome Institute, Walnut Creek, California; Liolios, Konstantinos [Joint Genome Institute, Walnut Creek, California; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [Joint Genome Institute, Walnut Creek, California; Palaniappan, Krishna [Joint Genome Institute, Walnut Creek, California; Ngatchou, Olivier Duplex [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Han, Cliff [Los Alamos National Laboratory (LANL); Detter, J. Chris [Joint Genome Institute, Walnut Creek, California; Ubler, Susanne [Universitat Regensburg, Regensburg, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Tindall, Brian [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Wirth, Reinhard [Universitat Regensburg, Regensburg, Germany; Woyke, Tanja [Joint Genome Institute, Walnut Creek, California; Bristow, James [Joint Genome Institute, Walnut Creek, California; Eisen, Jonathan [Joint Genome Institute, Walnut Creek, California; Markowitz, Victor [Joint Genome Institute, Walnut Creek, California; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Kyrpides, Nikos C [Joint Genome Institute, Walnut Creek, California

    2011-01-01

    Hydrogenobacter thermophilus Kawasumi et al. 1984 is the type species of the genus Hydrogenobacter. H. thermophilus was the first obligate autotrophic organism reported among aerobic hydrogen-oxidizing bacteria. Strain TK-6T is of interest because of the unusually efficient hydrogen-oxidizing ability of this strain, which results in a faster generation time compared to other autotrophs. It is also able to grow anaerobically using nitrate as an electron acceptor when molecular hydrogen is used as the energy source, and able to aerobically fix CO2 via the reductive tricarboxylic acid cycle. This is the fifth completed genome sequence in the family Aquificaceae, and the second genome sequence determined from a strain derived from the original isolate. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 1,742,932 bp long genome with its 1,899 protein-coding and 49 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  3. Complete genome sequence of Leptotrichia buccalis type strain (C-1013-bT)

    Energy Technology Data Exchange (ETDEWEB)

    Ivanova, Natalia; Gronow, Sabine; Lapidus, Alla; Copeland, Alex; Glavina Del Rio, Tijana; Nolan, Matt; Lucas, Susan; Chen, Feng; Tice, Hope; Cheng, Jan-Fang; Saunders, Liz; Bruce, David; Goodwin, Lynne; Brettin, Thomas; Detter, John C.; Han, Cliff; Pitluck, Sam; Mikhailova, Natalia; Pati, Amrita; Mavromatis, Konstantinos; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jefferies, Cynthia C.; Chain, Patrick; Rohde, Christine; Goker, Markus; Bristow, Jim; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter

    2009-05-20

    Leptotrichia buccalis (Robin 1853) Trevisan 1879 is the type species of the genus, and is of phylogenetic interest because of its isolated location in the sparsely populated and neither taxonomically nor genomically adequately accessed family 'Leptotrichiaceae' within the phylum 'Fusobacteria'. Species of Leptotrichia are large fusiform non-motile, non-sporulating rods, which often populate the human oral flora. L. buccalis is anaerobic to aerotolerant, and saccharolytic. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first complete genome sequence of the order 'Fusobacteriales' and no more than the second sequence from the phylum 'Fusobacteria'. The 2,465,610 bp long single replicon genome with its 2306 protein-coding and 61 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  4. Completed sequence and corrected annotation of the genome of maize Iranian mosaic virus.

    Science.gov (United States)

    Ghorbani, Abozar; Izadpanah, Keramatollah; Dietzgen, Ralf G

    2018-03-01

    Maize Iranian mosaic virus (MIMV) is a negative-sense single-stranded RNA virus that is classified in the genus Nucleorhabdovirus, family Rhabdoviridae. The MIMV genome contains six open reading frames (ORFs) that encode in 3΄ to 5΄ order the nucleocapsid protein (N), phosphoprotein (P), putative movement protein (P3), matrix protein (M), glycoprotein (G) and RNA-dependent RNA polymerase (L). In this study, we determined the first complete genome sequence of MIMV using Illumina RNA-Seq and 3'/5' RACE. MIMV genome ('Fars' isolate) is 12,426 nucleotides in length. Unexpectedly, the predicted N gene ORF of this isolate and of four other Iranian isolates is 143 nucleotides shorter than that of the MIMV coding-complete reference isolate 'Shiraz 1' (Genbank NC_011542), possibly due to a minor error in the previous sequence. Genetic variability among the N, P, P3 and G ORFs of Iranian MIMV isolates was limited, but highest in the G gene ORF. Phylogenetic analysis of complete nucleorhabdovirus genomes demonstrated a close evolutionary relationship between MIMV, maize mosaic virus and taro vein chlorosis virus.

  5. Complete plastid genome sequence of Primula sinensis (Primulaceae: structure comparison, sequence variation and evidence for accD transfer to nucleus

    Directory of Open Access Journals (Sweden)

    Tong-Jian Liu

    2016-06-01

    Full Text Available Species-rich genus Primula L. is a typical plant group with which to understand genetic variance between species in different levels of relationships. Chloroplast genome sequences are used to be the information resource for quantifying this difference and reconstructing evolutionary history. In this study, we reported the complete chloroplast genome sequence of Primula sinensis and compared it with other related species. This genome of chloroplast showed a typical circular quadripartite structure with 150,859 bp in sequence length consisting of 37.2% GC base. Two inverted repeated regions (25,535 bp were separated by a large single-copy region (82,064 bp and a small single-copy region (17,725 bp. The genome consists of 112 genes, including 78 protein-coding genes, 30 tRNA genes and four rRNA genes. Among them, seven coding genes, seven tRNA genes and four rRNA genes have two copies due to their locations in the IR regions. The accD and infA genes lacking intact open reading frames (ORF were identified as pseudogenes. SSR and sequence variation analyses were also performed on the plastome of Primula sinensis, comparing with another available plastome of P. poissonii. The four most variable regions, rpl36–rps8, rps16–trnQ, trnH–psbA and ndhC–trnV, were identified. Phylogenetic relationship estimates using three sub-datasets extracted from a matrix of 57 protein-coding gene sequences showed the identical result that was consistent with previous studies. A transcript found from P. sinensis transcriptome showed a high similarity to plastid accD functional region and was identified as a putative plastid transit peptide at the N-terminal region. The result strongly suggested that plastid accD has been functionally transferred to the nucleus in P. sinensis.

  6. Complete genome sequence of Mahella australiensis type strain (50-1 BONT)

    Energy Technology Data Exchange (ETDEWEB)

    Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Teshima, Hazuki [Los Alamos National Laboratory (LANL); Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Hammon, Nancy [U.S. Department of Energy, Joint Genome Institute; Deshpande, Shweta [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Pagani, Ioanna [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Huntemann, Marcel [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Ngatchou, Olivier Duplex [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Pukall, Rudiger [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Spring, Stefan [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Abt, Birte [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute

    2011-01-01

    Mahella australiensis Bonilla Salinas et al. 2004 is the type species of the genus Mahella, which belongs to the family Thermoanaerobacteraceae. The species is of interest because it differs from other known anaerobic spore-forming bacteria in its G+C content, and in certain phenotypic traits, such as carbon source utilization and relationship to temperature. Moreo- ver, it has been discussed that this species might be an indigenous member of petroleum and oil reservoirs. This is the first completed genome sequence of a member of the genus Mahella and the ninth completed type strain genome sequence from the family Thermoanaerobacte- raceae. The 3,135,972 bp long genome with its 2,974 protein-coding and 59 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  7. Complete genome sequence of Bifidobacterium breve CECT 7263, a strain isolated from human milk.

    Science.gov (United States)

    Jiménez, Esther; Villar-Tajadura, M Antonia; Marín, María; Fontecha, Javier; Requena, Teresa; Arroyo, Rebeca; Fernández, Leónides; Rodríguez, Juan M

    2012-07-01

    Bifidobacterium breve is an actinobacterium frequently isolated from colonic microbiota of breastfeeding babies. Here, we report the complete and annotated genome sequence of a B. breve strain isolated from human milk, B. breve CECT 7263. The genome sequence will provide new insights into the biology of this potential probiotic organism and will allow the characterization of genes related to beneficial properties.

  8. [Complete genome sequencing and sequence analysis of BCG Tice].

    Science.gov (United States)

    Wang, Zhiming; Pan, Yuanlong; Wu, Jun; Zhu, Baoli

    2012-10-04

    The objective of this study is to obtain the complete genome sequence of Bacillus Calmette-Guerin Tice (BCG Tice), in order to provide more information about the molecular biology of BCG Tice and design more reasonable vaccines to prevent tuberculosis. We assembled the data from high-throughput sequencing with SOAPdenovo software, with many contigs and scaffolds obtained. There are many sequence gaps and physical gaps remained as a result of regional low coverage and low quality. We designed primers at the end of contigs and performed PCR amplification in order to link these contigs and scaffolds. With various enzymes to perform PCR amplification, adjustment of PCR reaction conditions, and combined with clone construction to sequence, all the gaps were finished. We obtained the complete genome sequence of BCG Tice and submitted it to GenBank of National Center for Biotechnology Information (NCBI). The genome of BCG Tice is 4334064 base pairs in length, with GC content 65.65%. The problems and strategies during the finishing step of BCG Tice sequencing are illuminated here, with the hope of affording some experience to those who are involved in the finishing step of genome sequencing. The microarray data were verified by our results.

  9. [Sequencing and analysis of the complete mitochondrial genome of the King Cobra, Ophiophagus hannah (Serpents: Elapidae)].

    Science.gov (United States)

    Chen, Nian; Lai, Xiao-Ping

    2010-07-01

    We obtained the complete mitochondrial genome of King Cobra(GenBank accession number: EU_921899) by Ex Taq-PCR, TA-cloning and primer-walking methods. This genome is very similar to other vertebrate, which is 17 267 bp in length and encodes 38 genes (including 13 protein-coding, 2 ribosomal RNA and 23 transfer RNA genes) and two long non-coding regions. The duplication of tRNA-Ile gene forms a new mitochondrial gene rearrangement model. Eight tRNA genes and one protein genes were transcribed from L strand, and the other genes were transcribed genes from H strand. Genes on the H strand show a fairly similar content of Adenosine and Thymine respectively, whereas those on the L strand have higher proportion of A than T. Combined rDNA sequence data (12S+16S rRNA) were used to reconstruct the phylogeny of 21 snake species for which complete mitochondrial genome sequences were available in the public databases. This large data set and an appropriate range of outgroup taxa demonstrated that Elapidae is more closely related to colubridae than viperidae, which supports the traditional viewpoints.

  10. The complete chloroplast genome sequence of Helwingia himalaica (Helwingiaceae, Aquifoliales and a chloroplast phylogenomic analysis of the Campanulidae

    Directory of Open Access Journals (Sweden)

    Xin Yao

    2016-11-01

    Full Text Available Complete chloroplast genome sequences have been very useful for understanding phylogenetic relationships in angiosperms at the family level and above, but there are currently large gaps in coverage. We report the chloroplast genome for Helwingia himalaica, the first in the distinctive family Helwingiaceae and only the second genus to be sequenced in the order Aquifoliales. We then combine this with 36 published sequences in the large (c. 35,000 species subclass Campanulidae in order to investigate relationships at the order and family levels. The Helwingia genome consists of 158,362 bp containing a pair of inverted repeat (IR regions of 25,996 bp separated by a large single-copy (LSC region and a small single-copy (SSC region which are 87,810 and 18,560 bp, respectively. There are 142 known genes, including 94 protein-coding genes, eight ribosomal RNA genes, and 40 tRNA genes. The topology of the phylogenetic relationships between Apiales, Asterales, and Dipsacales differed between analyses based on complete genome sequences and on 36 shared protein-coding genes, showing that further studies of campanulid phylogeny are needed.

  11. Complete genome sequence of thermophilic Bacillus smithii type strain DSM 4216T

    DEFF Research Database (Denmark)

    Bosma, Elleke Fenna; Koehorst, Jasper J.; van Hijum, Sacha A. F. T.

    2016-01-01

    determined the complete genomic sequence of the B. smithii type strain DSM 4216T, which consists of a 3,368,778 bp chromosome (GenBank accession number CP012024.1) and a 12,514 bp plasmid (GenBank accession number CP012025.1), together encoding 3880 genes. Genome annotation via RAST was complemented...

  12. The complete chloroplast genome sequence of the chlorophycean green alga Scenedesmus obliquus reveals a compact gene organization and a biased distribution of genes on the two DNA strands

    Science.gov (United States)

    de Cambiaire, Jean-Charles; Otis, Christian; Lemieux, Claude; Turmel, Monique

    2006-01-01

    Background The phylum Chlorophyta contains the majority of the green algae and is divided into four classes. While the basal position of the Prasinophyceae is well established, the divergence order of the Ulvophyceae, Trebouxiophyceae and Chlorophyceae (UTC) remains uncertain. The five complete chloroplast DNA (cpDNA) sequences currently available for representatives of these classes display considerable variability in overall structure, gene content, gene density, intron content and gene order. Among these genomes, that of the chlorophycean green alga Chlamydomonas reinhardtii has retained the least ancestral features. The two single-copy regions, which are separated from one another by the large inverted repeat (IR), have similar sizes, rather than unequal sizes, and differ radically in both gene contents and gene organizations relative to the single-copy regions of prasinophyte and ulvophyte cpDNAs. To gain insights into the various changes that underwent the chloroplast genome during the evolution of chlorophycean green algae, we have sequenced the cpDNA of Scenedesmus obliquus, a member of a distinct chlorophycean lineage. Results The 161,452 bp IR-containing genome of Scenedesmus features single-copy regions of similar sizes, encodes 96 genes, i.e. only two additional genes (infA and rpl12) relative to its Chlamydomonas homologue and contains seven group I and two group II introns. It is clearly more compact than the four UTC algal cpDNAs that have been examined so far, displays the lowest proportion of short repeats among these algae and shows a stronger bias in clustering of genes on the same DNA strand compared to Chlamydomonas cpDNA. Like the latter genome, Scenedesmus cpDNA displays only a few ancestral gene clusters. The two chlorophycean genomes share 11 gene clusters that are not found in previously sequenced trebouxiophyte and ulvophyte cpDNAs as well as a few genes that have an unusual structure; however, their single-copy regions differ

  13. Analysis and comparison of fragrant gene sequence in some rice cultivars

    Directory of Open Access Journals (Sweden)

    Karami Noushafarin

    2016-01-01

    Full Text Available It is known that the fragrant trait in rice (Oryza sativa L. is largely controlled by fgr gene on chromosome 8 and it has been specified that the existence of an 8 bp deletion and three single nucleotide polymorphism (SNP in exon 7 is effective on this trait. In this study, sequence alignment analysis of fgr exon7 on chromosome 8 for 11 different fragrant and non-fragrant cultivars revealed that 5 aromatic rice cultivars carried 3 SNPs and 8 bp deletion in exon7 which terminates prematurely at a TAA stop codon. However, 5 of the non-aromatics showed a sequence identical to the published Nipponbare, being non-fragrant Japonica variety sequence. An exception among them was Bejar, which had 8 bp deletion and 3SNPs but it was non-aromatic. Sequencing can determine nucleotide alignment of a gene and give beneficial information about gene function. In silico prediction showed proteins sequences alignment of fgr gene for Khazar and Domsiah genotypes were different. Betaine aldehyde dehydrogenase complete enzyme belongs to Khazar non-fragrant genotype that has complete length and 503 amino acids while non-functional BADH2 enzyme for Domsiah fragrant genotype has 251 amino acids that result in accumulate 2-acetyl-1-pyrroline (2AP and produces aroma in fragrant genotypes.

  14. Complete plastid genome sequences suggest strong selection for retention of photosynthetic genes in the parasitic plant genus Cuscuta.

    Science.gov (United States)

    McNeal, Joel R; Kuehl, Jennifer V; Boore, Jeffrey L; de Pamphilis, Claude W

    2007-10-24

    Plastid genome content and protein sequence are highly conserved across land plants and their closest algal relatives. Parasitic plants, which obtain some or all of their nutrition through an attachment to a host plant, are often a striking exception. Heterotrophy can lead to relaxed constraint on some plastid genes or even total gene loss. We sequenced plastid genomes of two species in the parasitic genus Cuscuta along with a non-parasitic relative, Ipomoea purpurea, to investigate changes in the plastid genome that may result from transition to the parasitic lifestyle. Aside from loss of all ndh genes, Cuscuta exaltata retains photosynthetic and photorespiratory genes that evolve under strong selective constraint. Cuscuta obtusiflora has incurred substantially more change to its plastid genome, including loss of all genes for the plastid-encoded RNA polymerase. Despite extensive change in gene content and greatly increased rate of overall nucleotide substitution, C. obtusiflora also retains all photosynthetic and photorespiratory genes with only one minor exception. Although Epifagus virginiana, the only other parasitic plant with its plastid genome sequenced to date, has lost a largely overlapping set of transfer-RNA and ribosomal genes as Cuscuta, it has lost all genes related to photosynthesis and maintains a set of genes which are among the most divergent in Cuscuta. Analyses demonstrate photosynthetic genes are under the highest constraint of any genes within the plastid genomes of Cuscuta, indicating a function involving RuBisCo and electron transport through photosystems is still the primary reason for retention of the plastid genome in these species.

  15. Complete plastid genome sequences suggest strong selection for retention of photosynthetic genes in the parasitic plant genus Cuscuta

    Directory of Open Access Journals (Sweden)

    Kuehl Jennifer V

    2007-10-01

    Full Text Available Abstract Background Plastid genome content and protein sequence are highly conserved across land plants and their closest algal relatives. Parasitic plants, which obtain some or all of their nutrition through an attachment to a host plant, are often a striking exception. Heterotrophy can lead to relaxed constraint on some plastid genes or even total gene loss. We sequenced plastid genomes of two species in the parasitic genus Cuscuta along with a non-parasitic relative, Ipomoea purpurea, to investigate changes in the plastid genome that may result from transition to the parasitic lifestyle. Results Aside from loss of all ndh genes, Cuscuta exaltata retains photosynthetic and photorespiratory genes that evolve under strong selective constraint. Cuscuta obtusiflora has incurred substantially more change to its plastid genome, including loss of all genes for the plastid-encoded RNA polymerase. Despite extensive change in gene content and greatly increased rate of overall nucleotide substitution, C. obtusiflora also retains all photosynthetic and photorespiratory genes with only one minor exception. Conclusion Although Epifagus virginiana, the only other parasitic plant with its plastid genome sequenced to date, has lost a largely overlapping set of transfer-RNA and ribosomal genes as Cuscuta, it has lost all genes related to photosynthesis and maintains a set of genes which are among the most divergent in Cuscuta. Analyses demonstrate photosynthetic genes are under the highest constraint of any genes within the plastid genomes of Cuscuta, indicating a function involving RuBisCo and electron transport through photosystems is still the primary reason for retention of the plastid genome in these species.

  16. Sequencing and characterization of the complete mitochondrial genome of Japanese Swellshark (Cephalloscyllium umbratile)

    OpenAIRE

    Zhu, Ke-Cheng; Liang, Yin-Yin; Wu, Na; Guo, Hua-Yang; Zhang, Nan; Jiang, Shi-Gui; Zhang, Dian-Chang

    2017-01-01

    To further comprehend the genome features of Cephalloscyllium umbratile (Carcharhiniformes), an endangered species, the complete mitochondrial DNA (mtDNA) was firstly sequenced and annotated. The full-length mtDNA of C. umbratile was 16,697 bp and contained ribosomal RNA (rRNA) genes, 13 protein-coding genes (PCGs), 23 transfer RNA (tRNA) genes, and a major non-coding control region. Each PCG was initiated by an authoritative ATN codon, except for COX1 initiated by a GTG codon. Seven of 13 PC...

  17. Intermittency as a universal characteristic of the complete chromosome DNA sequences of eukaryotes: From protozoa to human genomes

    Science.gov (United States)

    Rybalko, S.; Larionov, S.; Poptsova, M.; Loskutov, A.

    2011-10-01

    Large-scale dynamical properties of complete chromosome DNA sequences of eukaryotes are considered. Using the proposed deterministic models with intermittency and symbolic dynamics we describe a wide spectrum of large-scale patterns inherent in these sequences, such as segmental duplications, tandem repeats, and other complex sequence structures. It is shown that the recently discovered gene number balance on the strands is not of a random nature, and certain subsystems of a complete chromosome DNA sequence exhibit the properties of deterministic chaos.

  18. Molecular cloning and complete nucleotide sequence of a human ventricular myosin light chain 1

    Energy Technology Data Exchange (ETDEWEB)

    Hoffmann, E; Shi, Q W; Floroff, M; Mickle, D A.G.; Wu, T W; Olley, P M; Jackowski, G

    1988-03-25

    Human ventricular plasmid library was constructed. The library was screened with the oligonucleotide probe (17-mer) corresponding to a conserve region of myosin light chain 1 near the carboxy terminal. Full length cDNA recombinant plasmid containing 1100 bp insert was isolated. RNA blot hybridization with this insert detected a message of approximately 1500 bp corresponding to the size of VLCl and mRNA. Complete nucleotide sequence of the coding region was determined in M13 subclones using dideoxy chain termination method. With the isolation of this clone (pCD HLVCl), the publication of the complete nucleotide sequence of HVLCl and the predicted secondary structure of this protein will aid in understanding of the biochemistry of myosin and its function in contraction, the evolution of myosin light genes and the genetic, developmental and physiological regulation of myosin genes.

  19. Complete genome sequence of Bifidobacterium breve CECT 7263, a strain isolated from human milk

    OpenAIRE

    Jiménez, Esther; Villar-Tajadura, M. Antonia; Marín, María; Fontecha, F. Javier; Requena, Teresa; Arroyo, Rebeca; Fernández, Leónides; Rodríguez, Juan M.

    2012-01-01

    Bifidobacterium breve is an actinobacterium frequently isolated from colonic microbiota of breastfeeding babies. Here, we report the complete and annotated genome sequence of a B. breve strain isolated from human milk, B. breve CECT 7263. The genome sequence will provide new insights into the biology of this potential probiotic organism and will allow the characterization of genes related to beneficial properties. © 2012, American Society for Microbiology.

  20. Complete Chloroplast Genome Sequence of Aquilaria sinensis (Lour.) Gilg and Evolution Analysis within the Malvales Order.

    Science.gov (United States)

    Wang, Ying; Zhan, Di-Feng; Jia, Xian; Mei, Wen-Li; Dai, Hao-Fu; Chen, Xiong-Ting; Peng, Shi-Qing

    2016-01-01

    Aquilaria sinensis (Lour.) Gilg is an important medicinal woody plant producing agarwood, which is widely used in traditional Chinese medicine. High-throughput sequencing of chloroplast (cp) genomes enhanced the understanding about evolutionary relationships within plant families. In this study, we determined the complete cp genome sequences for A. sinensis. The size of the A. sinensis cp genome was 159,565 bp. This genome included a large single-copy region of 87,482 bp, a small single-copy region of 19,857 bp, and a pair of inverted repeats (IRa and IRb) of 26,113 bp each. The GC content of the genome was 37.11%. The A. sinensis cp genome encoded 113 functional genes, including 82 protein-coding genes, 27 tRNA genes, and 4 rRNA genes. Seven genes were duplicated in the protein-coding genes, whereas 11 genes were duplicated in the RNA genes. A total of 45 polymorphic simple-sequence repeat loci and 60 pairs of large repeats were identified. Most simple-sequence repeats were located in the noncoding sections of the large single-copy/small single-copy region and exhibited high A/T content. Moreover, 33 pairs of large repeat sequences were located in the protein-coding genes, whereas 27 pairs were located in the intergenic regions. Aquilaria sinensis cp genome bias ended with A/T on the basis of codon usage. The distribution of codon usage in A. sinensis cp genome was most similar to that in the Gonystylus bancanus cp genome. Comparative results of 82 protein-coding genes from 29 species of cp genomes demonstrated that A. sinensis was a sister species to G. bancanus within the Malvales order. Aquilaria sinensis cp genome presented the highest sequence similarity of >90% with the G. bancanus cp genome by using CGView Comparison Tool. This finding strongly supports the placement of A. sinensis as a sister to G. bancanus within the Malvales order. The complete A. sinensis cp genome information will be highly beneficial for further studies on this traditional medicinal

  1. Complete Genome Sequence of Genotype VI Newcastle Disease Viruses Isolated from Pigeons in Pakistan

    OpenAIRE

    Wajid, Abdul; Rehmani, Shafqat Fatima; Sharma, Poonam; Goraichuk, Iryna V.; Dimitrov, Kiril M.; Afonso, Claudio L.

    2016-01-01

    Two complete genome sequences of Newcastle disease virus (NDV) are described here. Virulent isolates pigeon/Pakistan/Lahore/21A/2015 and pigeon/Pakistan/Lahore/25A/2015 were obtained from racing pigeons sampled in the Pakistani province of Punjab during 2015. Phylogenetic analysis of the fusion protein genes and complete genomes classified the isolates as members of NDV class II, genotype VI.

  2. Complete genome sequence of Beutenbergia cavernae type strain (HKI 0122T)

    Energy Technology Data Exchange (ETDEWEB)

    Land, Miriam; Pukall, Rudiger; Abt, Birte; Goker, Markus; Rohde, Manfred; Glavina Del Rio, Tijana; Tice, Hope; Copeland, Alex; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Nolan, Matt; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavrommatis, Konstantinos; Ovchinnikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Hauser, Loren; Chang, Yun-Juan; Jefferies, Cynthia C.; Saunders, Elizabeth; Brettin, Thomas; Detter, John C.; Han, Cliff; Chain, Patrick; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter; Lapidus, Alla

    2009-05-20

    Beutenbergia cavernae (Groth et al. 1999) is the type species of the genus and is of phylogenetic interest because of its isolated location in the actinobacterial suborder Micrococcineae. B. cavernae HKI 0122T is a Gram-positive, non-motile, non-spore-forming bacterium isolated from a cave in Guangxi (China). B. cavernae grows best under aerobic conditions and shows a rod-coccus growth cycle. Its cell wall peptidoglycan contains the diagnostic L-lysine - L-glutamate interpeptide bridge. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first completed genome sequence from the poorly populated micrococcineal family Beutenbergiaceae, and this 4,669,183 bp long single replicon genome with its 4225 protein-coding and 53 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  3. Complete genome sequence of Haliangium ochraceum type strain (SMP-2T)

    Energy Technology Data Exchange (ETDEWEB)

    Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Daum, Chris [U.S. Department of Energy, Joint Genome Institute; Lang, Elke [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Abt, Birte [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Kopitz, marcus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Saunders, Elizabeth H [Los Alamos National Laboratory (LANL); Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Chen, Feng [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Brettin, Thomas S [ORNL; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany

    2010-01-01

    Haliangium ochraceum Fudou et al. 2002 is the type species of the genus Haliangium in the myxococcal family Haliangiaceae . Members of the genus Haliangium are the first halophilic myxobacterial taxa described. The cells of the species follow a multicellular lifestyle in highly organized biofilms, called swarms, they decompose bacterial and yeast cells as most myxobacteria do. The fruiting bodies contain particularly small coccoid myxospores. H. ochraceum encodes the first actin homologue identified in a bacterial genome. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of a member of the myxococcal suborder Nannocystineae, and the 9,446,314 bp long single replicon genome with its 6,898 protein-coding and 53 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  4. The Complete Mitochondrial Genome Sequence of Bactericera cockerelli and Comparison with Three Other Psylloidea Species.

    Directory of Open Access Journals (Sweden)

    Fengnian Wu

    Full Text Available Potato psyllid (Bactericera cockerelli is an important pest of potato, tomato and pepper. Not only could a toxin secreted by nymphs results in serious phytotoxemia in some host plants, but also over the past few years B. cockerelli was shown to transmit "Candidatus Liberibacter solanacearum", the putative bacterial pathogen of potato zebra chip (ZC disease, to potato and tomato. ZC has caused devastating losses to potato production in the western U.S., Mexico, and elsewhere. New knowledge of the genetic diversity of the B. cockerelli is needed to develop improved strategies to manage pest populations. Mitochondrial genome (mitogenome sequencing provides important knowledge about insect evolution and diversity in and among populations. This report provides the first complete B. cockerelli mitogenome sequence as determined by next generation sequencing technology (Illumina MiSeq. The circular B. cockerelli mitogenome had a size of 15,220 bp with 13 protein-coding gene (PCGs, 2 ribosomal RNA genes (rRNAs, 22 transfer RNA genes (tRNAs, and a non-coding region of 975 bp. The overall gene order of the B. cockerelli mitogenome is identical to three other published Psylloidea mitogenomes: one species from the Triozidae, Paratrioza sinica; and two species from the Psyllidae, Cacopsylla coccinea and Pachypsylla venusta. This suggests all of these species share a common ancestral mitogenome. However, sequence analyses revealed differences between and among the insect families, in particular a unique region that can be folded into three stem-loop secondary structures present only within the B. cockerelli mitogenome. A phylogenetic tree based on the 13 PCGs matched an existing taxonomy scheme that was based on morphological characteristics. The available complete mitogenome sequence makes it accessible to all genes for future population diversity evaluation of B. cockerelli.

  5. The complete chloroplast genome sequence of Maddenia hypoleuca koehne (Prunoideae, Rosaceae).

    Science.gov (United States)

    Chen, Tao; Zhang, Jing; Liu, Yin; Wang, Hao; Wang, Juan; Chen, Qing; Tang, Hao-Ru; Wang, Xiao-Rong

    2016-11-01

    Maddenia hypoleuca Koehne belonging to family Rosaceae is a native species in China. The complete chloroplast (cp) genome was generated by de novo assembly using low coverage whole genome sequencing data and manual correction. The cp genome was 158 084 bp in length, with GC content of 36.63%. It exhibited a typical quadripartite structure: a pair of large inverted repeat regions (IRs, 26 246 bp each), a large single-copy region (LSC, 86 713 bp), and a small single-copy region (SSC, 18 879 bp). A total of 114 genes were predicted, which included 80 protein-coding genes, 30 tRNA genes, and four rRNA genes. Phylogenetic analysis indicated that M. hypoleuca is most closely related to Prunus padus within the Prunoideae subfamily, which conforms to the traditional classification.

  6. The complete chloroplast genome sequence of Gentiana lawrencei var. farreri (Gentianaceae) and comparative analysis with its congeneric species.

    Science.gov (United States)

    Fu, Peng-Cheng; Zhang, Yan-Zhao; Geng, Hui-Min; Chen, Shi-Long

    2016-01-01

    The chloroplast (cp) genome is useful in plant systematics, genetic diversity analysis, molecular identification and divergence dating. The genus Gentiana contains 362 species, but there are only two valuable complete cp genomes. The purpose of this study is to report the characterization of complete cp genome of G. lawrencei var. farreri , which is endemic to the Qinghai-Tibetan Plateau (QTP). Using high throughput sequencing technology, we got the complete nucleotide sequence of the G. lawrencei var. farreri cp genome. The comparison analysis including genome difference and gene divergence was performed with its congeneric species G. straminea . The simple sequence repeats (SSRs) and phylogenetics were studied as well. The cp genome of G. lawrencei var. farreri is a circular molecule of 138,750 bp, containing a pair of 24,653 bp inverted repeats which are separated by small and large single-copy regions of 11,365 and 78,082 bp, respectively. The cp genome contains 130 known genes, including 85 protein coding genes (PCGs), eight ribosomal RNA genes and 37 tRNA genes. Comparative analyses indicated that G. lawrencei var. farreri is 10,241 bp shorter than its congeneric species G. straminea. Four large gaps were detected that are responsible for 85% of the total sequence loss. Further detailed analyses revealed that 10 PCGs were included in the four gaps that encode nine NADH dehydrogenase subunits. The cp gene content, order and orientation are similar to those of its congeneric species, but with some variation among the PCGs. Three genes, ndhB , ndhF and clpP , have high nonsynonymous to synonymous values. There are 34 SSRs in the G. lawrencei var. farreri cp genome, of which 25 are mononucleotide repeats: no dinucleotide repeats were detected. Comparison with the G. straminea cp genome indicated that five SSRs have length polymorphisms and 23 SSRs are species-specific. The phylogenetic analysis of 48 PCGs from 12 Gentianales taxa cp genomes clearly identified

  7. The complete genome sequence of Plodia interpunctella granulovirus: Discovery of an unusual inhibitor-of-apoptosis gene

    Science.gov (United States)

    The Indianmeal moth, Plodia interpunctella (Lepidoptera: Pyralidae), is a common pest of stored goods with a worldwide distribution. The complete genome sequence for a larval pathogen of this moth, the baculovirus Plodia interpunctella granulovirus (PiGV), was determined by next-generation sequenci...

  8. Complete chloroplast genome and 45S nrDNA sequences of the medicinal plant species Glycyrrhiza glabra and Glycyrrhiza uralensis.

    Science.gov (United States)

    Kang, Sang-Ho; Lee, Jeong-Hoon; Lee, Hyun Oh; Ahn, Byoung Ohg; Won, So Youn; Sohn, Seong-Han; Kim, Jung Sun

    2017-10-06

    Glycyrrhiza uralensis and G. glabra, members of the Fabaceae, are medicinally important species that are native to Asia and Europe. Extracts from these plants are widely used as natural sweeteners because of their much greater sweetness than sucrose. In this study, the three complete chloroplast genomes and five 45S nuclear ribosomal (nr)DNA sequences of these two licorice species and an interspecific hybrid are presented. The chloroplast genomes of G. glabra, G. uralensis and G. glabra × G. uralensis were 127,895 bp, 127,716 bp and 127,939 bp, respectively. The three chloroplast genomes harbored 110 annotated genes, including 76 protein-coding genes, 30 tRNA genes and 4 rRNA genes. The 45S nrDNA sequences were either 5,947 or 5,948 bp in length. Glycyrrhiza glabra and G. glabra × G. uralensis showed two types of nrDNA, while G. uralensis contained a single type. The complete 45S nrDNA sequence unit contains 18S rRNA, ITS1, 5.8S rRNA, ITS2 and 26S rRNA. We identified simple sequence repeat and tandem repeat sequences. We also developed four reliable markers for analysis of Glycyrrhiza diversity authentication.

  9. The complete sequence of the mitochondrial genome of the African Penguin (Spheniscus demersus).

    Science.gov (United States)

    Labuschagne, Christiaan; Kotzé, Antoinette; Grobler, J Paul; Dalton, Desiré L

    2014-01-15

    The complete mitochondrial genome of the African Penguin (Spheniscus demersus) was sequenced. The molecule was sequenced via next generation sequencing and primer walking. The size of the genome is 17,346 bp in length. Comparison with the mitochondrial DNA of two other penguin genomes that have so far been reported was conducted namely; Little blue penguin (Eudyptula minor) and the Rockhopper penguin (Eudyptes chrysocome). This analysis made it possible to identify common penguin mitochondrial DNA characteristics. The S. demersus mtDNA genome is very similar, both in composition and length to both the E. chrysocome and E. minor genomes. The gene content of the African penguin mitochondrial genome is typical of vertebrates and all three penguin species have the standard gene order originally identified in the chicken. The control region for S. demersus is located between tRNA-Glu and tRNA-Phe and all three species of penguins contain two sets of similar repeats with varying copy numbers towards the 3' end of the control region, accounting for the size variance. This is the first report of the complete nucleotide sequence for the mitochondrial genome of the African penguin, S. demersus. These results can be subsequently used to provide information for penguin phylogenetic studies and insights into the evolution of genomes. © 2013 Elsevier B.V. All rights reserved.

  10. Viral Genome DataBase: storing and analyzing genes and proteins from complete viral genomes.

    Science.gov (United States)

    Hiscock, D; Upton, C

    2000-05-01

    The Viral Genome DataBase (VGDB) contains detailed information of the genes and predicted protein sequences from 15 completely sequenced genomes of large (&100 kb) viruses (2847 genes). The data that is stored includes DNA sequence, protein sequence, GenBank and user-entered notes, molecular weight (MW), isoelectric point (pI), amino acid content, A + T%, nucleotide frequency, dinucleotide frequency and codon use. The VGDB is a mySQL database with a user-friendly JAVA GUI. Results of queries can be easily sorted by any of the individual parameters. The software and additional figures and information are available at http://athena.bioc.uvic.ca/genomes/index.html .

  11. Complete Genome Sequence of Genotype VI Newcastle Disease Viruses Isolated from Pigeons in Pakistan

    Science.gov (United States)

    Wajid, Abdul; Rehmani, Shafqat Fatima; Sharma, Poonam; Goraichuk, Iryna V.; Dimitrov, Kiril M.

    2016-01-01

    Two complete genome sequences of Newcastle disease virus (NDV) are described here. Virulent isolates pigeon/Pakistan/Lahore/21A/2015 and pigeon/Pakistan/Lahore/25A/2015 were obtained from racing pigeons sampled in the Pakistani province of Punjab during 2015. Phylogenetic analysis of the fusion protein genes and complete genomes classified the isolates as members of NDV class II, genotype VI. PMID:27540069

  12. Variations in CCL3L gene cluster sequence and non-specific gene copy numbers

    Directory of Open Access Journals (Sweden)

    Edberg Jeffrey C

    2010-03-01

    Full Text Available Abstract Background Copy number variations (CNVs of the gene CC chemokine ligand 3-like1 (CCL3L1 have been implicated in HIV-1 susceptibility, but the association has been inconsistent. CCL3L1 shares homology with a cluster of genes localized to chromosome 17q12, namely CCL3, CCL3L2, and, CCL3L3. These genes are involved in host defense and inflammatory processes. Several CNV assays have been developed for the CCL3L1 gene. Findings Through pairwise and multiple alignments of these genes, we have shown that the homology between these genes ranges from 50% to 99% in complete gene sequences and from 70-100% in the exonic regions, with CCL3L1 and CCL3L3 being identical. By use of MEGA 4 and BioEdit, we aligned sense primers, anti-sense primers, and probes used in several previously described assays against pre-multiple alignments of all four chemokine genes. Each set of probes and primers aligned and matched with overlapping sequences in at least two of the four genes, indicating that previously utilized RT-PCR based CNV assays are not specific for only CCL3L1. The four available assays measured median copies of 2 and 3-4 in European and African American, respectively. The concordance between the assays ranged from 0.44-0.83 suggesting individual discordant calls and inconsistencies with the assays from the expected gene coverage from the known sequence. Conclusions This indicates that some of the inconsistencies in the association studies could be due to assays that provide heterogenous results. Sequence information to determine CNV of the three genes separately would allow to test whether their association with the pathogenesis of a human disease or phenotype is affected by an individual gene or by a combination of these genes.

  13. Complete genome sequence of Desulfohalobium retbaense type strain (HR100T)

    Energy Technology Data Exchange (ETDEWEB)

    Spring, Stefan [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Chen, Feng [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Munk, Christine [U.S. Department of Energy, Joint Genome Institute; Kiss, Hajnalka [Los Alamos National Laboratory (LANL); Chain, Patrick S. G. [Lawrence Livermore National Laboratory (LLNL); Han, Cliff [Los Alamos National Laboratory (LANL); Brettin, Thomas S [ORNL; Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Schuler, Esther [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany

    2010-01-01

    Desulfohalobium retbaense (Ollivier et al. 1991) is the type species of the polyphyletic genus Desulfohalobium, which comprises, at the time of writing, two species and represents the family Desulfohalobiaceae within the Deltaproteobacteria. D. retbaense is a moderately halophilic sulfate-reducing bacterium, which can utilize H2 and a limited range of organic substrates, which are incompletely oxidized to acetate and CO2, for growth. The type strain HR100T was isolated from sediments of the hypersaline Retba Lake in Senegal. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of a member of the family Desulfohalobiaceae. The 2,909,567 bp genome (one chromosome and a 45,263 bp plasmid) with its 2,552 protein-coding and 57 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  14. The complete chloroplast genome sequence of strawberry (Fragaria  × ananassa Duch.) and comparison with related species of Rosaceae.

    Science.gov (United States)

    Cheng, Hui; Li, Jinfeng; Zhang, Hong; Cai, Binhua; Gao, Zhihong; Qiao, Yushan; Mi, Lin

    2017-01-01

    Compared with other members of the family Rosaceae, the chloroplast genomes of Fragaria species exhibit low variation, and this situation has limited phylogenetic analyses; thus, complete chloroplast genome sequencing of Fragaria species is needed. In this study, we sequenced the complete chloroplast genome of F . ×  ananassa 'Benihoppe' using the Illumina HiSeq 2500-PE150 platform and then performed a combination of de novo assembly and reference-guided mapping of contigs to generate complete chloroplast genome sequences. The chloroplast genome exhibits a typical quadripartite structure with a pair of inverted repeats (IRs, 25,936 bp) separated by large (LSC, 85,531 bp) and small (SSC, 18,146 bp) single-copy (SC) regions. The length of the F . ×  ananassa 'Benihoppe' chloroplast genome is 155,549 bp, representing the smallest Fragaria chloroplast genome observed to date. The genome encodes 112 unique genes, comprising 78 protein-coding genes, 30 tRNA genes and four rRNA genes. Comparative analysis of the overall nucleotide sequence identity among ten complete chloroplast genomes confirmed that for both coding and non-coding regions in Rosaceae, SC regions exhibit higher sequence variation than IRs. The Ka/Ks ratio of most genes was less than 1, suggesting that most genes are under purifying selection. Moreover, the mVISTA results also showed a high degree of conservation in genome structure, gene order and gene content in Fragaria , particularly among three octoploid strawberries which were F . ×  ananassa 'Benihoppe', F . chiloensis (GP33) and F . virginiana (O477). However, when the sequences of the coding and non-coding regions of F . ×  ananassa 'Benihoppe' were compared in detail with those of F . chiloensis (GP33) and F . virginiana (O477), a number of SNPs and InDels were revealed by MEGA 7. Six non-coding regions ( trnK - matK , trnS - trnG , atpF - atpH , trnC - petN , trnT - psbD and trnP - psaJ ) with a percentage of variable sites greater than

  15. The complete chloroplast genome sequence of strawberry (Fragaria  × ananassa Duch. and comparison with related species of Rosaceae

    Directory of Open Access Journals (Sweden)

    Hui Cheng

    2017-10-01

    Full Text Available Compared with other members of the family Rosaceae, the chloroplast genomes of Fragaria species exhibit low variation, and this situation has limited phylogenetic analyses; thus, complete chloroplast genome sequencing of Fragaria species is needed. In this study, we sequenced the complete chloroplast genome of F. × ananassa ‘Benihoppe’ using the Illumina HiSeq 2500-PE150 platform and then performed a combination of de novo assembly and reference-guided mapping of contigs to generate complete chloroplast genome sequences. The chloroplast genome exhibits a typical quadripartite structure with a pair of inverted repeats (IRs, 25,936 bp separated by large (LSC, 85,531 bp and small (SSC, 18,146 bp single-copy (SC regions. The length of the F. × ananassa ‘Benihoppe’ chloroplast genome is 155,549 bp, representing the smallest Fragaria chloroplast genome observed to date. The genome encodes 112 unique genes, comprising 78 protein-coding genes, 30 tRNA genes and four rRNA genes. Comparative analysis of the overall nucleotide sequence identity among ten complete chloroplast genomes confirmed that for both coding and non-coding regions in Rosaceae, SC regions exhibit higher sequence variation than IRs. The Ka/Ks ratio of most genes was less than 1, suggesting that most genes are under purifying selection. Moreover, the mVISTA results also showed a high degree of conservation in genome structure, gene order and gene content in Fragaria, particularly among three octoploid strawberries which were F. × ananassa ‘Benihoppe’, F. chiloensis (GP33 and F. virginiana (O477. However, when the sequences of the coding and non-coding regions of F. × ananassa ‘Benihoppe’ were compared in detail with those of F. chiloensis (GP33 and F. virginiana (O477, a number of SNPs and InDels were revealed by MEGA 7. Six non-coding regions (trnK-matK, trnS-trnG, atpF-atpH, trnC-petN, trnT-psbD and trnP-psaJ with a percentage of variable sites greater than 1

  16. COMPLETE NUCLEOTIDE SEQUENCE OF SPHEROIDIN GENES OF CALLIPTAMUS ITALICUS ENTOMOPOXVIRUS(CIEPV) AND GOMPHOCERUS SIBIRICUS ENTOMOPOXVIRUS(GSEPV)

    Institute of Scientific and Technical Information of China (English)

    Yong-danLi; Li-yingWang; Xi-wuGao; Chao-yangZhao; Zhao-fengTian

    2004-01-01

    The spheroidin genes of Calliptamus italicus entomopoxvirus (CiEPV) and Gomphocerus sibiricus entomopoxvirus (GsEPV) were obtained by PCR,and the fragments were cloned, sequenced and analyzed. The CiEPV and GsEPV spheroidin genes respectively harbored ORFs of 2 922 bps and 2 967 bps that were capable of coding polypeptides of 109.2 and 111.1 kDa. Computer analysis indicated that CiEPV and GsEPV spheroidins shared less than 20% amino acid identities with lepidopteran AmEPV and coleopteran AcEPV spheroidins, but more than 80% amino acid identities with orthopteran OaEPV, MsEPV and AaEPV spheroidins. The CiEPV and GsEPV spheroidins respectively contained 19 and 21 cysteine residues that were particularly abundant at the C-termini, as is the case with those of the other orthopteran EPV spheroidins. The numbers and locations of the cysteine residues of the spheroidins were most similar to those of the spheroidins of EPVs that are virulent on the same insect orders. The promoter regions of the two spheroidin genes were highly conserved (99%) among the orthopteran EPVs and also contained the typical very A+T rich and TAAATG signal mediating transcription of poxvirus late genes. We also sequenced an incomplete ORF downstream of the pheroidin gene of CiEPV and GsEPV. The ORF was in the opposite direction to the spheroidin gene and was homologous to MSV072 putative protein of MsEPV.

  17. Equid herpesvirus 8: Complete genome sequence and association with abortion in mares

    Science.gov (United States)

    Garvey, Marie; Suárez, Nicolás M.; Kerr, Karen; Hector, Ralph; Moloney-Quinn, Laura; Arkins, Sean; Davison, Andrew J.

    2018-01-01

    Equid herpesvirus 8 (EHV-8), formerly known as asinine herpesvirus 3, is an alphaherpesvirus that is closely related to equid herpesviruses 1 and 9 (EHV-1 and EHV-9). The pathogenesis of EHV-8 is relatively little studied and to date has only been associated with respiratory disease in donkeys in Australia and horses in China. A single EHV-8 genome sequence has been generated for strain Wh in China, but is apparently incomplete and contains frameshifts in two genes. In this study, the complete genome sequences of four EHV-8 strains isolated in Ireland between 2003 and 2015 were determined by Illumina sequencing. Two of these strains were isolated from cases of abortion in horses, and were misdiagnosed initially as EHV-1, and two were isolated from donkeys, one with neurological disease. The four genome sequences are very similar to each other, exhibiting greater than 98.4% nucleotide identity, and their phylogenetic clustering together demonstrated that genomic diversity is not dependent on the host. Comparative genomic analysis revealed 24 of the 76 predicted protein sequences are completely conserved among the Irish EHV-8 strains. Evolutionary comparisons indicate that EHV-8 is phylogenetically closer to EHV-9 than it is to EHV-1. In summary, the first complete genome sequences of EHV-8 isolates from two host species over a twelve year period are reported. The current study suggests that EHV-8 can cause abortion in horses. The potential threat of EHV-8 to the horse industry and the possibility that donkeys may act as reservoirs of infection warrant further investigation. PMID:29414990

  18. Complete mitochondrial genome sequence of the longsnout seahorse Hippocampus reidi (Ginsburg, 1933; Gasterosteiformes: Syngnathidae).

    Science.gov (United States)

    Wang, Xin; Zhang, Yanhong; Zhang, Huixian; Meng, Tan; Lin, Qiang

    2016-01-01

    The complete mitochondrial genome sequence of the longsnout seahorse Hippocampus reidi was fisrt determined in this article. The total length of H. reidi mitogenome is 16,529 bp and consists of 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes and 1 control region. The gene order and composition of H. reidi were similar to those of most other vertebrates. The overall base composition of H. reidi is 32.47% A, 29.41% T, 14.75% G and 23.37% C, with a slight A + T rich feature (61.88%).

  19. Complete Plastid Genome Sequencing of Four Tilia Species (Malvaceae: A Comparative Analysis and Phylogenetic Implications.

    Directory of Open Access Journals (Sweden)

    Jie Cai

    Full Text Available Tilia is an ecologically and economically important genus in the family Malvaceae. However, there is no complete plastid genome of Tilia sequenced to date, and the taxonomy of Tilia is difficult owing to frequent hybridization and polyploidization. A well-supported interspecific relationships of this genus is not available due to limited informative sites from the commonly used molecular markers. We report here the complete plastid genome sequences of four Tilia species determined by the Illumina technology. The Tilia plastid genome is 162,653 bp to 162,796 bp in length, encoding 113 unique genes and a total number of 130 genes. The gene order and organization of the Tilia plastid genome exhibits the general structure of angiosperms and is very similar to other published plastid genomes of Malvaceae. As other long-lived tree genera, the sequence divergence among the four Tilia plastid genomes is very low. And we analyzed the nucleotide substitution patterns and the evolution of insertions and deletions in the Tilia plastid genomes. Finally, we build a phylogeny of the four sampled Tilia species with high supports using plastid phylogenomics, suggesting that it is an efficient way to resolve the phylogenetic relationships of this genus.

  20. Complete nucleotide sequence and gene rearrangement of the ...

    Indian Academy of Sciences (India)

    3Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu 610041, People's Republic of China ... of these rearrangements involve tRNA genes, ND5 gene and ... ncbi.nlm.nih.gov/projects/Sequin/download/seq_win_download.

  1. The Complete Chloroplast Genome Sequences of the Medicinal Plant Forsythia suspensa (Oleaceae

    Directory of Open Access Journals (Sweden)

    Wenbin Wang

    2017-10-01

    Full Text Available Forsythia suspensa is an important medicinal plant and traditionally applied for the treatment of inflammation, pyrexia, gonorrhea, diabetes, and so on. However, there is limited sequence and genomic information available for F. suspensa. Here, we produced the complete chloroplast genomes of F. suspensa using Illumina sequencing technology. F. suspensa is the first sequenced member within the genus Forsythia (Oleaceae. The gene order and organization of the chloroplast genome of F. suspensa are similar to other Oleaceae chloroplast genomes. The F. suspensa chloroplast genome is 156,404 bp in length, exhibits a conserved quadripartite structure with a large single-copy (LSC; 87,159 bp region, and a small single-copy (SSC; 17,811 bp region interspersed between inverted repeat (IRa/b; 25,717 bp regions. A total of 114 unique genes were annotated, including 80 protein-coding genes, 30 tRNA, and four rRNA. The low GC content (37.8% and codon usage bias for A- or T-ending codons may largely affect gene codon usage. Sequence analysis identified a total of 26 forward repeats, 23 palindrome repeats with lengths >30 bp (identity > 90%, and 54 simple sequence repeats (SSRs with an average rate of 0.35 SSRs/kb. We predicted 52 RNA editing sites in the chloroplast of F. suspensa, all for C-to-U transitions. IR expansion or contraction and the divergent regions were analyzed among several species including the reported F. suspensa in this study. Phylogenetic analysis based on whole-plastome revealed that F. suspensa, as a member of the Oleaceae family, diverged relatively early from Lamiales. This study will contribute to strengthening medicinal resource conservation, molecular phylogenetic, and genetic engineering research investigations of this species.

  2. Complete Genome Sequence of the Endophytic Biocontrol Strain Bacillus velezensis CC09

    OpenAIRE

    Cai, Xunchao; Kang, Xingxing; Xi, Huan; Liu, Changhong; Xue, Yarong

    2016-01-01

    Bacillus velezensis is a heterotypic synonym of B. methylotrophicus, B. amyloliquefaciens subsp. plantarum, and Bacillus oryzicola, and has been used to control plant fungal diseases. In order to fully understand the genetic basis of antimicrobial capacities, we did a complete genome sequencing of the endophytic B.?velezensis strain CC09. Genes tightly associated with biocontrol ability, including nonribosomal peptide synthetases, polyketide synthetases, iron acquisition, colonization, and vo...

  3. Complete genome sequence of Coraliomargarita akajimensis type strain (04OKA010-24T)

    Energy Technology Data Exchange (ETDEWEB)

    Mavromatis, Konstantinos; Abt, Birte; Brambilla, Evelyne; Lapidus, Alla; Copeland, Alex; Desphande, Shweta; Nolan, Matt; Lucas, Susan; Tice, Hope; Cheng, Jan-Fang; Han, Cliff; Detter, John C.; Woyke, Tanja; Goodwin, Lynne; Pitluck, Sam; Held, Brittany; Brettin, Thomas; Tapia, Roxanne; Ivanova, Natalia; Mikhailova, Natalia; Pati, Amrita; Liolios, Konstantinos; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D.; Rohde, Manfred; G& #246; ker, Markus; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Klenk, Hans-Peter; Kyrpides, Nikos C.

    2010-06-25

    Coraliomargarita akajimensis Yoon et al. 2007 the type species of the genus Coraliomargarita. C. akajimensis is an obligately aerobic, Gram-negative, non-spore-forming, non-motile, spherical bacterium which was isolated from seawater surrounding the hard coral Galaxea fascicularis. C. akajimensis organism is of special interest because of its phylogenetic position in a genomically purely studied area in the bacterial diversity. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of a member of the family Puniceicoccaceae. The 3,750,771 bp long genome with its 3,137 protein-coding and 55 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  4. Complete Chloroplast Genome of Pinus massoniana (Pinaceae): Gene Rearrangements, Loss of ndh Genes, and Short Inverted Repeats Contraction, Expansion.

    Science.gov (United States)

    Ni, ZhouXian; Ye, YouJu; Bai, Tiandao; Xu, Meng; Xu, Li-An

    2017-09-11

    The chloroplast genome (CPG) of Pinus massoniana belonging to the genus Pinus (Pinaceae), which is a primary source of turpentine, was sequenced and analyzed in terms of gene rearrangements, ndh genes loss, and the contraction and expansion of short inverted repeats (IRs). P. massoniana CPG has a typical quadripartite structure that includes large single copy (LSC) (65,563 bp), small single copy (SSC) (53,230 bp) and two IRs (IRa and IRb, 485 bp). The 108 unique genes were identified, including 73 protein-coding genes, 31 tRNAs, and 4 rRNAs. Most of the 81 simple sequence repeats (SSRs) identified in CPG were mononucleotides motifs of A/T types and located in non-coding regions. Comparisons with related species revealed an inversion (21,556 bp) in the LSC region; P. massoniana CPG lacks all 11 intact ndh genes (four ndh genes lost completely; the five remained truncated as pseudogenes; and the other two ndh genes remain as pseudogenes because of short insertions or deletions). A pair of short IRs was found instead of large IRs, and size variations among pine species were observed, which resulted from short insertions or deletions and non-synchronized variations between "IRa" and "IRb". The results of phylogenetic analyses based on whole CPG sequences of 16 conifers indicated that the whole CPG sequences could be used as a powerful tool in phylogenetic analyses.

  5. Complete genome sequence of Denitrovibrio acetiphilus type strain (N2460T)

    Energy Technology Data Exchange (ETDEWEB)

    Kiss, Hajnalka; Lang, Elke; Lapidus, Alla; Copeland, Alex; Nolan, Matt; Glavina Del Rio, Tijana; Chen, Feng; Lucas, Susan; Tice, Hope; Cheng, Jan-Fang; Han, Cliff; Goodwin, Lynne; Pitluck, Sam; Liolios, Konstantinos; Pati, Amrita; Ivanova, Natalia; Mavromatis, Konstantinos; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D.; Detter, John C.; Brettin, Thomas; Spring, Stefan; Rohde, Manfred; Goker, Markus; Woyke, Tanja; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter

    2010-06-25

    Denitrovibrio acetiphilus Myhr and Torsvik 2000 is the type species of the genus Denitrovibrio in the bacterial family Deferribacteraceae. It is of phylogenetic interest because there are only six genera described in the family Deferribacteraceae. D. acetiphilus was isolated as a representative of a population reducing nitrate to ammonia in a laboratory column simulating the conditions in off-shore oil recovery fields. When nitrate was added to this column undesirable hydrogen sulfide production was stopped because the sulfate reducing populations were superseded by these nitrate reducing bacteria. Here we describe the features of this marine, mesophilic, obligately anaerobic organism respiring by nitrate reduction, together with the complete genome sequence, and annotation. This is the second complete genome sequence of the order Deferribacterales and the class Deferribacteres, which is the sole class in the phylum Deferribacteres. The 3,222,077 bp genome with its 3,034 protein-coding and 51 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  6. Complete genome sequence of Saccharomonospora viridis type strain (P101T)

    Energy Technology Data Exchange (ETDEWEB)

    Pati, Amrita; Sikorski, Johannes; Nolan, Matt; Lapidus, Alla; Copeland, Alex; Glavina Del Rio, Tijana; Lucas, Susan; Chen, Feng; Tice, Hope; Pitluck, Sam; Cheng, Jan-Fang; Chertkov, Olga; Brettin, Thomas; Han, Cliff; Detter, John C.; Kuske, Cheryl; Bruce, David; Goodwin, Lynne; Chain, Patrick; D' haeseleer, Patrik; Chen, Amy; Palaniappan, Krishna; Ivanova, Natalia; Mavromatis, Konstantinos; Mikhailova, Natalia; Rohde, Manfred; Tindall, Brian J.; Goker, Markus; Bristow, Jim; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides1, Nikos C.; Klenk, Hans-Peter

    2009-05-20

    Saccharomonospora viridis (Schuurmans et al. 1956) Nonomurea and Ohara 1971 is the type species of the genus Saccharomonospora which belongs to the family Pseudonocardiaceae. S. viridis is of interest because it is a Gram-negative organism classified amongst the usually Gram-positive actinomycetes. Members of the species are frequently found in hot compost and hay, and its spores can cause farmer?s lung disease, bagassosis, and humidifier fever. Strains of the species S. viridis have been found to metabolize the xenobiotic pentachlorophenol (PCP). The strain described in this study has been isolated from peat-bog in Ireland. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of the family Pseudonocardiaceae, and the 4,308,349 bp long single replicon genome with its 3906 protein-coding and 64 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  7. Complete genome sequence of Dyadobacter fermentans type strain (NS114T)

    Energy Technology Data Exchange (ETDEWEB)

    Lang, Elke; Lapidus, Alla; Chertkov, Olga; Brettin, Thomas; Detter, John C.; Han, Cliff; Copeland, Alex; Glavina Del Rio, Tijana; Nolan, Matt; Chen, Feng; Lucas, Susan; Tice, Hope; Cheng, Jan-Fang; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ovchinnikova, Galina; Pati, Amrita; Ivanova, Natalia; Mavromatis, Konstantinos; Chen, Amy; Chain, Patrick; Bristow, Jim; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Goker, Markus; Rohde, Manfred; Kyrpides, Nikos C; Klenk, Hans-Peter

    2009-05-20

    Dyadobacter fermentans (Chelius MK and Triplett EW, 2000) is the type species of the genus Dyadobacter. It is of phylogenetic interest because of its location in the Cytophagaceae, a very diverse family within the order 'Sphingobacteriales'. D. fermentans has a mainly respiratory metabolism, stains Gram-negative, is non-motile and oxidase and catalase positive. It is characterized by the production of cell filaments in ageing cultures, a flexirubin-like pigment and its ability to ferment glucose, which is almost unique in the aerobically living members of this taxonomically difficult family. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of the 'sphingobacterial' genus Dyadobacter, and this 6,967,790 bp long single replicon genome with its 5804 protein-coding and 50 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  8. Complete genome sequence of Catenulispora acidiphila type strain (ID 139908T)

    Energy Technology Data Exchange (ETDEWEB)

    Copeland, Alex; Lapidus, Alla; Rio, Tijana GlavinaDel; Nolan, Matt; Lucas, Susan; Chen, Feng; Tice, Hope; Cheng, Jan-Fang; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Mikhailova, Natalia; Pati, Amrita; Ivanova, Natalia; Mavromatis, Konstantinos; Chen, Amy; Palaniappan, Krishna; Chain, Patrick; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jefferies, Cynthia C.; Chertkov, Olga; Brettin, Thomas; Detter, John C.; Han, Cliff; Ali, Zahid; Tindall, Brian J.; Goker, Markus; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter

    2009-05-20

    Catenulispora acidiphila Busti et al. 2006 is the type species of the genus Catenulispora, and is of interest because of the rather isolated phylogenetic location of the genomically little studied suborder Catenulisporineae within the order Actinomycetales. C. acidiphilia is known for its acidophilic, aerobic lifestyle, but can also grow scantly under anaerobic conditions. Under regular conditions C. acidiphilia grows in long filaments of relatively short aerial hyphae with marked septation. It is a free living, non motile, Gram-positive bacterium isolated from a forest soil sample taken from a wooded area in Gerenzano, Italy. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first complete genome sequence of the actinobacterial family Catenulisporaceae, and the 10,467,782 bp long single replicon genome with its 9056 protein-coding and 69 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  9. Complete genome sequence of Kytococcus sedentarius type strain (strain 541T)

    Energy Technology Data Exchange (ETDEWEB)

    Sims, David; Brettin, Thomas; Detter, John C.; Han, Cliff; Lapidus, Alla; Copeland, Alex; Glavina Del Rio, Tijana; Nolan, Matt; Chen, Feng; Lucas, Susan; Tice, Hope; Cheng, Jan-Fang; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ovchinnikova, Galina; Pati, Amrita; Ivanova, Natalia; Mavrommatis, Konstantinos; Chen, Amy; Palaniappan, Krishna; D' haeseleer, Patrick; Chain, Patrick; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Schneider, Susanne; Goker, Markus; Pukall, Rudiger; Kyrpides, Nikos C.; Klenk, Hans-Peter

    2009-05-20

    Kytococcus sedentarius (ZoBell and Upham 1944) Stackebrandt et al. 1995 is the type strain of the species, and is of phylogenetic interest because of its location in the Dermacoccaceae, a poorly studied family within the actinobacterial suborder Micrococcineae. K. sedentarius is known for the production of oligoketide antibiotics as well as for its role as an opportunistic pathogen causing valve endocarditis, hemorrhagic pneumonia, and pitted keratolysis. It is strictly aerobic and can only grow when several amino acids are provided in the medium. The strain described in this report is a free-living, nonmotile, Gram-positive bacterium, originally isolated from a marine environment. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of a member of the family Dermacoccaceae and the 2,785,024 bp long single replicon genome with its 2639 protein-coding and 64 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  10. Complete genome sequence of Brachybacterium faecium type strain (Schefferle 6-10T)

    Energy Technology Data Exchange (ETDEWEB)

    Lapidus, Alla; Pukall, Rudiger; LaButti, Kurt; Copeland, Alex; Glavina Del Rio, Tijana; Nolan, Matt; Chen, Feng; Lucas, Susan; Tice, Hope; Cheng, Jan-Fang; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Rohde, Manfred; Goker, Markus; Pati, Amrita; Ivanova, Natalia; Mavrommatis, Konstantinos; Chen, Amy; Palaniappan, Krishna; D' haeseleer, Patrik; Chain, Patrick; Bristow, Jim; Eisen, Johnathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter

    2009-05-20

    Brachybacterium faecium Collins et al. 1988 is the type species of the genus, and is of phylogenetic interest because of its location in the Dermabacteraceae, a rather isolated family within the actinobacterial suborder Micrococcineae. B. faecium is known for its rod-coccus growth cycle and the ability to degrade uric acid. It grows aerobically or weakly anaerobically. The strain described in this report is a free-living, nonmotile, Gram-positive bacterium, originally isolated from poultry deep litter. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of a member of the actinobacterial family Dermabacteraceae, and the 3,614,992 bp long single replicon genome with its 3129 protein-coding and 69 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  11. Complete genome sequencing of the luminescent bacterium, Vibrio qinghaiensis sp. Q67 using PacBio technology

    Science.gov (United States)

    Gong, Liang; Wu, Yu; Jian, Qijie; Yin, Chunxiao; Li, Taotao; Gupta, Vijai Kumar; Duan, Xuewu; Jiang, Yueming

    2018-01-01

    Vibrio qinghaiensis sp.-Q67 (Vqin-Q67) is a freshwater luminescent bacterium that continuously emits blue-green light (485 nm). The bacterium has been widely used for detecting toxic contaminants. Here, we report the complete genome sequence of Vqin-Q67, obtained using third-generation PacBio sequencing technology. Continuous long reads were attained from three PacBio sequencing runs and reads >500 bp with a quality value of >0.75 were merged together into a single dataset. This resultant highly-contiguous de novo assembly has no genome gaps, and comprises two chromosomes with substantial genetic information, including protein-coding genes, non-coding RNA, transposon and gene islands. Our dataset can be useful as a comparative genome for evolution and speciation studies, as well as for the analysis of protein-coding gene families, the pathogenicity of different Vibrio species in fish, the evolution of non-coding RNA and transposon, and the regulation of gene expression in relation to the bioluminescence of Vqin-Q67.

  12. Complete genome sequence of Oceanithermus profundus type strain (506T)

    Energy Technology Data Exchange (ETDEWEB)

    Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Zhang, Xiaojing [Los Alamos National Laboratory (LANL); Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Pagani, Ioanna [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Hauser, Loren John [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Brambilla, Evelyne-Marie [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Ruhl, Alina [U.S. Department of Energy, Joint Genome Institute; Mwirichia, Romano [University of Munster, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Tindall, Brian [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Wirth, Reinhard [Universitat Regensburg, Regensburg, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Land, Miriam L [ORNL

    2011-01-01

    Oceanithermus profundus Miroshnichenko et al. 2003 is the type species of the genus Oceanithermus, which belongs to the family Thermaceae. The genus currently comprises two species whose members are thermophilic and are able to reduce sulfur compounds and nitrite. The organism is adapted to the salinity of sea water, is able to utilize a broad range of carbohydrates, some proteinaceous substrates, organic acids and alcohols. This is the first completed genome sequence of a member of the genus Oceanithermus and the fourth sequence from the family Thermaceae. The 2,439,291 bp long genome with its 2,391 protein-coding and 54 RNA genes consists of one chromosome and a 135,351 bp long plasmid, and is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  13. Next generation sequencing yields the complete mitochondrial genome of the flathead mullet, Mugil cephalus cryptic species NWP2 (Teleostei: Mugilidae).

    Science.gov (United States)

    Shen, Kang-Ning; Yen, Ta-Chi; Chen, Ching-Hung; Li, Huei-Ying; Chen, Pei-Lung; Hsiao, Chung-Der

    2016-05-01

    In this study, the complete mitogenome sequence of Northwestern Pacific 2 (NWP2) cryptic species of flathead mullet, Mugil cephalus (Teleostei: Mugilidae) has been amplified by long-range PCR and sequenced by next-generation sequencing method. The assembled mitogenome, consisting of 16,686 bp, had the typical vertebrate mitochondrial gene arrangement, including 13 protein-coding genes, 22 transfer RNAs, 2 ribosomal RNAs genes and a non-coding control region of D-loop. D-loop was 909 bp length and was located between tRNA-Pro and tRNA-Phe. The overall base composition of NWP2 M. cephalus was 28.4% for A, 29.8% for C, 26.5% for T and 15.3% for G. The complete mitogenome may provide essential and important DNA molecular data for further phylogenetic and evolutionary analysis for flathead mullet species complex.

  14. From Sequence to Morphology - Long-Range Correlations in Complete Sequenced Genomes

    NARCIS (Netherlands)

    T.A. Knoch (Tobias)

    2004-01-01

    textabstractThe largely unresolved sequential organization, i.e. the relations within DNA sequences, and its connection to the three-dimensional organization of genomes was investigated by correlation analyses of completely sequenced chromosomes from Viroids, Archaea, Bacteria, Arabidopsis

  15. The completion of the Mammalian Gene Collection (MGC)

    Science.gov (United States)

    Temple, Gary; Gerhard, Daniela S.; Rasooly, Rebekah; Feingold, Elise A.; Good, Peter J.; Robinson, Cristen; Mandich, Allison; Derge, Jeffrey G.; Lewis, Jeanne; Shoaf, Debonny; Collins, Francis S.; Jang, Wonhee; Wagner, Lukas; Shenmen, Carolyn M.; Misquitta, Leonie; Schaefer, Carl F.; Buetow, Kenneth H.; Bonner, Tom I.; Yankie, Linda; Ward, Ming; Phan, Lon; Astashyn, Alex; Brown, Garth; Farrell, Catherine; Hart, Jennifer; Landrum, Melissa; Maidak, Bonnie L.; Murphy, Michael; Murphy, Terence; Rajput, Bhanu; Riddick, Lillian; Webb, David; Weber, Janet; Wu, Wendy; Pruitt, Kim D.; Maglott, Donna; Siepel, Adam; Brejova, Brona; Diekhans, Mark; Harte, Rachel; Baertsch, Robert; Kent, Jim; Haussler, David; Brent, Michael; Langton, Laura; Comstock, Charles L.G.; Stevens, Michael; Wei, Chaochun; van Baren, Marijke J.; Salehi-Ashtiani, Kourosh; Murray, Ryan R.; Ghamsari, Lila; Mello, Elizabeth; Lin, Chenwei; Pennacchio, Christa; Schreiber, Kirsten; Shapiro, Nicole; Marsh, Amber; Pardes, Elizabeth; Moore, Troy; Lebeau, Anita; Muratet, Mike; Simmons, Blake; Kloske, David; Sieja, Stephanie; Hudson, James; Sethupathy, Praveen; Brownstein, Michael; Bhat, Narayan; Lazar, Joseph; Jacob, Howard; Gruber, Chris E.; Smith, Mark R.; McPherson, John; Garcia, Angela M.; Gunaratne, Preethi H.; Wu, Jiaqian; Muzny, Donna; Gibbs, Richard A.; Young, Alice C.; Bouffard, Gerard G.; Blakesley, Robert W.; Mullikin, Jim; Green, Eric D.; Dickson, Mark C.; Rodriguez, Alex C.; Grimwood, Jane; Schmutz, Jeremy; Myers, Richard M.; Hirst, Martin; Zeng, Thomas; Tse, Kane; Moksa, Michelle; Deng, Merinda; Ma, Kevin; Mah, Diana; Pang, Johnson; Taylor, Greg; Chuah, Eric; Deng, Athena; Fichter, Keith; Go, Anne; Lee, Stephanie; Wang, Jing; Griffith, Malachi; Morin, Ryan; Moore, Richard A.; Mayo, Michael; Munro, Sarah; Wagner, Susan; Jones, Steven J.M.; Holt, Robert A.; Marra, Marco A.; Lu, Sun; Yang, Shuwei; Hartigan, James; Graf, Marcus; Wagner, Ralf; Letovksy, Stanley; Pulido, Jacqueline C.; Robison, Keith; Esposito, Dominic; Hartley, James; Wall, Vanessa E.; Hopkins, Ralph F.; Ohara, Osamu; Wiemann, Stefan

    2009-01-01

    Since its start, the Mammalian Gene Collection (MGC) has sought to provide at least one full-protein-coding sequence cDNA clone for every human and mouse gene with a RefSeq transcript, and at least 6200 rat genes. The MGC cloning effort initially relied on random expressed sequence tag screening of cDNA libraries. Here, we summarize our recent progress using directed RT-PCR cloning and DNA synthesis. The MGC now contains clones with the entire protein-coding sequence for 92% of human and 89% of mouse genes with curated RefSeq (NM-accession) transcripts, and for 97% of human and 96% of mouse genes with curated RefSeq transcripts that have one or more PubMed publications, in addition to clones for more than 6300 rat genes. These high-quality MGC clones and their sequences are accessible without restriction to researchers worldwide. PMID:19767417

  16. Detection and sequence analysis of accessory gene regulator genes of Staphylococcus pseudintermedius isolates

    Directory of Open Access Journals (Sweden)

    M. Ananda Chitra

    2015-07-01

    Full Text Available Background: Staphylococcus pseudintermedius (SP is the major pathogenic species of dogs involved in a wide variety of skin and soft tissue infections. The accessory gene regulator (agr locus of Staphylococcus aureus has been extensively studied, and it influences the expression of many virulence genes. It encodes a two-component signal transduction system that leads to down-regulation of surface proteins and up-regulation of secreted proteins during in vitro growth of S. aureus. The objective of this study was to detect and sequence analyzing the AgrA, B, and D of SP isolated from canine skin infections. Materials and Methods: In this study, we have isolated and identified SP from canine pyoderma and otitis cases by polymerase chain reaction (PCR and confirmed by PCR-restriction fragment length polymorphism. Primers for SP agrA and agrBD genes were designed using online primer designing software and BLAST searched for its specificity. Amplification of the agr genes was carried out for 53 isolates of SP by PCR and sequencing of agrA, B, and D were carried out for five isolates and analyzed using DNAstar and Mega5.2 software. Results: A total of 53 (59% SP isolates were obtained from 90 samples. 15 isolates (28% were confirmed to be methicillinresistant SP (MRSP with the detection of the mecA gene. Accessory gene regulator A, B, and D genes were detected in all the SP isolates. Complete nucleotide sequences of the above three genes for five isolates were submitted to GenBank, and their accession numbers are from KJ133557 to KJ133571. AgrA amino acid sequence analysis showed that it is mainly made of alpha-helices and is hydrophilic in nature. AgrB is a transmembrane protein, and AgrD encodes the precursor of the autoinducing peptide (AIP. Sequencing of the agrD gene revealed that the 5 canine SP strains tested could be divided into three Agr specificity groups (RIPTSTGFF, KIPTSTGFF, and RIPISTGFF based on the putative AIP produced by each strain

  17. Complete genome sequence of Pedobacter heparinus type strain (HIM 762-3T)

    Energy Technology Data Exchange (ETDEWEB)

    Han, Cliff; Spring, Stefan; Lapidus, Alla; Glavina Del Rio, Tijana; Tice, Hope; Copeland, Alex; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Nolan, Matt; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavrommatis, Konstantinos; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jefferies, Cynthia C.; Saunders, Elizabeth; Chertkov, Olga; Brettin, Thomas; Goker, Markus; Rohde, Manfred; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter; Detter, John C.

    2009-05-20

    Pedobacter heparinus (Payza and Korn 1956) Steyn et al. 1998 comb. nov. is the type species of the rapidly growing genus Pedobacter within the family Sphingobacteriaceae of the phylum 'Bacteroidetes'. P. heparinus is of interest, because it was the first isolated strain shown to grow with heparin as sole carbon and nitrogen source and because it produces several enzymes involved in the degradation of mucopolysaccharides. All available data about this species are based on a sole strain that was isolated from dry soil. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first report on a complete genome sequence of a member of the genus Pedobacter, and the 5,167,383 bp long single replicon genome with its 4287 protein-coding and 54 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  18. The complete chloroplast genome sequence of the relict woody plant Metasequoia glyptostroboides Hu et Cheng

    Directory of Open Access Journals (Sweden)

    Jinhui eChen

    2015-06-01

    Full Text Available Metasequoia glyptostroboides Hu et Cheng is the only species in the genus Metasequoia Miki ex Hu et Cheng, which belongs to the Cupressaceae family. There were around ten species in the Metasequoia genus, which were widely spread across the Northern Hemisphere during the Cretaceous of the Mesozoic and in the Cenozoic. M. glyptostroboides is the only remaining representative of this genus. Here, we report the complete chloroplast (cp genome sequence and the cp genomic features of M. glyptostroboides. The M. glyptostroboides cp genome is 131,887 bp in length, with a total of 117 genes comprised of 82 protein-coding genes, 31 tRNA genes and four rRNA genes. In this genome, 11 forward repeats, nine palindromic repeats and 15 tandem repeats were detected. A total of 188 perfect microsatellites were detected through simple sequence repeat (SSR analysis and these were distributed unevenly within the cp genome. Comparison of the cp genome structure and gene order to those of several other land plants indicated that a copy of the inverted repeat (IR region, which was found to be IR region A (IRA, was lost in the M. glyptostroboides cp ge-nome. The five most divergent and five most conserved genes were determined and further phylogenetic analysis was performed among plant species, especially for relat-ed species in conifers. Finally, phylogenetic analysis demonstrated that M. glyptostro-boides is a sister species to Cryptomeria japonica (L. F. D. Don and to Taiwania cryptomerioides Hayata. The complete cp genome sequence information of M. glyp-tostroboides will be great helpful for further investigations of this endemic relict woody plant and for in-depth understanding of the evolutionary history of the conif-erous cp genomes, especially for the position of M. glyptostroboides in plant systemat-ics and evolution.

  19. The complete chloroplast genome sequence of the relict woody plant Metasequoia glyptostroboides Hu et Cheng.

    Science.gov (United States)

    Chen, Jinhui; Hao, Zhaodong; Xu, Haibin; Yang, Liming; Liu, Guangxin; Sheng, Yu; Zheng, Chen; Zheng, Weiwei; Cheng, Tielong; Shi, Jisen

    2015-01-01

    Metasequoia glyptostroboides Hu et Cheng is the only species in the genus Metasequoia Miki ex Hu et Cheng, which belongs to the Cupressaceae family. There were around 10 species in the Metasequoia genus, which were widely spread across the Northern Hemisphere during the Cretaceous of the Mesozoic and in the Cenozoic. M. glyptostroboides is the only remaining representative of this genus. Here, we report the complete chloroplast (cp) genome sequence and the cp genomic features of M. glyptostroboides. The M. glyptostroboides cp genome is 131,887 bp in length, with a total of 117 genes comprised of 82 protein-coding genes, 31 tRNA genes and four rRNA genes. In this genome, 11 forward repeats, nine palindromic repeats, and 15 tandem repeats were detected. A total of 188 perfect microsatellites were detected through simple sequence repeat (SSR) analysis and these were distributed unevenly within the cp genome. Comparison of the cp genome structure and gene order to those of several other land plants indicated that a copy of the inverted repeat (IR) region, which was found to be IR region A (IRA), was lost in the M. glyptostroboides cp genome. The five most divergent and five most conserved genes were determined and further phylogenetic analysis was performed among plant species, especially for related species in conifers. Finally, phylogenetic analysis demonstrated that M. glyptostroboides is a sister species to Cryptomeria japonica (L. F.) D. Don and to Taiwania cryptomerioides Hayata. The complete cp genome sequence information of M. glyptostroboides will be great helpful for further investigations of this endemic relict woody plant and for in-depth understanding of the evolutionary history of the coniferous cp genomes, especially for the position of M. glyptostroboides in plant systematics and evolution.

  20. Complete genome sequence of the probiotic bacterium Bifidobacterium breve KCTC 12201BP isolated from a healthy infant.

    Science.gov (United States)

    Kwak, Min-Jung; Yoon, Jae-Kyung; Kwon, Soon-Kyeong; Chung, Myung-Jun; Seo, Jae-Gu; Kim, Jihyun F

    2015-11-20

    We present the completely sequenced genome of Bifidobacterium breve CBT BR3, which was isolated from the feces of a healthy infant. The 2.43-Mb genome contains several kinds of genetic factors associated with health promotion of the human host such as oligosaccharide-degrading genes and vitamin-biosynthetic genes. Copyright © 2015 Elsevier B.V. All rights reserved.

  1. The Complete Sequence of a Human Parainfluenzavirus 4 Genome

    Science.gov (United States)

    Yea, Carmen; Cheung, Rose; Collins, Carol; Adachi, Dena; Nishikawa, John; Tellier, Raymond

    2009-01-01

    Although the human parainfluenza virus 4 (HPIV4) has been known for a long time, its genome, alone among the human paramyxoviruses, has not been completely sequenced to date. In this study we obtained the first complete genomic sequence of HPIV4 from a clinical isolate named SKPIV4 obtained at the Hospital for Sick Children in Toronto (Ontario, Canada). The coding regions for the N, P/V, M, F and HN proteins show very high identities (95% to 97%) with previously available partial sequences for HPIV4B. The sequence for the L protein and the non-coding regions represent new information. A surprising feature of the genome is its length, more than 17 kb, making it the longest genome within the genus Rubulavirus, although the length is well within the known range of 15 kb to 19 kb for the subfamily Paramyxovirinae. The availability of a complete genomic sequence will facilitate investigations on a respiratory virus that is still not completely characterized. PMID:21994536

  2. The Complete Sequence of a Human Parainfluenzavirus 4 Genome

    Directory of Open Access Journals (Sweden)

    Carmen Yea

    2009-06-01

    Full Text Available Although the human parainfluenza virus 4 (HPIV4 has been known for a long time, its genome, alone among the human paramyxoviruses, has not been completely sequenced to date. In this study we obtained the first complete genomic sequence of HPIV4 from a clinical isolate named SKPIV4 obtained at the Hospital for Sick Children in Toronto (Ontario, Canada. The coding regions for the N, P/V, M, F and HN proteins show very high identities (95% to 97% with previously available partial sequences for HPIV4B. The sequence for the L protein and the non-coding regions represent new information. A surprising feature of the genome is its length, more than 17 kb, making it the longest genome within the genus Rubulavirus, although the length is well within the known range of 15 kb to 19 kb for the subfamily Paramyxovirinae. The availability of a complete genomic sequence will facilitate investigations on a respiratory virus that is still not completely characterized.

  3. Next-generation sequencing yields the complete mitochondrial genome of the flathead mullet, Mugil cephalus cryptic species in East Australia (Teleostei: Mugilidae).

    Science.gov (United States)

    Shen, Kang-Ning; Chen, Ching-Hung; Hsiao, Chung-Der; Durand, Jean-Dominique

    2016-09-01

    In this study, the complete mitogenome sequence of a cryptic species from East Australia (Mugil sp. H) belonging to the worldwide Mugil cephalus species complex (Teleostei: Mugilidae) has been sequenced by next-generation sequencing method. The assembled mitogenome, consisting of 16,845 bp, had the typical vertebrate mitochondrial gene arrangement, including 13 protein-coding genes, 22 transfer RNAs, 2 ribosomal RNAs genes and a non-coding control region of D-loop. D-loop consists of 1067 bp length, and is located between tRNA-Pro and tRNA-Phe. The overall base composition of East Australia M. cephalus is 28.4% for A, 29.3% for C, 15.4% for G and 26.9% for T. The complete mitogenome may provide essential and important DNA molecular data for further phylogenetic and evolutionary analysis for flathead mullet species complex.

  4. Complete genome sequence of a commensal bacterium, Hafnia alvei CBA7124, isolated from human feces.

    Science.gov (United States)

    Song, Hye Seon; Kim, Joon Yong; Kim, Yeon Bee; Jeong, Myeong Seon; Kang, Jisu; Rhee, Jin-Kyu; Kwon, Joseph; Kim, Ju Suk; Choi, Jong-Soon; Choi, Hak-Jong; Nam, Young-Do; Roh, Seong Woon

    2017-01-01

    Members of the genus Hafnia have been isolated from the feces of mammals, birds, reptiles, and fish, as well as from soil, water, sewage, and foods. Hafnia alvei is an opportunistic pathogen that has been implicated in intestinal and extraintestinal infections in humans. However, its pathogenicity is still unclear. In this study, we isolated H. alvei from human feces and performed sequencing as well as comparative genomic analysis to better understand its pathogenicity. The genome of H. alvei CBA7124 comprised a single circular chromosome with 4,585,298 bp and a GC content of 48.8%. The genome contained 25 rRNA genes (9 5S rRNA genes, 8 16S rRNA genes, and 8 23S rRNA genes), 88 tRNA genes, and 4043 protein-coding genes. Using comparative genomic analysis, the genome of this strain was found to have 72 strain-specific singletons. The genome also contained genes for antibiotic and antimicrobial resistance, as well as toxin-antitoxin systems. We revealed the complete genome sequence of the opportunistic gut pathogen, H. alvei CBA7124. We also performed comparative genomic analysis of the sequences in the genome of H. alvei CBA7124, and found that it contained strain-specific singletons, antibiotic resistance genes, and toxin-antitoxin systems. These results could improve our understanding of the pathogenicity and the mechanism behind the antibiotic resistance of H. alvei strains.

  5. Complete genome sequence of Capnocytophaga ochracea type strain (VPI 2845T)

    Energy Technology Data Exchange (ETDEWEB)

    Mavromatis, Konstantinos; Gronow, Sabine; Saunders, Elizabeth; Land, Miriam; Lapidus, Alla; Copeland, Alex; Glavina Del Rio, Tijana; Nolan, Matt; Lucas, Susan; Chen, Feng; Tice1, Hope; Cheng, Jan-Fang; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Pati, Amrita; Ivanova, Natalia; Chen, Amy; Palaniappan, Krishna; Chain, Patrick; Hauser, Loren; Chang, Yun-Juan; Jefferies, Cynthia C.; Brettin, Thomas; Detter, John C.; Han, Cliff; Bristow, James; Goker, Markus; Rohde, Manfred; Eisen, Jonathan A.; Markowitz, Victor; Kyrpides, Nikos C.; Klenk, Hans-Peter; Hugenholtz, Philip

    2009-05-20

    Capnocytophaga ochracea (Prevot et al. 1956) Leadbetter et al. 1982 is the type species of the genus Capnocytophaga. It is of interest because of its location in the Flavobacteriaceae, a genomically yet uncharted family within the order Flavobacteriales. The species grows as fusiform to rod shaped cells which tend to form clumps and are able to move by gliding. C. ochracea is known as a capnophilic organism with the ability to grow under anaerobic as well as under aerobic conditions (oxygen concentration larger than 15percent), here only in the presence of 5percent CO2. Strain VPI 2845T, the type strain of the species, is portrayed in this report as a gliding, Gram-negative bacterium, originally isolated from a human oral cavity. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first completed genome sequence from the flavobacterial genus Capnocytophaga, and the 2,612,925 bp long single replicon genome with its 2193 protein-coding and 59 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  6. The complete mitochondrial genome of Porites harrisoni (Cnidaria: Scleractinia) obtained using next-generation sequencing

    KAUST Repository

    Terraneo, Tullia Isotta; Arrigoni, Roberto; Benzoni, Francesca; Forsman, Zac H.; Berumen, Michael L.

    2018-01-01

    In this study, we sequenced the complete mitochondrial genome of Porites harrisoni using ezRAD and Illumina technology. Genome length consisted of 18,630 bp, with a base composition of 25.92% A, 13.28% T, 23.06% G, and 37.73% C. Consistent with other hard corals, P. harrisoni mitogenome was arranged in 13 protein-coding genes, 2 rRNA, and 2 tRNA genes. nad5 and cox1 contained embedded Group I Introns of 11,133 bp and 965 bp, respectively.

  7. The complete mitochondrial genome of Porites harrisoni (Cnidaria: Scleractinia) obtained using next-generation sequencing

    KAUST Repository

    Terraneo, Tullia Isotta

    2018-02-24

    In this study, we sequenced the complete mitochondrial genome of Porites harrisoni using ezRAD and Illumina technology. Genome length consisted of 18,630 bp, with a base composition of 25.92% A, 13.28% T, 23.06% G, and 37.73% C. Consistent with other hard corals, P. harrisoni mitogenome was arranged in 13 protein-coding genes, 2 rRNA, and 2 tRNA genes. nad5 and cox1 contained embedded Group I Introns of 11,133 bp and 965 bp, respectively.

  8. [Sequence analysis of LEAFY homologous gene from Dendrobium moniliforme and application for identification of medicinal Dendrobium].

    Science.gov (United States)

    Xing, Wen-Rui; Hou, Bei-Wei; Guan, Jing-Jiao; Luo, Jing; Ding, Xiao-Yu

    2013-04-01

    The LEAFY (LFY) homologous gene of Dendrobium moniliforme (L.) Sw. was cloned by new primers which were designed based on the conservative region of known sequences of orchid LEAFY gene. Partial LFY homologous gene was cloned by common PCR, then we got the complete LFY homologous gene Den LFY by Tail-PCR. The complete sequence of DenLFY gene was 3 575 bp which contained three exons and two introns. Using BLAST method, comparison analysis among the exon of LFY homologous gene indicted that the DenLFY gene had high identity with orchids LFY homologous, including the related fragment of PhalLFY (84%) in Phalaenopsis hybrid cultivar, LFY homologous gene in Oncidium (90%) and in other orchid (over 80%). Using MP analysis, Dendrobium is found to be the sister to Oncidium and Phalaenopsis. Homologous analysis demonstrated that the C-terminal amino acids were highly conserved. When the exons and introns were separately considered, exons and the sequence of amino acid were good markers for the function research of DenLFY gene. The second intron can be used in authentication research of Dendrobium based on the length polymorphism between Dendrobium moniliforme and Dendrobium officinale.

  9. A complete mitochondrial genome sequence of Asian black bear Sichuan subspecies (Ursus thibetanus mupinensis)

    Science.gov (United States)

    Hou, Wan-ru; Chen, Yu; Wu, Xia; Hu, Jin-chu; Peng, Zheng-song; Yang, Jung; Tang, Zong-xiang; Zhou, Cai-Quan; Li, Yu-ming; Yang, Shi-kui; Du, Yu-jie; Kong, Ling-lu; Ren, Zheng-long; Zhang, Huai-yu; Shuai, Su-rong

    2007-01-01

    We obtained the complete mitochondrial genome of U.thibetanus mupinensis by DNA sequencing based on the PCR fragments of 18 primers we designed. The results indicate that the mtDNA is 16 868 bp in size, encodes 13 protein genes, 22 tRNA genes, and 2 rRNA genes, with an overall H-strand base composition of 31.2% A, 25.4% C, 15.5% G and 27.9% T. The sequence of the control region (CR) located between tRNA-Pro and tRNA-Phe is 1422 bp in size, consists of 8.43% of the whole genome, GC content is 51.9% and has a 6bp tandem repeat and two 10bp tandem repeats identified by using the Tandem Repeats Finder. U. thibetanus mupinensis mitochondrial genome shares high similarity with those of three other Ursidae: U. americanus (91.46%), U. arctos (89.25%) and U. maritimus (87.66%). PMID:17205108

  10. Profiling dehydrin gene sequence and physiological parameters in drought tolerant and susceptible spring wheat cultivars

    International Nuclear Information System (INIS)

    Baloch, M.J.; Jatoi, W.A.

    2012-01-01

    Physiological and yield traits such as stomatal conductance (mmol m-/sup 2/s/sup -1/), Leaf relative water content (RWC %) and grain yield per plant were studied in a separate experiment. Results revealed that five out of sixteen cultivars viz. Anmol, Moomal, Sarsabz, Bhitai and Pavan, appeared to be relatively more drought tolerant. Based on morphophysiological results, studies were continued to look at these cultivars for drought tolerance at molecular level. Initially, four well recognized primers for dehydrin genes (DHNs) responsible for drought induction in T. durum L., T. aestivum L. and O. sativa L. were used for profiling gene sequence of sixteen wheat cultivars. The primers amplified the DHN genes variably like Primer WDHN13 (T. aestivum L.) amplified the DHN gene in only seven cultivars whereas primer TdDHN15 ( T. durum L.) amplified all the sixteen cultivars with even different DNA banding patterns some showing second weaker DNA bands. Third primer TdDHN16 (T. durum L.) has shown entirely different PCR amplification prototype, specially showing two strong DNA bands while fourth primer RAB16C (O. sativa L.) failed to amplify DHN gene in any of the cultivars. Examination of DNA sequences revealed several interesting features. First, it identified the two exon/one intron structure of this gene (complete sequences were not shown), a feature not previously described in the two database cDNA sequences available from T. aestivum L. (gi|21850). Secondly, the analysis identified several single nucleotide polymorphisms (SNPs), positions in gene sequence. Although complete gene sequence was not obtained for all the cultivars, yet there were a total of 38 variable positions in exonic (coding region) sequence, from a total gene length of 453 nucleotides. Matrix of SNP shows these 37 positions with individual sequence at positions given for each of the 14 cultivars (sequence of two cultivars was not obtained) included in this analysis. It demonstrated a considerab le

  11. Planarian homeobox genes: cloning, sequence analysis, and expression.

    Science.gov (United States)

    Garcia-Fernàndez, J; Baguñà, J; Saló, E

    1991-01-01

    Freshwater planarians (Platyhelminthes, Turbellaria, and Tricladida) are acoelomate, triploblastic, unsegmented, and bilaterally symmetrical organisms that are mainly known for their ample power to regenerate a complete organism from a small piece of their body. To identify potential pattern-control genes in planarian regeneration, we have isolated two homeobox-containing genes, Dth-1 and Dth-2 [Dugesia (Girardia) tigrina homeobox], by using degenerate oligonucleotides corresponding to the most conserved amino acid sequence from helix-3 of the homeodomain. Dth-1 and Dth-2 homeodomains are closely related (68% at the nucleotide level and 78% at the protein level) and show the conserved residues characteristic of the homeodomains identified to data. Similarity with most homeobox sequences is low (30-50%), except with Drosophila NK homeodomains (80-82% with NK-2) and the rodent TTF-1 homeodomain (77-87%). Some unusual amino acid residues specific to NK-2, TTF-1, Dth-1, and Dth-2 can be observed in the recognition helix (helix-3) and may define a family of homeodomains. The deduced amino acid sequences from the cDNAs contain, in addition to the homeodomain, other domains also present in various homeobox-containing genes. The expression of both genes, detected by Northern blot analysis, appear slightly higher in cephalic regions than in the rest of the intact organism, while a slight increase is detected in the central period (5 days) or regeneration. Images PMID:1714599

  12. Interspecific Comparison and annotation of two complete mitochondrial genome sequences from the plant pathogenic fungus Mycosphaerella graminicola

    Energy Technology Data Exchange (ETDEWEB)

    Millenbaugh, Bonnie A; Pangilinan, Jasmyn L.; Torriani, Stefano F.F.; Goodwin, Stephen B.; Kema, Gert H.J.; McDonald, Bruce A.

    2007-12-07

    The mitochondrial genomes of two isolates of the wheat pathogen Mycosphaerella graminicola were sequenced completely and compared to identify polymorphic regions. This organism is of interest because it is phylogenetically distant from other fungi with sequenced mitochondrial genomes and it has shown discordant patterns of nuclear and mitochondrial diversity. The mitochondrial genome of M. graminicola is a circular molecule of approximately 43,960 bp containing the typical genes coding for 14 proteins related to oxidative phosphorylation, one RNA polymerase, two rRNA genes and a set of 27 tRNAs. The mitochondrial DNA of M. graminicola lacks the gene encoding the putative ribosomal protein (rps5-like), commonly found in fungal mitochondrial genomes. Most of the tRNA genes were clustered with a gene order conserved with many other ascomycetes. A sample of thirty-five additional strains representing the known global mt diversity was partially sequenced to measure overall mitochondrial variability within the species. Little variation was found, confirming previous RFLP-based findings of low mitochondrial diversity. The mitochondrial sequence of M. graminicola is the first reported from the family Mycosphaerellaceae or the order Capnodiales. The sequence also provides a tool to better understand the development of fungicide resistance and the conflicting pattern of high nuclear and low mitochondrial diversity in global populations of this fungus.

  13. Complete genome sequence of Truepera radiovictrix type strain (RQ-24).

    Science.gov (United States)

    Ivanova, Natalia; Rohde, Christine; Munk, Christine; Nolan, Matt; Lucas, Susan; Del Rio, Tijana Glavina; Tice, Hope; Deshpande, Shweta; Cheng, Jan-Fang; Tapia, Roxane; Han, Cliff; Goodwin, Lynne; Pitluck, Sam; Liolios, Konstantinos; Mavromatis, Konstantinos; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D; Brambilla, Evelyne; Rohde, Manfred; Göker, Markus; Tindall, Brian J; Woyke, Tanja; Bristow, James; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter; Lapidus, Alla

    2011-02-22

    Truepera radiovictrix Albuquerque et al. 2005 is the type species of the genus Truepera within the phylum "Deinococcus/Thermus". T. radiovictrix is of special interest not only because of its isolated phylogenetic location in the order Deinococcales, but also because of its ability to grow under multiple extreme conditions in alkaline, moderately saline, and high temperature habitats. Of particular interest is the fact that, T. radiovictrix is also remarkably resistant to ionizing radiation, a feature it shares with members of the genus Deinococcus. This is the first completed genome sequence of a member of the family Trueperaceae and the fourth type strain genome sequence from a member of the order Deinococcales. The 3,260,398 bp long genome with its 2,994 protein-coding and 52 RNA genes consists of one circular chromosome and is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  14. Complete genome sequence of Tolumonas auensis type strain (TA 4T)

    Energy Technology Data Exchange (ETDEWEB)

    Chertkov, Olga; Copeland, Alex; Lucas1, Susa; Lapidus, Alla; Berry, KerrieW.; Detter, JohnC.; Glavina Del Rio, Tijana; Hammon, Nancy; Dalin, Eileen; Tice, Hope; Pitluck, Sam; Richardson, Paul; Bruce, David; Goodwin, Lynne; Han, Cliff; Tapia, Roxanne; Saunders, Elizabeth; Schmutz, Jeremy; Brettin, Thomas; Larimer, Frank; Land, Miriam; Hauser, Loren; Spring, Stefan; Rohde, Manfred; Kyrpides, NikosC.; Ivanova, Natalia; G& #246; ker, Markus; Beller, HarryR.; Klenk, Hans-Peter; Woyke, Tanja

    2011-10-04

    Tolumonas auensis (Fischer-Romero et al. 1996) is currently the only validly named species of the genus Tolumonas in the family Aeromonadaceae. The strain is of interest because of its ability to produce toluene from phenylalanine and other phenyl precursors, as well as phenol from tyrosine. This is of interest because toluene is normally considered to be a tracer of anthropogenic pollution in lakes, but T. auensis represents a biogenic source of toluene. Other than Aeromonas hydrophila subsp. hydrophila, T. auensis strain TA 4T is the only other member in the family Aeromonadaceae with a completely sequenced type-strain genome. The 3,471,292-bp chromosome with a total of 3,288 protein-coding and 116 RNA genes was sequenced as part of the DOE Joint Genome Institute Program JBEI 2008.

  15. Complete genome sequence of Tsukamurella paurometabola type strain (no. 33T)

    Energy Technology Data Exchange (ETDEWEB)

    Munk, Christine [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Huntemann, Marcel [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Brettin, Thomas S [ORNL; Yasawong, Montri [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Brambilla, Evelyne-Marie [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany

    2011-01-01

    Tsukamurella paurometabola corrig. (Steinhaus 1941) Collins et al. 1988 is the type species of the genus Tsukamurella, which is the type genus to the family Tsukamurellaceae. The spe- cies is not only of interest because of its isolated phylogenetic location, but also because it is a human opportunistic pathogen with some strains of the species reported to cause lung in- fection, lethal meningitis, and necrotizing tenosynovitis. This is the first completed genome sequence of a member of the genus Tsukamurella and the first genome sequence of a member of the family Tsukamurellaceae. The 4,479,724 bp long genome contains a 99,806 bp long plasmid and a total of 4,335 protein-coding and 56 RNA genes, and is a part of the Ge- nomic Encyclopedia of Bacteria and Archaea project.

  16. Complete genome sequence of Tolumonas auensis type strain (TA 4T)

    Energy Technology Data Exchange (ETDEWEB)

    Chertkov, Olga [Los Alamos National Laboratory (LANL); Copeland, A [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Berry, Alison M [California Institute of Technology, University of California, Davis; Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Hammon, Nancy [U.S. Department of Energy, Joint Genome Institute; Dalin, Eileen [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Richardson, P M [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Tapia, Roxanne [Los Alamos National Laboratory (LANL); Saunders, Elizabeth H [Los Alamos National Laboratory (LANL); Schmutz, Jeremy [Stanford University; Brettin, Thomas S [ORNL; Larimer, Frank W [ORNL; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Spring, Stefan [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Beller, Harry R. [Lawrence Berkeley National Laboratory (LBNL); Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute

    2011-01-01

    Tolumonas auensis Fischer-Romero et al. 1996 is currently the only validly named species of the genus Tolumonas in the family Aeromonadaceae. The strain is of interest because of its ability to produce toluene from phenylalanine and other phenyl precursors, as well as phenol from tyrosine. This is of interest because toluene is normally considered to be a tracer of anthropogenic pollution in lakes, but T. auensis represents a biogenic source of toluene. Oth- er than Aeromonas hydrophila subsp. hydrophila, T. auensis strain TA 4T is the only other member in the family Aeromonadaceae with a completely sequenced type-strain genome. The 3,471,292 bp chromosome with a total of 3,288 protein-coding and 116 RNA genes was sequenced as part of the DOE Joint Genome Institute Program JBEI 2008.

  17. Population genetic implications from sequence variation in four Y chromosome genes.

    Science.gov (United States)

    Shen, P; Wang, F; Underhill, P A; Franco, C; Yang, W H; Roxas, A; Sung, R; Lin, A A; Hyman, R W; Vollrath, D; Davis, R W; Cavalli-Sforza, L L; Oefner, P J

    2000-06-20

    Some insight into human evolution has been gained from the sequencing of four Y chromosome genes. Primary genomic sequencing determined gene SMCY to be composed of 27 exons that comprise 4,620 bp of coding sequence. The unfinished sequencing of the 5' portion of gene UTY1 was completed by primer walking, and a total of 20 exons were found. By using denaturing HPLC, these two genes, as well as DBY and DFFRY, were screened for polymorphic sites in 53-72 representatives of the five continents. A total of 98 variants were found, yielding nucleotide diversity estimates of 2.45 x 10(-5), 5. 07 x 10(-5), and 8.54 x 10(-5) for the coding regions of SMCY, DFFRY, and UTY1, respectively, with no variant having been observed in DBY. In agreement with most autosomal genes, diversity estimates for the noncoding regions were about 2- to 3-fold higher and ranged from 9. 16 x 10(-5) to 14.2 x 10(-5) for the four genes. Analysis of the frequencies of derived alleles for all four genes showed that they more closely fit the expectation of a Luria-Delbrück distribution than a distribution expected under a constant population size model, providing evidence for exponential population growth. Pairwise nucleotide mismatch distributions date the occurrence of population expansion to approximately 28,000 years ago. This estimate is in accord with the spread of Aurignacian technology and the disappearance of the Neanderthals.

  18. Complete genome sequence of Spirochaeta smaragdinae type strain (SEBR 4228T)

    Energy Technology Data Exchange (ETDEWEB)

    Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Yasawong, Montri [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Chertkov, Olga [Los Alamos National Laboratory (LANL); Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Bruce, David [U.S. Department of Energy, Joint Genome Institute; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Brambilla, Evelyne-Marie [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Spring, Stefan [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute

    2010-01-01

    Spirochaeta smaragdinae Magot et al. 1998 belongs to the family Spirochaetaceae. The species is Gram-negative, motile, obligately halophilic and strictly anaerobic bacterium, which is of interest because it is able to ferment numerous polysaccharides. S. smaragdinae is the only species of the family Spirochaetaceae known to reduce thiosulfate or element sulphur to sulfide. This is the first complete genome sequence in the family Spirochaetaceae. The 4,653,970 bp long genome with its 4,363 protein-coding and 57 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  19. The Complete Chloroplast and Mitochondrial Genome Sequences of Boea hygrometrica: Insights into the Evolution of Plant Organellar Genomes

    Science.gov (United States)

    Wang, Xumin; Deng, Xin; Zhang, Xiaowei; Hu, Songnian; Yu, Jun

    2012-01-01

    The complete nucleotide sequences of the chloroplast (cp) and mitochondrial (mt) genomes of resurrection plant Boea hygrometrica (Bh, Gesneriaceae) have been determined with the lengths of 153,493 bp and 510,519 bp, respectively. The smaller chloroplast genome contains more genes (147) with a 72% coding sequence, and the larger mitochondrial genome have less genes (65) with a coding faction of 12%. Similar to other seed plants, the Bh cp genome has a typical quadripartite organization with a conserved gene in each region. The Bh mt genome has three recombinant sequence repeats of 222 bp, 843 bp, and 1474 bp in length, which divide the genome into a single master circle (MC) and four isomeric molecules. Compared to other angiosperms, one remarkable feature of the Bh mt genome is the frequent transfer of genetic material from the cp genome during recent Bh evolution. We also analyzed organellar genome evolution in general regarding genome features as well as compositional dynamics of sequence and gene structure/organization, providing clues for the understanding of the evolution of organellar genomes in plants. The cp-derived sequences including tRNAs found in angiosperm mt genomes support the conclusion that frequent gene transfer events may have begun early in the land plant lineage. PMID:22291979

  20. Complete mitochondrial genome sequence of the lined seahorse Hippocampus erectus Perry, 1810 (Gasterosteiformes: Syngnathidae).

    Science.gov (United States)

    Zhang, Yanhong; Zhang, Huixian; Lin, Qiang; Huang, Liangmin

    2015-01-01

    The complete mitochondrial genome sequence of the lined seahorse Hippocampus erectus was first determined in this article. The total length of H. erectus mitogenome is 16,529 bp, which consists of 13 protein-coding genes, 22 tRNA and 2 rRNA genes and 1 control region. The features of the H. erectus mitochondrial genome were similar to the typical vertebrates. The overall base composition of H. erectus is 31.8% A, 28.6% T, 24.3% C and 15.3% G, with a slight A + T rich feature (60.4%).

  1. Streptococcus iniae SF1: complete genome sequence, proteomic profile, and immunoprotective antigens.

    Directory of Open Access Journals (Sweden)

    Bao-cun Zhang

    Full Text Available Streptococcus iniae is a Gram-positive bacterium that is reckoned one of the most severe aquaculture pathogens. It has a broad host range among farmed marine and freshwater fish and can also cause zoonotic infection in humans. Here we report for the first time the complete genome sequence as well as the host factor-induced proteomic profile of a pathogenic S. iniae strain, SF1, a serotype I isolate from diseased fish. SF1 possesses a single chromosome of 2,149,844 base pairs, which contains 2,125 predicted protein coding sequences (CDS, 12 rRNA genes, and 45 tRNA genes. Among the protein-encoding CDS are genes involved in resource acquisition and utilization, signal sensing and transduction, carbohydrate metabolism, and defense against host immune response. Potential virulence genes include those encoding adhesins, autolysins, toxins, exoenzymes, and proteases. In addition, two putative prophages and a CRISPR-Cas system were found in the genome, the latter containing a CRISPR locus and four cas genes. Proteomic analysis detected 21 secreted proteins whose expressions were induced by host serum. Five of the serum-responsive proteins were subjected to immunoprotective analysis, which revealed that two of the proteins were highly protective against lethal S. iniae challenge when used as purified recombinant subunit vaccines. Taken together, these results provide an important molecular basis for future study of S. iniae in various aspects, in particular those related to pathogenesis and disease control.

  2. Streptococcus iniae SF1: Complete Genome Sequence, Proteomic Profile, and Immunoprotective Antigens

    Science.gov (United States)

    Zhang, Bao-cun; Zhang, Jian; Sun, Li

    2014-01-01

    Streptococcus iniae is a Gram-positive bacterium that is reckoned one of the most severe aquaculture pathogens. It has a broad host range among farmed marine and freshwater fish and can also cause zoonotic infection in humans. Here we report for the first time the complete genome sequence as well as the host factor-induced proteomic profile of a pathogenic S. iniae strain, SF1, a serotype I isolate from diseased fish. SF1 possesses a single chromosome of 2,149,844 base pairs, which contains 2,125 predicted protein coding sequences (CDS), 12 rRNA genes, and 45 tRNA genes. Among the protein-encoding CDS are genes involved in resource acquisition and utilization, signal sensing and transduction, carbohydrate metabolism, and defense against host immune response. Potential virulence genes include those encoding adhesins, autolysins, toxins, exoenzymes, and proteases. In addition, two putative prophages and a CRISPR-Cas system were found in the genome, the latter containing a CRISPR locus and four cas genes. Proteomic analysis detected 21 secreted proteins whose expressions were induced by host serum. Five of the serum-responsive proteins were subjected to immunoprotective analysis, which revealed that two of the proteins were highly protective against lethal S. iniae challenge when used as purified recombinant subunit vaccines. Taken together, these results provide an important molecular basis for future study of S. iniae in various aspects, in particular those related to pathogenesis and disease control. PMID:24621602

  3. Assembly and comparative analysis of complete mitochondrial genome sequence of an economic plant Salix suchowensis

    Directory of Open Access Journals (Sweden)

    Ning Ye

    2017-03-01

    Full Text Available Willow is a widely used dioecious woody plant of Salicaceae family in China. Due to their high biomass yields, willows are promising sources for bioenergy crops. In this study, we assembled the complete mitochondrial (mt genome sequence of S. suchowensis with the length of 644,437 bp using Roche-454 GS FLX Titanium sequencing technologies. Base composition of the S. suchowensis mt genome is A (27.43%, T (27.59%, C (22.34%, and G (22.64%, which shows a prevalent GC content with that of other angiosperms. This long circular mt genome encodes 58 unique genes (32 protein-coding genes, 23 tRNA genes and 3 rRNA genes, and 9 of the 32 protein-coding genes contain 17 introns. Through the phylogenetic analysis of 35 species based on 23 protein-coding genes, it is supported that Salix as a sister to Populus. With the detailed phylogenetic information and the identification of phylogenetic position, some ribosomal protein genes and succinate dehydrogenase genes are found usually lost during evolution. As a native shrub willow species, this worthwhile research of S. suchowensis mt genome will provide more desirable information for better understanding the genomic breeding and missing pieces of sex determination evolution in the future.

  4. Complete genome sequence of Halanaerobium praevalens type strain (GSLT)

    Energy Technology Data Exchange (ETDEWEB)

    Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Chertkov, Olga [Los Alamos National Laboratory (LANL); Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Hammon, Nancy [U.S. Department of Energy, Joint Genome Institute; Deshpande, Shweta [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Huntemann, Marcel [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Pagani, Ioanna [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Brambilla, Evelyne-Marie [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Kannan, K. Palani [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Tindall, Brian [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute

    2011-01-01

    Halanaerobium praevalens Zeikus et al. 1984 is the type species of the genus Halanaero- bium, which in turn is the type genus of the family Halanaerobiaceae. The species is of inter- est because it is able to reduce a variety of nitro-substituted aromatic compounds at a high rate, and because of its ability to degrade organic pollutants. The strain is also of interest be- cause it functions as a hydrolytic bacterium, fermenting complex organic matter and produc- ing intermediary metabolites for other trophic groups such as sulfate-reducing and methano- genic bacteria. It is further reported as being involved in carbon removal in the Great Salt Lake, its source of isolation. This is the first completed genome sequence of a representative of the genus Halanaerobium and the second genome sequence from a type strain of the fami- ly Halanaerobiaceae. The 2,309,262 bp long genome with its 2,110 protein-coding and 70 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  5. Complete Genome Sequence of the Endophytic Biocontrol Strain Bacillus velezensis CC09.

    Science.gov (United States)

    Cai, Xunchao; Kang, Xingxing; Xi, Huan; Liu, Changhong; Xue, Yarong

    2016-09-29

    Bacillus velezensis is a heterotypic synonym of B. methylotrophicus, B. amyloliquefaciens subsp. plantarum, and Bacillus oryzicola, and has been used to control plant fungal diseases. In order to fully understand the genetic basis of antimicrobial capacities, we did a complete genome sequencing of the endophytic B. velezensis strain CC09. Genes tightly associated with biocontrol ability, including nonribosomal peptide synthetases, polyketide synthetases, iron acquisition, colonization, and volatile organic compound synthesis were identified in the genome. Copyright © 2016 Cai et al.

  6. Complete genome sequence of hypervirulent and outbreak-associated Acinetobacter baumannii strain LAC-4: epidemiology, resistance genetic determinants and potential virulence factors

    Science.gov (United States)

    Ou, Hong-Yu; Kuang, Shan N.; He, Xinyi; Molgora, Brenda M.; Ewing, Peter J.; Deng, Zixin; Osby, Melanie; Chen, Wangxue; Xu, H. Howard

    2015-01-01

    Acinetobacter baumannii is an important human pathogen due to its multi-drug resistance. In this study, the genome of an ST10 outbreak A. baumannii isolate LAC-4 was completely sequenced to better understand its epidemiology, antibiotic resistance genetic determinants and potential virulence factors. Compared with 20 other complete genomes of A. baumannii, LAC-4 genome harbors at least 12 copies of five distinct insertion sequences. It contains 12 and 14 copies of two novel IS elements, ISAba25 and ISAba26, respectively. Additionally, three novel composite transposons were identified: Tn6250, Tn6251 and Tn6252, two of which contain resistance genes. The antibiotic resistance genetic determinants on the LAC-4 genome correlate well with observed antimicrobial susceptibility patterns. Moreover, twelve genomic islands (GI) were identified in LAC-4 genome. Among them, the 33.4-kb GI12 contains a large number of genes which constitute the K (capsule) locus. LAC-4 harbors several unique putative virulence factor loci. Furthermore, LAC-4 and all 19 other outbreak isolates were found to harbor a heme oxygenase gene (hemO)-containing gene cluster. The sequencing of the first complete genome of an ST10 A. baumannii clinical strain should accelerate our understanding of the epidemiology, mechanisms of resistance and virulence of A. baumannii. PMID:25728466

  7. Molecular phylogeny of Japanese Rhinolophidae based on variations in the complete sequence of the mitochondrial cytochrome b gene.

    Science.gov (United States)

    Sakai, Takahiro; Kikkawa, Yoshiaki; Tsuchiya, Kimiyuki; Harada, Masashi; Kanoe, Masamitsu; Yoshiyuki, Mizuko; Yonekawa, Hiromichi

    2003-04-01

    Microchiroptera have diversified into many species whose size and the shapes of the complicated ear and nose have been adapted to their echolocation abilities. Their speciation processes, and intra- and interspecies relationships are still under discussion. Here we report on the geographical variation of Japanese Rhinolophus ferrumequinum and R. cornutus using the complete sequence of the mitochondrial cytochrome b gene to clarify the phylogenetic positions of the 2 species as well as that of Rhinolophidae within the Microchiroptera. We have found that sequence divergence values within each of the 2 species are unexpectedly low (0.07%-0.94%). We have also found that there is no local specificity of their mtCytb alleles. On the other hand, the divergence values for Japanese Microchiroptera (12.7%-16.6%) are much higher than those for other mammalian genera. Similarly, the values among five genera of Vespertilionidae were 20.5%-27.3%. Phylogenetic analysis shows that the 2 species of family Rhinolophidae in the suborder Microchiroptera belong to the Megachiroptera cluster in the constructed maximum parsimony tree. These results suggest that the speciation of Rhinolophidae involved its divergence as an independent lineage from other Microchiroptera, and other microbats might be paraphyletic. In addition, the tree also shows that the order Chiroptera is monophylitic, and the closest group to Chiroptera is the ungulates.

  8. Complete Genome Sequence of the Gamma-Aminobutyric Acid-Producing Strain Streptococcus thermophilus APC151.

    Science.gov (United States)

    Linares, Daniel M; Arboleya, Silvia; Ross, R Paul; Stanton, Catherine

    2017-04-27

    Here is presented the whole-genome sequence of Streptococcus thermophilus APC151, isolated from a marine fish. This bacterium produces gamma-aminobutyric acid (GABA) in high yields and is biotechnologically suitable to produce naturally GABA-enriched biofunctional yogurt. Its complete genome comprises 2,097 genes and 1,839,134 nucleotides, with an average G+C content of 39.1%. Copyright © 2017 Linares et al.

  9. Phylogenetic relationships among amphisbaenian reptiles based on complete mitochondrial genomic sequences.

    Science.gov (United States)

    Macey, J Robert; Papenfuss, Theodore J; Kuehl, Jennifer V; Fourcade, H Mathew; Boore, Jeffrey L

    2004-10-01

    Complete mitochondrial genomic sequences are reported from 12 members in the four families of the reptile group Amphisbaenia. Analysis of 11,946 aligned nucleotide positions (5797 informative) produces a robust phylogenetic hypothesis. The family Rhineuridae is basal and Bipedidae is the sister taxon to the Amphisbaenidae plus Trogonophidae. Amphisbaenian reptiles are surprisingly old, predating the breakup of Pangaea 200 million years before present, because successive basal taxa (Rhineuridae and Bipedidae) are situated in tectonic regions of Laurasia and nested taxa (Amphisbaenidae and Trogonophidae) are found in Gondwanan regions. Thorough sampling within the Bipedidae shows that it is not tectonic movement of Baja California away from the Mexican mainland that is primary in isolating Bipes species, but rather that primary vicariance occurred between northern and southern groups. Amphisbaenian families show parallel reduction in number of limbs and Bipes species exhibit parallel reduction in number of digits. A measure is developed for comparing the phylogenetic information content of various genes. A synapomorphic trait defining the Bipedidae is a shift from the typical vertebrate mitochondrial gene arrangement to the derived state of trnE and nad6. In addition, a tandem duplication of trnT and trnP is observed in Bipes biporus with a pattern of pseudogene formation that varies among populations. The first case of convergent rearrangement of the mitochondrial genome among animals demonstrated by complete genomic sequences is reported. Relative to most vertebrates, the Rhineuridae has the block nad6, trnE switched in order with the block cob, trnT, trnP, as they are in birds.

  10. Phylogenetic relationships among amphisbaenian reptiles based on complete mitochondrial genomic sequences

    Energy Technology Data Exchange (ETDEWEB)

    Macey, J. Robert; Papenfuss, Theodore J.; Kuehl, Jennifer V.; Fourcade, H. Matthew; Boore, Jeffrey L.

    2004-05-19

    Complete mitochondrial genomic sequences are reported from 12 members in the four families of the reptile group Amphisbaenia. Analysis of 11,946 aligned nucleotide positions (5,797 informative) produces a robust phylogenetic hypothesis. The family Rhineuridae is basal and Bipedidae is the sister taxon to the Amphisbaenidae plus Trogonophidae. Amphisbaenian reptiles are surprisingly old, predating the breakup of Pangaea 200 million years before present, because successive basal taxa (Rhineuridae and Bipedidae) are situated in tectonic regions of Laurasia and nested taxa (Amphisbaenidae and Trogonophidae) are found in Gondwanan regions. Thorough sampling within the Bipedidae shows that it is not tectonic movement of Baja California away from the Mexican mainland that is primary in isolating Bipes species, but rather that primary vicariance occurred between northern and southern groups. Amphisbaenian families show parallel reduction in number of limbs and Bipes species exhibit parallel reduction in number of digits. A measure is developed for comparing the phylogenetic information content of various genes. A synapomorphic trait defining the Bipedidae is a shift from the typical vertebrate mitochondrial gene arrangement to the derived state of trnE and nad6. In addition, a tandem duplication of trnT and trnP is observed in B. biporus with a pattern of pseudogene formation that varies among populations. The first case of convergent rearrangement of the mitochondrial genome among animals demonstrated by complete genomic sequences is reported. Relative to most vertebrates, the Rhineuridae has the block nad6, trnE switched in order with cob, trnT, trnP, as they are in birds.

  11. Recovery and evolutionary analysis of complete integron gene cassette arrays from Vibrio

    Directory of Open Access Journals (Sweden)

    Gillings Michael R

    2006-01-01

    Full Text Available Abstract Background Integrons are genetic elements capable of the acquisition, rearrangement and expression of genes contained in gene cassettes. Gene cassettes generally consist of a promoterless gene associated with a recombination site known as a 59-base element (59-be. Multiple insertion events can lead to the assembly of large integron-associated cassette arrays. The most striking examples are found in Vibrio, where such cassette arrays are widespread and can range from 30 kb to 150 kb. Besides those found in completely sequenced genomes, no such array has yet been recovered in its entirety. We describe an approach to systematically isolate, sequence and annotate large integron gene cassette arrays from bacterial strains. Results The complete Vibrio sp. DAT722 integron cassette array was determined through the streamlined approach described here. To place it in an evolutionary context, we compare the DAT722 array to known vibrio arrays and performed phylogenetic analyses for all of its components (integrase, 59-be sites, gene cassette encoded genes. It differs extensively in terms of genomic context as well as gene cassette content and organization. The phylogenetic tree of the 59-be sites collectively found in the Vibrio gene cassette pool suggests frequent transfer of cassettes within and between Vibrio species, with slower transfer rates between more phylogenetically distant relatives. We also identify multiple cases where non-integron chromosomal genes seem to have been assembled into gene cassettes and others where cassettes have been inserted into chromosomal locations outside integrons. Conclusion Our systematic approach greatly facilitates the isolation and annotation of large integrons gene cassette arrays. Comparative analysis of the Vibrio sp. DAT722 integron obtained through this approach to those found in other vibrios confirms the role of this genetic element in promoting lateral gene transfer and suggests a high rate of gene

  12. Complete Sequence of p07-406, a 24,179-base-pair plasmid harboring the blaVIM-7 metallo-beta-lactamase gene in a Pseudomonas aeruginosa isolate from the United States.

    Science.gov (United States)

    Li, Hongyang; Toleman, Mark A; Bennett, Peter M; Jones, Ronald N; Walsh, Timothy R

    2008-09-01

    An outbreak involving a Pseudomonas aeruginosa strain that was resistant to all tested antimicrobials except polymyxin B occurred in a hospital in Houston, TX. Previous studies on this strain showed that it possesses a novel mobile metallo-beta-lactamase (MBL) gene, designated bla(VIM-7), located on a plasmid (p07-406). Here, we report the complete sequence, annotation, and functional characterization of this plasmid. p07-406 is 24,179 bp in length, and 29 open reading frames were identified related to known or putatively recognized proteins. Analysis of this plasmid showed it to be comprised of four distinct regions: (i) a region of 5,200 bp having a Tn501-like mercuric resistance (mer) transposon upstream of the replication region; (ii) a Tn3-like transposon carrying a truncated integron with a bla(VIM-7) gene and an insertion sequence inserted at the other end of this transposon; (iii) a region of four genes, upstream of the Tn3-like transposon, possessing very high similarity to plasmid pXcB from Xanthomonas campestris pv. citri commonly associated with plants; (iv) a backbone sequence similar to the backbone structure of the IncP group plasmid Rms149, pB10, and R751. This is the first plasmid to be sequenced carrying an MBL gene and highlights the amelioration of DNA segments from disparate origins, most noticeably from plant pathogens.

  13. Chromosomal location and nucleotide sequence of the Escherichia coli dapA gene.

    OpenAIRE

    Richaud, F; Richaud, C; Ratet, P; Patte, J C

    1986-01-01

    In Escherichia coli, the first enzyme of the diaminopimelate and lysine pathway is dihydrodipicolinate synthetase, which is feedback-inhibited by lysine and encoded by the dapA gene. The location of the dapA gene on the bacterial chromosome has been determined accurately with respect to the neighboring purC and dapE genes. The complete nucleotide sequence and the transcriptional start of the dapA gene were determined. The results show that dapA consists of a single cistron encoding a 292-amin...

  14. Complete genome sequence of Capnocytophaga ochracea type strain (VPI 2845T)

    Energy Technology Data Exchange (ETDEWEB)

    Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Gronow, Sabine [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Saunders, Elizabeth H [Los Alamos National Laboratory (LANL); Land, Miriam L [ORNL; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Chen, Feng [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Chain, Patrick S. G. [Lawrence Livermore National Laboratory (LLNL); Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Brettin, Thomas S [ORNL; Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Bristow, James [U.S. Department of Energy, Joint Genome Institute; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute

    2009-01-01

    Capnocytophaga ochracea (Pr vot et al. 1956) Leadbetter et al. 1982 is the type species of the genus Capnocytophaga. It is of interest because of its location in the Flavobacteriaceae, a genomically not yet charted family within the order Flavobacteriales. The species grows as fusiform to rod shaped cells which tend to form clumps and are able to move by gliding. C. ochracea is known as a capnophilic (CO2-requiring) organism with the ability to grow under anaerobic as well as aerobic conditions (oxygen concentration larger than 15%), here only in the presence of 5% CO2. Strain VPI 2845T, the type strain of the species, is portrayed in this report as a gliding, Gram-negative bacterium, originally isolated from a human oral cavity. Here we describe the features of this organism, together with the complete genome se-quence, and annotation. This is the first completed genome sequence from the flavobacterial genus Capnocytophaga, and the 2,612,925 bp long single replicon genome with its 2193 protein-coding and 59 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  15. Combinatorial Pooling Enables Selective Sequencing of the Barley Gene Space

    Science.gov (United States)

    Lonardi, Stefano; Duma, Denisa; Alpert, Matthew; Cordero, Francesca; Beccuti, Marco; Bhat, Prasanna R.; Wu, Yonghui; Ciardo, Gianfranco; Alsaihati, Burair; Ma, Yaqin; Wanamaker, Steve; Resnik, Josh; Bozdag, Serdar; Luo, Ming-Cheng; Close, Timothy J.

    2013-01-01

    For the vast majority of species – including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding. PMID:23592960

  16. Combinatorial pooling enables selective sequencing of the barley gene space.

    Directory of Open Access Journals (Sweden)

    Stefano Lonardi

    2013-04-01

    Full Text Available For the vast majority of species - including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding.

  17. Combinatorial pooling enables selective sequencing of the barley gene space.

    Science.gov (United States)

    Lonardi, Stefano; Duma, Denisa; Alpert, Matthew; Cordero, Francesca; Beccuti, Marco; Bhat, Prasanna R; Wu, Yonghui; Ciardo, Gianfranco; Alsaihati, Burair; Ma, Yaqin; Wanamaker, Steve; Resnik, Josh; Bozdag, Serdar; Luo, Ming-Cheng; Close, Timothy J

    2013-04-01

    For the vast majority of species - including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding.

  18. Complete Chloroplast Genome Sequences and Comparative Analysis of Chenopodium quinoa and C. album.

    Science.gov (United States)

    Hong, Su-Young; Cheon, Kyeong-Sik; Yoo, Ki-Oug; Lee, Hyun-Oh; Cho, Kwang-Soo; Suh, Jong-Taek; Kim, Su-Jeong; Nam, Jeong-Hwan; Sohn, Hwang-Bae; Kim, Yul-Ho

    2017-01-01

    The Chenopodium genus comprises ~150 species, including Chenopodium quinoa and Chenopodium album , two important crops with high nutritional value. To elucidate the phylogenetic relationship between the two species, the complete chloroplast (cp) genomes of these species were obtained by next generation sequencing. We performed comparative analysis of the sequences and, using InDel markers, inferred phylogeny and genetic diversity of the Chenopodium genus. The cp genome is 152,099 bp ( C. quinoa ) and 152,167 bp ( C. album ) long. In total, 119 genes (78 protein-coding, 37 tRNA, and 4 rRNA) were identified. We found 14 ( C. quinoa ) and 15 ( C. album ) tandem repeats (TRs); 14 TRs were present in both species and C. album and C. quinoa each had one species-specific TR. The trnI-GAU intron sequences contained one ( C. quinoa ) or two ( C. album ) copies of TRs (66 bp); the InDel marker was designed based on the copy number variation in TRs. Using the InDel markers, we detected this variation in the TR copy number in four species, Chenopodium hybridum, Chenopodium pumilio, Chenopodium ficifolium , and Chenopodium koraiense , but not in Chenopodium glaucum . A comparison of coding and non-coding regions between C. quinoa and C. album revealed divergent sites. Nucleotide diversity >0.025 was found in 17 regions-14 were located in the large single copy region (LSC), one in the inverted repeats, and two in the small single copy region (SSC). A phylogenetic analysis based on 59 protein-coding genes from 25 taxa resolved Chenopodioideae monophyletic and sister to Betoideae. The complete plastid genome sequences and molecular markers based on divergence hotspot regions in the two Chenopodium taxa will help to resolve the phylogenetic relationships of Chenopodium .

  19. Sequence of the intron/exon junctions of the coding region of the human androgen receptor gene and identification of a point mutation in a family with complete androgen insensitivity

    International Nuclear Information System (INIS)

    Lubahn, D.B.; Simental, J.A.; Higgs, H.N.; Wilson, E.M.; French, F.S.; Brown, T.R.; Migeon, C.J.

    1989-01-01

    Androgens act through a receptor protein (AR) to mediate sex differentiation and development of the male phenotype. The authors have isolated the eight exons in the amino acid coding region of the AR gene from a human X chromosome library. Nucleotide sequences of the AR gene intron/exon boundaries were determined for use in designing synthetic oligonucleotide primers to bracket coding exons for amplification by the polymerase chain reaction. Genomic DNA was amplified from 46, XY phenotypic female siblings with complete androgen insensitivity syndrome. AR binding affinity for dihydrotestosterone in the affected siblings was lower than in normal males, but the binding capacity was normal. Sequence analysis of amplified exons demonstrated within the AR steroid-binding domain (exon G) a single guanine to adenine mutation, resulting in replacement of valine with methionine at amino acid residue 866. As expected, the carrier mother had both normal and mutant AR genes. Thus, a single point mutation in the steroid-binding domain of the AR gene correlated with the expression of an AR protein ineffective in stimulating male sexual development

  20. Complete mitochondrial genome sequence of the polychaete annelidPlatynereis dumerilii

    Energy Technology Data Exchange (ETDEWEB)

    Boore, Jeffrey L.

    2004-08-15

    Complete mitochondrial genome sequences are now available for 126 metazoans (see Boore 1999; Mitochondrial Genomics link at http://www.jgi.doe.gov), but the taxonomic representation is highly biased. For example, 80 are from a single phylum, Chordata, and show little variation for many molecular features. Arthropoda is represented by 16 taxa, Mollusca by eight, and Echinodermata by five, with only 17 others from the remaining {approx}30 metazoan phyla. With few exceptions (see Wolstenholme 1992 and Boore 1999) these are circular DNA molecules, about 16 kb in size, and encode the same set of 37 genes. A variety of non-standard names are sometimes used for animal mitochondrial genes; see Boore (1999) for gene nomenclature and a table of synonyms. Mitochondrial genome comparisons serve as a model of genome evolution. In this system, much smaller and simpler than that of the nucleus, are all of the same factors of genome evolution, where one may find tractable the changes in tRNA structure, base composition, genetic code, gene arrangement, etc. Further, patterns of mitochondrial gene rearrangements are an exceptionally reliable indicator of phylogenetic relationships (Smith et al.1993; Boore et al. 1995; Boore, Lavrov, and Brown 1998; Boore and Brown 1998, 2000; Dowton 1999; Stechmann and Schlegel 1999; Kurabayashi and Ueshima 2000). To these ends, we are sampling further the variation among major animal groups in features of their mitochondrial genomes.

  1. The complete genome sequence and analysis of the human pathogen Campylobacter lari

    DEFF Research Database (Denmark)

    Miller, WG; Wang, G; Binnewies, Tim Terence

    2008-01-01

    Campylobacter lari is a member of the epsilon subdivision of the Proteobacteria and is part of the thermotolerant Campylobacter group, a clade that includes the human pathogen C. jejuni. Here we present the complete genome sequence of the human clinical isolate, C. lari RM2100. The genome of strain...... RM2100 is approximately 1.53 Mb and includes the 46 kb megaplasmid pCL2100. Also present within the strain RM2100 genome is a 36 kb putative prophage, termed CLIE1, which is similar to CJIE4, a putative prophage present within the C. jejuni RM1221 genome. Nearly all (90%) of the gene content...... in strain RM2100 is similar to genes present in the genomes of other characterized thermotolerant campylobacters. However, several genes involved in amino acid biosynthesis and energy metabolism, identified previously in other Campylobacter genomes, are absent from the C. lari RM2100 genome. Therefore, C...

  2. The complete chloroplast genome sequences of Lychnis wilfordii and Silene capitata and comparative analyses with other Caryophyllaceae genomes.

    Science.gov (United States)

    Kang, Jong-Soo; Lee, Byoung Yoon; Kwak, Myounghai

    2017-01-01

    The complete chloroplast genomes of Lychnis wilfordii and Silene capitata were determined and compared with ten previously reported Caryophyllaceae chloroplast genomes. The chloroplast genome sequences of L. wilfordii and S. capitata contain 152,320 bp and 150,224 bp, respectively. The gene contents and orders among 12 Caryophyllaceae species are consistent, but several microstructural changes have occurred. Expansion of the inverted repeat (IR) regions at the large single copy (LSC)/IRb and small single copy (SSC)/IR boundaries led to partial or entire gene duplications. Additionally, rearrangements of the LSC region were caused by gene inversions and/or transpositions. The 18 kb inversions, which occurred three times in different lineages of tribe Sileneae, were thought to be facilitated by the intermolecular duplicated sequences. Sequence analyses of the L. wilfordii and S. capitata genomes revealed 39 and 43 repeats, respectively, including forward, palindromic, and reverse repeats. In addition, a total of 67 and 56 simple sequence repeats were discovered in the L. wilfordii and S. capitata chloroplast genomes, respectively. Finally, we constructed phylogenetic trees of the 12 Caryophyllaceae species and two Amaranthaceae species based on 73 protein-coding genes using both maximum parsimony and likelihood methods.

  3. Complete genome sequence analysis of novel human bocavirus reveals genetic recombination between human bocavirus 2 and human bocavirus 4.

    Science.gov (United States)

    Khamrin, Pattara; Okitsu, Shoko; Ushijima, Hiroshi; Maneekarn, Niwat

    2013-07-01

    Epidemiological surveillance of human bocavirus (HBoV) was conducted on fecal specimens collected from hospitalized children with diarrhea in Chiang Mai, Thailand in 2011. By partial sequence analysis of VP1 gene, an unusual strain of HBoV (CMH-S011-11), was initially identified as HBoV4. The complete genome sequence of CMH-S011-11 was performed and analyzed further to clarify whether it was a recombinant strain or a new HBoV variant. Analysis of complete genome sequence revealed that the coding sequence starting from NS1, NP1 to VP1/VP2 was 4795 nucleotides long. Interestingly, the nucleotide sequence of NS1 gene of CMH-S011-11 was most closely related to the HBoV2 reference strains detected in Pakistan, which contradicted to the initial genotyping result of the partial VP1 region in the previous study. In addition, comparison of NP1 nucleotide sequence of CMH-S011-11 with those of other HBoV1-4 reference strains also revealed a high level of sequence identity with HBoV2. On the other hand, nucleotide sequence of VP1/VP2 gene of CMH-S011-11 was most closely related to those of HBoV4 reference strains detected in Nigeria. The overall full-length sequence analysis revealed that this CMH-S011-11 was grouped within HBoV4 species, but located in a separate branch from other HBoV4 prototype strains. Recombination analysis revealed that CMH-S011-11 was the result of recombination between HBoV2 and HBoV4 strains with the break point located near the start codon of VP2. Copyright © 2013 Elsevier B.V. All rights reserved.

  4. Complete genome sequences of six measles virus strains

    NARCIS (Netherlands)

    Phan, M.V.T. (My V.T.); C.M.E. Schapendonk (Claudia); B.B. Oude Munnink (Bas B.); M.P.G. Koopmans D.V.M. (Marion); R.L. de Swart (Rik); Cotten, M. (Matthew)

    2018-01-01

    textabstractGenetic characterization of wild-type measles virus (MV) strains is a critical component of measles surveillance and molecular epidemiology. We have obtained complete genome sequences of six MV strains belonging to different genotypes, using random-primed next generation sequencing.

  5. Chromosomal location and nucleotide sequence of the Escherichia coli dapA gene.

    Science.gov (United States)

    Richaud, F; Richaud, C; Ratet, P; Patte, J C

    1986-04-01

    In Escherichia coli, the first enzyme of the diaminopimelate and lysine pathway is dihydrodipicolinate synthetase, which is feedback-inhibited by lysine and encoded by the dapA gene. The location of the dapA gene on the bacterial chromosome has been determined accurately with respect to the neighboring purC and dapE genes. The complete nucleotide sequence and the transcriptional start of the dapA gene were determined. The results show that dapA consists of a single cistron encoding a 292-amino acid polypeptide of 31,372 daltons.

  6. Chromosomal location and nucleotide sequence of the Escherichia coli dapA gene.

    Science.gov (United States)

    Richaud, F; Richaud, C; Ratet, P; Patte, J C

    1986-01-01

    In Escherichia coli, the first enzyme of the diaminopimelate and lysine pathway is dihydrodipicolinate synthetase, which is feedback-inhibited by lysine and encoded by the dapA gene. The location of the dapA gene on the bacterial chromosome has been determined accurately with respect to the neighboring purC and dapE genes. The complete nucleotide sequence and the transcriptional start of the dapA gene were determined. The results show that dapA consists of a single cistron encoding a 292-amino acid polypeptide of 31,372 daltons. Images PMID:3514578

  7. Complete coding sequence of the human raf oncogene and the corresponding structure of the c-raf-1 gene

    Energy Technology Data Exchange (ETDEWEB)

    Bonner, T I; Oppermann, H; Seeburg, P; Kerby, S B; Gunnell, M A; Young, A C; Rapp, U R

    1986-01-24

    The complete 648 amino acid sequence of the human raf oncogene was deduced from the 2977 nucleotide sequence of a fetal liver cDNA. The cDNA has been used to obtain clones which extend the human c-raf-1 locus by an additional 18.9 kb at the 5' end and contain all the remaining coding exons.

  8. Complete mitochondrial genome sequence of the Barbour's seahorse Hippocampus barbouri Jordan & Richardson, 1908 (Gasterosteiformes: Syngnathidae).

    Science.gov (United States)

    Wang, Bo; Zhang, Yanhong; Zhang, Huixian; Lin, Qiang

    2015-01-01

    The complete mitochondrial genome sequence of the Barbour's seahorse Hippocampus barbouri was first determined in this paper. The total length of H. barbouri mitogenome is 16,526 bp, which consists of 13 protein-coding genes, 22 tRNA and 2 rRNA genes and 1 control region. The features of the H. barbouri mitochondrial genome were similar to the typical vertebrates. The overall base composition of H. barbouri is 32.68% A, 29.75% T, 22.91% C and 14.66% G, with an AT content of 62.43%.

  9. Genetic Diversity of Toxoplasma gondii Strains from Different Hosts and Geographical Regions by Sequence Analysis of GRA20 Gene.

    Science.gov (United States)

    Ning, Hong-Rui; Huang, Si-Yang; Wang, Jin-Lei; Xu, Qian-Ming; Zhu, Xing-Quan

    2015-06-01

    Toxoplasma gondii is a eukaryotic parasite of the phylum Apicomplexa, which infects all warm-blood animals, including humans. In the present study, we examined sequence variation in dense granule 20 (GRA20) genes among T. gondii isolates collected from different hosts and geographical regions worldwide. The complete GRA20 genes were amplified from 16 T. gondii isolates using PCR, sequence were analyzed, and phylogenetic reconstruction was analyzed by maximum parsimony (MP) and maximum likelihood (ML) methods. The results showed that the complete GRA20 gene sequence was 1,586 bp in length among all the isolates used in this study, and the sequence variations in nucleotides were 0-7.9% among all strains. However, removing the type III strains (CTG, VEG), the sequence variations became very low, only 0-0.7%. These results indicated that the GRA20 sequence in type III was more divergence. Phylogenetic analysis of GRA20 sequences using MP and ML methods can differentiate 2 major clonal lineage types (type I and type III) into their respective clusters, indicating the GRA20 gene may represent a novel genetic marker for intraspecific phylogenetic analyses of T. gondii.

  10. The complete mitochondrial genome of the sea spider Achelia bituberculata (Pycnogonida, Ammotheidae: arthropod ground pattern of gene arrangement

    Directory of Open Access Journals (Sweden)

    Lee Yong-Seok

    2007-10-01

    Full Text Available Abstract Background The phylogenetic position of pycnogonids is a long-standing and controversial issue in arthropod phylogeny. This controversy has recently been rekindled by differences in the conclusions based on neuroanatomical data concerning the chelifore and the patterns of Hox expression. The mitochondrial genome of a sea spider, Nymphon gracile (Pycnogonida, Nymphonidae, was recently reported in an attempt to address this issue. However, N. gracile appears to be a long-branch taxon on the phylogenetic tree and exhibits a number of peculiar features, such as 10 tRNA translocations and even an inversion of several protein-coding genes. Sequences of other pycnogonid mitochondrial genomes are needed if the position of pycnogonids is to be elucidated on this basis. Results The complete mitochondrial genome (15,474 bp of a sea spider (Achelia bituberculata belonging to the family Ammotheidae, which combines a number of anatomical features considered plesiomorphic with respect to other pycnogonids, was sequenced and characterized. The genome organization shows the features typical of most metazoan animal genomes (37 tightly-packed genes. The overall gene arrangement is completely identical to the arthropod ground pattern, with one exception: the position of the trnQ gene between the rrnS gene and the control region. Maximum likelihood and Bayesian inference trees inferred from the amino acid sequences of mitochondrial protein-coding genes consistently indicate that the pycnogonids (A. bituberculata and N. gracile may be closely related to the clade of Acari and Araneae. Conclusion The complete mitochondrial genome sequence of A. bituberculata (Family Ammotheidae and the previously-reported partial sequence of Endeis spinosa show the gene arrangement patterns typical of arthropods (Limulus-like, but they differ markedly from that of N. gracile. Phylogenetic analyses based on mitochondrial protein-coding genes showed that Pycnogonida may be

  11. Complete chloroplast genome sequence of Elodea canadensis and comparative analyses with other monocot plastid genomes.

    Science.gov (United States)

    Huotari, Tea; Korpelainen, Helena

    2012-10-15

    Elodea canadensis is an aquatic angiosperm native to North America. It has attracted great attention due to its invasive nature when transported to new areas in its non-native range. We have determined the complete nucleotide sequence of the chloroplast (cp) genome of Elodea. Taxonomically Elodea is a basal monocot, and only few monocot cp genomes representing early lineages of monocots have been sequenced so far. The genome is a circular double-stranded DNA molecule 156,700 bp in length, and has a typical structure with large (LSC 86,194 bp) and small (SSC 17,810 bp) single-copy regions separated by a pair of inverted repeats (IRs 26,348 bp each). The Elodea cp genome contains 113 unique genes and 16 duplicated genes in the IR regions. A comparative analysis showed that the gene order and organization of the Elodea cp genome is almost identical to that of Amborella trichopoda, a basal angiosperm. The structure of IRs in Elodea is unique among monocot species with the whole cp genome sequenced. In Elodea and another monocot Lemna minor the borders between IRs and LSC are located upstream of rps 19 gene and downstream of trnH-GUG gene, while in most monocots, IR has extended to include both trnH and rps 19 genes. A phylogenetic analysis conducted using Bayesian method, based on the DNA sequences of 81 chloroplast genes from 17 monocot taxa provided support for the placement of Elodea together with Lemna as a basal monocot and the next diverging lineage of monocots after Acorales. In comparison with other monocots, the Elodea cp genome has gone through only few rearrangements or gene losses. IR of Elodea has a unique structure among the monocot species studied so far as its structure is similar to that of a basal angiosperm Amborella. This result together with phylogenetic analyses supports the placement of Elodea as a basal monocot to the next diverging lineage of monocots after Acorales. So far, only few cp genomes representing early lineages of monocots have been

  12. Complete genome sequence of Truepera radiovictrix type strain (RQ-24T)

    Energy Technology Data Exchange (ETDEWEB)

    Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Rohde, Christine [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Munk, Christine [Joint Genome Institute, Walnut Creek, California; Nolan, Matt [Joint Genome Institute, Walnut Creek, California; Lucas, Susan [Joint Genome Institute, Walnut Creek, California; Glavina Del Rio, Tijana [Joint Genome Institute, Walnut Creek, California; Tice, Hope [Joint Genome Institute, Walnut Creek, California; Deshpande, Shweta [Joint Genome Institute, Walnut Creek, California; Cheng, Jan-Fang [Joint Genome Institute, Walnut Creek, California; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [Joint Genome Institute, Walnut Creek, California; Liolios, Konstantinos [Joint Genome Institute, Walnut Creek, California; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [Joint Genome Institute, Walnut Creek, California; Palaniappan, Krishna [Joint Genome Institute, Walnut Creek, California; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Brambilla, Evelyne-Marie [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Tindall, Brian [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [Joint Genome Institute, Walnut Creek, California; Bristow, James [Joint Genome Institute, Walnut Creek, California; Eisen, Jonathan [Joint Genome Institute, Walnut Creek, California; Markowitz, Victor [Joint Genome Institute, Walnut Creek, California; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [Joint Genome Institute, Walnut Creek, California; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [Joint Genome Institute, Walnut Creek, California

    2011-01-01

    Truepera radiovictrix Albuquerque et al. 2005 is the type species of the genus Truepera within the phylum Deinococcus/Thermus. T. radiovictrix is of special interest not only because of its isolated phylogenetic location in the order Deinococcales, but also because of its ability to grow under multiple extreme conditions in alkaline, moderately saline, and high temperature habitats. Of particular interest is the fact that, T. radiovictrix is also remarkably resistant to ionizing radiation, a feature it shares with members of the genus Deinococcus. This is the first completed genome sequence of a member of the family Trueperaceae and the fourth type strain genome sequence from a member of the order Deinococcales. The 3,260,398 bp long genome with its 2,994 protein-coding and 52 RNA genes consists of one circular chromosome and is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  13. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.

    Science.gov (United States)

    Fu, Jianmin; Liu, Huimin; Hu, Jingjing; Liang, Yuqin; Liang, Jinjun; Wuyun, Tana; Tan, Xiaofeng

    2016-01-01

    Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales.

  14. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.

    Directory of Open Access Journals (Sweden)

    Jianmin Fu

    Full Text Available Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales.

  15. Complete Genome Sequence of Bacillus velezensis CN026 Exhibiting Antagonistic Activity against Gram-Negative Foodborne Pathogens

    OpenAIRE

    Nannan, Catherine; Gillis, Annika; Caulier, Simon; Mahillon, Jacques

    2018-01-01

    ABSTRACT We report here the complete genome sequence of Bacillus velezensis strain CN026, a member of the B. subtilis group, which is known for its many industrial applications. The genome contains 3,995,812 bp and displays six gene clusters potentially involved in strain CN026’s activity against Gram-negative foodborne pathogens.

  16. Establishing gene models from the Pinus pinaster genome using gene capture and BAC sequencing.

    Science.gov (United States)

    Seoane-Zonjic, Pedro; Cañas, Rafael A; Bautista, Rocío; Gómez-Maldonado, Josefa; Arrillaga, Isabel; Fernández-Pozo, Noé; Claros, M Gonzalo; Cánovas, Francisco M; Ávila, Concepción

    2016-02-27

    In the era of DNA throughput sequencing, assembling and understanding gymnosperm mega-genomes remains a challenge. Although drafts of three conifer genomes have recently been published, this number is too low to understand the full complexity of conifer genomes. Using techniques focused on specific genes, gene models can be established that can aid in the assembly of gene-rich regions, and this information can be used to compare genomes and understand functional evolution. In this study, gene capture technology combined with BAC isolation and sequencing was used as an experimental approach to establish de novo gene structures without a reference genome. Probes were designed for 866 maritime pine transcripts to sequence genes captured from genomic DNA. The gene models were constructed using GeneAssembler, a new bioinformatic pipeline, which reconstructed over 82% of the gene structures, and a high proportion (85%) of the captured gene models contained sequences from the promoter regulatory region. In a parallel experiment, the P. pinaster BAC library was screened to isolate clones containing genes whose cDNA sequence were already available. BAC clones containing the asparagine synthetase, sucrose synthase and xyloglucan endotransglycosylase gene sequences were isolated and used in this study. The gene models derived from the gene capture approach were compared with the genomic sequences derived from the BAC clones. This combined approach is a particularly efficient way to capture the genomic structures of gene families with a small number of members. The experimental approach used in this study is a valuable combined technique to study genomic gene structures in species for which a reference genome is unavailable. It can be used to establish exon/intron boundaries in unknown gene structures, to reconstruct incomplete genes and to obtain promoter sequences that can be used for transcriptional studies. A bioinformatics algorithm (GeneAssembler) is also provided as a

  17. Complete chloroplast DNA sequence from a Korean endemic genus, Megaleranthis saniculifolia, and its evolutionary implications.

    Science.gov (United States)

    Kim, Young-Kyu; Park, Chong-wook; Kim, Ki-Joong

    2009-03-31

    The chloroplast DNA sequences of Megaleranthis saniculifolia, an endemic and monotypic endangered plant species, were completed in this study (GenBank FJ597983). The genome is 159,924 bp in length. It harbors a pair of IR regions consisting of 26,608 bp each. The lengths of the LSC and SSC regions are 88,326 bp and 18,382 bp, respectively. The structural organizations, gene and intron contents, gene orders, AT contents, codon usages, and transcription units of the Megaleranthis chloroplast genome are similar to those of typical land plant cp DNAs. However, the detailed features of Megaleranthis chloroplast genomes are substantially different from that of Ranunculus, which belongs to the same family, the Ranunculaceae. First, the Megaleranthis cp DNA was 4,797 bp longer than that of Ranunculus due to an expanded IR region into the SSC region and duplicated sequence elements in several spacer regions of the Megaleranthis cp genome. Second, the chloroplast genomes of Megaleranthis and Ranunculus evidence 5.6% sequence divergence in the coding regions, 8.9% sequence divergence in the intron regions, and 18.7% sequence divergence in the intergenic spacer regions, respectively. In both the coding and noncoding regions, average nucleotide substitution rates differed markedly, depending on the genome position. Our data strongly implicate the positional effects of the evolutionary modes of chloroplast genes. The genes evidencing higher levels of base substitutions also have higher incidences of indel mutations and low Ka/Ks ratios. A total of 54 simple sequence repeat loci were identified from the Megaleranthis cp genome. The existence of rich cp SSR loci in the Megaleranthis cp genome provides a rare opportunity to study the population genetic structures of this endangered species. Our phylogenetic trees based on the two independent markers, the nuclear ITS and chloroplast matK sequences, strongly support the inclusion of the Megaleranthis to the Trollius. Therefore, our

  18. Complete genome sequence of Bacillus velezensis S3-1, a potential biological pesticide with plant pathogen inhibiting and plant promoting capabilities.

    Science.gov (United States)

    Jin, Qing; Jiang, Qiuyue; Zhao, Lei; Su, Cuizhu; Li, Songshuo; Si, Fangyi; Li, Shanshan; Zhou, Chenhao; Mu, Yonglin; Xiao, Ming

    2017-10-10

    Antagonistic soil microorganisms, which are non-toxic, harmless non-pollutants, can effectively reduce the density of pathogenic species by some ways. Bacillus velezensis strain S3-1 was isolated from the rhizosphere soil of cucumber, and was shown to inhibit plant pathogens, promote plant growth and efficiently colonize rhizosphere soils. The strain produced 13 kinds of lipopeptide antibiotics, belonging to the surfactin, iturin and fengycin families. Here, we presented the complete genome sequence of S3-1. The genome consists of one chromosome without plasmids and also contains the biosynthetic gene cluster that encodes difficidin, macrolactin, surfactin and fengycin. The genome contains 86 tRNA genes, 27 rRNA genes and 57 antibiotic-related genes. The complete genome sequence of B. velezensis S3-1 provides useful information to further detect the molecular mechanisms behind antifungal actions, and will facilitate its potential as a biological pesticide in the agricultural industry. Copyright © 2017 Elsevier B.V. All rights reserved.

  19. Complete Genome Sequence of Biocontroller Bacillus velezensis Strain JTYP2, Isolated from Leaves of Echeveria laui.

    Science.gov (United States)

    Wang, Beibei; Liu, Hu; Ma, Hailin; Wang, Chengqiang; Liu, Kai; Li, Yuhuan; Hou, Qihui; Ge, Ruofei; Zhang, Tongrui; Liu, Fangchun; Ma, Jinjin; Wang, Yun; Wang, Haide; Xu, Baochao; Yao, Gan; Xu, Wenfeng; Fan, Lingchao; Ding, Yanqin; Du, Binghai

    2017-06-15

    Bacillus velezensis JTYP2 was isolated from the leaves of Echeveria laui in Qingzhou, China, and may control some of the fungal pathogens of the plant. Here, we present the complete genome sequence of B. velezensis JTYP2. Several gene clusters related to its biosynthesis of antimicrobial compounds were predicted. Copyright © 2017 Wang et al.

  20. Complete Genome Sequence of Bacillus velezensis CN026 Exhibiting Antagonistic Activity against Gram-Negative Foodborne Pathogens.

    Science.gov (United States)

    Nannan, Catherine; Gillis, Annika; Caulier, Simon; Mahillon, Jacques

    2018-01-25

    We report here the complete genome sequence of Bacillus velezensis strain CN026, a member of the B. subtilis group, which is known for its many industrial applications. The genome contains 3,995,812 bp and displays six gene clusters potentially involved in strain CN026's activity against Gram-negative foodborne pathogens. Copyright © 2018 Nannan et al.

  1. Complete Genome Sequence of Bacillus velezensis GQJK49, a Plant Growth-Promoting Rhizobacterium with Antifungal Activity.

    Science.gov (United States)

    Ma, Jinjin; Liu, Hu; Liu, Kai; Wang, Chengqiang; Li, Yuhuan; Hou, Qihui; Yao, Liangtong; Cui, Yanru; Zhang, Tongrui; Wang, Haide; Wang, Beibei; Wang, Yun; Ge, Ruofei; Xu, Baochao; Yao, Gan; Xu, Wenfeng; Fan, Lingchao; Ding, Yanqin; Du, Binghai

    2017-08-31

    Bacillus velezensis GQJK49 is a plant growth-promoting rhizobacterium with antifungal activity, which was isolated from Lycium barbarum L. rhizosphere. Here, we report the complete genome sequence of B. velezensis GQJK49. Twelve gene clusters related to its biosynthesis of secondary metabolites, including antifungal and antibacterial antibiotics, were predicted. Copyright © 2017 Ma et al.

  2. Complete mitochondrial genome sequence of Urechis caupo, a representative of the phylum Echiura.

    Science.gov (United States)

    Boore, Jeffrey L

    2004-09-15

    Mitochondria contain small genomes that are physically separate from those of nuclei. Their comparison serves as a model system for understanding the processes of genome evolution. Although hundreds of these genome sequences have been reported, the taxonomic sampling is highly biased toward vertebrates and arthropods, with many whole phyla remaining unstudied. This is the first description of a complete mitochondrial genome sequence of a representative of the phylum Echiura, that of the fat innkeeper worm, Urechis caupo. This mtDNA is 15,113 nts in length and 62% A+T. It contains the 37 genes that are typical for animal mtDNAs in an arrangement somewhat similar to that of annelid worms. All genes are encoded by the same DNA strand which is rich in A and C relative to the opposite strand. Codons ending with the dinucleotide GG are more frequent than would be expected from apparent mutational biases. The largest non-coding region is only 282 nts long, is 71% A+T, and has potential for secondary structures. Urechis caupo mtDNA shares many features with those of the few studied annelids, including the common usage of ATG start codons, unusual among animal mtDNAs, as well as gene arrangements, tRNA structures, and codon usage biases.

  3. Complete genome sequence of the industrial bacterium Bacillus licheniformis and comparisons with closely related Bacillus species

    Science.gov (United States)

    Rey, Michael W; Ramaiya, Preethi; Nelson, Beth A; Brody-Karpin, Shari D; Zaretsky, Elizabeth J; Tang, Maria; de Leon, Alfredo Lopez; Xiang, Henry; Gusti, Veronica; Clausen, Ib Groth; Olsen, Peter B; Rasmussen, Michael D; Andersen, Jens T; Jørgensen, Per L; Larsen, Thomas S; Sorokin, Alexei; Bolotin, Alexander; Lapidus, Alla; Galleron, Nathalie; Ehrlich, S Dusko; Berka, Randy M

    2004-01-01

    Background Bacillus licheniformis is a Gram-positive, spore-forming soil bacterium that is used in the biotechnology industry to manufacture enzymes, antibiotics, biochemicals and consumer products. This species is closely related to the well studied model organism Bacillus subtilis, and produces an assortment of extracellular enzymes that may contribute to nutrient cycling in nature. Results We determined the complete nucleotide sequence of the B. licheniformis ATCC 14580 genome which comprises a circular chromosome of 4,222,336 base-pairs (bp) containing 4,208 predicted protein-coding genes with an average size of 873 bp, seven rRNA operons, and 72 tRNA genes. The B. licheniformis chromosome contains large regions that are colinear with the genomes of B. subtilis and Bacillus halodurans, and approximately 80% of the predicted B. licheniformis coding sequences have B. subtilis orthologs. Conclusions Despite the unmistakable organizational similarities between the B. licheniformis and B. subtilis genomes, there are notable differences in the numbers and locations of prophages, transposable elements and a number of extracellular enzymes and secondary metabolic pathway operons that distinguish these species. Differences include a region of more than 80 kilobases (kb) that comprises a cluster of polyketide synthase genes and a second operon of 38 kb encoding plipastatin synthase enzymes that are absent in the B. licheniformis genome. The availability of a completed genome sequence for B. licheniformis should facilitate the design and construction of improved industrial strains and allow for comparative genomics and evolutionary studies within this group of Bacillaceae. PMID:15461803

  4. Roles of genes and Alu repeats in nonlinear correlations of HUMHBB DNA sequence

    International Nuclear Information System (INIS)

    Xiao Yi; Huang Yanzhao

    2004-01-01

    DNA sequences of different species and different portion of the DNA of the same species may have completely different correlation properties, but the origin of these correlations is still not very clear and is currently being investigated, especially in different particular cases. We report here a study of the DNA sequence of human beta globin region (HUMHBB) which has strong linear and nonlinear correlations. We studied the roles of two of the typical elements of DNA sequence, genes and Alu repeats, in the nonlinear correlations of HUMHBB. We find that there exist strong nonlinear correlations between the exons or introns in different genes and between the Alu repeats. They may be one of the major sources of the nonlinear correlations in HUMBHB

  5. Complete mitochondrial genome of Zeugodacus tau (Insecta: Tephritidae) and differentiation of Z. tau species complex by mitochondrial cytochrome c oxidase subunit I gene.

    Science.gov (United States)

    Yong, Hoi-Sen; Song, Sze-Looi; Lim, Phaik-Eem; Eamsobhana, Praphathip

    2017-01-01

    The tephritid fruit fly Zeugodacus tau (Walker) is a polyphagous fruit pest of economic importance in Asia. Studies based on genetic markers indicate that it forms a species complex. We report here (1) the complete mitogenome of Z. tau from Malaysia and comparison with that of China as well as the mitogenome of other congeners, and (2) the relationship of Z. tau taxa from different geographical regions based on sequences of cytochrome c oxidase subunit I gene. The complete mitogenome of Z. tau had a total length of 15631 bp for the Malaysian specimen (ZT3) and 15835 bp for the China specimen (ZT1), with similar gene order comprising 37 genes (13 protein-coding genes-PCGs, 2 rRNA genes, and 22 tRNA genes) and a non-coding A + T-rich control region (D-loop). Based on 13 PCGs and 15 mt-genes, Z. tau NC_027290 (China) and Z. tau ZT1 (China) formed a sister group in the lineage containing also Z. tau ZT3 (Malaysia). Phylogenetic analysis based on partial sequences of cox1 gene indicates that the taxa from China, Japan, Laos, Malaysia, Bangladesh, India, Sri Lanka, and Z. tau sp. A from Thailand belong to Z. tau sensu stricto. A complete cox1 gene (or 13 PCGs or 15 mt-genes) instead of partial sequence is more appropriate for determining phylogenetic relationship.

  6. Sub-grouping of Plasmodium falciparum 3D7 var genes based on sequence analysis of coding and non-coding regions

    DEFF Research Database (Denmark)

    Lavstsen, Thomas; Salanti, Ali; Jensen, Anja T R

    2003-01-01

    and organization of the 3D7 PfEMP1 repertoire was investigated on the basis of the complete genome sequence. METHODS: Using two tree-building methods we analysed the coding and non-coding sequences of 3D7 var and rif genes as well as var genes of other parasite strains. RESULTS: var genes can be sub...

  7. Complete genome sequence of the aerobically denitrifying thermophilic bacterium Chelatococcus daeguensis TAD1

    Directory of Open Access Journals (Sweden)

    Yunlong Yang

    Full Text Available ABSTRACT Chelatococcus daeguensis TAD1 is a themophilic bacterium isolated from a biotrickling filter used to treat NOx in Ruiming Power Plant, located in Guangzhou, China, which shows an excellent aerobic denitrification activity at high temperature. The complete genome sequence of this strain was reported in the present study. Genes related to the aerobic denitrification were identified through whole genome analysis. This work will facilitate the mechanism of aerobic denitrification and provide evidence for its potential application in the nitrogen removal.

  8. Complete genome sequence of Serratia plymuthica strain AS12

    Energy Technology Data Exchange (ETDEWEB)

    Neupane, Saraswoti [Uppsala University, Uppsala, Sweden; Finlay, Roger D. [Uppsala University, Uppsala, Sweden; Alstrom, Sadhna [Uppsala University, Uppsala, Sweden; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Peters, Lin [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Chertkov, Olga [Los Alamos National Laboratory (LANL); Han, James [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Tapia, Roxanne [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Pagani, Ioanna [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Hogberg, Nils [Uppsala University, Uppsala, Sweden

    2012-01-01

    A plant associated member of the family Enterobacteriaceae, Serratia plymuthica strain AS12 was isolated from rapeseed roots. It is of scientific interest due to its plant growth promoting and plant pathogen inhibiting ability. The genome of S. plymuthica AS12 comprises a 5,443,009 bp long circular chromosome, which consists of 4,952 protein-coding genes, 87 tRNA genes and 7 rRNA operons. This genome was sequenced within the 2010 DOE-JGI Community Sequencing Program (CSP2010) as part of the project entitled 'Genomics of four rapeseed plant growth promoting bacteria with antagonistic effect on plant pathogens'.

  9. Using Partial Genomic Fosmid Libraries for Sequencing CompleteOrganellar Genomes

    Energy Technology Data Exchange (ETDEWEB)

    McNeal, Joel R.; Leebens-Mack, James H.; Arumuganathan, K.; Kuehl, Jennifer V.; Boore, Jeffrey L.; dePamphilis, Claude W.

    2005-08-26

    Organellar genome sequences provide numerous phylogenetic markers and yield insight into organellar function and molecular evolution. These genomes are much smaller in size than their nuclear counterparts; thus, their complete sequencing is much less expensive than total nuclear genome sequencing, making broader phylogenetic sampling feasible. However, for some organisms it is challenging to isolate plastid DNA for sequencing using standard methods. To overcome these difficulties, we constructed partial genomic libraries from total DNA preparations of two heterotrophic and two autotrophic angiosperm species using fosmid vectors. We then used macroarray screening to isolate clones containing large fragments of plastid DNA. A minimum tiling path of clones comprising the entire genome sequence of each plastid was selected, and these clones were shotgun-sequenced and assembled into complete genomes. Although this method worked well for both heterotrophic and autotrophic plants, nuclear genome size had a dramatic effect on the proportion of screened clones containing plastid DNA and, consequently, the overall number of clones that must be screened to ensure full plastid genome coverage. This technique makes it possible to determine complete plastid genome sequences for organisms that defy other available organellar genome sequencing methods, especially those for which limited amounts of tissue are available.

  10. Complete genome sequence of Paenibacillus riograndensis SBR5(T), a Gram-positive diazotrophic rhizobacterium.

    Science.gov (United States)

    Brito, Luciana Fernandes; Bach, Evelise; Kalinowski, Jörn; Rückert, Christian; Wibberg, Daniel; Passaglia, Luciane M; Wendisch, Volker F

    2015-08-10

    Paenibacillus riograndensis is a Gram-positive rhizobacterium which exhibits plant growth promoting activities. It was isolated from the rhizosphere of wheat grown in the state of Rio Grande do Sul, Brazil. Here we announce the complete genome sequence of P. riograndensis strain SBR5(T). The genome of P. riograndensis SBR5(T) consists of a circular chromosome of 7,893,056bps. The genome was finished and fully annotated, containing 6705 protein coding genes, 87 tRNAs and 27 rRNAs. The knowledge of the complete genome helped to explain why P. riograndensis SBR5(T) can grow with the carbon sources arabinose and mannitol, but not myo-inositol, and to explain physiological features such as biotin auxotrophy and antibiotic resistances. The genome sequence will be valuable for functional genomics and ecological studies as well as for application of P. riograndensis SBR5(T) as plant growth-promoting rhizobacterium. Copyright © 2015 Elsevier B.V. All rights reserved.

  11. Complete Genome Sequence of the Soybean Symbiont Bradyrhizobium japonicum Strain USDA6T

    Directory of Open Access Journals (Sweden)

    Nobukazu Uchiike

    2011-10-01

    Full Text Available The complete nucleotide sequence of the genome of the soybean symbiont Bradyrhizobium japonicum strain USDA6T was determined. The genome of USDA6T is a single circular chromosome of 9,207,384 bp. The genome size is similar to that of the genome of another soybean symbiont, B. japonicum USDA110 (9,105,828 bp. Comparison of the whole-genome sequences of USDA6T and USDA110 showed colinearity of major regions in the two genomes, although a large inversion exists between them. A significantly high level of sequence conservation was detected in three regions on each genome. The gene constitution and nucleotide sequence features in these three regions indicate that they may have been derived from a symbiosis island. An ancestral, large symbiosis island, approximately 860 kb in total size, appears to have been split into these three regions by unknown large-scale genome rearrangements. The two integration events responsible for this appear to have taken place independently, but through comparable mechanisms, in both genomes.

  12. Complete chloroplast genome sequence of MD-2 pineapple and its comparative analysis among nine other plants from the subclass Commelinidae.

    Science.gov (United States)

    Redwan, R M; Saidin, A; Kumar, S V

    2015-08-12

    Pineapple (Ananas comosus var. comosus) is known as the king of fruits for its crown and is the third most important tropical fruit after banana and citrus. The plant, which is indigenous to South America, is the most important species in the Bromeliaceae family and is largely traded for fresh fruit consumption. Here, we report the complete chloroplast sequence of the MD-2 pineapple that was sequenced using the PacBio sequencing technology. In this study, the high error rate of PacBio long sequence reads of A. comosus's total genomic DNA were improved by leveraging on the high accuracy but short Illumina reads for error-correction via the latest error correction module from Novocraft. Error corrected long PacBio reads were assembled by using a single tool to produce a contig representing the pineapple chloroplast genome. The genome of 159,636 bp in length is featured with the conserved quadripartite structure of chloroplast containing a large single copy region (LSC) with a size of 87,482 bp, a small single copy region (SSC) with a size of 18,622 bp and two inverted repeat regions (IRA and IRB) each with the size of 26,766 bp. Overall, the genome contained 117 unique coding regions and 30 were repeated in the IR region with its genes contents, structure and arrangement similar to its sister taxon, Typha latifolia. A total of 35 repeats structure were detected in both the coding and non-coding regions with a majority being tandem repeats. In addition, 205 SSRs were detected in the genome with six protein-coding genes contained more than two SSRs. Comparative chloroplast genomes from the subclass Commelinidae revealed a conservative protein coding gene albeit located in a highly divergence region. Analysis of selection pressure on protein-coding genes using Ka/Ks ratio showed significant positive selection exerted on the rps7 gene of the pineapple chloroplast with P less than 0.05. Phylogenetic analysis confirmed the recent taxonomical relation among the member of

  13. Complete sequence and analysis of plastid genomes of two economically important red algae: Pyropia haitanensis and Pyropia yezoensis.

    Directory of Open Access Journals (Sweden)

    Li Wang

    Full Text Available Pyropia haitanensis and P. yezoensis are two economically important marine crops that are also considered to be research models to study the physiological ecology of intertidal seaweed communities, evolutionary biology of plastids, and the origins of sexual reproduction. This plastid genome information will facilitate study of breeding, population genetics and phylogenetics.We have fully sequenced using next-generation sequencing the circular plastid genomes of P. hatanensis (195,597 bp and P. yezoensis (191,975 bp, the largest of all the plastid genomes of the red lineage sequenced to date. Organization and gene contents of the two plastids were similar, with 211-213 protein-coding genes (including 29-31 unknown-function ORFs, 37 tRNA genes, and 6 ribosomal RNA genes, suggesting a largest coding capacity in the red lineage. In each genome, 14 protein genes overlapped and no interrupted genes were found, indicating a high degree of genomic condensation. Pyropia maintain an ancient gene content and conserved gene clusters in their plastid genomes, containing nearly complete repertoires of the plastid genes known in photosynthetic eukaryotes. Similarity analysis based on the whole plastid genome sequences showed the distance between P. haitanensis and P. yezoensis (0.146 was much smaller than that of Porphyra purpurea and P. haitanensis (0.250, and P. yezoensis (0.251; this supports re-grouping the two species in a resurrected genus Pyropia while maintaining P. purpurea in genus Porphyra. Phylogenetic analysis supports a sister relationship between Bangiophyceae and Florideophyceae, though precise phylogenetic relationships between multicellular red alage and chromists were not fully resolved.These results indicate that Pyropia have compact plastid genomes. Large coding capacity and long intergenic regions contribute to the size of the largest plastid genomes reported for the red lineage. Possessing the largest coding capacity and ancient gene

  14. De novo transcriptome sequencing of axolotl blastema for identification of differentially expressed genes during limb regeneration

    Science.gov (United States)

    2013-01-01

    Background Salamanders are unique among vertebrates in their ability to completely regenerate amputated limbs through the mediation of blastema cells located at the stump ends. This regeneration is nerve-dependent because blastema formation and regeneration does not occur after limb denervation. To obtain the genomic information of blastema tissues, de novo transcriptomes from both blastema tissues and denervated stump ends of Ambystoma mexicanum (axolotls) 14 days post-amputation were sequenced and compared using Solexa DNA sequencing. Results The sequencing done for this study produced 40,688,892 reads that were assembled into 307,345 transcribed sequences. The N50 of transcribed sequence length was 562 bases. A similarity search with known proteins identified 39,200 different genes to be expressed during limb regeneration with a cut-off E-value exceeding 10-5. We annotated assembled sequences by using gene descriptions, gene ontology, and clusters of orthologous group terms. Targeted searches using these annotations showed that the majority of the genes were in the categories of essential metabolic pathways, transcription factors and conserved signaling pathways, and novel candidate genes for regenerative processes. We discovered and confirmed numerous sequences of the candidate genes by using quantitative polymerase chain reaction and in situ hybridization. Conclusion The results of this study demonstrate that de novo transcriptome sequencing allows gene expression analysis in a species lacking genome information and provides the most comprehensive mRNA sequence resources for axolotls. The characterization of the axolotl transcriptome can help elucidate the molecular mechanisms underlying blastema formation during limb regeneration. PMID:23815514

  15. Sequence Variation in Rhoptry Neck Protein 10 Gene among Toxoplasma gondii Isolates from Different Hosts and Geographical Locations

    Directory of Open Access Journals (Sweden)

    Yu ZHAO

    2017-09-01

    Full Text Available Background: Toxoplasma gondii, as a eukaryotic parasite of the phylum Apicomplexa, can infect almost all the warm-blooded animals and humans, causing toxoplasmosis. Rhoptry neck proteins (RONs play a key role in the invasion process of T. gondii and are potential vaccine candidate molecules against toxoplasmosis.Methods: The present study examined sequence variation in the rhoptry neck protein 10 (TgRON10 gene among 10 T. gondii isolates from different hosts and geographical locations from Lanzhou province during 2014, and compared with the corresponding sequences of strains ME49 and VEG obtained from the ToxoDB database, using polymerase chain reaction (PCR amplification, sequence analysis, and phylogenetic reconstruction by Bayesian inference (BI and maximum parsimony (MP. Results: Analysis of all the 12 TgRON10 genomic and cDNA sequences revealed 7 exons and 6 introns in the TgRON10 gDNA. The complete genomic sequence of the TgRON10 gene ranged from 4759 bp to 4763 bp, and sequence variation was 0-0.6% among the 12 T. gondii isolates, indicating a low sequence variation in TgRON10 gene. Phylogenetic analysis of TgRON10 sequences showed that the cluster of the 12 T. gondii isolates was not completely consistent with their respective genotypes.Conclusion: TgRON10 gene is not a suitable genetic marker for the differentiation of T. gondii isolates from different hosts and geographical locations, but may represent a potential vaccine candidate against toxoplasmosis, worth further studies.

  16. Sequence Variation in Rhoptry Neck Protein 10 Gene among Toxoplasma gondii Isolates from Different Hosts and Geographical Locations.

    Science.gov (United States)

    Zhao, Yu; Zhou, Donghui; Chen, Jia; Sun, Xiaolin

    2017-01-01

    Toxoplasma gondii, as a eukaryotic parasite of the phylum Apicomplexa, can infect almost all the warm-blooded animals and humans, causing toxoplasmosis. Rhoptry neck proteins (RONs) play a key role in the invasion process of T. gondii and are potential vaccine candidate molecules against toxoplasmosis. The present study examined sequence variation in the rhoptry neck protein 10 (TgRON10) gene among 10 T. gondii isolates from different hosts and geographical locations from Lanzhou province during 2014, and compared with the corresponding sequences of strains ME49 and VEG obtained from the ToxoDB database, using polymerase chain reaction (PCR) amplification, sequence analysis, and phylogenetic reconstruction by Bayesian inference (BI) and maximum parsimony (MP). Analysis of all the 12 TgRON10 genomic and cDNA sequences revealed 7 exons and 6 introns in the TgRON10 gDNA. The complete genomic sequence of the TgRON10 gene ranged from 4759 bp to 4763 bp, and sequence variation was 0-0.6% among the 12 T. gondii isolates, indicating a low sequence variation in TgRON10 gene. Phylogenetic analysis of TgRON10 sequences showed that the cluster of the 12 T. gondii isolates was not completely consistent with their respective genotypes. TgRON10 gene is not a suitable genetic marker for the differentiation of T. gondii isolates from different hosts and geographical locations, but may represent a potential vaccine candidate against toxoplasmosis, worth further studies.

  17. The complete mitogenome sequence of the Japanese oak silkmoth, Antheraea yamamai (Lepidoptera: Saturniidae).

    Science.gov (United States)

    Kim, Seong Ryeol; Kim, Man Il; Hong, Mee Yeon; Kim, Kee Young; Kang, Pil Don; Hwang, Jae Sam; Han, Yeon Soo; Jin, Byung Rae; Kim, Iksoo

    2009-09-01

    The 15,338-bp long complete mitochondrial genome (mitogenome) of the Japanese oak silkmoth, Antheraea yamamai (Lepidoptera: Saturniidae) was determined. This genome has a gene arrangement identical to those of all other sequenced lepidopteran insects, but differs from the most common type, as the result of the movement of tRNA(Met) to a position 5'-upstream of tRNA(Ile). No typical start codon of the A. yamamai COI gene is available. Instead, a tetranucleotide, TTAG, which is found at the beginning context of all sequenced lepidopteran insects was tentatively designated as the start codon for A. yamamai COI gene. Three of the 13 protein-coding genes (PCGs) harbor the incomplete termination codon, T or TA. All tRNAs formed stable stem-and-loop structures, with the exception of tRNA(Ser)(AGN), the DHU arm of which formed a simple loop as has been observed in many other metazoan mt tRNA(Ser)(AGN). The 334-bp long A + T-rich region is noteworthy in that it harbors tRNA-like structures, as has also been seen in the A + T-rich regions of other insect mitogenomes. Phylogenetic analyses of the available species of Bombycoidea, Pyraloidea, and Tortricidea bolstered the current morphology-based hypothesis that Bombycoidea and Pyraloidea are monophyletic (Obtectomera). As has been previously suggested, Bombycidae (Bombyx mori and B. mandarina) and Saturniidae (A. yamamai and Caligula boisduvalii) formed a reciprocal monophyletic group.

  18. Complete mitochondrial genome sequence from an endangered Indian snake, Python molurus molurus (Serpentes, Pythonidae).

    Science.gov (United States)

    Dubey, Bhawna; Meganathan, P R; Haque, Ikramul

    2012-07-01

    This paper reports the complete mitochondrial genome sequence of an endangered Indian snake, Python molurus molurus (Indian Rock Python). A typical snake mitochondrial (mt) genome of 17258 bp length comprising of 37 genes including the 13 protein coding genes, 22 tRNA genes, and 2 ribosomal RNA genes along with duplicate control regions is described herein. The P. molurus molurus mt. genome is relatively similar to other snake mt. genomes with respect to gene arrangement, composition, tRNA structures and skews of AT/GC bases. The nucleotide composition of the genome shows that there are more A-C % than T-G% on the positive strand as revealed by positive AT and CG skews. Comparison of individual protein coding genes, with other snake genomes suggests that ATP8 and NADH3 genes have high divergence rates. Codon usage analysis reveals a preference of NNC codons over NNG codons in the mt. genome of P. molurus. Also, the synonymous and non-synonymous substitution rates (ka/ks) suggest that most of the protein coding genes are under purifying selection pressure. The phylogenetic analyses involving the concatenated 13 protein coding genes of P. molurus molurus conformed to the previously established snake phylogeny.

  19. Analysis of the Complete Mitochondrial Genome Sequence of the Diploid Cotton Gossypium raimondii by Comparative Genomics Approaches

    Directory of Open Access Journals (Sweden)

    Changwei Bi

    2016-01-01

    Full Text Available Cotton is one of the most important economic crops and the primary source of natural fiber and is an important protein source for animal feed. The complete nuclear and chloroplast (cp genome sequences of G. raimondii are already available but not mitochondria. Here, we assembled the complete mitochondrial (mt DNA sequence of G. raimondii into a circular genome of length of 676,078 bp and performed comparative analyses with other higher plants. The genome contains 39 protein-coding genes, 6 rRNA genes, and 25 tRNA genes. We also identified four larger repeats (63.9 kb, 10.6 kb, 9.1 kb, and 2.5 kb in this mt genome, which may be active in intramolecular recombination in the evolution of cotton. Strikingly, nearly all of the G. raimondii mt genome has been transferred to nucleus on Chr1, and the transfer event must be very recent. Phylogenetic analysis reveals that G. raimondii, as a member of Malvaceae, is much closer to another cotton (G. barbadense than other rosids, and the clade formed by two Gossypium species is sister to Brassicales. The G. raimondii mt genome may provide a crucial foundation for evolutionary analysis, molecular biology, and cytoplasmic male sterility in cotton and other higher plants.

  20. Complete genome sequence of Defluviimonas alba cai42T, a microbial exopolysaccharides producer.

    Science.gov (United States)

    Zhao, Jie-Yu; Geng, Shuang; Xu, Lian; Hu, Bing; Sun, Ji-Quan; Nie, Yong; Tang, Yue-Qin; Wu, Xiao-Lei

    2016-12-10

    Defluviimonas alba cai42 T , isolated from the oil-production water in Xinjiang Oilfield in China, has a strong ability to produce exopolysaccharides (EPS). We hereby present its complete genome sequence information which consists of a circular chromosome and three plasmids. The strain characteristically contains various genes encoding for enzymes involved in EPS biosynthesis, modification, and export. According to the genomic and physiochemical data, it is predicted that the strain has the potential to be utilized in industrial production of microbial EPS. Copyright © 2016 Elsevier B.V. All rights reserved.

  1. The complete mitochondrial genome of Pseudocellus pearsei (Chelicerata: Ricinulei and a comparison of mitochondrial gene rearrangements in Arachnida

    Directory of Open Access Journals (Sweden)

    Braband Anke

    2007-10-01

    Full Text Available Abstract Background Mitochondrial genomes are widely utilized for phylogenetic and population genetic analyses among animals. In addition to sequence data the mitochondrial gene order and RNA secondary structure data are used in phylogenetic analyses. Arachnid phylogeny is still highly debated and there is a lack of sufficient sequence data for many taxa. Ricinulei (hooded tickspiders are a morphologically distinct clade of arachnids with uncertain phylogenetic affinities. Results The first complete mitochondrial DNA genome of a member of the Ricinulei, Pseudocellus pearsei (Arachnida: Ricinulei was sequenced using a PCR-based approach. The mitochondrial genome is a typical circular duplex DNA molecule with a size of 15,099 bp, showing the complete set of genes usually present in bilaterian mitochondrial genomes. Five tRNA genes (trnW, trnY, trnN, trnL(CUN, trnV show different relative positions compared to other Chelicerata (e.g. Limulus polyphemus, Ixodes spp.. We propose that two events led to this derived gene order: (1 a tandem duplication followed by random deletion and (2 an independent translocation of trnN. Most of the inferred tRNA secondary structures show the common cloverleaf pattern except tRNA-Glu where the TψC-arm is missing. In phylogenetic analyses (maximum likelihood, maximum parsimony, Bayesian inference using concatenated amino acid and nucleotide sequences of protein-coding genes the basal relationships of arachnid orders remain unresolved. Conclusion Phylogenetic analyses (ML, MP, BI of arachnid mitochondrial genomes fail to resolve interordinal relationships of Arachnida and remain in a preliminary stage because there is still a lack of mitogenomic data from important taxa such as Opiliones and Pseudoscorpiones. Gene order varies considerably within Arachnida – only eight out of 23 species have retained the putative arthropod ground pattern. Some gene order changes are valuable characters in phylogenetic analysis of

  2. Complete Genome Sequence of a Genotype XVII Newcastle Disease Virus, Isolated from an Apparently Healthy Domestic Duck in Nigeria

    OpenAIRE

    Shittu, Ismaila; Sharma, Poonam; Joannis, Tony M.; Volkening, Jeremy D.; Odaibo, Georgina N.; Olaleye, David O.; Williams-Coplin, Dawn; Solomon, Ponman; Abolnik, Celia; Miller, Patti J.; Dimitrov, Kiril M.; Afonso, Claudio L.

    2016-01-01

    The first complete genome sequence of a strain of Newcastle disease virus (NDV) of genotype XVII is described here. A velogenic strain (duck/Nigeria/903/KUDU-113/1992) was isolated from an apparently healthy free-roaming domestic duck sampled in Kuru, Nigeria, in 1992. Phylogenetic analysis of the fusion protein gene and complete genome classified the isolate as a member of NDV class II, genotype XVII.

  3. Complete sequencing of the bla(NDM-1)-positive IncA/C plasmid from Escherichia coli ST38 isolate suggests a possible origin from plant pathogens.

    Science.gov (United States)

    Sekizuka, Tsuyoshi; Matsui, Mari; Yamane, Kunikazu; Takeuchi, Fumihiko; Ohnishi, Makoto; Hishinuma, Akira; Arakawa, Yoshichika; Kuroda, Makoto

    2011-01-01

    The complete sequence of the plasmid pNDM-1_Dok01 carrying New Delhi metallo-β-lactamase (NDM-1) was determined by whole genome shotgun sequencing using Escherichia coli strain NDM-1_Dok01 (multilocus sequence typing type: ST38) and the transconjugant E. coli DH10B. The plasmid is an IncA/C incompatibility type composed of 225 predicted coding sequences in 195.5 kb and partially shares a sequence with bla(CMY-2)-positive IncA/C plasmids such as E. coli AR060302 pAR060302 (166.5 kb) and Salmonella enterica serovar Newport pSN254 (176.4 kb). The bla(NDM-1) gene in pNDM-1_Dok01 is terminally flanked by two IS903 elements that are distinct from those of the other characterized NDM-1 plasmids, suggesting that the bla(NDM-1) gene has been broadly transposed, together with various mobile elements, as a cassette gene. The chaperonin groES and groEL genes were identified in the bla(NDM-1)-related composite transposon, and phylogenetic analysis and guanine-cytosine content (GC) percentage showed similarities to the homologs of plant pathogens such as Pseudoxanthomonas and Xanthomonas spp., implying that plant pathogens are the potential source of the bla(NDM-1) gene. The complete sequence of pNDM-1_Dok01 suggests that the bla(NDM-1) gene was acquired by a novel composite transposon on an extensively disseminated IncA/C plasmid and transferred to the E. coli ST38 isolate.

  4. Complete mitochondrial genome of Zeugodacus tau (Insecta: Tephritidae and differentiation of Z. tau species complex by mitochondrial cytochrome c oxidase subunit I gene.

    Directory of Open Access Journals (Sweden)

    Hoi-Sen Yong

    Full Text Available The tephritid fruit fly Zeugodacus tau (Walker is a polyphagous fruit pest of economic importance in Asia. Studies based on genetic markers indicate that it forms a species complex. We report here (1 the complete mitogenome of Z. tau from Malaysia and comparison with that of China as well as the mitogenome of other congeners, and (2 the relationship of Z. tau taxa from different geographical regions based on sequences of cytochrome c oxidase subunit I gene. The complete mitogenome of Z. tau had a total length of 15631 bp for the Malaysian specimen (ZT3 and 15835 bp for the China specimen (ZT1, with similar gene order comprising 37 genes (13 protein-coding genes-PCGs, 2 rRNA genes, and 22 tRNA genes and a non-coding A + T-rich control region (D-loop. Based on 13 PCGs and 15 mt-genes, Z. tau NC_027290 (China and Z. tau ZT1 (China formed a sister group in the lineage containing also Z. tau ZT3 (Malaysia. Phylogenetic analysis based on partial sequences of cox1 gene indicates that the taxa from China, Japan, Laos, Malaysia, Bangladesh, India, Sri Lanka, and Z. tau sp. A from Thailand belong to Z. tau sensu stricto. A complete cox1 gene (or 13 PCGs or 15 mt-genes instead of partial sequence is more appropriate for determining phylogenetic relationship.

  5. Complete Genome Sequence of Bacillus vallismortis NBIF-001, a Novel Strain from Shangri-La, China, That Has High Activity against Fusarium oxysporum.

    Science.gov (United States)

    Liu, Xiaoyan; Min, Yong; Huang, Daye; Zhou, Ronghua; Fang, Wei; Liu, Cuijun; Rao, Ben; Zhang, Guangyang; Wang, Kaimei; Yang, Ziwen

    2017-11-30

    Bacillus vallismortis NBIF-001, a Gram-positive bacterium, was isolated from soil in Shangri-La, China. Here, we provide the complete genome sequence of this bacterium, which has a 3,929,787-bp-long genome, including 4,030 protein-coding genes and 195 RNA genes. This strain possesses a number of genes encoding virulence factors of pathogens. Copyright © 2017 Liu et al.

  6. Complete Genome Sequence of Escherichia coli Strain WG5

    DEFF Research Database (Denmark)

    Imamovic, Lejla; Misiakou, Maria-Anna; van der Helm, Eric

    2018-01-01

    Escherichia coli strain WG5 is a widely used host for phage detection, including somatic coliphages employed as standard ISO method 10705-1 (2000). Here, we present the complete genome sequence of a commercial E. coli WG5 strain.......Escherichia coli strain WG5 is a widely used host for phage detection, including somatic coliphages employed as standard ISO method 10705-1 (2000). Here, we present the complete genome sequence of a commercial E. coli WG5 strain....

  7. Complete Genome Sequence of Staphylococcus succinus 14BME20 Isolated from a Traditional Korean Fermented Soybean Food

    OpenAIRE

    Jeong, Do-Won; Lee, Jong-Hoon

    2017-01-01

    ABSTRACT The complete genome sequence of Staphylococcus succinus 14BME20, isolated from a Korean fermented soybean food and selected as a possible starter culture candidate, was determined. Comparative genome analysis with S.?succinus CSM-77 from a Triassic salt mine revealed the presence of strain-specific genes for lipid degradation in strain 14BME20.

  8. The complete nucleotide sequence of RNA 3 of a peach isolate of Prunus necrotic ringspot virus.

    Science.gov (United States)

    Hammond, R W; Crosslin, J M

    1995-04-01

    The complete nucleotide sequence of RNA 3 of the PE-5 peach isolate of Prunus necrotic ringspot ilarvirus (PNRSV) was obtained from cloned cDNA. The RNA sequence is 1941 nucleotides and contains two open reading frames (ORFs). ORF 1 consisted of 284 amino acids with a calculated molecular weight of 31,729 Da and ORF 2 contained 224 amino acids with a calculated molecular weight of 25,018 Da. ORF 2 corresponds to the coat protein gene. Expression of ORF 2 engineered into a pTrcHis vector in Escherichia coli results in a fusion polypeptide of approximately 28 kDa which cross-reacts with PNRSV polyclonal antiserum. Analysis of the coat protein amino acid sequence reveals a putative "zinc-finger" domain at the amino-terminal portion of the protein. Two tetranucleotide AUGC motifs occur in the 3'-UTR of the RNA and may function in coat protein binding and genome activation. ORF 1 homologies to other ilarviruses and alfalfa mosaic virus are confined to limited regions of conserved amino acids. The translated amino acid sequence of the coat protein gene shows 92% similarity to one isolate of apple mosaic virus, a closely related member of the ilarvirus group of plant viruses, but only 66% similarity to the amino acid sequence of the coat protein gene of a second isolate. These relationships are also reflected at the nucleotide sequence level. These results in one instance confirm the close similarities observed at the biophysical and serological levels between these two viruses, but on the other hand call into question the nomenclature used to describe these viruses.

  9. Complete mitochondrial genome sequence of Urechis caupo, a representative of the phylum Echiura

    Directory of Open Access Journals (Sweden)

    Boore Jeffrey L

    2004-09-01

    Full Text Available Abstract Background Mitochondria contain small genomes that are physically separate from those of nuclei. Their comparison serves as a model system for understanding the processes of genome evolution. Although hundreds of these genome sequences have been reported, the taxonomic sampling is highly biased toward vertebrates and arthropods, with many whole phyla remaining unstudied. This is the first description of a complete mitochondrial genome sequence of a representative of the phylum Echiura, that of the fat innkeeper worm, Urechis caupo. Results This mtDNA is 15,113 nts in length and 62% A+T. It contains the 37 genes that are typical for animal mtDNAs in an arrangement somewhat similar to that of annelid worms. All genes are encoded by the same DNA strand which is rich in A and C relative to the opposite strand. Codons ending with the dinucleotide GG are more frequent than would be expected from apparent mutational biases. The largest non-coding region is only 282 nts long, is 71% A+T, and has potential for secondary structures. Conclusions Urechis caupo mtDNA shares many features with those of the few studied annelids, including the common usage of ATG start codons, unusual among animal mtDNAs, as well as gene arrangements, tRNA structures, and codon usage biases.

  10. Complete Genome of Stachybotrys chartarum strain 51-11

    Data.gov (United States)

    U.S. Environmental Protection Agency — Complete genome sequence of the fungus Stachybotrys chartarum. Sequences can be used to identify genes, genetic pathways, gene clusters, genetic organization, etc....

  11. Molecular cloning, sequence characterization and expression pattern of Rab18 gene from watermelon (Citrullus lanatus).

    Science.gov (United States)

    Xinli, Xiao; Lei, Peng

    2015-03-04

    The complete mRNA sequence of watermelon Rab18 gene was amplified through the rapid amplification of cDNA ends (RACE) method. The full-length mRNA was 1010 bp containing a 645 bp open reading frame, which encodes a protein of 214 amino acids. Sequence analysis revealed that watermelon Rab18 protein shares high homology with the Rab18 of cucumber (99%), muskmelon (98%), Morus notabilis (90%), tomato (89%), wine grape (89%) and potato (88%). Phylogenetic analysis revealed that watermelon Rab18 gene has a closer genetic relationship with Rab18 gene of cucumber and muskmelon. Tissue expression profile analysis indicated that watermelon Rab18 gene was highly expressed in root, stem and leaf, moderately expressed in flower and weakly expressed in fruit.

  12. Complete genome sequence of Olsenella uli type strain (VPI D76D-27CT)

    Energy Technology Data Exchange (ETDEWEB)

    Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Held, Brittany [Los Alamos National Laboratory (LANL); Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Yasawong, Montri [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Pukall, Rudiger [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany

    2010-01-01

    Olsenella uli (Olsen et al. 1991) Dewhirst et al. 2001 is the type species of the genus Olsenella, which belongs to the actinobacterial family Coriobacteriaceae. The species is of interest because it is frequently isolated from dental plaque in periodontitis patients and can cause primary endodontic infection. The species is a Gram-positive, non-motile and non-sporulating bacterium. The strain described in this study has been isolated from human gingival crevices in 1982. This is the first completed sequence of the genus Olsenella and the fifth sequence from the family Coriobacteriaceae. The 2,051,896 bp long genome with its 1,795 protein-coding and 55 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  13. Complete Genome Sequence of Staphylococcus epidermidis 1457.

    Science.gov (United States)

    Galac, Madeline R; Stam, Jason; Maybank, Rosslyn; Hinkle, Mary; Mack, Dietrich; Rohde, Holger; Roth, Amanda L; Fey, Paul D

    2017-06-01

    Staphylococcus epidermidis 1457 is a frequently utilized strain that is amenable to genetic manipulation and has been widely used for biofilm-related research. We report here the whole-genome sequence of this strain, which encodes 2,277 protein-coding genes and 81 RNAs within its 2.4-Mb genome and plasmid. Copyright © 2017 Galac et al.

  14. Identification of genes in anonymous DNA sequences. Annual performance report, February 1, 1991--January 31, 1992

    Energy Technology Data Exchange (ETDEWEB)

    Fields, C.A.

    1996-06-01

    The objective of this project is the development of practical software to automate the identification of genes in anonymous DNA sequences from the human, and other higher eukaryotic genomes. A software system for automated sequence analysis, gm (gene modeler) has been designed, implemented, tested, and distributed to several dozen laboratories worldwide. A significantly faster, more robust, and more flexible version of this software, gm 2.0 has now been completed, and is being tested by operational use to analyze human cosmid sequence data. A range of efforts to further understand the features of eukaryoyic gene sequences are also underway. This progress report also contains papers coming out of the project including the following: gm: a Tool for Exploratory Analysis of DNA Sequence Data; The Human THE-LTR(O) and MstII Interspersed Repeats are subfamilies of a single widely distruted highly variable repeat family; Information contents and dinucleotide compostions of plant intron sequences vary with evolutionary origin; Splicing signals in Drosophila: intron size, information content, and consensus sequences; Integration of automated sequence analysis into mapping and sequencing projects; Software for the C. elegans genome project.

  15. Complete genome sequencing and phylogenetic analysis of dengue type 1 virus isolated from Jeddah, Saudi Arabia.

    Science.gov (United States)

    Azhar, Esam I; Hashem, Anwar M; El-Kafrawy, Sherif A; Abol-Ela, Said; Abd-Alla, Adly M M; Sohrab, Sayed Sartaj; Farraj, Suha A; Othman, Norah A; Ben-Helaby, Huda G; Ashshi, Ahmed; Madani, Tariq A; Jamjoom, Ghazi

    2015-01-16

    Dengue viruses (DENVs) are mosquito-borne viruses which can cause disease ranging from mild fever to severe dengue infection. These viruses are endemic in several tropical and subtropical regions. Multiple outbreaks of DENV serotypes 1, 2 and 3 (DENV-1, DENV-2 and DENV-3) have been reported from the western region in Saudi Arabia since 1994. Strains from at least two genotypes of DENV-1 (Asia and America/Africa genotypes) have been circulating in western Saudi Arabia until 2006. However, all previous studies reported from Saudi Arabia were based on partial sequencing data of the envelope (E) gene without any reports of full genome sequences for any DENV serotypes circulating in Saudi Arabia. Here, we report the isolation and the first complete genome sequence of a DENV-1 strain (DENV-1-Jeddah-1-2011) isolated from a patient from Jeddah, Saudi Arabia in 2011. Whole genome sequence alignment and phylogenetic analysis showed high similarity between DENV-1-Jeddah-1-2011 strain and D1/H/IMTSSA/98/606 isolate (Asian genotype) reported from Djibouti in 1998. Further analysis of the full envelope gene revealed a close relationship between DENV-1-Jeddah-1-2011 strain and isolates reported between 2004-2006 from Jeddah as well as recent isolates from Somalia, suggesting the widespread of the Asian genotype in this region. These data suggest that strains belonging to the Asian genotype might have been introduced into Saudi Arabia long before 2004 most probably by African pilgrims and continued to circulate in western Saudi Arabia at least until 2011. Most importantly, these results indicate that pilgrims from dengue endemic regions can play an important role in the spread of new DENVs in Saudi Arabia and the rest of the world. Therefore, availability of complete genome sequences would serve as a reference for future epidemiological studies of DENV-1 viruses.

  16. Complete Genome Sequence of a Genotype XVII Newcastle Disease Virus, Isolated from an Apparently Healthy Domestic Duck in Nigeria

    Science.gov (United States)

    Shittu, Ismaila; Sharma, Poonam; Joannis, Tony M.; Volkening, Jeremy D.; Odaibo, Georgina N.; Olaleye, David O.; Williams-Coplin, Dawn; Solomon, Ponman; Abolnik, Celia; Miller, Patti J.; Dimitrov, Kiril M.

    2016-01-01

    The first complete genome sequence of a strain of Newcastle disease virus (NDV) of genotype XVII is described here. A velogenic strain (duck/Nigeria/903/KUDU-113/1992) was isolated from an apparently healthy free-roaming domestic duck sampled in Kuru, Nigeria, in 1992. Phylogenetic analysis of the fusion protein gene and complete genome classified the isolate as a member of NDV class II, genotype XVII. PMID:26847901

  17. Sequence of a complete chicken BG haplotype shows dynamic expansion and contraction of two gene lineages with particular expression patterns

    DEFF Research Database (Denmark)

    Salomonsen, Jan; Chattaway, John A.; Chan, Andrew C. Y.

    2014-01-01

    complex (MHC), and show striking association with particular autoimmune diseases. In chickens, BG genes encode homologues with somewhat different domain organisation. Only a few BG genes have been characterised, one involved in actin-myosin interaction in the intestinal brush border, and another...... implicated in resistance to viral diseases. We characterise all BG genes in B12 chickens, finding a multigene family organised as tandem repeats in the BG region outside the MHC, a single gene in the MHC (the BF-BL region), and another single gene on a different chromosome. There is a precise cell and tissue...... many hybrid genes, suggesting recombination and/or deletion as major evolutionary forces. We identify BG genes in the chicken whole genome shotgun sequence, as well as by comparison to other haplotypes by fibre fluorescence in situ hybridisation, confirming dynamic expansion and contraction within...

  18. Metazoan Remaining Genes for Essential Amino Acid Biosynthesis: Sequence Conservation and Evolutionary Analyses

    Directory of Open Access Journals (Sweden)

    Igor R. Costa

    2014-12-01

    Full Text Available Essential amino acids (EAA consist of a group of nine amino acids that animals are unable to synthesize via de novo pathways. Recently, it has been found that most metazoans lack the same set of enzymes responsible for the de novo EAA biosynthesis. Here we investigate the sequence conservation and evolution of all the metazoan remaining genes for EAA pathways. Initially, the set of all 49 enzymes responsible for the EAA de novo biosynthesis in yeast was retrieved. These enzymes were used as BLAST queries to search for similar sequences in a database containing 10 complete metazoan genomes. Eight enzymes typically attributed to EAA pathways were found to be ubiquitous in metazoan genomes, suggesting a conserved functional role. In this study, we address the question of how these genes evolved after losing their pathway partners. To do this, we compared metazoan genes with their fungal and plant orthologs. Using phylogenetic analysis with maximum likelihood, we found that acetolactate synthase (ALS and betaine-homocysteine S-methyltransferase (BHMT diverged from the expected Tree of Life (ToL relationships. High sequence conservation in the paraphyletic group Plant-Fungi was identified for these two genes using a newly developed Python algorithm. Selective pressure analysis of ALS and BHMT protein sequences showed higher non-synonymous mutation ratios in comparisons between metazoans/fungi and metazoans/plants, supporting the hypothesis that these two genes have undergone non-ToL evolution in animals.

  19. Analysis of the complete genome sequence of Nocardia seriolae UTF1, the causative agent of fish nocardiosis: The first reference genome sequence of the fish pathogenic Nocardia species.

    Science.gov (United States)

    Yasuike, Motoshige; Nishiki, Issei; Iwasaki, Yuki; Nakamura, Yoji; Fujiwara, Atushi; Shimahara, Yoshiko; Kamaishi, Takashi; Yoshida, Terutoyo; Nagai, Satoshi; Kobayashi, Takanori; Katoh, Masaya

    2017-01-01

    Nocardiosis caused by Nocardia seriolae is one of the major threats in the aquaculture of Seriola species (yellowtail; S. quinqueradiata, amberjack; S. dumerili and kingfish; S. lalandi) in Japan. Here, we report the complete nucleotide genome sequence of N. seriolae UTF1, isolated from a cultured yellowtail. The genome is a circular chromosome of 8,121,733 bp with a G+C content of 68.1% that encodes 7,697 predicted proteins. In the N. seriolae UTF1 predicted genes, we found orthologs of virulence factors of pathogenic mycobacteria and human clinical Nocardia isolates involved in host cell invasion, modulation of phagocyte function and survival inside the macrophages. The virulence factor candidates provide an essential basis for understanding their pathogenic mechanisms at the molecular level by the fish nocardiosis research community in future studies. We also found many potential antibiotic resistance genes on the N. seriolae UTF1 chromosome. Comparative analysis with the four existing complete genomes, N. farcinica IFM 10152, N. brasiliensis HUJEG-1 and N. cyriacigeorgica GUH-2 and N. nova SH22a, revealed that 2,745 orthologous genes were present in all five Nocardia genomes (core genes) and 1,982 genes were unique to N. seriolae UTF1. In particular, the N. seriolae UTF1 genome contains a greater number of mobile elements and genes of unknown function that comprise the differences in structure and gene content from the other Nocardia genomes. In addition, a lot of the N. seriolae UTF1-specific genes were assigned to the ABC transport system. Because of limited resources in ocean environments, these N. seriolae UTF1 specific ABC transporters might facilitate adaptation strategies essential for marine environment survival. Thus, the availability of the complete N. seriolae UTF1 genome sequence will provide a valuable resource for comparative genomic studies of N. seriolae isolates, as well as provide new insights into the ecological and functional diversity of

  20. Molecular Cloning and Sequence Analysis of a Phenylalanine Ammonia-Lyase Gene from Dendrobium

    Science.gov (United States)

    Cai, Yongping; Lin, Yi

    2013-01-01

    In this study, a phenylalanine ammonia-lyase (PAL) gene was cloned from Dendrobium candidum using homology cloning and RACE. The full-length sequence and catalytic active sites that appear in PAL proteins of Arabidopsis thaliana and Nicotiana tabacum are also found: PAL cDNA of D. candidum (designated Dc-PAL1, GenBank No. JQ765748) has 2,458 bps and contains a complete open reading frame (ORF) of 2,142 bps, which encodes 713 amino acid residues. The amino acid sequence of DcPAL1 has more than 80% sequence identity with the PAL genes of other plants, as indicated by multiple alignments. The dominant sites and catalytic active sites, which are similar to that showing in PAL proteins of Arabidopsis thaliana and Nicotiana tabacum, are also found in DcPAL1. Phylogenetic tree analysis revealed that DcPAL is more closely related to PALs from orchidaceae plants than to those of other plants. The differential expression patterns of PAL in protocorm-like body, leaf, stem, and root, suggest that the PAL gene performs multiple physiological functions in Dendrobium candidum. PMID:23638048

  1. Complete plastid genome sequence of goosegrass (Eleusine indica) and comparison with other Poaceae.

    Science.gov (United States)

    Zhang, Hui; Hall, Nathan; McElroy, J Scott; Lowe, Elijah K; Goertzen, Leslie R

    2017-02-05

    Eleusine indica, also known as goosegrass, is a serious weed in at least 42 countries. In this paper we report the complete plastid genome sequence of goosegrass obtained by de novo assembly of paired-end and mate-paired reads generated by Illumina sequencing of total genomic DNA. The goosegrass plastome is a circular molecule of 135,151bp in length, consisting of two single-copy regions separated by a pair of inverted repeats (IRs) of 20,919 bases. The large (LSC) and the small (SSC) single-copy regions span 80,667 bases and 12,646 bases, respectively. The plastome of goosegrass has 38.19% GC content and includes 108 unique genes, of which 76 are protein-coding, 28 are transfer RNA, and 4 are ribosomal RNA. The goosegrass plastome sequence was compared to eight other species of Poaceae. Although generally conserved with respect to Poaceae, this genomic resource will be useful for evolutionary studies within this weed species and the genus Eleusine. Copyright © 2016. Published by Elsevier B.V.

  2. Sequence analysis of cereal sucrose synthase genes and isolation ...

    African Journals Online (AJOL)

    SERVER

    2007-10-18

    Oct 18, 2007 ... sequencing of sucrose synthase gene fragment from sor- ghum using primers designed at their conserved exons. MATERIALS AND METHODS. Multiple sequence alignment. Sucrose synthase gene sequences of various cereals like rice, maize, and barley were accessed from NCBI Genbank database.

  3. Complete genome sequence of a novel pestivirus from sheep.

    Science.gov (United States)

    Becher, Paul; Schmeiser, Stefanie; Oguzoglu, Tuba Cigdem; Postel, Alexander

    2012-10-01

    We report here the complete genome sequence of pestivirus strain Aydin/04-TR, which is the prototype of a group of similar viruses currently present in sheep and goats in Turkey. Sequence data from this virus showed that it clusters separately from the established and previously proposed tentative pestivirus species.

  4. Complete Genome Sequence of a Novel Pestivirus from Sheep

    OpenAIRE

    Becher, Paul; Schmeiser, Stefanie; Oguzoglu, Tuba Cigdem; Postel, Alexander

    2012-01-01

    We report here the complete genome sequence of pestivirus strain Aydin/04-TR, which is the prototype of a group of similar viruses currently present in sheep and goats in Turkey. Sequence data from this virus showed that it clusters separately from the established and previously proposed tentative pestivirus species.

  5. Complete genome sequence of Enterobacter sp. IIT-BT 08: A potential microbial strain for high rate hydrogen production.

    Science.gov (United States)

    Khanna, Namita; Ghosh, Ananta Kumar; Huntemann, Marcel; Deshpande, Shweta; Han, James; Chen, Amy; Kyrpides, Nikos; Mavrommatis, Kostas; Szeto, Ernest; Markowitz, Victor; Ivanova, Natalia; Pagani, Ioanna; Pati, Amrita; Pitluck, Sam; Nolan, Matt; Woyke, Tanja; Teshima, Hazuki; Chertkov, Olga; Daligault, Hajnalka; Davenport, Karen; Gu, Wei; Munk, Christine; Zhang, Xiaojing; Bruce, David; Detter, Chris; Xu, Yan; Quintana, Beverly; Reitenga, Krista; Kunde, Yulia; Green, Lance; Erkkila, Tracy; Han, Cliff; Brambilla, Evelyne-Marie; Lang, Elke; Klenk, Hans-Peter; Goodwin, Lynne; Chain, Patrick; Das, Debabrata

    2013-12-20

    Enterobacter sp. IIT-BT 08 belongs to Phylum: Proteobacteria, Class: Gammaproteobacteria, Order: Enterobacteriales, Family: Enterobacteriaceae. The organism was isolated from the leaves of a local plant near the Kharagpur railway station, Kharagpur, West Bengal, India. It has been extensively studied for fermentative hydrogen production because of its high hydrogen yield. For further enhancement of hydrogen production by strain development, complete genome sequence analysis was carried out. Sequence analysis revealed that the genome was linear, 4.67 Mbp long and had a GC content of 56.01%. The genome properties encode 4,393 protein-coding and 179 RNA genes. Additionally, a putative pathway of hydrogen production was suggested based on the presence of formate hydrogen lyase complex and other related genes identified in the genome. Thus, in the present study we describe the specific properties of the organism and the generation, annotation and analysis of its genome sequence as well as discuss the putative pathway of hydrogen production by this organism.

  6. Nucleotide sequences of immunoglobulin eta genes of chimpanzee and orangutan: DNA molecular clock and hominoid evolution

    Energy Technology Data Exchange (ETDEWEB)

    Sakoyama, Y.; Hong, K.J.; Byun, S.M.; Hisajima, H.; Ueda, S.; Yaoita, Y.; Hayashida, H.; Miyata, T.; Honjo, T.

    1987-02-01

    To determine the phylogenetic relationships among hominoids and the dates of their divergence, the complete nucleotide sequences of the constant region of the immunoglobulin eta-chain (C/sub eta1/) genes from chimpanzee and orangutan have been determined. These sequences were compared with the human eta-chain constant-region sequence. A molecular clock (silent molecular clock), measured by the degree of sequence divergence at the synonymous (silent) positions of protein-encoding regions, was introduced for the present study. From the comparison of nucleotide sequences of ..cap alpha../sub 1/-antitrypsin and ..beta..- and delta-globulin genes between humans and Old World monkeys, the silent molecular clock was calibrated: the mean evolutionary rate of silent substitution was determined to be 1.56 x 10/sup -9/ substitutions per site per year. Using the silent molecular clock, the mean divergence dates of chimpanzee and orangutan from the human lineage were estimated as 6.4 +/- 2.6 million years and 17.3 +/- 4.5 million years, respectively. It was also shown that the evolutionary rate of primate genes is considerably slower than those of other mammalian genes.

  7. Gene mining a marama bean expressed sequence tags (ESTs ...

    African Journals Online (AJOL)

    The authors reported the identification of genes associated with embryonic development and microsatellite sequences. The future direction will entail characterization of these genes using gene over-expression and mutant assays. Key words: Namibia, simple sequence repeats (SSR), data mining, homology searches, ...

  8. Presence and Expression of Microbial Genes Regulating Soil Nitrogen Dynamics Along the Tanana River Successional Sequence

    Science.gov (United States)

    Boone, R. D.; Rogers, S. L.

    2004-12-01

    We report on work to assess the functional gene sequences for soil microbiota that control nitrogen cycle pathways along the successional sequence (willow, alder, poplar, white spruce, black spruce) on the Tanana River floodplain, Interior Alaska. Microbial DNA and mRNA were extracted from soils (0-10 cm depth) for amoA (ammonium monooxygenase), nifH (nitrogenase reductase), napA (nitrate reductase), and nirS and nirK (nitrite reductase) genes. Gene presence was determined by amplification of a conserved sequence of each gene employing sequence specific oligonucleotide primers and Polymerase Chain Reaction (PCR). Expression of the genes was measured via nested reverse transcriptase PCR amplification of the extracted mRNA. Amplified PCR products were visualized on agarose electrophoresis gels. All five successional stages show evidence for the presence and expression of microbial genes that regulate N fixation (free-living), nitrification, and nitrate reduction. We detected (1) nifH, napA, and nirK presence and amoA expression (mRNA production) for all five successional stages and (2) nirS and amoA presence and nifH, nirK, and napA expression for early successional stages (willow, alder, poplar). The results highlight that the existing body of previous process-level work has not sufficiently considered the microbial potential for a nitrate economy and free-living N fixation along the complete floodplain successional sequence.

  9. Complete genome sequence of the aerobic CO-oxidizing thermophile Thermomicrobium roseum.

    Directory of Open Access Journals (Sweden)

    Dongying Wu

    Full Text Available In order to enrich the phylogenetic diversity represented in the available sequenced bacterial genomes and as part of an "Assembling the Tree of Life" project, we determined the genome sequence of Thermomicrobium roseum DSM 5159. T. roseum DSM 5159 is a red-pigmented, rod-shaped, Gram-negative extreme thermophile isolated from a hot spring that possesses both an atypical cell wall composition and an unusual cell membrane that is composed entirely of long-chain 1,2-diols. Its genome is composed of two circular DNA elements, one of 2,006,217 bp (referred to as the chromosome and one of 919,596 bp (referred to as the megaplasmid. Strikingly, though few standard housekeeping genes are found on the megaplasmid, it does encode a complete system for chemotaxis including both chemosensory components and an entire flagellar apparatus. This is the first known example of a complete flagellar system being encoded on a plasmid and suggests a straightforward means for lateral transfer of flagellum-based motility. Phylogenomic analyses support the recent rRNA-based analyses that led to T. roseum being removed from the phylum Thermomicrobia and assigned to the phylum Chloroflexi. Because T. roseum is a deep-branching member of this phylum, analysis of its genome provides insights into the evolution of the Chloroflexi. In addition, even though this species is not photosynthetic, analysis of the genome provides some insight into the origins of photosynthesis in the Chloroflexi. Metabolic pathway reconstructions and experimental studies revealed new aspects of the biology of this species. For example, we present evidence that T. roseum oxidizes CO aerobically, making it the first thermophile known to do so. In addition, we propose that glycosylation of its carotenoids plays a crucial role in the adaptation of the cell membrane to this bacterium's thermophilic lifestyle. Analyses of published metagenomic sequences from two hot springs similar to the one from which

  10. Complete genome sequence of Clostridium estertheticum DSM 8809, a microbe identified in spoiled vacuum packed beef

    Directory of Open Access Journals (Sweden)

    Zhongyi Yu

    2016-11-01

    Full Text Available Blown pack spoilage (BPS is a major issue for the beef industry. Aetiological agents of BPS involve members of a group of Clostridium species, including Clostridium estertheticum which has the ability to produce gas, mostly carbon dioxide, under anaerobic psychotrophic growth conditions. This spore-forming bacterium grows slowly under laboratory conditions, and it can take up to 3 months to produce a workable culture. These characteristics have limited the study of this commercially challenging bacterium. Consequently information on this bacterium is limited and no effective controls are currently available to confidently detect and manage this production risk. In this study the complete genome of Clostridium estertheticum DSM 8809 was determined by SMRT® sequencing. The genome consists of a circular chromosome of 4.7 Mbp along with a single plasmid carrying a potential tellurite resistance gene tehB and a Tn3-like resolvase-encoding gene tnpR. The genome sequence was searched for central metabolic pathways that would support its biochemical profile and several enzymes contributing to this phenotype were identified. Several putative antibiotic/biocide/metal resistance-encoding genes and virulence factors were also identified in the genome, a feature that requires further research. The availability of the genome sequence will provide a basic blueprint from which to develop valuable biomarkers that could support and improve the detection and control of this bacterium along the beef production chain.

  11. Identification of rat genes by TWINSCAN gene prediction, RT-PCR, and direct sequencing

    DEFF Research Database (Denmark)

    Wu, Jia Qian; Shteynberg, David; Arumugam, Manimozhiyan

    2004-01-01

    an alternative approach: reverse transcription-polymerase chain reaction (RT-PCR) and direct sequencing based on dual-genome de novo predictions from TWINSCAN. We tested 444 TWINSCAN-predicted rat genes that showed significant homology to known human genes implicated in disease but that were partially...... in the single-intron experiment. Spliced sequences were amplified in 46 cases (34%). We conclude that this procedure for elucidating gene structures with native cDNA sequences is cost-effective and will become even more so as it is further optimized.......The publication of a draft sequence of a third mammalian genome--that of the rat--suggests a need to rethink genome annotation. New mammalian sequences will not receive the kind of labor-intensive annotation efforts that are currently being devoted to human. In this paper, we demonstrate...

  12. Complete genome sequence of producer of the glycopeptide antibiotic Aculeximycin Kutzneria albida DSM 43870T, a representative of minor genus of Pseudonocardiaceae.

    Science.gov (United States)

    Rebets, Yuriy; Tokovenko, Bogdan; Lushchyk, Igor; Rückert, Christian; Zaburannyi, Nestor; Bechthold, Andreas; Kalinowski, Jörn; Luzhetskyy, Andriy

    2014-10-10

    Kutzneria is a representative of a rarely observed genus of the family Pseudonocardiaceae. Kutzneria species were initially placed in the Streptosporangiaceae genus and later reconsidered to be an independent genus of the Pseudonocardiaceae. Kutzneria albida is one of the eight known members of the genus. This strain is a unique producer of the glycosylated polyole macrolide aculeximycin which is active against both bacteria and fungi. Kutzneria albida genome sequencing and analysis allow a deeper understanding of evolution of this genus of Pseudonocardiaceae, provide new insight in the phylogeny of the genus, as well as decipher the hidden secondary metabolic potential of these rare actinobacteria. To explore the biosynthetic potential of Kutzneria albida to its full extent, the complete genome was sequenced. With a size of 9,874,926 bp, coding for 8,822 genes, it stands alongside other Pseudonocardiaceae with large circular genomes. Genome analysis revealed 46 gene clusters potentially encoding secondary metabolite biosynthesis pathways. Two large genomic islands were identified, containing regions most enriched with secondary metabolism gene clusters. Large parts of this secondary metabolism "clustome" are dedicated to siderophores production. Kutzneria albida is the first species of the genus Kutzneria with a completely sequenced genome. Genome sequencing allowed identifying the gene cluster responsible for the biosynthesis of aculeximycin, one of the largest known oligosaccharide-macrolide antibiotics. Moreover, the genome revealed 45 additional putative secondary metabolite gene clusters, suggesting a huge biosynthetic potential, which makes Kutzneria albida a very rich source of natural products. Comparison of the Kutzneria albida genome to genomes of other actinobacteria clearly shows its close relations with Pseudonocardiaceae in line with the taxonomic position of the genus.

  13. Sequencing and characterization of the complete mitochondrial genome of Japanese Swellshark (Cephalloscyllium umbratile).

    Science.gov (United States)

    Zhu, Ke-Cheng; Liang, Yin-Yin; Wu, Na; Guo, Hua-Yang; Zhang, Nan; Jiang, Shi-Gui; Zhang, Dian-Chang

    2017-11-10

    To further comprehend the genome features of Cephalloscyllium umbratile (Carcharhiniformes), an endangered species, the complete mitochondrial DNA (mtDNA) was firstly sequenced and annotated. The full-length mtDNA of C. umbratile was 16,697 bp and contained ribosomal RNA (rRNA) genes, 13 protein-coding genes (PCGs), 23 transfer RNA (tRNA) genes, and a major non-coding control region. Each PCG was initiated by an authoritative ATN codon, except for COX1 initiated by a GTG codon. Seven of 13 PCGs had a typical TAA termination codon, while others terminated with a single T or TA. Moreover, the relative synonymous codon usage of the 13 PCGs was consistent with that of other published Carcharhiniformes. All tRNA genes had typical clover-leaf secondary structures, except for tRNA-Ser (GCT), which lacked the dihydrouridine 'DHU' arm. Furthermore, the analysis of the average Ka/Ks in the 13 PCGs of three Carcharhiniformes species indicated a strong purifying selection within this group. In addition, phylogenetic analysis revealed that C. umbratile was closely related to Glyphis glyphis and Glyphis garricki. Our data supply a useful resource for further studies on genetic diversity and population structure of C. umbratile.

  14. De novo 454 sequencing of barcoded BAC pools for comprehensive gene survey and genome analysis in the complex genome of barley

    Directory of Open Access Journals (Sweden)

    Scholz Uwe

    2009-11-01

    Full Text Available Abstract Background De novo sequencing the entire genome of a large complex plant genome like the one of barley (Hordeum vulgare L. is a major challenge both in terms of experimental feasibility and costs. The emergence and breathtaking progress of next generation sequencing technologies has put this goal into focus and a clone based strategy combined with the 454/Roche technology is conceivable. Results To test the feasibility, we sequenced 91 barcoded, pooled, gene containing barley BACs using the GS FLX platform and assembled the sequences under iterative change of parameters. The BAC assemblies were characterized by N50 of ~50 kb (N80 ~31 kb, N90 ~21 kb and a Q40 of 94%. For ~80% of the clones, the best assemblies consisted of less than 10 contigs at 24-fold mean sequence coverage. Moreover we show that gene containing regions seem to assemble completely and uninterrupted thus making the approach suitable for detecting complete and positionally anchored genes. By comparing the assemblies of four clones to their complete reference sequences generated by the Sanger method, we evaluated the distribution, quality and representativeness of the 454 sequences as well as the consistency and reliability of the assemblies. Conclusion The described multiplex 454 sequencing of barcoded BACs leads to sequence consensi highly representative for the clones. Assemblies are correct for the majority of contigs. Though the resolution of complex repetitive structures requires additional experimental efforts, our approach paves the way for a clone based strategy of sequencing the barley genome.

  15. Complete nuclear ribosomal DNA sequence amplification and molecular analyses of Bangia (Bangiales, Rhodophyta) from China

    Science.gov (United States)

    Xu, Jiajie; Jiang, Bo; Chai, Sanming; He, Yuan; Zhu, Jianyi; Shen, Zonggen; Shen, Songdong

    2016-09-01

    Filamentous Bangia, which are distributed extensively throughout the world, have simple and similar morphological characteristics. Scientists can classify these organisms using molecular markers in combination with morphology. We successfully sequenced the complete nuclear ribosomal DNA, approximately 13 kb in length, from a marine Bangia population. We further analyzed the small subunit ribosomal DNA gene (nrSSU) and the internal transcribed spacer (ITS) sequence regions along with nine other marine, and two freshwater Bangia samples from China. Pairwise distances of the nrSSU and 5.8S ribosomal DNA gene sequences show the marine samples grouping together with low divergences (00.003; 0-0.006, respectively) from each other, but high divergences (0.123-0.126; 0.198, respectively) from freshwater samples. An exception is the marine sample collected from Weihai, which shows high divergence from both other marine samples (0.063-0.065; 0.129, respectively) and the freshwater samples (0.097; 0.120, respectively). A maximum likelihood phylogenetic tree based on a combined SSU-ITS dataset with maximum likelihood method shows the samples divided into three clades, with the two marine sample clades containing Bangia spp. from North America, Europe, Asia, and Australia; and one freshwater clade, containing Bangia atropurpurea from North America and China.

  16. Extensive structural variations between mitochondrial genomes of CMS and normal peppers (Capsicum annuum L.) revealed by complete nucleotide sequencing.

    Science.gov (United States)

    Jo, Yeong Deuk; Choi, Yoomi; Kim, Dong-Hwan; Kim, Byung-Dong; Kang, Byoung-Cheorl

    2014-07-04

    Cytoplasmic male sterility (CMS) is an inability to produce functional pollen that is caused by mutation of the mitochondrial genome. Comparative analyses of mitochondrial genomes of lines with and without CMS in several species have revealed structural differences between genomes, including extensive rearrangements caused by recombination. However, the mitochondrial genome structure and the DNA rearrangements that may be related to CMS have not been characterized in Capsicum spp. We obtained the complete mitochondrial genome sequences of the pepper CMS line FS4401 (507,452 bp) and the fertile line Jeju (511,530 bp). Comparative analysis between mitochondrial genomes of peppers and tobacco that are included in Solanaceae revealed extensive DNA rearrangements and poor conservation in non-coding DNA. In comparison between pepper lines, FS4401 and Jeju mitochondrial DNAs contained the same complement of protein coding genes except for one additional copy of an atp6 gene (ψatp6-2) in FS4401. In terms of genome structure, we found eighteen syntenic blocks in the two mitochondrial genomes, which have been rearranged in each genome. By contrast, sequences between syntenic blocks, which were specific to each line, accounted for 30,380 and 17,847 bp in FS4401 and Jeju, respectively. The previously-reported CMS candidate genes, orf507 and ψatp6-2, were located on the edges of the largest sequence segments that were specific to FS4401. In this region, large number of small sequence segments which were absent or found on different locations in Jeju mitochondrial genome were combined together. The incorporation of repeats and overlapping of connected sequence segments by a few nucleotides implied that extensive rearrangements by homologous recombination might be involved in evolution of this region. Further analysis using mtDNA pairs from other plant species revealed common features of DNA regions around CMS-associated genes. Although large portion of sequence context was

  17. A Probabilistic Genome-Wide Gene Reading Frame Sequence Model

    DEFF Research Database (Denmark)

    Have, Christian Theil; Mørk, Søren

    We introduce a new type of probabilistic sequence model, that model the sequential composition of reading frames of genes in a genome. Our approach extends gene finders with a model of the sequential composition of genes at the genome-level -- effectively producing a sequential genome annotation...... as output. The model can be used to obtain the most probable genome annotation based on a combination of i: a gene finder score of each gene candidate and ii: the sequence of the reading frames of gene candidates through a genome. The model --- as well as a higher order variant --- is developed and tested...... and are evaluated by the effect on prediction performance. Since bacterial gene finding to a large extent is a solved problem it forms an ideal proving ground for evaluating the explicit modeling of larger scale gene sequence composition of genomes. We conclude that the sequential composition of gene reading frames...

  18. Avian endogenous provirus (ev-3) env gene sequencing: implication for pathogenic retrovirus origination.

    Science.gov (United States)

    Tikhonenko, A T; Lomovskaya, O L

    1990-02-01

    The avian endogenous env gene product blocks the surface receptor and, as a result, cells become immune to related exogenous retroviruses. On the other hand, the same sequence can be included in the pathogenic retrovirus genome, as shown by oligonucleotide mapping. However, since the complete env gene sequence was not known, the comparison of genomic nucleotide sequences was not possible. Therefore an avian endogenous provirus with an intact env gene was cloned from a chicken gene bank and the regions coding for the C terminus of the gp85 and gp37 proteins were sequenced. Comparison of this sequence with those of other retroviruses proved that one of the pathogenic viruses associated with osteopetrosis is a cross between avian endogenous virus and Rous sarcoma virus. Retroviruses and, especially, endogenous retroviruses are traditionally of the most developed models of viral carcinogenesis. Many endogenous retroviruses are implicated in neoplastic transformation of the cell. For instance, endogenous mouse mammary tumor virus of some inbred lines appears to be the only causative agent in these mammary cancers. Other even nonpathogenic murine endogenous retroviruses are involved in the origination of MCF-type recombinant acute leukosis viruses. Some endogenous retroviruses are implicated in the transduction or activation of cellular protooncogenes. Our interest in endogenous viruses is based on their ability to make cells resistant to exogenous retroviruses. Expression of their major envelope glycoprotein leads to cellular surface receptor blockage and imparts immunity to infection by the related leukemia retroviruses. This problem is quite elaborated for chicken endogenous virus RAV-O (7-9).(ABSTRACT TRUNCATED AT 250 WORDS)

  19. Complete nucleotide sequence and organization of the mitogenome of the silk moth Caligula boisduvalii (Lepidoptera: Saturniidae) and comparison with other lepidopteran insects.

    Science.gov (United States)

    Hong, Mee Yeon; Lee, Eun Mee; Jo, Yong Hun; Park, Hae Chul; Kim, Seong Ryul; Hwang, Jae Sam; Jin, Byung Rae; Kang, Pil Don; Kim, Ki-Gyoung; Han, Yeon Soo; Kim, Iksoo

    2008-04-30

    The 15,360-bp long complete mitogenome of Caligula boisduvalii possesses a gene arrangement and content identical to other completely sequenced lepidopteran mitogenomes, but different from the common arrangement found in most insect order, as the result of the movement of tRNA(Met) to a position 5'-upstream of tRNA Ile. The 330-bp A+T-rich region is apparently capable of forming a stem-and-loop structure, which harbors the conserved flanking sequences at both ends. Dissimilar to what has been seen in other sequenced lepidopteran insects, the initiation codon for C. boisduvalii COI appears to be TTG, which is a rare, but apparently possible initiation codon. The ATP8, ATP6, ND4L, and ND6 genes, which neighbor another PCG at their 3' end, all harbored potential sequences for the formation of a hairpin structure. This is suggestive of the importance of such structures for the precise cleavage of the mRNA of mature PCGs. Phylogenetic analyses of available sequenced species of Bombycoidea, Pyraloidea, and Tortricidea supported the morphology-based current hypothesis that Bombycoidea and Pyraloidea are monophyletic (Obtectomera). As previously suggested, Bombycidae (Bombyx mori and B. mandarina) and Saturniidae (Antheraea pernyi and C. boisduvalii) formed a reciprocal monophyletic group.

  20. International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons

    Science.gov (United States)

    Olson, Nathan D.; Lund, Steven P.; Zook, Justin M.; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S.; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B.

    2015-01-01

    This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies. PMID:27077030

  1. International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons

    Directory of Open Access Journals (Sweden)

    Nathan D. Olson

    2015-03-01

    Full Text Available This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1 identity of biologically conserved position, (2 ratio of 16S rRNA gene copies featuring identified variants, and (3 the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies.

  2. Complete genome sequence of 'Thermobaculum terrenum' type strain (YNP1).

    Science.gov (United States)

    Kiss, Hajnalka; Cleland, David; Lapidus, Alla; Lucas, Susan; Del Rio, Tijana Glavina; Nolan, Matt; Tice, Hope; Han, Cliff; Goodwin, Lynne; Pitluck, Sam; Liolios, Konstantinos; Ivanova, Natalia; Mavromatis, Konstantinos; Ovchinnikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D; Lu, Megan; Brettin, Thomas; Detter, John C; Göker, Markus; Tindall, Brian J; Beck, Brian; McDermott, Timothy R; Woyke, Tanja; Bristow, James; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter; Cheng, Jan-Fang

    2010-10-27

    'Thermobaculum terrenum' Botero et al. 2004 is the sole species within the proposed genus 'Thermobaculum'. Strain YNP1(T) is the only cultivated member of an acid tolerant, extremely thermophilic species belonging to a phylogenetically isolated environmental clone group within the phylum Chloroflexi. At present, the name 'Thermobaculum terrenum' is not yet validly published as it contravenes Rule 30 (3a) of the Bacteriological Code. The bacterium was isolated from a slightly acidic extreme thermal soil in Yellowstone National Park, Wyoming (USA). Depending on its final taxonomic allocation, this is likely to be the third completed genome sequence of a member of the class Thermomicrobia and the seventh type strain genome from the phylum Chloroflexi. The 3,101,581 bp long genome with its 2,872 protein-coding and 58 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  3. Targeted Gene Sequencing and Whole-Exome Sequencing in Autopsied Fetuses with Prenatally Diagnosed Kidney Anomalies

    DEFF Research Database (Denmark)

    Rasmussen, M; Sunde, L; Nielsen, M L

    2018-01-01

    Identification of fetal kidney anomalies invites questions about underlying causes and recurrence risk in future pregnancies. We therefore investigated the diagnostic yield of next-generation sequencing in fetuses with bilateral kidney anomalies and the correlation between disrupted genes and fetal...... phenotypes. Fetuses with bilateral kidney anomalies were screened using an in-house-designed kidney-gene panel. In families where candidate variants were not identified, whole-exome sequencing was performed. Genes uncovered by this analysis were added to our kidney-panel. We identified likely deleterious...... of nephronophthisis. Exome sequencing identified ROBO1 variants in one family and a GREB1L variant in another family. GREB1L and ROBO1 were added to our kidney-gene panel and additional variants were identified. Next-generation sequencing substantially contributes to identifying causes of fetal kidney anomalies...

  4. Complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera, and comparative analyses with other grass genomes

    Science.gov (United States)

    Saski, Christopher; Lee, Seung-Bum; Fjellheim, Siri; Guda, Chittibabu; Jansen, Robert K.; Luo, Hong; Tomkins, Jeffrey; Rognli, Odd Arne; Clarke, Jihong Liu

    2009-01-01

    Comparisons of complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera to six published grass chloroplast genomes reveal that gene content and order are similar but two microstructural changes have occurred. First, the expansion of the IR at the SSC/IRa boundary that duplicates a portion of the 5′ end of ndhH is restricted to the three genera of the subfamily Pooideae (Agrostis, Hordeum and Triticum). Second, a 6 bp deletion in ndhK is shared by Agrostis, Hordeum, Oryza and Triticum, and this event supports the sister relationship between the subfamilies Erhartoideae and Pooideae. Repeat analysis identified 19–37 direct and inverted repeats 30 bp or longer with a sequence identity of at least 90%. Seventeen of the 26 shared repeats are found in all the grass chloroplast genomes examined and are located in the same genes or intergenic spacer (IGS) regions. Examination of simple sequence repeats (SSRs) identified 16–21 potential polymorphic SSRs. Five IGS regions have 100% sequence identity among Zea mays, Saccharum officinarum and Sorghum bicolor, whereas no spacer regions were identical among Oryza sativa, Triticum aestivum, H. vulgare and A. stolonifera despite their close phylogenetic relationship. Alignment of EST sequences and DNA coding sequences identified six C–U conversions in both Sorghum bicolor and H. vulgare but only one in A. stolonifera. Phylogenetic trees based on DNA sequences of 61 protein-coding genes of 38 taxa using both maximum parsimony and likelihood methods provide moderate support for a sister relationship between the subfamilies Erhartoideae and Pooideae. PMID:17534593

  5. Isolation, identification, and complete genome sequence of a bovine adenovirus type 3 from cattle in China

    Directory of Open Access Journals (Sweden)

    Zhu Yuan-Mao

    2011-12-01

    Full Text Available Abstract Background Bovine adenovirus type 3 (BAV-3 belongs to the Mastadenovirus genus of the family Adenoviridae and is involved in respiratory and enteric infections of calves. The isolation of BAV-3 has not been reported prior to this study in China. In 2009, there were many cases in cattle showing similar clinical signs to BAV-3 infection and a virus strain, showing cytopathic effect in Madin-Darby bovine kidney cells, was isolated from a bovine nasal swab collected from feedlot cattle in Heilongjiang Province, China. The isolate was confirmed as a bovine adenovirus type 3 by PCR and immunofluorescence assay, and named as HLJ0955. So far only the complete genome sequence of prototype of BAV-3 WBR-1 strain has been reported. In order to further characterize the Chinese isolate HLJ0955, the complete genome sequence of HLJ0955 was determined. Results The size of the genome of the Chinese isolate HLJ0955 is 34,132 nucleotides in length with a G+C content of 53.6%. The coding sequences for gene regions of HLJ0955 isolate were similar to the prototype of BAV-3 WBR-1 strain, with 80.0-98.6% nucleotide and 87.5-98.8% amino acid identities. The genome of HLJ0955 strain contains 16 regions and four deletions in inverted terminal repeats, E1B region and E4 region, respectively. The complete genome and DNA binding protein gene based phylogenetic analysis with other adenoviruses were performed and the results showed that HLJ0955 isolate belonged to BAV-3 and clustered within the Mastadenovirus genus of the family Adenoviridae. Conclusions This is the first study to report the isolation and molecular characterization of BAV-3 from cattle in China. The phylogenetic analysis performed in this study supported the use of the DNA binding protein gene of adenovirus as an appropriate subgenomic target for the classification of different genuses of the family Adenoviridae on the molecular basis. Meanwhile, a large-scale pathogen and serological epidemiological

  6. Molecular cloning and sequence analysis of a phenylalanine ammonia-lyase gene from dendrobium.

    Directory of Open Access Journals (Sweden)

    Qing Jin

    Full Text Available In this study, a phenylalanine ammonia-lyase (PAL gene was cloned from Dendrobium candidum using homology cloning and RACE. The full-length sequence and catalytic active sites that appear in PAL proteins of Arabidopsis thaliana and Nicotiana tabacum are also found: PAL cDNA of D. candidum (designated Dc-PAL1, GenBank No. JQ765748 has 2,458 bps and contains a complete open reading frame (ORF of 2,142 bps, which encodes 713 amino acid residues. The amino acid sequence of DcPAL1 has more than 80% sequence identity with the PAL genes of other plants, as indicated by multiple alignments. The dominant sites and catalytic active sites, which are similar to that showing in PAL proteins of Arabidopsis thaliana and Nicotiana tabacum, are also found in DcPAL1. Phylogenetic tree analysis revealed that DcPAL is more closely related to PALs from orchidaceae plants than to those of other plants. The differential expression patterns of PAL in protocorm-like body, leaf, stem, and root, suggest that the PAL gene performs multiple physiological functions in Dendrobium candidum.

  7. Complete genome sequence of the halophilic and highly halotolerant Chromohalobacter salexigens type strain (1H11T)

    Energy Technology Data Exchange (ETDEWEB)

    Copeland, A [U.S. Department of Energy, Joint Genome Institute; O' Connor, Kathleen [Purdue University; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Berry, Kerrie W. [United States Department of Energy Joint Genome Institute; Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Hammon, Nancy [U.S. Department of Energy, Joint Genome Institute; Dalin, Eileen [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Tapia, Roxanne [Los Alamos National Laboratory (LANL); Saunders, Elizabeth H [Los Alamos National Laboratory (LANL); Schmutz, Jeremy [Stanford University; Brettin, Thomas S [ORNL; Larimer, Frank W [ORNL; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Vargas, Carmen [University of Seville; Nieto, Joaquin J. [University of Seville; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Csonka, Laszlo N. [Purdue University; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute

    2011-01-01

    Chromohalobacter salexigens is one of nine currently known species of the genus Chromoha- lobacter in the family Halomonadaceae. It is the most halotolerant of the so-called mod- erately halophilic bacteria currently known and, due to its strong euryhaline phenotype, it is an established model organism for prokaryotic osmoadaptation. C. salexigens strain 1H11T and Halomonas elongata are the first and the second members of the family Halomonada- ceae with a completely sequenced genome. The 3,696,649 bp long chromosome with a total of 3,319 protein-coding and 93 RNA genes was sequenced as part of the DOE Joint Genome Institute Program DOEM 2004.

  8. Sequencing and comparing whole mitochondrial genomes ofanimals

    Energy Technology Data Exchange (ETDEWEB)

    Boore, Jeffrey L.; Macey, J. Robert; Medina, Monica

    2005-04-22

    Comparing complete animal mitochondrial genome sequences is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. Not only are they much more informative than shorter sequences of individual genes for inferring evolutionary relatedness, but these data also provide sets of genome-level characters, such as the relative arrangements of genes, that can be especially powerful. We describe here the protocols commonly used for physically isolating mtDNA, for amplifying these by PCR or RCA, for cloning,sequencing, assembly, validation, and gene annotation, and for comparing both sequences and gene arrangements. On several topics, we offer general observations based on our experiences to date with determining and comparing complete mtDNA sequences.

  9. The Complete Sequence of the Mitochondrial Genome of the Chamberednautilus (Mollusca: Cephalopoda)

    Energy Technology Data Exchange (ETDEWEB)

    Boore, Jeffrey L.

    2005-12-01

    Background: Mitochondria contain small genomes that arephysically separate from those of nuclei. Their comparison serves as amodel system for understanding the processes of genome evolution.Although complete mitochondrial genome sequences have been reported formore than 600 animals, the taxonomic sampling is highly biased towardvertebrates and arthropods, leaving much of the diversity yetuncharacterized. Results: The mitochondrial genome of a cephalopodmollusk, the Chambered Nautilus, is 16,258 nts in length and 59.5 percentA+T, both values that are typical of animal mitochondrial genomes. Itcontains the 37 genes that are typical for animal mtDNAs, with 15 on oneDNA strand and 22 on the other. The arrangement of these genes can bederived from that of the distantly related Katharina tunicata (Mollusca:Polyplacophora) by a switch in position of two large blocks of genes andtranspositions of four tRNA genes. There is strong skew in thedistribution of nucleotides between the two strands. There are an unusualnumber of non-coding regions and their function, if any, is not known;however, several of these demark abrupt shifts in nucleotide skew,suggesting that they may play roles in transcription and/or replication.One of the non-coding regions contains multiple repeats of a tRNA-likesequence. Some of the tRNA genes appear to overlap on the same strand,but this could be resolved if the polycistron were cleaved at thebeginning of the downstream gene, followed by polyadenylation of theproduct of the upstream gene to form a fully paired structure.Conclusions: Nautilus sp. mtDNA contains an expected gene content thathas experienced few rearrangements since the evolutionary split betweencephalopods and polyplacophorans. It contains an unusual number ofnon-coding regions, especially considering that these otherwise often aregenerated by the same processes that produce gene rearrangements. Thisappears to be yet another case where polyadenylation of mitochondrialtRNAs restores

  10. Comparison of methods for genomic localization of gene trap sequences

    Directory of Open Access Journals (Sweden)

    Ferrin Thomas E

    2006-09-01

    Full Text Available Abstract Background Gene knockouts in a model organism such as mouse provide a valuable resource for the study of basic biology and human disease. Determining which gene has been inactivated by an untargeted gene trapping event poses a challenging annotation problem because gene trap sequence tags, which represent sequence near the vector insertion site of a trapped gene, are typically short and often contain unresolved residues. To understand better the localization of these sequences on the mouse genome, we compared stand-alone versions of the alignment programs BLAT, SSAHA, and MegaBLAST. A set of 3,369 sequence tags was aligned to build 34 of the mouse genome using default parameters for each algorithm. Known genome coordinates for the cognate set of full-length genes (1,659 sequences were used to evaluate localization results. Results In general, all three programs performed well in terms of localizing sequences to a general region of the genome, with only relatively subtle errors identified for a small proportion of the sequence tags. However, large differences in performance were noted with regard to correctly identifying exon boundaries. BLAT correctly identified the vast majority of exon boundaries, while SSAHA and MegaBLAST missed the majority of exon boundaries. SSAHA consistently reported the fewest false positives and is the fastest algorithm. MegaBLAST was comparable to BLAT in speed, but was the most susceptible to localizing sequence tags incorrectly to pseudogenes. Conclusion The differences in performance for sequence tags and full-length reference sequences were surprisingly small. Characteristic variations in localization results for each program were noted that affect the localization of sequence at exon boundaries, in particular.

  11. The complete chloroplast genome sequence of Taxus chinensis var. mairei (Taxaceae): loss of an inverted repeat region and comparative analysis with related species.

    Science.gov (United States)

    Zhang, Yanzhen; Ma, Ji; Yang, Bingxian; Li, Ruyi; Zhu, Wei; Sun, Lianli; Tian, Jingkui; Zhang, Lin

    2014-05-01

    Taxus chinensis var. mairei (Taxaceae) is a domestic variety of yew species in local China. This plant is one of the sources for paclitaxel, which is a promising antineoplastic chemotherapy drugs during the last decade. We have sequenced the complete nucleotide sequence of the chloroplast (cp) genome of T. chinensis var. mairei. The T. chinensis var. mairei cp genome is 129,513 bp in length, with 113 single copy genes and two duplicated genes (trnI-CAU, trnQ-UUG). Among the 113 single copy genes, 9 are intron-containing. Compared to other land plant cp genomes, the T. chinensis var. mairei cp genome has lost one of the large inverted repeats (IRs) found in angiosperms, fern, liverwort, and gymnosperm such as Cycas revoluta and Ginkgo biloba L. Compared to related species, the gene order of T. chinensis var. mairei has a large inversion of ~110kb including 91 genes (from rps18 to accD) with gene contents unarranged. Repeat analysis identified 48 direct and 2 inverted repeats 30 bp long or longer with a sequence identity greater than 90%. Repeated short segments were found in genes rps18, rps19 and clpP. Analysis also revealed 22 simple sequence repeat (SSR) loci and almost all are composed of A or T. Copyright © 2014 Elsevier B.V. All rights reserved.

  12. DNA sequence responsible for the amplification of adjacent genes.

    Science.gov (United States)

    Pasion, S G; Hartigan, J A; Kumar, V; Biswas, D K

    1987-10-01

    A 10.3-kb DNA fragment in the 5'-flanking region of the rat prolactin (rPRL) gene was isolated from F1BGH(1)2C1, a strain of rat pituitary tumor cells (GH cells) that produces prolactin in response to 5-bromodeoxyuridine (BrdU). Following transfection and integration into genomic DNA of recipient mouse L cells, this DNA induced amplification of the adjacent thymidine kinase gene from Herpes simplex virus type 1 (HSV1TK). We confirmed the ability of this "Amplicon" sequence to induce amplification of other linked or unlinked genes in DNA-mediated gene transfer studies. When transferred into the mouse L cells with the 10.3-5'rPRL gene sequence of BrdU-responsive cells, both the human growth hormone and the HSV1TK genes are amplified in response to 5-bromodeoxyuridine. This observation is substantiated by BrdU-induced amplification of the cotransferred bacterial Neo gene. Cotransfection studies reveal that the BrdU-induced amplification capability is associated with a 4-kb DNA sequence in the 5'-flanking region of the rPRL gene of BrdU-responsive cells. These results demonstrate that genes of heterologous origin, linked or unlinked, and selected or unselected, can be coamplified when located within the amplification boundary of the Amplicon sequence.

  13. Complete sequences of the highly rearranged molluscan mitochondrial genomes of the scaphopod graptacme eborea and the bivalve mytilus edulis

    Energy Technology Data Exchange (ETDEWEB)

    Boore, Jeffrey L.; Medina, Monica; Rosenberg, Lewis A.

    2004-01-31

    We have determined the complete sequence of the mitochondrial genome of the scaphopod mollusk Graptacme eborea (Conrad, 1846) (14,492 nts) and completed the sequence of the mitochondrial genome of the bivalve mollusk Mytilus edulis Linnaeus, 1758 (16,740 nts). (The name Graptacme eborea is a revision of the species formerly known as Dentalium eboreum.) G. eborea mtDNA contains the 37 genes that are typically found and has the genes divided about evenly between the two strands, but M. edulis contains an extra trnM and is missing atp8, and has all genes on the same strand. Each has a highly rearranged gene order relative to each other and to all other studied mtDNAs. G. eborea mtDNA has almost no strand skew, but the coding strand of M. edulis mtDNA is very rich in G and T. This is reflected in differential codon usage patterns and even in amino acid compositions. G. eborea mtDNA has fewer non-coding nucleotides than any other mtDNA studied to date, with the largest non-coding region being only 24 nt long. Phylogenetic analysis using 2,420 aligned amino acid positions of concatenated proteins weakly supports an association of the scaphopod with gastropods to the exclusion of Bivalvia, Cephalopoda, and Polyplacophora, but is generally unable to convincingly resolve the relationships among major groups of the Lophotrochozoa, in contrast to the good resolution seen for several other major metazoan groups.

  14. Complete genome sequence of cyanobacterium Nostoc sp. NIES-3756, a potentially useful strain for phytochrome-based bioengineering.

    Science.gov (United States)

    Hirose, Yuu; Fujisawa, Takatomo; Ohtsubo, Yoshiyuki; Katayama, Mitsunori; Misawa, Naomi; Wakazuki, Sachiko; Shimura, Yohei; Nakamura, Yasukazu; Kawachi, Masanobu; Yoshikawa, Hirofumi; Eki, Toshihiko; Kanesaki, Yu

    2016-01-20

    To explore the diverse photoreceptors of cyanobacteria, we isolated Nostoc sp. strain NIES-3756 from soil at Mimomi-Park, Chiba, Japan, and determined its complete genome sequence. The Genome consists of one chromosome and two plasmids (total 6,987,571 bp containing no gaps). The NIES-3756 strain carries 7 phytochrome and 12 cyanobacteriochrome genes, which will facilitate the studies of phytochrome-based bioengineering. Copyright © 2015. Published by Elsevier B.V.

  15. [Sequencing technology in gene diagnosis and its application].

    Science.gov (United States)

    Yibin, Guo

    2014-11-01

    The study of gene mutation is one of the hot topics in the field of life science nowadays, and the related detection methods and diagnostic technology have been developed rapidly. Sequencing technology plays an indispensable role in the definite diagnosis and classification of genetic diseases. In this review, we summarize the research progress in sequencing technology, evaluate the advantages and disadvantages of 1(st) ~3(rd) generation of sequencing technology, and describe its application in gene diagnosis. Also we made forecasts and prospects on its development trend.

  16. The complete chloroplast genome sequence of Aster spathulifolius (Asteraceae); genomic features and relationship with Asteraceae.

    Science.gov (United States)

    Choi, Kyoung Su; Park, SeonJoo

    2015-11-10

    Aster spathulifolius, a member of the Asteraceae family, is distributed along the coast of Japan and Korea. This plant is used for medicinal and ornamental purposes. The complete chloroplast (cp) genome of A. sphathulifolius consists of 149,473 bp that include a pair of inverted repeats of 24,751 bp separated by a large single copy region of 81,998 bp and a small single copy region of 17,973 bp. The chloroplast genome contains 78 coding genes, four rRNA genes and 29 tRNA genes. When compared to other cpDNA sequences of Asteraceae, A. spathulifolius showed the closest relationship with Jacobaea vulgaris, and its atpB gene was found to be a pseudogene, unlike J. vulgaris. Furthermore, evaluation of the gene compositions of J. vulgaris, Helianthus annuus, Guizotia abyssinica and A. spathulifolius revealed that 13.6-kb showed inversion from ndhF to rps15, unlike Lactuca of Asteraceae. Comparison of the synonymous (Ks) and nonsynonymous (Ka) substitution rates with J. vulgaris revealed that synonymous genes related to a small subunit of the ribosome showed the highest value (0.1558), while nonsynonymous rates of genes related to ATP synthase genes were highest (0.0118). These findings revealed that substitution has occurred at similar rates in most genes, and the substitution rates suggested that most genes is a purified selection. Copyright © 2015 Elsevier B.V. All rights reserved.

  17. Complete mitochondrial genome sequence of Melipona scutellaris, a Brazilian stingless bee.

    Science.gov (United States)

    Pereira, Ulisses de Padua; Bonetti, Ana Maria; Goulart, Luiz Ricardo; Santos, Anderson Rodrigues Dos; Oliveira, Guilherme Correa de; Cuadros-Orellana, Sara; Ueira-Vieira, Carlos

    2016-09-01

    Melipona scutellaris is a Brazilian stingless bee species and a highly important native pollinator besides its use in rational rearing for honey production. In this study, we present the whole mitochondrial DNA sequence of M. scutellaris from a haploid male. The mitogenome has a size of 14,862 bp and harbors 13 protein-coding genes (PCGs), 2 rRNA genes and 21 tRNA genes.

  18. Full-Length Sequence of Mouse Acupuncture-Induced 1-L (Aig1l Gene Including Its Transcriptional Start Site

    Directory of Open Access Journals (Sweden)

    Mika Ohta

    2011-01-01

    Full Text Available We have been investigating the molecular efficacy of electroacupuncture (EA, which is one type of acupuncture therapy. In our previous molecular biological study of acupuncture, we found an EA-induced gene, named acupuncture-induced 1-L (Aig1l, in mouse skeletal muscle. The aims of this study consisted of identification of the full-length cDNA sequence of Aig1l including the transcriptional start site, determination of the tissue distribution of Aig1l and analysis of the effect of EA on Aig1l gene expression. We determined the complete cDNA sequence including the transcriptional start site via cDNA cloning with the cap site hunting method. We then analyzed the tissue distribution of Aig1l by means of northern blot analysis and real-time quantitative polymerase chain reaction. We used the semiquantitative reverse transcriptase-polymerase chain reaction to examine the effect of EA on Aig1l gene expression. Our results showed that the complete cDNA sequence of Aig1l was 6073 bp long, and the putative protein consisted of 962 amino acids. All seven tissues that we analyzed expressed the Aig1l gene. In skeletal muscle, EA induced expression of the Aig1l gene, with high expression observed after 3 hours of EA. Our findings thus suggest that the Aig1l gene may play a key role in the molecular mechanisms of EA efficacy.

  19. Complete Genome Sequences of emm111 Type Streptococcus pyogenes Strain GUR, with Antitumor Activity, and Its Derivative Strain GURSA1 with an Inactivated emm Gene

    DEFF Research Database (Denmark)

    Suvorova, Maria A; Tsapieva, Anna N; Bak, Emilie Glad

    2017-01-01

    We present here the complete genome sequence of Streptococcus pyogenes type emm111 strain GUR, a throat isolate from a scarlet fever patient, which has been used to treat cancer patients in the former Soviet Union. We also present the complete genome sequence of its derivative strain GURSA1...

  20. The Complete Genome Sequence of the Fish Pathogen Tenacibaculum maritimum Provides Insights into Virulence Mechanisms

    Directory of Open Access Journals (Sweden)

    David Pérez-Pascual

    2017-08-01

    Full Text Available Tenacibaculum maritimum is a devastating bacterial pathogen of wild and farmed marine fish with a broad host range and a worldwide distribution. We report here the complete genome sequence of the T. maritimum type strain NCIMB 2154T. The genome consists of a 3,435,971-base pair circular chromosome with 2,866 predicted protein-coding genes. Genes encoding the biosynthesis of exopolysaccharides, the type IX secretion system, iron uptake systems, adhesins, hemolysins, proteases, and glycoside hydrolases were identified. They are likely involved in the virulence process including immune escape, invasion, colonization, destruction of host tissues, and nutrient scavenging. Among the predicted virulence factors, type IX secretion-mediated and cell-surface exposed proteins were identified including an atypical sialidase, a sphingomyelinase and a chondroitin AC lyase which activities were demonstrated in vitro.

  1. The complete amino acid sequence of human erythrocyte diphosphoglycerate mutase.

    OpenAIRE

    Haggarty, N W; Dunbar, B; Fothergill, L A

    1983-01-01

    The complete amino acid sequence of human erythrocyte diphosphoglycerate mutase, comprising 239 residues, was determined. The sequence was deduced from the four cyanogen bromide fragments, and from the peptides derived from these fragments after digestion with a number of proteolytic enzymes. Comparison of this sequence with that of the yeast glycolytic enzyme, phosphoglycerate mutase, shows that these enzymes are 47% identical. Most, but not all, of the residues implicated as being important...

  2. Nucleotide sequence of a human tRNA gene heterocluster

    International Nuclear Information System (INIS)

    Chang, Y.N.; Pirtle, I.L.; Pirtle, R.M.

    1986-01-01

    Leucine tRNA from bovine liver was used as a hybridization probe to screen a human gene library harbored in Charon-4A of bacteriophage lambda. The human DNA inserts from plaque-pure clones were characterized by restriction endonuclease mapping and Southern hybridization techniques, using both [3'- 32 P]-labeled bovine liver leucine tRNA and total tRNA as hybridization probes. An 8-kb Hind III fragment of one of these γ-clones was subcloned into the Hind III site of pBR322. Subsequent fine restriction mapping and DNA sequence analysis of this plasmid DNA indicated the presence of four tRNA genes within the 8-kb DNA fragment. A leucine tRNA gene with an anticodon of AAG and a proline tRNA gene with an anticodon of AGG are in a 1.6-kb subfragment. A threonine tRNA gene with an anticodon of UGU and an as yet unidentified tRNA gene are located in a 1.1-kb subfragment. These two different subfragments are separated by 2.8 kb. The coding regions of the three sequenced genes contain characteristic internal split promoter sequences and do not have intervening sequences. The 3'-flanking region of these three genes have typical RNA polymerase III termination sites of at least four consecutive T residues

  3. Complete Genome Sequences of 44 Arthrobacter Phages.

    Science.gov (United States)

    Klyczek, Karen K; Jacobs-Sera, Deborah; Adair, Tamarah L; Adams, Sandra D; Ball, Sarah L; Benjamin, Robert C; Bonilla, J Alfred; Breitenberger, Caroline A; Daniels, Charles J; Gaffney, Bobby L; Harrison, Melinda; Hughes, Lee E; King, Rodney A; Krukonis, Gregory P; Lopez, A Javier; Monsen-Collar, Kirsten; Pizzorno, Marie C; Rinehart, Claire A; Staples, Amanda K; Stowe, Emily L; Garlena, Rebecca A; Russell, Daniel A; Cresawn, Steven G; Pope, Welkin H; Hatfull, Graham F

    2018-02-01

    We report here the complete genome sequences of 44 phages infecting Arthrobacter sp. strain ATCC 21022. These phages have double-stranded DNA genomes with sizes ranging from 15,680 to 70,707 bp and G+C contents from 45.1% to 68.5%. All three tail types (belonging to the families Siphoviridae , Myoviridae , and Podoviridae ) are represented. Copyright © 2018 Klyczek et al.

  4. Complete genome sequence of pronghorn virus, a pestivirus

    Science.gov (United States)

    The complete genome sequence of Pronghorn virus, a member of the Pestivirus genus of the Flaviviridae, was determined. The virus, originally isolated from a pronghorn antelope, had a genome of 12,287 nucleotides with a single open reading frame of 11,694 bases encoding 3898 amino acids....

  5. Gene Discovery through Genomic Sequencing of Brucella abortus

    Science.gov (United States)

    Sánchez, Daniel O.; Zandomeni, Ruben O.; Cravero, Silvio; Verdún, Ramiro E.; Pierrou, Ester; Faccio, Paula; Diaz, Gabriela; Lanzavecchia, Silvia; Agüero, Fernán; Frasch, Alberto C. C.; Andersson, Siv G. E.; Rossetti, Osvaldo L.; Grau, Oscar; Ugalde, Rodolfo A.

    2001-01-01

    Brucella abortus is the etiological agent of brucellosis, a disease that affects bovines and human. We generated DNA random sequences from the genome of B. abortus strain 2308 in order to characterize molecular targets that might be useful for developing immunological or chemotherapeutic strategies against this pathogen. The partial sequencing of 1,899 clones allowed the identification of 1,199 genomic sequence surveys (GSSs) with high homology (BLAST expect value < 10−5) to sequences deposited in the GenBank databases. Among them, 925 represent putative novel genes for the Brucella genus. Out of 925 nonredundant GSSs, 470 were classified in 15 categories based on cellular function. Seven hundred GSSs showed no significant database matches and remain available for further studies in order to identify their function. A high number of GSSs with homology to Agrobacterium tumefaciens and Rhizobium meliloti proteins were observed, thus confirming their close phylogenetic relationship. Among them, several GSSs showed high similarity with genes related to nodule nitrogen fixation, synthesis of nod factors, nodulation protein symbiotic plasmid, and nodule bacteroid differentiation. We have also identified several B. abortus homologs of virulence and pathogenesis genes from other pathogens, including a homolog to both the Shda gene from Salmonella enterica serovar Typhimurium and the AidA-1 gene from Escherichia coli. Other GSSs displayed significant homologies to genes encoding components of the type III and type IV secretion machineries, suggesting that Brucella might also have an active type III secretion machinery. PMID:11159979

  6. Inductive matrix completion for predicting gene-disease associations.

    Science.gov (United States)

    Natarajan, Nagarajan; Dhillon, Inderjit S

    2014-06-15

    Most existing methods for predicting causal disease genes rely on specific type of evidence, and are therefore limited in terms of applicability. More often than not, the type of evidence available for diseases varies-for example, we may know linked genes, keywords associated with the disease obtained by mining text, or co-occurrence of disease symptoms in patients. Similarly, the type of evidence available for genes varies-for example, specific microarray probes convey information only for certain sets of genes. In this article, we apply a novel matrix-completion method called Inductive Matrix Completion to the problem of predicting gene-disease associations; it combines multiple types of evidence (features) for diseases and genes to learn latent factors that explain the observed gene-disease associations. We construct features from different biological sources such as microarray expression data and disease-related textual data. A crucial advantage of the method is that it is inductive; it can be applied to diseases not seen at training time, unlike traditional matrix-completion approaches and network-based inference methods that are transductive. Comparison with state-of-the-art methods on diseases from the Online Mendelian Inheritance in Man (OMIM) database shows that the proposed approach is substantially better-it has close to one-in-four chance of recovering a true association in the top 100 predictions, compared to the recently proposed Catapult method (second best) that has bigdata.ices.utexas.edu/project/gene-disease. © The Author 2014. Published by Oxford University Press.

  7. Complete Genome Sequence of the Human Gut Symbiont Roseburia hominis

    DEFF Research Database (Denmark)

    Travis, Anthony J.; Kelly, Denise; Flint, Harry J

    2015-01-01

    We report here the complete genome sequence of the human gut symbiont Roseburia hominis A2-183(T) (= DSM 16839(T) = NCIMB 14029(T)), isolated from human feces. The genome is represented by a 3,592,125-bp chromosome with 3,405 coding sequences. A number of potential functions contributing to host...

  8. Automated cleaning and pre-processing of immunoglobulin gene sequences from high-throughput sequencing

    Directory of Open Access Journals (Sweden)

    Miri eMichaeli

    2012-12-01

    Full Text Available High throughput sequencing (HTS yields tens of thousands to millions of sequences that require a large amount of pre-processing work to clean various artifacts. Such cleaning cannot be performed manually. Existing programs are not suitable for immunoglobulin (Ig genes, which are variable and often highly mutated. This paper describes Ig-HTS-Cleaner (Ig High Throughput Sequencing Cleaner, a program containing a simple cleaning procedure that successfully deals with pre-processing of Ig sequences derived from HTS, and Ig-Indel-Identifier (Ig Insertion – Deletion Identifier, a program for identifying legitimate and artifact insertions and/or deletions (indels. Our programs were designed for analyzing Ig gene sequences obtained by 454 sequencing, but they are applicable to all types of sequences and sequencing platforms. Ig-HTS-Cleaner and Ig-Indel-Identifier have been implemented in Java and saved as executable JAR files, supported on Linux and MS Windows. No special requirements are needed in order to run the programs, except for correctly constructing the input files as explained in the text. The programs' performance has been tested and validated on real and simulated data sets.

  9. Prediction of transcriptional regulatory sites in the complete genome sequence of Escherichia coli K-12.

    Science.gov (United States)

    Thieffry, D; Salgado, H; Huerta, A M; Collado-Vides, J

    1998-06-01

    As one of the best-characterized free-living organisms, Escherichia coli and its recently completed genomic sequence offer a special opportunity to exploit systematically the variety of regulatory data available in the literature in order to make a comprehensive set of regulatory predictions in the whole genome. The complete genome sequence of E.coli was analyzed for the binding of transcriptional regulators upstream of coding sequences. The biological information contained in RegulonDB (Huerta, A.M. et al., Nucleic Acids Res.,26,55-60, 1998) for 56 different transcriptional proteins was the support to implement a stringent strategy combining string search and weight matrices. We estimate that our search included representatives of 15-25% of the total number of regulatory binding proteins in E.coli. This search was performed on the set of 4288 putative regulatory regions, each 450 bp long. Within the regions with predicted sites, 89% are regulated by one protein and 81% involve only one site. These numbers are reasonably consistent with the distribution of experimental regulatory sites. Regulatory sites are found in 603 regions corresponding to 16% of operon regions and 10% of intra-operonic regions. Additional evidence gives stronger support to some of these predictions, including the position of the site, biological consistency with the function of the downstream gene, as well as genetic evidence for the regulatory interaction. The predictions described here were incorporated into the map presented in the paper describing the complete E.coli genome (Blattner,F.R. et al., Science, 277, 1453-1461, 1997). The complete set of predictions in GenBank format is available at the url: http://www. cifn.unam.mx/Computational_Biology/E.coli-predictions ecoli-reg@cifn.unam.mx, collado@cifn.unam.mx

  10. Complete chloroplast genome sequence of green foxtail (Setaria viridis), a promising model system for C4 photosynthesis.

    Science.gov (United States)

    Wang, Shuo; Gao, Li-Zhi

    2016-09-01

    The complete chloroplast genome of green foxtail (Setaria viridis), a promising model system for C4 photosynthesis, is first reported in this study. The genome harbors a large single copy (LSC) region of 81 016 bp and a small single copy (SSC) region of 12 456  bp separated by a pair of inverted repeat (IRa and IRb) regions of 22 315 bp. GC content is 38.92%. The proportion of coding sequence is 57.97%, comprising of 111 (19 duplicated in IR regions) unique genes, 71 of which are protein-coding genes, four are rRNA genes, and 36 are tRNA genes. Phylogenetic analysis indicated that S. viridis was clustered with its cultivated species S. italica in the tribe Paniceae of the family Poaceae. This newly determined chloroplast genome will provide valuable genetic resources to assist future studies on C4 photosynthesis in grasses.

  11. Complete chloroplast genome sequence of a tree fern Alsophila spinulosa: insights into evolutionary changes in fern chloroplast genomes.

    Science.gov (United States)

    Gao, Lei; Yi, Xuan; Yang, Yong-Xia; Su, Ying-Juan; Wang, Ting

    2009-06-11

    Ferns have generally been neglected in studies of chloroplast genomics. Before this study, only one polypod and two basal ferns had their complete chloroplast (cp) genome reported. Tree ferns represent an ancient fern lineage that first occurred in the Late Triassic. In recent phylogenetic analyses, tree ferns were shown to be the sister group of polypods, the most diverse group of living ferns. Availability of cp genome sequence from a tree fern will facilitate interpretation of the evolutionary changes of fern cp genomes. Here we have sequenced the complete cp genome of a scaly tree fern Alsophila spinulosa (Cyatheaceae). The Alsophila cp genome is 156,661 base pairs (bp) in size, and has a typical quadripartite structure with the large (LSC, 86,308 bp) and small single copy (SSC, 21,623 bp) regions separated by two copies of an inverted repeat (IRs, 24,365 bp each). This genome contains 117 different genes encoding 85 proteins, 4 rRNAs and 28 tRNAs. Pseudogenes of ycf66 and trnT-UGU are also detected in this genome. A unique trnR-UCG gene (derived from trnR-CCG) is found between rbcL and accD. The Alsophila cp genome shares some unusual characteristics with the previously sequenced cp genome of the polypod fern Adiantum capillus-veneris, including the absence of 5 tRNA genes that exist in most other cp genomes. The genome shows a high degree of synteny with that of Adiantum, but differs considerably from two basal ferns (Angiopteris evecta and Psilotum nudum). At one endpoint of an ancient inversion we detected a highly repeated 565-bp-region that is absent from the Adiantum cp genome. An additional minor inversion of the trnD-GUC, which is possibly shared by all ferns, was identified by comparison between the fern and other land plant cp genomes. By comparing four fern cp genome sequences it was confirmed that two major rearrangements distinguish higher leptosporangiate ferns from basal fern lineages. The Alsophila cp genome is very similar to that of the

  12. Complete chloroplast genome sequence of a tree fern Alsophila spinulosa: insights into evolutionary changes in fern chloroplast genomes

    Directory of Open Access Journals (Sweden)

    Yang Yong-Xia

    2009-06-01

    Full Text Available Abstract Background Ferns have generally been neglected in studies of chloroplast genomics. Before this study, only one polypod and two basal ferns had their complete chloroplast (cp genome reported. Tree ferns represent an ancient fern lineage that first occurred in the Late Triassic. In recent phylogenetic analyses, tree ferns were shown to be the sister group of polypods, the most diverse group of living ferns. Availability of cp genome sequence from a tree fern will facilitate interpretation of the evolutionary changes of fern cp genomes. Here we have sequenced the complete cp genome of a scaly tree fern Alsophila spinulosa (Cyatheaceae. Results The Alsophila cp genome is 156,661 base pairs (bp in size, and has a typical quadripartite structure with the large (LSC, 86,308 bp and small single copy (SSC, 21,623 bp regions separated by two copies of an inverted repeat (IRs, 24,365 bp each. This genome contains 117 different genes encoding 85 proteins, 4 rRNAs and 28 tRNAs. Pseudogenes of ycf66 and trnT-UGU are also detected in this genome. A unique trnR-UCG gene (derived from trnR-CCG is found between rbcL and accD. The Alsophila cp genome shares some unusual characteristics with the previously sequenced cp genome of the polypod fern Adiantum capillus-veneris, including the absence of 5 tRNA genes that exist in most other cp genomes. The genome shows a high degree of synteny with that of Adiantum, but differs considerably from two basal ferns (Angiopteris evecta and Psilotum nudum. At one endpoint of an ancient inversion we detected a highly repeated 565-bp-region that is absent from the Adiantum cp genome. An additional minor inversion of the trnD-GUC, which is possibly shared by all ferns, was identified by comparison between the fern and other land plant cp genomes. Conclusion By comparing four fern cp genome sequences it was confirmed that two major rearrangements distinguish higher leptosporangiate ferns from basal fern lineages. The

  13. Complete genome sequence of the gliding, heparinolytic Pedobacter saltans type strain (113).

    Science.gov (United States)

    Liolios, Konstantinos; Sikorski, Johannes; Lu, Meagan; Nolan, Matt; Lapidus, Alla; Lucas, Susan; Hammon, Nancy; Deshpande, Shweta; Cheng, Jan-Fang; Tapia, Roxanne; Han, Cliff; Goodwin, Lynne; Pitluck, Sam; Huntemann, Marcel; Ivanova, Natalia; Pagani, Ioanna; Mavromatis, Konstantinos; Ovchinikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Brambilla, Evelyne-Marie; Kotsyurbenko, Oleg; Rohde, Manfred; Tindall, Brian J; Abt, Birte; Göker, Markus; Detter, John C; Woyke, Tanja; Bristow, James; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Klenk, Hans-Peter; Kyrpides, Nikos C

    2011-10-15

    Pedobacter saltans Steyn et al. 1998 is one of currently 32 species in the genus Pedobacter within the family Sphingobacteriaceae. The species is of interest for its isolated location in the tree of life. Like other members of the genus P. saltans is heparinolytic. Cells of P. saltans show a peculiar gliding, dancing motility and can be distinguished from other Pedobacter strains by their ability to utilize glycerol and the inability to assimilate D-cellobiose. The genome presented here is only the second completed genome sequence of a type strain from a member of the family Sphingobacteriaceae to be published. The 4,635,236 bp long genome with its 3,854 protein-coding and 67 RNA genes consists of one chromosome, and is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  14. Sequence comparison and phylogenetic analysis of core gene of ...

    African Journals Online (AJOL)

    Phylogenetic analysis suggests that our sequences are clustered with sequences reported from Japan. This is the first phylogenetic analysis of HCV core gene from Pakistani population. Our sequences and sequences from Japan are grouped into same cluster in the phylogenetic tree. Sequence comparison and ...

  15. A complete mitochondrial genome sequence of the wild two-humped camel (Camelus bactrianus ferus: an evolutionary history of camelidae

    Directory of Open Access Journals (Sweden)

    Meng He

    2007-07-01

    Full Text Available Abstract Background The family Camelidae that evolved in North America during the Eocene survived with two distinct tribes, Camelini and Lamini. To investigate the evolutionary relationship between them and to further understand the evolutionary history of this family, we determined the complete mitochondrial genome sequence of the wild two-humped camel (Camelus bactrianus ferus, the only wild survivor of the Old World camel. Results The mitochondrial genome sequence (16,680 bp from C. bactrianus ferus contains 13 protein-coding, two rRNA, and 22 tRNA genes as well as a typical control region; this basic structure is shared by all metazoan mitochondrial genomes. Its protein-coding region exhibits codon usage common to all mammals and possesses the three cryptic stop codons shared by all vertebrates. C. bactrianus ferus together with the rest of mammalian species do not share a triplet nucleotide insertion (GCC that encodes a proline residue found only in the nd1 gene of the New World camelid Lama pacos. This lineage-specific insertion in the L. pacos mtDNA occurred after the split between the Old and New World camelids suggests that it may have functional implication since a proline insertion in a protein backbone usually alters protein conformation significantly, and nd1 gene has not been seen as polymorphic as the rest of ND family genes among camelids. Our phylogenetic study based on complete mitochondrial genomes excluding the control region suggested that the divergence of the two tribes may occur in the early Miocene; it is much earlier than what was deduced from the fossil record (11 million years. An evolutionary history reconstructed for the family Camelidae based on cytb sequences suggested that the split of bactrian camel and dromedary may have occurred in North America before the tribe Camelini migrated from North America to Asia. Conclusion Molecular clock analysis of complete mitochondrial genomes from C. bactrianus ferus and L

  16. The complete mitochondrial genomes of three parasitic nematodes of birds: a unique gene order and insights into nematode phylogeny

    Science.gov (United States)

    2013-01-01

    Background Analyses of mitochondrial (mt) genome sequences in recent years challenge the current working hypothesis of Nematoda phylogeny proposed from morphology, ecology and nuclear small subunit rRNA gene sequences, and raise the need to sequence additional mt genomes for a broad range of nematode lineages. Results We sequenced the complete mt genomes of three Ascaridia species (family Ascaridiidae) that infest chickens, pigeons and parrots, respectively. These three Ascaridia species have an identical arrangement of mt genes to each other but differ substantially from other nematodes. Phylogenetic analyses of the mt genome sequences of the Ascaridia species, together with 62 other nematode species, support the monophylies of seven high-level taxa of the phylum Nematoda: 1) the subclass Dorylaimia; 2) the orders Rhabditida, Trichinellida and Mermithida; 3) the suborder Rhabditina; and 4) the infraorders Spiruromorpha and Oxyuridomorpha. Analyses of mt genome sequences, however, reject the monophylies of the suborders Spirurina and Tylenchina, and the infraorders Rhabditomorpha, Panagrolaimomorpha and Tylenchomorpha. Monophyly of the infraorder Ascaridomorpha varies depending on the methods of phylogenetic analysis. The Ascaridomorpha was more closely related to the infraorders Rhabditomorpha and Diplogasteromorpha (suborder Rhabditina) than they were to the other two infraorders of the Spirurina: Oxyuridorpha and Spiruromorpha. The closer relationship among Ascaridomorpha, Rhabditomorpha and Diplogasteromorpha was also supported by a shared common pattern of mitochondrial gene arrangement. Conclusions Analyses of mitochondrial genome sequences and gene arrangement has provided novel insights into the phylogenetic relationships among several major lineages of nematodes. Many lineages of nematodes, however, are underrepresented or not represented in these analyses. Expanding taxon sampling is necessary for future phylogenetic studies of nematodes with mt genome

  17. Complete Genome Sequence of Bifidobacterium bifidum S17▿

    Science.gov (United States)

    Zhurina, Daria; Zomer, Aldert; Gleinser, Marita; Brancaccio, Vincenco Francesco; Auchter, Marc; Waidmann, Mark S.; Westermann, Christina; van Sinderen, Douwe; Riedel, Christian U.

    2011-01-01

    Here, we report on the first completely annotated genome sequence of a Bifidobacterium bifidum strain. B. bifidum S17, isolated from feces of a breast-fed infant, was shown to strongly adhere to intestinal epithelial cells and has potent anti-inflammatory activity in vitro and in vivo. The genome sequence will provide new insights into the biology of this potential probiotic organism and allow for the characterization of the molecular mechanisms underlying its beneficial properties. PMID:21037011

  18. Complete genome sequence of Vibrio anguillarum strain NB10, a virulent isolate from the Gulf of Bothnia.

    Science.gov (United States)

    Holm, Kåre Olav; Nilsson, Kristina; Hjerde, Erik; Willassen, Nils-Peder; Milton, Debra L

    2015-01-01

    Vibrio anguillarum causes a fatal hemorrhagic septicemia in marine fish that leads to great economical losses in aquaculture world-wide. Vibrio anguillarum strain NB10 serotype O1 is a Gram-negative, motile, curved rod-shaped bacterium, isolated from a diseased fish on the Swedish coast of the Gulf of Bothnia, and is slightly halophilic. Strain NB10 is a virulent isolate that readily colonizes fish skin and intestinal tissues. Here, the features of this bacterium are described and the annotation and analysis of its complete genome sequence is presented. The genome is 4,373,835 bp in size, consists of two circular chromosomes and one plasmid, and contains 3,783 protein-coding genes and 129 RNA genes.

  19. Complete nucleotide sequence of pGA45, a 140,698-bp incFIIY plasmid encoding blaIMI-3-mediated carbapenem resistance, from river sediment

    Directory of Open Access Journals (Sweden)

    Bingjun eDang

    2016-02-01

    Full Text Available Plasmid pGA45 was isolated from the sediment of Haihe River using E. coli CV601 (gfp-tagged as recipients and indigenous bacteria from sediment as donors. This plasmid confers reduced susceptibility to imipenem which belongs to carbapenem group. Plasmid pGA45 was fully sequenced on an Illumina HiSeq 2000 sequencing system. The complete sequence of plasmid pGA45 was 140,698 bp in length with an average G+C content of 52.03%. Sequence analysis shows that pGA45 belongs to incFIIY group and harbors a backbone region shares high homology and gene synteny to several other incF plasmids including pNDM1_EC14653, pYDC644, pNDM-Ec1GN574, pRJF866, pKOX_NDM1 and pP10164-NDM. In addition to the backbone region, plasmid pGA45 harbors two notable features including one blaIMI-3-containing region and one type VI secretion system region. The blaIMI-3-containing region is responsible for bacteria carbapenem resistance and the type VI secretion system region is probably involved in bacteria virulence, respectively. Plasmid pGA45 represents the first complete nucleotide sequence of the blaIMI-harboring plasmid from environment sample and the sequencing of this plasmid provided insight into the architecture used for the dissemination of blaIMI carbapenemase genes.

  20. Complete genome sequence of Arcanobacterium haemolyticum type strain (11018T)

    Energy Technology Data Exchange (ETDEWEB)

    Yasawong, Montri [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Teshima, Hazuki [Los Alamos National Laboratory (LANL); Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Pukall, Rudiger [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany

    2010-01-01

    Vulcanisaeta distributa Itoh et al. 2002 belongs to the family Thermoproteaceae in the phylum Crenarchaeota. The genus Vulcanisaeta is characterized by a global distribution in hot and acidic springs. This is the first genome sequence from a member of the genus Vulcanisaeta and seventh genome sequence in the family Thermoproteaceae. The 2,374,137 bp long genome with its 2,544 protein-coding and 49 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  1. Mitochondrial genome of the Komodo dragon: efficient sequencing method with reptile-oriented primers and novel gene rearrangements.

    Science.gov (United States)

    Kumazawa, Yoshinori; Endo, Hideki

    2004-04-30

    The mitochondrial genome of the Komodo dragon (Varanus komodoensis) was nearly completely sequenced, except for two highly repetitive noncoding regions. An efficient sequencing method for squamate mitochondrial genomes was established by combining the long polymerase chain reaction (PCR) technology and a set of reptile-oriented primers designed for nested PCR amplifications. It was found that the mitochondrial genome had novel gene arrangements in which genes from NADH dehydrogenase subunit 6 to proline tRNA were extensively shuffled with duplicate control regions. These control regions had 99% sequence similarity over 700 bp. Although snake mitochondrial genomes are also known to possess duplicate control regions with nearly identical sequences, the location of the second control region suggested independent occurrence of the duplication on lineages leading to snakes and the Komodo dragon. Another feature of the mitochondrial genome of the Komodo dragon was the considerable number of tandem repeats, including sequences with a strong secondary structure, as a possible site for the slipped-strand mispairing in replication. These observations are consistent with hypotheses that tandem duplications via the slipped-strand mispairing may induce mitochondrial gene rearrangements and may serve to maintain similar copies of the control region.

  2. Comparative analysis of complete chloroplast genome sequence and inversion variation in Lasthenia burkei (Madieae, Asteraceae).

    Science.gov (United States)

    Walker, Joseph F; Zanis, Michael J; Emery, Nancy C

    2014-04-01

    Complete chloroplast genome studies can help resolve relationships among large, complex plant lineages such as Asteraceae. We present the first whole plastome from the Madieae tribe and compare its sequence variation to other chloroplast genomes in Asteraceae. We used high throughput sequencing to obtain the Lasthenia burkei chloroplast genome. We compared sequence structure and rates of molecular evolution in the small single copy (SSC), large single copy (LSC), and inverted repeat (IR) regions to those for eight Asteraceae accessions and one Solanaceae accession. The chloroplast sequence of L. burkei is 150 746 bp and contains 81 unique protein coding genes and 4 coding ribosomal RNA sequences. We identified three major inversions in the L. burkei chloroplast, all of which have been found in other Asteraceae lineages, and a previously unreported inversion in Lactuca sativa. Regions flanking inversions contained tRNA sequences, but did not have particularly high G + C content. Substitution rates varied among the SSC, LSC, and IR regions, and rates of evolution within each region varied among species. Some observed differences in rates of molecular evolution may be explained by the relative proportion of coding to noncoding sequence within regions. Rates of molecular evolution vary substantially within and among chloroplast genomes, and major inversion events may be promoted by the presence of tRNAs. Collectively, these results provide insight into different mechanisms that may promote intramolecular recombination and the inversion of large genomic regions in the plastome.

  3. The Complete Plastid Genome Sequence of Madagascar Periwinkle Catharanthus roseus (L.) G. Don: Plastid Genome Evolution, Molecular Marker Identification, and Phylogenetic Implications in Asterids

    Science.gov (United States)

    Ku, Chuan; Chung, Wan-Chia; Chen, Ling-Ling; Kuo, Chih-Horng

    2013-01-01

    The Madagascar periwinkle ( Catharanthus roseus in the family Apocynaceae) is an important medicinal plant and is the source of several widely marketed chemotherapeutic drugs. It is also commonly grown for its ornamental values and, due to ease of infection and distinctiveness of symptoms, is often used as the host for studies on phytoplasmas, an important group of uncultivated plant pathogens. To gain insights into the characteristics of apocynaceous plastid genomes (plastomes), we used a reference-assisted approach to assemble the complete plastome of C . roseus , which could be applied to other C . roseus -related studies. The C . roseus plastome is the second completely sequenced plastome in the asterid order Gentianales. We performed comparative analyses with two other representative sequences in the same order, including the complete plastome of Coffea arabica (from the basal Gentianales family Rubiaceae) and the nearly complete plastome of Asclepias syriaca (Apocynaceae). The results demonstrated considerable variations in gene content and plastome organization within Apocynaceae, including the presence/absence of three essential genes (i.e., accD, clpP, and ycf1) and large size changes in non-coding regions (e.g., rps2-rpoC2 and IRb-ndhF). To find plastome markers of potential utility for Catharanthus breeding and phylogenetic analyses, we identified 41 C . roseus -specific simple sequence repeats. Furthermore, five intergenic regions with high divergence between C . roseus and three other euasterids I taxa were identified as candidate markers. To resolve the euasterids I interordinal relationships, 82 plastome genes were used for phylogenetic inference. With the addition of representatives from Apocynaceae and sampling of most other asterid orders, a sister relationship between Gentianales and Solanales is supported. PMID:23825699

  4. Partial Sequencing of 16S rRNA Gene of Selected Staphylococcus aureus Isolates and its Antibiotic Resistance

    Directory of Open Access Journals (Sweden)

    Harsi Dewantari Kusumaningrum

    2016-08-01

    Full Text Available The choice of primer used in 16S rRNA sequencing for identification of Staphylococcus species found in food is important. This study aimed to characterize Staphylococcus aureus isolates by partial sequencing based on 16S rRNA gene employing primers 16sF, 63F or 1387R. The isolates were isolated from milk, egg dishes and chicken dishes and selected based on the presence of sea gene that responsible for formation of enterotoxin-A. Antibiotic susceptibility of the isolates towards six antibiotics was also tested. The use of 16sF resulted generally in higher identity percentage and query coverage compared to the sequencing by 63F or 1387R. BLAST results of all isolates, sequenced by 16sF, showed 99% homology to complete genome of four S. aureus strains, with different characteristics on enterotoxin production and antibiotic resistance. Considering that all isolates were carrying sea gene, indicated by the occurence of 120 bp amplicon after PCR amplification using primer SEA1/SEA2,  the isolates were most in agreeing to S. aureus subsp. aureus ST288. This study indicated that 4 out of 8 selected isolates were resistant towards streptomycin. The 16S rRNA gene sequencing using 16sF is useful for identification of S. aureus. However, additional analysis such as PCR employing specific gene target, should give a valuable supplementary information, when specific characteristic is expected.

  5. Nucleotide sequence of the triosephosphate isomerase gene from Macaca mulatta

    Energy Technology Data Exchange (ETDEWEB)

    Old, S.E.; Mohrenweiser, H.W. (Univ. of Michigan, Ann Arbor (USA))

    1988-09-26

    The triosephosphate isomerase gene from a rhesus monkey, Macaca mulatta, charon 34 library was sequenced. The human and chimpanzee enzymes differ from the rhesus enzyme at ASN 20 and GLU 198. The nucleotide sequence identity between rhesus and human is 97% in the coding region and >94% in the flanking regions. Comparison of the rhesus and chimp genes, including the intron and flanking sequences, does not suggest a mechanism for generating the two TPI peptides of proliferating cells from hominoids and a single peptide from the rhesus gene.

  6. Complete Genome Sequence of Yersinis pestis Strains Antiqua and Nepa1516: Evidence of Gene Reduction in an Emerging Pathogen

    Energy Technology Data Exchange (ETDEWEB)

    Chain, Patrick S [ORNL; Hu, Ping [Lawrence Berkeley National Laboratory (LBNL); Malfatti, Stephanie [Lawrence Livermore National Laboratory (LLNL); Radnedge, Lyndsay [Lawrence Livermore National Laboratory (LLNL); Larimer, Frank W [ORNL; Vergez, Lisa [Lawrence Livermore National Laboratory (LLNL); Worsham, Patricia [U.S. Army Medical Research Institute of Infectious Diseases; Chu, May C [Centers for Disease Control and Prevention; Anderson, Gary L [Lawrence Berkeley National Laboratory (LBNL)

    2006-01-01

    Yersinia pestis, the causative agent of bubonic and pneumonic plagues, has undergone detailed study at the molecular level. To further investigate the genomic diversity among this group and to help characterize lineages of the plague organism that have no sequenced members, we present here the genomes of two isolates of the ''classical'' antiqua biovar, strains Antiqua and Nepal516. The genomes of Antiqua and Nepal516 are 4.7 Mb and 4.5 Mb and encode 4,138 and 3,956 open reading frames, respectively. Though both strains belong to one of the three classical biovars, they represent separate lineages defined by recent phylogenetic studies. We compare all five currently sequenced Y. pestis genomes and the corresponding features in Yersinia pseudotuberculosis. There are strain-specific rearrangements, insertions, deletions, single nucleotide polymorphisms, and a unique distribution of insertion sequences. We found 453 single nucleotide polymorphisms in protein-coding regions, which were used to assess the evolutionary relationships of these Y. pestis strains. Gene reduction analysis revealed that the gene deletion processes are under selective pressure, and many of the inactivations are probably related to the organism's interaction with its host environment. The results presented here clearly demonstrate the differences between the two biovar antiqua lineages and support the notion that grouping Y. pestis strains based strictly on the classical definition of biovars (predicated upon two biochemical assays) does not accurately reflect the phylogenetic relationships within this species. A comparison of four virulent Y. pestis strains with the human-avirulent strain 91001 provides further insight into the genetic basis of virulence to humans.

  7. Confirmation and Sequence analysis of N gene of PPRV in South Xinjiang, China

    Directory of Open Access Journals (Sweden)

    YongHong Liu

    Full Text Available ABSTRACT In China, Peste des petits ruminants (PPR was officially first reported in 2007. From 2010 until the outbreak of 2013, PPRV infection was not reported. In November 2013, PPRV re-emerged in Xinjiang and rapidly spread to 22 P/A/M (provinces, autonomous regions and municipalities of China. In the study, suspected PPRV-infected sheep in a breeding farm of South Xinjiang in 2014 were diagnosed and the characteristics of complete sequence of N protein gene of PPRV was analyzed. The sheep showed PPRV-infected signs, such as fever, orinasal secretions increase, dyspnea and diarrhea, with 60% of morbidity and 21.1% of fatality rate. The macroscopic lesions after autopsy and histopathological changes were observed under light microscope including stomatitis, broncho-interstitial pneumonia, catarrhal hemorrhagic enteritis and intracytoplasmic eosinophilic inclusions in multinucleated giantcell in lung. The formalin-fixed mixed tissues samples were positive by nucleic acid extraction and RT-PCR detection. The nucleotide of N protein gene of China/XJNJ/2014 strain was extremely high homology with the China/XJYL/2013 strain, and the highest with PRADESH_95 strain from India in exotic strains. Phylogenetic analysis based on complete sequence of N protein gene of PPRV showed that the China/XJNJ/2014 strain, other strain of 2013-2014 in this study and Tibetan strains all belonged to lineage Ⅳ, but the PPRV strains of 2013-2014 in this study and Tibetan strains were in different sub-branches.

  8. Cloning and sequencing of the gene for human β-casein

    International Nuclear Information System (INIS)

    Loennerdal, B.; Bergstroem, S.; Andersson, Y.; Hialmarsson, K.; Sundgyist, A.; Hernell, O.

    1990-01-01

    Human β-casein is a major protein in human milk. This protein is part of the casein micelle and has been suggested to have several physiological functions in the newborn. Since there is limited information on βcasein and the factors that affect its concentration in human milk, the authors have isolated and sequenced the gene for this protein. A human mammary gland cDNA library (Clontech) in gt 11 was screened by plaque hy-hybridization using a 42-mer synthetic 32 p-labelled oligo-nucleotide. Positive clones were identified and isolated, DNA was prepared and the gene isolated by cleavage with EcoR1. Following subcloning (PUC18), restriction mapping and Southern blotting, DNA for sequencing was prepared. The gene was sequenced by the dideoxy method. Human β-casein has 212 amino acids and the amino acid sequence deducted from the nucleotide sequence is to 91% identical to the published sequence for human β-casein show a high degree of conservation at the leader peptide and the highly phosphorylated sequences, but also deletions and divergence at several positions. These results provide insight into the structure of the human β-casein gene and will facilitate studies on factors affecting its expression

  9. The complete amino acid sequence of human erythrocyte diphosphoglycerate mutase.

    Science.gov (United States)

    Haggarty, N W; Dunbar, B; Fothergill, L A

    1983-01-01

    The complete amino acid sequence of human erythrocyte diphosphoglycerate mutase, comprising 239 residues, was determined. The sequence was deduced from the four cyanogen bromide fragments, and from the peptides derived from these fragments after digestion with a number of proteolytic enzymes. Comparison of this sequence with that of the yeast glycolytic enzyme, phosphoglycerate mutase, shows that these enzymes are 47% identical. Most, but not all, of the residues implicated as being important for the activity of the glycolytic mutase are conserved in the erythrocyte diphosphoglycerate mutase. PMID:6313356

  10. Complete Genome Sequence of a Novel Reassortant Avian Influenza H1N2 Virus Isolated from a Domestic Sparrow in 2012

    OpenAIRE

    Xie, Zhixun; Guo, Jie; Xie, Liji; Liu, Jiabo; Pang, Yaoshan; Deng, Xianwen; Xie, Zhiqin; Fan, Qing; Luo, Sisi

    2013-01-01

    We report here the complete genome sequence of a novel H1N2 avian influenza virus strain, A/Sparrow /Guangxi/GXs-1/2012 (H1N2), isolated from a sparrow in the Guangxi Province of southern China in 2012. All of the 8 gene segments (hemagglutinin [HA], nucleoprotein [NP], matrix [M], polymerase basic 2 [PB2], neuraminidase [NA], polymerase acidic [PA], polymerase basic 1 [PB1], and nonstructural [NS] genes) of this natural recombinant virus are attributed to the Eurasian lineage, and phylogenet...

  11. Nucleotide sequence of the human N-myc gene

    International Nuclear Information System (INIS)

    Stanton, L.W.; Schwab, M.; Bishop, J.M.

    1986-01-01

    Human neuroblastomas frequently display amplification and augmented expression of a gene known as N-myc because of its similarity to the protooncogene c-myc. It has therefore been proposed that N-myc is itself a protooncogene, and subsequent tests have shown that N-myc and c-myc have similar biological activities in cell culture. The authors have now detailed the kinship between N-myc and c-myc by determining the nucleotide sequence of human N-myc and deducing the amino acid sequence of the protein encoded by the gene. The topography of N-myc is strikingly similar to that of c-myc: both genes contain three exons of similar lengths; the coding elements of both genes are located in the second and third exons; and both genes have unusually long 5' untranslated regions in their mRNAs, with features that raise the possibility that expression of the genes may be subject to similar controls of translation. The resemblance between the proteins encoded by N-myc and c-myc sustains previous suspicions that the genes encode related functions

  12. Definition of the complete Schistosoma mansoni hemoglobinase mRNA sequence and gene expression in developing parasites.

    Science.gov (United States)

    el Meanawy, M A; Aji, T; Phillips, N F; Davis, R E; Salata, R A; Malhotra, I; McClain, D; Aikawa, M; Davis, A H

    1990-07-01

    Schistosoma mansoni uses a variety of proteases termed hemoglobinases to obtain nutrition from host globin. Previous reports have characterized cDNAs encoding 1 of these enzymes. However, these sequences did not define the primary structures of the mRNA and protein. The complete sequence of the 1390 base mRNA has now been determined. It encodes a 50 kDa primary translation product. In vitro translations coupled with immunoprecipitations and Western blots of parasite lysates allowed visualization of the 50 kDa form. Production of the 31 kDa mature hemoglobinase from the 50 kDa species involves removal of both NH2 and COOH terminal residues from the primary translation product. Expression of hemoglobinase mRNA and protein was examined during larval parasite development. Low levels were observed in young schistosomula. After 6-9 days in culture, high hemoglobinase levels were seen which correlated with the onset of red blood cell feeding. Immunoelectron microscopy was employed to examine hemoglobinase location and function. In adult worms the enzyme was associated with the gut lumen and gut epithelium. In cercariae, the protease was observed in the head gland, suggesting new roles for the protease.

  13. Complete genome sequence of Bacillus methylotrophicus JJ-D34 isolated from deonjang, a Korean traditional fermented soybean paste.

    Science.gov (United States)

    Jung, Ji Young; Chun, Byung Hee; Moon, Ji Young; Yeo, Soo-Hwan; Jeon, Che Ok

    2016-02-10

    Bacillus methylotrophicus JJ-D34 showing good proteolytic and antipathogenic activities was isolated from doenjang, a Korean traditional fermented soybean paste. Here, we report the complete genome sequence of strain JJ-D34 harboring a 4,105,955 bp circular chromosome encoding 4044 genes with a 46.24% G+C content, which will provide insights into the genomic basis of its effects and facilitating its application to doenjang fermentation. Copyright © 2015 Elsevier B.V. All rights reserved.

  14. On the total number of genes and their length distribution in complete microbial genomes

    DEFF Research Database (Denmark)

    Skovgaard, M; Jensen, L J; Brunak, S

    2001-01-01

    In sequenced microbial genomes, some of the annotated genes are actually not protein-coding genes, but rather open reading frames that occur by chance. Therefore, the number of annotated genes is higher than the actual number of genes for most of these microbes. Comparison of the length distribut......In sequenced microbial genomes, some of the annotated genes are actually not protein-coding genes, but rather open reading frames that occur by chance. Therefore, the number of annotated genes is higher than the actual number of genes for most of these microbes. Comparison of the length...... distribution of the annotated genes with the length distribution of those matching a known protein reveals that too many short genes are annotated in many genomes. Here we estimate the true number of protein-coding genes for sequenced genomes. Although it is often claimed that Escherichia coli has about 4300...... genes, we show that it probably has only approximately 3800 genes, and that a similar discrepancy exists for almost all published genomes....

  15. The complete mitogenome of the whale shark parasitic copepod Pandarus rhincodonicus norman, Newbound & Knott (Crustacea; Siphonostomatoida; Pandaridae)--a new gene order for the copepoda.

    Science.gov (United States)

    Austin, Christopher M; Tan, Mun Hua; Lee, Yin Peng; Croft, Laurence J; Meekan, Mark G; Pierce, Simon J; Gan, Han Ming

    2016-01-01

    The complete mitochondrial genome of the parasitic copepod Pandarus rhincodonicus was obtained from a partial genome scan using the HiSeq sequencing system. The Pandarus rhincodonicus mitogenome has 14,480 base pairs (62% A+T content) made up of 12 protein-coding genes, 2 ribosomal subunit genes, 22 transfer RNAs, and a putative 384 bp non-coding AT-rich region. This Pandarus mitogenome sequence is the first for the family Pandaridae, the second for the order Siphonostomatoida and the sixth for the Copepoda.

  16. A complete mitochondrial genome sequence from a mesolithic wild aurochs (Bos primigenius.

    Directory of Open Access Journals (Sweden)

    Ceiridwen J Edwards

    Full Text Available BACKGROUND: The derivation of domestic cattle from the extinct wild aurochs (Bos primigenius has been well-documented by archaeological and genetic studies. Genetic studies point towards the Neolithic Near East as the centre of origin for Bos taurus, with some lines of evidence suggesting possible, albeit rare, genetic contributions from locally domesticated wild aurochsen across Eurasia. Inferences from these investigations have been based largely on the analysis of partial mitochondrial DNA sequences generated from modern animals, with limited sequence data from ancient aurochsen samples. Recent developments in DNA sequencing technologies, however, are affording new opportunities for the examination of genetic material retrieved from extinct species, providing new insight into their evolutionary history. Here we present DNA sequence analysis of the first complete mitochondrial genome (16,338 base pairs from an archaeologically-verified and exceptionally-well preserved aurochs bone sample. METHODOLOGY: DNA extracts were generated from an aurochs humerus bone sample recovered from a cave site located in Derbyshire, England and radiocarbon-dated to 6,738+/-68 calibrated years before present. These extracts were prepared for both Sanger and next generation DNA sequencing technologies (Illumina Genome Analyzer. In total, 289.9 megabases (22.48% of the post-filtered DNA sequences generated using the Illumina Genome Analyzer from this sample mapped with confidence to the bovine genome. A consensus B. primigenius mitochondrial genome sequence was constructed and was analysed alongside all available complete bovine mitochondrial genome sequences. CONCLUSIONS: For all nucleotide positions where both Sanger and Illumina Genome Analyzer sequencing methods gave high-confidence calls, no discrepancies were observed. Sequence analysis reveals evidence of heteroplasmy in this sample and places this mitochondrial genome sequence securely within a previously

  17. A complete mitochondrial genome sequence from a mesolithic wild aurochs (Bos primigenius).

    LENUS (Irish Health Repository)

    Edwards, Ceiridwen J

    2010-01-01

    BACKGROUND: The derivation of domestic cattle from the extinct wild aurochs (Bos primigenius) has been well-documented by archaeological and genetic studies. Genetic studies point towards the Neolithic Near East as the centre of origin for Bos taurus, with some lines of evidence suggesting possible, albeit rare, genetic contributions from locally domesticated wild aurochsen across Eurasia. Inferences from these investigations have been based largely on the analysis of partial mitochondrial DNA sequences generated from modern animals, with limited sequence data from ancient aurochsen samples. Recent developments in DNA sequencing technologies, however, are affording new opportunities for the examination of genetic material retrieved from extinct species, providing new insight into their evolutionary history. Here we present DNA sequence analysis of the first complete mitochondrial genome (16,338 base pairs) from an archaeologically-verified and exceptionally-well preserved aurochs bone sample. METHODOLOGY: DNA extracts were generated from an aurochs humerus bone sample recovered from a cave site located in Derbyshire, England and radiocarbon-dated to 6,738+\\/-68 calibrated years before present. These extracts were prepared for both Sanger and next generation DNA sequencing technologies (Illumina Genome Analyzer). In total, 289.9 megabases (22.48%) of the post-filtered DNA sequences generated using the Illumina Genome Analyzer from this sample mapped with confidence to the bovine genome. A consensus B. primigenius mitochondrial genome sequence was constructed and was analysed alongside all available complete bovine mitochondrial genome sequences. CONCLUSIONS: For all nucleotide positions where both Sanger and Illumina Genome Analyzer sequencing methods gave high-confidence calls, no discrepancies were observed. Sequence analysis reveals evidence of heteroplasmy in this sample and places this mitochondrial genome sequence securely within a previously identified

  18. The complete chloroplast genome sequence of Mahonia bealei (Berberidaceae) reveals a significant expansion of the inverted repeat and phylogenetic relationship with other angiosperms.

    Science.gov (United States)

    Ma, Ji; Yang, Bingxian; Zhu, Wei; Sun, Lianli; Tian, Jingkui; Wang, Xumin

    2013-10-10

    Mahonia bealei (Berberidaceae) is a frequently-used traditional Chinese medicinal plant with efficient anti-inflammatory ability. This plant is one of the sources of berberine, a new cholesterol-lowering drug with anti-diabetic activity. We have sequenced the complete nucleotide sequence of the chloroplast (cp) genome of M. bealei. The complete cp genome of M. bealei is 164,792 bp in length, and has a typical structure with large (LSC 73,052 bp) and small (SSC 18,591 bp) single-copy regions separated by a pair of inverted repeats (IRs 36,501 bp) of large size. The Mahonia cp genome contains 111 unique genes and 39 genes are duplicated in the IR regions. The gene order and content of M. bealei are almost unarranged which is consistent with the hypothesis that large IRs stabilize cp genome and reduce gene loss-and-gain probabilities during evolutionary process. A large IR expansion of over 12 kb has occurred in M. bealei, 15 genes (rps19, rpl22, rps3, rpl16, rpl14, rps8, infA, rpl36, rps11, petD, petB, psbH, psbN, psbT and psbB) have expanded to have an additional copy in the IRs. The IR expansion rearrangement occurred via a double-strand DNA break and subsequence repair, which is different from the ordinary gene conversion mechanism. Repeat analysis identified 39 direct/inverted repeats 30 bp or longer with a sequence identity ≥ 90%. Analysis also revealed 75 simple sequence repeat (SSR) loci and almost all are composed of A or T, contributing to a distinct bias in base composition. Comparison of protein-coding sequences with ESTs reveals 9 putative RNA edits and 5 of them resulted in non-synonymous modifications in rpoC1, rps2, rps19 and ycf1. Phylogenetic analysis using maximum parsimony (MP) and maximum likelihood (ML) was performed on a dataset composed of 65 protein-coding genes from 25 taxa, which yields an identical tree topology as previous plastid-based trees, and provides strong support for the sister relationship between Ranunculaceae and Berberidaceae

  19. Complete genome sequences of six strains of the genus methylobacterium

    Energy Technology Data Exchange (ETDEWEB)

    Marx, Christopher J [Harvard University; Bringel, Francoise O. [University of Strasbourg; Christoserdova, Ludmila [University of Washington, Seattle; Moulin, Lionel [UMR, France; Farhan Ul Haque, Muhammad [CNRS, Strasbourg, France; Fleischman, Darrell E. [Wright State University, Dayton, OH; Gruffaz, Christelle [CNRS, Strasbourg, France; Jourand, Philippe [UMR, France; Knief, Claudia [ETH Zurich, Switzerland; Lee, Ming-Chun [Harvard University; Muller, Emilie E. L. [CNRS, Strasbourg, France; Nadalig, Thierry [CNRS, Strasbourg, France; Peyraud, Remi [ETH Zurich, Switzerland; Roselli, Sandro [CNRS, Strasbourg, France; Russ, Lina [ETH Zurich, Switzerland; Aguero, Fernan [Universidad Nacional de General San Martin; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Lajus, Aurelie [Genoscope/Centre National de la Recherche Scientifique-Unite Mixte de Recherche; Land, Miriam L [ORNL; Medigue, Claudine [Genoscope/Centre National de la Recherche Scientifique-Unite Mixte de Recherche; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Stolyar, Sergey [University of Washington; Vorholt, Julia A. [ETH Zurich, Switzerland; Vuilleumier, Stephane [University of Strasbourg

    2012-01-01

    The complete and assembled genome sequences were determined for six strains of the alphaproteobacterial genus Methylobacterium, chosen for their key adaptations to different plant-associated niches and environmental constraints.

  20. Complete Genome Sequences of Six Strains of the Genus Methylobacterium

    Energy Technology Data Exchange (ETDEWEB)

    Marx, Christopher J [Harvard University; Bringel, Francoise O. [University of Strasbourg; Christoserdova, Ludmila [University of Washington, Seattle; Moulin, Lionel [UMR, France; UI Hague, Muhammad Farhan [University of Strasbourg; Fleischman, Darrell E. [Wright State University, Dayton, OH; Gruffaz, Christelle [CNRS, Strasbourg, France; Jourand, Philippe [UMR, France; Knief, Claudia [ETH Zurich, Switzerland; Lee, Ming-Chun [Harvard University; Muller, Emilie E. L. [CNRS, Strasbourg, France; Nadalig, Thierry [CNRS, Strasbourg, France; Peyraud, Remi [ETH Zurich, Switzerland; Roselli, Sandro [CNRS, Strasbourg, France; Russ, Lina [ETH Zurich, Switzerland; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Ivanov, Pavel S. [University of Wyoming, Laramie; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Lajus, Aurelie [Genoscope/Centre National de la Recherche Scientifique-Unite Mixte de Recherche; Land, Miriam L [ORNL; Medigue, Claudine [Genoscope/Centre National de la Recherche Scientifique-Unite Mixte de Recherche; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Stolyar, Sergey [University of Washington; Vorholt, Julia A. [ETH Zurich, Switzerland; Vuilleumier, Stephane [University of Strasbourg

    2012-01-01

    The complete and assembled genome sequences were determined for six strains of the alphaproteobacterial genus Methylobacterium, chosen for their key adaptations to different plant-associated niches and environmental constraints.

  1. Complete genome sequence of the European sheatfish virus.

    Science.gov (United States)

    Mavian, Carla; López-Bueno, Alberto; Fernández Somalo, María Pilar; Alcamí, Antonio; Alejo, Alí

    2012-06-01

    Viral diseases are an increasing threat to the thriving aquaculture industry worldwide. An emerging group of fish pathogens is formed by several ranaviruses, which have been isolated at different locations from freshwater and seawater fish species since 1985. We report the complete genome sequence of European sheatfish ranavirus (ESV), the first ranavirus isolated in Europe, which causes high mortality rates in infected sheatfish (Silurus glanis) and in other species. Analysis of the genome sequence shows that ESV belongs to the amphibian-like ranaviruses and is closely related to the epizootic hematopoietic necrosis virus (EHNV), a disease agent geographically confined to the Australian continent and notifiable to the World Organization for Animal Health.

  2. The complete mitochondrial genome of Sesarmops sinensis reveals gene rearrangements and phylogenetic relationships in Brachyura.

    Science.gov (United States)

    Tang, Bo-Ping; Xin, Zhao-Zhe; Liu, Yu; Zhang, Dai-Zhen; Wang, Zheng-Fei; Zhang, Hua-Bin; Chai, Xin-Yue; Zhou, Chun-Lin; Liu, Qiu-Ning

    2017-01-01

    Mitochondrial genome (mitogenome) is very important to understand molecular evolution and phylogenetics. Herein, in this study, the complete mitogenome of Sesarmops sinensis was reported. The mitogenome was 15,905 bp in size, and contained 13 protein-coding genes (PCGs), two ribosomal RNA (rRNA) genes, 22 transfer RNA (tRNA) genes, and a control region (CR). The AT skew and the GC skew are both negative in the mitogenomes of S. sinensis. The nucleotide composition of the S. sinensis mitogenome was also biased toward A + T nucleotides (75.7%). All tRNA genes displayed a typical mitochondrial tRNA cloverleaf structure, except for the trnS1 gene, which lacked a dihydroxyuridine arm. S. sinensis exhibits a novel rearrangement compared with the Pancrustacean ground pattern and other Brachyura species. Based on the 13 PCGs, the phylogenetic analysis showed that S. sinensis and Sesarma neglectum were clustered on one branch with high nodal support values, indicating that S. sinensis and S. neglectum have a sister group relationship. The group (S. sinensis + S. neglectum) was sister to (Parasesarmops tripectinis + Metopaulias depressus), suggesting that S. sinensis belongs to Grapsoidea, Sesarmidae. Phylogenetic trees based on amino acid sequences and nucleotide sequences of mitochondrial 13 PCGs using BI and ML respectively indicate that section Eubrachyura consists of four groups clearly. The resulting phylogeny supports the establishment of a separate subsection Potamoida. These four groups correspond to four subsections of Raninoida, Heterotremata, Potamoida, and Thoracotremata.

  3. Complete genome sequence of Brachyspira intermedia reveals unique genomic features in Brachyspira species and phage-mediated horizontal gene transfer

    Science.gov (United States)

    2011-01-01

    Background Brachyspira spp. colonize the intestines of some mammalian and avian species and show different degrees of enteropathogenicity. Brachyspira intermedia can cause production losses in chickens and strain PWS/AT now becomes the fourth genome to be completed in the genus Brachyspira. Results 15 classes of unique and shared genes were analyzed in B. intermedia, B. murdochii, B. hyodysenteriae and B. pilosicoli. The largest number of unique genes was found in B. intermedia and B. murdochii. This indicates the presence of larger pan-genomes. In general, hypothetical protein annotations are overrepresented among the unique genes. A 3.2 kb plasmid was found in B. intermedia strain PWS/AT. The plasmid was also present in the B. murdochii strain but not in nine other Brachyspira isolates. Within the Brachyspira genomes, genes had been translocated and also frequently switched between leading and lagging strands, a process that can be followed by different AT-skews in the third positions of synonymous codons. We also found evidence that bacteriophages were being remodeled and genes incorporated into them. Conclusions The accessory gene pool shapes species-specific traits. It is also influenced by reductive genome evolution and horizontal gene transfer. Gene-transfer events can cross both species and genus boundaries and bacteriophages appear to play an important role in this process. A mechanism for horizontal gene transfer appears to be gene translocations leading to remodeling of bacteriophages in combination with broad tropism. PMID:21816042

  4. Structural organization of glycophorin A and B genes: Glycophorin B gene evolved by homologous recombination at Alu repeat sequences

    International Nuclear Information System (INIS)

    Kudo, Shinichi; Fukuda, Minoru

    1989-01-01

    Glycophorins A (GPA) and B (GPB) are two major sialoglycoproteins of the human erythrocyte membrane. Here the authors present a comparison of the genomic structures of GPA and GPB developed by analyzing DNA clones isolated from a K562 genomic library. Nucleotide sequences of exon-intron junctions and 5' and 3' flanking sequences revealed that the GPA and GPB genes consist of 7 and 5 exons, respectively, and both genes have >95% identical sequence from the 5' flanking region to the region ∼ 1 kilobase downstream from the exon encoding the transmembrane regions. In this homologous part of the genes, GPB lacks one exon due to a point mutation at the 5' splicing site of the third intron, which inactivates the 5' cleavage event of splicing and leads to ligation of the second to the fourth exon. Following these very homologous sequences, the genomic sequences for GPA and GPB diverge significantly and no homology can be detected in their 3' end sequences. The analysis of the Alu sequences and their flanking direct repeat sequences suggest that an ancestral genomic structure has been maintained in the GPA gene, whereas the GPB gene has arisen from the acquisition of 3' sequences different from those of the GPA gene by homologous recombination at the Alu repeats during or after gene duplication

  5. Complete Sequence and Analysis of the Mitochondrial Genome of Hemiselmis andersenii CCMP644 (Cryptophyceae

    Directory of Open Access Journals (Sweden)

    Bowman Sharen

    2008-05-01

    Full Text Available Abstract Background Cryptophytes are an enigmatic group of unicellular eukaryotes with plastids derived by secondary (i.e., eukaryote-eukaryote endosymbiosis. Cryptophytes are unusual in that they possess four genomes–a host cell-derived nuclear and mitochondrial genome and an endosymbiont-derived plastid and 'nucleomorph' genome. The evolutionary origins of the host and endosymbiont components of cryptophyte algae are at present poorly understood. Thus far, a single complete mitochondrial genome sequence has been determined for the cryptophyte Rhodomonas salina. Here, the second complete mitochondrial genome of the cryptophyte alga Hemiselmis andersenii CCMP644 is presented. Results The H. andersenii mtDNA is 60,553 bp in size and encodes 30 structural RNAs and 36 protein-coding genes, all located on the same strand. A prominent feature of the genome is the presence of a ~20 Kbp long intergenic region comprised of numerous tandem and dispersed repeat units of between 22–336 bp. Adjacent to these repeats are 27 copies of palindromic sequences predicted to form stable DNA stem-loop structures. One such stem-loop is located near a GC-rich and GC-poor region and may have a regulatory function in replication or transcription. The H. andersenii mtDNA shares a number of features in common with the genome of the cryptophyte Rhodomonas salina, including general architecture, gene content, and the presence of a large repeat region. However, the H. andersenii mtDNA is devoid of inverted repeats and introns, which are present in R. salina. Comparative analyses of the suite of tRNAs encoded in the two genomes reveal that the H. andersenii mtDNA has lost or converted its original trnK(uuu gene and possesses a trnS-derived 'trnK(uuu', which appears unable to produce a functional tRNA. Mitochondrial protein coding gene phylogenies strongly support a variety of previously established eukaryotic groups, but fail to resolve the relationships among higher

  6. Complete genome sequence of the aerobic, heterotroph Marinithermus hydrothermalis type strain (T1T) from a deep-sea hydrothermal vent chimney

    Energy Technology Data Exchange (ETDEWEB)

    Copeland, A [U.S. Department of Energy, Joint Genome Institute; Gu, Wei [U.S. Department of Energy, Joint Genome Institute; Yasawong, Montri [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Deshpande, Shweta [U.S. Department of Energy, Joint Genome Institute; Pagani, Ioanna [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Pan, Chongle [ORNL; Brambilla, Evelyne-Marie [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Tindall, Brian [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute

    2012-01-01

    Marinithermus hydrothermalis Sako et al. 2003 is the type species of the monotypic genus Marinithermus. M. hydrothermalis T1 T was the first isolate within the phylum ThermusDeinococcus to exhibit optimal growth under a salinity equivalent to that of sea water and to have an absolute requirement for NaCl for growth. M. hydrothermalis T1 T is of interest because it may provide a new insight into the ecological significance of the aerobic, thermophilic decomposers in the circulation of organic compounds in deep-sea hydrothermal vent ecosystems. This is the first completed genome sequence of a member of the genus Marinithermus and the seventh sequence from the family Thermaceae. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 2,269,167 bp long genome with its 2,251 protein-coding and 59 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  7. Complete genome sequence of the aerobic, heterotroph Marinithermus hydrothermalis type strain (T1(T)) from a deep-sea hydrothermal vent chimney.

    Science.gov (United States)

    Copeland, Alex; Gu, Wei; Yasawong, Montri; Lapidus, Alla; Lucas, Susan; Deshpande, Shweta; Pagani, Ioanna; Tapia, Roxanne; Cheng, Jan-Fang; Goodwin, Lynne A; Pitluck, Sam; Liolios, Konstantinos; Ivanova, Natalia; Mavromatis, Konstantinos; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Pan, Chongle; Brambilla, Evelyne-Marie; Rohde, Manfred; Tindall, Brian J; Sikorski, Johannes; Göker, Markus; Detter, John C; Bristow, James; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter; Woyke, Tanja

    2012-03-19

    Marinithermus hydrothermalis Sako et al. 2003 is the type species of the monotypic genus Marinithermus. M. hydrothermalis T1(T) was the first isolate within the phylum "Thermus-Deinococcus" to exhibit optimal growth under a salinity equivalent to that of sea water and to have an absolute requirement for NaCl for growth. M. hydrothermalis T1(T) is of interest because it may provide a new insight into the ecological significance of the aerobic, thermophilic decomposers in the circulation of organic compounds in deep-sea hydrothermal vent ecosystems. This is the first completed genome sequence of a member of the genus Marinithermus and the seventh sequence from the family Thermaceae. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 2,269,167 bp long genome with its 2,251 protein-coding and 59 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  8. The complete mitochondrial genome sequence of the world's largest fish, the whale shark (Rhincodon typus), and its comparison with those of related shark species.

    Science.gov (United States)

    Alam, Md Tauqeer; Petit, Robert A; Read, Timothy D; Dove, Alistair D M

    2014-04-10

    The whale shark (Rhincodon typus) is the largest extant species of fish, belonging to the order Orectolobiformes. It is listed as a "vulnerable" species on the International Union for Conservation of Nature (IUCN)'s Red List of Threatened Species, which makes it an important species for conservation efforts. We report here the first complete sequence of the mitochondrial genome (mitogenome) of the whale shark obtained by next-generation sequencing methods. The assembled mitogenome is a 16,875 bp circle, comprising of 13 protein-coding genes, two rRNA genes, 22 tRNA genes and a control region. We also performed comparative analysis of the whale shark mitogenome to the available mitogenome sequences of 17 other shark species, four from the order Orectolobiformes, five from Lamniformes and eight from Carcharhiniformes. The nucleotide composition, number and arrangement of the genes in whale shark mitogenome are the same as found in the mitogenomes of the other members of the order Orectolobiformes and its closest orders Lamniformes and Carcharhiniformes, although the whale shark mitogenome had a slightly longer control region. The availability of mitogenome sequence of whale shark will aid studies of molecular systematics, biogeography, genetic differentiation, and conservation genetics in this species. Copyright © 2014 Elsevier B.V. All rights reserved.

  9. Complete Genome Sequence of Zucchini Yellow Mosaic Virus Strain Kurdistan, Iran.

    Science.gov (United States)

    Maghamnia, Hamid Reza; Hajizadeh, Mohammad; Azizi, Abdolbaset

    2018-03-01

    The complete genome sequence of Zucchini yellow mosaic virus strain Kurdistan (ZYMV-Kurdistan) infecting squash from Iran was determined from 13 overlapping fragments. Excluding the poly (A) tail, ZYMV-Kurdistan genome consisted of 9593 nucleotides (nt), with 138 and 211 nt at the 5' and 3' non-translated regions, respectively. It contained two open-reading frames (ORFs), the large ORF encoding a polyprotein of 3080 amino acids (aa) and the small overlapping ORF encoding a P3N-PIPO protein of 74 aa. This isolate had six unique aa differences compared to other ZYMV isolates and shared 79.6-98.8% identities with other ZYMV genome sequences at the nt level and 90.1-99% identities at the aa level. A phylogenetic tree of ZYMV complete genomic sequences showed that Iranian and Central European isolates are closely related and form a phylogenetically homogenous group. All values in the ratio of substitution rates at non-synonymous and synonymous sites ( d N / d S ) were below 1, suggestive of strong negative selection forces during ZYMV protein history. This is the first report of complete genome sequence information of the most prevalent virus in the west of Iran. This study helps our understanding of the genetic diversity of ZYMV isolates infecting cucurbit plants in Iran, virus evolution and epidemiology and can assist in designing better diagnostic tools.

  10. Cloning and sequence analysis of hyaluronoglucosaminidase (nagH gene of Clostridium chauvoei

    Directory of Open Access Journals (Sweden)

    Saroj K. Dangi

    2017-09-01

    Full Text Available Aim: Blackleg disease is caused by Clostridium chauvoei in ruminants. Although virulence factors such as C. chauvoei toxin A, sialidase, and flagellin are well characterized, hyaluronidases of C. chauvoei are not characterized. The present study was aimed at cloning and sequence analysis of hyaluronoglucosaminidase (nagH gene of C. chauvoei. Materials and Methods: C. chauvoei strain ATCC 10092 was grown in ATCC 2107 media and confirmed by polymerase chain reaction (PCR using the primers specific for 16-23S rDNA spacer region. nagH gene of C. chauvoei was amplified and cloned into pRham-SUMO vector and transformed into Escherichia cloni 10G cells. The construct was then transformed into E. cloni cells. Colony PCR was carried out to screen the colonies followed by sequencing of nagH gene in the construct. Results: PCR amplification yielded nagH gene of 1143 bp product, which was cloned in prokaryotic expression system. Colony PCR, as well as sequencing of nagH gene, confirmed the presence of insert. Sequence was then subjected to BLAST analysis of NCBI, which confirmed that the sequence was indeed of nagH gene of C. chauvoei. Phylogenetic analysis of the sequence showed that it is closely related to Clostridium perfringens and Clostridium paraputrificum. Conclusion: The gene for virulence factor nagH was cloned into a prokaryotic expression vector and confirmed by sequencing.

  11. Complete Genome Sequence of the Probiotic Strain Lactobacillus salivarius LPM01.

    Science.gov (United States)

    Chenoll, Empar; Codoñer, Francisco M; Martinez-Blanch, Juan F; Acevedo-Piérart, Marcelo; Ormeño, M Loreto; Ramón, Daniel; Genovés, Salvador

    2016-11-23

    Lactobacillus salivarius LPM01 (DSM 22150) is a probiotic strain able to improve health status in immunocompromised people. Here, we report its complete genome sequence deciphered by PacBio single-molecule real-time (SMRT) technology. Analysis of the sequence may provide insights into its functional activity and safety assessment. Copyright © 2016 Chenoll et al.

  12. Comparative genome sequencing of drosophila pseudoobscura: Chromosomal, gene and cis-element evolution

    Energy Technology Data Exchange (ETDEWEB)

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Todd, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catherine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenee; Verduzco, Daniel; Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.

    2004-04-01

    The genome sequence of a second fruit fly, D. pseudoobscura, presents an opportunity for comparative analysis of a primary model organism D. melanogaster. The vast majority of Drosophila genes have remained on the same arm, but within each arm gene order has been extensively reshuffled leading to the identification of approximately 1300 syntenic blocks. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 35 My since divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome wide average consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than control sequences between the species but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a picture of repeat mediated chromosomal rearrangement, and high co-adaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila.

  13. Expression map of a complete set of gustatory receptor genes in chemosensory organs of Bombyx mori.

    Science.gov (United States)

    Guo, Huizhen; Cheng, Tingcai; Chen, Zhiwei; Jiang, Liang; Guo, Youbing; Liu, Jianqiu; Li, Shenglong; Taniai, Kiyoko; Asaoka, Kiyoshi; Kadono-Okuda, Keiko; Arunkumar, Kallare P; Wu, Jiaqi; Kishino, Hirohisa; Zhang, Huijie; Seth, Rakesh K; Gopinathan, Karumathil P; Montagné, Nicolas; Jacquin-Joly, Emmanuelle; Goldsmith, Marian R; Xia, Qingyou; Mita, Kazuei

    2017-03-01

    Most lepidopteran species are herbivores, and interaction with host plants affects their gene expression and behavior as well as their genome evolution. Gustatory receptors (Grs) are expected to mediate host plant selection, feeding, oviposition and courtship behavior. However, due to their high diversity, sequence divergence and extremely low level of expression it has been difficult to identify precisely a complete set of Grs in Lepidoptera. By manual annotation and BAC sequencing, we improved annotation of 43 gene sequences compared with previously reported Grs in the most studied lepidopteran model, the silkworm, Bombyx mori, and identified 7 new tandem copies of BmGr30 on chromosome 7, bringing the total number of BmGrs to 76. Among these, we mapped 68 genes to chromosomes in a newly constructed chromosome distribution map and 8 genes to scaffolds; we also found new evidence for large clusters of BmGrs, especially from the bitter receptor family. RNA-seq analysis of diverse BmGr expression patterns in chemosensory organs of larvae and adults enabled us to draw a precise organ specific map of BmGr expression. Interestingly, most of the clustered genes were expressed in the same tissues and more than half of the genes were expressed in larval maxillae, larval thoracic legs and adult legs. For example, BmGr63 showed high expression levels in all organs in both larval and adult stages. By contrast, some genes showed expression limited to specific developmental stages or organs and tissues. BmGr19 was highly expressed in larval chemosensory organs (especially antennae and thoracic legs), the single exon genes BmGr53 and BmGr67 were expressed exclusively in larval tissues, the BmGr27-BmGr31 gene cluster on chr7 displayed a high expression level limited to adult legs and the candidate CO 2 receptor BmGr2 was highly expressed in adult antennae, where few other Grs were expressed. Transcriptional analysis of the Grs in B. mori provides a valuable new reference for

  14. Complete nucleotide sequence of the multidrug resistance IncA/C plasmid pR55 from Klebsiella pneumoniae isolated in 1969.

    Science.gov (United States)

    Doublet, Benoît; Boyd, David; Douard, Gregory; Praud, Karine; Cloeckaert, Axel; Mulvey, Michael R

    2012-10-01

    To determine the complete nucleotide sequence of the multidrug resistance IncA/C plasmid pR55 from a clinical Klebsiella pneumoniae strain that was isolated from a urinary tract infection in 1969 in a French hospital and compare it with those of contemporary emerging IncA/C plasmids. The plasmid was purified and sequenced using a 454 sequencing approach. After draft assembly, additional PCRs and walking reads were performed for gap closure. Sequence comparisons and multiple alignments with other IncA/C plasmids were done using the BLAST algorithm and CLUSTAL W, respectively. Plasmid pR55 (170 810 bp) revealed a shared plasmid backbone (>99% nucleotide identity) with current members of the IncA/C(2) multidrug resistance plasmid family that are widely disseminating antibiotic resistance genes. Nevertheless, two specific multidrug resistance gene arrays probably acquired from other genetic elements were identified inserted at conserved hotspot insertion sites in the IncA/C backbone. A novel transposon named Tn6187 showed an atypical mixed transposon configuration composed of two mercury resistance operons and two transposition modules that are related to Tn21 and Tn1696, respectively, and an In0-type integron. IncA/C(2) multidrug resistance plasmids have a broad host range and have been implicated in the dissemination of antibiotic resistance among Enterobacteriaceae from humans and animals. This typical IncA/C(2) genetic scaffold appears to carry various multidrug resistance gene arrays and is now also a successful vehicle for spreading AmpC-like cephalosporinase and metallo-β-lactamase genes, such as bla(CMY) and bla(NDM), respectively.

  15. PCR-Internal Transcribed Spacer (ITS) genes sequencing and ...

    African Journals Online (AJOL)

    Methods: DNA extraction, purification, amplification and sequencing of Internal Transcribed Spacer (ITS) genes were per- formed using ... Keywords: Internal transcribed spacer genes, phylogenetic, genetic relationship, clinical and environmental fungi, HIV-TB. ... Nigeria. An Ethical clearance was obtained from the Eth-.

  16. Complete mitochondrial genome of the Loligo opalescence.

    Science.gov (United States)

    Jiang, Lihua; Liu, Wei; Zhu, Aiyi; Zhang, Jianshe; Wu, Changwen

    2016-09-01

    In this study, we determined the complete mitochondrial genome of the Loligo opalescence. The genome was 17,370 bp in length and contained 13 protein-coding genes, 22 transfer RNA genes, 2 ribosomal RNA genes and 3 main non-coding regions. The composition and order of genes, were similar to most other invertebrates. The overall base composition of L. opalescence is A 38.62%, C 19.40%, T 32.37% and G 9.61%, with a highly A + T bias of 70.99%. All of the three control regions (CR) contain termination-associated sequences and conserved sequence blocks. This mitogenome sequence data would play an important role in the investigation of phylogenetic relationship, taxonomic resolution and phylogeography of the Loliginidae.

  17. Defining reference sequences for Nocardia species by similarity and clustering analyses of 16S rRNA gene sequence data.

    Directory of Open Access Journals (Sweden)

    Manal Helal

    Full Text Available BACKGROUND: The intra- and inter-species genetic diversity of bacteria and the absence of 'reference', or the most representative, sequences of individual species present a significant challenge for sequence-based identification. The aims of this study were to determine the utility, and compare the performance of several clustering and classification algorithms to identify the species of 364 sequences of 16S rRNA gene with a defined species in GenBank, and 110 sequences of 16S rRNA gene with no defined species, all within the genus Nocardia. METHODS: A total of 364 16S rRNA gene sequences of Nocardia species were studied. In addition, 110 16S rRNA gene sequences assigned only to the Nocardia genus level at the time of submission to GenBank were used for machine learning classification experiments. Different clustering algorithms were compared with a novel algorithm or the linear mapping (LM of the distance matrix. Principal Components Analysis was used for the dimensionality reduction and visualization. RESULTS: The LM algorithm achieved the highest performance and classified the set of 364 16S rRNA sequences into 80 clusters, the majority of which (83.52% corresponded with the original species. The most representative 16S rRNA sequences for individual Nocardia species have been identified as 'centroids' in respective clusters from which the distances to all other sequences were minimized; 110 16S rRNA gene sequences with identifications recorded only at the genus level were classified using machine learning methods. Simple kNN machine learning demonstrated the highest performance and classified Nocardia species sequences with an accuracy of 92.7% and a mean frequency of 0.578. CONCLUSION: The identification of centroids of 16S rRNA gene sequence clusters using novel distance matrix clustering enables the identification of the most representative sequences for each individual species of Nocardia and allows the quantitation of inter- and intra

  18. Complete genome sequence of the myxobacterium Sorangium cellulosum

    DEFF Research Database (Denmark)

    Schneiker, S; Perlova, O; Kaiser, O

    2007-01-01

    The genus Sorangium synthesizes approximately half of the secondary metabolites isolated from myxobacteria, including the anti-cancer metabolite epothilone. We report the complete genome sequence of the model Sorangium strain S. cellulosum Soce56, which produces several natural products and has...... morphological and physiological properties typical of the genus. The circular genome, comprising 13,033,779 base pairs, is the largest bacterial genome sequenced to date. No global synteny with the genome of Myxococcus xanthus is apparent, revealing an unanticipated level of divergence between...... these myxobacteria. A large percentage of the genome is devoted to regulation, particularly post-translational phosphorylation, which probably supports the strain's complex, social lifestyle. This regulatory network includes the highest number of eukaryotic protein kinase-like kinases discovered in any organism...

  19. Phylogenetic analyses of Vitis (Vitaceae) based on complete chloroplast genome sequences: effects of taxon sampling and phylogenetic methods on resolving relationships among rosids.

    Science.gov (United States)

    Jansen, Robert K; Kaittanis, Charalambos; Saski, Christopher; Lee, Seung-Bum; Tomkins, Jeffrey; Alverson, Andrew J; Daniell, Henry

    2006-04-09

    The Vitaceae (grape) is an economically important family of angiosperms whose phylogenetic placement is currently unresolved. Recent phylogenetic analyses based on one to several genes have suggested several alternative placements of this family, including sister to Caryophyllales, asterids, Saxifragales, Dilleniaceae or to rest of rosids, though support for these different results has been weak. There has been a recent interest in using complete chloroplast genome sequences for resolving phylogenetic relationships among angiosperms. These studies have clarified relationships among several major lineages but they have also emphasized the importance of taxon sampling and the effects of different phylogenetic methods for obtaining accurate phylogenies. We sequenced the complete chloroplast genome of Vitis vinifera and used these data to assess relationships among 27 angiosperms, including nine taxa of rosids. The Vitis vinifera chloroplast genome is 160,928 bp in length, including a pair of inverted repeats of 26,358 bp that are separated by small and large single copy regions of 19,065 bp and 89,147 bp, respectively. The gene content and order of Vitis is identical to many other unrearranged angiosperm chloroplast genomes, including tobacco. Phylogenetic analyses using maximum parsimony and maximum likelihood were performed on DNA sequences of 61 protein-coding genes for two datasets with 28 or 29 taxa, including eight or nine taxa from four of the seven currently recognized major clades of rosids. Parsimony and likelihood phylogenies of both data sets provide strong support for the placement of Vitaceae as sister to the remaining rosids. However, the position of the Myrtales and support for the monophyly of the eurosid I clade differs between the two data sets and the two methods of analysis. In parsimony analyses, the inclusion of Gossypium is necessary to obtain trees that support the monophyly of the eurosid I clade. However, maximum likelihood analyses place

  20. Phylogenetic analyses of Vitis (Vitaceae based on complete chloroplast genome sequences: effects of taxon sampling and phylogenetic methods on resolving relationships among rosids

    Directory of Open Access Journals (Sweden)

    Alverson Andrew J

    2006-04-01

    Full Text Available Abstract Background The Vitaceae (grape is an economically important family of angiosperms whose phylogenetic placement is currently unresolved. Recent phylogenetic analyses based on one to several genes have suggested several alternative placements of this family, including sister to Caryophyllales, asterids, Saxifragales, Dilleniaceae or to rest of rosids, though support for these different results has been weak. There has been a recent interest in using complete chloroplast genome sequences for resolving phylogenetic relationships among angiosperms. These studies have clarified relationships among several major lineages but they have also emphasized the importance of taxon sampling and the effects of different phylogenetic methods for obtaining accurate phylogenies. We sequenced the complete chloroplast genome of Vitis vinifera and used these data to assess relationships among 27 angiosperms, including nine taxa of rosids. Results The Vitis vinifera chloroplast genome is 160,928 bp in length, including a pair of inverted repeats of 26,358 bp that are separated by small and large single copy regions of 19,065 bp and 89,147 bp, respectively. The gene content and order of Vitis is identical to many other unrearranged angiosperm chloroplast genomes, including tobacco. Phylogenetic analyses using maximum parsimony and maximum likelihood were performed on DNA sequences of 61 protein-coding genes for two datasets with 28 or 29 taxa, including eight or nine taxa from four of the seven currently recognized major clades of rosids. Parsimony and likelihood phylogenies of both data sets provide strong support for the placement of Vitaceae as sister to the remaining rosids. However, the position of the Myrtales and support for the monophyly of the eurosid I clade differs between the two data sets and the two methods of analysis. In parsimony analyses, the inclusion of Gossypium is necessary to obtain trees that support the monophyly of the eurosid I clade

  1. Citrus plastid-related gene profiling based on expressed sequence tag analyses

    Directory of Open Access Journals (Sweden)

    Tercilio Calsa Jr.

    2007-01-01

    Full Text Available Plastid-related sequences, derived from putative nuclear or plastome genes, were searched in a large collection of expressed sequence tags (ESTs and genomic sequences from the Citrus Biotechnology initiative in Brazil. The identified putative Citrus chloroplast gene sequences were compared to those from Arabidopsis, Eucalyptus and Pinus. Differential expression profiling for plastid-directed nuclear-encoded proteins and photosynthesis-related gene expression variation between Citrus sinensis and Citrus reticulata, when inoculated or not with Xylella fastidiosa, were also analyzed. Presumed Citrus plastome regions were more similar to Eucalyptus. Some putative genes appeared to be preferentially expressed in vegetative tissues (leaves and bark or in reproductive organs (flowers and fruits. Genes preferentially expressed in fruit and flower may be associated with hypothetical physiological functions. Expression pattern clustering analysis suggested that photosynthesis- and carbon fixation-related genes appeared to be up- or down-regulated in a resistant or susceptible Citrus species after Xylella inoculation in comparison to non-infected controls, generating novel information which may be helpful to develop novel genetic manipulation strategies to control Citrus variegated chlorosis (CVC.

  2. Complete gene expression profiling of Saccharopolyspora erythraea using GeneChip DNA microarrays

    Directory of Open Access Journals (Sweden)

    Bordoni Roberta

    2007-11-01

    Full Text Available Abstract Background The Saccharopolyspora erythraea genome sequence, recently published, presents considerable divergence from those of streptomycetes in gene organization and function, confirming the remarkable potential of S. erythraea for producing many other secondary metabolites in addition to erythromycin. In order to investigate, at whole transcriptome level, how S. erythraea genes are modulated, a DNA microarray was specifically designed and constructed on the S. erythraea strain NRRL 2338 genome sequence, and the expression profiles of 6494 ORFs were monitored during growth in complex liquid medium. Results The transcriptional analysis identified a set of 404 genes, whose transcriptional signals vary during growth and characterize three distinct phases: a rapid growth until 32 h (Phase A; a growth slowdown until 52 h (Phase B; and another rapid growth phase from 56 h to 72 h (Phase C before the cells enter the stationary phase. A non-parametric statistical method, that identifies chromosomal regions with transcriptional imbalances, determined regional organization of transcription along the chromosome, highlighting differences between core and non-core regions, and strand specific patterns of expression. Microarray data were used to characterize the temporal behaviour of major functional classes and of all the gene clusters for secondary metabolism. The results confirmed that the ery cluster is up-regulated during Phase A and identified six additional clusters (for terpenes and non-ribosomal peptides that are clearly regulated in later phases. Conclusion The use of a S. erythraea DNA microarray improved specificity and sensitivity of gene expression analysis, allowing a global and at the same time detailed picture of how S. erythraea genes are modulated. This work underlines the importance of using DNA microarrays, coupled with an exhaustive statistical and bioinformatic analysis of the results, to understand the transcriptional

  3. Speeding disease gene discovery by sequence based candidate prioritization

    Directory of Open Access Journals (Sweden)

    Porteous David J

    2005-03-01

    Full Text Available Abstract Background Regions of interest identified through genetic linkage studies regularly exceed 30 centimorgans in size and can contain hundreds of genes. Traditionally this number is reduced by matching functional annotation to knowledge of the disease or phenotype in question. However, here we show that disease genes share patterns of sequence-based features that can provide a good basis for automatic prioritization of candidates by machine learning. Results We examined a variety of sequence-based features and found that for many of them there are significant differences between the sets of genes known to be involved in human hereditary disease and those not known to be involved in disease. We have created an automatic classifier called PROSPECTR based on those features using the alternating decision tree algorithm which ranks genes in the order of likelihood of involvement in disease. On average, PROSPECTR enriches lists for disease genes two-fold 77% of the time, five-fold 37% of the time and twenty-fold 11% of the time. Conclusion PROSPECTR is a simple and effective way to identify genes involved in Mendelian and oligogenic disorders. It performs markedly better than the single existing sequence-based classifier on novel data. PROSPECTR could save investigators looking at large regions of interest time and effort by prioritizing positional candidate genes for mutation detection and case-control association studies.

  4. Isolation of Hox cluster genes from insects reveals an accelerated sequence evolution rate.

    Directory of Open Access Journals (Sweden)

    Heike Hadrys

    Full Text Available Among gene families it is the Hox genes and among metazoan animals it is the insects (Hexapoda that have attracted particular attention for studying the evolution of development. Surprisingly though, no Hox genes have been isolated from 26 out of 35 insect orders yet, and the existing sequences derive mainly from only two orders (61% from Hymenoptera and 22% from Diptera. We have designed insect specific primers and isolated 37 new partial homeobox sequences of Hox cluster genes (lab, pb, Hox3, ftz, Antp, Scr, abd-a, Abd-B, Dfd, and Ubx from six insect orders, which are crucial to insect phylogenetics. These new gene sequences provide a first step towards comparative Hox gene studies in insects. Furthermore, comparative distance analyses of homeobox sequences reveal a correlation between gene divergence rate and species radiation success with insects showing the highest rate of homeobox sequence evolution.

  5. Human thyroid peroxidase: complete cDNA and protein sequence, chromosome mapping, and identification of two alternately spliced mRNAs

    International Nuclear Information System (INIS)

    Kimura, S.; Kotani, T.; McBride, O.W.; Umeki, K.; Hirai, K.; Nakayama, T.; Ohtaki, S.

    1987-01-01

    Two forms of human thyroid peroxidase cDNAs were isolated from a λgt11 cDNA library, prepared from Graves disease thyroid tissue mRNA, by use of oligonucleotides. The longest complete cDNA, designated phTPO-1, has 3048 nucleotides and an open reading frame consisting of 933 amino acids, which would encode a protein with a molecular weight of 103,026. Five potential asparagine-linked glycosylation sites are found in the deduced amino acid sequence. The second peroxidase cDNA, designated phTPO-2, is almost identical to phTPO-1 beginning 605 base pairs downstream except that it contains 1-base-pair difference and lacks 171 base pairs in the middle of the sequence. This results in a loss of 57 amino acids corresponding to a molecular weight of 6282. Interestingly, this 171-nucleotide sequence has GT and AG at its 5' and 3' boundaries, respectively, that are in good agreement with donor and acceptor splice site consensus sequences. Using specific oligonucleotide probes for the mRNAs derived from the cDNA sequences hTOP-1 and hTOP-2, the authors show that both are expressed in all thyroid tissues examined and the relative level of two mRNAs is different in each sample. The results suggest that two thyroid peroxidase proteins might be generated through alternate splicing of the same gene. By using somatic cell hybrid lines, the thyroid peroxidase gene was mapped to the short arm of human chromosome 2

  6. Discovery of candidate disease genes in ENU-induced mouse mutants by large-scale sequencing, including a splice-site mutation in nucleoredoxin.

    Directory of Open Access Journals (Sweden)

    Melissa K Boles

    2009-12-01

    Full Text Available An accurate and precisely annotated genome assembly is a fundamental requirement for functional genomic analysis. Here, the complete DNA sequence and gene annotation of mouse Chromosome 11 was used to test the efficacy of large-scale sequencing for mutation identification. We re-sequenced the 14,000 annotated exons and boundaries from over 900 genes in 41 recessive mutant mouse lines that were isolated in an N-ethyl-N-nitrosourea (ENU mutation screen targeted to mouse Chromosome 11. Fifty-nine sequence variants were identified in 55 genes from 31 mutant lines. 39% of the lesions lie in coding sequences and create primarily missense mutations. The other 61% lie in noncoding regions, many of them in highly conserved sequences. A lesion in the perinatal lethal line l11Jus13 alters a consensus splice site of nucleoredoxin (Nxn, inserting 10 amino acids into the resulting protein. We conclude that point mutations can be accurately and sensitively recovered by large-scale sequencing, and that conserved noncoding regions should be included for disease mutation identification. Only seven of the candidate genes we report have been previously targeted by mutation in mice or rats, showing that despite ongoing efforts to functionally annotate genes in the mammalian genome, an enormous gap remains between phenotype and function. Our data show that the classical positional mapping approach of disease mutation identification can be extended to large target regions using high-throughput sequencing.

  7. Complete Genome Sequence and Immunoproteomic Analyses of the Bacterial Fish Pathogen Streptococcus parauberis▿†

    Science.gov (United States)

    Nho, Seong Won; Hikima, Jun-ichi; Cha, In Seok; Park, Seong Bin; Jang, Ho Bin; del Castillo, Carmelo S.; Kondo, Hidehiro; Hirono, Ikuo; Aoki, Takashi; Jung, Tae Sung

    2011-01-01

    Although Streptococcus parauberis is known as a bacterial pathogen associated with bovine udder mastitis, it has recently become one of the major causative agents of olive flounder (Paralichthys olivaceus) streptococcosis in northeast Asia, causing massive mortality resulting in severe economic losses. S. parauberis contains two serotypes, and it is likely that capsular polysaccharide antigens serve to differentiate the serotypes. In the present study, the complete genome sequence of S. parauberis (serotype I) was determined using the GS-FLX system to investigate its phylogeny, virulence factors, and antigenic proteins. S. parauberis possesses a single chromosome of 2,143,887 bp containing 1,868 predicted coding sequences (CDSs), with an average GC content of 35.6%. Whole-genome dot plot analysis and phylogenetic analysis of a 60-kDa chaperonin-encoding gene and the glyceraldehyde-3-phosphate dehydrogenase (GAPDH)-encoding gene showed that the strain was evolutionarily closely related to Streptococcus uberis. S. parauberis antigenic proteins were analyzed using an immunoproteomic technique. Twenty-one antigenic protein spots were identified in S. parauberis, by reaction with an antiserum obtained from S. parauberis-challenged olive flounder. This work provides the foundation needed to understand more clearly the relationship between pathogen and host and develops new approaches toward prophylactic and therapeutic strategies to deal with streptococcosis in fish. The work also provides a better understanding of the physiology and evolution of a significant representative of the Streptococcaceae. PMID:21531805

  8. Computational sequence analysis of predicted long dsRNA transcriptomes of major crops reveals sequence complementarity with human genes.

    Science.gov (United States)

    Jensen, Peter D; Zhang, Yuanji; Wiggins, B Elizabeth; Petrick, Jay S; Zhu, Jin; Kerstetter, Randall A; Heck, Gregory R; Ivashuta, Sergey I

    2013-01-01

    Long double-stranded RNAs (long dsRNAs) are precursors for the effector molecules of sequence-specific RNA-based gene silencing in eukaryotes. Plant cells can contain numerous endogenous long dsRNAs. This study demonstrates that such endogenous long dsRNAs in plants have sequence complementarity to human genes. Many of these complementary long dsRNAs have perfect sequence complementarity of at least 21 nucleotides to human genes; enough complementarity to potentially trigger gene silencing in targeted human cells if delivered in functional form. However, the number and diversity of long dsRNA molecules in plant tissue from crops such as lettuce, tomato, corn, soy and rice with complementarity to human genes that have a long history of safe consumption supports a conclusion that long dsRNAs do not present a significant dietary risk.

  9. The Complete Chloroplast Genome Sequences of Five Epimedium Species: Lights into Phylogenetic and Taxonomic Analyses

    Science.gov (United States)

    Zhang, Yanjun; Du, Liuwen; Liu, Ao; Chen, Jianjun; Wu, Li; Hu, Weiming; Zhang, Wei; Kim, Kyunghee; Lee, Sang-Choon; Yang, Tae-Jin; Wang, Ying

    2016-01-01

    Epimedium L. is a phylogenetically and economically important genus in the family Berberidaceae. We here sequenced the complete chloroplast (cp) genomes of four Epimedium species using Illumina sequencing technology via a combination of de novo and reference-guided assembly, which was also the first comprehensive cp genome analysis on Epimedium combining the cp genome sequence of E. koreanum previously reported. The five Epimedium cp genomes exhibited typical quadripartite and circular structure that was rather conserved in genomic structure and the synteny of gene order. However, these cp genomes presented obvious variations at the boundaries of the four regions because of the expansion and contraction of the inverted repeat (IR) region and the single-copy (SC) boundary regions. The trnQ-UUG duplication occurred in the five Epimedium cp genomes, which was not found in the other basal eudicotyledons. The rapidly evolving cp genome regions were detected among the five cp genomes, as well as the difference of simple sequence repeats (SSR) and repeat sequence were identified. Phylogenetic relationships among the five Epimedium species based on their cp genomes showed accordance with the updated system of the genus on the whole, but reminded that the evolutionary relationships and the divisions of the genus need further investigation applying more evidences. The availability of these cp genomes provided valuable genetic information for accurately identifying species, taxonomy and phylogenetic resolution and evolution of Epimedium, and assist in exploration and utilization of Epimedium plants. PMID:27014326

  10. The complete chloroplast genome sequences of five Epimedium species: lights into phylogenetic and taxonomic analyses

    Directory of Open Access Journals (Sweden)

    Yanjun eZhang

    2016-03-01

    Full Text Available Epimedium L. is a phylogenetically and economically important genus in the family Berberidaceae. We here sequenced the complete chloroplast (cp genomes of four Epimedium species using Illumina sequencing technology via a combination of de novo and reference-guided assembly, which was also the first comprehensive cp genome analysis on Epimedium combining the cp genome sequence of E. koreanum previously reported. The five Epimedium cp genomes exhibited typical quadripartite and circular structure that was rather conserved in genomic structure and the synteny of gene order. However, these cp genomes presented obvious variations at the boundaries of the four regions because of the expansion and contraction of the inverted repeat (IR region and the single-copy (SC boundary regions. The trnQ-UUG duplication occurred in the five Epimedium cp genomes, which was not found in the other basal eudicotyledons. The rapidly evolving cp genome regions were detected among the five cp genomes, as well as the difference of simple sequence repeats (SSR and repeat sequence were identified. Phylogenetic relationships among the five Epimedium species based on their cp genomes showed accordance with the updated system of the genus on the whole, but reminded that the evolutionary relationships and the divisions of the genus need further investigation applying more evidences. The availability of these cp genomes provided valuable genetic information for accurately identifying species, taxonomy and phylogenetic resolution and evolution of Epimedium, and assist in exploration and utilization of Epimedium plants.

  11. Biological characterization and complete nucleotide sequence of a Tunisian isolate of Moroccan watermelon mosaic virus.

    Science.gov (United States)

    Yakoubi, S; Desbiez, C; Fakhfakh, H; Wipf-Scheibel, C; Marrakchi, M; Lecoq, H

    2008-01-01

    During a survey conducted in October 2005, cucurbit leaf samples showing virus-like symptoms were collected from the major cucurbit-growing areas in Tunisia. DAS-ELISA showed the presence of Moroccan watermelon mosaic virus (MWMV, Potyvirus), detected for the first time in Tunisia, in samples from the region of Cap Bon (Northern Tunisia). MWMV isolate TN05-76 (MWMV-Tn) was characterized biologically and its full-length genome sequence was established. MWMV-Tn was found to have biological properties similar to those reported for the MWMV type strain from Morocco. Phylogenetic analysis including the comparison of complete amino-acid sequences of 42 potyviruses confirmed that MWMV-Tn is related (65% amino-acid sequence identity) to Papaya ringspot virus (PRSV) isolates but is a member of a distinct virus species. Sequence analysis on parts of the CP gene of MWMV isolates from different geographical origins revealed some geographic structure of MWMV variability, with three different clusters: one cluster including isolates from the Mediterranean region, a second including isolates from western and central Africa, and a third one including isolates from the southern part of Africa. A significant correlation was observed between geographic and genetic distances between isolates. Isolates from countries in the Mediterranean region where MWMV has recently emerged (France, Spain, Portugal) have highly conserved sequences, suggesting that they may have a common and recent origin. MWMV from Sudan, a highly divergent variant, may be considered an evolutionary intermediate between MWMV and PRSV.

  12. First Complete Genome Sequence of Pepper vein yellows virus from Australia

    Science.gov (United States)

    Maina, Solomon; Edwards, Owain R.

    2016-01-01

    We present here the first complete genomic RNA sequence of the polerovirus Pepper vein yellows virus (PeVYV) obtained from a pepper plant in Australia. We compare it with complete PeVYV genomes from Japan and China. The Australian genome was more closely related to the Japanese than the Chinese genome. PMID:27231375

  13. Comparative analysis of the complete genome sequence of the California MSW strain of myxoma virus reveals potential host adaptations.

    Science.gov (United States)

    Kerr, Peter J; Rogers, Matthew B; Fitch, Adam; Depasse, Jay V; Cattadori, Isabella M; Hudson, Peter J; Tscharke, David C; Holmes, Edward C; Ghedin, Elodie

    2013-11-01

    Myxomatosis is a rapidly lethal disease of European rabbits that is caused by myxoma virus (MYXV). The introduction of a South American strain of MYXV into the European rabbit population of Australia is the classic case of host-pathogen coevolution following cross-species transmission. The most virulent strains of MYXV for European rabbits are the Californian viruses, found in the Pacific states of the United States and the Baja Peninsula, Mexico. The natural host of Californian MYXV is the brush rabbit, Sylvilagus bachmani. We determined the complete sequence of the MSW strain of Californian MYXV and performed a comparative analysis with other MYXV genomes. The MSW genome is larger than that of the South American Lausanne (type) strain of MYXV due to an expansion of the terminal inverted repeats (TIRs) of the genome, with duplication of the M156R, M154L, M153R, M152R, and M151R genes and part of the M150R gene from the right-hand (RH) end of the genome at the left-hand (LH) TIR. Despite the extreme virulence of MSW, no novel genes were identified; five genes were disrupted by multiple indels or mutations to the ATG start codon, including two genes, M008.1L/R and M152R, with major virulence functions in European rabbits, and a sixth gene, M000.5L/R, was absent. The loss of these gene functions suggests that S. bachmani is a relatively recent host for MYXV and that duplication of virulence genes in the TIRs, gene loss, or sequence variation in other genes can compensate for the loss of M008.1L/R and M152R in infections of European rabbits.

  14. A sweetpotato gene index established by de novo assembly of pyrosequencing and Sanger sequences and mining for gene-based microsatellite markers

    Directory of Open Access Journals (Sweden)

    Solis Julio

    2010-10-01

    Full Text Available Abstract Background Sweetpotato (Ipomoea batatas (L. Lam., a hexaploid outcrossing crop, is an important staple and food security crop in developing countries in Africa and Asia. The availability of genomic resources for sweetpotato is in striking contrast to its importance for human nutrition. Previously existing sequence data were restricted to around 22,000 expressed sequence tag (EST sequences and ~ 1,500 GenBank sequences. We have used 454 pyrosequencing to augment the available gene sequence information to enhance functional genomics and marker design for this plant species. Results Two quarter 454 pyrosequencing runs used two normalized cDNA collections from stems and leaves from drought-stressed sweetpotato clone Tanzania and yielded 524,209 reads, which were assembled together with 22,094 publically available expressed sequence tags into 31,685 sets of overlapping DNA segments and 34,733 unassembled sequences. Blastx comparisons with the UniRef100 database allowed annotation of 23,957 contigs and 15,342 singletons resulting in 24,657 putatively unique genes. Further, 27,119 sequences had no match to protein sequences of UniRef100database. On the basis of this gene index, we have identified 1,661 gene-based microsatellite sequences, of which 223 were selected for testing and 195 were successfully amplified in a test panel of 6 hexaploid (I. batatas and 2 diploid (I. trifida accessions. Conclusions The sweetpotato gene index is a useful source for functionally annotated sweetpotato gene sequences that contains three times more gene sequence information for sweetpotato than previous EST assemblies. A searchable version of the gene index, including a blastn function, is available at http://www.cipotato.org/sweetpotato_gene_index.

  15. Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications

    DEFF Research Database (Denmark)

    Yilmaz, Pelin; Kottmann, Renzo; Field, Dawn

    2011-01-01

    Here we present a standard developed by the Genomic Standards Consortium (GSC) for reporting marker gene sequences--the minimum information about a marker gene sequence (MIMARKS). We also introduce a system for describing the environment from which a biological sample originates. The 'environment...

  16. Complete plastome sequences of Equisetum arvense and Isoetes flaccida: implications for phylogeny and plastid genome evolution of early land plant lineages

    Directory of Open Access Journals (Sweden)

    Mandoli Dina F

    2010-10-01

    Full Text Available Abstract Background Despite considerable progress in our understanding of land plant phylogeny, several nodes in the green tree of life remain poorly resolved. Furthermore, the bulk of currently available data come from only a subset of major land plant clades. Here we examine early land plant evolution using complete plastome sequences including two previously unexamined and phylogenetically critical lineages. To better understand the evolution of land plants and their plastomes, we examined aligned nucleotide sequences, indels, gene and nucleotide composition, inversions, and gene order at the boundaries of the inverted repeats. Results We present the plastome sequences of Equisetum arvense, a horsetail, and of Isoetes flaccida, a heterosporous lycophyte. Phylogenetic analysis of aligned nucleotides from 49 plastome genes from 43 taxa supported monophyly for the following clades: embryophytes (land plants, lycophytes, monilophytes (leptosporangiate ferns + Angiopteris evecta + Psilotum nudum + Equisetum arvense, and seed plants. Resolution among the four monilophyte lineages remained moderate, although nucleotide analyses suggested that P. nudum and E. arvense form a clade sister to A. evecta + leptosporangiate ferns. Results from phylogenetic analyses of nucleotides were consistent with the distribution of plastome gene rearrangements and with analysis of sequence gaps resulting from insertions and deletions (indels. We found one new indel and an inversion of a block of genes that unites the monilophytes. Conclusions Monophyly of monilophytes has been disputed on the basis of morphological and fossil evidence. In the context of a broad sampling of land plant data we find several new pieces of evidence for monilophyte monophyly. Results from this study demonstrate resolution among the four monilophytes lineages, albeit with moderate support; we posit a clade consisting of Equisetaceae and Psilotaceae that is sister to the "true ferns

  17. Complete genome sequence of the gliding, heparinolytic Pedobacter saltans type strain (113T)

    Science.gov (United States)

    Liolios, Konstantinos; Sikorski, Johannes; Lu, Meagan; Nolan, Matt; Lapidus, Alla; Lucas, Susan; Hammon, Nancy; Deshpande, Shweta; Cheng, Jan-Fang; Tapia, Roxanne; Han, Cliff; Goodwin, Lynne; Pitluck, Sam; Huntemann, Marcel; Ivanova, Natalia; Pagani, Ioanna; Mavromatis, Konstantinos; Ovchinikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Brambilla, Evelyne-Marie; Kotsyurbenko, Oleg; Rohde, Manfred; Tindall, Brian J.; Abt, Birte; Göker, Markus; Detter, John C.; Woyke, Tanja; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Klenk, Hans-Peter; Kyrpides, Nikos C.

    2011-01-01

    Pedobacter saltans Steyn et al. 1998 is one of currently 32 species in the genus Pedobacter within the family Sphingobacteriaceae. The species is of interest for its isolated location in the tree of life. Like other members of the genus P. saltans is heparinolytic. Cells of P. saltans show a peculiar gliding, dancing motility and can be distinguished from other Pedobacter strains by their ability to utilize glycerol and the inability to assimilate D-cellobiose. The genome presented here is only the second completed genome sequence of a type strain from a member of the family Sphingobacteriaceae to be published. The 4,635,236 bp long genome with its 3,854 protein-coding and 67 RNA genes consists of one chromosome, and is a part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:22180808

  18. Complete Genome Sequence of a Putative Densovirus of the Asian Citrus Psyllid, Diaphorina citri

    OpenAIRE

    Nigg, Jared C.; Nouri, Shahideh; Falk, Bryce W.

    2016-01-01

    Here, we report the complete genome sequence of a putative densovirus of the Asian citrus psyllid, Diaphorina citri. Diaphorina citri densovirus (DcDNV) was originally identified through metagenomics, and here, we obtained the complete nucleotide sequence using PCR-based approaches. Phylogenetic analysis places DcDNV between viruses of the Ambidensovirus and Iteradensovirus genera.

  19. Gene number determination and genetic polymorphism of the gamma delta T cell co-receptor WC1 genes

    Directory of Open Access Journals (Sweden)

    Chen Chuang

    2012-10-01

    Full Text Available Abstract Background WC1 co-receptors belong to the scavenger receptor cysteine-rich (SRCR superfamily and are encoded by a multi-gene family. Expression of particular WC1 genes defines functional subpopulations of WC1+ γδ T cells. We have previously identified partial or complete genomic sequences for thirteen different WC1 genes through annotation of the bovine genome Btau_3.1 build. We also identified two WC1 cDNA sequences from other cattle that did not correspond to sequences in the Btau_3.1 build. Their absence in the Btau_3.1 build may have reflected gaps in the genome assembly or polymorphisms among animals. Since the response of γδ T cells to bacterial challenge is determined by WC1 gene expression, it was critical to understand whether individual cattle or breeds differ in the number of WC1 genes or display polymorphisms. Results Real-time quantitative PCR using DNA from the animal whose genome was sequenced (“Dominette” and sixteen other animals representing ten breeds of cattle, showed that the number of genes coding for WC1 co-receptors is thirteen. The complete coding sequences of those thirteen WC1 genes is presented, including the correction of an error in the WC1-2 gene due to mis-assembly in the Btau_3.1 build. All other cDNA sequences were found to agree with the previous annotation of complete or partial WC1 genes. PCR amplification and sequencing of the most variable N-terminal SRCR domain (domain 1 which has the SRCR “a” pattern of each of the thirteen WC1 genes showed that the sequences are highly conserved among individuals and breeds. Of 160 sequences of domain 1 from three breeds of cattle, no additional sequences beyond the thirteen described WC1 genes were found. Analysis of the complete WC1 cDNA sequences indicated that the thirteen WC1 genes code for three distinct WC1 molecular forms. Conclusion The bovine WC1 multi-gene family is composed of thirteen genes coding for three structural forms whose

  20. Complete genome sequencing and evolutionary analysis of Indian isolates of Dengue virus type 2

    Energy Technology Data Exchange (ETDEWEB)

    Dash, Paban Kumar, E-mail: pabandash@rediffmail.com; Sharma, Shashi; Soni, Manisha; Agarwal, Ankita; Parida, Manmohan; Rao, P.V.Lakshmana

    2013-07-05

    Highlights: •Complete genome of Indian DENV-2 was deciphered for the first time in this study. •The recent Indian DENV-2 revealed presence of many unique amino acid residues. •Genotype shift (American to Cosmopolitan) characterizes evolution of DENV-2 in India. •Circulation of a unique clade of DENV-2 in South Asia was identified. -- Abstract: Dengue is the most important arboviral infection of global public health significance. It is now endemic in most parts of the South East Asia including India. Though Dengue virus type 2 (DENV-2) is predominantly associated with major outbreaks in India, complete genome information of Indian DENV-2 is not available. In this study, the full-length genome of five DENV-2 isolates (four from 2001 to 2011 and one from 1960), from different parts of India was determined. The complete genome of the Indian DENV-2 was found to be 10,670 bases long with an open reading frame coding for 3391 amino acids. The recent Indian DENV-2 (2001–2011) revealed a nucleotide sequence identity of around 90% and 97% with an older Indian DENV-2 (1960) and closely related Sri Lankan and Chinese DENV-2 respectively. Presence of unique amino acid residues and non-conservative substitutions in critical amino acid residues of major structural and non-structural proteins was observed in recent Indian DENV-2. Selection pressure analysis revealed positive selection in few amino acid sites of the genes encoding for structural and non-structural proteins. The molecular phylogenetic analysis based on comparison of both complete coding region and envelope protein gene with globally diverse DENV-2 viruses classified the recent Indian isolates into a unique South Asian clade within Cosmopolitan genotype. A shift of genotype from American to Cosmopolitan in 1970s characterized the evolution of DENV-2 in India. Present study is the first report on complete genome characterization of emerging DENV-2 isolates from India and highlights the circulation of a

  1. Complete genome sequencing and evolutionary analysis of Indian isolates of Dengue virus type 2

    International Nuclear Information System (INIS)

    Dash, Paban Kumar; Sharma, Shashi; Soni, Manisha; Agarwal, Ankita; Parida, Manmohan; Rao, P.V.Lakshmana

    2013-01-01

    Highlights: •Complete genome of Indian DENV-2 was deciphered for the first time in this study. •The recent Indian DENV-2 revealed presence of many unique amino acid residues. •Genotype shift (American to Cosmopolitan) characterizes evolution of DENV-2 in India. •Circulation of a unique clade of DENV-2 in South Asia was identified. -- Abstract: Dengue is the most important arboviral infection of global public health significance. It is now endemic in most parts of the South East Asia including India. Though Dengue virus type 2 (DENV-2) is predominantly associated with major outbreaks in India, complete genome information of Indian DENV-2 is not available. In this study, the full-length genome of five DENV-2 isolates (four from 2001 to 2011 and one from 1960), from different parts of India was determined. The complete genome of the Indian DENV-2 was found to be 10,670 bases long with an open reading frame coding for 3391 amino acids. The recent Indian DENV-2 (2001–2011) revealed a nucleotide sequence identity of around 90% and 97% with an older Indian DENV-2 (1960) and closely related Sri Lankan and Chinese DENV-2 respectively. Presence of unique amino acid residues and non-conservative substitutions in critical amino acid residues of major structural and non-structural proteins was observed in recent Indian DENV-2. Selection pressure analysis revealed positive selection in few amino acid sites of the genes encoding for structural and non-structural proteins. The molecular phylogenetic analysis based on comparison of both complete coding region and envelope protein gene with globally diverse DENV-2 viruses classified the recent Indian isolates into a unique South Asian clade within Cosmopolitan genotype. A shift of genotype from American to Cosmopolitan in 1970s characterized the evolution of DENV-2 in India. Present study is the first report on complete genome characterization of emerging DENV-2 isolates from India and highlights the circulation of a

  2. High throughput 16S rRNA gene amplicon sequencing

    DEFF Research Database (Denmark)

    Nierychlo, Marta; Larsen, Poul; Jørgensen, Mads Koustrup

    S rRNA gene amplicon sequencing has been developed over the past few years and is now ready to use for more comprehensive studies related to plant operation and optimization thanks to short analysis time, low cost, high throughput, and high taxonomic resolution. In this study we show how 16S r......RNA gene amplicon sequencing can be used to reveal factors of importance for the operation of full-scale nutrient removal plants related to settling problems and floc properties. Using optimized DNA extraction protocols, indexed primers and our in-house Illumina platform, we prepared multiple samples...... be correlated to the presence of the species that are regarded as “strong” and “weak” floc formers. In conclusion, 16S rRNA gene amplicon sequencing provides a high throughput approach for a rapid and cheap community profiling of activated sludge that in combination with multivariate statistics can be used...

  3. The complete chloroplast genome sequence of the medicinal plant Andrographis paniculata.

    Science.gov (United States)

    Ding, Ping; Shao, Yanhua; Li, Qian; Gao, Junli; Zhang, Runjing; Lai, Xiaoping; Wang, Deqin; Zhang, Huiye

    2016-07-01

    The complete chloroplast genome of Andrographis paniculata, an important medicinal plant with great economic value, has been studied in this article. The genome size is 150,249 bp in length, with 38.3% GC content. A pair of inverted repeats (IRs, 25,300 bp) are separated by a large single copy region (LSC, 82,459 bp) and a small single-copy region (SSC, 17,190 bp). The chloroplast genome contains 114 unique genes, 80 protein-coding genes, 30 tRNA genes and 4 rRNA genes. In these genes, 15 genes contained 1 intron and 3 genes comprised of 2 introns.

  4. Complete genome sequence of the moderately thermophilic mineral-sulfide-oxidizing firmicute Sulfobacillus acidophilus type strain (NALT)

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, Iain [U.S. Department of Energy, Joint Genome Institute; Chertkov, Olga [Los Alamos National Laboratory (LANL); Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Saunders, Elizabeth H [Los Alamos National Laboratory (LANL); Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Hammon, Nancy [U.S. Department of Energy, Joint Genome Institute; Deshpande, Shweta [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Tapia, Roxanne [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Pagani, Ioanna [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Pan, Chongle [ORNL; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Pukall, Rudiger [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute

    2012-01-01

    Sulfobacillus acidophilus Norris et al. 1996 is a member of the genus Sulfobacillus which comprises five species of the order Clostridiales. Sulfobacillus species are of interest for comparison to other sulfur and iron oxidizers and also have biomining applications. This is the first completed genome sequence of a type strain of the genus Sulfobacillus, and the second published genome of a member of the species S. acidophilus. The genome, which consists of one chromosome and one plasmid with a total size of 3,557,831 bp, harbors 3,626 protein-coding and 69 RNA genes, and is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  5. Construction of an American mink Bacterial Artificial Chromosome (BAC library and sequencing candidate genes important for the fur industry

    Directory of Open Access Journals (Sweden)

    Christensen Knud

    2011-07-01

    Full Text Available Abstract Background Bacterial artificial chromosome (BAC libraries continue to be invaluable tools for the genomic analysis of complex organisms. Complemented by the newly and fast growing deep sequencing technologies, they provide an excellent source of information in genomics projects. Results Here, we report the construction and characterization of the CHORI-231 BAC library constructed from a Danish-farmed, male American mink (Neovison vison. The library contains approximately 165,888 clones with an average insert size of 170 kb, representing approximately 10-fold coverage. High-density filters, each consisting of 18,432 clones spotted in duplicate, have been produced for hybridization screening and are publicly available. Overgo probes derived from expressed sequence tags (ESTs, representing 21 candidate genes for traits important for the mink industry, were used to screen the BAC library. These included candidate genes for coat coloring, hair growth and length, coarseness, and some receptors potentially involved in viral diseases in mink. The extensive screening yielded positive results for 19 of these genes. Thirty-five clones corresponding to 19 genes were sequenced using 454 Roche, and large contigs (184 kb in average were assembled. Knowing the complete sequences of these candidate genes will enable confirmation of the association with a phenotype and the finding of causative mutations for the targeted phenotypes. Additionally, 1577 BAC clones were end sequenced; 2505 BAC end sequences (80% of BACs were obtained. An excess of 2 Mb has been analyzed, thus giving a snapshot of the mink genome. Conclusions The availability of the CHORI-321 American mink BAC library will aid in identification of genes and genomic regions of interest. We have demonstrated how the library can be used to identify specific genes of interest, develop genetic markers, and for BAC end sequencing and deep sequencing of selected clones. To our knowledge, this is the

  6. The complete mitochondrial genome sequence of the spider habronattus oregonensis reveals rearranged and extremely truncated tRNAs

    International Nuclear Information System (INIS)

    Masta, Susan E.; Boore, Jeffrey L.

    2004-01-01

    We sequenced the entire mitochondrial genome of the jumping spider Habronattus oregonensis of the arachnid order Araneae (Arthropoda: Chelicerata). A number of unusual features distinguish this genome from other chelicerate and arthropod mitochondrial genomes. Most of the transfer RNA gene sequences are greatly reduced in size and cannot be folded into typical cloverleaf-shaped secondary structures. At least nine of the tRNA sequences lack the potential to form TYC arm stem pairings, and instead are inferred to have TV-replacement loops. Furthermore, sequences that could encode the 3' aminoacyl acceptor stems in at least 10 tRNAs appear to be lacking, because fully paired acceptor stems are not possible and because the downstream sequences instead encode adjacent genes. Hence, these appear to be among the smallest known tRNA genes. We postulate that an RNA editing mechanism must exist to restore the 3' aminoacyl acceptor stems in order to allow the tRNAs to function. At least seven tRN As are rearranged with respect to the chelicerate Limulus polyphemus, although the arrangement of the protein-coding genes is identical. Most mitochondrial protein-coding genes of H. oregonensis have ATN as initiation codons, as commonly found in arthropod mtDNAs, but cytochrome oxidase subunit 2 and 3 genes apparently use UUG as an initiation codon. Finally, many of the gene sequences overlap one another and are truncated. This 14,381 bp genome, the first mitochondrial genome of a spider yet sequenced, is one of the smallest arthropod mitochondrial genomes known. We suggest that post transcriptional RNA editing can likely maintain function of the tRNAs while permitting the accumulation of mutations that would otherwise be deleterious. Such mechanisms may have allowed for the minimization of the spider mitochondrial genome

  7. Complete genome sequence and description of Salinispira pacifica gen. nov., sp. nov., a novel spirochaete isolated form a hypersaline microbial mat.

    Science.gov (United States)

    Ben Hania, Wajdi; Joseph, Manon; Schumann, Peter; Bunk, Boyke; Fiebig, Anne; Spröer, Cathrin; Klenk, Hans-Peter; Fardeau, Marie-Laure; Spring, Stefan

    2015-01-01

    During a study of the anaerobic microbial community of a lithifying hypersaline microbial mat of Lake 21 on the Kiritimati atoll (Kiribati Republic, Central Pacific) strain L21-RPul-D2(T) was isolated. The closest phylogenetic neighbor was Spirochaeta africana Z-7692(T) that shared a 16S rRNA gene sequence identity value of 90% with the novel strain and thus was only distantly related. A comprehensive polyphasic study including determination of the complete genome sequence was initiated to characterize the novel isolate. Cells of strain L21-RPul-D2(T) had a size of 0.2 - 0.25 × 8-9 μm, were helical, motile, stained Gram-negative and produced an orange carotenoid-like pigment. Optimal conditions for growth were 35°C, a salinity of 50 g/l NaCl and a pH around 7.0. Preferred substrates for growth were carbohydrates and a few carboxylic acids. The novel strain had an obligate fermentative metabolism and produced ethanol, acetate, lactate, hydrogen and carbon dioxide during growth on glucose. Strain L21-RPul-D2(T) was aerotolerant, but oxygen did not stimulate growth. Major cellular fatty acids were C14:0, iso-C15:0, C16:0 and C18:0. The major polar lipids were an unidentified aminolipid, phosphatidylglycerol, an unidentified phospholipid and two unidentified glycolipids. Whole-cell hydrolysates contained L-ornithine as diagnostic diamino acid of the cell wall peptidoglycan. The complete genome sequence was determined and annotated. The genome comprised one circular chromosome with a size of 3.78 Mbp that contained 3450 protein-coding genes and 50 RNA genes, including 2 operons of ribosomal RNA genes. The DNA G + C content was determined from the genome sequence as 51.9 mol%. There were no predicted genes encoding cytochromes or enzymes responsible for the biosynthesis of respiratory lipoquinones. Based on significant differences to the uncultured type species of the genus Spirochaeta, S. plicatilis, as well as to any other phylogenetically related

  8. Revised Mimivirus major capsid protein sequence reveals intron-containing gene structure and extra domain

    Directory of Open Access Journals (Sweden)

    Suzan-Monti Marie

    2009-05-01

    Full Text Available Abstract Background Acanthamoebae polyphaga Mimivirus (APM is the largest known dsDNA virus. The viral particle has a nearly icosahedral structure with an internal capsid shell surrounded with a dense layer of fibrils. A Capsid protein sequence, D13L, was deduced from the APM L425 coding gene and was shown to be the most abundant protein found within the viral particle. However this protein remained poorly characterised until now. A revised protein sequence deposited in a database suggested an additional N-terminal stretch of 142 amino acids missing from the original deduced sequence. This result led us to investigate the L425 gene structure and the biochemical properties of the complete APM major Capsid protein. Results This study describes the full length 3430 bp Capsid coding gene and characterises the 593 amino acids long corresponding Capsid protein 1. The recombinant full length protein allowed the production of a specific monoclonal antibody able to detect the Capsid protein 1 within the viral particle. This protein appeared to be post-translationnally modified by glycosylation and phosphorylation. We proposed a secondary structure prediction of APM Capsid protein 1 compared to the Capsid protein structure of Paramecium Bursaria Chlorella Virus 1, another member of the Nucleo-Cytoplasmic Large DNA virus family. Conclusion The characterisation of the full length L425 Capsid coding gene of Acanthamoebae polyphaga Mimivirus provides new insights into the structure of the main Capsid protein. The production of a full length recombinant protein will be useful for further structural studies.

  9. Complete Genome Sequence of a Putative Densovirus of the Asian Citrus Psyllid, Diaphorina citri.

    Science.gov (United States)

    Nigg, Jared C; Nouri, Shahideh; Falk, Bryce W

    2016-07-28

    Here, we report the complete genome sequence of a putative densovirus of the Asian citrus psyllid, Diaphorina citri Diaphorina citri densovirus (DcDNV) was originally identified through metagenomics, and here, we obtained the complete nucleotide sequence using PCR-based approaches. Phylogenetic analysis places DcDNV between viruses of the Ambidensovirus and Iteradensovirus genera. Copyright © 2016 Nigg et al.

  10. Genome sequencing of the lizard parasite Leishmania tarentolae reveals loss of genes associated to the intracellular stage of human pathogenic species

    Science.gov (United States)

    Raymond, Frédéric; Boisvert, Sébastien; Roy, Gaétan; Ritt, Jean-François; Légaré, Danielle; Isnard, Amandine; Stanke, Mario; Olivier, Martin; Tremblay, Michel J.; Papadopoulou, Barbara; Ouellette, Marc; Corbeil, Jacques

    2012-01-01

    The Leishmania tarentolae Parrot-TarII strain genome sequence was resolved to an average 16-fold mean coverage by next-generation DNA sequencing technologies. This is the first non-pathogenic to humans kinetoplastid protozoan genome to be described thus providing an opportunity for comparison with the completed genomes of pathogenic Leishmania species. A high synteny was observed between all sequenced Leishmania species. A limited number of chromosomal regions diverged between L. tarentolae and L. infantum, while remaining syntenic to L. major. Globally, >90% of the L. tarentolae gene content was shared with the other Leishmania species. We identified 95 predicted coding sequences unique to L. tarentolae and 250 genes that were absent from L. tarentolae. Interestingly, many of the latter genes were expressed in the intracellular amastigote stage of pathogenic species. In addition, genes coding for products involved in antioxidant defence or participating in vesicular-mediated protein transport were underrepresented in L. tarentolae. In contrast to other Leishmania genomes, two gene families were expanded in L. tarentolae, namely the zinc metallo-peptidase surface glycoprotein GP63 and the promastigote surface antigen PSA31C. Overall, L. tarentolae's gene content appears better adapted to the promastigote insect stage rather than the amastigote mammalian stage. PMID:21998295

  11. Whole-Genome Sequencing of Sordaria macrospora Mutants Identifies Developmental Genes.

    Science.gov (United States)

    Nowrousian, Minou; Teichert, Ines; Masloff, Sandra; Kück, Ulrich

    2012-02-01

    The study of mutants to elucidate gene functions has a long and successful history; however, to discover causative mutations in mutants that were generated by random mutagenesis often takes years of laboratory work and requires previously generated genetic and/or physical markers, or resources like DNA libraries for complementation. Here, we present an alternative method to identify defective genes in developmental mutants of the filamentous fungus Sordaria macrospora through Illumina/Solexa whole-genome sequencing. We sequenced pooled DNA from progeny of crosses of three mutants and the wild type and were able to pinpoint the causative mutations in the mutant strains through bioinformatics analysis. One mutant is a spore color mutant, and the mutated gene encodes a melanin biosynthesis enzyme. The causative mutation is a G to A change in the first base of an intron, leading to a splice defect. The second mutant carries an allelic mutation in the pro41 gene encoding a protein essential for sexual development. In the mutant, we detected a complex pattern of deletion/rearrangements at the pro41 locus. In the third mutant, a point mutation in the stop codon of a transcription factor-encoding gene leads to the production of immature fruiting bodies. For all mutants, transformation with a wild type-copy of the affected gene restored the wild-type phenotype. Our data demonstrate that whole-genome sequencing of mutant strains is a rapid method to identify developmental genes in an organism that can be genetically crossed and where a reference genome sequence is available, even without prior mapping information.

  12. Reanalysis of RNA-sequencing data reveals several additional fusion genes with multiple isoforms.

    Science.gov (United States)

    Kangaspeska, Sara; Hultsch, Susanne; Edgren, Henrik; Nicorici, Daniel; Murumägi, Astrid; Kallioniemi, Olli

    2012-01-01

    RNA-sequencing and tailored bioinformatic methodologies have paved the way for identification of expressed fusion genes from the chaotic genomes of solid tumors. We have recently successfully exploited RNA-sequencing for the discovery of 24 novel fusion genes in breast cancer. Here, we demonstrate the importance of continuous optimization of the bioinformatic methodology for this purpose, and report the discovery and experimental validation of 13 additional fusion genes from the same samples. Integration of copy number profiling with the RNA-sequencing results revealed that the majority of the gene fusions were promoter-donating events that occurred at copy number transition points or involved high-level DNA-amplifications. Sequencing of genomic fusion break points confirmed that DNA-level rearrangements underlie selected fusion transcripts. Furthermore, a significant portion (>60%) of the fusion genes were alternatively spliced. This illustrates the importance of reanalyzing sequencing data as gene definitions change and bioinformatic methods improve, and highlights the previously unforeseen isoform diversity among fusion transcripts.

  13. Reanalysis of RNA-sequencing data reveals several additional fusion genes with multiple isoforms.

    Directory of Open Access Journals (Sweden)

    Sara Kangaspeska

    Full Text Available RNA-sequencing and tailored bioinformatic methodologies have paved the way for identification of expressed fusion genes from the chaotic genomes of solid tumors. We have recently successfully exploited RNA-sequencing for the discovery of 24 novel fusion genes in breast cancer. Here, we demonstrate the importance of continuous optimization of the bioinformatic methodology for this purpose, and report the discovery and experimental validation of 13 additional fusion genes from the same samples. Integration of copy number profiling with the RNA-sequencing results revealed that the majority of the gene fusions were promoter-donating events that occurred at copy number transition points or involved high-level DNA-amplifications. Sequencing of genomic fusion break points confirmed that DNA-level rearrangements underlie selected fusion transcripts. Furthermore, a significant portion (>60% of the fusion genes were alternatively spliced. This illustrates the importance of reanalyzing sequencing data as gene definitions change and bioinformatic methods improve, and highlights the previously unforeseen isoform diversity among fusion transcripts.

  14. Neural Inductive Matrix Completion for Predicting Disease-Gene Associations

    KAUST Repository

    Hou, Siqing

    2018-05-21

    In silico prioritization of undiscovered associations can help find causal genes of newly discovered diseases. Some existing methods are based on known associations, and side information of diseases and genes. We exploit the possibility of using a neural network model, Neural inductive matrix completion (NIMC), in disease-gene prediction. Comparing to the state-of-the-art inductive matrix completion method, using neural networks allows us to learn latent features from non-linear functions of input features. Previous methods use disease features only from mining text. Comparing to text mining, disease ontology is a more informative way of discovering correlation of dis- eases, from which we can calculate the similarities between diseases and help increase the performance of predicting disease-gene associations. We compare the proposed method with other state-of-the-art methods for pre- dicting associated genes for diseases from the Online Mendelian Inheritance in Man (OMIM) database. Results show that both new features and the proposed NIMC model can improve the chance of recovering an unknown associated gene in the top 100 predicted genes. Best results are obtained by using both the new features and the new model. Results also show the proposed method does better in predicting associated genes for newly discovered diseases.

  15. Comparative genome sequencing of Drosophila pseudoobscura: Chromosomal, gene, and cis-element evolution

    DEFF Research Database (Denmark)

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.

    2005-01-01

    years (Myr) since the pseudoobscura/melanogaster divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome-wide average, consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than random and nearby sequences......We have sequenced the genome of a second Drosophila species, Drosophila pseudoobscura, and compared this to the genome sequence of Drosophila melanogaster, a primary model organism. Throughout evolution the vast majority of Drosophila genes have remained on the same chromosome arm, but within each...... between the species-but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a pattern of repeat-mediated chromosomal rearrangement, and high coadaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence...

  16. Transcriptome Profiling Using Single-Molecule Direct RNA Sequencing Approach for In-depth Understanding of Genes in Secondary Metabolism Pathways of Camellia sinensis

    Directory of Open Access Journals (Sweden)

    Qingshan Xu

    2017-07-01

    Full Text Available Characteristic secondary metabolites, including flavonoids, theanine and caffeine, are important components of Camellia sinensis, and their biosynthesis has attracted widespread interest. Previous studies on the biosynthesis of these major secondary metabolites using next-generation sequencing technologies limited the accurately prediction of full-length (FL splice isoforms. Herein, we applied single-molecule sequencing to pooled tea plant tissues, to provide a more complete transcriptome of C. sinensis. Moreover, we identified 94 FL transcripts and four alternative splicing events for enzyme-coding genes involved in the biosynthesis of flavonoids, theanine and caffeine. According to the comparison between long-read isoforms and assemble transcripts, we improved the quality and accuracy of genes sequenced by short-read next-generation sequencing technology. The resulting FL transcripts, together with the improved assembled transcripts and identified alternative splicing events, enhance our understanding of genes involved in the biosynthesis of characteristic secondary metabolites in C. sinensis.

  17. The complete mitochondrial genomes of two rice planthoppers, Nilaparvata lugens and Laodelphax striatellus: conserved genome rearrangement in Delphacidae and discovery of new characteristics of atp8 and tRNA genes.

    Science.gov (United States)

    Zhang, Kai-Jun; Zhu, Wen-Chao; Rong, Xia; Zhang, Yan-Kai; Ding, Xiu-Lei; Liu, Jing; Chen, Da-Song; Du, Yu; Hong, Xiao-Yue

    2013-06-22

    Nilaparvata lugens (the brown planthopper, BPH) and Laodelphax striatellus (the small brown planthopper, SBPH) are two of the most important pests of rice. Up to now, there was only one mitochondrial genome of rice planthopper has been sequenced and very few dependable information of mitochondria could be used for research on population genetics, phylogeographics and phylogenetic evolution of these pests. To get more valuable information from the mitochondria, we sequenced the complete mitochondrial genomes of BPH and SBPH. These two planthoppers were infected with two different functional Wolbachia (intracellular endosymbiont) strains (wLug and wStri). Since both mitochondria and Wolbachia are transmitted by cytoplasmic inheritance and it was difficult to separate them when purified the Wolbachia particles, concomitantly sequencing the genome of Wolbachia using next generation sequencing method, we also got nearly complete mitochondrial genome sequences of these two rice planthoppers. After gap closing, we present high quality and reliable complete mitochondrial genomes of these two planthoppers. The mitogenomes of N. lugens (BPH) and L. striatellus (SBPH) are 17, 619 bp and 16, 431 bp long with A + T contents of 76.95% and 77.17%, respectively. Both species have typical circular mitochondrial genomes that encode the complete set of 37 genes which are usually found in metazoans. However, the BPH mitogenome also possesses two additional copies of the trnC gene. In both mitochondrial genomes, the lengths of the atp8 gene were conspicuously shorter than that of all other known insect mitochondrial genomes (99 bp for BPH, 102 bp for SBPH). That two rearrangement regions (trnC-trnW and nad6-trnP-trnT) of mitochondrial genomes differing from other known insect were found in these two distantly related planthoppers revealed that the gene order of mitochondria might be conservative in Delphacidae. The large non-coding fragment (the A+T-rich region) putatively

  18. Complete Genome Sequence of Methanohalophilus halophilus DSM 3094 T , Isolated from a Cyanobacterial Mat and Bottom Deposits at Hamelin Pool, Shark Bay, Northwestern Australia

    KAUST Repository

    L'Haridon, Stéphane

    2017-02-17

    The complete genome sequence of Methanohalophilus halophilus DSM 3094, a member of the Methanosarcinaceae family and the Methanosarcianales order, consists of 2,022,959 bp in one contig and contains 2,137 predicted genes. The genome is consistent with a halophilic methylotrophic anaerobic lifestyle, including the methylotrophic and CO-H methanogensis pathways.

  19. Complete Genome Sequence of Methanohalophilus halophilus DSM 3094 T , Isolated from a Cyanobacterial Mat and Bottom Deposits at Hamelin Pool, Shark Bay, Northwestern Australia

    KAUST Repository

    L'Haridon, Sté phane; Corre, Erwan; Guan, Yue; Vinu, Manikandan; La Cono, Violetta; Yakimov, Mickail; Stingl, Ulrich; Toffin, Laurent; Jebbar, Mohamed

    2017-01-01

    The complete genome sequence of Methanohalophilus halophilus DSM 3094, a member of the Methanosarcinaceae family and the Methanosarcianales order, consists of 2,022,959 bp in one contig and contains 2,137 predicted genes. The genome is consistent with a halophilic methylotrophic anaerobic lifestyle, including the methylotrophic and CO-H methanogensis pathways.

  20. Cloning and sequencing of a cellobiohydrolase gene from Trichoderma harzianum FP108

    Science.gov (United States)

    Patrick Guilfoile; Ron Burns; Zu-Yi Gu; Matt Amundson; Fu-Hsian Chang

    1999-01-01

    A cbbl cellobiohydrolase gene was cloned and sequenced from the fungus Trichoderrna harzianum FP108. The cloning was performed by PCR amplification of T. harzianum genomic DNA, using PCR primers whose sequence was based on the cbbl gene from Tricboderma reesei. The 3' end of the gene was isolated by inverse...

  1. Nearly complete mitogenome of hairy sawfly, Corynis lateralis (Brullé, 1832) (Hymenoptera: Cimbicidae): rearrangements in the IQM and ARNS1EF gene clusters.

    Science.gov (United States)

    Doğan, Özgül; Korkmaz, E Mahir

    2017-10-01

    The Cimbicidae is a small family of the primitive and relatively less diverse suborder Symphyta (Hymenoptera). Here, nearly complete mitochondrial genome (mitogenome) of hairy sawfly, Corynis lateralis (Hymenoptera: Cimbicidae) was sequenced using next generation sequencing and comparatively analysed with the mitogenome of Trichiosoma anthracinum. The sequenced length of C. lateralis mitogenome was 14,899 bp with an A+T content of 80.60%. All protein coding genes (PCGs) are initiated by ATN codons and all are terminated with TAR or T- stop codon. All tRNA genes preferred usual anticodons. Compared with the inferred insect ancestral mitogenome, two tRNA rearrangements were observed in the IQM and ARNS1EF gene clusters, representing a new event not previously reported in Symphyta. An illicit priming of replication and/or intra/inter-mitochondrial recombination and TDRL seem to be responsible mechanisms for the rearrangement events in these gene clusters. Phylogenetic analyses confirmed the position of Corynis within Cimbicidae and recovered a relationship of Tenthredinoidea + (Cephoidea + Orussoidea) in Symphyta.

  2. Nucleotide sequence of the gene coding for human factor VII, a vitamin K-dependent protein participating in blood coagulation

    International Nuclear Information System (INIS)

    O'Hara, P.J.; Grant, F.J.; Haldeman, B.A.; Gray, C.L.; Insley, M.Y.; Hagen, F.S.; Murray, M.J.

    1987-01-01

    Activated factor VII (factor VIIa) is a vitamin K-dependent plasma serine protease that participates in a cascade of reactions leading to the coagulation of blood. Two overlapping genomic clones containing sequences encoding human factor VII were isolated and characterized. The complete sequence of the gene was determined and found to span about 12.8 kilobases. The mRNA for factor VII as demonstrated by cDNA cloning is polyadenylylated at multiple sites but contains only one AAUAAA poly(A) signal sequence. The mRNA can undergo alternative splicing, forming one transcript containing eight segments as exons and another with an additional exon that encodes a larger prepro leader sequence. The latter transcript has no known counterpart in the other vitamin K-dependent proteins. The positions of the introns with respect to the amino acid sequence encoded by the eight essential exons of factor VII are the same as those present in factor IX, factor X, protein C, and the first three exons of prothrombin. These exons code for domains generally conserved among members of this gene family. The comparable introns in these genes, however, are dissimilar with respect to size and sequence, with the exception of intron C in factor VII and protein C. The gene for factor VII also contains five regions made up of tandem repeats of oligonucleotide monomer elements. More than a quarter of the intron sequences and more than a third of the 3' untranslated portion of the mRNA transcript consist of these minisatellite tandem repeats

  3. Mouse mammary tumor virus-like gene sequences are present in lung patient specimens

    Directory of Open Access Journals (Sweden)

    Rodríguez-Padilla Cristina

    2011-09-01

    Full Text Available Abstract Background Previous studies have reported on the presence of Murine Mammary Tumor Virus (MMTV-like gene sequences in human cancer tissue specimens. Here, we search for MMTV-like gene sequences in lung diseases including carcinomas specimens from a Mexican population. This study was based on our previous study reporting that the INER51 lung cancer cell line, from a pleural effusion of a Mexican patient, contains MMTV-like env gene sequences. Results The MMTV-like env gene sequences have been detected in three out of 18 specimens studied, by PCR using a specific set of MMTV-like primers. The three identified MMTV-like gene sequences, which were assigned as INER6, HZ101, and HZ14, were 99%, 98%, and 97% homologous, respectively, as compared to GenBank sequence accession number AY161347. The INER6 and HZ-101 samples were isolated from lung cancer specimens, and the HZ-14 was isolated from an acute inflammatory lung infiltrate sample. Two of the env sequences exhibited disruption of the reading frame due to mutations. Conclusion In summary, we identified the presence of MMTV-like gene sequences in 2 out of 11 (18% of the lung carcinomas and 1 out of 7 (14% of acute inflamatory lung infiltrate specimens studied of a Mexican Population.

  4. On the total number of genes and their length distribution in complete microbial genomes

    DEFF Research Database (Denmark)

    Skovgaard, Marie; Jensen, L.J.; Brunak, Søren

    2001-01-01

    In sequenced microbial genomes, some of the annotated genes are actually not protein-coding genes, but rather open reading frames that occur by chance. Therefore, the number of annotated genes is higher than the actual number of genes for most of these microbes. Comparison of the length...... distribution of the annotated genes with the length distribution of those matching a known protein reveals that too many short genes are annotated in many genomes. Here we estimate the true number of protein-coding genes for sequenced genomes. Although it is often claimed that Escherichia coli has about 4300...... genes, we show that it probably has only similar to 3800 genes, and that a similar discrepancy exists for almost all published genomes....

  5. Complete genome sequence of Acinetobacter baumannii XH386 (ST208, a multi-drug resistant bacteria isolated from pediatric hospital in China

    Directory of Open Access Journals (Sweden)

    Youhong Fang

    2016-03-01

    Full Text Available Acinetobacter baumannii is an important bacterium that emerged as a significant nosocomial pathogen worldwide. The rise of A. baumannii was due to its multi-drug resistance (MDR, while it was difficult to treat multi-drug resistant A. baumannii with antibiotics, especially in pediatric patients for the therapeutic options with antibiotics were quite limited in pediatric patients. A. baumannii ST208 was identified as predominant sequence type of carbapenem resistant A. baumannii in the United States and China. As we knew, there was no complete genome sequence reproted for A. baumannii ST208, although several whole genome shotgun sequences had been reported. Here, we sequenced the 4087-kilobase (kb chromosome and 112-kb plasmid of A. baumannii XH386 (ST208, which was isolated from a pediatric hospital in China. The genome of A. baumannii XH386 contained 3968 protein-coding genes and 94 RNA-only encoding genes. Genomic analysis and Minimum inhibitory concentration assay showed that A. baumannii XH386 was multi-drug resistant strain, which showed resistance to most of antibiotics, except for tigecycline. The data may be accessed via the GenBank accession number CP010779 and CP010780. Keywords: Acinetobacter baumannii, Multi-drug resistance, Paediatric

  6. Genomic sequence around butterfly wing development genes: annotation and comparative analysis.

    Directory of Open Access Journals (Sweden)

    Inês C Conceição

    Full Text Available BACKGROUND: Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. METHODOLOGY/PRINCIPAL FINDINGS: We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes. CONCLUSIONS: The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1 the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2 the high

  7. Microsatellite Instability Use in Mismatch Repair Gene Sequence Variant Classification

    Directory of Open Access Journals (Sweden)

    Bryony A. Thompson

    2015-03-01

    Full Text Available Inherited mutations in the DNA mismatch repair genes (MMR can cause MMR deficiency and increased susceptibility to colorectal and endometrial cancer. Microsatellite instability (MSI is the defining molecular signature of MMR deficiency. The clinical classification of identified MMR gene sequence variants has a direct impact on the management of patients and their families. For a significant proportion of cases sequence variants of uncertain clinical significance (also known as unclassified variants are identified, constituting a challenge for genetic counselling and clinical management of families. The effect on protein function of these variants is difficult to interpret. The presence or absence of MSI in tumours can aid in determining the pathogenicity of associated unclassified MMR gene variants. However, there are some considerations that need to be taken into account when using MSI for variant interpretation. The use of MSI and other tumour characteristics in MMR gene sequence variant classification will be explored in this review.

  8. Cloning and sequence analysis of chitin synthase gene fragments of Demodex mites.

    Science.gov (United States)

    Zhao, Ya-e; Wang, Zheng-hang; Xu, Yang; Xu, Ji-ru; Liu, Wen-yan; Wei, Meng; Wang, Chu-ying

    2012-10-01

    To our knowledge, few reports on Demodex studied at the molecular level are available at present. In this study our group, for the first time, cloned, sequenced and analyzed the chitin synthase (CHS) gene fragments of Demodex folliculorum, Demodex brevis, and Demodex canis (three isolates from each species) from Xi'an China, by designing specific primers based on the only partial sequence of the CHS gene of D. canis from Japan, retrieved from GenBank. Results show that amplification was successful only in three D. canis isolates and one D. brevis isolate out of the nine Demodex isolates. The obtained fragments were sequenced to be 339 bp for D. canis and 338 bp for D. brevis. The CHS gene sequence similarities between the three Xi'an D. canis isolates and one Japanese D. canis isolate ranged from 99.7% to 100.0%, and those between four D. canis isolates and one D. brevis isolate were 99.1%-99.4%. Phylogenetic trees based on maximum parsimony (MP) and maximum likelihood (ML) methods shared the same clusters, according with the traditional classification. Two open reading frames (ORFs) were identified in each CHS gene sequenced, and their corresponding amino acid sequences were located at the catalytic domain. The relatively conserved sequences could be deduced to be a CHS class A gene, which is associated with chitin synthesis in the integument of Demodex mites.

  9. Complete nucleotide sequence and analysis of two conjugative broad host range plasmids from a marine microbial biofilm.

    Directory of Open Access Journals (Sweden)

    Peter Norberg

    Full Text Available The complete nucleotide sequence of plasmids pMCBF1 and pMCBF6 was determined and analyzed. pMCBF1 and pMCBF6 form a novel clade within the IncP-1 plasmid family designated IncP-1 ς. The plasmids were exogenously isolated earlier from a marine biofilm. pMCBF1 (62 689 base pairs; bp and pMCBF6 (66 729 bp have identical backbones, but differ in their mercury resistance transposons. pMCBF1 carries Tn5053 and pMCBF6 carries Tn5058. Both are flanked by 5 bp direct repeats, typical of replicative transposition. Both insertions are in the vicinity of a resolvase gene in the backbone, supporting the idea that both transposons are "res-site hunters" that preferably insert close to and use external resolvase functions. The similarity of the backbones indicates recent insertion of the two transposons and the ongoing dynamics of plasmid evolution in marine biofilms. Both plasmids also carry the insertion sequence ISPst1, albeit without flanking repeats. ISPs1is located in an unusual site within the control region of the plasmid. In contrast to most known IncP-1 plasmids the pMCBF1/pMCBF6 backbone has no insert between the replication initiation gene (trfA and the vegetative replication origin (oriV. One pMCBF1/pMCBF6 block of about 2.5 kilo bases (kb has no similarity with known sequences in the databases. Furthermore, insertion of three genes with similarity to the multidrug efflux pump operon mexEF and a gene from the NodT family of the tripartite multi-drug resistance-nodulation-division (RND system in Pseudomonas aeruginosa was found. They do not seem to confer antibiotic resistance to the hosts of pMCBF1/pMCBF6, but the presence of RND on promiscuous plasmids may have serious implications for the spread of antibiotic multi-resistance.

  10. The complete genome sequences of poxviruses isolated from a penguin and a pigeon in South Africa and comparison to other sequenced avipoxviruses.

    Science.gov (United States)

    Offerman, Kristy; Carulei, Olivia; van der Walt, Anelda Philine; Douglass, Nicola; Williamson, Anna-Lise

    2014-06-12

    Two novel avipoxviruses from South Africa have been sequenced, one from a Feral Pigeon (Columba livia) (FeP2) and the other from an African penguin (Spheniscus demersus) (PEPV). We present a purpose-designed bioinformatics pipeline for analysis of next generation sequence data of avian poxviruses and compare the different avipoxviruses sequenced to date with specific emphasis on their evolution and gene content. The FeP2 (282 kbp) and PEPV (306 kbp) genomes encode 271 and 284 open reading frames respectively and are more closely related to one another (94.4%) than to either fowlpox virus (FWPV) (85.3% and 84.0% respectively) or Canarypox virus (CNPV) (62.0% and 63.4% respectively). Overall, FeP2, PEPV and FWPV have syntenic gene arrangements; however, major differences exist throughout their genomes. The most striking difference between FeP2 and the FWPV-like avipoxviruses is a large deletion of ~16 kbp from the central region of the genome of FeP2 deleting a cc-chemokine-like gene, two Variola virus B22R orthologues, an N1R/p28-like gene and a V-type Ig domain family gene. FeP2 and PEPV both encode orthologues of vaccinia virus C7L and Interleukin 10. PEPV contains a 77 amino acid long orthologue of Ubiquitin sharing 97% amino acid identity to human ubiquitin. The genome sequences of FeP2 and PEPV have greatly added to the limited repository of genomic information available for the Avipoxvirus genus. In the comparison of FeP2 and PEPV to existing sequences, FWPV and CNPV, we have established insights into African avipoxvirus evolution. Our data supports the independent evolution of these South African avipoxviruses from a common ancestral virus to FWPV and CNPV.

  11. Sequencing and analysis of the gene-rich space of cowpea

    Directory of Open Access Journals (Sweden)

    Cheung Foo

    2008-02-01

    Full Text Available Abstract Background Cowpea, Vigna unguiculata (L. Walp., is one of the most important food and forage legumes in the semi-arid tropics because of its drought tolerance and ability to grow on poor quality soils. Approximately 80% of cowpea production takes place in the dry savannahs of tropical West and Central Africa, mostly by poor subsistence farmers. Despite its economic and social importance in the developing world, cowpea remains to a large extent an underexploited crop. Among the major goals of cowpea breeding and improvement programs is the stacking of desirable agronomic traits, such as disease and pest resistance and response to abiotic stresses. Implementation of marker-assisted selection and breeding programs is severely limited by a paucity of trait-linked markers and a general lack of information on gene structure and organization. With a nuclear genome size estimated at ~620 Mb, the cowpea genome is an ideal target for reduced representation sequencing. Results We report here the sequencing and analysis of the gene-rich, hypomethylated portion of the cowpea genome selectively cloned by methylation filtration (MF technology. Over 250,000 gene-space sequence reads (GSRs with an average length of 610 bp were generated, yielding ~160 Mb of sequence information. The GSRs were assembled, annotated by BLAST homology searches of four public protein annotation databases and four plant proteomes (A. thaliana, M. truncatula, O. sativa, and P. trichocarpa, and analyzed using various domain and gene modeling tools. A total of 41,260 GSR assemblies and singletons were annotated, of which 19,786 have unique GenBank accession numbers. Within the GSR dataset, 29% of the sequences were annotated using the Arabidopsis Gene Ontology (GO with the largest categories of assigned function being catalytic activity and metabolic processes, groups that include the majority of cellular enzymes and components of amino acid, carbohydrate and lipid metabolism. A

  12. Topology of genes and nontranscribed sequences in human interphase nuclei

    International Nuclear Information System (INIS)

    Scheuermann, Markus O.; Tajbakhsh, Jian; Kurz, Anette; Saracoglu, Kaan; Eils, Roland; Lichter, Peter

    2004-01-01

    Knowledge about the functional impact of the topological organization of DNA sequences within interphase chromosome territories is still sparse. Of the few analyzed single copy genomic DNA sequences, the majority had been found to localize preferentially at the chromosome periphery or to loop out from chromosome territories. By means of dual-color fluorescence in situ hybridization (FISH), immunolabeling, confocal microscopy, and three-dimensional (3D) image analysis, we analyzed the intraterritorial and nuclear localization of 10 genomic fragments of different sequence classes in four different human cell types. The localization of three muscle-specific genes FLNA, NEB, and TTN, the oncogene BCL2, the tumor suppressor gene MADH4, and five putatively nontranscribed genomic sequences was predominantly in the periphery of the respective chromosome territories, independent from transcriptional status and from GC content. In interphase nuclei, the noncoding sequences were only rarely found associated with heterochromatic sites marked by the satellite III DNA D1Z1 or clusters of mammalian heterochromatin proteins (HP1α, HP1β, HP1γ). However, the nontranscribed sequences were found predominantly at the nuclear periphery or at the nucleoli, whereas genes tended to localize on chromosome surfaces exposed to the nuclear interior

  13. Complete genome sequence of Lactobacillus paracasei CAUH35, a new strain isolated from traditional fermented dairy product koumiss in China.

    Science.gov (United States)

    Wang, Guohong; Xiong, Yao; Xu, Qi; Yin, Jia; Hao, Yanling

    2015-11-20

    Lactobacillus paracasei CAUH35 was isolated from homemade koumiss, a traditional fermented dairy product with beneficial effects on human health. The genome consists of a circular 2,770,411 bp chromosome and four plasmids. Genome analysis revealed the presence of gene clusters involved in the production of exopolysaccharides and bacteriocin. The complete genome sequence of L. paracasei CAUH35 will provide genetic basis for further comparative and functional genomic analyses. Copyright © 2015. Published by Elsevier B.V.

  14. Complete Genome Sequence of Bradyrhizobium sp. S23321: Insights into Symbiosis Evolution in Soil Oligotrophs

    Science.gov (United States)

    Okubo, Takashi; Tsukui, Takahiro; Maita, Hiroko; Okamoto, Shinobu; Oshima, Kenshiro; Fujisawa, Takatomo; Saito, Akihiro; Futamata, Hiroyuki; Hattori, Reiko; Shimomura, Yumi; Haruta, Shin; Morimoto, Sho; Wang, Yong; Sakai, Yoriko; Hattori, Masahira; Aizawa, Shin-ichi; Nagashima, Kenji V. P.; Masuda, Sachiko; Hattori, Tsutomu; Yamashita, Akifumi; Bao, Zhihua; Hayatsu, Masahito; Kajiya-Kanegae, Hiromi; Yoshinaga, Ikuo; Sakamoto, Kazunori; Toyota, Koki; Nakao, Mitsuteru; Kohara, Mitsuyo; Anda, Mizue; Niwa, Rieko; Jung-Hwan, Park; Sameshima-Saito, Reiko; Tokuda, Shin-ichi; Yamamoto, Sumiko; Yamamoto, Syuji; Yokoyama, Tadashi; Akutsu, Tomoko; Nakamura, Yasukazu; Nakahira-Yanaka, Yuka; Hoshino, Yuko Takada; Hirakawa, Hideki; Mitsui, Hisayuki; Terasawa, Kimihiro; Itakura, Manabu; Sato, Shusei; Ikeda-Ohtsubo, Wakako; Sakakura, Natsuko; Kaminuma, Eli; Minamisawa, Kiwamu

    2012-01-01

    Bradyrhizobium sp. S23321 is an oligotrophic bacterium isolated from paddy field soil. Although S23321 is phylogenetically close to Bradyrhizobium japonicum USDA110, a legume symbiont, it is unable to induce root nodules in siratro, a legume often used for testing Nod factor-dependent nodulation. The genome of S23321 is a single circular chromosome, 7,231,841 bp in length, with an average GC content of 64.3%. The genome contains 6,898 potential protein-encoding genes, one set of rRNA genes, and 45 tRNA genes. Comparison of the genome structure between S23321 and USDA110 showed strong colinearity; however, the symbiosis islands present in USDA110 were absent in S23321, whose genome lacked a chaperonin gene cluster (groELS3) for symbiosis regulation found in USDA110. A comparison of sequences around the tRNA-Val gene strongly suggested that S23321 contains an ancestral-type genome that precedes the acquisition of a symbiosis island by horizontal gene transfer. Although S23321 contains a nif (nitrogen fixation) gene cluster, the organization, homology, and phylogeny of the genes in this cluster were more similar to those of photosynthetic bradyrhizobia ORS278 and BTAi1 than to those on the symbiosis island of USDA110. In addition, we found genes encoding a complete photosynthetic system, many ABC transporters for amino acids and oligopeptides, two types (polar and lateral) of flagella, multiple respiratory chains, and a system for lignin monomer catabolism in the S23321 genome. These features suggest that S23321 is able to adapt to a wide range of environments, probably including low-nutrient conditions, with multiple survival strategies in soil and rhizosphere. PMID:22452844

  15. Molecular characterization, sequence analysis and tissue expression of a porcine gene – MOSPD2

    Directory of Open Access Journals (Sweden)

    Yang Jie

    2017-01-01

    Full Text Available The full-length cDNA sequence of a porcine gene, MOSPD2, was amplified using the rapid amplification of cDNA ends method based on a pig expressed sequence tag sequence which was highly homologous to the coding sequence of the human MOSPD2 gene. Sequence prediction analysis revealed that the open reading frame of this gene encodes a protein of 491 amino acids that has high homology with the motile sperm domain-containing protein 2 (MOSPD2 of five species: horse (89%, human (90%, chimpanzee (89%, rhesus monkey (89% and mouse (85%; thus, it could be defined as a porcine MOSPD2 gene. This novel porcine gene was assigned GeneID: 100153601. This gene is structured in 15 exons and 14 introns as revealed by computer-assisted analysis. The phylogenetic analysis revealed that the porcine MOSPD2 gene has a closer genetic relationship with the MOSPD2 gene of horse. Tissue expression analysis indicated that the porcine MOSPD2 gene is generally and differentially expressed in the spleen, muscle, skin, kidney, lung, liver, fat and heart. Our experiment is the first to establish the primary foundation for further research on the porcine MOSPD2 gene.

  16. Characterization of the complete mitochondrial genome of Marshallagia marshalli and phylogenetic implications for the superfamily Trichostrongyloidea.

    Science.gov (United States)

    Sun, Miao-Miao; Han, Liang; Zhang, Fu-Kai; Zhou, Dong-Hui; Wang, Shu-Qing; Ma, Jun; Zhu, Xing-Quan; Liu, Guo-Hua

    2018-01-01

    Marshallagia marshalli (Nematoda: Trichostrongylidae) infection can lead to serious parasitic gastroenteritis in sheep, goat, and wild ruminant, causing significant socioeconomic losses worldwide. Up to now, the study concerning the molecular biology of M. marshalli is limited. Herein, we sequenced the complete mitochondrial (mt) genome of M. marshalli and examined its phylogenetic relationship with selected members of the superfamily Trichostrongyloidea using Bayesian inference (BI) based on concatenated mt amino acid sequence datasets. The complete mt genome sequence of M. marshalli is 13,891 bp, including 12 protein-coding genes, 22 transfer RNA genes, and 2 ribosomal RNA genes. All protein-coding genes are transcribed in the same direction. Phylogenetic analyses based on concatenated amino acid sequences of the 12 protein-coding genes supported the monophylies of the families Haemonchidae, Molineidae, and Dictyocaulidae with strong statistical support, but rejected the monophyly of the family Trichostrongylidae. The determination of the complete mt genome sequence of M. marshalli provides novel genetic markers for studying the systematics, population genetics, and molecular epidemiology of M. marshalli and its congeners.

  17. Complete Genome Sequence of an Avian Metapneumovirus Subtype A Strain Isolated from Chicken (Gallus gallus) in Brazil

    OpenAIRE

    Rizotto, La?s S.; Scagion, Guilherme P.; Cardoso, Tereza C.; Sim?o, Raphael M.; Caserta, Leonardo C.; Benassi, Julia C.; Keid, Lara B.; Oliveira, Tr?cia M. F. de S.; Soares, Rodrigo M.; Arns, Clarice W.; Van Borm, Steven; Ferreira, Helena L.

    2017-01-01

    ABSTRACT We report here the complete genome sequence of an avian metapneumovirus (aMPV) isolated from a tracheal tissue sample of a commercial layer flock. The complete genome sequence of aMPV-A/chicken/Brazil-SP/669/2003 was obtained using MiSeq (Illumina, Inc.) sequencing. Phylogenetic analysis of the complete genome classified the isolate as avian metapneumovirus subtype A.

  18. The primary structures of two leghemoglobin genes from soybean

    DEFF Research Database (Denmark)

    Hyldig-Nielsen, J J; Jensen, E O; Paludan, K

    1982-01-01

    We present the complete nucleotide sequences of two leghemoglobin genes isolated from soybean DNA. Both genes contain three intervening sequences which interrupt the two coding sequences in identical positions. The 5' and 3' flanking sequences in both genes contain conserved sequences similar...

  19. Complete genome sequences and comparative genome analysis of Lactobacillus plantarum strain 5-2 isolated from fermented soybean.

    Science.gov (United States)

    Liu, Chen-Jian; Wang, Rui; Gong, Fu-Ming; Liu, Xiao-Feng; Zheng, Hua-Jun; Luo, Yi-Yong; Li, Xiao-Ran

    2015-12-01

    Lactobacillus plantarum is an important probiotic and is mostly isolated from fermented foods. We sequenced the genome of L. plantarum strain 5-2, which was derived from fermented soybean isolated from Yunnan province, China. The strain was determined to contain 3114 genes. Fourteen complete insertion sequence (IS) elements were found in 5-2 chromosome. There were 24 DNA replication proteins and 76 DNA repair proteins in the 5-2 genome. Consistent with the classification of L. plantarum as a facultative heterofermentative lactobacillus, the 5-2 genome encodes key enzymes required for the EMP (Embden-Meyerhof-Parnas) and phosphoketolase (PK) pathways. Several components of the secretion machinery are found in the 5-2 genome, which was compared with L. plantarum ST-III, JDM1 and WCFS1. Most of the specific proteins in the four genomes appeared to be related to their prophage elements. Copyright © 2015 Elsevier Inc. All rights reserved.

  20. Sequence composition and gene content of the short arm of rye (Secale cereale chromosome 1.

    Directory of Open Access Journals (Sweden)

    Silvia Fluch

    Full Text Available BACKGROUND: The purpose of the study is to elucidate the sequence composition of the short arm of rye chromosome 1 (Secale cereale with special focus on its gene content, because this portion of the rye genome is an integrated part of several hundreds of bread wheat varieties worldwide. METHODOLOGY/PRINCIPAL FINDINGS: Multiple Displacement Amplification of 1RS DNA, obtained from flow sorted 1RS chromosomes, using 1RS ditelosomic wheat-rye addition line, and subsequent Roche 454FLX sequencing of this DNA yielded 195,313,589 bp sequence information. This quantity of sequence information resulted in 0.43× sequence coverage of the 1RS chromosome arm, permitting the identification of genes with estimated probability of 95%. A detailed analysis revealed that more than 5% of the 1RS sequence consisted of gene space, identifying at least 3,121 gene loci representing 1,882 different gene functions. Repetitive elements comprised about 72% of the 1RS sequence, Gypsy/Sabrina (13.3% being the most abundant. More than four thousand simple sequence repeat (SSR sites mostly located in gene related sequence reads were identified for possible marker development. The existence of chloroplast insertions in 1RS has been verified by identifying chimeric chloroplast-genomic sequence reads. Synteny analysis of 1RS to the full genomes of Oryza sativa and Brachypodium distachyon revealed that about half of the genes of 1RS correspond to the distal end of the short arm of rice chromosome 5 and the proximal region of the long arm of Brachypodium distachyon chromosome 2. Comparison of the gene content of 1RS to 1HS barley chromosome arm revealed high conservation of genes related to chromosome 5 of rice. CONCLUSIONS: The present study revealed the gene content and potential gene functions on this chromosome arm and demonstrated numerous sequence elements like SSRs and gene-related sequences, which can be utilised for future research as well as in breeding of wheat and rye.

  1. Cloning and sequencing of phenol oxidase 1 (pox1) gene from ...

    African Journals Online (AJOL)

    The gene (pox1) encoding a phenol oxidase 1 from Pleurotus ostreatus was sequenced and the corresponding pox1-cDNA was also synthesized, cloned and sequenced. The isolated gene is flanked by an upstream region called the promoter (399 bp) prior to the start codon (ATG). The putative metalresponsive elements ...

  2. Complete genome sequence of multidrug-resistant Staphylococcus cohnii ssp. urealyticus strain SNUDS-2 isolated from farmed duck, Republic of Korea.

    Science.gov (United States)

    Han, Jee Eun; Lee, Seungki; Jeong, Dae Gwin; Yoon, Sun-Woo; Kim, Doo-Jin; Lee, Moo-Seung; Kim, Hye Kwon; Park, Sung-Kyun; Kim, Ji Hyung; Park, Se Chang

    2017-09-01

    Staphylococcus cohnii has become increasingly recognized as a potential pathogen of clinically significant nosocomial and farm animal infections. This study was designed to determine the genome of a multidrug-resistant S. cohnii subsp. urealyticus strain SNUDS-2 isolated from a farmed duck in Korea. Genomic DNA was sequenced using the PacBio RS II system. The complete genome was annotated and the presence of antimicrobial resistance and virulence genes were identified. The annotated 2,625,703 bp genome contained various antimicrobial resistance genes conferring resistance to β-lactam, aminoglycosides, fluoroquinolones, phenicols and trimethoprim. The virulence-associated three synergistic hemolysins have been identified in the strain. To the best of our knowledge, this is the first complete genome of S. cohnii, and will provide important insights into the biodiversity of CoNS and valuable information for the control of this emerging pathogen. Copyright © 2017 International Society for Chemotherapy of Infection and Cancer. Published by Elsevier Ltd. All rights reserved.

  3. Toward Understanding Phage:Host Interactions in the Rumen; Complete Genome Sequences of Lytic Phages Infecting Rumen Bacteria

    Directory of Open Access Journals (Sweden)

    Rosalind A. Gilbert

    2017-12-01

    Full Text Available The rumen is known to harbor dense populations of bacteriophages (phages predicted to be capable of infecting a diverse range of rumen bacteria. While bacterial genome sequencing projects are revealing the presence of phages which can integrate their DNA into the genome of their host to form stable, lysogenic associations, little is known of the genetics of phages which utilize lytic replication. These phages infect and replicate within the host, culminating in host lysis, and the release of progeny phage particles. While lytic phages for rumen bacteria have been previously isolated, their genomes have remained largely uncharacterized. Here we report the first complete genome sequences of lytic phage isolates specifically infecting three genera of rumen bacteria: Bacteroides, Ruminococcus, and Streptococcus. All phages were classified within the viral order Caudovirales and include two phage morphotypes, representative of the Siphoviridae and Podoviridae families. The phage genomes displayed modular organization and conserved viral genes were identified which enabled further classification and determination of closest phage relatives. Co-examination of bacterial host genomes led to the identification of several genes responsible for modulating phage:host interactions, including CRISPR/Cas elements and restriction-modification phage defense systems. These findings provide new genetic information and insights into how lytic phages may interact with bacteria of the rumen microbiome.

  4. Complete mitochondrial genome of Cynopterus sphinx (Pteropodidae: Cynopterus).

    Science.gov (United States)

    Li, Linmiao; Li, Min; Wu, Zhengjun; Chen, Jinping

    2015-01-01

    We have characterized the complete mitochondrial genome of Cynopterus sphinx (Pteropodidae: Cynopterus) and described its organization in this study. The total length of C. sphinx complete mitochondrial genome was 16,895 bp with the base composition of 32.54% A, 14.05% G, 25.82% T and 27.59% C. The complete mitochondrial genome included 13 protein-coding genes, 22 tRNA genes, 2 rRNA genes (12S rRNA and 16S rRNA) and 1 control region (D-loop). The control region was 1435 bp long with the sequence CATACG repeat 64 times. Three protein-coding genes (ND1, COI and ND4) were ended with incomplete stop codon TA or T.

  5. Analysis of complete mitochondrial genome sequences increases phylogenetic resolution of bears (Ursidae, a mammalian family that experienced rapid speciation

    Directory of Open Access Journals (Sweden)

    Ryder Oliver A

    2007-10-01

    Full Text Available Abstract Background Despite the small number of ursid species, bear phylogeny has long been a focus of study due to their conservation value, as all bear genera have been classified as endangered at either the species or subspecies level. The Ursidae family represents a typical example of rapid evolutionary radiation. Previous analyses with a single mitochondrial (mt gene or a small number of mt genes either provide weak support or a large unresolved polytomy for ursids. We revisit the contentious relationships within Ursidae by analyzing complete mt genome sequences and evaluating the performance of both entire mt genomes and constituent mtDNA genes in recovering a phylogeny of extremely recent speciation events. Results This mitochondrial genome-based phylogeny provides strong evidence that the spectacled bear diverged first, while within the genus Ursus, the sloth bear is the sister taxon of all the other five ursines. The latter group is divided into the brown bear/polar bear and the two black bears/sun bear assemblages. These findings resolve the previous conflicts between trees using partial mt genes. The ability of different categories of mt protein coding genes to recover the correct phylogeny is concordant with previous analyses for taxa with deep divergence times. This study provides a robust Ursidae phylogenetic framework for future validation by additional independent evidence, and also has significant implications for assisting in the resolution of other similarly difficult phylogenetic investigations. Conclusion Identification of base composition bias and utilization of the combined data of whole mitochondrial genome sequences has allowed recovery of a strongly supported phylogeny that is upheld when using multiple alternative outgroups for the Ursidae, a mammalian family that underwent a rapid radiation since the mid- to late Pliocene. It remains to be seen if the reliability of mt genome analysis will hold up in studies of other

  6. Complete Genome Sequence of an Avian Metapneumovirus Subtype A Strain Isolated from Chicken (Gallus gallus) in Brazil.

    Science.gov (United States)

    Rizotto, Laís S; Scagion, Guilherme P; Cardoso, Tereza C; Simão, Raphael M; Caserta, Leonardo C; Benassi, Julia C; Keid, Lara B; Oliveira, Trícia M F de S; Soares, Rodrigo M; Arns, Clarice W; Van Borm, Steven; Ferreira, Helena L

    2017-07-20

    We report here the complete genome sequence of an avian metapneumovirus (aMPV) isolated from a tracheal tissue sample of a commercial layer flock. The complete genome sequence of aMPV-A/chicken/Brazil-SP/669/2003 was obtained using MiSeq (Illumina, Inc.) sequencing. Phylogenetic analysis of the complete genome classified the isolate as avian metapneumovirus subtype A. Copyright © 2017 Rizotto et al.

  7. Facilitating genome navigation : survey sequencing and dense radiation-hybrid gene mapping

    NARCIS (Netherlands)

    Hitte, C; Madeoy, J; Kirkness, EF; Priat, C; Lorentzen, TD; Senger, F; Thomas, D; Derrien, T; Ramirez, C; Scott, C; Evanno, G; Pullar, B; Cadieu, E; Oza, [No Value; Lourgant, K; Jaffe, DB; Tacher, S; Dreano, S; Berkova, N; Andre, C; Deloukas, P; Fraser, C; Lindblad-Toh, K; Ostrander, EA; Galibert, F

    Accurate and comprehensive sequence coverage for large genomes has been restricted to only a few species of specific interest. Lower sequence coverage (survey sequencing) of related species can yield a wealth of information about gene content and putative regulatory elements. But survey sequences

  8. Advancing Eucalyptus genomics: identification and sequencing of lignin biosynthesis genes from deep-coverage BAC libraries

    Directory of Open Access Journals (Sweden)

    Kudrna David

    2011-03-01

    Full Text Available Abstract Background Eucalyptus species are among the most planted hardwoods in the world because of their rapid growth, adaptability and valuable wood properties. The development and integration of genomic resources into breeding practice will be increasingly important in the decades to come. Bacterial artificial chromosome (BAC libraries are key genomic tools that enable positional cloning of important traits, synteny evaluation, and the development of genome framework physical maps for genetic linkage and genome sequencing. Results We describe the construction and characterization of two deep-coverage BAC libraries EG_Ba and EG_Bb obtained from nuclear DNA fragments of E. grandis (clone BRASUZ1 digested with HindIII and BstYI, respectively. Genome coverages of 17 and 15 haploid genome equivalents were estimated for EG_Ba and EG_Bb, respectively. Both libraries contained large inserts, with average sizes ranging from 135 Kb (Eg_Bb to 157 Kb (Eg_Ba, very low extra-nuclear genome contamination providing a probability of finding a single copy gene ≥ 99.99%. Libraries were screened for the presence of several genes of interest via hybridizations to high-density BAC filters followed by PCR validation. Five selected BAC clones were sequenced and assembled using the Roche GS FLX technology providing the whole sequence of the E. grandis chloroplast genome, and complete genomic sequences of important lignin biosynthesis genes. Conclusions The two E. grandis BAC libraries described in this study represent an important milestone for the advancement of Eucalyptus genomics and forest tree research. These BAC resources have a highly redundant genome coverage (> 15×, contain large average inserts and have a very low percentage of clones with organellar DNA or empty vectors. These publicly available BAC libraries are thus suitable for a broad range of applications in genetic and genomic research in Eucalyptus and possibly in related species of Myrtaceae

  9. Prediction of novel archaeal enzymes from sequence-derived features

    DEFF Research Database (Denmark)

    Jensen, Lars Juhl; Skovgaard, Marie; Brunak, Søren

    2002-01-01

    The completely sequenced archaeal genomes potentially encode, among their many functionally uncharacterized genes, novel enzymes of biotechnological interest. We have developed a prediction method for detection and classification of enzymes from sequence alone (available at http://www.cbs.dtu.dk/......The completely sequenced archaeal genomes potentially encode, among their many functionally uncharacterized genes, novel enzymes of biotechnological interest. We have developed a prediction method for detection and classification of enzymes from sequence alone (available at http......://www.cbs.dtu.dk/services/ArchaeaFun/). The method does not make use of sequence similarity; rather, it relies on predicted protein features like cotranslational and posttranslational modifications, secondary structure, and simple physical/chemical properties....

  10. The Bryopsis hypnoides plastid genome: multimeric forms and complete nucleotide sequence.

    Directory of Open Access Journals (Sweden)

    Fang Lü

    Full Text Available BACKGROUND: Bryopsis hypnoides Lamouroux is a siphonous green alga, and its extruded protoplasm can aggregate spontaneously in seawater and develop into mature individuals. The chloroplast of B. hypnoides is the biggest organelle in the cell and shows strong autonomy. To better understand this organelle, we sequenced and analyzed the chloroplast genome of this green alga. PRINCIPAL FINDINGS: A total of 111 functional genes, including 69 potential protein-coding genes, 5 ribosomal RNA genes, and 37 tRNA genes were identified. The genome size (153,429 bp, arrangement, and inverted-repeat (IR-lacking structure of the B. hypnoides chloroplast DNA (cpDNA closely resembles that of Chlorella vulgaris. Furthermore, our cytogenomic investigations using pulsed-field gel electrophoresis (PFGE and southern blotting methods showed that the B. hypnoides cpDNA had multimeric forms, including monomer, dimer, trimer, tetramer, and even higher multimers, which is similar to the higher order organization observed previously for higher plant cpDNA. The relative amounts of the four multimeric cpDNA forms were estimated to be about 1, 1/2, 1/4, and 1/8 based on molecular hybridization analysis. Phylogenetic analyses based on a concatenated alignment of chloroplast protein sequences suggested that B. hypnoides is sister to all Chlorophyceae and this placement received moderate support. CONCLUSION: All of the results suggest that the autonomy of the chloroplasts of B. hypnoides has little to do with the size and gene content of the cpDNA, and the IR-lacking structure of the chloroplasts indirectly demonstrated that the multimeric molecules might result from the random cleavage and fusion of replication intermediates instead of recombinational events.

  11. The complete chloroplast genome sequence of the CAM epiphyte Spanish moss (Tillandsia usneoides, Bromeliaceae) and its comparative analysis.

    Science.gov (United States)

    Poczai, Péter; Hyvönen, Jaakko

    2017-01-01

    Spanish moss (Tillandsia usneoides) is an epiphytic bromeliad widely distributed throughout tropical and warm temperate America. This plant is highly adapted to extreme environmental conditions. Striking features of this species include specialized trichomes (scales) covering the surface of its shoots aiding the absorption of water and nutrients directly from the atmosphere and a specific photosynthesis using crassulacean acid metabolism (CAM). Here we report the plastid genome of Spanish moss and present the comparison of genome organization and sequence evolution within Poales. The plastome of Spanish moss has a quadripartite structure consisting of a large single copy (LSC, 87,439 bp), two inverted regions (IRa and IRb, 26,803 bp) and short single copy (SSC, 18,612 bp) region. The plastid genome had 37.2% GC content and 134 genes with 88 being unique protein-coding genes and 20 of these are duplicated in the IR, similar to other reported bromeliads. Our study shows that early diverging lineages of Poales do not have high substitution rates as compared to grasses, and plastid genomes of bromeliads show structural features considered to be ancestral in graminids. These include the loss of the introns in the clpP and rpoC1 genes and the complete loss or partial degradation of accD and ycf genes in the Graminid clade. Further structural rearrangements appeared in the graminids lacking in Spanish moss, which include a 28-kb inversion between the trnG-UCC-rps14 region and 6-kb in the trnG-UCC-psbD, followed by a third <1kb inversion in the trnT sequence.

  12. The complete nucleotide sequence, genome organization, and origin of human adenovirus type 11

    International Nuclear Information System (INIS)

    Stone, Daniel; Furthmann, Anne; Sandig, Volker; Lieber, Andre

    2003-01-01

    The complete DNA sequence and transcription map of human adenovirus type 11 are reported here. This is the first published sequence for a subgenera B human adenovirus and demonstrates a genome organization highly similar to those of other human adenoviruses. All of the genes from the early, intermediate, and late regions are present in the expected locations of the genome for a human adenovirus. The genome size is 34,794 bp in length and has a GC content of 48.9%. Sequence alignment with genomes of groups A (Ad12), C (Ad5), D (Ad17), E (Simian adenovirus 25), and F (Ad40) revealed homologies of 64, 54, 68, 75, and 52%, respectively. Detailed genomic analysis demonstrated that Ads 11 and 35 are highly conserved in all areas except the hexon hypervariable regions and fiber. Similarly, comparison of Ad11 with subgroup E SAV25 revealed poor homology between fibers but high homology in proteins encoded by all other areas of the genome. We propose an evolutionary model in which functional viruses can be reconstituted following fiber substitution from one serotype to another. According to this model either the Ad11 genome is a derivative of Ad35, from which the fiber was substituted with Ad7, or the Ad35 genome is the product of a fiber substitution from Ad21 into the Ad11 genome. This model also provides a possible explanation for the origin of group E Ads, which are evolutionarily derived from a group C fiber substitution into a group B genome

  13. Cloning and sequence analysis of chitin synthase gene fragments of Demodex mites*

    Science.gov (United States)

    Zhao, Ya-e; Wang, Zheng-hang; Xu, Yang; Xu, Ji-ru; Liu, Wen-yan; Wei, Meng; Wang, Chu-ying

    2012-01-01

    To our knowledge, few reports on Demodex studied at the molecular level are available at present. In this study our group, for the first time, cloned, sequenced and analyzed the chitin synthase (CHS) gene fragments of Demodex folliculorum, Demodex brevis, and Demodex canis (three isolates from each species) from Xi’an China, by designing specific primers based on the only partial sequence of the CHS gene of D. canis from Japan, retrieved from GenBank. Results show that amplification was successful only in three D. canis isolates and one D. brevis isolate out of the nine Demodex isolates. The obtained fragments were sequenced to be 339 bp for D. canis and 338 bp for D. brevis. The CHS gene sequence similarities between the three Xi’an D. canis isolates and one Japanese D. canis isolate ranged from 99.7% to 100.0%, and those between four D. canis isolates and one D. brevis isolate were 99.1%–99.4%. Phylogenetic trees based on maximum parsimony (MP) and maximum likelihood (ML) methods shared the same clusters, according with the traditional classification. Two open reading frames (ORFs) were identified in each CHS gene sequenced, and their corresponding amino acid sequences were located at the catalytic domain. The relatively conserved sequences could be deduced to be a CHS class A gene, which is associated with chitin synthesis in the integument of Demodex mites. PMID:23024043

  14. The complete genome sequence of Bacillus velezensis strain GH1-13 reveals agriculturally beneficial properties and a unique plasmid.

    Science.gov (United States)

    Kim, Sang Yoon; Song, Hajin; Sang, Mee Kyung; Weon, Hang-Yeon; Song, Jaekyeong

    2017-10-10

    The bacterial strain Bacillus velezensis GH1-13, isolated from rice paddy soil in Korea, has been shown to promote plant growth and have strong antagonistic activities against pathogens. Here, we report the complete genome sequence of GH1-13, revealing that it possesses a single 4,071,980-bp circular chromosome with 46.2% GC-content. The chromosome encodes 3,930 genes, and we have also identified a unique plasmid in the strain that encodes a further 104 genes (71,628bp and 31.7% GC-content). The genome was found to contain various enzyme-encoding operons, including indole-3-acetic acid (IAA) biosynthesis proteins, 2,3-butanediol dehydrogenase, various non-ribosomal peptide synthetases, and several polyketide synthases. These properties are responsible for the promotion of plant growth and the biosynthesis of secondary metabolites. They therefore have multiple beneficial effects that could be applied to agriculture. Through curing, we found that the unique plasmid of GH1-13 has important roles in the production of phytohormones, such as IAA, and in shaping phenotypic and physiological characteristics. The plasmid therefore likely influences the biological activities of GH1-13. The complete genome sequence of B. velezensis GH1-13 contributes to our understanding of this beneficial strain and will encourage research into its development for agricultural or biotechnological applications, enhancing productivity and crop quality. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. The complete mitochondrial genome of the deep-sea sponge Poecillastra laminaris (Astrophorida, Vulcanellidae).

    Science.gov (United States)

    Zeng, Cong; Thomas, Leighton J; Kelly, Michelle; Gardner, Jonathan P A

    2016-05-01

    The complete mitochondrial genome of a New Zealand specimen of the deep-sea sponge Poecillastra laminaris (Sollas, 1886) (Astrophorida, Vulcanellidae), from the Colville Ridge, New Zealand, was sequenced using the 454 Life Science pyrosequencing system. To identify homologous mitochondrial sequences, the 454 reads were mapped to the complete mitochondrial genome sequence of Geodia neptuni (GeneBank No. NC_006990). The P. laminaris genome is 18,413 bp in length and includes 14 protein-coding genes, 24 transfer RNA genes and 2 ribosomal RNA genes. Gene order resembled that of other demosponges. The base composition of the genome is A (29.1%), T (35.2%), C (14.0%) and G (21.7%). This is the second published mitogenome for a sponge of the order Astrophorida and will be useful in future phylogenetic analysis of deep-sea sponges.

  16. Complete genome sequence of a tomato infecting tomato mottle mosaic virus in New York

    Science.gov (United States)

    Complete genome sequence of an emerging isolate of tomato mottle mosaic virus (ToMMV) infecting experimental nicotianan benthamiana plants in up-state New York was obtained using small RNA deep sequencing. ToMMV_NY-13 shared 99% sequence identity to ToMMV isolates from Mexico and Florida. Broader d...

  17. Complete Genome Sequence of a Double-Stranded RNA Virus from Avocado

    OpenAIRE

    Villanueva, Francisco; Sabanadzovic, Sead; Valverde, Rodrigo A.; Navas-Castillo, Jesús

    2012-01-01

    A number of avocado (Persea americana) cultivars are known to contain high-molecular-weight double-stranded RNA (dsRNA) molecules for which a viral nature has been suggested, although sequence data are not available. Here we report the cloning and complete sequencing of a 13.5-kbp dsRNA virus isolated from avocado and show that it corresponds to the genome of a new species of the genus Endornavirus (family Endornaviridae), tentatively named Persea americana endornavirus (PaEV).

  18. The complete mitochondrial genome of the common sea slater, Ligia oceanica (Crustacea, Isopoda bears a novel gene order and unusual control region features

    Directory of Open Access Journals (Sweden)

    Podsiadlowski Lars

    2006-09-01

    Full Text Available Abstract Background Sequence data and other characters from mitochondrial genomes (gene translocations, secondary structure of RNA molecules are useful in phylogenetic studies among metazoan animals from population to phylum level. Moreover, the comparison of complete mitochondrial sequences gives valuable information about the evolution of small genomes, e.g. about different mechanisms of gene translocation, gene duplication and gene loss, or concerning nucleotide frequency biases. The Peracarida (gammarids, isopods, etc. comprise about 21,000 species of crustaceans, living in many environments from deep sea floor to arid terrestrial habitats. Ligia oceanica is a terrestrial isopod living at rocky seashores of the european North Sea and Atlantic coastlines. Results The study reveals the first complete mitochondrial DNA sequence from a peracarid crustacean. The mitochondrial genome of Ligia oceanica is a circular double-stranded DNA molecule, with a size of 15,289 bp. It shows several changes in mitochondrial gene order compared to other crustacean species. An overview about mitochondrial gene order of all crustacean taxa yet sequenced is also presented. The largest non-coding part (the putative mitochondrial control region of the mitochondrial genome of Ligia oceanica is unexpectedly not AT-rich compared to the remainder of the genome. It bears two repeat regions (4× 10 bp and 3× 64 bp, and a GC-rich hairpin-like secondary structure. Some of the transfer RNAs show secondary structures which derive from the usual cloverleaf pattern. While some tRNA genes are putative targets for RNA editing, trnR could not be localized at all. Conclusion Gene order is not conserved among Peracarida, not even among isopods. The two isopod species Ligia oceanica and Idotea baltica show a similarly derived gene order, compared to the arthropod ground pattern and to the amphipod Parhyale hawaiiensis, suggesting that most of the translocation events were already

  19. Complete genome sequence of the thermophilic sulfate-reducing ocean bacterium Thermodesulfatator indicus type strain (CIR29812(T)).

    Science.gov (United States)

    Anderson, Iain; Saunders, Elizabeth; Lapidus, Alla; Nolan, Matt; Lucas, Susan; Tice, Hope; Del Rio, Tijana Glavina; Cheng, Jan-Fang; Han, Cliff; Tapia, Roxanne; Goodwin, Lynne A; Pitluck, Sam; Liolios, Konstantinos; Mavromatis, Konstantinos; Pagani, Ioanna; Ivanova, Natalia; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Jeffries, Cynthia D; Chang, Yun-Juan; Brambilla, Evelyne-Marie; Rohde, Manfred; Spring, Stefan; Göker, Markus; Detter, John C; Woyke, Tanja; Bristow, James; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter

    2012-05-25

    Thermodesulfatator indicus Moussard et al. 2004 is a member of the Thermodesulfobacteriaceae, a family in the phylum Thermodesulfobacteria that is currently poorly characterized at the genome level. Members of this phylum are of interest because they represent a distinct, deep-branching, Gram-negative lineage. T. indicus is an anaerobic, thermophilic, chemolithoautotrophic sulfate reducer isolated from a deep-sea hydrothermal vent. Here we describe the features of this organism, together with the complete genome sequence, and annotation. The 2,322,224 bp long chromosome with its 2,233 protein-coding and 58 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  20. Cloning, sequencing and variability analysis of the gap gene from Mycoplasma hominis

    DEFF Research Database (Denmark)

    Mygind, Tina; Jacobsen, Iben Søgaard; Melkova, Renata

    2000-01-01

    The gap gene encodes the glycolytic enzyme glyceraldehyde 3-phosphate dehydrogenase (GAPDH). The gene was cloned and sequenced from the Mycoplasma hominis type strain PG21(T). The intraspecies variability was investigated by inspection of restriction fragment length polymorphism (RFLP) patterns...... after polymerase chain reaction (PCR) amplification of the gap gene from 15 strains and furthermore by sequencing of part of the gene in eight strains. The M. hominis gap gene was found to vary more than the Escherichia coli counterpart, but the variation at nucleotide level gave rise to only a few...