WorldWideScience

Sample records for 28s gene sequences

  1. Fungal community structure in disease suppressive soils assessed by 28S LSU gene sequencing.

    Penton, C Ryan; Gupta, V V S R; Tiedje, James M; Neate, Stephen M; Ophel-Keller, Kathy; Gillings, Michael; Harvey, Paul; Pham, Amanda; Roget, David K

    2014-01-01

    Natural biological suppression of soil-borne diseases is a function of the activity and composition of soil microbial communities. Soil microbe and phytopathogen interactions can occur prior to crop sowing and/or in the rhizosphere, subsequently influencing both plant growth and productivity. Research on suppressive microbial communities has concentrated on bacteria although fungi can also influence soil-borne disease. Fungi were analyzed in co-located soils 'suppressive' or 'non-suppressive' for disease caused by Rhizoctonia solani AG 8 at two sites in South Australia using 454 pyrosequencing targeting the fungal 28S LSU rRNA gene. DNA was extracted from a minimum of 125 g of soil per replicate to reduce the micro-scale community variability, and from soil samples taken at sowing and from the rhizosphere at 7 weeks to cover the peak Rhizoctonia infection period. A total of ∼ 994,000 reads were classified into 917 genera covering 54% of the RDP Fungal Classifier database, a high diversity for an alkaline, low organic matter soil. Statistical analyses and community ordinations revealed significant differences in fungal community composition between suppressive and non-suppressive soil and between soil type/location. The majority of differences associated with suppressive soils were attributed to less than 40 genera including a number of endophytic species with plant pathogen suppression potentials and mycoparasites such as Xylaria spp. Non-suppressive soils were dominated by Alternaria, Gibberella and Penicillum. Pyrosequencing generated a detailed description of fungal community structure and identified candidate taxa that may influence pathogen-plant interactions in stable disease suppression. PMID:24699870

  2. Fungal community structure in disease suppressive soils assessed by 28S LSU gene sequencing.

    C Ryan Penton

    Full Text Available Natural biological suppression of soil-borne diseases is a function of the activity and composition of soil microbial communities. Soil microbe and phytopathogen interactions can occur prior to crop sowing and/or in the rhizosphere, subsequently influencing both plant growth and productivity. Research on suppressive microbial communities has concentrated on bacteria although fungi can also influence soil-borne disease. Fungi were analyzed in co-located soils 'suppressive' or 'non-suppressive' for disease caused by Rhizoctonia solani AG 8 at two sites in South Australia using 454 pyrosequencing targeting the fungal 28S LSU rRNA gene. DNA was extracted from a minimum of 125 g of soil per replicate to reduce the micro-scale community variability, and from soil samples taken at sowing and from the rhizosphere at 7 weeks to cover the peak Rhizoctonia infection period. A total of ∼ 994,000 reads were classified into 917 genera covering 54% of the RDP Fungal Classifier database, a high diversity for an alkaline, low organic matter soil. Statistical analyses and community ordinations revealed significant differences in fungal community composition between suppressive and non-suppressive soil and between soil type/location. The majority of differences associated with suppressive soils were attributed to less than 40 genera including a number of endophytic species with plant pathogen suppression potentials and mycoparasites such as Xylaria spp. Non-suppressive soils were dominated by Alternaria, Gibberella and Penicillum. Pyrosequencing generated a detailed description of fungal community structure and identified candidate taxa that may influence pathogen-plant interactions in stable disease suppression.

  3. Phylogenetic Relationships of the Marine Haplosclerida (Phylum Porifera) Employing Ribosomal (28S rRNA) and Mitochondrial (cox1, nad1) Gene Sequence Data

    Redmond, Niamh E.; Jean Raleigh; Van Soest, Rob W.M.; Michelle Kelly; Travers, Simon A A; Brian Bradshaw; Salla Vartia; Kelly M Stephens; McCormack, Grace P.

    2011-01-01

    The systematics of the poriferan Order Haplosclerida (Class Demospongiae) has been under scrutiny for a number of years without resolution. Molecular data suggests that the order needs revision at all taxonomic levels. Here, we provide a comprehensive view of the phylogenetic relationships of the marine Haplosclerida using many species from across the order, and three gene regions. Gene trees generated using 28S rRNA, nad1 and cox1 gene data, under maximum likelihood and Bayesian approaches, ...

  4. A combination of morphology and 28S rRNA gene sequences provide grouping and ranking criteria to merge eight into three Ambispora species (Ambisporaceae, Glomeromycota).

    Bills, Robert J; Morton, Joseph B

    2015-08-01

    Ambispora, the only genus in Ambisporaceae and one of three deeply rooted families in Archaeosporales, Glomeromycetes, is amended. Analysis of the morphology of specimens from types and living cultures and 28S ribosomal DNA (rDNA; LSU) sequences resulted in two major changes that redefined Ambispora to include only species with the potential for spore dimorphism (acaulosporoid and glomoid). First, species described as producing only glomoid spores (Ambispora leptoticha, Ambispora fecundispora, and Ambispora callosa), only acaulosporoid spores (Ambispora jimgerdemannii), or both spore morphotypes (Ambispora appendicula) were synonymized with a redefined dimorphic species, A. leptoticha. LSU sequences and more conserved SSU gene data indicated little divergence between genotypes formerly classified as separate species. Second, Ambispora fennica was synonymized with Ambispora gerdemannii based on morphological and LSU sequence variation equivalent to that measured in the sister clade A. leptoticha. With this analysis, Ambispora was reduced to three species: A. leptoticha, A. gerdemannii, and Ambispora granatensis. Morphological and molecular characters were given equal treatment in this study, as each data set informed and clarified grouping and ranking decisions. The two inner layers of the acaulosporoid spore wall were the only structural characters uniquely defining each of these three species; all other characters were shared. Phenotypes of glomoid spores were indistinguishable between species, and thus were informative only at the genus level. Distinct subclade structure of the LSU gene tree suggests fixation of discrete variants typical of clonal reproduction and possible retention of polymorphisms in rDNA repeats, so that not all discrete genetic variants are indicative of speciation. PMID:25638691

  5. Phylogenetic relationships of the marine Haplosclerida (Phylum Porifera) employing ribosomal (28S rRNA) and mitochondrial (cox1, nad1) gene sequence data.

    Redmond, Niamh E; Raleigh, Jean; van Soest, Rob W M; Kelly, Michelle; Travers, Simon A A; Bradshaw, Brian; Vartia, Salla; Stephens, Kelly M; McCormack, Grace P

    2011-01-01

    The systematics of the poriferan Order Haplosclerida (Class Demospongiae) has been under scrutiny for a number of years without resolution. Molecular data suggests that the order needs revision at all taxonomic levels. Here, we provide a comprehensive view of the phylogenetic relationships of the marine Haplosclerida using many species from across the order, and three gene regions. Gene trees generated using 28S rRNA, nad1 and cox1 gene data, under maximum likelihood and Bayesian approaches, are highly congruent and suggest the presence of four clades. Clade A is comprised primarily of species of Haliclona and Callyspongia, and clade B is comprised of H. simulans and H. vansoesti (Family Chalinidae), Amphimedon queenslandica (Family Niphatidae) and Tabulocalyx (Family Phloeodictyidae), Clade C is comprised primarily of members of the Families Petrosiidae and Niphatidae, while Clade D is comprised of Aka species. The polyphletic nature of the suborders, families and genera described in other studies is also found here. PMID:21931685

  6. Higher-level phylogeny of the Therevidae (Diptera: insecta) based on 28S ribosomal and elongation factor-1 alpha gene sequences.

    Yang, L; Wiegmann, B M; Yeates, D K; Irwin, M E

    2000-06-01

    Therevidae (stilleto flies) are a little-known family of asiloid brachyceran Diptera (Insecta). Separate and combined phylogenetic analyses of 1200 bases of the 28S ribosomal DNA and 1100 bases of elongation factor-1alpha were used to infer phylogenetic relationships within the family. The position of the enigmatic taxon Apsilocephala Kröber is evaluated in light of the molecular evidence. In all analyses, molecular data strongly support the monophyly of Therevidae, excluding Apsilocephala, and the division of Therevidae into two main clades corresponding to a previous classification of the family into the subfamilies Phycinae and Therevinae. Despite strong support for some relationships within these groups, relationships at the base of the two main clades are weakly supported. Short branch lengths for Australasian clades at the base of the Therevinae may represent a rapid radiation of therevids in Australia. PMID:10860652

  7. Intragenomic sequence variation at the ITS1 - ITS2 region and at the 18S and 28S nuclear ribosomal DNA genes of the New Zealand mud snail, Potamopyrgus antipodarum (Hydrobiidae: mollusca)

    Hoy, Marshal S.; Rodriguez, Rusty J.

    2013-01-01

    Molecular genetic analysis was conducted on two populations of the invasive non-native New Zealand mud snail (Potamopyrgus antipodarum), one from a freshwater ecosystem in Devil's Lake (Oregon, USA) and the other from an ecosystem of higher salinity in the Columbia River estuary (Hammond Harbor, Oregon, USA). To elucidate potential genetic differences between the two populations, three segments of nuclear ribosomal DNA (rDNA), the ITS1-ITS2 regions and the 18S and 28S rDNA genes were cloned and sequenced. Variant sequences within each individual were found in all three rDNA segments. Folding models were utilized for secondary structure analysis and results indicated that there were many sequences which contained structure-altering polymorphisms, which suggests they could be nonfunctional pseudogenes. In addition, analysis of molecular variance (AMOVA) was used for hierarchical analysis of genetic variance to estimate variation within and among populations and within individuals. AMOVA revealed significant variation in the ITS region between the populations and among clones within individuals, while in the 5.8S rDNA significant variation was revealed among individuals within the two populations. High levels of intragenomic variation were found in the ITS regions, which are known to be highly variable in many organisms. More interestingly, intragenomic variation was also found in the 18S and 28S rDNA, which has rarely been observed in animals and is so far unreported in Mollusca. We postulate that in these P. antipodarum populations the effects of concerted evolution are diminished due to the fact that not all of the rDNA genes in their polyploid genome should be essential for sustaining cellular function. This could lead to a lessening of selection pressures, allowing mutations to accumulate in some copies, changing them into variant sequences.                   

  8. Identification of Dermatophyte Species by 28S Ribosomal DNA Sequencing with a Commercial Kit

    Ninet, Béatrice; Jan, Isabelle; Bontems, Olympia; Léchenne, Barbara; Jousson, Olivier; Panizzon, Renato; Lew, Daniel; Monod, Michel

    2003-01-01

    We have shown that dermatophyte species can be easily identified on the basis of a DNA sequence encoding a part of the large-subunit (LSU) rRNA (28S rRNA) by using the MicroSeq D2 LSU rRNA Fungal Sequencing Kit. Two taxa causing distinct dermatophytoses were clearly distinguished among isolates of the Trichophyton mentagrophytes species complex. PMID:12574293

  9. Identification of Dermatophyte Species by 28S Ribosomal DNA Sequencing with a Commercial Kit

    Ninet, Béatrice; Jan, Isabelle; Bontems, Olympia; Léchenne, Barbara; Jousson, Olivier; Panizzon, Renato; Lew, Daniel; Monod, Michel

    2003-01-01

    We have shown that dermatophyte species can be easily identified on the basis of a DNA sequence encoding a part of the large-subunit (LSU) rRNA (28S rRNA) by using the MicroSeq D2 LSU rRNA Fungal Sequencing Kit. Two taxa causing distinct dermatophytoses were clearly distinguished among isolates of the Trichophyton mentagrophytes species complex.

  10. IDENTIFICATION OF THREE FRUIT-ROT FUNGI OF BANANA BY 28S RIBOSOMAL DNA SEQUENCING

    Supriya Sarkar*, S Girisham and SM Reddy

    2013-01-01

    Full Text Available The aim of present investigation was to identify three fruit-rot fungi-Macrophomina phaseolina (Tassi Goid, Fusarium oxysporum (Schlechtend and Nigrospora oryzae (Berk and Br. Petch isolated from banana fruits [Rasthali (Silk AAB and Cavendish (AAA varieties]. Out of different fungal genera isolated, the above fungi were responsible for maximum loss of banana fruits as they spread rapidly into the fruit pulp and deteriorated the fruits. The amplification studies of fragment of D2 region of LSU (Large subunit 28S rDNA gene of three fungi understudy was carried out using PCR technique. Based on the nucleotide homology and phylogenetic analysis, the fungus M. phaseolina was identified as M. phaseolina strain R-4242 sp. (Genbank accession number: FJ415068.1, F. oxysporum as Fusarium sp.QJC-1403 5.8S ribosomal RNA gene sp. (Genbank accession number: EU193176.1 and N. oryzae as N. oryzae NRRL: 54030 sp. (Genbank accession number: GQ328855.1. Nucleic acid sequencing provides more objective separation of genera and species than that provided by the conventional techniques. This technique can best be used for the identification of organisms that could not be identified satisfactorily by their microscopic morphological features. Genetic characterization of plant pathogens prevalent in an area is necessary for efficient management and increased crop productivity. The data presented may help researchers to understand the host-pathogen interactions indetail in banana, to design effective strategies for deployment of resistant genes in banana (Musa paradisiaca L. growing regions in the country and worldwide.

  11. Genetic relationship between Neobenedenia girellae and N.melleni inferred from 28S rRNA sequences

    WANG Jun; ZHANG Wen; SU Yongquan; DING Shaoxiong

    2004-01-01

    The fragments of 350 bp in 28S rRNA from the closely related monogenea of trematoda, Neobenedenia girellae and N. melleni are obtained by polymerase chain reaction (PCR) amplified using a couple of special primers and then sequenced. The results show that the comparison of 28S rRNA sequences, with only a base varying in 337bp accounting for 0.3% genetic difference, from the relative species N. girellae and N. melleni parasitized on the different fishes in different farms displays that they possess a very high genetic similarity of 99.7%, higher than that of 99.41% for the single species N. melleni sampled in different areas, and the intraspecific divergence of N.melleni is 0.59%. Meanwhile, the interspecific differences between the two Neobenedenia and three Benedenia (i.e., B. lutjani, B. rohdei and B. seriolae) range from 2.08% to11.73%. In addition, UPGMA and MP molecular phylogenetic trees are constructed and proved to be consistent with each other. Though the morphological characteristics and the results of genetic diversity for the two Neobenedenia show a high similarity, whether they belong to a single species or not are still undefined, and the more genes of them should be further investigated, in combination with the systematical and detailed morphological study.

  12. Reconstruction of phylogenetic relationships in dermatomycete genus Trichophyton Malmsten 1848 based on ribosomal internal transcribed spacer region, partial 28S rRNA and beta-tubulin genes sequences.

    Pchelin, Ivan M; Zlatogursky, Vasily V; Rudneva, Mariya V; Chilina, Galina A; Rezaei-Matehkolaei, Ali; Lavnikevich, Dmitry M; Vasilyeva, Natalya V; Taraskina, Anastasia E

    2016-09-01

    Trichophyton spp. are important causative agents of superficial mycoses. The phylogeny of the genus and accurate strain identification, based on the ribosomal ITS region sequencing, are still under development. The present work is aimed at (i) inferring the genus phylogeny from partial ITS, LSU and BT2 sequences (ii) description of ribosomal ITS region polymorphism in 15 strains of Trichophyton interdigitale. We performed DNA sequence-based species identification and phylogenetic analysis on 48 strains belonging to the genus Trichophyton. Phylogenetic relationships were inferred by maximum likelihood and Bayesian methods on concatenated ITS, LSU and BT2 sequences. Ribosomal ITS region polymorphisms were assessed directly on the alignment. By phylogenetic reconstruction, we reveal major anthropophilic and zoophilic species clusters in the genus Trichophyton. We describe several sequences of the ITS region of T. interdigitale, which do not fit in the traditional polymorphism scheme and propose emendations in this scheme for discrimination between ITS sequence types in T. interdigitale. The new polymorphism scheme will allow inclusion of a wider spectrum of isolates while retaining its explanatory power. This scheme was also found to be partially congruent with NTS typing technique. PMID:27071492

  13. Inferring a classification of the Adenophorea (Nematoda) from nucleotide sequences of the D3 expansion segment (26/28s rDNA)

    Litvaitis, M.K.; Bates, J.W.; Hope, W. D.; Moens, T.

    2000-01-01

    Nucleotide sequences of the D3 expansion segment of the 28S rDNA gene were used to reconstruct evolutionary relationships within the Adenophorea. Neighbor-joining and parsimony analyses of representatives of most major taxa revealed a paraphyletic Adenophorea (p = 0.0005). Within Adenophorea, the Enoplia, Enoplida, and Enoplina were paraphyletic (p = 0.0024, 0.0014, and 0.0120, respectively). A major division was evident within the Enoplida, with one lineage consisting of a basal Thoracostomo...

  14. Evolutionary History of the Chaetognaths Inferred from Actin and 18S-28S rRNA Paralogous Genes

    J.P. Casanova

    2006-01-01

    Full Text Available The chaetognaths constitute a small and enigmatic phylum of marine invertebrates whose phylogenetic affinities remain uncertain. Our phylogenetical investigations inferred from partial paralogous 18S-28S rRNA genes suggest that the event resulting in the presence of two classes of rRNA genes would have occurred at approximately 300-400 million years and prior to the radiation of extant chaetognath, whereas the taxon, according to both molecular and paleontological data, would be dated from at least the Early Cambrian. These divergent rRNA genes could be the result of a whole ribosomal cluster duplication or of an allopolyploid event during a crisis period, since, the fossil are lacking posterioly to the post-Carboniferous period (c.a., 300 million years. In addition, actin phylogeny evidenced that the cytoplasmic chaetognath actin clustered with the cytoplasmic insect actins, while the muscular chaetognath actins are placed basal to all muscular vertebrate actins. The present study suggests that the gene conversion mechanisms could be inefficient in this taxon; this could explain the conservation of extremely divergent paralogous sequences in the chaetognath genomes which could be correlated to the difficulties to identify a sister group between chaetognaths and other taxa among metazoans.

  15. DISCRIMINATION 28S RIBOSOMAL GENE OF TREMATODE CERCARIAE IN SNAILS FROM CHIANG MAI PROVINCE, THAILAND.

    Wongsawad, Chalobol; Wongsawad, Pheravut; Sukontason, Kom; Phalee, Anawat; Noikong-Phalee, Waraporn; Chai, Jong Yil

    2016-03-01

    Trematode cercariae are commonly found in many freshwater gastropods. These cercariae can serve to identify the occurrence of such trematodes as Centrocestus formosanus, Haplorchis taichui, Haplorchoides sp, and Stellantchasmus falcatus, which are important parasites in Chiang Mai Province, Thailand. As the species of these cercariae cannot be identified accurately based on morphology, this study employed sequencing of a fragment of 28S ribosomal DNA and phylogenetic analysis to identify the trematode cercariae found in freshwater gastropods in Chiang Mai Province. Eight types of trematode cercariae were identified, namely, distome cercaria (grouped with Philophthalmus spp clade), echinostome cercaria (grouped with Echinostoma spp clade), furcocercous cercaria (grouped with Posthodiplostomum sp/Alaria taxideae/Hysteromorpha triloba clade), monostome cercaria (grouped with Catatropis indicus clade), parapleurolophocercous cercaria (grouped with Haplorchoides sp clade), pleurolophocercous cercaria (grouped with Centrocestusformosanus clade), transversotrema cercaria (grouped with Transversotrema spp clade), and xiphidiocercaria (grouped with Prosthodendrium spp clade). These results provide important information that can be used for identifying these parasites in epidemiological surveys. PMID:27244956

  16. Phylogenetic Relationships of Tribes Within Harpalinae (Coleoptera: Carabidae) as Inferred from 28S Ribosomal DNA and the Wingless Gene

    Ober, Karen A; Maddison, David R.

    2008-01-01

    Harpalinae is a large, monophyletic subfamily of carabid ground beetles containing more than 19,000 species in approximately 40 tribes. The higher level phylogenetic relationships within harpalines were investigated based on nucleotide data from two nuclear genes, wingless and 28S rDNA. Phylogenetic analyses of combined data indicate that many harpaline tribes are monophyletic, however the reconstructed trees showed little support for deeper nodes. In addition, our results suggest that the Le...

  17. D2 Region of the 28S RNA Gene: A Too-Conserved Fragment for Inferences on Phylogeny of South American Triatomines.

    Guerra, Ana Letícia; Alevi, Kaio Cesar Chaboli; Banho, Cecília Artico; de Oliveira, Jader; da Rosa, João Aristeu; Vilela de Azeredo-Oliveira, Maria Tercília

    2016-09-01

    The brasiliensis complex is composed of five triatomine species, and different approaches suggest that Triatoma lenti and Triatoma petrochiae may be the new members. Therefore, this study sought to analyze the phylogenetic relationships within this complex by means of the D2 region of the 28S RNA gene, and to analyze the degree of polymorphism and phylogenetic significance of this gene for South American triatomines. Phylogenetic analysis by using sequence fragments of the D2 domain did not allow to perform phylogenetic inferences on species within the brasiliensis complex, because the gene alignment composed of a matrix with 37 specimens exhibited only two variable sites along the 567 base pairs used. Furthermore, if all South American species are included, only four variable sites were detected, reflecting the high degree of gene conservation. Therefore, we do not recommend the use of this gene for phylogenetic reconstruction for this group of Chagas disease vectors. PMID:27382073

  18. Phylogenetic analysis of ten species of five genera of Buccinidae from the Chinese coast based on 28S rRNA gene

    DONG Chang-Yong; Hou, Lin; Sui, Na; Zhang, Yun; WANG Ming-Chang; Li, Yan

    2008-01-01

    It has been reported that there are 31 species in 13 genera of the family Buccinidae, distributed along the Chinese coast, but their taxonomic status is still controversial. In the present paper, phylogenetic relationships among ten species in five genera of Buccinidae from the Liaoning, Shandong and Fujian coast and ten species in five genus from the Chinese coast were examined using partial large ribosome subunit 28S rRNA sequences. An approximate 1400 bp fragment of the 28 rRNA gene was o...

  19. Comparative Analysis of 18S and 28S rDNA Sequences of Schistosoma japonicum from Mainland China, the Philippines and Japan

    G.H. Zhao

    2011-01-01

    Full Text Available In the present study, a portion of the 18S and 28S ribosomal DNA (rDNA sequences of 35 Schistosoma japonicum isolates representing three geographical strains from mainland China, the Philippines and Japan were amplified and compared and phylogenetic relationships were also reconstructed by Unweighted Pair-Group Method with Arithmetic averages (UPGMA using combined 18S and 28S rDNA sequences as well as the corresponding sequences of other species belonging to the Schistosoma genus available in the public database. The results indicated that the partial 18S and 28S rDNA sequences of all S. japonicum isolates were 745 and 618 bp, respectively and displayed low genetic variation among S. japonicum strains and isolates. Phylogenetic analysis revealed that the combined 18S and 28S rDNA sequences were not able to distinguish S. japonicum isolates from three geographical origins but provided an effective molecular marker for the inter-species phylogenetic analysis and differential identification of different Schistosoma species.

  20. Phylogenetic Relationships of Two Earth Tiger Tarantulas, Haplopelma lividum and H. longipes (Araneae, Theraphosidae, within the Infraorder Mygalomorph Using 28S Ribosomal DNA Sequences

    Arin Ngamniyom

    2014-01-01

    Full Text Available Haplopelma lividum and H. longipes (Araneae: Mygalomorphae: Theraphosidae are tarantulas that are distributed throughout Southeast Asia and are important carnivorous predators in ecological systems. The present study aimed to examine the phylogenetic relationships between Mygalomorph spiders using 28S ribosomal DNA sequences. The molecular results supported the placement of both species within a common theraphosid taxon. However, when considering relationships between Haplopelma spp. and related genera, H. schmidti, H. lividum and H. longipes were not monophyletic, suggesting that molecular data are incongruent with phylogenies based on morphological characteristics. These results provide molecular data to help elucidate the phylogenetic relationships between theraphosid tarantulas.

  1. Cloning and application of 28S rRNA gene fragment of Trichinella spiralis on Taxonmy%旋毛虫28S rRNA基因片段的克隆及其在分类学上的应用

    李成; 魏颖; 袁金钱; 宋铭忻

    2011-01-01

    In order to investigate the classification of Trihicnella swine isolate from Heilongjiang Province, the gene fragment in ribosome 28S rRNA was cloned and sequenced. The results showed that Trihicnella swine isolate from Heilongjiang Province was closed and belonged to Trichinella spiralis by sequence analysis. To some extent, the result was consistent with the traditional classfication and provided a base for the traditional taxonomy.%为了探讨所采集旋毛虫的分类,利用PCR方法克隆了猪旋毛虫黑龙江隔离种核糖体28S rRNA序列的基因片段.序列分析结果表明,猪旋毛虫黑龙江隔离种与旋毛形线虫(Trichinella spiralis,T1)的进化关系较近,确定为旋毛形线虫(Trichinella spiralis).结果与传统的分类结果基本一致,为传统的分类学方法提供了新的理论依据.

  2. Fungal Community Structure in Disease Suppressive Soils Assessed by 28S LSU Gene Sequencing

    Penton, C. Ryan; Gupta, V.V.S.R.; Tiedje, James M.; Neate, Stephen M.; Ophel-Keller, Kathy; Gillings, Michael; Harvey, Paul; Pham, Amanda; Roget, David K.

    2014-01-01

    Natural biological suppression of soil-borne diseases is a function of the activity and composition of soil microbial communities. Soil microbe and phytopathogen interactions can occur prior to crop sowing and/or in the rhizosphere, subsequently influencing both plant growth and productivity. Research on suppressive microbial communities has concentrated on bacteria although fungi can also influence soil-borne disease. Fungi were analyzed in co-located soils ‘suppressive’ or ‘non-suppressive’...

  3. Phylogenetic analysis of the spider mite sub-family Tetranychinae (Acari: Tetranychidae based on the mitochondrial COI gene and the 18S and the 5' end of the 28S rRNA genes indicates that several genera are polyphyletic.

    Tomoko Matsuda

    Full Text Available The spider mite sub-family Tetranychinae includes many agricultural pests. The internal transcribed spacer (ITS region of nuclear ribosomal RNA genes and the cytochrome c oxidase subunit I (COI gene of mitochondrial DNA have been used for species identification and phylogenetic reconstruction within the sub-family Tetranychinae, although they have not always been successful. The 18S and 28S rRNA genes should be more suitable for resolving higher levels of phylogeny, such as tribes or genera of Tetranychinae because these genes evolve more slowly and are made up of conserved regions and divergent domains. Therefore, we used both the 18S (1,825-1,901 bp and 28S (the 5' end of 646-743 bp rRNA genes to infer phylogenetic relationships within the sub-family Tetranychinae with a focus on the tribe Tetranychini. Then, we compared the phylogenetic tree of the 18S and 28S genes with that of the mitochondrial COI gene (618 bp. As observed in previous studies, our phylogeny based on the COI gene was not resolved because of the low bootstrap values for most nodes of the tree. On the other hand, our phylogenetic tree of the 18S and 28S genes revealed several well-supported clades within the sub-family Tetranychinae. The 18S and 28S phylogenetic trees suggest that the tribes Bryobiini, Petrobiini and Eurytetranychini are monophyletic and that the tribe Tetranychini is polyphyletic. At the genus level, six genera for which more than two species were sampled appear to be monophyletic, while four genera (Oligonychus, Tetranychus, Schizotetranychus and Eotetranychus appear to be polyphyletic. The topology presented here does not fully agree with the current morphology-based taxonomy, so that the diagnostic morphological characters of Tetranychinae need to be reconsidered.

  4. Inhibition of deoxyribonucleic acid transcription by ultraviolet irradiation in mammalian cells: determination of the transcriptional linkage of the 18S and 28S ribosomal ribonucleic acid genes

    The inhibition of deoxyribonucleic acid (DNA) transcription in mammalian cells by ultraviolet irradiation has been studied. The reduction in the rates and the amounts of total ribonucleic acid (RNA) synthesis and of 18S, 28S, and 45S ribosomal RNA (rRNA) synthesis, in tissue cultured mouse L cells, were examined as functions of ultraviolet dose and time after ultraviolet irradiation. Total RNA synthesis in the ultraviolet irradiated L cell was found to decrease as a function of ultraviolet dose. The rates of synthesis for the 18S and 28S rRNAs and the 45S precursor RNA decreased exponentially with ultraviolet dose; the respective D37 values were 310 erg/mm2, 130 erg/mm2, and 90 erg/mm2. Ultraviolet inactivation kinetics of rRNA synthesis in HeLa cells indicated that, as in L cells, each 45S rRNA transcriptional unit has its own promotor, and that the 18S rRNA cistron is promotor proximal and the 28S rRNA cistron is promotor distal. All of the above findings support the hypothesis that irradiation of mammalian cells with ultraviolet light causes the formation of lesions on the DNA templates which result in premature termination of transcription. (U.S.)

  5. Basal divergence of Eriophyoidea (Acariformes, Eupodina) inferred from combined partial COI and 28S gene sequences and CLSM genital anatomy.

    Chetverikov, P E; Cvrković, T; Makunin, A; Sukhareva, S; Vidović, B; Petanović, R

    2015-10-01

    Eriophyoids are an ancient group of highly miniaturized, morphologically simplified and diverse phytoparasitic mites. Their possible numerous host-switch events have been accompanied by considerable homoplastic evolution. Although several morphological cladistic and molecular phylogenetic studies attempted to reconstruct phylogeny of Eriophyoidea, the major lineages of eriophyoids, as well as the evolutionary relationships between them, are still poorly understood. New phylogenetically informative data have been provided by the recent discovery of the early derivative pentasetacine genus Loboquintus, and observations on the eriophyoid reproductive anatomy. Herein, we use COI and D1-2 rRNA data of 73 eriophyoid species (including early derivative pentasetacines) from Europe, the Americas and South Africa to reconstruct part of the phylogeny of the superfamily, and infer on the basal divergence of eriophyoid taxa. In addition, a comparative CLSM study of the female internal genitalia was undertaken in order to find putative apomorphies, which can be used to improve the taxonomy of Eriophyoidea. The following molecular clades, marked by differences in genital anatomy and prodorsal shield setation, were found in our analyses: Loboquintus(Pentasetacus((Eriophyidae + Diptilomiopidae)(Phytoptidae-1, Phytoptidae-2))). The results of this study suggest that the superfamily Eriophyoidea comprises basal paraphyletic pentasetacines (Loboquintus and Pentasetacus), and two large monophyletic groups: Eriophyidae s.l. [containing paraphyletic Eriophyidae sensu Amrine et al. 2003 (=Eriophyidae s.str.) and Diptilomiopidae sensu Amrine et al. 2003] and Phytoptidae s.l. [containing monophyletic Phytoptidae sensu Boczek et al. 1989 (=Phytoptidae s.str.) and Nalepellidae sensu Boczek et al. 1989]. Putative morphological apomorphies (including genital and gnathosomal characters) supporting the clades revealed in molecular analyses are briefly discussed. PMID:26126634

  6. Repetitive sequence environment distinguishes housekeeping genes

    Eller, C. Daniel; Regelson, Moira; Merriman, Barry; Nelson, Stan,; Horvath, Steve; Marahrens, York

    2006-01-01

    Housekeeping genes are expressed across a wide variety of tissues. Since repetitive sequences have been reported to influence the expression of individual genes, we employed a novel approach to determine whether housekeeping genes can be distinguished from tissue-specific genes their repetitive sequence context. We show that Alu elements are more highly concentrated around housekeeping genes while various longer (>400-bp) repetitive sequences ("repeats"), including Long Interspersed Nuclear E...

  7. cis sequence effects on gene expression

    Jacobs Kevin

    2007-08-01

    Full Text Available Abstract Background Sequence and transcriptional variability within and between individuals are typically studied independently. The joint analysis of sequence and gene expression variation (genetical genomics provides insight into the role of linked sequence variation in the regulation of gene expression. We investigated the role of sequence variation in cis on gene expression (cis sequence effects in a group of genes commonly studied in cancer research in lymphoblastoid cell lines. We estimated the proportion of genes exhibiting cis sequence effects and the proportion of gene expression variation explained by cis sequence effects using three different analytical approaches, and compared our results to the literature. Results We generated gene expression profiling data at N = 697 candidate genes from N = 30 lymphoblastoid cell lines for this study and used available candidate gene resequencing data at N = 552 candidate genes to identify N = 30 candidate genes with sufficient variance in both datasets for the investigation of cis sequence effects. We used two additive models and the haplotype phylogeny scanning approach of Templeton (Tree Scanning to evaluate association between individual SNPs, all SNPs at a gene, and diplotypes, with log-transformed gene expression. SNPs and diplotypes at eight candidate genes exhibited statistically significant (p cis sequence effects in our study, respectively. Conclusion Based on analysis of our results and the extant literature, one in four genes exhibits significant cis sequence effects, and for these genes, about 30% of gene expression variation is accounted for by cis sequence variation. Despite diverse experimental approaches, the presence or absence of significant cis sequence effects is largely supported by previously published studies.

  8. Phylogenetic analysis of three species of Encarsia ( Hymenoptera: Aphelinidae) parasitizing Bemisia tabaci ( Hemiptera: Aleyrodidae) in China based on their 28S rRNA gene%中国寄生烟粉虱的三种恩角蚜小蜂28S rRNA系统发育分析

    薛夏; 彭伟录; Muhammad Z. AHMED; Nasser S. MANDOUR; 任顺祥; Andrew G. S. CUTHBERTSON; 邱宝利

    2012-01-01

    Encarsia F(o)rster consists of important parasitoids of whitefly (Bemisia tabaci) pests,including E.bimaculata,E.formosa and E.sophia,the three most important aphelinid parasitoids in China.Eight populations of Encarsia from the South,Southeast,North and Southwest of China,as well as two populations from Malaysia and Egypt,respectively,were collected in the present study,and their interspecies phylogenetic relationships were analyzed based on 28S rRNA D2 and D3 expansion regions.The D2 and D3 regions were consistent with each other,confirmed a closer genetic relationship between E.sophia and E.bimaculata since they both belong to the Encarisa strenus species group,compared to those between these two species and En.formosa.Results of the genetic distance analysis using 28S rRNA D2 sequences revealed that there are certain genetic divergences within single species of the Encarsia parasitoids.The Guangzhou population of Encarsia sophia is more close to populations from Australia,Spain,Egypt and Ethiopia,but further from the population from Thailand.E. bimaculata populations from Sudan,Egypt and Guatemala as well as one population from Australia cluster together,while E.formosa Hengshui and Kunming populations cluster together with those from USA,UK and Greece,but are further from the Egypt population.The reasons for the inconsistency between the genetic and geographical distances of the Encarsia species are discussed.%蚜小蜂Bemisia tabaci是烟粉虱的重要天敌,其中双斑恩蚜小蜂Encarsia bimaculata,丽蚜小蜂E.forTmosa以及浅黄恩蚜小蜂E.sophia是国内烟粉虱寄生蜂3个优势种.本研究以采自中国华南、华东、华北、西南地区以及马来西亚、埃及的E.bimaculata、E.formosa和E.sophia3个优势种的8个不同地理种群为研究对象,对其28SrRNA D2和D3扩展区序列进行了测定和分析.结果表明:Encarsia属的恩蚜小蜂其28S rRNA D2和D3序列在种间水平上高度保守;与丽蚜小蜂相比,双斑

  9. Nucleotide sequence of Klebsiella pneumoniae lac genes.

    Buvinger, W E; Riley, M

    1985-01-01

    The nucleotide sequences of the Klebsiella pneumoniae lacI and lacZ genes and part of the lacY gene were determined, and these genes were located and oriented relative to one another. The K. pneumoniae lac operon is divergent in that the lacI and lacZ genes are oriented head to head, and complementary strands are transcribed. Besides base substitutions, the lacZ genes of K. pneumoniae and Escherichia coli have suffered short distance shifts of reading frame caused by additions or deletions or...

  10. Network of tRNA Gene Sequences

    WEI Fang-ping; LI Sheng; MA Hong-ru

    2008-01-01

    A network of 3719 tRNA gene sequences was constructed using simplest alignment. Its topology, degree distribution and clustering coefficient were studied. The behaviors of the network shift from fluctuated distribution to scale-free distribution when the similarity degree of the tRNA gene sequences increases. The tRNA gene sequences with the same anticodon identity are more self-organized than those with different anticodon identities and form local clusters in the network. Some vertices of the local cluster have a high connection with other local clusters, and the probable reason was given. Moreover, a network constructed by the same number of random tRNA sequences was used to make comparisons. The relationships between the properties of the tRNA similarity network and the characters of tRNA evolutionary history were discussed.

  11. Ab initio gene identification in metagenomic sequences.

    Zhu, Wenhan; Lomsadze, Alexandre; Borodovsky, Mark

    2010-07-01

    We describe an algorithm for gene identification in DNA sequences derived from shotgun sequencing of microbial communities. Accurate ab initio gene prediction in a short nucleotide sequence of anonymous origin is hampered by uncertainty in model parameters. While several machine learning approaches could be proposed to bypass this difficulty, one effective method is to estimate parameters from dependencies, formed in evolution, between frequencies of oligonucleotides in protein-coding regions and genome nucleotide composition. Original version of the method was proposed in 1999 and has been used since for (i) reconstructing codon frequency vector needed for gene finding in viral genomes and (ii) initializing parameters of self-training gene finding algorithms. With advent of new prokaryotic genomes en masse it became possible to enhance the original approach by using direct polynomial and logistic approximations of oligonucleotide frequencies, as well as by separating models for bacteria and archaea. These advances have increased the accuracy of model reconstruction and, subsequently, gene prediction. We describe the refined method and assess its accuracy on known prokaryotic genomes split into short sequences. Also, we show that as a result of application of the new method, several thousands of new genes could be added to existing annotations of several human and mouse gut metagenomes. PMID:20403810

  12. DNA sequence of the yeast transketolase gene.

    Fletcher, T S; Kwee, I L; Nakada, T; Largman, C; Martin, B M

    1992-02-18

    Transketolase (EC 2.2.1.1) is the enzyme that, together with aldolase, forms a reversible link between the glycolytic and pentose phosphate pathways. We have cloned and sequenced the transketolase gene from yeast (Saccharomyces cerevisiae). This is the first transketolase gene of the pentose phosphate shunt to be sequenced from any source. The molecular mass of the proposed translated protein is 73,976 daltons, in good agreement with the observed molecular mass of about 75,000 daltons. The 5'-nontranslated region of the gene is similar to other yeast genes. There is no evidence of 5'-splice junctions or branch points in the sequence. The 3'-nontranslated region contains the polyadenylation signal (AATAAA), 80 base pairs downstream from the termination codon. A high degree of homology is found between yeast transketolase and dihydroxyacetone synthase (formaldehyde transketolase) from the yeast Hansenula polymorpha. The overall sequence identity between these two proteins is 37%, with four regions of much greater similarity. The regions from amino acid residues 98-131, 157-182, 410-433, and 474-489 have sequence identities of 74%, 66%, 83%, and 82%, respectively. One of these regions (157-182) includes a possible thiamin pyrophosphate (TPP) binding domain, and another (410-433) may contain the catalytic domain. PMID:1737042

  13. Sequencing and Gene Expression Analysis of Leishmania tropica LACK Gene.

    Nour Hammoudeh

    2014-12-01

    Full Text Available Leishmania Homologue of receptors for Activated C Kinase (LACK antigen is a 36-kDa protein, which provokes a very early immune response against Leishmania infection. There are several reports on the expression of LACK through different life-cycle stages of genus Leishmania, but only a few of them have focused on L.tropica.The present study provides details of the cloning, DNA sequencing and gene expression of LACK in this parasite species. First, several local isolates of Leishmania parasites were typed in our laboratory using PCR technique to verify of Leishmania parasite species. After that, LACK gene was amplified and cloned into a vector for sequencing. Finally, the expression of this molecule in logarithmic and stationary growth phase promastigotes, as well as in amastigotes, was evaluated by Reverse Transcription-PCR (RT-PCR technique.The typing result confirmed that all our local isolates belong to L.tropica. LACK gene sequence was determined and high similarity was observed with the sequences of other Leishmania species. Furthermore, the expression of LACK gene in both promastigotes and amastigotes forms was confirmed.Overall, the data set the stage for future studies of the properties and immune role of LACK gene products.

  14. The nucleotide sequences of two leghemoglobin genes from soybean

    Wiborg, O; Hyldig-Nielsen, J J; Jensen, E O; Paludan, K; Marcker, K A

    1982-01-01

    We present the complete nucleotide sequences of two leghemoglobin genes isolated from soybean DNA. Both genes contain three intervening sequences in identical positions. Comparison of the coding sequences with known amino-acid sequences of soybean leghemoglobins suggest that the two genes...

  15. Cloning and sequencing genes related to preeclampsia

    SHI Juan-zi; LIU Yan-fang; YAO Yuan-qing; YAN Wei; ZHU Feng; ZHAO Zhong-liang

    2001-01-01

    To clone genes specifically expressed in the placenta of patients with preeclampsia, and to explain the mechanism in the etiopathology ofpreeclampsia. Methods: The placentae ofpreeclamptic and normotensive subjects with pregnancy were used as models, and the cDNA Library was constructed and 20 differentially expressed fragments were cloned after a new version of PCR-based subtractive hybridization. The false positive clones were identified by reverse dot blot analysis. With one of the obtained gene taken as the probe, the placentas of 10 normal pregnant women and 10 preeclamptic patients were studied by using dot hybridization methods. Results: Six false positive clones were identified by reverse dot blot, and the rest 14 clones were identified as preeclampsia-related genes. These clones were sequenced, and analyzed with BLAST analysis system. Eleven of 14 clones were genes already known, among which one belongs to necdin family; the rest 3 were identified as novel genes. These 3 genes were acknowledged by GenBank, with the accession numbers AF232216, AF232217, AF233648. The results of dot hybridization using necdin gene as probe were as follows: (1) There was this mRNA in the placental tissues of normal pregnancy as well as in that ofpreeclampsia.(2) The intensity of transcription of this mRNA in the placental tissues of preeclampsia increased significantly compared with that of the normal pregnancy (P<0.05). Conclusions: This study for the first time reported this group of genes, especially necdin-expressing gene, which are related to the etiopathology of preeclampsia. In addition, the overtranscription ofnecdin gene has been found in preeclampsia. It is helpful in further studies of the etiology ofpreeclampsia.

  16. Preliminary phylogeny of the thrips parasitoids of Turkey based on some morphological scales and 28S D2 rDNA, with description of a new species

    DOĞANLAR, Oğuzhan; Doğanlar, Mikdat; Frary, Anne

    2010-01-01

    Species of the Ceranisus thrips-attacking genus are difficult to distinguish morphologically. The phylogenetic relationships within the Ceranisus species were explored using nucleotide sequences of the 28S D2 expansion region of the rDNA gene. Bayesian, maximum likelihood, and parsimony inference methods were employed to construct the phylogenetic relationships. Principal component analysis on the Turkish species of Ceranisus, namely antalyacus, menes, bozovaensis, hirsutus, planitianus (a ne...

  17. Isolation and nucleotide sequence of the gene encoding human rhodopsin.

    Nathans, J; Hogness, D S

    1984-01-01

    We have isolated and completely sequenced the gene encoding human rhodopsin. The coding region of the human rhodopsin gene is interrupted by four introns, which are located at positions analogous to those found in the previously characterized bovine rhodopsin gene. The amino acid sequence of human rhodopsin, deduced from the nucleotide sequence of its gene, is 348 residues long and is 93.4% homologous to that of bovine rhodopsin. Interestingly, those portions of the polypeptide chain predicte...

  18. Bioinformatic Identification of Conserved Cis-Sequences in Coregulated Genes.

    Bülow, Lorenz; Hehl, Reinhard

    2016-01-01

    Bioinformatics tools can be employed to identify conserved cis-sequences in sets of coregulated plant genes because more and more gene expression and genomic sequence data become available. Knowledge on the specific cis-sequences, their enrichment and arrangement within promoters, facilitates the design of functional synthetic plant promoters that are responsive to specific stresses. The present chapter illustrates an example for the bioinformatic identification of conserved Arabidopsis thaliana cis-sequences enriched in drought stress-responsive genes. This workflow can be applied for the identification of cis-sequences in any sets of coregulated genes. The workflow includes detailed protocols to determine sets of coregulated genes, to extract the corresponding promoter sequences, and how to install and run a software package to identify overrepresented motifs. Further bioinformatic analyses that can be performed with the results are discussed. PMID:27557771

  19. Sequencing genes in silico using single nucleotide polymorphisms

    Zhang Xinyi

    2012-01-01

    Full Text Available Abstract Background The advent of high throughput sequencing technology has enabled the 1000 Genomes Project Pilot 3 to generate complete sequence data for more than 906 genes and 8,140 exons representing 697 subjects. The 1000 Genomes database provides a critical opportunity for further interpreting disease associations with single nucleotide polymorphisms (SNPs discovered from genetic association studies. Currently, direct sequencing of candidate genes or regions on a large number of subjects remains both cost- and time-prohibitive. Results To accelerate the translation from discovery to functional studies, we propose an in silico gene sequencing method (ISS, which predicts phased sequences of intragenic regions, using SNPs. The key underlying idea of our method is to infer diploid sequences (a pair of phased sequences/alleles at every functional locus utilizing the deep sequencing data from the 1000 Genomes Project and SNP data from the HapMap Project, and to build prediction models using flanking SNPs. Using this method, we have developed a database of prediction models for 611 known genes. Sequence prediction accuracy for these genes is 96.26% on average (ranges 79%-100%. This database of prediction models can be enhanced and scaled up to include new genes as the 1000 Genomes Project sequences additional genes on additional individuals. Applying our predictive model for the KCNJ11 gene to the Wellcome Trust Case Control Consortium (WTCCC Type 2 diabetes cohort, we demonstrate how the prediction of phased sequences inferred from GWAS SNP genotype data can be used to facilitate interpretation and identify a probable functional mechanism such as protein changes. Conclusions Prior to the general availability of routine sequencing of all subjects, the ISS method proposed here provides a time- and cost-effective approach to broadening the characterization of disease associated SNPs and regions, and facilitating the prioritization of candidate

  20. Fungal community analysis in the deep-sea sediments of the Pacific Ocean assessed by comparison of ITS, 18S and 28S ribosomal DNA regions

    Xu, Wei; Luo, Zhu-Hua; Guo, Shuangshuang; Pang, Ka-Lai

    2016-03-01

    We investigated the diversity of fungal communities in 6 different deep-sea sediment samples of the Pacific Ocean based on three different types of clone libraries, including internal transcribed spacer (ITS), 18S rDNA, and 28S rDNA regions. A total of 1978 clones were generated from 18 environmental clone libraries, resulting in 140 fungal operational taxonomic units (OTUs), including 18 OTUs from ITS, 44 OTUs from 18S rDNA, and 78 OTUs from 28S rDNA gene primer sets. The majority of the recovered sequences belonged to diverse phylotypes of the Ascomycota and Basidiomycota. Additionally, our study revealed a total of 46 novel fungal phylotypes, which showed low similarities (<97%) with available fungal sequences in the GenBank, including a novel Zygomycete lineage, suggesting possible new fungal taxa occurring in the deep-sea sediments. The results suggested that 28S rDNA is an efficient target gene to describe fungal community in deep-sea environment.

  1. A Probabilistic Genome-Wide Gene Reading Frame Sequence Model

    Have, Christian Theil; Mørk, Søren

    We introduce a new type of probabilistic sequence model, that model the sequential composition of reading frames of genes in a genome. Our approach extends gene finders with a model of the sequential composition of genes at the genome-level -- effectively producing a sequential genome annotation as...... output. The model can be used to obtain the most probable genome annotation based on a combination of i: a gene finder score of each gene candidate and ii: the sequence of the reading frames of gene candidates through a genome. The model --- as well as a higher order variant --- is developed and tested...... using the probabilistic logic programming language and machine learning system PRISM - a fast and efficient model prototyping environment, using bacterial gene finding performance as a benchmark of signal strength. The model is used to prune a set of gene predictions from an underlying gene finder and...

  2. Nucleotide sequence of the triosephosphate isomerase gene from Macaca mulatta

    Old, S.E.; Mohrenweiser, H.W. (Univ. of Michigan, Ann Arbor (USA))

    1988-09-26

    The triosephosphate isomerase gene from a rhesus monkey, Macaca mulatta, charon 34 library was sequenced. The human and chimpanzee enzymes differ from the rhesus enzyme at ASN 20 and GLU 198. The nucleotide sequence identity between rhesus and human is 97% in the coding region and >94% in the flanking regions. Comparison of the rhesus and chimp genes, including the intron and flanking sequences, does not suggest a mechanism for generating the two TPI peptides of proliferating cells from hominoids and a single peptide from the rhesus gene.

  3. Identification of sequence variants in genetic disease-causing genes using targeted next-generation sequencing.

    Xiaoming Wei

    Full Text Available BACKGROUND: Identification of gene variants plays an important role in research on and diagnosis of genetic diseases. A combination of enrichment of targeted genes and next-generation sequencing (targeted DNA-HiSeq results in both high efficiency and low cost for targeted sequencing of genes of interest. METHODOLOGY/PRINCIPAL FINDINGS: To identify mutations associated with genetic diseases, we designed an array-based gene chip to capture all of the exons of 193 genes involved in 103 genetic diseases. To evaluate this technology, we selected 7 samples from seven patients with six different genetic diseases resulting from six disease-causing genes and 100 samples from normal human adults as controls. The data obtained showed that on average, 99.14% of 3,382 exons with more than 30-fold coverage were successfully detected using Targeted DNA-HiSeq technology, and we found six known variants in four disease-causing genes and two novel mutations in two other disease-causing genes (the STS gene for XLI and the FBN1 gene for MFS as well as one exon deletion mutation in the DMD gene. These results were confirmed in their entirety using either the Sanger sequencing method or real-time PCR. CONCLUSIONS/SIGNIFICANCE: Targeted DNA-HiSeq combines next-generation sequencing with the capture of sequences from a relevant subset of high-interest genes. This method was tested by capturing sequences from a DNA library through hybridization to oligonucleotide probes specific for genetic disorder-related genes and was found to show high selectivity, improve the detection of mutations, enabling the discovery of novel variants, and provide additional indel data. Thus, targeted DNA-HiSeq can be used to analyze the gene variant profiles of monogenic diseases with high sensitivity, fidelity, throughput and speed.

  4. Comparison of methods for genomic localization of gene trap sequences

    Ferrin Thomas E

    2006-09-01

    Full Text Available Abstract Background Gene knockouts in a model organism such as mouse provide a valuable resource for the study of basic biology and human disease. Determining which gene has been inactivated by an untargeted gene trapping event poses a challenging annotation problem because gene trap sequence tags, which represent sequence near the vector insertion site of a trapped gene, are typically short and often contain unresolved residues. To understand better the localization of these sequences on the mouse genome, we compared stand-alone versions of the alignment programs BLAT, SSAHA, and MegaBLAST. A set of 3,369 sequence tags was aligned to build 34 of the mouse genome using default parameters for each algorithm. Known genome coordinates for the cognate set of full-length genes (1,659 sequences were used to evaluate localization results. Results In general, all three programs performed well in terms of localizing sequences to a general region of the genome, with only relatively subtle errors identified for a small proportion of the sequence tags. However, large differences in performance were noted with regard to correctly identifying exon boundaries. BLAT correctly identified the vast majority of exon boundaries, while SSAHA and MegaBLAST missed the majority of exon boundaries. SSAHA consistently reported the fewest false positives and is the fastest algorithm. MegaBLAST was comparable to BLAT in speed, but was the most susceptible to localizing sequence tags incorrectly to pseudogenes. Conclusion The differences in performance for sequence tags and full-length reference sequences were surprisingly small. Characteristic variations in localization results for each program were noted that affect the localization of sequence at exon boundaries, in particular.

  5. Degenerative primer design and gene sequencing validation for select turkey genes.

    Hutsko, Stephanie L; Lilburn, Michael S; Wick, Macdonald

    2016-06-01

    We successfully designed and validated degenerative primers for turkey genes MUC2, RPS13, TBP and TFF2 based on chicken sequences in order to use gene transcription analysis to evaluate (quantify) the mucin transcription to probiotic supplementation in turkeys. Primers were designed for the genes MUC2, TFF2, RPS13 and TBP using a degenerative primer design method based on the available Gallus gallus sequences. All primer sets, which produced a single PCR amplicon of the expected sizes, were cloned into the TOPO(®) vector and then transformed into TOP 10(®) competent cells. Plasmid DNA isolation was performed on the TOP10(®) cell culture and sent for sequencing. Sequences were analyzed using NCBI BLAST. All genes sequenced had over 90% homology with both the chicken and predicted turkey sequences. The sequences were used to design new 100% homologous primer sets for the genes of interest. PMID:27053625

  6. A silent composite hemoglobinopathy characterized by gene sequencing.

    Zorai, A; Moumni, I; Benmansour, I; Chaouachi, D; Ghanem, A; Abbes, S

    2011-01-01

    We report the case of a 35-year-old Tunisian women with a chronic anemia non investigated for a long time. Laboratory analysis using advanced technology of DNA sequencing revealed a compound heterozygote for Hb O Arab and cd 39 beta degrees-thalassemia. It's the first time that such a genotype has been characterized by gene sequencing. PMID:23461145

  7. Nucleotide sequence of a human tRNA gene heterocluster

    Leucine tRNA from bovine liver was used as a hybridization probe to screen a human gene library harbored in Charon-4A of bacteriophage lambda. The human DNA inserts from plaque-pure clones were characterized by restriction endonuclease mapping and Southern hybridization techniques, using both [3'-32P]-labeled bovine liver leucine tRNA and total tRNA as hybridization probes. An 8-kb Hind III fragment of one of these γ-clones was subcloned into the Hind III site of pBR322. Subsequent fine restriction mapping and DNA sequence analysis of this plasmid DNA indicated the presence of four tRNA genes within the 8-kb DNA fragment. A leucine tRNA gene with an anticodon of AAG and a proline tRNA gene with an anticodon of AGG are in a 1.6-kb subfragment. A threonine tRNA gene with an anticodon of UGU and an as yet unidentified tRNA gene are located in a 1.1-kb subfragment. These two different subfragments are separated by 2.8 kb. The coding regions of the three sequenced genes contain characteristic internal split promoter sequences and do not have intervening sequences. The 3'-flanking region of these three genes have typical RNA polymerase III termination sites of at least four consecutive T residues

  8. Mechanism of Gene Amplification via Yeast Autonomously Replicating Sequences

    Shelly Sehgal

    2015-01-01

    Full Text Available The present investigation was aimed at understanding the molecular mechanism of gene amplification. Interplay of fragile sites in promoting gene amplification was also elucidated. The amplification promoting sequences were chosen from the Saccharomyces cerevisiae ARS, 5S rRNA regions of Plantago ovata and P. lagopus, proposed sites of replication pausing at Ste20 gene locus of S. cerevisiae, and the bend DNA sequences within fragile site FRA11A in humans. The gene amplification assays showed that plasmid bearing APS from yeast and human beings led to enhanced protein concentration as compared to the wild type. Both the in silico and in vitro analyses were pointed out at the strong bending potential of these APS. In addition, high mitotic stability and presence of TTTT repeats and SAR amongst these sequences encourage gene amplification. Phylogenetic analysis of S. cerevisiae ARS was also conducted. The combinatorial power of different aspects of APS analyzed in the present investigation was harnessed to reach a consensus about the factors which stimulate gene expression, in presence of these sequences. It was concluded that the mechanism of gene amplification was that AT rich tracts present in fragile sites of yeast serve as binding sites for MAR/SAR and DNA unwinding elements. The DNA protein interactions necessary for ORC activation are facilitated by DNA bending. These specific bindings at ORC promote repeated rounds of DNA replication leading to gene amplification.

  9. Biased distribution of DNA uptake sequences towards genome maintenance genes

    Davidsen, T.; Rodland, E.A.; Lagesen, K.; Seeberg, E.; Rognes, Torbjørn; Tonjum, T.

    2004-01-01

    coding regions are the DNA uptake sequences (DUS) required for natural genetic transformation. More importantly, we found a significantly higher density of DUS within genes involved in DNA repair, recombination, restriction-modification and replication than in any other annotated gene group in these...

  10. Mechanism of gene amplification via yeast autonomously replicating sequences.

    Sehgal, Shelly; Kaul, Sanjana; Dhar, M K

    2015-01-01

    The present investigation was aimed at understanding the molecular mechanism of gene amplification. Interplay of fragile sites in promoting gene amplification was also elucidated. The amplification promoting sequences were chosen from the Saccharomyces cerevisiae ARS, 5S rRNA regions of Plantago ovata and P. lagopus, proposed sites of replication pausing at Ste20 gene locus of S. cerevisiae, and the bend DNA sequences within fragile site FRA11A in humans. The gene amplification assays showed that plasmid bearing APS from yeast and human beings led to enhanced protein concentration as compared to the wild type. Both the in silico and in vitro analyses were pointed out at the strong bending potential of these APS. In addition, high mitotic stability and presence of TTTT repeats and SAR amongst these sequences encourage gene amplification. Phylogenetic analysis of S. cerevisiae ARS was also conducted. The combinatorial power of different aspects of APS analyzed in the present investigation was harnessed to reach a consensus about the factors which stimulate gene expression, in presence of these sequences. It was concluded that the mechanism of gene amplification was that AT rich tracts present in fragile sites of yeast serve as binding sites for MAR/SAR and DNA unwinding elements. The DNA protein interactions necessary for ORC activation are facilitated by DNA bending. These specific bindings at ORC promote repeated rounds of DNA replication leading to gene amplification. PMID:25685838

  11. Cloning and sequencing of the gene for human. beta. -casein

    Loennerdal, B.; Bergstroem, S.; Andersson, Y.; Hialmarsson, K.; Sundgyist, A.; Hernell, O. (Univ. of California, Davis (United States))

    1990-02-26

    Human {beta}-casein is a major protein in human milk. This protein is part of the casein micelle and has been suggested to have several physiological functions in the newborn. Since there is limited information on {beta}casein and the factors that affect its concentration in human milk, the authors have isolated and sequenced the gene for this protein. A human mammary gland cDNA library (Clontech) in gt 11 was screened by plaque hy-hybridization using a 42-mer synthetic {sup 32}p-labelled oligo-nucleotide. Positive clones were identified and isolated, DNA was prepared and the gene isolated by cleavage with EcoR1. Following subcloning (PUC18), restriction mapping and Southern blotting, DNA for sequencing was prepared. The gene was sequenced by the dideoxy method. Human {beta}-casein has 212 amino acids and the amino acid sequence deducted from the nucleotide sequence is to 91% identical to the published sequence for human {beta}-casein show a high degree of conservation at the leader peptide and the highly phosphorylated sequences, but also deletions and divergence at several positions. These results provide insight into the structure of the human {beta}-casein gene and will facilitate studies on factors affecting its expression.

  12. Microsatellite Instability Use in Mismatch Repair Gene Sequence Variant Classification

    Bryony A. Thompson

    2015-03-01

    Full Text Available Inherited mutations in the DNA mismatch repair genes (MMR can cause MMR deficiency and increased susceptibility to colorectal and endometrial cancer. Microsatellite instability (MSI is the defining molecular signature of MMR deficiency. The clinical classification of identified MMR gene sequence variants has a direct impact on the management of patients and their families. For a significant proportion of cases sequence variants of uncertain clinical significance (also known as unclassified variants are identified, constituting a challenge for genetic counselling and clinical management of families. The effect on protein function of these variants is difficult to interpret. The presence or absence of MSI in tumours can aid in determining the pathogenicity of associated unclassified MMR gene variants. However, there are some considerations that need to be taken into account when using MSI for variant interpretation. The use of MSI and other tumour characteristics in MMR gene sequence variant classification will be explored in this review.

  13. SxtA gene sequence analysis of dinoflagellate Alexandrium minutum

    Norshaha, Safida Anira; Latib, Norhidayu Abdul; Usup, Gires; Yusof, Nurul Yuziana Mohd

    2015-09-01

    The dinoflagellate Alexandrium minutum is typically known for the production of potent neurotoxins such as saxitoxin, affecting the health of human seafood consumers via paralytic shellfish poisoning (PSP). These phenomena is related to the harmful algal blooms (HABs) that is believed to be influenced by environmental and nutritional factors. Previous study has revealed that SxtA gene is a starting gene that involved in the saxitoxin production pathway. The aim of this study was to analyse the sequence of the sxtA gene in A. minutum. The dinoflagellates culture was cultured at temperature 26°C with 16:8-hour light:dark photocycle. After the samples were harvested, RNA was extracted, complementary DNA (cDNA) was synthesised and amplified by polymerase chain reaction (PCR). The PCR products were then purified and cloned before sequenced. The SxtA sequence obtained was then analyzed in order to identify the presence of SxtA gene in Alexandrium minutum.

  14. Sequence Variability in Staphylococcal Enterotoxin Genes seb, sec, and sed

    Sophia Johler

    2016-06-01

    Full Text Available Ingestion of staphylococcal enterotoxins preformed by Staphylococcus aureus in food leads to staphylococcal food poisoning, the most prevalent foodborne intoxication worldwide. There are five major staphylococcal enterotoxins: SEA, SEB, SEC, SED, and SEE. While variants of these toxins have been described and were linked to specific hosts or levels or enterotoxin production, data on sequence variation is still limited. In this study, we aim to extend the knowledge on promoter and gene variants of the major enterotoxins SEB, SEC, and SED. To this end, we determined seb, sec, and sed promoter and gene sequences of a well-characterized set of enterotoxigenic Staphylococcus aureus strains originating from foodborne outbreaks, human infections, human nasal colonization, rabbits, and cattle. New nucleotide sequence variants were detected for all three enterotoxins and a novel amino acid sequence variant of SED was detected in a strain associated with human nasal colonization. While the seb promoter and gene sequences exhibited a high degree of variability, the sec and sed promoter and gene were more conserved. Interestingly, a truncated variant of sed was detected in all tested sed harboring rabbit strains. The generated data represents a further step towards improved understanding of strain-specific differences in enterotoxin expression and host-specific variation in enterotoxin sequences.

  15. Multiple gene sequence analysis using genes of the bacterial DNA repair pathway

    Miguel Rotelok Neto

    2015-06-01

    Full Text Available The ability to recognize and repair abnormal DNA structures is common to all forms of life. Physiological studies and genomic sequencing of a variety of bacterial species have identified an incredible diversity of DNA repair pathways. Despite the amount of available genes in public database, the usual method to place genomes in a taxonomic context is based mainly on the 16S rRNA or housekeeping genes. Thus, the relationships among genomes remain poorly understood. In this work, an approach of multiple gene sequence analysis based on genes of DNA repair pathway was used to compare bacterial genomes. Housekeeping and DNA repair genes were searched in 872 completely sequenced bacterial genomes. Seven DNA repair and housekeeping genes from distinct metabolic pathways were selected, aligned, edited and concatenated head-to-tail to form a super-gene. Results showed that the multiple gene sequence analysis using DNA repair genes had better resolution at class level than the housekeeping genes. As housekeeping genes, the DNA repair genes were advantageous to separate bacterial groups at low taxonomic levels and also sensitive to genes derived from horizontal transfer.

  16. The nucleotide sequence of the bacteriophage T5 ltf gene.

    Kaliman, A V; Kulshin, V E; Shlyapnikov, M G; Ksenzenko, V N; Kryukov, V M

    1995-06-01

    The nucleotide sequence of the bacteriophage T5 Bg/II-BamHI fragment (4,835 bp in length) known to carry a gene encoding the LTF protein which forms the phage L-shaped tail fibers was determined. It was shown to contain an open reading frame for 1,396 amino acid residues that corresponds to a protein of 147.8 kDa. The coding region of ltf gene is preceded by a typical Shine-Dalgarno sequence. Downstream from the ltf gene there is a strong transcription terminator. Data bank analysis of the LTF protein sequence reveals 55.1% identity to the hypothetical protein ORF 401 of bacteriophage lambda in a segment of 118 amino acids overlap. PMID:7789514

  17. PHYLOGENETIC ANALYSIS OF THE SUBCLASS PTERIOMORPHIA (BIVAVIA) BASED ON PARTIAL 28S rRNA SEQUENCE%基于28SrRNA基因片段的翼形亚纲(Bivalvia:Pteriomorphia)系统发育的初步研究

    薛东秀; 王海艳; 张涛; 张素萍; 徐凤山

    2012-01-01

    The phylogenetic relationships among 11 superfamilies of the subclass Pteriomorphia (Bivavia) were recon-structed based on partial sequences of the nuclear 28S ribosomal DNA retrieved from GenBank. Unambiguously aligned sequences (1252bp) of 80 species were subjected to partitioned maximum likelihood and Bayesian analyses. Sequence analysis showed that there were 359 variable sites, occupying 28.67% of all sites, and 300 parsimony informative sites, occupying 23.96% of all sites. The average content of A+T was 41.6%, obviously lower than G+C, showing that the base compositions were biased in favor of G+C. The genetic distances among species within superfamilies ranged from 0.01 to 0.14, which were obviously smaller than those among superfamilies. The resultant molecular phylogeny was compared with previously published phylogenetic hypotheses inferred from morphological characteristics and other molecular analyses. The molecular phylogenetic analyses strongly supported the monophyly of Pteriomorphia, which were congruent with previous results of based on morphological characters. The resulting trees clearly indicated that the 11 superfamilies were divided into three clades: clade I included Pterioidea, Ostreoidea, and Pinnoidea; clade I1 included Arcoidea, Limop- soidea, and Mytiloidea; and clade m included Pectinoidea, Anomioidea, Dimyoidea, Plicatuloidea, and Limoidea. Based on the results of the present study and information compiled from other's classification system, a revised classification of the extant superfamilies of Pteriomorphia is presented.%采用从GenBank下载的翼形亚纲11个总科80个种类的28S部分序列,对翼形亚纲11个总科贝类进行系统发育关系研究。在获得的1252个序列位点中,去除插入缺失位点,变异位点共359个,其中简约位点300个。翼形亚纲各总科内各种间的遗传距离为0.01—0.14,明显小于各总科间的遗传距离(除蚶总科与拟锉蛤总

  18. Diverse nucleotide compositions and sequence fluctuation in Rubisco protein genes

    Holden, Todd; Dehipawala, S.; Cheung, E.; Bienaime, R.; Ye, J.; Tremberger, G., Jr.; Schneider, P.; Lieberman, D.; Cheung, T.

    2011-10-01

    The Rubisco protein-enzyme is arguably the most abundance protein on Earth. The biology dogma of transcription and translation necessitates the study of the Rubisco genes and Rubisco-like genes in various species. Stronger correlation of fractal dimension of the atomic number fluctuation along a DNA sequence with Shannon entropy has been observed in the studied Rubisco-like gene sequences, suggesting a more diverse evolutionary pressure and constraints in the Rubisco sequences. The strategy of using metal for structural stabilization appears to be an ancient mechanism, with data from the porphobilinogen deaminase gene in Capsaspora owczarzaki and Monosiga brevicollis. Using the chi-square distance probability, our analysis supports the conjecture that the more ancient Rubisco-like sequence in Microcystis aeruginosa would have experienced very different evolutionary pressure and bio-chemical constraint as compared to Bordetella bronchiseptica, the two microbes occupying either end of the correlation graph. Our exploratory study would indicate that high fractal dimension Rubisco sequence would support high carbon dioxide rate via the Michaelis- Menten coefficient; with implication for the control of the whooping cough pathogen Bordetella bronchiseptica, a microbe containing a high fractal dimension Rubisco-like sequence (2.07). Using the internal comparison of chi-square distance probability for 16S rRNA (~ E-22) versus radiation repair Rec-A gene (~ E-05) in high GC content Deinococcus radiodurans, our analysis supports the conjecture that high GC content microbes containing Rubisco-like sequence are likely to include an extra-terrestrial origin, relative to Deinococcus radiodurans. Similar photosynthesis process that could utilize host star radiation would not compete with radiation resistant process from the biology dogma perspective in environments such as Mars and exoplanets.

  19. A human gut microbial gene catalogue established by metagenomic sequencing

    dos Santos, Marcelo Bertalan Quintanilha; Sicheritz-Pontén, Thomas; Nielsen, Henrik Bjørn;

    2010-01-01

    To understand the impact of gut microbes on human health and well-being it is crucial to assess their genetic potential. Here we describe the Illumina-based metagenomic sequencing, assembly and characterization of 3.3 million non-redundant microbial genes, derived from 576.7 gigabases of sequence...... gut metagenome and the minimal gut bacterial genome in terms of functions present in all individuals and most bacteria, respectively....

  20. A human gut microbial gene catalogue established by metagenomic sequencing

    dos Santos, Marcelo Bertalan Quintanilha; Sicheritz-Pontén, Thomas; Nielsen, Henrik Bjørn;

    2010-01-01

    To understand the impact of gut microbes on human health and well-being it is crucial to assess their genetic potential. Here we describe the Illumina-based metagenomic sequencing, assembly and characterization of 3.3 million non-redundant microbial genes, derived from 576.7 gigabases of sequence...... minimal gut metagenome and the minimal gut bacterial genome in terms of functions present in all individuals and most bacteria, respectively....

  1. Topology of genes and nontranscribed sequences in human interphase nuclei

    Knowledge about the functional impact of the topological organization of DNA sequences within interphase chromosome territories is still sparse. Of the few analyzed single copy genomic DNA sequences, the majority had been found to localize preferentially at the chromosome periphery or to loop out from chromosome territories. By means of dual-color fluorescence in situ hybridization (FISH), immunolabeling, confocal microscopy, and three-dimensional (3D) image analysis, we analyzed the intraterritorial and nuclear localization of 10 genomic fragments of different sequence classes in four different human cell types. The localization of three muscle-specific genes FLNA, NEB, and TTN, the oncogene BCL2, the tumor suppressor gene MADH4, and five putatively nontranscribed genomic sequences was predominantly in the periphery of the respective chromosome territories, independent from transcriptional status and from GC content. In interphase nuclei, the noncoding sequences were only rarely found associated with heterochromatic sites marked by the satellite III DNA D1Z1 or clusters of mammalian heterochromatin proteins (HP1α, HP1β, HP1γ). However, the nontranscribed sequences were found predominantly at the nuclear periphery or at the nucleoli, whereas genes tended to localize on chromosome surfaces exposed to the nuclear interior

  2. Illumina MiSeq sequencing disfavours a sequence motif in the GFP reporter gene.

    Van den Hoecke, Silvie; Verhelst, Judith; Saelens, Xavier

    2016-01-01

    Green fluorescent protein (GFP) is one of the most used reporter genes. We have used next-generation sequencing (NGS) to analyse the genetic diversity of a recombinant influenza A virus that expresses GFP and found a remarkable coverage dip in the GFP coding sequence. This coverage dip was present when virus-derived RT-PCR product or the parental plasmid DNA was used as starting material for NGS and regardless of whether Nextera XT transposase or Covaris shearing was used for DNA fragmentation. Therefore, the sequence coverage dip in the GFP coding sequence was not the result of emerging GFP mutant viruses or a bias introduced by Nextera XT fragmentation. Instead, we found that the Illumina MiSeq sequencing method disfavours the 'CCCGCC' motif in the GFP coding sequence. PMID:27193250

  3. Illumina MiSeq sequencing disfavours a sequence motif in the GFP reporter gene

    Van den Hoecke, Silvie; Verhelst, Judith; Saelens, Xavier

    2016-01-01

    Green fluorescent protein (GFP) is one of the most used reporter genes. We have used next-generation sequencing (NGS) to analyse the genetic diversity of a recombinant influenza A virus that expresses GFP and found a remarkable coverage dip in the GFP coding sequence. This coverage dip was present when virus-derived RT-PCR product or the parental plasmid DNA was used as starting material for NGS and regardless of whether Nextera XT transposase or Covaris shearing was used for DNA fragmentation. Therefore, the sequence coverage dip in the GFP coding sequence was not the result of emerging GFP mutant viruses or a bias introduced by Nextera XT fragmentation. Instead, we found that the Illumina MiSeq sequencing method disfavours the ‘CCCGCC’ motif in the GFP coding sequence. PMID:27193250

  4. A rapid method for sequencing of rRNA gene(s) amplified by polymerase chain reaction using an automated DNA sequencer

    Dwivedi, P.P.; Patel, B.K.C.; Rees, G.N.; Ollivier, Bernard

    1996-01-01

    A method for DNA sequencing of ribosomal RNA (rRNA) genes, amplified by polymerase chain reaction (PCR), using internal primers, designed on the basis of conserved regions of rRNA genes for determining a near complete sequence (99%) of the gene using an automated DNA sequencer (Applied Biosystem Incorporation, USA) is described. The procedure is extremely rapid as cloning of the gene is not required for sequence determination. In addition time consuming steps such as ethanol precipitation and...

  5. Identification of rat genes by TWINSCAN gene prediction, RT-PCR, and direct sequencing

    Wu, Jia Qian; Shteynberg, David; Arumugam, Manimozhiyan;

    2004-01-01

    alternative approach: reverse transcription-polymerase chain reaction (RT-PCR) and direct sequencing based on dual-genome de novo predictions from TWINSCAN. We tested 444 TWINSCAN-predicted rat genes that showed significant homology to known human genes implicated in disease but that were partially or...... single-intron experiment. Spliced sequences were amplified in 46 cases (34%). We conclude that this procedure for elucidating gene structures with native cDNA sequences is cost-effective and will become even more so as it is further optimized....

  6. Automated cleaning and pre-processing of immunoglobulin gene sequences from high-throughput sequencing

    Miri eMichaeli

    2012-12-01

    Full Text Available High throughput sequencing (HTS yields tens of thousands to millions of sequences that require a large amount of pre-processing work to clean various artifacts. Such cleaning cannot be performed manually. Existing programs are not suitable for immunoglobulin (Ig genes, which are variable and often highly mutated. This paper describes Ig-HTS-Cleaner (Ig High Throughput Sequencing Cleaner, a program containing a simple cleaning procedure that successfully deals with pre-processing of Ig sequences derived from HTS, and Ig-Indel-Identifier (Ig Insertion – Deletion Identifier, a program for identifying legitimate and artifact insertions and/or deletions (indels. Our programs were designed for analyzing Ig gene sequences obtained by 454 sequencing, but they are applicable to all types of sequences and sequencing platforms. Ig-HTS-Cleaner and Ig-Indel-Identifier have been implemented in Java and saved as executable JAR files, supported on Linux and MS Windows. No special requirements are needed in order to run the programs, except for correctly constructing the input files as explained in the text. The programs' performance has been tested and validated on real and simulated data sets.

  7. Gene identification in prokaryotic genomes, phages, metagenomes, and EST sequences with GeneMarkS suite.

    Borodovsky, Mark; Lomsadze, Alex

    2014-01-01

    This unit describes how to use several gene-finding programs from the GeneMark line developed for finding protein-coding ORFs in genomic DNA of prokaryotic species, in genomic DNA of eukaryotic species with intronless genes, in genomes of viruses and phages, and in prokaryotic metagenomic sequences, as well as in EST sequences with spliced-out introns. These bioinformatics tools were demonstrated to have state-of-the-art accuracy, and have been frequently used for gene annotation in novel nucleotide sequences. An additional advantage of these sequence-analysis tools is that the problem of algorithm parameterization is solved automatically, with parameters estimated by iterative self-training (unsupervised training). PMID:24510847

  8. Sequence of the human iduronate 2-sulfatase (IDS) gene

    Wilson, P.J.; Meaney, C.A.; Hopwood, J.J.; Morris, C.P. (Adelaide Children' s Hospital, North Adelaide (Australia))

    1993-09-01

    Deficiency of the lysosomal enzyme iduronate-2-sulfatase (IDS; EC 3.1.6.13) results in the storage of the glycosaminoglycans heparan sulfate and dermatan sulfate, which leads to the lysosomal storage disorder mucopolysaccharidosis type II. Three overlapping genomic clones derived from an X-chromosome-specific library containing the entire IDS gene were isolated and the sequences of the intron boundaries and the 5[prime] promoter region were determined. The IDS gene is split into nine exons spanning approximately 24 kb. The potential promoter for IDS lacks a TATA box but contains GC box consensus sequences, consistent with its role as a housekeeping gene. A polypyrimidine-like repeat is found in intron 1. 9 refs., 1 fig., 1 tab.

  9. Variations in CCL3L gene cluster sequence and non-specific gene copy numbers

    Edberg Jeffrey C

    2010-03-01

    Full Text Available Abstract Background Copy number variations (CNVs of the gene CC chemokine ligand 3-like1 (CCL3L1 have been implicated in HIV-1 susceptibility, but the association has been inconsistent. CCL3L1 shares homology with a cluster of genes localized to chromosome 17q12, namely CCL3, CCL3L2, and, CCL3L3. These genes are involved in host defense and inflammatory processes. Several CNV assays have been developed for the CCL3L1 gene. Findings Through pairwise and multiple alignments of these genes, we have shown that the homology between these genes ranges from 50% to 99% in complete gene sequences and from 70-100% in the exonic regions, with CCL3L1 and CCL3L3 being identical. By use of MEGA 4 and BioEdit, we aligned sense primers, anti-sense primers, and probes used in several previously described assays against pre-multiple alignments of all four chemokine genes. Each set of probes and primers aligned and matched with overlapping sequences in at least two of the four genes, indicating that previously utilized RT-PCR based CNV assays are not specific for only CCL3L1. The four available assays measured median copies of 2 and 3-4 in European and African American, respectively. The concordance between the assays ranged from 0.44-0.83 suggesting individual discordant calls and inconsistencies with the assays from the expected gene coverage from the known sequence. Conclusions This indicates that some of the inconsistencies in the association studies could be due to assays that provide heterogenous results. Sequence information to determine CNV of the three genes separately would allow to test whether their association with the pathogenesis of a human disease or phenotype is affected by an individual gene or by a combination of these genes.

  10. Cloning and sequence of the human adrenodoxin reductase gene

    Adrenodoxin reductase is a flavoprotein mediating electron transport to all mitochondrial forms of cytochrome P450. The authors cloned the human adrenodoxin reductase gene and characterized it by restriction endonuclease mapping and DNA sequencing. The entire gene is approximately 12 kilobases long and consists of 12 exons. The first exon encodes the first 26 of the 32 amino acids of the signal peptide, and the second exon encodes the remainder of signal peptide and the apparent FAD binding site. The remaining 10 exons are clustered in a region of only 4.3 kilobases, separated from the first two exons by a large intron of about 5.6 kilobases. Two forms of human adrenodoxin reductase mRNA, differing by the presence or absence of 18 bases in the middle of the sequence, arise from alternate splicing at the 5' end of exon 7. This alternately spliced region is directly adjacent to the NADPH binding site, which is entirely contained in exon 6. The immediate 5' flanking region lacks TATA and CAAT boxes; however, this region is rich in G+C and contains six copies of the sequence GGGCGGG, resembling promoter sequences of housekeeping genes. RNase protection experiments show that transcription is initiated from multiple sites in the 5' flanking region, located about 21-91 base pairs upstream from the AUG translational initiation codon

  11. Sequence variations in the FAD2 gene in seeded pumpkins.

    Ge, Y; Chang, Y; Xu, W L; Cui, C S; Qu, S P

    2015-01-01

    Seeded pumpkins are important economic crops; the seeds contain various unsaturated fatty acids, such as oleic acid and linoleic acid, which are crucial for human and animal nutrition. The fatty acid desaturase-2 (FAD2) gene encodes delta-12 desaturase, which converts oleic acid to linoleic acid. However, little is known about sequence variations in FAD2 in seeded pumpkins. Twenty-seven FAD2 clones from 27 accessions of Cucurbita moschata, Cucurbita maxima, Cucurbita pepo, and Cucurbita ficifolia were obtained (totally 1152 bp; a single gene without introns). More than 90% nucleotide identities were detected among the 27 FAD2 clones. Nucleotide substitution, rather than nucleotide insertion and deletion, led to sequence polymorphism in the 27 FAD2 clones. Furthermore, the 27 FAD2 selected clones all encoded the FAD2 enzyme (delta-12 desaturase) with amino acid sequence identities from 91.7 to 100% for 384 amino acids. The same main-function domain between 47 and 329 amino acids was identified. The four species clustered separately based on differences in the sequences that were identified using the unweighted pair group method with arithmetic mean. Geographic origin and species were found to be closely related to sequence variation in FAD2. PMID:26782391

  12. Informational structure of genetic sequences and nature of gene splicing

    Trifonov, E. N.

    1991-10-01

    Only about 1/20 of DNA of higher organisms codes for proteins, by means of classical triplet code. The rest of DNA sequences is largely silent, with unclear functions, if any. The triplet code is not the only code (message) carried by the sequences. There are three levels of molecular communication, where the same sequence ``talks'' to various bimolecules, while having, respectively, three different appearances: DNA, RNA and protein. Since the molecular structures and, hence, sequence specific preferences of these are substantially different, the original DNA sequence has to carry simultaneously three types of sequence patterns (codes, messages), thus, being a composite structure in which one had the same letter (nucleotide) is frequently involved in several overlapping codes of different nature. This multiplicity and overlapping of the codes is a unique feature of the Gnomic, language of genetic sequences. The coexisting codes have to be degenerate in various degrees to allow an optimal and concerted performance of all the encoded functions. There is an obvious conflict between the best possible performance of a given function and necessity to compromise the quality of a given sequence pattern in favor of other patterns. It appears that the major role of various changes in the sequences on their ``ontogenetic'' way from DNA to RNA to protein, like RNA editing and splicing, or protein post-translational modifications is to resolve such conflicts. New data are presented strongly indicating that the gene splicing is such a device to resolve the conflict between the code of DNA folding in chromatin and the triplet code for protein synthesis.

  13. Cloning,sequencing and phylogenic analysis of duck prion gene

    WANG Qigui; ZHANG Lei; HU Xiaoxiang; FAN Baoliang; LI Ning; LI Hui; WU Changxin

    2004-01-01

    Duck prion gene was cloned and sequenced. Similar to mammalian prion protein (PrP), duck prion is encoded by a single exon of a single copy in genome, which was confirmed by Southern blot analysis. All of the structural features of mammalian PrP were also identified in the duck PrP. Compared with mammalian PrP, it exhibited a 30 % of general similarity. When compared with chicken PrP, it showed a higher homology of 97%. A phylogenetic tree was constructed to trace evolution of prion gene in animals.

  14. Identification of Driver Genes in Hepatocellular Carcinoma by Exome Sequencing

    Sean P Cleary; Jeck, William R.; Zhao, Xiaobei; Chen, Kui; Selitsky, Sara R.; Savich, Gleb L.; Tan, Ting-Xu; Wu, Michael C.; Getz, Gad; Lawrence, Michael S.; Joel S Parker; Li, Jinyu; Powers, Scott; Kim, Hyeja; Fischer, Sandra

    2013-01-01

    Genetic alterations in specific driver genes lead to disruption of cellular pathways and are critical events in the instigation and progression of hepatocellular carcinoma. As a prerequisite for individualized cancer treatment, we sought to characterize the landscape of recurrent somatic mutations in hepatocellular carcinoma. We performed whole exome sequencing on 87 hepatocellular carcinomas and matched normal adjacent tissues to anaverage coverage of 59x. The overall mutation rate was rough...

  15. Chloroplast gene sequences and the study of plant evolution.

    Clegg, M T

    1993-01-01

    A large body of sequence data has accumulated for the chloroplast-encoded gene ribulose-1,5-biphosphate carboxylase/oxygenase (rbcL) as the result of a cooperative effort involving many laboratories. The data span all seed plants, including most major lineages from the angiosperms, and as such they provide an unprecedented opportunity to study plant evolutionary history. The full analysis of this large data set poses many problems and opportunities for plant evolutionary biologists and for bi...

  16. Detection and sequence analysis of accessory gene regulator genes of Staphylococcus pseudintermedius isolates

    M. Ananda Chitra

    2015-07-01

    Full Text Available Background: Staphylococcus pseudintermedius (SP is the major pathogenic species of dogs involved in a wide variety of skin and soft tissue infections. The accessory gene regulator (agr locus of Staphylococcus aureus has been extensively studied, and it influences the expression of many virulence genes. It encodes a two-component signal transduction system that leads to down-regulation of surface proteins and up-regulation of secreted proteins during in vitro growth of S. aureus. The objective of this study was to detect and sequence analyzing the AgrA, B, and D of SP isolated from canine skin infections. Materials and Methods: In this study, we have isolated and identified SP from canine pyoderma and otitis cases by polymerase chain reaction (PCR and confirmed by PCR-restriction fragment length polymorphism. Primers for SP agrA and agrBD genes were designed using online primer designing software and BLAST searched for its specificity. Amplification of the agr genes was carried out for 53 isolates of SP by PCR and sequencing of agrA, B, and D were carried out for five isolates and analyzed using DNAstar and Mega5.2 software. Results: A total of 53 (59% SP isolates were obtained from 90 samples. 15 isolates (28% were confirmed to be methicillinresistant SP (MRSP with the detection of the mecA gene. Accessory gene regulator A, B, and D genes were detected in all the SP isolates. Complete nucleotide sequences of the above three genes for five isolates were submitted to GenBank, and their accession numbers are from KJ133557 to KJ133571. AgrA amino acid sequence analysis showed that it is mainly made of alpha-helices and is hydrophilic in nature. AgrB is a transmembrane protein, and AgrD encodes the precursor of the autoinducing peptide (AIP. Sequencing of the agrD gene revealed that the 5 canine SP strains tested could be divided into three Agr specificity groups (RIPTSTGFF, KIPTSTGFF, and RIPISTGFF based on the putative AIP produced by each strain

  17. Complete MHC haplotype sequencing for common disease gene mapping.

    Stewart, C Andrew; Horton, Roger; Allcock, Richard J N; Ashurst, Jennifer L; Atrazhev, Alexey M; Coggill, Penny; Dunham, Ian; Forbes, Simon; Halls, Karen; Howson, Joanna M M; Humphray, Sean J; Hunt, Sarah; Mungall, Andrew J; Osoegawa, Kazutoyo; Palmer, Sophie; Roberts, Anne N; Rogers, Jane; Sims, Sarah; Wang, Yu; Wilming, Laurens G; Elliott, John F; de Jong, Pieter J; Sawcer, Stephen; Todd, John A; Trowsdale, John; Beck, Stephan

    2004-06-01

    The future systematic mapping of variants that confer susceptibility to common diseases requires the construction of a fully informative polymorphism map. Ideally, every base pair of the genome would be sequenced in many individuals. Here, we report 4.75 Mb of contiguous sequence for each of two common haplotypes of the major histocompatibility complex (MHC), to which susceptibility to >100 diseases has been mapped. The autoimmune disease-associated-haplotypes HLA-A3-B7-Cw7-DR15 and HLA-A1-B8-Cw7-DR3 were sequenced in their entirety through a bacterial artificial chromosome (BAC) cloning strategy using the consanguineous cell lines PGF and COX, respectively. The two sequences were annotated to encompass all described splice variants of expressed genes. We defined the complete variation content of the two haplotypes, revealing >18,000 variations between them. Average SNP densities ranged from less than one SNP per kilobase to >60. Acquisition of complete and accurate sequence data over polymorphic regions such as the MHC from large-insert cloned DNA provides a definitive resource for the construction of informative genetic maps, and avoids the limitation of chromosome regions that are refractory to PCR amplification. PMID:15140828

  18. Cloning, nucleotide sequence, and expression of the Rhodobacter sphaeroides Y thioredoxin gene.

    Pille, S.; Chuat, J C; Breton, A M; Clément-Métral, J D; Galibert, F

    1990-01-01

    Synthetic oligodeoxynucleotide probes based on the known amino acid sequence of Rhodobacter sphaeroides Y thioredoxin were used to identify, clone, and sequence the structural gene. The amino acid sequence derived from the DNA sequence of the R. sphaeroides gene was identical to the known amino acid sequence of R. sphaeroides thioredoxin. An NcoI site was created by directed mutagenesis at the beginning of the thioredoxin gene, inducing in the encoded protein the replacement of serine in posi...

  19. Angiosperm phylogeny inferred from sequences of four mitochondrial genes

    Yin-Long QIU; Zhi-Duan CHEN; Libo LI; Bin WANG; Jia-Yu XUE; Tory A. HENDRY; Rui-Qi LI; Joseph W. BROWN; Yang LIU; Geordan T. HUDSON

    2010-01-01

    An angiosperm phylogeny was reconstructed in a maximum likelihood analysis of sequences of four mitochondrial genes, atpl, matR, had5, and rps3, from 380 species that represent 376 genera and 296 families of seed plants. It is largely congruent with the phylogeny of angiosperms reconstructed from chloroplast genes atpB, matK, and rbcL, and nuclear 18S rDNA. The basalmost lineage consists of Amborella and Nymphaeales (including Hydatellaceae). Austrobaileyales follow this clade and are sister to the mesangiosperms, which include Chloranthaceae, Ceratophyllum, magnoliids, monocots, and eudicots. With the exception of Chloranthaceae being sister to Ceratophyllum, relationships among these five lineages are not well supported. In eudicots, Ranunculales, Sabiales, Proteales, Trochodendrales, Buxales, Gunnerales, Saxifragales, Vitales, Berberidopsidales, and Dilleniales form a basal grade of lines that diverged before the diversification of rosids and asterids. Within rosids, the COM (Celastrales-Oxalidales-Malpighiales) clade is sister to malvids (or rosid Ⅱ), instead of to the nitrogen-fixing clade as found in all previous large-scale molecular analyses of angiosperms. Santalales and Caryophyllales are members of an expanded asterid clade. This study shows that the mitochondrial genes are informative markers for resolving relationships among genera, families, or higher rank taxa across angiosperms. The low substitution rates and low homoplasy levels of the mitochondrial genes relative to the chloroplast genes, as found in this study, make them particularly useful for reconstructing ancient phylogenetic relationships. A mitochondrial gene-based angiosperm phylogeny provides an independent and essential reference for comparison with hypotheses of angiosperm phylogeny based on chloroplast genes, nuclear genes, and non-molecular data to reconstruct the underlying organismal phylogeny.

  20. Structure and sequence variation of mink interleukin-6 gene

    Aleutian disease (AD) is the number one disease threat to the survival and future of the mink industry in Nova Scotia and the world. Several ranchers have gone out of business in recent years in Nova Scotia as a direct result of AD. Currently, the control measure for AD consists of testing and slaughtering of infected mink. This practice has not been effective in controlling the disease. Finding a means of controlling AD is the number one priority for the mink industry in Nova Scotia. An effective control measure will have a long-term positive effect on the rural economy by improving production potential of mink and reducing production cost. It has been shown that antiviral antibodies produced by activated immune system cells sometimes combine with interleukin-6 (IL-6) to form immune complexes that cause AD in mink. There is evidence of a significant relationship between nucleotide variations in IL-6 gene and the onset of certain diseases in humans, which bears similar symptoms to AD. Furthermore, pathological symptoms of AD resemble those of other conditions, such as systemic lupus erythematosus (SLE) and Castleman Diseases in humans, where overproduction of IL-6 coincides with the severity of the disease. These findings suggest that IL-6 could be a candidate gene and warrant investigation vis-a-vis differences among mink genotypes in resistance or tolerance to ADV infection. The sequence of the IL-6 gene in mink was done and identification of polymorphisms was used to evaluate the potential role of this gene in the immune system response to infections. The 4678 bp promoter region, five exons and four introns of the interleukin-6 (IL-6) gene were bi-directionally sequenced in four unrelated mink from each of the wild, black, brown, pastel and sapphire mink (Genbank accession number (EF620932). The 344 bp promoter region of the gene contained several transcription binding sites. One exonic and seven intronic single nucleotide polymorphisms (SNP) were detected by

  1. Technology development for gene discovery and full-length sequencing

    Marcelo Bento Soares

    2004-07-19

    In previous years, with support from the U.S. Department of Energy, we developed methods for construction of normalized and subtracted cDNA libraries, and constructed hundreds of high-quality libraries for production of Expressed Sequence Tags (ESTs). Our clones were made widely available to the scientific community through the IMAGE Consortium, and millions of ESTs were produced from our libraries either by collaborators or by our own sequencing laboratory at the University of Iowa. During this grant period, we focused on (1) the development of a method for preferential cloning of tissue-specific and/or rare transcripts, (2) its utilization to expedite EST-based gene discovery for the NIH Mouse Brain Molecular Anatomy Project, (3) further development and optimization of a method for construction of full-length-enriched cDNA libraries, and (4) modification of a plasmid vector to maximize efficiency of full-length cDNA sequencing by the transposon-mediated approach. It is noteworthy that the technology developed for preferential cloning of rare mRNAs enabled identification of over 2,000 mouse transcripts differentially expressed in the hippocampus. In addition, the method that we optimized for construction of full-length-enriched cDNA libraries was successfully utilized for the production of approximately fifty libraries from the developing mouse nervous system, from which over 2,500 full-ORF-containing cDNAs have been identified and accurately sequenced in their entirety either by our group or by the NIH-Mammalian Gene Collection Program Sequencing Team.

  2. Nucleotide sequence and expression analysis of the Acetobacter xylinum uridine diphosphoglucose pyrophosphorylase gene.

    Brede, G; Fjaervik, E; Valla, S

    1991-01-01

    The nucleotide sequence of the Acetobacter xylinum uridine diphosphoglucose pyrophosphorylase gene was determined; this is the first procaryotic uridine diphosphoglucose pyrophosphorylase gene sequence reported. The sequence data indicated that the gene product consists of 284 amino acids. This finding was consistent with the results obtained by expression analysis in vivo and in vitro in Escherichia coli.

  3. Cloning and sequence analysis of US1 gene in duck enteritis virus%Cloning and sequence analysis of US1gene in duck enteritis virus

    ZHAO Yan; WANG Jun-wei; MA Bo; ZHAO Xiao-yan

    2011-01-01

    In this paper, a 1,860 bp sequence in IRs region of duck enteritis virus(DEV)was amplified by single oligonucleotide nested PCR with a single primer designed according to partial sequence of USI and then a pair of primers designed according to the 3' UTR of US8 gene and 5'end of the new getting sequence were used to amplify a 2,426 bp sequence toward the TRs region.Sequence analysis revealed that the both sequences contained an identical 990 bp open reading frame of DEV US1 gene.The two ORFs were in opposite transcription orientation.Sequence comparison of the nucleotide sequence and the deduced amino acid sequence of US1 gene showed relatively high identity to Mardivirus.Phylogenetic tree analysis showed that the eleven herpesviruses viruses were classified into three groups, and the duck enteritis virus was most closely related to Mardivirus.

  4. dcp gene of Escherichia coli: cloning, sequencing, transcript mapping, and characterization of the gene product.

    Henrich, B; S. Becker; Schroeder, U; Plapp, R.

    1993-01-01

    Dipeptidyl carboxypeptidase is a C-terminal exopeptidase of Escherichia coli. We have isolated the respective gene, dcp, from a low-copy-number plasmid library by its ability to complement a dcp mutation preventing the utilization of the unique substrate N-benzoyl-L-glycyl-L-histidyl-L-leucine. Sequence analysis of a 2.9-kb DNA fragment revealed an open reading frame of 2,043 nucleotides which was assigned to the dcp gene by N-terminal amino acid sequencing and electrophoretic molecular mass ...

  5. Deep sequencing reveals 50 novel genes for recessive cognitive disorders.

    Najmabadi, Hossein; Hu, Hao; Garshasbi, Masoud; Zemojtel, Tomasz; Abedini, Seyedeh Sedigheh; Chen, Wei; Hosseini, Masoumeh; Behjati, Farkhondeh; Haas, Stefan; Jamali, Payman; Zecha, Agnes; Mohseni, Marzieh; Püttmann, Lucia; Vahid, Leyla Nouri; Jensen, Corinna; Moheb, Lia Abbasi; Bienek, Melanie; Larti, Farzaneh; Mueller, Ines; Weissmann, Robert; Darvish, Hossein; Wrogemann, Klaus; Hadavi, Valeh; Lipkowitz, Bettina; Esmaeeli-Nieh, Sahar; Wieczorek, Dagmar; Kariminejad, Roxana; Firouzabadi, Saghar Ghasemi; Cohen, Monika; Fattahi, Zohreh; Rost, Imma; Mojahedi, Faezeh; Hertzberg, Christoph; Dehghan, Atefeh; Rajab, Anna; Banavandi, Mohammad Javad Soltani; Hoffer, Julia; Falah, Masoumeh; Musante, Luciana; Kalscheuer, Vera; Ullmann, Reinhard; Kuss, Andreas Walter; Tzschach, Andreas; Kahrizi, Kimia; Ropers, H Hilger

    2011-10-01

    Common diseases are often complex because they are genetically heterogeneous, with many different genetic defects giving rise to clinically indistinguishable phenotypes. This has been amply documented for early-onset cognitive impairment, or intellectual disability, one of the most complex disorders known and a very important health care problem worldwide. More than 90 different gene defects have been identified for X-chromosome-linked intellectual disability alone, but research into the more frequent autosomal forms of intellectual disability is still in its infancy. To expedite the molecular elucidation of autosomal-recessive intellectual disability, we have now performed homozygosity mapping, exon enrichment and next-generation sequencing in 136 consanguineous families with autosomal-recessive intellectual disability from Iran and elsewhere. This study, the largest published so far, has revealed additional mutations in 23 genes previously implicated in intellectual disability or related neurological disorders, as well as single, probably disease-causing variants in 50 novel candidate genes. Proteins encoded by several of these genes interact directly with products of known intellectual disability genes, and many are involved in fundamental cellular processes such as transcription and translation, cell-cycle control, energy metabolism and fatty-acid synthesis, which seem to be pivotal for normal brain development and function. PMID:21937992

  6. Multiple gene sequence analysis using genes of the bacterial DNA repair pathway

    Miguel Rotelok Neto; Carolina Weigert Galvão; Leonardo Magalhães Cruz; Dieval Guizelini; Leilane Caline Silva; Jarem Raul Garcia; Rafael Mazer Etto

    2015-01-01

    The ability to recognize and repair abnormal DNA structures is common to all forms of life. Physiological studies and genomic sequencing of a variety of bacterial species have identified an incredible diversity of DNA repair pathways. Despite the amount of available genes in public database, the usual method to place genomes in a taxonomic context is based mainly on the 16S rRNA or housekeeping genes. Thus, the relationships among genomes remain poorly understood. In this work, an approach of...

  7. Efficient expression of the Saccharomyces cerevisiae PGK gene depends on an upstream activation sequence but does not require TATA sequences.

    Ogden, J E; Stanway, C; Kim, S.; Mellor, J; Kingsman, A J; Kingsman, S M

    1986-01-01

    The Saccharomyces cerevisiae PGK (phosphoglycerate kinase) gene encodes one of the most abundant mRNA and protein species in the cell. To identify the promoter sequences required for the efficient expression of PGK, we undertook a detailed internal deletion analysis of the 5' noncoding region of the gene. Our analysis revealed that PGK has an upstream activation sequence (UASPGK) located between 402 and 479 nucleotides upstream from the initiating ATG sequence which is required for full trans...

  8. International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons

    Nathan D. Olson

    2015-03-01

    Full Text Available This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1 identity of biologically conserved position, (2 ratio of 16S rRNA gene copies featuring identified variants, and (3 the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies.

  9. SEQUENCING AND SEQUENCE ANALYSIS OF MYOSTATIN GENE IN THE EXON 1 OF THE CAMEL (CAMELUS DROMEDARIUS

    M. G. SHAH, A. S. QURESHI1, M. REISSMANN2 AND H. J. SCHWARTZ3

    2006-10-01

    Full Text Available Myostatin, also called growth differentiation factor-8 (GDF-8, is a member of the mammalian growth transforming family (TGF-beta superfamily, which is expressed specifically in developing an adult skeletal muscle. Muscular hypertrophy allele (mh allele in the double muscle breeds involved mutation within the myostatin gene. Genomic DNA was isolated from the camel hair using NucleoSpin Tissue kit. Two animals of each of the six breeds namely, Marecha, Dhatti, Larri, Kohi, Sakrai and Cambelpuri were used for sequencing. For PCR amplification of the gene, a primer pair was designed from homolog regions of already published sequences of farm animals from GenBank. Results showed that camel myostatin possessed more than 90% homology with that of cattle, sheep and pig. Camel formed separate cluster from the pig in spite of having high homology (98% and showed 94% homology with cattle and sheep as reported in literature. Sequence analysis of the PCR amplified part of exon 1 (256 bp of the camel myostatin was identical among six camel breeds.

  10. A homeodomain protein binds to. gamma. -globin gene regulatory sequences

    Lavelle, D.; Ducksworth, J.; Eves, E.; Gomes, G.; Keller, M.; Heller, P.; DeSimone, J. (Univ. of Illinois, Chicago (United States) Veterans Administration Westside Medical Center, Chicago, IL (United States))

    1991-08-15

    Developmental regulation of {gamma}-globin gene expression probably occurs through developmental-stage-specific trans-acting factors able to promote the interaction of enhancer elements located in the far upstream locus control region with regulatory elements in the {gamma} gene promoters and 3{prime}{sup A}{gamma} enhancer located in close proximity to the genes. The authors have detected a nuclear protein in K562 and baboon fetal bone marrow nuclear extracts capable of binding to A+T-rich sequences in the locus control region, {gamma} gene promoter, and 3{prime} {sup A}{gamma} enhancer. SDS/polyacrylamide gel analysis of the purified K562 binding activity revealed a single protein of 87 kDa. A K562 cDNA clone was isolated encoding a {beta}-galactosidase fusion protein with a DNA binding specificity identical to that of the K562/fetal bone marrow nuclear protein. The cDNA clone encodes a homeodomain homologous to the Drosophila antennapedia protein.

  11. dcp gene of Escherichia coli: cloning, sequencing, transcript mapping, and characterization of the gene product.

    Henrich, B; Becker, S; Schroeder, U; Plapp, R

    1993-01-01

    Dipeptidyl carboxypeptidase is a C-terminal exopeptidase of Escherichia coli. We have isolated the respective gene, dcp, from a low-copy-number plasmid library by its ability to complement a dcp mutation preventing the utilization of the unique substrate N-benzoyl-L-glycyl-L-histidyl-L-leucine. Sequence analysis of a 2.9-kb DNA fragment revealed an open reading frame of 2,043 nucleotides which was assigned to the dcp gene by N-terminal amino acid sequencing and electrophoretic molecular mass determination of the purified dcp product. Transcript mapping by primer extension and S1 protection experiments verified the physiological significance of potential initiation and termination signals for dcp transcription and allowed the identification of a single species of monocistronic dcp mRNA. The codon usage pattern and the effects of elevated gene copy number indicated a relatively low level of dcp expression. The predicted amino acid sequence of dipeptidyl carboxypeptidase, containing a potential zinc-binding site, is highly homologous (78.8%) to the corresponding enzyme from Salmonella typhimurium. It also displays significant homology to the products of the S. typhimurium opdA and the E. coli prlC genes and to some metalloproteases from rats and Saccharomyces cerevisiae. No potential export signals could be inferred from the amino acid sequence. Dipeptidyl carboxypeptidase was enriched 80-fold from crude extracts of E. coli and used to investigate some of its biochemical and biophysical properties. Images PMID:8226676

  12. An automated annotation tool for genomic DNA sequences using GeneScan and BLAST

    Andrew M. Lynn; Chakresh Kumar Jain; K. Kosalai; Pranjan Barman; Nupur Thakur; Harish Batra; Alok Bhattacharya

    2001-04-01

    Genomic sequence data are often available well before the annotated sequence is published. We present a method for analysis of genomic DNA to identify coding sequences using the GeneScan algorithm and characterize these resultant sequences by BLAST. The routines are used to develop a system for automated annotation of genome DNA sequences.

  13. Molecular cloning, sequence identification, and gene expression analysis of bovine ADCY2 gene.

    Li, Y X; Jin, H G; Yan, C G; Ren, C Y; Jiang, C J; Jin, C D; Seo, K S; Jin, X

    2014-06-01

    Adenylyl cyclase 2 (ADCY2), a class B member of adenylyl cyclases, is important in accelerating phosphor-acidification as well as glycogen synthesis and breakdown. Given its distinct role in flesh tenderization after butchering, we cloned and sequenced the ADCY2 gene from Yanbian cattle and assessed its expression in bovine tissues. A 2947 bp nucleotide sequence representing the full-length cDNA of bovine ADCY2 gene was obtained by 5' and 3' remote analysis computations for gene expression. Analyses of the putative protein sequence showed that ADCY2 had high homology among species, except with the non-mammal Oreochromis niloticus. Gene structural domain analyses in humans and rats indicated that the ADCY2 protein had no flaw; only the transmembrane domain was reduced and the CYCc structure domain was shortened. Assessment of ADCY2 expression in bovine tissues by real-time PCR showed that the highest expression was in the testes, followed by the longissimus dorsi, tensor fasciae latae, and latissimus dorsi. These data will serve as a foundation for further insight into the cattle ADCY2 gene. PMID:24797538

  14. EvoTol: a protein-sequence based evolutionary intolerance framework for disease-gene prioritization

    Rackham, Owen J. L.; Shihab, Hashem A; Johnson, Michael R.; Petretto, Enrico

    2014-01-01

    Methods to interpret personal genome sequences are increasingly required. Here, we report a novel framework (EvoTol) to identify disease-causing genes using patient sequence data from within protein coding-regions. EvoTol quantifies a gene's intolerance to mutation using evolutionary conservation of protein sequences and can incorporate tissue-specific gene expression data. We apply this framework to the analysis of whole-exome sequence data in epilepsy and congenital heart disease, and demon...

  15. Nucleotide sequence and corresponding amino acid sequence of the gene for the major antigen of foot and mouth disease virus.

    Kurz, C; Forss, S; Küpper, H; K Strohmaier; Schaller, H

    1981-01-01

    A segment of 1160 nucleotides of the FMDV genome has been sequenced using three overlapping fragments of cloned cDNA from FMDV strain O1K. This sequence contains the coding sequence for the viral capsid protein VP1 as shown by its homology to known and newly determined amino acid sequences from this man antigenic polypeptide of the FMDV virion. The structural gene for VP1 comprises 639 nucleotides which specify a sequence of 213 amino acids for the VP1 protein. The coding sequence is not flan...

  16. Identification of novel hereditary cancer genes by whole exome sequencing.

    Sokolenko, Anna P; Suspitsin, Evgeny N; Kuligina, Ekatherina Sh; Bizin, Ilya V; Frishman, Dmitrij; Imyanitov, Evgeny N

    2015-12-28

    Whole exome sequencing (WES) provides a powerful tool for medical genetic research. Several dozens of WES studies involving patients with hereditary cancer syndromes have already been reported. WES led to breakthrough in understanding of the genetic basis of some exceptionally rare syndromes; for example, identification of germ-line SMARCA4 mutations in patients with ovarian hypercalcemic small cell carcinomas indeed explains a noticeable share of familial aggregation of this disease. However, studies on common cancer types turned out to be more difficult. In particular, there is almost a dozen of reports describing WES analysis of breast cancer patients, but none of them yet succeeded to reveal a gene responsible for the significant share of missing heritability. Virtually all components of WES studies require substantial improvement, e.g. technical performance of WES, interpretation of WES results, mode of patient selection, etc. Most of contemporary investigations focus on genes with autosomal dominant mechanism of inheritance; however, recessive and oligogenic models of transmission of cancer susceptibility also need to be considered. It is expected that the list of medically relevant tumor-predisposing genes will be rapidly expanding in the next few years. PMID:26427841

  17. Transcriptome sequencing and positive selected genes analysis of Bombyx mandarina.

    Cheng, Tingcai; Fu, Bohua; Wu, Yuqian; Long, Renwen; Liu, Chun; Xia, Qingyou

    2015-01-01

    The wild silkworm Bombyx mandarina is widely believed to be an ancestor of the domesticated silkworm, Bombyx mori. Silkworms are often used as a model for studying the mechanism of species domestication. Here, we performed transcriptome sequencing of the wild silkworm using an Illumina HiSeq2000 platform. We produced 100,004,078 high-quality reads and assembled them into 50,773 contigs with an N50 length of 1764 bp and a mean length of 941.62 bp. A total of 33,759 unigenes were identified, with 12,805 annotated in the Nr database, 8273 in the Pfam database, and 9093 in the Swiss-Prot database. Expression profile analysis found significant differential expression of 1308 unigenes between the middle silk gland (MSG) and posterior silk gland (PSG). Three sericin genes (sericin 1, sericin 2, and sericin 3) were expressed specifically in the MSG and three fibroin genes (fibroin-H, fibroin-L, and fibroin/P25) were expressed specifically in the PSG. In addition, 32,297 Single-nucleotide polymorphisms (SNPs) and 361 insertion-deletions (INDELs) were detected. Comparison with the domesticated silkworm p50/Dazao identified 5,295 orthologous genes, among which 400 might have experienced or to be experiencing positive selection by Ka/Ks analysis. These data and analyses presented here provide insights into silkworm domestication and an invaluable resource for wild silkworm genomics research. PMID:25806526

  18. Transcriptome sequencing and positive selected genes analysis of Bombyx mandarina.

    Tingcai Cheng

    Full Text Available The wild silkworm Bombyx mandarina is widely believed to be an ancestor of the domesticated silkworm, Bombyx mori. Silkworms are often used as a model for studying the mechanism of species domestication. Here, we performed transcriptome sequencing of the wild silkworm using an Illumina HiSeq2000 platform. We produced 100,004,078 high-quality reads and assembled them into 50,773 contigs with an N50 length of 1764 bp and a mean length of 941.62 bp. A total of 33,759 unigenes were identified, with 12,805 annotated in the Nr database, 8273 in the Pfam database, and 9093 in the Swiss-Prot database. Expression profile analysis found significant differential expression of 1308 unigenes between the middle silk gland (MSG and posterior silk gland (PSG. Three sericin genes (sericin 1, sericin 2, and sericin 3 were expressed specifically in the MSG and three fibroin genes (fibroin-H, fibroin-L, and fibroin/P25 were expressed specifically in the PSG. In addition, 32,297 Single-nucleotide polymorphisms (SNPs and 361 insertion-deletions (INDELs were detected. Comparison with the domesticated silkworm p50/Dazao identified 5,295 orthologous genes, among which 400 might have experienced or to be experiencing positive selection by Ka/Ks analysis. These data and analyses presented here provide insights into silkworm domestication and an invaluable resource for wild silkworm genomics research.

  19. Estimating the extent of horizontal gene transfer in metagenomic sequences

    Moya Andrés

    2008-03-01

    Full Text Available Abstract Background Although the extent of horizontal gene transfer (HGT in complete genomes has been widely studied, its influence in the evolution of natural communities of prokaryotes remains unknown. The availability of metagenomic sequences allows us to address the study of global patterns of prokaryotic evolution in samples from natural communities. However, the methods that have been commonly used for the study of HGT are not suitable for metagenomic samples. Therefore it is important to develop new methods or to adapt existing ones to be used with metagenomic sequences. Results We have created two different methods that are suitable for the study of HGT in metagenomic samples. The methods are based on phylogenetic and DNA compositional approaches, and have allowed us to assess the extent of possible HGT events in metagenomes for the first time. The methods are shown to be compatible and quite precise, although they probably underestimate the number of possible events. Our results show that the phylogenetic method detects HGT in between 0.8% and 1.5% of the sequences, while DNA compositional methods identify putative HGT in between 2% and 8% of the sequences. These ranges are very similar to these found in complete genomes by related approaches. Both methods act with a different sensitivity since they probably target HGT events of different ages: the compositional method mostly identifies recent transfers, while the phylogenetic is more suitable for the detections of older events. Nevertheless, the study of the number of HGT events in metagenomic sequences from different communities shows a consistent trend for both methods: the lower amount is found for the sequences of the Sargasso Sea metagenome, while the higher quantity is found in the whale fall metagenome from the bottom of the ocean. The significance of these observations is discussed. Conclusion The computational approaches that are used to find possible HGT events in complete

  20. Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications

    Yilmaz, Pelin; Kottmann, Renzo; Field, Dawn;

    2011-01-01

    Here we present a standard developed by the Genomic Standards Consortium (GSC) for reporting marker gene sequences--the minimum information about a marker gene sequence (MIMARKS). We also introduce a system for describing the environment from which a biological sample originates. The 'environment...... present the minimum information about any (x) sequence (MIxS). Adoption of MIxS will enhance our ability to analyze natural genetic diversity documented by massive DNA sequencing efforts from myriad ecosystems in our ever-changing biosphere....

  1. Molecular Cloning and Sequencing of Hemoglobin-Beta Gene of Channel Catfish, Ictalurus Punctatus Rafinesque

    : Hemoglobin-y gene of channel catfish , lctalurus punctatus, was cloned and sequenced . Total RNA from head kidneys was isolated, reverse transcribed and amplified . The sequence of the channel catfish hemoglobin-y gene consists of 600 nucleotides . Analysis of the nucleotide sequence reveals one o...

  2. Detection bias in microarray and sequencing transcriptomic analysis identified by housekeeping genes

    Yijuan Zhang; Oluwafemi S. Akintola; Liu, Ken J.A.; Bingyun Sun

    2015-01-01

    This work includes the original data used to discover the gene ontology bias in transcriptomic analysis conducted by microarray and high throughput sequencing (Zhang et al., 2015) [1]. In the analysis, housekeeping genes were used to examine the differential detection ability by microarray and sequencing because these genes are probably the most reliably detected. The genes included here were compiled from 15 human housekeeping gene studies. The provided tables here comprise of detailed chrom...

  3. Poly purine.pyrimidine sequences upstream of the beta-galactosidase gene affect gene expression in Saccharomyces cerevisiae

    Brahmachari Samir K

    2001-10-01

    Full Text Available Abstract Background Poly purine.pyrimidine sequences have the potential to adopt intramolecular triplex structures and are overrepresented upstream of genes in eukaryotes. These sequences may regulate gene expression by modulating the interaction of transcription factors with DNA sequences upstream of genes. Results A poly purine.pyrimidine sequence with the potential to adopt an intramolecular triplex DNA structure was designed. The sequence was inserted within a nucleosome positioned upstream of the β-galactosidase gene in yeast, Saccharomyces cerevisiae, between the cycl promoter and gal 10Upstream Activating Sequences (UASg. Upon derepression with galactose, β-galactosidase gene expression is reduced 12-fold in cells carrying single copy poly purine.pyrimidine sequences. This reduction in expression is correlated with reduced transcription. Furthermore, we show that plasmids carrying a poly purine.pyrimidine sequence are not specifically lost from yeast cells. Conclusion We propose that a poly purine.pyrimidine sequence upstream of a gene affects transcription. Plasmids carrying this sequence are not specifically lost from cells and thus no additional effort is needed for the replication of these sequences in eukaryotic cells.

  4. Identification and analysis of gene families from the duplicated genome of soybean using EST sequences

    Shoemaker Randy

    2006-08-01

    Full Text Available Abstract Background Large scale gene analysis of most organisms is hampered by incomplete genomic sequences. In many organisms, such as soybean, the best source of sequence information is the existence of expressed sequence tag (EST libraries. Soybean has a large (1115 Mbp genome that has yet to be fully sequenced. However it does have the 6th largest EST collection comprised of ESTs from a variety of soybean genotypes. Many EST libraries were constructed from RNA extracted from various genetic backgrounds, thus gene identification from these sources is complicated by the existence of both gene and allele sequence differences. We used the ESTminer suite of programs to identify potential soybean gene transcripts from a single genetic background allowing us to observe functional classifications between gene families as well as structural differences between genes and gene paralogs within families. The identification of potential gene sequences (pHaps from soybean allows us to begin to get a picture of the genomic history of the organism as well as begin to observe the evolutionary fates of gene copies in this highly duplicated genome. Results We identified approximately 45,000 potential gene sequences (pHaps from EST sequences of Williams/Williams82, an inbred genotype of soybean (Glycine max L. Merr. using a redundancy criterion to identify reproducible sequence differences between related genes within gene families. Analysis of these sequences revealed single base substitutions and single base indels are the most frequently observed form of sequence variation between genes within families in the dataset. Genomic sequencing of selected loci indicate that intron-like intervening sequences are numerous and are approximately 220 bp in length. Functional annotation of gene sequences indicate functional classifications are not randomly distributed among gene families containing few or many genes. Conclusion The predominance of single nucleotide

  5. Cloning, sequencing and expression of a xylanase gene from the maize pathogen Helminthosporium turcicum

    Degefu, Y.; Paulin, L.; Lübeck, Peter Stephensen

    2001-01-01

    A gene encoding an endoxylanase from the phytopathogenic fungus Helminthosporium turcicum Pass. was cloned and sequenced. The entire nucleotide sequence of a 1991 bp genomic fragment containing an endoxylanase gene was determined. The xylanase gene of 795 bp, interrupted by two introns of 52 and ...

  6. High throughput 16S rRNA gene amplicon sequencing

    Nierychlo, Marta; Larsen, Poul; Jørgensen, Mads Koustrup;

    S rRNA gene amplicon sequencing has been developed over the past few years and is now ready to use for more comprehensive studies related to plant operation and optimization thanks to short analysis time, low cost, high throughput, and high taxonomic resolution. In this study we show how 16S r...... belonging to the phylum Chloroflexi. Based on knowledge about their ecophysiology, other control measures were introduced and the bulking problem was reduced after 2 months. Besides changes in the filament abundance and composition also other changes in the microbial community were observed that likely...... correlated with the bacterial species composition in 25 Danish full-scale WWTPs with nutrient removal. Examples of properties were SVI, filament index, floc size, floc strength, content of cations and amount of extracellular polymeric substances. Multivariate statistics provided several important insights...

  7. The GeneCards Suite: From Gene Data Mining to Disease Genome Sequence Analyses.

    Stelzer, Gil; Rosen, Naomi; Plaschkes, Inbar; Zimmerman, Shahar; Twik, Michal; Fishilevich, Simon; Stein, Tsippi Iny; Nudel, Ron; Lieder, Iris; Mazor, Yaron; Kaplan, Sergey; Dahary, Dvir; Warshawsky, David; Guan-Golan, Yaron; Kohn, Asher; Rappaport, Noa; Safran, Marilyn; Lancet, Doron

    2016-01-01

    GeneCards, the human gene compendium, enables researchers to effectively navigate and inter-relate the wide universe of human genes, diseases, variants, proteins, cells, and biological pathways. Our recently launched Version 4 has a revamped infrastructure facilitating faster data updates, better-targeted data queries, and friendlier user experience. It also provides a stronger foundation for the GeneCards suite of companion databases and analysis tools. Improved data unification includes gene-disease links via MalaCards and merged biological pathways via PathCards, as well as drug information and proteome expression. VarElect, another suite member, is a phenotype prioritizer for next-generation sequencing, leveraging the GeneCards and MalaCards knowledgebase. It automatically infers direct and indirect scored associations between hundreds or even thousands of variant-containing genes and disease phenotype terms. VarElect's capabilities, either independently or within TGex, our comprehensive variant analysis pipeline, help prepare for the challenge of clinical projects that involve thousands of exome/genome NGS analyses. © 2016 by John Wiley & Sons, Inc. PMID:27322403

  8. Sequence-specific interactions of nuclear factors with conserved sequences of human class II major histocompatibility complex genes

    All class II major histocompatibility complex genes contain two highly conserved sequences, termed X and Y, with the promoter region(s), which may have a role in regulation of expression. To study trans-acting factors that interact with these sequences, sequence-specific DNA binding activity has been examined by the gel electrophoresis retardation assay using the HLA-DQ2β gene 5' flanking DNA and nuclear extracts derived from various cell types. Several specific protein-binding activities were found using a 45-base-pair (bp) HinfI/Sau96I (-142 to -98 bp) and a 38-bp Sau96I/Sau96I (-97 to -60 bp) fragment, which include conserved sequence X (-113 to -100 bp) and conserved sequence Y (-80 to -71 bp), respectively. Competition experiments, methylation interference analysis, and DNase I footprinting demonstrated that distinct proteins in a nuclear extract of Raji cells (a human B lymphoma line) bind to sequence X, to sequence Y, and to DNA 5' of the X sequence (termed sequence W). The factor binding site in the W sequence is also found to be conserved among β-chain genes and is suggested to be a γ-interferon control region

  9. Next generation sequencing in synovial sarcoma reveals novel gene mutations.

    Vlenterie, Myrella; Hillebrandt-Roeffen, Melissa H S; Flucke, Uta E; Groenen, Patricia J T A; Tops, Bastiaan B J; Kamping, Eveline J; Pfundt, Rolph; de Bruijn, Diederik R H; Geurts van Kessel, Ad H M; van Krieken, Han J H J M; van der Graaf, Winette T A; Versleijen-Jonkers, Yvonne M H

    2015-10-27

    Over 95% of all synovial sarcomas (SS) share a unique translocation, t(X;18), however, they show heterogeneous clinical behavior. We analyzed multiple SS to reveal additional genetic alterations besides the translocation. Twenty-six SS from 22 patients were sequenced for 409 cancer-related genes using the Comprehensive Cancer Panel (Life Technologies, USA) on an Ion Torrent platform. The detected variants were verified by Sanger sequencing and compared to matched normal DNAs. Copy number variation was assessed in six tumors using the Oncoscan array (Affymetrix, USA). In total, eight somatic mutations were detected in eight samples. These mutations have not been reported previously in SS. Two of these, in KRAS and CCND1, represent known oncogenic mutations in other malignancies. Additional mutations were detected in RNF213, SEPT9, KDR, CSMD3, MLH1 and ERBB4. DNA alterations occurred more often in adult tumors. A distinctive loss of 6q was found in a metastatic lesion progressing under pazopanib, but not in the responding lesion. Our results emphasize t(X;18) as a single initiating event in SS and as the main oncogenic driver. Our results also show the occurrence of additional genetic events, mutations or chromosomal aberrations, occurring more frequently in SS with an onset in adults. PMID:26415226

  10. Hidden Markov Models for Gene Sequence Classification: Classifying the VSG genes in the Trypanosoma brucei Genome

    Mesa, Andrea; Basterrech, Sebastián; Guerberoff, Gustavo; Alvarez-Valin, Fernando

    2015-01-01

    The article presents an application of Hidden Markov Models (HMMs) for pattern recognition on genome sequences. We apply HMM for identifying genes encoding the Variant Surface Glycoprotein (VSG) in the genomes of Trypanosoma brucei (T. brucei) and other African trypanosomes. These are parasitic protozoa causative agents of sleeping sickness and several diseases in domestic and wild animals. These parasites have a peculiar strategy to evade the host's immune system that consists in periodicall...

  11. Genome-wide gene-gene interaction analysis for next-generation sequencing.

    Zhao, Jinying; Zhu, Yun; Xiong, Momiao

    2016-03-01

    The critical barrier in interaction analysis for next-generation sequencing (NGS) data is that the traditional pairwise interaction analysis that is suitable for common variants is difficult to apply to rare variants because of their prohibitive computational time, large number of tests and low power. The great challenges for successful detection of interactions with NGS data are (1) the demands in the paradigm of changes in interaction analysis; (2) severe multiple testing; and (3) heavy computations. To meet these challenges, we shift the paradigm of interaction analysis between two SNPs to interaction analysis between two genomic regions. In other words, we take a gene as a unit of analysis and use functional data analysis techniques as dimensional reduction tools to develop a novel statistic to collectively test interaction between all possible pairs of SNPs within two genome regions. By intensive simulations, we demonstrate that the functional logistic regression for interaction analysis has the correct type 1 error rates and higher power to detect interaction than the currently used methods. The proposed method was applied to a coronary artery disease dataset from the Wellcome Trust Case Control Consortium (WTCCC) study and the Framingham Heart Study (FHS) dataset, and the early-onset myocardial infarction (EOMI) exome sequence datasets with European origin from the NHLBI's Exome Sequencing Project. We discovered that 6 of 27 pairs of significantly interacted genes in the FHS were replicated in the independent WTCCC study and 24 pairs of significantly interacted genes after applying Bonferroni correction in the EOMI study. PMID:26173972

  12. Structural organization of glycophorin A and B genes: Glycophorin B gene evolved by homologous recombination at Alu repeat sequences

    Glycophorins A (GPA) and B (GPB) are two major sialoglycoproteins of the human erythrocyte membrane. Here the authors present a comparison of the genomic structures of GPA and GPB developed by analyzing DNA clones isolated from a K562 genomic library. Nucleotide sequences of exon-intron junctions and 5' and 3' flanking sequences revealed that the GPA and GPB genes consist of 7 and 5 exons, respectively, and both genes have >95% identical sequence from the 5' flanking region to the region ∼ 1 kilobase downstream from the exon encoding the transmembrane regions. In this homologous part of the genes, GPB lacks one exon due to a point mutation at the 5' splicing site of the third intron, which inactivates the 5' cleavage event of splicing and leads to ligation of the second to the fourth exon. Following these very homologous sequences, the genomic sequences for GPA and GPB diverge significantly and no homology can be detected in their 3' end sequences. The analysis of the Alu sequences and their flanking direct repeat sequences suggest that an ancestral genomic structure has been maintained in the GPA gene, whereas the GPB gene has arisen from the acquisition of 3' sequences different from those of the GPA gene by homologous recombination at the Alu repeats during or after gene duplication

  13. The Clinical Significance of Unknown Sequence Variants in BRCA Genes

    Calò, Valentina; Bruno, Loredana; Paglia, Laura La; Perez, Marco; Margarese, Naomi [Department of Surgery and Oncology, Regional Reference Center for the Biomolecular Characterization and Genetic Screening of Hereditary Tumors, University of Palermo, Via del Vespro 127, 90127 Palermo (Italy); Gaudio, Francesca Di [Department of Medical Biotechnologies and Legal Medicine, University of Palermo, Palermo (Italy); Russo, Antonio, E-mail: lab-oncobiologia@usa.net [Department of Surgery and Oncology, Regional Reference Center for the Biomolecular Characterization and Genetic Screening of Hereditary Tumors, University of Palermo, Via del Vespro 127, 90127 Palermo (Italy)

    2010-09-10

    Germline mutations in BRCA1/2 genes are responsible for a large proportion of hereditary breast and/or ovarian cancers. Many highly penetrant predisposition alleles have been identified and include frameshift or nonsense mutations that lead to the translation of a truncated protein. Other alleles contain missense mutations, which result in amino acid substitution and intronic variants with splicing effect. The discovery of variants of uncertain/unclassified significance (VUS) is a result that can complicate rather than improve the risk assessment process. VUSs are mainly missense mutations, but also include a number of intronic variants and in-frame deletions and insertions. Over 2,000 unique BRCA1 and BRCA2 missense variants have been identified, located throughout the whole gene (Breast Cancer Information Core Database (BIC database)). Up to 10–20% of the BRCA tests report the identification of a variant of uncertain significance. There are many methods to discriminate deleterious/high-risk from neutral/low-risk unclassified variants (i.e., analysis of the cosegregation in families of the VUS, measure of the influence of the VUSs on the wild-type protein activity, comparison of sequence conservation across multiple species), but only an integrated analysis of these methods can contribute to a real interpretation of the functional and clinical role of the discussed variants. The aim of our manuscript is to review the studies on BRCA VUS in order to clarify their clinical relevance.

  14. The Clinical Significance of Unknown Sequence Variants in BRCA Genes

    Germline mutations in BRCA1/2 genes are responsible for a large proportion of hereditary breast and/or ovarian cancers. Many highly penetrant predisposition alleles have been identified and include frameshift or nonsense mutations that lead to the translation of a truncated protein. Other alleles contain missense mutations, which result in amino acid substitution and intronic variants with splicing effect. The discovery of variants of uncertain/unclassified significance (VUS) is a result that can complicate rather than improve the risk assessment process. VUSs are mainly missense mutations, but also include a number of intronic variants and in-frame deletions and insertions. Over 2,000 unique BRCA1 and BRCA2 missense variants have been identified, located throughout the whole gene (Breast Cancer Information Core Database (BIC database)). Up to 10–20% of the BRCA tests report the identification of a variant of uncertain significance. There are many methods to discriminate deleterious/high-risk from neutral/low-risk unclassified variants (i.e., analysis of the cosegregation in families of the VUS, measure of the influence of the VUSs on the wild-type protein activity, comparison of sequence conservation across multiple species), but only an integrated analysis of these methods can contribute to a real interpretation of the functional and clinical role of the discussed variants. The aim of our manuscript is to review the studies on BRCA VUS in order to clarify their clinical relevance

  15. Targeted Sequencing Reveals Large-Scale Sequence Polymorphism in Maize Candidate Genes for Biomass Production and Composition.

    Moses M Muraya

    Full Text Available A major goal of maize genomic research is to identify sequence polymorphisms responsible for phenotypic variation in traits of economic importance. Large-scale detection of sequence variation is critical for linking genes, or genomic regions, to phenotypes. However, due to its size and complexity, it remains expensive to generate whole genome sequences of sufficient coverage for divergent maize lines, even with access to next generation sequencing (NGS technology. Because methods involving reduction of genome complexity, such as genotyping-by-sequencing (GBS, assess only a limited fraction of sequence variation, targeted sequencing of selected genomic loci offers an attractive alternative. We therefore designed a sequence capture assay to target 29 Mb genomic regions and surveyed a total of 4,648 genes possibly affecting biomass production in 21 diverse inbred maize lines (7 flints, 14 dents. Captured and enriched genomic DNA was sequenced using the 454 NGS platform to 19.6-fold average depth coverage, and a broad evaluation of read alignment and variant calling methods was performed to select optimal procedures for variant discovery. Sequence alignment with the B73 reference and de novo assembly identified 383,145 putative single nucleotide polymorphisms (SNPs, of which 42,685 were non-synonymous alterations and 7,139 caused frameshifts. Presence/absence variation (PAV of genes was also detected. We found that substantial sequence variation exists among genomic regions targeted in this study, which was particularly evident within coding regions. This diversification has the potential to broaden functional diversity and generate phenotypic variation that may lead to new adaptations and the modification of important agronomic traits. Further, annotated SNPs identified here will serve as useful genetic tools and as candidates in searches for phenotype-altering DNA variation. In summary, we demonstrated that sequencing of captured DNA is a powerful

  16. Isolation of Hox cluster genes from insects reveals an accelerated sequence evolution rate.

    Heike Hadrys

    Full Text Available Among gene families it is the Hox genes and among metazoan animals it is the insects (Hexapoda that have attracted particular attention for studying the evolution of development. Surprisingly though, no Hox genes have been isolated from 26 out of 35 insect orders yet, and the existing sequences derive mainly from only two orders (61% from Hymenoptera and 22% from Diptera. We have designed insect specific primers and isolated 37 new partial homeobox sequences of Hox cluster genes (lab, pb, Hox3, ftz, Antp, Scr, abd-a, Abd-B, Dfd, and Ubx from six insect orders, which are crucial to insect phylogenetics. These new gene sequences provide a first step towards comparative Hox gene studies in insects. Furthermore, comparative distance analyses of homeobox sequences reveal a correlation between gene divergence rate and species radiation success with insects showing the highest rate of homeobox sequence evolution.

  17. Sense-antisense gene pairs: sequence, transcription, and structure are not conserved between human and mouse

    Wood, Emily J.; Chin-Inmanu, Kwanrutai; Jia, Hui; Lipovich, Leonard

    2013-01-01

    Previous efforts to characterize conservation between the human and mouse genomes focused largely on sequence comparisons. These studies are inherently limited because they don't account for gene structure differences, which may exist despite genomic sequence conservation. Recent high-throughput transcriptome studies have revealed widespread and extensive overlaps between genes, and transcripts, encoded on both strands of the genomic sequence. This overlapping gene organization, which produce...

  18. Nucleotide Sequence of the Chromosomal ampC Gene of Enterobacter aerogenes

    Preston, Karen E.; Radomski, Christopher C. A.; Venezia, Richard A.

    2000-01-01

    The AmpC β-lactamase gene and a small portion of the regulatory ampR sequence of Enterobacter aerogenes 97B were cloned and sequenced. The β-lactamase had an isoelectric point of 8 and conferred cephalosporin and cephamycin resistance on the host. The sequence of the cloned gene is most closely related to those of the ampC genes of E. cloacae and C. freundii.

  19. Characterization of fusion genes and the significantly expressed fusion isoforms in breast cancer by hybrid sequencing

    Weirather, Jason L.; Afshar, Pegah Tootoonchi; Clark, Tyson A.; Tseng, Elizabeth; Powers, Linda S.; Underwood, Jason G; Zabner, Joseph; Korlach, Jonas; Wong, Wing Hung; Au, Kin Fai

    2015-01-01

    We developed an innovative hybrid sequencing approach, IDP-fusion, to detect fusion genes, determine fusion sites and identify and quantify fusion isoforms. IDP-fusion is the first method to study gene fusion events by integrating Third Generation Sequencing long reads and Second Generation Sequencing short reads. We applied IDP-fusion to PacBio data and Illumina data from the MCF-7 breast cancer cells. Compared with the existing tools, IDP-fusion detects fusion genes at higher precision and ...

  20. Secondary structure and phylogenetic utility of the ribosomal large subunit (28S) in monogeneans of the genus Thaparocleidus and Bifurcohaptor (Monogenea: Dactylogyridae).

    Chaudhary, Anshu; Singh, Hridaya Shanker

    2013-04-01

    Present communication deals with secondary structure of 28S rDNA of two already known species of monogeneans viz., Bifurcohaptor indicus and Thaparocleidus parvulus parasitizing gill filaments of a freshwater fish, Mystus vittatus for phylogenetic inference. Secondary structure data are best used as accessory taxonomic characters as their phylogenetic resolving power and confidence in validity. Secondary structure of the 28S rDNA transcript could provide information for identifying homologous nucleotide characters, useful for cladistic inference of relationships. Such structure data could be used as taxonomic character. The study supports that species-level sequence variability renders 28S sequence as a unique window for examining the behavior of fast evolving, non-coding DNA sequences. Apart from this it also confirms that molecular similarity present in various species could be host-induced. PMID:24431545

  1. Nucleotide Sequence of a Chicken Vitellogenin Gene and Derived Amino Acid Sequence of the Encoded Yolk Precursor Protein

    Schip, Fred D. van het; Samallo, John; Broos, Jaap; Ophuis, Jan; Mojet, Mart; Gruber, Max; AB, Geert

    1987-01-01

    The gene encoding the major vitellogenin from chicken has been completely sequenced and its exon-intron organization has been established. The gene is 20,342 base-pairs long and contains 35 exons with a combined length of 5787 base-pairs. They encode the 1850-amino acid pre-peptide of vitellogenin,

  2. CLONING AND SEQUENCING OF THE GENE FOR A LACTOCOCCAL ENDOPEPTIDASE, AN ENZYME WITH SEQUENCE SIMILARITY TO MAMMALIAN ENKEPHALINASE

    Mierau, Igor; Tan, Paris S.T.; Haandrikman, Alfred J.; Kok, Jan; Leenhouts, Kees J.; Konings, Wil N.; Venema, Gerard

    1993-01-01

    The gene specifying an endopeptidase of Lactococcus lactis, named pepO, was cloned from a genomic library of L. lactis subsp. cremoris P8-247 in lambdaEMBL3 and was subsequently sequenced. pepO is probably the last gene of an operon encoding the binding-protein-dependent oligopeptide transport syste

  3. Characterizations of Chinese isolates of Coxiella burnetii in the com1 gene sequence

    YU Quan; ZHANG Guo-quan; FUKUSHI Hideto; YAMAGUCHI Tsuyoshi; HIRAI Katsuya

    2002-01-01

    Objective: To know some genetical characterizations of Coxiella burnetii Chinese isolates by comparing the com1 gene sequence. Methods: com1 gene sequences of Chinese isolates were amplified, sequenced, and analyzed by comparing our result and the previous published data. Results: Three different com1 sequences were identified in 7 Chinese isolates. Sequence comparison indicated that the isolates harboring the QpRS plasmid could be defined as a new group and, in addition, the isolates carrying the same plasmid type showed similar com1 gene sequence. Conclusion: Study suggests that the classification of the group based on the com1 gene sequence is highly associated with the plasmid type of the isolates and, however, little related to disease forms and geographical origins of the isolates.

  4. Defining reference sequences for Nocardia species by similarity and clustering analyses of 16S rRNA gene sequence data.

    Manal Helal

    Full Text Available BACKGROUND: The intra- and inter-species genetic diversity of bacteria and the absence of 'reference', or the most representative, sequences of individual species present a significant challenge for sequence-based identification. The aims of this study were to determine the utility, and compare the performance of several clustering and classification algorithms to identify the species of 364 sequences of 16S rRNA gene with a defined species in GenBank, and 110 sequences of 16S rRNA gene with no defined species, all within the genus Nocardia. METHODS: A total of 364 16S rRNA gene sequences of Nocardia species were studied. In addition, 110 16S rRNA gene sequences assigned only to the Nocardia genus level at the time of submission to GenBank were used for machine learning classification experiments. Different clustering algorithms were compared with a novel algorithm or the linear mapping (LM of the distance matrix. Principal Components Analysis was used for the dimensionality reduction and visualization. RESULTS: The LM algorithm achieved the highest performance and classified the set of 364 16S rRNA sequences into 80 clusters, the majority of which (83.52% corresponded with the original species. The most representative 16S rRNA sequences for individual Nocardia species have been identified as 'centroids' in respective clusters from which the distances to all other sequences were minimized; 110 16S rRNA gene sequences with identifications recorded only at the genus level were classified using machine learning methods. Simple kNN machine learning demonstrated the highest performance and classified Nocardia species sequences with an accuracy of 92.7% and a mean frequency of 0.578. CONCLUSION: The identification of centroids of 16S rRNA gene sequence clusters using novel distance matrix clustering enables the identification of the most representative sequences for each individual species of Nocardia and allows the quantitation of inter- and intra

  5. Gene discovery in the hamster: a comparative genomics approach for gene annotation by sequencing of hamster testis cDNAs

    Khan Shafiq A

    2003-06-01

    Full Text Available Abstract Background Complete genome annotation will likely be achieved through a combination of computer-based analysis of available genome sequences combined with direct experimental characterization of expressed regions of individual genomes. We have utilized a comparative genomics approach involving the sequencing of randomly selected hamster testis cDNAs to begin to identify genes not previously annotated on the human, mouse, rat and Fugu (pufferfish genomes. Results 735 distinct sequences were analyzed for their relatedness to known sequences in public databases. Eight of these sequences were derived from previously unidentified genes and expression of these genes in testis was confirmed by Northern blotting. The genomic locations of each sequence were mapped in human, mouse, rat and pufferfish, where applicable, and the structure of their cognate genes was derived using computer-based predictions, genomic comparisons and analysis of uncharacterized cDNA sequences from human and macaque. Conclusion The use of a comparative genomics approach resulted in the identification of eight cDNAs that correspond to previously uncharacterized genes in the human genome. The proteins encoded by these genes included a new member of the kinesin superfamily, a SET/MYND-domain protein, and six proteins for which no specific function could be predicted. Each gene was expressed primarily in testis, suggesting that they may play roles in the development and/or function of testicular cells.

  6. Colorimetric biosensing of targeted gene sequence using dual nanoparticle platforms

    Thavanathan J

    2015-04-01

    Full Text Available Jeevan Thavanathan,1 Nay Ming Huang,1 Kwai Lin Thong2 1Low Dimension Material Research Center, Department of Physics, 2Institute of Biological Sciences, Faculty of Science, University of Malaya, Kuala Lumpur, Malaysia Abstract: We have developed a colorimetric biosensor using a dual platform of gold nanoparticles and graphene oxide sheets for the detection of Salmonella enterica. The presence of the invA gene in S. enterica causes a change in color of the biosensor from its original pinkish-red to a light purplish solution. This occurs through the aggregation of the primary gold nanoparticles–conjugated DNA probe onto the surface of the secondary graphene oxide–conjugated DNA probe through DNA hybridization with the targeted DNA sequence. Spectrophotometry analysis showed a shift in wavelength from 525 nm to 600 nm with 1 µM of DNA target. Specificity testing revealed that the biosensor was able to detect various serovars of the S. enterica while no color change was observed with the other bacterial species. Sensitivity testing revealed the limit of detection was at 1 nM of DNA target. This proves the effectiveness of the biosensor in the detection of S. enterica through DNA hybridization. Keywords: biosensor, DNA hybridization, DNA probe, gold nanoparticles, graphene oxide, Salmonella enterica

  7. Isolation and characterization of gene sequences expressed in cotton fiber

    Taciana de Carvalho Coutinho

    2016-06-01

    Full Text Available ABSTRACT Cotton fiber are tubular cells which develop from the differentiation of ovule epidermis. In addition to being one of the most important natural fiber of the textile group, cotton fiber afford an excellent experimental system for studying the cell wall. The aim of this work was to isolate and characterise the genes expressed in cotton fiber (Gossypium hirsutum L. to be used in future work in cotton breeding. Fiber of the cotton cultivar CNPA ITA 90 II were used to extract RNA for the subsequent generation of a cDNA library. Seventeen sequences were obtained, of which 14 were already described in the NCBI database (National Centre for Biotechnology Information, such as those encoding the lipid transfer proteins (LTPs and arabinogalactans (AGP. However, other cDNAs such as the B05 clone, which displays homology with the glycosyltransferases, have still not been described for this crop. Nevertheless, results showed that several clones obtained in this study are associated with cell wall proteins, wall-modifying enzymes and lipid transfer proteins directly involved in fiber development.

  8. Use of gene sequence analyses and genome comparisons for yeast systematics

    Detection, identification, and classification of yeasts has undergone a major transformation in the past decade and a half following application of gene sequence analyses and genome comparisons. Development of a database (barcode) of easily determined gene sequences from domains 1 and 2 of large sub...

  9. Human cysteine-proteinase inhibitors: nucleotide sequence analysis of three members of the cystatin gene family.

    Saitoh, E; Kim, H S; Smithies, O; Maeda, N

    1987-01-01

    Three genes from the human cystatin gene family of cysteine-proteinase inhibitors have been isolated from a bacteriophage lambda library containing HindIII digests of human genomic DNA. Two of the genes code for salivary cystatin SN and SA, the third is a pseudogene. The cloned genes were identified with a probe made from a salivary cystatin cDNA. The complete nucleotide sequence of the gene that codes for the precursor form of the neutral salivary protein, cystatin SN, was determined. The gene, which we name CST1, contains three exons and two intervening sequences. The expected CAT and ATA boxes are present in the 5'-flanking region of the gene. Partial nucleotide sequence determination of a second gene revealed that it codes for the precursor form of the acidic salivary protein, cystatin SA. This gene, which we name CST2, has the same gene organization as CST1. The complete nucleotide sequence of a third gene was determined. It does not contain a typical ATA box, and in addition, a premature stop codon and a frameshift deletion mutation occur within the gene. These inactivation mutations show that this gene, which we name CSTP1, is a cystatin pseudogene. These data combined with our genomic Southern-blot analyses show that the cystatin genes form a multigene family with at least seven members. PMID:3446578

  10. CLONING AND SEQUENCING OF MATURED FRAGMENT OF HUMAN NEVER GROWTH FACTOR GENE

    马巍; 吴玲; 王德利; 刘淼; 任惠民; 杨广笑; 王全颖

    2003-01-01

    Objective Molecular cloning and sequencing of the human matured fragment of human nerve growth factor(NGF) gene. Methods Extracting the human genomic DNA from the white blood cells as templates, the gene of NGF was cloned by using PCR and T-vector cloning method. Screening the positive clones and identified by the restriction enzymes, and then the cloned amplified fragment was sequenced and analyzed. Results DNA sequence comparison the cloned gene of NGF with the GenBank (V01511) sequence demonstrated that both of sequences were identical, 354bp length. Conclusion Cloning the NGF gene from the human genomic DNA has paved the way for further study on gene therapy of nerve system injury.

  11. PCR amplification and sequence analysis of the rat Sox3 gene

    Krstić A.

    2008-01-01

    Full Text Available The Sox3 gene is considered to be one of the earliest neural markers in vertebrates, playing a role in specifying neuronal fate. Despite the completion of a rat genome sequencing project, only a partial sequence of the rat Sox3 gene has been available in the public database. Using PCR, sequencing, and bioinformatics tools, in this study we have determined the complete coding sequence of the rat Sox3 gene encoding 449 amino acids. Comparative analysis of rat and human SOX3 proteins revealed a high degree of conservation. Identification of the rat Sox3 gene sequence would help in understanding the biological roles of this gene and provide insight into evolutionary relationships with vertebrate orthologs.

  12. Sequence homologies in the 5' regions of four Drosophila heat-shock genes.

    Holmgren, R; Corces, V; Morimoto, R; Blackman, R; Meselson, M

    1981-01-01

    We report nucleotide sequences of the regions surrounding the 5' ends of the genes for Drosophila melanogaster heat-shock proteins hsp83, hsp68, and hsp26, located at chromosome positions 63BC, 95D, and 67B, respectively. As in other eukaryotic genes, the sequence T-A-T-A-A-A-A-T occurs about 30 nucleotides upstream from the sites of mRNA initiation. Three additional sequence homologies and a dyad symmetry were noted at approximately corresponding locations in the three genes and in the gene ...

  13. DNA sequence of the lactose operon: the lacA gene and the transcriptional termination region.

    Hediger, M A; Johnson, D F; Nierlich, D P; Zabin, I

    1985-01-01

    The lac operon of Escherichia coli spans approximately 5300 base pairs and includes the lacZ, lacY, and lacA genes in addition to the operator, promoter, and transcription termination regions. We report here the sequence of the lacA gene and the region distal to it, confirming the sequence of thiogalactoside transacetylase and completing the sequence of the lac operon. The lacA gene is characterized by use of rare codons, suggesting an origin from a plasmid, transposon, or virus gene. UUG is ...

  14. Complete mitochondrial genome DNA sequence for two ophiuroids and a holothuroid: the utility of protein gene sequence and gene maps in the analyses of deep deuterostome phylogeny.

    Scouras, Andrea; Beckenbach, Karen; Arndt, Allan; Smith, Michael J

    2004-04-01

    The complete mitochondrial genome sequences have been determined for the holothuroid Cucumaria miniata and two ophiuroid species Ophiopholis aculeata and Ophiura lütkeni. In addition, the nucleotide sequence of the mitochondrial protein-coding genes for the asteroid Pisaster ochraceus has been completed. Maximum-likelihood and LogDet distance analyses of concatenated protein-coding sequences produced a series of trees that did not conclusively support generally accepted models of echinoderm phylogeny. The ophiuroid data consistently demonstrated accelerated nucleotide divergence rates and lack of stationarity. This confounds the phylogenetic analyses. Molecular investigations using individual protein-coding gene alignments demonstrated that the cytochrome b gene exhibits the least deviation in rate and stationarity and generated some trees consistent with proposed echinoderm phylogenies. Phylogenies based on echinoderm mitochondrial gene rearrangements also proved problematic because of extensive variation in gene order between and within classes. A comparison of the two distinctive ophiuroid mitochondrial gene orders supports the hypothesis that O. lütkeni has a more derived mitochondrial gene order versus O. aculeata. The variation in the echinoderm mitochondrial gene maps reinforces the limitations of the application of mitochondrial gene rearrangements as a global phylogenetic tool. PMID:15019608

  15. Structural analysis of DNA sequence: evidence for lateral gene transfer in Thermotoga maritima

    Worning, Peder; Jensen, Lars Juhl; Nelson, K. E.;

    2000-01-01

    The recently published complete DNA sequence of the bacterium Thermotoga maritima provides evidence, based on protein sequence conservation, for lateral gene transfer between Archaea and Bacteria. We introduce a new method of periodicity analysis of DNA sequences, based on structural parameters, ...

  16. [Analysis of full-length gene sequence of rabies vaccine virus aG strain].

    Li, Jia; Cao, Shou-Chun; Shi, Lei-Tai; Wu, Xiao-Hong; Liu, Jing-Hua; Wang, Yun-Peng; Tang, Jian-Rong; Yu, Yong-Xin; Dong, Guan-Mu

    2013-06-01

    To sequence and analyze the full-length gene sequence of rabies vaccine virus aG strain. The full-length gene sequence of aG strain was amplified by RT-PCR by 8 fragments,each PCR product was cloned into vector pGEM-T respectively, sequenced and assemblied; The 5' leader sequence was sequenced with method of 5' RACE. The homology between aG and other rabies vaccine virus was analyzed by using DNAstar and Mega4. 0 software. aG strain was 11 925nt(GenBank accession number: JN234411) in length and belonged to the genotype I . The Bioinformatics revealed that the homology showed disparation form different rabies vaccine virus. the full-length gene sequence of rabies vaccine virus aG strain provided a support for perfecting the standard for quality control of virus strains for production of rabies vaccine for human use in China. PMID:23895005

  17. Analysis of Mixed Sequencing Chromatograms and Its Application in Direct 16S rRNA Gene Sequencing of Polymicrobial Samples▿

    Kommedal, Øyvind; Karlsen, Bjarte; Sæbø, Øystein

    2008-01-01

    Investigation of clinical samples by direct 16S rRNA gene sequencing provides the possibility to detect nonviable bacteria and bacteria with special growth requirements. This approach has been particularly valuable for the diagnosis of patients who have received antibiotics prior to sample collection. In specimens containing more than one bacterium, direct sequencing gives mixed chromatograms that complicate further interpretation. We designed an algorithm able to analyze these ambiguous chro...

  18. Contribution of the Caspase Gene Sequence Diversification to the Specifically Antiviral Defense in Invertebrate

    Bin Zhi; Lei Wang; Guangyi Wang; Xiaobo Zhang

    2011-01-01

    Vertebrates achieve adaptive immunity of all sorts against pathogens through the diversification of antibodies. However the mechanism of invertebrates' innate immune defense against various pathogens remains largely unknown. Our study used shrimp and white spot syndrome virus (WSSV) to show that PjCaspase, a caspase gene of shrimp that is crucial in apoptosis, possessed gene sequence diversity. At present, the role of gene sequence diversity in immunity has not been characterized. To address ...

  19. Characterization and phylogenetic analysis of -gliadin gene sequences reveals significant genomic divergence in Triticeae species

    Guang-Rong Li; Tao Lang; En-Nian Yang; Cheng Liu; Zu-Jun Yang

    2014-12-01

    Although the unique properties of wheat -gliadin gene family are well characterized, little is known about the evolution and genomic divergence of -gliadin gene family within the Triticeae. We isolated a total of 203 -gliadin gene sequences from 11 representative diploid and polyploid Triticeae species, and found 108 sequences putatively functional. Our results indicate that -gliadin genes may have possibly originated from wild Secale species, where the sequences contain the shortest repetitive domains and display minimum variation. A miniature inverted-repeat transposable element insertion is reported for the first time in -gliadin gene sequence of Thinopyrum intermedium in this study, indicating that the transposable element might have contributed to the diversification of -gliadin genes family among Triticeae genomes. The phylogenetic analyses revealed that the -gliadin gene sequences of Dasypyrum, Australopyrum, Lophopyrum, Eremopyrum and Pseudoroengeria species have amplified several times. A search for four typical toxic epitopes for celiac disease within the Triticeae -gliadin gene sequences showed that the -gliadins of wild Secale, Australopyrum and Agropyron genomes lack all four epitopes, while other Triticeae species have accumulated these epitopes, suggesting that the evolution of these toxic epitopes sequences occurred during the course of speciation, domestication or polyploidization of Triticeae.

  20. Complete Sequence Construction of the Highly Repetitive Ribosomal RNA Gene Repeats in Eukaryotes Using Whole Genome Sequence Data.

    Agrawal, Saumya; Ganley, Austen R D

    2016-01-01

    The ribosomal RNA genes (rDNA) encode the major rRNA species of the ribosome, and thus are essential across life. These genes are highly repetitive in most eukaryotes, forming blocks of tandem repeats that form the core of nucleoli. The primary role of the rDNA in encoding rRNA has been long understood, but more recently the rDNA has been implicated in a number of other important biological phenomena, including genome stability, cell cycle, and epigenetic silencing. Noncoding elements, primarily located in the intergenic spacer region, appear to mediate many of these phenomena. Although sequence information is available for the genomes of many organisms, in almost all cases rDNA repeat sequences are lacking, primarily due to problems in assembling these intriguing regions during whole genome assemblies. Here, we present a method to obtain complete rDNA repeat unit sequences from whole genome assemblies. Limitations of next generation sequencing (NGS) data make them unsuitable for assembling complete rDNA unit sequences; therefore, the method we present relies on the use of Sanger whole genome sequence data. Our method makes use of the Arachne assembler, which can assemble highly repetitive regions such as the rDNA in a memory-efficient way. We provide a detailed step-by-step protocol for generating rDNA sequences from whole genome Sanger sequence data using Arachne, for refining complete rDNA unit sequences, and for validating the sequences obtained. In principle, our method will work for any species where the rDNA is organized into tandem repeats. This will help researchers working on species without a complete rDNA sequence, those working on evolutionary aspects of the rDNA, and those interested in conducting phylogenetic footprinting studies with the rDNA. PMID:27576718

  1. Disparate sequence characteristics of the Erysiphe graminis f.sp. hordei glyceraldehyde-3-phosphate dehydrogenase gene

    Christiansen, S.K.; Justesen, A.F.; Giese, H.

    1997-01-01

    , Egh falls into the group of Ascomycetes located at a basal position. The regulatory region of the Egh gpd gene has no homology to corresponding sequences in other filamentous Ascomycetes. Codon usage was determined for the four characterized Egh genes (tub2, Egh7, Egh16 and gpd) and found to be...... and plant genes in sequence mixtures. The Egh gpd promoter appears to be superior to that of the Egh beta-tubulin gene (tub2) for driving the E. coli beta-glucuronidase (GUS) gene in transformation experiments....

  2. Sequencing analysis reveals a unique gene organization in the gyrB region of Mycoplasma hominis

    Ladefoged, Søren; Christiansen, Gunna

    which showed similarity to that which encodes the LicA protein of Haemophilus influenzae. The organization of the genes in the region showed no resemblance to that in the corresponding regions of other bacteria sequenced so far. The gyrA gene was mapped 35 kb downstream from the gyrB gene.......The homolog of the gyrB gene, which has been reported to be present in the vicinity of the initiation site of replication in bacteria, was mapped on the Mycoplasma hominis genome, and the region was subsequently sequenced. Five open reading frames were identified flanking the gyrB gene, one of...

  3. Cloning, sequence analysis, and hyperexpression of the genes encoding phosphotransacetylase and acetate kinase from Methanosarcina thermophila.

    Latimer, M T; Ferry, J G

    1993-01-01

    The genes for the acetate-activating enzymes, acetate kinase and phosphotransacetylase (ack and pta), from Methanosarcina thermophila TM-1 were cloned and sequenced. Both genes are present in only one copy per genome, with the pta gene adjacent to and upstream of the ack gene. Consensus archaeal promoter sequences are found upstream of the pta coding region. The pta and ack genes encode predicted polypeptides with molecular masses of 35,198 and 44,482 Da, respectively. A hydropathy plot of th...

  4. Strong association between pseudogenization mechanisms and gene sequence length

    Harrison Paul M; Khachane Amit N

    2009-01-01

    Abstract Pseudogenes arise from the decay of gene copies following either RNA-mediated duplication (processed pseudogenes) or DNA-mediated duplication (nonprocessed pseudogenes). Here, we show that long protein-coding genes tend to produce more nonprocessed pseudogenes than short genes, whereas the opposite is true for processed pseudogenes. Protein-coding genes longer than 3000 bp are 6 times more likely to produce nonprocessed pseudogenes than processed ones. Reviewers This article was revi...

  5. Flagellar apparatus gene sequences of Aeromonas hydrophila AL09-73 isolate

    Flagellar apparatus genes of recent outbreak Aeromonas hydrophila AL09-73 isolate were sequenced and characterized. Total 28 flagellar genes were identified. The sizes of the genes range from 318 to 2001 nucleotides, which potentially encode different complex flagellar proteins. At nucleotide and...

  6. Cloning, sequence analysis, and characterization of the genes involved in isoprimeverose metabolism in Lactobacillus pentosus

    Chaillou, S.; Lokman, B.C.; Leer, R.J.; Posthuma, C.; Postma, P.W.; Pouwels, P.H.

    1998-01-01

    Two genes, xylP and xylQ, from the xylose regulon of Lactobacillus pentosus were cloned and sequenced. Together with the repressor gene of the regulon, xylR, the xylPQ genes form an operon which is inducible by xylose and which is transcribed from a promoter located 145 bp upstream of xylP. A putati

  7. Nucleotide sequence of a cyanobacterial nifH gene coding for nitrogenase reductase

    Mevarech, Moshe; Rice, Douglas; Haselkorn, Robert

    1980-01-01

    The nucleotide sequence of nifH, the structural gene for nitrogenase reductase (component II or Fe protein of nitrogenase) from the cyanobacterium Anabaena 7120 has been determined. Also reported are 194 bases of the 5′-flanking sequence and 170 bases of the 3′-flanking sequence. The predicted amino acid sequence was compared with that determined for the complete nitrogenase reductase of Clostridium pasteurianum and the cysteine-containing peptides of the protein from Azotobacter vinelandii. ...

  8. Nucleotide sequences of immunoglobulin epsilon genes of chimpanzee and orangutan: DNA molecular clock and hominoid evolution.

    Sakoyama, Y; Hong, K J; Byun, S. M.; Hisajima, H; Ueda, S; Yaoita, Y; Hayashida, H; Miyata, T.; Honjo, T

    1987-01-01

    To determine the phylogenetic relationships among hominoids and the dates of their divergence, the complete nucleotide sequences of the constant region of the immunoglobulin epsilon-chain (C epsilon 1) genes from chimpanzee and orangutan have been determined. These sequences were compared with the human epsilon-chain constant-region sequence. A molecular clock (silent molecular clock), measured by the degree of sequence divergence at the synonymous (silent) positions of protein-encoding regio...

  9. Cloning and sequencing of the gene encoding thermophilic beta-amylase of Clostridium thermosulfurogenes.

    Kitamoto, N; Yamagata, H; Kato, T; Tsukagoshi, N; Udaka, S

    1988-01-01

    A gene coding for thermophilic beta-amylase of Clostridium thermosulfurogenes was cloned into Bacillus subtilis, and its nucleotide sequence was determined. The nucleotide sequence suggested that the thermophilic beta-amylase is translated from monocistronic mRNA as a secretory precursor with a signal peptide of 32 amino acid residues. The deduced amino acid sequence of the mature beta-amylase contained 519 residues with a molecular weight of 57,167. The amino acid sequence of the C. thermosu...

  10. Cloning and sequencing of the bovine gastrin gene

    Lund, T; Rehfeld, J F; Olsen, Jørgen

    1989-01-01

    In order to deduce the primary structure of bovine preprogastrin we therefore sequenced a gastrin DNA clone isolated from a bovine liver cosmid library. Bovine preprogastrin comprises 104 amino acids and consists of a signal peptide, a 37 amino acid spacer-sequence, the gastrin-34 sequence follow...... by an amidation-site (Gly-Arg-Arg), and a C-terminal nonapeptide. Comparison with human, porcine, and rat cDNA sequences revealed extensive homology in the coding region as well as in short noncoding structures....

  11. Intergenic DNA sequences flanking the pseudo alpha globin genes of human and chimpanzee.

    Sawada, I; Beal, M P; Shen, C K; Chapman, B.; Wilson, A C; Schmid, C.

    1983-01-01

    We have determined the sequence of 2400 base pairs upstream from the human pseudo alpha globin (psi alpha) gene, and for comparison, 1100 base pairs of DNA within and upstream from the chimpanzee psi alpha gene. The region upstream from the promoter of the psi alpha gene shows no significant homology to the intergenic regions of the adult alpha 2 and alpha 1 globin genes. The chimpanzee gene has a coding defect in common with the human psi alpha gene, showing that the product of this gene, if...

  12. Mouse mammary tumor virus-like gene sequences are present in lung patient specimens

    Rodríguez-Padilla Cristina

    2011-09-01

    Full Text Available Abstract Background Previous studies have reported on the presence of Murine Mammary Tumor Virus (MMTV-like gene sequences in human cancer tissue specimens. Here, we search for MMTV-like gene sequences in lung diseases including carcinomas specimens from a Mexican population. This study was based on our previous study reporting that the INER51 lung cancer cell line, from a pleural effusion of a Mexican patient, contains MMTV-like env gene sequences. Results The MMTV-like env gene sequences have been detected in three out of 18 specimens studied, by PCR using a specific set of MMTV-like primers. The three identified MMTV-like gene sequences, which were assigned as INER6, HZ101, and HZ14, were 99%, 98%, and 97% homologous, respectively, as compared to GenBank sequence accession number AY161347. The INER6 and HZ-101 samples were isolated from lung cancer specimens, and the HZ-14 was isolated from an acute inflammatory lung infiltrate sample. Two of the env sequences exhibited disruption of the reading frame due to mutations. Conclusion In summary, we identified the presence of MMTV-like gene sequences in 2 out of 11 (18% of the lung carcinomas and 1 out of 7 (14% of acute inflamatory lung infiltrate specimens studied of a Mexican Population.

  13. Characterization of promoter sequence of toll-like receptor genes in Vechur cattle

    R. Lakshmi

    2016-06-01

    Full Text Available Aim: To analyze the promoter sequence of toll-like receptor (TLR genes in Vechur cattle, an indigenous breed of Kerala with the sequence of Bos taurus and access the differences that could be attributed to innate immune responses against bovine mastitis. Materials and Methods: Blood samples were collected from Jugular vein of Vechur cattle, maintained at Vechur cattle conservation center of Kerala Veterinary and Animal Sciences University, using an acid-citrate-dextrose anticoagulant. The genomic DNA was extracted, and polymerase chain reaction was carried out to amplify the promoter region of TLRs. The amplified product of TLR2, 4, and 9 promoter regions was sequenced by Sanger enzymatic DNA sequencing technique. Results: The sequence of promoter region of TLR2 of Vechur cattle with the B. taurus sequence present in GenBank showed 98% similarity and revealed variants for four sequence motifs. The sequence of the promoter region of TLR4 of Vechur cattle revealed 99% similarity with that of B. taurus sequence but not reveals significant variant in motifregions. However, two heterozygous loci were observed from the chromatogram. Promoter sequence of TLR9 gene also showed 99% similarity to B. taurus sequence and revealed variants for four sequence motifs. Conclusion: The results of this study indicate that significant variation in the promoter of TLR2 and 9 genes in Vechur cattle breed and may potentially link the influence the innate immunity response against mastitis diseases.

  14. The Arabidopsis Root Transcriptome by Serial Analysis of Gene Expression. Gene Identification Using the Genome Sequence1

    Fizames, Cécile; Muños, Stéphane; Cazettes, Céline; Nacry, Philippe; Boucherez, Jossia; Gaymard, Frédéric; Piquemal, David; Delorme, Valérie; Commes, Thérèse; Doumas, Patrick; Cooke, Richard; Marti, Jacques; Sentenac, Hervé; Gojon, Alain

    2004-01-01

    Large-scale identification of genes expressed in roots of the model plant Arabidopsis was performed by serial analysis of gene expression (SAGE), on a total of 144,083 sequenced tags, representing at least 15,964 different mRNAs. For tag to gene assignment, we developed a computational approach based on 26,620 genes annotated from the complete sequence of the genome. The procedure selected warrants the identification of the genes corresponding to the majority of the tags found experimentally, with a high level of reliability, and provides a reference database for SAGE studies in Arabidopsis. This new resource allowed us to characterize the expression of more than 3,000 genes, for which there is no expressed sequence tag (EST) or cDNA in the databases. Moreover, 85% of the tags were specific for one gene. To illustrate this advantage of SAGE for functional genomics, we show that our data allow an unambiguous analysis of most of the individual genes belonging to 12 different ion transporter multigene families. These results indicate that, compared with EST-based tag to gene assignment, the use of the annotated genome sequence greatly improves gene identification in SAGE studies. However, more than 6,000 different tags remained with no gene match, suggesting that a significant proportion of transcripts present in the roots originate from yet unknown or wrongly annotated genes. The root transcriptome characterized in this study markedly differs from those obtained in other organs, and provides a unique resource for investigating the functional specificities of the root system. As an example of the use of SAGE for transcript profiling in Arabidopsis, we report here the identification of 270 genes differentially expressed between roots of plants grown either with NO3- or NH4NO3 as N source. PMID:14730065

  15. Identification of true EST alignments and exon regions of gene sequences

    ZHOU Yanhong; JING Hui; LI Yanen; LIU Huailan

    2004-01-01

    Expressed sequence tags (ESTs), which have piled up considerably so far, provide a valuable resource for finding new genes, disease-relevant genes, and for recognizing alternative splicing variants, SNP sites, etc. The prerequisite for carrying out these researches is to correctly ascertain the gene-sequence-related ESTs. Based on analysis of the alignment results between some known gene sequences and ESTs in public database, several measures including Identity Check, Gap Check, Inclusion Check and Length Check have been introduced to judge whether an EST alignment is related to a gene sequence or not. A computational program EDSAc1.0 has been developed to identify true EST alignments and exon regions of query gene sequences. When tested with human gene sequences in the standard dataset HMR195 and evaluated with the standard measures of gene prediction performance, EDSAc1.0 can identify protein- coding regions with specificity of 0.997 and sensitivity of 0.88 at the nucleotide level, which outperform that of the counterpart TAP. A web server of EDSAc1.0 is available at http://infosci.hust.edu.cn.

  16. Biased distribution of DNA uptake sequences towards genome maintenance genes

    Davidsen, T.; Rodland, E.A.; Lagesen, K.;

    2004-01-01

    Repeated sequence signatures are characteristic features of all genomic DNA. We have made a rigorous search for repeat genomic sequences in the human pathogens Neisseria meningitidis, Neisseria gonorrhoeae and Haemophilus influenzae and found that by far the most frequent 9-10mers residing within...

  17. Escherichia coli rep gene: sequence of the gene, the encoded helicase, and its homology with uvrD.

    Gilchrist, C A; Denhardt, D T

    1987-01-01

    The sequence of a 2.67-kilobase section of the Escherichia coli chromosome that contains the rep gene has been determined. This gene codes for a protein of predicted Mr 72,800, a DNA helicase, which is also a single-stranded DNA-dependent ATPase. The sequenced region contains an open reading frame of the correct length and orientation to encode the Rep protein. A secondary structure for the protein can be formulated from the amino acid sequence. We have compared both the primary and the secon...

  18. Cloning and Sequence Analysis of Y-box Binding Protein Gene in Min Pig

    Zhang Dong-jie; Liu Di; Wang Liang; He Xin-miao; Wang Wen-tao

    2014-01-01

    In order to study the gene sequence of Min pig Y-box binding protein (YB-1) gene, the complete coding sequence of Min pig YB-1 gene was cloned by RT-PCR, the sequence features were analyzed by some software and online website. The results showed that the complete CDS of Min pig Y-box was found to be 975 bp long, encoding 324 amino acids. It contained a conserved cold shock domain and several phosphorylation sites, but had no transmembrane domains, and was consistent with a protein found in the cytoplasm. Min pig YB-1 nucleotides shared high similarity (61.37%-97.66%) with other mammals.

  19. Molecular cloning and sequencing of the gene encoding the fimbrial subunit protein of Bacteroides gingivalis.

    Dickinson, D P; Kubiniec, M A; Yoshimura, F; Genco, R J

    1988-01-01

    The gene encoding the fimbrial subunit protein of Bacteroides gingivalis 381, fimbrilin, has been cloned and sequenced. The gene was present as a single copy on the bacterial chromosome, and the codon usage in the gene conformed closely to that expected for an abundant protein. The predicted size of the mature protein was 35,924 daltons, and the secretory form may have had a 10-amino-acid, hydrophilic leader sequence similar to the leader sequences of the MePhe fimbriae family. The protein se...

  20. Targeting of AID-mediated sequence diversification to immunoglobulin genes

    Kothapalli, Naga Rama; Fugmann, Sebastian D.

    2011-01-01

    Activation-induced cytidine deaminase (AID) is a key enzyme for antibody-mediated immune responses. Antibodies are encoded by the immunoglobulin genes and AID acts as a transcription-dependent DNA mutator on these genes to improve antibody affinity and effector functions. An emerging theme in field is that many transcribed genes are potential targets of AID, presenting an obvious danger to genomic integrity. Thus there are mechanisms in place to ensure that mutagenic outcomes of AID activity ...

  1. Cloning, sequencing and identification of single nucleotide polymorphisms of partial sequence on the porcine CACNA1S gene

    FANG XiaoMin; XU NingYing; REN ShouWen

    2008-01-01

    CACNA1S gene encodes the α1 subunit of the calcium channel. The mutation of CACNA1S gene can cause hypokalemic periodic paralysis (HypoKPP) and maliglant hyperthermla synarome (MHS) in human beings. Current research on CACNA1S was mainly in human being and model animal, but rarely in livestock and poultry. In this study, Yorkshire pigs (23), Pietrain pigs (30), Jinhua pigs (115) and the second generation (126) of crossbred of Jinhua and Pietrein were used. Primers were designed according to the sequence of human CACNA1S gene and PCR was carried out using pig genome DNA.PCR products were sequenced and compared with that of human, and then single nucleotide polymorphisms (SNPs) were investigated by PCR-SSCP, while PCR-RFLP tests were performed to validate the mutations. Results indicated: (1) the 5211 bp DNA fragments of porcine CACNA1S gene were acquired (GenBank accession number: DQ767693 ) and the identity of the exon region was 82.6% between human and pig; (2) fifty-seven mutations were found within the cloned sequences, among which 24 were in exon region; (3) the results of PCR-RFLP were in accordance with that of PCR-SSCP. According to the EST of porcine CACNA1S gene published in GenBank (Bx914582, Bx666997), 8 of the 11 SNPs identified in the present study were consistent with the base difference between two EST fragments.

  2. Genome-wide analysis reveals diverged patterns of codon bias, gene expression, and rates of sequence evolution in Picea gene families

    De La Torre, Amanda R; Lin, Yao-Cheng; van de Peer, Yves; Pär K Ingvarsson

    2015-01-01

    The recent sequencing of several gymnosperm genomes has greatly facilitated studying the evolution of their genes and gene families. In this study, we examine the evidence for expression-mediated selection in the first two fully sequenced representatives of the gymnosperm plant clade (Picea abies and Picea glauca). We use genome-wide estimates of gene expression (> 50,000 expressed genes) to study the relationship between gene expression, codon bias, rates of sequence divergence, protein l...

  3. Molecular systematics of the genus Sigmodon: results from mitochondrial and nuclear gene sequences

    Henson, Dallas D.; BRADLEY, ROBERT D.

    2009-01-01

    Phylogenetic relationships within the genus Sigmodon Say and Ord, 1825 were examined using sequence data from multiple gene regions, including exon 1 of the nuclear-encoded interphotoreceptor retinoid binding protein, intron 7 of the nuclear beta-fibrinogen gene, and the mitochondrial cytochrome b gene from 27 individuals representing 11 species of Sigmodon. Nuclear genes were analyzed independently, combined with each other, and combined with the mitochondrial data. Topologies were construct...

  4. Prokaryotic genes in eukaryotic genome sequences: when to infer horizontal gene transfer and when to suspect an actual microbe.

    Artamonova, Irena I; Lappi, Tanya; Zudina, Liudmila; Mushegian, Arcady R

    2015-07-01

    Assessment of phylogenetic positions of predicted gene and protein sequences is a routine step in any genome project, useful for validating the species' taxonomic position and for evaluating hypotheses about genome evolution and function. Several recent eukaryotic genome projects have reported multiple gene sequences that were much more similar to homologues in bacteria than to any eukaryotic sequence. In the spirit of the times, horizontal gene transfer from bacteria to eukaryotes has been invoked in some of these cases. Here, we show, using comparative sequence analysis, that some of those bacteria-like genes indeed appear likely to have been horizontally transferred from bacteria to eukaryotes. In other cases, however, the evidence strongly indicates that the eukaryotic DNA sequenced in the genome project contains a sample of non-integrated DNA from the actual bacteria, possibly providing a window into the host microbiome. Recent literature suggests also that common reagents, kits and laboratory equipment may be systematically contaminated with bacterial DNA, which appears to be sampled by metagenome projects non-specifically. We review several bioinformatic criteria that help to distinguish putative horizontal gene transfers from the admixture of genes from autonomously replicating bacteria in their hosts' genome databases or from the reagent contamination. PMID:25919787

  5. Comparative genome sequencing of drosophila pseudoobscura: Chromosomal, gene and cis-element evolution

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Todd, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catherine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenee; Verduzco, Daniel; Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.

    2004-04-01

    The genome sequence of a second fruit fly, D. pseudoobscura, presents an opportunity for comparative analysis of a primary model organism D. melanogaster. The vast majority of Drosophila genes have remained on the same arm, but within each arm gene order has been extensively reshuffled leading to the identification of approximately 1300 syntenic blocks. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 35 My since divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome wide average consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than control sequences between the species but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a picture of repeat mediated chromosomal rearrangement, and high co-adaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila.

  6. Chromosomal localization and sequence variation of 5S rRNA gene in five Capsicum species.

    Park, Y K; Park, K C; Park, C H; Kim, N S

    2000-02-29

    Chromosomal localization and sequence analysis of the 5S rRNA gene were carried out in five Capsicum species. Fluorescence in situ hybridization revealed that chromosomal location of the 5S rRNA gene was conserved in a single locus at a chromosome which was assigned to chromosome 1 by the synteny relationship with tomato. In sequence analysis, the repeating units of the 5S rRNA genes in the Capsicum species were variable in size from 278 bp to 300 bp. In sequence comparison of our results to the results with other Solanaceae plants as published by others, the coding region was highly conserved, but the spacer regions varied in size and sequence. T stretch regions, just after the end of the coding sequences, were more prominant in the Capsicum species than in two other plants. High G x C rich regions, which might have similar functions as that of the GC islands in the genes transcribed by RNA PolII, were observed after the T stretch region. Although we could not observe the TATA like sequences, an AT rich segment at -27 to -18 was detected in the 5S rRNA genes of the Capsicum species. Species relationship among the Capsicum species was also studied by the sequence comparison of the 5S rRNA genes. While C. chinense, C. frutescens, and C. annuum formed one lineage, C. baccatum was revealed to be an intermediate species between the former three species and C. pubescens. PMID:10774742

  7. Brucella abortus S19 genome sequenced, points toward virulence genes

    Whyte, Barry James

    2008-01-01

    Researchers at the Virginia Bioinformatics Institute at Virginia Tech; the National Animal Disease Center in Ames, Iowa; and collaborators at 454 Life Sciences, Branford, Conn., have sequenced the genome of Brucella abortus strain S19.

  8. Targeting DNA with triplex-forming oligonucleotides to modify gene sequence.

    Simon, Philippe; Cannata, Fabio; Concordet, Jean-Paul; Giovannangeli, Carine

    2008-08-01

    Molecules that interact with DNA in a sequence-specific manner are attractive tools for manipulating gene sequence and expression. For example, triplex-forming oligonucleotides (TFOs), which bind to oligopyrimidine.oligopurine sequences via Hoogsteen hydrogen bonds, have been used to inhibit gene expression at the DNA level as well as to induce targeted mutagenesis in model systems. Recent advances in using oligonucleotides and analogs to target DNA in a sequence-specific manner will be discussed. In particular, chemical modification of TFOs has been used to improve binding to chromosomal target sequences in living cells. Various oligonucleotide analogs have also been found to expand the range of sequences amenable to manipulation, including so-called "Zorro" locked nucleic acids (LNAs) and pseudo-complementary peptide nucleic acids (pcPNAs). Finally, we will examine the potential of TFOs for directing targeted gene sequence modification and propose that synthetic nucleases, based on conjugation of sequence-specific DNA ligands to DNA damaging molecules, are a promising alternative to protein-based endonucleases for targeted gene sequence modification. PMID:18460344

  9. Identification and characterization of rhizospheric microbial diversity by 16S ribosomal RNA gene sequencing

    Naveed, Muhammad; Mubeen, Samavia; Khan, Samiullah; Ahmed, Iftikhar; Khalid, Nauman; Suleria, Hafiz Ansar Rasul; Bano, Asghari; Mumtaz, Abdul Samad

    2014-01-01

    In the present study, samples of rhizosphere and root nodules were collected from different areas of Pakistan to isolate plant growth promoting rhizobacteria. Identification of bacterial isolates was made by 16S rRNA gene sequence analysis and taxonomical confirmation on EzTaxon Server. The identified bacterial strains were belonged to 5 genera i.e. Ensifer, Bacillus, Pseudomona, Leclercia and Rhizobium. Phylogenetic analysis inferred from 16S rRNA gene sequences showed the evolutionary relat...

  10. Trichinella pseudospiralis vs. T. spiralis thymidylate synthase gene structure and T. pseudospiralis thymidylate synthase retrogene sequence

    Jagielska, Elżbieta; Płucienniczak, Andrzej; Dąbrowska, Magdalena; Dowierciał, Anna; Rode, Wojciech

    2014-01-01

    Background Thymidylate synthase is a housekeeping gene, designated ancient due to its role in DNA synthesis and ubiquitous phyletic distribution. The genomic sequences were characterized coding for thymidylate synthase in two species of the genus Trichinella, an encapsulating T. spiralis and a non-encapsulating T. pseudospiralis. Methods Based on the sequence of parasitic nematode Trichinella spiralis thymidylate synthase cDNA, PCR techniques were employed. Results Each of the respective gene...

  11. Targeting of AID-mediated sequence diversification to immunoglobulin genes.

    Kothapalli, Naga Rama; Fugmann, Sebastian D

    2011-04-01

    Activation-induced cytidine deaminase (AID) is a key enzyme for antibody-mediated immune responses. Antibodies are encoded by the immunoglobulin genes and AID acts as a transcription-dependent DNA mutator on these genes to improve antibody affinity and effector functions. An emerging theme in field is that many transcribed genes are potential targets of AID, presenting an obvious danger to genomic integrity. Thus there are mechanisms in place to ensure that mutagenic outcomes of AID activity are specifically restricted to the immunoglobulin loci. Cis-regulatory targeting elements mediate this effect and their mode of action is probably a combination of immunoglobulin gene specific activation of AID and a perversion of faithful DNA repair towards error-prone outcomes. PMID:21295456

  12. Sequencing of the β-tubulin genes in the ascarid nematodes Parascaris equorum and Ascaridia galli.

    Tydén, E; Engström, A; Morrison, D A; Höglund, J

    2013-07-01

    Benzimidazoles (BZ) are used to control infections of the equine roundworm Parascaris equorum and the poultry roundworm Ascaridia galli. There are still no reports of anthelmintic resistance (AR) to BZ in these two nematodes, although AR to BZ is widespread in several other veterinary parasites. Several single nucleotide polymorphisms (SNP) in the β-tubulin genes have been associated with BZ-resistance. In the present study we have sequenced β-tubulin genes: isotype 1 and isotype 2 of P. equorum and isotype 1 of A. galli. Phylogenetic analysis of all currently known isotypes showed that the Nematoda has more diversity among the β-tubulin genes than the Vertebrata. In addition, this diversity is arranged in a more complex pattern of isotypes. Phylogenetically, the A. galli sequence and one of the P. equorum sequences clustered with the known Ascaridoidea isotype 1 sequences, while the other P. equorum sequence did not cluster with any other β-tubulin sequences. We therefore conclude that this is a previously unreported isotype 2. The β-tubulin gene sequences were used to develop a PCR for genotyping SNP in codons 167, 198 and 200. No SNP was observed despite sequencing 95 and 100 individual adult worms of P. equorum and A. galli, respectively. Given the diversity of isotype patterns among nematodes, it is likely that associations of genetic data with BZ-resistance cannot be generalised from one taxonomic group to another. PMID:23685342

  13. IDENTIFICATION OF THREE FRUIT-ROT FUNGI OF BANANA BY 28S RIBOSOMAL DNA SEQUENCING

    Supriya Sarkar*, S Girisham and SM Reddy

    2013-01-01

    The aim of present investigation was to identify three fruit-rot fungi-Macrophomina phaseolina (Tassi) Goid, Fusarium oxysporum (Schlechtend) and Nigrospora oryzae (Berk and Br.) Petch isolated from banana fruits [Rasthali (Silk AAB) and Cavendish (AAA) varieties]. Out of different fungal genera isolated, the above fungi were responsible for maximum loss of banana fruits as they spread rapidly into the fruit pulp and deteriorated the fruits. The amplification studies of fragment of D2 region ...

  14. Cloning and Sequence Analysis of Light Variable Region Gene of Anti-human Retinoblastoma Monoclonal Antibody

    Xiufeng Zhong; Yongping Li; Shuqi Huang; Bo Ning; Chunyan Zhang; Jianliang Zheng; Guanguang Feng

    2002-01-01

    Purpose: To clone the variable region gene of light chain of monoclonal antibody against human retinoblastoma and to analyze the characterization of its nucleotide sequence as well as amino acid sequence.Methods: Total RNA was extracted from 3C6 hybridoma cells secreting specific monoclonal antibody(McAb)against human retinoblastoma(RB), then transcripted reversely into cDNA with olig-dT primers.The variable region of the light chain (VL) gene fragments was amplified using polymeerase chain reaction(PCR) and further cloned into pGEM(R) -T Easy vector. Then, 3C6 VL cDNA was sequenced by Sanger's method.Homologous analysis was done by NCBI BLAST.Results: The complete nucleotide sequence of 3C6 VL cDNA consisted of 321 bp encoding 107 amino acid residues, containing four workframe regions(FRs)and three complementarity-determining regions (CDRs) as well as the typical structure of two cys residues. The sequence is most homological to a member of the Vk9 gene family, and its chain utilizes the Jkl gene segment.Conclusion: The light chain variable region gene of the McAb against human RB was amplified successfully , which belongs to the Vk9 gene family and utilizes Vk-Jk1 gene rearrangement. This study lays a good basis for constructing a recombinant antibody and for making a new targeted therapeutic agents against retinoblastoma.

  15. Identification of a New Variable Sequence in the P1 Cytadhesin Gene of Mycoplasma pneumoniae: Evidence for the Generation of Antigenic Variation by DNA Recombination between Repetitive Sequences

    Kenri, Tsuyoshi; Taniguchi, Rie; Sasaki, Yuko; Okazaki, Norio; Narita, Mitsuo; Izumikawa, Kinichi; Umetsu, Masao; Sasaki, Tsuguo

    1999-01-01

    A Mycoplasma pneumoniae cytadhesin P1 gene with novel nucleotide sequence variation has been identified. Four clinical strains of M. pneumoniae were found to carry this type of P1 gene. This new P1 gene is similar to the known group II P1 genes but possesses novel sequence variation of approximately 300 bp in the RepMP2/3 region. The position of the new variable region is distant from the previously reported variable regions known to differ between group I and II P1 genes. Two sequences close...

  16. Experimental Conditions: SE28_S03_M04_D01 [Metabolonote[Archive

    Full Text Available SE28_S03_M04_D01 SE28 Comparison of leaf metabolites among developmental stages of ...Hevea brasiliensis SE28_S03 Hevea brasiliensis leaf SE28_S03_M04 6.7 mg [MassBase ID] MDLC1_21613 SE28_MS2 LC-FT-ICR-MS ESI posit...ive method 2 SE28_DS1 PowerGet analysis for detection of all peaks (B2) 6|ITMS 2 SE28_AM1 PowerGet annotation A1 ...

  17. Experimental Conditions: SE28_S01_M02_D01 [Metabolonote[Archive

    Full Text Available SE28_S01_M02_D01 SE28 Comparison of leaf metabolites among developmental stages of ...Hevea brasiliensis SE28_S01 Hevea brasiliensis leaf SE28_S01_M02 6.7 mg [MassBase ID] MDLC1_20370 SE28_MS1 LC-FT-ICR-MS ESI posit...ive method 1 SE28_DS1 PowerGet analysis for detection of all peaks (B2) 6|ITMS 2 SE28_AM1 PowerGet annotation A1 ...

  18. Experimental Conditions: SE28_S03_M06_D01 [Metabolonote[Archive

    Full Text Available SE28_S03_M06_D01 SE28 Comparison of leaf metabolites among developmental stages of ...Hevea brasiliensis SE28_S03 Hevea brasiliensis leaf SE28_S03_M06 6.7 mg [MassBase ID] MDLC1_21615 SE28_MS2 LC-FT-ICR-MS ESI posit...ive method 2 SE28_DS1 PowerGet analysis for detection of all peaks (B2) 6|ITMS 2 SE28_AM1 PowerGet annotation A1 ...

  19. Experimental Conditions: SE28_S01_M03_D01 [Metabolonote[Archive

    Full Text Available SE28_S01_M03_D01 SE28 Comparison of leaf metabolites among developmental stages of ...Hevea brasiliensis SE28_S01 Hevea brasiliensis leaf SE28_S01_M03 6.7 mg [MassBase ID] MDLC1_20371 SE28_MS1 LC-FT-ICR-MS ESI posit...ive method 1 SE28_DS1 PowerGet analysis for detection of all peaks (B2) 6|ITMS 2 SE28_AM1 PowerGet annotation A1 ...

  20. Experimental Conditions: SE28_S04_M01_D01 [Metabolonote[Archive

    Full Text Available SE28_S04_M01_D01 SE28 Comparison of leaf metabolites among developmental stages of ...Hevea brasiliensis SE28_S04 Hevea brasiliensis leaf SE28_S04_M01 6.7 mg [MassBase ID] MDLC1_20378 SE28_MS1 LC-FT-ICR-MS ESI posit...ive method 1 SE28_DS1 PowerGet analysis for detection of all peaks (B2) 6|ITMS 2 SE28_AM1 PowerGet annotation A1 ...

  1. Identification of genes in anonymous DNA sequences. Annual performance report, February 1, 1991--January 31, 1992

    Fields, C.A.

    1996-06-01

    The objective of this project is the development of practical software to automate the identification of genes in anonymous DNA sequences from the human, and other higher eukaryotic genomes. A software system for automated sequence analysis, gm (gene modeler) has been designed, implemented, tested, and distributed to several dozen laboratories worldwide. A significantly faster, more robust, and more flexible version of this software, gm 2.0 has now been completed, and is being tested by operational use to analyze human cosmid sequence data. A range of efforts to further understand the features of eukaryoyic gene sequences are also underway. This progress report also contains papers coming out of the project including the following: gm: a Tool for Exploratory Analysis of DNA Sequence Data; The Human THE-LTR(O) and MstII Interspersed Repeats are subfamilies of a single widely distruted highly variable repeat family; Information contents and dinucleotide compostions of plant intron sequences vary with evolutionary origin; Splicing signals in Drosophila: intron size, information content, and consensus sequences; Integration of automated sequence analysis into mapping and sequencing projects; Software for the C. elegans genome project.

  2. Sequencing and mapping hemoglobin gene clusters in the australian model dasyurid marsupial sminthopsis macroura

    De Leo, A.A.; Wheeler, D.; Lefevre, C.; Cheng, Jan-Fang; Hope, R.; Kuliwaba, J.; Nicholas, K.R.; Westermanc, M.; Graves, J.A.M.

    2004-07-26

    Comparing globin genes and their flanking sequences across many species has allowed globin gene evolution to be reconstructed in great detail. Marsupial globin sequences have proved to be of exceptional significance. A previous finding of a beta-like omega gene in the alpha cluster in the tammar wallaby suggested that the alpha and beta cluster evolved via genome duplication and loss rather than tandem duplication. To confirm and extend this important finding we isolated and sequenced BACs containing the alpha and beta loci from the distantly related Australian marsupial Sminthopsis macroura. We report that the alpha gene lies in the same BAC as the beta-like omega gene, implying that the alpha-omega juxtaposition is likely to be conserved in all marsupials. The LUC7L gene was found 3' of the S. macroura alpha locus, a gene order shared with humans but not mouse, chicken or fugu. Sequencing a BAC contig that contained the S. macroura beta globin and epsilon globin loci showed that the globin cluster is flanked by olfactory genes, demonstrating a gene arrangement conserved for over 180 MY. Analysis of the region 5' to the S. macroura epsilon globin gene revealed a region similar to the eutherian LCR, containing sequences and potential transcription factor binding sites with homology to eutherian hypersensitive sites 1 to 5. FISH mapping of BACs containing S. macroura alpha and beta globin genes located the beta globin cluster on chromosome 3q and the alpha locus close to the centromere on 1q, resolving contradictory map locations obtained by previous radioactive in situ hybridization.

  3. How the sequence of a gene can tune its translation

    Fredrick, Kurt; Ibba, Michael

    2010-01-01

    Sixty-one codons specify 20 amino acids, offering cells many options for encoding a polypeptide sequence. Two new studies (Cannarrozzi et al., 2010; Tuller et al., 2010) now foster the idea that patterns of codon usage can control ribosome speed, fine-tuning translation to increase the efficiency...

  4. Neural network predicts sequence of TP53 gene based on DNA chip

    Spicker, J.S.; Wikman, F.; Lu, M.L.;

    2002-01-01

    We have trained an artificial neural network to predict the sequence of the human TP53 tumor suppressor gene based on a p53 GeneChip. The trained neural network uses as input the fluorescence intensities of DNA hybridized to oligonucleotides on the surface of the chip and makes between zero and...

  5. GIPS: A Software Guide to Sequencing-Based Direct Gene Cloning in Forward Genetics Studies.

    Hu, Han; Wang, Weitao; Zhu, Zhongxu; Zhu, Jianhua; Tan, Deyong; Zhou, Zhipeng; Mao, Chuanzao; Chen, Xin

    2016-04-01

    The Gene Identification via Phenotype Sequencing (GIPS) software considers a range of experimental and analysis choices in sequencing-based forward genetics studies within an integrated probabilistic framework, which enables direct gene cloning from the sequencing of several unrelated mutants of the same phenotype without the need to create segregation populations. GIPS estimates four measurements to help optimize an analysis procedure as follows: (1) the chance of reporting the true phenotype-associated gene; (2) the expected number of random genes that may be reported; (3) the significance of each candidate gene's association with the phenotype; and (4) the significance of violating the Mendelian assumption if no gene is reported or if all candidate genes have failed validation. The usage of GIPS is illustrated with the identification of a rice (Oryza sativa) gene that epistatically suppresses the phenotype of the phosphate2 mutant from sequencing three unrelated ethyl methanesulfonate mutants. GIPS is available at https://github.com/synergy-zju/gips/wiki with the user manual and an analysis example. PMID:26842621

  6. Sequence of the Proteus mirabilis urease accessory gene ureG.

    Sriwanthana, B; Island, M D; Mobley, H L

    1993-07-15

    We report the sequence of ureG, an accessory gene that is a part of the ure gene cluster of uropathogenic Proteus mirabilis and required for full enzymatic activity of urease. The 615-bp open reading frame predicts a M(r) 22,374 polypeptide, which contains a consensus amino acid (aa) sequence for ATP-binding. The polypeptide shares sequence homology with UreG of Escherichia coli (93% of identical aa), Klebsiella aerogenes (59%) and Helicobacter pylori (59%). PMID:8335248

  7. Isolation and nucleotide sequence of a mouse histidine tRNA gene.

    Han, J. H.; Harding, J D

    1982-01-01

    We have sequenced a 1307 base pair mouse genomic DNA fragment which contains a histidine tRNA gene. The sequence of the putative mouse histidine tRNA differs from the published sequence of sheep liver histidine tRNA by a single base change in the D-loop. It does not contain an unpaired 5' terminal G residue, as reported for Drosophila and sheep histidine tRNAs. The gene does not contain introns. The 3' flanking region contains a typical RNA polymerase III termination site of 6 consecutive T r...

  8. [Characterization of Black and Dichothrix Cyanobacteria Based on the 16S Ribosomal RNA Gene Sequence

    Ortega, Maya

    2010-01-01

    My project focuses on characterizing different cyanobacteria in thrombolitic mats found on the island of Highborn Cay, Bahamas. Thrombolites are interesting ecosystems because of the ability of bacteria in these mats to remove carbon dioxide from the atmosphere and mineralize it as calcium carbonate. In the future they may be used as models to develop carbon sequestration technologies, which could be used as part of regenerative life systems in space. These thrombolitic communities are also significant because of their similarities to early communities of life on Earth. I targeted two cyanobacteria in my research, Dichothrix spp. and whatever black is, since they are believed to be important to carbon sequestration in these thrombolitic mats. The goal of my summer research project was to molecularly identify these two cyanobacteria. DNA was isolated from each organism through mat dissections and DNA extractions. I ran Polymerase Chain Reactions (PCR) to amplify the 16S ribosomal RNA (rRNA) gene in each cyanobacteria. This specific gene is found in almost all bacteria and is highly conserved, meaning any changes in the sequence are most likely due to evolution. As a result, the 16S rRNA gene can be used for bacterial identification of different species based on the sequence of their 16S rRNA gene. Since the exact sequence of the Dichothrix gene was unknown, I designed different primers that flanked the gene based on the known sequences from other taxonomically similar cyanobacteria. Once the 16S rRNA gene was amplified, I cloned the gene into specialized Escherichia coli cells and sent the gene products for sequencing. Once the sequence is obtained, it will be added to a genetic database for future reference to and classification of other Dichothrix sp.

  9. Cloning and sequencing of the virulent gene LipL32 of Leptospira interrogans serovar Autumnalis

    Sriram Vamshi Krishna; Siju Joseph; R Ambily; M. Mini; Liya Anto; Sheethal G Mohan

    2013-01-01

    Aim: To clone the virulent gene LipL32 of Leptospira interrogans serovar Autumnalis and to analyze the sequence with LipL32 gene of other pathogenic serovars of Leptopsira. Materials and Methods: Leptospira interrogans serovar Autumnalis procured from Leptospira research laboratory, Chennai was used in the study. Polymerase chain reaction (PCR) was carried out for amplifying LipL32 gene using the reported primers of Leptospira Kirschnerii. The PCR product was cloned into TA cloning vector and...

  10. Possible origin of sequence divergence in the P1 cytadhesin gene of Mycoplasma pneumoniae.

    Su, C J; Dallo, S F; Chavoya, A; Baseman, J B

    1993-01-01

    Specific regions of the P1 adhesin structural gene of Mycoplasma pneumoniae hybridize to various parts of the mycoplasma genome, indicating their multiple-copy nature. In addition, restriction fragment length polymorphisms and sequence divergence have been observed in the P1 gene, permitting the classification of clinical isolates of M. pneumoniae into two groups, I and II. These data suggest that the observed P1 gene diversity may be explained by homologous recombination between similar but ...

  11. A novel method to discover fluoroquinolone antibiotic resistance (qnr genes in fragmented nucleotide sequences

    Boulund Fredrik

    2012-12-01

    Full Text Available Abstract Background Broad-spectrum fluoroquinolone antibiotics are central in modern health care and are used to treat and prevent a wide range of bacterial infections. The recently discovered qnr genes provide a mechanism of resistance with the potential to rapidly spread between bacteria using horizontal gene transfer. As for many antibiotic resistance genes present in pathogens today, qnr genes are hypothesized to originate from environmental bacteria. The vast amount of data generated by shotgun metagenomics can therefore be used to explore the diversity of qnr genes in more detail. Results In this paper we describe a new method to identify qnr genes in nucleotide sequence data. We show, using cross-validation, that the method has a high statistical power of correctly classifying sequences from novel classes of qnr genes, even for fragments as short as 100 nucleotides. Based on sequences from public repositories, the method was able to identify all previously reported plasmid-mediated qnr genes. In addition, several fragments from novel putative qnr genes were identified in metagenomes. The method was also able to annotate 39 chromosomal variants of which 11 have previously not been reported in literature. Conclusions The method described in this paper significantly improves the sensitivity and specificity of identification and annotation of qnr genes in nucleotide sequence data. The predicted novel putative qnr genes in the metagenomic data support the hypothesis of a large and uncharacterized diversity within this family of resistance genes in environmental bacterial communities. An implementation of the method is freely available at http://bioinformatics.math.chalmers.se/qnr/.

  12. Multi-species microarrays reveal the effect of sequence divergence on gene expression profiles

    Gilad, Yoav; Rifkin, Scott A.; Bertone, Paul; Gerstein, Mark; White, Kevin P

    2005-01-01

    Interspecies comparisons of gene expression levels will increase our understanding of the evolution of transcriptional mechanisms and help to identify targets of natural selection. This approach holds particular promise for apes, as many human-specific adaptations are thought to result from differences in gene expression rather than in coding sequence. To date, however, all studies directly comparing interspecies gene expression have been performed on single-species arrays, so that it has bee...

  13. The vlhA gene sequencing of Iranian Mycoplasma synoviae isolates

    Pourbakhsh, S.A.

    2013-12-01

    Full Text Available Mycoplasma synoviae expressed variable lipoprotein haemagglutinin (VlhA is believed to play a major role in pathogenesis of the disease by mediating adherence and immune evasion. The aim of this study was sequencing Iranian M. synoviae isolates for the detection of nucleotide variation in the M. synoviae vlhA gene. Using oligonucleotide primers complementary to the single-copy conserved 5´ end of vlhA gene, amplicons of ~400 bp were generated from 10 M. synoviae isolated from commercial broiler chicken farms in Iran, afterward the conserved domain of the vlhA gene of M. synoviae was sequenced and analyzed for Iranian isolates. The results showed that, there was a complete concordance between all Iranian isolates nucleotide sequence (1-386 nt. In comparison with vaccine MS-H strain sequence, all Iranian isolates; entire vlhA sequence downstream of nucleotide 386 was different. It was also observed in all Iranian M. synoviae isolates, point mutations and frame-shift mutation. This study was demonstrated a difference between Iranian isolates and live commercial vaccine MS-H strain. Furthermore, these data indicated that changes in the vlhA gene sequence could introduce into the expressed vlhA gene amino acid codons and effective in pathogenesis rate in flocks.

  14. Molecular cloning and analysis of the partial sequence of Rhinopithecus roxellanae growth hormone gene

    徐来祥; 孔繁华; 华育平

    2000-01-01

    Growth hormone gene (GH) of Rhinopithecus roxellanae was amplified by PCR based on the sequences of the reported mammalian growth hormone gene for the first time. The amplified fragment was about 1.8 kb. It was cloned and its upper stream was sequenced. This sequencing region consists of a 5¢ flanking regulatory region, exon I and part of exon II, intron I of growth hormone gene. Comparing the corresponding sequences of growth hormone gene between Rhinopithecus roxellanae and the porcine, we concluded that the homology reached 81% in the region, and there was high conservation in the 5¢ flanking sequence. The kinds of amino acids of exon I and exon II for about 90% were the same to those in pig. Many mutations occurred in the degenerate site of the triplet code. In the nucleotides of intron I, there were only 72% homologies with those in pig. It means that introns and 3¢ flanking sequence maybe play an important part in growth hormone gene regulation of the different animals.

  15. The Cloning and Sequencing of Read-through Protein Gene from BYDV-GAV Virus

    CHANG Sheng-jun; WANG Xi-feng; LI Li; MA Zhan-hong; ZHOU Guang-he

    2001-01-01

    The cDNA of BYDV-GAV read-through protein (RTP) gene was amplified from the extracted RNA of BYDV-GAV by using the polymerase chain reaction (PCR), and cloned into pGEM-7zf( + ). Its complete nucleotide sequence was determined by dideoxynucleotide chain-termination method. The BYDV-GAV RTP gene consists of 1377nt. Its sequences were most similar to that of the RTP gene of BYDV - MAV with identities of 87.4% and 87.1% at the nucleotide and amino acid levels, respectively.

  16. Cloning and sequencing of the gene for human β-casein

    Human β-casein is a major protein in human milk. This protein is part of the casein micelle and has been suggested to have several physiological functions in the newborn. Since there is limited information on βcasein and the factors that affect its concentration in human milk, the authors have isolated and sequenced the gene for this protein. A human mammary gland cDNA library (Clontech) in gt 11 was screened by plaque hy-hybridization using a 42-mer synthetic 32p-labelled oligo-nucleotide. Positive clones were identified and isolated, DNA was prepared and the gene isolated by cleavage with EcoR1. Following subcloning (PUC18), restriction mapping and Southern blotting, DNA for sequencing was prepared. The gene was sequenced by the dideoxy method. Human β-casein has 212 amino acids and the amino acid sequence deducted from the nucleotide sequence is to 91% identical to the published sequence for human β-casein show a high degree of conservation at the leader peptide and the highly phosphorylated sequences, but also deletions and divergence at several positions. These results provide insight into the structure of the human β-casein gene and will facilitate studies on factors affecting its expression

  17. Combined sequence and sequence-structure based methods for analyzing FGF23, CYP24A1 and VDR genes.

    Nagamani, Selvaraman; Singh, Kh Dhanachandra; Muthusamy, Karthikeyan

    2016-09-01

    FGF23, CYP24A1 and VDR altogether play a significant role in genetic susceptibility to chronic kidney disease (CKD). Identification of possible causative mutations may serve as therapeutic targets and diagnostic markers for CKD. Thus, we adopted both sequence and sequence-structure based SNP analysis algorithm in order to overcome the limitations of both methods. We explore the functional significance towards the prediction of risky SNPs associated with CKD. We assessed the performance of four widely used pathogenicity prediction methods. We compared the performances of the programs using Mathews correlation Coefficient ranged from poor (MCC = 0.39) to reasonably good (MCC = 0.42). However, we got the best results for the combined sequence and structure based analysis method (MCC = 0.45). 4 SNPs from FGF23 gene, 8 SNPs from VDR gene and 13 SNPs from CYP24A1 gene were predicted to be the causative agents for human diseases. This study will be helpful in selecting potential SNPs for experimental study from the SNP pool and also will reduce the cost for identification of potential SNPs as a genetic marker. PMID:27114920

  18. Bidirectional gene sequences with similar homology to functional proteins of alkane degrading bacterium pseudomonas fredriksbergensis DNA

    The potential for two overlapping fragments of DNA from a clone of newly isolated alkanes degrading bacterium Pseudomonas frederiksbergensis encoding sequences with similar homology to two parts of functional proteins is described. One strand contains a sequence with high homology to alkanes monooxygenase (alkB), a member of the alkanes hydroxylase family, and the other strand contains a sequence with some homology to alcohol dehydrogenase gene (alkJ). Overlapping of the genes on opposite strands has been reported in eukaryotic species, and is now reported in a bacterial species. The sequence comparisons and ORFS results revealed that the regulation and the genes organization involved in alkane oxidation represented in Pseudomonas frederiksberghensis varies among the different known alkane degrading bacteria. The alk gene cluster containing homologues to the known alkane monooxygenase (alkB), and rubredoxin (alkG) are oriented in the same direction, whereas alcohol dehydrogenase (alkJ) is oriented in the opposite direction. Such genomes encode messages on both strands of the DNA, or in an overlapping but different reading frames, of the same strand of DNA. The possibility of creating novel genes from pre-existing sequences, known as overprinting, which is a widespread phenomenon in small viruses. Here, the origin and evolution of the gene overlap to bacteriophages belonging to the family Microviridae have been investigated. Such a phenomenon is most widely described in extremely small genomes such as those of viruses or small plasmids, yet here is a unique phenomenon. (author)

  19. Complexity of rice Hsp100 gene family: lessons from rice genome sequence data

    Gaurav Batra; Vineeta Singh Chauhan; Amanjot Singh; Neelam K Sarkar; Anil Grover

    2007-04-01

    Elucidation of genome sequence provides an excellent platform to understand detailed complexity of the various gene families. Hsp100 is an important family of chaperones in diverse living systems. There are eight putative gene loci encoding for Hsp100 proteins in Arabidopsis genome. In rice, two full-length Hsp100 cDNAs have been isolated and sequenced so far. Analysis of rice genomic sequence by in silico approach showed that two isolated rice Hsp100 cDNAs correspond to Os05g44340 and Os02g32520 genes in the rice genome database. There appears to be three additional proteins (encoded by Os03g31300, Os04g32560 and Os04g33210 gene loci) that are variably homologous to Os05g44340 and Os02g32520 throughout the entire amino acid sequence. The above five rice Hsp100 genes show significant similarities in the signature sequences known to be conserved among Hsp100 proteins. While Os05g44340 encodes cytoplasmic Hsp100 protein, those encoded by the other four genes are predicted to have chloroplast transit peptides.

  20. Presence and Expression of Microbial Genes Regulating Soil Nitrogen Dynamics Along the Tanana River Successional Sequence

    Boone, R. D.; Rogers, S. L.

    2004-12-01

    We report on work to assess the functional gene sequences for soil microbiota that control nitrogen cycle pathways along the successional sequence (willow, alder, poplar, white spruce, black spruce) on the Tanana River floodplain, Interior Alaska. Microbial DNA and mRNA were extracted from soils (0-10 cm depth) for amoA (ammonium monooxygenase), nifH (nitrogenase reductase), napA (nitrate reductase), and nirS and nirK (nitrite reductase) genes. Gene presence was determined by amplification of a conserved sequence of each gene employing sequence specific oligonucleotide primers and Polymerase Chain Reaction (PCR). Expression of the genes was measured via nested reverse transcriptase PCR amplification of the extracted mRNA. Amplified PCR products were visualized on agarose electrophoresis gels. All five successional stages show evidence for the presence and expression of microbial genes that regulate N fixation (free-living), nitrification, and nitrate reduction. We detected (1) nifH, napA, and nirK presence and amoA expression (mRNA production) for all five successional stages and (2) nirS and amoA presence and nifH, nirK, and napA expression for early successional stages (willow, alder, poplar). The results highlight that the existing body of previous process-level work has not sufficiently considered the microbial potential for a nitrate economy and free-living N fixation along the complete floodplain successional sequence.

  1. Cloning, sequencing and identification of single nu-cleotide polymorphisms of partial sequence on the porcine CACNA1S gene

    2008-01-01

    CACNA1S gene encodes the α1 subunit of the calcium channel. The mutation of CACNA1S gene can cause hypokalemic periodic paralysis (HypoKPP) and maliglant hyperthermia synarome (MHS) in hu-man beings. Current research on CACNA1S was mainly in human being and model animal, but rarely in livestock and poultry. In this study, Yorkshire pigs (23), Pietrain pigs (30), Jinhua pigs (115) and the second generation (126) of crossbred of Jinhua and Pietrain were used. Primers were designed ac-cording to the sequence of human CACNA1S gene and PCR was carried out using pig genome DNA. PCR products were sequenced and compared with that of human, and then single nucleotide poly-morphisms (SNPs) were investigated by PCR-SSCP, while PCR-RFLP tests were performed to validate the mutations. Results indicated: (1) the 5211 bp DNA fragments of porcine CACNA1S gene were ac-quired (GenBank accession number: DQ767693 ) and the identity of the exon region was 82.6% be-tween human and pig; (2) fifty-seven mutations were found within the cloned sequences, among which 24 were in exon region; (3) the results of PCR-RFLP were in accordance with that of PCR-SSCP. Ac-cording to the EST of porcine CACNA1S gene published in GenBank (Bx914582, Bx666997), 8 of the 11 SNPs identified in the present study were consistent with the base difference between two EST frag-ments.

  2. Sequence diversities of serine-aspartate repeat genes among Staphylococcus aureus isolates from different hosts presumably by horizontal gene transfer.

    Huping Xue

    Full Text Available BACKGROUND: Horizontal gene transfer (HGT is recognized as one of the major forces for bacterial genome evolution. Many clinically important bacteria may acquire virulence factors and antibiotic resistance through HGT. The comparative genomic analysis has become an important tool for identifying HGT in emerging pathogens. In this study, the Serine-Aspartate Repeat (Sdr family has been compared among different sources of Staphylococcus aureus (S. aureus to discover sequence diversities within their genomes. METHODOLOGY/PRINCIPAL FINDINGS: Four sdr genes were analyzed for 21 different S. aureus strains and 218 mastitis-associated S. aureus isolates from Canada. Comparative genomic analyses revealed that S. aureus strains from bovine mastitis (RF122 and mastitis isolates in this study, ovine mastitis (ED133, pig (ST398, chicken (ED98, and human methicillin-resistant S. aureus (MRSA (TCH130, MRSA252, Mu3, Mu50, N315, 04-02981, JH1 and JH9 were highly associated with one another, presumably due to HGT. In addition, several types of insertion and deletion were found in sdr genes of many isolates. A new insertion sequence was found in mastitis isolates, which was presumably responsible for the HGT of sdrC gene among different strains. Moreover, the sdr genes could be used to type S. aureus. Regional difference of sdr genes distribution was also indicated among the tested S. aureus isolates. Finally, certain associations were found between sdr genes and subclinical or clinical mastitis isolates. CONCLUSIONS: Certain sdr gene sequences were shared in S. aureus strains and isolates from different species presumably due to HGT. Our results also suggest that the distributional assay of virulence factors should detect the full sequences or full functional regions of these factors. The traditional assay using short conserved regions may not be accurate or credible. These findings have important implications with regard to animal husbandry practices that may

  3. Genomic organization and sequence analysis of the vomeronasal receptor V2R genes in mouse genome

    YANG Hui; Zhang YaPing

    2007-01-01

    Two multigene superfamilies, named V1R and V2R, encoding seven-transmembrane-domain G-protein coupled receptors (GPCRs) have been identified as pheromone receptors in mammals. Three V2R gene families have been described in mouse and rat. Here we screened the updated mouse genome sequence database and finally retrieved 63 putative functional V2R genes including three newly identified genes which formed a new additional family. We described the genomic organization of these genes and also characterized the conservation of mouse V2R protein sequences. These genomic and sequence information we described are useful as part of the evidence to speculate the functional domain of V2Rs and should give aid to the functionality study in the future.

  4. Evolution at Two Levels in Fire Ants: The Relationship between Patterns of Gene Expression and Protein Sequence Evolution

    Hunt, B. G.; Ometto, L.; Keller, L.; Goodisman, M. A. D.

    2013-01-01

    Variation in protein sequence and gene expression each contribute to phenotypic diversity, and may be subject to similar selective pressures. Eusocial insects are particularly useful for investigating the evolutionary link between protein sequence and condition-dependent patterns of gene expression because gene expression plays a central role in determining differences between eusocial insect sexes and castes. We investigated the relationship between protein coding sequence evolution and gene...

  5. Sequence analysis of the ERCC2 gene regions in human, mouse, and hamster reveals three linked genes

    Lamerdin, J.E.; Stilwagen, S.A.; Ramirez, M.H. [Lawrence Livermore National Lab., CA (United States)] [and others

    1996-06-15

    The ERCC2 (excision repair cross-complementing rodent repair group 2) gene product is involved in transcription-coupled repair as an integral member of the basal transcription factor BTF2/TFIIH complex. Defects in this gene can result in three distinct human disorders, namely the cancer-prone syndrome xeroderma pigmentosum complementation group D, trichothiodystrophy, and Cockayne syndrome. We report the comparative analysis of 91.6 kb of new sequence including 54.3 kb encompassing the human ERCC2 locus, the syntenic region in the mouse (32.6 kb), and a further 4.7 kb of sequence 3{prime} of the previously reported ERCC2 region in the hamster. In addition to ERCC2, our analysis revealed the presence of two previously undescribed genes in all three species. The first is centromeric (in the human) to ERCC2 and is most similar to the kinesin light chain gene in sea urchin. The second gene is telomeric (in the human) to ERCC2 and contains a motif found in ankyrins, some cell proteins, and transcription factors. Multiple EST matches to this putative new gene indicate that it is expressed in several human tissues, including breast. The identification and description of two new genes provides potential candidate genes for disorders mapping to this region of 19q13.2. 42 refs., 6 figs., 3 tabs.

  6. Sequence analysis of the ERCC2 gene regions in human, mouse, and hamster reveals three linked genes.

    Lamerdin, J E; Stilwagen, S A; Ramirez, M H; Stubbs, L; Carrano, A V

    1996-06-15

    The ERCC2 (excision repair cross-complementing rodent repair group 2) gene product is involved in transcription-coupled repair as an integral member of the basal transcription factor BTF2/TFIIH complex. Defects in this gene can result in three distinct human disorders, namely the cancer-prone syndrome xeroderma pigmentosum complementation group D, trichothiodystrophy, and Cockayne syndrome. We report the comparative analysis of 91.6 kb of new sequence including 54.3 kb encompassing the human ERCC2 locus, the syntenic region in the mouse (32.6 kb), and a further 4.7 kb of sequence 3' of the previously reported ERCC2 region in the hamster. In addition to ERCC2, our analysis revealed the presence of two previously undescribed genes in all three species. The first is centromeric (in the human) to ERCC2 and is most similar to the kinesin light chain gene in sea urchin. The second gene is telomeric (in the human) to ERCC2 and contains a motif found in ankyrins, some cell cycle proteins, and transcription factors. Multiple EST matches to this putative new gene indicate that it is expressed in several human tissues, including breast. The identification and description of two new genes provides potential candidate genes for disorders mapping to this region of 19q13.2. PMID:8786141

  7. Research Techniques Made Simple: Bacterial 16S Ribosomal RNA Gene Sequencing in Cutaneous Research.

    Jo, Jay-Hyun; Kennedy, Elizabeth A; Kong, Heidi H

    2016-03-01

    Skin serves as a protective barrier and also harbors numerous microorganisms collectively comprising the skin microbiome. As a result of recent advances in sequencing (next-generation sequencing), our understanding of microbial communities on skin has advanced substantially. In particular, the 16S ribosomal RNA gene sequencing technique has played an important role in efforts to identify the global communities of bacteria in healthy individuals and patients with various disorders in multiple topographical regions over the skin surface. Here, we describe basic principles, study design, and a workflow of 16S ribosomal RNA gene sequencing methodology, primarily for investigators who are not familiar with this approach. This article will also discuss some applications and challenges of 16S ribosomal RNA sequencing as well as directions for future development. PMID:26902128

  8. Exome sequencing of 18 Chinese families with congenital cataracts: a new sight of the NHS gene.

    Wenmin Sun

    Full Text Available PURPOSE: The aim of this study was to investigate the mutation spectrum and frequency of 34 known genes in 18 Chinese families with congenital cataracts. METHODS: Genomic DNA and clinical data was collected from 18 families with congenital cataracts. Variations in 34 cataract-associated genes were screened by whole exome sequencing and then validated by Sanger sequencing. RESULTS: Eleven candidate variants in seven of the 34 genes were detected by exome sequencing and then confirmed by Sanger sequencing, including two variants predicted to be benign and the other pathogenic mutations. The nine mutations were present in 9 of the 18 (50% families with congenital cataracts. Of the four families with mutations in the X-linked NHS gene, no other abnormalities were recorded except for cataract, in which a pseudo-dominant inheritance form was suggested, as female carriers also had different forms of cataracts. CONCLUSION: This study expands the mutation spectrum and frequency of genes responsible for congenital cataract. Mutation in NHS is a common cause of nonsyndromic congenital cataract with pseudo-autosomal dominant inheritance. Combined with our previous studies, a genetic basis could be identified in 67.6% of families with congenital cataracts in our case series, in which mutations in genes encoding crystallins, genes encoding connexins, and NHS are responsible for 29.4%, 14.7%, and 11.8% of families, respectively. Our results suggest that mutations in NHS are the common cause of congenital cataract, both syndromic and nonsyndromic.

  9. The nucleotide sequence of the uvrD gene of E. coli.

    Finch, P W; Emmerson, P T

    1984-01-01

    The nucleotide sequence of a cloned section of the E. coli chromosome containing the uvrD gene has been determined. The coding region for the UvrD protein consists of 2,160 nucleotides which would direct the synthesis of a polypeptide 720 amino acids long with a calculated molecular weight of 82 kd. The predicted amino acid sequence of the UvrD protein has been compared with the amino acid sequences of other known adenine nucleotide binding proteins and a common sequence has been identified, ...

  10. A pilot study of gene testing of genetic bone dysplasia using targeted next-generation sequencing.

    Zhang, Huiwen; Yang, Rui; Wang, Yu; Ye, Jun; Han, Lianshu; Qiu, Wenjuan; Gu, Xuefan

    2015-12-01

    Molecular diagnosis of genetic bone dysplasia is challenging for non-expert. A targeted next-generation sequencing technology was applied to identify the underlying molecular mechanism of bone dysplasia and evaluate the contribution of these genes to patients with bone dysplasia encountered in pediatric endocrinology. A group of unrelated patients (n=82), characterized by short stature, dysmorphology and X-ray abnormalities, of which mucopolysacharidoses, GM1 gangliosidosis, mucolipidosis type II/III and achondroplasia owing to FGFR3 G380R mutation had been excluded, were recruited in this study. Probes were designed to 61 genes selected according to the nosology and classification of genetic skeletal disorders of 2010 by Illumina's online DesignStudio software. DNA was hybridized with probes and then a library was established following the standard Illumina protocols. Amplicon library was sequenced on a MiSeq sequencing system and the data were analyzed by MiSeq Reporter. Mutations of 13 different genes were found in 44 of the 82 patients (54%). Mutations of COL2A1 gene and PHEX gene were found in nine patients, respectively (9/44=20%), followed by COMP gene in 8 (18%), TRPV4 gene in 4 (9%), FBN1 gene in 4 (9%), COL1A1 gene in 3 (6%) and COL11A1, TRAPPC2, MATN3, ARSE, TRPS1, SMARCAL1, ENPP1 gene mutations in one patient each (2% each). In conclusion, mutations of COL2A1, PHEX and COMP gene are common for short stature due to bone dysplasia in outpatient clinics in pediatric endocrinology. Targeted next-generation sequencing is an efficient way to identify the underlying molecular mechanism of genetic bone dysplasia. PMID:26377240

  11. Estimating variation within the genes and inferring the phylogeny of 186 sequenced diverse Escherichia coli genomes

    Kaas Rolf S

    2012-10-01

    Full Text Available Abstract Background Escherichia coli exists in commensal and pathogenic forms. By measuring the variation of individual genes across more than a hundred sequenced genomes, gene variation can be studied in detail, including the number of mutations found for any given gene. This knowledge will be useful for creating better phylogenies, for determination of molecular clocks and for improved typing techniques. Results We find 3,051 gene clusters/families present in at least 95% of the genomes and 1,702 gene clusters present in 100% of the genomes. The former 'soft core' of about 3,000 gene families is perhaps more biologically relevant, especially considering that many of these genome sequences are draft quality. The E. coli pan-genome for this set of isolates contains 16,373 gene clusters. A core-gene tree, based on alignment and a pan-genome tree based on gene presence/absence, maps the relatedness of the 186 sequenced E. coli genomes. The core-gene tree displays high confidence and divides the E. coli strains into the observed MLST type clades and also separates defined phylotypes. Conclusion The results of comparing a large and diverse E. coli dataset support the theory that reliable and good resolution phylogenies can be inferred from the core-genome. The results further suggest that the resolution at the isolate level may, subsequently be improved by targeting more variable genes. The use of whole genome sequencing will make it possible to eliminate, or at least reduce, the need for several typing steps used in traditional epidemiology.

  12. Sequence and organization of coelacanth neurohypophysial hormone genes: Evolutionary history of the vertebrate neurohypophysial hormone gene locus

    Brenner Sydney

    2008-03-01

    Full Text Available Abstract Background The mammalian neurohypophysial hormones, vasopressin and oxytocin are involved in osmoregulation and uterine smooth muscle contraction respectively. All jawed vertebrates contain at least one homolog each of vasopressin and oxytocin whereas jawless vertebrates contain a single neurohypophysial hormone called vasotocin. The vasopressin homolog in non-mammalian vertebrates is vasotocin; and the oxytocin homolog is mesotocin in non-eutherian tetrapods, mesotocin and [Phe2]mesotocin in lungfishes, and isotocin in ray-finned fishes. The genes encoding vasopressin and oxytocin genes are closely linked in the human and rodent genomes in a tail-to-tail orientation. In contrast, their pufferfish homologs (vasotocin and isotocin are located on the same strand of DNA with isotocin gene located upstream of vasotocin gene separated by five genes, suggesting that this locus has experienced rearrangements in either mammalian or ray-finned fish lineage, or in both lineages. The coelacanths occupy a unique phylogenetic position close to the divergence of the mammalian and ray-finned fish lineages. Results We have sequenced a coelacanth (Latimeria menadoensis BAC clone encompassing the neurohypophysial hormone genes and investigated the evolutionary history of the vertebrate neurohypophysial hormone gene locus within a comparative genomics framework. The coelacanth contains vasotocin and mesotocin genes like non-mammalian tetrapods. The coelacanth genes are present on the same strand of DNA with no intervening genes, with the vasotocin gene located upstream of the mesotocin gene. Nucleotide sequences of the second exons of the two genes are under purifying selection implying a regulatory function. We have also analyzed the neurohypophysial hormone gene locus in the genomes of opossum, chicken and Xenopus tropicalis. The opossum contains two tandem copies of vasopressin and mesotocin genes. The vasotocin and mesotocin genes in chicken and

  13. Gene discovery and transcript analyses in the corn smut pathogen Ustilago maydis: expressed sequence tag and genome sequence comparison

    Saville Barry J

    2007-09-01

    Full Text Available Abstract Background Ustilago maydis is the basidiomycete fungus responsible for common smut of corn and is a model organism for the study of fungal phytopathogenesis. To aid in the annotation of the genome sequence of this organism, several expressed sequence tag (EST libraries were generated from a variety of U. maydis cell types. In addition to utility in the context of gene identification and structure annotation, the ESTs were analyzed to identify differentially abundant transcripts and to detect evidence of alternative splicing and anti-sense transcription. Results Four cDNA libraries were constructed using RNA isolated from U. maydis diploid teliospores (U. maydis strains 518 × 521 and haploid cells of strain 521 grown under nutrient rich, carbon starved, and nitrogen starved conditions. Using the genome sequence as a scaffold, the 15,901 ESTs were assembled into 6,101 contiguous expressed sequences (contigs; among these, 5,482 corresponded to predicted genes in the MUMDB (MIPS Ustilago maydis database, while 619 aligned to regions of the genome not yet designated as genes in MUMDB. A comparison of EST abundance identified numerous genes that may be regulated in a cell type or starvation-specific manner. The transcriptional response to nitrogen starvation was assessed using RT-qPCR. The results of this suggest that there may be cross-talk between the nitrogen and carbon signalling pathways in U. maydis. Bioinformatic analysis identified numerous examples of alternative splicing and anti-sense transcription. While intron retention was the predominant form of alternative splicing in U. maydis, other varieties were also evident (e.g. exon skipping. Selected instances of both alternative splicing and anti-sense transcription were independently confirmed using RT-PCR. Conclusion Through this work: 1 substantial sequence information has been provided for U. maydis genome annotation; 2 new genes were identified through the discovery of 619

  14. Microdiversity of extracellular enzyme genes among sequenced prokaryotic genomes

    Zimmerman, Amy E; Martiny, Adam C.; Allison, Steven D.

    2013-01-01

    Understanding the relationship between prokaryotic traits and phylogeny is important for predicting and modeling ecological processes. Microbial extracellular enzymes have a pivotal role in nutrient cycling and the decomposition of organic matter, yet little is known about the phylogenetic distribution of genes encoding these enzymes. In this study, we analyzed 3058 annotated prokaryotic genomes to determine which taxa have the genetic potential to produce alkaline phosphatase, chitinase and ...

  15. nef gene sequence variation among HIV-1-infected African children

    Chakraborty, R.; Reiniš, Milan; Rostron, T.; Philpott, S.; Dong, T.; D'Agostino, A.; Musoke, R.; de Silva, E.; Stumpf, M.; Weiser, B.; Burger, H.; Rowland-Jones, S.L.

    2006-01-01

    Roč. 7, č. 2 (2006), s. 75-84. ISSN 1464-2662 Grant ostatní: Fogarty International Center, NIH(US) 3D43TW00915; NIH(US) RO1 AI 42555 Institutional research plan: CEZ:AV0Z50520514 Keywords : HIV-1 nef gene * non-clade B * Kenya Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 2.674, year: 2006

  16. Haplotypes and Sequence Variation in the Ovine Adiponectin Gene (ADIPOQ

    Qing-Ming An

    2015-11-01

    Full Text Available The adiponectin gene (ADIPOQ plays an important role in energy homeostasis. In this study five separate regions (regions 1 to 5 of ovine ADIPOQ were analysed using PCR-SSCP. Four different PCR-SSCP patterns (A1-D1, A2-D2 were detected in region-1 and region-2, respectively, with seven and six SNPs being revealed. In region-3, three different patterns (A3-C3 and three SNPs were observed. Two patterns (A4-B4, A5-B5 and two and one SNPs were observed in region-4 and region-5, respectively. In total, nineteen SNPs were detected, with five of them in the coding region and two (c.46T/C and c.515G/A putatively resulting in amino acid changes (p.Tyr16His and p.Lys172Arg. In region-1, -2 and -3 of 316 sheep from eight New Zealand breeds, variants A1, A2 and A3 were the most common, although variant frequencies differed in the eight breeds. Across region-1 and region-3, nine haplotypes were identified and haplotypes A1-A3, A1-C3, B1-A3 and B1-C3 were most common. These results indicate that the ADIPOQ gene is polymorphic and suggest that further analysis is required to see if the variation in the gene is associated with animal production traits.

  17. Automated DNA mutation detection using universal conditions direct sequencing: application to ten muscular dystrophy genes

    Wu Bai-Lin

    2009-10-01

    Full Text Available Abstract Background One of the most common and efficient methods for detecting mutations in genes is PCR amplification followed by direct sequencing. Until recently, the process of designing PCR assays has been to focus on individual assay parameters rather than concentrating on matching conditions for a set of assays. Primers for each individual assay were selected based on location and sequence concerns. The two primer sequences were then iteratively adjusted to make the individual assays work properly. This generally resulted in groups of assays with different annealing temperatures that required the use of multiple thermal cyclers or multiple passes in a single thermal cycler making diagnostic testing time-consuming, laborious and expensive. These factors have severely hampered diagnostic testing services, leaving many families without an answer for the exact cause of a familial genetic disease. A search of GeneTests for sequencing analysis of the entire coding sequence for genes that are known to cause muscular dystrophies returns only a small list of laboratories that perform comprehensive gene panels. The hypothesis for the study was that a complete set of universal assays can be designed to amplify and sequence any gene or family of genes using computer aided design tools. If true, this would allow automation and optimization of the mutation detection process resulting in reduced cost and increased throughput. Results An automated process has been developed for the detection of deletions, duplications/insertions and point mutations in any gene or family of genes and has been applied to ten genes known to bear mutations that cause muscular dystrophy: DMD; CAV3; CAPN3; FKRP; TRIM32; LMNA; SGCA; SGCB; SGCG; SGCD. Using this process, mutations have been found in five DMD patients and four LGMD patients (one in the FKRP gene, one in the CAV3 gene, and two likely causative heterozygous pairs of variations in the CAPN3 gene of two other

  18. Transcriptome Sequencing and Positive Selected Genes Analysis of Bombyx mandarina

    Tingcai Cheng; Bohua Fu; Yuqian Wu; Renwen Long; Chun Liu; Qingyou Xia

    2015-01-01

    The wild silkworm Bombyx mandarina is widely believed to be an ancestor of the domesticated silkworm, Bombyx mori. Silkworms are often used as a model for studying the mechanism of species domestication. Here, we performed transcriptome sequencing of the wild silkworm using an Illumina HiSeq2000 platform. We produced 100,004,078 high-quality reads and assembled them into 50,773 contigs with an N50 length of 1764 bp and a mean length of 941.62 bp. A total of 33,759 unigenes were identified, wi...

  19. Application of gene sequencing directly to identify the pathogens in specimens

    LU Xin-xin; YUAN Liang; WAN Xiao-hua; GENG Jia-jing

    2010-01-01

    Background Accurate identification of bacterial isolates is an essential task in clinical microbiology. This study compared culturing to analyzing 16S rRNA gene sequences as methods to identify bacteria in clinical samples. We developed a key technique to directly identify bacteria in clinical samples via nucleic acid sequences, thus improving the ability to confirm pathogens.Methods We obtained 225 samples from Beijing Tongran Hospital and examined them by conventional culture and 16S rDNA sequencing to identify pathogens. This study made use of a modified sample pre-treatment technique which came from our laboratory to extract DNA. 16S rDNA was amplified by PCR. The amplified product was sequenced on a CEQ8000 capillary sequencer. Sequences were uploaded to the GenBank BLAST database for comparison.Results Among the positively cultivated bacterial strains, seven strains were identified differently by Vitek32 and by 16S rDNA sequencing. Twelve samples that were negative by standard culturing were determined to have pathogens by sequence analysis.Conclusion The use of 16S rRNA gene sequencing can improve clinical microbiology by providing better identification of unidentified bacteria or providing reference identification of unusual strains.

  20. IS21-558 insertion sequences are involved in the mobility of the multiresistance gene cfr

    Kehrenberg, Corinna; Aarestrup, Frank Møller; Schwarz, Stefan

    2007-01-01

    During a study of florfenicol-resistant porcine staphylococci from Denmark, the genes cfr and fexA were detected in the chromosomal DNA or on plasmids of Staphylococcus hyicus, Staphylococcus warneri, and Staphylococcus simulans. A novel variant of the phenicol resistance transposon Tn558...... was detected on the ca. 43-kb plasmid pSCFS6 in S. warneri and S. simulans isolates. Sequence analysis of a 22,010-bp segment revealed that the new Tn558 variant harbored an additional resistance gene region integrated into the tnpC reading frame. This resistance gene region consisted of the clindamycin...... exporter gene lsa(B) and the gene cfr for combined resistance to phenicols, lincosamides, oxazolidinones, pleuromutilins, and streptogramin A antibiotics bracketed by IS21-558 insertion sequences orientated in the same direction. A 6-bp target site duplication was detected at the integration site within...

  1. Sequence and expression analyses of the UL37 and UL38 genes of Aujeszky's disease virus.

    Braun, A; Kaliman, A; Boldogköi, Z; Aszódi, A; Fodor, I

    2000-01-01

    Previously, we sequenced the HSV-1 Ul39-Ul40 homologue genes of Aujeszky's disease virus (ADV), also designated as pseudorabies virus (Kaliman et al., 1994a, b). Now we report the nucleotide sequence of the adjacent DNA that encodes Ul38, the 5'-region (750 bp) of Ul37, and the promoter regions between these divergently arranged two genes. The ADV Ul38 gene encodes a protein of 368 amino acids. Amino acid sequence comparison of ADV Ul38 with that of other herpesviruses revealed significant structural homology. In a transcription study using RNase protection assay and Northern blot hybridization, we found that the Ul38 gene had one initiation site, but the Ul37 gene was initiated at two transcription sites with two potential initiator AUGs, one of which was dominant. Comparison of ADV Ul37, Ul38 and ribonucleotide reductase gene expression showed that these genes belong to the same temporal class with early kinetics. Data of structural and transcriptional studies suggest that regulation of the expression of these two ADV genes could differ from that of the HSV-1 virus. PMID:11402671

  2. Cloning and sequence analysis of a gene encoding polygalacturonase-inhibiting protein from cotton

    2003-01-01

    Polygalacturonase-inhibiting proteins (PGIP) play important roles in plant defense of pathogen, especially fungi. A pair of degenerated primers is designed based on the conserved sequence of 20 other known pgip genes and used to amplify Gossypium barbadense cultivation 7124 cDNA library by touch-down PCR. A 561 bp internal fragment of the pgip gene is obtained and used to design the primers for rapid amplification of cDNA ends. A composite pgip gene sequence is constructed from the products of 5′ and 3′ RACE, which are 666 bp and 906 bp respectively. Analysis of nucleic acid sequence shows 69.2% and 68.7% similarity to Citrus and Poncirus pgip genes, respectively. Its open reading frame of the gene encodes a polypeptide of 330 amino acids, in which 10 leucine-rich repeats arrange tandemly. A new set of primers is designed to the 5′ and 3′ ends of the gene, which allows amplification of the full-length gene from the cotton cDNA library. Genomic DNA analysis reveals that this gene has no intron.

  3. Sequence Analysis of the Protein Structure Homology Modeling of Growth Hormone Gene from Salmo trutta caspius

    Abolhasan Rezaei

    2012-03-01

    Full Text Available In view of the growth hormone protein investigated and characterized from Salmo trutta caspius. Growth hormone gene in the Salmo trutta caspius have six exons in the full length that is translated into a Molecular Weight (kDa: ssDNA: 64.98 and dsDNA: 129.6. There are also 210 amino acid residue. The assembled full length of DNA contains open reading frame of growth hormone gene that contains 15 sequences in the full length. The average GC content is 47% and AT content is 53%. This protein multiple alignment has shown that this peptide is 100% identical to the corresponding homologous protein in the growth hormone protein which including Salmo salar (Accession number: AAA49558.1 and Rainbow trout (Salmo trutta (Accession number: AAA49555.1" sequences. The sequence of protein had deposited in Gene Bank, Accession number: AEK70940. Also we were analyzed second and third structure between sequences reported in Gene Bank Network system. The results are shown, there are homology between second structure in three sequences including: Salmo trutta caspius, Salmo salar and Rainbow trout. Regarding third structure, Salmo trutta caspius and Salmo salar are same type, but Rainbow trout has different homology with Salmo trutta caspius and Salmo salar. However, the sequences were observed three parallel " helix and in second structure there were almost same percent β sheet.

  4. The complete nucleotide sequence and structure of the gene encoding bovine phenylethanolamine N-methyltransferase.

    Batter, D K; D'Mello, S R; Turzai, L M; Hughes, H B; Gioio, A E; Kaplan, B B

    1988-03-01

    A cDNA clone for bovine adrenal phenylethanolamine N-methyltransferase (PNMT) was used to screen a Charon 28 genomic library. One phage was identified, designated lambda P1, which included the entire PNMT gene. Construction of a restriction map, with subsequent Southern blot analysis, allowed the identification of exon-containing fragments. Dideoxy sequence analysis of these fragments, and several more further upstream, indicates that the bovine PNMT gene is 1,594 base pairs in length, consisting of three exons and two introns. The transcription initiation site was identified by two independent methods and is located approximately 12 base pairs upstream from the ATG translation start site. The 3' untranslated region is 88 base pairs in length and contains the expected polyadenylation signal (AATAAA). A putative promoter sequence (TATA box) is located about 25 base pairs upstream from the transcription initiation site. Computer comparison of the nucleotide sequence data with the consensus sequences of known regulatory elements revealed potential binding sites for glucocorticoid receptors and the Sp1 regulatory protein in the 5' flanking region of the gene. Additionally, comparison of the sequence of the exons of the PNMT gene with cDNA sequences for other enzymes involved in biogenic amine synthesis revealed no significant homology, indicating that PNMT is not a member of a multigene family of catecholamine biosynthetic enzymes. PMID:3379652

  5. GeneAnalytics: An Integrative Gene Set Analysis Tool for Next Generation Sequencing, RNAseq and Microarray Data.

    Ben-Ari Fuchs, Shani; Lieder, Iris; Stelzer, Gil; Mazor, Yaron; Buzhor, Ella; Kaplan, Sergey; Bogoch, Yoel; Plaschkes, Inbar; Shitrit, Alina; Rappaport, Noa; Kohn, Asher; Edgar, Ron; Shenhav, Liraz; Safran, Marilyn; Lancet, Doron; Guan-Golan, Yaron; Warshawsky, David; Shtrichman, Ronit

    2016-03-01

    Postgenomics data are produced in large volumes by life sciences and clinical applications of novel omics diagnostics and therapeutics for precision medicine. To move from "data-to-knowledge-to-innovation," a crucial missing step in the current era is, however, our limited understanding of biological and clinical contexts associated with data. Prominent among the emerging remedies to this challenge are the gene set enrichment tools. This study reports on GeneAnalytics™ ( geneanalytics.genecards.org ), a comprehensive and easy-to-apply gene set analysis tool for rapid contextualization of expression patterns and functional signatures embedded in the postgenomics Big Data domains, such as Next Generation Sequencing (NGS), RNAseq, and microarray experiments. GeneAnalytics' differentiating features include in-depth evidence-based scoring algorithms, an intuitive user interface and proprietary unified data. GeneAnalytics employs the LifeMap Science's GeneCards suite, including the GeneCards®--the human gene database; the MalaCards-the human diseases database; and the PathCards--the biological pathways database. Expression-based analysis in GeneAnalytics relies on the LifeMap Discovery®--the embryonic development and stem cells database, which includes manually curated expression data for normal and diseased tissues, enabling advanced matching algorithm for gene-tissue association. This assists in evaluating differentiation protocols and discovering biomarkers for tissues and cells. Results are directly linked to gene, disease, or cell "cards" in the GeneCards suite. Future developments aim to enhance the GeneAnalytics algorithm as well as visualizations, employing varied graphical display items. Such attributes make GeneAnalytics a broadly applicable postgenomics data analyses and interpretation tool for translation of data to knowledge-based innovation in various Big Data fields such as precision medicine, ecogenomics, nutrigenomics, pharmacogenomics, vaccinomics

  6. Characterization of the Helicoverpa assulta nucleopolyhedrovirus genome and sequence analysis of the polyhedrin gene region

    Soo-Dong Woo; Jae Young Choi; Yeon Ho Je; Byung Rae Jin

    2006-09-01

    A local strain of Helicoverpa assulta nucleopolyhedrovirus (HasNPV) was isolated from infected H. assulta larvae in Korea. Restriction endonuclease fragment analysis, using 4 restriction enzymes, estimated that the total genome size of HasNPV is about 138 kb. A degenerate polymerase chain reaction (PCR) primer set for the polyhedrin gene successfully amplified the partial polyhedrin gene of HasNPV. The sequencing results showed that the about 430 bp PCR product was a fragment of the corresponding polyhedrin gene. Using HasNPV partial predicted polyhedrin to probe the Southern blots, we identified the location of the polyhedrin gene within the 6 kb EcoRI, 15 kb NcoI, 20 kb XhoI, 17 kb BglII and 3 kb ClaI fragments, respectively. The 3 kb ClaI fragment was cloned and the nucleotide sequences of the polyhedrin coding region and its flaking regions were determined. Nucleotide sequence analysis indicated the presence of an open reading frame of 735 nucleotides which could encode 245 amino acids with a predicted molecular mass of 29 kDa. The nucleotide sequences within the coding region of HasNPV polyhedrin shared 73.7% identity with the polyhedrin gene from Autographa californica NPV but were most closely related to Helicoverpa and Heliothis species NPVs with over 99% sequence identity.

  7. Nucleotide sequence of the beta-cyclodextrin glucanotransferase gene of alkalophilic Bacillus sp. strain 1011 and similarity of its amino acid sequence to those of alpha-amylases.

    Kimura, K.; Kataoka, S; Ishii, Y; Takano, T.; Yamane, K

    1987-01-01

    The nucleotide sequence of the gene for cyclodextrin glucanotransferase of alkalophilic Bacillus sp. strain 1011 was determined. The deduced amino acid sequence at the NH2-terminal side of the enzyme showed a high homology with the sequences of alpha-amylase in the three regions which constitutes the active centers of alpha-amylases.

  8. Versatile Cosmid Vectors for the Isolation, Expression, and Rescue of Gene Sequences: Studies with the Human α -globin Gene Cluster

    Lau, Yun-Fai; Kan, Yuet Wai

    1983-09-01

    We have developed a series of cosmids that can be used as vectors for genomic recombinant DNA library preparations, as expression vectors in mammalian cells for both transient and stable transformations, and as shuttle vectors between bacteria and mammalian cells. These cosmids were constructed by inserting one of the SV2-derived selectable gene markers-SV2-gpt, SV2-DHFR, and SV2-neo-in cosmid pJB8. High efficiency of genomic cloning was obtained with these cosmids and the size of the inserts was 30-42 kilobases. We isolated recombinant cosmids containing the human α -globin gene cluster from these genomic libraries. The simian virus 40 DNA in these selectable gene markers provides the origin of replication and enhancer sequences necessary for replication in permissive cells such as COS 7 cells and thereby allows transient expression of α -globin genes in these cells. These cosmids and their recombinants could also be stably transformed into mammalian cells by using the respective selection systems. Both of the adult α -globin genes were more actively expressed than the embryonic zeta -globin genes in these transformed cell lines. Because of the presence of the cohesive ends of the Charon 4A phage in the cosmids, the transforming DNA sequences could readily be rescued from these stably transformed cells into bacteria by in vitro packaging of total cellular DNA. Thus, these cosmid vectors are potentially useful for direct isolation of structural genes.

  9. Isolation and characterization of gene sequences expressed in cotton fiber

    Taciana de Carvalho Coutinho; Marcelo de Almeida Guimarães; Marcia Soares Vidal

    2016-01-01

    ABSTRACT Cotton fiber are tubular cells which develop from the differentiation of ovule epidermis. In addition to being one of the most important natural fiber of the textile group, cotton fiber afford an excellent experimental system for studying the cell wall. The aim of this work was to isolate and characterise the genes expressed in cotton fiber (Gossypium hirsutum L.) to be used in future work in cotton breeding. Fiber of the cotton cultivar CNPA ITA 90 II were used to extract RNA for th...

  10. The complete mitochondrial genome sequence and gene organization of Tridentiger trigonocephalus (Gobiidae: Gobionellinae) with phylogenetic consideration.

    Wei, Hongqing; Ma, Hongyu; Ma, Chunyan; Zhang, Fengying; Wang, Wei; Chen, Wei; Ma, Lingbo

    2016-09-01

    The complete mitochondrial genome plays an important role in studies of genome-level characteristics and phylogenetic relationships. Here we determined the complete mitogenome sequence of Tridentiger trigonocephalus (Perciformes, Gobiidae), and discovered its phylogenetic relationship. This circular genome was 16 662 bp in length, and consisted of 37 typical genes, including 13 protein-coding genes, 22 tRNA genes, and two rRNA genes. The gene order of T. trigonocephalus mitochondrial genome was identical to those observed in most other vertebrates. Of 37 genes, 28 were encoded by heavy strand, while the others were encoded by light strand. The phylogenetic tree constructed by 13 concatenated protein-coding genes showed that T. trigonocephalus was closest to T. bifasciatus, and then to T. barbatus among the 20 species within suborder Gobioidei. This work should facilitate the studies on population genetic diversity, and molecular evolution in Gobioidei fishes. PMID:26370266

  11. Citrus plastid-related gene profiling based on expressed sequence tag analyses

    Tercilio Calsa Jr.

    2007-01-01

    Full Text Available Plastid-related sequences, derived from putative nuclear or plastome genes, were searched in a large collection of expressed sequence tags (ESTs and genomic sequences from the Citrus Biotechnology initiative in Brazil. The identified putative Citrus chloroplast gene sequences were compared to those from Arabidopsis, Eucalyptus and Pinus. Differential expression profiling for plastid-directed nuclear-encoded proteins and photosynthesis-related gene expression variation between Citrus sinensis and Citrus reticulata, when inoculated or not with Xylella fastidiosa, were also analyzed. Presumed Citrus plastome regions were more similar to Eucalyptus. Some putative genes appeared to be preferentially expressed in vegetative tissues (leaves and bark or in reproductive organs (flowers and fruits. Genes preferentially expressed in fruit and flower may be associated with hypothetical physiological functions. Expression pattern clustering analysis suggested that photosynthesis- and carbon fixation-related genes appeared to be up- or down-regulated in a resistant or susceptible Citrus species after Xylella inoculation in comparison to non-infected controls, generating novel information which may be helpful to develop novel genetic manipulation strategies to control Citrus variegated chlorosis (CVC.

  12. Candida famata (Debaryomyces hansenii) DNA sequences containing genes involved in riboflavin synthesis.

    Voronovsky, Andriy Y; Abbas, Charles A; Dmytruk, Kostyantyn V; Ishchuk, Olena P; Kshanovska, Barbara V; Sybirna, Kateryna A; Gaillardin, Claude; Sibirny, Andriy A

    2004-11-01

    Previously cloned Candida famata (Debaryomyces hansenii) strain VKM Y-9 genomic DNA fragments containing genes RIB1 (codes for GTP cyclohydrolase II), RIB2 (encodes specific reductase), RIB5 (codes for dimethylribityllumazine synthase), RIB6 (encodes dihydroxybutanone phosphate synthase) and RIB7 (codes for riboflavin synthase) were sequenced. The derived amino acid sequences of C. famata RIB genes showed extensive homology to the corresponding sequences of riboflavin synthesis enzymes of other yeast species. The highest identity was observed to homologues of D. hansenii CBS767, as C. famata is the anamorph of this hemiascomycetous yeast. The D. hansenii CBS767 RIB3 gene encoding specific deaminase was cloned. This gene successfully complemented riboflavin auxotrophy of the rib3 mutant of flavinogenic yeast, Pichia guilliermondii. Putative iron-responsive elements (potential sites for binding of the transcription factors Fep1p or Aft1p and Aft2p) were found in the upstream regions of some C. famata and D. hansenii RIB genes. The sequences of C. famata RIB genes have been submitted to the EMBL data library under Accession Nos AJ810169-AJ810173. PMID:15543522

  13. A flexible and economical barcoding approach for highly multiplexed amplicon sequencing of diverse target genes

    Craig W. Herbold

    2015-07-01

    Full Text Available High throughput sequencing of phylogenetic and functional gene amplicons provides tremendous insight into the structure and functional potential of complex microbial communities. Here, we introduce a highly adaptable and economical PCR approach to barcoding and pooling libraries of numerous target genes. In this approach, we replace gene- and sequencing platform-specific fusion primers with general, interchangeable barcoding primers, enabling nearly limitless customized barcode-primer combinations. Compared to barcoding with long fusion primers, our multiple-target gene approach is more economical because it overall requires lower number of primers and is based on short primers with generally lower synthesis and purification costs. To highlight our approach, we pooled over 900 different small-subunit rRNA and functional gene amplicon libraries obtained from various environmental or host-associated microbial community samples into a single, paired-end Illumina MiSeq run. Although the amplicon regions ranged in size from approximately 290 to 720 bp, we found no significant systematic sequencing bias related to amplicon length or gene target. Our results indicate that this flexible multiplexing approach produces large, diverse and high quality sets of amplicon sequence data for modern studies in microbial ecology.

  14. Estimating variation within the genes and inferring the phylogeny of 186 sequenced diverse Escherichia coli genomes

    Kaas, Rolf Sommer; Rundsten, Carsten Friis; Ussery, David;

    2012-01-01

    creating better phylogenies, for determination of molecular clocks and for improved typing techniques. Results We find 3,051 gene clusters/families present in at least 95% of the genomes and 1,702 gene clusters present in 100% of the genomes. The former 'soft core' of about 3,000 gene families is perhaps...... 186 sequenced E. coli genomes. The core-gene tree displays high confidence and divides the E. coli strains into the observed MLST type clades and also separates defined phylotypes. Conclusion The results of comparing a large and diverse E. coli dataset support the theory that reliable and good...

  15. Analysis of breast cancer metastasis candidate genes from next generation-sequencing via systematic functional genomics

    Blomstrøm, Monica Marie

    2016-01-01

    ) and non-CSCs. The main goal of this project was to functionally characterize a set of candidate genes recovered from next-generation sequencing analysis for their role in breast cancer metastasis formation. The starting gene set comprised 104 gene variants; i.e. 57 wildtype and 47 mutated variants....... During the project, the aim was to generate a panel of genetically identical (“isogenic”) MCF7 breast cancer cell lines with inducible overexpression of the gene variants, and to analyze these for effects on breast cancer growth and invasion in vitro under standardized conditions. Moreover, it was aimed...

  16. Analysis of human growth hormone gene 5' sequences in isolated growth hormone deficiency patients.

    Wang, Y.; Yu, L L; Sheng, Q.; Meng, C; Sun, J.; S.S. Chen

    1994-01-01

    Human growth hormone (hGH) gene deletion (6.7 to 7.6 kb) is one of the causes of isolated growth hormone deficiency (IGHD), named IGHD IA. IGHD IA, however, only accounts for about 10% of the total IGHD patients. Most IGHD is caused by unknown mechanisms. Here, hGH gene 5' sequences in three IGHD patients without hGH gene deletion were analysed to see if there was any mutation hindering the expression of the hGH gene.

  17. Cloning, nucleotide sequence, and expression of the Bacillus subtilis lon gene.

    Riethdorf, S.; Völker, U; Gerth, U.; Winkler, A; Engelmann, S; Hecker, M.

    1994-01-01

    The lon gene of Escherichia coli encodes the ATP-dependent serine protease La and belongs to the family of sigma 32-dependent heat shock genes. In this paper, we report the cloning and characterization of the lon gene from the gram-positive bacterium Bacillus subtilis. The nucleotide sequence of the lon locus, which is localized upstream of the hemAXCDBL operon, was determined. The lon gene codes for an 87-kDa protein consisting of 774 amino acid residues. A comparison of the deduced amino ac...

  18. Cloning and Sequence Analysis of Envelope Glycoprotein E1 Gene of Rubella Virus, JR23 Strain

    王志玉; 薛永磊; 王小凡; 宋艳艳; 温红玲

    2003-01-01

    To construct an expression vector containing the E1 glycoprotein gene of rubella virus for the study on the effectof mutation of the E1 gene glycoprotein and the analysis of phylogenetic differences of sequences, the gene encoding the E1envelope glycoprotein was amplified from rubella virus, Jinan strain JR23, by RT-PCR and ligated into PMD-18T vector.The clones that carried the E1 gene were identified after ampr selection and analysis of restriction enzyme digestion. After sequencing this gene was analyzed by Danstar and Winstar programs, and the map of phylogenetic tree was drawn. The clone of E1 glycoprotein was thus constructed. It was found that the sequence differences between JR23 strain and the TCRB strainfrom Japan and those between JR23 strain and Thomas strain of England were rather small with difference values of 0.9% and 1.2% respectively. Yet those between JR23 strain and BRD2 strain from Beijing and those between JR23 strain and XG379 strain from Hong Kong were comparatively larger with difference values of 7.6% and 7.3% respectively. The sequence of JR23 strain with other strains was less than 3% except the NC strain (3.7%). It concludes that the constructionof E1 glycoprotein gene offers an approach to study the relationship between structures and functions of E1 gene and its gene products. In the phylogenetic tree, it shows that there are significant differences in the sequences of rubella virus isolated in China, and this might be helpful to develop an effective subunit vaccine.

  19. Rapid evolution of the sequences and gene repertoires of secreted proteins in bacteria.

    Teresa Nogueira

    Full Text Available Proteins secreted to the extracellular environment or to the periphery of the cell envelope, the secretome, play essential roles in foraging, antagonistic and mutualistic interactions. We hypothesize that arms races, genetic conflicts and varying selective pressures should lead to the rapid change of sequences and gene repertoires of the secretome. The analysis of 42 bacterial pan-genomes shows that secreted, and especially extracellular proteins, are predominantly encoded in the accessory genome, i.e. among genes not ubiquitous within the clade. Genes encoding outer membrane proteins might engage more frequently in intra-chromosomal gene conversion because they are more often in multi-genic families. The gene sequences encoding the secretome evolve faster than the rest of the genome and in particular at non-synonymous positions. Cell wall proteins in Firmicutes evolve particularly fast when compared with outer membrane proteins of Proteobacteria. Virulence factors are over-represented in the secretome, notably in outer membrane proteins, but cell localization explains more of the variance in substitution rates and gene repertoires than sequence homology to known virulence factors. Accordingly, the repertoires and sequences of the genes encoding the secretome change fast in the clades of obligatory and facultative pathogens and also in the clades of mutualists and free-living bacteria. Our study shows that cell localization shapes genome evolution. In agreement with our hypothesis, the repertoires and the sequences of genes encoding secreted proteins evolve fast. The particularly rapid change of extracellular proteins suggests that these public goods are key players in bacterial adaptation.

  20. Molecular cloning, sequence characterization, and gene expression profiling of a novel water buffalo (Bubalus bubalis) gene, AGPAT6.

    Song, S; Huo, J L; Li, D L; Yuan, Y Y; Yuan, F; Miao, Y W

    2013-01-01

    Several 1-acylglycerol-3-phosphate-O-acyltransferases (AGPATs) can acylate lysophosphatidic acid to produce phosphatidic acid. Of the eight AGPAT isoforms, AGPAT6 is a crucial enzyme for glycerolipids and triacylglycerol biosynthesis in some mammalian tissues. We amplified and identified the complete coding sequence (CDS) of the water buffalo AGPAT6 gene by using the reverse transcription-polymerase chain reaction, based on the conversed sequence information of the cattle or expressed sequence tags of other Bovidae species. This novel gene was deposited in the NCBI database (accession No. JX518941). Sequence analysis revealed that the CDS of this AGPAT6 encodes a 456-amino acid enzyme (molecular mass = 52 kDa; pI = 9.34). Water buffalo AGPAT6 contains three hydrophobic transmembrane regions and a signal 37-amino acid peptide, localized in the cytoplasm. The deduced amino acid sequences share 99, 98, 98, 97, 98, 98, 97 and 95% identity with their homologous sequences from cattle, horse, human, mouse, orangutan, pig, rat, and chicken, respectively. The phylogenetic tree analysis based on the AGPAT6 CDS showed that water buffalo has a closer genetic relationship with cattle than with other species. Tissue expression profile analysis shows that this gene is highly expressed in the mammary gland, moderately expressed in the heart, muscle, liver, and brain; weakly expressed in the pituitary gland, spleen, and lung; and almost silently expressed in the small intestine, skin, kidney, and adipose tissues. Four predicted microRNA target sites are found in the water buffalo AGPAT6 CDS. These results will establish a foundation for further insights into this novel water buffalo gene. PMID:24114207

  1. GeneAnalytics: An Integrative Gene Set Analysis Tool for Next Generation Sequencing, RNAseq and Microarray Data

    Ben-Ari Fuchs, Shani; Lieder, Iris; Mazor, Yaron; Buzhor, Ella; Kaplan, Sergey; Bogoch, Yoel; Plaschkes, Inbar; Shitrit, Alina; Rappaport, Noa; Kohn, Asher; Edgar, Ron; Shenhav, Liraz; Safran, Marilyn; Lancet, Doron; Guan-Golan, Yaron; Warshawsky, David; Shtrichman, Ronit

    2016-01-01

    Abstract Postgenomics data are produced in large volumes by life sciences and clinical applications of novel omics diagnostics and therapeutics for precision medicine. To move from “data-to-knowledge-to-innovation,” a crucial missing step in the current era is, however, our limited understanding of biological and clinical contexts associated with data. Prominent among the emerging remedies to this challenge are the gene set enrichment tools. This study reports on GeneAnalytics™ (geneanalytics.genecards.org), a comprehensive and easy-to-apply gene set analysis tool for rapid contextualization of expression patterns and functional signatures embedded in the postgenomics Big Data domains, such as Next Generation Sequencing (NGS), RNAseq, and microarray experiments. GeneAnalytics' differentiating features include in-depth evidence-based scoring algorithms, an intuitive user interface and proprietary unified data. GeneAnalytics employs the LifeMap Science's GeneCards suite, including the GeneCards®—the human gene database; the MalaCards—the human diseases database; and the PathCards—the biological pathways database. Expression-based analysis in GeneAnalytics relies on the LifeMap Discovery®—the embryonic development and stem cells database, which includes manually curated expression data for normal and diseased tissues, enabling advanced matching algorithm for gene–tissue association. This assists in evaluating differentiation protocols and discovering biomarkers for tissues and cells. Results are directly linked to gene, disease, or cell “cards” in the GeneCards suite. Future developments aim to enhance the GeneAnalytics algorithm as well as visualizations, employing varied graphical display items. Such attributes make GeneAnalytics a broadly applicable postgenomics data analyses and interpretation tool for translation of data to knowledge-based innovation in various Big Data fields such as precision medicine, ecogenomics, nutrigenomics

  2. Next-generation sequencing approach for connecting secondary metabolites to biosynthetic gene clusters in fungi

    Ralph A Cacho

    2015-01-01

    Full Text Available Genomics has revolutionized the research on fungal secondary metabolite biosynthesis. To elucidate the molecular and enzymatic mechanisms underlying the biosynthesis of a specific secondary metabolite compound, the important first step is often to find the genes that responsible for its synthesis. The accessibility to fungal genome sequences allows the bypass of the cumbersome traditional library construction and screening approach. The advance in next-generation sequencing (NGS technologies have further improved the speed and reduced the cost of microbial genome sequencing in the past few years, which has accelerated the research in this field. Here, we will present an example work flow for identifying the gene cluster encoding the biosynthesis of secondary metabolites of interest using an NGS approach. We will also review the different strategies that can be employed to pinpoint the targeted gene clusters rapidly by giving several examples stemming from our work.

  3. Gene organization and complete sequence of the mitochondrial genome of Linwu mallard.

    Tian, Ke-Xiong; Liu, Li-Li; Yu, Qi-Fang; He, Shao-Ping; He, Jian-Hua

    2016-01-01

    Linwu mallard is an excellent native breeds from Hunan province in China. This is the first study to determine the complete mitochondrial genome sequence of L. mallard using PCR-based amplification and Sanger sequencing. The characteristic of the entire mitochondrial genome was analyzed in detail, with the base composition of 29.19% A, 22.19% T, 32.83% C, 15.79% G in the L. mallard (16,605 bp in length). It contained 2 ribosomal RNA genes, 13 protein-coding genes, 22 transfer RNA genes and a major non-coding control region (D-loop region). The complete mitochondrial genome sequence of L. mallard will be useful for the phylogenetics of poultry, and be available as basic data for the genetics and breeding. PMID:24938102

  4. Isolation of laccase gene-specific sequences from white rot and brown rot fungi by PCR

    D`Souza, T.M.; Boominathan, K.; Reddy, C.A. [Michigan State Univ., East Lansing, MI (United States)

    1996-10-01

    Degenerate primers corresponding to the consensus sequences of the copper-binding regions in the N-terminal domains of known basidiomycete laccases were used to isolate laccase gene-specific sequences from strains representing nine genera of wood rot fungi. All except three gave the expected PCR product of about 200 bp. Computer searches of the databases identified the sequences of each of the PCR product of about 200 bp. Computer searches of the databases identified the sequence of each of the PCR products analyzed as a laccase gene sequence, suggesting the specificity of the primers. PCR products of the white rot fungi Ganoderma lucidum, Phlebia brevispora, and Trametes versicolor showed 65 to 74% nucleotide sequence similarity to each other; the similarity in deduced amino acid sequences was 83 to 91%. The PCR products of Lentinula edodes and Lentinus tigrinus, on the other hand, showed relatively low nucleotide and amino acid similarities (58 to 64 and 62 to 81%, respectively); however, these similarities were still much higher than when compared with the corresponding regions in the laccases of the ascomycete fungi Aspergillus nidulans and Neurospora crassa. A few of the white rot fungi, as well as Gloeophyllum trabeum, a brown rot fungus, gave a 144-bp PCR fragment which had a nucleotide sequence similarity of 60 to 71%. Demonstration of laccase activity in G. trabeum and several other brown rot fungi was of particular interest because these organisms were not previously shown to produce laccases. 36 refs., 6 figs., 2 tabs.

  5. Automated conserved noncoding sequence (CNS discovery reveals differences in gene content and promoter evolution among grasses

    Gina eTurco

    2013-07-01

    Full Text Available Conserved noncoding sequences (CNS are islands of noncoding sequence that, like protein coding exons, show less divergence in sequence between related species than functionless DNA. Several of CNSs have been demonstrated experimentally to function as cis-regulatory regions. However, the specific functions of most CNSs remain unknown. Previous searchers for CNS in plants have either anchored on exons and only identified nearby sequences or required years of painstaking manual annotation. Here we present an open source tool that can accurately identify CNSs between any two related species with sequenced genomes, including both those immediately adjacent to exons and distal sequences separated by >12 KB of noncoding sequence. We have used this tool to characterize new motifs, associate CNSs with additional functions and identify previously undetected genes encoding RNA and protein in the genomes of five grass species. We provide a list of 15,363 orthologous CNSs conserved across all grasses tested. We were also able to identify regulatory sequences present in the common ancestor of grasses that have been lost in one or more extant grass lineages. Lists of orthologous gene pairs and associated CNSs are provided for reference inbred lines of arabidopsis, Japonica rice, foxtail millet, sorghum, brachypodium and maize.

  6. Identification, sequencing and structural analysis of a nifA-like gene of Acetobacter diazotrophicus.

    Teixeira, K R; Morgan, T; Meletzus, D; Galler, R; Baldani, J I; Kennedy, C

    1999-01-01

    A recombinant plasmid, pAD101, containing a DNA fragment of Acetobacter diazotrophicus strain PAL5 was isolated by its ability to restore Nif+ phenotype to a nifA- ntrC- double mutant of Azotobacter vinelandii. Hybridization with the nifA genes of Azospirillum brasilense located the nifA gene more precisely to specific fragments of pAD101. DNA sequencing of appropriate subclones of pAD101 revealed that the nifA gene was adjacent to the nifB gene in A. diazotrophicus, and the 5' end of the nifB gene was located downstream of the nitrogenase MoFe subunit gene, nifK. The deduced aminoacid sequence of A. diazotrophicus nifA and nifB gene were most similar to the NifA and NifB proteins of Azorhizobium caulinodans and Rhodobacter capsulatus, respectively. In addition, nucleotide sequences upstream of the A. diazotrophicus nifA-encoding region indicate features similar to those in the A. caulinodans nifA promoter region involved in O2 and fixed N regulation of nifA expression. PMID:10530336

  7. Molecular Identification and Sequencing of Mannose Binding Protein (MBP Gene of Acanthamoeba palestinensis

    M Rezaeian

    2010-02-01

    Full Text Available "nBackground: Acanthamoeba keratitis develops by pathogenic Acanthamoeba such as A. pal­es­tinen­sis. Indeed this species is one of the known causative agents of amoebic keratitis in Iran. Mannose Binding Protein (MBP is the main pathogenicity factors for developing this sight threatening disease. We aimed to characterize MBP gene in pathogenic Acanthamoeba isolates such as A. palestinensis."nMethods: This experimental research was performed in the School of Public Health, Tehran University of Medical Sciences, Tehran, Iran during 2007-2008.  A. palestinensis was grown on 2% non-nutrient agar overlaid with Escherichia coli. DNA extraction was performed using phenol-chloroform method. PCR reaction and amplification were done using specific primer pairs of MBP. The amplified fragment were purified and sequenced. Finally, the obtained fragment was deposited in the gene data bank."nResults: A 900 bp PCR-product was recovered after PCR reaction. Sequence analysis of the purified PCR product revealed a gene with 943 nucleotides. Homology analysis of the ob­tained sequence showed 81% similarity with the available MBP gene in the gene data bank. The fragment was deposited in the gene data bank under accession number EU678895"nConclusion: MBP is known as the most important factor in Acanthamoeba pathogenesis cas­cade. Therefore, characterization of this gene can aid in developing better therapeutic agents and even immunization of high-risk people.

  8. p21WAF1/CIP1 gene DNA sequencing and its expression in human osteosarcoma

    廖威明; 张春林; 李佛保; 曾炳芳; 曾益新

    2004-01-01

    Background Mutation and expression change of p21WAF1/CIP1 may play a role in the growth of osteosarcoma. This study was to investigate the expression of the p21WAF1/CIP1 gene in human osteosarcoma, p21WAF1/CIP1 gene DNA sequence change and their relationships with the phenotype and clinical prognosis.Methods p21WAF1/CIP1 gene in 10 normal people and the tumours of 45 osteosarcoma patients were examined using polymerase chain reaction-single strand conformation polymorphism (PCR-SSCP) with silver staining. The PCR product with an abnormal strand was sequenced directly. The p21WAF1/CIP1 gene mRNA and P21 protein of 45 cases of osteosarcoma were investigated by using in situ hybridization and immunohistochemistry, respectively. Results The occurrence of P21 protein in osteosarcoma was 17.78% (8/45), and p21WAF1/CIP1 mRNA expression in osteosarcoma was 42.22% (19/45). The p21WAF1/CIP1 gene DNA sequencing of amplified production showed that in p21WAF1/CIP1 gene exon 3 of 36 cases of human osteosarcoma, there were 17 cases (47.22%) with C→T at position 609; 10 normal blood samples' DNA sequence analysis yielded 8 cases (80.00%) with C→T at the same position. Conclusions Along with the increase of malignancy, the expression of p21WAF1/CIP1mRNA and P21 protein in osteosarcoma tends to decrease. It is uncommon for the p21WAF1/CIP1 gene mutation to occur in human osteosarcoma. As a result, the possible existence of tumour subtypes of p21WAF1/CIP1 gene mutation should be investigated. Our research leads to the location of p21WAF1/CIP1 gene polymorphism of Chinese osteosarcoma patients, which can provide a basis for further research.

  9. Complete nucleotide sequence and gene rearrangement of the mitochondrial genome of Occidozyga martensii

    En Li; Xiaoqiang Li; Xiaobing Wu; Ge Feng; Man Zhang; Haitao Shi; Lijun Wang; Jianping Jiang

    2014-12-01

    In this study, the complete nucleotide sequence (18,321 bp) of the mitochondrial (mt) genome of the round-tongued floating frog, Occidozyga martensii was determined. Although, the base composition and codon usage of O. martensii conformed to the typical vertebrate patterns, this mt genome contained 23 tRNAs (a tandem duplication of tRNA-Met gene). The LTPF tRNA-gene cluster, and the derived position of the ND5 gene downstream of the control region, were present in this mitogenome. Moreover, we found that in the WANCY tRNA-gene cluster, the tRNA-Asn gene was located between the tRNA-Tyr and COI genes instead of between the tRNA-Ala and tRNA-Cys genes, which is a novel mtDNA gene rearrangement in vertebrates. Based on the concatenated nucleotide sequences of the 13 protein-coding genes, phylogenetic analysis (BI, ML, MP) was performed to further clarify the phylogenetic relations of this species within anurans.

  10. Re-annotation of genome microbial CoDing-Sequences: finding new genes and inaccurately annotated genes

    Danchin Antoine

    2002-02-01

    Full Text Available Abstract Background Analysis of any newly sequenced bacterial genome starts with the identification of protein-coding genes. Despite the accumulation of multiple complete genome sequences, which provide useful comparisons with close relatives among other organisms during the annotation process, accurate gene prediction remains quite difficult. A major reason for this situation is that genes are tightly packed in prokaryotes, resulting in frequent overlap. Thus, detection of translation initiation sites and/or selection of the correct coding regions remain difficult unless appropriate biological knowledge (about the structure of a gene is imbedded in the approach. Results We have developed a new program that automatically identifies biologically significant candidate genes in a bacterial genome. Twenty-six complete prokaryotic genomes were analyzed using this tool, and the accuracy of gene finding was assessed by comparison with existing annotations. This analysis revealed that, despite the enormous effort of genome program annotators, a small but not negligible number of genes annotated within the framework of sequencing projects are likely to be partially inaccurate or plainly wrong. Moreover, the analysis of several putative new genes shows that, as expected, many short genes have escaped annotation. In most cases, these new genes revealed frameshifts that could be either artifacts or genuine frameshifts. Some entirely unexpected new genes have also been identified. This allowed us to get a more complete picture of prokaryotic genomes. The results of this procedure are progressively integrated into the SWISS-PROT reference databank. Conclusions The results described in the present study show that our procedure is very satisfactory in terms of gene finding accuracy. Except in few cases, discrepancies between our results and annotations provided by individual authors can be accounted for by the nature of each annotation process or by specific