WorldWideScience

Sample records for 28s gene sequences

  1. Fungal community structure in disease suppressive soils assessed by 28S LSU gene sequencing.

    Penton, C Ryan; Gupta, V V S R; Tiedje, James M; Neate, Stephen M; Ophel-Keller, Kathy; Gillings, Michael; Harvey, Paul; Pham, Amanda; Roget, David K

    2014-01-01

    Natural biological suppression of soil-borne diseases is a function of the activity and composition of soil microbial communities. Soil microbe and phytopathogen interactions can occur prior to crop sowing and/or in the rhizosphere, subsequently influencing both plant growth and productivity. Research on suppressive microbial communities has concentrated on bacteria although fungi can also influence soil-borne disease. Fungi were analyzed in co-located soils 'suppressive' or 'non-suppressive' for disease caused by Rhizoctonia solani AG 8 at two sites in South Australia using 454 pyrosequencing targeting the fungal 28S LSU rRNA gene. DNA was extracted from a minimum of 125 g of soil per replicate to reduce the micro-scale community variability, and from soil samples taken at sowing and from the rhizosphere at 7 weeks to cover the peak Rhizoctonia infection period. A total of ∼ 994,000 reads were classified into 917 genera covering 54% of the RDP Fungal Classifier database, a high diversity for an alkaline, low organic matter soil. Statistical analyses and community ordinations revealed significant differences in fungal community composition between suppressive and non-suppressive soil and between soil type/location. The majority of differences associated with suppressive soils were attributed to less than 40 genera including a number of endophytic species with plant pathogen suppression potentials and mycoparasites such as Xylaria spp. Non-suppressive soils were dominated by Alternaria, Gibberella and Penicillum. Pyrosequencing generated a detailed description of fungal community structure and identified candidate taxa that may influence pathogen-plant interactions in stable disease suppression. PMID:24699870

  2. Fungal community structure in disease suppressive soils assessed by 28S LSU gene sequencing.

    C Ryan Penton

    Full Text Available Natural biological suppression of soil-borne diseases is a function of the activity and composition of soil microbial communities. Soil microbe and phytopathogen interactions can occur prior to crop sowing and/or in the rhizosphere, subsequently influencing both plant growth and productivity. Research on suppressive microbial communities has concentrated on bacteria although fungi can also influence soil-borne disease. Fungi were analyzed in co-located soils 'suppressive' or 'non-suppressive' for disease caused by Rhizoctonia solani AG 8 at two sites in South Australia using 454 pyrosequencing targeting the fungal 28S LSU rRNA gene. DNA was extracted from a minimum of 125 g of soil per replicate to reduce the micro-scale community variability, and from soil samples taken at sowing and from the rhizosphere at 7 weeks to cover the peak Rhizoctonia infection period. A total of ∼ 994,000 reads were classified into 917 genera covering 54% of the RDP Fungal Classifier database, a high diversity for an alkaline, low organic matter soil. Statistical analyses and community ordinations revealed significant differences in fungal community composition between suppressive and non-suppressive soil and between soil type/location. The majority of differences associated with suppressive soils were attributed to less than 40 genera including a number of endophytic species with plant pathogen suppression potentials and mycoparasites such as Xylaria spp. Non-suppressive soils were dominated by Alternaria, Gibberella and Penicillum. Pyrosequencing generated a detailed description of fungal community structure and identified candidate taxa that may influence pathogen-plant interactions in stable disease suppression.

  3. Phylogenetic Relationships of the Marine Haplosclerida (Phylum Porifera) Employing Ribosomal (28S rRNA) and Mitochondrial (cox1, nad1) Gene Sequence Data

    Redmond, Niamh E.; Jean Raleigh; Van Soest, Rob W.M.; Michelle Kelly; Travers, Simon A A; Brian Bradshaw; Salla Vartia; Kelly M Stephens; McCormack, Grace P.

    2011-01-01

    The systematics of the poriferan Order Haplosclerida (Class Demospongiae) has been under scrutiny for a number of years without resolution. Molecular data suggests that the order needs revision at all taxonomic levels. Here, we provide a comprehensive view of the phylogenetic relationships of the marine Haplosclerida using many species from across the order, and three gene regions. Gene trees generated using 28S rRNA, nad1 and cox1 gene data, under maximum likelihood and Bayesian approaches, ...

  4. A combination of morphology and 28S rRNA gene sequences provide grouping and ranking criteria to merge eight into three Ambispora species (Ambisporaceae, Glomeromycota).

    Bills, Robert J; Morton, Joseph B

    2015-08-01

    Ambispora, the only genus in Ambisporaceae and one of three deeply rooted families in Archaeosporales, Glomeromycetes, is amended. Analysis of the morphology of specimens from types and living cultures and 28S ribosomal DNA (rDNA; LSU) sequences resulted in two major changes that redefined Ambispora to include only species with the potential for spore dimorphism (acaulosporoid and glomoid). First, species described as producing only glomoid spores (Ambispora leptoticha, Ambispora fecundispora, and Ambispora callosa), only acaulosporoid spores (Ambispora jimgerdemannii), or both spore morphotypes (Ambispora appendicula) were synonymized with a redefined dimorphic species, A. leptoticha. LSU sequences and more conserved SSU gene data indicated little divergence between genotypes formerly classified as separate species. Second, Ambispora fennica was synonymized with Ambispora gerdemannii based on morphological and LSU sequence variation equivalent to that measured in the sister clade A. leptoticha. With this analysis, Ambispora was reduced to three species: A. leptoticha, A. gerdemannii, and Ambispora granatensis. Morphological and molecular characters were given equal treatment in this study, as each data set informed and clarified grouping and ranking decisions. The two inner layers of the acaulosporoid spore wall were the only structural characters uniquely defining each of these three species; all other characters were shared. Phenotypes of glomoid spores were indistinguishable between species, and thus were informative only at the genus level. Distinct subclade structure of the LSU gene tree suggests fixation of discrete variants typical of clonal reproduction and possible retention of polymorphisms in rDNA repeats, so that not all discrete genetic variants are indicative of speciation. PMID:25638691

  5. Phylogenetic relationships of the marine Haplosclerida (Phylum Porifera) employing ribosomal (28S rRNA) and mitochondrial (cox1, nad1) gene sequence data.

    Redmond, Niamh E; Raleigh, Jean; van Soest, Rob W M; Kelly, Michelle; Travers, Simon A A; Bradshaw, Brian; Vartia, Salla; Stephens, Kelly M; McCormack, Grace P

    2011-01-01

    The systematics of the poriferan Order Haplosclerida (Class Demospongiae) has been under scrutiny for a number of years without resolution. Molecular data suggests that the order needs revision at all taxonomic levels. Here, we provide a comprehensive view of the phylogenetic relationships of the marine Haplosclerida using many species from across the order, and three gene regions. Gene trees generated using 28S rRNA, nad1 and cox1 gene data, under maximum likelihood and Bayesian approaches, are highly congruent and suggest the presence of four clades. Clade A is comprised primarily of species of Haliclona and Callyspongia, and clade B is comprised of H. simulans and H. vansoesti (Family Chalinidae), Amphimedon queenslandica (Family Niphatidae) and Tabulocalyx (Family Phloeodictyidae), Clade C is comprised primarily of members of the Families Petrosiidae and Niphatidae, while Clade D is comprised of Aka species. The polyphletic nature of the suborders, families and genera described in other studies is also found here. PMID:21931685

  6. Higher-level phylogeny of the Therevidae (Diptera: insecta) based on 28S ribosomal and elongation factor-1 alpha gene sequences.

    Yang, L; Wiegmann, B M; Yeates, D K; Irwin, M E

    2000-06-01

    Therevidae (stilleto flies) are a little-known family of asiloid brachyceran Diptera (Insecta). Separate and combined phylogenetic analyses of 1200 bases of the 28S ribosomal DNA and 1100 bases of elongation factor-1alpha were used to infer phylogenetic relationships within the family. The position of the enigmatic taxon Apsilocephala Kröber is evaluated in light of the molecular evidence. In all analyses, molecular data strongly support the monophyly of Therevidae, excluding Apsilocephala, and the division of Therevidae into two main clades corresponding to a previous classification of the family into the subfamilies Phycinae and Therevinae. Despite strong support for some relationships within these groups, relationships at the base of the two main clades are weakly supported. Short branch lengths for Australasian clades at the base of the Therevinae may represent a rapid radiation of therevids in Australia. PMID:10860652

  7. Intragenomic sequence variation at the ITS1 - ITS2 region and at the 18S and 28S nuclear ribosomal DNA genes of the New Zealand mud snail, Potamopyrgus antipodarum (Hydrobiidae: mollusca)

    Hoy, Marshal S.; Rodriguez, Rusty J.

    2013-01-01

    Molecular genetic analysis was conducted on two populations of the invasive non-native New Zealand mud snail (Potamopyrgus antipodarum), one from a freshwater ecosystem in Devil's Lake (Oregon, USA) and the other from an ecosystem of higher salinity in the Columbia River estuary (Hammond Harbor, Oregon, USA). To elucidate potential genetic differences between the two populations, three segments of nuclear ribosomal DNA (rDNA), the ITS1-ITS2 regions and the 18S and 28S rDNA genes were cloned and sequenced. Variant sequences within each individual were found in all three rDNA segments. Folding models were utilized for secondary structure analysis and results indicated that there were many sequences which contained structure-altering polymorphisms, which suggests they could be nonfunctional pseudogenes. In addition, analysis of molecular variance (AMOVA) was used for hierarchical analysis of genetic variance to estimate variation within and among populations and within individuals. AMOVA revealed significant variation in the ITS region between the populations and among clones within individuals, while in the 5.8S rDNA significant variation was revealed among individuals within the two populations. High levels of intragenomic variation were found in the ITS regions, which are known to be highly variable in many organisms. More interestingly, intragenomic variation was also found in the 18S and 28S rDNA, which has rarely been observed in animals and is so far unreported in Mollusca. We postulate that in these P. antipodarum populations the effects of concerted evolution are diminished due to the fact that not all of the rDNA genes in their polyploid genome should be essential for sustaining cellular function. This could lead to a lessening of selection pressures, allowing mutations to accumulate in some copies, changing them into variant sequences.                   

  8. Identification of Dermatophyte Species by 28S Ribosomal DNA Sequencing with a Commercial Kit

    Ninet, Béatrice; Jan, Isabelle; Bontems, Olympia; Léchenne, Barbara; Jousson, Olivier; Panizzon, Renato; Lew, Daniel; Monod, Michel

    2003-01-01

    We have shown that dermatophyte species can be easily identified on the basis of a DNA sequence encoding a part of the large-subunit (LSU) rRNA (28S rRNA) by using the MicroSeq D2 LSU rRNA Fungal Sequencing Kit. Two taxa causing distinct dermatophytoses were clearly distinguished among isolates of the Trichophyton mentagrophytes species complex. PMID:12574293

  9. Identification of Dermatophyte Species by 28S Ribosomal DNA Sequencing with a Commercial Kit

    Ninet, Béatrice; Jan, Isabelle; Bontems, Olympia; Léchenne, Barbara; Jousson, Olivier; Panizzon, Renato; Lew, Daniel; Monod, Michel

    2003-01-01

    We have shown that dermatophyte species can be easily identified on the basis of a DNA sequence encoding a part of the large-subunit (LSU) rRNA (28S rRNA) by using the MicroSeq D2 LSU rRNA Fungal Sequencing Kit. Two taxa causing distinct dermatophytoses were clearly distinguished among isolates of the Trichophyton mentagrophytes species complex.

  10. IDENTIFICATION OF THREE FRUIT-ROT FUNGI OF BANANA BY 28S RIBOSOMAL DNA SEQUENCING

    Supriya Sarkar*, S Girisham and SM Reddy

    2013-01-01

    Full Text Available The aim of present investigation was to identify three fruit-rot fungi-Macrophomina phaseolina (Tassi Goid, Fusarium oxysporum (Schlechtend and Nigrospora oryzae (Berk and Br. Petch isolated from banana fruits [Rasthali (Silk AAB and Cavendish (AAA varieties]. Out of different fungal genera isolated, the above fungi were responsible for maximum loss of banana fruits as they spread rapidly into the fruit pulp and deteriorated the fruits. The amplification studies of fragment of D2 region of LSU (Large subunit 28S rDNA gene of three fungi understudy was carried out using PCR technique. Based on the nucleotide homology and phylogenetic analysis, the fungus M. phaseolina was identified as M. phaseolina strain R-4242 sp. (Genbank accession number: FJ415068.1, F. oxysporum as Fusarium sp.QJC-1403 5.8S ribosomal RNA gene sp. (Genbank accession number: EU193176.1 and N. oryzae as N. oryzae NRRL: 54030 sp. (Genbank accession number: GQ328855.1. Nucleic acid sequencing provides more objective separation of genera and species than that provided by the conventional techniques. This technique can best be used for the identification of organisms that could not be identified satisfactorily by their microscopic morphological features. Genetic characterization of plant pathogens prevalent in an area is necessary for efficient management and increased crop productivity. The data presented may help researchers to understand the host-pathogen interactions indetail in banana, to design effective strategies for deployment of resistant genes in banana (Musa paradisiaca L. growing regions in the country and worldwide.

  11. Genetic relationship between Neobenedenia girellae and N.melleni inferred from 28S rRNA sequences

    WANG Jun; ZHANG Wen; SU Yongquan; DING Shaoxiong

    2004-01-01

    The fragments of 350 bp in 28S rRNA from the closely related monogenea of trematoda, Neobenedenia girellae and N. melleni are obtained by polymerase chain reaction (PCR) amplified using a couple of special primers and then sequenced. The results show that the comparison of 28S rRNA sequences, with only a base varying in 337bp accounting for 0.3% genetic difference, from the relative species N. girellae and N. melleni parasitized on the different fishes in different farms displays that they possess a very high genetic similarity of 99.7%, higher than that of 99.41% for the single species N. melleni sampled in different areas, and the intraspecific divergence of N.melleni is 0.59%. Meanwhile, the interspecific differences between the two Neobenedenia and three Benedenia (i.e., B. lutjani, B. rohdei and B. seriolae) range from 2.08% to11.73%. In addition, UPGMA and MP molecular phylogenetic trees are constructed and proved to be consistent with each other. Though the morphological characteristics and the results of genetic diversity for the two Neobenedenia show a high similarity, whether they belong to a single species or not are still undefined, and the more genes of them should be further investigated, in combination with the systematical and detailed morphological study.

  12. Reconstruction of phylogenetic relationships in dermatomycete genus Trichophyton Malmsten 1848 based on ribosomal internal transcribed spacer region, partial 28S rRNA and beta-tubulin genes sequences.

    Pchelin, Ivan M; Zlatogursky, Vasily V; Rudneva, Mariya V; Chilina, Galina A; Rezaei-Matehkolaei, Ali; Lavnikevich, Dmitry M; Vasilyeva, Natalya V; Taraskina, Anastasia E

    2016-09-01

    Trichophyton spp. are important causative agents of superficial mycoses. The phylogeny of the genus and accurate strain identification, based on the ribosomal ITS region sequencing, are still under development. The present work is aimed at (i) inferring the genus phylogeny from partial ITS, LSU and BT2 sequences (ii) description of ribosomal ITS region polymorphism in 15 strains of Trichophyton interdigitale. We performed DNA sequence-based species identification and phylogenetic analysis on 48 strains belonging to the genus Trichophyton. Phylogenetic relationships were inferred by maximum likelihood and Bayesian methods on concatenated ITS, LSU and BT2 sequences. Ribosomal ITS region polymorphisms were assessed directly on the alignment. By phylogenetic reconstruction, we reveal major anthropophilic and zoophilic species clusters in the genus Trichophyton. We describe several sequences of the ITS region of T. interdigitale, which do not fit in the traditional polymorphism scheme and propose emendations in this scheme for discrimination between ITS sequence types in T. interdigitale. The new polymorphism scheme will allow inclusion of a wider spectrum of isolates while retaining its explanatory power. This scheme was also found to be partially congruent with NTS typing technique. PMID:27071492

  13. Inferring a classification of the Adenophorea (Nematoda) from nucleotide sequences of the D3 expansion segment (26/28s rDNA)

    Litvaitis, M.K.; Bates, J.W.; Hope, W. D.; Moens, T.

    2000-01-01

    Nucleotide sequences of the D3 expansion segment of the 28S rDNA gene were used to reconstruct evolutionary relationships within the Adenophorea. Neighbor-joining and parsimony analyses of representatives of most major taxa revealed a paraphyletic Adenophorea (p = 0.0005). Within Adenophorea, the Enoplia, Enoplida, and Enoplina were paraphyletic (p = 0.0024, 0.0014, and 0.0120, respectively). A major division was evident within the Enoplida, with one lineage consisting of a basal Thoracostomo...

  14. Evolutionary History of the Chaetognaths Inferred from Actin and 18S-28S rRNA Paralogous Genes

    J.P. Casanova

    2006-01-01

    Full Text Available The chaetognaths constitute a small and enigmatic phylum of marine invertebrates whose phylogenetic affinities remain uncertain. Our phylogenetical investigations inferred from partial paralogous 18S-28S rRNA genes suggest that the event resulting in the presence of two classes of rRNA genes would have occurred at approximately 300-400 million years and prior to the radiation of extant chaetognath, whereas the taxon, according to both molecular and paleontological data, would be dated from at least the Early Cambrian. These divergent rRNA genes could be the result of a whole ribosomal cluster duplication or of an allopolyploid event during a crisis period, since, the fossil are lacking posterioly to the post-Carboniferous period (c.a., 300 million years. In addition, actin phylogeny evidenced that the cytoplasmic chaetognath actin clustered with the cytoplasmic insect actins, while the muscular chaetognath actins are placed basal to all muscular vertebrate actins. The present study suggests that the gene conversion mechanisms could be inefficient in this taxon; this could explain the conservation of extremely divergent paralogous sequences in the chaetognath genomes which could be correlated to the difficulties to identify a sister group between chaetognaths and other taxa among metazoans.

  15. DISCRIMINATION 28S RIBOSOMAL GENE OF TREMATODE CERCARIAE IN SNAILS FROM CHIANG MAI PROVINCE, THAILAND.

    Wongsawad, Chalobol; Wongsawad, Pheravut; Sukontason, Kom; Phalee, Anawat; Noikong-Phalee, Waraporn; Chai, Jong Yil

    2016-03-01

    Trematode cercariae are commonly found in many freshwater gastropods. These cercariae can serve to identify the occurrence of such trematodes as Centrocestus formosanus, Haplorchis taichui, Haplorchoides sp, and Stellantchasmus falcatus, which are important parasites in Chiang Mai Province, Thailand. As the species of these cercariae cannot be identified accurately based on morphology, this study employed sequencing of a fragment of 28S ribosomal DNA and phylogenetic analysis to identify the trematode cercariae found in freshwater gastropods in Chiang Mai Province. Eight types of trematode cercariae were identified, namely, distome cercaria (grouped with Philophthalmus spp clade), echinostome cercaria (grouped with Echinostoma spp clade), furcocercous cercaria (grouped with Posthodiplostomum sp/Alaria taxideae/Hysteromorpha triloba clade), monostome cercaria (grouped with Catatropis indicus clade), parapleurolophocercous cercaria (grouped with Haplorchoides sp clade), pleurolophocercous cercaria (grouped with Centrocestusformosanus clade), transversotrema cercaria (grouped with Transversotrema spp clade), and xiphidiocercaria (grouped with Prosthodendrium spp clade). These results provide important information that can be used for identifying these parasites in epidemiological surveys. PMID:27244956

  16. Phylogenetic Relationships of Tribes Within Harpalinae (Coleoptera: Carabidae) as Inferred from 28S Ribosomal DNA and the Wingless Gene

    Ober, Karen A; Maddison, David R.

    2008-01-01

    Harpalinae is a large, monophyletic subfamily of carabid ground beetles containing more than 19,000 species in approximately 40 tribes. The higher level phylogenetic relationships within harpalines were investigated based on nucleotide data from two nuclear genes, wingless and 28S rDNA. Phylogenetic analyses of combined data indicate that many harpaline tribes are monophyletic, however the reconstructed trees showed little support for deeper nodes. In addition, our results suggest that the Le...

  17. D2 Region of the 28S RNA Gene: A Too-Conserved Fragment for Inferences on Phylogeny of South American Triatomines.

    Guerra, Ana Letícia; Alevi, Kaio Cesar Chaboli; Banho, Cecília Artico; de Oliveira, Jader; da Rosa, João Aristeu; Vilela de Azeredo-Oliveira, Maria Tercília

    2016-09-01

    The brasiliensis complex is composed of five triatomine species, and different approaches suggest that Triatoma lenti and Triatoma petrochiae may be the new members. Therefore, this study sought to analyze the phylogenetic relationships within this complex by means of the D2 region of the 28S RNA gene, and to analyze the degree of polymorphism and phylogenetic significance of this gene for South American triatomines. Phylogenetic analysis by using sequence fragments of the D2 domain did not allow to perform phylogenetic inferences on species within the brasiliensis complex, because the gene alignment composed of a matrix with 37 specimens exhibited only two variable sites along the 567 base pairs used. Furthermore, if all South American species are included, only four variable sites were detected, reflecting the high degree of gene conservation. Therefore, we do not recommend the use of this gene for phylogenetic reconstruction for this group of Chagas disease vectors. PMID:27382073

  18. Phylogenetic analysis of ten species of five genera of Buccinidae from the Chinese coast based on 28S rRNA gene

    DONG Chang-Yong; Hou, Lin; Sui, Na; Zhang, Yun; WANG Ming-Chang; Li, Yan

    2008-01-01

    It has been reported that there are 31 species in 13 genera of the family Buccinidae, distributed along the Chinese coast, but their taxonomic status is still controversial. In the present paper, phylogenetic relationships among ten species in five genera of Buccinidae from the Liaoning, Shandong and Fujian coast and ten species in five genus from the Chinese coast were examined using partial large ribosome subunit 28S rRNA sequences. An approximate 1400 bp fragment of the 28 rRNA gene was o...

  19. Comparative Analysis of 18S and 28S rDNA Sequences of Schistosoma japonicum from Mainland China, the Philippines and Japan

    G.H. Zhao

    2011-01-01

    Full Text Available In the present study, a portion of the 18S and 28S ribosomal DNA (rDNA sequences of 35 Schistosoma japonicum isolates representing three geographical strains from mainland China, the Philippines and Japan were amplified and compared and phylogenetic relationships were also reconstructed by Unweighted Pair-Group Method with Arithmetic averages (UPGMA using combined 18S and 28S rDNA sequences as well as the corresponding sequences of other species belonging to the Schistosoma genus available in the public database. The results indicated that the partial 18S and 28S rDNA sequences of all S. japonicum isolates were 745 and 618 bp, respectively and displayed low genetic variation among S. japonicum strains and isolates. Phylogenetic analysis revealed that the combined 18S and 28S rDNA sequences were not able to distinguish S. japonicum isolates from three geographical origins but provided an effective molecular marker for the inter-species phylogenetic analysis and differential identification of different Schistosoma species.

  20. Phylogenetic Relationships of Two Earth Tiger Tarantulas, Haplopelma lividum and H. longipes (Araneae, Theraphosidae, within the Infraorder Mygalomorph Using 28S Ribosomal DNA Sequences

    Arin Ngamniyom

    2014-01-01

    Full Text Available Haplopelma lividum and H. longipes (Araneae: Mygalomorphae: Theraphosidae are tarantulas that are distributed throughout Southeast Asia and are important carnivorous predators in ecological systems. The present study aimed to examine the phylogenetic relationships between Mygalomorph spiders using 28S ribosomal DNA sequences. The molecular results supported the placement of both species within a common theraphosid taxon. However, when considering relationships between Haplopelma spp. and related genera, H. schmidti, H. lividum and H. longipes were not monophyletic, suggesting that molecular data are incongruent with phylogenies based on morphological characteristics. These results provide molecular data to help elucidate the phylogenetic relationships between theraphosid tarantulas.

  1. Cloning and application of 28S rRNA gene fragment of Trichinella spiralis on Taxonmy%旋毛虫28S rRNA基因片段的克隆及其在分类学上的应用

    李成; 魏颖; 袁金钱; 宋铭忻

    2011-01-01

    In order to investigate the classification of Trihicnella swine isolate from Heilongjiang Province, the gene fragment in ribosome 28S rRNA was cloned and sequenced. The results showed that Trihicnella swine isolate from Heilongjiang Province was closed and belonged to Trichinella spiralis by sequence analysis. To some extent, the result was consistent with the traditional classfication and provided a base for the traditional taxonomy.%为了探讨所采集旋毛虫的分类,利用PCR方法克隆了猪旋毛虫黑龙江隔离种核糖体28S rRNA序列的基因片段.序列分析结果表明,猪旋毛虫黑龙江隔离种与旋毛形线虫(Trichinella spiralis,T1)的进化关系较近,确定为旋毛形线虫(Trichinella spiralis).结果与传统的分类结果基本一致,为传统的分类学方法提供了新的理论依据.

  2. Fungal Community Structure in Disease Suppressive Soils Assessed by 28S LSU Gene Sequencing

    Penton, C. Ryan; Gupta, V.V.S.R.; Tiedje, James M.; Neate, Stephen M.; Ophel-Keller, Kathy; Gillings, Michael; Harvey, Paul; Pham, Amanda; Roget, David K.

    2014-01-01

    Natural biological suppression of soil-borne diseases is a function of the activity and composition of soil microbial communities. Soil microbe and phytopathogen interactions can occur prior to crop sowing and/or in the rhizosphere, subsequently influencing both plant growth and productivity. Research on suppressive microbial communities has concentrated on bacteria although fungi can also influence soil-borne disease. Fungi were analyzed in co-located soils ‘suppressive’ or ‘non-suppressive’...

  3. Phylogenetic analysis of the spider mite sub-family Tetranychinae (Acari: Tetranychidae based on the mitochondrial COI gene and the 18S and the 5' end of the 28S rRNA genes indicates that several genera are polyphyletic.

    Tomoko Matsuda

    Full Text Available The spider mite sub-family Tetranychinae includes many agricultural pests. The internal transcribed spacer (ITS region of nuclear ribosomal RNA genes and the cytochrome c oxidase subunit I (COI gene of mitochondrial DNA have been used for species identification and phylogenetic reconstruction within the sub-family Tetranychinae, although they have not always been successful. The 18S and 28S rRNA genes should be more suitable for resolving higher levels of phylogeny, such as tribes or genera of Tetranychinae because these genes evolve more slowly and are made up of conserved regions and divergent domains. Therefore, we used both the 18S (1,825-1,901 bp and 28S (the 5' end of 646-743 bp rRNA genes to infer phylogenetic relationships within the sub-family Tetranychinae with a focus on the tribe Tetranychini. Then, we compared the phylogenetic tree of the 18S and 28S genes with that of the mitochondrial COI gene (618 bp. As observed in previous studies, our phylogeny based on the COI gene was not resolved because of the low bootstrap values for most nodes of the tree. On the other hand, our phylogenetic tree of the 18S and 28S genes revealed several well-supported clades within the sub-family Tetranychinae. The 18S and 28S phylogenetic trees suggest that the tribes Bryobiini, Petrobiini and Eurytetranychini are monophyletic and that the tribe Tetranychini is polyphyletic. At the genus level, six genera for which more than two species were sampled appear to be monophyletic, while four genera (Oligonychus, Tetranychus, Schizotetranychus and Eotetranychus appear to be polyphyletic. The topology presented here does not fully agree with the current morphology-based taxonomy, so that the diagnostic morphological characters of Tetranychinae need to be reconsidered.

  4. Inhibition of deoxyribonucleic acid transcription by ultraviolet irradiation in mammalian cells: determination of the transcriptional linkage of the 18S and 28S ribosomal ribonucleic acid genes

    The inhibition of deoxyribonucleic acid (DNA) transcription in mammalian cells by ultraviolet irradiation has been studied. The reduction in the rates and the amounts of total ribonucleic acid (RNA) synthesis and of 18S, 28S, and 45S ribosomal RNA (rRNA) synthesis, in tissue cultured mouse L cells, were examined as functions of ultraviolet dose and time after ultraviolet irradiation. Total RNA synthesis in the ultraviolet irradiated L cell was found to decrease as a function of ultraviolet dose. The rates of synthesis for the 18S and 28S rRNAs and the 45S precursor RNA decreased exponentially with ultraviolet dose; the respective D37 values were 310 erg/mm2, 130 erg/mm2, and 90 erg/mm2. Ultraviolet inactivation kinetics of rRNA synthesis in HeLa cells indicated that, as in L cells, each 45S rRNA transcriptional unit has its own promotor, and that the 18S rRNA cistron is promotor proximal and the 28S rRNA cistron is promotor distal. All of the above findings support the hypothesis that irradiation of mammalian cells with ultraviolet light causes the formation of lesions on the DNA templates which result in premature termination of transcription. (U.S.)

  5. Basal divergence of Eriophyoidea (Acariformes, Eupodina) inferred from combined partial COI and 28S gene sequences and CLSM genital anatomy.

    Chetverikov, P E; Cvrković, T; Makunin, A; Sukhareva, S; Vidović, B; Petanović, R

    2015-10-01

    Eriophyoids are an ancient group of highly miniaturized, morphologically simplified and diverse phytoparasitic mites. Their possible numerous host-switch events have been accompanied by considerable homoplastic evolution. Although several morphological cladistic and molecular phylogenetic studies attempted to reconstruct phylogeny of Eriophyoidea, the major lineages of eriophyoids, as well as the evolutionary relationships between them, are still poorly understood. New phylogenetically informative data have been provided by the recent discovery of the early derivative pentasetacine genus Loboquintus, and observations on the eriophyoid reproductive anatomy. Herein, we use COI and D1-2 rRNA data of 73 eriophyoid species (including early derivative pentasetacines) from Europe, the Americas and South Africa to reconstruct part of the phylogeny of the superfamily, and infer on the basal divergence of eriophyoid taxa. In addition, a comparative CLSM study of the female internal genitalia was undertaken in order to find putative apomorphies, which can be used to improve the taxonomy of Eriophyoidea. The following molecular clades, marked by differences in genital anatomy and prodorsal shield setation, were found in our analyses: Loboquintus(Pentasetacus((Eriophyidae + Diptilomiopidae)(Phytoptidae-1, Phytoptidae-2))). The results of this study suggest that the superfamily Eriophyoidea comprises basal paraphyletic pentasetacines (Loboquintus and Pentasetacus), and two large monophyletic groups: Eriophyidae s.l. [containing paraphyletic Eriophyidae sensu Amrine et al. 2003 (=Eriophyidae s.str.) and Diptilomiopidae sensu Amrine et al. 2003] and Phytoptidae s.l. [containing monophyletic Phytoptidae sensu Boczek et al. 1989 (=Phytoptidae s.str.) and Nalepellidae sensu Boczek et al. 1989]. Putative morphological apomorphies (including genital and gnathosomal characters) supporting the clades revealed in molecular analyses are briefly discussed. PMID:26126634

  6. Repetitive sequence environment distinguishes housekeeping genes

    Eller, C. Daniel; Regelson, Moira; Merriman, Barry; Nelson, Stan,; Horvath, Steve; Marahrens, York

    2006-01-01

    Housekeeping genes are expressed across a wide variety of tissues. Since repetitive sequences have been reported to influence the expression of individual genes, we employed a novel approach to determine whether housekeeping genes can be distinguished from tissue-specific genes their repetitive sequence context. We show that Alu elements are more highly concentrated around housekeeping genes while various longer (>400-bp) repetitive sequences ("repeats"), including Long Interspersed Nuclear E...

  7. Phylogenetic analysis of three species of Encarsia ( Hymenoptera: Aphelinidae) parasitizing Bemisia tabaci ( Hemiptera: Aleyrodidae) in China based on their 28S rRNA gene%中国寄生烟粉虱的三种恩角蚜小蜂28S rRNA系统发育分析

    薛夏; 彭伟录; Muhammad Z. AHMED; Nasser S. MANDOUR; 任顺祥; Andrew G. S. CUTHBERTSON; 邱宝利

    2012-01-01

    Encarsia F(o)rster consists of important parasitoids of whitefly (Bemisia tabaci) pests,including E.bimaculata,E.formosa and E.sophia,the three most important aphelinid parasitoids in China.Eight populations of Encarsia from the South,Southeast,North and Southwest of China,as well as two populations from Malaysia and Egypt,respectively,were collected in the present study,and their interspecies phylogenetic relationships were analyzed based on 28S rRNA D2 and D3 expansion regions.The D2 and D3 regions were consistent with each other,confirmed a closer genetic relationship between E.sophia and E.bimaculata since they both belong to the Encarisa strenus species group,compared to those between these two species and En.formosa.Results of the genetic distance analysis using 28S rRNA D2 sequences revealed that there are certain genetic divergences within single species of the Encarsia parasitoids.The Guangzhou population of Encarsia sophia is more close to populations from Australia,Spain,Egypt and Ethiopia,but further from the population from Thailand.E. bimaculata populations from Sudan,Egypt and Guatemala as well as one population from Australia cluster together,while E.formosa Hengshui and Kunming populations cluster together with those from USA,UK and Greece,but are further from the Egypt population.The reasons for the inconsistency between the genetic and geographical distances of the Encarsia species are discussed.%蚜小蜂Bemisia tabaci是烟粉虱的重要天敌,其中双斑恩蚜小蜂Encarsia bimaculata,丽蚜小蜂E.forTmosa以及浅黄恩蚜小蜂E.sophia是国内烟粉虱寄生蜂3个优势种.本研究以采自中国华南、华东、华北、西南地区以及马来西亚、埃及的E.bimaculata、E.formosa和E.sophia3个优势种的8个不同地理种群为研究对象,对其28SrRNA D2和D3扩展区序列进行了测定和分析.结果表明:Encarsia属的恩蚜小蜂其28S rRNA D2和D3序列在种间水平上高度保守;与丽蚜小蜂相比,双斑

  8. cis sequence effects on gene expression

    Jacobs Kevin

    2007-08-01

    Full Text Available Abstract Background Sequence and transcriptional variability within and between individuals are typically studied independently. The joint analysis of sequence and gene expression variation (genetical genomics provides insight into the role of linked sequence variation in the regulation of gene expression. We investigated the role of sequence variation in cis on gene expression (cis sequence effects in a group of genes commonly studied in cancer research in lymphoblastoid cell lines. We estimated the proportion of genes exhibiting cis sequence effects and the proportion of gene expression variation explained by cis sequence effects using three different analytical approaches, and compared our results to the literature. Results We generated gene expression profiling data at N = 697 candidate genes from N = 30 lymphoblastoid cell lines for this study and used available candidate gene resequencing data at N = 552 candidate genes to identify N = 30 candidate genes with sufficient variance in both datasets for the investigation of cis sequence effects. We used two additive models and the haplotype phylogeny scanning approach of Templeton (Tree Scanning to evaluate association between individual SNPs, all SNPs at a gene, and diplotypes, with log-transformed gene expression. SNPs and diplotypes at eight candidate genes exhibited statistically significant (p cis sequence effects in our study, respectively. Conclusion Based on analysis of our results and the extant literature, one in four genes exhibits significant cis sequence effects, and for these genes, about 30% of gene expression variation is accounted for by cis sequence variation. Despite diverse experimental approaches, the presence or absence of significant cis sequence effects is largely supported by previously published studies.

  9. Nucleotide sequence of Klebsiella pneumoniae lac genes.

    Buvinger, W E; Riley, M

    1985-01-01

    The nucleotide sequences of the Klebsiella pneumoniae lacI and lacZ genes and part of the lacY gene were determined, and these genes were located and oriented relative to one another. The K. pneumoniae lac operon is divergent in that the lacI and lacZ genes are oriented head to head, and complementary strands are transcribed. Besides base substitutions, the lacZ genes of K. pneumoniae and Escherichia coli have suffered short distance shifts of reading frame caused by additions or deletions or...

  10. Network of tRNA Gene Sequences

    WEI Fang-ping; LI Sheng; MA Hong-ru

    2008-01-01

    A network of 3719 tRNA gene sequences was constructed using simplest alignment. Its topology, degree distribution and clustering coefficient were studied. The behaviors of the network shift from fluctuated distribution to scale-free distribution when the similarity degree of the tRNA gene sequences increases. The tRNA gene sequences with the same anticodon identity are more self-organized than those with different anticodon identities and form local clusters in the network. Some vertices of the local cluster have a high connection with other local clusters, and the probable reason was given. Moreover, a network constructed by the same number of random tRNA sequences was used to make comparisons. The relationships between the properties of the tRNA similarity network and the characters of tRNA evolutionary history were discussed.

  11. Ab initio gene identification in metagenomic sequences.

    Zhu, Wenhan; Lomsadze, Alexandre; Borodovsky, Mark

    2010-07-01

    We describe an algorithm for gene identification in DNA sequences derived from shotgun sequencing of microbial communities. Accurate ab initio gene prediction in a short nucleotide sequence of anonymous origin is hampered by uncertainty in model parameters. While several machine learning approaches could be proposed to bypass this difficulty, one effective method is to estimate parameters from dependencies, formed in evolution, between frequencies of oligonucleotides in protein-coding regions and genome nucleotide composition. Original version of the method was proposed in 1999 and has been used since for (i) reconstructing codon frequency vector needed for gene finding in viral genomes and (ii) initializing parameters of self-training gene finding algorithms. With advent of new prokaryotic genomes en masse it became possible to enhance the original approach by using direct polynomial and logistic approximations of oligonucleotide frequencies, as well as by separating models for bacteria and archaea. These advances have increased the accuracy of model reconstruction and, subsequently, gene prediction. We describe the refined method and assess its accuracy on known prokaryotic genomes split into short sequences. Also, we show that as a result of application of the new method, several thousands of new genes could be added to existing annotations of several human and mouse gut metagenomes. PMID:20403810

  12. DNA sequence of the yeast transketolase gene.

    Fletcher, T S; Kwee, I L; Nakada, T; Largman, C; Martin, B M

    1992-02-18

    Transketolase (EC 2.2.1.1) is the enzyme that, together with aldolase, forms a reversible link between the glycolytic and pentose phosphate pathways. We have cloned and sequenced the transketolase gene from yeast (Saccharomyces cerevisiae). This is the first transketolase gene of the pentose phosphate shunt to be sequenced from any source. The molecular mass of the proposed translated protein is 73,976 daltons, in good agreement with the observed molecular mass of about 75,000 daltons. The 5'-nontranslated region of the gene is similar to other yeast genes. There is no evidence of 5'-splice junctions or branch points in the sequence. The 3'-nontranslated region contains the polyadenylation signal (AATAAA), 80 base pairs downstream from the termination codon. A high degree of homology is found between yeast transketolase and dihydroxyacetone synthase (formaldehyde transketolase) from the yeast Hansenula polymorpha. The overall sequence identity between these two proteins is 37%, with four regions of much greater similarity. The regions from amino acid residues 98-131, 157-182, 410-433, and 474-489 have sequence identities of 74%, 66%, 83%, and 82%, respectively. One of these regions (157-182) includes a possible thiamin pyrophosphate (TPP) binding domain, and another (410-433) may contain the catalytic domain. PMID:1737042

  13. Sequencing and Gene Expression Analysis of Leishmania tropica LACK Gene.

    Nour Hammoudeh

    2014-12-01

    Full Text Available Leishmania Homologue of receptors for Activated C Kinase (LACK antigen is a 36-kDa protein, which provokes a very early immune response against Leishmania infection. There are several reports on the expression of LACK through different life-cycle stages of genus Leishmania, but only a few of them have focused on L.tropica.The present study provides details of the cloning, DNA sequencing and gene expression of LACK in this parasite species. First, several local isolates of Leishmania parasites were typed in our laboratory using PCR technique to verify of Leishmania parasite species. After that, LACK gene was amplified and cloned into a vector for sequencing. Finally, the expression of this molecule in logarithmic and stationary growth phase promastigotes, as well as in amastigotes, was evaluated by Reverse Transcription-PCR (RT-PCR technique.The typing result confirmed that all our local isolates belong to L.tropica. LACK gene sequence was determined and high similarity was observed with the sequences of other Leishmania species. Furthermore, the expression of LACK gene in both promastigotes and amastigotes forms was confirmed.Overall, the data set the stage for future studies of the properties and immune role of LACK gene products.

  14. The nucleotide sequences of two leghemoglobin genes from soybean

    Wiborg, O; Hyldig-Nielsen, J J; Jensen, E O; Paludan, K; Marcker, K A

    1982-01-01

    We present the complete nucleotide sequences of two leghemoglobin genes isolated from soybean DNA. Both genes contain three intervening sequences in identical positions. Comparison of the coding sequences with known amino-acid sequences of soybean leghemoglobins suggest that the two genes...

  15. Cloning and sequencing genes related to preeclampsia

    SHI Juan-zi; LIU Yan-fang; YAO Yuan-qing; YAN Wei; ZHU Feng; ZHAO Zhong-liang

    2001-01-01

    To clone genes specifically expressed in the placenta of patients with preeclampsia, and to explain the mechanism in the etiopathology ofpreeclampsia. Methods: The placentae ofpreeclamptic and normotensive subjects with pregnancy were used as models, and the cDNA Library was constructed and 20 differentially expressed fragments were cloned after a new version of PCR-based subtractive hybridization. The false positive clones were identified by reverse dot blot analysis. With one of the obtained gene taken as the probe, the placentas of 10 normal pregnant women and 10 preeclamptic patients were studied by using dot hybridization methods. Results: Six false positive clones were identified by reverse dot blot, and the rest 14 clones were identified as preeclampsia-related genes. These clones were sequenced, and analyzed with BLAST analysis system. Eleven of 14 clones were genes already known, among which one belongs to necdin family; the rest 3 were identified as novel genes. These 3 genes were acknowledged by GenBank, with the accession numbers AF232216, AF232217, AF233648. The results of dot hybridization using necdin gene as probe were as follows: (1) There was this mRNA in the placental tissues of normal pregnancy as well as in that ofpreeclampsia.(2) The intensity of transcription of this mRNA in the placental tissues of preeclampsia increased significantly compared with that of the normal pregnancy (P<0.05). Conclusions: This study for the first time reported this group of genes, especially necdin-expressing gene, which are related to the etiopathology of preeclampsia. In addition, the overtranscription ofnecdin gene has been found in preeclampsia. It is helpful in further studies of the etiology ofpreeclampsia.

  16. Preliminary phylogeny of the thrips parasitoids of Turkey based on some morphological scales and 28S D2 rDNA, with description of a new species

    DOĞANLAR, Oğuzhan; Doğanlar, Mikdat; Frary, Anne

    2010-01-01

    Species of the Ceranisus thrips-attacking genus are difficult to distinguish morphologically. The phylogenetic relationships within the Ceranisus species were explored using nucleotide sequences of the 28S D2 expansion region of the rDNA gene. Bayesian, maximum likelihood, and parsimony inference methods were employed to construct the phylogenetic relationships. Principal component analysis on the Turkish species of Ceranisus, namely antalyacus, menes, bozovaensis, hirsutus, planitianus (a ne...

  17. Isolation and nucleotide sequence of the gene encoding human rhodopsin.

    Nathans, J; Hogness, D S

    1984-01-01

    We have isolated and completely sequenced the gene encoding human rhodopsin. The coding region of the human rhodopsin gene is interrupted by four introns, which are located at positions analogous to those found in the previously characterized bovine rhodopsin gene. The amino acid sequence of human rhodopsin, deduced from the nucleotide sequence of its gene, is 348 residues long and is 93.4% homologous to that of bovine rhodopsin. Interestingly, those portions of the polypeptide chain predicte...

  18. Bioinformatic Identification of Conserved Cis-Sequences in Coregulated Genes.

    Bülow, Lorenz; Hehl, Reinhard

    2016-01-01

    Bioinformatics tools can be employed to identify conserved cis-sequences in sets of coregulated plant genes because more and more gene expression and genomic sequence data become available. Knowledge on the specific cis-sequences, their enrichment and arrangement within promoters, facilitates the design of functional synthetic plant promoters that are responsive to specific stresses. The present chapter illustrates an example for the bioinformatic identification of conserved Arabidopsis thaliana cis-sequences enriched in drought stress-responsive genes. This workflow can be applied for the identification of cis-sequences in any sets of coregulated genes. The workflow includes detailed protocols to determine sets of coregulated genes, to extract the corresponding promoter sequences, and how to install and run a software package to identify overrepresented motifs. Further bioinformatic analyses that can be performed with the results are discussed. PMID:27557771

  19. Fungal community analysis in the deep-sea sediments of the Pacific Ocean assessed by comparison of ITS, 18S and 28S ribosomal DNA regions

    Xu, Wei; Luo, Zhu-Hua; Guo, Shuangshuang; Pang, Ka-Lai

    2016-03-01

    We investigated the diversity of fungal communities in 6 different deep-sea sediment samples of the Pacific Ocean based on three different types of clone libraries, including internal transcribed spacer (ITS), 18S rDNA, and 28S rDNA regions. A total of 1978 clones were generated from 18 environmental clone libraries, resulting in 140 fungal operational taxonomic units (OTUs), including 18 OTUs from ITS, 44 OTUs from 18S rDNA, and 78 OTUs from 28S rDNA gene primer sets. The majority of the recovered sequences belonged to diverse phylotypes of the Ascomycota and Basidiomycota. Additionally, our study revealed a total of 46 novel fungal phylotypes, which showed low similarities (<97%) with available fungal sequences in the GenBank, including a novel Zygomycete lineage, suggesting possible new fungal taxa occurring in the deep-sea sediments. The results suggested that 28S rDNA is an efficient target gene to describe fungal community in deep-sea environment.

  20. Sequencing genes in silico using single nucleotide polymorphisms

    Zhang Xinyi

    2012-01-01

    Full Text Available Abstract Background The advent of high throughput sequencing technology has enabled the 1000 Genomes Project Pilot 3 to generate complete sequence data for more than 906 genes and 8,140 exons representing 697 subjects. The 1000 Genomes database provides a critical opportunity for further interpreting disease associations with single nucleotide polymorphisms (SNPs discovered from genetic association studies. Currently, direct sequencing of candidate genes or regions on a large number of subjects remains both cost- and time-prohibitive. Results To accelerate the translation from discovery to functional studies, we propose an in silico gene sequencing method (ISS, which predicts phased sequences of intragenic regions, using SNPs. The key underlying idea of our method is to infer diploid sequences (a pair of phased sequences/alleles at every functional locus utilizing the deep sequencing data from the 1000 Genomes Project and SNP data from the HapMap Project, and to build prediction models using flanking SNPs. Using this method, we have developed a database of prediction models for 611 known genes. Sequence prediction accuracy for these genes is 96.26% on average (ranges 79%-100%. This database of prediction models can be enhanced and scaled up to include new genes as the 1000 Genomes Project sequences additional genes on additional individuals. Applying our predictive model for the KCNJ11 gene to the Wellcome Trust Case Control Consortium (WTCCC Type 2 diabetes cohort, we demonstrate how the prediction of phased sequences inferred from GWAS SNP genotype data can be used to facilitate interpretation and identify a probable functional mechanism such as protein changes. Conclusions Prior to the general availability of routine sequencing of all subjects, the ISS method proposed here provides a time- and cost-effective approach to broadening the characterization of disease associated SNPs and regions, and facilitating the prioritization of candidate

  1. A Probabilistic Genome-Wide Gene Reading Frame Sequence Model

    Have, Christian Theil; Mørk, Søren

    We introduce a new type of probabilistic sequence model, that model the sequential composition of reading frames of genes in a genome. Our approach extends gene finders with a model of the sequential composition of genes at the genome-level -- effectively producing a sequential genome annotation as...... output. The model can be used to obtain the most probable genome annotation based on a combination of i: a gene finder score of each gene candidate and ii: the sequence of the reading frames of gene candidates through a genome. The model --- as well as a higher order variant --- is developed and tested...... using the probabilistic logic programming language and machine learning system PRISM - a fast and efficient model prototyping environment, using bacterial gene finding performance as a benchmark of signal strength. The model is used to prune a set of gene predictions from an underlying gene finder and...

  2. Nucleotide sequence of the triosephosphate isomerase gene from Macaca mulatta

    Old, S.E.; Mohrenweiser, H.W. (Univ. of Michigan, Ann Arbor (USA))

    1988-09-26

    The triosephosphate isomerase gene from a rhesus monkey, Macaca mulatta, charon 34 library was sequenced. The human and chimpanzee enzymes differ from the rhesus enzyme at ASN 20 and GLU 198. The nucleotide sequence identity between rhesus and human is 97% in the coding region and >94% in the flanking regions. Comparison of the rhesus and chimp genes, including the intron and flanking sequences, does not suggest a mechanism for generating the two TPI peptides of proliferating cells from hominoids and a single peptide from the rhesus gene.

  3. Identification of sequence variants in genetic disease-causing genes using targeted next-generation sequencing.

    Xiaoming Wei

    Full Text Available BACKGROUND: Identification of gene variants plays an important role in research on and diagnosis of genetic diseases. A combination of enrichment of targeted genes and next-generation sequencing (targeted DNA-HiSeq results in both high efficiency and low cost for targeted sequencing of genes of interest. METHODOLOGY/PRINCIPAL FINDINGS: To identify mutations associated with genetic diseases, we designed an array-based gene chip to capture all of the exons of 193 genes involved in 103 genetic diseases. To evaluate this technology, we selected 7 samples from seven patients with six different genetic diseases resulting from six disease-causing genes and 100 samples from normal human adults as controls. The data obtained showed that on average, 99.14% of 3,382 exons with more than 30-fold coverage were successfully detected using Targeted DNA-HiSeq technology, and we found six known variants in four disease-causing genes and two novel mutations in two other disease-causing genes (the STS gene for XLI and the FBN1 gene for MFS as well as one exon deletion mutation in the DMD gene. These results were confirmed in their entirety using either the Sanger sequencing method or real-time PCR. CONCLUSIONS/SIGNIFICANCE: Targeted DNA-HiSeq combines next-generation sequencing with the capture of sequences from a relevant subset of high-interest genes. This method was tested by capturing sequences from a DNA library through hybridization to oligonucleotide probes specific for genetic disorder-related genes and was found to show high selectivity, improve the detection of mutations, enabling the discovery of novel variants, and provide additional indel data. Thus, targeted DNA-HiSeq can be used to analyze the gene variant profiles of monogenic diseases with high sensitivity, fidelity, throughput and speed.

  4. Comparison of methods for genomic localization of gene trap sequences

    Ferrin Thomas E

    2006-09-01

    Full Text Available Abstract Background Gene knockouts in a model organism such as mouse provide a valuable resource for the study of basic biology and human disease. Determining which gene has been inactivated by an untargeted gene trapping event poses a challenging annotation problem because gene trap sequence tags, which represent sequence near the vector insertion site of a trapped gene, are typically short and often contain unresolved residues. To understand better the localization of these sequences on the mouse genome, we compared stand-alone versions of the alignment programs BLAT, SSAHA, and MegaBLAST. A set of 3,369 sequence tags was aligned to build 34 of the mouse genome using default parameters for each algorithm. Known genome coordinates for the cognate set of full-length genes (1,659 sequences were used to evaluate localization results. Results In general, all three programs performed well in terms of localizing sequences to a general region of the genome, with only relatively subtle errors identified for a small proportion of the sequence tags. However, large differences in performance were noted with regard to correctly identifying exon boundaries. BLAT correctly identified the vast majority of exon boundaries, while SSAHA and MegaBLAST missed the majority of exon boundaries. SSAHA consistently reported the fewest false positives and is the fastest algorithm. MegaBLAST was comparable to BLAT in speed, but was the most susceptible to localizing sequence tags incorrectly to pseudogenes. Conclusion The differences in performance for sequence tags and full-length reference sequences were surprisingly small. Characteristic variations in localization results for each program were noted that affect the localization of sequence at exon boundaries, in particular.

  5. Degenerative primer design and gene sequencing validation for select turkey genes.

    Hutsko, Stephanie L; Lilburn, Michael S; Wick, Macdonald

    2016-06-01

    We successfully designed and validated degenerative primers for turkey genes MUC2, RPS13, TBP and TFF2 based on chicken sequences in order to use gene transcription analysis to evaluate (quantify) the mucin transcription to probiotic supplementation in turkeys. Primers were designed for the genes MUC2, TFF2, RPS13 and TBP using a degenerative primer design method based on the available Gallus gallus sequences. All primer sets, which produced a single PCR amplicon of the expected sizes, were cloned into the TOPO(®) vector and then transformed into TOP 10(®) competent cells. Plasmid DNA isolation was performed on the TOP10(®) cell culture and sent for sequencing. Sequences were analyzed using NCBI BLAST. All genes sequenced had over 90% homology with both the chicken and predicted turkey sequences. The sequences were used to design new 100% homologous primer sets for the genes of interest. PMID:27053625

  6. A silent composite hemoglobinopathy characterized by gene sequencing.

    Zorai, A; Moumni, I; Benmansour, I; Chaouachi, D; Ghanem, A; Abbes, S

    2011-01-01

    We report the case of a 35-year-old Tunisian women with a chronic anemia non investigated for a long time. Laboratory analysis using advanced technology of DNA sequencing revealed a compound heterozygote for Hb O Arab and cd 39 beta degrees-thalassemia. It's the first time that such a genotype has been characterized by gene sequencing. PMID:23461145

  7. Mechanism of Gene Amplification via Yeast Autonomously Replicating Sequences

    Shelly Sehgal

    2015-01-01

    Full Text Available The present investigation was aimed at understanding the molecular mechanism of gene amplification. Interplay of fragile sites in promoting gene amplification was also elucidated. The amplification promoting sequences were chosen from the Saccharomyces cerevisiae ARS, 5S rRNA regions of Plantago ovata and P. lagopus, proposed sites of replication pausing at Ste20 gene locus of S. cerevisiae, and the bend DNA sequences within fragile site FRA11A in humans. The gene amplification assays showed that plasmid bearing APS from yeast and human beings led to enhanced protein concentration as compared to the wild type. Both the in silico and in vitro analyses were pointed out at the strong bending potential of these APS. In addition, high mitotic stability and presence of TTTT repeats and SAR amongst these sequences encourage gene amplification. Phylogenetic analysis of S. cerevisiae ARS was also conducted. The combinatorial power of different aspects of APS analyzed in the present investigation was harnessed to reach a consensus about the factors which stimulate gene expression, in presence of these sequences. It was concluded that the mechanism of gene amplification was that AT rich tracts present in fragile sites of yeast serve as binding sites for MAR/SAR and DNA unwinding elements. The DNA protein interactions necessary for ORC activation are facilitated by DNA bending. These specific bindings at ORC promote repeated rounds of DNA replication leading to gene amplification.

  8. Nucleotide sequence of a human tRNA gene heterocluster

    Leucine tRNA from bovine liver was used as a hybridization probe to screen a human gene library harbored in Charon-4A of bacteriophage lambda. The human DNA inserts from plaque-pure clones were characterized by restriction endonuclease mapping and Southern hybridization techniques, using both [3'-32P]-labeled bovine liver leucine tRNA and total tRNA as hybridization probes. An 8-kb Hind III fragment of one of these γ-clones was subcloned into the Hind III site of pBR322. Subsequent fine restriction mapping and DNA sequence analysis of this plasmid DNA indicated the presence of four tRNA genes within the 8-kb DNA fragment. A leucine tRNA gene with an anticodon of AAG and a proline tRNA gene with an anticodon of AGG are in a 1.6-kb subfragment. A threonine tRNA gene with an anticodon of UGU and an as yet unidentified tRNA gene are located in a 1.1-kb subfragment. These two different subfragments are separated by 2.8 kb. The coding regions of the three sequenced genes contain characteristic internal split promoter sequences and do not have intervening sequences. The 3'-flanking region of these three genes have typical RNA polymerase III termination sites of at least four consecutive T residues

  9. Mechanism of gene amplification via yeast autonomously replicating sequences.

    Sehgal, Shelly; Kaul, Sanjana; Dhar, M K

    2015-01-01

    The present investigation was aimed at understanding the molecular mechanism of gene amplification. Interplay of fragile sites in promoting gene amplification was also elucidated. The amplification promoting sequences were chosen from the Saccharomyces cerevisiae ARS, 5S rRNA regions of Plantago ovata and P. lagopus, proposed sites of replication pausing at Ste20 gene locus of S. cerevisiae, and the bend DNA sequences within fragile site FRA11A in humans. The gene amplification assays showed that plasmid bearing APS from yeast and human beings led to enhanced protein concentration as compared to the wild type. Both the in silico and in vitro analyses were pointed out at the strong bending potential of these APS. In addition, high mitotic stability and presence of TTTT repeats and SAR amongst these sequences encourage gene amplification. Phylogenetic analysis of S. cerevisiae ARS was also conducted. The combinatorial power of different aspects of APS analyzed in the present investigation was harnessed to reach a consensus about the factors which stimulate gene expression, in presence of these sequences. It was concluded that the mechanism of gene amplification was that AT rich tracts present in fragile sites of yeast serve as binding sites for MAR/SAR and DNA unwinding elements. The DNA protein interactions necessary for ORC activation are facilitated by DNA bending. These specific bindings at ORC promote repeated rounds of DNA replication leading to gene amplification. PMID:25685838

  10. Biased distribution of DNA uptake sequences towards genome maintenance genes

    Davidsen, T.; Rodland, E.A.; Lagesen, K.; Seeberg, E.; Rognes, Torbjørn; Tonjum, T.

    2004-01-01

    coding regions are the DNA uptake sequences (DUS) required for natural genetic transformation. More importantly, we found a significantly higher density of DUS within genes involved in DNA repair, recombination, restriction-modification and replication than in any other annotated gene group in these...

  11. Cloning and sequencing of the gene for human. beta. -casein

    Loennerdal, B.; Bergstroem, S.; Andersson, Y.; Hialmarsson, K.; Sundgyist, A.; Hernell, O. (Univ. of California, Davis (United States))

    1990-02-26

    Human {beta}-casein is a major protein in human milk. This protein is part of the casein micelle and has been suggested to have several physiological functions in the newborn. Since there is limited information on {beta}casein and the factors that affect its concentration in human milk, the authors have isolated and sequenced the gene for this protein. A human mammary gland cDNA library (Clontech) in gt 11 was screened by plaque hy-hybridization using a 42-mer synthetic {sup 32}p-labelled oligo-nucleotide. Positive clones were identified and isolated, DNA was prepared and the gene isolated by cleavage with EcoR1. Following subcloning (PUC18), restriction mapping and Southern blotting, DNA for sequencing was prepared. The gene was sequenced by the dideoxy method. Human {beta}-casein has 212 amino acids and the amino acid sequence deducted from the nucleotide sequence is to 91% identical to the published sequence for human {beta}-casein show a high degree of conservation at the leader peptide and the highly phosphorylated sequences, but also deletions and divergence at several positions. These results provide insight into the structure of the human {beta}-casein gene and will facilitate studies on factors affecting its expression.

  12. Microsatellite Instability Use in Mismatch Repair Gene Sequence Variant Classification

    Bryony A. Thompson

    2015-03-01

    Full Text Available Inherited mutations in the DNA mismatch repair genes (MMR can cause MMR deficiency and increased susceptibility to colorectal and endometrial cancer. Microsatellite instability (MSI is the defining molecular signature of MMR deficiency. The clinical classification of identified MMR gene sequence variants has a direct impact on the management of patients and their families. For a significant proportion of cases sequence variants of uncertain clinical significance (also known as unclassified variants are identified, constituting a challenge for genetic counselling and clinical management of families. The effect on protein function of these variants is difficult to interpret. The presence or absence of MSI in tumours can aid in determining the pathogenicity of associated unclassified MMR gene variants. However, there are some considerations that need to be taken into account when using MSI for variant interpretation. The use of MSI and other tumour characteristics in MMR gene sequence variant classification will be explored in this review.

  13. SxtA gene sequence analysis of dinoflagellate Alexandrium minutum

    Norshaha, Safida Anira; Latib, Norhidayu Abdul; Usup, Gires; Yusof, Nurul Yuziana Mohd

    2015-09-01

    The dinoflagellate Alexandrium minutum is typically known for the production of potent neurotoxins such as saxitoxin, affecting the health of human seafood consumers via paralytic shellfish poisoning (PSP). These phenomena is related to the harmful algal blooms (HABs) that is believed to be influenced by environmental and nutritional factors. Previous study has revealed that SxtA gene is a starting gene that involved in the saxitoxin production pathway. The aim of this study was to analyse the sequence of the sxtA gene in A. minutum. The dinoflagellates culture was cultured at temperature 26°C with 16:8-hour light:dark photocycle. After the samples were harvested, RNA was extracted, complementary DNA (cDNA) was synthesised and amplified by polymerase chain reaction (PCR). The PCR products were then purified and cloned before sequenced. The SxtA sequence obtained was then analyzed in order to identify the presence of SxtA gene in Alexandrium minutum.

  14. Sequence Variability in Staphylococcal Enterotoxin Genes seb, sec, and sed

    Sophia Johler

    2016-06-01

    Full Text Available Ingestion of staphylococcal enterotoxins preformed by Staphylococcus aureus in food leads to staphylococcal food poisoning, the most prevalent foodborne intoxication worldwide. There are five major staphylococcal enterotoxins: SEA, SEB, SEC, SED, and SEE. While variants of these toxins have been described and were linked to specific hosts or levels or enterotoxin production, data on sequence variation is still limited. In this study, we aim to extend the knowledge on promoter and gene variants of the major enterotoxins SEB, SEC, and SED. To this end, we determined seb, sec, and sed promoter and gene sequences of a well-characterized set of enterotoxigenic Staphylococcus aureus strains originating from foodborne outbreaks, human infections, human nasal colonization, rabbits, and cattle. New nucleotide sequence variants were detected for all three enterotoxins and a novel amino acid sequence variant of SED was detected in a strain associated with human nasal colonization. While the seb promoter and gene sequences exhibited a high degree of variability, the sec and sed promoter and gene were more conserved. Interestingly, a truncated variant of sed was detected in all tested sed harboring rabbit strains. The generated data represents a further step towards improved understanding of strain-specific differences in enterotoxin expression and host-specific variation in enterotoxin sequences.

  15. Multiple gene sequence analysis using genes of the bacterial DNA repair pathway

    Miguel Rotelok Neto

    2015-06-01

    Full Text Available The ability to recognize and repair abnormal DNA structures is common to all forms of life. Physiological studies and genomic sequencing of a variety of bacterial species have identified an incredible diversity of DNA repair pathways. Despite the amount of available genes in public database, the usual method to place genomes in a taxonomic context is based mainly on the 16S rRNA or housekeeping genes. Thus, the relationships among genomes remain poorly understood. In this work, an approach of multiple gene sequence analysis based on genes of DNA repair pathway was used to compare bacterial genomes. Housekeeping and DNA repair genes were searched in 872 completely sequenced bacterial genomes. Seven DNA repair and housekeeping genes from distinct metabolic pathways were selected, aligned, edited and concatenated head-to-tail to form a super-gene. Results showed that the multiple gene sequence analysis using DNA repair genes had better resolution at class level than the housekeeping genes. As housekeeping genes, the DNA repair genes were advantageous to separate bacterial groups at low taxonomic levels and also sensitive to genes derived from horizontal transfer.

  16. PHYLOGENETIC ANALYSIS OF THE SUBCLASS PTERIOMORPHIA (BIVAVIA) BASED ON PARTIAL 28S rRNA SEQUENCE%基于28SrRNA基因片段的翼形亚纲(Bivalvia:Pteriomorphia)系统发育的初步研究

    薛东秀; 王海艳; 张涛; 张素萍; 徐凤山

    2012-01-01

    The phylogenetic relationships among 11 superfamilies of the subclass Pteriomorphia (Bivavia) were recon-structed based on partial sequences of the nuclear 28S ribosomal DNA retrieved from GenBank. Unambiguously aligned sequences (1252bp) of 80 species were subjected to partitioned maximum likelihood and Bayesian analyses. Sequence analysis showed that there were 359 variable sites, occupying 28.67% of all sites, and 300 parsimony informative sites, occupying 23.96% of all sites. The average content of A+T was 41.6%, obviously lower than G+C, showing that the base compositions were biased in favor of G+C. The genetic distances among species within superfamilies ranged from 0.01 to 0.14, which were obviously smaller than those among superfamilies. The resultant molecular phylogeny was compared with previously published phylogenetic hypotheses inferred from morphological characteristics and other molecular analyses. The molecular phylogenetic analyses strongly supported the monophyly of Pteriomorphia, which were congruent with previous results of based on morphological characters. The resulting trees clearly indicated that the 11 superfamilies were divided into three clades: clade I included Pterioidea, Ostreoidea, and Pinnoidea; clade I1 included Arcoidea, Limop- soidea, and Mytiloidea; and clade m included Pectinoidea, Anomioidea, Dimyoidea, Plicatuloidea, and Limoidea. Based on the results of the present study and information compiled from other's classification system, a revised classification of the extant superfamilies of Pteriomorphia is presented.%采用从GenBank下载的翼形亚纲11个总科80个种类的28S部分序列,对翼形亚纲11个总科贝类进行系统发育关系研究。在获得的1252个序列位点中,去除插入缺失位点,变异位点共359个,其中简约位点300个。翼形亚纲各总科内各种间的遗传距离为0.01—0.14,明显小于各总科间的遗传距离(除蚶总科与拟锉蛤总

  17. The nucleotide sequence of the bacteriophage T5 ltf gene.

    Kaliman, A V; Kulshin, V E; Shlyapnikov, M G; Ksenzenko, V N; Kryukov, V M

    1995-06-01

    The nucleotide sequence of the bacteriophage T5 Bg/II-BamHI fragment (4,835 bp in length) known to carry a gene encoding the LTF protein which forms the phage L-shaped tail fibers was determined. It was shown to contain an open reading frame for 1,396 amino acid residues that corresponds to a protein of 147.8 kDa. The coding region of ltf gene is preceded by a typical Shine-Dalgarno sequence. Downstream from the ltf gene there is a strong transcription terminator. Data bank analysis of the LTF protein sequence reveals 55.1% identity to the hypothetical protein ORF 401 of bacteriophage lambda in a segment of 118 amino acids overlap. PMID:7789514

  18. Diverse nucleotide compositions and sequence fluctuation in Rubisco protein genes

    Holden, Todd; Dehipawala, S.; Cheung, E.; Bienaime, R.; Ye, J.; Tremberger, G., Jr.; Schneider, P.; Lieberman, D.; Cheung, T.

    2011-10-01

    The Rubisco protein-enzyme is arguably the most abundance protein on Earth. The biology dogma of transcription and translation necessitates the study of the Rubisco genes and Rubisco-like genes in various species. Stronger correlation of fractal dimension of the atomic number fluctuation along a DNA sequence with Shannon entropy has been observed in the studied Rubisco-like gene sequences, suggesting a more diverse evolutionary pressure and constraints in the Rubisco sequences. The strategy of using metal for structural stabilization appears to be an ancient mechanism, with data from the porphobilinogen deaminase gene in Capsaspora owczarzaki and Monosiga brevicollis. Using the chi-square distance probability, our analysis supports the conjecture that the more ancient Rubisco-like sequence in Microcystis aeruginosa would have experienced very different evolutionary pressure and bio-chemical constraint as compared to Bordetella bronchiseptica, the two microbes occupying either end of the correlation graph. Our exploratory study would indicate that high fractal dimension Rubisco sequence would support high carbon dioxide rate via the Michaelis- Menten coefficient; with implication for the control of the whooping cough pathogen Bordetella bronchiseptica, a microbe containing a high fractal dimension Rubisco-like sequence (2.07). Using the internal comparison of chi-square distance probability for 16S rRNA (~ E-22) versus radiation repair Rec-A gene (~ E-05) in high GC content Deinococcus radiodurans, our analysis supports the conjecture that high GC content microbes containing Rubisco-like sequence are likely to include an extra-terrestrial origin, relative to Deinococcus radiodurans. Similar photosynthesis process that could utilize host star radiation would not compete with radiation resistant process from the biology dogma perspective in environments such as Mars and exoplanets.

  19. A human gut microbial gene catalogue established by metagenomic sequencing

    dos Santos, Marcelo Bertalan Quintanilha; Sicheritz-Pontén, Thomas; Nielsen, Henrik Bjørn;

    2010-01-01

    To understand the impact of gut microbes on human health and well-being it is crucial to assess their genetic potential. Here we describe the Illumina-based metagenomic sequencing, assembly and characterization of 3.3 million non-redundant microbial genes, derived from 576.7 gigabases of sequence...... gut metagenome and the minimal gut bacterial genome in terms of functions present in all individuals and most bacteria, respectively....

  20. A human gut microbial gene catalogue established by metagenomic sequencing

    dos Santos, Marcelo Bertalan Quintanilha; Sicheritz-Pontén, Thomas; Nielsen, Henrik Bjørn;

    2010-01-01

    To understand the impact of gut microbes on human health and well-being it is crucial to assess their genetic potential. Here we describe the Illumina-based metagenomic sequencing, assembly and characterization of 3.3 million non-redundant microbial genes, derived from 576.7 gigabases of sequence...... minimal gut metagenome and the minimal gut bacterial genome in terms of functions present in all individuals and most bacteria, respectively....

  1. Topology of genes and nontranscribed sequences in human interphase nuclei

    Knowledge about the functional impact of the topological organization of DNA sequences within interphase chromosome territories is still sparse. Of the few analyzed single copy genomic DNA sequences, the majority had been found to localize preferentially at the chromosome periphery or to loop out from chromosome territories. By means of dual-color fluorescence in situ hybridization (FISH), immunolabeling, confocal microscopy, and three-dimensional (3D) image analysis, we analyzed the intraterritorial and nuclear localization of 10 genomic fragments of different sequence classes in four different human cell types. The localization of three muscle-specific genes FLNA, NEB, and TTN, the oncogene BCL2, the tumor suppressor gene MADH4, and five putatively nontranscribed genomic sequences was predominantly in the periphery of the respective chromosome territories, independent from transcriptional status and from GC content. In interphase nuclei, the noncoding sequences were only rarely found associated with heterochromatic sites marked by the satellite III DNA D1Z1 or clusters of mammalian heterochromatin proteins (HP1α, HP1β, HP1γ). However, the nontranscribed sequences were found predominantly at the nuclear periphery or at the nucleoli, whereas genes tended to localize on chromosome surfaces exposed to the nuclear interior

  2. Illumina MiSeq sequencing disfavours a sequence motif in the GFP reporter gene.

    Van den Hoecke, Silvie; Verhelst, Judith; Saelens, Xavier

    2016-01-01

    Green fluorescent protein (GFP) is one of the most used reporter genes. We have used next-generation sequencing (NGS) to analyse the genetic diversity of a recombinant influenza A virus that expresses GFP and found a remarkable coverage dip in the GFP coding sequence. This coverage dip was present when virus-derived RT-PCR product or the parental plasmid DNA was used as starting material for NGS and regardless of whether Nextera XT transposase or Covaris shearing was used for DNA fragmentation. Therefore, the sequence coverage dip in the GFP coding sequence was not the result of emerging GFP mutant viruses or a bias introduced by Nextera XT fragmentation. Instead, we found that the Illumina MiSeq sequencing method disfavours the 'CCCGCC' motif in the GFP coding sequence. PMID:27193250

  3. Illumina MiSeq sequencing disfavours a sequence motif in the GFP reporter gene

    Van den Hoecke, Silvie; Verhelst, Judith; Saelens, Xavier

    2016-01-01

    Green fluorescent protein (GFP) is one of the most used reporter genes. We have used next-generation sequencing (NGS) to analyse the genetic diversity of a recombinant influenza A virus that expresses GFP and found a remarkable coverage dip in the GFP coding sequence. This coverage dip was present when virus-derived RT-PCR product or the parental plasmid DNA was used as starting material for NGS and regardless of whether Nextera XT transposase or Covaris shearing was used for DNA fragmentation. Therefore, the sequence coverage dip in the GFP coding sequence was not the result of emerging GFP mutant viruses or a bias introduced by Nextera XT fragmentation. Instead, we found that the Illumina MiSeq sequencing method disfavours the ‘CCCGCC’ motif in the GFP coding sequence. PMID:27193250

  4. A rapid method for sequencing of rRNA gene(s) amplified by polymerase chain reaction using an automated DNA sequencer

    Dwivedi, P.P.; Patel, B.K.C.; Rees, G.N.; Ollivier, Bernard

    1996-01-01

    A method for DNA sequencing of ribosomal RNA (rRNA) genes, amplified by polymerase chain reaction (PCR), using internal primers, designed on the basis of conserved regions of rRNA genes for determining a near complete sequence (99%) of the gene using an automated DNA sequencer (Applied Biosystem Incorporation, USA) is described. The procedure is extremely rapid as cloning of the gene is not required for sequence determination. In addition time consuming steps such as ethanol precipitation and...

  5. Identification of rat genes by TWINSCAN gene prediction, RT-PCR, and direct sequencing

    Wu, Jia Qian; Shteynberg, David; Arumugam, Manimozhiyan;

    2004-01-01

    alternative approach: reverse transcription-polymerase chain reaction (RT-PCR) and direct sequencing based on dual-genome de novo predictions from TWINSCAN. We tested 444 TWINSCAN-predicted rat genes that showed significant homology to known human genes implicated in disease but that were partially or...... single-intron experiment. Spliced sequences were amplified in 46 cases (34%). We conclude that this procedure for elucidating gene structures with native cDNA sequences is cost-effective and will become even more so as it is further optimized....

  6. Automated cleaning and pre-processing of immunoglobulin gene sequences from high-throughput sequencing

    Miri eMichaeli

    2012-12-01

    Full Text Available High throughput sequencing (HTS yields tens of thousands to millions of sequences that require a large amount of pre-processing work to clean various artifacts. Such cleaning cannot be performed manually. Existing programs are not suitable for immunoglobulin (Ig genes, which are variable and often highly mutated. This paper describes Ig-HTS-Cleaner (Ig High Throughput Sequencing Cleaner, a program containing a simple cleaning procedure that successfully deals with pre-processing of Ig sequences derived from HTS, and Ig-Indel-Identifier (Ig Insertion – Deletion Identifier, a program for identifying legitimate and artifact insertions and/or deletions (indels. Our programs were designed for analyzing Ig gene sequences obtained by 454 sequencing, but they are applicable to all types of sequences and sequencing platforms. Ig-HTS-Cleaner and Ig-Indel-Identifier have been implemented in Java and saved as executable JAR files, supported on Linux and MS Windows. No special requirements are needed in order to run the programs, except for correctly constructing the input files as explained in the text. The programs' performance has been tested and validated on real and simulated data sets.

  7. Gene identification in prokaryotic genomes, phages, metagenomes, and EST sequences with GeneMarkS suite.

    Borodovsky, Mark; Lomsadze, Alex

    2014-01-01

    This unit describes how to use several gene-finding programs from the GeneMark line developed for finding protein-coding ORFs in genomic DNA of prokaryotic species, in genomic DNA of eukaryotic species with intronless genes, in genomes of viruses and phages, and in prokaryotic metagenomic sequences, as well as in EST sequences with spliced-out introns. These bioinformatics tools were demonstrated to have state-of-the-art accuracy, and have been frequently used for gene annotation in novel nucleotide sequences. An additional advantage of these sequence-analysis tools is that the problem of algorithm parameterization is solved automatically, with parameters estimated by iterative self-training (unsupervised training). PMID:24510847

  8. Sequence of the human iduronate 2-sulfatase (IDS) gene

    Wilson, P.J.; Meaney, C.A.; Hopwood, J.J.; Morris, C.P. (Adelaide Children' s Hospital, North Adelaide (Australia))

    1993-09-01

    Deficiency of the lysosomal enzyme iduronate-2-sulfatase (IDS; EC 3.1.6.13) results in the storage of the glycosaminoglycans heparan sulfate and dermatan sulfate, which leads to the lysosomal storage disorder mucopolysaccharidosis type II. Three overlapping genomic clones derived from an X-chromosome-specific library containing the entire IDS gene were isolated and the sequences of the intron boundaries and the 5[prime] promoter region were determined. The IDS gene is split into nine exons spanning approximately 24 kb. The potential promoter for IDS lacks a TATA box but contains GC box consensus sequences, consistent with its role as a housekeeping gene. A polypyrimidine-like repeat is found in intron 1. 9 refs., 1 fig., 1 tab.

  9. Variations in CCL3L gene cluster sequence and non-specific gene copy numbers

    Edberg Jeffrey C

    2010-03-01

    Full Text Available Abstract Background Copy number variations (CNVs of the gene CC chemokine ligand 3-like1 (CCL3L1 have been implicated in HIV-1 susceptibility, but the association has been inconsistent. CCL3L1 shares homology with a cluster of genes localized to chromosome 17q12, namely CCL3, CCL3L2, and, CCL3L3. These genes are involved in host defense and inflammatory processes. Several CNV assays have been developed for the CCL3L1 gene. Findings Through pairwise and multiple alignments of these genes, we have shown that the homology between these genes ranges from 50% to 99% in complete gene sequences and from 70-100% in the exonic regions, with CCL3L1 and CCL3L3 being identical. By use of MEGA 4 and BioEdit, we aligned sense primers, anti-sense primers, and probes used in several previously described assays against pre-multiple alignments of all four chemokine genes. Each set of probes and primers aligned and matched with overlapping sequences in at least two of the four genes, indicating that previously utilized RT-PCR based CNV assays are not specific for only CCL3L1. The four available assays measured median copies of 2 and 3-4 in European and African American, respectively. The concordance between the assays ranged from 0.44-0.83 suggesting individual discordant calls and inconsistencies with the assays from the expected gene coverage from the known sequence. Conclusions This indicates that some of the inconsistencies in the association studies could be due to assays that provide heterogenous results. Sequence information to determine CNV of the three genes separately would allow to test whether their association with the pathogenesis of a human disease or phenotype is affected by an individual gene or by a combination of these genes.

  10. Cloning and sequence of the human adrenodoxin reductase gene

    Adrenodoxin reductase is a flavoprotein mediating electron transport to all mitochondrial forms of cytochrome P450. The authors cloned the human adrenodoxin reductase gene and characterized it by restriction endonuclease mapping and DNA sequencing. The entire gene is approximately 12 kilobases long and consists of 12 exons. The first exon encodes the first 26 of the 32 amino acids of the signal peptide, and the second exon encodes the remainder of signal peptide and the apparent FAD binding site. The remaining 10 exons are clustered in a region of only 4.3 kilobases, separated from the first two exons by a large intron of about 5.6 kilobases. Two forms of human adrenodoxin reductase mRNA, differing by the presence or absence of 18 bases in the middle of the sequence, arise from alternate splicing at the 5' end of exon 7. This alternately spliced region is directly adjacent to the NADPH binding site, which is entirely contained in exon 6. The immediate 5' flanking region lacks TATA and CAAT boxes; however, this region is rich in G+C and contains six copies of the sequence GGGCGGG, resembling promoter sequences of housekeeping genes. RNase protection experiments show that transcription is initiated from multiple sites in the 5' flanking region, located about 21-91 base pairs upstream from the AUG translational initiation codon

  11. Sequence variations in the FAD2 gene in seeded pumpkins.

    Ge, Y; Chang, Y; Xu, W L; Cui, C S; Qu, S P

    2015-01-01

    Seeded pumpkins are important economic crops; the seeds contain various unsaturated fatty acids, such as oleic acid and linoleic acid, which are crucial for human and animal nutrition. The fatty acid desaturase-2 (FAD2) gene encodes delta-12 desaturase, which converts oleic acid to linoleic acid. However, little is known about sequence variations in FAD2 in seeded pumpkins. Twenty-seven FAD2 clones from 27 accessions of Cucurbita moschata, Cucurbita maxima, Cucurbita pepo, and Cucurbita ficifolia were obtained (totally 1152 bp; a single gene without introns). More than 90% nucleotide identities were detected among the 27 FAD2 clones. Nucleotide substitution, rather than nucleotide insertion and deletion, led to sequence polymorphism in the 27 FAD2 clones. Furthermore, the 27 FAD2 selected clones all encoded the FAD2 enzyme (delta-12 desaturase) with amino acid sequence identities from 91.7 to 100% for 384 amino acids. The same main-function domain between 47 and 329 amino acids was identified. The four species clustered separately based on differences in the sequences that were identified using the unweighted pair group method with arithmetic mean. Geographic origin and species were found to be closely related to sequence variation in FAD2. PMID:26782391

  12. Informational structure of genetic sequences and nature of gene splicing

    Trifonov, E. N.

    1991-10-01

    Only about 1/20 of DNA of higher organisms codes for proteins, by means of classical triplet code. The rest of DNA sequences is largely silent, with unclear functions, if any. The triplet code is not the only code (message) carried by the sequences. There are three levels of molecular communication, where the same sequence ``talks'' to various bimolecules, while having, respectively, three different appearances: DNA, RNA and protein. Since the molecular structures and, hence, sequence specific preferences of these are substantially different, the original DNA sequence has to carry simultaneously three types of sequence patterns (codes, messages), thus, being a composite structure in which one had the same letter (nucleotide) is frequently involved in several overlapping codes of different nature. This multiplicity and overlapping of the codes is a unique feature of the Gnomic, language of genetic sequences. The coexisting codes have to be degenerate in various degrees to allow an optimal and concerted performance of all the encoded functions. There is an obvious conflict between the best possible performance of a given function and necessity to compromise the quality of a given sequence pattern in favor of other patterns. It appears that the major role of various changes in the sequences on their ``ontogenetic'' way from DNA to RNA to protein, like RNA editing and splicing, or protein post-translational modifications is to resolve such conflicts. New data are presented strongly indicating that the gene splicing is such a device to resolve the conflict between the code of DNA folding in chromatin and the triplet code for protein synthesis.

  13. Cloning,sequencing and phylogenic analysis of duck prion gene

    WANG Qigui; ZHANG Lei; HU Xiaoxiang; FAN Baoliang; LI Ning; LI Hui; WU Changxin

    2004-01-01

    Duck prion gene was cloned and sequenced. Similar to mammalian prion protein (PrP), duck prion is encoded by a single exon of a single copy in genome, which was confirmed by Southern blot analysis. All of the structural features of mammalian PrP were also identified in the duck PrP. Compared with mammalian PrP, it exhibited a 30 % of general similarity. When compared with chicken PrP, it showed a higher homology of 97%. A phylogenetic tree was constructed to trace evolution of prion gene in animals.

  14. Chloroplast gene sequences and the study of plant evolution.

    Clegg, M T

    1993-01-01

    A large body of sequence data has accumulated for the chloroplast-encoded gene ribulose-1,5-biphosphate carboxylase/oxygenase (rbcL) as the result of a cooperative effort involving many laboratories. The data span all seed plants, including most major lineages from the angiosperms, and as such they provide an unprecedented opportunity to study plant evolutionary history. The full analysis of this large data set poses many problems and opportunities for plant evolutionary biologists and for bi...

  15. Identification of Driver Genes in Hepatocellular Carcinoma by Exome Sequencing

    Sean P Cleary; Jeck, William R.; Zhao, Xiaobei; Chen, Kui; Selitsky, Sara R.; Savich, Gleb L.; Tan, Ting-Xu; Wu, Michael C.; Getz, Gad; Lawrence, Michael S.; Joel S Parker; Li, Jinyu; Powers, Scott; Kim, Hyeja; Fischer, Sandra

    2013-01-01

    Genetic alterations in specific driver genes lead to disruption of cellular pathways and are critical events in the instigation and progression of hepatocellular carcinoma. As a prerequisite for individualized cancer treatment, we sought to characterize the landscape of recurrent somatic mutations in hepatocellular carcinoma. We performed whole exome sequencing on 87 hepatocellular carcinomas and matched normal adjacent tissues to anaverage coverage of 59x. The overall mutation rate was rough...

  16. Detection and sequence analysis of accessory gene regulator genes of Staphylococcus pseudintermedius isolates

    M. Ananda Chitra

    2015-07-01

    Full Text Available Background: Staphylococcus pseudintermedius (SP is the major pathogenic species of dogs involved in a wide variety of skin and soft tissue infections. The accessory gene regulator (agr locus of Staphylococcus aureus has been extensively studied, and it influences the expression of many virulence genes. It encodes a two-component signal transduction system that leads to down-regulation of surface proteins and up-regulation of secreted proteins during in vitro growth of S. aureus. The objective of this study was to detect and sequence analyzing the AgrA, B, and D of SP isolated from canine skin infections. Materials and Methods: In this study, we have isolated and identified SP from canine pyoderma and otitis cases by polymerase chain reaction (PCR and confirmed by PCR-restriction fragment length polymorphism. Primers for SP agrA and agrBD genes were designed using online primer designing software and BLAST searched for its specificity. Amplification of the agr genes was carried out for 53 isolates of SP by PCR and sequencing of agrA, B, and D were carried out for five isolates and analyzed using DNAstar and Mega5.2 software. Results: A total of 53 (59% SP isolates were obtained from 90 samples. 15 isolates (28% were confirmed to be methicillinresistant SP (MRSP with the detection of the mecA gene. Accessory gene regulator A, B, and D genes were detected in all the SP isolates. Complete nucleotide sequences of the above three genes for five isolates were submitted to GenBank, and their accession numbers are from KJ133557 to KJ133571. AgrA amino acid sequence analysis showed that it is mainly made of alpha-helices and is hydrophilic in nature. AgrB is a transmembrane protein, and AgrD encodes the precursor of the autoinducing peptide (AIP. Sequencing of the agrD gene revealed that the 5 canine SP strains tested could be divided into three Agr specificity groups (RIPTSTGFF, KIPTSTGFF, and RIPISTGFF based on the putative AIP produced by each strain

  17. Complete MHC haplotype sequencing for common disease gene mapping.

    Stewart, C Andrew; Horton, Roger; Allcock, Richard J N; Ashurst, Jennifer L; Atrazhev, Alexey M; Coggill, Penny; Dunham, Ian; Forbes, Simon; Halls, Karen; Howson, Joanna M M; Humphray, Sean J; Hunt, Sarah; Mungall, Andrew J; Osoegawa, Kazutoyo; Palmer, Sophie; Roberts, Anne N; Rogers, Jane; Sims, Sarah; Wang, Yu; Wilming, Laurens G; Elliott, John F; de Jong, Pieter J; Sawcer, Stephen; Todd, John A; Trowsdale, John; Beck, Stephan

    2004-06-01

    The future systematic mapping of variants that confer susceptibility to common diseases requires the construction of a fully informative polymorphism map. Ideally, every base pair of the genome would be sequenced in many individuals. Here, we report 4.75 Mb of contiguous sequence for each of two common haplotypes of the major histocompatibility complex (MHC), to which susceptibility to >100 diseases has been mapped. The autoimmune disease-associated-haplotypes HLA-A3-B7-Cw7-DR15 and HLA-A1-B8-Cw7-DR3 were sequenced in their entirety through a bacterial artificial chromosome (BAC) cloning strategy using the consanguineous cell lines PGF and COX, respectively. The two sequences were annotated to encompass all described splice variants of expressed genes. We defined the complete variation content of the two haplotypes, revealing >18,000 variations between them. Average SNP densities ranged from less than one SNP per kilobase to >60. Acquisition of complete and accurate sequence data over polymorphic regions such as the MHC from large-insert cloned DNA provides a definitive resource for the construction of informative genetic maps, and avoids the limitation of chromosome regions that are refractory to PCR amplification. PMID:15140828

  18. Angiosperm phylogeny inferred from sequences of four mitochondrial genes

    Yin-Long QIU; Zhi-Duan CHEN; Libo LI; Bin WANG; Jia-Yu XUE; Tory A. HENDRY; Rui-Qi LI; Joseph W. BROWN; Yang LIU; Geordan T. HUDSON

    2010-01-01

    An angiosperm phylogeny was reconstructed in a maximum likelihood analysis of sequences of four mitochondrial genes, atpl, matR, had5, and rps3, from 380 species that represent 376 genera and 296 families of seed plants. It is largely congruent with the phylogeny of angiosperms reconstructed from chloroplast genes atpB, matK, and rbcL, and nuclear 18S rDNA. The basalmost lineage consists of Amborella and Nymphaeales (including Hydatellaceae). Austrobaileyales follow this clade and are sister to the mesangiosperms, which include Chloranthaceae, Ceratophyllum, magnoliids, monocots, and eudicots. With the exception of Chloranthaceae being sister to Ceratophyllum, relationships among these five lineages are not well supported. In eudicots, Ranunculales, Sabiales, Proteales, Trochodendrales, Buxales, Gunnerales, Saxifragales, Vitales, Berberidopsidales, and Dilleniales form a basal grade of lines that diverged before the diversification of rosids and asterids. Within rosids, the COM (Celastrales-Oxalidales-Malpighiales) clade is sister to malvids (or rosid Ⅱ), instead of to the nitrogen-fixing clade as found in all previous large-scale molecular analyses of angiosperms. Santalales and Caryophyllales are members of an expanded asterid clade. This study shows that the mitochondrial genes are informative markers for resolving relationships among genera, families, or higher rank taxa across angiosperms. The low substitution rates and low homoplasy levels of the mitochondrial genes relative to the chloroplast genes, as found in this study, make them particularly useful for reconstructing ancient phylogenetic relationships. A mitochondrial gene-based angiosperm phylogeny provides an independent and essential reference for comparison with hypotheses of angiosperm phylogeny based on chloroplast genes, nuclear genes, and non-molecular data to reconstruct the underlying organismal phylogeny.

  19. Cloning, nucleotide sequence, and expression of the Rhodobacter sphaeroides Y thioredoxin gene.

    Pille, S.; Chuat, J C; Breton, A M; Clément-Métral, J D; Galibert, F

    1990-01-01

    Synthetic oligodeoxynucleotide probes based on the known amino acid sequence of Rhodobacter sphaeroides Y thioredoxin were used to identify, clone, and sequence the structural gene. The amino acid sequence derived from the DNA sequence of the R. sphaeroides gene was identical to the known amino acid sequence of R. sphaeroides thioredoxin. An NcoI site was created by directed mutagenesis at the beginning of the thioredoxin gene, inducing in the encoded protein the replacement of serine in posi...

  20. Structure and sequence variation of mink interleukin-6 gene

    Aleutian disease (AD) is the number one disease threat to the survival and future of the mink industry in Nova Scotia and the world. Several ranchers have gone out of business in recent years in Nova Scotia as a direct result of AD. Currently, the control measure for AD consists of testing and slaughtering of infected mink. This practice has not been effective in controlling the disease. Finding a means of controlling AD is the number one priority for the mink industry in Nova Scotia. An effective control measure will have a long-term positive effect on the rural economy by improving production potential of mink and reducing production cost. It has been shown that antiviral antibodies produced by activated immune system cells sometimes combine with interleukin-6 (IL-6) to form immune complexes that cause AD in mink. There is evidence of a significant relationship between nucleotide variations in IL-6 gene and the onset of certain diseases in humans, which bears similar symptoms to AD. Furthermore, pathological symptoms of AD resemble those of other conditions, such as systemic lupus erythematosus (SLE) and Castleman Diseases in humans, where overproduction of IL-6 coincides with the severity of the disease. These findings suggest that IL-6 could be a candidate gene and warrant investigation vis-a-vis differences among mink genotypes in resistance or tolerance to ADV infection. The sequence of the IL-6 gene in mink was done and identification of polymorphisms was used to evaluate the potential role of this gene in the immune system response to infections. The 4678 bp promoter region, five exons and four introns of the interleukin-6 (IL-6) gene were bi-directionally sequenced in four unrelated mink from each of the wild, black, brown, pastel and sapphire mink (Genbank accession number (EF620932). The 344 bp promoter region of the gene contained several transcription binding sites. One exonic and seven intronic single nucleotide polymorphisms (SNP) were detected by

  1. Technology development for gene discovery and full-length sequencing

    Marcelo Bento Soares

    2004-07-19

    In previous years, with support from the U.S. Department of Energy, we developed methods for construction of normalized and subtracted cDNA libraries, and constructed hundreds of high-quality libraries for production of Expressed Sequence Tags (ESTs). Our clones were made widely available to the scientific community through the IMAGE Consortium, and millions of ESTs were produced from our libraries either by collaborators or by our own sequencing laboratory at the University of Iowa. During this grant period, we focused on (1) the development of a method for preferential cloning of tissue-specific and/or rare transcripts, (2) its utilization to expedite EST-based gene discovery for the NIH Mouse Brain Molecular Anatomy Project, (3) further development and optimization of a method for construction of full-length-enriched cDNA libraries, and (4) modification of a plasmid vector to maximize efficiency of full-length cDNA sequencing by the transposon-mediated approach. It is noteworthy that the technology developed for preferential cloning of rare mRNAs enabled identification of over 2,000 mouse transcripts differentially expressed in the hippocampus. In addition, the method that we optimized for construction of full-length-enriched cDNA libraries was successfully utilized for the production of approximately fifty libraries from the developing mouse nervous system, from which over 2,500 full-ORF-containing cDNAs have been identified and accurately sequenced in their entirety either by our group or by the NIH-Mammalian Gene Collection Program Sequencing Team.

  2. Nucleotide sequence and expression analysis of the Acetobacter xylinum uridine diphosphoglucose pyrophosphorylase gene.

    Brede, G; Fjaervik, E; Valla, S

    1991-01-01

    The nucleotide sequence of the Acetobacter xylinum uridine diphosphoglucose pyrophosphorylase gene was determined; this is the first procaryotic uridine diphosphoglucose pyrophosphorylase gene sequence reported. The sequence data indicated that the gene product consists of 284 amino acids. This finding was consistent with the results obtained by expression analysis in vivo and in vitro in Escherichia coli.

  3. Cloning and sequence analysis of US1 gene in duck enteritis virus%Cloning and sequence analysis of US1gene in duck enteritis virus

    ZHAO Yan; WANG Jun-wei; MA Bo; ZHAO Xiao-yan

    2011-01-01

    In this paper, a 1,860 bp sequence in IRs region of duck enteritis virus(DEV)was amplified by single oligonucleotide nested PCR with a single primer designed according to partial sequence of USI and then a pair of primers designed according to the 3' UTR of US8 gene and 5'end of the new getting sequence were used to amplify a 2,426 bp sequence toward the TRs region.Sequence analysis revealed that the both sequences contained an identical 990 bp open reading frame of DEV US1 gene.The two ORFs were in opposite transcription orientation.Sequence comparison of the nucleotide sequence and the deduced amino acid sequence of US1 gene showed relatively high identity to Mardivirus.Phylogenetic tree analysis showed that the eleven herpesviruses viruses were classified into three groups, and the duck enteritis virus was most closely related to Mardivirus.

  4. dcp gene of Escherichia coli: cloning, sequencing, transcript mapping, and characterization of the gene product.

    Henrich, B; S. Becker; Schroeder, U; Plapp, R.

    1993-01-01

    Dipeptidyl carboxypeptidase is a C-terminal exopeptidase of Escherichia coli. We have isolated the respective gene, dcp, from a low-copy-number plasmid library by its ability to complement a dcp mutation preventing the utilization of the unique substrate N-benzoyl-L-glycyl-L-histidyl-L-leucine. Sequence analysis of a 2.9-kb DNA fragment revealed an open reading frame of 2,043 nucleotides which was assigned to the dcp gene by N-terminal amino acid sequencing and electrophoretic molecular mass ...

  5. Deep sequencing reveals 50 novel genes for recessive cognitive disorders.

    Najmabadi, Hossein; Hu, Hao; Garshasbi, Masoud; Zemojtel, Tomasz; Abedini, Seyedeh Sedigheh; Chen, Wei; Hosseini, Masoumeh; Behjati, Farkhondeh; Haas, Stefan; Jamali, Payman; Zecha, Agnes; Mohseni, Marzieh; Püttmann, Lucia; Vahid, Leyla Nouri; Jensen, Corinna; Moheb, Lia Abbasi; Bienek, Melanie; Larti, Farzaneh; Mueller, Ines; Weissmann, Robert; Darvish, Hossein; Wrogemann, Klaus; Hadavi, Valeh; Lipkowitz, Bettina; Esmaeeli-Nieh, Sahar; Wieczorek, Dagmar; Kariminejad, Roxana; Firouzabadi, Saghar Ghasemi; Cohen, Monika; Fattahi, Zohreh; Rost, Imma; Mojahedi, Faezeh; Hertzberg, Christoph; Dehghan, Atefeh; Rajab, Anna; Banavandi, Mohammad Javad Soltani; Hoffer, Julia; Falah, Masoumeh; Musante, Luciana; Kalscheuer, Vera; Ullmann, Reinhard; Kuss, Andreas Walter; Tzschach, Andreas; Kahrizi, Kimia; Ropers, H Hilger

    2011-10-01

    Common diseases are often complex because they are genetically heterogeneous, with many different genetic defects giving rise to clinically indistinguishable phenotypes. This has been amply documented for early-onset cognitive impairment, or intellectual disability, one of the most complex disorders known and a very important health care problem worldwide. More than 90 different gene defects have been identified for X-chromosome-linked intellectual disability alone, but research into the more frequent autosomal forms of intellectual disability is still in its infancy. To expedite the molecular elucidation of autosomal-recessive intellectual disability, we have now performed homozygosity mapping, exon enrichment and next-generation sequencing in 136 consanguineous families with autosomal-recessive intellectual disability from Iran and elsewhere. This study, the largest published so far, has revealed additional mutations in 23 genes previously implicated in intellectual disability or related neurological disorders, as well as single, probably disease-causing variants in 50 novel candidate genes. Proteins encoded by several of these genes interact directly with products of known intellectual disability genes, and many are involved in fundamental cellular processes such as transcription and translation, cell-cycle control, energy metabolism and fatty-acid synthesis, which seem to be pivotal for normal brain development and function. PMID:21937992

  6. Multiple gene sequence analysis using genes of the bacterial DNA repair pathway

    Miguel Rotelok Neto; Carolina Weigert Galvão; Leonardo Magalhães Cruz; Dieval Guizelini; Leilane Caline Silva; Jarem Raul Garcia; Rafael Mazer Etto

    2015-01-01

    The ability to recognize and repair abnormal DNA structures is common to all forms of life. Physiological studies and genomic sequencing of a variety of bacterial species have identified an incredible diversity of DNA repair pathways. Despite the amount of available genes in public database, the usual method to place genomes in a taxonomic context is based mainly on the 16S rRNA or housekeeping genes. Thus, the relationships among genomes remain poorly understood. In this work, an approach of...

  7. Efficient expression of the Saccharomyces cerevisiae PGK gene depends on an upstream activation sequence but does not require TATA sequences.

    Ogden, J E; Stanway, C; Kim, S.; Mellor, J; Kingsman, A J; Kingsman, S M

    1986-01-01

    The Saccharomyces cerevisiae PGK (phosphoglycerate kinase) gene encodes one of the most abundant mRNA and protein species in the cell. To identify the promoter sequences required for the efficient expression of PGK, we undertook a detailed internal deletion analysis of the 5' noncoding region of the gene. Our analysis revealed that PGK has an upstream activation sequence (UASPGK) located between 402 and 479 nucleotides upstream from the initiating ATG sequence which is required for full trans...

  8. International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons

    Nathan D. Olson

    2015-03-01

    Full Text Available This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1 identity of biologically conserved position, (2 ratio of 16S rRNA gene copies featuring identified variants, and (3 the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies.

  9. SEQUENCING AND SEQUENCE ANALYSIS OF MYOSTATIN GENE IN THE EXON 1 OF THE CAMEL (CAMELUS DROMEDARIUS

    M. G. SHAH, A. S. QURESHI1, M. REISSMANN2 AND H. J. SCHWARTZ3

    2006-10-01

    Full Text Available Myostatin, also called growth differentiation factor-8 (GDF-8, is a member of the mammalian growth transforming family (TGF-beta superfamily, which is expressed specifically in developing an adult skeletal muscle. Muscular hypertrophy allele (mh allele in the double muscle breeds involved mutation within the myostatin gene. Genomic DNA was isolated from the camel hair using NucleoSpin Tissue kit. Two animals of each of the six breeds namely, Marecha, Dhatti, Larri, Kohi, Sakrai and Cambelpuri were used for sequencing. For PCR amplification of the gene, a primer pair was designed from homolog regions of already published sequences of farm animals from GenBank. Results showed that camel myostatin possessed more than 90% homology with that of cattle, sheep and pig. Camel formed separate cluster from the pig in spite of having high homology (98% and showed 94% homology with cattle and sheep as reported in literature. Sequence analysis of the PCR amplified part of exon 1 (256 bp of the camel myostatin was identical among six camel breeds.

  10. A homeodomain protein binds to. gamma. -globin gene regulatory sequences

    Lavelle, D.; Ducksworth, J.; Eves, E.; Gomes, G.; Keller, M.; Heller, P.; DeSimone, J. (Univ. of Illinois, Chicago (United States) Veterans Administration Westside Medical Center, Chicago, IL (United States))

    1991-08-15

    Developmental regulation of {gamma}-globin gene expression probably occurs through developmental-stage-specific trans-acting factors able to promote the interaction of enhancer elements located in the far upstream locus control region with regulatory elements in the {gamma} gene promoters and 3{prime}{sup A}{gamma} enhancer located in close proximity to the genes. The authors have detected a nuclear protein in K562 and baboon fetal bone marrow nuclear extracts capable of binding to A+T-rich sequences in the locus control region, {gamma} gene promoter, and 3{prime} {sup A}{gamma} enhancer. SDS/polyacrylamide gel analysis of the purified K562 binding activity revealed a single protein of 87 kDa. A K562 cDNA clone was isolated encoding a {beta}-galactosidase fusion protein with a DNA binding specificity identical to that of the K562/fetal bone marrow nuclear protein. The cDNA clone encodes a homeodomain homologous to the Drosophila antennapedia protein.

  11. dcp gene of Escherichia coli: cloning, sequencing, transcript mapping, and characterization of the gene product.

    Henrich, B; Becker, S; Schroeder, U; Plapp, R

    1993-01-01

    Dipeptidyl carboxypeptidase is a C-terminal exopeptidase of Escherichia coli. We have isolated the respective gene, dcp, from a low-copy-number plasmid library by its ability to complement a dcp mutation preventing the utilization of the unique substrate N-benzoyl-L-glycyl-L-histidyl-L-leucine. Sequence analysis of a 2.9-kb DNA fragment revealed an open reading frame of 2,043 nucleotides which was assigned to the dcp gene by N-terminal amino acid sequencing and electrophoretic molecular mass determination of the purified dcp product. Transcript mapping by primer extension and S1 protection experiments verified the physiological significance of potential initiation and termination signals for dcp transcription and allowed the identification of a single species of monocistronic dcp mRNA. The codon usage pattern and the effects of elevated gene copy number indicated a relatively low level of dcp expression. The predicted amino acid sequence of dipeptidyl carboxypeptidase, containing a potential zinc-binding site, is highly homologous (78.8%) to the corresponding enzyme from Salmonella typhimurium. It also displays significant homology to the products of the S. typhimurium opdA and the E. coli prlC genes and to some metalloproteases from rats and Saccharomyces cerevisiae. No potential export signals could be inferred from the amino acid sequence. Dipeptidyl carboxypeptidase was enriched 80-fold from crude extracts of E. coli and used to investigate some of its biochemical and biophysical properties. Images PMID:8226676

  12. An automated annotation tool for genomic DNA sequences using GeneScan and BLAST

    Andrew M. Lynn; Chakresh Kumar Jain; K. Kosalai; Pranjan Barman; Nupur Thakur; Harish Batra; Alok Bhattacharya

    2001-04-01

    Genomic sequence data are often available well before the annotated sequence is published. We present a method for analysis of genomic DNA to identify coding sequences using the GeneScan algorithm and characterize these resultant sequences by BLAST. The routines are used to develop a system for automated annotation of genome DNA sequences.

  13. Molecular cloning, sequence identification, and gene expression analysis of bovine ADCY2 gene.

    Li, Y X; Jin, H G; Yan, C G; Ren, C Y; Jiang, C J; Jin, C D; Seo, K S; Jin, X

    2014-06-01

    Adenylyl cyclase 2 (ADCY2), a class B member of adenylyl cyclases, is important in accelerating phosphor-acidification as well as glycogen synthesis and breakdown. Given its distinct role in flesh tenderization after butchering, we cloned and sequenced the ADCY2 gene from Yanbian cattle and assessed its expression in bovine tissues. A 2947 bp nucleotide sequence representing the full-length cDNA of bovine ADCY2 gene was obtained by 5' and 3' remote analysis computations for gene expression. Analyses of the putative protein sequence showed that ADCY2 had high homology among species, except with the non-mammal Oreochromis niloticus. Gene structural domain analyses in humans and rats indicated that the ADCY2 protein had no flaw; only the transmembrane domain was reduced and the CYCc structure domain was shortened. Assessment of ADCY2 expression in bovine tissues by real-time PCR showed that the highest expression was in the testes, followed by the longissimus dorsi, tensor fasciae latae, and latissimus dorsi. These data will serve as a foundation for further insight into the cattle ADCY2 gene. PMID:24797538

  14. EvoTol: a protein-sequence based evolutionary intolerance framework for disease-gene prioritization

    Rackham, Owen J. L.; Shihab, Hashem A; Johnson, Michael R.; Petretto, Enrico

    2014-01-01

    Methods to interpret personal genome sequences are increasingly required. Here, we report a novel framework (EvoTol) to identify disease-causing genes using patient sequence data from within protein coding-regions. EvoTol quantifies a gene's intolerance to mutation using evolutionary conservation of protein sequences and can incorporate tissue-specific gene expression data. We apply this framework to the analysis of whole-exome sequence data in epilepsy and congenital heart disease, and demon...

  15. Nucleotide sequence and corresponding amino acid sequence of the gene for the major antigen of foot and mouth disease virus.

    Kurz, C; Forss, S; Küpper, H; K Strohmaier; Schaller, H

    1981-01-01

    A segment of 1160 nucleotides of the FMDV genome has been sequenced using three overlapping fragments of cloned cDNA from FMDV strain O1K. This sequence contains the coding sequence for the viral capsid protein VP1 as shown by its homology to known and newly determined amino acid sequences from this man antigenic polypeptide of the FMDV virion. The structural gene for VP1 comprises 639 nucleotides which specify a sequence of 213 amino acids for the VP1 protein. The coding sequence is not flan...

  16. Transcriptome sequencing and positive selected genes analysis of Bombyx mandarina.

    Cheng, Tingcai; Fu, Bohua; Wu, Yuqian; Long, Renwen; Liu, Chun; Xia, Qingyou

    2015-01-01

    The wild silkworm Bombyx mandarina is widely believed to be an ancestor of the domesticated silkworm, Bombyx mori. Silkworms are often used as a model for studying the mechanism of species domestication. Here, we performed transcriptome sequencing of the wild silkworm using an Illumina HiSeq2000 platform. We produced 100,004,078 high-quality reads and assembled them into 50,773 contigs with an N50 length of 1764 bp and a mean length of 941.62 bp. A total of 33,759 unigenes were identified, with 12,805 annotated in the Nr database, 8273 in the Pfam database, and 9093 in the Swiss-Prot database. Expression profile analysis found significant differential expression of 1308 unigenes between the middle silk gland (MSG) and posterior silk gland (PSG). Three sericin genes (sericin 1, sericin 2, and sericin 3) were expressed specifically in the MSG and three fibroin genes (fibroin-H, fibroin-L, and fibroin/P25) were expressed specifically in the PSG. In addition, 32,297 Single-nucleotide polymorphisms (SNPs) and 361 insertion-deletions (INDELs) were detected. Comparison with the domesticated silkworm p50/Dazao identified 5,295 orthologous genes, among which 400 might have experienced or to be experiencing positive selection by Ka/Ks analysis. These data and analyses presented here provide insights into silkworm domestication and an invaluable resource for wild silkworm genomics research. PMID:25806526

  17. Transcriptome sequencing and positive selected genes analysis of Bombyx mandarina.

    Tingcai Cheng

    Full Text Available The wild silkworm Bombyx mandarina is widely believed to be an ancestor of the domesticated silkworm, Bombyx mori. Silkworms are often used as a model for studying the mechanism of species domestication. Here, we performed transcriptome sequencing of the wild silkworm using an Illumina HiSeq2000 platform. We produced 100,004,078 high-quality reads and assembled them into 50,773 contigs with an N50 length of 1764 bp and a mean length of 941.62 bp. A total of 33,759 unigenes were identified, with 12,805 annotated in the Nr database, 8273 in the Pfam database, and 9093 in the Swiss-Prot database. Expression profile analysis found significant differential expression of 1308 unigenes between the middle silk gland (MSG and posterior silk gland (PSG. Three sericin genes (sericin 1, sericin 2, and sericin 3 were expressed specifically in the MSG and three fibroin genes (fibroin-H, fibroin-L, and fibroin/P25 were expressed specifically in the PSG. In addition, 32,297 Single-nucleotide polymorphisms (SNPs and 361 insertion-deletions (INDELs were detected. Comparison with the domesticated silkworm p50/Dazao identified 5,295 orthologous genes, among which 400 might have experienced or to be experiencing positive selection by Ka/Ks analysis. These data and analyses presented here provide insights into silkworm domestication and an invaluable resource for wild silkworm genomics research.

  18. Identification of novel hereditary cancer genes by whole exome sequencing.

    Sokolenko, Anna P; Suspitsin, Evgeny N; Kuligina, Ekatherina Sh; Bizin, Ilya V; Frishman, Dmitrij; Imyanitov, Evgeny N

    2015-12-28

    Whole exome sequencing (WES) provides a powerful tool for medical genetic research. Several dozens of WES studies involving patients with hereditary cancer syndromes have already been reported. WES led to breakthrough in understanding of the genetic basis of some exceptionally rare syndromes; for example, identification of germ-line SMARCA4 mutations in patients with ovarian hypercalcemic small cell carcinomas indeed explains a noticeable share of familial aggregation of this disease. However, studies on common cancer types turned out to be more difficult. In particular, there is almost a dozen of reports describing WES analysis of breast cancer patients, but none of them yet succeeded to reveal a gene responsible for the significant share of missing heritability. Virtually all components of WES studies require substantial improvement, e.g. technical performance of WES, interpretation of WES results, mode of patient selection, etc. Most of contemporary investigations focus on genes with autosomal dominant mechanism of inheritance; however, recessive and oligogenic models of transmission of cancer susceptibility also need to be considered. It is expected that the list of medically relevant tumor-predisposing genes will be rapidly expanding in the next few years. PMID:26427841

  19. Estimating the extent of horizontal gene transfer in metagenomic sequences

    Moya Andrés

    2008-03-01

    Full Text Available Abstract Background Although the extent of horizontal gene transfer (HGT in complete genomes has been widely studied, its influence in the evolution of natural communities of prokaryotes remains unknown. The availability of metagenomic sequences allows us to address the study of global patterns of prokaryotic evolution in samples from natural communities. However, the methods that have been commonly used for the study of HGT are not suitable for metagenomic samples. Therefore it is important to develop new methods or to adapt existing ones to be used with metagenomic sequences. Results We have created two different methods that are suitable for the study of HGT in metagenomic samples. The methods are based on phylogenetic and DNA compositional approaches, and have allowed us to assess the extent of possible HGT events in metagenomes for the first time. The methods are shown to be compatible and quite precise, although they probably underestimate the number of possible events. Our results show that the phylogenetic method detects HGT in between 0.8% and 1.5% of the sequences, while DNA compositional methods identify putative HGT in between 2% and 8% of the sequences. These ranges are very similar to these found in complete genomes by related approaches. Both methods act with a different sensitivity since they probably target HGT events of different ages: the compositional method mostly identifies recent transfers, while the phylogenetic is more suitable for the detections of older events. Nevertheless, the study of the number of HGT events in metagenomic sequences from different communities shows a consistent trend for both methods: the lower amount is found for the sequences of the Sargasso Sea metagenome, while the higher quantity is found in the whale fall metagenome from the bottom of the ocean. The significance of these observations is discussed. Conclusion The computational approaches that are used to find possible HGT events in complete

  20. Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications

    Yilmaz, Pelin; Kottmann, Renzo; Field, Dawn;

    2011-01-01

    Here we present a standard developed by the Genomic Standards Consortium (GSC) for reporting marker gene sequences--the minimum information about a marker gene sequence (MIMARKS). We also introduce a system for describing the environment from which a biological sample originates. The 'environment...... present the minimum information about any (x) sequence (MIxS). Adoption of MIxS will enhance our ability to analyze natural genetic diversity documented by massive DNA sequencing efforts from myriad ecosystems in our ever-changing biosphere....

  1. Molecular Cloning and Sequencing of Hemoglobin-Beta Gene of Channel Catfish, Ictalurus Punctatus Rafinesque

    : Hemoglobin-y gene of channel catfish , lctalurus punctatus, was cloned and sequenced . Total RNA from head kidneys was isolated, reverse transcribed and amplified . The sequence of the channel catfish hemoglobin-y gene consists of 600 nucleotides . Analysis of the nucleotide sequence reveals one o...

  2. Detection bias in microarray and sequencing transcriptomic analysis identified by housekeeping genes

    Yijuan Zhang; Oluwafemi S. Akintola; Liu, Ken J.A.; Bingyun Sun

    2015-01-01

    This work includes the original data used to discover the gene ontology bias in transcriptomic analysis conducted by microarray and high throughput sequencing (Zhang et al., 2015) [1]. In the analysis, housekeeping genes were used to examine the differential detection ability by microarray and sequencing because these genes are probably the most reliably detected. The genes included here were compiled from 15 human housekeeping gene studies. The provided tables here comprise of detailed chrom...

  3. Poly purine.pyrimidine sequences upstream of the beta-galactosidase gene affect gene expression in Saccharomyces cerevisiae

    Brahmachari Samir K

    2001-10-01

    Full Text Available Abstract Background Poly purine.pyrimidine sequences have the potential to adopt intramolecular triplex structures and are overrepresented upstream of genes in eukaryotes. These sequences may regulate gene expression by modulating the interaction of transcription factors with DNA sequences upstream of genes. Results A poly purine.pyrimidine sequence with the potential to adopt an intramolecular triplex DNA structure was designed. The sequence was inserted within a nucleosome positioned upstream of the β-galactosidase gene in yeast, Saccharomyces cerevisiae, between the cycl promoter and gal 10Upstream Activating Sequences (UASg. Upon derepression with galactose, β-galactosidase gene expression is reduced 12-fold in cells carrying single copy poly purine.pyrimidine sequences. This reduction in expression is correlated with reduced transcription. Furthermore, we show that plasmids carrying a poly purine.pyrimidine sequence are not specifically lost from yeast cells. Conclusion We propose that a poly purine.pyrimidine sequence upstream of a gene affects transcription. Plasmids carrying this sequence are not specifically lost from cells and thus no additional effort is needed for the replication of these sequences in eukaryotic cells.

  4. Identification and analysis of gene families from the duplicated genome of soybean using EST sequences

    Shoemaker Randy

    2006-08-01

    Full Text Available Abstract Background Large scale gene analysis of most organisms is hampered by incomplete genomic sequences. In many organisms, such as soybean, the best source of sequence information is the existence of expressed sequence tag (EST libraries. Soybean has a large (1115 Mbp genome that has yet to be fully sequenced. However it does have the 6th largest EST collection comprised of ESTs from a variety of soybean genotypes. Many EST libraries were constructed from RNA extracted from various genetic backgrounds, thus gene identification from these sources is complicated by the existence of both gene and allele sequence differences. We used the ESTminer suite of programs to identify potential soybean gene transcripts from a single genetic background allowing us to observe functional classifications between gene families as well as structural differences between genes and gene paralogs within families. The identification of potential gene sequences (pHaps from soybean allows us to begin to get a picture of the genomic history of the organism as well as begin to observe the evolutionary fates of gene copies in this highly duplicated genome. Results We identified approximately 45,000 potential gene sequences (pHaps from EST sequences of Williams/Williams82, an inbred genotype of soybean (Glycine max L. Merr. using a redundancy criterion to identify reproducible sequence differences between related genes within gene families. Analysis of these sequences revealed single base substitutions and single base indels are the most frequently observed form of sequence variation between genes within families in the dataset. Genomic sequencing of selected loci indicate that intron-like intervening sequences are numerous and are approximately 220 bp in length. Functional annotation of gene sequences indicate functional classifications are not randomly distributed among gene families containing few or many genes. Conclusion The predominance of single nucleotide

  5. Cloning, sequencing and expression of a xylanase gene from the maize pathogen Helminthosporium turcicum

    Degefu, Y.; Paulin, L.; Lübeck, Peter Stephensen

    2001-01-01

    A gene encoding an endoxylanase from the phytopathogenic fungus Helminthosporium turcicum Pass. was cloned and sequenced. The entire nucleotide sequence of a 1991 bp genomic fragment containing an endoxylanase gene was determined. The xylanase gene of 795 bp, interrupted by two introns of 52 and ...

  6. High throughput 16S rRNA gene amplicon sequencing

    Nierychlo, Marta; Larsen, Poul; Jørgensen, Mads Koustrup;

    S rRNA gene amplicon sequencing has been developed over the past few years and is now ready to use for more comprehensive studies related to plant operation and optimization thanks to short analysis time, low cost, high throughput, and high taxonomic resolution. In this study we show how 16S r...... belonging to the phylum Chloroflexi. Based on knowledge about their ecophysiology, other control measures were introduced and the bulking problem was reduced after 2 months. Besides changes in the filament abundance and composition also other changes in the microbial community were observed that likely...... correlated with the bacterial species composition in 25 Danish full-scale WWTPs with nutrient removal. Examples of properties were SVI, filament index, floc size, floc strength, content of cations and amount of extracellular polymeric substances. Multivariate statistics provided several important insights...

  7. The GeneCards Suite: From Gene Data Mining to Disease Genome Sequence Analyses.

    Stelzer, Gil; Rosen, Naomi; Plaschkes, Inbar; Zimmerman, Shahar; Twik, Michal; Fishilevich, Simon; Stein, Tsippi Iny; Nudel, Ron; Lieder, Iris; Mazor, Yaron; Kaplan, Sergey; Dahary, Dvir; Warshawsky, David; Guan-Golan, Yaron; Kohn, Asher; Rappaport, Noa; Safran, Marilyn; Lancet, Doron

    2016-01-01

    GeneCards, the human gene compendium, enables researchers to effectively navigate and inter-relate the wide universe of human genes, diseases, variants, proteins, cells, and biological pathways. Our recently launched Version 4 has a revamped infrastructure facilitating faster data updates, better-targeted data queries, and friendlier user experience. It also provides a stronger foundation for the GeneCards suite of companion databases and analysis tools. Improved data unification includes gene-disease links via MalaCards and merged biological pathways via PathCards, as well as drug information and proteome expression. VarElect, another suite member, is a phenotype prioritizer for next-generation sequencing, leveraging the GeneCards and MalaCards knowledgebase. It automatically infers direct and indirect scored associations between hundreds or even thousands of variant-containing genes and disease phenotype terms. VarElect's capabilities, either independently or within TGex, our comprehensive variant analysis pipeline, help prepare for the challenge of clinical projects that involve thousands of exome/genome NGS analyses. © 2016 by John Wiley & Sons, Inc. PMID:27322403

  8. Sequence-specific interactions of nuclear factors with conserved sequences of human class II major histocompatibility complex genes

    All class II major histocompatibility complex genes contain two highly conserved sequences, termed X and Y, with the promoter region(s), which may have a role in regulation of expression. To study trans-acting factors that interact with these sequences, sequence-specific DNA binding activity has been examined by the gel electrophoresis retardation assay using the HLA-DQ2β gene 5' flanking DNA and nuclear extracts derived from various cell types. Several specific protein-binding activities were found using a 45-base-pair (bp) HinfI/Sau96I (-142 to -98 bp) and a 38-bp Sau96I/Sau96I (-97 to -60 bp) fragment, which include conserved sequence X (-113 to -100 bp) and conserved sequence Y (-80 to -71 bp), respectively. Competition experiments, methylation interference analysis, and DNase I footprinting demonstrated that distinct proteins in a nuclear extract of Raji cells (a human B lymphoma line) bind to sequence X, to sequence Y, and to DNA 5' of the X sequence (termed sequence W). The factor binding site in the W sequence is also found to be conserved among β-chain genes and is suggested to be a γ-interferon control region

  9. Next generation sequencing in synovial sarcoma reveals novel gene mutations.

    Vlenterie, Myrella; Hillebrandt-Roeffen, Melissa H S; Flucke, Uta E; Groenen, Patricia J T A; Tops, Bastiaan B J; Kamping, Eveline J; Pfundt, Rolph; de Bruijn, Diederik R H; Geurts van Kessel, Ad H M; van Krieken, Han J H J M; van der Graaf, Winette T A; Versleijen-Jonkers, Yvonne M H

    2015-10-27

    Over 95% of all synovial sarcomas (SS) share a unique translocation, t(X;18), however, they show heterogeneous clinical behavior. We analyzed multiple SS to reveal additional genetic alterations besides the translocation. Twenty-six SS from 22 patients were sequenced for 409 cancer-related genes using the Comprehensive Cancer Panel (Life Technologies, USA) on an Ion Torrent platform. The detected variants were verified by Sanger sequencing and compared to matched normal DNAs. Copy number variation was assessed in six tumors using the Oncoscan array (Affymetrix, USA). In total, eight somatic mutations were detected in eight samples. These mutations have not been reported previously in SS. Two of these, in KRAS and CCND1, represent known oncogenic mutations in other malignancies. Additional mutations were detected in RNF213, SEPT9, KDR, CSMD3, MLH1 and ERBB4. DNA alterations occurred more often in adult tumors. A distinctive loss of 6q was found in a metastatic lesion progressing under pazopanib, but not in the responding lesion. Our results emphasize t(X;18) as a single initiating event in SS and as the main oncogenic driver. Our results also show the occurrence of additional genetic events, mutations or chromosomal aberrations, occurring more frequently in SS with an onset in adults. PMID:26415226

  10. Hidden Markov Models for Gene Sequence Classification: Classifying the VSG genes in the Trypanosoma brucei Genome

    Mesa, Andrea; Basterrech, Sebastián; Guerberoff, Gustavo; Alvarez-Valin, Fernando

    2015-01-01

    The article presents an application of Hidden Markov Models (HMMs) for pattern recognition on genome sequences. We apply HMM for identifying genes encoding the Variant Surface Glycoprotein (VSG) in the genomes of Trypanosoma brucei (T. brucei) and other African trypanosomes. These are parasitic protozoa causative agents of sleeping sickness and several diseases in domestic and wild animals. These parasites have a peculiar strategy to evade the host's immune system that consists in periodicall...

  11. Genome-wide gene-gene interaction analysis for next-generation sequencing.

    Zhao, Jinying; Zhu, Yun; Xiong, Momiao

    2016-03-01

    The critical barrier in interaction analysis for next-generation sequencing (NGS) data is that the traditional pairwise interaction analysis that is suitable for common variants is difficult to apply to rare variants because of their prohibitive computational time, large number of tests and low power. The great challenges for successful detection of interactions with NGS data are (1) the demands in the paradigm of changes in interaction analysis; (2) severe multiple testing; and (3) heavy computations. To meet these challenges, we shift the paradigm of interaction analysis between two SNPs to interaction analysis between two genomic regions. In other words, we take a gene as a unit of analysis and use functional data analysis techniques as dimensional reduction tools to develop a novel statistic to collectively test interaction between all possible pairs of SNPs within two genome regions. By intensive simulations, we demonstrate that the functional logistic regression for interaction analysis has the correct type 1 error rates and higher power to detect interaction than the currently used methods. The proposed method was applied to a coronary artery disease dataset from the Wellcome Trust Case Control Consortium (WTCCC) study and the Framingham Heart Study (FHS) dataset, and the early-onset myocardial infarction (EOMI) exome sequence datasets with European origin from the NHLBI's Exome Sequencing Project. We discovered that 6 of 27 pairs of significantly interacted genes in the FHS were replicated in the independent WTCCC study and 24 pairs of significantly interacted genes after applying Bonferroni correction in the EOMI study. PMID:26173972

  12. The Clinical Significance of Unknown Sequence Variants in BRCA Genes

    Germline mutations in BRCA1/2 genes are responsible for a large proportion of hereditary breast and/or ovarian cancers. Many highly penetrant predisposition alleles have been identified and include frameshift or nonsense mutations that lead to the translation of a truncated protein. Other alleles contain missense mutations, which result in amino acid substitution and intronic variants with splicing effect. The discovery of variants of uncertain/unclassified significance (VUS) is a result that can complicate rather than improve the risk assessment process. VUSs are mainly missense mutations, but also include a number of intronic variants and in-frame deletions and insertions. Over 2,000 unique BRCA1 and BRCA2 missense variants have been identified, located throughout the whole gene (Breast Cancer Information Core Database (BIC database)). Up to 10–20% of the BRCA tests report the identification of a variant of uncertain significance. There are many methods to discriminate deleterious/high-risk from neutral/low-risk unclassified variants (i.e., analysis of the cosegregation in families of the VUS, measure of the influence of the VUSs on the wild-type protein activity, comparison of sequence conservation across multiple species), but only an integrated analysis of these methods can contribute to a real interpretation of the functional and clinical role of the discussed variants. The aim of our manuscript is to review the studies on BRCA VUS in order to clarify their clinical relevance

  13. The Clinical Significance of Unknown Sequence Variants in BRCA Genes

    Calò, Valentina; Bruno, Loredana; Paglia, Laura La; Perez, Marco; Margarese, Naomi [Department of Surgery and Oncology, Regional Reference Center for the Biomolecular Characterization and Genetic Screening of Hereditary Tumors, University of Palermo, Via del Vespro 127, 90127 Palermo (Italy); Gaudio, Francesca Di [Department of Medical Biotechnologies and Legal Medicine, University of Palermo, Palermo (Italy); Russo, Antonio, E-mail: lab-oncobiologia@usa.net [Department of Surgery and Oncology, Regional Reference Center for the Biomolecular Characterization and Genetic Screening of Hereditary Tumors, University of Palermo, Via del Vespro 127, 90127 Palermo (Italy)

    2010-09-10

    Germline mutations in BRCA1/2 genes are responsible for a large proportion of hereditary breast and/or ovarian cancers. Many highly penetrant predisposition alleles have been identified and include frameshift or nonsense mutations that lead to the translation of a truncated protein. Other alleles contain missense mutations, which result in amino acid substitution and intronic variants with splicing effect. The discovery of variants of uncertain/unclassified significance (VUS) is a result that can complicate rather than improve the risk assessment process. VUSs are mainly missense mutations, but also include a number of intronic variants and in-frame deletions and insertions. Over 2,000 unique BRCA1 and BRCA2 missense variants have been identified, located throughout the whole gene (Breast Cancer Information Core Database (BIC database)). Up to 10–20% of the BRCA tests report the identification of a variant of uncertain significance. There are many methods to discriminate deleterious/high-risk from neutral/low-risk unclassified variants (i.e., analysis of the cosegregation in families of the VUS, measure of the influence of the VUSs on the wild-type protein activity, comparison of sequence conservation across multiple species), but only an integrated analysis of these methods can contribute to a real interpretation of the functional and clinical role of the discussed variants. The aim of our manuscript is to review the studies on BRCA VUS in order to clarify their clinical relevance.

  14. Structural organization of glycophorin A and B genes: Glycophorin B gene evolved by homologous recombination at Alu repeat sequences

    Glycophorins A (GPA) and B (GPB) are two major sialoglycoproteins of the human erythrocyte membrane. Here the authors present a comparison of the genomic structures of GPA and GPB developed by analyzing DNA clones isolated from a K562 genomic library. Nucleotide sequences of exon-intron junctions and 5' and 3' flanking sequences revealed that the GPA and GPB genes consist of 7 and 5 exons, respectively, and both genes have >95% identical sequence from the 5' flanking region to the region ∼ 1 kilobase downstream from the exon encoding the transmembrane regions. In this homologous part of the genes, GPB lacks one exon due to a point mutation at the 5' splicing site of the third intron, which inactivates the 5' cleavage event of splicing and leads to ligation of the second to the fourth exon. Following these very homologous sequences, the genomic sequences for GPA and GPB diverge significantly and no homology can be detected in their 3' end sequences. The analysis of the Alu sequences and their flanking direct repeat sequences suggest that an ancestral genomic structure has been maintained in the GPA gene, whereas the GPB gene has arisen from the acquisition of 3' sequences different from those of the GPA gene by homologous recombination at the Alu repeats during or after gene duplication

  15. Secondary structure and phylogenetic utility of the ribosomal large subunit (28S) in monogeneans of the genus Thaparocleidus and Bifurcohaptor (Monogenea: Dactylogyridae).

    Chaudhary, Anshu; Singh, Hridaya Shanker

    2013-04-01

    Present communication deals with secondary structure of 28S rDNA of two already known species of monogeneans viz., Bifurcohaptor indicus and Thaparocleidus parvulus parasitizing gill filaments of a freshwater fish, Mystus vittatus for phylogenetic inference. Secondary structure data are best used as accessory taxonomic characters as their phylogenetic resolving power and confidence in validity. Secondary structure of the 28S rDNA transcript could provide information for identifying homologous nucleotide characters, useful for cladistic inference of relationships. Such structure data could be used as taxonomic character. The study supports that species-level sequence variability renders 28S sequence as a unique window for examining the behavior of fast evolving, non-coding DNA sequences. Apart from this it also confirms that molecular similarity present in various species could be host-induced. PMID:24431545

  16. Targeted Sequencing Reveals Large-Scale Sequence Polymorphism in Maize Candidate Genes for Biomass Production and Composition.

    Moses M Muraya

    Full Text Available A major goal of maize genomic research is to identify sequence polymorphisms responsible for phenotypic variation in traits of economic importance. Large-scale detection of sequence variation is critical for linking genes, or genomic regions, to phenotypes. However, due to its size and complexity, it remains expensive to generate whole genome sequences of sufficient coverage for divergent maize lines, even with access to next generation sequencing (NGS technology. Because methods involving reduction of genome complexity, such as genotyping-by-sequencing (GBS, assess only a limited fraction of sequence variation, targeted sequencing of selected genomic loci offers an attractive alternative. We therefore designed a sequence capture assay to target 29 Mb genomic regions and surveyed a total of 4,648 genes possibly affecting biomass production in 21 diverse inbred maize lines (7 flints, 14 dents. Captured and enriched genomic DNA was sequenced using the 454 NGS platform to 19.6-fold average depth coverage, and a broad evaluation of read alignment and variant calling methods was performed to select optimal procedures for variant discovery. Sequence alignment with the B73 reference and de novo assembly identified 383,145 putative single nucleotide polymorphisms (SNPs, of which 42,685 were non-synonymous alterations and 7,139 caused frameshifts. Presence/absence variation (PAV of genes was also detected. We found that substantial sequence variation exists among genomic regions targeted in this study, which was particularly evident within coding regions. This diversification has the potential to broaden functional diversity and generate phenotypic variation that may lead to new adaptations and the modification of important agronomic traits. Further, annotated SNPs identified here will serve as useful genetic tools and as candidates in searches for phenotype-altering DNA variation. In summary, we demonstrated that sequencing of captured DNA is a powerful

  17. Isolation of Hox cluster genes from insects reveals an accelerated sequence evolution rate.

    Heike Hadrys

    Full Text Available Among gene families it is the Hox genes and among metazoan animals it is the insects (Hexapoda that have attracted particular attention for studying the evolution of development. Surprisingly though, no Hox genes have been isolated from 26 out of 35 insect orders yet, and the existing sequences derive mainly from only two orders (61% from Hymenoptera and 22% from Diptera. We have designed insect specific primers and isolated 37 new partial homeobox sequences of Hox cluster genes (lab, pb, Hox3, ftz, Antp, Scr, abd-a, Abd-B, Dfd, and Ubx from six insect orders, which are crucial to insect phylogenetics. These new gene sequences provide a first step towards comparative Hox gene studies in insects. Furthermore, comparative distance analyses of homeobox sequences reveal a correlation between gene divergence rate and species radiation success with insects showing the highest rate of homeobox sequence evolution.

  18. Nucleotide Sequence of the Chromosomal ampC Gene of Enterobacter aerogenes

    Preston, Karen E.; Radomski, Christopher C. A.; Venezia, Richard A.

    2000-01-01

    The AmpC β-lactamase gene and a small portion of the regulatory ampR sequence of Enterobacter aerogenes 97B were cloned and sequenced. The β-lactamase had an isoelectric point of 8 and conferred cephalosporin and cephamycin resistance on the host. The sequence of the cloned gene is most closely related to those of the ampC genes of E. cloacae and C. freundii.

  19. Sense-antisense gene pairs: sequence, transcription, and structure are not conserved between human and mouse

    Wood, Emily J.; Chin-Inmanu, Kwanrutai; Jia, Hui; Lipovich, Leonard

    2013-01-01

    Previous efforts to characterize conservation between the human and mouse genomes focused largely on sequence comparisons. These studies are inherently limited because they don't account for gene structure differences, which may exist despite genomic sequence conservation. Recent high-throughput transcriptome studies have revealed widespread and extensive overlaps between genes, and transcripts, encoded on both strands of the genomic sequence. This overlapping gene organization, which produce...

  20. Characterization of fusion genes and the significantly expressed fusion isoforms in breast cancer by hybrid sequencing

    Weirather, Jason L.; Afshar, Pegah Tootoonchi; Clark, Tyson A.; Tseng, Elizabeth; Powers, Linda S.; Underwood, Jason G; Zabner, Joseph; Korlach, Jonas; Wong, Wing Hung; Au, Kin Fai

    2015-01-01

    We developed an innovative hybrid sequencing approach, IDP-fusion, to detect fusion genes, determine fusion sites and identify and quantify fusion isoforms. IDP-fusion is the first method to study gene fusion events by integrating Third Generation Sequencing long reads and Second Generation Sequencing short reads. We applied IDP-fusion to PacBio data and Illumina data from the MCF-7 breast cancer cells. Compared with the existing tools, IDP-fusion detects fusion genes at higher precision and ...

  1. Nucleotide Sequence of a Chicken Vitellogenin Gene and Derived Amino Acid Sequence of the Encoded Yolk Precursor Protein

    Schip, Fred D. van het; Samallo, John; Broos, Jaap; Ophuis, Jan; Mojet, Mart; Gruber, Max; AB, Geert

    1987-01-01

    The gene encoding the major vitellogenin from chicken has been completely sequenced and its exon-intron organization has been established. The gene is 20,342 base-pairs long and contains 35 exons with a combined length of 5787 base-pairs. They encode the 1850-amino acid pre-peptide of vitellogenin,

  2. CLONING AND SEQUENCING OF THE GENE FOR A LACTOCOCCAL ENDOPEPTIDASE, AN ENZYME WITH SEQUENCE SIMILARITY TO MAMMALIAN ENKEPHALINASE

    Mierau, Igor; Tan, Paris S.T.; Haandrikman, Alfred J.; Kok, Jan; Leenhouts, Kees J.; Konings, Wil N.; Venema, Gerard

    1993-01-01

    The gene specifying an endopeptidase of Lactococcus lactis, named pepO, was cloned from a genomic library of L. lactis subsp. cremoris P8-247 in lambdaEMBL3 and was subsequently sequenced. pepO is probably the last gene of an operon encoding the binding-protein-dependent oligopeptide transport syste

  3. Characterizations of Chinese isolates of Coxiella burnetii in the com1 gene sequence

    YU Quan; ZHANG Guo-quan; FUKUSHI Hideto; YAMAGUCHI Tsuyoshi; HIRAI Katsuya

    2002-01-01

    Objective: To know some genetical characterizations of Coxiella burnetii Chinese isolates by comparing the com1 gene sequence. Methods: com1 gene sequences of Chinese isolates were amplified, sequenced, and analyzed by comparing our result and the previous published data. Results: Three different com1 sequences were identified in 7 Chinese isolates. Sequence comparison indicated that the isolates harboring the QpRS plasmid could be defined as a new group and, in addition, the isolates carrying the same plasmid type showed similar com1 gene sequence. Conclusion: Study suggests that the classification of the group based on the com1 gene sequence is highly associated with the plasmid type of the isolates and, however, little related to disease forms and geographical origins of the isolates.

  4. Defining reference sequences for Nocardia species by similarity and clustering analyses of 16S rRNA gene sequence data.

    Manal Helal

    Full Text Available BACKGROUND: The intra- and inter-species genetic diversity of bacteria and the absence of 'reference', or the most representative, sequences of individual species present a significant challenge for sequence-based identification. The aims of this study were to determine the utility, and compare the performance of several clustering and classification algorithms to identify the species of 364 sequences of 16S rRNA gene with a defined species in GenBank, and 110 sequences of 16S rRNA gene with no defined species, all within the genus Nocardia. METHODS: A total of 364 16S rRNA gene sequences of Nocardia species were studied. In addition, 110 16S rRNA gene sequences assigned only to the Nocardia genus level at the time of submission to GenBank were used for machine learning classification experiments. Different clustering algorithms were compared with a novel algorithm or the linear mapping (LM of the distance matrix. Principal Components Analysis was used for the dimensionality reduction and visualization. RESULTS: The LM algorithm achieved the highest performance and classified the set of 364 16S rRNA sequences into 80 clusters, the majority of which (83.52% corresponded with the original species. The most representative 16S rRNA sequences for individual Nocardia species have been identified as 'centroids' in respective clusters from which the distances to all other sequences were minimized; 110 16S rRNA gene sequences with identifications recorded only at the genus level were classified using machine learning methods. Simple kNN machine learning demonstrated the highest performance and classified Nocardia species sequences with an accuracy of 92.7% and a mean frequency of 0.578. CONCLUSION: The identification of centroids of 16S rRNA gene sequence clusters using novel distance matrix clustering enables the identification of the most representative sequences for each individual species of Nocardia and allows the quantitation of inter- and intra

  5. Gene discovery in the hamster: a comparative genomics approach for gene annotation by sequencing of hamster testis cDNAs

    Khan Shafiq A

    2003-06-01

    Full Text Available Abstract Background Complete genome annotation will likely be achieved through a combination of computer-based analysis of available genome sequences combined with direct experimental characterization of expressed regions of individual genomes. We have utilized a comparative genomics approach involving the sequencing of randomly selected hamster testis cDNAs to begin to identify genes not previously annotated on the human, mouse, rat and Fugu (pufferfish genomes. Results 735 distinct sequences were analyzed for their relatedness to known sequences in public databases. Eight of these sequences were derived from previously unidentified genes and expression of these genes in testis was confirmed by Northern blotting. The genomic locations of each sequence were mapped in human, mouse, rat and pufferfish, where applicable, and the structure of their cognate genes was derived using computer-based predictions, genomic comparisons and analysis of uncharacterized cDNA sequences from human and macaque. Conclusion The use of a comparative genomics approach resulted in the identification of eight cDNAs that correspond to previously uncharacterized genes in the human genome. The proteins encoded by these genes included a new member of the kinesin superfamily, a SET/MYND-domain protein, and six proteins for which no specific function could be predicted. Each gene was expressed primarily in testis, suggesting that they may play roles in the development and/or function of testicular cells.

  6. Colorimetric biosensing of targeted gene sequence using dual nanoparticle platforms

    Thavanathan J

    2015-04-01

    Full Text Available Jeevan Thavanathan,1 Nay Ming Huang,1 Kwai Lin Thong2 1Low Dimension Material Research Center, Department of Physics, 2Institute of Biological Sciences, Faculty of Science, University of Malaya, Kuala Lumpur, Malaysia Abstract: We have developed a colorimetric biosensor using a dual platform of gold nanoparticles and graphene oxide sheets for the detection of Salmonella enterica. The presence of the invA gene in S. enterica causes a change in color of the biosensor from its original pinkish-red to a light purplish solution. This occurs through the aggregation of the primary gold nanoparticles–conjugated DNA probe onto the surface of the secondary graphene oxide–conjugated DNA probe through DNA hybridization with the targeted DNA sequence. Spectrophotometry analysis showed a shift in wavelength from 525 nm to 600 nm with 1 µM of DNA target. Specificity testing revealed that the biosensor was able to detect various serovars of the S. enterica while no color change was observed with the other bacterial species. Sensitivity testing revealed the limit of detection was at 1 nM of DNA target. This proves the effectiveness of the biosensor in the detection of S. enterica through DNA hybridization. Keywords: biosensor, DNA hybridization, DNA probe, gold nanoparticles, graphene oxide, Salmonella enterica

  7. Isolation and characterization of gene sequences expressed in cotton fiber

    Taciana de Carvalho Coutinho

    2016-06-01

    Full Text Available ABSTRACT Cotton fiber are tubular cells which develop from the differentiation of ovule epidermis. In addition to being one of the most important natural fiber of the textile group, cotton fiber afford an excellent experimental system for studying the cell wall. The aim of this work was to isolate and characterise the genes expressed in cotton fiber (Gossypium hirsutum L. to be used in future work in cotton breeding. Fiber of the cotton cultivar CNPA ITA 90 II were used to extract RNA for the subsequent generation of a cDNA library. Seventeen sequences were obtained, of which 14 were already described in the NCBI database (National Centre for Biotechnology Information, such as those encoding the lipid transfer proteins (LTPs and arabinogalactans (AGP. However, other cDNAs such as the B05 clone, which displays homology with the glycosyltransferases, have still not been described for this crop. Nevertheless, results showed that several clones obtained in this study are associated with cell wall proteins, wall-modifying enzymes and lipid transfer proteins directly involved in fiber development.

  8. Use of gene sequence analyses and genome comparisons for yeast systematics

    Detection, identification, and classification of yeasts has undergone a major transformation in the past decade and a half following application of gene sequence analyses and genome comparisons. Development of a database (barcode) of easily determined gene sequences from domains 1 and 2 of large sub...

  9. Human cysteine-proteinase inhibitors: nucleotide sequence analysis of three members of the cystatin gene family.

    Saitoh, E; Kim, H S; Smithies, O; Maeda, N

    1987-01-01

    Three genes from the human cystatin gene family of cysteine-proteinase inhibitors have been isolated from a bacteriophage lambda library containing HindIII digests of human genomic DNA. Two of the genes code for salivary cystatin SN and SA, the third is a pseudogene. The cloned genes were identified with a probe made from a salivary cystatin cDNA. The complete nucleotide sequence of the gene that codes for the precursor form of the neutral salivary protein, cystatin SN, was determined. The gene, which we name CST1, contains three exons and two intervening sequences. The expected CAT and ATA boxes are present in the 5'-flanking region of the gene. Partial nucleotide sequence determination of a second gene revealed that it codes for the precursor form of the acidic salivary protein, cystatin SA. This gene, which we name CST2, has the same gene organization as CST1. The complete nucleotide sequence of a third gene was determined. It does not contain a typical ATA box, and in addition, a premature stop codon and a frameshift deletion mutation occur within the gene. These inactivation mutations show that this gene, which we name CSTP1, is a cystatin pseudogene. These data combined with our genomic Southern-blot analyses show that the cystatin genes form a multigene family with at least seven members. PMID:3446578

  10. PCR amplification and sequence analysis of the rat Sox3 gene

    Krstić A.

    2008-01-01

    Full Text Available The Sox3 gene is considered to be one of the earliest neural markers in vertebrates, playing a role in specifying neuronal fate. Despite the completion of a rat genome sequencing project, only a partial sequence of the rat Sox3 gene has been available in the public database. Using PCR, sequencing, and bioinformatics tools, in this study we have determined the complete coding sequence of the rat Sox3 gene encoding 449 amino acids. Comparative analysis of rat and human SOX3 proteins revealed a high degree of conservation. Identification of the rat Sox3 gene sequence would help in understanding the biological roles of this gene and provide insight into evolutionary relationships with vertebrate orthologs.

  11. CLONING AND SEQUENCING OF MATURED FRAGMENT OF HUMAN NEVER GROWTH FACTOR GENE

    马巍; 吴玲; 王德利; 刘淼; 任惠民; 杨广笑; 王全颖

    2003-01-01

    Objective Molecular cloning and sequencing of the human matured fragment of human nerve growth factor(NGF) gene. Methods Extracting the human genomic DNA from the white blood cells as templates, the gene of NGF was cloned by using PCR and T-vector cloning method. Screening the positive clones and identified by the restriction enzymes, and then the cloned amplified fragment was sequenced and analyzed. Results DNA sequence comparison the cloned gene of NGF with the GenBank (V01511) sequence demonstrated that both of sequences were identical, 354bp length. Conclusion Cloning the NGF gene from the human genomic DNA has paved the way for further study on gene therapy of nerve system injury.

  12. DNA sequence of the lactose operon: the lacA gene and the transcriptional termination region.

    Hediger, M A; Johnson, D F; Nierlich, D P; Zabin, I

    1985-01-01

    The lac operon of Escherichia coli spans approximately 5300 base pairs and includes the lacZ, lacY, and lacA genes in addition to the operator, promoter, and transcription termination regions. We report here the sequence of the lacA gene and the region distal to it, confirming the sequence of thiogalactoside transacetylase and completing the sequence of the lac operon. The lacA gene is characterized by use of rare codons, suggesting an origin from a plasmid, transposon, or virus gene. UUG is ...

  13. Sequence homologies in the 5' regions of four Drosophila heat-shock genes.

    Holmgren, R; Corces, V; Morimoto, R; Blackman, R; Meselson, M

    1981-01-01

    We report nucleotide sequences of the regions surrounding the 5' ends of the genes for Drosophila melanogaster heat-shock proteins hsp83, hsp68, and hsp26, located at chromosome positions 63BC, 95D, and 67B, respectively. As in other eukaryotic genes, the sequence T-A-T-A-A-A-A-T occurs about 30 nucleotides upstream from the sites of mRNA initiation. Three additional sequence homologies and a dyad symmetry were noted at approximately corresponding locations in the three genes and in the gene ...

  14. Complete mitochondrial genome DNA sequence for two ophiuroids and a holothuroid: the utility of protein gene sequence and gene maps in the analyses of deep deuterostome phylogeny.

    Scouras, Andrea; Beckenbach, Karen; Arndt, Allan; Smith, Michael J

    2004-04-01

    The complete mitochondrial genome sequences have been determined for the holothuroid Cucumaria miniata and two ophiuroid species Ophiopholis aculeata and Ophiura lütkeni. In addition, the nucleotide sequence of the mitochondrial protein-coding genes for the asteroid Pisaster ochraceus has been completed. Maximum-likelihood and LogDet distance analyses of concatenated protein-coding sequences produced a series of trees that did not conclusively support generally accepted models of echinoderm phylogeny. The ophiuroid data consistently demonstrated accelerated nucleotide divergence rates and lack of stationarity. This confounds the phylogenetic analyses. Molecular investigations using individual protein-coding gene alignments demonstrated that the cytochrome b gene exhibits the least deviation in rate and stationarity and generated some trees consistent with proposed echinoderm phylogenies. Phylogenies based on echinoderm mitochondrial gene rearrangements also proved problematic because of extensive variation in gene order between and within classes. A comparison of the two distinctive ophiuroid mitochondrial gene orders supports the hypothesis that O. lütkeni has a more derived mitochondrial gene order versus O. aculeata. The variation in the echinoderm mitochondrial gene maps reinforces the limitations of the application of mitochondrial gene rearrangements as a global phylogenetic tool. PMID:15019608

  15. Structural analysis of DNA sequence: evidence for lateral gene transfer in Thermotoga maritima

    Worning, Peder; Jensen, Lars Juhl; Nelson, K. E.;

    2000-01-01

    The recently published complete DNA sequence of the bacterium Thermotoga maritima provides evidence, based on protein sequence conservation, for lateral gene transfer between Archaea and Bacteria. We introduce a new method of periodicity analysis of DNA sequences, based on structural parameters, ...

  16. [Analysis of full-length gene sequence of rabies vaccine virus aG strain].

    Li, Jia; Cao, Shou-Chun; Shi, Lei-Tai; Wu, Xiao-Hong; Liu, Jing-Hua; Wang, Yun-Peng; Tang, Jian-Rong; Yu, Yong-Xin; Dong, Guan-Mu

    2013-06-01

    To sequence and analyze the full-length gene sequence of rabies vaccine virus aG strain. The full-length gene sequence of aG strain was amplified by RT-PCR by 8 fragments,each PCR product was cloned into vector pGEM-T respectively, sequenced and assemblied; The 5' leader sequence was sequenced with method of 5' RACE. The homology between aG and other rabies vaccine virus was analyzed by using DNAstar and Mega4. 0 software. aG strain was 11 925nt(GenBank accession number: JN234411) in length and belonged to the genotype I . The Bioinformatics revealed that the homology showed disparation form different rabies vaccine virus. the full-length gene sequence of rabies vaccine virus aG strain provided a support for perfecting the standard for quality control of virus strains for production of rabies vaccine for human use in China. PMID:23895005

  17. Analysis of Mixed Sequencing Chromatograms and Its Application in Direct 16S rRNA Gene Sequencing of Polymicrobial Samples▿

    Kommedal, Øyvind; Karlsen, Bjarte; Sæbø, Øystein

    2008-01-01

    Investigation of clinical samples by direct 16S rRNA gene sequencing provides the possibility to detect nonviable bacteria and bacteria with special growth requirements. This approach has been particularly valuable for the diagnosis of patients who have received antibiotics prior to sample collection. In specimens containing more than one bacterium, direct sequencing gives mixed chromatograms that complicate further interpretation. We designed an algorithm able to analyze these ambiguous chro...

  18. Contribution of the Caspase Gene Sequence Diversification to the Specifically Antiviral Defense in Invertebrate

    Bin Zhi; Lei Wang; Guangyi Wang; Xiaobo Zhang

    2011-01-01

    Vertebrates achieve adaptive immunity of all sorts against pathogens through the diversification of antibodies. However the mechanism of invertebrates' innate immune defense against various pathogens remains largely unknown. Our study used shrimp and white spot syndrome virus (WSSV) to show that PjCaspase, a caspase gene of shrimp that is crucial in apoptosis, possessed gene sequence diversity. At present, the role of gene sequence diversity in immunity has not been characterized. To address ...

  19. Characterization and phylogenetic analysis of -gliadin gene sequences reveals significant genomic divergence in Triticeae species

    Guang-Rong Li; Tao Lang; En-Nian Yang; Cheng Liu; Zu-Jun Yang

    2014-12-01

    Although the unique properties of wheat -gliadin gene family are well characterized, little is known about the evolution and genomic divergence of -gliadin gene family within the Triticeae. We isolated a total of 203 -gliadin gene sequences from 11 representative diploid and polyploid Triticeae species, and found 108 sequences putatively functional. Our results indicate that -gliadin genes may have possibly originated from wild Secale species, where the sequences contain the shortest repetitive domains and display minimum variation. A miniature inverted-repeat transposable element insertion is reported for the first time in -gliadin gene sequence of Thinopyrum intermedium in this study, indicating that the transposable element might have contributed to the diversification of -gliadin genes family among Triticeae genomes. The phylogenetic analyses revealed that the -gliadin gene sequences of Dasypyrum, Australopyrum, Lophopyrum, Eremopyrum and Pseudoroengeria species have amplified several times. A search for four typical toxic epitopes for celiac disease within the Triticeae -gliadin gene sequences showed that the -gliadins of wild Secale, Australopyrum and Agropyron genomes lack all four epitopes, while other Triticeae species have accumulated these epitopes, suggesting that the evolution of these toxic epitopes sequences occurred during the course of speciation, domestication or polyploidization of Triticeae.

  20. Complete Sequence Construction of the Highly Repetitive Ribosomal RNA Gene Repeats in Eukaryotes Using Whole Genome Sequence Data.

    Agrawal, Saumya; Ganley, Austen R D

    2016-01-01

    The ribosomal RNA genes (rDNA) encode the major rRNA species of the ribosome, and thus are essential across life. These genes are highly repetitive in most eukaryotes, forming blocks of tandem repeats that form the core of nucleoli. The primary role of the rDNA in encoding rRNA has been long understood, but more recently the rDNA has been implicated in a number of other important biological phenomena, including genome stability, cell cycle, and epigenetic silencing. Noncoding elements, primarily located in the intergenic spacer region, appear to mediate many of these phenomena. Although sequence information is available for the genomes of many organisms, in almost all cases rDNA repeat sequences are lacking, primarily due to problems in assembling these intriguing regions during whole genome assemblies. Here, we present a method to obtain complete rDNA repeat unit sequences from whole genome assemblies. Limitations of next generation sequencing (NGS) data make them unsuitable for assembling complete rDNA unit sequences; therefore, the method we present relies on the use of Sanger whole genome sequence data. Our method makes use of the Arachne assembler, which can assemble highly repetitive regions such as the rDNA in a memory-efficient way. We provide a detailed step-by-step protocol for generating rDNA sequences from whole genome Sanger sequence data using Arachne, for refining complete rDNA unit sequences, and for validating the sequences obtained. In principle, our method will work for any species where the rDNA is organized into tandem repeats. This will help researchers working on species without a complete rDNA sequence, those working on evolutionary aspects of the rDNA, and those interested in conducting phylogenetic footprinting studies with the rDNA. PMID:27576718

  1. Sequencing analysis reveals a unique gene organization in the gyrB region of Mycoplasma hominis

    Ladefoged, Søren; Christiansen, Gunna

    which showed similarity to that which encodes the LicA protein of Haemophilus influenzae. The organization of the genes in the region showed no resemblance to that in the corresponding regions of other bacteria sequenced so far. The gyrA gene was mapped 35 kb downstream from the gyrB gene.......The homolog of the gyrB gene, which has been reported to be present in the vicinity of the initiation site of replication in bacteria, was mapped on the Mycoplasma hominis genome, and the region was subsequently sequenced. Five open reading frames were identified flanking the gyrB gene, one of...

  2. Disparate sequence characteristics of the Erysiphe graminis f.sp. hordei glyceraldehyde-3-phosphate dehydrogenase gene

    Christiansen, S.K.; Justesen, A.F.; Giese, H.

    1997-01-01

    , Egh falls into the group of Ascomycetes located at a basal position. The regulatory region of the Egh gpd gene has no homology to corresponding sequences in other filamentous Ascomycetes. Codon usage was determined for the four characterized Egh genes (tub2, Egh7, Egh16 and gpd) and found to be...... and plant genes in sequence mixtures. The Egh gpd promoter appears to be superior to that of the Egh beta-tubulin gene (tub2) for driving the E. coli beta-glucuronidase (GUS) gene in transformation experiments....

  3. Cloning, sequence analysis, and hyperexpression of the genes encoding phosphotransacetylase and acetate kinase from Methanosarcina thermophila.

    Latimer, M T; Ferry, J G

    1993-01-01

    The genes for the acetate-activating enzymes, acetate kinase and phosphotransacetylase (ack and pta), from Methanosarcina thermophila TM-1 were cloned and sequenced. Both genes are present in only one copy per genome, with the pta gene adjacent to and upstream of the ack gene. Consensus archaeal promoter sequences are found upstream of the pta coding region. The pta and ack genes encode predicted polypeptides with molecular masses of 35,198 and 44,482 Da, respectively. A hydropathy plot of th...

  4. Strong association between pseudogenization mechanisms and gene sequence length

    Harrison Paul M; Khachane Amit N

    2009-01-01

    Abstract Pseudogenes arise from the decay of gene copies following either RNA-mediated duplication (processed pseudogenes) or DNA-mediated duplication (nonprocessed pseudogenes). Here, we show that long protein-coding genes tend to produce more nonprocessed pseudogenes than short genes, whereas the opposite is true for processed pseudogenes. Protein-coding genes longer than 3000 bp are 6 times more likely to produce nonprocessed pseudogenes than processed ones. Reviewers This article was revi...

  5. Flagellar apparatus gene sequences of Aeromonas hydrophila AL09-73 isolate

    Flagellar apparatus genes of recent outbreak Aeromonas hydrophila AL09-73 isolate were sequenced and characterized. Total 28 flagellar genes were identified. The sizes of the genes range from 318 to 2001 nucleotides, which potentially encode different complex flagellar proteins. At nucleotide and...

  6. Cloning, sequence analysis, and characterization of the genes involved in isoprimeverose metabolism in Lactobacillus pentosus

    Chaillou, S.; Lokman, B.C.; Leer, R.J.; Posthuma, C.; Postma, P.W.; Pouwels, P.H.

    1998-01-01

    Two genes, xylP and xylQ, from the xylose regulon of Lactobacillus pentosus were cloned and sequenced. Together with the repressor gene of the regulon, xylR, the xylPQ genes form an operon which is inducible by xylose and which is transcribed from a promoter located 145 bp upstream of xylP. A putati

  7. Cloning and sequencing of the gene encoding thermophilic beta-amylase of Clostridium thermosulfurogenes.

    Kitamoto, N; Yamagata, H; Kato, T; Tsukagoshi, N; Udaka, S

    1988-01-01

    A gene coding for thermophilic beta-amylase of Clostridium thermosulfurogenes was cloned into Bacillus subtilis, and its nucleotide sequence was determined. The nucleotide sequence suggested that the thermophilic beta-amylase is translated from monocistronic mRNA as a secretory precursor with a signal peptide of 32 amino acid residues. The deduced amino acid sequence of the mature beta-amylase contained 519 residues with a molecular weight of 57,167. The amino acid sequence of the C. thermosu...

  8. Nucleotide sequence of a cyanobacterial nifH gene coding for nitrogenase reductase

    Mevarech, Moshe; Rice, Douglas; Haselkorn, Robert

    1980-01-01

    The nucleotide sequence of nifH, the structural gene for nitrogenase reductase (component II or Fe protein of nitrogenase) from the cyanobacterium Anabaena 7120 has been determined. Also reported are 194 bases of the 5′-flanking sequence and 170 bases of the 3′-flanking sequence. The predicted amino acid sequence was compared with that determined for the complete nitrogenase reductase of Clostridium pasteurianum and the cysteine-containing peptides of the protein from Azotobacter vinelandii. ...

  9. Nucleotide sequences of immunoglobulin epsilon genes of chimpanzee and orangutan: DNA molecular clock and hominoid evolution.

    Sakoyama, Y; Hong, K J; Byun, S. M.; Hisajima, H; Ueda, S; Yaoita, Y; Hayashida, H; Miyata, T.; Honjo, T

    1987-01-01

    To determine the phylogenetic relationships among hominoids and the dates of their divergence, the complete nucleotide sequences of the constant region of the immunoglobulin epsilon-chain (C epsilon 1) genes from chimpanzee and orangutan have been determined. These sequences were compared with the human epsilon-chain constant-region sequence. A molecular clock (silent molecular clock), measured by the degree of sequence divergence at the synonymous (silent) positions of protein-encoding regio...

  10. Cloning and sequencing of the bovine gastrin gene

    Lund, T; Rehfeld, J F; Olsen, Jørgen

    1989-01-01

    In order to deduce the primary structure of bovine preprogastrin we therefore sequenced a gastrin DNA clone isolated from a bovine liver cosmid library. Bovine preprogastrin comprises 104 amino acids and consists of a signal peptide, a 37 amino acid spacer-sequence, the gastrin-34 sequence follow...... by an amidation-site (Gly-Arg-Arg), and a C-terminal nonapeptide. Comparison with human, porcine, and rat cDNA sequences revealed extensive homology in the coding region as well as in short noncoding structures....

  11. Intergenic DNA sequences flanking the pseudo alpha globin genes of human and chimpanzee.

    Sawada, I; Beal, M P; Shen, C K; Chapman, B.; Wilson, A C; Schmid, C.

    1983-01-01

    We have determined the sequence of 2400 base pairs upstream from the human pseudo alpha globin (psi alpha) gene, and for comparison, 1100 base pairs of DNA within and upstream from the chimpanzee psi alpha gene. The region upstream from the promoter of the psi alpha gene shows no significant homology to the intergenic regions of the adult alpha 2 and alpha 1 globin genes. The chimpanzee gene has a coding defect in common with the human psi alpha gene, showing that the product of this gene, if...

  12. Mouse mammary tumor virus-like gene sequences are present in lung patient specimens

    Rodríguez-Padilla Cristina

    2011-09-01

    Full Text Available Abstract Background Previous studies have reported on the presence of Murine Mammary Tumor Virus (MMTV-like gene sequences in human cancer tissue specimens. Here, we search for MMTV-like gene sequences in lung diseases including carcinomas specimens from a Mexican population. This study was based on our previous study reporting that the INER51 lung cancer cell line, from a pleural effusion of a Mexican patient, contains MMTV-like env gene sequences. Results The MMTV-like env gene sequences have been detected in three out of 18 specimens studied, by PCR using a specific set of MMTV-like primers. The three identified MMTV-like gene sequences, which were assigned as INER6, HZ101, and HZ14, were 99%, 98%, and 97% homologous, respectively, as compared to GenBank sequence accession number AY161347. The INER6 and HZ-101 samples were isolated from lung cancer specimens, and the HZ-14 was isolated from an acute inflammatory lung infiltrate sample. Two of the env sequences exhibited disruption of the reading frame due to mutations. Conclusion In summary, we identified the presence of MMTV-like gene sequences in 2 out of 11 (18% of the lung carcinomas and 1 out of 7 (14% of acute inflamatory lung infiltrate specimens studied of a Mexican Population.

  13. Characterization of promoter sequence of toll-like receptor genes in Vechur cattle

    R. Lakshmi

    2016-06-01

    Full Text Available Aim: To analyze the promoter sequence of toll-like receptor (TLR genes in Vechur cattle, an indigenous breed of Kerala with the sequence of Bos taurus and access the differences that could be attributed to innate immune responses against bovine mastitis. Materials and Methods: Blood samples were collected from Jugular vein of Vechur cattle, maintained at Vechur cattle conservation center of Kerala Veterinary and Animal Sciences University, using an acid-citrate-dextrose anticoagulant. The genomic DNA was extracted, and polymerase chain reaction was carried out to amplify the promoter region of TLRs. The amplified product of TLR2, 4, and 9 promoter regions was sequenced by Sanger enzymatic DNA sequencing technique. Results: The sequence of promoter region of TLR2 of Vechur cattle with the B. taurus sequence present in GenBank showed 98% similarity and revealed variants for four sequence motifs. The sequence of the promoter region of TLR4 of Vechur cattle revealed 99% similarity with that of B. taurus sequence but not reveals significant variant in motifregions. However, two heterozygous loci were observed from the chromatogram. Promoter sequence of TLR9 gene also showed 99% similarity to B. taurus sequence and revealed variants for four sequence motifs. Conclusion: The results of this study indicate that significant variation in the promoter of TLR2 and 9 genes in Vechur cattle breed and may potentially link the influence the innate immunity response against mastitis diseases.

  14. The Arabidopsis Root Transcriptome by Serial Analysis of Gene Expression. Gene Identification Using the Genome Sequence1

    Fizames, Cécile; Muños, Stéphane; Cazettes, Céline; Nacry, Philippe; Boucherez, Jossia; Gaymard, Frédéric; Piquemal, David; Delorme, Valérie; Commes, Thérèse; Doumas, Patrick; Cooke, Richard; Marti, Jacques; Sentenac, Hervé; Gojon, Alain

    2004-01-01

    Large-scale identification of genes expressed in roots of the model plant Arabidopsis was performed by serial analysis of gene expression (SAGE), on a total of 144,083 sequenced tags, representing at least 15,964 different mRNAs. For tag to gene assignment, we developed a computational approach based on 26,620 genes annotated from the complete sequence of the genome. The procedure selected warrants the identification of the genes corresponding to the majority of the tags found experimentally, with a high level of reliability, and provides a reference database for SAGE studies in Arabidopsis. This new resource allowed us to characterize the expression of more than 3,000 genes, for which there is no expressed sequence tag (EST) or cDNA in the databases. Moreover, 85% of the tags were specific for one gene. To illustrate this advantage of SAGE for functional genomics, we show that our data allow an unambiguous analysis of most of the individual genes belonging to 12 different ion transporter multigene families. These results indicate that, compared with EST-based tag to gene assignment, the use of the annotated genome sequence greatly improves gene identification in SAGE studies. However, more than 6,000 different tags remained with no gene match, suggesting that a significant proportion of transcripts present in the roots originate from yet unknown or wrongly annotated genes. The root transcriptome characterized in this study markedly differs from those obtained in other organs, and provides a unique resource for investigating the functional specificities of the root system. As an example of the use of SAGE for transcript profiling in Arabidopsis, we report here the identification of 270 genes differentially expressed between roots of plants grown either with NO3- or NH4NO3 as N source. PMID:14730065

  15. Identification of true EST alignments and exon regions of gene sequences

    ZHOU Yanhong; JING Hui; LI Yanen; LIU Huailan

    2004-01-01

    Expressed sequence tags (ESTs), which have piled up considerably so far, provide a valuable resource for finding new genes, disease-relevant genes, and for recognizing alternative splicing variants, SNP sites, etc. The prerequisite for carrying out these researches is to correctly ascertain the gene-sequence-related ESTs. Based on analysis of the alignment results between some known gene sequences and ESTs in public database, several measures including Identity Check, Gap Check, Inclusion Check and Length Check have been introduced to judge whether an EST alignment is related to a gene sequence or not. A computational program EDSAc1.0 has been developed to identify true EST alignments and exon regions of query gene sequences. When tested with human gene sequences in the standard dataset HMR195 and evaluated with the standard measures of gene prediction performance, EDSAc1.0 can identify protein- coding regions with specificity of 0.997 and sensitivity of 0.88 at the nucleotide level, which outperform that of the counterpart TAP. A web server of EDSAc1.0 is available at http://infosci.hust.edu.cn.

  16. Biased distribution of DNA uptake sequences towards genome maintenance genes

    Davidsen, T.; Rodland, E.A.; Lagesen, K.;

    2004-01-01

    Repeated sequence signatures are characteristic features of all genomic DNA. We have made a rigorous search for repeat genomic sequences in the human pathogens Neisseria meningitidis, Neisseria gonorrhoeae and Haemophilus influenzae and found that by far the most frequent 9-10mers residing within...

  17. Escherichia coli rep gene: sequence of the gene, the encoded helicase, and its homology with uvrD.

    Gilchrist, C A; Denhardt, D T

    1987-01-01

    The sequence of a 2.67-kilobase section of the Escherichia coli chromosome that contains the rep gene has been determined. This gene codes for a protein of predicted Mr 72,800, a DNA helicase, which is also a single-stranded DNA-dependent ATPase. The sequenced region contains an open reading frame of the correct length and orientation to encode the Rep protein. A secondary structure for the protein can be formulated from the amino acid sequence. We have compared both the primary and the secon...

  18. Cloning and Sequence Analysis of Y-box Binding Protein Gene in Min Pig

    Zhang Dong-jie; Liu Di; Wang Liang; He Xin-miao; Wang Wen-tao

    2014-01-01

    In order to study the gene sequence of Min pig Y-box binding protein (YB-1) gene, the complete coding sequence of Min pig YB-1 gene was cloned by RT-PCR, the sequence features were analyzed by some software and online website. The results showed that the complete CDS of Min pig Y-box was found to be 975 bp long, encoding 324 amino acids. It contained a conserved cold shock domain and several phosphorylation sites, but had no transmembrane domains, and was consistent with a protein found in the cytoplasm. Min pig YB-1 nucleotides shared high similarity (61.37%-97.66%) with other mammals.

  19. Molecular cloning and sequencing of the gene encoding the fimbrial subunit protein of Bacteroides gingivalis.

    Dickinson, D P; Kubiniec, M A; Yoshimura, F; Genco, R J

    1988-01-01

    The gene encoding the fimbrial subunit protein of Bacteroides gingivalis 381, fimbrilin, has been cloned and sequenced. The gene was present as a single copy on the bacterial chromosome, and the codon usage in the gene conformed closely to that expected for an abundant protein. The predicted size of the mature protein was 35,924 daltons, and the secretory form may have had a 10-amino-acid, hydrophilic leader sequence similar to the leader sequences of the MePhe fimbriae family. The protein se...

  20. Targeting of AID-mediated sequence diversification to immunoglobulin genes

    Kothapalli, Naga Rama; Fugmann, Sebastian D.

    2011-01-01

    Activation-induced cytidine deaminase (AID) is a key enzyme for antibody-mediated immune responses. Antibodies are encoded by the immunoglobulin genes and AID acts as a transcription-dependent DNA mutator on these genes to improve antibody affinity and effector functions. An emerging theme in field is that many transcribed genes are potential targets of AID, presenting an obvious danger to genomic integrity. Thus there are mechanisms in place to ensure that mutagenic outcomes of AID activity ...

  1. Cloning, sequencing and identification of single nucleotide polymorphisms of partial sequence on the porcine CACNA1S gene

    FANG XiaoMin; XU NingYing; REN ShouWen

    2008-01-01

    CACNA1S gene encodes the α1 subunit of the calcium channel. The mutation of CACNA1S gene can cause hypokalemic periodic paralysis (HypoKPP) and maliglant hyperthermla synarome (MHS) in human beings. Current research on CACNA1S was mainly in human being and model animal, but rarely in livestock and poultry. In this study, Yorkshire pigs (23), Pietrain pigs (30), Jinhua pigs (115) and the second generation (126) of crossbred of Jinhua and Pietrein were used. Primers were designed according to the sequence of human CACNA1S gene and PCR was carried out using pig genome DNA.PCR products were sequenced and compared with that of human, and then single nucleotide polymorphisms (SNPs) were investigated by PCR-SSCP, while PCR-RFLP tests were performed to validate the mutations. Results indicated: (1) the 5211 bp DNA fragments of porcine CACNA1S gene were acquired (GenBank accession number: DQ767693 ) and the identity of the exon region was 82.6% between human and pig; (2) fifty-seven mutations were found within the cloned sequences, among which 24 were in exon region; (3) the results of PCR-RFLP were in accordance with that of PCR-SSCP. According to the EST of porcine CACNA1S gene published in GenBank (Bx914582, Bx666997), 8 of the 11 SNPs identified in the present study were consistent with the base difference between two EST fragments.

  2. Genome-wide analysis reveals diverged patterns of codon bias, gene expression, and rates of sequence evolution in Picea gene families

    De La Torre, Amanda R; Lin, Yao-Cheng; van de Peer, Yves; Pär K Ingvarsson

    2015-01-01

    The recent sequencing of several gymnosperm genomes has greatly facilitated studying the evolution of their genes and gene families. In this study, we examine the evidence for expression-mediated selection in the first two fully sequenced representatives of the gymnosperm plant clade (Picea abies and Picea glauca). We use genome-wide estimates of gene expression (> 50,000 expressed genes) to study the relationship between gene expression, codon bias, rates of sequence divergence, protein l...

  3. Molecular systematics of the genus Sigmodon: results from mitochondrial and nuclear gene sequences

    Henson, Dallas D.; BRADLEY, ROBERT D.

    2009-01-01

    Phylogenetic relationships within the genus Sigmodon Say and Ord, 1825 were examined using sequence data from multiple gene regions, including exon 1 of the nuclear-encoded interphotoreceptor retinoid binding protein, intron 7 of the nuclear beta-fibrinogen gene, and the mitochondrial cytochrome b gene from 27 individuals representing 11 species of Sigmodon. Nuclear genes were analyzed independently, combined with each other, and combined with the mitochondrial data. Topologies were construct...

  4. Prokaryotic genes in eukaryotic genome sequences: when to infer horizontal gene transfer and when to suspect an actual microbe.

    Artamonova, Irena I; Lappi, Tanya; Zudina, Liudmila; Mushegian, Arcady R

    2015-07-01

    Assessment of phylogenetic positions of predicted gene and protein sequences is a routine step in any genome project, useful for validating the species' taxonomic position and for evaluating hypotheses about genome evolution and function. Several recent eukaryotic genome projects have reported multiple gene sequences that were much more similar to homologues in bacteria than to any eukaryotic sequence. In the spirit of the times, horizontal gene transfer from bacteria to eukaryotes has been invoked in some of these cases. Here, we show, using comparative sequence analysis, that some of those bacteria-like genes indeed appear likely to have been horizontally transferred from bacteria to eukaryotes. In other cases, however, the evidence strongly indicates that the eukaryotic DNA sequenced in the genome project contains a sample of non-integrated DNA from the actual bacteria, possibly providing a window into the host microbiome. Recent literature suggests also that common reagents, kits and laboratory equipment may be systematically contaminated with bacterial DNA, which appears to be sampled by metagenome projects non-specifically. We review several bioinformatic criteria that help to distinguish putative horizontal gene transfers from the admixture of genes from autonomously replicating bacteria in their hosts' genome databases or from the reagent contamination. PMID:25919787

  5. Chromosomal localization and sequence variation of 5S rRNA gene in five Capsicum species.

    Park, Y K; Park, K C; Park, C H; Kim, N S

    2000-02-29

    Chromosomal localization and sequence analysis of the 5S rRNA gene were carried out in five Capsicum species. Fluorescence in situ hybridization revealed that chromosomal location of the 5S rRNA gene was conserved in a single locus at a chromosome which was assigned to chromosome 1 by the synteny relationship with tomato. In sequence analysis, the repeating units of the 5S rRNA genes in the Capsicum species were variable in size from 278 bp to 300 bp. In sequence comparison of our results to the results with other Solanaceae plants as published by others, the coding region was highly conserved, but the spacer regions varied in size and sequence. T stretch regions, just after the end of the coding sequences, were more prominant in the Capsicum species than in two other plants. High G x C rich regions, which might have similar functions as that of the GC islands in the genes transcribed by RNA PolII, were observed after the T stretch region. Although we could not observe the TATA like sequences, an AT rich segment at -27 to -18 was detected in the 5S rRNA genes of the Capsicum species. Species relationship among the Capsicum species was also studied by the sequence comparison of the 5S rRNA genes. While C. chinense, C. frutescens, and C. annuum formed one lineage, C. baccatum was revealed to be an intermediate species between the former three species and C. pubescens. PMID:10774742

  6. Comparative genome sequencing of drosophila pseudoobscura: Chromosomal, gene and cis-element evolution

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Todd, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catherine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenee; Verduzco, Daniel; Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.

    2004-04-01

    The genome sequence of a second fruit fly, D. pseudoobscura, presents an opportunity for comparative analysis of a primary model organism D. melanogaster. The vast majority of Drosophila genes have remained on the same arm, but within each arm gene order has been extensively reshuffled leading to the identification of approximately 1300 syntenic blocks. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 35 My since divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome wide average consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than control sequences between the species but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a picture of repeat mediated chromosomal rearrangement, and high co-adaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila.

  7. Brucella abortus S19 genome sequenced, points toward virulence genes

    Whyte, Barry James

    2008-01-01

    Researchers at the Virginia Bioinformatics Institute at Virginia Tech; the National Animal Disease Center in Ames, Iowa; and collaborators at 454 Life Sciences, Branford, Conn., have sequenced the genome of Brucella abortus strain S19.

  8. Targeting DNA with triplex-forming oligonucleotides to modify gene sequence.

    Simon, Philippe; Cannata, Fabio; Concordet, Jean-Paul; Giovannangeli, Carine

    2008-08-01

    Molecules that interact with DNA in a sequence-specific manner are attractive tools for manipulating gene sequence and expression. For example, triplex-forming oligonucleotides (TFOs), which bind to oligopyrimidine.oligopurine sequences via Hoogsteen hydrogen bonds, have been used to inhibit gene expression at the DNA level as well as to induce targeted mutagenesis in model systems. Recent advances in using oligonucleotides and analogs to target DNA in a sequence-specific manner will be discussed. In particular, chemical modification of TFOs has been used to improve binding to chromosomal target sequences in living cells. Various oligonucleotide analogs have also been found to expand the range of sequences amenable to manipulation, including so-called "Zorro" locked nucleic acids (LNAs) and pseudo-complementary peptide nucleic acids (pcPNAs). Finally, we will examine the potential of TFOs for directing targeted gene sequence modification and propose that synthetic nucleases, based on conjugation of sequence-specific DNA ligands to DNA damaging molecules, are a promising alternative to protein-based endonucleases for targeted gene sequence modification. PMID:18460344

  9. Identification and characterization of rhizospheric microbial diversity by 16S ribosomal RNA gene sequencing

    Naveed, Muhammad; Mubeen, Samavia; Khan, Samiullah; Ahmed, Iftikhar; Khalid, Nauman; Suleria, Hafiz Ansar Rasul; Bano, Asghari; Mumtaz, Abdul Samad

    2014-01-01

    In the present study, samples of rhizosphere and root nodules were collected from different areas of Pakistan to isolate plant growth promoting rhizobacteria. Identification of bacterial isolates was made by 16S rRNA gene sequence analysis and taxonomical confirmation on EzTaxon Server. The identified bacterial strains were belonged to 5 genera i.e. Ensifer, Bacillus, Pseudomona, Leclercia and Rhizobium. Phylogenetic analysis inferred from 16S rRNA gene sequences showed the evolutionary relat...

  10. Trichinella pseudospiralis vs. T. spiralis thymidylate synthase gene structure and T. pseudospiralis thymidylate synthase retrogene sequence

    Jagielska, Elżbieta; Płucienniczak, Andrzej; Dąbrowska, Magdalena; Dowierciał, Anna; Rode, Wojciech

    2014-01-01

    Background Thymidylate synthase is a housekeeping gene, designated ancient due to its role in DNA synthesis and ubiquitous phyletic distribution. The genomic sequences were characterized coding for thymidylate synthase in two species of the genus Trichinella, an encapsulating T. spiralis and a non-encapsulating T. pseudospiralis. Methods Based on the sequence of parasitic nematode Trichinella spiralis thymidylate synthase cDNA, PCR techniques were employed. Results Each of the respective gene...

  11. IDENTIFICATION OF THREE FRUIT-ROT FUNGI OF BANANA BY 28S RIBOSOMAL DNA SEQUENCING

    Supriya Sarkar*, S Girisham and SM Reddy

    2013-01-01

    The aim of present investigation was to identify three fruit-rot fungi-Macrophomina phaseolina (Tassi) Goid, Fusarium oxysporum (Schlechtend) and Nigrospora oryzae (Berk and Br.) Petch isolated from banana fruits [Rasthali (Silk AAB) and Cavendish (AAA) varieties]. Out of different fungal genera isolated, the above fungi were responsible for maximum loss of banana fruits as they spread rapidly into the fruit pulp and deteriorated the fruits. The amplification studies of fragment of D2 region ...

  12. Targeting of AID-mediated sequence diversification to immunoglobulin genes.

    Kothapalli, Naga Rama; Fugmann, Sebastian D

    2011-04-01

    Activation-induced cytidine deaminase (AID) is a key enzyme for antibody-mediated immune responses. Antibodies are encoded by the immunoglobulin genes and AID acts as a transcription-dependent DNA mutator on these genes to improve antibody affinity and effector functions. An emerging theme in field is that many transcribed genes are potential targets of AID, presenting an obvious danger to genomic integrity. Thus there are mechanisms in place to ensure that mutagenic outcomes of AID activity are specifically restricted to the immunoglobulin loci. Cis-regulatory targeting elements mediate this effect and their mode of action is probably a combination of immunoglobulin gene specific activation of AID and a perversion of faithful DNA repair towards error-prone outcomes. PMID:21295456

  13. Experimental Conditions: SE28_S03_M04_D01 [Metabolonote[Archive

    Full Text Available SE28_S03_M04_D01 SE28 Comparison of leaf metabolites among developmental stages of ...Hevea brasiliensis SE28_S03 Hevea brasiliensis leaf SE28_S03_M04 6.7 mg [MassBase ID] MDLC1_21613 SE28_MS2 LC-FT-ICR-MS ESI posit...ive method 2 SE28_DS1 PowerGet analysis for detection of all peaks (B2) 6|ITMS 2 SE28_AM1 PowerGet annotation A1 ...

  14. Experimental Conditions: SE28_S01_M02_D01 [Metabolonote[Archive

    Full Text Available SE28_S01_M02_D01 SE28 Comparison of leaf metabolites among developmental stages of ...Hevea brasiliensis SE28_S01 Hevea brasiliensis leaf SE28_S01_M02 6.7 mg [MassBase ID] MDLC1_20370 SE28_MS1 LC-FT-ICR-MS ESI posit...ive method 1 SE28_DS1 PowerGet analysis for detection of all peaks (B2) 6|ITMS 2 SE28_AM1 PowerGet annotation A1 ...

  15. Experimental Conditions: SE28_S03_M06_D01 [Metabolonote[Archive

    Full Text Available SE28_S03_M06_D01 SE28 Comparison of leaf metabolites among developmental stages of ...Hevea brasiliensis SE28_S03 Hevea brasiliensis leaf SE28_S03_M06 6.7 mg [MassBase ID] MDLC1_21615 SE28_MS2 LC-FT-ICR-MS ESI posit...ive method 2 SE28_DS1 PowerGet analysis for detection of all peaks (B2) 6|ITMS 2 SE28_AM1 PowerGet annotation A1 ...

  16. Experimental Conditions: SE28_S01_M03_D01 [Metabolonote[Archive

    Full Text Available SE28_S01_M03_D01 SE28 Comparison of leaf metabolites among developmental stages of ...Hevea brasiliensis SE28_S01 Hevea brasiliensis leaf SE28_S01_M03 6.7 mg [MassBase ID] MDLC1_20371 SE28_MS1 LC-FT-ICR-MS ESI posit...ive method 1 SE28_DS1 PowerGet analysis for detection of all peaks (B2) 6|ITMS 2 SE28_AM1 PowerGet annotation A1 ...

  17. Experimental Conditions: SE28_S04_M01_D01 [Metabolonote[Archive

    Full Text Available SE28_S04_M01_D01 SE28 Comparison of leaf metabolites among developmental stages of ...Hevea brasiliensis SE28_S04 Hevea brasiliensis leaf SE28_S04_M01 6.7 mg [MassBase ID] MDLC1_20378 SE28_MS1 LC-FT-ICR-MS ESI posit...ive method 1 SE28_DS1 PowerGet analysis for detection of all peaks (B2) 6|ITMS 2 SE28_AM1 PowerGet annotation A1 ...

  18. Sequencing of the β-tubulin genes in the ascarid nematodes Parascaris equorum and Ascaridia galli.

    Tydén, E; Engström, A; Morrison, D A; Höglund, J

    2013-07-01

    Benzimidazoles (BZ) are used to control infections of the equine roundworm Parascaris equorum and the poultry roundworm Ascaridia galli. There are still no reports of anthelmintic resistance (AR) to BZ in these two nematodes, although AR to BZ is widespread in several other veterinary parasites. Several single nucleotide polymorphisms (SNP) in the β-tubulin genes have been associated with BZ-resistance. In the present study we have sequenced β-tubulin genes: isotype 1 and isotype 2 of P. equorum and isotype 1 of A. galli. Phylogenetic analysis of all currently known isotypes showed that the Nematoda has more diversity among the β-tubulin genes than the Vertebrata. In addition, this diversity is arranged in a more complex pattern of isotypes. Phylogenetically, the A. galli sequence and one of the P. equorum sequences clustered with the known Ascaridoidea isotype 1 sequences, while the other P. equorum sequence did not cluster with any other β-tubulin sequences. We therefore conclude that this is a previously unreported isotype 2. The β-tubulin gene sequences were used to develop a PCR for genotyping SNP in codons 167, 198 and 200. No SNP was observed despite sequencing 95 and 100 individual adult worms of P. equorum and A. galli, respectively. Given the diversity of isotype patterns among nematodes, it is likely that associations of genetic data with BZ-resistance cannot be generalised from one taxonomic group to another. PMID:23685342

  19. Cloning and Sequence Analysis of Light Variable Region Gene of Anti-human Retinoblastoma Monoclonal Antibody

    Xiufeng Zhong; Yongping Li; Shuqi Huang; Bo Ning; Chunyan Zhang; Jianliang Zheng; Guanguang Feng

    2002-01-01

    Purpose: To clone the variable region gene of light chain of monoclonal antibody against human retinoblastoma and to analyze the characterization of its nucleotide sequence as well as amino acid sequence.Methods: Total RNA was extracted from 3C6 hybridoma cells secreting specific monoclonal antibody(McAb)against human retinoblastoma(RB), then transcripted reversely into cDNA with olig-dT primers.The variable region of the light chain (VL) gene fragments was amplified using polymeerase chain reaction(PCR) and further cloned into pGEM(R) -T Easy vector. Then, 3C6 VL cDNA was sequenced by Sanger's method.Homologous analysis was done by NCBI BLAST.Results: The complete nucleotide sequence of 3C6 VL cDNA consisted of 321 bp encoding 107 amino acid residues, containing four workframe regions(FRs)and three complementarity-determining regions (CDRs) as well as the typical structure of two cys residues. The sequence is most homological to a member of the Vk9 gene family, and its chain utilizes the Jkl gene segment.Conclusion: The light chain variable region gene of the McAb against human RB was amplified successfully , which belongs to the Vk9 gene family and utilizes Vk-Jk1 gene rearrangement. This study lays a good basis for constructing a recombinant antibody and for making a new targeted therapeutic agents against retinoblastoma.

  20. Identification of a New Variable Sequence in the P1 Cytadhesin Gene of Mycoplasma pneumoniae: Evidence for the Generation of Antigenic Variation by DNA Recombination between Repetitive Sequences

    Kenri, Tsuyoshi; Taniguchi, Rie; Sasaki, Yuko; Okazaki, Norio; Narita, Mitsuo; Izumikawa, Kinichi; Umetsu, Masao; Sasaki, Tsuguo

    1999-01-01

    A Mycoplasma pneumoniae cytadhesin P1 gene with novel nucleotide sequence variation has been identified. Four clinical strains of M. pneumoniae were found to carry this type of P1 gene. This new P1 gene is similar to the known group II P1 genes but possesses novel sequence variation of approximately 300 bp in the RepMP2/3 region. The position of the new variable region is distant from the previously reported variable regions known to differ between group I and II P1 genes. Two sequences close...

  1. Identification of genes in anonymous DNA sequences. Annual performance report, February 1, 1991--January 31, 1992

    Fields, C.A.

    1996-06-01

    The objective of this project is the development of practical software to automate the identification of genes in anonymous DNA sequences from the human, and other higher eukaryotic genomes. A software system for automated sequence analysis, gm (gene modeler) has been designed, implemented, tested, and distributed to several dozen laboratories worldwide. A significantly faster, more robust, and more flexible version of this software, gm 2.0 has now been completed, and is being tested by operational use to analyze human cosmid sequence data. A range of efforts to further understand the features of eukaryoyic gene sequences are also underway. This progress report also contains papers coming out of the project including the following: gm: a Tool for Exploratory Analysis of DNA Sequence Data; The Human THE-LTR(O) and MstII Interspersed Repeats are subfamilies of a single widely distruted highly variable repeat family; Information contents and dinucleotide compostions of plant intron sequences vary with evolutionary origin; Splicing signals in Drosophila: intron size, information content, and consensus sequences; Integration of automated sequence analysis into mapping and sequencing projects; Software for the C. elegans genome project.

  2. How the sequence of a gene can tune its translation

    Fredrick, Kurt; Ibba, Michael

    2010-01-01

    Sixty-one codons specify 20 amino acids, offering cells many options for encoding a polypeptide sequence. Two new studies (Cannarrozzi et al., 2010; Tuller et al., 2010) now foster the idea that patterns of codon usage can control ribosome speed, fine-tuning translation to increase the efficiency...

  3. Sequencing and mapping hemoglobin gene clusters in the australian model dasyurid marsupial sminthopsis macroura

    De Leo, A.A.; Wheeler, D.; Lefevre, C.; Cheng, Jan-Fang; Hope, R.; Kuliwaba, J.; Nicholas, K.R.; Westermanc, M.; Graves, J.A.M.

    2004-07-26

    Comparing globin genes and their flanking sequences across many species has allowed globin gene evolution to be reconstructed in great detail. Marsupial globin sequences have proved to be of exceptional significance. A previous finding of a beta-like omega gene in the alpha cluster in the tammar wallaby suggested that the alpha and beta cluster evolved via genome duplication and loss rather than tandem duplication. To confirm and extend this important finding we isolated and sequenced BACs containing the alpha and beta loci from the distantly related Australian marsupial Sminthopsis macroura. We report that the alpha gene lies in the same BAC as the beta-like omega gene, implying that the alpha-omega juxtaposition is likely to be conserved in all marsupials. The LUC7L gene was found 3' of the S. macroura alpha locus, a gene order shared with humans but not mouse, chicken or fugu. Sequencing a BAC contig that contained the S. macroura beta globin and epsilon globin loci showed that the globin cluster is flanked by olfactory genes, demonstrating a gene arrangement conserved for over 180 MY. Analysis of the region 5' to the S. macroura epsilon globin gene revealed a region similar to the eutherian LCR, containing sequences and potential transcription factor binding sites with homology to eutherian hypersensitive sites 1 to 5. FISH mapping of BACs containing S. macroura alpha and beta globin genes located the beta globin cluster on chromosome 3q and the alpha locus close to the centromere on 1q, resolving contradictory map locations obtained by previous radioactive in situ hybridization.

  4. Neural network predicts sequence of TP53 gene based on DNA chip

    Spicker, J.S.; Wikman, F.; Lu, M.L.;

    2002-01-01

    We have trained an artificial neural network to predict the sequence of the human TP53 tumor suppressor gene based on a p53 GeneChip. The trained neural network uses as input the fluorescence intensities of DNA hybridized to oligonucleotides on the surface of the chip and makes between zero and...

  5. GIPS: A Software Guide to Sequencing-Based Direct Gene Cloning in Forward Genetics Studies.

    Hu, Han; Wang, Weitao; Zhu, Zhongxu; Zhu, Jianhua; Tan, Deyong; Zhou, Zhipeng; Mao, Chuanzao; Chen, Xin

    2016-04-01

    The Gene Identification via Phenotype Sequencing (GIPS) software considers a range of experimental and analysis choices in sequencing-based forward genetics studies within an integrated probabilistic framework, which enables direct gene cloning from the sequencing of several unrelated mutants of the same phenotype without the need to create segregation populations. GIPS estimates four measurements to help optimize an analysis procedure as follows: (1) the chance of reporting the true phenotype-associated gene; (2) the expected number of random genes that may be reported; (3) the significance of each candidate gene's association with the phenotype; and (4) the significance of violating the Mendelian assumption if no gene is reported or if all candidate genes have failed validation. The usage of GIPS is illustrated with the identification of a rice (Oryza sativa) gene that epistatically suppresses the phenotype of the phosphate2 mutant from sequencing three unrelated ethyl methanesulfonate mutants. GIPS is available at https://github.com/synergy-zju/gips/wiki with the user manual and an analysis example. PMID:26842621

  6. Sequence of the Proteus mirabilis urease accessory gene ureG.

    Sriwanthana, B; Island, M D; Mobley, H L

    1993-07-15

    We report the sequence of ureG, an accessory gene that is a part of the ure gene cluster of uropathogenic Proteus mirabilis and required for full enzymatic activity of urease. The 615-bp open reading frame predicts a M(r) 22,374 polypeptide, which contains a consensus amino acid (aa) sequence for ATP-binding. The polypeptide shares sequence homology with UreG of Escherichia coli (93% of identical aa), Klebsiella aerogenes (59%) and Helicobacter pylori (59%). PMID:8335248

  7. Isolation and nucleotide sequence of a mouse histidine tRNA gene.

    Han, J. H.; Harding, J D

    1982-01-01

    We have sequenced a 1307 base pair mouse genomic DNA fragment which contains a histidine tRNA gene. The sequence of the putative mouse histidine tRNA differs from the published sequence of sheep liver histidine tRNA by a single base change in the D-loop. It does not contain an unpaired 5' terminal G residue, as reported for Drosophila and sheep histidine tRNAs. The gene does not contain introns. The 3' flanking region contains a typical RNA polymerase III termination site of 6 consecutive T r...

  8. [Characterization of Black and Dichothrix Cyanobacteria Based on the 16S Ribosomal RNA Gene Sequence

    Ortega, Maya

    2010-01-01

    My project focuses on characterizing different cyanobacteria in thrombolitic mats found on the island of Highborn Cay, Bahamas. Thrombolites are interesting ecosystems because of the ability of bacteria in these mats to remove carbon dioxide from the atmosphere and mineralize it as calcium carbonate. In the future they may be used as models to develop carbon sequestration technologies, which could be used as part of regenerative life systems in space. These thrombolitic communities are also significant because of their similarities to early communities of life on Earth. I targeted two cyanobacteria in my research, Dichothrix spp. and whatever black is, since they are believed to be important to carbon sequestration in these thrombolitic mats. The goal of my summer research project was to molecularly identify these two cyanobacteria. DNA was isolated from each organism through mat dissections and DNA extractions. I ran Polymerase Chain Reactions (PCR) to amplify the 16S ribosomal RNA (rRNA) gene in each cyanobacteria. This specific gene is found in almost all bacteria and is highly conserved, meaning any changes in the sequence are most likely due to evolution. As a result, the 16S rRNA gene can be used for bacterial identification of different species based on the sequence of their 16S rRNA gene. Since the exact sequence of the Dichothrix gene was unknown, I designed different primers that flanked the gene based on the known sequences from other taxonomically similar cyanobacteria. Once the 16S rRNA gene was amplified, I cloned the gene into specialized Escherichia coli cells and sent the gene products for sequencing. Once the sequence is obtained, it will be added to a genetic database for future reference to and classification of other Dichothrix sp.

  9. Cloning and sequencing of the virulent gene LipL32 of Leptospira interrogans serovar Autumnalis

    Sriram Vamshi Krishna; Siju Joseph; R Ambily; M. Mini; Liya Anto; Sheethal G Mohan

    2013-01-01

    Aim: To clone the virulent gene LipL32 of Leptospira interrogans serovar Autumnalis and to analyze the sequence with LipL32 gene of other pathogenic serovars of Leptopsira. Materials and Methods: Leptospira interrogans serovar Autumnalis procured from Leptospira research laboratory, Chennai was used in the study. Polymerase chain reaction (PCR) was carried out for amplifying LipL32 gene using the reported primers of Leptospira Kirschnerii. The PCR product was cloned into TA cloning vector and...

  10. Possible origin of sequence divergence in the P1 cytadhesin gene of Mycoplasma pneumoniae.

    Su, C J; Dallo, S F; Chavoya, A; Baseman, J B

    1993-01-01

    Specific regions of the P1 adhesin structural gene of Mycoplasma pneumoniae hybridize to various parts of the mycoplasma genome, indicating their multiple-copy nature. In addition, restriction fragment length polymorphisms and sequence divergence have been observed in the P1 gene, permitting the classification of clinical isolates of M. pneumoniae into two groups, I and II. These data suggest that the observed P1 gene diversity may be explained by homologous recombination between similar but ...

  11. A novel method to discover fluoroquinolone antibiotic resistance (qnr genes in fragmented nucleotide sequences

    Boulund Fredrik

    2012-12-01

    Full Text Available Abstract Background Broad-spectrum fluoroquinolone antibiotics are central in modern health care and are used to treat and prevent a wide range of bacterial infections. The recently discovered qnr genes provide a mechanism of resistance with the potential to rapidly spread between bacteria using horizontal gene transfer. As for many antibiotic resistance genes present in pathogens today, qnr genes are hypothesized to originate from environmental bacteria. The vast amount of data generated by shotgun metagenomics can therefore be used to explore the diversity of qnr genes in more detail. Results In this paper we describe a new method to identify qnr genes in nucleotide sequence data. We show, using cross-validation, that the method has a high statistical power of correctly classifying sequences from novel classes of qnr genes, even for fragments as short as 100 nucleotides. Based on sequences from public repositories, the method was able to identify all previously reported plasmid-mediated qnr genes. In addition, several fragments from novel putative qnr genes were identified in metagenomes. The method was also able to annotate 39 chromosomal variants of which 11 have previously not been reported in literature. Conclusions The method described in this paper significantly improves the sensitivity and specificity of identification and annotation of qnr genes in nucleotide sequence data. The predicted novel putative qnr genes in the metagenomic data support the hypothesis of a large and uncharacterized diversity within this family of resistance genes in environmental bacterial communities. An implementation of the method is freely available at http://bioinformatics.math.chalmers.se/qnr/.

  12. Multi-species microarrays reveal the effect of sequence divergence on gene expression profiles

    Gilad, Yoav; Rifkin, Scott A.; Bertone, Paul; Gerstein, Mark; White, Kevin P

    2005-01-01

    Interspecies comparisons of gene expression levels will increase our understanding of the evolution of transcriptional mechanisms and help to identify targets of natural selection. This approach holds particular promise for apes, as many human-specific adaptations are thought to result from differences in gene expression rather than in coding sequence. To date, however, all studies directly comparing interspecies gene expression have been performed on single-species arrays, so that it has bee...

  13. The vlhA gene sequencing of Iranian Mycoplasma synoviae isolates

    Pourbakhsh, S.A.

    2013-12-01

    Full Text Available Mycoplasma synoviae expressed variable lipoprotein haemagglutinin (VlhA is believed to play a major role in pathogenesis of the disease by mediating adherence and immune evasion. The aim of this study was sequencing Iranian M. synoviae isolates for the detection of nucleotide variation in the M. synoviae vlhA gene. Using oligonucleotide primers complementary to the single-copy conserved 5´ end of vlhA gene, amplicons of ~400 bp were generated from 10 M. synoviae isolated from commercial broiler chicken farms in Iran, afterward the conserved domain of the vlhA gene of M. synoviae was sequenced and analyzed for Iranian isolates. The results showed that, there was a complete concordance between all Iranian isolates nucleotide sequence (1-386 nt. In comparison with vaccine MS-H strain sequence, all Iranian isolates; entire vlhA sequence downstream of nucleotide 386 was different. It was also observed in all Iranian M. synoviae isolates, point mutations and frame-shift mutation. This study was demonstrated a difference between Iranian isolates and live commercial vaccine MS-H strain. Furthermore, these data indicated that changes in the vlhA gene sequence could introduce into the expressed vlhA gene amino acid codons and effective in pathogenesis rate in flocks.

  14. Molecular cloning and analysis of the partial sequence of Rhinopithecus roxellanae growth hormone gene

    徐来祥; 孔繁华; 华育平

    2000-01-01

    Growth hormone gene (GH) of Rhinopithecus roxellanae was amplified by PCR based on the sequences of the reported mammalian growth hormone gene for the first time. The amplified fragment was about 1.8 kb. It was cloned and its upper stream was sequenced. This sequencing region consists of a 5¢ flanking regulatory region, exon I and part of exon II, intron I of growth hormone gene. Comparing the corresponding sequences of growth hormone gene between Rhinopithecus roxellanae and the porcine, we concluded that the homology reached 81% in the region, and there was high conservation in the 5¢ flanking sequence. The kinds of amino acids of exon I and exon II for about 90% were the same to those in pig. Many mutations occurred in the degenerate site of the triplet code. In the nucleotides of intron I, there were only 72% homologies with those in pig. It means that introns and 3¢ flanking sequence maybe play an important part in growth hormone gene regulation of the different animals.

  15. The Cloning and Sequencing of Read-through Protein Gene from BYDV-GAV Virus

    CHANG Sheng-jun; WANG Xi-feng; LI Li; MA Zhan-hong; ZHOU Guang-he

    2001-01-01

    The cDNA of BYDV-GAV read-through protein (RTP) gene was amplified from the extracted RNA of BYDV-GAV by using the polymerase chain reaction (PCR), and cloned into pGEM-7zf( + ). Its complete nucleotide sequence was determined by dideoxynucleotide chain-termination method. The BYDV-GAV RTP gene consists of 1377nt. Its sequences were most similar to that of the RTP gene of BYDV - MAV with identities of 87.4% and 87.1% at the nucleotide and amino acid levels, respectively.

  16. Cloning and sequencing of the gene for human β-casein

    Human β-casein is a major protein in human milk. This protein is part of the casein micelle and has been suggested to have several physiological functions in the newborn. Since there is limited information on βcasein and the factors that affect its concentration in human milk, the authors have isolated and sequenced the gene for this protein. A human mammary gland cDNA library (Clontech) in gt 11 was screened by plaque hy-hybridization using a 42-mer synthetic 32p-labelled oligo-nucleotide. Positive clones were identified and isolated, DNA was prepared and the gene isolated by cleavage with EcoR1. Following subcloning (PUC18), restriction mapping and Southern blotting, DNA for sequencing was prepared. The gene was sequenced by the dideoxy method. Human β-casein has 212 amino acids and the amino acid sequence deducted from the nucleotide sequence is to 91% identical to the published sequence for human β-casein show a high degree of conservation at the leader peptide and the highly phosphorylated sequences, but also deletions and divergence at several positions. These results provide insight into the structure of the human β-casein gene and will facilitate studies on factors affecting its expression

  17. Combined sequence and sequence-structure based methods for analyzing FGF23, CYP24A1 and VDR genes.

    Nagamani, Selvaraman; Singh, Kh Dhanachandra; Muthusamy, Karthikeyan

    2016-09-01

    FGF23, CYP24A1 and VDR altogether play a significant role in genetic susceptibility to chronic kidney disease (CKD). Identification of possible causative mutations may serve as therapeutic targets and diagnostic markers for CKD. Thus, we adopted both sequence and sequence-structure based SNP analysis algorithm in order to overcome the limitations of both methods. We explore the functional significance towards the prediction of risky SNPs associated with CKD. We assessed the performance of four widely used pathogenicity prediction methods. We compared the performances of the programs using Mathews correlation Coefficient ranged from poor (MCC = 0.39) to reasonably good (MCC = 0.42). However, we got the best results for the combined sequence and structure based analysis method (MCC = 0.45). 4 SNPs from FGF23 gene, 8 SNPs from VDR gene and 13 SNPs from CYP24A1 gene were predicted to be the causative agents for human diseases. This study will be helpful in selecting potential SNPs for experimental study from the SNP pool and also will reduce the cost for identification of potential SNPs as a genetic marker. PMID:27114920

  18. Bidirectional gene sequences with similar homology to functional proteins of alkane degrading bacterium pseudomonas fredriksbergensis DNA

    The potential for two overlapping fragments of DNA from a clone of newly isolated alkanes degrading bacterium Pseudomonas frederiksbergensis encoding sequences with similar homology to two parts of functional proteins is described. One strand contains a sequence with high homology to alkanes monooxygenase (alkB), a member of the alkanes hydroxylase family, and the other strand contains a sequence with some homology to alcohol dehydrogenase gene (alkJ). Overlapping of the genes on opposite strands has been reported in eukaryotic species, and is now reported in a bacterial species. The sequence comparisons and ORFS results revealed that the regulation and the genes organization involved in alkane oxidation represented in Pseudomonas frederiksberghensis varies among the different known alkane degrading bacteria. The alk gene cluster containing homologues to the known alkane monooxygenase (alkB), and rubredoxin (alkG) are oriented in the same direction, whereas alcohol dehydrogenase (alkJ) is oriented in the opposite direction. Such genomes encode messages on both strands of the DNA, or in an overlapping but different reading frames, of the same strand of DNA. The possibility of creating novel genes from pre-existing sequences, known as overprinting, which is a widespread phenomenon in small viruses. Here, the origin and evolution of the gene overlap to bacteriophages belonging to the family Microviridae have been investigated. Such a phenomenon is most widely described in extremely small genomes such as those of viruses or small plasmids, yet here is a unique phenomenon. (author)

  19. Complexity of rice Hsp100 gene family: lessons from rice genome sequence data

    Gaurav Batra; Vineeta Singh Chauhan; Amanjot Singh; Neelam K Sarkar; Anil Grover

    2007-04-01

    Elucidation of genome sequence provides an excellent platform to understand detailed complexity of the various gene families. Hsp100 is an important family of chaperones in diverse living systems. There are eight putative gene loci encoding for Hsp100 proteins in Arabidopsis genome. In rice, two full-length Hsp100 cDNAs have been isolated and sequenced so far. Analysis of rice genomic sequence by in silico approach showed that two isolated rice Hsp100 cDNAs correspond to Os05g44340 and Os02g32520 genes in the rice genome database. There appears to be three additional proteins (encoded by Os03g31300, Os04g32560 and Os04g33210 gene loci) that are variably homologous to Os05g44340 and Os02g32520 throughout the entire amino acid sequence. The above five rice Hsp100 genes show significant similarities in the signature sequences known to be conserved among Hsp100 proteins. While Os05g44340 encodes cytoplasmic Hsp100 protein, those encoded by the other four genes are predicted to have chloroplast transit peptides.

  20. Presence and Expression of Microbial Genes Regulating Soil Nitrogen Dynamics Along the Tanana River Successional Sequence

    Boone, R. D.; Rogers, S. L.

    2004-12-01

    We report on work to assess the functional gene sequences for soil microbiota that control nitrogen cycle pathways along the successional sequence (willow, alder, poplar, white spruce, black spruce) on the Tanana River floodplain, Interior Alaska. Microbial DNA and mRNA were extracted from soils (0-10 cm depth) for amoA (ammonium monooxygenase), nifH (nitrogenase reductase), napA (nitrate reductase), and nirS and nirK (nitrite reductase) genes. Gene presence was determined by amplification of a conserved sequence of each gene employing sequence specific oligonucleotide primers and Polymerase Chain Reaction (PCR). Expression of the genes was measured via nested reverse transcriptase PCR amplification of the extracted mRNA. Amplified PCR products were visualized on agarose electrophoresis gels. All five successional stages show evidence for the presence and expression of microbial genes that regulate N fixation (free-living), nitrification, and nitrate reduction. We detected (1) nifH, napA, and nirK presence and amoA expression (mRNA production) for all five successional stages and (2) nirS and amoA presence and nifH, nirK, and napA expression for early successional stages (willow, alder, poplar). The results highlight that the existing body of previous process-level work has not sufficiently considered the microbial potential for a nitrate economy and free-living N fixation along the complete floodplain successional sequence.

  1. Cloning, sequencing and identification of single nu-cleotide polymorphisms of partial sequence on the porcine CACNA1S gene

    2008-01-01

    CACNA1S gene encodes the α1 subunit of the calcium channel. The mutation of CACNA1S gene can cause hypokalemic periodic paralysis (HypoKPP) and maliglant hyperthermia synarome (MHS) in hu-man beings. Current research on CACNA1S was mainly in human being and model animal, but rarely in livestock and poultry. In this study, Yorkshire pigs (23), Pietrain pigs (30), Jinhua pigs (115) and the second generation (126) of crossbred of Jinhua and Pietrain were used. Primers were designed ac-cording to the sequence of human CACNA1S gene and PCR was carried out using pig genome DNA. PCR products were sequenced and compared with that of human, and then single nucleotide poly-morphisms (SNPs) were investigated by PCR-SSCP, while PCR-RFLP tests were performed to validate the mutations. Results indicated: (1) the 5211 bp DNA fragments of porcine CACNA1S gene were ac-quired (GenBank accession number: DQ767693 ) and the identity of the exon region was 82.6% be-tween human and pig; (2) fifty-seven mutations were found within the cloned sequences, among which 24 were in exon region; (3) the results of PCR-RFLP were in accordance with that of PCR-SSCP. Ac-cording to the EST of porcine CACNA1S gene published in GenBank (Bx914582, Bx666997), 8 of the 11 SNPs identified in the present study were consistent with the base difference between two EST frag-ments.

  2. Sequence diversities of serine-aspartate repeat genes among Staphylococcus aureus isolates from different hosts presumably by horizontal gene transfer.

    Huping Xue

    Full Text Available BACKGROUND: Horizontal gene transfer (HGT is recognized as one of the major forces for bacterial genome evolution. Many clinically important bacteria may acquire virulence factors and antibiotic resistance through HGT. The comparative genomic analysis has become an important tool for identifying HGT in emerging pathogens. In this study, the Serine-Aspartate Repeat (Sdr family has been compared among different sources of Staphylococcus aureus (S. aureus to discover sequence diversities within their genomes. METHODOLOGY/PRINCIPAL FINDINGS: Four sdr genes were analyzed for 21 different S. aureus strains and 218 mastitis-associated S. aureus isolates from Canada. Comparative genomic analyses revealed that S. aureus strains from bovine mastitis (RF122 and mastitis isolates in this study, ovine mastitis (ED133, pig (ST398, chicken (ED98, and human methicillin-resistant S. aureus (MRSA (TCH130, MRSA252, Mu3, Mu50, N315, 04-02981, JH1 and JH9 were highly associated with one another, presumably due to HGT. In addition, several types of insertion and deletion were found in sdr genes of many isolates. A new insertion sequence was found in mastitis isolates, which was presumably responsible for the HGT of sdrC gene among different strains. Moreover, the sdr genes could be used to type S. aureus. Regional difference of sdr genes distribution was also indicated among the tested S. aureus isolates. Finally, certain associations were found between sdr genes and subclinical or clinical mastitis isolates. CONCLUSIONS: Certain sdr gene sequences were shared in S. aureus strains and isolates from different species presumably due to HGT. Our results also suggest that the distributional assay of virulence factors should detect the full sequences or full functional regions of these factors. The traditional assay using short conserved regions may not be accurate or credible. These findings have important implications with regard to animal husbandry practices that may

  3. Genomic organization and sequence analysis of the vomeronasal receptor V2R genes in mouse genome

    YANG Hui; Zhang YaPing

    2007-01-01

    Two multigene superfamilies, named V1R and V2R, encoding seven-transmembrane-domain G-protein coupled receptors (GPCRs) have been identified as pheromone receptors in mammals. Three V2R gene families have been described in mouse and rat. Here we screened the updated mouse genome sequence database and finally retrieved 63 putative functional V2R genes including three newly identified genes which formed a new additional family. We described the genomic organization of these genes and also characterized the conservation of mouse V2R protein sequences. These genomic and sequence information we described are useful as part of the evidence to speculate the functional domain of V2Rs and should give aid to the functionality study in the future.

  4. Evolution at Two Levels in Fire Ants: The Relationship between Patterns of Gene Expression and Protein Sequence Evolution

    Hunt, B. G.; Ometto, L.; Keller, L.; Goodisman, M. A. D.

    2013-01-01

    Variation in protein sequence and gene expression each contribute to phenotypic diversity, and may be subject to similar selective pressures. Eusocial insects are particularly useful for investigating the evolutionary link between protein sequence and condition-dependent patterns of gene expression because gene expression plays a central role in determining differences between eusocial insect sexes and castes. We investigated the relationship between protein coding sequence evolution and gene...

  5. Sequence analysis of the ERCC2 gene regions in human, mouse, and hamster reveals three linked genes

    Lamerdin, J.E.; Stilwagen, S.A.; Ramirez, M.H. [Lawrence Livermore National Lab., CA (United States)] [and others

    1996-06-15

    The ERCC2 (excision repair cross-complementing rodent repair group 2) gene product is involved in transcription-coupled repair as an integral member of the basal transcription factor BTF2/TFIIH complex. Defects in this gene can result in three distinct human disorders, namely the cancer-prone syndrome xeroderma pigmentosum complementation group D, trichothiodystrophy, and Cockayne syndrome. We report the comparative analysis of 91.6 kb of new sequence including 54.3 kb encompassing the human ERCC2 locus, the syntenic region in the mouse (32.6 kb), and a further 4.7 kb of sequence 3{prime} of the previously reported ERCC2 region in the hamster. In addition to ERCC2, our analysis revealed the presence of two previously undescribed genes in all three species. The first is centromeric (in the human) to ERCC2 and is most similar to the kinesin light chain gene in sea urchin. The second gene is telomeric (in the human) to ERCC2 and contains a motif found in ankyrins, some cell proteins, and transcription factors. Multiple EST matches to this putative new gene indicate that it is expressed in several human tissues, including breast. The identification and description of two new genes provides potential candidate genes for disorders mapping to this region of 19q13.2. 42 refs., 6 figs., 3 tabs.

  6. Sequence analysis of the ERCC2 gene regions in human, mouse, and hamster reveals three linked genes.

    Lamerdin, J E; Stilwagen, S A; Ramirez, M H; Stubbs, L; Carrano, A V

    1996-06-15

    The ERCC2 (excision repair cross-complementing rodent repair group 2) gene product is involved in transcription-coupled repair as an integral member of the basal transcription factor BTF2/TFIIH complex. Defects in this gene can result in three distinct human disorders, namely the cancer-prone syndrome xeroderma pigmentosum complementation group D, trichothiodystrophy, and Cockayne syndrome. We report the comparative analysis of 91.6 kb of new sequence including 54.3 kb encompassing the human ERCC2 locus, the syntenic region in the mouse (32.6 kb), and a further 4.7 kb of sequence 3' of the previously reported ERCC2 region in the hamster. In addition to ERCC2, our analysis revealed the presence of two previously undescribed genes in all three species. The first is centromeric (in the human) to ERCC2 and is most similar to the kinesin light chain gene in sea urchin. The second gene is telomeric (in the human) to ERCC2 and contains a motif found in ankyrins, some cell cycle proteins, and transcription factors. Multiple EST matches to this putative new gene indicate that it is expressed in several human tissues, including breast. The identification and description of two new genes provides potential candidate genes for disorders mapping to this region of 19q13.2. PMID:8786141

  7. Research Techniques Made Simple: Bacterial 16S Ribosomal RNA Gene Sequencing in Cutaneous Research.

    Jo, Jay-Hyun; Kennedy, Elizabeth A; Kong, Heidi H

    2016-03-01

    Skin serves as a protective barrier and also harbors numerous microorganisms collectively comprising the skin microbiome. As a result of recent advances in sequencing (next-generation sequencing), our understanding of microbial communities on skin has advanced substantially. In particular, the 16S ribosomal RNA gene sequencing technique has played an important role in efforts to identify the global communities of bacteria in healthy individuals and patients with various disorders in multiple topographical regions over the skin surface. Here, we describe basic principles, study design, and a workflow of 16S ribosomal RNA gene sequencing methodology, primarily for investigators who are not familiar with this approach. This article will also discuss some applications and challenges of 16S ribosomal RNA sequencing as well as directions for future development. PMID:26902128

  8. Exome sequencing of 18 Chinese families with congenital cataracts: a new sight of the NHS gene.

    Wenmin Sun

    Full Text Available PURPOSE: The aim of this study was to investigate the mutation spectrum and frequency of 34 known genes in 18 Chinese families with congenital cataracts. METHODS: Genomic DNA and clinical data was collected from 18 families with congenital cataracts. Variations in 34 cataract-associated genes were screened by whole exome sequencing and then validated by Sanger sequencing. RESULTS: Eleven candidate variants in seven of the 34 genes were detected by exome sequencing and then confirmed by Sanger sequencing, including two variants predicted to be benign and the other pathogenic mutations. The nine mutations were present in 9 of the 18 (50% families with congenital cataracts. Of the four families with mutations in the X-linked NHS gene, no other abnormalities were recorded except for cataract, in which a pseudo-dominant inheritance form was suggested, as female carriers also had different forms of cataracts. CONCLUSION: This study expands the mutation spectrum and frequency of genes responsible for congenital cataract. Mutation in NHS is a common cause of nonsyndromic congenital cataract with pseudo-autosomal dominant inheritance. Combined with our previous studies, a genetic basis could be identified in 67.6% of families with congenital cataracts in our case series, in which mutations in genes encoding crystallins, genes encoding connexins, and NHS are responsible for 29.4%, 14.7%, and 11.8% of families, respectively. Our results suggest that mutations in NHS are the common cause of congenital cataract, both syndromic and nonsyndromic.

  9. The nucleotide sequence of the uvrD gene of E. coli.

    Finch, P W; Emmerson, P T

    1984-01-01

    The nucleotide sequence of a cloned section of the E. coli chromosome containing the uvrD gene has been determined. The coding region for the UvrD protein consists of 2,160 nucleotides which would direct the synthesis of a polypeptide 720 amino acids long with a calculated molecular weight of 82 kd. The predicted amino acid sequence of the UvrD protein has been compared with the amino acid sequences of other known adenine nucleotide binding proteins and a common sequence has been identified, ...

  10. A pilot study of gene testing of genetic bone dysplasia using targeted next-generation sequencing.

    Zhang, Huiwen; Yang, Rui; Wang, Yu; Ye, Jun; Han, Lianshu; Qiu, Wenjuan; Gu, Xuefan

    2015-12-01

    Molecular diagnosis of genetic bone dysplasia is challenging for non-expert. A targeted next-generation sequencing technology was applied to identify the underlying molecular mechanism of bone dysplasia and evaluate the contribution of these genes to patients with bone dysplasia encountered in pediatric endocrinology. A group of unrelated patients (n=82), characterized by short stature, dysmorphology and X-ray abnormalities, of which mucopolysacharidoses, GM1 gangliosidosis, mucolipidosis type II/III and achondroplasia owing to FGFR3 G380R mutation had been excluded, were recruited in this study. Probes were designed to 61 genes selected according to the nosology and classification of genetic skeletal disorders of 2010 by Illumina's online DesignStudio software. DNA was hybridized with probes and then a library was established following the standard Illumina protocols. Amplicon library was sequenced on a MiSeq sequencing system and the data were analyzed by MiSeq Reporter. Mutations of 13 different genes were found in 44 of the 82 patients (54%). Mutations of COL2A1 gene and PHEX gene were found in nine patients, respectively (9/44=20%), followed by COMP gene in 8 (18%), TRPV4 gene in 4 (9%), FBN1 gene in 4 (9%), COL1A1 gene in 3 (6%) and COL11A1, TRAPPC2, MATN3, ARSE, TRPS1, SMARCAL1, ENPP1 gene mutations in one patient each (2% each). In conclusion, mutations of COL2A1, PHEX and COMP gene are common for short stature due to bone dysplasia in outpatient clinics in pediatric endocrinology. Targeted next-generation sequencing is an efficient way to identify the underlying molecular mechanism of genetic bone dysplasia. PMID:26377240

  11. Estimating variation within the genes and inferring the phylogeny of 186 sequenced diverse Escherichia coli genomes

    Kaas Rolf S

    2012-10-01

    Full Text Available Abstract Background Escherichia coli exists in commensal and pathogenic forms. By measuring the variation of individual genes across more than a hundred sequenced genomes, gene variation can be studied in detail, including the number of mutations found for any given gene. This knowledge will be useful for creating better phylogenies, for determination of molecular clocks and for improved typing techniques. Results We find 3,051 gene clusters/families present in at least 95% of the genomes and 1,702 gene clusters present in 100% of the genomes. The former 'soft core' of about 3,000 gene families is perhaps more biologically relevant, especially considering that many of these genome sequences are draft quality. The E. coli pan-genome for this set of isolates contains 16,373 gene clusters. A core-gene tree, based on alignment and a pan-genome tree based on gene presence/absence, maps the relatedness of the 186 sequenced E. coli genomes. The core-gene tree displays high confidence and divides the E. coli strains into the observed MLST type clades and also separates defined phylotypes. Conclusion The results of comparing a large and diverse E. coli dataset support the theory that reliable and good resolution phylogenies can be inferred from the core-genome. The results further suggest that the resolution at the isolate level may, subsequently be improved by targeting more variable genes. The use of whole genome sequencing will make it possible to eliminate, or at least reduce, the need for several typing steps used in traditional epidemiology.

  12. Sequence and organization of coelacanth neurohypophysial hormone genes: Evolutionary history of the vertebrate neurohypophysial hormone gene locus

    Brenner Sydney

    2008-03-01

    Full Text Available Abstract Background The mammalian neurohypophysial hormones, vasopressin and oxytocin are involved in osmoregulation and uterine smooth muscle contraction respectively. All jawed vertebrates contain at least one homolog each of vasopressin and oxytocin whereas jawless vertebrates contain a single neurohypophysial hormone called vasotocin. The vasopressin homolog in non-mammalian vertebrates is vasotocin; and the oxytocin homolog is mesotocin in non-eutherian tetrapods, mesotocin and [Phe2]mesotocin in lungfishes, and isotocin in ray-finned fishes. The genes encoding vasopressin and oxytocin genes are closely linked in the human and rodent genomes in a tail-to-tail orientation. In contrast, their pufferfish homologs (vasotocin and isotocin are located on the same strand of DNA with isotocin gene located upstream of vasotocin gene separated by five genes, suggesting that this locus has experienced rearrangements in either mammalian or ray-finned fish lineage, or in both lineages. The coelacanths occupy a unique phylogenetic position close to the divergence of the mammalian and ray-finned fish lineages. Results We have sequenced a coelacanth (Latimeria menadoensis BAC clone encompassing the neurohypophysial hormone genes and investigated the evolutionary history of the vertebrate neurohypophysial hormone gene locus within a comparative genomics framework. The coelacanth contains vasotocin and mesotocin genes like non-mammalian tetrapods. The coelacanth genes are present on the same strand of DNA with no intervening genes, with the vasotocin gene located upstream of the mesotocin gene. Nucleotide sequences of the second exons of the two genes are under purifying selection implying a regulatory function. We have also analyzed the neurohypophysial hormone gene locus in the genomes of opossum, chicken and Xenopus tropicalis. The opossum contains two tandem copies of vasopressin and mesotocin genes. The vasotocin and mesotocin genes in chicken and

  13. Gene discovery and transcript analyses in the corn smut pathogen Ustilago maydis: expressed sequence tag and genome sequence comparison

    Saville Barry J

    2007-09-01

    Full Text Available Abstract Background Ustilago maydis is the basidiomycete fungus responsible for common smut of corn and is a model organism for the study of fungal phytopathogenesis. To aid in the annotation of the genome sequence of this organism, several expressed sequence tag (EST libraries were generated from a variety of U. maydis cell types. In addition to utility in the context of gene identification and structure annotation, the ESTs were analyzed to identify differentially abundant transcripts and to detect evidence of alternative splicing and anti-sense transcription. Results Four cDNA libraries were constructed using RNA isolated from U. maydis diploid teliospores (U. maydis strains 518 × 521 and haploid cells of strain 521 grown under nutrient rich, carbon starved, and nitrogen starved conditions. Using the genome sequence as a scaffold, the 15,901 ESTs were assembled into 6,101 contiguous expressed sequences (contigs; among these, 5,482 corresponded to predicted genes in the MUMDB (MIPS Ustilago maydis database, while 619 aligned to regions of the genome not yet designated as genes in MUMDB. A comparison of EST abundance identified numerous genes that may be regulated in a cell type or starvation-specific manner. The transcriptional response to nitrogen starvation was assessed using RT-qPCR. The results of this suggest that there may be cross-talk between the nitrogen and carbon signalling pathways in U. maydis. Bioinformatic analysis identified numerous examples of alternative splicing and anti-sense transcription. While intron retention was the predominant form of alternative splicing in U. maydis, other varieties were also evident (e.g. exon skipping. Selected instances of both alternative splicing and anti-sense transcription were independently confirmed using RT-PCR. Conclusion Through this work: 1 substantial sequence information has been provided for U. maydis genome annotation; 2 new genes were identified through the discovery of 619

  14. nef gene sequence variation among HIV-1-infected African children

    Chakraborty, R.; Reiniš, Milan; Rostron, T.; Philpott, S.; Dong, T.; D'Agostino, A.; Musoke, R.; de Silva, E.; Stumpf, M.; Weiser, B.; Burger, H.; Rowland-Jones, S.L.

    2006-01-01

    Roč. 7, č. 2 (2006), s. 75-84. ISSN 1464-2662 Grant ostatní: Fogarty International Center, NIH(US) 3D43TW00915; NIH(US) RO1 AI 42555 Institutional research plan: CEZ:AV0Z50520514 Keywords : HIV-1 nef gene * non-clade B * Kenya Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 2.674, year: 2006

  15. Microdiversity of extracellular enzyme genes among sequenced prokaryotic genomes

    Zimmerman, Amy E; Martiny, Adam C.; Allison, Steven D.

    2013-01-01

    Understanding the relationship between prokaryotic traits and phylogeny is important for predicting and modeling ecological processes. Microbial extracellular enzymes have a pivotal role in nutrient cycling and the decomposition of organic matter, yet little is known about the phylogenetic distribution of genes encoding these enzymes. In this study, we analyzed 3058 annotated prokaryotic genomes to determine which taxa have the genetic potential to produce alkaline phosphatase, chitinase and ...

  16. Haplotypes and Sequence Variation in the Ovine Adiponectin Gene (ADIPOQ

    Qing-Ming An

    2015-11-01

    Full Text Available The adiponectin gene (ADIPOQ plays an important role in energy homeostasis. In this study five separate regions (regions 1 to 5 of ovine ADIPOQ were analysed using PCR-SSCP. Four different PCR-SSCP patterns (A1-D1, A2-D2 were detected in region-1 and region-2, respectively, with seven and six SNPs being revealed. In region-3, three different patterns (A3-C3 and three SNPs were observed. Two patterns (A4-B4, A5-B5 and two and one SNPs were observed in region-4 and region-5, respectively. In total, nineteen SNPs were detected, with five of them in the coding region and two (c.46T/C and c.515G/A putatively resulting in amino acid changes (p.Tyr16His and p.Lys172Arg. In region-1, -2 and -3 of 316 sheep from eight New Zealand breeds, variants A1, A2 and A3 were the most common, although variant frequencies differed in the eight breeds. Across region-1 and region-3, nine haplotypes were identified and haplotypes A1-A3, A1-C3, B1-A3 and B1-C3 were most common. These results indicate that the ADIPOQ gene is polymorphic and suggest that further analysis is required to see if the variation in the gene is associated with animal production traits.

  17. Automated DNA mutation detection using universal conditions direct sequencing: application to ten muscular dystrophy genes

    Wu Bai-Lin

    2009-10-01

    Full Text Available Abstract Background One of the most common and efficient methods for detecting mutations in genes is PCR amplification followed by direct sequencing. Until recently, the process of designing PCR assays has been to focus on individual assay parameters rather than concentrating on matching conditions for a set of assays. Primers for each individual assay were selected based on location and sequence concerns. The two primer sequences were then iteratively adjusted to make the individual assays work properly. This generally resulted in groups of assays with different annealing temperatures that required the use of multiple thermal cyclers or multiple passes in a single thermal cycler making diagnostic testing time-consuming, laborious and expensive. These factors have severely hampered diagnostic testing services, leaving many families without an answer for the exact cause of a familial genetic disease. A search of GeneTests for sequencing analysis of the entire coding sequence for genes that are known to cause muscular dystrophies returns only a small list of laboratories that perform comprehensive gene panels. The hypothesis for the study was that a complete set of universal assays can be designed to amplify and sequence any gene or family of genes using computer aided design tools. If true, this would allow automation and optimization of the mutation detection process resulting in reduced cost and increased throughput. Results An automated process has been developed for the detection of deletions, duplications/insertions and point mutations in any gene or family of genes and has been applied to ten genes known to bear mutations that cause muscular dystrophy: DMD; CAV3; CAPN3; FKRP; TRIM32; LMNA; SGCA; SGCB; SGCG; SGCD. Using this process, mutations have been found in five DMD patients and four LGMD patients (one in the FKRP gene, one in the CAV3 gene, and two likely causative heterozygous pairs of variations in the CAPN3 gene of two other

  18. Transcriptome Sequencing and Positive Selected Genes Analysis of Bombyx mandarina

    Tingcai Cheng; Bohua Fu; Yuqian Wu; Renwen Long; Chun Liu; Qingyou Xia

    2015-01-01

    The wild silkworm Bombyx mandarina is widely believed to be an ancestor of the domesticated silkworm, Bombyx mori. Silkworms are often used as a model for studying the mechanism of species domestication. Here, we performed transcriptome sequencing of the wild silkworm using an Illumina HiSeq2000 platform. We produced 100,004,078 high-quality reads and assembled them into 50,773 contigs with an N50 length of 1764 bp and a mean length of 941.62 bp. A total of 33,759 unigenes were identified, wi...

  19. Application of gene sequencing directly to identify the pathogens in specimens

    LU Xin-xin; YUAN Liang; WAN Xiao-hua; GENG Jia-jing

    2010-01-01

    Background Accurate identification of bacterial isolates is an essential task in clinical microbiology. This study compared culturing to analyzing 16S rRNA gene sequences as methods to identify bacteria in clinical samples. We developed a key technique to directly identify bacteria in clinical samples via nucleic acid sequences, thus improving the ability to confirm pathogens.Methods We obtained 225 samples from Beijing Tongran Hospital and examined them by conventional culture and 16S rDNA sequencing to identify pathogens. This study made use of a modified sample pre-treatment technique which came from our laboratory to extract DNA. 16S rDNA was amplified by PCR. The amplified product was sequenced on a CEQ8000 capillary sequencer. Sequences were uploaded to the GenBank BLAST database for comparison.Results Among the positively cultivated bacterial strains, seven strains were identified differently by Vitek32 and by 16S rDNA sequencing. Twelve samples that were negative by standard culturing were determined to have pathogens by sequence analysis.Conclusion The use of 16S rRNA gene sequencing can improve clinical microbiology by providing better identification of unidentified bacteria or providing reference identification of unusual strains.

  20. IS21-558 insertion sequences are involved in the mobility of the multiresistance gene cfr

    Kehrenberg, Corinna; Aarestrup, Frank Møller; Schwarz, Stefan

    2007-01-01

    During a study of florfenicol-resistant porcine staphylococci from Denmark, the genes cfr and fexA were detected in the chromosomal DNA or on plasmids of Staphylococcus hyicus, Staphylococcus warneri, and Staphylococcus simulans. A novel variant of the phenicol resistance transposon Tn558...... was detected on the ca. 43-kb plasmid pSCFS6 in S. warneri and S. simulans isolates. Sequence analysis of a 22,010-bp segment revealed that the new Tn558 variant harbored an additional resistance gene region integrated into the tnpC reading frame. This resistance gene region consisted of the clindamycin...... exporter gene lsa(B) and the gene cfr for combined resistance to phenicols, lincosamides, oxazolidinones, pleuromutilins, and streptogramin A antibiotics bracketed by IS21-558 insertion sequences orientated in the same direction. A 6-bp target site duplication was detected at the integration site within...

  1. Sequence and expression analyses of the UL37 and UL38 genes of Aujeszky's disease virus.

    Braun, A; Kaliman, A; Boldogköi, Z; Aszódi, A; Fodor, I

    2000-01-01

    Previously, we sequenced the HSV-1 Ul39-Ul40 homologue genes of Aujeszky's disease virus (ADV), also designated as pseudorabies virus (Kaliman et al., 1994a, b). Now we report the nucleotide sequence of the adjacent DNA that encodes Ul38, the 5'-region (750 bp) of Ul37, and the promoter regions between these divergently arranged two genes. The ADV Ul38 gene encodes a protein of 368 amino acids. Amino acid sequence comparison of ADV Ul38 with that of other herpesviruses revealed significant structural homology. In a transcription study using RNase protection assay and Northern blot hybridization, we found that the Ul38 gene had one initiation site, but the Ul37 gene was initiated at two transcription sites with two potential initiator AUGs, one of which was dominant. Comparison of ADV Ul37, Ul38 and ribonucleotide reductase gene expression showed that these genes belong to the same temporal class with early kinetics. Data of structural and transcriptional studies suggest that regulation of the expression of these two ADV genes could differ from that of the HSV-1 virus. PMID:11402671

  2. Cloning and sequence analysis of a gene encoding polygalacturonase-inhibiting protein from cotton

    2003-01-01

    Polygalacturonase-inhibiting proteins (PGIP) play important roles in plant defense of pathogen, especially fungi. A pair of degenerated primers is designed based on the conserved sequence of 20 other known pgip genes and used to amplify Gossypium barbadense cultivation 7124 cDNA library by touch-down PCR. A 561 bp internal fragment of the pgip gene is obtained and used to design the primers for rapid amplification of cDNA ends. A composite pgip gene sequence is constructed from the products of 5′ and 3′ RACE, which are 666 bp and 906 bp respectively. Analysis of nucleic acid sequence shows 69.2% and 68.7% similarity to Citrus and Poncirus pgip genes, respectively. Its open reading frame of the gene encodes a polypeptide of 330 amino acids, in which 10 leucine-rich repeats arrange tandemly. A new set of primers is designed to the 5′ and 3′ ends of the gene, which allows amplification of the full-length gene from the cotton cDNA library. Genomic DNA analysis reveals that this gene has no intron.

  3. Sequence Analysis of the Protein Structure Homology Modeling of Growth Hormone Gene from Salmo trutta caspius

    Abolhasan Rezaei

    2012-03-01

    Full Text Available In view of the growth hormone protein investigated and characterized from Salmo trutta caspius. Growth hormone gene in the Salmo trutta caspius have six exons in the full length that is translated into a Molecular Weight (kDa: ssDNA: 64.98 and dsDNA: 129.6. There are also 210 amino acid residue. The assembled full length of DNA contains open reading frame of growth hormone gene that contains 15 sequences in the full length. The average GC content is 47% and AT content is 53%. This protein multiple alignment has shown that this peptide is 100% identical to the corresponding homologous protein in the growth hormone protein which including Salmo salar (Accession number: AAA49558.1 and Rainbow trout (Salmo trutta (Accession number: AAA49555.1" sequences. The sequence of protein had deposited in Gene Bank, Accession number: AEK70940. Also we were analyzed second and third structure between sequences reported in Gene Bank Network system. The results are shown, there are homology between second structure in three sequences including: Salmo trutta caspius, Salmo salar and Rainbow trout. Regarding third structure, Salmo trutta caspius and Salmo salar are same type, but Rainbow trout has different homology with Salmo trutta caspius and Salmo salar. However, the sequences were observed three parallel " helix and in second structure there were almost same percent β sheet.

  4. The complete nucleotide sequence and structure of the gene encoding bovine phenylethanolamine N-methyltransferase.

    Batter, D K; D'Mello, S R; Turzai, L M; Hughes, H B; Gioio, A E; Kaplan, B B

    1988-03-01

    A cDNA clone for bovine adrenal phenylethanolamine N-methyltransferase (PNMT) was used to screen a Charon 28 genomic library. One phage was identified, designated lambda P1, which included the entire PNMT gene. Construction of a restriction map, with subsequent Southern blot analysis, allowed the identification of exon-containing fragments. Dideoxy sequence analysis of these fragments, and several more further upstream, indicates that the bovine PNMT gene is 1,594 base pairs in length, consisting of three exons and two introns. The transcription initiation site was identified by two independent methods and is located approximately 12 base pairs upstream from the ATG translation start site. The 3' untranslated region is 88 base pairs in length and contains the expected polyadenylation signal (AATAAA). A putative promoter sequence (TATA box) is located about 25 base pairs upstream from the transcription initiation site. Computer comparison of the nucleotide sequence data with the consensus sequences of known regulatory elements revealed potential binding sites for glucocorticoid receptors and the Sp1 regulatory protein in the 5' flanking region of the gene. Additionally, comparison of the sequence of the exons of the PNMT gene with cDNA sequences for other enzymes involved in biogenic amine synthesis revealed no significant homology, indicating that PNMT is not a member of a multigene family of catecholamine biosynthetic enzymes. PMID:3379652

  5. Characterization of the Helicoverpa assulta nucleopolyhedrovirus genome and sequence analysis of the polyhedrin gene region

    Soo-Dong Woo; Jae Young Choi; Yeon Ho Je; Byung Rae Jin

    2006-09-01

    A local strain of Helicoverpa assulta nucleopolyhedrovirus (HasNPV) was isolated from infected H. assulta larvae in Korea. Restriction endonuclease fragment analysis, using 4 restriction enzymes, estimated that the total genome size of HasNPV is about 138 kb. A degenerate polymerase chain reaction (PCR) primer set for the polyhedrin gene successfully amplified the partial polyhedrin gene of HasNPV. The sequencing results showed that the about 430 bp PCR product was a fragment of the corresponding polyhedrin gene. Using HasNPV partial predicted polyhedrin to probe the Southern blots, we identified the location of the polyhedrin gene within the 6 kb EcoRI, 15 kb NcoI, 20 kb XhoI, 17 kb BglII and 3 kb ClaI fragments, respectively. The 3 kb ClaI fragment was cloned and the nucleotide sequences of the polyhedrin coding region and its flaking regions were determined. Nucleotide sequence analysis indicated the presence of an open reading frame of 735 nucleotides which could encode 245 amino acids with a predicted molecular mass of 29 kDa. The nucleotide sequences within the coding region of HasNPV polyhedrin shared 73.7% identity with the polyhedrin gene from Autographa californica NPV but were most closely related to Helicoverpa and Heliothis species NPVs with over 99% sequence identity.

  6. GeneAnalytics: An Integrative Gene Set Analysis Tool for Next Generation Sequencing, RNAseq and Microarray Data.

    Ben-Ari Fuchs, Shani; Lieder, Iris; Stelzer, Gil; Mazor, Yaron; Buzhor, Ella; Kaplan, Sergey; Bogoch, Yoel; Plaschkes, Inbar; Shitrit, Alina; Rappaport, Noa; Kohn, Asher; Edgar, Ron; Shenhav, Liraz; Safran, Marilyn; Lancet, Doron; Guan-Golan, Yaron; Warshawsky, David; Shtrichman, Ronit

    2016-03-01

    Postgenomics data are produced in large volumes by life sciences and clinical applications of novel omics diagnostics and therapeutics for precision medicine. To move from "data-to-knowledge-to-innovation," a crucial missing step in the current era is, however, our limited understanding of biological and clinical contexts associated with data. Prominent among the emerging remedies to this challenge are the gene set enrichment tools. This study reports on GeneAnalytics™ ( geneanalytics.genecards.org ), a comprehensive and easy-to-apply gene set analysis tool for rapid contextualization of expression patterns and functional signatures embedded in the postgenomics Big Data domains, such as Next Generation Sequencing (NGS), RNAseq, and microarray experiments. GeneAnalytics' differentiating features include in-depth evidence-based scoring algorithms, an intuitive user interface and proprietary unified data. GeneAnalytics employs the LifeMap Science's GeneCards suite, including the GeneCards®--the human gene database; the MalaCards-the human diseases database; and the PathCards--the biological pathways database. Expression-based analysis in GeneAnalytics relies on the LifeMap Discovery®--the embryonic development and stem cells database, which includes manually curated expression data for normal and diseased tissues, enabling advanced matching algorithm for gene-tissue association. This assists in evaluating differentiation protocols and discovering biomarkers for tissues and cells. Results are directly linked to gene, disease, or cell "cards" in the GeneCards suite. Future developments aim to enhance the GeneAnalytics algorithm as well as visualizations, employing varied graphical display items. Such attributes make GeneAnalytics a broadly applicable postgenomics data analyses and interpretation tool for translation of data to knowledge-based innovation in various Big Data fields such as precision medicine, ecogenomics, nutrigenomics, pharmacogenomics, vaccinomics

  7. Nucleotide sequence of the beta-cyclodextrin glucanotransferase gene of alkalophilic Bacillus sp. strain 1011 and similarity of its amino acid sequence to those of alpha-amylases.

    Kimura, K.; Kataoka, S; Ishii, Y; Takano, T.; Yamane, K

    1987-01-01

    The nucleotide sequence of the gene for cyclodextrin glucanotransferase of alkalophilic Bacillus sp. strain 1011 was determined. The deduced amino acid sequence at the NH2-terminal side of the enzyme showed a high homology with the sequences of alpha-amylase in the three regions which constitutes the active centers of alpha-amylases.

  8. Versatile Cosmid Vectors for the Isolation, Expression, and Rescue of Gene Sequences: Studies with the Human α -globin Gene Cluster

    Lau, Yun-Fai; Kan, Yuet Wai

    1983-09-01

    We have developed a series of cosmids that can be used as vectors for genomic recombinant DNA library preparations, as expression vectors in mammalian cells for both transient and stable transformations, and as shuttle vectors between bacteria and mammalian cells. These cosmids were constructed by inserting one of the SV2-derived selectable gene markers-SV2-gpt, SV2-DHFR, and SV2-neo-in cosmid pJB8. High efficiency of genomic cloning was obtained with these cosmids and the size of the inserts was 30-42 kilobases. We isolated recombinant cosmids containing the human α -globin gene cluster from these genomic libraries. The simian virus 40 DNA in these selectable gene markers provides the origin of replication and enhancer sequences necessary for replication in permissive cells such as COS 7 cells and thereby allows transient expression of α -globin genes in these cells. These cosmids and their recombinants could also be stably transformed into mammalian cells by using the respective selection systems. Both of the adult α -globin genes were more actively expressed than the embryonic zeta -globin genes in these transformed cell lines. Because of the presence of the cohesive ends of the Charon 4A phage in the cosmids, the transforming DNA sequences could readily be rescued from these stably transformed cells into bacteria by in vitro packaging of total cellular DNA. Thus, these cosmid vectors are potentially useful for direct isolation of structural genes.

  9. Isolation and characterization of gene sequences expressed in cotton fiber

    Taciana de Carvalho Coutinho; Marcelo de Almeida Guimarães; Marcia Soares Vidal

    2016-01-01

    ABSTRACT Cotton fiber are tubular cells which develop from the differentiation of ovule epidermis. In addition to being one of the most important natural fiber of the textile group, cotton fiber afford an excellent experimental system for studying the cell wall. The aim of this work was to isolate and characterise the genes expressed in cotton fiber (Gossypium hirsutum L.) to be used in future work in cotton breeding. Fiber of the cotton cultivar CNPA ITA 90 II were used to extract RNA for th...

  10. The complete mitochondrial genome sequence and gene organization of Tridentiger trigonocephalus (Gobiidae: Gobionellinae) with phylogenetic consideration.

    Wei, Hongqing; Ma, Hongyu; Ma, Chunyan; Zhang, Fengying; Wang, Wei; Chen, Wei; Ma, Lingbo

    2016-09-01

    The complete mitochondrial genome plays an important role in studies of genome-level characteristics and phylogenetic relationships. Here we determined the complete mitogenome sequence of Tridentiger trigonocephalus (Perciformes, Gobiidae), and discovered its phylogenetic relationship. This circular genome was 16 662 bp in length, and consisted of 37 typical genes, including 13 protein-coding genes, 22 tRNA genes, and two rRNA genes. The gene order of T. trigonocephalus mitochondrial genome was identical to those observed in most other vertebrates. Of 37 genes, 28 were encoded by heavy strand, while the others were encoded by light strand. The phylogenetic tree constructed by 13 concatenated protein-coding genes showed that T. trigonocephalus was closest to T. bifasciatus, and then to T. barbatus among the 20 species within suborder Gobioidei. This work should facilitate the studies on population genetic diversity, and molecular evolution in Gobioidei fishes. PMID:26370266

  11. Citrus plastid-related gene profiling based on expressed sequence tag analyses

    Tercilio Calsa Jr.

    2007-01-01

    Full Text Available Plastid-related sequences, derived from putative nuclear or plastome genes, were searched in a large collection of expressed sequence tags (ESTs and genomic sequences from the Citrus Biotechnology initiative in Brazil. The identified putative Citrus chloroplast gene sequences were compared to those from Arabidopsis, Eucalyptus and Pinus. Differential expression profiling for plastid-directed nuclear-encoded proteins and photosynthesis-related gene expression variation between Citrus sinensis and Citrus reticulata, when inoculated or not with Xylella fastidiosa, were also analyzed. Presumed Citrus plastome regions were more similar to Eucalyptus. Some putative genes appeared to be preferentially expressed in vegetative tissues (leaves and bark or in reproductive organs (flowers and fruits. Genes preferentially expressed in fruit and flower may be associated with hypothetical physiological functions. Expression pattern clustering analysis suggested that photosynthesis- and carbon fixation-related genes appeared to be up- or down-regulated in a resistant or susceptible Citrus species after Xylella inoculation in comparison to non-infected controls, generating novel information which may be helpful to develop novel genetic manipulation strategies to control Citrus variegated chlorosis (CVC.

  12. A flexible and economical barcoding approach for highly multiplexed amplicon sequencing of diverse target genes

    Craig W. Herbold

    2015-07-01

    Full Text Available High throughput sequencing of phylogenetic and functional gene amplicons provides tremendous insight into the structure and functional potential of complex microbial communities. Here, we introduce a highly adaptable and economical PCR approach to barcoding and pooling libraries of numerous target genes. In this approach, we replace gene- and sequencing platform-specific fusion primers with general, interchangeable barcoding primers, enabling nearly limitless customized barcode-primer combinations. Compared to barcoding with long fusion primers, our multiple-target gene approach is more economical because it overall requires lower number of primers and is based on short primers with generally lower synthesis and purification costs. To highlight our approach, we pooled over 900 different small-subunit rRNA and functional gene amplicon libraries obtained from various environmental or host-associated microbial community samples into a single, paired-end Illumina MiSeq run. Although the amplicon regions ranged in size from approximately 290 to 720 bp, we found no significant systematic sequencing bias related to amplicon length or gene target. Our results indicate that this flexible multiplexing approach produces large, diverse and high quality sets of amplicon sequence data for modern studies in microbial ecology.

  13. Candida famata (Debaryomyces hansenii) DNA sequences containing genes involved in riboflavin synthesis.

    Voronovsky, Andriy Y; Abbas, Charles A; Dmytruk, Kostyantyn V; Ishchuk, Olena P; Kshanovska, Barbara V; Sybirna, Kateryna A; Gaillardin, Claude; Sibirny, Andriy A

    2004-11-01

    Previously cloned Candida famata (Debaryomyces hansenii) strain VKM Y-9 genomic DNA fragments containing genes RIB1 (codes for GTP cyclohydrolase II), RIB2 (encodes specific reductase), RIB5 (codes for dimethylribityllumazine synthase), RIB6 (encodes dihydroxybutanone phosphate synthase) and RIB7 (codes for riboflavin synthase) were sequenced. The derived amino acid sequences of C. famata RIB genes showed extensive homology to the corresponding sequences of riboflavin synthesis enzymes of other yeast species. The highest identity was observed to homologues of D. hansenii CBS767, as C. famata is the anamorph of this hemiascomycetous yeast. The D. hansenii CBS767 RIB3 gene encoding specific deaminase was cloned. This gene successfully complemented riboflavin auxotrophy of the rib3 mutant of flavinogenic yeast, Pichia guilliermondii. Putative iron-responsive elements (potential sites for binding of the transcription factors Fep1p or Aft1p and Aft2p) were found in the upstream regions of some C. famata and D. hansenii RIB genes. The sequences of C. famata RIB genes have been submitted to the EMBL data library under Accession Nos AJ810169-AJ810173. PMID:15543522

  14. Analysis of breast cancer metastasis candidate genes from next generation-sequencing via systematic functional genomics

    Blomstrøm, Monica Marie

    2016-01-01

    ) and non-CSCs. The main goal of this project was to functionally characterize a set of candidate genes recovered from next-generation sequencing analysis for their role in breast cancer metastasis formation. The starting gene set comprised 104 gene variants; i.e. 57 wildtype and 47 mutated variants....... During the project, the aim was to generate a panel of genetically identical (“isogenic”) MCF7 breast cancer cell lines with inducible overexpression of the gene variants, and to analyze these for effects on breast cancer growth and invasion in vitro under standardized conditions. Moreover, it was aimed...

  15. Analysis of human growth hormone gene 5' sequences in isolated growth hormone deficiency patients.

    Wang, Y.; Yu, L L; Sheng, Q.; Meng, C; Sun, J.; S.S. Chen

    1994-01-01

    Human growth hormone (hGH) gene deletion (6.7 to 7.6 kb) is one of the causes of isolated growth hormone deficiency (IGHD), named IGHD IA. IGHD IA, however, only accounts for about 10% of the total IGHD patients. Most IGHD is caused by unknown mechanisms. Here, hGH gene 5' sequences in three IGHD patients without hGH gene deletion were analysed to see if there was any mutation hindering the expression of the hGH gene.

  16. Cloning, nucleotide sequence, and expression of the Bacillus subtilis lon gene.

    Riethdorf, S.; Völker, U; Gerth, U.; Winkler, A; Engelmann, S; Hecker, M.

    1994-01-01

    The lon gene of Escherichia coli encodes the ATP-dependent serine protease La and belongs to the family of sigma 32-dependent heat shock genes. In this paper, we report the cloning and characterization of the lon gene from the gram-positive bacterium Bacillus subtilis. The nucleotide sequence of the lon locus, which is localized upstream of the hemAXCDBL operon, was determined. The lon gene codes for an 87-kDa protein consisting of 774 amino acid residues. A comparison of the deduced amino ac...

  17. Estimating variation within the genes and inferring the phylogeny of 186 sequenced diverse Escherichia coli genomes

    Kaas, Rolf Sommer; Rundsten, Carsten Friis; Ussery, David;

    2012-01-01

    creating better phylogenies, for determination of molecular clocks and for improved typing techniques. Results We find 3,051 gene clusters/families present in at least 95% of the genomes and 1,702 gene clusters present in 100% of the genomes. The former 'soft core' of about 3,000 gene families is perhaps...... 186 sequenced E. coli genomes. The core-gene tree displays high confidence and divides the E. coli strains into the observed MLST type clades and also separates defined phylotypes. Conclusion The results of comparing a large and diverse E. coli dataset support the theory that reliable and good...

  18. Cloning and Sequence Analysis of Envelope Glycoprotein E1 Gene of Rubella Virus, JR23 Strain

    王志玉; 薛永磊; 王小凡; 宋艳艳; 温红玲

    2003-01-01

    To construct an expression vector containing the E1 glycoprotein gene of rubella virus for the study on the effectof mutation of the E1 gene glycoprotein and the analysis of phylogenetic differences of sequences, the gene encoding the E1envelope glycoprotein was amplified from rubella virus, Jinan strain JR23, by RT-PCR and ligated into PMD-18T vector.The clones that carried the E1 gene were identified after ampr selection and analysis of restriction enzyme digestion. After sequencing this gene was analyzed by Danstar and Winstar programs, and the map of phylogenetic tree was drawn. The clone of E1 glycoprotein was thus constructed. It was found that the sequence differences between JR23 strain and the TCRB strainfrom Japan and those between JR23 strain and Thomas strain of England were rather small with difference values of 0.9% and 1.2% respectively. Yet those between JR23 strain and BRD2 strain from Beijing and those between JR23 strain and XG379 strain from Hong Kong were comparatively larger with difference values of 7.6% and 7.3% respectively. The sequence of JR23 strain with other strains was less than 3% except the NC strain (3.7%). It concludes that the constructionof E1 glycoprotein gene offers an approach to study the relationship between structures and functions of E1 gene and its gene products. In the phylogenetic tree, it shows that there are significant differences in the sequences of rubella virus isolated in China, and this might be helpful to develop an effective subunit vaccine.

  19. Rapid evolution of the sequences and gene repertoires of secreted proteins in bacteria.

    Teresa Nogueira

    Full Text Available Proteins secreted to the extracellular environment or to the periphery of the cell envelope, the secretome, play essential roles in foraging, antagonistic and mutualistic interactions. We hypothesize that arms races, genetic conflicts and varying selective pressures should lead to the rapid change of sequences and gene repertoires of the secretome. The analysis of 42 bacterial pan-genomes shows that secreted, and especially extracellular proteins, are predominantly encoded in the accessory genome, i.e. among genes not ubiquitous within the clade. Genes encoding outer membrane proteins might engage more frequently in intra-chromosomal gene conversion because they are more often in multi-genic families. The gene sequences encoding the secretome evolve faster than the rest of the genome and in particular at non-synonymous positions. Cell wall proteins in Firmicutes evolve particularly fast when compared with outer membrane proteins of Proteobacteria. Virulence factors are over-represented in the secretome, notably in outer membrane proteins, but cell localization explains more of the variance in substitution rates and gene repertoires than sequence homology to known virulence factors. Accordingly, the repertoires and sequences of the genes encoding the secretome change fast in the clades of obligatory and facultative pathogens and also in the clades of mutualists and free-living bacteria. Our study shows that cell localization shapes genome evolution. In agreement with our hypothesis, the repertoires and the sequences of genes encoding secreted proteins evolve fast. The particularly rapid change of extracellular proteins suggests that these public goods are key players in bacterial adaptation.

  20. Molecular cloning, sequence characterization, and gene expression profiling of a novel water buffalo (Bubalus bubalis) gene, AGPAT6.

    Song, S; Huo, J L; Li, D L; Yuan, Y Y; Yuan, F; Miao, Y W

    2013-01-01

    Several 1-acylglycerol-3-phosphate-O-acyltransferases (AGPATs) can acylate lysophosphatidic acid to produce phosphatidic acid. Of the eight AGPAT isoforms, AGPAT6 is a crucial enzyme for glycerolipids and triacylglycerol biosynthesis in some mammalian tissues. We amplified and identified the complete coding sequence (CDS) of the water buffalo AGPAT6 gene by using the reverse transcription-polymerase chain reaction, based on the conversed sequence information of the cattle or expressed sequence tags of other Bovidae species. This novel gene was deposited in the NCBI database (accession No. JX518941). Sequence analysis revealed that the CDS of this AGPAT6 encodes a 456-amino acid enzyme (molecular mass = 52 kDa; pI = 9.34). Water buffalo AGPAT6 contains three hydrophobic transmembrane regions and a signal 37-amino acid peptide, localized in the cytoplasm. The deduced amino acid sequences share 99, 98, 98, 97, 98, 98, 97 and 95% identity with their homologous sequences from cattle, horse, human, mouse, orangutan, pig, rat, and chicken, respectively. The phylogenetic tree analysis based on the AGPAT6 CDS showed that water buffalo has a closer genetic relationship with cattle than with other species. Tissue expression profile analysis shows that this gene is highly expressed in the mammary gland, moderately expressed in the heart, muscle, liver, and brain; weakly expressed in the pituitary gland, spleen, and lung; and almost silently expressed in the small intestine, skin, kidney, and adipose tissues. Four predicted microRNA target sites are found in the water buffalo AGPAT6 CDS. These results will establish a foundation for further insights into this novel water buffalo gene. PMID:24114207

  1. GeneAnalytics: An Integrative Gene Set Analysis Tool for Next Generation Sequencing, RNAseq and Microarray Data

    Ben-Ari Fuchs, Shani; Lieder, Iris; Mazor, Yaron; Buzhor, Ella; Kaplan, Sergey; Bogoch, Yoel; Plaschkes, Inbar; Shitrit, Alina; Rappaport, Noa; Kohn, Asher; Edgar, Ron; Shenhav, Liraz; Safran, Marilyn; Lancet, Doron; Guan-Golan, Yaron; Warshawsky, David; Shtrichman, Ronit

    2016-01-01

    Abstract Postgenomics data are produced in large volumes by life sciences and clinical applications of novel omics diagnostics and therapeutics for precision medicine. To move from “data-to-knowledge-to-innovation,” a crucial missing step in the current era is, however, our limited understanding of biological and clinical contexts associated with data. Prominent among the emerging remedies to this challenge are the gene set enrichment tools. This study reports on GeneAnalytics™ (geneanalytics.genecards.org), a comprehensive and easy-to-apply gene set analysis tool for rapid contextualization of expression patterns and functional signatures embedded in the postgenomics Big Data domains, such as Next Generation Sequencing (NGS), RNAseq, and microarray experiments. GeneAnalytics' differentiating features include in-depth evidence-based scoring algorithms, an intuitive user interface and proprietary unified data. GeneAnalytics employs the LifeMap Science's GeneCards suite, including the GeneCards®—the human gene database; the MalaCards—the human diseases database; and the PathCards—the biological pathways database. Expression-based analysis in GeneAnalytics relies on the LifeMap Discovery®—the embryonic development and stem cells database, which includes manually curated expression data for normal and diseased tissues, enabling advanced matching algorithm for gene–tissue association. This assists in evaluating differentiation protocols and discovering biomarkers for tissues and cells. Results are directly linked to gene, disease, or cell “cards” in the GeneCards suite. Future developments aim to enhance the GeneAnalytics algorithm as well as visualizations, employing varied graphical display items. Such attributes make GeneAnalytics a broadly applicable postgenomics data analyses and interpretation tool for translation of data to knowledge-based innovation in various Big Data fields such as precision medicine, ecogenomics, nutrigenomics

  2. Next-generation sequencing approach for connecting secondary metabolites to biosynthetic gene clusters in fungi

    Ralph A Cacho

    2015-01-01

    Full Text Available Genomics has revolutionized the research on fungal secondary metabolite biosynthesis. To elucidate the molecular and enzymatic mechanisms underlying the biosynthesis of a specific secondary metabolite compound, the important first step is often to find the genes that responsible for its synthesis. The accessibility to fungal genome sequences allows the bypass of the cumbersome traditional library construction and screening approach. The advance in next-generation sequencing (NGS technologies have further improved the speed and reduced the cost of microbial genome sequencing in the past few years, which has accelerated the research in this field. Here, we will present an example work flow for identifying the gene cluster encoding the biosynthesis of secondary metabolites of interest using an NGS approach. We will also review the different strategies that can be employed to pinpoint the targeted gene clusters rapidly by giving several examples stemming from our work.

  3. Gene organization and complete sequence of the mitochondrial genome of Linwu mallard.

    Tian, Ke-Xiong; Liu, Li-Li; Yu, Qi-Fang; He, Shao-Ping; He, Jian-Hua

    2016-01-01

    Linwu mallard is an excellent native breeds from Hunan province in China. This is the first study to determine the complete mitochondrial genome sequence of L. mallard using PCR-based amplification and Sanger sequencing. The characteristic of the entire mitochondrial genome was analyzed in detail, with the base composition of 29.19% A, 22.19% T, 32.83% C, 15.79% G in the L. mallard (16,605 bp in length). It contained 2 ribosomal RNA genes, 13 protein-coding genes, 22 transfer RNA genes and a major non-coding control region (D-loop region). The complete mitochondrial genome sequence of L. mallard will be useful for the phylogenetics of poultry, and be available as basic data for the genetics and breeding. PMID:24938102

  4. Isolation of laccase gene-specific sequences from white rot and brown rot fungi by PCR

    D`Souza, T.M.; Boominathan, K.; Reddy, C.A. [Michigan State Univ., East Lansing, MI (United States)

    1996-10-01

    Degenerate primers corresponding to the consensus sequences of the copper-binding regions in the N-terminal domains of known basidiomycete laccases were used to isolate laccase gene-specific sequences from strains representing nine genera of wood rot fungi. All except three gave the expected PCR product of about 200 bp. Computer searches of the databases identified the sequences of each of the PCR product of about 200 bp. Computer searches of the databases identified the sequence of each of the PCR products analyzed as a laccase gene sequence, suggesting the specificity of the primers. PCR products of the white rot fungi Ganoderma lucidum, Phlebia brevispora, and Trametes versicolor showed 65 to 74% nucleotide sequence similarity to each other; the similarity in deduced amino acid sequences was 83 to 91%. The PCR products of Lentinula edodes and Lentinus tigrinus, on the other hand, showed relatively low nucleotide and amino acid similarities (58 to 64 and 62 to 81%, respectively); however, these similarities were still much higher than when compared with the corresponding regions in the laccases of the ascomycete fungi Aspergillus nidulans and Neurospora crassa. A few of the white rot fungi, as well as Gloeophyllum trabeum, a brown rot fungus, gave a 144-bp PCR fragment which had a nucleotide sequence similarity of 60 to 71%. Demonstration of laccase activity in G. trabeum and several other brown rot fungi was of particular interest because these organisms were not previously shown to produce laccases. 36 refs., 6 figs., 2 tabs.

  5. Automated conserved noncoding sequence (CNS discovery reveals differences in gene content and promoter evolution among grasses

    Gina eTurco

    2013-07-01

    Full Text Available Conserved noncoding sequences (CNS are islands of noncoding sequence that, like protein coding exons, show less divergence in sequence between related species than functionless DNA. Several of CNSs have been demonstrated experimentally to function as cis-regulatory regions. However, the specific functions of most CNSs remain unknown. Previous searchers for CNS in plants have either anchored on exons and only identified nearby sequences or required years of painstaking manual annotation. Here we present an open source tool that can accurately identify CNSs between any two related species with sequenced genomes, including both those immediately adjacent to exons and distal sequences separated by >12 KB of noncoding sequence. We have used this tool to characterize new motifs, associate CNSs with additional functions and identify previously undetected genes encoding RNA and protein in the genomes of five grass species. We provide a list of 15,363 orthologous CNSs conserved across all grasses tested. We were also able to identify regulatory sequences present in the common ancestor of grasses that have been lost in one or more extant grass lineages. Lists of orthologous gene pairs and associated CNSs are provided for reference inbred lines of arabidopsis, Japonica rice, foxtail millet, sorghum, brachypodium and maize.

  6. Molecular Identification and Sequencing of Mannose Binding Protein (MBP Gene of Acanthamoeba palestinensis

    M Rezaeian

    2010-02-01

    Full Text Available "nBackground: Acanthamoeba keratitis develops by pathogenic Acanthamoeba such as A. pal­es­tinen­sis. Indeed this species is one of the known causative agents of amoebic keratitis in Iran. Mannose Binding Protein (MBP is the main pathogenicity factors for developing this sight threatening disease. We aimed to characterize MBP gene in pathogenic Acanthamoeba isolates such as A. palestinensis."nMethods: This experimental research was performed in the School of Public Health, Tehran University of Medical Sciences, Tehran, Iran during 2007-2008.  A. palestinensis was grown on 2% non-nutrient agar overlaid with Escherichia coli. DNA extraction was performed using phenol-chloroform method. PCR reaction and amplification were done using specific primer pairs of MBP. The amplified fragment were purified and sequenced. Finally, the obtained fragment was deposited in the gene data bank."nResults: A 900 bp PCR-product was recovered after PCR reaction. Sequence analysis of the purified PCR product revealed a gene with 943 nucleotides. Homology analysis of the ob­tained sequence showed 81% similarity with the available MBP gene in the gene data bank. The fragment was deposited in the gene data bank under accession number EU678895"nConclusion: MBP is known as the most important factor in Acanthamoeba pathogenesis cas­cade. Therefore, characterization of this gene can aid in developing better therapeutic agents and even immunization of high-risk people.

  7. Identification, sequencing and structural analysis of a nifA-like gene of Acetobacter diazotrophicus.

    Teixeira, K R; Morgan, T; Meletzus, D; Galler, R; Baldani, J I; Kennedy, C

    1999-01-01

    A recombinant plasmid, pAD101, containing a DNA fragment of Acetobacter diazotrophicus strain PAL5 was isolated by its ability to restore Nif+ phenotype to a nifA- ntrC- double mutant of Azotobacter vinelandii. Hybridization with the nifA genes of Azospirillum brasilense located the nifA gene more precisely to specific fragments of pAD101. DNA sequencing of appropriate subclones of pAD101 revealed that the nifA gene was adjacent to the nifB gene in A. diazotrophicus, and the 5' end of the nifB gene was located downstream of the nitrogenase MoFe subunit gene, nifK. The deduced aminoacid sequence of A. diazotrophicus nifA and nifB gene were most similar to the NifA and NifB proteins of Azorhizobium caulinodans and Rhodobacter capsulatus, respectively. In addition, nucleotide sequences upstream of the A. diazotrophicus nifA-encoding region indicate features similar to those in the A. caulinodans nifA promoter region involved in O2 and fixed N regulation of nifA expression. PMID:10530336

  8. p21WAF1/CIP1 gene DNA sequencing and its expression in human osteosarcoma

    廖威明; 张春林; 李佛保; 曾炳芳; 曾益新

    2004-01-01

    Background Mutation and expression change of p21WAF1/CIP1 may play a role in the growth of osteosarcoma. This study was to investigate the expression of the p21WAF1/CIP1 gene in human osteosarcoma, p21WAF1/CIP1 gene DNA sequence change and their relationships with the phenotype and clinical prognosis.Methods p21WAF1/CIP1 gene in 10 normal people and the tumours of 45 osteosarcoma patients were examined using polymerase chain reaction-single strand conformation polymorphism (PCR-SSCP) with silver staining. The PCR product with an abnormal strand was sequenced directly. The p21WAF1/CIP1 gene mRNA and P21 protein of 45 cases of osteosarcoma were investigated by using in situ hybridization and immunohistochemistry, respectively. Results The occurrence of P21 protein in osteosarcoma was 17.78% (8/45), and p21WAF1/CIP1 mRNA expression in osteosarcoma was 42.22% (19/45). The p21WAF1/CIP1 gene DNA sequencing of amplified production showed that in p21WAF1/CIP1 gene exon 3 of 36 cases of human osteosarcoma, there were 17 cases (47.22%) with C→T at position 609; 10 normal blood samples' DNA sequence analysis yielded 8 cases (80.00%) with C→T at the same position. Conclusions Along with the increase of malignancy, the expression of p21WAF1/CIP1mRNA and P21 protein in osteosarcoma tends to decrease. It is uncommon for the p21WAF1/CIP1 gene mutation to occur in human osteosarcoma. As a result, the possible existence of tumour subtypes of p21WAF1/CIP1 gene mutation should be investigated. Our research leads to the location of p21WAF1/CIP1 gene polymorphism of Chinese osteosarcoma patients, which can provide a basis for further research.

  9. Complete nucleotide sequence and gene rearrangement of the mitochondrial genome of Occidozyga martensii

    En Li; Xiaoqiang Li; Xiaobing Wu; Ge Feng; Man Zhang; Haitao Shi; Lijun Wang; Jianping Jiang

    2014-12-01

    In this study, the complete nucleotide sequence (18,321 bp) of the mitochondrial (mt) genome of the round-tongued floating frog, Occidozyga martensii was determined. Although, the base composition and codon usage of O. martensii conformed to the typical vertebrate patterns, this mt genome contained 23 tRNAs (a tandem duplication of tRNA-Met gene). The LTPF tRNA-gene cluster, and the derived position of the ND5 gene downstream of the control region, were present in this mitogenome. Moreover, we found that in the WANCY tRNA-gene cluster, the tRNA-Asn gene was located between the tRNA-Tyr and COI genes instead of between the tRNA-Ala and tRNA-Cys genes, which is a novel mtDNA gene rearrangement in vertebrates. Based on the concatenated nucleotide sequences of the 13 protein-coding genes, phylogenetic analysis (BI, ML, MP) was performed to further clarify the phylogenetic relations of this species within anurans.

  10. Re-annotation of genome microbial CoDing-Sequences: finding new genes and inaccurately annotated genes

    Danchin Antoine

    2002-02-01

    Full Text Available Abstract Background Analysis of any newly sequenced bacterial genome starts with the identification of protein-coding genes. Despite the accumulation of multiple complete genome sequences, which provide useful comparisons with close relatives among other organisms during the annotation process, accurate gene prediction remains quite difficult. A major reason for this situation is that genes are tightly packed in prokaryotes, resulting in frequent overlap. Thus, detection of translation initiation sites and/or selection of the correct coding regions remain difficult unless appropriate biological knowledge (about the structure of a gene is imbedded in the approach. Results We have developed a new program that automatically identifies biologically significant candidate genes in a bacterial genome. Twenty-six complete prokaryotic genomes were analyzed using this tool, and the accuracy of gene finding was assessed by comparison with existing annotations. This analysis revealed that, despite the enormous effort of genome program annotators, a small but not negligible number of genes annotated within the framework of sequencing projects are likely to be partially inaccurate or plainly wrong. Moreover, the analysis of several putative new genes shows that, as expected, many short genes have escaped annotation. In most cases, these new genes revealed frameshifts that could be either artifacts or genuine frameshifts. Some entirely unexpected new genes have also been identified. This allowed us to get a more complete picture of prokaryotic genomes. The results of this procedure are progressively integrated into the SWISS-PROT reference databank. Conclusions The results described in the present study show that our procedure is very satisfactory in terms of gene finding accuracy. Except in few cases, discrepancies between our results and annotations provided by individual authors can be accounted for by the nature of each annotation process or by specific

  11. Hunting down frame shifts: Ecological analysis of diverse functional gene sequences

    Michal eStrejcek

    2015-11-01

    Full Text Available Functional gene ecological analyses using amplicon sequencing can be challenging as translated sequences are often burdened with shifted reading frames. The aim of this work was to evaluate several bioinformatics tools designed to correct errors which arise during sequencing in an effort to reduce the number of frame-shifts (FS. Genes encoding for alpha subunits of biphenyl (bphA and benzoate (benA dioxygenases were used as model sequences. FrameBot, a FS correction tool, was able to reduce the number of detected FS to zero. However, up to 43.1% of sequences were discarded by FrameBot as non-specific targets. Therefore, we proposed a de novo mode of FrameBot for FS correction, which works on a similar basis as common chimera identifying platforms and is not dependent on reference sequences. By nature of FrameBot de novo design, it is crucial to provide it with data as error free as possible. We tested the ability of several publicly available correction tools to decrease the number of errors in the data sets. The combination of Maximum Expected Error (MEE filtering and single linkage pre-clustering (SLP proved the most efficient read procession. Applying FrameBot de novo on the processed data enabled analysis of BphA sequences with minimal losses of potentially functional sequences not homologous to those previously known. This experiment also demonstrated the extensive diversity of dioxygenases in soil. A script which performs FrameBot de novo is presented in the supplementary material to the study and the tool was implemented into FunGene Pipeline available at http://fungene.cme.msu.edu/FunGenePipeline/ and https://github.com/rdpstaff/Framebot.

  12. Next-generation sequencing approach for connecting secondary metabolites to biosynthetic gene clusters in fungi

    Cacho, Ralph A.; Yi eTang; Yit-Heng eChooi

    2015-01-01

    Genomics has revolutionized the research on fungal secondary metabolite biosynthesis. To elucidate the molecular and enzymatic mechanisms underlying the biosynthesis of a specific secondary metabolite compound, the important first step is often to find the genes that responsible for its synthesis. The accessibility to fungal genome sequences allows the bypass of the cumbersome traditional library construction and screening approach. The advance in next-generation sequencing (NGS) technologies...

  13. Next-generation sequencing approach for connecting secondary metabolites to biosynthetic gene clusters in fungi

    Cacho, Ralph A.; Tang, Yi; Chooi, Yit-Heng

    2015-01-01

    Genomics has revolutionized the research on fungal secondary metabolite (SM) biosynthesis. To elucidate the molecular and enzymatic mechanisms underlying the biosynthesis of a specific SM compound, the important first step is often to find the genes that responsible for its synthesis. The accessibility to fungal genome sequences allows the bypass of the cumbersome traditional library construction and screening approach. The advance in next-generation sequencing (NGS) technologies have further...

  14. Cyanobacterial community structure as seen from RNA polymerase gene sequence analysis.

    Palenik, B

    1994-01-01

    PCR was used to amplify DNA-dependent RNA polymerase gene sequences specifically from the cyanobacterial population in a seawater sample from the Sargasso Sea. Sequencing and analysis of the cloned fragments suggest that the population in the sample consisted of two distinct clusters of Prochlorococcus-like cyanobacteria and four clusters of Synechococcus-like cyanobacteria. The diversity within these clusters was significantly different, however. Clones within each Synechococcus-like cluster...

  15. Sequence Diversity and Genomic Organization of Vomeronasal Receptor Genes in the Mouse

    Del Punta, Karina; Rothman, Andrea; Rodriguez, Ivan; Mombaerts, Peter

    2000-01-01

    The vomeronasal system of mice is thought to be specialized in the detection of pheromones. Two multigene families have been identified that encode proteins with seven putative transmembrane domains and that are expressed selectively in subsets of neurons of the vomeronasal organ. The products of these vomeronasal receptor (Vr) genes are regarded as candidate pheromone receptors. Little is known about their genomic organization and sequence diversity, and only five sequences of mouse V1r codi...

  16. In-depth cDNA Library Sequencing Provides Quantitative Gene Expression Profiling in Cancer Biomarker Discovery

    Wanling Yang; Dingge Ying; Yu-Lung Lau

    2009-01-01

    procedures may allow detection of many expres-sion features for less abundant gene variants. With the reduction of sequencing cost and the emerging of new generation sequencing technology, in-depth sequencing of cDNA pools or libraries may represent a better and powerful tool in gene expression profiling and cancer biomarker detection. We also propose using sequence-specific subtraction to remove hundreds of the most abundant housekeeping genes to in-crease sequencing depth without affecting relative expression ratio of other genes, as transcripts from as few as 300 most abundantly expressed genes constitute about 20% of the total transcriptome. In-depth sequencing also represents a unique ad-vantage of detecting unknown forms of transcripts, such as alternative splicing variants, fusion genes, and regulatory RNAs, as well as detecting mutations and polymorphisms that may play important roles in disease pathogenesis.

  17. Yersinia spp. Identification Using Copy Diversity in the Chromosomal 16S rRNA Gene Sequence.

    Hao, Huijing; Liang, Junrong; Duan, Ran; Chen, Yuhuang; Liu, Chang; Xiao, Yuchun; Li, Xu; Su, Mingming; Jing, Huaiqi; Wang, Xin

    2016-01-01

    API 20E strip test, the standard for Enterobacteriaceae identification, is not sufficient to discriminate some Yersinia species for some unstable biochemical reactions and the same biochemical profile presented in some species, e.g. Yersinia ferderiksenii and Yersinia intermedia, which need a variety of molecular biology methods as auxiliaries for identification. The 16S rRNA gene is considered a valuable tool for assigning bacterial strains to species. However, the resolution of the 16S rRNA gene may be insufficient for discrimination because of the high similarity of sequences between some species and heterogeneity within copies at the intra-genomic level. In this study, for each strain we randomly selected five 16S rRNA gene clones from 768 Yersinia strains, and collected 3,840 sequences of the 16S rRNA gene from 10 species, which were divided into 439 patterns. The similarity among the five clones of 16S rRNA gene is over 99% for most strains. Identical sequences were found in strains of different species. A phylogenetic tree was constructed using the five 16S rRNA gene sequences for each strain where the phylogenetic classifications are consistent with biochemical tests; and species that are difficult to identify by biochemical phenotype can be differentiated. Most Yersinia strains form distinct groups within each species. However Yersinia kristensenii, a heterogeneous species, clusters with some Yersinia enterocolitica and Yersinia ferderiksenii/intermedia strains, while not affecting the overall efficiency of this species classification. In conclusion, through analysis derived from integrated information from multiple 16S rRNA gene sequences, the discrimination ability of Yersinia species is improved using our method. PMID:26808495

  18. Nucleotide sequence of the gene for the b subunit of human factor XIII

    Bottenus, R.E.; Ichinose, A.; Davie, E.W. (Univ. of Washington, Seattle (USA))

    1990-12-01

    Factor XIII (M{sub r} 320 000) is a blood coagulation factor that stabilizes and strengthens the fibrin clot. It circulates in blood as a tetramer composed of two a subunits (M{sub r} 75 000 each) and two b subunits (M{sub r} 80 000 each). The b subunit consists of 641 amino acids and includes 10 tandem repeats of 60 amino acids known as GP-I structures, short consensus repeats (SCR), or sushi domains. In the present study, the human gene for the b subunit has been isolated from three different genomic libraries prepared in {lambda} phage. Fifteen independent phage with inserts coding for the entire gene were isolated and characterized by restriction mapping, Southern blotting, and DNA sequencing. The gene was found to be 28 kilobases in length and consisted of 12 exons (I-XII) separated by 11 intervening sequences. The leader sequence was encoded by exon I, while the carbonyl-terminal region of the protein was encoded by exon XII. Exons II-XI each coded for a single sushi domain, suggesting that the gene evolved through exon shuffling and duplication. The 12 exons in the gene ranged in size from 64 to 222 base pairs, while the introns ranged in size from 87 to 9970 nucleotides and made up 92{percent} of the gene. One nucleotide change was found in the coding region of the gene when its sequence was compared to that of the cDNA. This difference, however, did not result in a change in the amino acid sequence of the protein.

  19. Soluble normal and mutated DNA sequences from single-copy genes in human blood.

    Sorenson, G D; Pribish, D M; Valone, F H; Memoli, V A; Bzik, D J; Yao, S L

    1994-01-01

    Healthy individuals have soluble (extracellular) DNA in their blood, and increased amounts are present in cancer patients. Here we report the detection of specific sequences of the cystic fibrosis and K-ras genes in plasma DNA from normal donors by amplification with the polymerase chain reaction. In addition, mutated K-ras sequences are identified by polymerase chain reaction utilizing allele-specific primers in the plasma or serum from three patients with pancreatic carcinoma that contain mutated K-ras genes. The mutations are confirmed by direct sequencing. These results indicate that sequences of single-copy genes can be identified in normal plasma and that the sequences of mutated oncogenes can be detected and identified with allele-specific amplification by polymerase chain reaction in plasma or serum from patients with malignant tumors containing identical mutated genes. Mutated oncogenes in plasma and serum may represent tumor markers that could be useful for diagnosis, determining response to treatment, and predicting prognosis. PMID:8118388

  20. Nucleotide sequences of immunoglobulin eta genes of chimpanzee and orangutan: DNA molecular clock and hominoid evolution

    To determine the phylogenetic relationships among hominoids and the dates of their divergence, the complete nucleotide sequences of the constant region of the immunoglobulin eta-chain (C/sub eta1/) genes from chimpanzee and orangutan have been determined. These sequences were compared with the human eta-chain constant-region sequence. A molecular clock (silent molecular clock), measured by the degree of sequence divergence at the synonymous (silent) positions of protein-encoding regions, was introduced for the present study. From the comparison of nucleotide sequences of α1-antitrypsin and β- and δ-globulin genes between humans and Old World monkeys, the silent molecular clock was calibrated: the mean evolutionary rate of silent substitution was determined to be 1.56 x 10-9 substitutions per site per year. Using the silent molecular clock, the mean divergence dates of chimpanzee and orangutan from the human lineage were estimated as 6.4 +/- 2.6 million years and 17.3 +/- 4.5 million years, respectively. It was also shown that the evolutionary rate of primate genes is considerably slower than those of other mammalian genes

  1. Isolation, sequencing and overexpression of the gene encoding the theta subunit of DNA polymerase III holoenzyme.

    J.R. Carter; Franden, M A; Aebersold, R.; Kim, D.R.; McHenry, C S

    1993-01-01

    The gene encoding the theta subunit of DNA polymerase III holoenzyme, designated holE, was isolated using a strategy in which peptide sequence was used to derive a DNA hybridization probe. Sequencing of the gene, which maps to 41.43 centisomes of the chromosome, revealed a 76-codon open reading frame predicted to produce a protein of 8,846 Da. When placed in a tac promoter expression vector, the open reading frame directed expression of a protein, that comigrated with authentic theta subunit ...

  2. Sequencing of 16S rRNA Gene: A Rapid Tool for Identification of Bacillus anthracis

    Sacchi, Claudio T.; Whitney, Anne M.; Mayer, Leonard W.; Morey, Roger; Steigerwalt, Arnold; Boras, Ariana; Weyant, Robin S.; Popovic, Tanja

    2002-01-01

    In a bioterrorism event, a tool is needed to rapidly differentiate Bacillus anthracis from other closely related spore-forming Bacillus species. During the recent outbreak of bioterrorism-associated anthrax, we sequenced the 16S rRNA generom these species to evaluate the potential of 16S rRNA gene sequencing as a diagnostic tool. We found eight distinct 16S types among all 107 16S rRNA gene seqs fuences that differed from each other at 1 to 8 positions (0.06% to 0.5%). All 86 B. anthracis had...

  3. Identification of cancer-driver genes in focal genomic alterations from whole genome sequencing data.

    Jang, Ho; Hur, Youngmi; Lee, Hyunju

    2016-01-01

    DNA copy number alterations (CNAs) are the main genomic events that occur during the initiation and development of cancer. Distinguishing driver aberrant regions from passenger regions, which might contain candidate target genes for cancer therapies, is an important issue. Several methods for identifying cancer-driver genes from multiple cancer patients have been developed for single nucleotide polymorphism (SNP) arrays. However, for NGS data, methods for the SNP array cannot be directly applied because of different characteristics of NGS such as higher resolutions of data without predefined probes and incorrectly mapped reads to reference genomes. In this study, we developed a wavelet-based method for identification of focal genomic alterations for sequencing data (WIFA-Seq). We applied WIFA-Seq to whole genome sequencing data from glioblastoma multiforme, ovarian serous cystadenocarcinoma and lung adenocarcinoma, and identified focal genomic alterations, which contain candidate cancer-related genes as well as previously known cancer-driver genes. PMID:27156852

  4. The Sequence Variations of Intron-3 of the α-Amylase Gene in Adzuki Bean

    JIN Wen-lin; Yamaguchi Hirofumi; Isigami Matiko; Yasuda Kentaro

    2003-01-01

    This study describes variation of intron-3 of a-amylase gene from 156 breeds of adzuki beansusing SSCP(single-strand conformation polymorphism)analysis. Based on a-amylase gene structure and se-quence, A pair of PCR primers, F (CCTACATTCTAACACACCCT) and R (GCATATTGTGCCAGTACAAT)were designed to amplify intron-3 fragments of a-amylase gene. 14 variant types were detected, including 13,9, 10, 4 variant types in the wild, weed, locally cultivated and modern brought-up adzuki beans respectively,9, 8, 7 variant types of the wild adzuki beans from Japan, China and Korea respectively, and some other va-riant types in the local adzuki beans from China and Bhutan. 60 % of subjects of cultivated races were found tobe EE type in the experiment. In addition, sequence analysis of intron-3 of α-amylase gene from 8 varianttypes reveals the evolution process of various variant types in adzuki beans.

  5. Identification of cancer-driver genes in focal genomic alterations from whole genome sequencing data

    Jang, Ho; Hur, Youngmi; Lee, Hyunju

    2016-01-01

    DNA copy number alterations (CNAs) are the main genomic events that occur during the initiation and development of cancer. Distinguishing driver aberrant regions from passenger regions, which might contain candidate target genes for cancer therapies, is an important issue. Several methods for identifying cancer-driver genes from multiple cancer patients have been developed for single nucleotide polymorphism (SNP) arrays. However, for NGS data, methods for the SNP array cannot be directly applied because of different characteristics of NGS such as higher resolutions of data without predefined probes and incorrectly mapped reads to reference genomes. In this study, we developed a wavelet-based method for identification of focal genomic alterations for sequencing data (WIFA-Seq). We applied WIFA-Seq to whole genome sequencing data from glioblastoma multiforme, ovarian serous cystadenocarcinoma and lung adenocarcinoma, and identified focal genomic alterations, which contain candidate cancer-related genes as well as previously known cancer-driver genes. PMID:27156852

  6. Delineation of the species Haemophilus influenzae by phenotype, multilocus sequence phylogeny, and detection of marker genes

    Nørskov-Lauritsen, Niels; Overballe, MD; Kilian, Mogens

    2009-01-01

    encoded hemin biosynthesis genes were identified, and sequence analysis showed these genes to represent an ancestral genotype rather than recent transfers from, e.g., Haemophilus parainfluenzae. Strains previously assigned to H. haemolyticus formed several separate lineages within a distinct but deeply......To obtain more information on the much-debated definition of prokaryotic species, we investigated the borders of Haemophilus influenzae by comparative analysis of H. influenzae reference strains with closely related bacteria including strains assigned to Haemophilus haemolyticus, cryptic...... genospecies biotype IV, and the never formally validated species "Haemophilus intermedius". Multilocus sequence phylogeny based on six housekeeping genes separated a cluster encompassing the type and the reference strains of H. influenzae from 31 more distantly related strains. Comparison of 16S rRNA gene...

  7. Development of a Comprehensive Sequencing Assay for Inherited Cardiac Condition Genes.

    Pua, Chee Jian; Bhalshankar, Jaydutt; Miao, Kui; Walsh, Roddy; John, Shibu; Lim, Shi Qi; Chow, Kingsley; Buchan, Rachel; Soh, Bee Yong; Lio, Pei Min; Lim, Jaclyn; Schafer, Sebastian; Lim, Jing Quan; Tan, Patrick; Whiffin, Nicola; Barton, Paul J; Ware, James S; Cook, Stuart A

    2016-02-01

    Inherited cardiac conditions (ICCs) are characterised by marked genetic and allelic heterogeneity and require extensive sequencing for genetic characterisation. We iteratively optimised a targeted gene capture panel for ICCs that includes disease-causing, putatively pathogenic, research and phenocopy genes (n = 174 genes). We achieved high coverage of the target region on both MiSeq (>99.8 % at ≥20× read depth, n = 12) and NextSeq (>99.9 % at ≥20×, n = 48) platforms with 100 % sensitivity and precision for single nucleotide variants and indels across the protein-coding target on the MiSeq. In the final assay, 40 out of 43 established ICC genes informative in clinical practice achieved complete coverage (100 % at ≥20×). By comparison, whole exome sequencing (WES; ∼80×), deep WES (∼500×) and whole genome sequencing (WGS; ∼70×) had poorer performance (88.1, 99.2 and 99.3 % respectively at ≥20×) across the ICC target. The assay described here delivers highly accurate and affordable sequencing of ICC genes, complemented by accessible cloud-based computation and informatics. See Editorial in this issue (DOI: 10.1007/s12265-015-9667-8 ). PMID:26888179

  8. Metazoan Remaining Genes for Essential Amino Acid Biosynthesis: Sequence Conservation and Evolutionary Analyses

    Igor R. Costa

    2014-12-01

    Full Text Available Essential amino acids (EAA consist of a group of nine amino acids that animals are unable to synthesize via de novo pathways. Recently, it has been found that most metazoans lack the same set of enzymes responsible for the de novo EAA biosynthesis. Here we investigate the sequence conservation and evolution of all the metazoan remaining genes for EAA pathways. Initially, the set of all 49 enzymes responsible for the EAA de novo biosynthesis in yeast was retrieved. These enzymes were used as BLAST queries to search for similar sequences in a database containing 10 complete metazoan genomes. Eight enzymes typically attributed to EAA pathways were found to be ubiquitous in metazoan genomes, suggesting a conserved functional role. In this study, we address the question of how these genes evolved after losing their pathway partners. To do this, we compared metazoan genes with their fungal and plant orthologs. Using phylogenetic analysis with maximum likelihood, we found that acetolactate synthase (ALS and betaine-homocysteine S-methyltransferase (BHMT diverged from the expected Tree of Life (ToL relationships. High sequence conservation in the paraphyletic group Plant-Fungi was identified for these two genes using a newly developed Python algorithm. Selective pressure analysis of ALS and BHMT protein sequences showed higher non-synonymous mutation ratios in comparisons between metazoans/fungi and metazoans/plants, supporting the hypothesis that these two genes have undergone non-ToL evolution in animals.

  9. POLYMORPHISM IN THE CODING REGION SEQUENCE OF GDF8 GENE IN INDIAN SHEEP.

    Pothuraju, M; Mishra, S K; Kumar, S N; Mohamed, N F; Kataria, R S; Yadav, D K; Arora, R

    2015-11-01

    The present study was undertaken to identify polymorphism in the coding sequence of GDF8gene across indigenous meat type sheep breeds. A 1647 bp sequence was generated, encompassing 208 bp of the 5'UTR, 1128 bp of coding region (exon1, 2 and 3) as well as 311 bp of 3'UTR. The sheep and goat GDF8 gene sequences were observed to be highly conserved as compared to cattle, buffalo, horse and pig. Several nucleotide variations were observed across coding sequence of GDF8 gene in Indian sheep. Three polymorphic sites were identified in the 5'UTR, one in exon 1 and one in the exon 2 regions. Both SNPs in the exonic region were found to be non-synonymous. The mutations c.539T > G and c.821T > A discovered in this study in the exon 1 and exon 2, respectively, have not been previously reported. The information generated provides preliminary indication of the functional diversity present in Indian sheep at the coding region of GDF8gene. The novel as well as the previously reported SNPs discovered in the Indian sheep warrant further analysis to see whether they affect the phenotype. Future studies will need to establish the affect of reported SNPs in the expression of the GDF8 gene in Indian sheep population. PMID:26845859

  10. Cloning and sequencing of cagA gene fragment of Helicobacter pylori with coccoid form

    Ke-Xia Wang; Xue-Feng Wang

    2004-01-01

    AIM: To clone and sequence the cagA gene fragment of Helicobacter pylori ( H pylori) with coccoid form.METHODS: H pylori strain NCTC11637 were transformed to coccoid form by exposure to antibiotics in subinhibitory concentrations. The coccoid H pyloriwas collected. cagA gene of the coccoid H pylori strain was amplified by PCR.After purified, the target fragment was cloned into plasmid pMD-18T. The recombinant plasmid pMD-18T-cagA was transformed into E. coli JM109. Positive clones were screened and identified by PCR and digestion with restriction endonucleases. The sequence of inserted fragment was then analysed.RESULTS: cagA gene of 3 444 bp was obtained from the coccoid H pylori genome DNA. The recombinant plasmid pMD-18T-cagA was constructed, then it was digested by BamH Ⅰ+Sac Ⅰ, and the product of digestion was identical with the predicted one. Sequence analysis showed that the homology of coccoid and the reported original sequence H pylori was 99.7%.CONCLUSION: The recombinant plasmid containing cagA gene from coccoid H pylori has been constructed successfully.The coccoid H pylori contain completed cagA gene, which may be related to pathogenicity of them.

  11. cDNA sequence of human transforming gene hst and identification of the coding sequence required for transforming activity

    The hst gene was originally identified as a transforming gene in DNAs from human stomach cancers and from a noncancerous portion of stomach mucosa by DNA-mediated transfection assay using NIH3T3 cells. cDNA clones of hst were isolated from the cDNA library constructed from poly(A)+ RNA of a secondary transformant induced by the DNA from a stomach cancer. The sequence analysis of the hst cDNA revealed the presence of two open reading frames. When this cDNA was inserted into an expression vector containing the simian virus 40 promoter, it efficiently induced the transformation of NIH3T3 cells upon transfection. It was found that one of the reading frames, which coded for 206 amino acids, was responsible for the transforming activity

  12. Biologic: Gene circuits and feedback in an introductory physics sequence for biology and premedical students

    Cahn, S B

    2013-01-01

    Two synthetic gene circuits -- the genetic toggle switch and the repressilator -- are analyzed quantitatively and discussed in the context of an educational module on gene circuits and feedback that constitutes the final topic of a year-long introductory physics sequence, aimed at biology and premedical undergraduate students. The genetic toggle switch consists of two genes, each of whose protein product represses the other's expression, while the repressilator consists of three genes, each of whose protein product represses the next gene's expression. Analytic, numerical, and electronic treatments of the genetic toggle switch shows that this gene circuit realizes bistability. A simplified treatment of the repressilator reveals that this circuit can realize sustained oscillations. In both cases, a "phase diagram" is obtained, that specifies the region of parameter space in which bistability or oscillatory behavior, respectively, occurs.

  13. Complete sequence and gene organization of the mitochondrial genome of Asio flammeus (Strigiformes, strigidae).

    Zhang, Yanan; Song, Tao; Pan, Tao; Sun, Xiaonan; Sun, Zhonglou; Qian, Lifu; Zhang, Baowei

    2016-07-01

    The complete sequence of the mitochondrial genome was determined for Asio flammeus, which is distributed widely in geography. The length of the complete mitochondrial genome was 18,966 bp, containing 2 rRNA genes, 22 tRNA genes, 13 protein-coding genes (PCGs), and 1 non-coding region (D-loop). All the genes were distributed on the H-strand, except for the ND6 subunit gene and eight tRNA genes which were encoded on the L-strand. The D-loop of A. flammeus contained many tandem repeats of varying lengths and repeat numbers. The molecular-based phylogeny showed that our species acted as the sister group to A. capensis and the supported Asio was the monophyletic group. PMID:25980662

  14. Hindered proton collectivity in 28S: Possible magic number at Z=16

    Togano, Y; Iwasa, N; Yamada, K; Motobayashi, T; Aoi, N; Baba, H; Bishop, S; Cai, X; Doornenbal, P; Fang, D; Furukawa, T; Ieki, K; Kawabata, T; Kanno, S; Kobayashi, N; Kondo, Y; Kuboki, T; Kume, N; Kurita, K; Kurokawa, M; Ma, Y G; Matsuo, Y; Murakami, H; Matsushita, M; Nakamura, T; Okada, K; Ota, S; Satou, Y; Shimoura, S; Shioda, R; Tanaka, K N; Takeuchi, S; Tian, W; Wang, H; Wang, J; Yoneda, K

    2012-01-01

    The reduced transition probability B(E2;0 ->2+) for 28S was obtained experimentally using Coulomb excitation at 53 MeV/nucleon. The resultant B(E2) value 181(31) e2fm4 is smaller than the expectation based on empirical B(E2) systematics. The double ratio |M_n/M_p|/(N/Z) of the 0+ ->2+ transition in 28S was determined to be 1.9(2) by evaluating the M_n value from the known B(E2) value of the mirror nucleus 28Mg, showing the hindrance of proton collectivity relative to that of neutrons. These results indicate the emergence of the magic number Z=16 in the |T_z|=2 nucleus 28S.

  15. Cloning and Sequence Analysis on 3' Coding Region of Wild Boar and Cross Bred Pig Myostatin Gene

    LIU Di; YANG Xiu-qin; YANG Jia-fang

    2004-01-01

    Myostatin, with a highly conservative gene among breeds is a negative regulator of muscle. The 3' coding regions of wild boar and crossbred pig myostatin were cloned by RT-PCR and sequenced respectively. The homology of the nucleotide sequence between wild boar and crossbred pig was 100% and there was no difference in this region compared with pig myostatin gene of Genbank. This indicated that there was not change of gene sequence in this region during the evolution processes.

  16. Molecular cloning and long terminal repeat sequences of human endogenous retrovirus genes related to types A and B retrovirus genes

    By using a DNA fragment primarily encoding the reverse transcriptase (pol) region of the Syrian hamster intracisternal A particle (IAP; type A retrovirus) gene as a probe, human endogenous retrovirus genes, tentatively termed HERV-K genes, were cloned from a fetal human liver gene library. Typical HERV-K genes were 9.1 or 9.4 kilobases in length, having long terminal repeats (LTRs) of ca. 970 base pairs. Many structural features commonly observed on the retrovirus LTRs, such as the TATAA box, polyadenylation signal, and terminal inverted repeats, were present on each LTR, and a lysine (K) tRNA having a CUU anticodon was identified as a presumed primer tRNA. The HERV-K LTR, however, had little sequence homology to either the IAP LTR or other typical oncovirus LTRs. By filter hybridization, the number of HERV-K genes was estimated to be ca. 50 copies per haploid human genome. The cloned mouse mammary tumor virus (type B) gene was found to hybridize with both the HERV-K and IAP genes to essentially the same extent

  17. Local synteny and codon usage contribute to asymmetric sequence divergence of Saccharomyces cerevisiae gene duplicates

    Bergthorsson Ulfar

    2011-09-01

    Full Text Available Abstract Background Duplicated genes frequently experience asymmetric rates of sequence evolution. Relaxed selective constraints and positive selection have both been invoked to explain the observation that one paralog within a gene-duplicate pair exhibits an accelerated rate of sequence evolution. In the majority of studies where asymmetric divergence has been established, there is no indication as to which gene copy, ancestral or derived, is evolving more rapidly. In this study we investigated the effect of local synteny (gene-neighborhood conservation and codon usage on the sequence evolution of gene duplicates in the S. cerevisiae genome. We further distinguish the gene duplicates into those that originated from a whole-genome duplication (WGD event (ohnologs versus small-scale duplications (SSD to determine if there exist any differences in their patterns of sequence evolution. Results For SSD pairs, the derived copy evolves faster than the ancestral copy. However, there is no relationship between rate asymmetry and synteny conservation (ancestral-like versus derived-like in ohnologs. mRNA abundance and optimal codon usage as measured by the CAI is lower in the derived SSD copies relative to ancestral paralogs. Moreover, in the case of ohnologs, the faster-evolving copy has lower CAI and lowered expression. Conclusions Together, these results suggest that relaxation of selection for codon usage and gene expression contribute to rate asymmetry in the evolution of duplicated genes and that in SSD pairs, the relaxation of selection stems from the loss of ancestral regulatory information in the derived copy.

  18. Sequences of the coat protein gene from brazilian isolates of Papaya ringspot virus

    LIMA ROBERTO C. A.; SOUZA JR. MANOEL T.; PIO-RIBEIRO GILVAN; LIMA J. ALBERSIO A.

    2002-01-01

    Papaya ringspot virus (PRSV) is the causal agent of the main papaya (Carica papaya) disease in the world. Brazil is currently the world's main papaya grower, responsible for about 40% of the worldwide production. Resistance to PRSV on transgenic plants expressing the PRSV coat protein (cp) gene was shown to be dependent on the sequence homology between the cp transgene expressed in the plant genome and the cp gene from the incoming virus, in an isolate-specific fashion. Therefore, knowledge o...

  19. SeqGene: a comprehensive software solution for mining exome- and transcriptome- sequencing data

    Deng Xutao

    2011-01-01

    Abstract Background The popularity of massively parallel exome and transcriptome sequencing projects demands new data mining tools with a comprehensive set of features to support a wide range of analysis tasks. Results SeqGene, a new data mining tool, supports mutation detection and annotation, dbSNP and 1000 Genome data integration, RNA-Seq expression quantification, mutation and coverage visualization, allele specific expression (ASE), differentially expressed genes (DEGs) identification, c...

  20. Control of gene conversion and somatic hypermutation by immunoglobulin promoter and enhancer sequences

    Yang, Shu Yuan; Fugmann, Sebastian D.; Schatz, David G.

    2006-01-01

    It is thought that gene conversion (GCV) and somatic hypermutation (SHM) of immunoglobulin (Ig) genes occur in two steps: the generation of uracils in DNA by activation-induced cytidine deaminase, followed by their subsequent repair by various DNA repair pathways to generate sequence-diversified products. It is not known how either of the two steps is targeted specifically to Ig loci. Because of the tight link between transcription and SHM, we have investigated the role of endogenous Ig light...

  1. Defining the minimal length of sequence homology required for selective gene isolation by TAR cloning

    Noskov, V. N.; Koriabine, M.; Solomon, G.; Randolph, M; Barrett, J C; Leem, S.-H.; Stubbs, L; Kouprina, N; Larionov, V.

    2001-01-01

    The transformation-associated recombination (TAR) cloning technique allows selective and accurate isolation of chromosomal regions and genes from complex genomes. The technique is based on in vivo recombination between genomic DNA and a linearized vector containing homologous sequences, or hooks, to the gene of interest. The recombination occurs during transformation of yeast spheroplasts that results in the generation of a yeast artificial chromosome (YAC) contain...

  2. Cloning, Sequencing, and Disruption of the Bacillus subtilis psd Gene Coding for Phosphatidylserine Decarboxylase

    Matsumoto, Kouji; Okada, Masahiro; Horikoshi, Yuko; Matsuzaki, Hiroshi; Kishi, Tsutomu; Itaya, Mitsuhiro; Shibuya, Isao

    1998-01-01

    The psd gene of Bacillus subtilis Marburg, encoding phosphatidylserine decarboxylase, has been cloned and sequenced. It encodes a polypeptide of 263 amino acid residues (deduced molecular weight of 29,689) and is located just downstream of pss, the structural gene for phosphatidylserine synthase that catalyzes the preceding reaction in phosphatidylethanolamine synthesis (M. Okada, H. Matsuzaki, I. Shibuya, and K. Matsumoto, J. Bacteriol. 176:7456–7461, 1994). Introduction of a plasmid contain...

  3. The Genome Sequence of Leishmania (Leishmania) amazonensis: Functional Annotation and Extended Analysis of Gene Models

    Real, Fernando; Vidal, Ramon Oliveira; Carazzolle, Marcelo Falsarella; Mondego, Jorge Maurício Costa; Costa, Gustavo Gilson Lacerda; Herai, Roberto Hirochi; Würtele, Martin; de Carvalho, Lucas Miguel; e Ferreira, Renata Carmona; Mortara, Renato Arruda; Barbiéri, Clara Lucia; Mieczkowski, Piotr; da Silveira, José Franco; Briones, Marcelo Ribeiro da Silva; Pereira, Gonçalo Amarante Guimarães

    2013-01-01

    We present the sequencing and annotation of the Leishmania (Leishmania) amazonensis genome, an etiological agent of human cutaneous leishmaniasis in the Amazon region of Brazil. L. (L.) amazonensis shares features with Leishmania (L.) mexicana but also exhibits unique characteristics regarding geographical distribution and clinical manifestations of cutaneous lesions (e.g. borderline disseminated cutaneous leishmaniasis). Predicted genes were scored for orthologous gene families and conserved...

  4. Cloning, sequencing, and transcriptional analysis of the gene coding for the vegetative sigma factor of Agrobacterium tumefaciens.

    Segal, G.; Ron, E. Z.

    1993-01-01

    The sigA gene of Agrobacterium tumefaciens was cloned and sequenced. Comparison with previously analyzed sigA genes revealed a high degree of similarity in nucleotide and amino acid sequences of regions two, three, and four of vegetative sigma factors. However, the upstream regulatory region shows no sequence homology with the Escherichia coli heat shock (sigma 32) promoters. It also does not contain the hairpin-loop structure (inverted repeat sequence) that was found in the upstream region o...

  5. Cloning, sequence analysis, and expression of the genes encoding lytic functions of Bacteriophage Fg1e

    OKI, Masaya; Kakikawa, Makiko; Yamada, Kazuyo; Taketo, Akira; KODAIRA, Ken-Ichi

    1996-01-01

    The lysis genes of a Lactobacillus phage Fgle were cloned, sequenced, and expressed in Escherichia coli. Nucleotide sequencing of a 3813-bp Fgle DNA revealed five successive open reading frames (ORF), Rorf50, Rorf118, hol, and lys and Rorf175, in the same DNA strand. By comparative analysis of the DNA sequence, the putative hol product (holin) has an estimated molecular weight is 14.2 kDa, and contains two potential transmembrane helices and highly charged N- and C-termini, resembling predict...

  6. Nucleotide sequence and taxonomical distribution of the bacteriocin gene lin cloned from Brevibacterium linens M18.

    Valdes-Stauber, N; Scherer, S

    1996-01-01

    Linocin M18 is an antilisterial bacteriocin produced by the red smear cheese bacterium Brevibacterium linens M18. Oligonucleotide probes based on the N-terminal amino acid sequence were used to locate its single copy gene, lin, on the chromosomal DNA. The amino acid composition, N-terminal sequence, and molecular mass derived from the nucleotide sequence of an open reading frame of 798 nucleotides coding for 266 amino acids found on a 3-kb BamHI restriction fragment correspond closely to thos...

  7. Efficient Nucleic Acid Extraction and 16S rRNA Gene Sequencing for Bacterial Community Characterization.

    Anahtar, Melis N; Bowman, Brittany A; Kwon, Douglas S

    2016-01-01

    There is a growing appreciation for the role of microbial communities as critical modulators of human health and disease. High throughput sequencing technologies have allowed for the rapid and efficient characterization of bacterial communities using 16S rRNA gene sequencing from a variety of sources. Although readily available tools for 16S rRNA sequence analysis have standardized computational workflows, sample processing for DNA extraction remains a continued source of variability across studies. Here we describe an efficient, robust, and cost effective method for extracting nucleic acid from swabs. We also delineate downstream methods for 16S rRNA gene sequencing, including generation of sequencing libraries, data quality control, and sequence analysis. The workflow can accommodate multiple samples types, including stool and swabs collected from a variety of anatomical locations and host species. Additionally, recovered DNA and RNA can be separated and used for other applications, including whole genome sequencing or RNA-seq. The method described allows for a common processing approach for multiple sample types and accommodates downstream analysis of genomic, metagenomic and transcriptional information. PMID:27168460

  8. Nucleotide sequence of the gene encoding the nitrogenase iron protein of Thiobacillus ferrooxidans

    The DNA sequence was determined for the cloned Thiobacillus ferrooxidans nifH and part of the nifD genes. The DNA chains were radiolabeled with [α-32P]dCTP (3000 Ci/mmol) or [α-35S]dCTP (400 Ci/mmol). A putative T. ferrooxidans nifH promoter was identified whose sequences showed perfect consensus with those of the Klebsiella pneumoniae nif promoter. Two putative consensus upstream activator sequences were also identified. The amino acid sequence was deduced from the DNA sequence. In a comparison of nifH DNA sequences from T. ferrooxidans and eight other nitrogen-fixing microbes, a Rhizobium sp. isolated from Parasponia andersonii showed the greatest homology (74%) and Clostridium pasteurianum (nifH1) showed the least homology (54%). In the comparison of the amino acid sequences of the Fe proteins, the Rhizobium sp. and Rhizobium japonicum showed the greatest homology (both 86%) and C. pasteurianum (nifH1 gene product) demonstrated the least homology (56%) to the T. ferrooxidans Fe protein

  9. Isolation and characterisation of the Xenopus laevis albumin genes: loss of 74K albumin gene sequences by library amplification.

    May, F E; Weber, R.; Westley, B. R.

    1982-01-01

    The blood of the frog X.laevis contains 2 albumins of 68,000 and 74,000 daltons which are encoded in the liver by two related mRNAs. When an amplified X.laevis DNA library was screened with cloned albumin cDNA only 68,000 dalton albumin gene sequences were isolated. Hybridisation of the albumin cDNA to Southern-blots of Eco R1 digested X.laevis DNA showed that the sequences present in the recombinants did not account for all the fragments which hybridised on the Southern-blots. This indicated...

  10. GeneLook: a novel ab initio gene identification system suitable for automated annotation of prokaryotic sequences.

    Nishi, Tatsunari; Ikemura, Toshimichi; Kanaya, Shigehiko

    2005-02-14

    With the rapid increases in the amounts of sequence data for prokaryotic genomes, it has become important to develop systems for automated and accurate genome annotation. We present herein a novel ab initio gene identification system, GeneLook, that predicts protein-coding open reading frames (ORFs) with high sensitivity and specificity with no prior knowledge of the sequence composition. The system predicts protein-coding ORFs in two stages, seed ORF selection and main prediction. In the selection of reliable seed ORFs containing at least 200 codons, GeneLook predicts translation start sites and operon structures through searches for ribosome-binding sites and a novel operon prediction algorithm. The codon and nucleotide frequencies of seed ORFs are then used to determine values for two new coding-potential parameters for identification of protein-coding ORFs of at least 34 codons and for another parameter that improves the prediction accuracy for GC-rich genomes. In the main prediction, GeneLook uses these parameters to identify the most likely genes of a given minimal length. We assessed the performance of GeneLook with two indices, sensitivity and specificity that are defined as true positives (TP)/(TP+false negatives) and TP/(TP+false positives), respectively. This system predicted protein-coding ORFs for Escherichia coli and Bacillus subtilis with sensitivities of 96.5% and 96.2%, respectively, and specificities of 96.9% and 96.1%, respectively. The system also identified 94.1% of annotated genes of the Pseudomonas aeruginosa genome, which is GC-rich, with high specificity (97.2%). Furthermore, GeneLook identified protein-coding ORFs with high accuracy from a wide variety of prokaryotic genomes. PMID:15716020

  11. Discovery of clubroot-resistant genes in Brassica napus by transcriptome sequencing.

    Chen, S W; Liu, T; Gao, Y; Zhang, C; Peng, S D; Bai, M B; Li, S J; Xu, L; Zhou, X Y; Lin, L B

    2016-01-01

    Clubroot significantly affects plants of the Brassicaceae family and is one of the main diseases causing serious losses in B. napus yield. Few studies have investigated the clubroot-resistance mechanism in B. napus. Identification of clubroot-resistant genes may be used in clubroot-resistant breeding, as well as to elucidate the molecular mechanism behind B. napus clubroot-resistance. We used three B. napus transcriptome samples to construct a transcriptome sequencing library by using Illumina HiSeq™ 2000 sequencing and bioinformatic analysis. In total, 171 million high-quality reads were obtained, containing 96,149 unigenes of N50-value. We aligned the obtained unigenes with the Nr, Swiss-Prot, clusters of orthologous groups, and gene ontology databases and annotated their functions. In the Kyoto encyclopedia of genes and genomes database, 25,033 unigenes (26.04%) were assigned to 124 pathways. Many genes, including broad-spectrum disease-resistance genes, specific clubroot-resistant genes, and genes related to indole-3-acetic acid (IAA) signal transduction, cytokinin synthesis, and myrosinase synthesis in the Huashuang 3 variety of B. napus were found to be related to clubroot-resistance. The effective clubroot-resistance observed in this variety may be due to the induced increased expression of these disease-resistant genes and strong inhibition of the IAA signal transduction, cytokinin synthesis, and myrosinase synthesis. The homology observed between unigenes 0048482, 0061770 and the Crr1 gene shared 94% nucleotide similarity. Furthermore, unigene 0061770 could have originated from an inversion of the Crr1 5'-end sequence. PMID:27525940

  12. Discovering the Secrets of the Candida albicans Agglutinin-Like Sequence (ALS) Gene Family—a Sticky Pursuit

    HOYER, LOIS L.; GREEN, CLAYTON B.; Oh, Soon-Hwan; Zhao, Xiaomin

    2008-01-01

    The Agglutinin-Like Sequence (ALS) family of Candida albicans includes eight genes that encode large cell-surface glycoproteins. The high degree of sequence relatedness between the ALS genes and the tremendous allelic variability often present in the same C. albicans strain complicated definition and characterization of the gene family. The main hypothesis driving ALS family research is that the genes encode adhesins, primarily involved in host-pathogen interactions. Although adhesive functio...

  13. De Novo Transcriptome Sequencing of Oryza officinalis Wall ex Watt to Identify Disease-Resistance Genes

    Bin He

    2015-12-01

    Full Text Available Oryza officinalis Wall ex Watt is one of the most important wild relatives of cultivated rice and exhibits high resistance to many diseases. It has been used as a source of genes for introgression into cultivated rice. However, there are limited genomic resources and little genetic information publicly reported for this species. To better understand the pathways and factors involved in disease resistance and accelerating the process of rice breeding, we carried out a de novo transcriptome sequencing of O. officinalis. In this research, 137,229 contigs were obtained ranging from 200 to 19,214 bp with an N50 of 2331 bp through de novo assembly of leaves, stems and roots in O. officinalis using an Illumina HiSeq 2000 platform. Based on sequence similarity searches against a non-redundant protein database, a total of 88,249 contigs were annotated with gene descriptions and 75,589 transcripts were further assigned to GO terms. Candidate genes for plant–pathogen interaction and plant hormones regulation pathways involved in disease-resistance were identified. Further analyses of gene expression profiles showed that the majority of genes related to disease resistance were all expressed in the three tissues. In addition, there are two kinds of rice bacterial blight-resistant genes in O. officinalis, including two Xa1 genes and three Xa26 genes. All 2 Xa1 genes showed the highest expression level in stem, whereas one of Xa26 was expressed dominantly in leaf and other 2 Xa26 genes displayed low expression level in all three tissues. This transcriptomic database provides an opportunity for identifying the genes involved in disease-resistance and will provide a basis for studying functional genomics of O. officinalis and genetic improvement of cultivated rice in the future.

  14. Molecular cloning of gyrA and gyrB genes of mycobacterium tuberculosis: analysis of nucleotide sequence

    Madhusudan, K.; Ramesh, V.; Nagaraja, V

    1994-01-01

    We have recently reported the cloning of gyrA and gyrB genes from Mycobacterium tuberculosis H37Ra [Curr. Science, (1994) 66, 664-667). Here, we present the complete nucleotide sequence of gyrB gene from M.tuberculosis H37Ra along with the flanking regions. The gyrA gene has been located 34 nucleotides downstream of gyrB and has been partially sequenced; both the genes seem to be transcribed from the promoter elements located upstream of gyrB coding sequence. The gyrB gene encodes a polypepti...

  15. Dose-sensitivity, conserved noncoding sequences and duplicate gene retention through multiple tetraploidies in the grasses.

    James C Schnable

    2011-03-01

    Full Text Available Whole genome duplications, or tetraplodies, are an important source of increased gene content. Following whole genome duplication, duplicate copies of many genes are lost from the genome. This loss of genes is biased both in the classes of genes deleted and the subgenome from which they are lost. Many or all classes are genes preferentially retained as duplicate copies are engaged in dose sensitive protein-protein interactions, such that deletion of any one duplicate upsets the status quo of subunit concentrations, and presumably lowers fitness as a result. Transcription factors are also preferentially retained following every whole genome duplications studied. This has been explained as a consequence of protein-protein interactions, just as for other highly retained classes of genes. We show that the quantity of conserved non-coding sequences (CNSs associated with genes predicts the likelyhood of their retention as duplicate pairs following whole genome duplication. As many CNSs likely represent binding sites for transcriptional regulators, we propose that the likelyhood of gene retention following tetraploidy may also be influenced by dose-sensitive protein-DNA interactions between the regulatory regions of CNS-rich genes -- nicknamed "bigfoot genes" – and the proteins that bind to them. Using grass genomes, we show that differential loss of CNSs from one member of a pair following the pregrass tetraploidy reduces its chance of retention in the subsequent maize-lineage tetraploidy.

  16. Gene duplication and paleopolyploidy in soybean and the implications for whole genome sequencing

    Nelson Rex T

    2007-09-01

    Full Text Available Abstract Background Soybean, Glycine max (L. Merr., is a well documented paleopolyploid. What remains relatively under characterized is the level of sequence identity in retained homeologous regions of the genome. Recently, the Department of Energy Joint Genome Institute and United States Department of Agriculture jointly announced the sequencing of the soybean genome. One of the initial concerns is to what extent sequence identity in homeologous regions would have on whole genome shotgun sequence assembly. Results Seventeen BACs representing ~2.03 Mb were sequenced as representative potential homeologous regions from the soybean genome. Genetic mapping of each BAC shows that 11 of the 20 chromosomes are represented. Sequence comparisons between homeologous BACs shows that the soybean genome is a mosaic of retained paleopolyploid regions. Some regions appear to be highly conserved while other regions have diverged significantly. Large-scale "batch" reassembly of all 17 BACs combined showed that even the most homeologous BACs with upwards of 95% sequence identity resolve into their respective homeologous sequences. Potential assembly errors were generated by tandemly duplicated pentatricopeptide repeat containing genes and long simple sequence repeats. Analysis of a whole-genome shotgun assembly of 80,000 randomly chosen JGI-DOE sequence traces reveals some new soybean-specific repeat sequences. Conclusion This analysis investigated both the structure of the paleopolyploid soybean genome and the potential effects retained homeology will have on assembling the whole genome shotgun sequence. Based upon these results, homeologous regions similar to those characterized here will not cause major assembly issues.

  17. X-exome sequencing of 405 unresolved families identifies seven novel intellectual disability genes

    Hu, H; Haas, S A; Chelly, J; Van Esch, H; Raynaud, M; de Brouwer, A P M; Weinert, S; Froyen, G; Frints, S G M; Laumonnier, F; Zemojtel, T; Love, M I; Richard, H; Emde, A-K; Bienek, M; Jensen, C; Hambrock, M; Fischer, U; Langnick, C; Feldkamp, M; Wissink-Lindhout, W; Lebrun, N; Castelnau, L; Rucci, J; Montjean, R; Dorseuil, O; Billuart, P; Stuhlmann, T; Shaw, M; Corbett, M A; Gardner, A; Willis-Owen, S; Tan, C; Friend, K L; Belet, S; van Roozendaal, K E P; Jimenez-Pocquet, M; Moizard, M-P; Ronce, N; Sun, R; O'Keeffe, S; Chenna, R; van Bömmel, A; Göke, J; Hackett, A; Field, M; Christie, L; Boyle, J; Haan, E; Nelson, J; Turner, G; Baynam, G; Gillessen-Kaesbach, G; Müller, U; Steinberger, D; Budny, B; Badura-Stronka, M; Latos-Bieleńska, A; Ousager, L B; Wieacker, P; Rodríguez Criado, G; Bondeson, M-L; Annerén, G; Dufke, A; Cohen, M; Van Maldergem, L; Vincent-Delorme, C; Echenne, B; Simon-Bouy, B; Kleefstra, T; Willemsen, M; Fryns, J-P; Devriendt, K; Ullmann, R; Vingron, M; Wrogemann, K; Wienker, T F; Tzschach, A; van Bokhoven, H; Gecz, J; Jentsch, T J; Chen, W; Ropers, H-H; Kalscheuer, V M

    2016-01-01

    X-linked intellectual disability (XLID) is a clinically and genetically heterogeneous disorder. During the past two decades in excess of 100 X-chromosome ID genes have been identified. Yet, a large number of families mapping to the X-chromosome remained unresolved suggesting that more XLID genes or...... established roles in cognitive function and intellectual disability in particular. We suggest that systematic sequencing of all X-chromosomal genes in a cohort of patients with genetic evidence for X-chromosome locus involvement may resolve up to 58% of Fragile X-negative cases....

  18. Nucleotide sequence and characterization of a carbenicillin-hydrolyzing penicillinase gene from Proteus mirabilis.

    Sakurai, Y.; Tsukamoto, K; Sawai, T

    1991-01-01

    The structural gene of a carbenicillinase was cloned from the chromosomal DNA of Proteus mirabilis GN79. This gene codes for a protein of 270 amino acids. Alignment of the amino acid sequence with those of known beta-lactamases revealed that the enzyme is a novel class A beta-lactamase with a unique conserved triad, RTG. By using a DNA fragment of the structural gene, a lack of cross hybridization was confirmed between the DNA probe and total DNAs from natural isolates of P. mirabilis, sugges...

  19. Origin of a novel protein-coding gene family with similar signal sequence in Schistosoma japonicum

    Mbanefo Evaristus

    2012-06-01

    Full Text Available Abstract Background Evolution of novel protein-coding genes is the bedrock of adaptive evolution. Recently, we identified six protein-coding genes with similar signal sequence from Schistosoma japonicum egg stage mRNA using signal sequence trap (SST. To find the mechanism underlying the origination of these genes with similar core promoter regions and signal sequence, we adopted an integrated approach utilizing whole genome, transcriptome and proteome database BLAST queries, other bioinformatics tools, and molecular analyses. Results Our data, in combination with database analyses showed evidences of expression of these genes both at the mRNA and protein levels exclusively in all developmental stages of S. japonicum. The signal sequence motif was identified in 27 distinct S. japonicum UniGene entries with multiple mRNA transcripts, and in 34 genome contigs distributed within 18 scaffolds with evidence of genome-wide dispersion. No homolog of these genes or similar domain was found in deposited data from any other organism. We observed preponderance of flanking repetitive elements (REs, albeit partial copies, especially of the RTE-like and Perere class at either side of the duplication source locus. The role of REs as major mediators of DNA-level recombination leading to dispersive duplication is discussed with evidence from our analyses. We also identified a stepwise pathway towards functional selection in evolving genes by alternative splicing. Equally, the possible transcription models of some protein-coding representatives of the duplicons are presented with evidence of expression in vitro. Conclusion Our findings contribute to the accumulating evidence of the role of REs in the generation of evolutionary novelties in organisms’ genomes.

  20. Sub-genomic level sequence analysis of the aquaporin multi-gene family in cotton

    Aquaporins function mainly as water transport channel proteins that facilitate water movement across intracellular and intercellular membranes in most living organisms. Plant aquaporins belong to a multi-gene family and are commonly categorized into 5 subfamilies according to sequence similarity. Re...

  1. Isolation and Analysis of α-Gliadin Gene Coding Sequences from Triticum durum

    WANG Han-yan; WEI Yu-ming; ZE Hong-yan; ZHENG You-liang

    2007-01-01

    Three coding sequences of gliadins genes, designed as Gli2_Du1, Gli2_Du2 and Gli2_Du3, were isolated from the genomic DNA of Triticum durum accessions CItr5083. Gli2_Du1 and Gli2_Du2 contain 945 and 864 bp, encoding the mature proteins with 314 and 287 amino acid residues, respectively. Gli2_Du3 is recognized as a pseudogene due to the stop codon occurring in the coding region. The pseudogenes, commonly occurring in gliadins family, are attributed to the single base change C → T. The amino acid sequences deduced from these gene sequences were characterized with the typical structure of α-gliadin proteins, including the toxic sequences (PSQQQP). The peptide fraction PF(Y)PP(Q)is thought to be an extra unit of repetitive domain, slightly diverging from the previous report. Six cysteine residues were observed within two unique domains. Phylogenetic analysis showed Gli2_Du2 and Gli2_Du3 were closely related to the genes on chromosome 6A, whereas Gli2_Du1 seems to be more homologous with the genes on chromosome 6B.

  2. Cloning, nucleotide sequence and transcriptional analysis of the uvrA gene from Neisseria gonorrhoeae

    A recombinant plasmid capable of restoring UV resistance to an Escherichia coli uvrA mutant was isolated from a genomic library of Neisseria gonorrhoeae. Sequence analysis revealed an open reading frame whose deduced amino acid sequence displayed significant similarity to those of the UvrA proteins of other bacterial species. A second open reading frame (ORF259) was identified upstream from, and in the opposite orientation to the gonococcal uvrA gene. Transcriptional fusions between portions of the gonococcal uvrA upstream region and a reporter gene were used to localise promoter activity in both E. coli and N. gonorrhoeae. The transcriptional starting points of uvrA and ORF259 were mapped in E. coli by primer extension analysis, and corresponding σ70 promoters were identified. The arrangement of the uvrA-ORF259 intergenic region is similar to that of the gonococcal recA-aroD intergenic region. Both contain inverted copies of the 10 bp neisserial DNA uptake sequence situated between divergently transcribed genes. However, there is no evidence that either the uptake sequence or the proximity of the promoters influences expression of these genes. (author)

  3. Next-generation sequencing of 28 ALS-related genes in a Japanese ALS cohort.

    Nakamura, Ryoichi; Sone, Jun; Atsuta, Naoki; Tohnai, Genki; Watanabe, Hazuki; Yokoi, Daichi; Nakatochi, Masahiro; Watanabe, Hirohisa; Ito, Mizuki; Senda, Jo; Katsuno, Masahisa; Tanaka, Fumiaki; Li, Yuanzhe; Izumi, Yuishin; Morita, Mitsuya; Taniguchi, Akira; Kano, Osamu; Oda, Masaya; Kuwabara, Satoshi; Abe, Koji; Aiba, Ikuko; Okamoto, Koichi; Mizoguchi, Kouichi; Hasegawa, Kazuko; Aoki, Masashi; Hattori, Nobutaka; Tsuji, Shoji; Nakashima, Kenji; Kaji, Ryuji; Sobue, Gen

    2016-03-01

    We investigated the frequency and contribution of variants of the 28 known amyotrophic lateral sclerosis (ALS)-related genes in Japanese ALS patients. We designed a multiplex, polymerase chain reaction-based primer panel to amplify the coding regions of the 28 ALS-related genes and sequenced DNA samples from 257 Japanese ALS patients using an Ion Torrent PGM sequencer. We also performed exome sequencing and identified variants of the 28 genes in an additional 251 ALS patients using an Illumina HiSeq 2000 platform. We identified the known ALS pathogenic variants and predicted the functional properties of novel nonsynonymous variants in silico. These variants were confirmed by Sanger sequencing. Known pathogenic variants were identified in 19 (48.7%) of the 39 familial ALS patients and 14 (3.0%) of the 469 sporadic ALS patients. Thirty-two sporadic ALS patients (6.8%) harbored 1 or 2 novel nonsynonymous variants of ALS-related genes that might be deleterious. This study reports the first extensive genetic screening of Japanese ALS patients. These findings are useful for developing genetic screening and counseling strategies for such patients. PMID:26742954

  4. Molecular cloning, sequence characteristics, and tissue expression analysis of ECE1 gene in Tibetan pig.

    Wang, Yan-Dong; Zhang, Jian; Li, Chuan-Hao; Xu, Hai-Peng; Chen, Wei; Zeng, Yong-Qing; Wang, Hui

    2015-10-25

    Low air pressure and low oxygen partial pressure at high altitude seriously affect the survival and development of human beings and animals. ECE1 is a recently discovered gene that is involved in anti-hypoxia, but the full-length cDNA sequence has not been obtained. For a better understanding of the structure and function of the ECE1 gene and to study its effect in Tibetan pig, the cDNA of the ECE1 gene from the muscle of Tibetan pig was cloned, sequenced and characterized. The ECE1 full-length cDNA sequence consists of 2262 bp coding sequence (CDS) that encodes 753 amino acids with a molecular mass of 85,449 kD, 2 bp 5'UTR and 1507 bp 3'UTR. In addition, the phylogenetic tree analysis revealed that the Tibetan pig ECE1 has a closer genetic relationship and evolution distance with the land mammals ECE1. Furthermore, analysis by qPCR showed that the ECE1 transcript is constitutively expressed in the 10 tissues tested: the liver, subcutaneous fat, kidney, muscle, stomach, heart, brain, spleen, pancreas, and lung. These results serve as a foundation for further insight into the Tibetan pig ECE1 gene. PMID:26115769

  5. MATRIX PROTEIN GENE SEQUENCE ANALYSIS OF AVIAN PARAMYXOVIRUS 1 ISOLATES OBTAINED FROM PIGEONS

    The matrix protein gene was cloned and sequenced for several recent isolates of avian paramyxovirus 1 (APMV1). Specifically, isolates from pigeons and doves, members of the Columbidae family were examined. APMV1 is the causative agent of Newcastle disease and the virus is associated with disease amo...

  6. Distribution of Genes and Repetitive Elements in the Diabrotica virgifera virgifera Genome Estimated Using BAC Sequencing

    Brad S. Coates

    2012-01-01

    Full Text Available Feeding damage caused by the western corn rootworm, Diabrotica virgifera virgifera, is destructive to corn plants in North America and Europe where control remains challenging due to evolution of resistance to chemical and transgenic toxins. A BAC library, DvvBAC1, containing 109,486 clones with 104±34.5 kb inserts was created, which has an ~4.56X genome coverage based upon a 2.58 Gb (2.80 pg flow cytometry-estimated haploid genome size. Paired end sequencing of 1037 BAC inserts produced 1.17 Mb of data (~0.05% genome coverage and indicated ~9.4 and 16.0% of reads encode, respectively, endogenous genes and transposable elements (TEs. Sequencing genes within BAC full inserts demonstrated that TE densities are high within intergenic and intron regions and contribute to the increased gene size. Comparison of homologous genome regions cloned within different BAC clones indicated that TE movement may cause haplotype variation within the inbred strain. The data presented here indicate that the D. virgifera virgifera genome is large in size and contains a high proportion of repetitive sequence. These BAC sequencing methods that are applicable for characterization of genomes prior to sequencing may likely be valuable resources for genome annotation as well as scaffolding.

  7. Sequence signatures involved in targeting the male-specific lethal complex to X-chromosomal genes in Drosophila melanogaster

    Philip Philge

    2012-03-01

    Full Text Available Abstract Background In Drosophila melanogaster, the dosage-compensation system that equalizes X-linked gene expression between males and females, thereby assuring that an appropriate balance is maintained between the expression of genes on the X chromosome(s and the autosomes, is at least partially mediated by the Male-Specific Lethal (MSL complex. This complex binds to genes with a preference for exons on the male X chromosome with a 3' bias, and it targets most expressed genes on the X chromosome. However, a number of genes are expressed but not targeted by the complex. High affinity sites seem to be responsible for initial recruitment of the complex to the X chromosome, but the targeting to and within individual genes is poorly understood. Results We have extensively examined X chromosome sequence variation within five types of gene features (promoters, 5' UTRs, coding sequences, introns, 3' UTRs and intergenic sequences, and assessed its potential involvement in dosage compensation. Presented results show that: the X chromosome has a distinct sequence composition within its gene features; some of the detected variation correlates with genes targeted by the MSL-complex; the insulator protein BEAF-32 preferentially binds upstream of MSL-bound genes; BEAF-32 and MOF co-localizes in promoters; and that bound genes have a distinct sequence composition that shows a 3' bias within coding sequence. Conclusions Although, many strongly bound genes are close to a high affinity site neither our promoter motif nor our coding sequence signatures show any correlation to HAS. Based on the results presented here, we believe that there are sequences in the promoters and coding sequences of targeted genes that have the potential to direct the secondary spreading of the MSL-complex to nearby genes.

  8. Evaluation and update of cutoff values for methanotrophic pmoA gene sequences.

    Wen, Xi; Yang, Sizhong; Liebner, Susanne

    2016-09-01

    The functional pmoA gene is frequently used to probe the diversity and phylogeny of methane-oxidizing bacteria (MOB) in various environments. Here, we compared the similarities between the pmoA gene and the corresponding 16S rRNA gene sequences of 77 described species covering gamma- and alphaproteobacterial methanotrophs (type I and type II MOB, respectively) as well as methanotrophs from the phylum Verrucomicrobia. We updated and established the weighted mean pmoA gene cutoff values on the nucleotide level at 86, 82, and 71 % corresponding to the 97, 95, and 90 % similarity of the 16S rRNA gene. Based on these cutoffs, the functional gene fragments can be entirely processed at the nucleotide level throughout software platforms such as Mothur or QIIME which provide a user-friendly and command-based alternative to amino acid-based pipelines. Type II methanotrophs are less divergent than type I both with regard to ribosomal and functional gene sequence similarity and GC content. We suggest that this agrees with the theory of different life strategies proposed for type I and type II MOB. PMID:27098810

  9. Sequencing and complementation analysis of the nifUSV genes from Azospirillum brasilense.

    Frazzon, J; Schrank, I S

    1998-02-15

    The functionality of nitrogenase in diazotrophic bacteria is dependent upon nif genes other than the structural nifH, D, and K genes which encode the enzyme subunit proteins. Such genes are involved in the activation of nif gene expression, maturation of subunit proteins, cofactor biosynthesis, and electron transport. In this work, approximately 5500 base pairs located within the major nif gene cluster of Azospirillum brasilense Sp7 have been sequenced. The deduced open reading frames were compared to the nif gene products of Azotobacter vinelandii and other diazotrophs. This analysis indicates the presence of five ORFs encoding ORF2, nifU, nifS, nifV, and ORF4 in the same sequential organization as found in other organisms. Consensus sigma 54 and NifA binding sites are present in the putative promoter region upstream of ORF2 in the A. brasilense sequence. The nifV gene of A. brasilense but not nifU or nifS complemented corresponding mutants strains of A. vinelandii. PMID:9503607

  10. Resolution of the African hominoid trichotomy by use of a mitochondrial gene sequence

    Mitochondrial DNA sequences encoding the cytochrome oxidase subunit II gene have been determined for five primate species, siamang (Hylobates syndactylus), lowland gorilla (Gorilla gorilla), pygmy chimpanzee (Pan paniscus), crab-eating macaque (Macaca fascicularis), and green monkey (Cercopithecus aethiops), and compared with published sequences of other primate and nonprimate species. Comparisons of cytochrome oxidase subunit II gene sequences provide clear-cut evidence from the mitochondrial genome for the separation of the African ape trichotomy into two evolutionary lineages, one leading to gorillas and the other to humans and chimpanzees. Several different tree-building methods support this same phylogenetic tree topology. The comparisons also yield trees in which a substantial length separates the divergence point of gorillas from that of humans and chimpanzees, suggesting that the lineage most immediately ancestral to humans and chimpanzees may have been in existence for a relatively long time

  11. Resolution of the African hominoid trichotomy by use of a mitochondrial gene sequence

    Ruvolo, M.; Disotell, T.R.; Allard, M.W. (Harvard Univ., Cambridge, MA (United States)); Brown, W.M. (Univ. of Michigan, Ann Arbor (United States)); Honeycutt, R.L. (Texas A and M Univ., College Station (United States))

    1991-02-15

    Mitochondrial DNA sequences encoding the cytochrome oxidase subunit II gene have been determined for five primate species, siamang (Hylobates syndactylus), lowland gorilla (Gorilla gorilla), pygmy chimpanzee (Pan paniscus), crab-eating macaque (Macaca fascicularis), and green monkey (Cercopithecus aethiops), and compared with published sequences of other primate and nonprimate species. Comparisons of cytochrome oxidase subunit II gene sequences provide clear-cut evidence from the mitochondrial genome for the separation of the African ape trichotomy into two evolutionary lineages, one leading to gorillas and the other to humans and chimpanzees. Several different tree-building methods support this same phylogenetic tree topology. The comparisons also yield trees in which a substantial length separates the divergence point of gorillas from that of humans and chimpanzees, suggesting that the lineage most immediately ancestral to humans and chimpanzees may have been in existence for a relatively long time.

  12. Taxonomically Clustering Organisms Based on the Profiles of Gene Sequences Using PCA

    E. Ramaraj

    2006-01-01

    Full Text Available The biological implications of bioinformatics can already be seen in various implementations. Biological taxonomy may seem like a simple science in which the biologists merely observe similarities among organisms and construct classifications according to those similarities[1], but it is not so simple. By applying data mining techniques on gene sequence database we can cluster the data to find interesting similarities in the gene expression data. One of the applications of such kind of clustering is taxonomically clustering the organisms based on their gene sequential expressions. In this study we outlined a method for taxonomical clustering of species of the organisms based on the genetic profile using Principal Component Analysis and Self Organizing Neural Networks. We have implemented the idea using Matlab and tried to cluster the gene sequences taken from PAUP version of the ML5/ML6 database. The taxa used for some of the basidiomycetous fungi form the database. To study the scalability issues another large gene sequence database was used. The proposed method clustered the species of organisms correctly in almost all the cases. The obtained were more significant and promising. The proposed method clustered the species of organisms correctly in almost all the cases. The obtained results were more significant and promising.

  13. SeqGene: a comprehensive software solution for mining exome- and transcriptome- sequencing data

    Deng Xutao

    2011-06-01

    Full Text Available Abstract Background The popularity of massively parallel exome and transcriptome sequencing projects demands new data mining tools with a comprehensive set of features to support a wide range of analysis tasks. Results SeqGene, a new data mining tool, supports mutation detection and annotation, dbSNP and 1000 Genome data integration, RNA-Seq expression quantification, mutation and coverage visualization, allele specific expression (ASE, differentially expressed genes (DEGs identification, copy number variation (CNV analysis, and gene expression quantitative trait loci (eQTLs detection. We also developed novel methods for testing the association between SNP and expression and identifying genotype-controlled DEGs. We showed that the results generated from SeqGene compares favourably to other existing methods in our case studies. Conclusion SeqGene is designed as a general-purpose software package. It supports both paired-end reads and single reads generated on most sequencing platforms; it runs on all major types of computers; it supports arbitrary genome assemblies for arbitrary organisms; and it scales well to support both large and small scale sequencing projects. The software homepage is http://seqgene.sourceforge.net.

  14. Two lamprey Hedgehog genes share non-coding regulatory sequences and expression patterns with gnathostome Hedgehogs.

    Shungo Kano

    Full Text Available Hedgehog (Hh genes play major roles in animal development and studies of their evolution, expression and function point to major differences among chordates. Here we focused on Hh genes in lampreys in order to characterize the evolution of Hh signalling at the emergence of vertebrates. Screening of a cosmid library of the river lamprey Lampetra fluviatilis and searching the preliminary genome assembly of the sea lamprey Petromyzon marinus indicate that lampreys have two Hh genes, named Hha and Hhb. Phylogenetic analyses suggest that Hha and Hhb are lamprey-specific paralogs closely related to Sonic/Indian Hh genes. Expression analysis indicates that Hha and Hhb are expressed in a Sonic Hh-like pattern. The two transcripts are expressed in largely overlapping but not identical domains in the lamprey embryonic brain, including a newly-described expression domain in the nasohypophyseal placode. Global alignments of genomic sequences and local alignment with known gnathostome regulatory motifs show that lamprey Hhs share conserved non-coding elements (CNE with gnathostome Hhs albeit with sequences that have significantly diverged and dispersed. Functional assays using zebrafish embryos demonstrate gnathostome-like midline enhancer activity for CNEs contained in intron2. We conclude that lamprey Hh genes are gnathostome Shh-like in terms of expression and regulation. In addition, they show some lamprey-specific features, including duplication and structural (but not functional changes in the intronic/regulatory sequences.

  15. SeqGene: a comprehensive software solution for mining exome- and transcriptome- sequencing data

    2011-01-01

    Background The popularity of massively parallel exome and transcriptome sequencing projects demands new data mining tools with a comprehensive set of features to support a wide range of analysis tasks. Results SeqGene, a new data mining tool, supports mutation detection and annotation, dbSNP and 1000 Genome data integration, RNA-Seq expression quantification, mutation and coverage visualization, allele specific expression (ASE), differentially expressed genes (DEGs) identification, copy number variation (CNV) analysis, and gene expression quantitative trait loci (eQTLs) detection. We also developed novel methods for testing the association between SNP and expression and identifying genotype-controlled DEGs. We showed that the results generated from SeqGene compares favourably to other existing methods in our case studies. Conclusion SeqGene is designed as a general-purpose software package. It supports both paired-end reads and single reads generated on most sequencing platforms; it runs on all major types of computers; it supports arbitrary genome assemblies for arbitrary organisms; and it scales well to support both large and small scale sequencing projects. The software homepage is http://seqgene.sourceforge.net. PMID:21714929

  16. Expressed sequences tags of the anther smut fungus, Microbotryum violaceum, identify mating and pathogenicity genes

    Devier Benjamin

    2007-08-01

    Full Text Available Abstract Background The basidiomycete fungus Microbotryum violaceum is responsible for the anther-smut disease in many plants of the Caryophyllaceae family and is a model in genetics and evolutionary biology. Infection is initiated by dikaryotic hyphae produced after the conjugation of two haploid sporidia of opposite mating type. This study describes M. violaceum ESTs corresponding to nuclear genes expressed during conjugation and early hyphal production. Results A normalized cDNA library generated 24,128 sequences, which were assembled into 7,765 unique genes; 25.2% of them displayed significant similarity to annotated proteins from other organisms, 74.3% a weak similarity to the same set of known proteins, and 0.5% were orphans. We identified putative pheromone receptors and genes that in other fungi are involved in the mating process. We also identified many sequences similar to genes known to be involved in pathogenicity in other fungi. The M. violaceum EST database, MICROBASE, is available on the Web and provides access to the sequences, assembled contigs, annotations and programs to compare similarities against MICROBASE. Conclusion This study provides a basis for cloning the mating type locus, for further investigation of pathogenicity genes in the anther smut fungi, and for comparative genomics.

  17. Sequencing, physical organization and kinetic expression of the patulin biosynthetic gene cluster from Penicillium expansum

    Patulin is a polyketide-derived mycotoxin produced by numerous filamentous fungi. Among them, Penicillium expansum is by far the most problematic species. This fungus is a destructive phytopathogen capable of growing on fruit, provoking the blue mold decay of apples and producing significant amounts of patulin. The biosynthetic pathway of this mycotoxin is chemically well-characterized, but its genetic bases remain largely unknown with only few characterized genes in less economic relevant species. The present study consisted of the identification and positional organization of the patulin gene cluster in P. expansum strain NRRL 35695. Several amplification reactions were performed with degenerative primers that were designed based on sequences from the orthologous genes available in other species. An improved genome Walking approach was used in order to sequence the remaining adjacent genes of the cluster. RACE-PCR was also carried out from mRNAs to determine the start and stop codons of the coding sequences. The patulin gene cluster in P. expansum consists of 15 genes in the following order: patH, patG, patF, patE, patD, patC, patB, patA, patM, patN, patO, patL, patI, patJ, and patK. These genes share 60–70% of identity with orthologous genes grouped differently, within a putative patulin cluster described in a non-producing strain of Aspergillus clavatus. The kinetics of patulin cluster genes expression was studied under patulin-permissive conditions (natural apple-based medium) and patulin-restrictive conditions (Eagle's minimal essential medium), and demonstrated a significant association between gene expression and patulin production. In conclusion, the sequence of the patulin cluster in P. expansum constitutes a key step for a better understanding of themechanisms leading to patulin production in this fungus. It will allow the role of each gene to be elucidated, and help to define strategies to reduce patulin production in apple-based products

  18. Multidimensional metrics for estimating phage abundance, distribution, gene density, and sequence coverage in metagenomes

    Ramy Karam Aziz

    2015-05-01

    Full Text Available Phages are the most abundant biological entities on Earth and play major ecological roles, yet the current sequenced phage genomes do not adequately represent their diversity, and little is known about the abundance and distribution of these sequenced genomes in nature. Although the study of phage ecology has benefited tremendously from the emergence of metagenomic sequencing, a systematic survey of phage genes and genomes in various ecosystems is still lacking, and fundamental questions about phage biology, lifestyle, and ecology remain unanswered. To address these questions and improve comparative analysis of phages in different metagenomes, we screened a core set of publicly available metagenomic samples for sequences related to completely sequenced phages using the web tool, Phage Eco-Locator. We then adopted and deployed an array of mathematical and statistical metrics for a multidimensional estimation of the abundance and distribution of phage genes and genomes in various ecosystems. Experiments using those metrics individually showed their usefulness in emphasizing the pervasive, yet uneven, distribution of known phage sequences in environmental metagenomes. Using these metrics in combination allowed us to resolve phage genomes into clusters that correlated with their genotypes and taxonomic classes as well as their ecological properties. We propose adding this set of metrics to current metaviromic analysis pipelines, where they can provide insight regarding phage mosaicism, habitat specificity, and evolution.

  19. [Cloning, sequence analysis and expression of N-acetylglutamate kinase gene in Corynebacterium crenatum].

    Hao, Ning; Zhao, Zhi; Wang, Yu; Zhang, Ying-zi; Ding, Jiu-yuan

    2006-02-01

    N-Acetylglutamate kinase (EC 2.7.2.8;NAGK) genes from wild-type Corynebacterium crenatum AS 1.542 and a L-arginine-producing mutant C. crenatum 971.1 were cloned and sequenced. Analysis of argB sequences revealed that only one ORF existed, which used ATG as the initiation codon and coded a peptide of 317 amino acids with a calculated molecular weight of 33.6kDa. Only one nucleotide difference was found in the structure gene and the difference did not cause a change of amino acid by comparison of the gene sequences between the wild type C. crenatum AS 1.542 and the mutant 971.1. The ORF sequence of argB from C. crenatum AS 1.542 showed homologies of 99.89%, 76.62%, 37.94% to those from Corynebacterium glutamicum ATCC 13032, Corynebacterium efficient YS-314 and Escherichia coli k12. And the amino acid sequence deduced from ORF displayed homologies of 100%, 78.55%, 25.25% to those from microorganisms above, respectively. An internal promoter was found in the upstream of the argB gene from C. crenatum. The argB gene from C. crenatum AS 1.542 was expressed both in C. crenatum AS 1.542 and 971.1. The NAGK activity of transformed C. crenatum AS 1.542 was greatly increased by the induction of IPTG. The NAGK activity of transformed C. crenatum 971.1 was almost twice as much as that of C. crenatum 971.1 under the same induction. The amplification of the NAGK activity yielded 25% increase of L-arginine production in C. crenatum 971.1. PMID:16579472

  20. Diversity through duplication: whole-genome sequencing reveals novel gene retrocopies in the human population.

    Richardson, Sandra R; Salvador-Palomeque, Carmen; Faulkner, Geoffrey J

    2014-05-01

    Gene retrocopies are generated by reverse transcription and genomic integration of mRNA. As such, retrocopies present an important exception to the central dogma of molecular biology, and have substantially impacted the functional landscape of the metazoan genome. While an estimated 8,000-17,000 retrocopies exist in the human genome reference sequence, the extent of variation between individuals in terms of retrocopy content has remained largely unexplored. Three recent studies by Abyzov et al., Ewing et al. and Schrider et al. have exploited 1,000 Genomes Project Consortium data, as well as other sources of whole-genome sequencing data, to uncover novel gene retrocopies. Here, we compare the methods and results of these three studies, highlight the impact of retrocopies in human diversity and genome evolution, and speculate on the potential for somatic gene retrocopies to impact cancer etiology and genetic diversity among individual neurons in the mammalian brain. PMID:24615986

  1. Sequencing and analysis of the gene-rich space of cowpea

    Cheung Foo

    2008-02-01

    Full Text Available Abstract Background Cowpea, Vigna unguiculata (L. Walp., is one of the most important food and forage legumes in the semi-arid tropics because of its drought tolerance and ability to grow on poor quality soils. Approximately 80% of cowpea production takes place in the dry savannahs of tropical West and Central Africa, mostly by poor subsistence farmers. Despite its economic and social importance in the developing world, cowpea remains to a large extent an underexploited crop. Among the major goals of cowpea breeding and improvement programs is the stacking of desirable agronomic traits, such as disease and pest resistance and response to abiotic stresses. Implementation of marker-assisted selection and breeding programs is severely limited by a paucity of trait-linked markers and a general lack of information on gene structure and organization. With a nuclear genome size estimated at ~620 Mb, the cowpea genome is an ideal target for reduced representation sequencing. Results We report here the sequencing and analysis of the gene-rich, hypomethylated portion of the cowpea genome selectively cloned by methylation filtration (MF technology. Over 250,000 gene-space sequence reads (GSRs with an average length of 610 bp were generated, yielding ~160 Mb of sequence information. The GSRs were assembled, annotated by BLAST homology searches of four public protein annotation databases and four plant proteomes (A. thaliana, M. truncatula, O. sativa, and P. trichocarpa, and analyzed using various domain and gene modeling tools. A total of 41,260 GSR assemblies and singletons were annotated, of which 19,786 have unique GenBank accession numbers. Within the GSR dataset, 29% of the sequences were annotated using the Arabidopsis Gene Ontology (GO with the largest categories of assigned function being catalytic activity and metabolic processes, groups that include the majority of cellular enzymes and components of amino acid, carbohydrate and lipid metabolism. A

  2. Characterization of partial Hox gene sequences in annual fish of the subfamily Cynolebiatinae (Cyprinodontiformes, Rivulidae

    Verónica Gutiérrez

    2007-03-01

    Full Text Available Hox genes encode a family of transcription factors implicated in conferring regional identity along the anteroposterior axis in developing animal embryos. These genes are organized in genomic clusters, expressed collinearly and highly conserved in vertebrates. Among teleost, South American annual killifishes of the Cynolebiatinae subfamily represent an excellent model in development studies because their embryos are capable of undergoing reversible developmental arrest (diapause at three well-defined morphological stages. They are also an excellent model for evolutionary studies due to the high rates of mutation of their mitochondrial genome, their karyotypic divergence and their morphological variability. In this study, three partial homeobox sequences were isolated from different species of the Cynolebiatinae subfamily. Phylogenetic analyses and sequence comparisons revealed that they belong to the anterior Hox complex group, specifically to paralogue groups 1 and 3. This is the first time that partial Hox genes have been described in species of the Cynolebiatinae subfamily.

  3. Stable intronic sequence RNAs (sisRNAs): a new layer of gene regulation.

    Osman, Ismail; Tay, Mandy Li-Ian; Pek, Jun Wei

    2016-09-01

    Upon splicing, introns are rapidly degraded. Hence, RNAs derived from introns are commonly deemed as junk sequences. However, the discoveries of intronic-derived small nucleolar RNAs (snoRNAs), small Cajal body associated RNAs (scaRNAs) and microRNAs (miRNAs) suggested otherwise. These non-coding RNAs are shown to play various roles in gene regulation. In this review, we highlight another class of intron-derived RNAs known as stable intronic sequence RNAs (sisRNAs). sisRNAs have been observed since the 1980 s; however, we are only beginning to understand their biological significance. Recent studies have shown or suggested that sisRNAs regulate their own host's gene expression, function as molecular sinks or sponges, and regulate protein translation. We propose that sisRNAs function as an additional layer of gene regulation in the cells. PMID:27147469

  4. Chicken TAP genes differ from their human orthologues in locus organisation, size, sequence features and polymorphism.

    Walker, Brian A; van Hateren, Andrew; Milne, Sarah; Beck, Stephan; Kaufman, Jim

    2005-05-01

    We have previously shown that in the chicken major histocompatibility complex, the two transporters associated with antigen processing genes (TAP1 and TAP2) are located head to head between two classical class I genes. Here we show that the region between these two TAP genes has transcription factor-binding sites in common with class I gene promoters. The TAP genes are also up-regulated by interferon-gamma in a similar way to mammalian TAP genes and in a way that suggests they are both transcribed from a bi-directional promoter. The gene structures of TAP1 and TAP2 differ from that of human TAPs in that TAP1 has a truncated exon 1 and TAP2 has fused exons, resulting in a much smaller gene size. The truncation of TAP1 results in the loss of approximately 150 amino acids, which are thought to be involved in endoplasmic reticulum retention, heterodimer formation and tapasin binding, compared to human TAP1. Most of the protein sequence features involved in binding ATP are conserved, with two exceptions: chicken TAP1 has a glycine in the switch region where other TAPs have glutamine or histidine, and both chicken TAP genes have serines in the C motif where mammalian TAP2 has an alanine. Lastly, the chicken TAP genes are highly polymorphic, with at least as many TAP alleles as there are class I alleles, as seen by investigating nine inbred lines of chicken. The close proximity of the TAP genes to the class I genes and the high level of polymorphism may allow co-evolution of the genes, allowing TAP molecules to transport peptides specifically for the class I molecules of that haplotype. PMID:15900495

  5. Nucleotide sequence of the Pseudomonas fluorescens signal peptidase II gene (lsp) and flanking genes.

    Isaki, L; Beers, R; Wu, H.C.

    1990-01-01

    The lsp gene encoding prolipoprotein signal peptidase (signal peptidase II) is organized into an operon consisting of ileS and three open reading frames, designated genes x, orf149, and orf316 in both Escherichia coli and Enterobacter aerogenes. A plasmid, pBROC128, containing a 5.8-kb fragment of Pseudomonas fluorescens DNA was found to confer pseudomonic acid resistance on E. coli host cells and to contain the structural gene of ileS from P. fluorescens. In addition, E. coli strains carryin...

  6. Driver Gene Mutations in Stools of Colorectal Carcinoma Patients Detected by Targeted Next-Generation Sequencing.

    Armengol, Gemma; Sarhadi, Virinder K; Ghanbari, Reza; Doghaei-Moghaddam, Masoud; Ansari, Reza; Sotoudeh, Masoud; Puolakkainen, Pauli; Kokkola, Arto; Malekzadeh, Reza; Knuutila, Sakari

    2016-07-01

    Detection of driver gene mutations in stool DNA represents a promising noninvasive approach for screening colorectal cancer (CRC). Amplicon-based next-generation sequencing (NGS) is a good option to study mutations in many cancer genes simultaneously and from a low amount of DNA. Our aim was to assess the feasibility of identifying mutations in 22 cancer driver genes with Ion Torrent technology in stool DNA from a series of 65 CRC patients. The assay was successful in 80% of stool DNA samples. NGS results showed 83 mutations in cancer driver genes, 29 hotspot and 54 novel mutations. One to five genes were mutated in 75% of cases. TP53, KRAS, FBXW7, and SMAD4 were the top mutated genes, consistent with previous studies. Of samples with mutations, 54% presented concomitant mutations in different genes. Phosphatidylinositol 3-kinase/mitogen-activated protein kinase pathway genes were mutated in 70% of samples, with 58% having alterations in KRAS, NRAS, or BRAF. Because mutations in these genes can compromise the efficacy of epidermal growth factor receptor blockade in CRC patients, identifying mutations that confer resistance to some targeted treatments may be useful to guide therapeutic decisions. In conclusion, the data presented herein show that NGS procedures on stool DNA represent a promising tool to detect genetic mutations that could be used in the future for diagnosis, monitoring, or treating CRC. PMID:27155048

  7. Identification of miRNAs and their target genes in developing soybean seeds by deep sequencing

    Chen Shou-Yi

    2011-01-01

    Full Text Available Abstract Background MicroRNAs (miRNAs regulate gene expression by mediating gene silencing at transcriptional and post-transcriptional levels in higher plants. miRNAs and related target genes have been widely studied in model plants such as Arabidopsis and rice; however, the number of identified miRNAs in soybean (Glycine max is limited, and global identification of the related miRNA targets has not been reported in previous research. Results In our study, a small RNA library and a degradome library were constructed from developing soybean seeds for deep sequencing. We identified 26 new miRNAs in soybean by bioinformatic analysis and further confirmed their expression by stem-loop RT-PCR. The miRNA star sequences of 38 known miRNAs and 8 new miRNAs were also discovered, providing additional evidence for the existence of miRNAs. Through degradome sequencing, 145 and 25 genes were identified as targets of annotated miRNAs and new miRNAs, respectively. GO analysis indicated that many of the identified miRNA targets may function in soybean seed development. Additionally, a soybean homolog of Arabidopsis SUPPRESSOR OF GENE SLIENCING 3 (AtSGS3 was detected as a target of the newly identified miRNA Soy_25, suggesting the presence of feedback control of miRNA biogenesis. Conclusions We have identified large numbers of miRNAs and their related target genes through deep sequencing of a small RNA library and a degradome library. Our study provides more information about the regulatory network of miRNAs in soybean and advances our understanding of miRNA functions during seed development.

  8. Species identification using genetic tools: the value of nuclear and mitochondrial gene sequences in whale conservation.

    Palumbi, S R; Cipriano, F

    1998-01-01

    DNA sequence analysis is a powerful tool for identifying the source of samples thought to be derived from threatened or endangered species. Analysis of mitochondrial DNA (mtDNA) from retail whale meat markets has shown consistently that the expected baleen whale in these markets, the minke whale, makes up only about half the products analyzed. The other products are either unregulated small toothed whales like dolphins or are protected baleen whales such as humpback, Bryde's, fin, or blue whales. Independent verification of such mtDNA identifications requires analysis of nuclear genetic loci, but this is technically more difficult than standard mtDNA sequencing. In addition, evolution of species-specific sequences (i.e., fixation of sequence differences to produce reciprocally monophyletic gene trees) is slower in nuclear than in mitochondrial genes primarily because genetic drift is slower at nuclear loci. When will use of nuclear sequences allow forensic DNA identification? Comparison of neutral theories of coalescence of mitochondrial and nuclear loci suggests a simple rule of thumb. The "three-times rule" suggests that phylogenetic sorting at nuclear loci is likely to produce species-specific sequences when mitochondrial alleles are reciprocally monophyletic and the branches leading to the mtDNA sequences of a species are three times longer than the average difference observed within species. A preliminary test of the three-times rule, which depends on many assumptions about the species and genes involved, suggests that blue and fin whales should have species-specific sequences at most neutral nuclear loci, whereas humpback and fin whales should show species-specific sequences at fewer nuclear loci. Partial sequences of actin introns from these species confirm the predictions of the three-times rule and show that blue and fin whales are reciprocally monophyletic at this locus. These intron sequences are thus good tools for the identification of these species

  9. alpha-Amylase gene of Streptomyces limosus: nucleotide sequence, expression motifs, and amino acid sequence homology to mammalian and invertebrate alpha-amylases.

    Long, C M; Virolle, M J; Chang, S Y; Chang, S.; Bibb, M.J.

    1987-01-01

    The nucleotide sequence of the coding and regulatory regions of the alpha-amylase gene (aml) of Streptomyces limosus was determined. High-resolution S1 mapping was used to locate the 5' end of the transcript and demonstrated that the gene is transcribed from a unique promoter. The predicted amino acid sequence has considerable identity to mammalian and invertebrate alpha-amylases, but not to those of plant, fungal, or eubacterial origin. Consistent with this is the susceptibility of the enzym...

  10. Development and analytical validation of a 25-gene next generation sequencing panel that includes the BRCA1 and BRCA2 genes to assess hereditary cancer risk

    Judkins, Thaddeus; Leclair, Benoît; Bowles, Karla; Gutin, Natalia; Trost, Jeff; McCulloch, James; Bhatnagar, Satish; Murray, Adam; Craft, Jonathan; Wardell, Bryan; Bastian, Mark; Mitchell, Jeffrey; Jian CHEN; Tran, Thanh; Williams, Deborah

    2015-01-01

    Background Germline DNA mutations that increase the susceptibility of a patient to certain cancers have been identified in various genes, and patients can be screened for mutations in these genes to assess their level of risk for developing cancer. Traditional methods using Sanger sequencing focus on small groups of genes and therefore are unable to screen for numerous genes from several patients simultaneously. The goal of the present study was to validate a 25-gene panel to assess genetic r...

  11. Comparative organization of nitrogen fixation-specific genes from Azotobacter vinelandii and Klebsiella pneumoniae: DNA sequence of the nifUSV genes.

    Beynon, J; Ally, A; Cannon, M; Cannon, F.; Jacobson, M.; Cash, V; Dean, D.

    1987-01-01

    In the facultative anaerobe Klebsiella pneumoniae 17 nitrogen fixation-specific genes (nif genes) have been identified. Homologs to 12 of these genes have now been isolated from the aerobic diazotroph Azotobacter vinelandii. Comparative studies have indicated that these diverse microorganisms share striking similarities in the genetic organization of their nif genes and in the primary structure of their individual nif gene products. In this study the complete nucleotide sequence of the nifUSV...

  12. Nucleotide sequence specifying the glycoprotein gene, gB, of herpes simplex virus type 1.

    Bzik, D J; Fox, B A; DeLuca, N A; Person, S

    1984-03-01

    The nucleotide sequence thought to specify the glycoprotein gene, gB, of the KOS strain of herpes simplex virus type 1 (HSV-1) has been determined. A 3.1-kilobase (kb), viral-specified RNA was mapped to the left half of the BamHI-G fragment (0.345 to 0.399 map units). TATA, CAT-box, and possible mRNA start sequences characteristic of HSV-1 genes are found near 0.368 map units. The first available ATG codon is at 0.366 and the first in-phase chain terminator at 0.348 map units. A polyA-addition signal (AATAAA) occurs 17 nucleotides past the chain terminator. Translation of these sequences would yield a 100.3-kilodalton (kDa) polypeptide characterized by a 5' signal sequence, nine N-linked saccharide addition sites, a strongly hydrophobic membrane-spanning sequence, and a highly charged 3' cytoplasmic anchor sequence. Two mutants of KOS, tsJ12 and tsJ20, that are temperature-sensitive for viral growth and for the production of gB, have been physically mapped to 0.357 to 0.360 and 0.360 to 0.364 map units, respectively (DeLuca et al., in preparation). The nucleotide sequence of the mutants was determined in these regions. In both cases a single amino acid replacement within the 100.3-kDa polypeptide is predicted from the sequence analysis. PMID:6324454

  13. Targeted enrichment of the black cottonwood (Populus trichocarpa gene space using sequence capture

    Zhou Lecong

    2012-12-01

    Full Text Available Abstract Background High-throughput re-sequencing is rapidly becoming the method of choice for studies of neutral and adaptive processes in natural populations across taxa. As re-sequencing the genome of large numbers of samples is still cost-prohibitive in many cases, methods for genome complexity reduction have been developed in attempts to capture most ecologically-relevant genetic variation. One of these approaches is sequence capture, in which oligonucleotide baits specific to genomic regions of interest are synthesized and used to retrieve and sequence those regions. Results We used sequence capture to re-sequence most predicted exons, their upstream regulatory regions, as well as numerous random genomic intervals in a panel of 48 genotypes of the angiosperm tree Populus trichocarpa (black cottonwood, or ‘poplar’. A total of 20.76Mb (5% of the poplar genome was targeted, corresponding to 173,040 baits. With 12 indexed samples run in each of four lanes on an Illumina HiSeq instrument (2x100 paired-end, 86.8% of the bait regions were on average sequenced at a depth ≥10X. Few off-target regions (>250bp away from any bait were present in the data, but on average ~80bp on either side of the baits were captured and sequenced to an acceptable depth (≥10X to call heterozygous SNPs. Nucleotide diversity estimates within and adjacent to protein-coding genes were similar to those previously reported in Populus spp., while intergenic regions had higher values consistent with a relaxation of selection. Conclusions Our results illustrate the efficiency and utility of sequence capture for re-sequencing highly heterozygous tree genomes, and suggest design considerations to optimize the use of baits in future studies.

  14. Analysis of mutations in the entire coding sequence of the factor VIII gene

    Bidichadani, S.I.; Lanyon, W.G.; Connor, J.M. [Glascow Univ. (United Kingdom)] [and others

    1994-09-01

    Hemophilia A is a common X-linked recessive disorder of bleeding caused by deleterious mutations in the gene for clotting factor VIII. The large size of the factor VIII gene, the high frequency of de novo mutations and its tissue-specific expression complicate the detection of mutations. We have used a combination of RT-PCR of ectopic factor VIII transcripts and genomic DNA-PCRs to amplify the entire essential sequence of the factor VIII gene. This is followed by chemical mismatch cleavage analysis and direct sequencing in order to facilitate a comprehensive search for mutations. We describe the characterization of nine potentially pathogenic mutations, six of which are novel. In each case, a correlation of the genotype with the observed phenotype is presented. In order to evaluate the pathogenicity of the five missense mutations detected, we have analyzed them for evolutionary sequence conservation and for their involvement of sequence motifs catalogued in the PROSITE database of protein sites and patterns.

  15. SERPINA1 Full-Gene Sequencing Identifies Rare Mutations Not Detected in Targeted Mutation Analysis.

    Graham, Rondell P; Dina, Michelle A; Howe, Sarah C; Butz, Malinda L; Willkomm, Kurt S; Murray, David L; Snyder, Melissa R; Rumilla, Kandelaria M; Halling, Kevin C; Highsmith, W Edward

    2015-11-01

    Genetic α-1 antitrypsin (AAT) deficiency is characterized by low serum AAT levels and the identification of causal mutations or an abnormal protein. It needs to be distinguished from deficiency because of nongenetic causes, and diagnostic delay may contribute to worse patient outcome. Current routine clinical testing assesses for only the most common mutations. We wanted to determine the proportion of unexplained cases of AAT deficiency that harbor causal mutations not identified through current standard allele-specific genotyping and isoelectric focusing (IEF). All prospective cases from December 1, 2013, to October 1, 2014, with a low serum AAT level not explained by allele-specific genotyping and IEF were assessed through full-gene sequencing with a direct sequencing method for pathogenic mutations. We reviewed the results using American Council of Medical Genetics criteria. Of 3523 cases, 42 (1.2%) met study inclusion criteria. Pathogenic or likely pathogenic mutations not identified through clinical testing were detected through full-gene sequencing in 16 (38%) of the 42 cases. Rare mutations not detected with current allele-specific testing and IEF underlie a substantial proportion of genetic AAT deficiency. Full-gene sequencing, therefore, has the ability to improve accuracy in the diagnosis of AAT deficiency. PMID:26321041

  16. Comparative sequence analyses of the neurotoxin complex genes in Clostridium botulinum serotypes A, B, E, and F

    Ajay K. Singh

    2012-09-01

    Full Text Available Neurotoxin complex (NTC genes are arranged in two known hemagglutinin (HA and open reading frame X (ORFX clusters. NTC genes have been analyzed in four serotypes A, B, E and F of Clostridium botulinum causing human botulism. Analysis of amino acid sequences of NT genes demonstrated significant differences among subtypes and four serotypes. Phylogram tree of NT genes reveals that serotypes A1 and B1 are much closer compared to serotype E1 and F1. However, non-toxic non-hemagglutinin (NTNH gene is highly conserved among four serotypes. Analysis of phylogram tree of NTNH gene reveals that serotypes A and F are more closely related compared to serotype B and E. Additionally, sequences of HAs and ORFX genes are very divergent but these genes are specific in subtypes and serotypes of Clostridium botulinum. Information derived from sequence analyses of NTC has direct implication in development of detection tools and therapeutic countermeasures for botulism.

  17. Candidate gene analysis and exome sequencing confirm LBX1 as a susceptibility gene for idiopathic scoliosis

    Grauers, Anna; Wang, Jingwen; Einarsdottir, Elisabet;

    2015-01-01

    ,739 patients with idiopathic scoliosis and 1,812 controls were included. OUTCOME MEASURE: The outcome measure was idiopathic scoliosis. METHODS: The variants rs10510181, rs11190870, rs12946942, and rs6570507 were genotyped in 1,739 patients with idiopathic scoliosis and 1,812 controls. Exome sequencing was...

  18. Sequence, Expression and Phylogenetic Analysis of Immune Response Genes Related to Mastitis in Buffaloes

    Priyanka Banerjee

    2013-08-01

    Full Text Available Increased expression of several acute phase cytokines, such as IL1, IL8 and TNF-α, have been positively correlated with the most severe clinical symptoms often associated with coliform or endotoxin-induced mastitis. Very little information is available on buffalo transcriptome. No information is available on mastitis related genes in buffaloes thus, the main aim of the present study was to analyze the buffalo transcriptome for extracting sequence of mastitis related genes (IL-1B, IL6, IL8 and IL-12B, the expression of these genes across various tissues using RNA-Seq, analyse the functional pathways and confer the phylogenetic relationship of these Interleukin genes with other species. IL1B revealed high expression in lungs while IL8 in mammary gland and IL12B in kidney respectively. The phylogenetic analysis revealed a clear homology between buffalo and cattle ILs. The results were confirmed by constructing gene and species tree. In species tree, Bos taurus was nearest to Bubalus bubalis followed by Ovis aries thereby grouping the whole Bovidae family together. The gene tree constructed with the help of Maximum likelihood methods clearly clustered IL1B and IL8 in one clade. This may be attributed to structural and functional relationship of IL8 induced after exposure to a variety of inflammatory stimuli including bacteria, oxidative stress, LPS, TNF and IL1B. Phylogenetic analysis of vertebrate IL genes provided insights into their patterns and process of gene evolution.

  19. Identification of Genetic Causes of Inherited Peripheral Neuropathies by Targeted Gene Panel Sequencing.

    Nam, Soo Hyun; Hong, Young Bin; Hyun, Young Se; Nam, Da Eun; Kwak, Geon; Hwang, Sun Hee; Choi, Byung-Ok; Chung, Ki Wha

    2016-05-31

    Inherited peripheral neuropathies (IPN), which are a group of clinically and genetically heterogeneous peripheral nerve disorders including Charcot-Marie-Tooth disease (CMT), exhibit progressive degeneration of muscles in the extremities and loss of sensory function. Over 70 genes have been reported as genetic causatives and the number is still growing. We prepared a targeted gene panel for IPN diagnosis based on next generation sequencing (NGS). The gene panel was designed to detect mutations in 73 genes reported to be genetic causes of IPN or related peripheral neuropathies, and to detect duplication of the chromosome 17p12 region, the major genetic cause of CMT1A. We applied the gene panel to 115 samples from 63 non-CMT1A families, and isolated 15 pathogenic or likely-pathogenic mutations in eight genes from 25 patients (17 families). Of them, eight mutations were unreported variants. Of particular interest, this study revealed several very rare mutations in the SPTLC2, DCTN1, and MARS genes. In addition, the effectiveness of the detection of CMT1A was confirmed by comparing five 17p12-nonduplicated controls and 15 CMT1A cases. In conclusion, we developed a gene panel for one step genetic diagnosis of IPN. It seems that its time- and cost-effectiveness are superior to previous tiered-genetic diagnosis algorithms, and it could be applied as a genetic diagnostic system for inherited peripheral neuropathies. PMID:27025386

  20. Identification of Genetic Causes of Inherited Peripheral Neuropathies by Targeted Gene Panel Sequencing

    Nam, Soo Hyun; Hong, Young Bin; Hyun, Young Se; Nam, Da Eun; Kwak, Geon; Hwang, Sun Hee; Choi, Byung-Ok; Chung, Ki Wha

    2016-01-01

    Inherited peripheral neuropathies (IPN), which are a group of clinically and genetically heterogeneous peripheral nerve disorders including Charcot-Marie-Tooth disease (CMT), exhibit progressive degeneration of muscles in the extremities and loss of sensory function. Over 70 genes have been reported as genetic causatives and the number is still growing. We prepared a targeted gene panel for IPN diagnosis based on next generation sequencing (NGS). The gene panel was designed to detect mutations in 73 genes reported to be genetic causes of IPN or related peripheral neuropathies, and to detect duplication of the chromosome 17p12 region, the major genetic cause of CMT1A. We applied the gene panel to 115 samples from 63 non-CMT1A families, and isolated 15 pathogenic or likely-pathogenic mutations in eight genes from 25 patients (17 families). Of them, eight mutations were unreported variants. Of particular interest, this study revealed several very rare mutations in the SPTLC2, DCTN1, and MARS genes. In addition, the effectiveness of the detection of CMT1A was confirmed by comparing five 17p12-nonduplicated controls and 15 CMT1A cases. In conclusion, we developed a gene panel for one step genetic diagnosis of IPN. It seems that its time- and cost-effectiveness are superior to previous tiered-genetic diagnosis algorithms, and it could be applied as a genetic diagnostic system for inherited peripheral neuropathies. PMID:27025386

  1. Rapid cloning of disease-resistance genes in plants using mutagenesis and sequence capture.

    Steuernagel, Burkhard; Periyannan, Sambasivam K; Hernández-Pinzón, Inmaculada; Witek, Kamil; Rouse, Matthew N; Yu, Guotai; Hatta, Asyraf; Ayliffe, Mick; Bariana, Harbans; Jones, Jonathan D G; Lagudah, Evans S; Wulff, Brande B H

    2016-06-01

    Wild relatives of domesticated crop species harbor multiple, diverse, disease resistance (R) genes that could be used to engineer sustainable disease control. However, breeding R genes into crop lines often requires long breeding timelines of 5-15 years to break linkage between R genes and deleterious alleles (linkage drag). Further, when R genes are bred one at a time into crop lines, the protection that they confer is often overcome within a few seasons by pathogen evolution. If several cloned R genes were available, it would be possible to pyramid R genes in a crop, which might provide more durable resistance. We describe a three-step method (MutRenSeq)-that combines chemical mutagenesis with exome capture and sequencing for rapid R gene cloning. We applied MutRenSeq to clone stem rust resistance genes Sr22 and Sr45 from hexaploid bread wheat. MutRenSeq can be applied to other commercially relevant crops and their relatives, including, for example, pea, bean, barley, oat, rye, rice and maize. PMID:27111722

  2. Patterns of homoeologous gene expression shown by RNA sequencing in hexaploid bread wheat.

    Leach, Lindsey J

    2014-04-11

    BACKGROUND: Bread wheat (Triticum aestivum) has a large, complex and hexaploid genome consisting of A, B and D homoeologous chromosome sets. Therefore each wheat gene potentially exists as a trio of A, B and D homoeoloci, each of which may contribute differentially to wheat phenotypes. We describe a novel approach combining wheat cytogenetic resources (chromosome substitution \\'nullisomic-tetrasomic\\' lines) with next generation deep sequencing of gene transcripts (RNA-Seq), to directly and accurately identify homoeologue-specific single nucleotide variants and quantify the relative contribution of individual homoeoloci to gene expression. RESULTS: We discover, based on a sample comprising ~5-10% of the total wheat gene content, that at least 45% of wheat genes are expressed from all three distinct homoeoloci. Most of these genes show strikingly biased expression patterns in which expression is dominated by a single homoeolocus. The remaining ~55% of wheat genes are expressed from either one or two homoeoloci only, through a combination of extensive transcriptional silencing and homoeolocus loss. CONCLUSIONS: We conclude that wheat is tending towards functional diploidy, through a variety of mechanisms causing single homoeoloci to become the predominant source of gene transcripts. This discovery has profound consequences for wheat breeding and our understanding of wheat evolution.

  3. Gene Profiling of Bone around Orthodontic Mini-Implants by RNA-Sequencing Analysis

    Kyung-Yen Nahm

    2015-01-01

    Full Text Available This study aimed to evaluate the genes that were expressed in the healing bones around SLA-treated titanium orthodontic mini-implants in a beagle at early (1-week and late (4-week stages with RNA-sequencing (RNA-Seq. Samples from sites of surgical defects were used as controls. Total RNA was extracted from the tissue around the implants, and an RNA-Seq analysis was performed with Illumina TruSeq. In the 1-week group, genes in the gene ontology (GO categories of cell growth and the extracellular matrix (ECM were upregulated, while genes in the categories of the oxidation-reduction process, intermediate filaments, and structural molecule activity were downregulated. In the 4-week group, the genes upregulated included ECM binding, stem cell fate specification, and intramembranous ossification, while genes in the oxidation-reduction process category were downregulated. GO analysis revealed an upregulation of genes that were related to significant mechanisms, including those with roles in cell proliferation, the ECM, growth factors, and osteogenic-related pathways, which are associated with bone formation. From these results, implant-induced bone formation progressed considerably during the times examined in this study. The upregulation or downregulation of selected genes was confirmed with real-time reverse transcription polymerase chain reaction. The RNA-Seq strategy was useful for defining the biological responses to orthodontic mini-implants and identifying the specific genetic networks for targeted evaluations of successful peri-implant bone remodeling.

  4. FrameD: a flexible program for quality check and gene prediction in prokaryotic genomes and noisy matured eukaryotic sequences

    Schiex, Thomas; Gouzy, Jérôme; Moisan, Annick; de Oliveira, Yannick

    2003-01-01

    We describe FrameD, a program that predicts coding regions in prokaryotic and matured eukaryotic sequences. Initially targeted at gene prediction in bacterial GC rich genomes, the gene model used in FrameD also allows to predict genes in the presence of frameshifts and partially undetermined sequences which makes it also very suitable for gene prediction and frameshift correction in unfinished sequences such as EST and EST cluster sequences. Like recent eukaryotic gene prediction programs, FrameD also includes the ability to take into account protein similarity information both in its prediction and its graphical output. Its performances are evaluated on different bacterial genomes. The web site (http://genopole.toulouse.inra.fr/bioinfo/FrameD/FD) allows direct prediction, sequence correction and translation and the ability to learn new models for new organisms. PMID:12824407

  5. Identification of drought-inducible genes and differentially expressed sequence tags in barley.

    Diab, Ayman A; Teulat-Merah, Béatrice; This, Dominique; Ozturk, Neslihan Z; Benscher, David; Sorrells, Mark E

    2004-11-01

    Drought limits cereal yields in several regions of the world and plant water status plays an important role in tolerance to drought. To investigate and understand the genetic and physiological basis of drought tolerance in barley, differentially expressed sequence tags (dESTs) and candidate genes for the drought response were mapped in a population of 167 F8 recombinant inbred lines derived from a cross between "Tadmor" (drought tolerant) and "Er/Apm" (adapted only to specific dry environments). One hundred sequenced probes from two cDNA libraries previously constructed from drought-stressed barley (Hordeum vulgare L., var. Tokak) plants and 12 candidate genes were surveyed for polymorphism, and 33 loci were added to a previously published map. Composite interval mapping was used to identify quantitative trait loci (QTL) associated with drought tolerance including leaf relative water content, leaf osmotic potential, osmotic potential at full turgor, water-soluble carbohydrate concentration, osmotic adjustment, and carbon isotope discrimination. A total of 68 QTLs with a limit of detection score > or =2.5 were detected for the traits evaluated under two water treatments and the two traits calculated from both treatments. The number of QTLs identified for each trait varied from one to 12, indicating that the genome contains multiple genes affecting different traits. Two candidate genes and ten differentially expressed sequences were associated with QTLs for drought tolerance traits. PMID:15517148

  6. Identification and characterization of rhizospheric microbial diversity by 16S ribosomal RNA gene sequencing.

    Naveed, Muhammad; Mubeen, Samavia; Khan, SamiUllah; Ahmed, Iftikhar; Khalid, Nauman; Suleria, Hafiz Ansar Rasul; Bano, Asghari; Mumtaz, Abdul Samad

    2014-01-01

    In the present study, samples of rhizosphere and root nodules were collected from different areas of Pakistan to isolate plant growth promoting rhizobacteria. Identification of bacterial isolates was made by 16S rRNA gene sequence analysis and taxonomical confirmation on EzTaxon Server. The identified bacterial strains were belonged to 5 genera i.e. Ensifer, Bacillus, Pseudomona, Leclercia and Rhizobium. Phylogenetic analysis inferred from 16S rRNA gene sequences showed the evolutionary relationship of bacterial strains with the respective genera. Based on phylogenetic analysis, some candidate novel species were also identified. The bacterial strains were also characterized for morphological, physiological, biochemical tests and glucose dehydrogenase (gdh) gene that involved in the phosphate solublization using cofactor pyrroloquinolone quinone (PQQ). Seven rhizoshperic and 3 root nodulating stains are positive for gdh gene. Furthermore, this study confirms a novel association between microbes and their hosts like field grown crops, leguminous and non-leguminous plants. It was concluded that a diverse group of bacterial population exist in the rhizosphere and root nodules that might be useful in evaluating the mechanisms behind plant microbial interactions and strains QAU-63 and QAU-68 have sequence similarity of 97 and 95% which might be declared as novel after further taxonomic characterization. PMID:25477935

  7. Identification and characterization of rhizospheric microbial diversity by 16S ribosomal RNA gene sequencing

    Muhammad Naveed

    2014-09-01

    Full Text Available In the present study, samples of rhizosphere and root nodules were collected from different areas of Pakistan to isolate plant growth promoting rhizobacteria. Identification of bacterial isolates was made by 16S rRNA gene sequence analysis and taxonomical confirmation on EzTaxon Server. The identified bacterial strains were belonged to 5 genera i.e. Ensifer, Bacillus, Pseudomona, Leclercia and Rhizobium. Phylogenetic analysis inferred from 16S rRNA gene sequences showed the evolutionary relationship of bacterial strains with the respective genera. Based on phylogenetic analysis, some candidate novel species were also identified. The bacterial strains were also characterized for morphological, physiological, biochemical tests and glucose dehydrogenase (gdh gene that involved in the phosphate solublization using cofactor pyrroloquinolone quinone (PQQ. Seven rhizoshperic and 3 root nodulating stains are positive for gdh gene. Furthermore, this study confirms a novel association between microbes and their hosts like field grown crops, leguminous and non-leguminous plants. It was concluded that a diverse group of bacterial population exist in the rhizosphere and root nodules that might be useful in evaluating the mechanisms behind plant microbial interactions and strains QAU-63 and QAU-68 have sequence similarity of 97 and 95% which might be declared as novel after further taxonomic characterization.

  8. Breaking the 1000-gene barrier for Mimivirus using ultra-deep genome and transcriptome sequencing

    Claverie Jean-Michel

    2011-03-01

    Full Text Available Abstract Background Mimivirus, a giant dsDNA virus infecting Acanthamoeba, is the prototype of the mimiviridae family, the latest addition to the family of the nucleocytoplasmic large DNA viruses (NCLDVs. Its 1.2 Mb-genome was initially predicted to encode 917 genes. A subsequent RNA-Seq analysis precisely mapped many transcript boundaries and identified 75 new genes. Findings We now report a much deeper analysis using the SOLiD™ technology combining RNA-Seq of the Mimivirus transcriptome during the infectious cycle (202.4 Million reads, and a complete genome re-sequencing (45.3 Million reads. This study corrected the genome sequence and identified several single nucleotide polymorphisms. Our results also provided clear evidence of previously overlooked transcription units, including an important RNA polymerase subunit distantly related to Euryarchea homologues. The total Mimivirus gene count is now 1018, 11% greater than the original annotation. Conclusions This study highlights the huge progress brought about by ultra-deep sequencing for the comprehensive annotation of virus genomes, opening the door to a complete one-nucleotide resolution level description of their transcriptional activity, and to the realistic modeling of the viral genome expression at the ultimate molecular level. This work also illustrates the need to go beyond bioinformatics-only approaches for the annotation of short protein and non-coding genes in viral genomes.

  9. Molecular cloning, nucleotide sequence, and expression of the gene encoding human eosinophil differentiation factor (interleukin 5)

    The human eosinophil differentiation factor (EDF) gene was cloned from a genomic library in λ phage EMBL3A by using a murine EDF cDNA clone as a probe. The DNA sequence of a 3.2-kilobase BamHI fragment spanning the gene was determined. The gene contains three introns. The predicted amino acid sequence of 134 amino acids is identical with that recently reported for human interleukin 5 but shows no significant homology with other known hemopoietic growth regulators. The amino acid sequence shows strong homology (∼ 70% identity) with that of murine EDF. Recombinant human EDF, expressed from the human EDF gene after transfection into monkey COS cells, stimulated the production of eosinophils and eosinophil colonies from normal human bone marrow but had no effect on the production of neutrophils or mononuclear cells (monocytes and lymphoid cells). The apparent specificity of human EDF for the eosinophil lineage in myeloid hemopoiesis contrasts with the properties of human interleukin 3 and granulocyte/macrophage and granulocyte colony-stimulating factors but is directly analogous to the biological properties of murine EDF. Human EDF therefore represents a distinct hemopoietic growth factor that could play a central role in the regulation of eosinophilia

  10. Zooplankton diversity analysis through single-gene sequencing of a community sample

    Nishida Mutsumi

    2009-09-01

    Full Text Available Abstract Background Oceans cover more than 70% of the earth's surface and are critical for the homeostasis of the environment. Among the components of the ocean ecosystem, zooplankton play vital roles in energy and matter transfer through the system. Despite their importance, understanding of zooplankton biodiversity is limited because of their fragile nature, small body size, and the large number of species from various taxonomic phyla. Here we present the results of single-gene zooplankton community analysis using a method that determines a large number of mitochondrial COI gene sequences from a bulk zooplankton sample. This approach will enable us to estimate the species richness of almost the entire zooplankton community. Results A sample was collected from a depth of 721 m to the surface in the western equatorial Pacific off Pohnpei Island, Micronesia, with a plankton net equipped with a 2-m2 mouth opening. A total of 1,336 mitochondrial COI gene sequences were determined from the cDNA library made from the sample. From the determined sequences, the occurrence of 189 species of zooplankton was estimated. BLASTN search results showed high degrees of similarity (>98% between the query and database for 10 species, including holozooplankton and merozooplankton. Conclusion In conjunction with the Census of Marine Zooplankton and Barcode of Life projects, single-gene zooplankton community analysis will be a powerful tool for estimating the species richness of zooplankton communities.

  11. Mycoplasma pneumoniae P1 Type 1- and Type 2-Specific Sequences within the P1 Cytadhesin Gene of Individual Strains

    Dorigo-Zetsma, J. Wendelien; Wilbrink, Berry; Dankert, Jacob; Zaat, Sebastian A.J.

    2001-01-01

    Mycoplasma pneumoniae strains traditionally are divided into two types, based on sequence variation in the P1 gene. Recently, however, we have identified 8 P1 subtypes by restriction fragment length polymorphism analysis. In the present study the P1 gene sequences of three P1 type 1 and two P1 type 2 M. pneumoniae strains were analyzed. A new P1 gene sequence in a type 1 strain with partial similarity to a recently reported variable region in the P1 gene of an M. pneumoniae type 2 strain (T. ...

  12. Cloning and sequence analysis of β-actin gene from Aedes albopictus (Diptera: Culicidae)

    Weijie Wang; Xiaobang Hu; Donghui Zhang; Jianhua Jiao; Yan Sun; Lei Ma; Changliang Zhu

    2007-01-01

    Objective: To obtain the complete β-actin gene from Aedes albopictus. Methods: Total RNA was extracted from C6/36 cells. Degenerate primers were designed based on the β-actin sequences of An. gambiae, Ae. aegypti, Cx. pipiens pallens and D.melanogaster. By RT-PCR, the product was amplified, purified, cloned into the pGT vector and sequenced. The β-actin sequence was aligned and phylogenetically analyzed by the BLAST program and the CLUSTAL W program. Results: A sequence of 1132 bp including an open reading frame of 1131 bp was obtained (GenBank DQ657949). The deduced protein had 376 amino acids.Aligned to SWISS-PROT, it exhibited a high level of identity with β-actins from Anopheles, Drosophila and Culex at the amino acid sequence level. Phylogenetic analysis indicated that Ae. albopictus β-actin was much more homologous with invertebrate β-actin than with vertebrate β-actin. Conclusion: The gene may be used as the internal control in the experiments of Ae. albopictus.

  13. CLONING AND SEQUENCING OF MATURE FRAGMENT OF HUMAN BMP4 GENE

    2000-01-01

    Objective To study the cloning and sequencing of mature fragment of human bone morphogenetic protein-4 gene. Methods The template DNA was obtained from the human osteosarcoma cell line U2OS. By using RT- PCR method, the cDNA coding for the mature fragment of BMP-4 was amplified, cloned into the vector pUC19, and sequenced by Sanger Dideoxy-mediated Chain Termination method. Results The mature fragment of BMP4 cDNA was obtained by RT-PCR and determined by sequencing. Through the computer search on Genebank, the analysis showed that the homology of nucleotides and amino acids between cDNA of rhBMP4 mature fragment of this study and the published sequence was 99%. Sequence analysis showed that there were two differences, one was at base 1154 (201): G→C, which had no influence on the corresponding amino acids (Val). Another was at basel222 (269):C→T, the mutation at the base 1222 had the change of Ala to Val. Conclusion The mature fragment of BMP4 gene has been cloned. The results will be of great significance in treatment of skeletal injuries and diseases.

  14. Sequence analysis of mitochondrial 16S ribosomal RNA gene fragment from seven mosquito species

    Yogesh S Shouche; Milind S Patole

    2000-12-01

    Mosquitoes are vectors for the transmission of many human pathogens that include viruses, nematodes and protozoa. For the understanding of their vectorial capacity, identification of disease carrying and refractory strains is essential. Recently, molecular taxonomic techniques have been utilized for this purpose. Sequence analysis of the mitochondrial 16S rRNA gene has been used for molecular taxonomy in many insects. In this paper, we have analysed a 450 bp hypervariable region of the mitochondrial 16S rRNA gene in three major genera of mosquitoes, Aedes, Anopheles and Culex. The sequence was found to be unusually A + T rich and in substitutions the rate of transversions was higher than the transition rate. A phylogenetic tree was constructed with these sequences. An interesting feature of the sequences was a stretch of Ts that distinguished between Aedes and Culex on the one hand, and Anopheles on the other. This is the first report of mitochondrial rRNA sequences from these medically important genera of mosquitoes.

  15. Genomic localization, sequence analysis, and transcription of the putative human cytomegalovirus DNA polymerase gene.

    Heilbronn, R; Jahn, G; Bürkle, A; Freese, U K; Fleckenstein, B; zur Hausen, H

    1987-01-01

    The human cytomegalovirus (HCMV)-induced DNA polymerase has been well characterized biochemically and functionally, but its genomic location has not yet been assigned. To identify the coding sequence, cross-hybridization with the herpes simplex virus type 1 (HSV-1) polymerase gene was used, as suggested by the close similarity of the herpes group virus-induced DNA polymerases to the HCMV DNA polymerase. A cosmid and plasmid library of the entire HCMV genome was screened with the BamHI Q fragment of HSV-1 at different stringency conditions. One PstI-HincII restriction fragment of 850 base pairs mapping within the EcoRI M fragment of HCMV cross-hybridized at Tm - 25 degrees C. Sequence analysis revealed one open reading frame spanning the entire sequence. The amino acid sequence showed a highly conserved domain of 133 amino acids shared with the HSV and putative Epstein-Barr virus polymerase sequences. This domain maps within the C-terminal part of the HSV polymerase gene, which has been suggested to contain part of the catalytic center of the enzyme. Transcription analysis revealed one 5.4-kilobase early transcript in the sense orientation with respect to the open reading frame identified. This transcript appears to code for the 140-kilodalton HCMV polymerase protein. Images PMID:3023689

  16. Genomic localization, sequence analysis, and transcription of the putative human cytomegalovirus DNA polymerase gene

    Heilbronn, T.; Jahn, G.; Buerkle, A.; Freese, U.K.; Fleckenstein, B.; Zur Hausen, H.

    1987-01-01

    The human cytomegalovirus (HCMV)-induced DNA polymerase has been well characterized biochemically and functionally, but its genomic location has not yet been assigned. To identify the coding sequence, cross-hybridization with the herpes simplex virus type 1 (HSV-1) polymerase gene was used, as suggested by the close similarity of the herpes group virus-induced DNA polymerases to the HCMV DNA polymerase. A cosmid and plasmid library of the entire HCMV genome was screened with the BamHI Q fragment of HSF-1 at different stringency conditions. One PstI-HincII restriction fragment of 850 base pairs mapping within the EcoRI M fragment of HCMV cross-hybridized at T/sub m/ - 25/degrees/C. Sequence analysis revealed one open reading frame spanning the entire sequence. The amino acid sequence showed a highly conserved domain of 133 amino acids shared with the HSV and putative Esptein-Barr virus polymerase sequences. This domain maps within the C-terminal part of the HSV polymerase gene, which has been suggested to contain part of the catalytic center of the enzyme. Transcription analysis revealed one 5.4-kilobase early transcript in the sense orientation with respect to the open reading frame identified. This transcript appears to code for the 140-kilodalton HCMV polymerase protein.

  17. Genomic localization, sequence analysis, and transcription of the putative human cytomegalovirus DNA polymerase gene

    The human cytomegalovirus (HCMV)-induced DNA polymerase has been well characterized biochemically and functionally, but its genomic location has not yet been assigned. To identify the coding sequence, cross-hybridization with the herpes simplex virus type 1 (HSV-1) polymerase gene was used, as suggested by the close similarity of the herpes group virus-induced DNA polymerases to the HCMV DNA polymerase. A cosmid and plasmid library of the entire HCMV genome was screened with the BamHI Q fragment of HSF-1 at different stringency conditions. One PstI-HincII restriction fragment of 850 base pairs mapping within the EcoRI M fragment of HCMV cross-hybridized at T/sub m/ - 25/degrees/C. Sequence analysis revealed one open reading frame spanning the entire sequence. The amino acid sequence showed a highly conserved domain of 133 amino acids shared with the HSV and putative Esptein-Barr virus polymerase sequences. This domain maps within the C-terminal part of the HSV polymerase gene, which has been suggested to contain part of the catalytic center of the enzyme. Transcription analysis revealed one 5.4-kilobase early transcript in the sense orientation with respect to the open reading frame identified. This transcript appears to code for the 140-kilodalton HCMV polymerase protein

  18. Gene identification and protein classification in microbial metagenomic sequence data via incremental clustering

    Li Weizhong

    2008-04-01

    Full Text Available Abstract Background The identification and study of proteins from metagenomic datasets can shed light on the roles and interactions of the source organisms in their communities. However, metagenomic datasets are characterized by the presence of organisms with varying GC composition, codon usage biases etc., and consequently gene identification is challenging. The vast amount of sequence data also requires faster protein family classification tools. Results We present a computational improvement to a sequence clustering approach that we developed previously to identify and classify protein coding genes in large microbial metagenomic datasets. The clustering approach can be used to identify protein coding genes in prokaryotes, viruses, and intron-less eukaryotes. The computational improvement is based on an incremental clustering method that does not require the expensive all-against-all compute that was required by the original approach, while still preserving the remote homology detection capabilities. We present evaluations of the clustering approach in protein-coding gene identification and classification, and also present the results of updating the protein clusters from our previous work with recent genomic and metagenomic sequences. The clustering results are available via CAMERA, (http://camera.calit2.net. Conclusion The clustering paradigm is shown to be a very useful tool in the analysis of microbial metagenomic data. The incremental clustering method is shown to be much faster than the original approach in identifying genes, grouping sequences into existing protein families, and also identifying novel families that have multiple members in a metagenomic dataset. These clusters provide a basis for further studies of protein families.

  19. Hypoxia-induced protein binding to O2-responsive sequences on the tyrosine hydroxylase gene.

    Norris, M L; Millhorn, D E

    1995-10-01

    We reported recently that the gene that encodes tyrosine hydroxylase (TH), the rate-limiting enzyme in the biosynthesis of catecholamines, is regulated by hypoxia in the dopaminergic cells of the mammalian carotid body (Czyzyk-Krzeska, M. F., Bayliss, D. A., Lawson, E. E. & Millhorn, D. E. (1992) J. Neurochem. 58, 1538-1546) and in pheochromocytoma (PC12) cells (Czyzyk-Krzeska, M. F., Furnari, B. A., Lawson, E. E. & Millhorn, D. E. (1994) J. Biol. Chem. 269, 760-764). Regulation of this gene during low O2 conditions occurs at both the level of transcription and RNA stability. Increased transcription during hypoxia is regulated by a region of the proximal promoter that extends from -284 to + 27 bases, relative to transcription start site. The present study was undertaken to further characterize the sequences that confer O2 responsiveness of the TH gene and to identify hypoxia-induced protein interactions with these sequences. Results from chloramphenicol acetyltransferase assays identified a region between bases -284 and -150 that contains the essential sequences for O2 regulation. This region contains a number of regulatory elements including AP1, AP2, and HIF-1. Gel shift assays revealed enhanced protein interactions at the AP1 and HIF-1 elements of the native gene. Further investigations using supershift and shift-Western analysis showed that c-Fos and JunB bind to the AP1 element during hypoxia and that these protein levels are stimulated by hypoxia. Mutation of the AP1 sequence prevented stimulation of transcription of the TH-chloramphenicol acetyltransferase reporter gene by hypoxia. PMID:7559551

  20. Gene discovery in the threatened elkhorn coral: 454 sequencing of the Acropora palmata transcriptome.

    Nicholas R Polato

    Full Text Available BACKGROUND: Cnidarians, including corals and anemones, offer unique insights into metazoan evolution because they harbor genetic similarities with vertebrates beyond that found in model invertebrates and retain genes known only from non-metazoans. Cataloging genes expressed in Acropora palmata, a foundation-species of reefs in the Caribbean and western Atlantic, will advance our understanding of the genetic basis of ecologically important traits in corals and comes at a time when sequencing efforts in other cnidarians allow for multi-species comparisons. RESULTS: A cDNA library from a sample enriched for symbiont free larval tissue was sequenced on the 454 GS-FLX platform. Over 960,000 reads were obtained and assembled into 42,630 contigs. Annotation data was acquired for 57% of the assembled sequences. Analysis of the assembled sequences indicated that 83-100% of all A. palmata transcripts were tagged, and provided a rough estimate of the total number genes expressed in our samples (~18,000-20,000. The coral annotation data contained many of the same molecular components as in the Bilateria, particularly in pathways associated with oxidative stress and DNA damage repair, and provided evidence that homologs of p53, a key player in DNA repair pathways, has experienced selection along the branch separating Cnidaria and Bilateria. Transcriptome wide screens of paralog groups and transition/transversion ratios highlighted genes including: green fluorescent proteins, carbonic anhydrase, and oxidative stress proteins; and functional groups involved in protein and nucleic acid metabolism, and the formation of structural molecules. These results provide a starting point for study of adaptive evolution in corals. CONCLUSIONS: Currently available transcriptome data now make comparative studies of the mechanisms underlying coral's evolutionary success possible. Here we identified candidate genes that enable corals to maintain genomic integrity despite

  1. Detection of DNA sequence polymorphisms in carcinogen metabolism genes by polymerase chain reaction

    Bell, D.A. (National Inst. of Environmental Health Sciences, Research Triangle Park, NC (United States))

    1991-01-01

    The glutathione transferase mu gene (GST1) and the debrisoquine hydroxylase gene (CYP2D6) are known to be polymorphic in the human population and have been associated with increased susceptibility to cancer. Smokers with low lymphocyte GST mu activity are at higher risk for lung cancer, while low debrisoquine hydroxylase activity has been correlated with lower risk for lung and bladder cancer. Phenotypic characterization of these polymorphisms by lymphocyte enzyme activity (GST) and urine metabolite ratios (debrisoquine) is cumbersome for population studies. Recent cloning and sequencing of the mutant alleles of these genes has allowed genotyping via the polymerase chain reaction (PCR). Advantages of PCR approaches are speed, technical simplicity, and minimal sample requirements. This article reviews the PCR-based methods for detection of genetic polymorphisms in human cancer susceptibility genes.

  2. Detection of DNA sequence polymorphisms in carcinogen metabolism genes by polymerase chain reaction.

    Bell, D A

    1991-01-01

    The glutathione transferase mu gene (GST1) and the debrisoquine hydroxylase gene (CYP2D6) are known to be polymorphic in the human population and have been associated with increased susceptibility to cancer. Smokers with low lymphocyte GST mu activity are at higher risk for lung cancer, while low debrisoquine hydroxylase activity has been correlated with lower risk for lung and bladder cancer. Phenotypic characterization of these polymorphisms by lymphocyte enzyme activity (GST) and urine metabolite ratios (debrisoquine) is cumbersome for population studies. Recent cloning and sequencing of the mutant alleles of these genes has allowed genotyping via the polymerase chain reaction (PCR). Advantages of PCR approaches are speed, technical simplicity, and minimal sample requirements. This article reviews the PCR-based methods for detection of genetic polymorphisms in human cancer susceptibility genes. PMID:1684153

  3. Identification of antimicrobial resistance genes in multidrug-resistant clinical Bacteroides fragilis isolates by whole genome shotgun sequencing

    Sydenham, Thomas Vognbjerg; Sóki, József; Hasman, Henrik;

    2015-01-01

    Bacteroides fragilis constitutes the most frequent anaerobic bacterium causing bacteremia in humans. The genetic background for antimicrobial resistance in B. fragilis is diverse with some genes requiring insertion sequence (IS) elements inserted upstream for increased expression. To evaluate whole...... genome shotgun sequencing as a method for predicting antimicrobial resistance properties, one meropenem resistant and five multidrug-resistant blood culture isolates were sequenced and antimicrobial resistance genes and IS elements identified using ResFinder 2.1 (http...

  4. Cloning, nucleotide sequence, and regulatory analysis of the Lactococcus lactis dnaJ gene.

    van Asseldonk, M; Simons, A.; Visser, H.; DE VOS W.M.; Simons, G

    1993-01-01

    The dnaJ gene of Lactococcus lactis was isolated from a genomic library of L. lactis NIZO R5 and cloned into pUC19. Nucleotide sequencing revealed an open reading frame of 1,137 bp in length, encoding a protein of 379 amino acids. The deduced amino acid sequence showed homology to the DnaJ proteins of Escherichia coli, Mycobacterium tuberculosis, Bacillus subtilis, and Clostridium acetobutylicum. The level of the dnaJ monocistronic mRNA increased approximately threefold after heat shock. The ...

  5. Large-scale Gene Ontology analysis of plant transcriptome-derived sequences retrieved by AFLP technology

    Ramina Angelo

    2008-07-01

    Full Text Available Abstract Background After 10-year-use of AFLP (Amplified Fragment Length Polymorphism technology for DNA fingerprinting and mRNA profiling, large repertories of genome- and transcriptome-derived sequences are available in public databases for model, crop and tree species. AFLP marker systems have been and are being extensively exploited for genome scanning and gene mapping, as well as cDNA-AFLP for transcriptome profiling and differentially expressed gene cloning. The evaluation, annotation and classification of genomic markers and expressed transcripts would be of great utility for both functional genomics and systems biology research in plants. This may be achieved by means of the Gene Ontology (GO, consisting in three structured vocabularies (i.e. ontologies describing genes, transcripts and proteins of any organism in terms of their associated cellular component, biological process and molecular function in a species-independent manner. In this paper, the functional annotation of about 8,000 AFLP-derived ESTs retrieved in the NCBI databases was carried out by using GO terminology. Results Descriptive statistics on the type, size and nature of gene sequences obtained by means of AFLP technology were calculated. The gene products associated with mRNA transcripts were then classified according to the three main GO vocabularies. A comparison of the functional content of cDNA-AFLP records was also performed by splitting the sequence dataset into monocots and dicots and by comparing them to all annotated ESTs of Arabidopsis and rice, respectively. On the whole, the statistical parameters adopted for the in silico AFLP-derived transcriptome-anchored sequence analysis proved to be critical for obtaining reliable GO results. Such an exhaustive annotation may offer a suitable platform for functional genomics, particularly useful in non-model species. Conclusion Reliable GO annotations of AFLP-derived sequences can be gathered through the optimization

  6. Gene Identification and Expression Analysis of 86,136 Expressed Sequence Tags (EST) from the Rice Genome

    Yan Zhou; Lin Ye; Li Lin; Jun Li; Xuegang Wang; Hao Xu; Yibin Pan; Wei Lin; Wei Tian; Jing Liu; Liping Wei; Jiabin Tang; Siqi Liu; Huanming Yang; Jun Yu; Jian Wang; Michael G. Walker; Xiuqing Zhang; Jun Wang; Songnian Hu; Huayong Xu; Yajun Deng; Jianhai Dong

    2003-01-01

    Expressed Sequence Tag (EST) analysis has pioneered genome-wide gene discovery and expression profiling. In order to establish a gene expression index in the rice cultivar indica, we sequenced and analyzed 86,136 ESTs from nine rice cDNA libraries from the super hybrid cultivar LYP9 and its parental cultivars. We assembled these ESTs into 13,232 contigs and leave 8,976 singletons. Overall, 7,497 sequences were found similar to the existing sequences in GenBank and 14,711 are novel. These sequences are classified by molecular function, biological process and pathways according to the Gene Ontology. We compared our sequenced ESTs with the publicly available 95,000 ESTs from japonica, and found little sequence variation, despite the large difference between genome sequences. We then assembled the combined 173,000 rice ESTs for further analysis. Using the pooled ESTs, we compared gene expression in metabolism pathway between rice and Avabidopsis according to KEGG. We further profiled gene expression patterns in different tis sues, developmental stages, and in a conditional sterile mutant, after checking the libraries are comparable by means of sequence coverage. We also identified some possible library specific genes and a number of enzymes and transcription factors that contribute to rice development.

  7. Deep RNA sequencing analysis of readthrough gene fusions in human prostate adenocarcinoma and reference samples

    Nacu Serban

    2011-01-01

    Full Text Available Abstract Background Readthrough fusions across adjacent genes in the genome, or transcription-induced chimeras (TICs, have been estimated using expressed sequence tag (EST libraries to involve 4-6% of all genes. Deep transcriptional sequencing (RNA-Seq now makes it possible to study the occurrence and expression levels of TICs in individual samples across the genome. Methods We performed single-end RNA-Seq on three human prostate adenocarcinoma samples and their corresponding normal tissues, as well as brain and universal reference samples. We developed two bioinformatics methods to specifically identify TIC events: a targeted alignment method using artificial exon-exon junctions within 200,000 bp from adjacent genes, and genomic alignment allowing splicing within individual reads. We performed further experimental verification and characterization of selected TIC and fusion events using quantitative RT-PCR and comparative genomic hybridization microarrays. Results Targeted alignment against artificial exon-exon junctions yielded 339 distinct TIC events, including 32 gene pairs with multiple isoforms. The false discovery rate was estimated to be 1.5%. Spliced alignment to the genome was less sensitive, finding only 18% of those found by targeted alignment in 33-nt reads and 59% of those in 50-nt reads. However, spliced alignment revealed 30 cases of TICs with intervening exons, in addition to distant inversions, scrambled genes, and translocations. Our findings increase the catalog of observed TIC gene pairs by 66%. We verified 6 of 6 predicted TICs in all prostate samples, and 2 of 5 predicted novel distant gene fusions, both private events among 54 prostate tumor samples tested. Expression of TICs correlates with that of the upstream gene, which can explain the prostate-specific pattern of some TIC events and the restriction of the SLC45A3-ELK4 e4-e2 TIC to ERG-negative prostate samples, as confirmed in 20 matched prostate tumor and normal

  8. Captured metagenomics: large-scale targeting of genes based on ‘sequence capture’ reveals functional diversity in soils

    Manoharan, Lokeshwaran; Kushwaha, Sandeep K; Hedlund, Katarina; Ahrén, Dag

    2015-01-01

    Microbial enzyme diversity is a key to understand many ecosystem processes. Whole metagenome sequencing (WMG) obtains information on functional genes, but it is costly and inefficient due to large amount of sequencing that is required. In this study, we have applied a captured metagenomics technique for functional genes in soil microorganisms, as an alternative to WMG. Large-scale targeting of functional genes, coding for enzymes related to organic matter degradation, was applied to two agric...

  9. Sequencing of rhesus macaque Y chromosome clarifies origins and evolution of the DAZ (Deleted in AZoospermia) genes

    Hughes, Jennifer F.; Skaletsky, Helen; Page, David C.

    2012-01-01

    Studies of Y chromosome evolution often emphasize gene loss, but this loss has been counterbalanced by addition of new genes. The DAZ genes, which are critical to human spermatogenesis, were acquired by the Y chromosome in the ancestor of Old World monkeys and apes. We and our colleagues recently sequenced the rhesus macaque Y chromosome, and comparison of this sequence to human and chimpanzee enables us to reconstruct much of the evolutionary history of DAZ. We report that DAZ arrived on the...

  10. Nucleotide sequence analysis of the Legionella micdadei mip gene, encoding a 30-kilodalton analog of the Legionella pneumophila Mip protein

    Bangsborg, Jette Marie; Cianciotto, N P; Hindersson, P

    1991-01-01

    After the demonstration of analogs of the Legionella pneumophila macrophage infectivity potentiator (Mip) protein in other Legionella species, the Legionella micdadei mip gene was cloned and expressed in Escherichia coli. DNA sequence analysis of the L. micdadei mip gene contained in the plasmid p...... homology with the mip-like genes of several Legionella species. Furthermore, amino acid sequence comparisons revealed significant homology to two eukaryotic proteins with isomerase activity (FK506-binding proteins)....

  11. Influences on gene expression in vivo by a Shine-Dalgarno sequence

    Jin, Haining; Zhao, Qing; Gonzalez de Valdivia, Ernesto I;

    2006-01-01

    start sites compete for ribosomes that bind to an SD+ located between them. A minor positive contribution to upstream initiation resulting from 3' to 5' ribosomal diffusion along the mRNA is suggested. Analysis of the E. coli K12 genome suggests that the SD+ or SD-like sequences are systematically...... positive effect of an upstream SD+ is confirmed. A downstream SD+ gives decreased gene expression. This effect is also valid for appropriately modified natural Escherichia coli genes. If an SD+ is placed between two potential initiation codons, initiation takes place predominantly at the second start site...

  12. Nucleotide sequence and characterization of the transcript of a Dictyostelium ribosomal protein gene.

    Steel, L F; Smyth, A; A. Jacobson

    1987-01-01

    Dictyostelium ribosomal protein mRNAs are subject to developmental regulation of both their translation and their stability. In order to consider whether such post-transcriptional regulation can be attributed to structural features of the mRNAs, we have cloned and sequenced a 1.9 kb EcoRI genomic DNA fragment which contains the gene for the Dictyostelium ribosomal protein 1024 (rp1024). The rp1024 gene contains a single intron of 350 bp which begins just after the fourth codon of protein codi...

  13. Gene Expression Analysis in the Age of Mass Sequencing: An Introduction.

    Pilarsky, Christian; Nanduri, Lahiri Kanth; Roy, Janine

    2016-01-01

    During the last years the technology used for gene expression analysis has changed dramatically. The old mainstay, DNA microarray, has served its due course and will soon be replaced by next-generation sequencing (NGS), the Swiss army knife of modern high-throughput nucleic acid-based analysis. Therefore preparation technologies have to adapt to suit the emerging NGS technology platform. Moreover, interpretation of the results is still time consuming and employs the use of high-end computers usually not found in molecular biology laboratories. Alternatively, cloud computing might solve this problem. Nevertheless, these new challenges have to be embraced for gene expression analysis in general. PMID:26667455

  14. Ribosomal RNA gene sequences confirm that protistan endoparasite of larval cod Gadus morhua is Ichthyodinium sp

    Skovgaard, Alf; Meyer, Stefan; Overton, Julia Lynne; Støttrup, Josianne; Buchmann, Kurt

    2010-01-01

    An enigmatic protistan endoparasite found in eggs and larvae of cod Gadus morhua and turbot Psetta maxima was isolated from Baltic cod larvae, and DNA was extracted for sequencing of the parasite's small Subunit ribosomal RNA (SSU rRNA) gene. The endoparasite has previously been suggested to be...... related to Ichthyodinium chabelardi, a dinoflagellate-like protist that parasitizes yolk sacs of embryos and larvae of a variety of fish species. Comparison of a 1535 bp long fragment of the SSU rRNA gene of the cod endoparasite showed absolute identify with I. chabelardi, demonstrating that the 2...

  15. Immunoscintigraphy with anti-225.28S for ocular melanoma - a comparison with histology and immunohistochemistry

    Aim: The purpose of this prospective study was to evaluate the value of immunoscintigraphy (ISG) with anti-225.28S in clinically suspected ocular melanoma. Methods: For this purpose standardized ISG was performed in 36 patients using both planar acquisition and emission computed tomography (ECT). Ocular melanoma was present in 31 patients. In 21 patients therapy was enucleation of the eye. These specimens were evaluated by histology and immunohistochemistry in 11 of 21 patients. Results: Regarding the clinical diagnosis, ISG was positive only in 15 of 31 patients with ocular melanoma, regarding histology in 11 of 21 and regarding immunohistochemistry in 5 of 6 patients with a positive immunoreaction. 5 patients showed no immunoreactivity, their ISG was negative. Conclusion: Thus a good correlation between ISG and immunohistochemistry was observed. However ISG using the cutaneous melanoma antibody 225.28S cannot be recommended for the diagnostic work-up of an ocular melanoma considering the poor immunoreactivity. (orig.)

  16. Comparison of inherently essential genes of Porphyromonas gingivalis identified in two transposon-sequencing libraries.

    Hutcherson, J A; Gogeneni, H; Yoder-Himes, D; Hendrickson, E L; Hackett, M; Whiteley, M; Lamont, R J; Scott, D A

    2016-08-01

    Porphyromonas gingivalis is a Gram-negative anaerobe and keystone periodontal pathogen. A mariner transposon insertion mutant library has recently been used to define 463 genes as putatively essential for the in vitro growth of P. gingivalis ATCC 33277 in planktonic culture (Library 1). We have independently generated a transposon insertion mutant library (Library 2) for the same P. gingivalis strain and herein compare genes that are putatively essential for in vitro growth in complex media, as defined by both libraries. In all, 281 genes (61%) identified by Library 1 were common to Library 2. Many of these common genes are involved in fundamentally important metabolic pathways, notably pyrimidine cycling as well as lipopolysaccharide, peptidoglycan, pantothenate and coenzyme A biosynthesis, and nicotinate and nicotinamide metabolism. Also in common are genes encoding heat-shock protein homologues, sigma factors, enzymes with proteolytic activity, and the majority of sec-related protein export genes. In addition to facilitating a better understanding of critical physiological processes, transposon-sequencing technology has the potential to identify novel strategies for the control of P. gingivalis infections. Those genes defined as essential by two independently generated TnSeq mutant libraries are likely to represent particularly attractive therapeutic targets. PMID:26358096

  17. Genome sequence surveys of Brachiola algerae and Edhazardia aedis reveal microsporidia with low gene densities

    Fast Naomi M

    2008-04-01

    Full Text Available Abstract Background Microsporidia are well known models of extreme nuclear genome reduction and compaction. The smallest microsporidian genomes have received the most attention, but genomes of different species range in size from 2.3 Mb to 19.5 Mb and the nature of the larger genomes remains unknown. Results Here we have undertaken genome sequence surveys of two diverse microsporidia, Brachiola algerae and Edhazardia aedis. In both species we find very large intergenic regions, many transposable elements, and a low gene-density, all in contrast to the small, model microsporidian genomes. We also find no recognizable genes that are not also found in other surveyed or sequenced microsporidian genomes. Conclusion Our results demonstrate that microsporidian genome architecture varies greatly between microsporidia. Much of the genome size difference could be accounted for by non-coding material, such as intergenic spaces and retrotransposons, and this suggests that the forces dictating genome size may vary across the phylum.

  18. Rapid high resolution genotyping of Francisella tularensis by whole genome sequence comparison of annotated genes ("MLST+".

    Markus H Antwerpen

    Full Text Available The zoonotic disease tularemia is caused by the bacterium Francisella tularensis. This pathogen is considered as a category A select agent with potential to be misused in bioterrorism. Molecular typing based on DNA-sequence like canSNP-typing or MLVA has become the accepted standard for this organism. Due to the organism's highly clonal nature, the current typing methods have reached their limit of discrimination for classifying closely related subpopulations within the subspecies F. tularensis ssp. holarctica. We introduce a new gene-by-gene approach, MLST+, based on whole genome data of 15 sequenced F. tularensis ssp. holarctica strains and apply this approach to investigate an epidemic of lethal tularemia among non-human primates in two animal facilities in Germany. Due to the high resolution of MLST+ we are able to demonstrate that three independent clones of this highly infectious pathogen were responsible for these spatially and temporally restricted outbreaks.

  19. Coptotermes gestroi (Isoptera: Rhinotermitidae) in Brazil: possible origins inferred by mitochondrial cytochrome oxidase II gene sequences.

    Martins, C; Fontes, L R; Bueno, O C; Martins, V G

    2010-09-01

    The Asian subterranean termite, Coptotermes gestroi, originally from northeast India through Burma, Thailand, Malaysia, and the Indonesian archipelago, is a major termite pest introduced in several countries around the world, including Brazil. We sequenced the mitochondrial COII gene from individuals representing 23 populations. Phylogenetic analysis of COII gene sequences from this and other studies resulted in two main groups: (1) populations of Cleveland (USA) and four populations of Malaysia and (2) populations of Brazil, four populations of Malaysia, and one population from each of Thailand, Puerto Rico, and Key West (USA). Three new localities are reported here, considerably enlarging the distribution of C. gestroi in Brazil: Campo Grande (state of Mato Grosso do Sul), Itajaí (state of Santa Catarina), and Porto Alegre (state of Rio Grande do Sul). PMID:20924414

  20. An abundance of ubiquitously expressed genes revealed by tissue transcriptome sequence data.

    Daniel Ramsköld

    2009-12-01

    Full Text Available The parts of the genome transcribed by a cell or tissue reflect the biological processes and functions it carries out. We characterized the features of mammalian tissue transcriptomes at the gene level through analysis of RNA deep sequencing (RNA-Seq data across human and mouse tissues and cell lines. We observed that roughly 8,000 protein-coding genes were ubiquitously expressed, contributing to around 75% of all mRNAs by message copy number in most tissues. These mRNAs encoded proteins that were often intracellular, and tended to be involved in metabolism, transcription, RNA processing or translation. In contrast, genes for secreted or plasma membrane proteins were generally expressed in only a subset of tissues. The distribution of expression levels was broad but fairly continuous: no support was found for the concept of distinct expression classes of genes. Expression estimates that included reads mapping to coding exons only correlated better with qRT-PCR data than estimates which also included 3' untranslated regions (UTRs. Muscle and liver had the least complex transcriptomes, in that they expressed predominantly ubiquitous genes and a large fraction of the transcripts came from a few highly expressed genes, whereas brain, kidney and testis expressed more complex transcriptomes with the vast majority of genes expressed and relatively small contributions from the most expressed genes. mRNAs expressed in brain had unusually long 3'UTRs, and mean 3'UTR length was higher for genes involved in development, morphogenesis and signal transduction, suggesting added complexity of UTR-based regulation for these genes. Our results support a model in which variable exterior components feed into a large, densely connected core composed of ubiquitously expressed intracellular proteins.

  1. Advancing Eucalyptus genomics: identification and sequencing of lignin biosynthesis genes from deep-coverage BAC libraries

    Kudrna David

    2011-03-01

    Full Text Available Abstract Background Eucalyptus species are among the most planted hardwoods in the world because of their rapid growth, adaptability and valuable wood properties. The development and integration of genomic resources into breeding practice will be increasingly important in the decades to come. Bacterial artificial chromosome (BAC libraries are key genomic tools that enable positional cloning of important traits, synteny evaluation, and the development of genome framework physical maps for genetic linkage and genome sequencing. Results We describe the construction and characterization of two deep-coverage BAC libraries EG_Ba and EG_Bb obtained from nuclear DNA fragments of E. grandis (clone BRASUZ1 digested with HindIII and BstYI, respectively. Genome coverages of 17 and 15 haploid genome equivalents were estimated for EG_Ba and EG_Bb, respectively. Both libraries contained large inserts, with average sizes ranging from 135 Kb (Eg_Bb to 157 Kb (Eg_Ba, very low extra-nuclear genome contamination providing a probability of finding a single copy gene ≥ 99.99%. Libraries were screened for the presence of several genes of interest via hybridizations to high-density BAC filters followed by PCR validation. Five selected BAC clones were sequenced and assembled using the Roche GS FLX technology providing the whole sequence of the E. grandis chloroplast genome, and complete genomic sequences of important lignin biosynthesis genes. Conclusions The two E. grandis BAC libraries described in this study represent an important milestone for the advancement of Eucalyptus genomics and forest tree research. These BAC resources have a highly redundant genome coverage (> 15×, contain large average inserts and have a very low percentage of clones with organellar DNA or empty vectors. These publicly available BAC libraries are thus suitable for a broad range of applications in genetic and genomic research in Eucalyptus and possibly in related species of Myrtaceae

  2. Multiple, non-allelic, intein-coding sequences in eukaryotic RNA polymerase genes

    Butler Margaret I

    2006-10-01

    Full Text Available Abstract Background Inteins are self-splicing protein elements. They are translated as inserts within host proteins that excise themselves and ligate the flanking portions of the host protein (exteins with a peptide bond. They are encoded as in-frame insertions within the genes for the host proteins. Inteins are found in all three domains of life and in viruses, but have a very sporadic distribution. Only a small number of intein coding sequences have been identified in eukaryotic nuclear genes, and all of these are from ascomycete or basidiomycete fungi. Results We identified seven intein coding sequences within nuclear genes coding for the second largest subunits of RNA polymerase. These sequences were found in diverse eukaryotes: one is in the second largest subunit of RNA polymerase I (RPA2 from the ascomycete fungus Phaeosphaeria nodorum, one is in the RNA polymerase III (RPC2 of the slime mould Dictyostelium discoideum and four intein coding sequences are in RNA polymerase II genes (RPB2, one each from the green alga Chlamydomonas reinhardtii, the zygomycete fungus Spiromyces aspiralis and the chytrid fungi Batrachochytrium dendrobatidis and Coelomomyces stegomyiae. The remaining intein coding sequence is in a viral relic embedded within the genome of the oomycete Phytophthora ramorum. The Chlamydomonas and Dictyostelium inteins are the first nuclear-encoded inteins found outside of the fungi. These new inteins represent a unique dataset: they are found in homologous proteins that form a paralogous group. Although these paralogues diverged early in eukaryotic evolution, their sequences can be aligned over most of their length. The inteins are inserted at multiple distinct sites, each of which corresponds to a highly conserved region of RNA polymerase. This dataset supports earlier work suggesting that inteins preferentially occur in highly conserved regions of their host proteins. Conclusion The identification of these new inteins

  3. Genome Sequencing Highlights Genes Under Selection and the Dynamic Early History of Dogs

    Freedman AH1; Gronau I2; Schweizer RM1; Ortega-Del Vecchyo D1; Han E1; Silva PM3; Galaverni M4; Fan Z; Marx P6; Lorente-Galdos B; Beale H8; Ramirez O7; Hormozdiari F; Alkan C; Vil\\xe0 C11

    2013-01-01

    To identify genetic changes underlying dog domestication and reconstruct their early evolutionary history, we analyzed novel high-quality genome sequences of three gray wolves, one from each of three putative centers of dog domestication, two ancient dog lineages (Basenji and Dingo) and a golden jackal as an outgroup. We find dogs and wolves diverged through a dynamic process involving population bottlenecks in both lineages and post-divergence gene flow, which confounds previous inferences o...

  4. Exome sequencing identifies MPL as a causative gene in familial aplastic anemia

    Walne, Amanda J.; Dokal, Arran; Plagnol, Vincent; Beswick, Richard; Kirwan, Michael; de la Fuente, Josu; Vulliamy, Tom; Dokal, Inderjeet

    2012-01-01

    The primary cause of aplastic anemia remains unknown in many patients. The aim of this study was to clarify the genetic cause of familial aplastic anemia. Genomic DNA of an affected individual from a multiplex consanguineous family was hybridized to a Nimblegen exome library before being sequenced on a GAIIx genome analyzer. Once the disease causing homozygous mutation had been confirmed in the consanguineous family, this gene was then analyzed for mutation in 33 uncharacterized index cases o...

  5. Phylogeny of the malarial genus Plasmodium, derived from rRNA gene sequences.

    Escalante, A A; Ayala, F. J.

    1994-01-01

    Malaria is among mankind's worst scourges, affecting many millions of people, particularly in the tropics. Human malaria is caused by several species of Plasmodium, a parasitic protozoan. We analyze the small subunit rRNA gene sequences of 11 Plasmodium species, including three parasitic to humans, to infer their evolutionary relationships. Plasmodium falciparum, the most virulent of the human species, is closely related to Plasmodium reichenowi, which is parasitic to chimpanzee. The estimate...

  6. A Hybrid Distance Measure for Clustering Expressed Sequence Tags Originating from the Same Gene Family

    Ng, Keng-Hoong; Ho, Chin-Kuan; Phon-Amnuaisuk, Somnuk

    2012-01-01

    Background Clustering is a key step in the processing of Expressed Sequence Tags (ESTs). The primary goal of clustering is to put ESTs from the same transcript of a single gene into a unique cluster. Recent EST clustering algorithms mostly adopt the alignment-free distance measures, where they tend to yield acceptable clustering accuracies with reasonable computational time. Despite the fact that these clustering methods work satisfactorily on a majority of the EST datasets, they have a commo...

  7. Nucleotide sequence variation of the envelope protein gene identifies two distinct genotypes of yellow fever virus.

    Chang, G J; Cropp, B. C.; Kinney, R M; Trent, D W; Gubler, D. J.

    1995-01-01

    The evolution of yellow fever virus over 67 years was investigated by comparing the nucleotide sequences of the envelope (E) protein genes of 20 viruses isolated in Africa, the Caribbean, and South America. Uniformly weighted parsimony algorithm analysis defined two major evolutionary yellow fever virus lineages designated E genotypes I and II. E genotype I contained viruses isolated from East and Central Africa. E genotype II viruses were divided into two sublineages: IIA viruses from West A...

  8. Sequence variation in the androgen receptor gene is not a common determinant of male sexual orientation.

    Macke, J. P.; Hu, N; S. Hu; Bailey, M.; King, V L; Brown, T.; Hamer, D; Nathans, J

    1993-01-01

    To test the hypothesis that DNA sequence variation in the androgen receptor gene plays a causal role in the development of male sexual orientation, we have (1) measured the degree of concordance of androgen receptor alleles in 36 pairs of homosexual brothers, (2) compared the lengths of polyglutamine and polyglycine tracts in the amino-terminal domain of the androgen receptor in a sample of 197 homosexual males and 213 unselected subjects, and (3) screened the the entire androgen receptor cod...

  9. Molecular analysis of the bovine coronavirus S1 gene by direct sequencing of diarrheic fecal specimens

    E. Takiuchi

    2008-04-01

    Full Text Available Bovine coronavirus (BCoV causes severe diarrhea in newborn calves, is associated with winter dysentery in adult cattle and respiratory infections in calves and feedlot cattle. The BCoV S protein plays a fundamental role in viral attachment and entry into the host cell, and is cleaved into two subunits termed S1 (amino terminal and S2 (carboxy terminal. The present study describes a strategy for the sequencing of the BCoV S1 gene directly from fecal diarrheic specimens that were previously identified as BCoV positive by RT-PCR assay for N gene detection. A consensus sequence of 2681 nucleotides was obtained through direct sequencing of seven overlapping PCR fragments of the S gene. The samples did not undergo cell culture passage prior to PCR amplification and sequencing. The structural analysis was based on the genomic differences between Brazilian strains and other known BCoV from different geographical regions. The phylogenetic analysis of the entire S1 gene showed that the BCoV Brazilian strains were more distant from the Mebus strain (97.8% identity for nucleotides and 96.8% identity for amino acids and more similar to the BCoV-ENT strain (98.7% for nucleotides and 98.7% for amino acids. Based on the phylogenetic analysis of the hypervariable region of the S1 subunit, these strains clustered with the American (BCoV-ENT, 182NS and Canadian (BCQ20, BCQ2070, BCQ9, BCQ571, BCQ1523 calf diarrhea and the Canadian winter dysentery (BCQ7373, BCQ2590 strains, but clustered on a separate branch of the Korean and respiratory BCoV strains. The BCoV strains of the present study were not clustered in the same branch of previously published Brazilian strains (AY606193, AY606194. These data agree with the genealogical construction and suggest that at least two different BCoV strains are circulating in Brazil.

  10. Sequence Analysis of Bitter Taste Receptor Gene Repertoires in Different Ruminant Species

    Monteiro Ferreira, Ana; Tomás Marques, Andreia; Bhide, Mangesh; Cubric-Curik, Vlatka; Hollung, Kristin; Knight, Christopher Harold; Raundrup, Katrine; Lippolis, John; Palmer, Mitchell; Sales-Baptista, Elvira; Araújo, Susana de Sousa; Almeida, André Martinho

    2015-01-01

    Bitter taste has been extensively studied in mammalian species and is associated with sensitivity to toxins and with food choices that avoid dangerous substances in the diet. At the molecular level, bitter compounds are sensed by bitter taste receptor proteins (T2R) present at the surface of taste receptor cells in the gustatory papillae. Our work aims at exploring the phylogenetic relationships of T2R gene sequences within different ruminant species. To accomplish this goal, we gathered a co...

  11. How the Sequence of a Gene Specifies Structural Symmetry in Proteins.

    Xiaojuan Shen

    Full Text Available Internal symmetry is commonly observed in the majority of fundamental protein folds. Meanwhile, sufficient evidence suggests that nascent polypeptide chains of proteins have the potential to start the co-translational folding process and this process allows mRNA to contain additional information on protein structure. In this paper, we study the relationship between gene sequences and protein structures from the viewpoint of symmetry to explore how gene sequences code for structural symmetry in proteins. We found that, for a set of two-fold symmetric proteins from left-handed beta-helix fold, intragenic symmetry always exists in their corresponding gene sequences. Meanwhile, codon usage bias and local mRNA structure might be involved in modulating translation speed for the formation of structural symmetry: a major decrease of local codon usage bias in the middle of the codon sequence can be identified as a common feature; and major or consecutive decreases in local mRNA folding energy near the boundaries of the symmetric substructures can also be observed. The results suggest that gene duplication and fusion may be an evolutionarily conserved process for this protein fold. In addition, the usage of rare codons and the formation of higher order of secondary structure near the boundaries of symmetric substructures might have coevolved as conserved mechanisms to slow down translation elongation and to facilitate effective folding of symmetric substructures. These findings provide valuable insights into our understanding of the mechanisms of translation and its evolution, as well as the design of proteins via symmetric modules.

  12. Low-pass shotgun sequencing of the barley genome facilitates rapid identification of genes, conserved non-coding sequences and novel repeats

    Graner Andreas

    2008-10-01

    Full Text Available Abstract Background Barley has one of the largest and most complex genomes of all economically important food crops. The rise of new short read sequencing technologies such as Illumina/Solexa permits such large genomes to be effectively sampled at relatively low cost. Based on the corresponding sequence reads a Mathematically Defined Repeat (MDR index can be generated to map repetitive regions in genomic sequences. Results We have generated 574 Mbp of Illumina/Solexa sequences from barley total genomic DNA, representing about 10% of a genome equivalent. From these sequences we generated an MDR index which was then used to identify and mark repetitive regions in the barley genome. Comparison of the MDR plots with expert repeat annotation drawing on the information already available for known repetitive elements revealed a significant correspondence between the two methods. MDR-based annotation allowed for the identification of dozens of novel repeat sequences, though, which were not recognised by hand-annotation. The MDR data was also used to identify gene-containing regions by masking of repetitive sequences in eight de-novo sequenced bacterial artificial chromosome (BAC clones. For half of the identified candidate gene islands indeed gene sequences could be identified. MDR data were only of limited use, when mapped on genomic sequences from the closely related species Triticum monococcum as only a fraction of the repetitive sequences was recognised. Conclusion An MDR index for barley, which was obtained by whole-genome Illumina/Solexa sequencing, proved as efficient in repeat identification as manual expert annotation. Circumventing the labour-intensive step of producing a specific repeat library for expert annotation, an MDR index provides an elegant and efficient resource for the identification of repetitive and low-copy (i.e. potentially gene-containing sequences regions in uncharacterised genomic sequences. The restriction that a particular

  13. Whole-exome sequencing for the identification of susceptibility genes of Kashin-Beck disease.

    Zhenxing Yang

    Full Text Available OBJECTIVE: To identify and investigate the susceptibility genes of Kashin-Beck disease (KBD in Chinese population. METHODS: Whole-exome capturing and sequencing technology was used for the detection of genetic variations in 19 individuals from six families with high incidence of KBD. A total of 44 polymorphisms from 41 genes were genotyped from a total of 144 cases and 144 controls by using MassARRAY under the standard protocol from Sequenom. Association was applied on the data by using PLINK1.07. RESULTS: In the sequencing stage, each sample showed approximately 70-fold coverage, thus covering more than 99% of the target regions. Among the single nucleotide polymorphisms (SNPs used in the transmission disequilibrium test, 108 had a p-value of <0.01, whereas 1056 had a p-value of <0.05. Kyoto Encyclopedia of Genes and Genomes(KEGG pathway analysis indicates that these SNPs focus on three major pathways: regulation of actin cytoskeleton, focal adhesion, and metabolic pathways. In the validation stage, single locus effects revealed that two of these polymorphisms (rs7745040 and rs9275295 in the human leukocyte antigen (HLA-DRB1 gene and one polymorphism (rs9473132 in CD2-associated protein (CD2AP gene have a significant statistical association with KBD. CONCLUSIONS: HLA-DRB1 and CD2AP gene were identified to be among the susceptibility genes of KBD, thus supporting the role of the autoimmune response in KBD and the possibility of shared etiology between osteoarthritis, rheumatoid arthritis, and KBD.

  14. Sequence and organization of 5S ribosomal RNA-encoding genes of Arabidopsis thaliana.

    Campell, B R; Song, Y; Posch, T E; Cullis, C A; Town, C D

    1992-03-15

    We have isolated a genomic clone containing Arabidopsis thaliana 5S ribosomal RNA (rRNA)-encoding genes (rDNA) by screening an A. thaliana library with a 5S rDNA probe from flax. The clone isolated contains seven repeat units of 497 bp, plus 11 kb of flanking genomic sequence at one border. Sequencing of individual subcloned repeat units shows that the sequence of the 5S rRNA coding region is very similar to that reported for other flowering plants. Four A. thaliana ecotypes were found to contain approx. 1000 copies of 5S rDNA per haploid genome. Southern-blot analysis of genomic DNA indicates that 5S rDNA occurs in long tandem arrays, and shows the presence of numerous restriction-site polymorphisms among the six ecotypes studied. PMID:1348233

  15. Sequence analysis of the equine ACTN3 gene in Australian horse breeds.

    Thomas, K C; Hamilton, N A; North, K N; Houweling, P J

    2014-03-15

    The sarcomeric α-actinins, encoded by the genes ACTN2 and ACTN3, are major structural components of the Z-line and have high sequence similarity. α-Actinin-2 is present in all skeletal muscle fibres, while α-actinin-3 has developed specialized expression in only type 2 (fast, glycolytic) fibres. A common single nucleotide polymorphism (SNP) in the human ACTN3 gene (R577X) has been found to influence muscle performance in elite athletes and the normal population. For this reason, equine ACTN3 (eACTN3) is considered to be a possible candidate that may influence horse performance. In this study, the intron/exon boundaries and entire coding region of eACTN3 have been sequenced in five Australian horse breeds (Thoroughbred, Arabian, Standardbred, Clydsdale and Shire) and compared to the eACTN3 GenBank sequence. A total of 34 SNPs were identified, of which 26 were intronic and eight exonic. All exonic SNPs were synonymous; however, five intronic SNPs showed significant differences between breeds. A total of 72 horses were genotyped for a SNP located in the promoter region of the eACTN3 gene (g. 1104 G>A) which differed significantly between breed groups. We hypothesize that this polymorphism influences eACTN3 expression and with further studies may provide a novel marker of horse performance in the future. PMID:24440781

  16. Sequence Analysis of Bitter Taste Receptor Gene Repertoires in Different Ruminant Species.

    Ana Monteiro Ferreira

    Full Text Available Bitter taste has been extensively studied in mammalian species and is associated with sensitivity to toxins and with food choices that avoid dangerous substances in the diet. At the molecular level, bitter compounds are sensed by bitter taste receptor proteins (T2R present at the surface of taste receptor cells in the gustatory papillae. Our work aims at exploring the phylogenetic relationships of T2R gene sequences within different ruminant species. To accomplish this goal, we gathered a collection of ruminant species with different feeding behaviors and for which no genome data is available: American bison, chamois, elk, European bison, fallow deer, goat, moose, mouflon, muskox, red deer, reindeer and white tailed deer. The herbivores chosen for this study belong to different taxonomic families and habitats, and hence, exhibit distinct foraging behaviors and diet preferences. We describe the first partial repertoires of T2R gene sequences for these species obtained by direct sequencing. We then consider the homology and evolutionary history of these receptors within this ruminant group, and whether it relates to feeding type classification, using MEGA software. Our results suggest that phylogenetic proximity of T2R genes corresponds more to the traditional taxonomic groups of the species rather than reflecting a categorization by feeding strategy.

  17. Cloning and sequence analysis of the Antheraea pernyi nucleopolyhedrovirus gp64 gene

    Wenbing Wang; Shanying Zhu; Liqun Wang; Feng Yu; Weide Shen

    2005-12-01

    Frequent outbreaks of the purulence disease of Chinese oak silkworm are reported in Middle and Northeast China. The disease is produced by the pathogen Antheraea pernyi nucleopolyhedrovirus (AnpeNPV). To obtain molecular information of the virus, the polyhedra of AnpeNPV were purified and characterized. The genomic DNA of AnpeNPV was extracted and digested with HindIII. The genome size of AnpeNPV is estimated at 128 kb. Based on the analysis of DNA fragments digested with HindIII, 23 fragments were bigger than 564 bp. A genomic library was generated using HindIII and the positive clones were sequenced and analysed. The gp64 gene, encoding the baculovirus envelope protein GP64, was found in an insert. The nucleotide sequence analysis indicated that the AnpeNPV gp64 gene consists of a 1530 nucleotide open reading frame (ORF), encoding a protein of 509 amino acids. Of the eight gp64 homologues, the AnpeNPV gp64 ORF shared the most sequence similarity with the gp64 gene of Anticarsia gemmatalis NPV, but not Bombyx mori NPV. The upstream region of the AnpeNPV gp64 ORF encoded the conserved transcriptional elements for early and late stage of the viral infection cycle. These results indicated that AnpeNPV belongs to group I NPV and was far removed in molecular phylogeny from the BmNPV.

  18. Sequences of cytochrome b gene for primitive cyprinid fishes in East Asia and their phylogenetic concerning

    2001-01-01

    1140 bp of cytochrome b gene were amplified and sequenced from 14species of primitive cyprinid fishes in East Asia. Aligned with other ten cytochrome b gene sequences of cyprinid fish from Europe and North America retrieved from Gene bank, we obtained a matrix of 24 DNA sequences. A cladogram was generated by the method of Maximum likelihood for the primitive cyprinid fishes. The result indicated that subfamily Leuciscinae and Danioninae do not form a monophyletic group. In the subfamily Danioninae, Opsariichthys biden and Zacco platypus are very primitive and form a natural group and located at the root. But the genera in subfamily Danioninae are included in different groups and have not direct relationship. Among them, Aphyocypris chinensis and Yaoshanicus arcus form a monophyletic group. Tanichthys albonubes and Gobiocypris rarus have a close relation to Gobioninae. The genus Danio is far from other genera in Danioninae. In our cladogram, the genera in Leuciscinae were divided into two groups that have no direct relationship. The genera in Leuciscinae distributed in Europe, Sibera and North America, including Leuciscus, Rutilus, Phoxinus, N. crysole, Opsopoeodus emilae, form a monophyletic group. And the Leuciscinae in southern China including Ctenopharyngodon idellus, Mylopharyngodon piceus, Squalibarbus and Ochetobius elongatus have a common origination.

  19. Molecular cloning and sequence analysis of a phenylalanine ammonia-lyase gene from dendrobium.

    Qing Jin

    Full Text Available In this study, a phenylalanine ammonia-lyase (PAL gene was cloned from Dendrobium candidum using homology cloning and RACE. The full-length sequence and catalytic active sites that appear in PAL proteins of Arabidopsis thaliana and Nicotiana tabacum are also found: PAL cDNA of D. candidum (designated Dc-PAL1, GenBank No. JQ765748 has 2,458 bps and contains a complete open reading frame (ORF of 2,142 bps, which encodes 713 amino acid residues. The amino acid sequence of DcPAL1 has more than 80% sequence identity with the PAL genes of other plants, as indicated by multiple alignments. The dominant sites and catalytic active sites, which are similar to that showing in PAL proteins of Arabidopsis thaliana and Nicotiana tabacum, are also found in DcPAL1. Phylogenetic tree analysis revealed that DcPAL is more closely related to PALs from orchidaceae plants than to those of other plants. The differential expression patterns of PAL in protocorm-like body, leaf, stem, and root, suggest that the PAL gene performs multiple physiological functions in Dendrobium candidum.

  20. A reassessment of the evolutionary timescale of bat rabies viruses based upon glycoprotein gene sequences.

    Kuzmina, Natalia A; Kuzmin, Ivan V; Ellison, James A; Taylor, Steven T; Bergman, David L; Dew, Beverly; Rupprecht, Charles E

    2013-10-01

    Rabies, an acute progressive encephalomyelitis caused by viruses in the genus Lyssavirus, is one of the oldest known infectious diseases. Although dogs and other carnivores represent the greatest threat to public health as rabies reservoirs, it is commonly accepted that bats are the primary evolutionary hosts of lyssaviruses. Despite early historical documentation of rabies, molecular clock analyses indicate a quite young age of lyssaviruses, which is confusing. For example, the results obtained for partial and complete nucleoprotein gene sequences of rabies viruses (RABV), or for a limited number of glycoprotein gene sequences, indicated that the time of the most recent common ancestor (TMRCA) for current bat RABV diversity in the Americas lies in the seventeenth to eighteenth centuries and might be directly or indirectly associated with the European colonization. Conversely, several other reports demonstrated high genetic similarity between lyssavirus isolates, including RABV, obtained within a time interval of 25-50 years. In the present study, we attempted to re-estimate the age of several North American bat RABV lineages based on the largest set of complete and partial glycoprotein gene sequences compiled to date (n = 201) employing a codon substitution model. Although our results overlap with previous estimates in marginal areas of the 95 % high probability density (HPD), they suggest a longer evolutionary history of American bat RABV lineages (TMRCA at least 732 years, with a 95 % HPD 436-1107 years). PMID:23839669

  1. Operator Sequence Alters Gene Expression Independently of Transcription Factor Occupancy in Bacteria

    Hernan G. Garcia

    2012-07-01

    Full Text Available A canonical quantitative view of transcriptional regulation holds that the only role of operator sequence is to set the probability of transcription factor binding, with operator occupancy determining the level of gene expression. In this work, we test this idea by characterizing repression in vivo and the binding of RNA polymerase in vitro in experiments where operators of various sequences were placed either upstream or downstream from the promoter in Escherichia coli. Surprisingly, we find that operators with a weaker binding affinity can yield higher repression levels than stronger operators. Repressor bound to upstream operators modulates promoter escape, and the magnitude of this modulation is not correlated with the repressor-operator binding affinity. This suggests that operator sequences may modulate transcription by altering the nature of the interaction of the bound transcription factor with the transcriptional machinery, implying a new layer of sequence dependence that must be confronted in the quantitative understanding of gene expression.

  2. Cloning and Characterization of a Human Genomic Sequence that Alleviates Repeat-Induced Gene Silencing

    Miura, Osamu; Ohyama, Takashi; Shimizu, Noriaki

    2016-01-01

    Plasmids bearing a mammalian replication initiation region (IR) and a nuclear matrix attachment region (MAR) are spontaneously amplified in transfected mammalian cells, and such amplification generates chromosomal homogeneously staining regions (HSRs) or extrachromosomal double minutes (DMs). This method provides a novel, efficient, and rapid way to establish cells that stably produce high levels of recombinant proteins. However, because IR/MAR plasmids are amplified as repeats, they are frequently targeted by repeat-induced gene silencing (RIGS), which silences a variety of repeated sequences in transgenes and the genome. To address this problem, we developed a novel screening system using the IR/MAR plasmid to isolate human genome sequences that alleviate RIGS. The screen identified a 3,271 bp sequence (B-3-31) that elevated transgene expression without affecting the amplification process. Neither non-B structure (i.e., the inverted repeats or bending) nor known epigenetic modifier elements such as MARs, insulators, UCOEs, or STARs could explain the anti-silencing activity of B-3-31. Instead, the activity was distributed throughout the entire B-3-31 sequence, which was extremely A/T-rich and CpG-poor. Because B-3-31 effectively and reproducibly alleviated RIGS of repeated genes, it could be used to increase recombinant protein production. PMID:27078685

  3. Cloning,Sequencing and Phylogenetic Study of rbcL Gene from Cyanobacteria Arthrospira and Spirulina

    Liu Jinjie(刘金姐); Zhang Xuecheng; Sui Zhenghong; Mao Yunxiang; Sun Xue

    2004-01-01

    Large subunit gene of rubisco (rbcL) of cyanobacteria Arthrospira platensis FACHB341, A. Platensis FACHB439, A. Maxima OUQDSM and Spirulina sp. FACHB440 is cloned, sequenced and characterized. Results show that GC content of the gene in strain Spirulina sp. FACHB440 is higher than that in the others. The alignments based on deduced amino acid sequences indicate that Spirulina sp. FACHB440 is different from that in other three samples of Arthrospira, though they have the same conserved functional sites (95, 98, 121, 124, 221, 257). The nucleotide sequence similarity among the three strains of the genus of Arthrospira (96.5~99.6%) is higher than that between Arthrospira and Spirulina (78.1~78.5%). By comparison of the corresponding sequence of other cyanobacteria, a phylogenetic tree with two clusters is constructed. A. Platensis FACHB341, A. Maxima OUQDSM and A. Platensis FACHB439 form the monophyletic linage, which is fully supported by bootstrap values (1000), while Spirulina sp. FACHB440 and Anabaena sp. PCC7120 cluster in another linage with the bootstrap value of 909.

  4. Sequence variation in the androgen receptor gene is not a common determinant of male sexual orientation

    Macke, J.P.; Nathans, J.; King, V.L. (Johns Hopkins Univ., Baltimore, MD (United States)); Hu, N.; Hu, S.; Hamer, D.; Bailey, M. (Northwestern Univ., Evanston, IL (United States)); Brown, T. (Johns Hopkins Univ. School of Hygiene and Public Health, Baltimore, MD (United States))

    1993-10-01

    To test the hypothesis that DNA sequence variation in the androgen receptor gene plays a causal role in the development of male sexual orientation, the authors have (1) measured the degree of concordance of androgen receptor alleles in 36 pairs of homosexual brothers, (2) compared the lengths of polyglutamine and polyglycine tracts in the amino-terminal domain of the androgen receptor in a sample of 197 homosexual males and 213 unselected subjects, and (3) screened the entire androgen receptor coding region for sequence variation by PCR and denaturing gradient-gel electrophoresis (DGGE) and/or single-strand conformation polymorphism analysis in 20 homosexual males with homosexual or bisexual brothers and one homosexual male with no homosexual brothers, and screened the amino-terminal domain of the receptor for sequence variation in an additional 44 homosexual males, 37 of whom had one or more first- or second-degree male relatives who were either homosexual or bisexual. These analyses show that (1) homosexual brothers are as likely to be discordant as concordant for androgen receptor alleles; (2) there are no large-scale differences between the distributions of polyglycine or polyglutamine tract lengths in the homosexual and control groups; and (3) coding region sequence variation is not commonly found within the androgen receptor gene of homosexual men. The DGGE screen identified two rare amino acid substitutions, ser[sup 205] -to-arg and glu[sup 793]-to-asp, the biological significance of which is unknown. 32 refs., 2 figs., 2 tabs.

  5. c-myc gene sequences and the phylogeny of bats and other eutherian mammals.

    Miyamoto, M M; Porter, C A; Goodman, M

    2000-09-01

    The complete protein-coding sequences of the c-myc proto-oncogene were determined for five species of four new orders of eutherian (placental) mammals. These newly obtained sequences were aligned to each other and to other available orthologs for the phylogenetic estimation of eutherian interordinal relationships. Several measures of sequence difference and base composition were first calculated to assess the major evolutionary properties of the three codon positions and two protein-coding exons of the gene. On the basis of these calculations, different parsimony, distance, and maximum likelihood approaches were adopted, with the most sophisticated involving the separate, then combined, likelihood analyses of the third codon positions of exon 2 versus all other sites. These phylogenetic approaches provided clear support for the grouping of Chiroptera (bats) with Artiodactyla (ruminants, camels, and pigs) and Carnivora (cats, dogs, and their allies), an interordinal arrangement that receives strong corroboration from other lines of evidence including complete mitochondrial DNA sequences. In contrast, these analyses failed to provide strong to reasonable support for any other interordinal group. This study concludes with specific recommendations about sampling and other strategies for maximizing the phylogenetic contributions of the c-myc gene to the continued resolution of the eutherian ordinal tree. PMID:12116424

  6. Medical Sequencing of Candidate Genes for Nonsyndromic Cleft Lip and Palate.

    2005-12-01

    Full Text Available Nonsyndromic or isolated cleft lip with or without cleft palate (CL/P occurs in wide geographic distribution with an average birth prevalence of 1/700. We used direct sequencing as an approach to study candidate genes for CL/P. We report here the results of sequencing on 20 candidate genes for clefts in 184 cases with CL/P selected with an emphasis on severity and positive family history. Genes were selected based on expression patterns, animal models, and/or role in known human clefting syndromes. For seven genes with identified coding mutations that are potentially etiologic, we performed linkage disequilibrium studies as well in 501 family triads (affected child/mother/father. The recently reported MSX1 P147Q mutation was also studied in an additional 1,098 cleft cases. Selected missense mutations were screened in 1,064 controls from unrelated individuals on the Centre d'Etude du Polymorphisme Humain (CEPH diversity cell line panel. Our aggregate data suggest that point mutations in these candidate genes are likely to contribute to 6% of isolated clefts, particularly those with more severe phenotypes (bilateral cleft of the lip with cleft palate. Additional cases, possibly due to microdeletions or isodisomy, were also detected and may contribute to clefts as well. Sequence analysis alone suggests that point mutations in FOXE1, GLI2, JAG2, LHX8, MSX1, MSX2, SATB2, SKI, SPRY2, and TBX10 may be rare causes of isolated cleft lip with or without cleft palate, and the linkage disequilibrium data support a larger, as yet unspecified, role for variants in or near MSX2, JAG2, and SKI. This study also illustrates the need to test large numbers of controls to distinguish rare polymorphic variants and prioritize functional studies for rare point mutations.

  7. Quantitative sequence-function relationships in proteins based on gene ontology

    Lesk Arthur M

    2007-08-01

    Full Text Available Abstract Background The relationship between divergence of amino-acid sequence and divergence of function among homologous proteins is complex. The assumption that homologs share function – the basis of transfer of annotations in databases – must therefore be regarded with caution. Here, we present a quantitative study of sequence and function divergence, based on the Gene Ontology classification of function. We determined the relationship between sequence divergence and function divergence in 6828 protein families from the PFAM database. Within families there is a broad range of sequence similarity from very closely related proteins – for instance, orthologs in different mammals – to very distantly-related proteins at the limit of reliable recognition of homology. Results We correlated the divergence in sequences determined from pairwise alignments, and the divergence in function determined by path lengths in the Gene Ontology graph, taking into account the fact that many proteins have multiple functions. Our results show that, among homologous proteins, the proportion of divergent functions decreases dramatically above a threshold of sequence similarity at about 50% residue identity. For proteins with more than 50% residue identity, transfer of annotation between homologs will lead to an erroneous attribution with a totally dissimilar function in fewer than 6% of cases. This means that for very similar proteins (about 50 % identical residues the chance of completely incorrect annotation is low; however, because of the phenomenon of recruitment, it is still non-zero. Conclusion Our results describe general features of the evolution of protein function, and serve as a guide to the reliability of annotation transfer, based on the closeness of the relationship between a new protein and its nearest annotated relative.

  8. tRNADB-CE: tRNA gene database well-timed in the era of big sequence data

    Takashi eAbe

    2014-05-01

    Full Text Available The tRNA Gene Data Base Curated by Experts tRNADB-CE (http://trna.ie.niigata-u.ac.jp was constructed by analyzing 1,966 complete and 5,272 draft genomes of prokaryotes, 171 viruses’, 121 chloroplasts’, and 12 eukaryotes’ genomes plus fragment sequences obtained by metagenome studies of environmental samples. 595,115 tRNA genes in total, and thus two times of genes compiled previously, have been registered, for which sequence, clover-leaf structure, and results of sequence-similarity and oligonucleotide-pattern searches can be browsed. To provide collective knowledge with help from experts in tRNA researches, we added a column for enregistering comments to each tRNA. By grouping bacterial tRNAs with an identical sequence, we have found high phylogenetic preservation of tRNA sequences, especially at the phylum level. Since many species-unknown tRNAs from metagenomic sequences have sequences identical to those found in species-known prokaryotes, the identical sequence group can provide phylogenetic markers to investigate the microbial community in an environmental ecosystem. This strategy can be applied to a huge amount of short sequences obtained from next-generation sequencers, as showing that tRNADB-CE is a well-timed database in the era of big sequence data. It is also discussed that BLSOM with oligonucleotide composition is useful for efficient knowledge discovery from big sequence data.

  9. Comparative genome sequencing of Drosophila pseudoobscura: Chromosomal, gene, and cis-element evolution

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.;

    2005-01-01

    years (Myr) since the pseudoobscura/melanogaster divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome-wide average, consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than random and nearby sequences......We have sequenced the genome of a second Drosophila species, Drosophila pseudoobscura, and compared this to the genome sequence of Drosophila melanogaster, a primary model organism. Throughout evolution the vast majority of Drosophila genes have remained on the same chromosome arm, but within each...... between the species-but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a pattern of repeat-mediated chromosomal rearrangement, and high coadaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence...

  10. EXONSAMPLER: a computer program for genome-wide and candidate gene exon sampling for targeted next-generation sequencing.

    Cosart, Ted; Beja-Pereira, Albano; Luikart, Gordon

    2014-11-01

    The computer program EXONSAMPLER automates the sampling of thousands of exon sequences from publicly available reference genome sequences and gene annotation databases. It was designed to provide exon sequences for the efficient, next-generation gene sequencing method called exon capture. The exon sequences can be sampled by a list of gene name abbreviations (e.g. IFNG, TLR1), or by sampling exons from genes spaced evenly across chromosomes. It provides a list of genomic coordinates (a bed file), as well as a set of sequences in fasta format. User-adjustable parameters for collecting exon sequences include a minimum and maximum acceptable exon length, maximum number of exonic base pairs (bp) to sample per gene, and maximum total bp for the entire collection. It allows for partial sampling of very large exons. It can preferentially sample upstream (5 prime) exons, downstream (3 prime) exons, both external exons, or all internal exons. It is written in the Python programming language using its free libraries. We describe the use of EXONSAMPLER to collect exon sequences from the domestic cow (Bos taurus) genome for the design of an exon-capture microarray to sequence exons from related species, including the zebu cow and wild bison. We collected ~10% of the exome (~3 million bp), including 155 candidate genes, and ~16,000 exons evenly spaced genomewide. We prioritized the collection of 5 prime exons to facilitate discovery and genotyping of SNPs near upstream gene regulatory DNA sequences, which control gene expression and are often under natural selection. PMID:24751285

  11. Respiratory Syncytial Virus whole-genome sequencing identifies convergent evolution of sequence duplication in the C-terminus of the G gene

    Schobel, Seth A.; Stucker, Karla M.; Moore, Martin L.; Anderson, Larry J.; Larkin, Emma K.; Shankar, Jyoti; Bera, Jayati; Puri, Vinita; Shilts, Meghan H.; Rosas-Salazar, Christian; Halpin, Rebecca A.; Fedorova, Nadia; Shrivastava, Susmita; Stockwell, Timothy B.; Peebles, R. Stokes; Hartert, Tina V.; Das, Suman R.

    2016-01-01

    Respiratory Syncytial Virus (RSV) is responsible for considerable morbidity and mortality worldwide and is the most important respiratory viral pathogen in infants. Extensive sequence variability within and between RSV group A and B viruses and the ability of multiple clades and sub-clades of RSV to co-circulate are likely mechanisms contributing to the evasion of herd immunity. Surveillance and large-scale whole-genome sequencing of RSV is currently limited but would help identify its evolutionary dynamics and sites of selective immune evasion. In this study, we performed complete-genome next-generation sequencing of 92 RSV isolates from infants in central Tennessee during the 2012–2014 RSV seasons. We identified multiple co-circulating clades of RSV from both the A and B groups. Each clade is defined by signature N- and O-linked glycosylation patterns. Analyses of specific RSV genes revealed high rates of positive selection in the attachment (G) gene. We identified RSV-A viruses in circulation with and without a recently reported 72-nucleotide G gene sequence duplication. Furthermore, we show evidence of convergent evolution of G gene sequence duplication and fixation over time, which suggests a potential fitness advantage of RSV with the G sequence duplication. PMID:27212633

  12. Identification of functional SNPs in the 5-prime flanking sequences of human genes

    Lenhard Boris

    2005-02-01

    Full Text Available Abstract Background Over 4 million single nucleotide polymorphisms (SNPs are currently reported to exist within the human genome. Only a small fraction of these SNPs alter gene function or expression, and therefore might be associated with a cell phenotype. These functional SNPs are consequently important in understanding human health. Information related to functional SNPs in candidate disease genes is critical for cost effective genetic association studies, which attempt to understand the genetics of complex diseases like diabetes, Alzheimer's, etc. Robust methods for the identification of functional SNPs are therefore crucial. We report one such experimental approach. Results Sequence conserved between mouse and human genomes, within 5 kilobases of the 5-prime end of 176 GPCR genes, were screened for SNPs. Sequences flanking these SNPs were scored for transcription factor binding sites. Allelic pairs resulting in a significant score difference were predicted to influence the binding of transcription factors (TFs. Ten such SNPs were selected for mobility shift assays (EMSA, resulting in 7 of them exhibiting a reproducible shift. The full-length promoter regions with 4 of the 7 SNPs were cloned in a Luciferase based plasmid reporter system. Two out of the 4 SNPs exhibited differential promoter activity in several human cell lines. Conclusions We propose a method for effective selection of functional, regulatory SNPs that are located in evolutionary conserved 5-prime flanking regions (5'-FR regions of human genes and influence the activity of the transcriptional regulatory region. Some SNPs behave differently in different cell types.

  13. A note on gene pleiotropy estimation from phylogenetic analysis of protein sequences

    Wen-Hai CHEN; Zhi-Xi SU; Xun GU

    2013-01-01

    Recently,several statistical methods have been independently proposed for estimating the degree (n) of gene pleiotropy (i.e.the capacity of a gene to affect many phenotypes) without knowing measurable phenotypic traits.However,the theoretical limitation of these approaches has not been well demonstrated.In this short note,we show that our previous method based on the phylogeny of protein sequences is,in fact,an effective estimate of a parameter that can be written symbolically as K =min(n,r),where r is the rank of mutations at an amino acid site.Hence,understanding of r is crucial for appropriate interpretation of the estimated K,denoted by Ke (the effective gene pleiotropy).Indeed,when protein sequence alignment is used to estimate effective gene pleiotropy (Ke) by this method,Ke can be interpreted as an effective estimate of n when n ≤ 20,as long as the phylogeny is sufficiently large.If n > 20,Ke → 20,although the true n could be much higher.

  14. Genome sequence of Rickettsia bellii illuminates the role of amoebae in gene exchanges between intracellular pathogens.

    2006-05-01

    Full Text Available The recently sequenced Rickettsia felis genome revealed an unexpected plasmid carrying several genes usually associated with DNA transfer, suggesting that ancestral rickettsiae might have been endowed with a conjugation apparatus. Here we present the genome sequence of Rickettsia bellii, the earliest diverging species of known rickettsiae. The 1,552,076 base pair-long chromosome does not exhibit the colinearity observed between other rickettsia genomes, and encodes a complete set of putative conjugal DNA transfer genes most similar to homologues found in Protochlamydia amoebophila UWE25, an obligate symbiont of amoebae. The genome exhibits many other genes highly similar to homologues in intracellular bacteria of amoebae. We sought and observed sex pili-like cell surface appendages for R. bellii. We also found that R. bellii very efficiently multiplies in the nucleus of eukaryotic cells and survives in the phagocytic amoeba, Acanthamoeba polyphaga. These results suggest that amoeba-like ancestral protozoa could have served as a genetic "melting pot" where the ancestors of rickettsiae and other bacteria promiscuously exchanged genes, eventually leading to their adaptation to the intracellular lifestyle within eukaryotic cells.

  15. In silico phylogenetic and virulence gene profile analyses of avian pathogenic Escherichia coli genome sequences

    Thaís C.G. Rojas

    2014-02-01

    Full Text Available Avian pathogenic Escherichia coli (APEC infections are responsible for significant losses in the poultry industry worldwide. A zoonotic risk has been attributed to APEC strains because they present similarities to extraintestinal pathogenic E. coli (ExPEC associated with illness in humans, mainly urinary tract infections and neonatal meningitis. Here, we present in silico analyses with pathogenic E. coli genome sequences, including recently available APEC genomes. The phylogenetic tree, based on multi-locus sequence typing (MLST of seven housekeeping genes, revealed high diversity in the allelic composition. Nevertheless, despite this diversity, the phylogenetic tree was able to cluster the different pathotypes together. An in silico virulence gene profile was also determined for each of these strains, through the presence or absence of 83 well-known virulence genes/traits described in pathogenic E. coli strains. The MLST phylogeny and the virulence gene profiles demonstrated a certain genetic similarity between Brazilian APEC strains, APEC isolated in the United States, UPEC (uropathogenic E. coli and diarrheagenic strains isolated from humans. This correlation corroborates and reinforces the zoonotic potential hypothesis proposed to APEC.

  16. Exon-intron organization and sequence comparison of human and murine T11 (CD2) genes

    Genomic DNA clones containing the human and murine genes coding for the 50-kDa T11 (CD2) T-cell surface glycoprotein were characterized. The human T11 gene is ≅ 12 kilobases long and comprised of five exons. A leader exon (L) contains the 5'-untranslated region and most of the nucleotides defining the signal peptide [amino acids (aa) -24 to -5]. Two exons encode the extracellular segment; exon Ex1 is 321 base pairs (bp) long and codes for four residues of the leader peptide and aa 1-103 of the mature protein, and exon Ex2 is 231 bp long and encodes aa 104-180. Exon TM is 123 bp long and codes for the single transmembrane region of the molecule (aa 181-221). Exon C is a large 765-bp exon encoding virtually the entire cytoplasmic domain (aa 222-327) and the 3'-untranslated region. The murine region T11 gene has a similar organization with exon-intron boundaries essentially identical to the human gene. Substantial conservation of nucleotide sequences between species in both 5'- and 3'-gene flanking regions equivalent to that among homologous exons suggests that murine and human genes may be regulated in a similar fashion. The probable relationship of the individual T11 exons to functional and structural protein domains is discussed

  17. Cloning and Sequencing of the Pokeweed Antiviral Protein Gene and Its Expression in E. coli

    CHEN Ding-hu; WANG Xi-feng; LI Li; ZHOU Guang-he

    2002-01-01

    The total RNA was isolated from pokeweed (Phytolacca americana ) leaves using the method of guanidine isothiocyanite and used as a template to amplify the deleted mutant pokeweed antiviral protein (PAP) gene by RT-PCR and then the gene was cloned into the pGEMR-T vector. The sequencing results showed that the PAP gene consisted of 711nt, which was 99.6% identical to the PAP gene reported by Lin et al (1991). The IPTG-inducible expression vector containing the PAP gene was constructed and transferred into the E. coli strain BL21 (DE3)-plysS. A specific protein was produced after induction with 0.4m mol/L IPTG and its molecular weight was 26ku. The results of the double diffusion on the agar plate and the western blotting test showed that the protein produced in E. coli was highly identical with the PAP extracted by a Frenchman from French pokeweed leaves. These revealed that PAP gene was actually achieved and exactly expressed in E . coli.

  18. Analysis of intron sequence features associated with transcriptional regulation in human genes.

    Huimin Li

    Full Text Available Although some preliminary work has revealed the potential transcriptional regulatory function of the introns in eukaryotes, additional evidences are needed to support this conjecture. In this study, we perform systemic analyses of the sequence characteristics of human introns. The results show that the first introns are generally longer and C, G and their dinucleotide compositions are over-represented relative to other introns, which are consistent with the previous findings. In addition, some new phenomena concerned with transcriptional regulation are found: i the first introns are enriched in CpG islands; and ii the percentages of the first introns containing TATA, CAAT and GC boxes are relatively higher than other position introns. The similar features of introns are observed in tissue-specific genes. The results further support that the first introns of human genes are likely to be involved in transcriptional regulation, and give an insight into the transcriptional regulatory regions of genes.

  19. [Sequence variation of mitochondrial cytochrome b gene and phylogenetic relationships among twelve species of Charadriiformes].

    Chen, Xiao-Fang; Wang, Xiang; Yuan, Xiao-Dong; Tang, Min-Qian; Li, Yu-Xiang; Guo, Yu-Mei; Li, Qing-Wei

    2003-05-01

    Studies of the phylogenetic relationships of the Charadriiformes have been largely based on conservative morphological characters. During the past 10 years, many studies on the evolutionary biology of birds adopted phylogenetic information obtained from mitochondrial DNA, but few work on the Charadriiformes has been reported to date. Therefore, phylogenetic relationships and classification of the Charadriiformes remains controversial. In this study, we try to shed light on these relationships via DNA sequence analysis of the mitochondrial Cyt b gene in 12 species of Charadriiformes. It was a preliminary study of the origin and evolution of the species by using nucleotide sequence data. Using the well-known PCR techniques, the complete mitochondrial Cyt b gene sequences were amplified and sequenced respectively from Charadrius mongolus, Charadrius alexandrinus, Numenius madagascariensis, Numenius arquat, Numenius phaeopus, Tringa totanus, Tringa glareola, Xenus cineres, Arenaria interpres, Calidris tenuirostris, Recurvirostra avosetts and Haematopus ostralensis. The 1143 bp long DNA sequences of the gene from these species were obtained, in which 381 variable sites were identified without insertions or deletions. The nucleic acid sequence variation of the mitochondrial Cyt b gene was 5.16%-16.01% among these species. Phylogenetic trees constructed using the NJ method, MP method and ML method with Ciconia ciconia as the outgroup indicate that the 12 species of Charadriiformes examined in this study are clustered in two major clades. The first clade includes T. totanus, T. glareola, A. interpres, C. tenuirostris, X. cineres, N. madagascariensis, N. arquata and N. phaeopus. The second one includes C. mongolus, C. alexandrinus, R. avosetts and H. ostralensis. Our molecular data show that the phylogenetic relationships among species of Scolopacidae are consistent with the classification based on morphological studies; R. avosetts and H. ostralensis are relatively closer

  20. Complete plastid genome sequence of Vaccinium macrocarpon: structure, gene content and rearrangements revealed by next generation sequencing

    The complete plastid genome sequence of the American cranberry was reconstructed using next-generation sequencing data by in silico procedures. We used Roche 454 shotgun sequence data to isolate cranberry plastid-specific sequences of the cultivar ‘HyRed’ via homology comparisons with complete seque...

  1. Gene Expression Profiling of Development and Anthocyanin Accumulation in Kiwifruit (Actinidia chinensis Based on Transcriptome Sequencing.

    Wenbin Li

    Full Text Available Red-fleshed kiwifruit (Actinidia chinensis Planch. 'Hongyang' is a promising commercial cultivar due to its nutritious value and unique flesh color, derived from vitamin C and anthocyanins. In this study, we obtained transcriptome data of 'Hongyang' from seven developmental stages using Illumina sequencing. We mapped 39-54 million reads to the recently sequenced kiwifruit genome and other databases to define gene structure, to analyze alternative splicing, and to quantify gene transcript abundance at different developmental stages. The transcript profiles throughout red kiwifruit development were constructed and analyzed, with a focus on the biosynthesis and metabolism of compounds such as phytohormones, sugars, starch and L-ascorbic acid, which are indispensable for the development and formation of quality fruit. Candidate genes for these pathways were identified through MapMan and phylogenetic analysis. The transcript levels of genes involved in sucrose and starch metabolism were consistent with the change in soluble sugar and starch content throughout kiwifruit development. The metabolism of L-ascorbic acid was very active, primarily through the L-galactose pathway. The genes responsible for the accumulation of anthocyanin in red kiwifruit were identified, and their expression levels were investigated during kiwifruit development. This survey of gene expression during kiwifruit development paves the way for further investigation of the development of this uniquely colored and nutritious fruit and reveals which factors are needed for high quality fruit formation. This transcriptome data and its analysis will be useful for improving kiwifruit genome annotation, for basic fruit molecular biology research, and for kiwifruit breeding and improvement.

  2. Primary sequence of the 5' flanking regions of the Drosophila heat shock genes in chromosome subdivision 67B.

    Ingolia, T D; Craig, E A

    1981-01-01

    The 5' flanking regions of the four small heat shock genes of Drosophila melanogaster from cytological locus 67B have been characterized. Approximately 500 bp of the primary sequence upstream from the proposed site of initiation of translation has been determined and the 5' end of the messenger RNAs have been localized for each gene. Each of the four genes contains an A-T rich sequence, either TATAAATA or TATAAAAG, which is flanked by a G-C rich region. This A-T rich sequence, which ends abou...

  3. Deep sequencing of New World screw-worm transcripts to discover genes involved in insecticide resistance

    Azeredo-Espin Ana Maria L

    2010-12-01

    Full Text Available Abstract Background The New World screw-worm (NWS, Cochliomyia hominivorax, is one of the most important myiasis-causing flies, causing severe losses to the livestock industry. In its current geographical distribution, this species has been controlled by the application of insecticides, mainly organophosphate (OP compounds, but a number of lineages have been identified that are resistant to such chemicals. Despite its economic importance, only limited genetic information is available for the NWS. Here, as a part of an effort to characterize the C. hominivorax genome and identify putative genes involved in insecticide resistance, we sampled its transcriptome by deep sequencing of polyadenylated transcripts using the 454 sequencing technology. Results Deep sequencing on the 454 platform of three normalized libraries (larval, adult male and adult female generated a total of 548,940 reads. Eighteen candidate genes coding for three metabolic detoxification enzyme families, cytochrome P450 monooxygenases, glutathione S-transferases and carboxyl/cholinesterases were selected and gene expression levels were measured using quantitative real-time polymerase chain reaction (qRT-PCR. Of the investigated candidates, only one gene was expressed differently between control and resistant larvae with, at least, a 10-fold down-regulation in the resistant larvae. The presence of mutations in the acetylcholinesterase (target site and carboxylesterase E3 genes was investigated and all of the resistant flies presented E3 mutations previously associated with insecticide resistance. Conclusions Here, we provided the largest database of NWS expressed sequence tags that is an important resource, not only for further studies on the molecular basis of the OP resistance in NWS fly, but also for functional and comparative studies among Calliphoridae flies. Among our candidates, only one gene was found differentially expressed in resistant individuals, and its role on

  4. Gene ontology-based protein function prediction by using sequence composition information.

    Dong, Qiwen; Zhou, Shuigeng; Deng, Lei; Guan, Jihong

    2010-06-01

    The prediction of protein function is a difficult and important problem in computational biology. In this study, an efficient method is presented to predict protein function with sequence composition information. Four kinds of basic building blocks of protein sequences are investigated, including N-grams, binary profiles, PFAM domains and InterPro domains. The protein sequences are mapped into high-dimensional vectors by using the occurrence frequencies of each kind of building blocks. The resulting vectors are then taken as input to support vector machine to predict their function based on gene ontology. Experiments are conducted over the subset of GOA database. The experimental results show that the protein function can be predicted from primary sequence information. The method based on InterPro domains outperforms the other building blocks, and gets an overall accuracy of 0.87 and ROC score is 0.93. We also demonstrate that the use of feature extraction algorithms such as latent semantic analysis and nonnegative matrix factorization, can efficiently remove noise and improve the prediction efficiency without significantly degrading the performance. The results obtained here are helpful for the prediction of protein function by using only sequence information. PMID:19995340

  5. cDNA cloning, sequence analysis, and chromosomal localization of the gene for human carnitine palmitoyltransferase

    Finocchiaro, G.; Taroni, F.; Martin, A.L.; Colombo, I.; Tarelli, G.T.; DiDonato, S. (Istituto Nazionale Neurologico C. Besta, Milan (Italy)); Rocchi, M. (Istituto G. Gaslini, Genoa (Italy))

    1991-01-15

    The authors have cloned and sequenced a cDNA encoding human liver carnitine palmitoyltransferase an inner mitochondrial membrane enzyme that plays a major role in the fatty acid oxidation pathway. Mixed oligonucleotide primers whose sequences were deduced from one tryptic peptide obtained from purified CPTase were used in a polymerase chain reaction, allowing the amplification of a 0.12-kilobase fragment of human genomic DNA encoding such a peptide. A 60-base-pair (bp) oligonucleotide synthesized on the basis of the sequence from this fragment was used for the screening of a cDNA library from human liver and hybridized to a cDNA insert of 2255 bp. This cDNA contains an open reading frame of 1974 bp that encodes a protein of 658 amino acid residues including 25 residues of an NH{sub 2}-terminal leader peptide. The assignment of this open reading frame to human liver CPTase is confirmed by matches to seven different amino acid sequences of tryptic peptides derived from pure human CPTase and by the 82.2% homology with the amino acid sequence of rat CPTase. The NH{sub 2}-terminal region of CPTase contains a leucine-proline motif that is shared by carnitine acetyl- and octanoyltransferases and by choline acetyltransferase. The gene encoding CPTase was assigned to human chromosome 1, region 1q12-1pter, by hybridization of CPTase cDNA with a DNA panel of 19 human-hanster somatic cell hybrids.

  6. cDNA cloning, sequence analysis, and chromosomal localization of the gene for human carnitine palmitoyltransferase

    The authors have cloned and sequenced a cDNA encoding human liver carnitine palmitoyltransferase an inner mitochondrial membrane enzyme that plays a major role in the fatty acid oxidation pathway. Mixed oligonucleotide primers whose sequences were deduced from one tryptic peptide obtained from purified CPTase were used in a polymerase chain reaction, allowing the amplification of a 0.12-kilobase fragment of human genomic DNA encoding such a peptide. A 60-base-pair (bp) oligonucleotide synthesized on the basis of the sequence from this fragment was used for the screening of a cDNA library from human liver and hybridized to a cDNA insert of 2255 bp. This cDNA contains an open reading frame of 1974 bp that encodes a protein of 658 amino acid residues including 25 residues of an NH2-terminal leader peptide. The assignment of this open reading frame to human liver CPTase is confirmed by matches to seven different amino acid sequences of tryptic peptides derived from pure human CPTase and by the 82.2% homology with the amino acid sequence of rat CPTase. The NH2-terminal region of CPTase contains a leucine-proline motif that is shared by carnitine acetyl- and octanoyltransferases and by choline acetyltransferase. The gene encoding CPTase was assigned to human chromosome 1, region 1q12-1pter, by hybridization of CPTase cDNA with a DNA panel of 19 human-hanster somatic cell hybrids

  7. Sequence analysis of the gene for the glucan-binding protein of Streptococcus mutans Ingbritt.

    Banas, J A; Russell, R R; Ferretti, J J

    1990-01-01

    The nucleotide sequence of the gbp gene, which encodes the glucan-binding protein (GBP) of Streptococcus mutans, was determined. The reading frame for gbp was 1,689 bases. A ribosome-binding site and putative promoter preceded the start codon, and potential stem-loop structures were identified downstream from the termination codon. The deduced amino acid sequence of the GBP revealed the presence of a signal peptide of 35 amino acids. The molecular weight of the processed protein was calculated to be 59,039. Two series of repeats spanned three-quarters of the carboxy-terminal end of the protein. The repeats were 32 to 34 and 17 to 20 amino acids in length and shared partial identity within each series. The repeats were found to be homologous to sequences hypothesized to be involved in glucan binding in the GTF-I of S. downei and to sequences within the protein products encoded by gtfB and gtfC of S. mutans. The repeated sequences may represent peptide segments that are important to glucan binding and may be distributed among GBPs from other bacterial inhabitants of plaque or the oral cavity. PMID:2307516

  8. Extensive sequence variation in rice blast resistance gene Pi54 makes it broad spectrum in nature

    Shallu eThakur

    2015-05-01

    Full Text Available Rice blast resistant gene, Pi54 cloned from rice line, Tetep, is effective against diverse isolates of Magnaporthe oryzae. In this study, we prospected the allelic variants of the dominant blast resistance gene from a set of 92 rice lines to determine the nucleotide diversity, pattern of its molecular evolution, phylogenetic relationships and evolutionary dynamics, and to develop allele specific markers. High quality sequences were generated for homologs of Pi54 gene. Using comparative sequence analysis, InDels of variable sizes in all the alleles were observed. Profiling of the selected sites of SNP (Single Nucleotide Polymorphism and amino acids (N sites ≥ 10 exhibited constant frequency distribution of mutational and substitutional sites between the resistance and susceptible rice lines, respectively. A total of 50 new haplotypes based on the nucleotide polymorphism was also identified. A unique haplotype (H_3 was found to be linked to all the resistant alleles isolated from indica rice lines. Unique leucine zipper and tyrosine sulfation sites were identified in the predicted Pi54 proteins. Selection signals were observed in entire coding sequence of resistance alleles, as compared to LRR domains for susceptible alleles. This is a maiden report of extensive variability of Pi54 alleles in different landraces and cultivated varieties, possibly, attributing broad-spectrum resistance to Magnaporthe oryzae. The sequence variation in two consensus region: 163 bp and 144 bp were used for the development of allele specific DNA markers. Validated markers can be used for the selection and identification of better allele(s and their introgression in commercial rice cultivars employing marker assisted selection.

  9. Molecular cloning and sequence analysis of prion protein gene in Xiji donkey in China.

    Zhang, Zhuming; Wang, Renli; Xu, Lihua; Yuan, Fangzhong; Zhou, Xiangmei; Yang, Lifeng; Yin, Xiaomin; Xu, Binrui; Zhao, Deming

    2013-10-25

    Prion diseases are a group of human and animal neurodegenerative disorders caused by the deposition of an abnormal isoform prion protein (PrP(Sc)) encoded by a single copy prion protein gene (PRNP). Prion disease has been reported in many herbivores but not in Equus and the species barrier might be playing a role in resistance of these species to the disease. Therefore, analysis of genotype of prion protein (PrP) in these species may help understand the transmission of the disease. Xiji donkey is a rare species of Equus not widely reared in Ningxia, China, for service, food and medicine, but its PRNP has not been studied. Based on the reported PrP sequence in GenBank we designed primers and amplified, cloned and sequenced the PRNP of Xiji donkey. The sequence analysis showed that the Xiji donkey PRNP was consisted of an open reading frame of 768 nucleotides encoding 256 amino acids. Amino acid residues unique to donkey as compared with some Equus animals, mink, cow, sheep, human, dog, sika deer, rabbit and hamster were identified. The results showed that the amino acid sequence of Xiji donkey PrP starts with the consensus sequence MVKSH, with almost identical amino acid sequence to the PrP of other Equus species in this study. Amino acid sequence analysis showed high identity within species and close relation to the PRNP of sika deer, sheep, dog, camel, cow, mink, rabbit and hamster with 83.1-99.7% identity. The results provided the PRNP data for an additional Equus species, which should be useful to the study of the prion disease pathogenesis, resistance and cross species transmission. PMID:23954254

  10. Revised Mimivirus major capsid protein sequence reveals intron-containing gene structure and extra domain

    Suzan-Monti Marie

    2009-05-01

    Full Text Available Abstract Background Acanthamoebae polyphaga Mimivirus (APM is the largest known dsDNA virus. The viral particle has a nearly icosahedral structure with an internal capsid shell surrounded with a dense layer of fibrils. A Capsid protein sequence, D13L, was deduced from the APM L425 coding gene and was shown to be the most abundant protein found within the viral particle. However this protein remained poorly characterised until now. A revised protein sequence deposited in a database suggested an additional N-terminal stretch of 142 amino acids missing from the original deduced sequence. This result led us to investigate the L425 gene structure and the biochemical properties of the complete APM major Capsid protein. Results This study describes the full length 3430 bp Capsid coding gene and characterises the 593 amino acids long corresponding Capsid protein 1. The recombinant full length protein allowed the production of a specific monoclonal antibody able to detect the Capsid protein 1 within the viral particle. This protein appeared to be post-translationnally modified by glycosylation and phosphorylation. We proposed a secondary structure prediction of APM Capsid protein 1 compared to the Capsid protein structure of Paramecium Bursaria Chlorella Virus 1, another member of the Nucleo-Cytoplasmic Large DNA virus family. Conclusion The characterisation of the full length L425 Capsid coding gene of Acanthamoebae polyphaga Mimivirus provides new insights into the structure of the main Capsid protein. The production of a full length recombinant protein will be useful for further structural studies.

  11. Gene identification and analysis of transcripts differentially regulated in fracture healing by EST sequencing in the domestic sheep

    Hecht Jochen

    2006-07-01

    Full Text Available Abstract Background The sheep is an important model animal for testing novel fracture treatments and other medical applications. Despite these medical uses and the well known economic and cultural importance of the sheep, relatively little research has been performed into sheep genetics, and DNA sequences are available for only a small number of sheep genes. Results In this work we have sequenced over 47 thousand expressed sequence tags (ESTs from libraries developed from healing bone in a sheep model of fracture healing. These ESTs were clustered with the previously available 10 thousand sheep ESTs to a total of 19087 contigs with an average length of 603 nucleotides. We used the newly identified sequences to develop RT-PCR assays for 78 sheep genes and measured differential expression during the course of fracture healing between days 7 and 42 postfracture. All genes showed significant shifts at one or more time points. 23 of the genes were differentially expressed between postfracture days 7 and 10, which could reflect an important role for these genes for the initiation of osteogenesis. Conclusion The sequences we have identified in this work are a valuable resource for future studies on musculoskeletal healing and regeneration using sheep and represent an important head-start for genomic sequencing projects for Ovis aries, with partial or complete sequences being made available for over 5,800 previously unsequenced sheep genes.

  12. Nucleotide sequence of glycoprotein genes B, C, D, G, H and I, the thymidine kinase and protein kinase genes and gene homologue UL24 of an Australian isolate of canine herpesvirus.

    Reubel, Gerhard Herbert; Pekin, Jenny; Webb-Wagg, Kyleen; Hardy, Christopher Miles

    2002-10-01

    We report the complete nucleotide (nt) sequence of nine genes of an Australian isolate of canine herpesvirus (CHV). Four of them are located in the unique short (US) region: glycoprotein (g) genes gG, gD and gI, and the protein kinase gene. Five are in the unique long (UL) region: the thymidine kinase gene, gB, gC, gH, and gene homologue UL24. Partial sequence was determined for four genes, two in the UL region (UL21 and virion protein) and two in the US region (US2 and gE). A repeat sequence of 382 nt with unknown function was identified in the 615 nt intergenic region between gH and UL21. A total of 16.93 kb was sequenced and compared with sequences from CHV isolates from the USA, France, Japan and Australia. Only minor nt and/or amino acid (aa) differences were observed. PMID:12416682

  13. Multiple substitutions in the von Willebrand factor gene that mimic the pseudogene sequence

    Eikenboom, J.C.; Brieet, E.; Reitsma, P.H.; Vink, T.; Sixma, J.J. [Univ. Hospital, Utrecht (Netherlands)

    1994-03-15

    The authors have analyzed a type IIB and a type I von Willebrand disease family for the presence of mutations in the region coding for the glycoprotein Ib binding domain of the von Willebrand factor. Since this sequence is also present in the highly homologous von Willebrand factor pseudogene, the authors have studied genomic DNA as well as cDNA, which was produced from RNA isolated from endothelial cells or platelets. In both families, they have detected multiple consecutive nucleotide substitutions in the 5{prime} end of exon 28 that result in a sequence identical to the von Willebrand factor pseudogene. These substitutions were also found in cDNA, which proves that they are present in the active gene. The occurrence of multiple adjacent substitutions that exactly reflect a part of the sequence of the von Willebrand factor pseudogene is difficult to reconcile with sequential single mutational events. They therefore hypothesize that each of these multiple substitutions arose from one recombinational event between gene and pseudogene. 34 refs., 4 figs., 2 tabs.

  14. Bioinformatic identification of microRNAs and their target genes from Solanum tuberosum expressed sequence tags

    2007-01-01

    MicroRNAs (miRNAs) are a class of non-coding RNAs that regulate gene post-transcriptional expression in plants and animals. Low levels of some miRNAs and time- and tissue-specific expression patterns lead to the difficulty for experimental identification of miRNAs. Here we present a bioinformatic approach for expressed sequence tags (ESTs) prediction of novel miRNAs as well as their targets in Solanum tuberosum. We blasted the databases of S. Tuberosum ESTs to search for potential miRNAs, using previously known miRNA sequences from Arabidopsis, rice and other plant species. By analyzing parameters of plant precursors, including secondary structure, stem length and conservation of miRNAs, and following a variety of filtering criteria, a total of 22 potential miRNAs were detected. Using the newly identified miRNA sequences, we were able to further blast the S. Tuberosum mRNA database and detected 75 potential targets of miRNAs in S. Tuberosum. According to the mRNA annotations provided by the National Center for Biotechnology Information (NCBI) (http://www.ncbi.nlm.nih.gov/), most of the miRNA target genes were predicted to encode transcription factors that regulate cell growth and development, signaling, and metabolism.

  15. Computational prediction of miRNA genes from small RNA sequencing data

    Wenjing eKang

    2015-01-01

    Full Text Available Next-generation sequencing now for the first time allows researchers to gauge the depth and variation of entire transcriptomes. However, now as rare transcripts can be detected that are present in cells at single copies, more advanced computational tools are needed to accurately annotate and profile them. miRNAs are 22 nucleotide small RNAs (sRNAs that post-transcriptionally reduce the output of protein coding genes. They have established roles in numerous biological processes, including cancers and other diseases. During miRNA biogenesis, the sRNAs are sequentially cleaved from precursor molecules that have a characteristic hairpin RNA structure. The vast majority of new miRNA genes that are discovered are mined from small RNA sequencing (sRNA-seq, which can detect more than a billion RNAs in a single run. However, given that many of the detected RNAs are degradation products from all types of transcripts, the accurate identification of miRNAs remain a non-trivial computational problem. Here we review the tools available to predict animal miRNAs from sRNA sequencing data. We present tools for generalist and specialist use cases, including prediction from massively pooled data or in species without reference genome. We also present wet-lab methods used to validate predicted miRNAs, and approaches to computationally benchmark prediction accuracy. For each tool, we reference validation experiments and benchmarking efforts. Last, we discuss the future of the field.

  16. Novel and functional DNA sequence variants within the GATA5 gene promoter in ventricular septal defects

    Ji-Ping Shan; Xiao-Li Wang; Yuan-Gang Qiao; Hong-Xin Wan Yan; Wen-Hui Huang; Shu-Chao Pang; Bo Yan

    2014-01-01

    Background: Congenital heart disease (CHD) is the most common human birth defect. Genetic causes for CHD remain largely unknown. GATA transcription factor 5 (GATA 5) is an essential regulator for the heart development. Mutations in the GATA5 gene have been reported in patients with a variety of CHD. Since misregulation of gene expression have been associated with human diseases, we speculated that changed levels of cardiac transcription factors, GATA5, may mediate the development of CHD. Methods: In this study, GATA5 gene promoter was genetically and functionally analyzed in large cohorts of patients with ventricular septal defect (VSD) (n=343) and ethnic-matched healthy controls (n=348). Results: Two novel and heterozygous DNA sequence variants (DSVs), g.61051165A>G and g.61051463delC, were identified in three VSD patients, but not in the controls. In cultured cardiomyocytes, GATA5 gene promoter activities were significantly decreased by DSV g.61051165A>G and increased by DSV g.61051463delC. Moreover, fathers of the VSD patients carrying the same DSVs had reduced diastolic function of left ventricles. Three SNPs, g.61051279C>T (rs77067995), g.61051327A>C (rs145936691) and g.61051373G>A (rs80197101), and one novel heterozygous DSV, g.61051227C>T, were found in both VSD patients and controls with similar frequencies. Conclusion: Our data suggested that the DSVs in the GATA5 gene promoter may increase the susceptibility to the development of VSD as a risk factor.

  17. Sequencing, Expression and Diagnostic Application of the Nucleoprotein Gene of Xinjiang Hemorrhagic Fever Virus

    马本江; 杭长寿; 解燕乡; 王世文

    2004-01-01

    In order to analyze the nucleoprotein (NP) gene of Crimean-Congo hemorrhagic fever virus (CCHFV), viral RNA was amplified by RT-PCR by using the proof-reading DNA polymerase to produce the complete NP gene. The PCR product was sequenced, analyzed for phylogenesis and cloned into the expression vector pE132a and the recombinant plasmid expressed in E. coil BL-21 with high yield. The primarily purified fused protein.was used to coat ELISA plates for the detect antibodies. It was found the similarities between NP gene of BA88166 and other XHFVs in nucleotide level and amino acid contents were very significant, and the NP gene of BA88166 encoded a nucleoprotein with 482 amino acid and a deduced molecular weight (MW) of 54 kDa. Western blot assay showed that the fusion protein expressed in bacteria possessed good antigenicity. The results with ELISA for the detection of the human and animal sera collected in endemic areas were found to be in good accordance to the clinical diagnosis. It concluded that the relations of NP genes of XHFV BA88166 and other XHFVs appeared to be evolutionally close. The methodologies established in this study were accurate, specific, rapid and reproducible for the clinical examinations and epidemiological survey.

  18. Targeted next-generation sequencing reveals multiple deleterious variants in OPLL-associated genes.

    Chen, Xin; Guo, Jun; Cai, Tao; Zhang, Fengshan; Pan, Shengfa; Zhang, Li; Wang, Shaobo; Zhou, Feifei; Diao, Yinze; Zhao, Yanbin; Chen, Zhen; Liu, Xiaoguang; Chen, Zhongqiang; Liu, Zhongjun; Sun, Yu; Du, Jie

    2016-01-01

    Ossification of the posterior longitudinal ligament of the spine (OPLL), which is characterized by ectopic bone formation in the spinal ligaments, can cause spinal-cord compression. To date, at least 11 susceptibility genes have been genetically linked to OPLL. In order to identify potential deleterious alleles in these OPLL-associated genes, we designed a capture array encompassing all coding regions of the target genes for next-generation sequencing (NGS) in a cohort of 55 unrelated patients with OPLL. By bioinformatics analyses, we successfully identified three novel and five extremely rare variants (MAF < 0.005). These variants were predicted to be deleterious by commonly used various algorithms, thereby resulting in missense mutations in four OPLL-associated genes (i.e., COL6A1, COL11A2, FGFR1, and BMP2). Furthermore, potential effects of the patient with p.Q89E of BMP2 were confirmed by a markedly increased BMP2 level in peripheral blood samples. Notably, seven of the variants were found to be associated with the patients with continuous subtype changes by cervical spinal radiological analyses. Taken together, our findings revealed for the first time that deleterious coding variants of the four OPLL-associated genes are potentially pathogenic in the patients with OPLL. PMID:27246988

  19. Targeted next-generation sequencing reveals multiple deleterious variants in OPLL-associated genes

    Chen, Xin; Guo, Jun; Cai, Tao; Zhang, Fengshan; Pan, Shengfa; Zhang, Li; Wang, Shaobo; Zhou, Feifei; Diao, Yinze; Zhao, Yanbin; Chen, Zhen; Liu, Xiaoguang; Chen, Zhongqiang; Liu, Zhongjun; Sun, Yu; Du, Jie

    2016-01-01

    Ossification of the posterior longitudinal ligament of the spine (OPLL), which is characterized by ectopic bone formation in the spinal ligaments, can cause spinal-cord compression. To date, at least 11 susceptibility genes have been genetically linked to OPLL. In order to identify potential deleterious alleles in these OPLL-associated genes, we designed a capture array encompassing all coding regions of the target genes for next-generation sequencing (NGS) in a cohort of 55 unrelated patients with OPLL. By bioinformatics analyses, we successfully identified three novel and five extremely rare variants (MAF < 0.005). These variants were predicted to be deleterious by commonly used various algorithms, thereby resulting in missense mutations in four OPLL-associated genes (i.e., COL6A1, COL11A2, FGFR1, and BMP2). Furthermore, potential effects of the patient with p.Q89E of BMP2 were confirmed by a markedly increased BMP2 level in peripheral blood samples. Notably, seven of the variants were found to be associated with the patients with continuous subtype changes by cervical spinal radiological analyses. Taken together, our findings revealed for the first time that deleterious coding variants of the four OPLL-associated genes are potentially pathogenic in the patients with OPLL. PMID:27246988

  20. Understanding gene sequence variation in the context of transcription regulation in yeast.

    Irit Gat-Viks

    2010-01-01

    Full Text Available DNA sequence polymorphism in a regulatory protein can have a widespread transcriptional effect. Here we present a computational approach for analyzing modules of genes with a common regulation that are affected by specific DNA polymorphisms. We identify such regulatory-linkage modules by integrating genotypic and expression data for individuals in a segregating population with complementary expression data of strains mutated in a variety of regulatory proteins. Our procedure searches simultaneously for groups of co-expressed genes, for their common underlying linkage interval, and for their shared regulatory proteins. We applied the method to a cross between laboratory and wild strains of S. cerevisiae, demonstrating its ability to correctly suggest modules and to outperform extant approaches. Our results suggest that middle sporulation genes are under the control of polymorphism in the sporulation-specific tertiary complex Sum1p/Rfm1p/Hst1p. In another example, our analysis reveals novel inter-relations between Swi3 and two mitochondrial inner membrane proteins underlying variation in a module of aerobic cellular respiration genes. Overall, our findings demonstrate that this approach provides a useful framework for the systematic mapping of quantitative trait loci and their role in gene expression variation.

  1. An Updated Collection of Sequence Barcoded Temperature-Sensitive Alleles of Yeast Essential Genes.

    Kofoed, Megan; Milbury, Karissa L; Chiang, Jennifer H; Sinha, Sunita; Ben-Aroya, Shay; Giaever, Guri; Nislow, Corey; Hieter, Philip; Stirling, Peter C

    2015-09-01

    Systematic analyses of essential gene function using mutant collections in Saccharomyces cerevisiae have been conducted using collections of heterozygous diploids, promoter shut-off alleles, through alleles with destabilized mRNA, destabilized protein, or bearing mutations that lead to a temperature-sensitive (ts) phenotype. We previously described a method for construction of barcoded ts alleles in a systematic fashion. Here we report the completion of this collection of alleles covering 600 essential yeast genes. This resource covers a larger gene repertoire than previous collections and provides a complementary set of strains suitable for single gene and genomic analyses. We use deep sequencing to characterize the amino acid changes leading to the ts phenotype in half of the alleles. We also use high-throughput approaches to describe the relative ts behavior of the alleles. Finally, we demonstrate the experimental usefulness of the collection in a high-content, functional genomic screen for ts alleles that increase spontaneous P-body formation. By increasing the number of alleles and improving the annotation, this ts collection will serve as a community resource for probing new aspects of biology for essential yeast genes. PMID:26175450

  2. Abundance and genetic diversity of nifH gene sequences in anthropogenically affected Brazilian mangrove sediments.

    Dias, Armando Cavalcante Franco; Pereira e Silva, Michele de Cassia; Cotta, Simone Raposo; Dini-Andreote, Francisco; Soares, Fábio Lino; Salles, Joana Falcão; Azevedo, João Lúcio; van Elsas, Jan Dirk; Andreote, Fernando Dini

    2012-11-01

    Although mangroves represent ecosystems of global importance, the genetic diversity and abundance of functional genes that are key to their functioning scarcely have been explored. Here, we present a survey based on the nifH gene across transects of sediments of two mangrove systems located along the coast line of São Paulo state (Brazil) which differed by degree of disturbance, i.e., an oil-spill-affected and an unaffected mangrove. The diazotrophic communities were assessed by denaturing gradient gel electrophoresis (DGGE), quantitative PCR (qPCR), and clone libraries. The nifH gene abundance was similar across the two mangrove sediment systems, as evidenced by qPCR. However, the nifH-based PCR-DGGE profiles revealed clear differences between the mangroves. Moreover, shifts in the nifH gene diversities were noted along the land-sea transect within the previously oiled mangrove. The nifH gene diversity depicted the presence of nitrogen-fixing bacteria affiliated with a wide range of taxa, encompassing members of the Alphaproteobacteria, Betaproteobacteria, Gammaproteobacteria, Firmicutes, and also a group of anaerobic sulfate-reducing bacteria. We also detected a unique mangrove-specific cluster of sequences denoted Mgv-nifH. Our results indicate that nitrogen-fixing bacterial guilds can be partially endemic to mangroves, and these communities are modulated by oil contamination, which has important implications for conservation strategies. PMID:22941088

  3. Next-generation sequencing identifies transportin 3 as the causative gene for LGMD1F.

    Annalaura Torella

    Full Text Available Limb-girdle muscular dystrophies (LGMD are genetically and clinically heterogeneous conditions. We investigated a large family with autosomal dominant transmission pattern, previously classified as LGMD1F and mapped to chromosome 7q32. Affected members are characterized by muscle weakness affecting earlier the pelvic girdle and the ileopsoas muscles. We sequenced the whole exome of four family members and identified a shared heterozygous frame-shift variant in the Transportin 3 (TNPO3 gene, encoding a member of the importin-β super-family. The TNPO3 gene is mapped within the LGMD1F critical interval and its 923-amino acid human gene product is also expressed in skeletal muscle. In addition, we identified an isolated case of LGMD with a new missense mutation in the same gene. We localized the mutant TNPO3 around the nucleus, but not inside. The involvement of gene related to the nuclear transport suggests a novel disease mechanism leading to muscular dystrophy.

  4. Novel and Functional DNA Sequence Variants within the GATA6 Gene Promoter in Ventricular Septal Defects

    Chunyu Li

    2014-07-01

    Full Text Available Congenital heart disease (CHD is the most common birth defect in humans. Genetic causes and underlying molecular mechanisms for isolated CHD remain largely unknown. Studies have demonstrated that GATA transcription factor 6 (GATA6 plays an essential role in the heart development. Mutations in GATA6 gene have been associated with diverse types of CHD. As GATA6 functions in a dosage-dependent manner, we speculated that changed GATA6 levels, resulting from DNA sequence variants (DSVs within the gene regulatory regions, may mediate the CHD development. In the present study, GATA6 gene promoter was genetically and functionally analyzed in large groups of patients with ventricular septal defect (VSD (n = 359 and ethnic-matched healthy controls (n = 365. In total, 11 DSVs, including four SNPs, were identified in VSD patients and controls. Two novel and heterozygous DSVs, g.22169190A>T and g.22169311C>G, were identified in two VSD patients, but in none of controls. In cultured cardiomyocytes, the activities of the GATA6 gene promoter were significantly reduced by the DSVs g.22169190A>T and g.22169311C>G. Therefore, our findings suggested that the DSVs within the GATA6 gene promoter identified in VSD patients may change GATA6 levels, contributing to the VSD development as a risk factor.

  5. Sequence Variation and Expression of the Gimap Gene Family in the BB Rat

    Elizabeth A. Rutledge

    2009-01-01

    Full Text Available Positional cloning of lymphopenia (lyp in the BB rat revealed a frameshift mutation in Gimap5, a member of at least seven related GTPase Immune Associated Protein genes located on rat chromosome 4q24. Our aim was to clone and sequence the cDNA of the BB diabetes prone (DP and diabetes resistant (DR alleles of all seven Gimap genes in the congenic DR.lyp rat line with 2 Mb of BB DP DNA introgressed onto the DR genetic background. All (100% DR.lyp/lyp rats are lymphopenic and develop type 1 diabetes (T1D by 84 days of age while DR.+/+ rats remain T1D and lyp resistant. Among the seven Gimap genes, the Gimap5 frameshift mutation, a mutant allele that produces no protein, had the greatest impact on lymphopenia in the DR.lyp/lyp rat. Gimap4 and Gimap1 each had one amino acid substitution of unlikely significance for lymphopenia. Quantitative RT-PCR analysis showed a reduction in expression of all seven Gimap genes in DR.lyp/lyp spleen and mesenteric lymph nodes when compared to DR.+/+. Only four; Gimap1, Gimap4, Gimap5, and Gimap9 were reduced in thymus. Our data substantiates the Gimap5 frameshift mutation as the primary defect with only limited contributions to lymphopenia from the remaining Gimap genes.

  6. Sequence and molecular analysis of the nifL gene of Azotobacter vinelandii.

    Blanco, G; Drummond, M; Woodley, P; Kennedy, C

    1993-08-01

    In both Klebsiella pneumoniae and Azotobacter vinelandii the nifL gene, which encodes a negative regulator of nitrogen fixation, lies immediately upstream of nifA. We have sequenced the A. vinelandii nifL gene and found that it is more homologous in its C-terminal domain to the histidine protein kinases (HPKs) than is K. pneumoniae NifL. In particular A. vinelandii NifL contains a conserved histidine at a position shown to be phosphorylated in other systems. Both NifL proteins are homologous in their N-termini to a part of the Halobacterium halobium bat gene product; Bat is involved in regulation of bacterio-opsin, the expression of which is oxygen sensitive. The same region showed homology to the haem-binding N-terminal domain of the Rhizobium meliloti fixL gene product, an oxygen-sensing protein. Like K. pneumoniae NifL, A. vinelandii NifL is shown here to prevent expression of nif genes in the presence of NH+4 or oxygen. The sequences found homologous in the C-terminal regions of NifL, FixL and Bat might therefore be involved in oxygen binding or sensing. An in-frame deletion mutation in the nifL coding region resulted in loss of repression by NH+4 and the mutant excreted high amounts of ammonia during nitrogen fixation, thus confirming a phenotype reported earlier for an insertion mutation. In addition, nifLA are cotranscribed in A. vinelandii as in K. pneumoniae, but expression from the A. vinelandii promoter requires neither RpoN nor NtrC. PMID:8231815

  7. De novo transcriptome sequencing and discovery of genes related to copper tolerance in Paeonia ostii.

    Wang, Yanjie; Dong, Chunlan; Xue, Zeyun; Jin, Qijiang; Xu, Yingchun

    2016-01-15

    Paeonia ostii, an important ornamental and medicinal plant, grows normally on copper (Cu) mines with widespread Cu contamination of soils, and it has the ability to lower Cu contents in the Cu-contaminated soils. However, very little molecular information concerned with Cu resistance of P. ostii is available. In this study, high-throughput de novo transcriptome sequencing was carried out for P. ostii with and without Cu treatment using Illumina HiSeq 2000 platform. A total of 77,704 All-unigenes were obtained with a mean length of 710 bp. Of these unigenes, 47,461 were annotated with public databases based on sequence similarities. Comparative transcript profiling allowed the discovery of 4324 differentially expressed genes (DEGs), with 2207 up-regulated and 2117 down-regulated unigenes in Cu-treated library as compared to the control counterpart. Based on these DEGs, Gene Ontology (GO) enrichment analysis indicated Cu stress-relevant terms, such as 'membrane' and 'antioxidant activity'. Meanwhile, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis uncovered some important pathways, including 'biosynthesis of secondary metabolites' and 'metabolic pathways'. In addition, expression patterns of 12 selected DEGs derived from quantitative real-time polymerase chain reaction (qRT-PCR) were consistent with their transcript abundance changes obtained by transcriptomic analyses, suggesting that all the 12 genes were authentically involved in Cu tolerance in P. ostii. This is the first report to identify genes related to Cu stress responses in P. ostii, which could offer valuable information on the molecular mechanisms of Cu resistance, and provide a basis for further genomics research on this and related ornamental species for phytoremediation. PMID:26435192

  8. Detection of Tropical Fungi in Formalin-Fixed, Paraffin-Embedded Tissue: Still an Indication for Microscopy in Times of Sequence-Based Diagnosis?

    Hagen Frickmann; Ulrike Loderstaedt; Paul Racz; Klara Tenner-Racz; Petra Eggert; Alexandra Haeupler; Ralf Bialek; Ralf Matthias Hagen

    2015-01-01

    Introduction. The aim of the study was the evaluation of panfungal PCR protocols with subsequent sequence analysis for the diagnostic identification of invasive mycoses in formalin-fixed, paraffin-embedded tissue samples with rare tropical mycoses. Materials and Methods. Five different previously described panfungal PCR/sequencing protocols targeting 18S and 28S ribosomal RNA gene fragments as well as internal transcribed spacer 1 and 2 fragments were evaluated with a collection of 17 formali...

  9. Clinical Next-Generation Sequencing Pipeline Outperforms a Combined Approach Using Sanger Sequencing and Multiplex Ligation-Dependent Probe Amplification in Targeted Gene Panel Analysis.

    Schenkel, Laila C; Kerkhof, Jennifer; Stuart, Alan; Reilly, Jack; Eng, Barry; Woodside, Crystal; Levstik, Alexander; Howlett, Christopher J; Rupar, Anthony C; Knoll, Joan H M; Ainsworth, Peter; Waye, John S; Sadikovic, Bekim

    2016-09-01

    Advances in next-generation sequencing (NGS) have facilitated parallel analysis of multiple genes enabling the implementation of cost-effective, rapid, and high-throughput methods for the molecular diagnosis of multiple genetic conditions, including the identification of BRCA1 and BRCA2 mutations in high-risk patients for hereditary breast and ovarian cancer. We clinically validated a NGS pipeline designed to replace Sanger sequencing and multiplex ligation-dependent probe amplification analysis and to facilitate detection of sequence and copy number alterations in a single test focusing on a BRCA1/BRCA2 gene analysis panel. Our custom capture library covers 46 exons, including BRCA1 exons 2, 3, and 5 to 24 and BRCA2 exons 2 to 27, with 20 nucleotides of intronic regions both 5' and 3' of each exon. We analyzed 402 retrospective patients, with previous Sanger sequencing and multiplex ligation-dependent probe amplification results, and 240 clinical prospective patients. One-hundred eighty-three unique variants, including sequence and copy number variants, were detected in the retrospective (n = 95) and prospective (n = 88) cohorts. This standardized NGS pipeline demonstrated 100% sensitivity and 100% specificity, uniformity, and high-depth nucleotide coverage per sample (approximately 7000 reads per nucleotide). Subsequently, the NGS pipeline was applied to the analysis of larger gene panels, which have shown similar uniformity, sample-to-sample reproducibility in coverage distribution, and sensitivity and specificity for detection of sequence and copy number variants. PMID:27376475

  10. Cloning and sequencing of the trpE gene from Arthrobacter globiformis ATCC 8010 and several related subsurface Arthrobacter isolates

    Chernova, T.; Viswanathan, V.K.; Austria, N.; Nichols, B.P.

    1998-09-01

    Tryptophan dependent mutants of Arthrobacter globiformis ATCC 8010 were isolated and trp genes were cloned by complementation and marker rescue of the auxotrophic strains. Rescue studies and preliminary sequence analysis reveal that at least the genes trpE, trpC, and trpB are clustered together in this organism. In addition, sequence analysis of the entire trpE gene, which encodes component I of anthranilate synthase, is described. Segments of the trpE gene from 17 subsurface isolates of Arthrobacter sp. were amplified by PCR and sequenced. The partial trpE sequences from the various strains were aligned and subjected to phylogenetic analysis. The data suggest that in addition to single base changes, recombination and genetic exchange play a major role in the evolution of the Arthrobacter genome.

  11. Targeted next generation sequencing reveals a novel intragenic deletion of the TPO gene in a family with intellectual disability

    Iqbal, Z.; Neveling, K.; Razzaq, A.; Shahzad, M.; Zahoor, M.Y.; Qasim, M.; Gilissen, C.; Wieskamp, N.; Kwint, M.P.; Gijsen, S.; Brouwer, A.P. de; Veltman, J.A.; Riazuddin, S.; Bokhoven, J.H.L.M. van

    2012-01-01

    BACKGROUNDS AND AIMS: Next generation sequencing (NGS) approaches have revolutionized the identification of mutations underlying genetic disorders. This technology is particularly useful for the identification of mutations in known and new genes for conditions with extensive genetic heterogeneity. I

  12. Agouti signalling protein (ASIP) gene: molecular cloning, sequence characterisation and tissue distribution in domestic goose.

    Zhang, J; Wang, C; Liu, Y; Liu, J; Wang, H Y; Liu, A F; He, D Q

    2016-06-01

    Agouti signalling protein (ASIP) is an endogenous antagonist of melanocortin-1 receptor (MC1R) and is involved in the regulation of pigmentation in mammals. The objective of this study was to identify and characterise the ASIP gene in domestic goose. The goose ASIP cDNA consisted of a 44-nucleotide 5'-terminal untranslated region (UTR), a 390-nucleotide open-reading frame (ORF) and a 45-nucleotide 3'-UTR. The length of goose ASIP genomic DNA was 6176 bp, including three coding exons and two introns. Bioinformatic analysis indicated that the ORF encodes a protein of 130 amino-acid residues with a molecular weight of 14.88 kDa and an isoelectric point of 9.73. Multiple sequence alignments and phylogenetic analysis showed that the amino-acid sequence of ASIP was conserved in vertebrates, especially in the avian species. RT-qPCR showed that the goose ASIP mRNA was differentially expressed in the pigment deposition tissues, including eye, foot, feather follicle, skin of the back, as well as in skin of the abdomen. The expression level of the ASIP gene in skin of the abdomen was higher than that in skin of the back. Those findings will contribute to further understanding the functions of the ASIP gene in geese plumage colouring. PMID:26750999

  13. Sequence analysis and prokaryotic expression of Giardia lamblia α-18 giardin gene.

    Wu, Sheng; Yu, Xingang; Abdullahi, Auwalu Yusuf; Hu, Wei; Pan, Weida; Shi, Xianli; Tan, Liping; Song, Meiran; Li, Guoqing

    2016-03-01

    To study the genetic variation and prokaryotic expression of α18 giardin gene of Giardia lamblia zoonotic assemblage A and host-specific assemblage F, the α18 genes were amplified from G. lamblia assemblages A and F by PCR and sequenced. The PCR product was cloned into the prokaryotic expression vector pET-28a(+) and the positive recombinant plasmid was transformed into Escherichia coli Rosetta (DE3) strain for the expression. The expressed α18 giardin fusion protein was validated by SDS-PAGE and Western blot analysis, and purified by Ni-Agarose resin. The putative sequence of α18 giardin amino acid was analyzed by bioinformatics software. Results showed that the α18 giardin gene was 861 bp in length, encoding 286 amino acids; it was 100% homologous between human-derived and dog-derived G. lamblia assemblage A, but it was 86.8% homologous with G. lamblia assemblage F (cat-derived). Giardin α18 was about 36 kDa in molecular weight, with good reactivity. Prediction based on in silico analyses: it had hydrophobicity, without signal peptide and transmembrane domain, and contained 11 alpha regions, 13 beta sheets, 1 beta turn and 7 random coils in secondary structure. The above information would lay the foundation for research about the subcellular localization and biological function of α18 giardin in G. lamblia. PMID:26656833

  14. SAS1 and SAS2, GTP-binding protein genes in Dictyostelium discoideum with sequence similarities to essential genes in Saccharomyces cerevisiae.

    Saxe, S A; Kimmel, A R

    1990-01-01

    We have identified two novel, very closely related genes, SAS1 and SAS2, from Dictyostelium discoideum. These encode small, approximately 20-kilodaton proteins with amino acid sequences thought to be involved in interaction with guanine nucleotides. The protein sizes, spacings of GTP-binding domains, and carboxyl-terminal sequences suggest their relationship to the ubiquitous ras-type proteins. Their sequences, however, are sufficiently different to indicate that they are not true ras protein...

  15. Barcode Sequencing Screen Identifies SUB1 as a Regulator of Yeast Pheromone Inducible Genes.

    Sliva, Anna; Kuang, Zheng; Meluh, Pamela B; Boeke, Jef D

    2016-01-01

    The yeast pheromone response pathway serves as a valuable model of eukaryotic mitogen-activated protein kinase (MAPK) pathways, and transcription of their downstream targets. Here, we describe application of a screening method combining two technologies: fluorescence-activated cell sorting (FACS), and barcode analysis by sequencing (Bar-Seq). Using this screening method, and pFUS1-GFP as a reporter for MAPK pathway activation, we readily identified mutants in known mating pathway components. In this study, we also include a comprehensive analysis of the FUS1 induction properties of known mating pathway mutants by flow cytometry, featuring single cell analysis of each mutant population. We also characterized a new source of false positives resulting from the design of this screen. Additionally, we identified a deletion mutant, sub1Δ, with increased basal expression of pFUS1-GFP. Here, in the first ChIP-Seq of Sub1, our data shows that Sub1 binds to the promoters of about half the genes in the genome (tripling the 991 loci previously reported), including the promoters of several pheromone-inducible genes, some of which show an increase upon pheromone induction. Here, we also present the first RNA-Seq of a sub1Δ mutant; the majority of genes have no change in RNA, but, of the small subset that do, most show decreased expression, consistent with biochemical studies implicating Sub1 as a positive transcriptional regulator. The RNA-Seq data also show that certain pheromone-inducible genes are induced less in the sub1Δ mutant relative to the wild type, supporting a role for Sub1 in regulation of mating pathway genes. The sub1Δ mutant has increased basal levels of a small subset of other genes besides FUS1, including IMD2 and FIG1, a gene encoding an integral membrane protein necessary for efficient mating. PMID:26837954

  16. Implications of using whole genome sequencing to test unselected populations for high risk breast cancer genes: a modelling study

    Warren-Gash, Charlotte; Kroese, Mark; Burton, Hilary; Pharoah, Paul

    2016-01-01

    Background The decision to test for high risk breast cancer gene mutations is traditionally based on risk scores derived from age, family and personal cancer history. Next generation sequencing technologies such as whole genome sequencing (WGS) make wider population testing more feasible. In the UK’s 100,000 Genomes Project, mutations in 16 genes including BRCA1 and BRCA2 are to be actively sought regardless of clinical presentation. The implications of deploying this approach at scale for pa...

  17. Genetic Diversity of Toxoplasma gondii Strains from Different Hosts and Geographical Regions by Sequence Analysis of GRA20 Gene

    Ning, Hong-Rui; Huang, Si-Yang; Wang, Jin-Lei; Xu, Qian-Ming; Zhu, Xing-Quan

    2015-01-01

    Toxoplasma gondii is a eukaryotic parasite of the phylum Apicomplexa, which infects all warm-blood animals, including humans. In the present study, we examined sequence variation in dense granule 20 (GRA20) genes among T. gondii isolates collected from different hosts and geographical regions worldwide. The complete GRA20 genes were amplified from 16 T. gondii isolates using PCR, sequence were analyzed, and phylogenetic reconstruction was analyzed by maximum parsimony (MP) and maximum likelih...

  18. Cloning, sequencing, and expression of the gene encoding amylopullulanase from Pyrococcus furiosus and biochemical characterization of the recombinant enzyme.

    Dong, G.; Vieille, C; Zeikus, J G

    1997-01-01

    The gene encoding the Pyrococcus furiosus hyperthermophilic amylopullulanase (APU) was cloned, sequenced, and expressed in Escherichia coli. The gene encoded a single 827-residue polypeptide with a 26-residue signal peptide. The protein sequence had very low homology (17 to 21% identity) with other APUs and enzymes of the alpha-amylase family. In particular, none of the consensus regions present in the alpha-amylase family could be identified. P. furiosus APU showed similarity to three protei...

  19. Nucleotide sequence homology between the heat-labile enterotoxin gene of Escherichia coli and Vibrio cholerae deoxyribonucleic acid.

    Moseley, S L; Falkow, S

    1980-01-01

    Isolated deoxyribonucleic acid fragments encoding the heat-labile enterotoxin of Escherichia coli were used to probe for homologous sequences in restricted whole-cell deoxyribonucleic acid from Vibrio cholerae. Significant sequence homology between the heat-labile enterotoxin gene and V. cholerae deoxyribonucleic acid was demonstrated, and apparent differences were observed in the organization of the cholera toxin gene among different strains of V. cholerae.

  20. A Plasmid Bearing the bla(CTX-M-15) Gene and Phage P1-Like Sequences from a Sequence Type 11 Klebsiella pneumoniae Isolate.

    Shin, Juyoun; Ko, Kwan Soo

    2015-10-01

    Plasmid pKP12226 was extracted and analyzed from a CTX-M-15-producing Klebsiella pneumoniae sequence type 11 (ST11) isolate collected in South Korea. The plasmid represents chimeric characteristics consisting of a pIP1206-like backbone and lysogenized phage P1-like sequences. It bears a resistance region that includes resistance genes to several antibiotics and is different from previously characterized plasmids from South Korea bearing blaCTX-M-15. It may have resulted from recombination between an Escherichia coli plasmid backbone, a blaCTX-M-15-bearing resistance region, and lysogenized phage P1-like sequences. PMID:26195513

  1. Sequence analysis of the msp4 gene of Anaplasma ovis strains

    de la Fuente, J.; Atkinson, M.W.; Naranjo, V.; Fernandez de Mera, I. G.; Mangold, A.J.; Keating, K.A.; Kocan, K.M.

    2007-01-01

    Anaplasma ovis (Rickettsiales: Anaplasmataceae) is a tick-borne pathogen of sheep, goats and wild ruminants. The genetic diversity of A. ovis strains has not been well characterized due to the lack of sequence information. In this study, we evaluated bighorn sheep (Ovis canadensis) and mule deer (Odocoileus hemionus) from Montana for infection with A. ovis by serology and sequence analysis of the msp4 gene. Antibodies to Anaplasma spp. were detected in 37% and 39% of bighorn sheep and mule deer analyzed, respectively. Four new msp4 genotypes were identified. The A. ovis msp4 sequences identified herein were analyzed together with sequences reported previously for the characterization of the genetic diversity of A. ovis strains in comparison with other Anaplasma spp. The results of these studies demonstrated that although A. ovis msp4 genotypes may vary among geographic regions and between sheep and deer hosts, the variation observed was less than the variation observed between A. marginale and A. phagocytophilum strains. The results reported herein further confirm that A. ovis infection occurs in natural wild ruminant populations in Western United States and that bighorn sheep and mule deer may serve as wildlife reservoirs of A. ovis. ?? 2006.

  2. Detection and Quantification of Mosaic Mutations in Disease Genes by Next-Generation Sequencing.

    Qin, Lan; Wang, Jing; Tian, Xia; Yu, Hui; Truong, Cavatina; Mitchell, John J; Wierenga, Klaas J; Craigen, William J; Zhang, Victor Wei; Wong, Lee-Jun C

    2016-05-01

    The identification of mosaicism is important in establishing a disease diagnosis, assessing recurrence risk, and genetic counseling. Next-generation sequencing (NGS) with deep sequence coverage enhances sensitivity and allows for accurate quantification of the level of mosaicism. NGS identifies low-level mosaicism that would be undetectable by conventional Sanger sequencing. A customized DNA probe library was used for capturing targeted genes, followed by deep NGS analysis. The mean coverage depth per base was approximately 800×. The NGS sequence data were analyzed for single-nucleotide variants and copy number variations. Mosaic mutations in 10 cases/families were detected and confirmed by NGS analysis. Mosaicism was identified for autosomal dominant (JAG1, COL3A1), autosomal recessive (PYGM), and X-linked (PHKA2, PDHA1, OTC, and SLC6A8) disorders. The mosaicism was identified either in one or more tissues from the probands or in a parent of an affected child. When analyzing data from patients with unusual testing results or inheritance patterns, it is important to further evaluate the possibility of mosaicism. Deep NGS analysis not only provides insights into the spectrum of mosaic mutations but also underlines the importance of the detection of mosaicism as an integral part of clinical molecular diagnosis and genetic counseling. PMID:26944031

  3. Detection of sequences homologous to human retroviral DNA in multiple sclerosis by gene amplification

    Twenty-one patients with multiple sclerosis, chronic progressive type, were examined for DNA sequences homologous to a human retrovirus. Genomic DNA from peripheral blood mononuclear cells was analyzed for the presence of homologous sequences to the human T-cell leukemia/lymphoma virus type I (HTLV-I) long terminal repeat, 3' gag, pol, and env domains by the enzymatic in vitro gene amplification technique, polymerase chain reaction. Positive identification of homologous pol sequences was made in the amplified DNA from six of these patients (29%). Three of these six patients (14%) also tested positive for the env region, but not for the other regions tested. In contrast, none of the samples from 35 normal individuals studied was positive when amplified and tested with the same primers and probes. Comparison of patterns obtained from controls and from patients with adult T-cell leukemia or tropical spastic paraparesis suggests that the DNA sequences identified are exogenous to the human genome and may correspond to a human retroviral species. The data support the detection of a human retroviral agent in some patients with multiple sclerosis

  4. Phylogeny of the cuttlefishes (Mollusca:Cephalopoda) based on mitochondrial COI and 16S rRNA gene sequence data

    LIN Xiangzhi; ZHENG Xiaodong; XIAO Shu; WANG Rucai

    2004-01-01

    To clarify cuttlefish phylogeny, mitochondrial cytochrome c oxidase subunit I (COI) gene and partial 16S rRNA gene are sequenced for 13 cephalopod species. Phylogenetic trees are constructed, with the neighbor-joining method.Coleoids are divided into two main lineages, Decabrachia and Octobrachia. The monophyly of the order Sepioidea,which includes the families Sepiidae, Sepiolidae and Idiosepiidae, is not supported. From the two families of Sepioidea examined, the Sepiolidae are polyphyletic and are excluded from the order. On the basis of 16S rRNA and amino acid of COI gene sequences data, the two genera (Sepiella and Sepia) from the Sepiidae can be distinguished, but do not have a visible boundary using COI gene sequences. The reason is explained. This suggests that the 16S rDNA of cephalopods is a precious tool to analyze taxonomic relationships at the genus level, and COI gene is fitter at a higher taxonomic level (i.e., family).

  5. Analysis of unstable DNA sequence in FRM1 gene in Polish families with fragile X syndrome

    The unstable DNA sequence in the FMR1 gene was analyzed in 85 individuals from Polish families with fragile X syndrome in order to characterize mutations responsible for the disease in Poland. In all affected individuals classified on the basis of clinical features and expression of the fragile site at X(q27.3) a large expansion of the unstable sequence (full mutation) was detected. About 5% (2 of 43) of individuals with full mutation did not express the fragile site. Among normal alleles, ranging in size from 20 to 41 CGC repeats, allele with 29 repeats was the most frequent (37%). Transmission of premutated and fully mutated alleles to the offspring was always associated with size increase. No change in repeat number was found when normal alleles were transmitted. (author). 19 refs., 4 figs, 1 tab

  6. Cis-acting sequences from a human surfactant protein gene confer pulmonary-specific gene expression in transgenic mice

    Korfhagen, T.R.; Glasser, S.W.; Wert, S.E.; Bruno, M.D.; Daugherty, C.C.; McNeish, J.D.; Stock, J.L.; Potter, S.S.; Whitsett, J.A. (Cincinnati College of Medicine, OH (USA))

    1990-08-01

    Pulmonary surfactant is produced in late gestation by developing type II epithelial cells lining the alveolar epithelium of the lung. Lack of surfactant at birth is associated with respiratory distress syndrome in premature infants. Surfactant protein C (SP-C) is a highly hydrophobic peptide isolated from pulmonary tissue that enhances the biophysical activity of surfactant phospholipids. Like surfactant phospholipid, SP-C is produced by epithelial cells in the distal respiratory epithelium, and its expression increases during the latter part of gestation. A chimeric gene containing 3.6 kilobases of the promoter and 5{prime}-flanking sequences of the human SP-C gene was used to express diphtheria toxin A. The SP-C-diphtheria toxin A fusion gene was injected into fertilized mouse eggs to produce transgenic mice. Affected mice developed respiratory failure in the immediate postnatal period. Morphologic analysis of lungs from affected pups showed variable but severe cellular injury confined to pulmonary tissues. Ultrastructural changes consistent with cell death and injury were prominent in the distal respiratory epithelium. Proximal components of the tracheobronchial tree were not severely affected. Transgenic animals were of normal size at birth, and structural abnormalities were not detected in nonpulmonary tissues. Lung-specific diphtheria toxin A expression controlled by the human SP-C gene injured type II epithelial cells and caused extensive necrosis of the distal respiratory epithelium. The absence of type I epithelial cells in the most severely affected transgenic animals supports the concept that developing type II cells serve as precursors to type I epithelial cells.

  7. Characterization of the bovine pregnancy-associated glycoprotein gene family – analysis of gene sequences, regulatory regions within the promoter and expression of selected genes

    Walker Angela M

    2009-04-01

    Full Text Available Abstract Background The Pregnancy-associated glycoproteins (PAGs belong to a large family of aspartic peptidases expressed exclusively in the placenta of species in the Artiodactyla order. In cattle, the PAG gene family is comprised of at least 22 transcribed genes, as well as some variants. Phylogenetic analyses have shown that the PAG family segregates into 'ancient' and 'modern' groupings. Along with sequence differences between family members, there are clear distinctions in their spatio-temporal distribution and in their relative level of expression. In this report, 1 we performed an in silico analysis of the bovine genome to further characterize the PAG gene family, 2 we scrutinized proximal promoter sequences of the PAG genes to evaluate the evolution pressures operating on them and to identify putative regulatory regions, 3 we determined relative transcript abundance of selected PAGs during pregnancy and, 4 we performed preliminary characterization of the putative regulatory elements for one of the candidate PAGs, bovine (bo PAG-2. Results From our analysis of the bovine genome, we identified 18 distinct PAG genes and 14 pseudogenes. We observed that the first 500 base pairs upstream of the translational start site contained multiple regions that are conserved among all boPAGs. However, a preponderance of conserved regions, that harbor recognition sites for putative transcriptional factors (TFs, were found to be unique to the modern boPAG grouping, but not the ancient boPAGs. We gathered evidence by means of Q-PCR and screening of EST databases to show that boPAG-2 is the most abundant of all boPAG transcripts. Finally, we provided preliminary evidence for the role of ETS- and DDVL-related TFs in the regulation of the boPAG-2 gene. Conclusion PAGs represent a relatively large gene family in the bovine genome. The proximal promoter regions of these genes display differences in putative TF binding sites, likely contributing to observed

  8. Partial Sequence Analysis of Merozoite Surface Proteine-3α Gene in Plasmodium vivax Isolates from Malarious Areas of Iran

    H Mirhendi

    2008-12-01

    Full Text Available Background: Approximately 85-90% of malaria infections in Iran are attributed to Plasmodium vivax, while little is known about the genetic of the parasite and its strain types in this region. This study was designed and performed for describing genetic characteristics of Plasmodium vivax population of Iran based on the merozoite surface protein-3α gene sequence. Methods: Through a descriptive study we analyzed partial P. vivax merozoite surface protein-3α gene sequences from 17 clinical P. vivax isolates collected from malarious areas of Iran. Genomic DNA was extracted by Q1Aamp® DNA blood mini kit, amplified through nested PCR for a partial nucleotide sequence of PvMSP-3 gene in P. vivax. PCR-amplified products were sequenced with an ABI Prism Perkin-Elmer 310 sequencer machine and the data were analyzed with clustal W software. Results: Analysis of PvMSP-3 gene sequences demonstrated extensive polymorphisms, but the sequence identity between isolates of same types was relatively high. We identified specific insertions and deletions for the types A, B and C variants of P. vivax in our isolates. In phylogenetic comparison of geographically separated isolates, there was not a significant geo­graphical branching of the parasite populations. Conclusion: The highly polymorphic nature of isolates suggests that more investigations of the PvMSP-3 gene are needed to explore its vaccine potential.

  9. New splicing mutation in the choline kinase beta (CHKB) gene causing a muscular dystrophy detected by whole-exome sequencing.

    Oliveira, Jorge; Negrão, Luís; Fineza, Isabel; Taipa, Ricardo; Melo-Pires, Manuel; Fortuna, Ana Maria; Gonçalves, Ana Rita; Froufe, Hugo; Egas, Conceição; Santos, Rosário; Sousa, Mário

    2015-06-01

    Muscular dystrophies (MDs) are a group of hereditary muscle disorders that include two particularly heterogeneous subgroups: limb-girdle MD and congenital MD, linked to 52 different genes (seven common to both subgroups). Massive parallel sequencing technology may avoid the usual stepwise gene-by-gene analysis. We report the whole-exome sequencing (WES) analysis of a patient with childhood-onset progressive MD, also presenting mental retardation and dilated cardiomyopathy. Conventional sequencing had excluded eight candidate genes. WES of the trio (patient and parents) was performed using the ion proton sequencing system. Data analysis resorted to filtering steps using the GEMINI software revealed a novel silent variant in the choline kinase beta (CHKB) gene. Inspection of sequence alignments ultimately identified the causal variant (CHKB:c.1031+3G>C). This splice site mutation was confirmed using Sanger sequencing and its effect was further evaluated with gene expression analysis. On reassessment of the muscle biopsy, typical abnormal mitochondrial oxidative changes were observed. Mutations in CHKB have been shown to cause phosphatidylcholine deficiency in myofibers, causing a rare form of CMD (only 21 patients reported). Notwithstanding interpretative difficulties that need to be overcome before the integration of WES in the diagnostic workflow, this work corroborates its utility in solving cases from highly heterogeneous groups of diseases, in which conventional diagnostic approaches fail to provide a definitive diagnosis. PMID:25740612

  10. The complete nucleotide sequence of the rat 18S ribosomal RNA gene and comparison with the respective yeast and frog genes.

    Torczynski, R; Bollon, A P; Fuke, M

    1983-01-01

    The complete nucleotide sequence of the rat 18S ribosomal RNA gene has been determined. A comparison of the rat 18S ribosomal RNA gene sequence with the known sequences of yeast and frog revealed three conserved (stable) regions, two unstable regions, and three large inserts. (A,T) leads to (G,C) changes were more frequent than (G,C) leads to (A,T) changes for three comparisons (yeast leads to frog, frog leads to rat, and yeast leads to rat). GC pairs were inserted preferentially over AT pair...

  11. Nucleotide sequence of the plasminogen activator gene of Yersinia pestis: relationship to ompT of Escherichia coli and gene E of Salmonella typhimurium.

    Sodeinde, O A; Goguen, J.D.

    1989-01-01

    We have determined the nucleotide sequence of the 1.4-kilobase DNA fragment containing the plasminogen activator gene (pla) of Yersinia pestis, which determines both plasminogen activator and coagulase activities of the species. The sequence revealed the presence of a 936-base-pair open reading frame that constitutes the pla gene. This reading frame encodes a 312-amino-acid protein of 34.6 kilodaltons and containing a putative 20-amino-acid signal sequence. The presence of a single large open...

  12. 16S rRNA gene sequencing in routine identification of anaerobic bacteria isolated from blood cultures

    Justesen, Ulrik Stenz; Skov, Marianne Nielsine; Knudsen, Elisa;

    2010-01-01

    A comparison between conventional identification and 16S rRNA gene sequencing of anaerobic bacteria isolated from blood cultures in a routine setting was performed (n = 127). With sequencing, 89% were identified to the species level, versus 52% with conventional identification. The times for...... identification were 1.5 days and 2.8 days, respectively....

  13. Sequence and structural requirements for high-affinity DNA binding by the WT1 gene product.

    Nakagama, H; Heinrich, G.; Pelletier, J; Housman, D E

    1995-01-01

    The Wilms' tumor suppressor gene, WT1, encodes a zinc finger polypeptide which plays a key role regulating cell growth and differentiation in the urogenital system. Using the whole-genome PCR approach, we searched murine genomic DNA for high-affinity WT1 binding sites and identified a 10-bp motif 5'GCGTGGGAGT3' which we term WTE). The WTE motif is similar to the consensus binding sequence 5'GCG(G/T)GGGCG3' recognized by EGR-1 and is also suggested to function as a binding site for WT1, settin...

  14. GeneWiz browser: An Interactive Tool for Visualizing Sequenced Chromosomes

    Hallin, Peter Fischer; Stærfeldt, Hans Henrik; Rotenberg, Eva;

    2009-01-01

    readability and increased functionality compared to other browsers. The tool allows the user to select the display of various genomic features, color setting and data ranges. Custom numerical data can be added to the plot, allowing for example visualization of gene expression and regulation data. Further......, standard atlases are pre-generated for all prokaryotic genomes available in GenBank, providing a fast overview of all available genomes, including recently deposited genome sequences. The tool is available online from http://www.cbs.dtu.dk/services/gwBrowser. [Supplemental material including interactive...

  15. Cloning and nucleotide sequence of the Salmonella typhimurium dcp gene encoding dipeptidyl carboxypeptidase.

    Hamilton, S.; Miller, C G

    1992-01-01

    Plasmids carrying the Salmonella typhimurium dcp gene were isolated from a pBR328 library of Salmonella chromosomal DNA by screening for complementation of a peptide utilization defect conferred by a dcp mutation. Strains carrying these plasmids overproduced dipeptidyl carboxypeptidase approximately 50-fold. The nucleotide sequence of a 2.8-kb region of one of these plasmids contained an open reading frame coding for a protein of 77,269 Da, in agreement with the 80-kDa size for dipeptidyl car...

  16. Methylation-sensitive linking libraries enhance gene-enriched sequencing of complex genomes and map DNA methylation domains

    Bharti Arvind K

    2008-12-01

    Full Text Available Abstract Background Many plant genomes are resistant to whole-genome assembly due to an abundance of repetitive sequence, leading to the development of gene-rich sequencing techniques. Two such techniques are hypomethylated partial restriction (HMPR and methylation spanning linker libraries (MSLL. These libraries differ from other gene-rich datasets in having larger insert sizes, and the MSLL clones are designed to provide reads localized to "epigenetic boundaries" where methylation begins or ends. Results A large-scale study in maize generated 40,299 HMPR sequences and 80,723 MSLL sequences, including MSLL clones exceeding 100 kb. The paired end reads of MSLL and HMPR clones were shown to be effective in linking existing gene-rich sequences into scaffolds. In addition, it was shown that the MSLL clones can be used for anchoring these scaffolds to a BAC-based physical map. The MSLL end reads effectively identified epigenetic boundaries, as indicated by their preferential alignment to regions upstream and downstream from annotated genes. The ability to precisely map long stretches of fully methylated DNA sequence is a unique outcome of MSLL analysis, and was also shown to provide evidence for errors in gene identification. MSLL clones were observed to be significantly more repeat-rich in their interiors than in their end reads, confirming the correlation between methylation and retroelement content. Both MSLL and HMPR reads were found to be substantially gene-enriched, with the SalI MSLL libraries being the most highly enriched (31% align to an EST contig, while the HMPR clones exhibited exceptional depletion of repetitive DNA (to ~11%. These two techniques were compared with other gene-enrichment methods, and shown to be complementary. Conclusion MSLL technology provides an unparalleled approach for mapping the epigenetic status of repetitive blocks and for identifying sequences mis-identified as genes. Although the types and natures of

  17. Maximal sequence length of exact match between members from a gene family during early evolution

    WEN Xiao; GUO Xing-yi; FAN Long-jiang

    2005-01-01

    Mutation (substitution, deletion, insertion, etc.) in nucleotide acid causes the maximal sequence lengths of exact match (MALE) between paralogous members from a duplicate event to become shorter during evolution. In this work, MALE changes between members of 26 gene families from four representative species (Arabidopsis thaliana, Oryza sativa, Mus musculus and Homo sapiens) were investigated. Comparative study ofparalogous' MALE and amino acid substitution rate (dA<0.5)indicated that a close relationship existed between them. The results suggested that MALE could be a sound evolutionary scale for the divergent time for paralogous genes during their early evolution. A reference table between MALE and divergent time for the four species was set up, which would be useful widely, for large-scale genome alignment and comparison. As an example, detection of large-scale duplication events of rice genome based on the table was illustrated.

  18. Sequence characterization of heat shock protein gene of Cyclospora cayetanensis isolates from Nepal, Mexico, and Peru.

    Sulaiman, Irshad M; Torres, Patricia; Simpson, Steven; Kerdahi, Khalil; Ortega, Ynes

    2013-04-01

    We have described the development of a 2-step nested PCR protocol based on the characterization of the 70-kDa heat shock protein (HSP70) gene for rapid detection of the human-pathogenic Cyclospora cayetanensis parasite. We tested and validated these newly designed primer sets by PCR amplification followed by nucleotide sequencing of PCR-amplified HSP70 fragments belonging to 16 human C. cayetanensis isolates from 3 different endemic regions that include Nepal, Mexico, and Peru. No genetic polymorphism was observed among the isolates at the characterized regions of the HSP70 locus. This newly developed HSP70 gene-based nested PCR protocol provides another useful genetic marker for the rapid detection of C. cayetanensis in the future. PMID:22924935

  19. Steroid induction of a peptide hormone gene leads to orchestration of a defined behavioral sequence.

    Zitnan, D; Ross, L S; Zitnanova, I; Hermesman, J L; Gill, S S; Adams, M E

    1999-07-01

    At the end of each molt, insects shed the old cuticle by performing preecdysis and ecdysis behaviors. Regulation of these centrally patterned movements involves peptide signaling between endocrine Inka cells and the CNS. In Inka cells, we have identified the cDNA and gene encoding preecdysis-triggering hormone (PETH) and ecdysis-triggering hormone (ETH), which activate these behaviors. Prior to behavioral onset, rising ecdysteroid levels induce expression of the ecdysone receptor (EcR) and ETH gene in Inka cells and evoke CNS sensitivity to PETH and ETH. Subsequent ecdysteroid decline is required for peptide release, which initiates three motor patterns in specific order: PETH triggers preecdysis I, while ETH activates preecdysis II and ecdysis. The Inka cell provides a model for linking steroid regulation of peptide hormone expression and release with activation of a defined behavioral sequence. PMID:10433264

  20. Enhancer sequences of a retroviral vector determine expression of a gene in multipotent hematopoietic progenitors and committed erythroid cells.

    Holland, C A; Anklesaria, P; Sakakeeny, M A; Greenberger, J.S.

    1987-01-01

    To analyze the transcriptional activity of retroviral enhancer sequences in hematopoietic lineages, we determined the effect of enhancer sequences on the expression of the neomycin resistance gene transferred by two retroviral vectors to primary hematopoietic lineages. We constructed the vector pFr-SV(X). The Moloney murine leukemia virus enhancer region of a vector, pZIP-SV(X), was replaced by a 380-nucleotide-long fragment containing the enhancer sequences of the Friend murine leukemia viru...

  1. Stem loop sequences specific to transposable element IS605 are found linked to lipoprotein genes in Borrelia plasmids.

    Nicholas Delihas

    Full Text Available BACKGROUND: Plasmids of Borrelia species are dynamic structures that contain a large number of repetitive genes, gene fragments, and gene fusions. In addition, the transposable element IS605/200 family, as well as degenerate forms of this IS element, are prevalent. In Helicobacter pylori, flanking regions of the IS605 transposase gene contain sequences that fold into identical small stem loops. These function in transposition at the single-stranded DNA level. METHODOLOGY/PRINCIPAL FINDINGS: In work reported here, bioinformatics techniques were used to scan Borrelia plasmid genomes for IS605 transposable element specific stem loop sequences. Two variant stem loop motifs are found in the left and right flanking regions of the transposase gene. Both motifs appear to have dispersed in plasmid genomes and are found "free-standing" and phylogenetically conserved without the associated IS605 transposase gene or the adjacent flanking sequence. Importantly, IS605 specific stem loop sequences are also found at the 3' ends of lipoprotein genes (PFam12 and PFam60, however the left and right sequences appear to develop their own evolutionary patterns. The lipoprotein gene-linked left stem loop sequences maintain the IS605 stem loop motif in orthologs but only at the RNA level. These show mutations whereby variants fold into phylogenetically conserved RNA-type stem loops that contain the wobble non-Watson-Crick G-U base-pairing. The right flanking sequence is associated with the family lipoprotein-1 genes. A comparison of homologs shows that the IS605 stem loop motif rapidly dissipates, but a more elaborate secondary structure appears to develop in its place. CONCLUSIONS/SIGNIFICANCE: Stem loop sequences specific to the transposable element IS605 are present in plasmid regions devoid of a transposase gene and significantly, are found linked to lipoprotein genes in Borrelia plasmids. These sequences are evolutionarily conserved and/or structurally developed in

  2. Shotgun Metagenomic Sequencing Reveals Functional Genes and Microbiome Associated with Bovine Digital Dermatitis.

    Martin Zinicola

    Full Text Available Metagenomic methods amplifying 16S ribosomal RNA genes have been used to describe the microbial diversity of healthy skin and lesion stages of bovine digital dermatitis (DD and to detect critical pathogens involved with disease pathogenesis. In this study, we characterized the microbiome and for the first time, the composition of functional genes of healthy skin (HS, active (ADD and inactive (IDD lesion stages using a whole-genome shotgun approach. Metagenomic sequences were annotated using MG-RAST pipeline. Six phyla were identified as the most abundant. Firmicutes and Actinobacteria were the predominant bacterial phyla in the microbiome of HS, while Spirochetes, Bacteroidetes and Proteobacteria were highly abundant in ADD and IDD. T. denticola-like, T. vincentii-like and T. phagedenis-like constituted the most abundant species in ADD and IDD. Recruitment plots comparing sequences from HS, ADD and IDD samples to the genomes of specific Treponema spp., supported the presence of T. denticola and T. vincentii in ADD and IDD. Comparison of the functional composition of HS to ADD and IDD identified a significant difference in genes associated with motility/chemotaxis and iron acquisition/metabolism. We also provide evidence that the microbiome of ADD and IDD compared to that of HS had significantly higher abundance of genes associated with resistance to copper and zinc, which are commonly used in footbaths to prevent and control DD. In conclusion, the results from this study provide new insights into the HS, ADD and IDD microbiomes, improve our understanding of the disease pathogenesis and generate unprecedented knowledge regarding the functional genetic composition of the digital dermatitis microbiome.

  3. The ovine respiratory syncytial virus F gene sequence and its diagnostic application.

    Eleraky, N Z; Kania, S A; Potgieter, L N

    2001-11-01

    Ruminant respiratory syncytial viruses (RSVs) are classified into 2 subgroups, ovine RSV and bovine RSV. Although ovine RSV infects cattle, its contribution to bovine respiratory tract disease has not been established, which is an important issue for vaccine development in cattle. Diagnosis by virus isolation or serology has low or variable sensitivity and/or specificity and polymerase chain reaction (PCR) has been recommended as a rapid and sensitive technique for RSV detection. A simple procedure has been developed to detect and identify bovine and ovine RSVs. First, the nucleotide sequence of the ovine RSV fusion (F) gene was determined and compared with representative strains of bovine RSV and human RSV subgroups A and B. The ovine RSV F gene has 85 and 72-73% nucleotide identity with those of bovine RSV and human RSV, respectively. The predicted amino acid sequence of the ovine RSV F gene has 94 and 83-84% amino acid identity with those of bovine RSV and human RSV, respectively. Then PCR primers targeting a specific F gene fragment of bovine and ovine RSV were designed. The primers represented bases 85-103 and the complementary sequence to bases 510-493 of the ovine RSV F gene. A similar PCR product (426 bp) was obtained on agarose gel electrophoresis from bovine RSV and from ovine RSV. The products, however, were unique to the parent virus and could be distinguished by EcoRI or MspI restriction endonuclease cleavage. EcoRI cleaved the ovine product into 2 bands (285 and 141 bp) but failed to affect the bovine RSV PCR product. However, MspI cleaved the bovine product into 2 bands (229 and 197 bp) but had no effect on the ovine product. Also, this assay did not amplify any PCR product with human RSV. The reverse transcription-polymerase chain reaction (RT-PCR) followed by restriction enzyme digestion is a useful and practical approach for detection and differentiation of ruminant respiratory syncytial viruses. PMID:11724134

  4. Identification of novel functional sequence variants in the gene for peptidase inhibitor 3

    Edwin Samuel

    2006-05-01

    Full Text Available Abstract Background Peptidase inhibitor 3 (PI3 inhibits neutrophil elastase and proteinase-3, and has a potential role in skin and lung diseases as well as in cancer. Genome-wide expression profiling of chorioamniotic membranes revealed decreased expression of PI3 in women with preterm premature rupture of membranes. To elucidate the molecular mechanisms contributing to the decreased expression in amniotic membranes, the PI3 gene was searched for sequence variations and the functional significance of the identified promoter variants was studied. Methods Single nucleotide polymorphisms (SNPs were identified by direct sequencing of PCR products spanning a region from 1,173 bp upstream to 1,266 bp downstream of the translation start site. Fourteen SNPs were genotyped from 112 and nine SNPs from 24 unrelated individuals. Putative transcription factor binding sites as detected by in silico search were verified by electrophoretic mobility shift assay (EMSA using nuclear extract from Hela and amnion cell nuclear extract. Deviation from Hardy-Weinberg equilibrium (HWE was tested by χ2 goodness-of-fit test. Haplotypes were estimated using expectation maximization (EM algorithm. Results Twenty-three sequence variations were identified by direct sequencing of polymerase chain reaction (PCR products covering 2,439 nt of the PI3 gene (-1,173 nt of promoter sequences and all three exons. Analysis of 112 unrelated individuals showed that 20 variants had minor allele frequencies (MAF ranging from 0.02 to 0.46 representing "true polymorphisms", while three had MAF ≤ 0.01. Eleven variants were in the promoter region; several putative transcription factor binding sites were found at these sites by database searches. Differential binding of transcription factors was demonstrated at two polymorphic sites by electrophoretic mobility shift assays, both in amniotic and HeLa cell nuclear extracts. Differential binding of the transcription factor GATA1 at -689C>G site

  5. Identification of FVIII gene mutations in patients with hemophilia A using new combinatorial sequencing by hybridization

    Chetta M

    2008-01-01

    Full Text Available Background: Standard methods of mutation detection are time consuming in Hemophilia A (HA rendering their application unavailable in some analysis such as prenatal diagnosis. Objectives: To evaluate the feasibility of combinatorial sequencing-by-hybridization (cSBH as an alternative and reliable tool for mutation detection in FVIII gene. Patients/Methods: We have applied a new method of cSBH that uses two different colors for detection of multiple point mutations in the FVIII gene. The 26 exons encompassing the HA gene were analyzed in 7 newly diagnosed Italian patients and in 19 previously characterized individuals with FVIII deficiency. Results: Data show that, when solution-phase TAMRA and QUASAR labeled 5-mer oligonucleotide sets mixed with unlabeled target PCR templates are co-hybridized in the presence of DNA ligase to universal 6-mer oligonucleotide probe-based arrays, a number of mutations can be successfully detected. The technique was reliable also in identifying a mutant FVIII allele in an obligate heterozygote. A novel missense mutation (Leu1843Thr in exon 16 and three novel neutral polymorphisms are presented with an updated protocol for 2-color cSBH. Conclusions: cSBH is a reliable tool for mutation detection in FVIII gene and may represent a complementary method for the genetic screening of HA patients.

  6. Cloning and sequence analysis of putative type II fatty acid synthase genes from Arachis hypogaea L.

    Meng-Jun Li; Ai-Qin Li; Han Xia; Chuan-Zhi Zhao; Chang-Sheng Li; Shu-Bo Wan; Yu-Ping Bi; Xing-Jun Wang

    2009-06-01

    The cultivated peanut is a valuable source of dietary oil and ranks fifth among the world oil crops. Plant fatty acid biosynthesis is catalysed by type II fatty acid synthase (FAS) in plastids and mitochondria. By constructing a full-length cDNA library derived from immature peanut seeds and homology-based cloning, candidate genes of acyl carrier protein (ACP), malonyl-CoA:ACP transacylase, -ketoacyl-ACP synthase (I, II, III), -ketoacyl-ACP reductase, -hydroxyacyl-ACP dehydrase and enoyl-ACP reductase were isolated. Sequence alignments revealed that primary structures of type II FAS enzymes were highly conserved in higher plants and the catalytic residues were strictly conserved in Escherichia coli and higher plants. Homologue numbers of each type II FAS gene expressing in developing peanut seeds varied from 1 in KASII, KASIII and HD to 5 in ENR. The number of single-nucleotide polymorphisms (SNPs) was quite different in each gene. Peanut type II FAS genes were predicted to target plastids except ACP2 and ACP3. The results suggested that peanut may contain two type II FAS systems in plastids and mitochondria. The type II FAS enzymes in higher plants may have similar functions as those in E. coli.

  7. Sequencing and functional annotation of the Bacillus subtilis genes in the 200 kb rrnB-dnaB region.

    Lapidus, A; Galleron, N; Sorokin, A; Ehrlich, S D

    1997-11-01

    The 200 kb region of the Bacillus subtilis chromosome spanning from 255 to 275 degrees on the genetic map was sequenced. The strategy applied, based on use of yeast artificial chromosomes and multiplex Long Accurate PCR, proved to be very efficient for sequencing a large bacterial chromosome area. A total of 193 genes of this part of the chromosome was classified by level of knowledge and biological category of their functions. Five levels of gene function understanding are defined. These are: (i) experimental evidence is available of gene product or biological function; (ii) strong homology exists for the putative gene product with proteins from other organisms; (iii) some indication of the function can be derived from homologies with known proteins; (iv) the gene product can be clustered with hypothetical proteins; (v) no indication on the gene function exists. The percentage of detected genes in each category was: 20, 28, 20, 15 and 17, respectively. In the sequenced region, a high percentage of genes are implicated in transport and metabolic linking of glycolysis and the citric acid cycle. A functional connection of several genes from this region and the genes close to 140 degrees in the chromosome was also observed. PMID:9387221

  8. Gene

    U.S. Department of Health & Human Services — Gene integrates information from a wide range of species. A record may include nomenclature, Reference Sequences (RefSeqs), maps, pathways, variations, phenotypes,...

  9. Sequence analysis and expression of the M1 and M2 matrix protein genes of hirame rhabdovirus (HIRRV)

    Nishizawa, T.; Kurath, G.; Winton, J.R.

    1997-01-01

    We have cloned and sequenced a 2318 nucleotide region of the genomic RNA of hirame rhabdovirus (HIRRV), an important viral pathogen of Japanese flounder Paralichthys olivaceus. This region comprises approximately two-thirds of the 3' end of the nucleocapsid protein (N) gene and the complete matrix protein (M1 and M2) genes with the associated intergenic regions. The partial N gene sequence was 812 nucleotides in length with an open reading frame (ORF) that encoded the carboxyl-terminal 250 amino acids of the N protein. The M1 and M2 genes were 771 and 700 nucleotides in length, respectively, with ORFs encoding proteins of 227 and 193 amino acids. The M1 gene sequence contained an additional small ORF that could encode a highly basic, arginine-rich protein of 25 amino acids. Comparisons of the N, M1, and M2 gene sequences of HIRRV with the corresponding sequences of the fish rhabdoviruses, infectious hematopoietic necrosis virus (IHNV) or viral hemorrhagic septicemia virus (VHSV) indicated that HIRRV was more closely related to IHNV than to VHSV, but was clearly distinct from either. The putative consensus gene termination sequence for IHNV and VHSV, AGAYAG(A)(7), was present in the N-M1, M1-M2, and M2-G intergenic regions of HIRRV as were the putative transcription initiation sequences YGGCAC and AACA. An Escherichia coli expression system was used to produce recombinant proteins from the M1 and M2 genes of HIRRV. These were the same size as the authentic M1 and M2 proteins and reacted with anti-HIRRV rabbit serum in western blots. These reagents can be used for further study of the fish immune response and to test novel control methods.

  10. Genome sequencing of a virulent avian Pasteurella multocida strain GX-Pm reveals the candidate genes involved in the pathogenesis.

    Yu, Chengjie; Sizhu, Suolang; Luo, Qingping; Xu, Xuewen; Fu, Lei; Zhang, Anding

    2016-04-01

    Pasteurella multocida (P. multocida) was first shown to be the causative agent of fowl cholera by Louis Pasteur in 1881. First genomic study was performed on an avirulent avian strain Pm70, and until 2013, two genomes of virulent avian strains X73 and P1059 were sequenced. Comparative genome study supplied important information for further study on the pathogenesis of fowl cholera. In the previous study, a capsular serotype A strain GX-Pm was isolated from the liver of a chicken, which died during an outbreak of fowl cholera in 2011. The strain showed multiple drug resistance and was highly virulent to chickens. Therefore, the present study performed the genome sequencing and a comparative genomic analysis to reveal the candidate genes involved in virulence of P. multocida. Sequenced draft genome sequence of GX-Pm was 2,292,886 bp, contained 2941 protein-coding genes, 5 genomic islands, 4 IS elements and 2 prophage regions. Notability, all the predicted drug-resistance genes were included in predicted genomic islands. A comparative genome study on virulent avian strains P1059, X73 and GX-Pm with the avirulent avian strain Pm 70 indicated that 475 unique genes were only identified in either of virulent strains but absent in the avirulent strain. Among these genes, 20 genes were contained within genomes of all three virulent strains, including a few of putative virulence genes. Further characterization of the pathogenic functions of these genes would benefit the understanding of pathogenesis of fowl cholera. PMID:27033902

  11. Gene Expression Versus Sequence for Predicting Function:Glia Maturation Factor Gamma Is Not A Glia Maturation Factor

    MichaelG.Walker

    2003-01-01

    It is standard practice,whenever a researcher finds a new gene,to search databases for genes that have a similar sequence.It is not standard practice,whenever a researcher finds a new gene,to search for genes that have similar expression(coexpression).Failure to perform co-expression searches has lead to incorrect conclusions about the likely function of new genes,and has lead to wasted laboratory attempts to confirm functions incorrectly predicted.We present here the example of Glia Maturation Factor gamma(GMF-gamma).Despite its name,it has not been shown to participate in glia maturation.It is a gene of unknown function that is similar in sequence to GMF-beta.The sequence homology and chromosomal location led to an unsuccessful searchfor GMF-gamma mutations in glioma.We examined GMF-gamma expression in 1432 human cDNA libraries.Highest expression occurs in phagocytic,antigen-presenting and other hematopoietic cells.We found GMF-gamma mRNA in almost every tissue examined,with expression in nervous tissue no higher than in any other tissue.Our evidence indicates that GMF-gamma participates in phagocytosis in antigen presenting cells.Searches for genes with similar sequences should be supplemented with searches for genes with similar expression to avoid incorrect predictions.

  12. Gene Expression Versus Sequence for Predicting Function: Glia Maturation Factor Gamma Is Not A Glia Maturation Factor

    Michael G. Walker

    2003-01-01

    It is standard practice, whenever a researcher finds a new gene, to search databases for genes that have a similar sequence. It is not standard practice, whenever a researcher finds a new gene, to search for genes that have similar expression (coexpression). Failure to perform co-expression searches has lead to incorrect conclusions about the likely function of new genes, and has lead to wasted laboratory attempts to confirm functions incorrectly predicted. We present here the example of Glia Maturation Factor gamma (GMF-gamma). Despite its name, it has not been shown to participate in glia maturation. It is a gene of unknown function that is similar in sequence to GMF-beta. The sequence homology and chromosomal location led to an unsuccessful search for GMF-gamma mutations in glioma.We examined GMF-gamma expression in 1432 human cDNA libraries. Highest expression occurs in phagocytic, antigen-presenting and other hematopoietic cells.We found GMF-gamma mRNA in almost every tissue examined, with expression in nervous tissue no higher than in any other tissue. Our evidence indicates that GMF-gamma participates in phagocytosis in antigen presenting cells. Searches for genes with similar sequences should be supplemented with searches for genes with similar expression to avoid incorrect predictions.

  13. cDNA cloning, sequence analysis, and chromosomal localization of the gene for human carnitine palmitoyltransferase.

    Finocchiaro, G; Taroni, F; Rocchi, M; Martin, A L; Colombo, I; Tarelli, G T; DiDonato, S

    1991-01-01

    We have cloned and sequenced a cDNA encoding human liver carnitine palmitoyltransferase (CPTase; palmitoyl-CoA:L-carnitine O-palmitoyltransferase, EC 2.3.1.21), an inner mitochondrial membrane enzyme that plays a major role in the fatty acid oxidation pathway. Mixed oligonucleotide primers whose sequences were deduced from one tryptic peptide obtained from purified CPTase were used in a polymerase chain reaction, allowing the amplification of a 0.12-kilobase fragment of human genomic DNA encoding such a peptide. A 60-base-pair (bp) oligonucleotide synthesized on the basis of the sequence from this fragment was used for the screening of a cDNA library from human liver and hybridized to a cDNA insert of 2255 bp. This cDNA contains an open reading frame of 1974 bp that encodes a protein of 658 amino acid residues including 25 residues of an NH2-terminal leader peptide. The assignment of this open reading frame to human liver CPTase is confirmed by matches to seven different amino acid sequences of tryptic peptides derived from pure human CPTase and by the 82.2% homology with the amino acid sequence of rat CPTase. The NH2-terminal region of CPTase contains a leucine-proline motif that is shared by carnitine acetyl- and octanoyltransferases and by choline acetyltransferase. The gene encoding CPTase was assigned to human chromosome 1, region 1q12-1pter, by hybridization of CPTase cDNA with a DNA panel of 19 human-hamster somatic cell hybrids. Images PMID:1988962

  14. Small tandemly repeated DNA sequences of higher plants likely originate from a tRNA gene ancestor.

    Benslimane, A A; Dron, M; Hartmann, C; Rode, A.

    1986-01-01

    Several monomers (177 bp) of a tandemly arranged repetitive nuclear DNA sequence of Brassica oleracea have been cloned and sequenced. They share up to 95% homology between one another and up to 80% with other satellite DNA sequences of Cruciferae, suggesting a common ancestor. Both strands of these monomers show more than 50% homology with many tRNA genes; the best homologies have been obtained with Lys and His yeast mitochondrial tRNA genes (respectively 64% and 60%). These results suggest t...

  15. L-asparaginase II of Escherichia coli K-12: cloning, mapping and sequencing of the ansB gene.

    Bonthron, D T

    1990-07-01

    The Escherichia coli gene ansB, encoding the chemotherapeutic enzyme L-asparaginase II, has been cloned, using a strategy based on the polymerase chain reaction, and sequenced. The amino acid (aa) sequence differs in eleven positions from the data previously derived by direct aa sequencing. A cleavable secretory signal peptide precedes the N terminus of the mature protein. The ansB gene maps to position 3114 kb on the physical map of E. coli [Kohara et al., Cell 50 (1987) 495-508], corresponding to approx. 63.8 min on the genetic map. PMID:2144836

  16. Molecular cloning, sequencing, and overexpression of the structural gene encoding the delta subunit of Escherichia coli DNA polymerase III holoenzyme.

    J.R. Carter; Franden, M A; Aebersold, R.; McHenry, C S

    1992-01-01

    Using an oligonucleotide hybridization probe, we have mapped the structural gene for the delta subunit of Escherichia coli DNA polymerase III holoenzyme to 14.6 centisomes of the chromosome. This gene, designated holA, was cloned and sequenced. The sequence of holA matches precisely four amino acid sequences obtained for the amino terminus of delta and three internal tryptic peptides. A holA-overproducing plasmid that directs the expression of delta up to 4% of the soluble protein was constru...

  17. Effects of using coding potential, sequence conservation and mRNA structure conservation for predicting pyrroly-sine containing genes

    Have, Christian Theil; Zambach, Sine; Christiansen, Henning

    2013-01-01

    suggested, but the structure does not seem to be present in all pyrrolysine incorporating genes. Results We propose a strategy to predict pyrrolysine encoding genes in genomes of archaea and bacteria. We cluster open reading frames interrupted by the amber codon based on sequence similarity. We rank these...... prediction of pyrrolysine incorporating genes in genomes of bacteria and archaea leading to insights about the factors driving pyrrolysine translation and identification of new gene candidates. The method predicts known conserved genes with high recall and predicts several other promising candidates for...... experimental verification. The method is implemented as a computational pipeline which is available on request....

  18. Transcriptome sequencing of Eucalyptus camaldulensis seedlings subjected to water stress reveals functional single nucleotide polymorphisms and genes under selection

    Thumma Bala R; Sharma Navin; Southerton Simon G

    2012-01-01

    Abstract Background Water stress limits plant survival and production in many parts of the world. Identification of genes and alleles responding to water stress conditions is important in breeding plants better adapted to drought. Currently there are no studies examining the transcriptome wide gene and allelic expression patterns under water stress conditions. We used RNA sequencing (RNA-seq) to identify the candidate genes and alleles and to explore the evolutionary signatures of selection. ...

  19. Intronic and flanking sequences are required to silence enhancement of an embryonic beta-type globin gene.

    Wandersee, N J; Ferris, R C; Ginder, G D

    1996-01-01

    In the course of studying regulatory elements that affect avian embryonic rho-globin gene expression, the multipotential hematopoietic cell line K562 was transiently transfected with various rho-globin gene constructs containing or lacking an avian erythroid enhancer element. Enhanced levels of rho gene expression were seen from those constructs containing an enhancer element and minimal 5' or 3' flanking rho sequences but were not seen from enhancer-containing constructs that included extens...

  20. Comprehensive evaluation of AmpliSeq transcriptome, a novel targeted whole transcriptome RNA sequencing methodology for global gene expression analysis

    Li, Wenli; Turner, Amy; Aggarwal, Praful; Matter, Andrea; Storvick, Erin; Donna K Arnett; Broeckel, Ulrich

    2015-01-01

    Background Whole transcriptome sequencing (RNA-seq) represents a powerful approach for whole transcriptome gene expression analysis. However, RNA-seq carries a few limitations, e.g., the requirement of a significant amount of input RNA and complications led by non-specific mapping of short reads. The Ion AmpliSeq™ Transcriptome Human Gene Expression Kit (AmpliSeq) was recently introduced by Life Technologies as a whole-transcriptome, targeted gene quantification kit to overcome these limitati...