Sample records for 28s gene sequences

  1. Chromosomal localization and partial sequencing of the 18S and 28S ribosomal genes from Bradysia hygida (Diptera: Sciaridae).

    Gaspar, V P; Shimauti, E L T; Fernandez, M A


    In insects, ribosomal genes are usually detected in sex chromosomes, but have also or only been detected in autosomal chromosomes in some cases. Previous results from our research group indicated that in Bradysia hygida, nucleolus organizer regions were associated with heterochromatic regions of the autosomal C chromosome, using the silver impregnation technique. The present study confirmed this location of the ribosomal genes using fluorescence in situ hybridization analysis. This analysis also revealed the partial sequences of the 18S and 28S genes for this sciarid. The sequence alignment showed that the 18S gene has 98% identity to Corydalus armatus and 91% identity to Drosophila persimilis and Drosophila melanogaster. The partial sequence analysis of the 28S gene showed 95% identity with Bradysia amoena and 93% identity with Schwenckfeldina sp. These results confirmed the location of ribosomal genes of B. hygida in an autosomal chromosome, and the partial sequence analysis of the 18S and 28S genes demonstrated a high percentage of identity among several insect ribosomal genes.

  2. Fungal community structure in disease suppressive soils assessed by 28S LSU gene sequencing.

    C Ryan Penton

    Full Text Available Natural biological suppression of soil-borne diseases is a function of the activity and composition of soil microbial communities. Soil microbe and phytopathogen interactions can occur prior to crop sowing and/or in the rhizosphere, subsequently influencing both plant growth and productivity. Research on suppressive microbial communities has concentrated on bacteria although fungi can also influence soil-borne disease. Fungi were analyzed in co-located soils 'suppressive' or 'non-suppressive' for disease caused by Rhizoctonia solani AG 8 at two sites in South Australia using 454 pyrosequencing targeting the fungal 28S LSU rRNA gene. DNA was extracted from a minimum of 125 g of soil per replicate to reduce the micro-scale community variability, and from soil samples taken at sowing and from the rhizosphere at 7 weeks to cover the peak Rhizoctonia infection period. A total of ∼ 994,000 reads were classified into 917 genera covering 54% of the RDP Fungal Classifier database, a high diversity for an alkaline, low organic matter soil. Statistical analyses and community ordinations revealed significant differences in fungal community composition between suppressive and non-suppressive soil and between soil type/location. The majority of differences associated with suppressive soils were attributed to less than 40 genera including a number of endophytic species with plant pathogen suppression potentials and mycoparasites such as Xylaria spp. Non-suppressive soils were dominated by Alternaria, Gibberella and Penicillum. Pyrosequencing generated a detailed description of fungal community structure and identified candidate taxa that may influence pathogen-plant interactions in stable disease suppression.

  3. Chromosomal localization of the 18S-28S and 5S rRNA genes and (TTAGGGn sequences of butterfly lizards (Leiolepis belliana belliana and Leiolepis boehmei, Agamidae, Squamata

    Kornsorn Srikulnath


    Full Text Available Chromosomal mapping of the butterfly lizards Leiolepis belliana belliana and L. boehmei was done using the 18S-28S and 5S rRNA genes and telomeric (TTAGGGn sequences. The karyotype of L. b. belliana was 2n = 36, whereas that of L. boehmei was 2n = 34. The 18S-28S rRNA genes were located at the secondary constriction of the long arm of chromosome 1, while the 5S rRNA genes were found in the pericentromeric region of chromosome 6 in both species. Hybridization signals for the (TTAGGGn sequence were observed at the telomeric ends of all chromosomes, as well as interstitially at the same position as the 18S-28S rRNA genes in L. boehmei. This finding suggests that in L. boehmei telomere-to-telomere fusion probably occurred between chromosome 1 and a microchromosome where the 18S-28S rRNA genes were located or, alternatively, at the secondary constriction of chromosome 1. The absence of telomeric sequence signals in chromosome 1 of L. b. belliana suggested that its chromosomes may have only a few copies of the (TTAGGGn sequence or that there may have been a gradual loss of the repeat sequences during chromosomal evolution.

  4. Phylogenetic analysis of ruminant Theileria spp. from China based on 28S ribosomal RNA gene.

    Gou, Huitian; Guan, Guiquan; Ma, Miling; Liu, Aihong; Liu, Zhijie; Xu, Zongke; Ren, Qiaoyun; Li, Youquan; Yang, Jifei; Chen, Ze; Yin, Hong; Luo, Jianxun


    Species identification using DNA sequences is the basis for DNA taxonomy. In this study, we sequenced the ribosomal large-subunit RNA gene sequences (3,037-3,061 bp) in length of 13 Chinese Theileria stocks that were infective to cattle and sheep. The complete 28S rRNA gene is relatively difficult to amplify and its conserved region is not important for phylogenetic study. Therefore, we selected the D2-D3 region from the complete 28S rRNA sequences for phylogenetic analysis. Our analyses of 28S rRNA gene sequences showed that the 28S rRNA was useful as a phylogenetic marker for analyzing the relationships among Theileria spp. in ruminants. In addition, the D2-D3 region was a short segment that could be used instead of the whole 28S rRNA sequence during the phylogenetic analysis of Theileria, and it may be an ideal DNA barcode.

  5. Phylogenetic relationships of the marine Haplosclerida (Phylum Porifera employing ribosomal (28S rRNA and mitochondrial (cox1, nad1 gene sequence data.

    Niamh E Redmond

    Full Text Available The systematics of the poriferan Order Haplosclerida (Class Demospongiae has been under scrutiny for a number of years without resolution. Molecular data suggests that the order needs revision at all taxonomic levels. Here, we provide a comprehensive view of the phylogenetic relationships of the marine Haplosclerida using many species from across the order, and three gene regions. Gene trees generated using 28S rRNA, nad1 and cox1 gene data, under maximum likelihood and Bayesian approaches, are highly congruent and suggest the presence of four clades. Clade A is comprised primarily of species of Haliclona and Callyspongia, and clade B is comprised of H. simulans and H. vansoesti (Family Chalinidae, Amphimedon queenslandica (Family Niphatidae and Tabulocalyx (Family Phloeodictyidae, Clade C is comprised primarily of members of the Families Petrosiidae and Niphatidae, while Clade D is comprised of Aka species. The polyphletic nature of the suborders, families and genera described in other studies is also found here.

  6. Phylogenetic relationships of the marine Haplosclerida (Phylum Porifera) employing ribosomal (28S rRNA) and mitochondrial (cox1, nad1) gene sequence data.

    Redmond, Niamh E; Raleigh, Jean; van Soest, Rob W M; Kelly, Michelle; Travers, Simon A A; Bradshaw, Brian; Vartia, Salla; Stephens, Kelly M; McCormack, Grace P


    The systematics of the poriferan Order Haplosclerida (Class Demospongiae) has been under scrutiny for a number of years without resolution. Molecular data suggests that the order needs revision at all taxonomic levels. Here, we provide a comprehensive view of the phylogenetic relationships of the marine Haplosclerida using many species from across the order, and three gene regions. Gene trees generated using 28S rRNA, nad1 and cox1 gene data, under maximum likelihood and Bayesian approaches, are highly congruent and suggest the presence of four clades. Clade A is comprised primarily of species of Haliclona and Callyspongia, and clade B is comprised of H. simulans and H. vansoesti (Family Chalinidae), Amphimedon queenslandica (Family Niphatidae) and Tabulocalyx (Family Phloeodictyidae), Clade C is comprised primarily of members of the Families Petrosiidae and Niphatidae, while Clade D is comprised of Aka species. The polyphletic nature of the suborders, families and genera described in other studies is also found here.

  7. Molecular identification of sibling species of Sclerodermus (Hymenoptera: Bethylidae that parasitize buprestid and cerambycid beetles by using partial sequences of mitochondrial DNA cytochrome oxidase subunit 1 and 28S ribosomal RNA gene.

    Yuan Jiang

    Full Text Available The species belonging to Sclerodermus (Hymenoptera: Bethylidae are currently the most important insect natural enemies of wood borer pests, mainly buprestid and cerambycid beetles, in China. However, some sibling species of this genus are very difficult to distinguish because of their similar morphological features. To address this issue, we conducted phylogenetic and genetic analyses of cytochrome oxidase subunit I (COI and 28S RNA gene sequences from eight species of Sclerodermus reared from different wood borer pests. The eight sibling species were as follows: S. guani Xiao et Wu, S. sichuanensis Xiao, S. pupariae Yang et Yao, and Sclerodermus spp. (Nos. 1-5. A 594-bp fragment of COI and 750-bp fragment of 28S were subsequently sequenced. For COI, the G-C content was found to be low in all the species, averaging to about 30.0%. Sequence divergences (Kimura-2-parameter distances between congeneric species averaged to 4.5%, and intraspecific divergences averaged to about 0.09%. Further, the maximum sequence divergences between congeneric species and Sclerodermus sp. (No. 5 averaged to about 16.5%. All 136 samples analyzed were included in six reciprocally monophyletic clades in the COI neighbor-joining (NJ tree. The NJ tree inferred from the 28S rRNA sequence yielded almost identical results, but the samples from S. guani, S. sichuanensis, S. pupariae, and Sclerodermus spp. (Nos. 1-4 clustered together and only Sclerodermus sp. (No. 5 clustered separately. Our findings indicate that the standard barcode region of COI can be efficiently used to distinguish morphologically similar Sclerodermus species. Further, we speculate that Sclerodermus sp. (No. 5 might be a new species of Sclerodermus.

  8. Intragenomic sequence variation at the ITS1 - ITS2 region and at the 18S and 28S nuclear ribosomal DNA genes of the New Zealand mud snail, Potamopyrgus antipodarum (Hydrobiidae: mollusca)

    Hoy, Marshal S.; Rodriguez, Rusty J.


    Molecular genetic analysis was conducted on two populations of the invasive non-native New Zealand mud snail (Potamopyrgus antipodarum), one from a freshwater ecosystem in Devil's Lake (Oregon, USA) and the other from an ecosystem of higher salinity in the Columbia River estuary (Hammond Harbor, Oregon, USA). To elucidate potential genetic differences between the two populations, three segments of nuclear ribosomal DNA (rDNA), the ITS1-ITS2 regions and the 18S and 28S rDNA genes were cloned and sequenced. Variant sequences within each individual were found in all three rDNA segments. Folding models were utilized for secondary structure analysis and results indicated that there were many sequences which contained structure-altering polymorphisms, which suggests they could be nonfunctional pseudogenes. In addition, analysis of molecular variance (AMOVA) was used for hierarchical analysis of genetic variance to estimate variation within and among populations and within individuals. AMOVA revealed significant variation in the ITS region between the populations and among clones within individuals, while in the 5.8S rDNA significant variation was revealed among individuals within the two populations. High levels of intragenomic variation were found in the ITS regions, which are known to be highly variable in many organisms. More interestingly, intragenomic variation was also found in the 18S and 28S rDNA, which has rarely been observed in animals and is so far unreported in Mollusca. We postulate that in these P. antipodarum populations the effects of concerted evolution are diminished due to the fact that not all of the rDNA genes in their polyploid genome should be essential for sustaining cellular function. This could lead to a lessening of selection pressures, allowing mutations to accumulate in some copies, changing them into variant sequences.                   

  9. Phylogenetic reconstruction of the wolf spiders (Araneae: Lycosidae) using sequences from the 12S rRNA, 28S rRNA, and NADH1 genes: implications for classification, biogeography, and the evolution of web building behavior.

    Murphy, Nicholas P; Framenau, Volker W; Donnellan, Stephen C; Harvey, Mark S; Park, Yung-Chul; Austin, Andrew D


    Current knowledge of the evolutionary relationships amongst the wolf spiders (Araneae: Lycosidae) is based on assessment of morphological similarity or phylogenetic analysis of a small number of taxa. In order to enhance the current understanding of lycosid relationships, phylogenies of 70 lycosid species were reconstructed by parsimony and Bayesian methods using three molecular markers; the mitochondrial genes 12S rRNA, NADH1, and the nuclear gene 28S rRNA. The resultant trees from the mitochondrial markers were used to assess the current taxonomic status of the Lycosidae and to assess the evolutionary history of sheet-web construction in the group. The results suggest that a number of genera are not monophyletic, including Lycosa, Arctosa, Alopecosa, and Artoria. At the subfamilial level, the status of Pardosinae needs to be re-assessed, and the position of a number of genera within their respective subfamilies is in doubt (e.g., Hippasa and Arctosa in Lycosinae and Xerolycosa, Aulonia and Hygrolycosa in Venoniinae). In addition, a major clade of strictly Australasian taxa may require the creation of a new subfamily. The analysis of sheet-web building in Lycosidae revealed that the interpretation of this trait as an ancestral state relies on two factors: (1) an asymmetrical model favoring the loss of sheet-webs and (2) that the suspended silken tube of Pirata is directly descended from sheet-web building. Paralogous copies of the nuclear 28S rRNA gene were sequenced, confounding the interpretation of the phylogenetic analysis and suggesting that a cautionary approach should be taken to the further use of this gene for lycosid phylogenetic analysis.

  10. PCR primers for metazoan nuclear 18S and 28S ribosomal DNA sequences.

    Ryuji J Machida

    Full Text Available BACKGROUND: Metagenetic analyses, which amplify and sequence target marker DNA regions from environmental samples, are increasingly employed to assess the biodiversity of communities of small organisms. Using this approach, our understanding of microbial diversity has expanded greatly. In contrast, only a few studies using this approach to characterize metazoan diversity have been reported, despite the fact that many metazoan species are small and difficult to identify or are undescribed. One of the reasons for this discrepancy is the availability of universal primers for the target taxa. In microbial studies, analysis of the 16S ribosomal DNA is standard. In contrast, the best gene for metazoan metagenetics is less clear. In the present study, we have designed primers that amplify the nuclear 18S and 28S ribosomal DNA sequences of most metazoan species with the goal of providing effective approaches for metagenetic analyses of metazoan diversity in environmental samples, with a particular emphasis on marine biodiversity. METHODOLOGY/PRINCIPAL FINDINGS: Conserved regions suitable for designing PCR primers were identified using 14,503 and 1,072 metazoan sequences of the nuclear 18S and 28S rDNA regions, respectively. The sequence similarity of both these newly designed and the previously reported primers to the target regions of these primers were compared for each phylum to determine the expected amplification efficacy. The nucleotide diversity of the flanking regions of the primers was also estimated for genera or higher taxonomic groups of 11 phyla to determine the variable regions within the genes. CONCLUSIONS/SIGNIFICANCE: The identified nuclear ribosomal DNA primers (five primer pairs for 18S and eleven for 28S and the results of the nucleotide diversity analyses provide options for primer combinations for metazoan metagenetic analyses. Additionally, advantages and disadvantages of not only the 18S and 28S ribosomal DNA, but also other

  11. Molecular Phylogeny of Cypridoid Freshwater Ostracods (Crustacea: Ostracoda), Inferred from 18S and 28S rDNA Sequences.

    Hiruta, Shimpei F; Kobayashi, Norio; Katoh, Toru; Kajihara, Hiroshi


    With the aim of exploring phylogenetic relationships within Cypridoidea, the most species-rich superfamily among the podocopidan ostracods, we sequenced nearly the entire 18S rRNA gene (18S) and part of the 28S rRNA gene (28S) for 22 species in the order Podocopida, with representatives from all the major cypridoid families. We conducted phylogenetic analyses using the methods of maximum likelihood, minimum evolution, and Bayesian analysis. Our analyses showed monophyly for Cyprididae, one of the four families currently recognized in Cypridoidea. Candonidae turned out to be paraphyletic, and included three clades corresponding to the subfamilies Candoninae, Paracypridinae, and Cyclocypridinae. We propose restricting the name Candonidae s. str. to comprise what is now Candoninae, and raising Paracypridinae and Cyclocyprininae to family rank within the superfamily Cypridoidea.

  12. Genetic relationship between Neobenedenia girellae and N.melleni inferred from 28S rRNA sequences

    WANG Jun; ZHANG Wen; SU Yongquan; DING Shaoxiong


    The fragments of 350 bp in 28S rRNA from the closely related monogenea of trematoda, Neobenedenia girellae and N. melleni are obtained by polymerase chain reaction (PCR) amplified using a couple of special primers and then sequenced. The results show that the comparison of 28S rRNA sequences, with only a base varying in 337bp accounting for 0.3% genetic difference, from the relative species N. girellae and N. melleni parasitized on the different fishes in different farms displays that they possess a very high genetic similarity of 99.7%, higher than that of 99.41% for the single species N. melleni sampled in different areas, and the intraspecific divergence of N.melleni is 0.59%. Meanwhile, the interspecific differences between the two Neobenedenia and three Benedenia (i.e., B. lutjani, B. rohdei and B. seriolae) range from 2.08% to11.73%. In addition, UPGMA and MP molecular phylogenetic trees are constructed and proved to be consistent with each other. Though the morphological characteristics and the results of genetic diversity for the two Neobenedenia show a high similarity, whether they belong to a single species or not are still undefined, and the more genes of them should be further investigated, in combination with the systematical and detailed morphological study.

  13. Evolutionary relationships of the coelacanth, lungfishes, and tetrapods based on the 28S ribosomal RNA gene.

    Zardoya, R; Meyer, A


    The origin of land vertebrates was one of the major transitions in the history of vertebrates. Yet, despite many studies that are based on either morphology or molecules, the phylogenetic relationships among tetrapods and the other two living groups of lobe-finned fishes, the coelacanth and the lungfishes, are still unresolved and debated. Knowledge of the relationships among these lineages, which originated back in the Devonian, has profound implications for the reconstruction of the evolutionary scenario of the conquest of land. We collected the largest molecular data set on this issue so far, about 3,500 base pairs from seven species of the large 28S nuclear ribosomal gene. All phylogenetic analyses (maximum parsimony, neighbor-joining, and maximum likelihood) point toward the hypothesis that lungfishes and coelacanths form a monophyletic group and are equally closely related to land vertebrates. This evolutionary hypothesis complicates the identification of morphological or physiological preadaptations that might have permitted the common ancestor of tetrapods to colonize land. This is because the reconstruction of its ancestral conditions would be hindered by the difficulty to separate uniquely derived characters from shared derived characters in the coelacanth/lungfish and tetrapod lineages. This molecular phylogeny aids in the reconstruction of morphological evolutionary steps by providing a framework; however, only paleontological evidence can determine the sequence of morphological acquisitions that allowed lobe-finned fishes to colonize land.

  14. Sequencing for complete rDNA sequences (18S, ITS1, 5.8S, ITS2, and 28S rDNA) of Demodex and phylogenetic analysis of Acari based on 18S and 28S rDNA.

    Zhao, Ya-E; Wu, Li-Ping; Hu, Li; Xu, Yang; Wang, Zheng-Hang; Liu, Wen-Yan


    Due to the difficulty of DNA extraction for Demodex, few studies dealt with the identification and the phyletic evolution of Demodex at molecular level. In this study, we amplified, sequenced, and analyzed a complete (Demodex folliculorum) and an almost complete (D12 missing) (Demodex brevis) ribosomal DNA (rDNA) sequence and also analyzed the primary sequences of divergent domains in small-subunit ribosomal RNA (rRNA) of 51 species and in large-subunit rRNA of 43 species from four superfamilies in Acari (Cheyletoidea, Tetranychoidea, Analgoidea, and Ixodoidea). The results revealed that 18S rDNA sequence was relatively conserved in rDNA-coding regions and was not evolving as rapidly as 28S rDNA sequence. The evolutionary rates of transcribed spacer regions were much higher than those of the coding regions. The maximum parsimony trees of 18S and 28S rDNA appeared to be almost identical, consistent with their morphological classification. Based on the fact that the resolution capability of sequence length and the divergence of the 13 segments (D1-D6, D7a, D7b, and D8-D12) of 28S rDNA were stronger than that of the nine variable regions (V1-V9) of 18S rDNA, we were able to identify Demodex (Cheyletoidea) by the indels occurring in D2, D6, and D8.

  15. Phylogeny of the major lineages of Membracoidea (Insecta: Hemiptera: Cicadomorpha) based on 28S rDNA sequences.

    Dietrich, C H; Rakitov, R A; Holmes, J L; Black, W C


    Analysis of sequences from a 3.5-kb region of the nuclear ribosomal 28S DNA gene spanning divergent domains D2-D10 supports the hypothesis, based on fossil, biogeographic, and behavioral evidence, that treehoppers (Aetalionidae and Membracidae) are derived from leafhoppers (Cicadellidae). Maximum-parsimony analysis indicated that treehoppers are the sister group of a lineage comprising the currently recognized cicadellid subfamilies Agalliinae, Megophthalminae, Adelungiinae, and Ulopinae. Based on this phylogenetic estimate, the derivation of treehoppers approximately coincided with shifts in physiology and behavior, including loss of brochosome production and a reversal from active, jumping nymphs to sessile, nonjumping nymphs. Myerslopiidae, traditionally placed as a tribe of the cicadellid subfamily Ulopinae, represented a basal lineage distinct from other extant membracoids. The analysis recovered a large leafhopper lineage comprising a polyphyletic Deltocephalinae (sensu stricto) and its apparent derivatives Koebeliinae, Eupelicinae (polyphyletic), Selenocephalinae, and Penthimiinae. Clades comprising Macropsinae, Neocoelidiinae, Scarinae, Iassinae, Coelidiinae, Eurymelinae + Idiocerinae, Evacanthini + Pagaroniini, Aphrodinae + Ledrinae (in part), Stenocotini + Tartessinae, and Cicadellini + Proconiini were also recovered with moderate to high branch support. Cicadellinae (sensu lato), Ledrinae, Typhlocybinae, and Xestocephalinae were consistently polyphyletic on the most-parsimonious topologies, but constraining these groups to be monophyletic did not significantly increase the length of the cladograms. Relationships among the major lineages received low branch support, suggesting that more data are needed to provide a robust phylogenetic estimate.

  16. Genetic diversity based on 28S rDNA sequences among populations of Culex quinquefasciatus collected at different locations in Tamil Nadu, India.

    Sakthivelkumar, S; Ramaraj, P; Veeramani, V; Janarthanan, S


    The basis of the present study was to distinguish the existence of any genetic variability among populations of Culex quinquefasciatus which would be a valuable tool in the management of mosquito control programmes. In the present study, population of Cx. quinquefasciatus collected at different locations in Tamil Nadu were analyzed for their genetic variation based on 28S rDNA D2 region nucleotide sequences. A high degree of genetic polymorphism was detected in the sequences of D2 region of 28S rDNA on the predicted secondary structures in spite of high nucleotide sequence similarity. The findings based on secondary structure using rDNA sequences suggested the existence of a complex genotypic diversity of Cx. quinquefasciatus population collected at different locations of Tamil Nadu, India. This complexity in genetic diversity in a single mosquito population collected at different locations is considered an important issue towards their influence and nature of vector potential of these mosquitoes.

  17. Contrasting evolutionary patterns of 28S and ITS rRNA genes reveal high intragenomic variation in Cephalenchus (Nematoda): Implications for species delimitation.

    Pereira, Tiago José; Baldwin, James Gordon


    Concerted evolution is often assumed to be the evolutionary force driving multi-family genes, including those from ribosomal DNA (rDNA) repeat, to complete homogenization within a species, although cases of non-concerted evolution have been also documented. In this study, sequence variation of 28S and ITS ribosomal RNA (rRNA) genes in the genus Cephalenchus is assessed at three different levels, intragenomic, intraspecific, and interspecific. The findings suggest that not all Cephalenchus species undergo concerted evolution. High levels of intraspecific polymorphism, mostly due to intragenomic variation, are found in Cephalenchus sp1 (BRA-01). Secondary structure analyses of both rRNA genes and across different species show a similar substitution pattern, including mostly compensatory (CBC) and semi-compensatory (SBC) base changes, thus suggesting the functionality of these rRNA copies despite the variation found in some species. This view is also supported by low sequence variation in the 5.8S gene in relation to the flanking ITS-1 and ITS-2 as well as by the existence of conserved motifs in the former gene. It is suggested that potential cross-fertilization in some Cephalenchus species, based on inspection of female reproductive system, might contribute to both intragenomic and intraspecific polymorphism of their rRNA genes. These results reinforce the potential implications of intragenomic and intraspecific genetic diversity on species delimitation, especially in biodiversity studies based solely on metagenetic approaches. Knowledge of sequence variation will be crucial for accurate species diversity estimation using molecular methods.

  18. Phylogenetic relationships among the microgastroid wasps (Hymenoptera: braconidae): combined analysis of 16S and 28S rDNA genes and morphological data.

    Dowton, M; Austin, A D


    Relationships among the microgastroid complex of braconid wasps were investigated using sequence data from the 16S mitochondrial rDNA and 28S (D2 expansion region) nuclear rDNA genes, as well as morphological data. Parsimony analysis of these gene fragments, both separately and combined, indicated that Neoneurus (Neoneurinae) and Ichneutes (Ichneutinae) were no more closely related to the microgastroids than were a range of helconoid taxa. Combined parsimony analysis of the microgastroids indicated the relationships ((Cardiochilinae + Microgastrinae) + Miracinae) + Cheloninae, with Adeliinae falling inside the Cheloninae. Bootstrap proportions for each of these nodes were greater than 70%. Character reweighting (sensu Farris), using the rescaled consistency index, also recovered these relationships. Mapping of lifestyle traits onto this relatively well supported phylogeny indicated that solitary endoparasitism is ancestral for the microgastroids, with a single origin for egg-larval endoparasitism in the Cheloninae + Adeliinae. Mapping of the radiation of the microgastroids into lepidopteran hosts was less clear, due to the specialized biology of the most basal microgastroid clade, the Cheloninae + Adeliinae. Our data are consistent with attack of concealed lepidopteran hosts as the plesiomorphic lifestyle, at least for the Miracinae + Cardiochilinae + Microgastrinae, with radiation into more exposed hosts in the Cardiochilinae + Microgastrinae.

  19. Cloning and application of 28S rRNA gene fragment of Trichinella spiralis on Taxonmy%旋毛虫28S rRNA基因片段的克隆及其在分类学上的应用

    李成; 魏颖; 袁金钱; 宋铭忻


    In order to investigate the classification of Trihicnella swine isolate from Heilongjiang Province, the gene fragment in ribosome 28S rRNA was cloned and sequenced. The results showed that Trihicnella swine isolate from Heilongjiang Province was closed and belonged to Trichinella spiralis by sequence analysis. To some extent, the result was consistent with the traditional classfication and provided a base for the traditional taxonomy.%为了探讨所采集旋毛虫的分类,利用PCR方法克隆了猪旋毛虫黑龙江隔离种核糖体28S rRNA序列的基因片段.序列分析结果表明,猪旋毛虫黑龙江隔离种与旋毛形线虫(Trichinella spiralis,T1)的进化关系较近,确定为旋毛形线虫(Trichinella spiralis).结果与传统的分类结果基本一致,为传统的分类学方法提供了新的理论依据.


    李雪玲; 姚一建


    通过对侧耳属18个分类单元的28S rDNA序列进行分析,构建了侧耳属较为完整的系统发育树.分子系统学资料显示:Coremiopleurotus组和侧耳属内单、二系菌丝系统分别为多系起源的;Pleurotus组单系菌丝种类、被划分在Tuberregium组的具核侧耳和Lentodiellum组的P. levis能够分别与侧耳属内其他成员进行区分;红侧耳、P. calyptratus、P. opuntiae三者关系密切,而金顶侧耳应作为白黄侧耳的种下分类单元.

  1. Dracula ant phylogeny as inferred by nuclear 28S rDNA sequences and implications for ant systematics (Hymenoptera: Formicidae: Amblyoponinae).

    Saux, Corrie; Fisher, Brian L; Spicer, Greg S


    Ants are one of the most ecologically and numerically dominant families of organisms in almost every terrestrial habitat throughout the world, though they include only about 1% of all described insect species. The development of eusociality is thought to have been a driving force in the striking diversification and dominance of this group, yet we know little about the evolution of the major lineages of ants and have been unable to clearly determine their primitive characteristics. Ants within the subfamily Amblyoponinae are specialized arthropod predators, possess many anatomically and behaviorally primitive characters and have been proposed as a possible basal lineage within the ants. We investigate the phylogenetic relationships among the members of the subfamily, using nuclear 28S rDNA sequence data. Outgroups for the analysis include members of the poneromorph and leptanillomorph (Apomyrma, Leptanilla) ant subfamilies, as well as three wasp families. Parsimony, maximum likelihood, and Bayesian analyses provide strong support for the monophyly of a clade containing the two genera Apomyrma+Mystrium (100% bpp; 97% ML bs; and 97% MP bs), and moderate support for the monophyly of the Amblyoponinae as long as Apomyrma (Apomyrminae) is included (87% bpp; 57% ML bs; and 76% MP bs). Analyses did not recover evidence of monophyly of the Amblyopone genus, while the monophyly of the other genera in the subfamily is supported. Based on these results we provide a morphological diagnosis of the Amblyoponinae that includes Apomyrma. Among the outgroup taxa, Typhlomyrmex grouped consistently with Ectatomma, supporting the recent placement of Typhlomyrmex in the Ectatomminae. The results of this present study place the included ant subfamilies into roughly two clades with the basal placement of Leptanilla unclear. One clade contains all the Amblyoponinae (including Apomyrma), Ponerinae, and Proceratiinae (Poneroid clade). The other clade contains members from subfamilies

  2. Phylogenetic analysis of the spider mite sub-family Tetranychinae (Acari: Tetranychidae) based on the mitochondrial COI gene and the 18S and the 5' end of the 28S rRNA genes indicates that several genera are polyphyletic.

    Matsuda, Tomoko; Morishita, Maiko; Hinomoto, Norihide; Gotoh, Tetsuo


    The spider mite sub-family Tetranychinae includes many agricultural pests. The internal transcribed spacer (ITS) region of nuclear ribosomal RNA genes and the cytochrome c oxidase subunit I (COI) gene of mitochondrial DNA have been used for species identification and phylogenetic reconstruction within the sub-family Tetranychinae, although they have not always been successful. The 18S and 28S rRNA genes should be more suitable for resolving higher levels of phylogeny, such as tribes or genera of Tetranychinae because these genes evolve more slowly and are made up of conserved regions and divergent domains. Therefore, we used both the 18S (1,825-1,901 bp) and 28S (the 5' end of 646-743 bp) rRNA genes to infer phylogenetic relationships within the sub-family Tetranychinae with a focus on the tribe Tetranychini. Then, we compared the phylogenetic tree of the 18S and 28S genes with that of the mitochondrial COI gene (618 bp). As observed in previous studies, our phylogeny based on the COI gene was not resolved because of the low bootstrap values for most nodes of the tree. On the other hand, our phylogenetic tree of the 18S and 28S genes revealed several well-supported clades within the sub-family Tetranychinae. The 18S and 28S phylogenetic trees suggest that the tribes Bryobiini, Petrobiini and Eurytetranychini are monophyletic and that the tribe Tetranychini is polyphyletic. At the genus level, six genera for which more than two species were sampled appear to be monophyletic, while four genera (Oligonychus, Tetranychus, Schizotetranychus and Eotetranychus) appear to be polyphyletic. The topology presented here does not fully agree with the current morphology-based taxonomy, so that the diagnostic morphological characters of Tetranychinae need to be reconsidered.

  3. Phylogenetic analysis of the spider mite sub-family Tetranychinae (Acari: Tetranychidae based on the mitochondrial COI gene and the 18S and the 5' end of the 28S rRNA genes indicates that several genera are polyphyletic.

    Tomoko Matsuda

    Full Text Available The spider mite sub-family Tetranychinae includes many agricultural pests. The internal transcribed spacer (ITS region of nuclear ribosomal RNA genes and the cytochrome c oxidase subunit I (COI gene of mitochondrial DNA have been used for species identification and phylogenetic reconstruction within the sub-family Tetranychinae, although they have not always been successful. The 18S and 28S rRNA genes should be more suitable for resolving higher levels of phylogeny, such as tribes or genera of Tetranychinae because these genes evolve more slowly and are made up of conserved regions and divergent domains. Therefore, we used both the 18S (1,825-1,901 bp and 28S (the 5' end of 646-743 bp rRNA genes to infer phylogenetic relationships within the sub-family Tetranychinae with a focus on the tribe Tetranychini. Then, we compared the phylogenetic tree of the 18S and 28S genes with that of the mitochondrial COI gene (618 bp. As observed in previous studies, our phylogeny based on the COI gene was not resolved because of the low bootstrap values for most nodes of the tree. On the other hand, our phylogenetic tree of the 18S and 28S genes revealed several well-supported clades within the sub-family Tetranychinae. The 18S and 28S phylogenetic trees suggest that the tribes Bryobiini, Petrobiini and Eurytetranychini are monophyletic and that the tribe Tetranychini is polyphyletic. At the genus level, six genera for which more than two species were sampled appear to be monophyletic, while four genera (Oligonychus, Tetranychus, Schizotetranychus and Eotetranychus appear to be polyphyletic. The topology presented here does not fully agree with the current morphology-based taxonomy, so that the diagnostic morphological characters of Tetranychinae need to be reconsidered.

  4. Analysis of the genetic polymorphism of Paracoccidioides brasiliensis and Paracoccidioides cerebriformis "Moore" by random amplified polymorphic DNA (RAPD and 28S ribosomal DNA sequencing: Paracoccidioides cerebriformis revisited Análise do polimorfismo genético do Paracoccidioides brasiliensis e Paracoccidioides cerebriformis "Moore" pela técnica de amplificação aleatória do polimorfismo do DNA (RAPD e sequenciamento do DNA ribossomal 28S: Paracoccidioides cerebriformis revisitado

    Sarah Desirée Barbosa Cavalcanti


    Full Text Available Our purpose was to compare the genetic polymorphism of six samples of P. brasiliensis (113, 339, BAT, T1F1, T3B6, T5LN1, with four samples of P. cerebriformis (735, 741, 750, 361 from the Mycological Laboratory of the Instituto de Medicina Tropical de São Paulo, using Random Amplified Polymorphic DNA Analysis (RAPD. RAPD profiles clearly segregated P. brasiliensis and P. cerebriformis isolates. However, the variation on band patterns among P. cerebriformis isolates was high. Sequencing of the 28S rDNA gene showed nucleotide conservancy among P. cerebriformis isolates, providing basis for taxonomical grouping, and disclosing high divergence to P. brasiliensis supporting that they are in fact two distinct species. Moreover, DNA sequence suggests that P. cerebriformis belongs in fact to the Aspergillus genus.Nosso propósito foi comparar o polimorfismo genético de seis amostras de P. brasiliensis (113, 339, BAT, T1F1, T3B6, T5LN1, com quatro amostras de P. cerebriformis (735, 741, 750, 361 do laboratório de micologia do Instituto de Medicina Tropical de São Paulo, utilizando a técnica de Amplificação Aleatória do Polimorfismo de DNA (RAPD. O perfil de bandas do RAPD diferenciou claramente os isolados de P. brasiliensis de P. cerebriformis. Entretanto, ocorreu uma variação significativa no padrão de bandas das amostras de P. cerebriformis. O sequenciamento do gene ribossomal 28S revelou seqüências de nucleotídeos bastante conservadas entre os isolados de P. cerebriformis, fornecendo subsídio para o agrupamento taxonômico destas amostras, diferenciando estas de P. brasiliensis e mostrando que de fato são espécies distintas. A seqüência de DNA sugere que P. cerebriformis pertence ao gênero Aspergillus.

  5. Secondary structure prediction for complete rDNA sequences (18S, 5.8S, and 28S rDNA) of Demodex folliculorum, and comparison of divergent domains structures across Acari.

    Zhao, Ya-E; Wang, Zheng-Hang; Xu, Yang; Wu, Li-Ping; Hu, Li


    According to base pairing, the rRNA folds into corresponding secondary structures, which contain additional phylogenetic information. On the basis of sequencing for complete rDNA sequences (18S, ITS1, 5.8S, ITS2 and 28S rDNA) of Demodex, we predicted the secondary structure of the complete rDNA sequence (18S, 5.8S, and 28S rDNA) of Demodex folliculorum, which was in concordance with that of the main arthropod lineages in past studies. And together with the sequence data from GenBank, we also predicted the secondary structures of divergent domains in SSU rRNA of 51 species and in LSU rRNA of 43 species from four superfamilies in Acari (Cheyletoidea, Tetranychoidea, Analgoidea and Ixodoidea). The multiple alignment among the four superfamilies in Acari showed that, insertions from Tetranychoidea SSU rRNA formed two newly proposed helixes, and helix c3-2b of LSU rRNA was absent in Demodex (Cheyletoidea) taxa. Generally speaking, LSU rRNA presented more remarkable differences than SSU rRNA did, mainly in D2, D3, D5, D7a, D7b, D8 and D10.

  6. Phylogenetic position of the enigmatic clawless eutardigrade genus Apodibius Dastych, 1983 (Tardigrada), based on 18S and 28S rRNA sequence data from its type species A. confusus.

    Dabert, Miroslawa; Dastych, Hieronymus; Hohberg, Karin; Dabert, Jacek


    The systematics of Eutardigrada, the largest lineage among the three classes of the phylum Tardigrada, is based mainly on the morphology of the leg claws and of the buccal apparatus. However, three members of the rarely recorded and poorly known limno-terrestrial eutardigrade genus Apodibius have no claws on their strongly reduced legs, a unique character among all tardigrades. This absence of all claws makes the systematic position of Apodibius one of the most enigmatic among the whole class. Until now all known associates of the genus Apodibius have been located in the incertae sedis species group or, quite recently, included into the Necopinatidae family. In the present study, phylogenetic analyses of 18S and 28S rRNA sequence data from 31 tardigrade species representing four parachelan superfamilies (Isohypsibioidea, Hypsibioidea, Macrobiotoidea, Eohypsibioidea), the apochelan Milnesium tardigradum, and the type species of the genus Apodibius, A. confusus, indicated close relationship of the Apodibius with tardigrade species recently included in the superfamily Isohypsibioidea. This result was well-supported and consistent across all markers (separate 18S rRNA, 28S rRNA, and combined 18S rRNA+28S rRNA datasets) and methods (MP, ML) applied.

  7. Phylogeny of Deltocephalinae (Hemiptera: Cicadellidae)from China based on partial 16S rDNA and 28S rDNA D2 sequences combined with morphological characters%基于16S rDNA和28S rDNA D2基因序列与形态特征联合分析的中国角顶叶蝉亚科系统发育研究(半翅目:叶蝉科)

    戴仁怀; 陈学新; 李子忠


    The phylogeny of 19 genera of Deltocephalinae leafhoppers was analyzed based on 50 adult morphological characters combined with nucleotide sequences of the mitochondrial 16S rDNA and nuclear 28S D2 rDNA genes. One species of Typhlocybinae was included as outgroup. Parsimonian, distance and Bayesian methods were used to estimate the phylogenetic relationships. The topology of the phylogenetic trees generated with different methods was quite similar. We partially resolved the morphologically-defined tribes and the relationships among 19 genera of Deltocephalinae. The genus Macrosteles was well supported to occupy a basal position in the study, so the most primary tribe in Deltocephalinae might be Macrostelini. The phylogenetic analysis trees put all genera of Deltocephalini but Nakaharanus onto a single lineage. The genus Balclutha, corresponding to the tribe Balclnthini,remains unsolved in our analyses. The Euscelini might be a polyphyletic group in the analysis. Analytical result recovered Athysanini and Paralimnini as monophyletic clades. The clade Phlogotettix and Scaphoideus-Nakaharanus was constantly resolved using different methods. We suggested that Scaphoideus, Nakaharanus and Phlogotettix should be included in or into Scaphoideini. But the results resolved poorly the taxonomic status of Xestoeephalini overall.%首次在国内利用28s rDNA D2区段和16s rDNA基因序列,结合50个形态特征对角顶叶蝉亚科(Deltocephalinae)[半翅目(Hemiptera):叶蝉科(cicadellidae)]19个属进行系统发育分析研究.从无水乙醇浸泡保存的标本中提取基因组DNA并扩增了19个内群和1种外群Tyhlocybinae[半翅目(Hemiptera):叶蝉科(cicadelIidae)]种类的28s rDNA D2基因片段并测序,同时扩增了16s rDNA基因片段并测序11条,采用了GenBank中1个种类的16S rDNA同源序列.采用PAuP*4.O和MrBayes3.0两个分析软件和3种建树方法,利用同源28s D2 rDNA和16srDNA两个基因序列与形态特征结合进行系统发

  8. Molecular phylogeny of the butterfly tribe Satyrini (Nymphalidae: Satyrinae) with emphasis on the utility of ribosomal mitochondrial genes 16s rDNA and nuclear 28s rDNA.

    Yang, Mingsheng; Zhang, Yalin


    The tribe Satyrini is one of the most diverse groups of butterflies, but no robust phylogenetic hypothesis for this group has been achieved. Two rarely used 16s and 28s ribosomal and another seven protein-coding genes were used to reconstruct the phylogeny of the Satyrini, with further aim to evaluate the informativeness of the ribosomal genes. Our maximum parsimony (MP), maximum likelihood (ML) and Bayesian inference (BI) analyses consistently recovered three well-supported clades for the eleven sampled subtribes of Satyrini: clade I includes Eritina and Coenonymphina, being sister to the clade II + clade III; clade II contains Parargina, Mycalesina and Lethina, and the other six subtribes constitute clade III. The placements of the taxonomically unstable Davidina Oberthür and geographically restricted Paroeneis Moore in Satyrina are confirmed for the first time based on molecular evidence. The close relationships of Callerebia Butler, Loxerebia Watkins and Argestina Riley are well-supported. We suggest that Rhaphicera Butler belongs to Lethina. The partitioned Bremer support (PBS) values of MP analysis show that the 16s rDNA contributes well to the nodes representing all the taxa from subtribe to species levels, and the 28s rDNA is informative at the subtribe level. Furthermore, our ML analyses show that the ribosomal genes 16s rDNA and 28s rDNA are informative, because most node support values are lower in the ML tree after the removal of them than that in ML tree constructed based on the full nine-gene dataset. This indicates that some other ribosomal genes should be tentatively used through combining with traditionally used protein-coding genes in further analysis on phylogeny of Satyrini, providing that proper representatives are sampled.

  9. Phylogenetic analysis of three species of Encarsia ( Hymenoptera: Aphelinidae) parasitizing Bemisia tabaci ( Hemiptera: Aleyrodidae) in China based on their 28S rRNA gene%中国寄生烟粉虱的三种恩角蚜小蜂28S rRNA系统发育分析

    薛夏; 彭伟录; Muhammad Z. AHMED; Nasser S. MANDOUR; 任顺祥; Andrew G. S. CUTHBERTSON; 邱宝利


    Encarsia F(o)rster consists of important parasitoids of whitefly (Bemisia tabaci) pests,including E.bimaculata,E.formosa and E.sophia,the three most important aphelinid parasitoids in China.Eight populations of Encarsia from the South,Southeast,North and Southwest of China,as well as two populations from Malaysia and Egypt,respectively,were collected in the present study,and their interspecies phylogenetic relationships were analyzed based on 28S rRNA D2 and D3 expansion regions.The D2 and D3 regions were consistent with each other,confirmed a closer genetic relationship between E.sophia and E.bimaculata since they both belong to the Encarisa strenus species group,compared to those between these two species and En.formosa.Results of the genetic distance analysis using 28S rRNA D2 sequences revealed that there are certain genetic divergences within single species of the Encarsia parasitoids.The Guangzhou population of Encarsia sophia is more close to populations from Australia,Spain,Egypt and Ethiopia,but further from the population from Thailand.E. bimaculata populations from Sudan,Egypt and Guatemala as well as one population from Australia cluster together,while E.formosa Hengshui and Kunming populations cluster together with those from USA,UK and Greece,but are further from the Egypt population.The reasons for the inconsistency between the genetic and geographical distances of the Encarsia species are discussed.%蚜小蜂Bemisia tabaci是烟粉虱的重要天敌,其中双斑恩蚜小蜂Encarsia bimaculata,丽蚜小蜂E.forTmosa以及浅黄恩蚜小蜂E.sophia是国内烟粉虱寄生蜂3个优势种.本研究以采自中国华南、华东、华北、西南地区以及马来西亚、埃及的E.bimaculata、E.formosa和E.sophia3个优势种的8个不同地理种群为研究对象,对其28SrRNA D2和D3扩展区序列进行了测定和分析.结果表明:Encarsia属的恩蚜小蜂其28S rRNA D2和D3序列在种间水平上高度保守;与丽蚜小蜂相比,双斑

  10. cis sequence effects on gene expression

    Jacobs Kevin


    Full Text Available Abstract Background Sequence and transcriptional variability within and between individuals are typically studied independently. The joint analysis of sequence and gene expression variation (genetical genomics provides insight into the role of linked sequence variation in the regulation of gene expression. We investigated the role of sequence variation in cis on gene expression (cis sequence effects in a group of genes commonly studied in cancer research in lymphoblastoid cell lines. We estimated the proportion of genes exhibiting cis sequence effects and the proportion of gene expression variation explained by cis sequence effects using three different analytical approaches, and compared our results to the literature. Results We generated gene expression profiling data at N = 697 candidate genes from N = 30 lymphoblastoid cell lines for this study and used available candidate gene resequencing data at N = 552 candidate genes to identify N = 30 candidate genes with sufficient variance in both datasets for the investigation of cis sequence effects. We used two additive models and the haplotype phylogeny scanning approach of Templeton (Tree Scanning to evaluate association between individual SNPs, all SNPs at a gene, and diplotypes, with log-transformed gene expression. SNPs and diplotypes at eight candidate genes exhibited statistically significant (p cis sequence effects in our study, respectively. Conclusion Based on analysis of our results and the extant literature, one in four genes exhibits significant cis sequence effects, and for these genes, about 30% of gene expression variation is accounted for by cis sequence variation. Despite diverse experimental approaches, the presence or absence of significant cis sequence effects is largely supported by previously published studies.

  11. 基于28S rRNA基因的PCR-RFLP分析对赤拟谷盗与杂拟谷盗进行分子鉴定%Molecular identification of Tribolium castaneum and T.confusum based on PCR-RFLP analyses of 28S rRNA gene

    张汉松; 冯照军; 程超


    本研究拟利用聚合酶链式反应-限制性片段长度多态性(PCR-RFLP)分析方法对赤拟谷盗Tribolium castaneum (Herbst)和杂拟谷盗Tribolium confusum(Jac du Val)进行分子鉴定,以期为仓储害虫管理和口岸检疫提供技术帮助和支持.采用通用引物对赤拟谷盗和杂拟谷盗的28S rRNA基因进行了PCR扩增、序列测定和分析,结果发现:扩增片段长约1070 bp,该序列种内均无变异位点、种间有76个变异位点,即种内没有核苷酸替换发生、种间核苷酸替换发生76次,其中转换56次,颠换20次,转换/颠换的比值为2.80.用限制性内切酶PvuⅠ对赤拟谷盗和杂拟谷盗的28S rRNA基因扩增产物进行酶切,电泳检测显示,赤拟谷盗和杂拟谷盗的28S rRNA基因扩增产物的PvuⅠ酶切图谱(分别产生2个和3个酶切条带)明显不同,因此本研究建立的28SrRNA基因PCR-RFLP方法可用于赤拟谷盗与杂拟谷盗的分子鉴定.

  12. Genetic differentiation and phylogenesis of Tribolium castaneum and T.confusum based on 28S rRNA and CO Ⅰ genes%基于28S rRNA和COⅠ基因探讨赤拟谷盗与杂拟谷盗的遗传分化和系统发育

    明庆磊; 王阿旻; 程超


    赤拟谷盗与杂拟谷盗形态相似且种间生殖隔离不完全,为探明这两个近缘种之间的遗传分化和系统发育关系,对赤拟谷盗与杂拟谷盗30个个体的一个核基因28S核糖体RNA(28S rRNA)和一个线粒体基因细胞色素氧化酶亚基Ⅰ (COⅠ)进行了PCR扩增、测序和分析,发现这两个基因分别有2个和3个单倍型,种间没有相同的单倍型.在28S rRNA基因区,两个种的种内核苷酸序列均没有变异;在COⅠ基因区,种内核苷酸变异位点不超过2个,且核苷酸变异没有导致其编码氨基酸发生改变.然而,在28S rRNA和COⅠ基因区,种间核苷酸序列分别存在76个和144个位点,且COⅠ基因区的核苷酸变异位点导致25个编码氨基酸发生改变.系统发育分析表明,赤拟谷盗与弗氏拟谷盗和黑拟谷盗的亲缘关系要近于与杂拟谷盗的亲缘关系,这与由其形态推导的系统发育关系并不完全一致.表明,尽管赤拟谷盗与杂拟谷盗形态和大小相似,但其种间的分子遗传分化明显,用28S rRNA和COⅠ基因来评价它们的遗传变异与系统发育关系是非常有用的.

  13. Cloning and phylogenetic analysis of 18S rRNA and 28S rRNA genes of Pomacea canaliculata%福寿螺18S rRNA和28S rRNA基因片段的克隆与进化分析

    潘颖瑛; 董胜张; 俞晓平


    为从分子水平上明确入侵我国的福寿螺在分类学上的地位,采用分子克隆和序列比对的方法,对来自菲律宾及我国广东、广西、浙江等不同地理种群福寿螺的18S rRNA基因和28S rRNA基因片段进行扩增、克隆和序列测定,并同瓶螺科、田螺科和环口螺科相关物种进行系统发育分析.结果表明,获得的福寿螺18S rRNA基因和28S rRNA基园片段长度分别为602 bp、325 bp,且不同地理种群间碱基序列无差异.通过邻接法(NJ)和最大筒约法(MP)构建的系统树基本一致,证实福寿螺隶属于瓶螺科,与田螺科物种亲缘关系较近,而与环口螺科亲缘关系较远.

  14. Synaptotagmin gene content of the sequenced genomes

    Craxton Molly


    Full Text Available Abstract Background Synaptotagmins exist as a large gene family in mammals. There is much interest in the function of certain family members which act crucially in the regulated synaptic vesicle exocytosis required for efficient neurotransmission. Knowledge of the functions of other family members is relatively poor and the presence of Synaptotagmin genes in plants indicates a role for the family as a whole which is wider than neurotransmission. Identification of the Synaptotagmin genes within completely sequenced genomes can provide the entire Synaptotagmin gene complement of each sequenced organism. Defining the detailed structures of all the Synaptotagmin genes and their encoded products can provide a useful resource for functional studies and a deeper understanding of the evolution of the gene family. The current rapid increase in the number of sequenced genomes from different branches of the tree of life, together with the public deposition of evolutionarily diverse transcript sequences make such studies worthwhile. Results I have compiled a detailed list of the Synaptotagmin genes of Caenorhabditis, Anopheles, Drosophila, Ciona, Danio, Fugu, Mus, Homo, Arabidopsis and Oryza by examining genomic and transcript sequences from public sequence databases together with some transcript sequences obtained by cDNA library screening and RT-PCR. I have compared all of the genes and investigated the relationship between plant Synaptotagmins and their non-Synaptotagmin counterparts. Conclusions I have identified and compared 98 Synaptotagmin genes from 10 sequenced genomes. Detailed comparison of transcript sequences reveals abundant and complex variation in Synaptotagmin gene expression and indicates the presence of Synaptotagmin genes in all animals and land plants. Amino acid sequence comparisons indicate patterns of conservation and diversity in function. Phylogenetic analysis shows the origin of Synaptotagmins in multicellular eukaryotes and their

  15. Network of tRNA Gene Sequences

    WEI Fang-ping; LI Sheng; MA Hong-ru


    A network of 3719 tRNA gene sequences was constructed using simplest alignment. Its topology, degree distribution and clustering coefficient were studied. The behaviors of the network shift from fluctuated distribution to scale-free distribution when the similarity degree of the tRNA gene sequences increases. The tRNA gene sequences with the same anticodon identity are more self-organized than those with different anticodon identities and form local clusters in the network. Some vertices of the local cluster have a high connection with other local clusters, and the probable reason was given. Moreover, a network constructed by the same number of random tRNA sequences was used to make comparisons. The relationships between the properties of the tRNA similarity network and the characters of tRNA evolutionary history were discussed.

  16. Sequencing and Gene Expression Analysis of Leishmania tropica LACK Gene.

    Nour Hammoudeh


    Full Text Available Leishmania Homologue of receptors for Activated C Kinase (LACK antigen is a 36-kDa protein, which provokes a very early immune response against Leishmania infection. There are several reports on the expression of LACK through different life-cycle stages of genus Leishmania, but only a few of them have focused on L.tropica.The present study provides details of the cloning, DNA sequencing and gene expression of LACK in this parasite species. First, several local isolates of Leishmania parasites were typed in our laboratory using PCR technique to verify of Leishmania parasite species. After that, LACK gene was amplified and cloned into a vector for sequencing. Finally, the expression of this molecule in logarithmic and stationary growth phase promastigotes, as well as in amastigotes, was evaluated by Reverse Transcription-PCR (RT-PCR technique.The typing result confirmed that all our local isolates belong to L.tropica. LACK gene sequence was determined and high similarity was observed with the sequences of other Leishmania species. Furthermore, the expression of LACK gene in both promastigotes and amastigotes forms was confirmed.Overall, the data set the stage for future studies of the properties and immune role of LACK gene products.

  17. The nucleotide sequences of two leghemoglobin genes from soybean

    Wiborg, O; Hyldig-Nielsen, J J; Jensen, E O


    We present the complete nucleotide sequences of two leghemoglobin genes isolated from soybean DNA. Both genes contain three intervening sequences in identical positions. Comparison of the coding sequences with known amino-acid sequences of soybean leghemoglobins suggest that the two genes...

  18. Cloning and sequencing genes related to preeclampsia

    SHI Juan-zi; LIU Yan-fang; YAO Yuan-qing; YAN Wei; ZHU Feng; ZHAO Zhong-liang


    To clone genes specifically expressed in the placenta of patients with preeclampsia, and to explain the mechanism in the etiopathology ofpreeclampsia. Methods: The placentae ofpreeclamptic and normotensive subjects with pregnancy were used as models, and the cDNA Library was constructed and 20 differentially expressed fragments were cloned after a new version of PCR-based subtractive hybridization. The false positive clones were identified by reverse dot blot analysis. With one of the obtained gene taken as the probe, the placentas of 10 normal pregnant women and 10 preeclamptic patients were studied by using dot hybridization methods. Results: Six false positive clones were identified by reverse dot blot, and the rest 14 clones were identified as preeclampsia-related genes. These clones were sequenced, and analyzed with BLAST analysis system. Eleven of 14 clones were genes already known, among which one belongs to necdin family; the rest 3 were identified as novel genes. These 3 genes were acknowledged by GenBank, with the accession numbers AF232216, AF232217, AF233648. The results of dot hybridization using necdin gene as probe were as follows: (1) There was this mRNA in the placental tissues of normal pregnancy as well as in that ofpreeclampsia.(2) The intensity of transcription of this mRNA in the placental tissues of preeclampsia increased significantly compared with that of the normal pregnancy (P<0.05). Conclusions: This study for the first time reported this group of genes, especially necdin-expressing gene, which are related to the etiopathology of preeclampsia. In addition, the overtranscription ofnecdin gene has been found in preeclampsia. It is helpful in further studies of the etiology ofpreeclampsia.

  19. Metagenomic data of fungal internal transcribed Spacer and 18S rRNA gene sequences from Lonar lake sediment, India.

    Dudhagara, Pravin; Ghelani, Anjana; Bhavsar, Sunil; Bhatt, Shreyas


    The data in this article contains the sequences of fungal Internal Transcribed Spacer (ITS) and 18S rRNA gene from a metagenome of Lonar soda lake, India. Sequences were amplified using fungal specific primers, which amplified the amplicon lined between the 18S and 28S rRNA genes. Data were obtained using Fungal tag-encoded FLX amplicon pyrosequencing (fTEFAP) technique and used to analyze fungal profile by the culture-independent method. Primary analysis using PlutoF 454 pipeline suggests the Lonar lake mycobiome contained the 29 different fungal species. The raw sequencing data used to perform this analysis along with FASTQ file are located in the NCBI Sequence Read Archive (SRA) under accession No. SRX889598 (

  20. The first determination of DNA sequence of a specific gene.

    Inouye, Masayori


    How and when the first DNA sequence of a gene was determined? In 1977, F. Sanger came up with an innovative technology to sequence DNA by using chain terminators, and determined the entire DNA sequence of the 5375-base genome of bacteriophage φX 174 (Sanger et al., 1977). While this Sanger's achievement has been recognized as the first DNA sequencing of genes, we had determined DNA sequence of a gene, albeit a partial sequence, 11 years before the Sanger's DNA sequence (Okada et al., 1966).

  1. Fungal community analysis in the deep-sea sediments of the Pacific Ocean assessed by comparison of ITS, 18S and 28S ribosomal DNA regions

    Xu, Wei; Luo, Zhu-Hua; Guo, Shuangshuang; Pang, Ka-Lai


    We investigated the diversity of fungal communities in 6 different deep-sea sediment samples of the Pacific Ocean based on three different types of clone libraries, including internal transcribed spacer (ITS), 18S rDNA, and 28S rDNA regions. A total of 1978 clones were generated from 18 environmental clone libraries, resulting in 140 fungal operational taxonomic units (OTUs), including 18 OTUs from ITS, 44 OTUs from 18S rDNA, and 78 OTUs from 28S rDNA gene primer sets. The majority of the recovered sequences belonged to diverse phylotypes of the Ascomycota and Basidiomycota. Additionally, our study revealed a total of 46 novel fungal phylotypes, which showed low similarities (<97%) with available fungal sequences in the GenBank, including a novel Zygomycete lineage, suggesting possible new fungal taxa occurring in the deep-sea sediments. The results suggested that 28S rDNA is an efficient target gene to describe fungal community in deep-sea environment.

  2. Gene and translation initiation site prediction in metagenomic sequences

    Hyatt, Philip Douglas [ORNL; LoCascio, Philip F [ORNL; Hauser, Loren John [ORNL; Uberbacher, Edward C [ORNL


    Gene prediction in metagenomic sequences remains a difficult problem. Current sequencing technologies do not achieve sufficient coverage to assemble the individual genomes in a typical sample; consequently, sequencing runs produce a large number of short sequences whose exact origin is unknown. Since these sequences are usually smaller than the average length of a gene, algorithms must make predictions based on very little data. We present MetaProdigal, a metagenomic version of the gene prediction program Prodigal, that can identify genes in short, anonymous coding sequences with a high degree of accuracy. The novel value of the method consists of enhanced translation initiation site identification, ability to identify sequences that use alternate genetic codes and confidence values for each gene call. We compare the results of MetaProdigal with other methods and conclude with a discussion of future improvements.

  3. A Probabilistic Genome-Wide Gene Reading Frame Sequence Model

    Have, Christian Theil; Mørk, Søren

    We introduce a new type of probabilistic sequence model, that model the sequential composition of reading frames of genes in a genome. Our approach extends gene finders with a model of the sequential composition of genes at the genome-level -- effectively producing a sequential genome annotation...... and are evaluated by the effect on prediction performance. Since bacterial gene finding to a large extent is a solved problem it forms an ideal proving ground for evaluating the explicit modeling of larger scale gene sequence composition of genomes. We conclude that the sequential composition of gene reading frames...... as output. The model can be used to obtain the most probable genome annotation based on a combination of i: a gene finder score of each gene candidate and ii: the sequence of the reading frames of gene candidates through a genome. The model --- as well as a higher order variant --- is developed and tested...

  4. Sequencing genes in silico using single nucleotide polymorphisms

    Zhang Xinyi


    Full Text Available Abstract Background The advent of high throughput sequencing technology has enabled the 1000 Genomes Project Pilot 3 to generate complete sequence data for more than 906 genes and 8,140 exons representing 697 subjects. The 1000 Genomes database provides a critical opportunity for further interpreting disease associations with single nucleotide polymorphisms (SNPs discovered from genetic association studies. Currently, direct sequencing of candidate genes or regions on a large number of subjects remains both cost- and time-prohibitive. Results To accelerate the translation from discovery to functional studies, we propose an in silico gene sequencing method (ISS, which predicts phased sequences of intragenic regions, using SNPs. The key underlying idea of our method is to infer diploid sequences (a pair of phased sequences/alleles at every functional locus utilizing the deep sequencing data from the 1000 Genomes Project and SNP data from the HapMap Project, and to build prediction models using flanking SNPs. Using this method, we have developed a database of prediction models for 611 known genes. Sequence prediction accuracy for these genes is 96.26% on average (ranges 79%-100%. This database of prediction models can be enhanced and scaled up to include new genes as the 1000 Genomes Project sequences additional genes on additional individuals. Applying our predictive model for the KCNJ11 gene to the Wellcome Trust Case Control Consortium (WTCCC Type 2 diabetes cohort, we demonstrate how the prediction of phased sequences inferred from GWAS SNP genotype data can be used to facilitate interpretation and identify a probable functional mechanism such as protein changes. Conclusions Prior to the general availability of routine sequencing of all subjects, the ISS method proposed here provides a time- and cost-effective approach to broadening the characterization of disease associated SNPs and regions, and facilitating the prioritization of candidate

  5. A human gut microbial gene catalogue established by metagenomic sequencing

    dos Santos, Marcelo Bertalan Quintanilha; Sicheritz-Pontén, Thomas; Nielsen, Henrik Bjørn;


    To understand the impact of gut microbes on human health and well-being it is crucial to assess their genetic potential. Here we describe the Illumina-based metagenomic sequencing, assembly and characterization of 3.3 million non-redundant microbial genes, derived from 576.7 gigabases of sequence......, from faecal samples of 124 European individuals. The gene set, ,150 times larger than the human gene complement, contains an overwhelming majority of the prevalent (more frequent) microbial genes of the cohort and probably includes a large proportion of the prevalent human intestinal microbial genes...


    Anthonius Y.P.B.C. Widyatmoko


    Full Text Available Sequence polymorphisms among and within four Acacia species,  A. aulacocarpa, A. auriculiformis, A. crassicarpa, and A. mangium, were investigated using four chloroplast DNA genes (atpA, petA, rbcL, and rpoA. The phylogenetic relationship among these species is discussed in light of the results of the sequence information. No intraspecific sequence variation was found in the four genes of the four species, and a conservative rate of mutation of the chloroplast DNA genes was also confirmed in the Acacia species. In the atpA and petA of the four genes, all four species possessed identical sequences, and no sequence variation was found among the four Acacia species. In the rbcL and rpoA genes, however, sequence polymorphisms were revealed among these species. Acacia aulacocarpa and A. crassicarpa shared an identical sequence, and A. auriculiformis and A. mangium also showed no sequence variation.  The fact that A. mangium and A. auriculiformis shared identical sequences as did A. aulacocarpa and A. crassicarpa indicated that the two respective species were extremely closely related. Although a putative natural hybrid of A. aulacocarpa and A. auriculiformis has been reported, our results suggested that natural hybridization should be further verified using molecular markers.

  7. Identification and sequence analysis of Tapasin gene in guinea fowl

    Varuna P. Panicker


    Full Text Available Aim: An attempt has been made to identify and study the nucleotide sequence variability in exon 5 - exon 6 regions of guinea fowl Tapasin gene. Materials and Methods: Blood samples were collected from randomly selected birds (12 guinea fowl birds and Tapasin gene amplified using chicken specific primers designed from GenBank submitted sequences. Polymerase chain reaction conditions were standardized so as get only single amplicons. Obtained products were then cloned and sequenced; sequences were then analyzed using suitable software. Results: Amplicon size of the Tapasin gene in guinea fowl was same as reported in chicken with areas of transitions and transversions. The sequence variations reported in these coding sequences might have influence in the protein structure, which may be correlated with the increased immune status of the bird when compared with chicken breeds. Conclusion: Since Tapasin gene is an immunologically important gene, which plays an important role in the immune status of the bird. Sequence variations in the gene can be correlated with the altered immune status of the bird.

  8. Comparison of methods for genomic localization of gene trap sequences

    Ferrin Thomas E


    Full Text Available Abstract Background Gene knockouts in a model organism such as mouse provide a valuable resource for the study of basic biology and human disease. Determining which gene has been inactivated by an untargeted gene trapping event poses a challenging annotation problem because gene trap sequence tags, which represent sequence near the vector insertion site of a trapped gene, are typically short and often contain unresolved residues. To understand better the localization of these sequences on the mouse genome, we compared stand-alone versions of the alignment programs BLAT, SSAHA, and MegaBLAST. A set of 3,369 sequence tags was aligned to build 34 of the mouse genome using default parameters for each algorithm. Known genome coordinates for the cognate set of full-length genes (1,659 sequences were used to evaluate localization results. Results In general, all three programs performed well in terms of localizing sequences to a general region of the genome, with only relatively subtle errors identified for a small proportion of the sequence tags. However, large differences in performance were noted with regard to correctly identifying exon boundaries. BLAT correctly identified the vast majority of exon boundaries, while SSAHA and MegaBLAST missed the majority of exon boundaries. SSAHA consistently reported the fewest false positives and is the fastest algorithm. MegaBLAST was comparable to BLAT in speed, but was the most susceptible to localizing sequence tags incorrectly to pseudogenes. Conclusion The differences in performance for sequence tags and full-length reference sequences were surprisingly small. Characteristic variations in localization results for each program were noted that affect the localization of sequence at exon boundaries, in particular.

  9. GeneTack database: genes with frameshifts in prokaryotic genomes and eukaryotic mRNA sequences.

    Antonov, Ivan; Baranov, Pavel; Borodovsky, Mark


    Database annotations of prokaryotic genomes and eukaryotic mRNA sequences pay relatively low attention to frame transitions that disrupt protein-coding genes. Frame transitions (frameshifts) could be caused by sequencing errors or indel mutations inside protein-coding regions. Other observed frameshifts are related to recoding events (that evolved to control expression of some genes). Earlier, we have developed an algorithm and software program GeneTack for ab initio frameshift finding in intronless genes. Here, we describe a database (freely available at containing genes with frameshifts (fs-genes) predicted by GeneTack. The database includes 206 991 fs-genes from 1106 complete prokaryotic genomes and 45 295 frameshifts predicted in mRNA sequences from 100 eukaryotic genomes. The whole set of fs-genes was grouped into clusters based on sequence similarity between fs-proteins (conceptually translated fs-genes), conservation of the frameshift position and frameshift direction (-1, +1). The fs-genes can be retrieved by similarity search to a given query sequence via a web interface, by fs-gene cluster browsing, etc. Clusters of fs-genes are characterized with respect to their likely origin, such as pseudogenization, phase variation, etc. The largest clusters contain fs-genes with programed frameshifts (related to recoding events).

  10. Biased distribution of DNA uptake sequences towards genome maintenance genes

    Davidsen, T.; Rodland, E.A.; Lagesen, K.


    coding regions are the DNA uptake sequences (DUS) required for natural genetic transformation. More importantly, we found a significantly higher density of DUS within genes involved in DNA repair, recombination, restriction-modification and replication than in any other annotated gene group......Repeated sequence signatures are characteristic features of all genomic DNA. We have made a rigorous search for repeat genomic sequences in the human pathogens Neisseria meningitidis, Neisseria gonorrhoeae and Haemophilus influenzae and found that by far the most frequent 9-10mers residing within...

  11. Targeted sequencing of cancer-related genes in colorectal cancer using next-generation sequencing.

    Sae-Won Han

    Full Text Available Recent advance in sequencing technology has enabled comprehensive profiling of genetic alterations in cancer. We have established a targeted sequencing platform using next-generation sequencing (NGS technology for clinical use, which can provide mutation and copy number variation data. NGS was performed with paired-end library enriched with exons of 183 cancer-related genes. Normal and tumor tissue pairs of 60 colorectal adenocarcinomas were used to test feasibility. Somatic mutation and copy number alteration were analyzed. A total of 526 somatic non-synonymous sequence variations were found in 113 genes. Among these, 278 single nucleotide variations were 232 different somatic point mutations. 216 SNV were 79 known single nucleotide polymorphisms in the dbSNP. 32 indels were 28 different indel mutations. Median number of mutated gene per tumor was 4 (range 0-23. Copy number gain (>X2 fold was found in 65 genes in 40 patients, whereas copy number loss (genes in 39 patients. The most frequently altered genes (mutation and/or copy number alteration were APC in 35 patients (58%, TP53 in 34 (57%, and KRAS in 24 (40%. Altered gene list revealed ErbB signaling pathway as the most commonly involved pathway (25 patients, 42%. Targeted sequencing platform using NGS technology is feasible for clinical use and provides comprehensive genetic alteration data.

  12. Coelacanth genome sequence reveals the evolutionary history of vertebrate genes.

    Noonan, James P; Grimwood, Jane; Danke, Joshua; Schmutz, Jeremy; Dickson, Mark; Amemiya, Chris T; Myers, Richard M


    The coelacanth is one of the nearest living relatives of tetrapods. However, a teleost species such as zebrafish or Fugu is typically used as the outgroup in current tetrapod comparative sequence analyses. Such studies are complicated by the fact that teleost genomes have undergone a whole-genome duplication event, as well as individual gene-duplication events. Here, we demonstrate the value of coelacanth genome sequence by complete sequencing and analysis of the protocadherin gene cluster of the Indonesian coelacanth, Latimeria menadoensis. We found that coelacanth has 49 protocadherin cluster genes organized in the same three ordered subclusters, alpha, beta, and gamma, as the 54 protocadherin cluster genes in human. In contrast, whole-genome and tandem duplications have generated two zebrafish protocadherin clusters comprised of at least 97 genes. Additionally, zebrafish protocadherins are far more prone to homogenizing gene conversion events than coelacanth protocadherins, suggesting that recombination- and duplication-driven plasticity may be a feature of teleost genomes. Our results indicate that coelacanth provides the ideal outgroup sequence against which tetrapod genomes can be measured. We therefore present L. menadoensis as a candidate for whole-genome sequencing.

  13. Sequence Variability in Staphylococcal Enterotoxin Genes seb, sec, and sed

    Sophia Johler


    Full Text Available Ingestion of staphylococcal enterotoxins preformed by Staphylococcus aureus in food leads to staphylococcal food poisoning, the most prevalent foodborne intoxication worldwide. There are five major staphylococcal enterotoxins: SEA, SEB, SEC, SED, and SEE. While variants of these toxins have been described and were linked to specific hosts or levels or enterotoxin production, data on sequence variation is still limited. In this study, we aim to extend the knowledge on promoter and gene variants of the major enterotoxins SEB, SEC, and SED. To this end, we determined seb, sec, and sed promoter and gene sequences of a well-characterized set of enterotoxigenic Staphylococcus aureus strains originating from foodborne outbreaks, human infections, human nasal colonization, rabbits, and cattle. New nucleotide sequence variants were detected for all three enterotoxins and a novel amino acid sequence variant of SED was detected in a strain associated with human nasal colonization. While the seb promoter and gene sequences exhibited a high degree of variability, the sec and sed promoter and gene were more conserved. Interestingly, a truncated variant of sed was detected in all tested sed harboring rabbit strains. The generated data represents a further step towards improved understanding of strain-specific differences in enterotoxin expression and host-specific variation in enterotoxin sequences.

  14. Combinatorial pooling enables selective sequencing of the barley gene space.

    Stefano Lonardi


    Full Text Available For the vast majority of species - including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding.

  15. Proteolipid protein 1 gene sequencing of hereditary spastic paraplegia

    Yu Gao; Lumei Chi; Yinshi Jin; Guangxian Nan


    PCR amplification and sequencing of whole blood DNA from an individual with hereditary spastic paraplegia, as well as family members, revealed a fragment of proteolipid protein 1 (PLP1) gene exon 1, which excluded the possibility of isomer 1 expression for this family. The fragment sequence of exon 3 and exon 5 was consistent with the proteolipid protein 1 sequence at NCBI. In the proband samples, a PLP1 point mutation in exon 4 was detected at the basic group of position 844, T→C, phenylalanine→leucine. In proband samples from a male cousin, the basic group at position 844 was C, but gene sequencing signals revealed mixed signals of T and C, indicating possible mutation at this locus. Results demonstrated that changes in PLP1 exon 4 amino acids were associated with onset of hereditary spastic paraplegia.

  16. Speeding disease gene discovery by sequence based candidate prioritization

    Porteous David J


    Full Text Available Abstract Background Regions of interest identified through genetic linkage studies regularly exceed 30 centimorgans in size and can contain hundreds of genes. Traditionally this number is reduced by matching functional annotation to knowledge of the disease or phenotype in question. However, here we show that disease genes share patterns of sequence-based features that can provide a good basis for automatic prioritization of candidates by machine learning. Results We examined a variety of sequence-based features and found that for many of them there are significant differences between the sets of genes known to be involved in human hereditary disease and those not known to be involved in disease. We have created an automatic classifier called PROSPECTR based on those features using the alternating decision tree algorithm which ranks genes in the order of likelihood of involvement in disease. On average, PROSPECTR enriches lists for disease genes two-fold 77% of the time, five-fold 37% of the time and twenty-fold 11% of the time. Conclusion PROSPECTR is a simple and effective way to identify genes involved in Mendelian and oligogenic disorders. It performs markedly better than the single existing sequence-based classifier on novel data. PROSPECTR could save investigators looking at large regions of interest time and effort by prioritizing positional candidate genes for mutation detection and case-control association studies.

  17. A human gut microbial gene catalogue established by metagenomic sequencing

    dos Santos, Marcelo Bertalan Quintanilha; Sicheritz-Pontén, Thomas; Nielsen, Henrik Bjørn;


    To understand the impact of gut microbes on human health and well-being it is crucial to assess their genetic potential. Here we describe the Illumina-based metagenomic sequencing, assembly and characterization of 3.3 million non-redundant microbial genes, derived from 576.7 gigabases of sequence...... gut metagenome and the minimal gut bacterial genome in terms of functions present in all individuals and most bacteria, respectively....

  18. Defects in 18 S or 28 S rRNA processing activate the p53 pathway.

    Hölzel, Michael; Orban, Mathias; Hochstatter, Julia; Rohrmoser, Michaela; Harasim, Thomas; Malamoussi, Anastassia; Kremmer, Elisabeth; Längst, Gernot; Eick, Dirk


    The p53 tumor suppressor pathway is activated by defective ribosome synthesis. Ribosomal proteins are released from the nucleolus and block human double minute-2 (Hdm2) that targets p53 for degradation. However, it remained elusive how abrogation of individual rRNA processing pathways contributes to p53 stabilization. Here, we show that selective inhibition of 18 S rRNA processing provokes accumulation of p53 as efficiently as abrogated 28 S rRNA maturation. We describe hUTP18 as a novel mammalian rRNA processing factor that is specifically involved in 18 S rRNA production. hUTP18 was essential for the cleavage of the 5'-external transcribed spacer leader sequence from the primary polymerase I transcript, but was dispensable for rRNA transcription. Because maturation of the 28 S rRNA was unaffected in hUTP18-depleted cells, our results suggest that the integrity of both the 18 S and 28 S rRNA synthesis pathways can be monitored independently by the p53 pathway. Interestingly, accumulation of p53 after hUTP18 knock down required the ribosomal protein L11. Therefore, cells survey the maturation of the small and large ribosomal subunits by separate molecular routes, which may merge in an L11-dependent signaling pathway for p53 stabilization.

  19. Automated cleaning and pre-processing of immunoglobulin gene sequences from high-throughput sequencing

    Miri eMichaeli


    Full Text Available High throughput sequencing (HTS yields tens of thousands to millions of sequences that require a large amount of pre-processing work to clean various artifacts. Such cleaning cannot be performed manually. Existing programs are not suitable for immunoglobulin (Ig genes, which are variable and often highly mutated. This paper describes Ig-HTS-Cleaner (Ig High Throughput Sequencing Cleaner, a program containing a simple cleaning procedure that successfully deals with pre-processing of Ig sequences derived from HTS, and Ig-Indel-Identifier (Ig Insertion – Deletion Identifier, a program for identifying legitimate and artifact insertions and/or deletions (indels. Our programs were designed for analyzing Ig gene sequences obtained by 454 sequencing, but they are applicable to all types of sequences and sequencing platforms. Ig-HTS-Cleaner and Ig-Indel-Identifier have been implemented in Java and saved as executable JAR files, supported on Linux and MS Windows. No special requirements are needed in order to run the programs, except for correctly constructing the input files as explained in the text. The programs' performance has been tested and validated on real and simulated data sets.

  20. Dinoflagellate 17S rRNA sequence inferred from the gene sequence: Evolutionary implications

    Herzog, Michel; Maroteaux, Luc


    We present the complete sequence of the nuclear-encoded small-ribosomal-subunit RNA inferred from the cloned gene sequence of the dinoflagellate Prorocentrum micans. The dinoflagellate 17S rRNA sequence of 1798 nucleotides is contained in a family of 200 tandemly repeated genes per haploid genome. A tentative model of the secondary structure of P. micans 17S rRNA is presented. This sequence is compared with the small-ribosomal-subunit rRNA of Xenopus laevis (Animalia), Saccharomyces cerevisiae (Fungi), Zea mays (Planta), Dictyostelium discoideum (Protoctista), and Halobacterium volcanii (Monera). Although the secondary structure of the dinoflagellate 17S rRNA presents most of the eukaryotic characteristics, it contains sufficient archaeobacterial-like structural features to reinforce the view that dinoflagellates branch off very early from the eukaryotic lineage. PMID:16578795

  1. A unique box in 28S rRNA is shared by the enigmatic insect order Zoraptera and Dictyoptera.

    Yanhui Wang

    Full Text Available The position of the Zoraptera remains one of the most challenging and uncertain concerns in ordinal-level phylogenies of the insects. Zoraptera have been viewed as having a close relationship with five different groups of Polyneoptera, or as being allied to the Paraneoptera or even Holometabola. Although rDNAs have been widely used in phylogenetic studies of insects, the application of the complete 28S rDNA are still scattered in only a few orders. In this study, a secondary structure model of the complete 28S rRNAs of insects was reconstructed based on all orders of Insecta. It was found that one length-variable region, D3-4, is particularly distinctive. The length and/or sequence of D3-4 is conservative within each order of Polyneoptera, but it can be divided into two types between the different orders of the supercohort, of which the enigmatic order Zoraptera and Dictyoptera share one type, while the remaining orders of Polyneoptera share the other. Additionally, independent evidence from phylogenetic results support the clade (Zoraptera+Dictyoptera as well. Thus, the similarity of D3-4 between Zoraptera and Dictyoptera can serve as potentially valuable autapomorphy or synapomorphy in phylogeny reconstruction. The clades of (Plecoptera+Dermaptera and ((Grylloblattodea+Mantophasmatodea+(Embiodea+Phasmatodea were also recovered in the phylogenetic study. In addition, considering the other studies based on rDNAs, this study reached the highest congruence with previous phylogenetic studies of Holometabola based on nuclear protein coding genes or morphology characters. Future comparative studies of secondary structures across deep divergences and additional taxa are likely to reveal conserved patterns, structures and motifs that can provide support for major phylogenetic lineages.

  2. Sequence Analysis of the ank Gene of Granulocytic Ehrlichiae


    The ank gene of the agent of human granulocytic ehrlichiosis (HGE) codes for a protein with a predicted molecular size of 131.2 kDa that is recognized by serum from both dogs and humans infected with granulocytic ehrlichiae. As part of an effort to assess the phylogenetic relatedness of granulocytic ehrlichiae from different geographic regions and in different host species, the ank gene was PCR amplified and sequenced from a variety of sources. These included 10 blood specimens from patients ...

  3. 香蕉穿孔线虫28S rRNA基因的D2/D3区序列分析%Phylogentic Analysis of Radopholus similis from D2 and D3 Fragments of the 28S rRNA Gene Sequences

    李佳; 彭德良; 黄文坤


    采用线虫通用引物D2A和D3B对9个香蕉穿孔线虫种群的核糖体DNA 28S大亚基的D2/D3区进行了扩增,获得的片段长度约为780 bp,克隆测序后使用UPGMA法进行聚类分析和构建系统发育树.结果表明9个线虫种群D2/D3区核苷酸序列相似性为99.13%,其中8个群体的序列与越南报道的香蕉穿孔线虫近源种(Radopholus sp.7B VietNam)的D2/D3区核苷酸序列(DQ328712)相似性为97.46%,说明香蕉穿孔线虫D2/D3区具有较大的保守性.聚类分析结果表明,RSHN12r、RSSH、RSHL、RSHK、RSLZ、RSSZ、RSHN13、RSHN12p等8个种群聚为一类,亲缘关系很近,RSHN3群体与以上8个群体亲缘关系较远,说明RSHN12r、RSSH、RSHL、RSHK、RSLZ、RSSZ、RSHN13、RSHN12p这8个香蕉穿孔线虫种群可能来源于同一地理种群,而RSHN3种群可能来源于另一个地理种群.

  4. Sequence and gene expression evolution of paralogous genes in willows.

    Harikrishnan, Srilakshmy L; Pucholt, Pascal; Berlin, Sofia


    Whole genome duplications (WGD) have had strong impacts on species diversification by triggering evolutionary novelties, however, relatively little is known about the balance between gene loss and forces involved in the retention of duplicated genes originating from a WGD. We analyzed putative Salicoid duplicates in willows, originating from the Salicoid WGD, which took place more than 45 Mya. Contigs were constructed by de novo assembly of RNA-seq data derived from leaves and roots from two genotypes. Among the 48,508 contigs, 3,778 pairs were, based on fourfold synonymous third-codon transversion rates and syntenic positions, predicted to be Salicoid duplicates. Both copies were in most cases expressed in both tissues and 74% were significantly differentially expressed. Mean Ka/Ks was 0.23, suggesting that the Salicoid duplicates are evolving by purifying selection. Gene Ontology enrichment analyses showed that functions related to DNA- and nucleic acid binding were over-represented among the non-differentially expressed Salicoid duplicates, while functions related to biosynthesis and metabolism were over-represented among the differentially expressed Salicoid duplicates. We propose that the differentially expressed Salicoid duplicates are regulatory neo- and/or subfunctionalized, while the non-differentially expressed are dose sensitive, hence, functionally conserved. Multiple evolutionary processes, thus drive the retention of Salicoid duplicates in willows.

  5. Nucleotide Sequence of the Protective Antigen Gene of Bacillus Anthracis


    Montie, S. Kadis, and S. I. Ajl (ed.), Microbial toxins, vol. 3. Academic Press, Inc., New York. 23. Little, S. F., and G. B. Knudaon. 1986...Takkinen, and L. Kaariainen. 1981. Nucleotide sequence of the promoter and NHa-terminal signal peptide region of the a- amylase gene from Bacillus

  6. Quantitative modeling of a gene's expression from its intergenic sequence.

    Md Abul Hassan Samee


    Full Text Available Modeling a gene's expression from its intergenic locus and trans-regulatory context is a fundamental goal in computational biology. Owing to the distributed nature of cis-regulatory information and the poorly understood mechanisms that integrate such information, gene locus modeling is a more challenging task than modeling individual enhancers. Here we report the first quantitative model of a gene's expression pattern as a function of its locus. We model the expression readout of a locus in two tiers: 1 combinatorial regulation by transcription factors bound to each enhancer is predicted by a thermodynamics-based model and 2 independent contributions from multiple enhancers are linearly combined to fit the gene expression pattern. The model does not require any prior knowledge about enhancers contributing toward a gene's expression. We demonstrate that the model captures the complex multi-domain expression patterns of anterior-posterior patterning genes in the early Drosophila embryo. Altogether, we model the expression patterns of 27 genes; these include several gap genes, pair-rule genes, and anterior, posterior, trunk, and terminal genes. We find that the model-selected enhancers for each gene overlap strongly with its experimentally characterized enhancers. Our findings also suggest the presence of sequence-segments in the locus that would contribute ectopic expression patterns and hence were "shut down" by the model. We applied our model to identify the transcription factors responsible for forming the stripe boundaries of the studied genes. The resulting network of regulatory interactions exhibits a high level of agreement with known regulatory influences on the target genes. Finally, we analyzed whether and why our assumption of enhancer independence was necessary for the genes we studied. We found a deterioration of expression when binding sites in one enhancer were allowed to influence the readout of another enhancer. Thus, interference

  7. Thermodynamics-based models of transcriptional regulation with gene sequence.

    Wang, Shuqiang; Shen, Yanyan; Hu, Jinxing


    Quantitative models of gene regulatory activity have the potential to improve our mechanistic understanding of transcriptional regulation. However, the few models available today have been based on simplistic assumptions about the sequences being modeled or heuristic approximations of the underlying regulatory mechanisms. In this work, we have developed a thermodynamics-based model to predict gene expression driven by any DNA sequence. The proposed model relies on a continuous time, differential equation description of transcriptional dynamics. The sequence features of the promoter are exploited to derive the binding affinity which is derived based on statistical molecular thermodynamics. Experimental results show that the proposed model can effectively identify the activity levels of transcription factors and the regulatory parameters. Comparing with the previous models, the proposed model can reveal more biological sense.

  8. Cloning and sequencing of a Moraxella bovis pilin gene.

    Marrs, C F; Schoolnik, G; Koomey, J M; Hardy, J; Rothbard, J; Falkow, S


    Moraxella bovis pili have been shown to play a major role in both infectivity and protective immunity of bovine infectious keratoconjunctivitis. Sonicated M. bovis DNA from the piliated strain EPP63 was inserted into the vector lambda gt11 with EcoRI linkers. Recombinant phage were screened with an oligonucleotide probe based on the amino-terminal portion of the DNA sequence of a Neisseria gonorrhoeae pilin gene. Two candidate phages produced a protein that comigrated with EPP63 beta pilin in sodium dodecyl sulfate-polyacrylamide gels and bound anti-pilus antisera. The 1.9-kilobase insert from one of these, lambda gt11M182, was subcloned in both orientations into pBR322, forming the plasmids pMxB7 and pMxB9, both of which produced beta pilin, as did pMxB12, a HindIII deletion derivative of pMxB7. In HB101(pMxB12), the M. bovis pilin protein was shown to be primarily localized in the inner membrane. The entire 939-base-pair insert of pMxB12 was sequenced, revealing a ribosome binding site just upstream of the coding region and an AT-rich region further upstream containing some potential RNA polymerase recognition sites. The translation of the sequence predicts a six-amino-acid leader sequence preceding the phenylalanine that begins the mature protein. Codon usage analysis of the M. bovis beta pilin gene revealed greater use of the CUA codon for leucine than usual for a well-expressed Escherichia coli gene. Comparisons of the M. bovis EPP63 beta pilin protein sequence with other pilin gene sequences are presented.

  9. Sequence variations in the FAD2 gene in seeded pumpkins.

    Ge, Y; Chang, Y; Xu, W L; Cui, C S; Qu, S P


    Seeded pumpkins are important economic crops; the seeds contain various unsaturated fatty acids, such as oleic acid and linoleic acid, which are crucial for human and animal nutrition. The fatty acid desaturase-2 (FAD2) gene encodes delta-12 desaturase, which converts oleic acid to linoleic acid. However, little is known about sequence variations in FAD2 in seeded pumpkins. Twenty-seven FAD2 clones from 27 accessions of Cucurbita moschata, Cucurbita maxima, Cucurbita pepo, and Cucurbita ficifolia were obtained (totally 1152 bp; a single gene without introns). More than 90% nucleotide identities were detected among the 27 FAD2 clones. Nucleotide substitution, rather than nucleotide insertion and deletion, led to sequence polymorphism in the 27 FAD2 clones. Furthermore, the 27 FAD2 selected clones all encoded the FAD2 enzyme (delta-12 desaturase) with amino acid sequence identities from 91.7 to 100% for 384 amino acids. The same main-function domain between 47 and 329 amino acids was identified. The four species clustered separately based on differences in the sequences that were identified using the unweighted pair group method with arithmetic mean. Geographic origin and species were found to be closely related to sequence variation in FAD2.

  10. Informational structure of genetic sequences and nature of gene splicing

    Trifonov, E. N.


    Only about 1/20 of DNA of higher organisms codes for proteins, by means of classical triplet code. The rest of DNA sequences is largely silent, with unclear functions, if any. The triplet code is not the only code (message) carried by the sequences. There are three levels of molecular communication, where the same sequence ``talks'' to various bimolecules, while having, respectively, three different appearances: DNA, RNA and protein. Since the molecular structures and, hence, sequence specific preferences of these are substantially different, the original DNA sequence has to carry simultaneously three types of sequence patterns (codes, messages), thus, being a composite structure in which one had the same letter (nucleotide) is frequently involved in several overlapping codes of different nature. This multiplicity and overlapping of the codes is a unique feature of the Gnomic, language of genetic sequences. The coexisting codes have to be degenerate in various degrees to allow an optimal and concerted performance of all the encoded functions. There is an obvious conflict between the best possible performance of a given function and necessity to compromise the quality of a given sequence pattern in favor of other patterns. It appears that the major role of various changes in the sequences on their ``ontogenetic'' way from DNA to RNA to protein, like RNA editing and splicing, or protein post-translational modifications is to resolve such conflicts. New data are presented strongly indicating that the gene splicing is such a device to resolve the conflict between the code of DNA folding in chromatin and the triplet code for protein synthesis.

  11. Variations in CCL3L gene cluster sequence and non-specific gene copy numbers

    Edberg Jeffrey C


    Full Text Available Abstract Background Copy number variations (CNVs of the gene CC chemokine ligand 3-like1 (CCL3L1 have been implicated in HIV-1 susceptibility, but the association has been inconsistent. CCL3L1 shares homology with a cluster of genes localized to chromosome 17q12, namely CCL3, CCL3L2, and, CCL3L3. These genes are involved in host defense and inflammatory processes. Several CNV assays have been developed for the CCL3L1 gene. Findings Through pairwise and multiple alignments of these genes, we have shown that the homology between these genes ranges from 50% to 99% in complete gene sequences and from 70-100% in the exonic regions, with CCL3L1 and CCL3L3 being identical. By use of MEGA 4 and BioEdit, we aligned sense primers, anti-sense primers, and probes used in several previously described assays against pre-multiple alignments of all four chemokine genes. Each set of probes and primers aligned and matched with overlapping sequences in at least two of the four genes, indicating that previously utilized RT-PCR based CNV assays are not specific for only CCL3L1. The four available assays measured median copies of 2 and 3-4 in European and African American, respectively. The concordance between the assays ranged from 0.44-0.83 suggesting individual discordant calls and inconsistencies with the assays from the expected gene coverage from the known sequence. Conclusions This indicates that some of the inconsistencies in the association studies could be due to assays that provide heterogenous results. Sequence information to determine CNV of the three genes separately would allow to test whether their association with the pathogenesis of a human disease or phenotype is affected by an individual gene or by a combination of these genes.

  12. Cloning,sequencing and phylogenic analysis of duck prion gene

    WANG Qigui; ZHANG Lei; HU Xiaoxiang; FAN Baoliang; LI Ning; LI Hui; WU Changxin


    Duck prion gene was cloned and sequenced. Similar to mammalian prion protein (PrP), duck prion is encoded by a single exon of a single copy in genome, which was confirmed by Southern blot analysis. All of the structural features of mammalian PrP were also identified in the duck PrP. Compared with mammalian PrP, it exhibited a 30 % of general similarity. When compared with chicken PrP, it showed a higher homology of 97%. A phylogenetic tree was constructed to trace evolution of prion gene in animals.

  13. Cloning and sequencing of a Moraxella bovis pilin gene.


    Moraxella bovis pili have been shown to play a major role in both infectivity and protective immunity of bovine infectious keratoconjunctivitis. Sonicated M. bovis DNA from the piliated strain EPP63 was inserted into the vector lambda gt11 with EcoRI linkers. Recombinant phage were screened with an oligonucleotide probe based on the amino-terminal portion of the DNA sequence of a Neisseria gonorrhoeae pilin gene. Two candidate phages produced a protein that comigrated with EPP63 beta pilin in...

  14. Full-length minor ampullate spidroin gene sequence.

    Gefei Chen

    Full Text Available Spider silk includes seven protein based fibers and glue-like substances produced by glands in the spider's abdomen. Minor ampullate silk is used to make the auxiliary spiral of the orb-web and also for wrapping prey, has a high tensile strength and does not supercontract in water. So far, only partial cDNA sequences have been obtained for minor ampullate spidroins (MiSps. Here we describe the first MiSp full-length gene sequence from the spider species Araneus ventricosus, using a multidimensional PCR approach. Comparative analysis of the sequence reveals regulatory elements, as well as unique spidroin gene and protein architecture including the presence of an unusually large intron. The spliced full-length transcript of MiSp gene is 5440 bp in size and encodes 1766 amino acid residues organized into conserved nonrepetitive N- and C-terminal domains and a central predominantly repetitive region composed of four units that are iterated in a non regular manner. The repeats are more conserved within A. ventricosus MiSp than compared to repeats from homologous proteins, and are interrupted by two nonrepetitive spacer regions, which have 100% identity even at the nucleotide level.

  15. Byssochlamys nivea with patulin-producing capability has an isoepoxydon dehydrogenase gene (idh) with sequence homology to Penicillium expansum and P. griseofulvum.

    Dombrink-Kurtzman, Mary Ann; Engberg, Amy E


    Nucleotide sequences of the isoepoxydon dehydrogenase gene (idh) for eight strains of Byssochlamys nivea were determined by constructing GenomeWalker libraries. A striking finding was that all eight strains of B. nivea examined had identical nucleotide sequences, including those of the two introns present. The length of intron 2 was nearly three times the size of introns in strains of Penicillium expansum and P. griseofulvum, but intron 1 was comparable in size to the number of nucleotides present in introns 1 and 2 of P. expansum and P. griseofulvum. A high degree of amino acid homology (88%) existed for the idh genes of the strains of B. nivea when compared with sequences of P. expansum and P. griseofulvum. There were many nucleotide differences present, but they did not affect the amino acid sequence because they were present in the third position. The identity of the B. nivea isolates was confirmed by sequencing the ITS/partial LSU (28 S) rDNA genes. Four B. nivea strains were analysed for production of patulin, a mycotoxin found primarily in apple juice and other fruit products. The B. nivea strains produced patulin in amounts comparable to P. expansum strains. Interest in the genus Byssochlamys is related to the ability of its ascospores to survive pasteurization and cause spoilage of heat-processed fruit products worldwide.

  16. Technology development for gene discovery and full-length sequencing

    Marcelo Bento Soares


    In previous years, with support from the U.S. Department of Energy, we developed methods for construction of normalized and subtracted cDNA libraries, and constructed hundreds of high-quality libraries for production of Expressed Sequence Tags (ESTs). Our clones were made widely available to the scientific community through the IMAGE Consortium, and millions of ESTs were produced from our libraries either by collaborators or by our own sequencing laboratory at the University of Iowa. During this grant period, we focused on (1) the development of a method for preferential cloning of tissue-specific and/or rare transcripts, (2) its utilization to expedite EST-based gene discovery for the NIH Mouse Brain Molecular Anatomy Project, (3) further development and optimization of a method for construction of full-length-enriched cDNA libraries, and (4) modification of a plasmid vector to maximize efficiency of full-length cDNA sequencing by the transposon-mediated approach. It is noteworthy that the technology developed for preferential cloning of rare mRNAs enabled identification of over 2,000 mouse transcripts differentially expressed in the hippocampus. In addition, the method that we optimized for construction of full-length-enriched cDNA libraries was successfully utilized for the production of approximately fifty libraries from the developing mouse nervous system, from which over 2,500 full-ORF-containing cDNAs have been identified and accurately sequenced in their entirety either by our group or by the NIH-Mammalian Gene Collection Program Sequencing Team.

  17. Angiosperm phylogeny inferred from sequences of four mitochondrial genes

    Yin-Long QIU; Zhi-Duan CHEN; Libo LI; Bin WANG; Jia-Yu XUE; Tory A. HENDRY; Rui-Qi LI; Joseph W. BROWN; Yang LIU; Geordan T. HUDSON


    An angiosperm phylogeny was reconstructed in a maximum likelihood analysis of sequences of four mitochondrial genes, atpl, matR, had5, and rps3, from 380 species that represent 376 genera and 296 families of seed plants. It is largely congruent with the phylogeny of angiosperms reconstructed from chloroplast genes atpB, matK, and rbcL, and nuclear 18S rDNA. The basalmost lineage consists of Amborella and Nymphaeales (including Hydatellaceae). Austrobaileyales follow this clade and are sister to the mesangiosperms, which include Chloranthaceae, Ceratophyllum, magnoliids, monocots, and eudicots. With the exception of Chloranthaceae being sister to Ceratophyllum, relationships among these five lineages are not well supported. In eudicots, Ranunculales, Sabiales, Proteales, Trochodendrales, Buxales, Gunnerales, Saxifragales, Vitales, Berberidopsidales, and Dilleniales form a basal grade of lines that diverged before the diversification of rosids and asterids. Within rosids, the COM (Celastrales-Oxalidales-Malpighiales) clade is sister to malvids (or rosid Ⅱ), instead of to the nitrogen-fixing clade as found in all previous large-scale molecular analyses of angiosperms. Santalales and Caryophyllales are members of an expanded asterid clade. This study shows that the mitochondrial genes are informative markers for resolving relationships among genera, families, or higher rank taxa across angiosperms. The low substitution rates and low homoplasy levels of the mitochondrial genes relative to the chloroplast genes, as found in this study, make them particularly useful for reconstructing ancient phylogenetic relationships. A mitochondrial gene-based angiosperm phylogeny provides an independent and essential reference for comparison with hypotheses of angiosperm phylogeny based on chloroplast genes, nuclear genes, and non-molecular data to reconstruct the underlying organismal phylogeny.

  18. Detection and sequence analysis of accessory gene regulator genes of Staphylococcus pseudintermedius isolates

    M. Ananda Chitra


    Full Text Available Background: Staphylococcus pseudintermedius (SP is the major pathogenic species of dogs involved in a wide variety of skin and soft tissue infections. The accessory gene regulator (agr locus of Staphylococcus aureus has been extensively studied, and it influences the expression of many virulence genes. It encodes a two-component signal transduction system that leads to down-regulation of surface proteins and up-regulation of secreted proteins during in vitro growth of S. aureus. The objective of this study was to detect and sequence analyzing the AgrA, B, and D of SP isolated from canine skin infections. Materials and Methods: In this study, we have isolated and identified SP from canine pyoderma and otitis cases by polymerase chain reaction (PCR and confirmed by PCR-restriction fragment length polymorphism. Primers for SP agrA and agrBD genes were designed using online primer designing software and BLAST searched for its specificity. Amplification of the agr genes was carried out for 53 isolates of SP by PCR and sequencing of agrA, B, and D were carried out for five isolates and analyzed using DNAstar and Mega5.2 software. Results: A total of 53 (59% SP isolates were obtained from 90 samples. 15 isolates (28% were confirmed to be methicillinresistant SP (MRSP with the detection of the mecA gene. Accessory gene regulator A, B, and D genes were detected in all the SP isolates. Complete nucleotide sequences of the above three genes for five isolates were submitted to GenBank, and their accession numbers are from KJ133557 to KJ133571. AgrA amino acid sequence analysis showed that it is mainly made of alpha-helices and is hydrophilic in nature. AgrB is a transmembrane protein, and AgrD encodes the precursor of the autoinducing peptide (AIP. Sequencing of the agrD gene revealed that the 5 canine SP strains tested could be divided into three Agr specificity groups (RIPTSTGFF, KIPTSTGFF, and RIPISTGFF based on the putative AIP produced by each strain

  19. Nuclear gene sequences from a late pleistocene sloth coprolite.

    Poinar, Hendrik; Kuch, Melanie; McDonald, Gregory; Martin, Paul; Pääbo, Svante


    The determination of nuclear DNA sequences from ancient remains would open many novel opportunities such as the resolution of phylogenies, the sexing of hominid and animal remains, and the characterization of genes involved in phenotypic traits. However, to date, single-copy nuclear DNA sequences from fossils have been determined only from bones and teeth of woolly mammoths preserved in the permafrost. Since the best preserved ancient nucleic acids tend to stem from cold environments, this has led to the assumption that nuclear DNA would be retrievable only from frozen remains. We have previously shown that Pleistocene coprolites stemming from the extinct Shasta sloth (Nothrotheriops shastensis, Megatheriidae) contain mitochondrial (mt) DNA from the animal that produced them as well as chloroplast (cp) DNA from the ingested plants. Recent attempts to resolve the phylogeny of two families of extinct sloths by using strictly mitochondrial DNA has been inconclusive. We have prepared DNA extracts from a ground sloth coprolite from Gypsum Cave, Nevada, and quantitated the number of mtDNA copies for three different fragment lengths by using real-time PCR. We amplified one multicopy and three single-copy nuclear gene fragments and used the concatenated sequence to resolve the phylogeny. These results show that ancient single-copy nuclear DNA can be recovered from warm, arid climates. Thus, nuclear DNA preservation is not restricted to cold climates.

  20. Cloning and sequence analysis of US1 gene in duck enteritis virus%Cloning and sequence analysis of US1gene in duck enteritis virus

    ZHAO Yan; WANG Jun-wei; MA Bo; ZHAO Xiao-yan


    In this paper, a 1,860 bp sequence in IRs region of duck enteritis virus(DEV)was amplified by single oligonucleotide nested PCR with a single primer designed according to partial sequence of USI and then a pair of primers designed according to the 3' UTR of US8 gene and 5'end of the new getting sequence were used to amplify a 2,426 bp sequence toward the TRs region.Sequence analysis revealed that the both sequences contained an identical 990 bp open reading frame of DEV US1 gene.The two ORFs were in opposite transcription orientation.Sequence comparison of the nucleotide sequence and the deduced amino acid sequence of US1 gene showed relatively high identity to Mardivirus.Phylogenetic tree analysis showed that the eleven herpesviruses viruses were classified into three groups, and the duck enteritis virus was most closely related to Mardivirus.

  1. Sequence polymorphism and evolution of three cetacean MHC genes.

    Xu, Shi Xia; Ren, Wen Hua; Li, Shu Zhen; Wei, Fu Wen; Zhou, Kai Ya; Yang, Guang


    Sequence variability at three major histocompatibility complex (MHC) genes (DQB, DRA, and MHC-I) of cetaceans was investigated in order to get an overall understanding of cetacean MHC evolution. Little sequence variation was detected at the DRA locus, while extensive and considerable variability were found at the MHC-I and DQB loci. Phylogenetic reconstruction and sequence comparison revealed extensive sharing of identical MHC alleles among different species at the three MHC loci examined. Comparisons of phylogenetic trees for these MHC loci with the trees reconstructed only based on non-PBR sites revealed that allelic similarity/identity possibly reflected common ancestry and were not due to adaptive convergence. At the same time, trans-species evolution was also evidenced that the allelic diversity of the three MHC loci clearly pre-dated species divergence events according to the relaxed molecular clock. It may be the forces of balancing selection acting to maintain the high sequence variability and identical alleles in trans-specific manner at the MHC-I and DQB loci.

  2. Mining Association Rules in Dengue Gene Sequence with Latent Periodicity

    Marimuthu Thangam


    Full Text Available The mining of periodic patterns in dengue database is an interesting research problem that can be used for predicting the future evolution of dengue viruses. In this paper, we propose an algorithm called Recurrence Finder (RECFIN that uses the suffix tree for detecting the periodic patterns of dengue gene sequence. Also, the RECFIN finds the presence of palindrome which indicates the possibilities of formation of proteins. Further, this paper computes the periodicity of nucleic acid and amino acid sequences of any length. The periodicity based association rules are used to diagnose the type of dengue. The time complexity of the proposed algorithm is O(n2. We demonstrate the effectiveness of the proposed approach by comparing the experimental results performed on dengue virus serotypes dataset with NCBI-BLAST algorithm.

  3. International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons.

    Olson, Nathan D; Lund, Steven P; Zook, Justin M; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B


    This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing(®), or Ion Torrent PGM(®). The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies.

  4. Detecting gene mutations in Japanese Alzheimer's patients by semiconductor sequencing.

    Yagi, Ryoichi; Miyamoto, Ryosuke; Morino, Hiroyuki; Izumi, Yuishin; Kuramochi, Masahito; Kurashige, Takashi; Maruyama, Hirofumi; Mizuno, Noriyoshi; Kurihara, Hidemi; Kawakami, Hideshi


    Alzheimer's disease (AD) is the most common form of dementia. To date, several genes have been identified as the cause of AD, including PSEN1, PSEN2, and APP. The association between APOE and late-onset AD has also been reported. We here used a bench top next-generation sequencer, which uses an integrated semiconductor device, detects hydrogen ions, and operates at a high-speed using nonoptical technology. We examined 45 Japanese AD patients with positive family histories, and 29 sporadic patients with early onset (useful for detecting genetic variations in familial AD.

  5. A sequence-based approach to identify reference genes for gene expression analysis

    Chari Raj


    Full Text Available Abstract Background An important consideration when analyzing both microarray and quantitative PCR expression data is the selection of appropriate genes as endogenous controls or reference genes. This step is especially critical when identifying genes differentially expressed between datasets. Moreover, reference genes suitable in one context (e.g. lung cancer may not be suitable in another (e.g. breast cancer. Currently, the main approach to identify reference genes involves the mining of expression microarray data for highly expressed and relatively constant transcripts across a sample set. A caveat here is the requirement for transcript normalization prior to analysis, and measurements obtained are relative, not absolute. Alternatively, as sequencing-based technologies provide digital quantitative output, absolute quantification ensues, and reference gene identification becomes more accurate. Methods Serial analysis of gene expression (SAGE profiles of non-malignant and malignant lung samples were compared using a permutation test to identify the most stably expressed genes across all samples. Subsequently, the specificity of the reference genes was evaluated across multiple tissue types, their constancy of expression was assessed using quantitative RT-PCR (qPCR, and their impact on differential expression analysis of microarray data was evaluated. Results We show that (i conventional references genes such as ACTB and GAPDH are highly variable between cancerous and non-cancerous samples, (ii reference genes identified for lung cancer do not perform well for other cancer types (breast and brain, (iii reference genes identified through SAGE show low variability using qPCR in a different cohort of samples, and (iv normalization of a lung cancer gene expression microarray dataset with or without our reference genes, yields different results for differential gene expression and subsequent analyses. Specifically, key established pathways in lung

  6. Nuclear 28S rDNA phylogeny supports the basal placement of Noctiluca scintillans (Dinophyceae; Noctilucales) in dinoflagellates.

    Ki, Jang-Seu


    Noctiluca scintillans (Macartney) Kofoid et Swezy, 1921 is an unarmoured heterotrophic dinoflagellate with a global distribution, and has been considered as one of the ancestral taxa among dinoflagellates. Recently, 18S rDNA, actin, alpha-, beta-tubulin, and Hsp90-based phylogenies have shown the basal position of the noctilucids. However, the relationships of dinoflagellates in the basal lineages are still controversial. Although the nuclear rDNA (e.g. 18S, ITS-5.8S, and 28S) contains much genetic information, DNA sequences of N. scintillans rDNA molecules were insufficiently characterized as yet. Here the author sequenced a long-range nuclear rDNA, spanning from the 18S to the D5 region of the 28S rDNA, of N. scintillans. The present N. scintillans had a nearly identical genotype (>99.0% similarity) compared to other Noctiluca sequences from different geographic origins. Nucleotide divergence in the partial 28S rDNA was significantly high (pdinoflagellates, two perkinsids, and two apicomplexans as outgroups showed that N. scintillans and Oxyrrhis marina formed a clade that diverged separately from core dinoflagellates.

  7. An automated annotation tool for genomic DNA sequences using GeneScan and BLAST

    Andrew M. Lynn; Chakresh Kumar Jain; K. Kosalai; Pranjan Barman; Nupur Thakur; Harish Batra; Alok Bhattacharya


    Genomic sequence data are often available well before the annotated sequence is published. We present a method for analysis of genomic DNA to identify coding sequences using the GeneScan algorithm and characterize these resultant sequences by BLAST. The routines are used to develop a system for automated annotation of genome DNA sequences.

  8. Solving the molecular diagnostic testing conundrum for Mendelian disorders in the era of next-generation sequencing: single-gene, gene panel, or exome/genome sequencing.

    Xue, Yuan; Ankala, Arunkanth; Wilcox, William R; Hegde, Madhuri R


    Next-generation sequencing is changing the paradigm of clinical genetic testing. Today there are numerous molecular tests available, including single-gene tests, gene panels, and exome sequencing or genome sequencing. As a result, ordering physicians face the conundrum of selecting the best diagnostic tool for their patients with genetic conditions. Single-gene testing is often most appropriate for conditions with distinctive clinical features and minimal locus heterogeneity. Next-generation sequencing-based gene panel testing, which can be complemented with array comparative genomic hybridization and other ancillary methods, provides a comprehensive and feasible approach for heterogeneous disorders. Exome sequencing and genome sequencing have the advantage of being unbiased regarding what set of genes is analyzed, enabling parallel interrogation of most of the genes in the human genome. However, current limitations of next-generation sequencing technology and our variant interpretation capabilities caution us against offering exome sequencing or genome sequencing as either stand-alone or first-choice diagnostic approaches. A growing interest in personalized medicine calls for the application of genome sequencing in clinical diagnostics, but major challenges must be addressed before its full potential can be realized. Here, we propose a testing algorithm to help clinicians opt for the most appropriate molecular diagnostic tool for each scenario.

  9. Estimating the extent of horizontal gene transfer in metagenomic sequences

    Moya Andrés


    Full Text Available Abstract Background Although the extent of horizontal gene transfer (HGT in complete genomes has been widely studied, its influence in the evolution of natural communities of prokaryotes remains unknown. The availability of metagenomic sequences allows us to address the study of global patterns of prokaryotic evolution in samples from natural communities. However, the methods that have been commonly used for the study of HGT are not suitable for metagenomic samples. Therefore it is important to develop new methods or to adapt existing ones to be used with metagenomic sequences. Results We have created two different methods that are suitable for the study of HGT in metagenomic samples. The methods are based on phylogenetic and DNA compositional approaches, and have allowed us to assess the extent of possible HGT events in metagenomes for the first time. The methods are shown to be compatible and quite precise, although they probably underestimate the number of possible events. Our results show that the phylogenetic method detects HGT in between 0.8% and 1.5% of the sequences, while DNA compositional methods identify putative HGT in between 2% and 8% of the sequences. These ranges are very similar to these found in complete genomes by related approaches. Both methods act with a different sensitivity since they probably target HGT events of different ages: the compositional method mostly identifies recent transfers, while the phylogenetic is more suitable for the detections of older events. Nevertheless, the study of the number of HGT events in metagenomic sequences from different communities shows a consistent trend for both methods: the lower amount is found for the sequences of the Sargasso Sea metagenome, while the higher quantity is found in the whale fall metagenome from the bottom of the ocean. The significance of these observations is discussed. Conclusion The computational approaches that are used to find possible HGT events in complete

  10. Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications

    Yilmaz, Pelin; Kottmann, Renzo; Field, Dawn


    Here we present a standard developed by the Genomic Standards Consortium (GSC) for reporting marker gene sequences--the minimum information about a marker gene sequence (MIMARKS). We also introduce a system for describing the environment from which a biological sample originates. The 'environmental...... present the minimum information about any (x) sequence (MIxS). Adoption of MIxS will enhance our ability to analyze natural genetic diversity documented by massive DNA sequencing efforts from myriad ecosystems in our ever-changing biosphere....

  11. Sequencing, characterization, and gene expression analysis of the histidine decarboxylase gene cluster of Morganella morganii.

    Ferrario, Chiara; Borgo, Francesca; de Las Rivas, Blanca; Muñoz, Rosario; Ricci, Giovanni; Fortina, Maria Grazia


    The histidine decarboxylase gene cluster of Morganella morganii DSM30146(T) was sequenced, and four open reading frames, named hdcT1, hdc, hdcT2, and hisRS were identified. Two putative histidine/histamine antiporters (hdcT1 and hdcT2) were located upstream and downstream the hdc gene, codifying a pyridoxal-P dependent histidine decarboxylase, and followed by hisRS gene encoding a histidyl-tRNA synthetase. This organization was comparable with the gene cluster of other known Gram negative bacteria, particularly with that of Klebsiella oxytoca. Recombinant Escherichia coli strains harboring plasmids carrying the M. morganii hdc gene were shown to overproduce histidine decarboxylase, after IPTG induction at 37 °C for 4 h. Quantitative RT-PCR experiments revealed the hdc and hisRS genes were highly induced under acidic and histidine-rich conditions. This work represents the first description and identification of the hdc-related genes in M. morganii. Results support the hypothesis that the histidine decarboxylation reaction in this prolific histamine producing species may play a role in acid survival. The knowledge of the role and the regulation of genes involved in histidine decarboxylation should improve the design of rational strategies to avoid toxic histamine production in foods.

  12. Variation in the nucleotide sequence of a prolamin gene family in wild rice.

    Barbier, P; Ishihama, A


    Variation in the DNA sequence of the 10 kDa prolamin gene family within the wild rice species Oryza rufipogon was probed using the direct sequencing of PCR-amplified genes. A comparison of the nucleotide and deduced amino-acid sequences of eight Asian strains of O. rufipogon and one strain of the related African species O. longistaminata is presented.

  13. Molecular Cloning and Sequencing of Hemoglobin-Beta Gene of Channel Catfish, Ictalurus Punctatus Rafinesque

    : Hemoglobin-y gene of channel catfish , lctalurus punctatus, was cloned and sequenced . Total RNA from head kidneys was isolated, reverse transcribed and amplified . The sequence of the channel catfish hemoglobin-y gene consists of 600 nucleotides . Analysis of the nucleotide sequence reveals one o...

  14. EcoGene: a genome sequence database for Escherichia coli K-12.

    Rudd, K E


    The EcoGene database provides a set of gene and protein sequences derived from the genome sequence of Escherichia coli K-12. EcoGene is a source of re-annotated sequences for the SWISS-PROT and Colibri databases. EcoGene is used for genetic and physical map compilations in collaboration with the Coli Genetic Stock Center. The EcoGene12 release includes 4293 genes. EcoGene12 differs from the GenBank annotation of the complete genome sequence in several ways, including (i) the revision of 706 predicted or confirmed gene start sites, (ii) the correction or hypothetical reconstruction of 61 frame-shifts caused by either sequence error or mutation, (iii) the reconstruction of 14 protein sequences interrupted by the insertion of IS elements, and (iv) pre-dictions that 92 genes are partially deleted gene fragments. A literature survey identified 717 proteins whose N-terminal amino acids have been verified by sequencing. 12 446 cross-references to 6835 literature citations and s are provided. EcoGene is accessible at a new website: Users can search and retrieve individual EcoGene GenePages or they can download large datasets for incorporation into database management systems, facilitating various genome-scale computational and functional analyses.

  15. Mutational analysis of DBD*--a unique antileukemic gene sequence.

    Ji, Yan-shan; Johnson, Betty H; Webb, M Scott; Thompson, E Brad


    DBD* is a novel gene encoding an 89 amino acid peptide that is constitutively lethal to leukemic cells. DBD* was derived from the DNA binding domain of the human glucocorticoid receptor by a frameshift that replaces the final 21 C-terminal amino acids of the domain. Previous studies suggested that DBD* no longer acted as the natural DNA binding domain. To confirm and extend these results, we mutated DBD* in 29 single amino acid positions, critical for the function in the native domain or of possible functional significance in the novel 21 amino acid C-terminal sequence. Steroid-resistant leukemic ICR-27-4 cells were transiently transfected by electroporation with each of the 29 mutants. Cell kill was evaluated by trypan blue dye exclusion, a WST-1 tetrazolium-based assay for cell respiration, propidium iodide exclusion, and Hoechst 33258 staining of chromatin. Eleven of the 29 point mutants increased, whereas four decreased antileukemic activity. The remainder had no effect on activity. The nonconcordances between these effects and native DNA binding domain function strongly suggest that the lethality of DBD* is distinct from that of the glucocorticoid receptor. Transfections of fragments of DBD* showed that optimal activity localized to the sequence for its C-terminal 32 amino acids.

  16. Cloning, sequencing and expression of a xylanase gene from the maize pathogen Helminthosporium turcicum

    Degefu, Y.; Paulin, L.; Lübeck, Peter Stephensen


    A gene encoding an endoxylanase from the phytopathogenic fungus Helminthosporium turcicum Pass. was cloned and sequenced. The entire nucleotide sequence of a 1991 bp genomic fragment containing an endoxylanase gene was determined. The xylanase gene of 795 bp, interrupted by two introns of 52 and ...

  17. Poly purine.pyrimidine sequences upstream of the beta-galactosidase gene affect gene expression in Saccharomyces cerevisiae

    Brahmachari Samir K


    Full Text Available Abstract Background Poly purine.pyrimidine sequences have the potential to adopt intramolecular triplex structures and are overrepresented upstream of genes in eukaryotes. These sequences may regulate gene expression by modulating the interaction of transcription factors with DNA sequences upstream of genes. Results A poly purine.pyrimidine sequence with the potential to adopt an intramolecular triplex DNA structure was designed. The sequence was inserted within a nucleosome positioned upstream of the β-galactosidase gene in yeast, Saccharomyces cerevisiae, between the cycl promoter and gal 10Upstream Activating Sequences (UASg. Upon derepression with galactose, β-galactosidase gene expression is reduced 12-fold in cells carrying single copy poly purine.pyrimidine sequences. This reduction in expression is correlated with reduced transcription. Furthermore, we show that plasmids carrying a poly purine.pyrimidine sequence are not specifically lost from yeast cells. Conclusion We propose that a poly purine.pyrimidine sequence upstream of a gene affects transcription. Plasmids carrying this sequence are not specifically lost from cells and thus no additional effort is needed for the replication of these sequences in eukaryotic cells.

  18. Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system.

    Schloss, Patrick D; Jenior, Matthew L; Koumpouras, Charles C; Westcott, Sarah L; Highlander, Sarah K


    Over the past 10 years, microbial ecologists have largely abandoned sequencing 16S rRNA genes by the Sanger sequencing method and have instead adopted highly parallelized sequencing platforms. These new platforms, such as 454 and Illumina's MiSeq, have allowed researchers to obtain millions of high quality but short sequences. The result of the added sequencing depth has been significant improvements in experimental design. The tradeoff has been the decline in the number of full-length reference sequences that are deposited into databases. To overcome this problem, we tested the ability of the PacBio Single Molecule, Real-Time (SMRT) DNA sequencing platform to generate sequence reads from the 16S rRNA gene. We generated sequencing data from the V4, V3-V5, V1-V3, V1-V5, V1-V6, and V1-V9 variable regions from within the 16S rRNA gene using DNA from a synthetic mock community and natural samples collected from human feces, mouse feces, and soil. The mock community allowed us to assess the actual sequencing error rate and how that error rate changed when different curation methods were applied. We developed a simple method based on sequence characteristics and quality scores to reduce the observed error rate for the V1-V9 region from 0.69 to 0.027%. This error rate is comparable to what has been observed for the shorter reads generated by 454 and Illumina's MiSeq sequencing platforms. Although the per base sequencing cost is still significantly more than that of MiSeq, the prospect of supplementing reference databases with full-length sequences from organisms below the limit of detection from the Sanger approach is exciting.

  19. The Clinical Significance of Unknown Sequence Variants in BRCA Genes

    Calò, Valentina; Bruno, Loredana; Paglia, Laura La; Perez, Marco; Margarese, Naomi [Department of Surgery and Oncology, Regional Reference Center for the Biomolecular Characterization and Genetic Screening of Hereditary Tumors, University of Palermo, Via del Vespro 127, 90127 Palermo (Italy); Gaudio, Francesca Di [Department of Medical Biotechnologies and Legal Medicine, University of Palermo, Palermo (Italy); Russo, Antonio, E-mail: [Department of Surgery and Oncology, Regional Reference Center for the Biomolecular Characterization and Genetic Screening of Hereditary Tumors, University of Palermo, Via del Vespro 127, 90127 Palermo (Italy)


    Germline mutations in BRCA1/2 genes are responsible for a large proportion of hereditary breast and/or ovarian cancers. Many highly penetrant predisposition alleles have been identified and include frameshift or nonsense mutations that lead to the translation of a truncated protein. Other alleles contain missense mutations, which result in amino acid substitution and intronic variants with splicing effect. The discovery of variants of uncertain/unclassified significance (VUS) is a result that can complicate rather than improve the risk assessment process. VUSs are mainly missense mutations, but also include a number of intronic variants and in-frame deletions and insertions. Over 2,000 unique BRCA1 and BRCA2 missense variants have been identified, located throughout the whole gene (Breast Cancer Information Core Database (BIC database)). Up to 10–20% of the BRCA tests report the identification of a variant of uncertain significance. There are many methods to discriminate deleterious/high-risk from neutral/low-risk unclassified variants (i.e., analysis of the cosegregation in families of the VUS, measure of the influence of the VUSs on the wild-type protein activity, comparison of sequence conservation across multiple species), but only an integrated analysis of these methods can contribute to a real interpretation of the functional and clinical role of the discussed variants. The aim of our manuscript is to review the studies on BRCA VUS in order to clarify their clinical relevance.

  20. Targeted Sequencing Reveals Large-Scale Sequence Polymorphism in Maize Candidate Genes for Biomass Production and Composition.

    Moses M Muraya

    Full Text Available A major goal of maize genomic research is to identify sequence polymorphisms responsible for phenotypic variation in traits of economic importance. Large-scale detection of sequence variation is critical for linking genes, or genomic regions, to phenotypes. However, due to its size and complexity, it remains expensive to generate whole genome sequences of sufficient coverage for divergent maize lines, even with access to next generation sequencing (NGS technology. Because methods involving reduction of genome complexity, such as genotyping-by-sequencing (GBS, assess only a limited fraction of sequence variation, targeted sequencing of selected genomic loci offers an attractive alternative. We therefore designed a sequence capture assay to target 29 Mb genomic regions and surveyed a total of 4,648 genes possibly affecting biomass production in 21 diverse inbred maize lines (7 flints, 14 dents. Captured and enriched genomic DNA was sequenced using the 454 NGS platform to 19.6-fold average depth coverage, and a broad evaluation of read alignment and variant calling methods was performed to select optimal procedures for variant discovery. Sequence alignment with the B73 reference and de novo assembly identified 383,145 putative single nucleotide polymorphisms (SNPs, of which 42,685 were non-synonymous alterations and 7,139 caused frameshifts. Presence/absence variation (PAV of genes was also detected. We found that substantial sequence variation exists among genomic regions targeted in this study, which was particularly evident within coding regions. This diversification has the potential to broaden functional diversity and generate phenotypic variation that may lead to new adaptations and the modification of important agronomic traits. Further, annotated SNPs identified here will serve as useful genetic tools and as candidates in searches for phenotype-altering DNA variation. In summary, we demonstrated that sequencing of captured DNA is a powerful

  1. Facilitating genome navigation : survey sequencing and dense radiation-hybrid gene mapping

    Hitte, C; Madeoy, J; Kirkness, EF; Priat, C; Lorentzen, TD; Senger, F; Thomas, D; Derrien, T; Ramirez, C; Scott, C; Evanno, G; Pullar, B; Cadieu, E; Oza, [No Value; Lourgant, K; Jaffe, DB; Tacher, S; Dreano, S; Berkova, N; Andre, C; Deloukas, P; Fraser, C; Lindblad-Toh, K; Ostrander, EA; Galibert, F


    Accurate and comprehensive sequence coverage for large genomes has been restricted to only a few species of specific interest. Lower sequence coverage (survey sequencing) of related species can yield a wealth of information about gene content and putative regulatory elements. But survey sequences la

  2. Plasmodium falciparum antigenic variation. Mapping mosaic var gene sequences onto a network of shared, highly polymorphic sequence blocks.

    Bull, Peter C; Buckee, Caroline O; Kyes, Sue; Kortok, Moses M; Thathy, Vandana; Guyah, Bernard; Stoute, José A; Newbold, Chris I; Marsh, Kevin


    Plasmodium falciparum erythrocyte membrane protein 1 (PfEMP1) is a potentially important family of immune targets, encoded by an extremely diverse gene family called var. Understanding of the genetic organization of var genes is hampered by sequence mosaicism that results from a long history of non-homologous recombination. Here we have used software designed to analyse social networks to visualize the relationships between large collections of short var sequences tags sampled from clinical parasite isolates. In this approach, two sequences are connected if they share one or more highly polymorphic sequence blocks. The results show that the majority of analysed sequences including several var-like sequences from the chimpanzee parasite Plasmodium reichenowi can be either directly or indirectly linked together in a single unbroken network. However, the network is highly structured and contains putative subgroups of recombining sequences. The major subgroup contains the previously described group A var genes, previously proposed to be genetically distinct. Another subgroup contains sequences found to be associated with rosetting, a parasite virulence phenotype. The mosaic structure of the sequences and their division into subgroups may reflect the conflicting problems of maximizing antigenic diversity and minimizing epitope sharing between variants while maintaining their host cell binding functions.

  3. Genome-wide gene-gene interaction analysis for next-generation sequencing.

    Zhao, Jinying; Zhu, Yun; Xiong, Momiao


    The critical barrier in interaction analysis for next-generation sequencing (NGS) data is that the traditional pairwise interaction analysis that is suitable for common variants is difficult to apply to rare variants because of their prohibitive computational time, large number of tests and low power. The great challenges for successful detection of interactions with NGS data are (1) the demands in the paradigm of changes in interaction analysis; (2) severe multiple testing; and (3) heavy computations. To meet these challenges, we shift the paradigm of interaction analysis between two SNPs to interaction analysis between two genomic regions. In other words, we take a gene as a unit of analysis and use functional data analysis techniques as dimensional reduction tools to develop a novel statistic to collectively test interaction between all possible pairs of SNPs within two genome regions. By intensive simulations, we demonstrate that the functional logistic regression for interaction analysis has the correct type 1 error rates and higher power to detect interaction than the currently used methods. The proposed method was applied to a coronary artery disease dataset from the Wellcome Trust Case Control Consortium (WTCCC) study and the Framingham Heart Study (FHS) dataset, and the early-onset myocardial infarction (EOMI) exome sequence datasets with European origin from the NHLBI's Exome Sequencing Project. We discovered that 6 of 27 pairs of significantly interacted genes in the FHS were replicated in the independent WTCCC study and 24 pairs of significantly interacted genes after applying Bonferroni correction in the EOMI study.

  4. Isolation of Hox cluster genes from insects reveals an accelerated sequence evolution rate.

    Heike Hadrys

    Full Text Available Among gene families it is the Hox genes and among metazoan animals it is the insects (Hexapoda that have attracted particular attention for studying the evolution of development. Surprisingly though, no Hox genes have been isolated from 26 out of 35 insect orders yet, and the existing sequences derive mainly from only two orders (61% from Hymenoptera and 22% from Diptera. We have designed insect specific primers and isolated 37 new partial homeobox sequences of Hox cluster genes (lab, pb, Hox3, ftz, Antp, Scr, abd-a, Abd-B, Dfd, and Ubx from six insect orders, which are crucial to insect phylogenetics. These new gene sequences provide a first step towards comparative Hox gene studies in insects. Furthermore, comparative distance analyses of homeobox sequences reveal a correlation between gene divergence rate and species radiation success with insects showing the highest rate of homeobox sequence evolution.

  5. Nucleotide sequence of the structural gene for tryptophanase of Escherichia coli K-12.

    Deeley, M C; Yanofsky, C


    The tryptophanase structural gene, tnaA, of Escherichia coli K-12 was cloned and sequenced. The size, amino acid composition, and sequence of the protein predicted from the nucleotide sequence agree with protein structure data previously acquired by others for the tryptophanase of E. coli B. Physiological data indicated that the region controlling expression of tnaA was present in the cloned segment. Sequence data suggested that a second structural gene of unknown function was located distal ...

  6. Phylogenetic analysis of vibrios and related species by means of atpA gene sequences.

    Thompson, Cristiane C; Thompson, Fabiano L; Vicente, Ana Carolina P; Swings, Jean


    We investigated the use of atpA gene sequences as alternative phylogenetic and identification markers for vibrios. A fragment of 1322 bp (corresponding to approximately 88% of the coding region) was analysed in 151 strains of vibrios. The relationships observed were in agreement with the phylogeny inferred from 16S rRNA gene sequence analysis. For instance, the Vibrio cholerae, Vibrio halioticoli, Vibrio harveyi and Vibrio splendidus species groups appeared in the atpA gene phylogenetic analyses, suggesting that these groups may be considered as separate genera within the current Vibrio genus. Overall, atpA gene sequences appeared to be more discriminatory for species differentiation than 16S rRNA gene sequences. 16S rRNA gene sequence similarities above 97% corresponded to atpA gene sequences similarities above 80%. The intraspecies variation in the atpA gene sequence was about 99% sequence similarity. The results showed clearly that atpA gene sequences are a suitable alternative for the identification and phylogenetic study of vibrios.

  7. Characterization of sulphonamide-resistant Escherichia coli using comparison of sul2 gene sequences and multilocus sequence typing

    Trobos, Margarita; Christensen, Henrik; Sunde, Marianne


    The sul2 gene encodes sulphonamide resistance (Sul(R)) and is commonly found in Escherichia coli from different hosts. We typed E coli isolates by multilocus sequence typing (MLST) and compared the results to sequence variation of sul2, in order to investigate the relation to host origin of patho......The sul2 gene encodes sulphonamide resistance (Sul(R)) and is commonly found in Escherichia coli from different hosts. We typed E coli isolates by multilocus sequence typing (MLST) and compared the results to sequence variation of sul2, in order to investigate the relation to host origin...... of pathogenic and commensal E coli strains and to investigate whether transfer of sul2 into different genomic lineages has happened multiple times. Sixty-eight E coli isolated in Denmark and Norway from different hosts and years were MLST typed and sul2 PCR products were sequenced and compared. PFGE...

  8. Defining reference sequences for Nocardia species by similarity and clustering analyses of 16S rRNA gene sequence data.

    Manal Helal

    Full Text Available BACKGROUND: The intra- and inter-species genetic diversity of bacteria and the absence of 'reference', or the most representative, sequences of individual species present a significant challenge for sequence-based identification. The aims of this study were to determine the utility, and compare the performance of several clustering and classification algorithms to identify the species of 364 sequences of 16S rRNA gene with a defined species in GenBank, and 110 sequences of 16S rRNA gene with no defined species, all within the genus Nocardia. METHODS: A total of 364 16S rRNA gene sequences of Nocardia species were studied. In addition, 110 16S rRNA gene sequences assigned only to the Nocardia genus level at the time of submission to GenBank were used for machine learning classification experiments. Different clustering algorithms were compared with a novel algorithm or the linear mapping (LM of the distance matrix. Principal Components Analysis was used for the dimensionality reduction and visualization. RESULTS: The LM algorithm achieved the highest performance and classified the set of 364 16S rRNA sequences into 80 clusters, the majority of which (83.52% corresponded with the original species. The most representative 16S rRNA sequences for individual Nocardia species have been identified as 'centroids' in respective clusters from which the distances to all other sequences were minimized; 110 16S rRNA gene sequences with identifications recorded only at the genus level were classified using machine learning methods. Simple kNN machine learning demonstrated the highest performance and classified Nocardia species sequences with an accuracy of 92.7% and a mean frequency of 0.578. CONCLUSION: The identification of centroids of 16S rRNA gene sequence clusters using novel distance matrix clustering enables the identification of the most representative sequences for each individual species of Nocardia and allows the quantitation of inter- and intra

  9. Molecular characterization of gap region in 28S rRNA molecules in brine shrimp Artemia parthenogenetica and planarian Dugesia japonica.

    Sun, Shuhong; Xie, Hui; Sun, Yan; Song, Jing; Li, Zhi


    In most insects and some other protostomes, a small stretch of nucleotides can be removed from mature 28S rRNA molecules, which could create two 28S rRNA subunits (28Sα and 28Sβ). Thus, during electrophoresis, the rRNA profiles of these organisms may differ significantly from the standard benchmark since the two subunits co-migrate with the 18S rRNA. To understand the structure and mechanism of the atypical 28S rRNA molecule, partial fragments of 28Sα and 28Sβ in brine shrimp Artemia parthenogenetica and planarian Dugesia japonica were cloned using a modified technology based on terminal transferase. Alignment with the corresponding sequences of 28S rDNAs indicates that there are 41 nucleotides in A. parthenogenetica and 42 nucleotides in D. japonica absent from the mature rRNAs. The AU content of the gap sequences of D. japonica and A. parthenogenetica is high. Both the gaps may form stem-loop structure. In D. japonica a UAAU cleavage signal is identified in the loop, but it is absent in A. parthenogenetica. Thus, it is proposed that the gap processing of 28S rRNA was a late enzyme-dependent cleavage event in the rRNA maturational process based on the AU rich gap sequence and the formation of the stem-loop structure to expose the processing segment, while the deletion of the gap region would not affect the structure and function of the 28S rRNA molecule.

  10. Characterizations of Chinese isolates of Coxiella burnetii in the com1 gene sequence

    YU Quan; ZHANG Guo-quan; FUKUSHI Hideto; YAMAGUCHI Tsuyoshi; HIRAI Katsuya


    Objective: To know some genetical characterizations of Coxiella burnetii Chinese isolates by comparing the com1 gene sequence. Methods: com1 gene sequences of Chinese isolates were amplified, sequenced, and analyzed by comparing our result and the previous published data. Results: Three different com1 sequences were identified in 7 Chinese isolates. Sequence comparison indicated that the isolates harboring the QpRS plasmid could be defined as a new group and, in addition, the isolates carrying the same plasmid type showed similar com1 gene sequence. Conclusion: Study suggests that the classification of the group based on the com1 gene sequence is highly associated with the plasmid type of the isolates and, however, little related to disease forms and geographical origins of the isolates.

  11. Isolation and characterization of gene sequences expressed in cotton fiber

    Taciana de Carvalho Coutinho


    Full Text Available ABSTRACT Cotton fiber are tubular cells which develop from the differentiation of ovule epidermis. In addition to being one of the most important natural fiber of the textile group, cotton fiber afford an excellent experimental system for studying the cell wall. The aim of this work was to isolate and characterise the genes expressed in cotton fiber (Gossypium hirsutum L. to be used in future work in cotton breeding. Fiber of the cotton cultivar CNPA ITA 90 II were used to extract RNA for the subsequent generation of a cDNA library. Seventeen sequences were obtained, of which 14 were already described in the NCBI database (National Centre for Biotechnology Information, such as those encoding the lipid transfer proteins (LTPs and arabinogalactans (AGP. However, other cDNAs such as the B05 clone, which displays homology with the glycosyltransferases, have still not been described for this crop. Nevertheless, results showed that several clones obtained in this study are associated with cell wall proteins, wall-modifying enzymes and lipid transfer proteins directly involved in fiber development.

  12. Colorimetric biosensing of targeted gene sequence using dual nanoparticle platforms

    Thavanathan J


    Full Text Available Jeevan Thavanathan,1 Nay Ming Huang,1 Kwai Lin Thong2 1Low Dimension Material Research Center, Department of Physics, 2Institute of Biological Sciences, Faculty of Science, University of Malaya, Kuala Lumpur, Malaysia Abstract: We have developed a colorimetric biosensor using a dual platform of gold nanoparticles and graphene oxide sheets for the detection of Salmonella enterica. The presence of the invA gene in S. enterica causes a change in color of the biosensor from its original pinkish-red to a light purplish solution. This occurs through the aggregation of the primary gold nanoparticles–conjugated DNA probe onto the surface of the secondary graphene oxide–conjugated DNA probe through DNA hybridization with the targeted DNA sequence. Spectrophotometry analysis showed a shift in wavelength from 525 nm to 600 nm with 1 µM of DNA target. Specificity testing revealed that the biosensor was able to detect various serovars of the S. enterica while no color change was observed with the other bacterial species. Sensitivity testing revealed the limit of detection was at 1 nM of DNA target. This proves the effectiveness of the biosensor in the detection of S. enterica through DNA hybridization. Keywords: biosensor, DNA hybridization, DNA probe, gold nanoparticles, graphene oxide, Salmonella enterica

  13. Unresolved orthology and peculiar coding sequence properties of lamprey genes: the KCNA gene family as test case

    Kuraku Shigehiro


    Full Text Available Abstract Background In understanding the evolutionary process of vertebrates, cyclostomes (hagfishes and lamprey occupy crucial positions. Resolving molecular phylogenetic relationships of cyclostome genes with gnathostomes (jawed vertebrates genes is indispensable in deciphering both the species tree and gene trees. However, molecular phylogenetic analyses, especially those including lamprey genes, have produced highly discordant results between gene families. To efficiently scrutinize this problem using partial genome assemblies of early vertebrates, we focused on the potassium voltage-gated channel, shaker-related (KCNA family, whose members are mostly single-exon. Results Seven sea lamprey KCNA genes as well as six elephant shark genes were identified, and their orthologies to bony vertebrate subgroups were assessed. In contrast to robustly supported orthology of the elephant shark genes to gnathostome subgroups, clear orthology of any sea lamprey gene could not be established. Notably, sea lamprey KCNA sequences displayed unique codon usage pattern and amino acid composition, probably associated with exceptionally high GC-content in their coding regions. This lamprey-specific property of coding sequences was also observed generally for genes outside this gene family. Conclusions Our results suggest that secondary modifications of sequence properties unique to the lamprey lineage may be one of the factors preventing robust orthology assessments of lamprey genes, which deserves further genome-wide validation. The lamprey lineage-specific alteration of protein-coding sequence properties needs to be taken into consideration in tackling the key questions about early vertebrate evolution.

  14. Gene discovery in the hamster: a comparative genomics approach for gene annotation by sequencing of hamster testis cDNAs

    Khan Shafiq A


    Full Text Available Abstract Background Complete genome annotation will likely be achieved through a combination of computer-based analysis of available genome sequences combined with direct experimental characterization of expressed regions of individual genomes. We have utilized a comparative genomics approach involving the sequencing of randomly selected hamster testis cDNAs to begin to identify genes not previously annotated on the human, mouse, rat and Fugu (pufferfish genomes. Results 735 distinct sequences were analyzed for their relatedness to known sequences in public databases. Eight of these sequences were derived from previously unidentified genes and expression of these genes in testis was confirmed by Northern blotting. The genomic locations of each sequence were mapped in human, mouse, rat and pufferfish, where applicable, and the structure of their cognate genes was derived using computer-based predictions, genomic comparisons and analysis of uncharacterized cDNA sequences from human and macaque. Conclusion The use of a comparative genomics approach resulted in the identification of eight cDNAs that correspond to previously uncharacterized genes in the human genome. The proteins encoded by these genes included a new member of the kinesin superfamily, a SET/MYND-domain protein, and six proteins for which no specific function could be predicted. Each gene was expressed primarily in testis, suggesting that they may play roles in the development and/or function of testicular cells.

  15. Cloning and sequencing of the bovine gastrin gene

    Lund, T; Rehfeld, J F; Olsen, Jørgen


    In order to deduce the primary structure of bovine preprogastrin we therefore sequenced a gastrin DNA clone isolated from a bovine liver cosmid library. Bovine preprogastrin comprises 104 amino acids and consists of a signal peptide, a 37 amino acid spacer-sequence, the gastrin-34 sequence followed...

  16. Candidate gene analysis and exome sequencing confirm LBX1 as a susceptibility gene for idiopathic scoliosis

    Grauers, Anna; Wang, Jingwen; Einarsdottir, Elisabet;


    BACKGROUND CONTEXT: Idiopathic scoliosis is a spinal deformity affecting approximately 3% of otherwise healthy children or adolescents. The etiology is still largely unknown but has an important genetic component. Genome-wide association studies have identified a number of common genetic variants...... that are significantly associated with idiopathic scoliosis in Asian and Caucasian populations, rs11190870 close to the LBX1 gene being the most replicated finding. PURPOSE: The aim of the present study was to investigate the genetics of idiopathic scoliosis in a Scandinavian cohort by performing a candidate gene study...... of four variants previously shown to be associated with idiopathic scoliosis and exome sequencing of idiopathic scoliosis patients with a severe phenotype to identify possible novel scoliosis risk variants. STUDY DESIGN: This was a case control study. PATIENT SAMPLE: A total of 1,739 patients...


    马巍; 吴玲; 王德利; 刘淼; 任惠民; 杨广笑; 王全颖


    Objective Molecular cloning and sequencing of the human matured fragment of human nerve growth factor(NGF) gene. Methods Extracting the human genomic DNA from the white blood cells as templates, the gene of NGF was cloned by using PCR and T-vector cloning method. Screening the positive clones and identified by the restriction enzymes, and then the cloned amplified fragment was sequenced and analyzed. Results DNA sequence comparison the cloned gene of NGF with the GenBank (V01511) sequence demonstrated that both of sequences were identical, 354bp length. Conclusion Cloning the NGF gene from the human genomic DNA has paved the way for further study on gene therapy of nerve system injury.

  18. Probing proton halo of the exotic nucleus 28S by elastic electron scattering

    WANG; Zaijun; REN; Zhongzhou


    Elastic electron scattering on the exotic light nucleus 28S is investigated in the plane wave Born approximation. The variation of the squared form factors of 28S with momentum transfer is compared with that of 32S. It is found that the behavior of the form factors near the second minimum (with a moderate momentum transfer) is sensitive to the alteration of the charge density distribution of halo protons in 28S. This indicates that elastic electron scattering can be a good probe of the structure of proton-halo nuclei.

  19. Sequence characterization, polymorphism and chromosomal localizations of the porcine PSME1 and PSME2 genes

    Wang, Y.F.; Yu, M.; Pas, te M.F.W.; Yerle, M.; Liu, B.; Fan, B.; Xiong, T.; Li, K.


    The full-length cDNA of porcine genes (PSME1 and PSME2) encoding proteasome activators PA28¿- and ß-subunits were obtained by the rapid amplification of cDNA ends (RACE). The nucleotide sequences and the predicted protein sequences share high sequence identity with their mammalian counterparts. The

  20. Structural analysis of DNA sequence: evidence for lateral gene transfer in Thermotoga maritima

    Worning, Peder; Jensen, Lars Juhl; Nelson, K. E.;


    The recently published complete DNA sequence of the bacterium Thermotoga maritima provides evidence, based on protein sequence conservation, for lateral gene transfer between Archaea and Bacteria. We introduce a new method of periodicity analysis of DNA sequences, based on structural parameters......, which brings independent evidence for the lateral gene transfer in the genome of T.maritima, The structural analysis relates the Archaea-like DNA sequences to the genome of Pyrococcus horikoshii. Analysis of 24 complete genomic DNA sequences shows different periodicity patterns for organisms...

  1. Sequencing and bacterial expression of a novel murine alpha interferon gene

    王焱; 王征宇; 周鸣南; 蔡菊娥; 孙兰英; 刘新垣; B.L.Daugherty; S.Pestka


    A murine new alpha interferon gene (mIFN-αB) was found by primer-based sequencing method in a murine genomic DNA library. The gene was cloned and its sequence was determined. It was expressed in Escherichia coli under the control of the PL promoter which resulted in antiviral activity on mouse L-cells. The sequence of mlFN-αB has been accepted by GENEBANK.

  2. Neural network predicts sequence of TP53 gene based on DNA chip

    Spicker, J.S.; Wikman, F.; Lu, M.L.;


    We have trained an artificial neural network to predict the sequence of the human TP53 tumor suppressor gene based on a p53 GeneChip. The trained neural network uses as input the fluorescence intensities of DNA hybridized to oligonucleotides on the surface of the chip and makes between zero...... and four errors in the predicted 1300 bp sequence when tested on wild-type TP53 sequence....

  3. Molecular characterization, sequence analysis and tissue expression of a porcine gene – MOSPD2

    Yang Jie


    Full Text Available The full-length cDNA sequence of a porcine gene, MOSPD2, was amplified using the rapid amplification of cDNA ends method based on a pig expressed sequence tag sequence which was highly homologous to the coding sequence of the human MOSPD2 gene. Sequence prediction analysis revealed that the open reading frame of this gene encodes a protein of 491 amino acids that has high homology with the motile sperm domain-containing protein 2 (MOSPD2 of five species: horse (89%, human (90%, chimpanzee (89%, rhesus monkey (89% and mouse (85%; thus, it could be defined as a porcine MOSPD2 gene. This novel porcine gene was assigned GeneID: 100153601. This gene is structured in 15 exons and 14 introns as revealed by computer-assisted analysis. The phylogenetic analysis revealed that the porcine MOSPD2 gene has a closer genetic relationship with the MOSPD2 gene of horse. Tissue expression analysis indicated that the porcine MOSPD2 gene is generally and differentially expressed in the spleen, muscle, skin, kidney, lung, liver, fat and heart. Our experiment is the first to establish the primary foundation for further research on the porcine MOSPD2 gene.

  4. Characterization and phylogenetic analysis of -gliadin gene sequences reveals significant genomic divergence in Triticeae species

    Guang-Rong Li; Tao Lang; En-Nian Yang; Cheng Liu; Zu-Jun Yang


    Although the unique properties of wheat -gliadin gene family are well characterized, little is known about the evolution and genomic divergence of -gliadin gene family within the Triticeae. We isolated a total of 203 -gliadin gene sequences from 11 representative diploid and polyploid Triticeae species, and found 108 sequences putatively functional. Our results indicate that -gliadin genes may have possibly originated from wild Secale species, where the sequences contain the shortest repetitive domains and display minimum variation. A miniature inverted-repeat transposable element insertion is reported for the first time in -gliadin gene sequence of Thinopyrum intermedium in this study, indicating that the transposable element might have contributed to the diversification of -gliadin genes family among Triticeae genomes. The phylogenetic analyses revealed that the -gliadin gene sequences of Dasypyrum, Australopyrum, Lophopyrum, Eremopyrum and Pseudoroengeria species have amplified several times. A search for four typical toxic epitopes for celiac disease within the Triticeae -gliadin gene sequences showed that the -gliadins of wild Secale, Australopyrum and Agropyron genomes lack all four epitopes, while other Triticeae species have accumulated these epitopes, suggesting that the evolution of these toxic epitopes sequences occurred during the course of speciation, domestication or polyploidization of Triticeae.

  5. Sequencing of 15 622 gene-bearing BACs clarifies the gene-dense regions of the barley genome.

    Muñoz-Amatriaín, María; Lonardi, Stefano; Luo, MingCheng; Madishetty, Kavitha; Svensson, Jan T; Moscou, Matthew J; Wanamaker, Steve; Jiang, Tao; Kleinhofs, Andris; Muehlbauer, Gary J; Wise, Roger P; Stein, Nils; Ma, Yaqin; Rodriguez, Edmundo; Kudrna, Dave; Bhat, Prasanna R; Chao, Shiaoman; Condamine, Pascal; Heinen, Shane; Resnik, Josh; Wing, Rod; Witt, Heather N; Alpert, Matthew; Beccuti, Marco; Bozdag, Serdar; Cordero, Francesca; Mirebrahim, Hamid; Ounit, Rachid; Wu, Yonghui; You, Frank; Zheng, Jie; Simková, Hana; Dolezel, Jaroslav; Grimwood, Jane; Schmutz, Jeremy; Duma, Denisa; Altschmied, Lothar; Blake, Tom; Bregitzer, Phil; Cooper, Laurel; Dilbirligi, Muharrem; Falk, Anders; Feiz, Leila; Graner, Andreas; Gustafson, Perry; Hayes, Patrick M; Lemaux, Peggy; Mammadov, Jafar; Close, Timothy J


    Barley (Hordeum vulgare L.) possesses a large and highly repetitive genome of 5.1 Gb that has hindered the development of a complete sequence. In 2012, the International Barley Sequencing Consortium released a resource integrating whole-genome shotgun sequences with a physical and genetic framework. However, because only 6278 bacterial artificial chromosome (BACs) in the physical map were sequenced, fine structure was limited. To gain access to the gene-containing portion of the barley genome at high resolution, we identified and sequenced 15 622 BACs representing the minimal tiling path of 72 052 physical-mapped gene-bearing BACs. This generated ~1.7 Gb of genomic sequence containing an estimated 2/3 of all Morex barley genes. Exploration of these sequenced BACs revealed that although distal ends of chromosomes contain most of the gene-enriched BACs and are characterized by high recombination rates, there are also gene-dense regions with suppressed recombination. We made use of published map-anchored sequence data from Aegilops tauschii to develop a synteny viewer between barley and the ancestor of the wheat D-genome. Except for some notable inversions, there is a high level of collinearity between the two species. The software HarvEST:Barley provides facile access to BAC sequences and their annotations, along with the barley-Ae. tauschii synteny viewer. These BAC sequences constitute a resource to improve the efficiency of marker development, map-based cloning, and comparative genomics in barley and related crops. Additional knowledge about regions of the barley genome that are gene-dense but low recombination is particularly relevant.

  6. Sequence Analysis of Toxin Gene-Bearing Corynebacterium diphtheriae Strains, Australia.

    Doyle, Christine J; Mazins, Adam; Graham, Rikki M A; Fang, Ning-Xia; Smith, Helen V; Jennison, Amy V


    By conducting a molecular characterization of Corynebacterium diphtheriae strains in Australia, we identified novel sequences, nonfunctional toxin genes, and 5 recent cases of toxigenic cutaneous diphtheria. These findings highlight the importance of extrapharyngeal infections for toxin gene-bearing (functional or not) and non-toxin gene-bearing C. diphtheriae strains. Continued surveillance is recommended.

  7. Sequencing analysis reveals a unique gene organization in the gyrB region of Mycoplasma hominis

    Ladefoged, Søren; Christiansen, Gunna


    of which showed similarity to that which encodes the LicA protein of Haemophilus influenzae. The organization of the genes in the region showed no resemblance to that in the corresponding regions of other bacteria sequenced so far. The gyrA gene was mapped 35 kb downstream from the gyrB gene....

  8. Estimating variation within the genes and inferring the phylogeny of 186 sequenced diverse Escherichia coli genomes

    Kaas, Rolf Sommer; Rundsten, Carsten Friis; Ussery, David


    more biologically relevant, especially considering that many of these genome sequences are draft quality. The E. coli pan-genome for this set of isolates contains 16,373 gene clusters. A core-gene tree, based on alignment and a pan-genome tree based on gene presence/absence, maps the relatedness...

  9. Update of the Gene Discovery Program in Schistosoma mansoni with the Expressed Sequence Tag Approach

    Élida ML Rabelo


    Full Text Available Continuing the Schistosoma mansoni Genome Project 363 new templates were sequenced generating 205 more ESTs corresponding to 91 genes. Seventy four of these genes (81% had not previously been described in S. mansoni. Among the newly discovered genes there are several of significant biological interest such as synaptophysin, NIFs-like and rho-GDP dissociation inhibitor

  10. Cloning, sequence analysis, and characterization of the genes involved in isoprimeverose metabolism in Lactobacillus pentosus

    Chaillou, S.; Lokman, B.C.; Leer, R.J.; Posthuma, C.; Postma, P.W.; Pouwels, P.H.


    Two genes, xylP and xylQ, from the xylose regulon of Lactobacillus pentosus were cloned and sequenced. Together with the repressor gene of the regulon, xylR, the xylPQ genes form an operon which is inducible by xylose and which is transcribed from a promoter located 145 bp upstream of xylP. A putati

  11. Characterization of promoter sequence of toll-like receptor genes in Vechur cattle

    R. Lakshmi


    Full Text Available Aim: To analyze the promoter sequence of toll-like receptor (TLR genes in Vechur cattle, an indigenous breed of Kerala with the sequence of Bos taurus and access the differences that could be attributed to innate immune responses against bovine mastitis. Materials and Methods: Blood samples were collected from Jugular vein of Vechur cattle, maintained at Vechur cattle conservation center of Kerala Veterinary and Animal Sciences University, using an acid-citrate-dextrose anticoagulant. The genomic DNA was extracted, and polymerase chain reaction was carried out to amplify the promoter region of TLRs. The amplified product of TLR2, 4, and 9 promoter regions was sequenced by Sanger enzymatic DNA sequencing technique. Results: The sequence of promoter region of TLR2 of Vechur cattle with the B. taurus sequence present in GenBank showed 98% similarity and revealed variants for four sequence motifs. The sequence of the promoter region of TLR4 of Vechur cattle revealed 99% similarity with that of B. taurus sequence but not reveals significant variant in motifregions. However, two heterozygous loci were observed from the chromatogram. Promoter sequence of TLR9 gene also showed 99% similarity to B. taurus sequence and revealed variants for four sequence motifs. Conclusion: The results of this study indicate that significant variation in the promoter of TLR2 and 9 genes in Vechur cattle breed and may potentially link the influence the innate immunity response against mastitis diseases.

  12. Identification of new genes in Sinorhizobium meliloti using the Genome Sequencer FLX system

    Jensen Roderick V


    Full Text Available Abstract Background Sinorhizobium meliloti is an agriculturally important model symbiont. There is an ongoing need to update and improve its genome annotation. In this study, we used a high-throughput pyrosequencing approach to sequence the transcriptome of S. meliloti, and search for new bacterial genes missed in the previous genome annotation. This is the first report of sequencing a bacterial transcriptome using the pyrosequencing technology. Results Our pilot sequencing run generated 19,005 reads with an average length of 136 nucleotides per read. From these data, we identified 20 new genes. These new gene transcripts were confirmed by RT-PCR and their possible functions were analyzed. Conclusion Our results indicate that high-throughput sequence analysis of bacterial transcriptomes is feasible and next-generation sequencing technologies will greatly facilitate the discovery of new genes and improve genome annotation.

  13. Mouse mammary tumor virus-like gene sequences are present in lung patient specimens

    Rodríguez-Padilla Cristina


    Full Text Available Abstract Background Previous studies have reported on the presence of Murine Mammary Tumor Virus (MMTV-like gene sequences in human cancer tissue specimens. Here, we search for MMTV-like gene sequences in lung diseases including carcinomas specimens from a Mexican population. This study was based on our previous study reporting that the INER51 lung cancer cell line, from a pleural effusion of a Mexican patient, contains MMTV-like env gene sequences. Results The MMTV-like env gene sequences have been detected in three out of 18 specimens studied, by PCR using a specific set of MMTV-like primers. The three identified MMTV-like gene sequences, which were assigned as INER6, HZ101, and HZ14, were 99%, 98%, and 97% homologous, respectively, as compared to GenBank sequence accession number AY161347. The INER6 and HZ-101 samples were isolated from lung cancer specimens, and the HZ-14 was isolated from an acute inflammatory lung infiltrate sample. Two of the env sequences exhibited disruption of the reading frame due to mutations. Conclusion In summary, we identified the presence of MMTV-like gene sequences in 2 out of 11 (18% of the lung carcinomas and 1 out of 7 (14% of acute inflamatory lung infiltrate specimens studied of a Mexican Population.

  14. Identification of true EST alignments and exon regions of gene sequences

    ZHOU Yanhong; JING Hui; LI Yanen; LIU Huailan


    Expressed sequence tags (ESTs), which have piled up considerably so far, provide a valuable resource for finding new genes, disease-relevant genes, and for recognizing alternative splicing variants, SNP sites, etc. The prerequisite for carrying out these researches is to correctly ascertain the gene-sequence-related ESTs. Based on analysis of the alignment results between some known gene sequences and ESTs in public database, several measures including Identity Check, Gap Check, Inclusion Check and Length Check have been introduced to judge whether an EST alignment is related to a gene sequence or not. A computational program EDSAc1.0 has been developed to identify true EST alignments and exon regions of query gene sequences. When tested with human gene sequences in the standard dataset HMR195 and evaluated with the standard measures of gene prediction performance, EDSAc1.0 can identify protein- coding regions with specificity of 0.997 and sensitivity of 0.88 at the nucleotide level, which outperform that of the counterpart TAP. A web server of EDSAc1.0 is available at

  15. Cloning, sequencing and identification of single nucleotide polymorphisms of partial sequence on the porcine CACNA1S gene

    FANG XiaoMin; XU NingYing; REN ShouWen


    CACNA1S gene encodes the α1 subunit of the calcium channel. The mutation of CACNA1S gene can cause hypokalemic periodic paralysis (HypoKPP) and maliglant hyperthermla synarome (MHS) in human beings. Current research on CACNA1S was mainly in human being and model animal, but rarely in livestock and poultry. In this study, Yorkshire pigs (23), Pietrain pigs (30), Jinhua pigs (115) and the second generation (126) of crossbred of Jinhua and Pietrein were used. Primers were designed according to the sequence of human CACNA1S gene and PCR was carried out using pig genome DNA.PCR products were sequenced and compared with that of human, and then single nucleotide polymorphisms (SNPs) were investigated by PCR-SSCP, while PCR-RFLP tests were performed to validate the mutations. Results indicated: (1) the 5211 bp DNA fragments of porcine CACNA1S gene were acquired (GenBank accession number: DQ767693 ) and the identity of the exon region was 82.6% between human and pig; (2) fifty-seven mutations were found within the cloned sequences, among which 24 were in exon region; (3) the results of PCR-RFLP were in accordance with that of PCR-SSCP. According to the EST of porcine CACNA1S gene published in GenBank (Bx914582, Bx666997), 8 of the 11 SNPs identified in the present study were consistent with the base difference between two EST fragments.

  16. Strong association between pseudogenization mechanisms and gene sequence length

    Harrison Paul M


    Full Text Available Abstract Pseudogenes arise from the decay of gene copies following either RNA-mediated duplication (processed pseudogenes or DNA-mediated duplication (nonprocessed pseudogenes. Here, we show that long protein-coding genes tend to produce more nonprocessed pseudogenes than short genes, whereas the opposite is true for processed pseudogenes. Protein-coding genes longer than 3000 bp are 6 times more likely to produce nonprocessed pseudogenes than processed ones. Reviewers This article was reviewed by Dr. Dan Graur and Dr. Craig Nelson (nominated by Dr. J Peter Gogarten.

  17. Cloning and Sequence Analysis of Y-box Binding Protein Gene in Min Pig

    Zhang Dong-jie; Liu Di; Wang Liang; He Xin-miao; Wang Wen-tao


    In order to study the gene sequence of Min pig Y-box binding protein (YB-1) gene, the complete coding sequence of Min pig YB-1 gene was cloned by RT-PCR, the sequence features were analyzed by some software and online website. The results showed that the complete CDS of Min pig Y-box was found to be 975 bp long, encoding 324 amino acids. It contained a conserved cold shock domain and several phosphorylation sites, but had no transmembrane domains, and was consistent with a protein found in the cytoplasm. Min pig YB-1 nucleotides shared high similarity (61.37%-97.66%) with other mammals.

  18. Cloning and sequencing of human lambda immunoglobulin genes by the polymerase chain reaction.

    Songsivilai, S; Bye, J M; Marks, J D; Hughes-Jones, N C


    Universal oligonucleotide primers, designed for amplifying and sequencing genes encoding the rearranged human lambda immunoglobulin variable region, were validated by amplification of the lambda light chain genes from four human heterohybridoma cell lines and in the generation of a cDNA library of human V lambda sequences from Epstein-Barr virus-transformed human peripheral blood lymphocytes. This technique allows rapid cloning and sequencing of human immunoglobulin genes, and has potential applications in the rescue of unstable human antibody-producing cell lines and in the production of human monoclonal antibodies.

  19. Cloning and Sequence Analysis of Glycoprotein D Gene of Bovine Herpesvirus-1 Strain Luojing

    LI Ji-chang; TONG Guang-zhi; QIU Hua-ji; ZHOU Yan-jun; XUE Qiang


    By means of PCR,the gene encoding gD of bovine herpesvirus-1 (BHV-1) strain Luojing was amplified,cloned and sequenced.The nucleotide sequence of this gD gene was 1 251 bp,encoding 417 amino acids.Comparied with the published P8-2 strain,the homology of the necleotide sequence is 99.92%,and that of the deduced amino acid sequence is 100%.The results indicated that gD of BHV-1 was highly conservative.

  20. [CHL15--a new gene controlling the replication of chromosomes in saccharomycetes yeast: cloning, physical mapping, sequencing, and sequence analysis].

    Kuprina, N Iu; Krol', E S; Koriabin, M Iu; Shestopalov, B V; Bliskovskiĭ, V V; Bannikov, V M; Gizatullin, R Z; Kirillov, A V; Kravtsov, V Iu; Zakhar'ev, V M


    We have analyzed the CHL15 gene, earlier identified in a screen for yeast mutants with increased loss of chromosome III and artificial circular and linear chromosomes in mitosis. Mutations in the CHL15 gene lead to a 100-fold increase in the rate of chromosome III loss per cell division and a 200-fold increase in the rate of marker homozygosis on this chromosome by mitotic recombination. Analysis of segregation of artificial circular minichromosome and artificially generated nonessential marker chromosome fragment indicated that sister chromatid loss (1:0 segregation) is a main reason of chromosome destabilization in the chl15-1 mutant. A genomic clone of CHL15 was isolated and used to map its physical position on chromosome XVI. Nucleotide sequence analysis of CHL15 revealed a 2.8-kb open reading frame with a 105-kD predicted protein sequence. At the N-terminal region of the protein sequences potentially able to form DNA-binding domains defined as zinc-fingers were found. The C-terminal region of the predicted protein displayed a similarity to sequence of regulatory proteins known as the helix-loop-helix (HLH) proteins. Data on partial deletion analysis suggest that the HLH domain is essential for the function of the CHL15 gene product. Analysis of the upstream untranslated region of CHL15 revealed the presence of the hexamer element, ACGCGT (an MluI restriction site) controlling both the periodic expression and coordinate regulation of the DNA synthesis genes in budding yeast. Deletion in the RAD52 gene, the product of which is involved in double-strand break/recombination repair and replication, leads to a considerable decrease in the growth rate of the chl15 mutant. We suggest that CHL15 is a new DNA synthesis gene in the yeast Saccharomyces cerevisiae.

  1. Description and interpretation of various SNPs identified by BRCA2 gene sequencing

    Anca Negura


    Full Text Available Molecular diagnosis for hereditary breast and ovarian cancer (HBOC involves systematic DNA sequencing of predisposition genes like BRCA1 or BRCA2. Deleterious mutations within such genes are responsible for developing the disease, but other sequence variants can also be identified. Common Single Nucleotide Polymorphisms (SNPs are usually present in human genome, defining alleles whose frequencies widely vary in different populations. Either intragenic or intronic, silent or generating aminoacid substitutions, SNPs cannot be afforded themselves a predisposition status. However, prevalent SNPs can be used to define gene haplotypes, with also various frequencies. Since some mutation can easily be assigned to haplotypes (such is the case for BRCA1 gene, SNPs can therefore provide usual information in interpreting gene mutations effects on hereditary predisposition to cancer. Here we describe 10 BRCA2 SNPs identified by complete gene sequencing

  2. Sequencing of neuroblastoma identifies chromothripsis and defects in neuritogenesis genes

    J. Molenaar (Jan); J. Koster (Jan); D. Zwijnenburg (Danny); P. van Sluis (Peter); L.J. Valentijn (Linda); I. van der Ploeg (Ida); M. Hamdi (Mohamed); J. van Nes (Johan); B.A. Westerman (Bart); J. van Arkel (Jennemiek); M.E. Ebus; F. Haneveld (Franciska); A. Lakeman (Arjan); L. Schild (Linda); P. Molenaar (Piet); P. Stroeken (Peter); M.M. van Noesel (Max); I. Øra (Ingrid); J.P. di Santo (James); H.N. Caron (Huib); E.M. Westerhout (Ellen); R. Versteeg (Rogier)


    textabstractNeuroblastoma is a childhood tumour of the peripheral sympathetic nervous system. The pathogenesis has for a long time been quite enigmatic, as only very few gene defects were identified in this often lethal tumour. Frequently detected gene alterations are limited to MYCN amplification (

  3. Genome-wide analysis reveals diverged patterns of codon bias, gene expression, and rates of sequence evolution in picea gene families.

    De La Torre, Amanda R; Lin, Yao-Cheng; Van de Peer, Yves; Ingvarsson, Pär K


    The recent sequencing of several gymnosperm genomes has greatly facilitated studying the evolution of their genes and gene families. In this study, we examine the evidence for expression-mediated selection in the first two fully sequenced representatives of the gymnosperm plant clade (Picea abies and Picea glauca). We use genome-wide estimates of gene expression (>50,000 expressed genes) to study the relationship between gene expression, codon bias, rates of sequence divergence, protein length, and gene duplication. We found that gene expression is correlated with rates of sequence divergence and codon bias, suggesting that natural selection is acting on Picea protein-coding genes for translational efficiency. Gene expression, rates of sequence divergence, and codon bias are correlated with the size of gene families, with large multicopy gene families having, on average, a lower expression level and breadth, lower codon bias, and higher rates of sequence divergence than single-copy gene families. Tissue-specific patterns of gene expression were more common in large gene families with large gene expression divergence than in single-copy families. Recent family expansions combined with large gene expression variation in paralogs and increased rates of sequence evolution suggest that some Picea gene families are rapidly evolving to cope with biotic and abiotic stress. Our study highlights the importance of gene expression and natural selection in shaping the evolution of protein-coding genes in Picea species, and sets the ground for further studies investigating the evolution of individual gene families in gymnosperms.

  4. Whole Blood Transcriptome Sequencing Reveals Gene Expression Differences between Dapulian and Landrace Piglets

    Hu, Jiaqing; Yang, Dandan; Chen, Wei; Li, Chuanhao; Wang, Yandong; Zeng, Yongqing; Wang, Hui


    There is little genomic information regarding gene expression differences at the whole blood transcriptome level of different pig breeds at the neonatal stage. To solve this, we characterized differentially expressed genes (DEGs) in the whole blood of Dapulian (DPL) and Landrace piglets using RNA-seq (RNA-sequencing) technology. In this study, 83 DEGs were identified between the two breeds. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses identified immun...

  5. Comparative genome sequencing of drosophila pseudoobscura: Chromosomal, gene and cis-element evolution

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Todd, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catherine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenee; Verduzco, Daniel; Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.


    The genome sequence of a second fruit fly, D. pseudoobscura, presents an opportunity for comparative analysis of a primary model organism D. melanogaster. The vast majority of Drosophila genes have remained on the same arm, but within each arm gene order has been extensively reshuffled leading to the identification of approximately 1300 syntenic blocks. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 35 My since divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome wide average consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than control sequences between the species but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a picture of repeat mediated chromosomal rearrangement, and high co-adaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila.

  6. Sequencing and phylogenetic analysis of partial CXCR2 gene of Murrah buffalo

    S. A. Wani


    Full Text Available Aim: Present study was carried out to sequence and phylogenetic analysis of CXCR2 gene of Murrah buffalo. Materials and Methods: For the present investigation, from a group of forty eight Murrah buffaloes (Bubalus bubalis, blood samples were collected randomly from eight animals, out of which four were healthy and four were mastitic. Results: The amplification of Interleukin-8B (IL-8B receptor gene target sequence was carried out using the primer pair in an optimized polymerase chain reaction. Partial sequencing of IL-8B receptor gene of Bubalus bubalis (Murrah has been done successfully. The sequences of IL-8B receptor gene showed 99% homology to that of Bos indicus × Bos taurus, 98% to that of Bos taurus, 97% to that of Ovis aries, 93% to that of Sus scrofa, 92% to that of Equus caballus and 90% to that of Felis catus. Conclusion: From the present study it can be concluded that the PCR amplification procedure for target region of IL-8B receptor gene yielding 459 bp products has been standardized, which yielded consistent and specific amplification. Amplification of partial IL-8B receptor gene (exon 2- 459 bp using self designed primers specific for cattle ortholog sequence signifies that the locus is conserved in cattle and buffaloes. In phylogenetic tree, the target sequence of IL-8B receptor gene of Bubalus bubalis were found to be more closely related to Bos indicus × Bos Taurus and Bos taurus than to Ovis aries and Sus scrofa.

  7. Evolution of the RH gene family in vertebrates revealed by brown hagfish (Eptatretus atami) genome sequences.

    Suzuki, Akinori; Komata, Hidero; Iwashita, Shogo; Seto, Shotaro; Ikeya, Hironobu; Tabata, Mitsutoshi; Kitano, Takashi


    In vertebrates, there are four major genes in the RH (Rhesus) gene family, RH, RHAG, RHBG, and RHCG. These genes are thought to have been formed by the two rounds of whole-genome duplication (2R-WGD) in the common ancestor of all vertebrates. In our previous work, where we analyzed details of the gene duplications process of this gene family, three nucleotide sequences belonging to this family were identified in Far Eastern brook lamprey (Lethenteron reissneri), and the phylogenetic positions of the genes were determined. Lampreys, along with hagfishes, are cyclostomata (jawless fishes), which is a sister group of gnathostomata (jawed vertebrates). Although those results suggested that one gene was orthologous to the gnathostome RHCG genes, we did not identify clear orthologues for other genes. In this study, therefore, we identified three novel cDNA sequences that belong to the RH gene family using de novo transcriptome analysis of another cyclostome: the brown hagfish (Eptatretus atami). We also determined the nucleotide sequences for the RHBG and RHCG genes in a red stingray (Dasyatis akajei), which belongs to the cartilaginous fishes. The phylogenetic tree showed that two brown hagfish genes, which were probably duplicated in the cyclostome lineage, formed a cluster with the gnathostome RHAG genes, whereas another brown hagfish gene formed a cluster with the gnathostome RHCG genes. We estimated that the RH genes had a higher evolutionary rate than the RHAG, RHBG, and RHCG genes. Interestingly, in the RHBG genes, only the bird lineage showed a higher rate of nonsynonymous substitutions. It is likely that this higher rate was caused by a state of relaxed functional constraints rather than positive selection nor by pseudogenization.

  8. Cloning and nucleotide sequence of the Enterobacter aerogenes signal peptidase II (lsp) gene.

    Isaki, L; Kawakami, M; Beers, R; Hom, R; Wu, H.C.


    In Escherichia coli, prolipoprotein signal peptidase is encoded by the lsp gene, which is organized into an operon consisting of ileS, lsp, and three open reading frames, designated genes x, orf-149, and orf-316. The Enterobacter aerogenes lsp gene was cloned and expressed in E. coli. The nucleotide sequence of the Enterobacter aerogenes lsp gene and a part of its flanking sequences were determined. A high degree of homology was found between the E. coli ileS-lsp operon and the corresponding ...

  9. Application of next generation sequencing to human gene fusion detection: computational tools, features and perspectives.

    Wang, Qingguo; Xia, Junfeng; Jia, Peilin; Pao, William; Zhao, Zhongming


    Gene fusions are important genomic events in human cancer because their fusion gene products can drive the development of cancer and thus are potential prognostic tools or therapeutic targets in anti-cancer treatment. Major advancements have been made in computational approaches for fusion gene discovery over the past 3 years due to improvements and widespread applications of high-throughput next generation sequencing (NGS) technologies. To identify fusions from NGS data, existing methods typically leverage the strengths of both sequencing technologies and computational strategies. In this article, we review the NGS and computational features of existing methods for fusion gene detection and suggest directions for future development.

  10. Gene conversion-like events in the diversification of human rearranged IGHV3-23*01 gene sequences

    Bhargavi eDuvvuri


    Full Text Available Gene conversion (GCV as a mechanism of immunoglobulin diversification is well established in a few species. However, definitive evidence of GCV-like events in human immunoglobulin genes is scarce. GCV is mediated by activation-induced cytidine deaminase (AID. The lack of evidence of GCV in human rearranged immunoglobulin gene sequences is puzzling given the presence of highly similar germline donors and all the enzymatic machinery required for GCV. In this study, we undertook a computational analysis of rearranged IGHV3-23*01 gene sequences from common variable immunodeficiency (CVID patients and healthy individuals to survey ‘GCV-like’ activities. Our search identified strong evidence of GCV-like patterns. Germline VH sequences were identified as potential donors for clustered mutations in rearranged IGHV3-23*01 gene sequences. We identified minimum and maximum sequence identities between donor and recipient sequences that can serve as targets for GCV and our findings are consistent with those reported in literature. We observed that GCV-like tracts are flanked by activation-induced cytidine deaminase (AID hotspot motifs. Structural modeling of IGHV3-23*01 gene sequence revealed that hypermutable bases flanking GCV-like tracts, are in the single stranded DNA (ssDNA of stable stem-loop structures (SLSs. SsDNA is inherently fragile and also an optimal target for AID. We speculate that GCV could have been initiated by the targeting of hypermutable bases in ssDNA state in stable SLSs, plausibly by AID. We have observed that the frequency of GCV-like events is significantly higher in rearranged IGHV323-*01 sequences from healthy individuals compared to that of CVID patients. GCV, unlike SHM, can result in multiple base substitutions that can alter many amino acids. The extensive changes in antibody affinity by GCV-like events, as identified in this study would be instrumental in protecting humans against pathogens that diversify their genome by

  11. Molecular evidence of lateral gene transfer in rpoB gene of Mycobacterium yongonense strains via multilocus sequence analysis.

    Byoung-Jun Kim

    Full Text Available Recently, a novel species, Mycobacterium yongonense (DSM 45126(T, was introduced and while it is phylogenetically related to Mycobacterium intracellulare, it has a distinct RNA polymerase β-subunit gene (rpoB sequence that is identical to that of Mycobacterium parascrofulaceum, which is a distantly related scotochromogen, which suggests the acquisition of the rpoB gene via a potential lateral gene transfer (LGT event. The aims of this study are to prove the presence of the LGT event in the rpoB gene of the M. yongonense strains via multilocus sequence analysis (MLSA. In order to determine the potential of an LGT event in the rpoB gene of the M. yongonense, the MLSA based on full rpoB sequences (3447 or 3450 bp and on partial sequences of five other targets [16S rRNA (1383 or 1395 bp, hsp65 (603 bp, dnaJ (192 bp, recA (1053 bp, and sodA (501 bp] were conducted. Incongruences between the phylogenetic analysis of the full rpoB and the five other genes in a total of three M. yongonense strains [two clinical strains (MOTT-12 and MOTT-27 and one type strain (DSM 45126(T] were observed, suggesting that rpoB gene of three M. yongonense strains may have been acquired very recently via an LGT event from M. parascrofulaceum, which is a distantly related scotochromogen.

  12. Targeting of AID-mediated sequence diversification to immunoglobulin genes.

    Kothapalli, Naga Rama; Fugmann, Sebastian D


    Activation-induced cytidine deaminase (AID) is a key enzyme for antibody-mediated immune responses. Antibodies are encoded by the immunoglobulin genes and AID acts as a transcription-dependent DNA mutator on these genes to improve antibody affinity and effector functions. An emerging theme in field is that many transcribed genes are potential targets of AID, presenting an obvious danger to genomic integrity. Thus there are mechanisms in place to ensure that mutagenic outcomes of AID activity are specifically restricted to the immunoglobulin loci. Cis-regulatory targeting elements mediate this effect and their mode of action is probably a combination of immunoglobulin gene specific activation of AID and a perversion of faithful DNA repair towards error-prone outcomes.

  13. Successful Recovery of Nuclear Protein-Coding Genes from Small Insects in Museums Using Illumina Sequencing.

    Kanda, Kojun; Pflug, James M; Sproul, John S; Dasenko, Mark A; Maddison, David R


    In this paper we explore high-throughput Illumina sequencing of nuclear protein-coding, ribosomal, and mitochondrial genes in small, dried insects stored in natural history collections. We sequenced one tenebrionid beetle and 12 carabid beetles ranging in size from 3.7 to 9.7 mm in length that have been stored in various museums for 4 to 84 years. Although we chose a number of old, small specimens for which we expected low sequence recovery, we successfully recovered at least some low-copy nuclear protein-coding genes from all specimens. For example, in one 56-year-old beetle, 4.4 mm in length, our de novo assembly recovered about 63% of approximately 41,900 nucleotides in a target suite of 67 nuclear protein-coding gene fragments, and 70% using a reference-based assembly. Even in the least successfully sequenced carabid specimen, reference-based assembly yielded fragments that were at least 50% of the target length for 34 of 67 nuclear protein-coding gene fragments. Exploration of alternative references for reference-based assembly revealed few signs of bias created by the reference. For all specimens we recovered almost complete copies of ribosomal and mitochondrial genes. We verified the general accuracy of the sequences through comparisons with sequences obtained from PCR and Sanger sequencing, including of conspecific, fresh specimens, and through phylogenetic analysis that tested the placement of sequences in predicted regions. A few possible inaccuracies in the sequences were detected, but these rarely affected the phylogenetic placement of the samples. Although our sample sizes are low, an exploratory regression study suggests that the dominant factor in predicting success at recovering nuclear protein-coding genes is a high number of Illumina reads, with success at PCR of COI and killing by immersion in ethanol being secondary factors; in analyses of only high-read samples, the primary significant explanatory variable was body length, with small beetles

  14. Phylogeny and identification of Enterococci by atpA gene sequence analysis.

    Naser, S; Thompson, F L; Hoste, B; Gevers, D; Vandemeulebroecke, K; Cleenwerck, I; Thompson, C C; Vancanneyt, M; Swings, J


    The relatedness among 91 Enterococcus strains representing all validly described species was investigated by comparing a 1,102-bp fragment of atpA, the gene encoding the alpha subunit of ATP synthase. The relationships observed were in agreement with the phylogeny inferred from 16S rRNA gene sequence analysis. However, atpA gene sequences were much more discriminatory than 16S rRNA for species differentiation. All species were differentiated on the basis of atpA sequences with, at a maximum, 92% similarity. Six members of the Enterococcus faecium species group (E. faecium, E. hirae, E. durans, E. villorum, E. mundtii, and E. ratti) showed > 99% 16S rRNA gene sequence similarity, but the highest value of atpA gene sequence similarity was only 89.9%. The intraspecies atpA sequence similarities for all species except E. faecium strains varied from 98.6 to 100%; the E. faecium strains had a lower atpA sequence similarity of 96.3%. Our data clearly show that atpA provides an alternative tool for the phylogenetic study and identification of enterococci.

  15. Cloning and sequence analysis of chitin synthase gene fragments of Demodex mites.

    Zhao, Ya-e; Wang, Zheng-hang; Xu, Yang; Xu, Ji-ru; Liu, Wen-yan; Wei, Meng; Wang, Chu-ying


    To our knowledge, few reports on Demodex studied at the molecular level are available at present. In this study our group, for the first time, cloned, sequenced and analyzed the chitin synthase (CHS) gene fragments of Demodex folliculorum, Demodex brevis, and Demodex canis (three isolates from each species) from Xi'an China, by designing specific primers based on the only partial sequence of the CHS gene of D. canis from Japan, retrieved from GenBank. Results show that amplification was successful only in three D. canis isolates and one D. brevis isolate out of the nine Demodex isolates. The obtained fragments were sequenced to be 339 bp for D. canis and 338 bp for D. brevis. The CHS gene sequence similarities between the three Xi'an D. canis isolates and one Japanese D. canis isolate ranged from 99.7% to 100.0%, and those between four D. canis isolates and one D. brevis isolate were 99.1%-99.4%. Phylogenetic trees based on maximum parsimony (MP) and maximum likelihood (ML) methods shared the same clusters, according with the traditional classification. Two open reading frames (ORFs) were identified in each CHS gene sequenced, and their corresponding amino acid sequences were located at the catalytic domain. The relatively conserved sequences could be deduced to be a CHS class A gene, which is associated with chitin synthesis in the integument of Demodex mites.

  16. Therapeutic modulation of endogenous gene function by agents with designed DNA-sequence specificities

    Uil, T.G.; Haisma, H.J.; Rots, Marianne


    Designer molecules that can specifically target pre-determined DNA sequences provide a means to modulate endogenous gene function. Different classes of sequence-specific DNA-binding agents have been developed, including triplex-forming molecules, synthetic polyamides and designer zinc finger protein

  17. [cDNA cloning and sequence analysis of pluripotency genes in tree shrews (Tupaia belangeri)].

    Wang, Cai-Yun; Ma, Yun-Han; He, Da-Jian; Yang, Shi-Hua


    In this paper, partial sequences of the tree shrew (Tupaia belangeri) Klf4, Sox2, and c-Myc genes were cloned and sequenced, which were 382, 612, and 485 bp in length and encoded 127, 204, and 161 amino acids, respectively. Whereas, their cDNA sequence identities with those of human were 89%, 98%, and 89%, respectively. Their phylogenetic tree results indicated different topologies and suggested individual evolutional pathways. These results can facilitate further functional studies.

  18. Gene tree discordance of wild and cultivated Asian rice deciphered by genome-wide sequence comparison.

    Yang, Ching-chia; Sakai, Hiroaki; Numa, Hisataka; Itoh, Takeshi


    Although a large number of genes are expected to correctly solve a phylogenetic relationship, inconsistent gene tree topologies have been observed. This conflicting evidence in gene tree topologies, known as gene tree discordance, becomes increasingly important as advanced sequencing technologies produce an enormous amount of sequence information for phylogenomic studies among closely related species. Here, we aim to characterize the gene tree discordance of the Asian cultivated rice Oryza sativa and its progenitor, O. rufipogon, which will be an ideal case study of gene tree discordance. Using genome and cDNA sequences of O. sativa and O. rufipogon, we have conducted the first in-depth analyses of gene tree discordance in Asian rice. Our comparison of full-length cDNA sequences of O. rufipogon with the genome sequences of the japonica and indica cultivars of O. sativa revealed that 60% of the gene trees showed a topology consistent with the expected one, whereas the remaining genes supported significantly different topologies. Moreover, the proportions of the topologies deviated significantly from expectation, suggesting at least one hybridization event between the two subgroups of O. sativa, japonica and indica. In fact, a genome-wide alignment between japonica and indica indicated that significant portions of the indica genome are derived from japonica. In addition, literature concerning the pedigree of the indica cultivar strongly supported the hybridization hypothesis. Our molecular evolutionary analyses deciphered complicated evolutionary processes in closely related species. They also demonstrated the importance of gene tree discordance in the era of high-speed DNA sequencing.

  19. Hindered proton collectivity in 28S: Possible magic number at Z=16

    Togano, Y; Iwasa, N; Yamada, K; Motobayashi, T; Aoi, N; Baba, H; Bishop, S; Cai, X; Doornenbal, P; Fang, D; Furukawa, T; Ieki, K; Kawabata, T; Kanno, S; Kobayashi, N; Kondo, Y; Kuboki, T; Kume, N; Kurita, K; Kurokawa, M; Ma, Y G; Matsuo, Y; Murakami, H; Matsushita, M; Nakamura, T; Okada, K; Ota, S; Satou, Y; Shimoura, S; Shioda, R; Tanaka, K N; Takeuchi, S; Tian, W; Wang, H; Wang, J; Yoneda, K


    The reduced transition probability B(E2;0 ->2+) for 28S was obtained experimentally using Coulomb excitation at 53 MeV/nucleon. The resultant B(E2) value 181(31) e2fm4 is smaller than the expectation based on empirical B(E2) systematics. The double ratio |M_n/M_p|/(N/Z) of the 0+ ->2+ transition in 28S was determined to be 1.9(2) by evaluating the M_n value from the known B(E2) value of the mirror nucleus 28Mg, showing the hindrance of proton collectivity relative to that of neutrons. These results indicate the emergence of the magic number Z=16 in the |T_z|=2 nucleus 28S.

  20. Cloning and Sequence Analysis of Light Variable Region Gene of Anti-human Retinoblastoma Monoclonal Antibody

    Xiufeng Zhong; Yongping Li; Shuqi Huang; Bo Ning; Chunyan Zhang; Jianliang Zheng; Guanguang Feng


    Purpose: To clone the variable region gene of light chain of monoclonal antibody against human retinoblastoma and to analyze the characterization of its nucleotide sequence as well as amino acid sequence.Methods: Total RNA was extracted from 3C6 hybridoma cells secreting specific monoclonal antibody(McAb)against human retinoblastoma(RB), then transcripted reversely into cDNA with olig-dT primers.The variable region of the light chain (VL) gene fragments was amplified using polymeerase chain reaction(PCR) and further cloned into pGEM(R) -T Easy vector. Then, 3C6 VL cDNA was sequenced by Sanger's method.Homologous analysis was done by NCBI BLAST.Results: The complete nucleotide sequence of 3C6 VL cDNA consisted of 321 bp encoding 107 amino acid residues, containing four workframe regions(FRs)and three complementarity-determining regions (CDRs) as well as the typical structure of two cys residues. The sequence is most homological to a member of the Vk9 gene family, and its chain utilizes the Jkl gene segment.Conclusion: The light chain variable region gene of the McAb against human RB was amplified successfully , which belongs to the Vk9 gene family and utilizes Vk-Jk1 gene rearrangement. This study lays a good basis for constructing a recombinant antibody and for making a new targeted therapeutic agents against retinoblastoma.

  1. Identification of genes in anonymous DNA sequences. Annual performance report, February 1, 1991--January 31, 1992

    Fields, C.A.


    The objective of this project is the development of practical software to automate the identification of genes in anonymous DNA sequences from the human, and other higher eukaryotic genomes. A software system for automated sequence analysis, gm (gene modeler) has been designed, implemented, tested, and distributed to several dozen laboratories worldwide. A significantly faster, more robust, and more flexible version of this software, gm 2.0 has now been completed, and is being tested by operational use to analyze human cosmid sequence data. A range of efforts to further understand the features of eukaryoyic gene sequences are also underway. This progress report also contains papers coming out of the project including the following: gm: a Tool for Exploratory Analysis of DNA Sequence Data; The Human THE-LTR(O) and MstII Interspersed Repeats are subfamilies of a single widely distruted highly variable repeat family; Information contents and dinucleotide compostions of plant intron sequences vary with evolutionary origin; Splicing signals in Drosophila: intron size, information content, and consensus sequences; Integration of automated sequence analysis into mapping and sequencing projects; Software for the C. elegans genome project.

  2. [Recent progress in gene mapping through high-throughput sequencing technology and forward genetic approaches].

    Lu, Cairui; Zou, Changsong; Song, Guoli


    Traditional gene mapping using forward genetic approaches is conducted primarily through construction of a genetic linkage map, the process of which is tedious and time-consuming, and often results in low accuracy of mapping and large mapping intervals. With the rapid development of high-throughput sequencing technology and decreasing cost of sequencing, a variety of simple and quick methods of gene mapping through sequencing have been developed, including direct sequencing of the mutant genome, sequencing of selective mutant DNA pooling, genetic map construction through sequencing of individuals in population, as well as sequencing of transcriptome and partial genome. These methods can be used to identify mutations at the nucleotide level and has been applied in complex genetic background. Recent reports have shown that sequencing mapping could be even done without the reference of genome sequence, hybridization, and genetic linkage information, which made it possible to perform forward genetic study in many non-model species. In this review, we summarized these new technologies and their application in gene mapping.

  3. Cloning, sequencing and identification of single nu-cleotide polymorphisms of partial sequence on the porcine CACNA1S gene


    CACNA1S gene encodes the α1 subunit of the calcium channel. The mutation of CACNA1S gene can cause hypokalemic periodic paralysis (HypoKPP) and maliglant hyperthermia synarome (MHS) in hu-man beings. Current research on CACNA1S was mainly in human being and model animal, but rarely in livestock and poultry. In this study, Yorkshire pigs (23), Pietrain pigs (30), Jinhua pigs (115) and the second generation (126) of crossbred of Jinhua and Pietrain were used. Primers were designed ac-cording to the sequence of human CACNA1S gene and PCR was carried out using pig genome DNA. PCR products were sequenced and compared with that of human, and then single nucleotide poly-morphisms (SNPs) were investigated by PCR-SSCP, while PCR-RFLP tests were performed to validate the mutations. Results indicated: (1) the 5211 bp DNA fragments of porcine CACNA1S gene were ac-quired (GenBank accession number: DQ767693 ) and the identity of the exon region was 82.6% be-tween human and pig; (2) fifty-seven mutations were found within the cloned sequences, among which 24 were in exon region; (3) the results of PCR-RFLP were in accordance with that of PCR-SSCP. Ac-cording to the EST of porcine CACNA1S gene published in GenBank (Bx914582, Bx666997), 8 of the 11 SNPs identified in the present study were consistent with the base difference between two EST frag-ments.

  4. Codon usage in mammalian genes is biased by sequence slippage mechanisms.

    Bains, W


    The codons for some conserved amino acids are found to be the same between homologous genes from different species when the statistics of codon usage would suggest that they should be different. I examine whether this 'coincidence' of codon usage could be due to genetic mechanisms homogenising the DNA around specific sites. This paper describes the further analysis of the coincident codons in 19 genes (a total of 96 homologues) for slippage. Coincident codons arise in contexts of increased sequence simplicity, and have a high chance of occurring within sequences similar to the recombination-prone minisatellite 'core' sequence. This suggests a role of genetic homogenisation in their generation.

  5. [Characterization of Black and Dichothrix Cyanobacteria Based on the 16S Ribosomal RNA Gene Sequence

    Ortega, Maya


    My project focuses on characterizing different cyanobacteria in thrombolitic mats found on the island of Highborn Cay, Bahamas. Thrombolites are interesting ecosystems because of the ability of bacteria in these mats to remove carbon dioxide from the atmosphere and mineralize it as calcium carbonate. In the future they may be used as models to develop carbon sequestration technologies, which could be used as part of regenerative life systems in space. These thrombolitic communities are also significant because of their similarities to early communities of life on Earth. I targeted two cyanobacteria in my research, Dichothrix spp. and whatever black is, since they are believed to be important to carbon sequestration in these thrombolitic mats. The goal of my summer research project was to molecularly identify these two cyanobacteria. DNA was isolated from each organism through mat dissections and DNA extractions. I ran Polymerase Chain Reactions (PCR) to amplify the 16S ribosomal RNA (rRNA) gene in each cyanobacteria. This specific gene is found in almost all bacteria and is highly conserved, meaning any changes in the sequence are most likely due to evolution. As a result, the 16S rRNA gene can be used for bacterial identification of different species based on the sequence of their 16S rRNA gene. Since the exact sequence of the Dichothrix gene was unknown, I designed different primers that flanked the gene based on the known sequences from other taxonomically similar cyanobacteria. Once the 16S rRNA gene was amplified, I cloned the gene into specialized Escherichia coli cells and sent the gene products for sequencing. Once the sequence is obtained, it will be added to a genetic database for future reference to and classification of other Dichothrix sp.

  6. Detecting Sequence Homology at the Gene Cluster Level with MultiGeneBlast

    Medema, Marnix H.; Takano, Eriko; Breitling, Rainer; Nowick, Katja


    The genes encoding many biomolecular systems and pathways are genomically organized in operons or gene clusters. With MultiGeneBlast, we provide a user-friendly and effective tool to perform homology searches with operons or gene clusters as basic units, instead of single genes. The contextualizatio

  7. Disparate sequence characteristics of the Erysiphe graminis f.sp. hordei glyceraldehyde-3-phosphate dehydrogenase gene

    Christiansen, S.K.; Justesen, A.F.; Giese, H.


    to be similar for all four genes. The results of the codon-usage analysis suggest that Egh is more flexible than other fungi in the choice of nucleotides at the wobble position. Codon-usage preferences in Egh and barley genes indicate a level of difference which may be exploited to discriminate between fungal...... and plant genes in sequence mixtures. The Egh gpd promoter appears to be superior to that of the Egh beta-tubulin gene (tub2) for driving the E. coli beta-glucuronidase (GUS) gene in transformation experiments....

  8. Molecular cloning and analysis of the partial sequence of Rhinopithecus roxellanae growth hormone gene

    徐来祥; 孔繁华; 华育平


    Growth hormone gene (GH) of Rhinopithecus roxellanae was amplified by PCR based on the sequences of the reported mammalian growth hormone gene for the first time. The amplified fragment was about 1.8 kb. It was cloned and its upper stream was sequenced. This sequencing region consists of a 5¢ flanking regulatory region, exon I and part of exon II, intron I of growth hormone gene. Comparing the corresponding sequences of growth hormone gene between Rhinopithecus roxellanae and the porcine, we concluded that the homology reached 81% in the region, and there was high conservation in the 5¢ flanking sequence. The kinds of amino acids of exon I and exon II for about 90% were the same to those in pig. Many mutations occurred in the degenerate site of the triplet code. In the nucleotides of intron I, there were only 72% homologies with those in pig. It means that introns and 3¢ flanking sequence maybe play an important part in growth hormone gene regulation of the different animals.

  9. Chitin synthase 1 (Chs1) gene sequences of Microsporum equinum and Trichophyton equinum.

    Kano, R; Aihara, S; Nakamura, Y; Watanabe, S; Hasegawa, A


    Chitin synthase 1 (Chs1) genes from Microsporum equinum and Trichophyton equinum were compared with those of the other dermatophytes. The Chs1 nucleotide sequences of these dermatophytes from horses showed more than 80% similarity to those of Arthroderma benhamiae, A. fulvum, A. grubyi, A. gypseum, A. incruvatum, A. otae, A. simii, A. vanbreuseghemii, Epidermophyton floccosum, T. mentagrophytes var. interdigitale (T. interdigitale), T. rubrum and T. violaceum. Especially high degree of nucleotide sequence similarity of more than 99% was noted between the Chs1 gene fragments of M. equinum and A. otae, and those of T. equinum, T. interdigitale and A. vanbreuseghemii, respectively. The phylogenetic analysis of their sequences revealed that M. equinum was genetically very close to A. otae and T. equinum to A. vanbreuseghemii. A molecular analysis of Chs1 genes will provide useful information for the genetic relatedness of M. equinum and T. equinum and confirm the value of DNA sequencing in identification of these two dermatophytes.

  10. A novel method to discover fluoroquinolone antibiotic resistance (qnr genes in fragmented nucleotide sequences

    Boulund Fredrik


    Full Text Available Abstract Background Broad-spectrum fluoroquinolone antibiotics are central in modern health care and are used to treat and prevent a wide range of bacterial infections. The recently discovered qnr genes provide a mechanism of resistance with the potential to rapidly spread between bacteria using horizontal gene transfer. As for many antibiotic resistance genes present in pathogens today, qnr genes are hypothesized to originate from environmental bacteria. The vast amount of data generated by shotgun metagenomics can therefore be used to explore the diversity of qnr genes in more detail. Results In this paper we describe a new method to identify qnr genes in nucleotide sequence data. We show, using cross-validation, that the method has a high statistical power of correctly classifying sequences from novel classes of qnr genes, even for fragments as short as 100 nucleotides. Based on sequences from public repositories, the method was able to identify all previously reported plasmid-mediated qnr genes. In addition, several fragments from novel putative qnr genes were identified in metagenomes. The method was also able to annotate 39 chromosomal variants of which 11 have previously not been reported in literature. Conclusions The method described in this paper significantly improves the sensitivity and specificity of identification and annotation of qnr genes in nucleotide sequence data. The predicted novel putative qnr genes in the metagenomic data support the hypothesis of a large and uncharacterized diversity within this family of resistance genes in environmental bacterial communities. An implementation of the method is freely available at

  11. Cloning Sequencing and Structural Manipulation of the Enterotoxin D and E Genes from Staphylococcus aureus


    time. Further characterization of the plasmid was carried out by restriction mapping of pIB485 was performed. pIB485 DNA was digested with EcoRI...the interruption of the gene by insertion of the phage DNA. To characterize this unique regulation of gene expression, we sequenced the lipase a solution of 0.1% carboxymethylcellulose (added to stabilize the emulsion) by sonication for 7 - 10 minutes at 50w. This suspension was used to

  12. Influences on gene expression in vivo by a Shine-Dalgarno sequence

    Jin, Haining; Zhao, Qing; Gonzalez de Valdivia, Ernesto I;


    The Shine-Dalgarno (SD+: 5'-AAGGAGG-3') sequence anchors the mRNA by base pairing to the 16S rRNA in the small ribosomal subunit during translation initiation. We have here compared how an SD+ sequence influences gene expression, if located upstream or downstream of an initiation codon....... The positive effect of an upstream SD+ is confirmed. A downstream SD+ gives decreased gene expression. This effect is also valid for appropriately modified natural Escherichia coli genes. If an SD+ is placed between two potential initiation codons, initiation takes place predominantly at the second start site...

  13. Gene discovery and transcript analyses in the corn smut pathogen Ustilago maydis: expressed sequence tag and genome sequence comparison

    Saville Barry J


    Full Text Available Abstract Background Ustilago maydis is the basidiomycete fungus responsible for common smut of corn and is a model organism for the study of fungal phytopathogenesis. To aid in the annotation of the genome sequence of this organism, several expressed sequence tag (EST libraries were generated from a variety of U. maydis cell types. In addition to utility in the context of gene identification and structure annotation, the ESTs were analyzed to identify differentially abundant transcripts and to detect evidence of alternative splicing and anti-sense transcription. Results Four cDNA libraries were constructed using RNA isolated from U. maydis diploid teliospores (U. maydis strains 518 × 521 and haploid cells of strain 521 grown under nutrient rich, carbon starved, and nitrogen starved conditions. Using the genome sequence as a scaffold, the 15,901 ESTs were assembled into 6,101 contiguous expressed sequences (contigs; among these, 5,482 corresponded to predicted genes in the MUMDB (MIPS Ustilago maydis database, while 619 aligned to regions of the genome not yet designated as genes in MUMDB. A comparison of EST abundance identified numerous genes that may be regulated in a cell type or starvation-specific manner. The transcriptional response to nitrogen starvation was assessed using RT-qPCR. The results of this suggest that there may be cross-talk between the nitrogen and carbon signalling pathways in U. maydis. Bioinformatic analysis identified numerous examples of alternative splicing and anti-sense transcription. While intron retention was the predominant form of alternative splicing in U. maydis, other varieties were also evident (e.g. exon skipping. Selected instances of both alternative splicing and anti-sense transcription were independently confirmed using RT-PCR. Conclusion Through this work: 1 substantial sequence information has been provided for U. maydis genome annotation; 2 new genes were identified through the discovery of 619

  14. RNA-Seq analysis and gene discovery of Andrias davidianus using Illumina short read sequencing.

    Fenggang Li

    Full Text Available The Chinese giant salamander, Andrias davidianus, is an important species in the course of evolution; however, there is insufficient genomic data in public databases for understanding its immunologic mechanisms. High-throughput transcriptome sequencing is necessary to generate an enormous number of transcript sequences from A. davidianus for gene discovery. In this study, we generated more than 40 million reads from samples of spleen and skin tissue using the Illumina paired-end sequencing technology. De novo assembly yielded 87,297 transcripts with a mean length of 734 base pairs (bp. Based on the sequence similarities, searching with known proteins, 38,916 genes were identified. Gene enrichment analysis determined that 981 transcripts were assigned to the immune system. Tissue-specific expression analysis indicated that 443 of transcripts were specifically expressed in the spleen and skin. Among these transcripts, 147 transcripts were found to be involved in immune responses and inflammatory reactions, such as fucolectin, β-defensins and lymphotoxin beta. Eight tissue-specific genes were selected for validation using real time reverse transcription quantitative PCR (qRT-PCR. The results showed that these genes were significantly more expressed in spleen and skin than in other tissues, suggesting that these genes have vital roles in the immune response. This work provides a comprehensive genomic sequence resource for A. davidianus and lays the foundation for future research on the immunologic and disease resistance mechanisms of A. davidianus and other amphibians.

  15. Complexity of rice Hsp100 gene family: lessons from rice genome sequence data

    Gaurav Batra; Vineeta Singh Chauhan; Amanjot Singh; Neelam K Sarkar; Anil Grover


    Elucidation of genome sequence provides an excellent platform to understand detailed complexity of the various gene families. Hsp100 is an important family of chaperones in diverse living systems. There are eight putative gene loci encoding for Hsp100 proteins in Arabidopsis genome. In rice, two full-length Hsp100 cDNAs have been isolated and sequenced so far. Analysis of rice genomic sequence by in silico approach showed that two isolated rice Hsp100 cDNAs correspond to Os05g44340 and Os02g32520 genes in the rice genome database. There appears to be three additional proteins (encoded by Os03g31300, Os04g32560 and Os04g33210 gene loci) that are variably homologous to Os05g44340 and Os02g32520 throughout the entire amino acid sequence. The above five rice Hsp100 genes show significant similarities in the signature sequences known to be conserved among Hsp100 proteins. While Os05g44340 encodes cytoplasmic Hsp100 protein, those encoded by the other four genes are predicted to have chloroplast transit peptides.

  16. cDNA cloning and sequence analysis of NIb gene of soybean mosaic virus

    刘俊君; 彭学贤; 莽克强


    cDNA of soybean mosaic virus (Beijing isolate, SMV-BJ) has been synthesized, using viralgenomic RNA as template and random hexanucleotides as primers. Based on the sequences of SMV-BJ coat protein (CP) gene as well as SMV- and WMV-II-related regions, oligonucleotides were made as primers for polymerase chain reaction (PCR). NIb gene of SMV-BJ was amplified by PCR, and cloned into pBluescript SK. The complete sequence was determined. The comparison of NIb genes between SMV-BJ and WMV-II . (USA) shows that similarities for nucleotide sequence reach 80.3%, and the deduced amino acid sequence. 91 3%. In consideration of the high identities in between the CP gene and the 3’-non-coding region between them, WMV-II might be considered as a watermelon strain of SMV Besides, some unexpected sequences were found in the 3’-region of 2 NIb gene clones. Following modification and splicing, a binary vector of NIb gene has been constructed for its expression in higher plant for the purpose of studying the possible repl

  17. Sequence and secondary structure of the mitochondrial 16S ribosomal RNA gene of Ixodes scapularis.

    Krakowetz, Chantel N; Chilton, Neil B


    The complete DNA sequences and secondary structure of the mitochondrial (mt) 16S ribosomal (r) RNA gene were determined for six Ixodes scapularis adults. There were 44 variable nucleotide positions in the 1252 bp sequence alignment. Most (95%) nucleotide alterations did not affect the integrity of the secondary structure of the gene because they either occurred at unpaired positions or represented compensatory changes that maintained the base pairing in helices. A large proportion (75%) of the intraspecific variation in DNA sequence occurred within Domains I, II and VI of the 16S gene. Therefore, several regions within this gene may be highly informative for studies of the population genetics and phylogeography of I. scapularis, a major vector of pathogens of humans and domestic animals in North America.

  18. Phylogenetic analysis of freshwater mussel corbicula regularis by 18s rRNA gene sequencing

    Magare V N


    Full Text Available Corbicula regularis is a freshwater mussel found in the Indian sub-continent. In the present study, phylogenetic characterization of this important bivalve was attempted using 18S ribosomal RNA gene markers. Genomic DNA was extracted and 18S rRNA gene was amplified by universal primers. The amplification product was sequenced and compared with the nucleotide databases available online to evaluate phylogenetic relationship of the animal under study. Results indicated that 18S rRNA gene sequences of C. regularis showed high degree of similarity to another freshwater mussel, C. fluminea. This work constitutes the first ever sequence deposition of the C. regularis in the nucleotide databases highlighting the usefulness of 18S ribosomal gene markers for phylogenetic analysis.

  19. Genomic organization and sequence analysis of the vomeronasal receptor V2R genes in mouse genome

    YANG Hui; Zhang YaPing


    Two multigene superfamilies, named V1R and V2R, encoding seven-transmembrane-domain G-protein coupled receptors (GPCRs) have been identified as pheromone receptors in mammals. Three V2R gene families have been described in mouse and rat. Here we screened the updated mouse genome sequence database and finally retrieved 63 putative functional V2R genes including three newly identified genes which formed a new additional family. We described the genomic organization of these genes and also characterized the conservation of mouse V2R protein sequences. These genomic and sequence information we described are useful as part of the evidence to speculate the functional domain of V2Rs and should give aid to the functionality study in the future.

  20. Annotation and Re-Sequencing of Genes from De Novo Transcriptome Assembly of Abies alba (Pinaceae

    Anna M. Roschanski


    Full Text Available Premise of the study: We present a protocol for the annotation of transcriptome sequence data and the identification of candidate genes therein using the example of the nonmodel conifer Abies alba. Methods and Results: A normalized cDNA library was built from an A. alba seedling. The sequencing on a 454 platform yielded more than 1.5 million reads that were de novo assembled into 25 149 contigs. Two complementary approaches were applied to annotate gene fragments that code for (1 well-known proteins and (2 proteins that are potentially adaptively relevant. Primer development and testing yielded 88 amplicons that could successfully be resequenced from genomic DNA. Conclusions: The annotation workflow offers an efficient way to identify potential adaptively relevant genes from the large quantity of transcriptome sequence data. The primer set presented should be prioritized for single-nucleotide polymorphism detection in adaptively relevant genes in A. alba.

  1. Haplotypes and Sequence Variation in the Ovine Adiponectin Gene (ADIPOQ

    Qing-Ming An


    Full Text Available The adiponectin gene (ADIPOQ plays an important role in energy homeostasis. In this study five separate regions (regions 1 to 5 of ovine ADIPOQ were analysed using PCR-SSCP. Four different PCR-SSCP patterns (A1-D1, A2-D2 were detected in region-1 and region-2, respectively, with seven and six SNPs being revealed. In region-3, three different patterns (A3-C3 and three SNPs were observed. Two patterns (A4-B4, A5-B5 and two and one SNPs were observed in region-4 and region-5, respectively. In total, nineteen SNPs were detected, with five of them in the coding region and two (c.46T/C and c.515G/A putatively resulting in amino acid changes (p.Tyr16His and p.Lys172Arg. In region-1, -2 and -3 of 316 sheep from eight New Zealand breeds, variants A1, A2 and A3 were the most common, although variant frequencies differed in the eight breeds. Across region-1 and region-3, nine haplotypes were identified and haplotypes A1-A3, A1-C3, B1-A3 and B1-C3 were most common. These results indicate that the ADIPOQ gene is polymorphic and suggest that further analysis is required to see if the variation in the gene is associated with animal production traits.

  2. Analysis of hepatitis B virus genotyping and drug resistance gene mutations based on massively parallel sequencing.

    Han, Yingxin; Zhang, Yinxin; Mei, Yanhua; Wang, Yuqi; Liu, Tao; Guan, Yanfang; Tan, Deming; Liang, Yu; Yang, Ling; Yi, Xin


    Drug resistance to nucleoside analogs is a serious problem worldwide. Both drug resistance gene mutation detection and HBV genotyping are helpful for guiding clinical treatment. Total HBV DNA from 395 patients who were treated with single or multiple drugs including Lamivudine, Adefovir, Entecavir, Telbivudine, Tenofovir and Emtricitabine were sequenced using the HiSeq 2000 sequencing system and validated using the 3730 sequencing system. In addition, a mixed sample of HBV plasmid DNA was used to determine the cutoff value for HiSeq-sequencing, and 52 of the 395 samples were sequenced three times to evaluate the repeatability and stability of this technology. Of the 395 samples sequenced using both HiSeq and 3730 sequencing, the results from 346 were consistent, and the results from 49 were inconsistent. Among the 49 inconsistent results, 13 samples were detected as drug-resistance-positive using HiSeq but negative using 3730, and the other 36 samples showed a higher number of drug-resistance-positive gene mutations using HiSeq 2000 than using 3730. Gene mutations had an apparent frequency of 1% as assessed by the plasmid testing. Therefore, a 1% cutoff value was adopted. Furthermore, the experiment was repeated three times, and the same results were obtained in 49/52 samples using the HiSeq sequencing system. HiSeq sequencing can be used to analyze HBV gene mutations with high sensitivity, high fidelity, high throughput and automation and is a potential method for hepatitis B virus gene mutation detection and genotyping.

  3. Expression of the human glucokinase gene: important roles of the 5' flanking and intron 1 sequences.

    Yi Wang

    Full Text Available BACKGROUND: Glucokinase plays important tissue-specific roles in human physiology, where it acts as a sensor of blood glucose levels in the pancreas, and a few other cells of the gut and brain, and as the rate-limiting step in glucose metabolism in the liver. Liver-specific expression is driven by one of the two tissue-specific promoters, and has an absolute requirement for insulin. The sequences that mediate regulation by insulin are incompletely understood. METHODOLOGY/PRINCIPAL FINDINGS: To better understand the liver-specific expression of the human glucokinase gene we compared the structures of this gene from diverse mammals. Much of the sequence located between the 5' pancreatic beta-cell-specific and downstream liver-specific promoters of the glucokinase genes is composed of repetitive DNA elements that were inserted in parallel on different mammalian lineages. The transcriptional activity of the liver-specific promoter 5' flanking sequences were tested with and without downstream intronic sequences in two human liver cells lines, HepG2 and L-02. While glucokinase liver-specific 5' flanking sequences support expression in liver cell lines, a sequence located about 2000 bases 3' to the liver-specific mRNA start site represses gene expression. Enhanced reporter gene expression was observed in both cell lines when cells were treated with fetal calf serum, but only in the L-02 cells was expression enhanced by insulin. CONCLUSIONS/SIGNIFICANCE: Our results suggest that the normal liver L-02 cell line may be a better model to understand the regulation of the liver-specific expression of the human glucokinase gene. Our results also suggest that sequences downstream of the liver-specific mRNA start site have important roles in the regulation of liver-specific glucokinase gene expression.

  4. The impact of gene duplication, insertion, deletion, lateral gene transfer and sequencing error on orthology inference: a simulation study.

    Dalquen, Daniel A; Altenhoff, Adrian M; Gonnet, Gaston H; Dessimoz, Christophe


    The identification of orthologous genes, a prerequisite for numerous analyses in comparative and functional genomics, is commonly performed computationally from protein sequences. Several previous studies have compared the accuracy of orthology inference methods, but simulated data has not typically been considered in cross-method assessment studies. Yet, while dependent on model assumptions, simulation-based benchmarking offers unique advantages: contrary to empirical data, all aspects of simulated data are known with certainty. Furthermore, the flexibility of simulation makes it possible to investigate performance factors in isolation of one another.Here, we use simulated data to dissect the performance of six methods for orthology inference available as standalone software packages (Inparanoid, OMA, OrthoInspector, OrthoMCL, QuartetS, SPIMAP) as well as two generic approaches (bidirectional best hit and reciprocal smallest distance). We investigate the impact of various evolutionary forces (gene duplication, insertion, deletion, and lateral gene transfer) and technological artefacts (ambiguous sequences) on orthology inference. We show that while gene duplication/loss and insertion/deletion are well handled by most methods (albeit for different trade-offs of precision and recall), lateral gene transfer disrupts all methods. As for ambiguous sequences, which might result from poor sequencing, assembly, or genome annotation, we show that they affect alignment score-based orthology methods more strongly than their distance-based counterparts.

  5. The impact of gene duplication, insertion, deletion, lateral gene transfer and sequencing error on orthology inference: a simulation study.

    Daniel A Dalquen

    Full Text Available The identification of orthologous genes, a prerequisite for numerous analyses in comparative and functional genomics, is commonly performed computationally from protein sequences. Several previous studies have compared the accuracy of orthology inference methods, but simulated data has not typically been considered in cross-method assessment studies. Yet, while dependent on model assumptions, simulation-based benchmarking offers unique advantages: contrary to empirical data, all aspects of simulated data are known with certainty. Furthermore, the flexibility of simulation makes it possible to investigate performance factors in isolation of one another.Here, we use simulated data to dissect the performance of six methods for orthology inference available as standalone software packages (Inparanoid, OMA, OrthoInspector, OrthoMCL, QuartetS, SPIMAP as well as two generic approaches (bidirectional best hit and reciprocal smallest distance. We investigate the impact of various evolutionary forces (gene duplication, insertion, deletion, and lateral gene transfer and technological artefacts (ambiguous sequences on orthology inference. We show that while gene duplication/loss and insertion/deletion are well handled by most methods (albeit for different trade-offs of precision and recall, lateral gene transfer disrupts all methods. As for ambiguous sequences, which might result from poor sequencing, assembly, or genome annotation, we show that they affect alignment score-based orthology methods more strongly than their distance-based counterparts.

  6. Strategy for microbiome analysis using 16S rRNA gene sequence analysis on the Illumina sequencing platform.

    Ram, Jeffrey L; Karim, Aos S; Sendler, Edward D; Kato, Ikuko


    Understanding the identity and changes of organisms in the urogenital and other microbiomes of the human body may be key to discovering causes and new treatments of many ailments, such as vaginosis. High-throughput sequencing technologies have recently enabled discovery of the great diversity of the human microbiome. The cost per base of many of these sequencing platforms remains high (thousands of dollars per sample); however, the Illumina Genome Analyzer (IGA) is estimated to have a cost per base less than one-fifth of its nearest competitor. The main disadvantage of the IGA for sequencing PCR-amplified 16S rRNA genes is that the maximum read-length of the IGA is only 100 bases; whereas, at least 300 bases are needed to obtain phylogenetically informative data down to the genus and species level. In this paper we describe and conduct a pilot test of a multiplex sequencing strategy suitable for achieving total reads of > 300 bases per extracted DNA molecule on the IGA. Results show that all proposed primers produce products of the expected size and that correct sequences can be obtained, with all proposed forward primers. Various bioinformatic optimization of the Illumina Bustard analysis pipeline proved necessary to extract the correct sequence from IGA image data, and these modifications of the data files indicate that further optimization of the analysis pipeline may improve the quality rankings of the data and enable more sequence to be correctly analyzed. The successful application of this method could result in an unprecedentedly deep description (800,000 taxonomic identifications per sample) of the urogenital and other microbiomes in a large number of samples at a reasonable cost per sample.

  7. Myelin protein zero gene sequencing diagnoses Charcot-Marie-Tooth Type 1B disease

    Su, Y.; Zhang, H.; Madrid, R. [Univ. of California, San Francisco, CA (United States)] [and others


    Charcot-Marie-Tooth disease (CMT), the most common genetic neuropathy, affects about 1 in 2600 people in Norway and is found worldwide. CMT Type 1 (CMT1) has slow nerve conduction with demyelinated Schwann cells. Autosomal dominant CMT Type 1B (CMT1B) results from mutations in the myelin protein zero gene which directs the synthesis of more than half of all Schwann cell protein. This gene was mapped to the chromosome 1q22-1q23.1 borderline by fluorescence in situ hybridization. The first 7 of 7 reported CMT1B mutations are unique. Thus the most effective means to identify CMT1B mutations in at-risk family members and fetuses is to sequence the entire coding sequence in dominant or sporadic CMT patients without the CMT1A duplication. Of the 19 primers used in 16 pars to uniquely amplify the entire MPZ coding sequence, 6 primer pairs were used to amplify and sequence the 6 exons. The DyeDeoxy Terminator cycle sequencing method used with four different color fluorescent lables was superior to manual sequencing because it sequences more bases unambiguously from extracted genomic DNA samples within 24 hours. This protocol was used to test 28 CMT and Dejerine-Sottas patients without CMT1A gene duplication. Sequencing MPZ gene-specific amplified fragments identified 9 polymorphic sites within the 6 exons that encode the 248 amino acid MPZ protein. The large number of major CMT1B mutations identified by single strand sequencing are being verified by reverse strand sequencing and when possible, by restriction enzyme analysis. This protocol can be used to distringuish CMT1B patients from othre CMT phenotypes and to determine the CMT1B status of relatives both presymptomatically and prenatally.

  8. Identification of vernalization responsive genes in the winter wheat cultivar Jing841 by transcriptome sequencing



    This study aimed to identify vernalization responsive genes in the winter wheat cultivar Jing841 by comparing the transcriptome data with that of a spring wheat cultivar Liaochun10. For each cultivar, seedlings before and after the vernalizationtreatment were sequenced by Solexa/Illumina sequencing. Genes differentially expressed after and before vernalization were identified as differentially expressed genes (DEGs) using false discovery rate (FDR) ≤ 0.001 and |log2 (fold change)| >1 as cutoffs. The Jing841-specific DEGs were screened and subjected to functional annotation using gene ontology (GO) database.Vernalization responsive genes among the specific genes were selected for validation by quantitative reverse transcription polymerase chain reaction (qRT-PCR) and the expression change over the time was investigated for the top 11 genes with the most significant expression differences. A total of 138,062 unigenes were obtained. Overall, 636 DEGs were identified as vernalization responsive genes including some known genes such as VRN-1 and COR14a, and some unknown contigs. The qRT-PCR validated changes in the expression of 18 DEGs that were detected by RNA-seq. Among them, 11 genes displayed four different types of expression patterns over time during the 30-day-long vernalization treatment. Genes or contigs such as VRN-A1, COR14a, IRIP, unigene1806 and Cl18953. Contig2 probably have critical roles in vernalization.

  9. Sequence diversities of serine-aspartate repeat genes among Staphylococcus aureus isolates from different hosts presumably by horizontal gene transfer.

    Huping Xue

    Full Text Available BACKGROUND: Horizontal gene transfer (HGT is recognized as one of the major forces for bacterial genome evolution. Many clinically important bacteria may acquire virulence factors and antibiotic resistance through HGT. The comparative genomic analysis has become an important tool for identifying HGT in emerging pathogens. In this study, the Serine-Aspartate Repeat (Sdr family has been compared among different sources of Staphylococcus aureus (S. aureus to discover sequence diversities within their genomes. METHODOLOGY/PRINCIPAL FINDINGS: Four sdr genes were analyzed for 21 different S. aureus strains and 218 mastitis-associated S. aureus isolates from Canada. Comparative genomic analyses revealed that S. aureus strains from bovine mastitis (RF122 and mastitis isolates in this study, ovine mastitis (ED133, pig (ST398, chicken (ED98, and human methicillin-resistant S. aureus (MRSA (TCH130, MRSA252, Mu3, Mu50, N315, 04-02981, JH1 and JH9 were highly associated with one another, presumably due to HGT. In addition, several types of insertion and deletion were found in sdr genes of many isolates. A new insertion sequence was found in mastitis isolates, which was presumably responsible for the HGT of sdrC gene among different strains. Moreover, the sdr genes could be used to type S. aureus. Regional difference of sdr genes distribution was also indicated among the tested S. aureus isolates. Finally, certain associations were found between sdr genes and subclinical or clinical mastitis isolates. CONCLUSIONS: Certain sdr gene sequences were shared in S. aureus strains and isolates from different species presumably due to HGT. Our results also suggest that the distributional assay of virulence factors should detect the full sequences or full functional regions of these factors. The traditional assay using short conserved regions may not be accurate or credible. These findings have important implications with regard to animal husbandry practices that may

  10. Estimating variation within the genes and inferring the phylogeny of 186 sequenced diverse Escherichia coli genomes

    Kaas Rolf S


    Full Text Available Abstract Background Escherichia coli exists in commensal and pathogenic forms. By measuring the variation of individual genes across more than a hundred sequenced genomes, gene variation can be studied in detail, including the number of mutations found for any given gene. This knowledge will be useful for creating better phylogenies, for determination of molecular clocks and for improved typing techniques. Results We find 3,051 gene clusters/families present in at least 95% of the genomes and 1,702 gene clusters present in 100% of the genomes. The former 'soft core' of about 3,000 gene families is perhaps more biologically relevant, especially considering that many of these genome sequences are draft quality. The E. coli pan-genome for this set of isolates contains 16,373 gene clusters. A core-gene tree, based on alignment and a pan-genome tree based on gene presence/absence, maps the relatedness of the 186 sequenced E. coli genomes. The core-gene tree displays high confidence and divides the E. coli strains into the observed MLST type clades and also separates defined phylotypes. Conclusion The results of comparing a large and diverse E. coli dataset support the theory that reliable and good resolution phylogenies can be inferred from the core-genome. The results further suggest that the resolution at the isolate level may, subsequently be improved by targeting more variable genes. The use of whole genome sequencing will make it possible to eliminate, or at least reduce, the need for several typing steps used in traditional epidemiology.

  11. Automated DNA mutation detection using universal conditions direct sequencing: application to ten muscular dystrophy genes

    Wu Bai-Lin


    Full Text Available Abstract Background One of the most common and efficient methods for detecting mutations in genes is PCR amplification followed by direct sequencing. Until recently, the process of designing PCR assays has been to focus on individual assay parameters rather than concentrating on matching conditions for a set of assays. Primers for each individual assay were selected based on location and sequence concerns. The two primer sequences were then iteratively adjusted to make the individual assays work properly. This generally resulted in groups of assays with different annealing temperatures that required the use of multiple thermal cyclers or multiple passes in a single thermal cycler making diagnostic testing time-consuming, laborious and expensive. These factors have severely hampered diagnostic testing services, leaving many families without an answer for the exact cause of a familial genetic disease. A search of GeneTests for sequencing analysis of the entire coding sequence for genes that are known to cause muscular dystrophies returns only a small list of laboratories that perform comprehensive gene panels. The hypothesis for the study was that a complete set of universal assays can be designed to amplify and sequence any gene or family of genes using computer aided design tools. If true, this would allow automation and optimization of the mutation detection process resulting in reduced cost and increased throughput. Results An automated process has been developed for the detection of deletions, duplications/insertions and point mutations in any gene or family of genes and has been applied to ten genes known to bear mutations that cause muscular dystrophy: DMD; CAV3; CAPN3; FKRP; TRIM32; LMNA; SGCA; SGCB; SGCG; SGCD. Using this process, mutations have been found in five DMD patients and four LGMD patients (one in the FKRP gene, one in the CAV3 gene, and two likely causative heterozygous pairs of variations in the CAPN3 gene of two other

  12. Multiple Cis-Acting Sequences Contribute to Evolved Regulatory Variation for Drosophila Adh Genes

    Fang, X. M.; Brennan, M. D.


    Drosophila affinidisjuncta and Drosophila hawaiiensis are closely related species that display distinct tissue-specific expression patterns for their homologous alcohol dehydrogenase genes (Adh genes). In Drosophila melanogaster transformants, both genes are expressed at high levels in the larval and adult fat bodies, but the D. affinidisjuncta gene is expressed 10-50-fold more strongly in the larval and adult midguts and Malpighian tubules. The present study reports the mapping of cis-acting sequences contributing to the regulatory differences between these two genes in transformants. Chimeric genes were constructed and introduced into the germ line of D. melanogaster. Stage- and tissue-specific expression patterns were determined by measuring steady-state RNA levels in larvae and adults. Three portions of the promoter region make distinct contributions to the tissue-specific regulatory differences between the native genes. Sequences immediately upstream of the distal promoter have a strong effect in the adult Malpighian tubules, while sequences between the two promoters are relatively important in the larval Malpighian tubules. A third gene segment, immediately upstream of the proximal promoter, influences levels of the proximal Adh transcript in all tissues and developmental stages examined, and largely accounts for the regulatory difference in the larval and adult midguts. However, these as well as other sequences make smaller contributions to various aspects of the tissue-specific regulatory differences. In addition, some chimeric genes display aberrant RNA levels for the whole organism, suggesting close physical association between sequences involved in tissue-specific regulatory differences and those important for Adh expression in the larval and adult fat bodies. PMID:1644276

  13. Application of gene sequencing directly to identify the pathogens in specimens

    LU Xin-xin; YUAN Liang; WAN Xiao-hua; GENG Jia-jing


    Background Accurate identification of bacterial isolates is an essential task in clinical microbiology. This study compared culturing to analyzing 16S rRNA gene sequences as methods to identify bacteria in clinical samples. We developed a key technique to directly identify bacteria in clinical samples via nucleic acid sequences, thus improving the ability to confirm pathogens.Methods We obtained 225 samples from Beijing Tongran Hospital and examined them by conventional culture and 16S rDNA sequencing to identify pathogens. This study made use of a modified sample pre-treatment technique which came from our laboratory to extract DNA. 16S rDNA was amplified by PCR. The amplified product was sequenced on a CEQ8000 capillary sequencer. Sequences were uploaded to the GenBank BLAST database for comparison.Results Among the positively cultivated bacterial strains, seven strains were identified differently by Vitek32 and by 16S rDNA sequencing. Twelve samples that were negative by standard culturing were determined to have pathogens by sequence analysis.Conclusion The use of 16S rRNA gene sequencing can improve clinical microbiology by providing better identification of unidentified bacteria or providing reference identification of unusual strains.

  14. Sequence and chromosomal localization of the mouse brevican gene

    Rauch, U; Meyer, H; Brakebusch, C


    Brevican is a brain-specific proteoglycan belonging to the aggrecan family. Phage clones containing the complete mouse brevican open reading frame of 2649 bp and the complete 3'-untranslated region of 341 bp were isolated from a mouse brain cDNA library, and cosmid clones containing the mouse bre...... to an alternative brevican cDNA, coding for a GPI-linked isoform. Single strand conformation polymorphism analysis mapped the brevican gene (Bcan) to chromosome 3 between the microsatellite markers D3Mit22 and D3Mit11....

  15. Candidate genes revealed by a genome scan for mosquito resistance to a bacterial insecticide: sequence and gene expression variations

    David Jean-Philippe


    Full Text Available Abstract Background Genome scans are becoming an increasingly popular approach to study the genetic basis of adaptation and speciation, but on their own, they are often helpless at identifying the specific gene(s or mutation(s targeted by selection. This shortcoming is hopefully bound to disappear in the near future, thanks to the wealth of new genomic resources that are currently being developed for many species. In this article, we provide a foretaste of this exciting new era by conducting a genome scan in the mosquito Aedes aegypti with the aim to look for candidate genes involved in resistance to Bacillus thuringiensis subsp. israelensis (Bti insecticidal toxins. Results The genome of a Bti-resistant and a Bti-susceptible strains was surveyed using about 500 MITE-based molecular markers, and the loci showing the highest inter-strain genetic differentiation were sequenced and mapped on the Aedes aegypti genome sequence. Several good candidate genes for Bti-resistance were identified in the vicinity of these highly differentiated markers. Two of them, coding for a cadherin and a leucine aminopeptidase, were further examined at the sequence and gene expression levels. In the resistant strain, the cadherin gene displayed patterns of nucleotide polymorphisms consistent with the action of positive selection (e.g. an excess of high compared to intermediate frequency mutations, as well as a significant under-expression compared to the susceptible strain. Conclusion Both sequence and gene expression analyses agree to suggest a role for positive selection in the evolution of this cadherin gene in the resistant strain. However, it is unlikely that resistance to Bti is conferred by this gene alone, and further investigation will be needed to characterize other genes significantly associated with Bti resistance in Ae. aegypti. Beyond these results, this article illustrates how genome scans can build on the body of new genomic information (here, full

  16. Sequence and organization of coelacanth neurohypophysial hormone genes: Evolutionary history of the vertebrate neurohypophysial hormone gene locus

    Brenner Sydney


    Full Text Available Abstract Background The mammalian neurohypophysial hormones, vasopressin and oxytocin are involved in osmoregulation and uterine smooth muscle contraction respectively. All jawed vertebrates contain at least one homolog each of vasopressin and oxytocin whereas jawless vertebrates contain a single neurohypophysial hormone called vasotocin. The vasopressin homolog in non-mammalian vertebrates is vasotocin; and the oxytocin homolog is mesotocin in non-eutherian tetrapods, mesotocin and [Phe2]mesotocin in lungfishes, and isotocin in ray-finned fishes. The genes encoding vasopressin and oxytocin genes are closely linked in the human and rodent genomes in a tail-to-tail orientation. In contrast, their pufferfish homologs (vasotocin and isotocin are located on the same strand of DNA with isotocin gene located upstream of vasotocin gene separated by five genes, suggesting that this locus has experienced rearrangements in either mammalian or ray-finned fish lineage, or in both lineages. The coelacanths occupy a unique phylogenetic position close to the divergence of the mammalian and ray-finned fish lineages. Results We have sequenced a coelacanth (Latimeria menadoensis BAC clone encompassing the neurohypophysial hormone genes and investigated the evolutionary history of the vertebrate neurohypophysial hormone gene locus within a comparative genomics framework. The coelacanth contains vasotocin and mesotocin genes like non-mammalian tetrapods. The coelacanth genes are present on the same strand of DNA with no intervening genes, with the vasotocin gene located upstream of the mesotocin gene. Nucleotide sequences of the second exons of the two genes are under purifying selection implying a regulatory function. We have also analyzed the neurohypophysial hormone gene locus in the genomes of opossum, chicken and Xenopus tropicalis. The opossum contains two tandem copies of vasopressin and mesotocin genes. The vasotocin and mesotocin genes in chicken and

  17. Isolation, sequence identification and tissue expression profiles of 3 novel porcine genes: ASPA, NAGA, and HEXA.

    Shu, Xianghua; Liu, Yonggang; Yang, Liangyu; Song, Chunlian; Hou, Jiafa


    The complete coding sequences of 3 porcine genes - ASPA, NAGA, and HEXA - were amplified by the reverse transcriptase polymerase chain reaction (RT-PCR) based on the conserved sequence information of the mouse or other mammals and referenced pig ESTs. These 3 novel porcine genes were then deposited in the NCBI database and assigned GeneIDs: 100142661, 100142664 and 100142667. The phylogenetic tree analysis revealed that the porcine ASPA, NAGA, and HEXA all have closer genetic relationships with the ASPA, NAGA, and HEXA of cattle. Tissue expression profile analysis was also carried out and results revealed that swine ASPA, NAGA, and HEXA genes were differentially expressed in various organs, including skeletal muscle, the heart, liver, fat, kidney, lung, and small and large intestines. Our experiment is the first one to establish the foundation for further research on these 3 swine genes.

  18. Effect of 5'-flanking sequence deletions on expression of the human insulin gene in transgenic mice

    Fromont-Racine, M; Bucchini, D; Madsen, O;


    Expression of the human insulin gene was examined in transgenic mouse lines carrying the gene with various lengths of DNA sequences 5' to the transcription start site (+1). Expression of the transgene was demonstrated by 1) the presence of human C-peptide in urine, 2) the presence of specific tra...... of the transgene was observed in cell types other than beta-islet cells.......Expression of the human insulin gene was examined in transgenic mouse lines carrying the gene with various lengths of DNA sequences 5' to the transcription start site (+1). Expression of the transgene was demonstrated by 1) the presence of human C-peptide in urine, 2) the presence of specific......, and -168 allowed correct initiation of the transcripts and cell specificity of expression, while quantitative expression gradually decreased. Deletion to -58 completely abolished the expression of the gene. The amount of human product that in mice harboring the longest fragment contributes up to 50...

  19. [Cloning and sequencing the isopenicillin N synthetase(IPNS) gene from Streptomyces cattleya].

    Wang, Y; Li, R


    Great homology existed between IPNS genes from surphur-containing beta-lactam antibiotics producers including procaryotes and eucaryotes. A DNA homologous band was confirmed in S. cattleya by Southern blot analysis using IPNS gene from S. lipmanii as a probe. A recombinant plasmid containing the cyclase gene involved in thienamycin biosynthesis and IPNS gene was obtained by complementary cloning with mutant from S. cattleya. DNA sequencing revealed that the IPNS gene of S. cattleya consists of 963 bp encoding a protein of 321 amino acids with ATG as start codon, TGA as stop codon. Pairwise comparison of the predicted amino acid sequences showed 56% and 64% similarity with IPNSs of S. clavuligerus and S. lipmanii, respectively.

  20. Sequence Analysis of the Protein Structure Homology Modeling of Growth Hormone Gene from Salmo trutta caspius

    Abolhasan Rezaei


    Full Text Available In view of the growth hormone protein investigated and characterized from Salmo trutta caspius. Growth hormone gene in the Salmo trutta caspius have six exons in the full length that is translated into a Molecular Weight (kDa: ssDNA: 64.98 and dsDNA: 129.6. There are also 210 amino acid residue. The assembled full length of DNA contains open reading frame of growth hormone gene that contains 15 sequences in the full length. The average GC content is 47% and AT content is 53%. This protein multiple alignment has shown that this peptide is 100% identical to the corresponding homologous protein in the growth hormone protein which including Salmo salar (Accession number: AAA49558.1 and Rainbow trout (Salmo trutta (Accession number: AAA49555.1" sequences. The sequence of protein had deposited in Gene Bank, Accession number: AEK70940. Also we were analyzed second and third structure between sequences reported in Gene Bank Network system. The results are shown, there are homology between second structure in three sequences including: Salmo trutta caspius, Salmo salar and Rainbow trout. Regarding third structure, Salmo trutta caspius and Salmo salar are same type, but Rainbow trout has different homology with Salmo trutta caspius and Salmo salar. However, the sequences were observed three parallel " helix and in second structure there were almost same percent β sheet.

  1. Phylogenetic analysis of Mexican Babesia bovis isolates using msa and ssrRNA gene sequences.

    Genis, Alma D; Mosqueda, Juan J; Borgonio, Verónica M; Falcón, Alfonso; Alvarez, Antonio; Camacho, Minerva; de Lourdes Muñoz, Maria; Figueroa, Julio V


    Variable merozoite surface antigens of Babesia bovis are exposed glycoproteins having a role in erythrocyte invasion. Members of this gene family include msa-1 and msa-2 (msa-2c, msa-2a(1), msa-2a(2), and msa-2b). Small subunit ribosomal (ssr)RNA gene is subject to evolutive pressure and has been used in phylogenetic studies. To determine the phylogenetic relationship among B. bovis Mexican isolates using different genetic markers, PCR amplicons, corresponding to msa-1, msa-2c, msa-2b, and ssrRNA genes, were cloned and plasmids carrying the corresponding inserts were sequenced. Comparative analysis of nucleotide and deduced amino acid sequences revealed distinct degrees of variability and identity among the coding gene sequences obtained from 12 geographically different B. bovis isolates and a reference strain. Overall sequence identities of 47.7%, 72.3%, 87.7%, and 94% were determined for msa-1, msa-2b, msa-2c, and ssrRNA, respectively. A robust phylogenetic tree was obtained with msa-2b sequences. The phylogenetic analysis suggests that Mexican B. bovis isolates group in clades not concordant with the Mexican geography. However, the Mexican isolates group together in an American clade separated from the Australian clade. Sequence heterogeneity in msa-1, msa-2b, and msa-2c coding regions of Mexican B. bovis isolates present in different geographical regions can be a result of either differential evolutive pressure or cattle movement from commercial trade.

  2. Cloning and sequence analysis of a gene encoding polygalacturonase-inhibiting protein from cotton


    Polygalacturonase-inhibiting proteins (PGIP) play important roles in plant defense of pathogen, especially fungi. A pair of degenerated primers is designed based on the conserved sequence of 20 other known pgip genes and used to amplify Gossypium barbadense cultivation 7124 cDNA library by touch-down PCR. A 561 bp internal fragment of the pgip gene is obtained and used to design the primers for rapid amplification of cDNA ends. A composite pgip gene sequence is constructed from the products of 5′ and 3′ RACE, which are 666 bp and 906 bp respectively. Analysis of nucleic acid sequence shows 69.2% and 68.7% similarity to Citrus and Poncirus pgip genes, respectively. Its open reading frame of the gene encodes a polypeptide of 330 amino acids, in which 10 leucine-rich repeats arrange tandemly. A new set of primers is designed to the 5′ and 3′ ends of the gene, which allows amplification of the full-length gene from the cotton cDNA library. Genomic DNA analysis reveals that this gene has no intron.

  3. Nucleotide sequence and transcription of a trypomastigote surface antigen gene of Trypanosoma cruzi.

    Fouts, D L; Ruef, B J; Ridley, P T; Wrightsman, R A; Peterson, D S; Manning, J E


    In previous studies we identified a 500-bp segment of the gene, TSA-1, which encodes an 85-kDa trypomastigote-specific surface antigen of the Peru strain of Trypanosoma cruzi. TSA-1 was shown to be located at a telomeric site and to contain a 27-bp tandem repeat unit within the coding region. This repeat unit defines a discrete subset of a multigene family and places the TSA-1 gene within this subset. In this study, we present the complete nucleotide sequence of the TSA-1 gene from the Peru strain. By homology matrix analysis, fragments of two other trypomastigote specific surface antigen genes, pTt34 and SA85-1.1, are shown to have extensive sequence homology with TSA-1 indicating that these genes are members of the same gene family as TSA-1. The TSA-1 subfamily was also found to be active in two other strains of T. cruzi, one of which contains multiple telomeric members and one of which contains a single non-telomeric member, suggesting that transcription is not necessarily dependent on the gene being located at a telomeric site. Also, while some of the sequences found in this gene family are present in 2 size classes of poly(A)+ RNA, others appear to be restricted to only 1 of the 2 RNA classes.

  4. Sequence analysis of the phage 21 genes for prohead assembly and head completion.

    Smith, M P; Feiss, M


    Phage 21 is a temperate lambdoid coliphage, and its head-encoding genes, as well as those of phage lambda, are descended from a common ancestral phage. The head protein-encoding genes of phage 21 have been sequenced, confirming earlier genetic studies indicating that the head-encoding genes of 21 and lambda are analogous in location, size, and function. The phage 21 head-encoding genes identified (and their lambda analogues) include: 3(W), 4(B), 5(C), 6(Nu3), shp (D), 7(E), and 8(FII), respectively. An open reading frame, orf1, is analogous in position and shares some sequence identity with FI, a phage lambda gene involved in DNA packaging. The phage 21 major head protein, gp7, is predicted to have strong sequence identity (65%) with the lambda major capsid protein, gpE, including amino acids known to be important for capsid form determination. The nested genes 5/6 of phage 21 and C/Nu3 of lambda differ by several rearrangements including deletions and a triplication. The possibility that lambda genes C/Nu3 evolved from ancestal nested genes containing a triplication is discussed.

  5. Characterization of the Helicoverpa assulta nucleopolyhedrovirus genome and sequence analysis of the polyhedrin gene region

    Soo-Dong Woo; Jae Young Choi; Yeon Ho Je; Byung Rae Jin


    A local strain of Helicoverpa assulta nucleopolyhedrovirus (HasNPV) was isolated from infected H. assulta larvae in Korea. Restriction endonuclease fragment analysis, using 4 restriction enzymes, estimated that the total genome size of HasNPV is about 138 kb. A degenerate polymerase chain reaction (PCR) primer set for the polyhedrin gene successfully amplified the partial polyhedrin gene of HasNPV. The sequencing results showed that the about 430 bp PCR product was a fragment of the corresponding polyhedrin gene. Using HasNPV partial predicted polyhedrin to probe the Southern blots, we identified the location of the polyhedrin gene within the 6 kb EcoRI, 15 kb NcoI, 20 kb XhoI, 17 kb BglII and 3 kb ClaI fragments, respectively. The 3 kb ClaI fragment was cloned and the nucleotide sequences of the polyhedrin coding region and its flaking regions were determined. Nucleotide sequence analysis indicated the presence of an open reading frame of 735 nucleotides which could encode 245 amino acids with a predicted molecular mass of 29 kDa. The nucleotide sequences within the coding region of HasNPV polyhedrin shared 73.7% identity with the polyhedrin gene from Autographa californica NPV but were most closely related to Helicoverpa and Heliothis species NPVs with over 99% sequence identity.

  6. Sequencing and identification of expressed Schistosoma mansoni genes by random selection of cDNA clones from a directional library

    Glória R. Franco


    Full Text Available We have initiated a gene discovery program in Schistosoma mansoni based on the technique of Expressed Sequence Tags (ESTs, i.e. partial sequences of cDNAs obtained from single passes in automatic DNA sequencers. ESTs can be used to identify genese onf the basis of their homology whith sequences from other species deposited in DNA or protein databases. Trasncripts with sequences without matches in teh databases may represent novel parasite-specific genes. This approach has shown to be very efficient and in less than two years a broad range of novel genes has already been ascertained, more than doubling the number of known S. mansoni genes.

  7. Distant horizontal gene transfer is rare for multiple families of prokaryotic insertion sequences.

    Wagner, Andreas; de la Chaux, Nicole


    Horizontal gene transfer in prokaryotes is rampant on short and intermediate evolutionary time scales. It poses a fundamental problem to our ability to reconstruct the evolutionary tree of life. Is it also frequent over long evolutionary distances? To address this question, we analyzed the evolution of 2,091 insertion sequences from all 20 major families in 438 completely sequenced prokaryotic genomes. Specifically, we mapped insertion sequence occurrence on a 16S rDNA tree of the genomes we analyzed, and we also constructed phylogenetic trees of the insertion sequence transposase coding sequences. We found only 30 cases of likely horizontal transfer among distantly related prokaryotic clades. Most of these horizontal transfer events are ancient. Only seven events are recent. Almost all of these transfer events occur between pairs of human pathogens or commensals. If true also for other, non-mobile DNA, the rarity of distant horizontal transfer increases the odds of reliable phylogenetic inference from sequence data.

  8. TGM6 identified as a novel causative gene of spinocerebellar ataxias using exome sequencing.

    Wang, Jun Ling; Yang, Xu; Xia, Kun; Hu, Zheng Mao; Weng, Ling; Jin, Xin; Jiang, Hong; Zhang, Peng; Shen, Lu; Guo, Ji Feng; Li, Nan; Li, Ying Rui; Lei, Li Fang; Zhou, Jie; Du, Juan; Zhou, Ya Fang; Pan, Qian; Wang, Jian; Wang, Jun; Li, Rui Qiang; Tang, Bei Sha


    Autosomal-dominant spinocerebellar ataxias constitute a large, heterogeneous group of progressive neurodegenerative diseases with multiple types. To date, classical genetic studies have revealed 31 distinct genetic forms of spinocerebellar ataxias and identified 19 causative genes. Traditional positional cloning strategies, however, have limitations for finding causative genes of rare Mendelian disorders. Here, we used a combined strategy of exome sequencing and linkage analysis to identify a novel spinocerebellar ataxia causative gene, TGM6. We sequenced the whole exome of four patients in a Chinese four-generation spinocerebellar ataxia family and identified a missense mutation, c.1550T-G transition (L517W), in exon 10 of TGM6. This change is at a highly conserved position, is predicted to have a functional impact, and completely cosegregated with the phenotype. The exome results were validated using linkage analysis. The mutation we identified using exome sequencing was located in the same region (20p13-12.2) as that identified by linkage analysis, which cross-validated TGM6 as the causative spinocerebellar ataxia gene in this family. We also showed that the causative gene could be mapped by a combined method of linkage analysis and sequencing of one sample from the family. We further confirmed our finding by identifying another missense mutation c.980A-G transition (D327G) in exon seven of TGM6 in an additional spinocerebellar ataxia family, which also cosegregated with the phenotype. Both mutations were absent in 500 normal unaffected individuals of matched geographical ancestry. The finding of TGM6 as a novel causative gene of spinocerebellar ataxia illustrates whole-exome sequencing of affected individuals from one family as an effective and cost efficient method for mapping genes of rare Mendelian disorders and the use of linkage analysis and exome sequencing for further improving efficiency.

  9. BAC-based sequencing of behaviorally-relevant genes in the prairie vole.

    Lisa A McGraw

    Full Text Available The prairie vole (Microtus ochrogaster is an important model organism for the study of social behavior, yet our ability to correlate genes and behavior in this species has been limited due to a lack of genetic and genomic resources. Here we report the BAC-based targeted sequencing of behaviorally-relevant genes and flanking regions in the prairie vole. A total of 6.4 Mb of non-redundant or haplotype-specific sequence assemblies were generated that span the partial or complete sequence of 21 behaviorally-relevant genes as well as an additional 55 flanking genes. Estimates of nucleotide diversity from 13 loci based on alignments of 1.7 Mb of haplotype-specific assemblies revealed an average pair-wise heterozygosity (8.4×10(-3. Comparative analyses of the prairie vole proteins encoded by the behaviorally-relevant genes identified >100 substitutions specific to the prairie vole lineage. Finally, our sequencing data indicate that a duplication of the prairie vole AVPR1A locus likely originated from a recent segmental duplication spanning a minimum of 105 kb. In summary, the results of our study provide the genomic resources necessary for the molecular and genetic characterization of a high-priority set of candidate genes for regulating social behavior in the prairie vole.

  10. Candida famata (Debaryomyces hansenii) DNA sequences containing genes involved in riboflavin synthesis.

    Voronovsky, Andriy Y; Abbas, Charles A; Dmytruk, Kostyantyn V; Ishchuk, Olena P; Kshanovska, Barbara V; Sybirna, Kateryna A; Gaillardin, Claude; Sibirny, Andriy A


    Previously cloned Candida famata (Debaryomyces hansenii) strain VKM Y-9 genomic DNA fragments containing genes RIB1 (codes for GTP cyclohydrolase II), RIB2 (encodes specific reductase), RIB5 (codes for dimethylribityllumazine synthase), RIB6 (encodes dihydroxybutanone phosphate synthase) and RIB7 (codes for riboflavin synthase) were sequenced. The derived amino acid sequences of C. famata RIB genes showed extensive homology to the corresponding sequences of riboflavin synthesis enzymes of other yeast species. The highest identity was observed to homologues of D. hansenii CBS767, as C. famata is the anamorph of this hemiascomycetous yeast. The D. hansenii CBS767 RIB3 gene encoding specific deaminase was cloned. This gene successfully complemented riboflavin auxotrophy of the rib3 mutant of flavinogenic yeast, Pichia guilliermondii. Putative iron-responsive elements (potential sites for binding of the transcription factors Fep1p or Aft1p and Aft2p) were found in the upstream regions of some C. famata and D. hansenii RIB genes. The sequences of C. famata RIB genes have been submitted to the EMBL data library under Accession Nos AJ810169-AJ810173.

  11. Citrus plastid-related gene profiling based on expressed sequence tag analyses

    Tercilio Calsa Jr.


    Full Text Available Plastid-related sequences, derived from putative nuclear or plastome genes, were searched in a large collection of expressed sequence tags (ESTs and genomic sequences from the Citrus Biotechnology initiative in Brazil. The identified putative Citrus chloroplast gene sequences were compared to those from Arabidopsis, Eucalyptus and Pinus. Differential expression profiling for plastid-directed nuclear-encoded proteins and photosynthesis-related gene expression variation between Citrus sinensis and Citrus reticulata, when inoculated or not with Xylella fastidiosa, were also analyzed. Presumed Citrus plastome regions were more similar to Eucalyptus. Some putative genes appeared to be preferentially expressed in vegetative tissues (leaves and bark or in reproductive organs (flowers and fruits. Genes preferentially expressed in fruit and flower may be associated with hypothetical physiological functions. Expression pattern clustering analysis suggested that photosynthesis- and carbon fixation-related genes appeared to be up- or down-regulated in a resistant or susceptible Citrus species after Xylella inoculation in comparison to non-infected controls, generating novel information which may be helpful to develop novel genetic manipulation strategies to control Citrus variegated chlorosis (CVC.

  12. The complete mitochondrial genome sequence and gene organization of Tridentiger trigonocephalus (Gobiidae: Gobionellinae) with phylogenetic consideration.

    Wei, Hongqing; Ma, Hongyu; Ma, Chunyan; Zhang, Fengying; Wang, Wei; Chen, Wei; Ma, Lingbo


    The complete mitochondrial genome plays an important role in studies of genome-level characteristics and phylogenetic relationships. Here we determined the complete mitogenome sequence of Tridentiger trigonocephalus (Perciformes, Gobiidae), and discovered its phylogenetic relationship. This circular genome was 16 662 bp in length, and consisted of 37 typical genes, including 13 protein-coding genes, 22 tRNA genes, and two rRNA genes. The gene order of T. trigonocephalus mitochondrial genome was identical to those observed in most other vertebrates. Of 37 genes, 28 were encoded by heavy strand, while the others were encoded by light strand. The phylogenetic tree constructed by 13 concatenated protein-coding genes showed that T. trigonocephalus was closest to T. bifasciatus, and then to T. barbatus among the 20 species within suborder Gobioidei. This work should facilitate the studies on population genetic diversity, and molecular evolution in Gobioidei fishes.

  13. Rapid evolution of the sequences and gene repertoires of secreted proteins in bacteria.

    Teresa Nogueira

    Full Text Available Proteins secreted to the extracellular environment or to the periphery of the cell envelope, the secretome, play essential roles in foraging, antagonistic and mutualistic interactions. We hypothesize that arms races, genetic conflicts and varying selective pressures should lead to the rapid change of sequences and gene repertoires of the secretome. The analysis of 42 bacterial pan-genomes shows that secreted, and especially extracellular proteins, are predominantly encoded in the accessory genome, i.e. among genes not ubiquitous within the clade. Genes encoding outer membrane proteins might engage more frequently in intra-chromosomal gene conversion because they are more often in multi-genic families. The gene sequences encoding the secretome evolve faster than the rest of the genome and in particular at non-synonymous positions. Cell wall proteins in Firmicutes evolve particularly fast when compared with outer membrane proteins of Proteobacteria. Virulence factors are over-represented in the secretome, notably in outer membrane proteins, but cell localization explains more of the variance in substitution rates and gene repertoires than sequence homology to known virulence factors. Accordingly, the repertoires and sequences of the genes encoding the secretome change fast in the clades of obligatory and facultative pathogens and also in the clades of mutualists and free-living bacteria. Our study shows that cell localization shapes genome evolution. In agreement with our hypothesis, the repertoires and the sequences of genes encoding secreted proteins evolve fast. The particularly rapid change of extracellular proteins suggests that these public goods are key players in bacterial adaptation.

  14. Versatile Cosmid Vectors for the Isolation, Expression, and Rescue of Gene Sequences: Studies with the Human α -globin Gene Cluster

    Lau, Yun-Fai; Kan, Yuet Wai


    We have developed a series of cosmids that can be used as vectors for genomic recombinant DNA library preparations, as expression vectors in mammalian cells for both transient and stable transformations, and as shuttle vectors between bacteria and mammalian cells. These cosmids were constructed by inserting one of the SV2-derived selectable gene markers-SV2-gpt, SV2-DHFR, and SV2-neo-in cosmid pJB8. High efficiency of genomic cloning was obtained with these cosmids and the size of the inserts was 30-42 kilobases. We isolated recombinant cosmids containing the human α -globin gene cluster from these genomic libraries. The simian virus 40 DNA in these selectable gene markers provides the origin of replication and enhancer sequences necessary for replication in permissive cells such as COS 7 cells and thereby allows transient expression of α -globin genes in these cells. These cosmids and their recombinants could also be stably transformed into mammalian cells by using the respective selection systems. Both of the adult α -globin genes were more actively expressed than the embryonic zeta -globin genes in these transformed cell lines. Because of the presence of the cohesive ends of the Charon 4A phage in the cosmids, the transforming DNA sequences could readily be rescued from these stably transformed cells into bacteria by in vitro packaging of total cellular DNA. Thus, these cosmid vectors are potentially useful for direct isolation of structural genes.

  15. Complete nucleotide sequences of two adjacent early vaccinia virus genes located within the inverted terminal repetition.

    Venkatesan, S; Gershowitz, A; Moss, B


    The proximal part of the 10,000-base pair (bp) inverted terminal repetition of vaccinia virus DNA encodes at least three early mRNAs. A 2,236-bp segment of the repetition was sequenced to characterize two of the genes. This task was facilitated by constructing a series of recombinants containing overlapping deletions; oligonucleotide linkers with synthetic restriction sites provided points for radioactive labeling before sequencing by the chemical degradation method of Maxam and Gilbert (Methods Enzymol. 65:499-560, 1980). The ends of the transcripts were mapped by hybridizing labeled DNA fragments to early viral RNA and resolving nuclease S1-protected fragments in sequencing gels, by sequencing cDNA clones, and from the lengths of the RNAs. The nucleotide sequences for at least 60 bp upstream of both transcriptional initiation sites are more than 80% adenine . thymine rich and contain long runs of adenines and thymines with some homology to procaryotic and eucaryotic consensus sequences. The gene transcribed in the rightward direction encodes an RNA of approximately 530 nucleotides with a single open reading frame of 420 nucleotides. Preceding the first AUG, there is a heptanucleotide that can hybridize to the 3' end of 18S rRNA with only one mismatch. The derived amino acid sequence of the protein indicated a molecular weight of 15,500. The gene transcribed in the leftward direction encodes an RNA 1,000 to 1,100 nucleotides long with an open reading frame of 996 nucleotides and a leader sequence of only 5 to 6 nucleotides. The derived amino acid sequence of this protein indicated a molecular weight of 38,500. The 3' ends of the two transcripts were located within 100 bp of each other. Although there are adenine . thymine-rich clusters near the putative transcriptional termination sites, specific AATAAA polyadenylic acid signal sequences are absent.

  16. Isolation of laccase gene-specific sequences from white rot and brown rot fungi by PCR

    D`Souza, T.M.; Boominathan, K.; Reddy, C.A. [Michigan State Univ., East Lansing, MI (United States)


    Degenerate primers corresponding to the consensus sequences of the copper-binding regions in the N-terminal domains of known basidiomycete laccases were used to isolate laccase gene-specific sequences from strains representing nine genera of wood rot fungi. All except three gave the expected PCR product of about 200 bp. Computer searches of the databases identified the sequences of each of the PCR product of about 200 bp. Computer searches of the databases identified the sequence of each of the PCR products analyzed as a laccase gene sequence, suggesting the specificity of the primers. PCR products of the white rot fungi Ganoderma lucidum, Phlebia brevispora, and Trametes versicolor showed 65 to 74% nucleotide sequence similarity to each other; the similarity in deduced amino acid sequences was 83 to 91%. The PCR products of Lentinula edodes and Lentinus tigrinus, on the other hand, showed relatively low nucleotide and amino acid similarities (58 to 64 and 62 to 81%, respectively); however, these similarities were still much higher than when compared with the corresponding regions in the laccases of the ascomycete fungi Aspergillus nidulans and Neurospora crassa. A few of the white rot fungi, as well as Gloeophyllum trabeum, a brown rot fungus, gave a 144-bp PCR fragment which had a nucleotide sequence similarity of 60 to 71%. Demonstration of laccase activity in G. trabeum and several other brown rot fungi was of particular interest because these organisms were not previously shown to produce laccases. 36 refs., 6 figs., 2 tabs.

  17. Resequencing of the common marmoset genome improves genome assemblies and gene-coding sequence analysis.

    Sato, Kengo; Kuroki, Yoko; Kumita, Wakako; Fujiyama, Asao; Toyoda, Atsushi; Kawai, Jun; Iriki, Atsushi; Sasaki, Erika; Okano, Hideyuki; Sakakibara, Yasubumi


    The first draft of the common marmoset (Callithrix jacchus) genome was published by the Marmoset Genome Sequencing and Analysis Consortium. The draft was based on whole-genome shotgun sequencing, and the current assembly version is Callithrix_jacches-3.2.1, but there still exist 187,214 undetermined gap regions and supercontigs and relatively short contigs that are unmapped to chromosomes in the draft genome. We performed resequencing and assembly of the genome of common marmoset by deep sequencing with high-throughput sequencing technology. Several different sequence runs using Illumina sequencing platforms were executed, and 181 Gbp of high-quality bases including mate-pairs with long insert lengths of 3, 8, 20, and 40 Kbp were obtained, that is, approximately 60× coverage. The resequencing significantly improved the MGSAC draft genome sequence. The N50 of the contigs, which is a statistical measure used to evaluate assembly quality, doubled. As a result, 51% of the contigs (total length: 299 Mbp) that were unmapped to chromosomes in the MGSAC draft were merged with chromosomal contigs, and the improved genome sequence helped to detect 5,288 new genes that are homologous to human cDNAs and the gaps in 5,187 transcripts of the Ensembl gene annotations were completely filled.

  18. Cloning and sequencing of the beta-glucosidase gene from Acetobacter xylinum ATCC 23769.

    Tajima, K; Nakajima, K; Yamashita, H; Shiba, T; Munekata, M; Takai, M


    The beta-glucosidase gene (bglxA) was cloned from the genomic DNA of Acetobacter xylinum ATCC 23769 and its nucleotide sequence (2200 bp) was determined. This bglxA gene was present downstream of the cellulose synthase operon and coded for a polypeptide of molecular mass 79 kDa. The overexpression of the beta-glucosidase in A. xylinum caused a tenfold increase in activity compared to the wild-type strain. In addition, the action pattern of the enzyme was identified as G3ase activity. The deduced amino acid sequence of the bglxA gene showed 72.3%, 49.6%, and 45.1% identity with the beta-glucosidases from A. xylinum subsp. sucrofermentans, Cellvibrio gilvus, and Mycobacterium tuberculosis, respectively. Based on amino acid sequence similarities, the beta-glucosidase (BglxA) was assigned to family 3 of the glycosyl hydrolases.

  19. Preliminary study on mitochondrial 16S rRNA gene sequences and phylogeny of flatfishes (Pleuronectiformes)


    A 605 bp section of mitochondrial 16S rRNA gene from Paralichthys olivaceus, Pseudorhombus cinnamomeus, Psetta maxima and Kareius bicoloratus, which represent 3 families of Order Pleuronectiformes was amplified by PCR and sequenced to show the molecular systematics of Pleuronectiformes for comparison with related gene sequences of other 6 flatfish downloaded from GenBank. Phylogenetic analysis based on genetic distance from related gene sequences of 10 flatfish showed that this method was ideal to explore the relationship between species, genera and families. Phylogenetic trees set-up is based on neighbor-joining, maximum parsimony and maximum likelihood methods that accords to the general rule of Pleuronectiformes evolution. But they also resulted in some confusion. Unlike data from morphological characters, P. olivaceus clustered with K.bicoloratus, but P. cinnamomeus did not cluster with P. olivaceus, which is worth further studying.

  20. Putative and unique gene sequence utilization for the design of species specific probes as modeled by Lactobacillus plantarum

    The concept of utilizing putative and unique gene sequences for the design of species specific probes was tested. The abundance profile of assigned functions within the Lactobacillus plantarum genome was used for the identification of the putative and unique gene sequence, csh. The targeted gene (cs...

  1. Natural variation in CBF gene sequence, gene expression and freezing tolerance in the Versailles core collection of Arabidopsis thaliana

    Brunel Dominique


    Full Text Available Abstract Background Plants from temperate regions are able to withstand freezing temperatures due to a process known as cold acclimation, which is a prior exposure to low, but non-freezing temperatures. During acclimation, a large number of genes are induced, bringing about biochemical changes in the plant, thought to be responsible for the subsequent increase in freezing tolerance. Key regulatory proteins in this process are the CBF1, 2 and 3 transcription factors which control the expression of a set of target genes referred to as the "CBF regulon". Results To assess the role of the CBF genes in cold acclimation and freezing tolerance of Arabidopsis thaliana, the CBF genes and their promoters were sequenced in the Versailles core collection, a set of 48 accessions that maximizes the naturally-occurring genetic diversity, as well as in the commonly used accessions Col-0 and WS. Extensive polymorphism was found in all three genes. Freezing tolerance was measured in all accessions to assess the variability in acclimated freezing tolerance. The effect of sequence polymorphism was investigated by evaluating the kinetics of CBF gene expression, as well as that of a subset of the target COR genes, in a set of eight accessions with contrasting freezing tolerance. Our data indicate that CBF genes as well as the selected COR genes are cold induced in all accessions, irrespective of their freezing tolerance. Although we observed different levels of expression in different accessions, CBF or COR gene expression was not closely correlated with freezing tolerance. Conclusion Our results indicate that the Versailles core collection contains significant natural variation with respect to freezing tolerance, polymorphism in the CBF genes and CBF and COR gene expression. Although there tends to be more CBF and COR gene expression in tolerant accessions, there are exceptions, reinforcing the idea that a complex network of genes is involved in freezing tolerance

  2. Sequencing and comparative analysis of fugu protocadherin clusters reveal diversity of protocadherin genes among teleosts

    Rajasegaran Vikneswari


    Full Text Available Abstract Background The synaptic cell adhesion molecules, protocadherins, are a vertebrate innovation that accompanied the emergence of the neural tube and the elaborate central nervous system. In mammals, the protocadherins are encoded by three closely-linked clusters (α, β and γ of tandem genes and are hypothesized to provide a molecular code for specifying the remarkably-diverse neural connections in the central nervous system. Like mammals, the coelacanth, a lobe-finned fish, contains a single protocadherin locus, also arranged into α, β and γ clusters. Zebrafish, however, possesses two protocadherin loci that contain more than twice the number of genes as the coelacanth, but arranged only into α and γ clusters. To gain further insight into the evolutionary history of protocadherin clusters, we have sequenced and analyzed protocadherin clusters from the compact genome of the pufferfish, Fugu rubripes. Results Fugu contains two unlinked protocadherin loci, Pcdh1 and Pcdh2, that collectively consist of at least 77 genes. The fugu Pcdh1 locus has been subject to extensive degeneration, resulting in the complete loss of Pcdh1γ cluster. The fugu Pcdh genes have undergone lineage-specific regional gene conversion processes that have resulted in a remarkable regional sequence homogenization among paralogs in the same subcluster. Phylogenetic analyses show that most protocadherin genes are orthologous between fugu and zebrafish either individually or as paralog groups. Based on the inferred phylogenetic relationships of fugu and zebrafish genes, we have reconstructed the evolutionary history of protocadherin clusters in the teleost fish lineage. Conclusion Our results demonstrate the exceptional evolutionary dynamism of protocadherin genes in vertebrates in general, and in teleost fishes in particular. Besides the 'fish-specific' whole genome duplication, the evolution of protocadherin genes in teleost fishes is influenced by lineage

  3. Comparison of the aflR gene sequences of strains in Aspergillus section Flavi.

    Lee, Chao-Zong; Liou, Guey-Yuh; Yuan, Gwo-Fang


    Aflatoxins are polyketide-derived secondary metabolites produced by Aspergillus parasiticus, Aspergillus flavus, Aspergillus nomius and a few other species. The toxic effects of aflatoxins have adverse consequences for human health and agricultural economics. The aflR gene, a regulatory gene for aflatoxin biosynthesis, encodes a protein containing a zinc-finger DNA-binding motif. Although Aspergillus oryzae and Aspergillus sojae, which are used in fermented foods and in ingredient manufacture, have no record of producing aflatoxin, they have been shown to possess an aflR gene. This study examined 34 strains of Aspergillus section Flavi. The aflR gene of 23 of these strains was successfully amplified and sequenced. No aflR PCR products were found in five A. sojae strains or six strains of A. oryzae. These PCR results suggested that the aflR gene is absent or significantly different in some A. sojae and A. oryzae strains. The sequenced aflR genes from the 23 positive strains had greater than 96.6 % similarity, which was particularly conserved in the zinc-finger DNA-binding domain. The aflR gene of A. sojae has two obvious characteristics: an extra CTCATG sequence fragment and a C to T transition that causes premature termination of AFLR protein synthesis. Differences between A. parasiticus/A. sojae and A. flavus/A. oryzae aflR genes were also identified. Some strains of A. flavus as well as A. flavus var. viridis, A. oryzae var. viridis and A. oryzae var. effuses have an A. oryzae-type aflR gene. For all strains with the A. oryzae-type aflR gene, there was no evidence of aflatoxin production. It is suggested that for safety reasons, the aflR gene could be examined to assess possible aflatoxin production by Aspergillus section Flavi strains.

  4. p21WAF1/CIP1 gene DNA sequencing and its expression in human osteosarcoma

    廖威明; 张春林; 李佛保; 曾炳芳; 曾益新


    Background Mutation and expression change of p21WAF1/CIP1 may play a role in the growth of osteosarcoma. This study was to investigate the expression of the p21WAF1/CIP1 gene in human osteosarcoma, p21WAF1/CIP1 gene DNA sequence change and their relationships with the phenotype and clinical prognosis.Methods p21WAF1/CIP1 gene in 10 normal people and the tumours of 45 osteosarcoma patients were examined using polymerase chain reaction-single strand conformation polymorphism (PCR-SSCP) with silver staining. The PCR product with an abnormal strand was sequenced directly. The p21WAF1/CIP1 gene mRNA and P21 protein of 45 cases of osteosarcoma were investigated by using in situ hybridization and immunohistochemistry, respectively. Results The occurrence of P21 protein in osteosarcoma was 17.78% (8/45), and p21WAF1/CIP1 mRNA expression in osteosarcoma was 42.22% (19/45). The p21WAF1/CIP1 gene DNA sequencing of amplified production showed that in p21WAF1/CIP1 gene exon 3 of 36 cases of human osteosarcoma, there were 17 cases (47.22%) with C→T at position 609; 10 normal blood samples' DNA sequence analysis yielded 8 cases (80.00%) with C→T at the same position. Conclusions Along with the increase of malignancy, the expression of p21WAF1/CIP1mRNA and P21 protein in osteosarcoma tends to decrease. It is uncommon for the p21WAF1/CIP1 gene mutation to occur in human osteosarcoma. As a result, the possible existence of tumour subtypes of p21WAF1/CIP1 gene mutation should be investigated. Our research leads to the location of p21WAF1/CIP1 gene polymorphism of Chinese osteosarcoma patients, which can provide a basis for further research.

  5. Molecular Identification and Sequencing of Mannose Binding Protein (MBP Gene of Acanthamoeba palestinensis

    M Rezaeian


    Full Text Available "nBackground: Acanthamoeba keratitis develops by pathogenic Acanthamoeba such as A. pal­es­tinen­sis. Indeed this species is one of the known causative agents of amoebic keratitis in Iran. Mannose Binding Protein (MBP is the main pathogenicity factors for developing this sight threatening disease. We aimed to characterize MBP gene in pathogenic Acanthamoeba isolates such as A. palestinensis."nMethods: This experimental research was performed in the School of Public Health, Tehran University of Medical Sciences, Tehran, Iran during 2007-2008.  A. palestinensis was grown on 2% non-nutrient agar overlaid with Escherichia coli. DNA extraction was performed using phenol-chloroform method. PCR reaction and amplification were done using specific primer pairs of MBP. The amplified fragment were purified and sequenced. Finally, the obtained fragment was deposited in the gene data bank."nResults: A 900 bp PCR-product was recovered after PCR reaction. Sequence analysis of the purified PCR product revealed a gene with 943 nucleotides. Homology analysis of the ob­tained sequence showed 81% similarity with the available MBP gene in the gene data bank. The fragment was deposited in the gene data bank under accession number EU678895"nConclusion: MBP is known as the most important factor in Acanthamoeba pathogenesis cas­cade. Therefore, characterization of this gene can aid in developing better therapeutic agents and even immunization of high-risk people.

  6. Cloning and DNA sequence analysis of a Lactococcus bacteriophage lysin gene.

    Shearman, C; Underwood, H; Jury, K; Gasson, M


    A gene for the lysin of Lactococcus lactis bacteriphage phi vML3 was cloned using an Escherichia coli/bacteriophage lambda host-vector system. The gene was detected by its expression of antimicrobial activity against L. lactis cells in a bioassay. The cloned fragment was analysed by sub-cloning on to E. coli plasmid vectors and by restriction endonuclease and deletion mapping. Its entire DNA sequence was determined and an open reading frame for the lysin structural gene was identified. The sequenced lysin gene would express a protein of 187 amino acids with a molecular weight of 21,090, which is in good agreement with that of a protein detected after in vitro transcription and translation of DNA encoding the gene. Expression of the lysin gene in E. coli and B. subtilis from an adjacent bacteriophage promoter was readily detected but in L. lactis expression of lysin was found to be lethal. The bacteriophage phi vML3 lysin had sequence homology with protein 15 of B. subtilis bacteriophage PZA. This protein is involved in DNA packaging during bacteriophage maturation rather than in host cell lysis. The cloning and analysis of the phi vML3 lysin gene is of importance in further understanding lactic streptococcal bacteriophages, for the development of positive selection vectors and for biotechnological applications of relevance to the dairy industry.

  7. Transcriptome sequencing uncovers the Avr5 avirulence gene of the tomato leaf mold pathogen Cladosporium fulvum.

    Mesarich, Carl H; Griffiths, Scott A; van der Burgt, Ate; Okmen, Bilal; Beenen, Henriek G; Etalo, Desalegn W; Joosten, Matthieu H A J; de Wit, Pierre J G M


    The Cf-5 gene of tomato confers resistance to strains of the fungal pathogen Cladosporium fulvum carrying the avirulence gene Avr5. Although Cf-5 has been cloned, Avr5 has remained elusive. We report the cloning of Avr5 using a combined bioinformatic and transcriptome sequencing approach. RNA-Seq was performed on the sequenced race 0 strain (0WU; carrying Avr5), as well as a race 5 strain (IPO 1979; lacking a functional Avr5 gene) during infection of susceptible tomato. Forty-four in planta-induced C. fulvum candidate effector (CfCE) genes of 0WU were identified that putatively encode a secreted, small cysteine-rich protein. An expressed transcript sequence comparison between strains revealed two polymorphic CfCE genes in IPO 1979. One of these conferred avirulence to IPO 1979 on Cf-5 tomato following complementation with the corresponding 0WU allele, confirming identification of Avr5. Complementation also led to increased fungal biomass during infection of susceptible tomato, signifying a role for Avr5 in virulence. Seven of eight race 5 strains investigated escape Cf-5-mediated resistance through deletion of the Avr5 gene. Avr5 is heavily flanked by repetitive elements, suggesting that repeat instability, in combination with Cf-5-mediated selection pressure, has led to the emergence of race 5 strains deleted for the Avr5 gene.

  8. Discrimination of germline V genes at different sequencing lengths and mutational burdens: A new tool for identifying and evaluating the reliability of V gene assignment.

    Zhang, Bochao; Meng, Wenzhao; Prak, Eline T Luning; Hershberg, Uri


    Immune repertoires are collections of lymphocytes that express diverse antigen receptor gene rearrangements consisting of Variable (V), (Diversity (D) in the case of heavy chains) and Joining (J) gene segments. Clonally related cells typically share the same germline gene segments and have highly similar junctional sequences within their third complementarity determining regions. Identifying clonal relatedness of sequences is a key step in the analysis of immune repertoires. The V gene is the most important for clone identification because it has the longest sequence and the greatest number of sequence variants. However, accurate identification of a clone's germline V gene source is challenging because there is a high degree of similarity between different germline V genes. This difficulty is compounded in antibodies, which can undergo somatic hypermutation. Furthermore, high-throughput sequencing experiments often generate partial sequences and have significant error rates. To address these issues, we describe a novel method to estimate which germline V genes (or alleles) cannot be discriminated under different conditions (read lengths, sequencing errors or somatic hypermutation frequencies). Starting with any set of germline V genes, this method measures their similarity using different sequencing lengths and calculates their likelihood of unambiguous assignment under different levels of mutation. Hence, one can identify, under different experimental and biological conditions, the germline V genes (or alleles) that cannot be uniquely identified and bundle them together into groups of specific V genes with highly similar sequences.

  9. Cloning and sequence analysis of chitin synthase gene fragments of Demodex mites

    Ya-e ZHAO; Zheng-hang WANG; Yang XU; Ji-ru XU; Wen-yan LIU; Meng WEI; Chu-ying WANG


    To our knowledge,few reports on Demodex studied at the molecular level are available at present.In this study our group,for the first time,cloned,sequenced and analyzed the chitin synthase (CHS) gene fragments of Demodex folliculorum,Demodex brevis,and Demodex canis (three isolates from each species) from Xi'an China,by designing specific primers based on the only partial sequence of the CHS gene of D.canis from Japan,retrieved from GenBank.Results show that amplification was successful only in three D.canis isolates and one D.brevis isolate out of the nine Demodex isolates.The obtained fragments were sequenced to be 339 bp for D.canis and 338 bp for D.brevis.The CHS gene sequence similarities between the three Xi'an D.canis isolates and one Japanese D.canis isolate ranged from 99.7% to 100.0%,and those between four D.canis isolates and one D.brevis isolate were 99.1%-99.4%.Phylogenetic trees based on maximum parsimony (MP) and maximum likelihood (ML) methods shared the same clusters,according with the traditional classification.Two open reading frames (ORFs) were identified in each CHS gene sequenced,and their corresponding amino acid sequences were located at the catalytic domain.The relatively conserved sequences could be deduced to be a CHS class A gene,which is associated with chitin synthesis in the integument of Demodex mites.

  10. Sperm competition shapes gene expression and sequence evolution in the ocellated wrasse.

    Dean, Rebecca; Wright, Alison E; Marsh-Rollo, Susan E; Nugent, Bridget M; Alonzo, Suzanne H; Mank, Judith E


    Gene expression differences between males and females often underlie sexually dimorphic phenotypes, and the expression levels of genes that are differentially expressed between the sexes are thought to respond to sexual selection. Most studies on the transcriptomic response to sexual selection treat sexual selection as a single force, but postmating sexual selection in particular is expected to specifically target gonadal tissue. The three male morphs of the ocellated wrasse (Symphodus ocellatus) make it possible to test the role of postmating sexual selection in shaping the gonadal transcriptome. Nesting males hold territories and have the highest reproductive success, yet we detected feminization of their gonadal gene expression compared to satellite males. Satellite males are less brightly coloured and experience more intense sperm competition than nesting males. In line with postmating sexual selection affecting gonadal gene expression, we detected a more masculinized expression profile in satellites. Sneakers are the lowest quality males and showed both de-masculinization and de-feminization of gene expression. We also detected higher rates of gene sequence evolution of male-biased genes compared to unbiased genes, which could at least in part be explained by positive selection. Together, these results reveal the potential for postmating sexual selection to drive higher rates of gene sequence evolution and shape the gonadal transcriptome profile.

  11. Complete nucleotide sequence and gene rearrangement of the mitochondrial genome of Occidozyga martensii

    En Li; Xiaoqiang Li; Xiaobing Wu; Ge Feng; Man Zhang; Haitao Shi; Lijun Wang; Jianping Jiang


    In this study, the complete nucleotide sequence (18,321 bp) of the mitochondrial (mt) genome of the round-tongued floating frog, Occidozyga martensii was determined. Although, the base composition and codon usage of O. martensii conformed to the typical vertebrate patterns, this mt genome contained 23 tRNAs (a tandem duplication of tRNA-Met gene). The LTPF tRNA-gene cluster, and the derived position of the ND5 gene downstream of the control region, were present in this mitogenome. Moreover, we found that in the WANCY tRNA-gene cluster, the tRNA-Asn gene was located between the tRNA-Tyr and COI genes instead of between the tRNA-Ala and tRNA-Cys genes, which is a novel mtDNA gene rearrangement in vertebrates. Based on the concatenated nucleotide sequences of the 13 protein-coding genes, phylogenetic analysis (BI, ML, MP) was performed to further clarify the phylogenetic relations of this species within anurans.

  12. Characterization of microsatellites and gene contents from genome shotgun sequences of mungbean (Vigna radiata (L. Wilczek

    Sommanas Warunee


    Full Text Available Abstract Background Mungbean is an important economical crop in Asia. However, genomic research has lagged behind other crop species due to the lack of polymorphic DNA markers found in this crop. The objective of this work is to develop and characterize microsatellite or simple sequence repeat (SSR markers from genome shotgun sequencing of mungbean. Result We have generated and characterized a total of 470,024 genome shotgun sequences covering 100.5 Mb of the mungbean (Vigna radiata (L. Wilczek genome using 454 sequencing technology. We identified 1,493 SSR motifs that could be used as potential molecular markers. Among 192 tested primer pairs in 17 mungbean accessions, 60 loci revealed polymorphism with polymorphic information content (PIC values ranging from 0.0555 to 0.6907 with an average of 0.2594. Majority of microsatellite markers were transferable in Vigna species, whereas transferability rates were only 22.90% and 24.43% in Phaseolus vulgaris and Glycine max, respectively. We also used 16 SSR loci to evaluate phylogenetic relationship of 35 genotypes of the Asian Vigna group. The genome survey sequences were further analyzed to search for gene content. The evidence suggested 1,542 gene fragments have been sequence tagged, that fell within intersected existing gene models and shared sequence homology with other proteins in the database. Furthermore, potential microRNAs that could regulate developmental stages and environmental responses were discovered from this dataset. Conclusion In this report, we provided evidence of generating remarkable levels of diverse microsatellite markers and gene content from high throughput genome shotgun sequencing of the mungbean genomic DNA. The markers could be used in germplasm analysis, accessing genetic diversity and linkage mapping of mungbean.

  13. Hunting down frame shifts: Ecological analysis of diverse functional gene sequences

    Michal eStrejcek


    Full Text Available Functional gene ecological analyses using amplicon sequencing can be challenging as translated sequences are often burdened with shifted reading frames. The aim of this work was to evaluate several bioinformatics tools designed to correct errors which arise during sequencing in an effort to reduce the number of frame-shifts (FS. Genes encoding for alpha subunits of biphenyl (bphA and benzoate (benA dioxygenases were used as model sequences. FrameBot, a FS correction tool, was able to reduce the number of detected FS to zero. However, up to 43.1% of sequences were discarded by FrameBot as non-specific targets. Therefore, we proposed a de novo mode of FrameBot for FS correction, which works on a similar basis as common chimera identifying platforms and is not dependent on reference sequences. By nature of FrameBot de novo design, it is crucial to provide it with data as error free as possible. We tested the ability of several publicly available correction tools to decrease the number of errors in the data sets. The combination of Maximum Expected Error (MEE filtering and single linkage pre-clustering (SLP proved the most efficient read procession. Applying FrameBot de novo on the processed data enabled analysis of BphA sequences with minimal losses of potentially functional sequences not homologous to those previously known. This experiment also demonstrated the extensive diversity of dioxygenases in soil. A script which performs FrameBot de novo is presented in the supplementary material to the study and the tool was implemented into FunGene Pipeline available at and

  14. GeneAnalytics: An Integrative Gene Set Analysis Tool for Next Generation Sequencing, RNAseq and Microarray Data

    Ben-Ari Fuchs, Shani; Lieder, Iris; Mazor, Yaron; Buzhor, Ella; Kaplan, Sergey; Bogoch, Yoel; Plaschkes, Inbar; Shitrit, Alina; Rappaport, Noa; Kohn, Asher; Edgar, Ron; Shenhav, Liraz; Safran, Marilyn; Lancet, Doron; Guan-Golan, Yaron; Warshawsky, David; Shtrichman, Ronit


    Abstract Postgenomics data are produced in large volumes by life sciences and clinical applications of novel omics diagnostics and therapeutics for precision medicine. To move from “data-to-knowledge-to-innovation,” a crucial missing step in the current era is, however, our limited understanding of biological and clinical contexts associated with data. Prominent among the emerging remedies to this challenge are the gene set enrichment tools. This study reports on GeneAnalytics™ (, a comprehensive and easy-to-apply gene set analysis tool for rapid contextualization of expression patterns and functional signatures embedded in the postgenomics Big Data domains, such as Next Generation Sequencing (NGS), RNAseq, and microarray experiments. GeneAnalytics' differentiating features include in-depth evidence-based scoring algorithms, an intuitive user interface and proprietary unified data. GeneAnalytics employs the LifeMap Science's GeneCards suite, including the GeneCards®—the human gene database; the MalaCards—the human diseases database; and the PathCards—the biological pathways database. Expression-based analysis in GeneAnalytics relies on the LifeMap Discovery®—the embryonic development and stem cells database, which includes manually curated expression data for normal and diseased tissues, enabling advanced matching algorithm for gene–tissue association. This assists in evaluating differentiation protocols and discovering biomarkers for tissues and cells. Results are directly linked to gene, disease, or cell “cards” in the GeneCards suite. Future developments aim to enhance the GeneAnalytics algorithm as well as visualizations, employing varied graphical display items. Such attributes make GeneAnalytics a broadly applicable postgenomics data analyses and interpretation tool for translation of data to knowledge-based innovation in various Big Data fields such as precision medicine, ecogenomics, nutrigenomics

  15. Analysis and comparison of fragrant gene sequence in some rice cultivars

    Karami Noushafarin


    Full Text Available It is known that the fragrant trait in rice (Oryza sativa L. is largely controlled by fgr gene on chromosome 8 and it has been specified that the existence of an 8 bp deletion and three single nucleotide polymorphism (SNP in exon 7 is effective on this trait. In this study, sequence alignment analysis of fgr exon7 on chromosome 8 for 11 different fragrant and non-fragrant cultivars revealed that 5 aromatic rice cultivars carried 3 SNPs and 8 bp deletion in exon7 which terminates prematurely at a TAA stop codon. However, 5 of the non-aromatics showed a sequence identical to the published Nipponbare, being non-fragrant Japonica variety sequence. An exception among them was Bejar, which had 8 bp deletion and 3SNPs but it was non-aromatic. Sequencing can determine nucleotide alignment of a gene and give beneficial information about gene function. In silico prediction showed proteins sequences alignment of fgr gene for Khazar and Domsiah genotypes were different. Betaine aldehyde dehydrogenase complete enzyme belongs to Khazar non-fragrant genotype that has complete length and 503 amino acids while non-functional BADH2 enzyme for Domsiah fragrant genotype has 251 amino acids that result in accumulate 2-acetyl-1-pyrroline (2AP and produces aroma in fragrant genotypes.

  16. Cloning and sequence analysis of Alcaligenes faecalis nifHDK gene cluster

    张海予; 林敏; 萧凤回; 朱新生; 方宣钧; 尤崇杓; 朱玉贤


    Total DNA of Alcaligenes faecalis was probed with both the nifH and nifHD sequences from K. pneumoniae. One positive band of about 4.6 kb was discovered. This nifH homologous fragment was cloned into the vector pBluescript SK to construct the recombinant plasmid pBZl. The inserted fragment in pBZl was analyzed by physical mapping and was further subcloned for sequencing. It was found that this A. faecalis nifHDK homology pos-sessed a typical σ54-dependent promoter region with upstream activator sequence (UAS) and A-T rich region. The nifH and nifD ORFs were 888 and 1 476 bp long respectively. The GC contents of these two genes were about 61. 6% and 60.0% . The intergenic regions of nifH-nifD and nifD-nifK were 101 and 105 bp respectively. There were sepa-rate SD sequences upstream of all the three genes. The deduced amino acid sequences of the nifH gene product (the Fe-protein ) and the nifD gene product (the Mo-Fc-protein) were also highly homologous to other nitrogen-fixing bacteria, especially in th

  17. Nucleotide sequences of immunoglobulin eta genes of chimpanzee and orangutan: DNA molecular clock and hominoid evolution

    Sakoyama, Y.; Hong, K.J.; Byun, S.M.; Hisajima, H.; Ueda, S.; Yaoita, Y.; Hayashida, H.; Miyata, T.; Honjo, T.


    To determine the phylogenetic relationships among hominoids and the dates of their divergence, the complete nucleotide sequences of the constant region of the immunoglobulin eta-chain (C/sub eta1/) genes from chimpanzee and orangutan have been determined. These sequences were compared with the human eta-chain constant-region sequence. A molecular clock (silent molecular clock), measured by the degree of sequence divergence at the synonymous (silent) positions of protein-encoding regions, was introduced for the present study. From the comparison of nucleotide sequences of ..cap alpha../sub 1/-antitrypsin and ..beta..- and delta-globulin genes between humans and Old World monkeys, the silent molecular clock was calibrated: the mean evolutionary rate of silent substitution was determined to be 1.56 x 10/sup -9/ substitutions per site per year. Using the silent molecular clock, the mean divergence dates of chimpanzee and orangutan from the human lineage were estimated as 6.4 +/- 2.6 million years and 17.3 +/- 4.5 million years, respectively. It was also shown that the evolutionary rate of primate genes is considerably slower than those of other mammalian genes.

  18. Yersinia spp. Identification Using Copy Diversity in the Chromosomal 16S rRNA Gene Sequence.

    Hao, Huijing; Liang, Junrong; Duan, Ran; Chen, Yuhuang; Liu, Chang; Xiao, Yuchun; Li, Xu; Su, Mingming; Jing, Huaiqi; Wang, Xin


    API 20E strip test, the standard for Enterobacteriaceae identification, is not sufficient to discriminate some Yersinia species for some unstable biochemical reactions and the same biochemical profile presented in some species, e.g. Yersinia ferderiksenii and Yersinia intermedia, which need a variety of molecular biology methods as auxiliaries for identification. The 16S rRNA gene is considered a valuable tool for assigning bacterial strains to species. However, the resolution of the 16S rRNA gene may be insufficient for discrimination because of the high similarity of sequences between some species and heterogeneity within copies at the intra-genomic level. In this study, for each strain we randomly selected five 16S rRNA gene clones from 768 Yersinia strains, and collected 3,840 sequences of the 16S rRNA gene from 10 species, which were divided into 439 patterns. The similarity among the five clones of 16S rRNA gene is over 99% for most strains. Identical sequences were found in strains of different species. A phylogenetic tree was constructed using the five 16S rRNA gene sequences for each strain where the phylogenetic classifications are consistent with biochemical tests; and species that are difficult to identify by biochemical phenotype can be differentiated. Most Yersinia strains form distinct groups within each species. However Yersinia kristensenii, a heterogeneous species, clusters with some Yersinia enterocolitica and Yersinia ferderiksenii/intermedia strains, while not affecting the overall efficiency of this species classification. In conclusion, through analysis derived from integrated information from multiple 16S rRNA gene sequences, the discrimination ability of Yersinia species is improved using our method.

  19. Molecular genotyping of human Ureaplasma species based on multiple-banded antigen (MBA) gene sequences.

    Kong, F; Ma, Z; James, G; Gordon, S; Gilbert, G L


    Ureaplasma urealyticum has been divided into 14 serovars. Recently, subdivision of U. urealyticum into two species has been proposed: U. parvum (previously U. urealyticum parvo biovar), comprising four serovars (1, 3, 6, 14) and U. urealyticum (previously U. urealyticum T-960 biovar), 10 serovars (2, 4, 5, 7-13). The multiple-banded antigen (MBA) genes of these species contain both species and serovar/subtype specific sequences. Based on whole sequences of the 5'-ends of MBA genes of U. parvum serovars and partial sequences of the 5'-ends of MBA genes of U. urealyticum serovars, we previously divided each of these species into three MBA genotypes. To further elucidate the relationships between serovars, we sequenced the whole 5'-ends of MBA genes of all 10 U. urealyticum serovars and partial repetitive regions of these genes from all serovars of U. parvum and U. urealyticum. For the first time, all four serovars of U. parvum were clearly differentiated from each other. In addition, the 10 serovars of U. urealyticum were divided into five MBA genotypes, as follows: MBA genotype A comprises serovars 2, 5, 8; MBA genotype B, serovar 10 only; MBA genotype C, serovars 4, 12, 13; MBA genotype D, serovar 9 only; and MBA genotype E comprises serovars 7 and 11. There were no sequence differences between members within each MBA genotype. Further work is required to identify other genes or other regions of the MBA genes that may be used to differentiate U. urealyticum serovars within MBA genotypes A, C and E. A better understanding of the molecular basis of serotype differentiation will help to improve subtyping methods for use in studies of the pathogenesis and epidemiology of these organisms.

  20. Metazoan Remaining Genes for Essential Amino Acid Biosynthesis: Sequence Conservation and Evolutionary Analyses

    Igor R. Costa


    Full Text Available Essential amino acids (EAA consist of a group of nine amino acids that animals are unable to synthesize via de novo pathways. Recently, it has been found that most metazoans lack the same set of enzymes responsible for the de novo EAA biosynthesis. Here we investigate the sequence conservation and evolution of all the metazoan remaining genes for EAA pathways. Initially, the set of all 49 enzymes responsible for the EAA de novo biosynthesis in yeast was retrieved. These enzymes were used as BLAST queries to search for similar sequences in a database containing 10 complete metazoan genomes. Eight enzymes typically attributed to EAA pathways were found to be ubiquitous in metazoan genomes, suggesting a conserved functional role. In this study, we address the question of how these genes evolved after losing their pathway partners. To do this, we compared metazoan genes with their fungal and plant orthologs. Using phylogenetic analysis with maximum likelihood, we found that acetolactate synthase (ALS and betaine-homocysteine S-methyltransferase (BHMT diverged from the expected Tree of Life (ToL relationships. High sequence conservation in the paraphyletic group Plant-Fungi was identified for these two genes using a newly developed Python algorithm. Selective pressure analysis of ALS and BHMT protein sequences showed higher non-synonymous mutation ratios in comparisons between metazoans/fungi and metazoans/plants, supporting the hypothesis that these two genes have undergone non-ToL evolution in animals.

  1. Development of a Comprehensive Sequencing Assay for Inherited Cardiac Condition Genes.

    Pua, Chee Jian; Bhalshankar, Jaydutt; Miao, Kui; Walsh, Roddy; John, Shibu; Lim, Shi Qi; Chow, Kingsley; Buchan, Rachel; Soh, Bee Yong; Lio, Pei Min; Lim, Jaclyn; Schafer, Sebastian; Lim, Jing Quan; Tan, Patrick; Whiffin, Nicola; Barton, Paul J; Ware, James S; Cook, Stuart A


    Inherited cardiac conditions (ICCs) are characterised by marked genetic and allelic heterogeneity and require extensive sequencing for genetic characterisation. We iteratively optimised a targeted gene capture panel for ICCs that includes disease-causing, putatively pathogenic, research and phenocopy genes (n = 174 genes). We achieved high coverage of the target region on both MiSeq (>99.8% at ≥ 20× read depth, n = 12) and NextSeq (>99.9% at ≥ 20×, n = 48) platforms with 100% sensitivity and precision for single nucleotide variants and indels across the protein-coding target on the MiSeq. In the final assay, 40 out of 43 established ICC genes informative in clinical practice achieved complete coverage (100 % at ≥ 20×). By comparison, whole exome sequencing (WES; ∼ 80×), deep WES (∼ 500×) and whole genome sequencing (WGS; ∼ 70×) had poorer performance (88.1, 99.2 and 99.3% respectively at ≥ 20×) across the ICC target. The assay described here delivers highly accurate and affordable sequencing of ICC genes, complemented by accessible cloud-based computation and informatics. See Editorial in this issue (DOI: 10.1007/s12265-015-9667-8 ).

  2. Cloning and sequencing of cagA gene fragment of Helicobacter pylori with coccoid form

    Ke-Xia Wang; Xue-Feng Wang


    AIM: To clone and sequence the cagA gene fragment of Helicobacter pylori ( H pylori) with coccoid form.METHODS: H pylori strain NCTC11637 were transformed to coccoid form by exposure to antibiotics in subinhibitory concentrations. The coccoid H pyloriwas collected. cagA gene of the coccoid H pylori strain was amplified by PCR.After purified, the target fragment was cloned into plasmid pMD-18T. The recombinant plasmid pMD-18T-cagA was transformed into E. coli JM109. Positive clones were screened and identified by PCR and digestion with restriction endonucleases. The sequence of inserted fragment was then analysed.RESULTS: cagA gene of 3 444 bp was obtained from the coccoid H pylori genome DNA. The recombinant plasmid pMD-18T-cagA was constructed, then it was digested by BamH Ⅰ+Sac Ⅰ, and the product of digestion was identical with the predicted one. Sequence analysis showed that the homology of coccoid and the reported original sequence H pylori was 99.7%.CONCLUSION: The recombinant plasmid containing cagA gene from coccoid H pylori has been constructed successfully.The coccoid H pylori contain completed cagA gene, which may be related to pathogenicity of them.

  3. OrthoSelect: a web server for selecting orthologous gene alignments from EST sequences.

    Schreiber, Fabian; Wörheide, Gert; Morgenstern, Burkhard


    In the absence of whole genome sequences for many organisms, the use of expressed sequence tags (EST) offers an affordable approach for researchers conducting phylogenetic analyses to gain insight about the evolutionary history of organisms. Reliable alignments for phylogenomic analyses are based on orthologous gene sequences from different taxa. So far, researchers have not sufficiently tackled the problem of the completely automated construction of such datasets. Existing software tools are either semi-automated, covering only part of the necessary data processing, or implemented as a pipeline, requiring the installation and configuration of a cascade of external tools, which may be time-consuming and hard to manage. To simplify data set construction for phylogenomic studies, we set up a web server that uses our recently developed OrthoSelect approach. To the best of our knowledge, our web server is the first web-based EST analysis pipeline that allows the detection of orthologous gene sequences in EST libraries and outputs orthologous gene alignments. Additionally, OrthoSelect provides the user with an extensive results section that lists and visualizes all important results, such as annotations, data matrices for each gene/taxon and orthologous gene alignments. The web server is available at

  4. Nucleotide sequence analysis of hypervariable junctions of Haemophilus influenzae pilus gene clusters.

    Read, T D; Satola, S W; Farley, M M


    Haemophilus influenzae pili are surface structures that promote attachment to human epithelial cells. The five genes that encode pili, hifABCDE, are found inserted in genomes either between pmbA and hpt (hif-1) or between purE and pepN (hif-2). We determined the sequence between the ends of the pilus clusters and bordering genes in a number of H. influenzae strains. The junctions of the hif-1 cluster (limited to biogroup aegyptius isolates) are structurally simple. In contrast, hif-2 junctions are highly diverse, complex assemblies of conserved intergenic sequences (including genes hicA and hicB) with evidence of frequent recombination. Variation at hif-2 junctions seems to be tied to multiple copies of a 23-bp Haemophilus intergenic dyad sequence. The hif-1 cluster appears to have originated in biogroup aegyptius strains from invasion of the hpt-pmbA region by a DNA template containing the hif-2 genes with termini in the hairpin loop of flanking intergenic dyad sequences. The pilus gene clusters are an interesting model of a mobile "pathogenicity island" not associated with a phage, transposon, or insertion element.

  5. Different organisms associated with heartwater as shown by analysis of 16S ribosomal RNA gene sequences.

    Allsopp, M; Visser, E S; du Plessis, J L; Vogel, S W; Allsopp, B A


    Cowdria ruminantium is a rickettsial parasite which causes heartwater, a economically important disease of domestic and wild ruminants in tropical and subtropical Africa and parts of the Caribbean. Because existing diagnostic methods are unreliable, we investigated the small-subunit ribosomal RNA (srRNA) gene from heartwater-infected material to characterise the organisms present and to develop specific oligonucleotide probes for polymerase chain reaction (PCR) based diagnosis. DNA was obtained from ticks and ruminants from heartwater-free and heartwater-endemic areas from Cowdria in tissue culture. PCR was carried out using primers designed to amplify only rickettsial srRNA genes, the target region being the highly variable V1 loop. Amplicons were cloned and sequenced; 51% were C. ruminantium sequences corresponding to four genotypes, two of which were identical to previously reported C. ruminantium sequences while the other two were new. The four different Cowdria genotypes can be correlated with different phenotypes. Tissue-culture samples yielded only Cowdria genotype sequences, but an extraordinary heterogeneity of 16S sequences was obtained from field samples. In addition to Cowdria genotypes we found sequences from previously unknown Ehrlichia spp., sequences showing homology to other Rickettsiales and a variety of Pseudomonadaceae. One Ehrlichia sequence was phylogenetically closely related to Ehrlichia platys (Group II Ehrlichia) and one to Ehrlichia canis (Group III Ehrlichia). This latter sequence was from an isolate (Germishuys) made from a naturally infected sheep which, from brain smear examination and pathology, appeared to be suffering from heartwater; nevertheless no Cowdria genotype sequences were found in this isolate. In addition no Cowdria sequences were obtained from uninfected ticks. Complete 16S rRNA gene sequences were determined for two C. ruminantium genotypes and for two previously uncharacterised heartwater-associated Ehrlichia spp

  6. The Sequence Variations of Intron-3 of the α-Amylase Gene in Adzuki Bean

    JIN Wen-lin; Yamaguchi Hirofumi; Isigami Matiko; Yasuda Kentaro


    This study describes variation of intron-3 of a-amylase gene from 156 breeds of adzuki beansusing SSCP(single-strand conformation polymorphism)analysis. Based on a-amylase gene structure and se-quence, A pair of PCR primers, F (CCTACATTCTAACACACCCT) and R (GCATATTGTGCCAGTACAAT)were designed to amplify intron-3 fragments of a-amylase gene. 14 variant types were detected, including 13,9, 10, 4 variant types in the wild, weed, locally cultivated and modern brought-up adzuki beans respectively,9, 8, 7 variant types of the wild adzuki beans from Japan, China and Korea respectively, and some other va-riant types in the local adzuki beans from China and Bhutan. 60 % of subjects of cultivated races were found tobe EE type in the experiment. In addition, sequence analysis of intron-3 of α-amylase gene from 8 varianttypes reveals the evolution process of various variant types in adzuki beans.

  7. Molecular cloning, sequence characterization and expression pattern of Rab18 gene from watermelon (Citrullus lanatus).

    Xinli, Xiao; Lei, Peng


    The complete mRNA sequence of watermelon Rab18 gene was amplified through the rapid amplification of cDNA ends (RACE) method. The full-length mRNA was 1010 bp containing a 645 bp open reading frame, which encodes a protein of 214 amino acids. Sequence analysis revealed that watermelon Rab18 protein shares high homology with the Rab18 of cucumber (99%), muskmelon (98%), Morus notabilis (90%), tomato (89%), wine grape (89%) and potato (88%). Phylogenetic analysis revealed that watermelon Rab18 gene has a closer genetic relationship with Rab18 gene of cucumber and muskmelon. Tissue expression profile analysis indicated that watermelon Rab18 gene was highly expressed in root, stem and leaf, moderately expressed in flower and weakly expressed in fruit.

  8. Sequence analysis of the Toll-like receptor 2 gene of old world camels

    Shyam S. Dahiya


    Full Text Available The Toll-like receptor 2 (TLR2 gene of old world camels (Camelus dromedarius and Camelus bactrianus was cloned and sequenced. The TLR2 gene of the dromedary camel had the highest nucleotide and amino acid identity with pig, i.e., 66.8% and 59.6%, respectively. Similarly, the TLR2 gene of the Bactrian camel also had the highest nucleotide and amino acid identity with pig, i.e., 85.7% and 81.4%, respectively. Dromedary and Bactrian camels shared 77.9% nucleotide and 73.6% amino acid identity with each other. Interestingly, the amidation motif is present in camel (Dromedary and Bactrian TLR2 only, and the TIR domain is absent in Dromedary camel TLR2. This is the first report of the TLR2 gene sequence of Dromedary and Bactrian camels.

  9. Sequence Comparison of Partial Cytochrome b Genes of Two Coilia species

    LIU Jinxian; GAO Tianxiang; WANG Yujiang; ZHANG Yaping


    Sequence variation of partial cytochrome b genes between two Coilia species, C. ectenes and C. mystus, was investigated. Of the 402 nucleotides, twenty-seven (6.72%) are polymorphic and all are synonymous substitutions. At the third positions of genetic condon of cytochrome b gene, the two species show an extreme anti-G bias (< 4 % ) and a pronounced bias towards A and C (>68%). There is no amino acid sequence divergence between the partial cytochrome b genes of the two species, indicating a close genetic relationship between them. The k-2p genetic distance of partial cytochrome b segment of the two species is 0.072, suggesting that the species were separated 3.6 Ma ago, in the middle Pliocene. Our result reveals that the cytochrome b gene is an appropriate marker for studies of population genetic structures and phylogeographic patterns of the two species.

  10. Sequence and evolution of the blue cone pigment gene in old and new world primates

    Hunt, D.M.; Cowing, J.A.; Patel, R. [Univ. of London (United Kingdom)] [and others


    The sequences of the blue cone photopigments in the talapoin monkey (Miopithecus talapoin), an Old World primate, and in the marmoset (Callithrix jacchus), a New World monkey, are presented. Both genes are composed of 5 exons separated by 4 introns. In this respect, they are identical to the human blue gene, and intron sizes are also similar. Based on the level of amino acid identity, both monkey pigments are members of the S branch of pigments. Alignment of these sequences with the human gene requires the insertion/deletion of two separate codons in exon 1. The silent site divergence between these primate blue genes indicates a separation of the Old and New World primate lineages around 43 million years ago. 41 refs., 1 fig., 3 tabs.

  11. In-depth cDNA Library Sequencing Provides Quantitative Gene Expression Profiling in Cancer Biomarker Discovery

    Wanling Yang; Dingge Ying; Yu-Lung Lau


    procedures may allow detection of many expres-sion features for less abundant gene variants. With the reduction of sequencing cost and the emerging of new generation sequencing technology, in-depth sequencing of cDNA pools or libraries may represent a better and powerful tool in gene expression profiling and cancer biomarker detection. We also propose using sequence-specific subtraction to remove hundreds of the most abundant housekeeping genes to in-crease sequencing depth without affecting relative expression ratio of other genes, as transcripts from as few as 300 most abundantly expressed genes constitute about 20% of the total transcriptome. In-depth sequencing also represents a unique ad-vantage of detecting unknown forms of transcripts, such as alternative splicing variants, fusion genes, and regulatory RNAs, as well as detecting mutations and polymorphisms that may play important roles in disease pathogenesis.

  12. Cloning and Sequence Analysis of Disease Resistance Gene Analogues from Three Wild Rice Species in Yunnan

    LIU Ji-mei; CHENG Zai-quan; YANG Ming-zhi; WU Cheng-jun; WANG Ling-xian; SUN Yi-ding; HUANG Xing-qi


    Two sets of degenerate oligonucleotide primers were designed according to amino acid conservedregions of reported plant disease resistance genes which encode proteins that contain nucleotide-binding site andleucine-rich repeats(NBS-LRR), and the plant disease resistance genes which encode serine/threonine proteinkinase(STK). By polymerase chain reaction(PCR), disease resistance gene analogues have been amplified fromthree wild rice species in Yunnan Province, China. The DNA fragments from amplification have been clonedinto the pGEM-T vector respectively. Sequencing of the DNA fragments indicated that 7 classes, 2 classes and6 classes NBS-LRR disease resistance gene analogues from Oryza rufipogon Griff. , Oryza officinalis Wall. ,and Oryza meyeriana Baill. were obtained respectively. The two representative fragments of TO12 from Ory-za officinalis Wall. and TR19 from Oryza rufipogon Griff. belong to the same class and homology of theirsequences are 100%. The result shows that the sequences of the same class disease resistance gene analogueshave no difference among different species of wild rice. 5 classes STK disease resistance gene analogues werealso obtained among which 4 classes from Oryza rufipogon Griff. , 1 class from Oryza officinalis Wall. Bycomparison analysis of amino acid sequences, we found that the obtained disease resistance gene analogues havevery iow identity(low to 25%) with the reported disease resistance gene L6, N, Bs2, Prf, Pto, Lr10 and Xa21etc. The finding suggests that the obtained disease resistance gene analogues are analogues of putative diseaseresistance genes that have not been isolated so far.

  13. Cloning and sequencing of the ferredoxin gene of blue-green alga Anabaena siamensis

    Li, Shou-Dong; Song, Li-Rong; Liu, Yong-Ding; Zhao, Jin-Dong


    The structure gene for ferredoxin, petFI, from Anabaena siamensis has been amplified by polymerase chain reaction(PCR) and cloned into cloning vector pGEM-3zf(+). The nucleotide sequence of petFI has been determined with silver staining sequencing method. There is 96.8% homology between coding region of petFI from A. siamensis and that of petFI from A. sp. 7120. Amino acid sequences of seven strains of blue-green algae are compared.

  14. Discovery of sequence motifs related to coexpression of genes using evolutionary computation

    Fogel, Gary B.; Weekes, Dana G.; Varga, Gabor; Dow, Ernst R.; Harlow, Harry B.; Onyia, Jude E.; Su, Chen


    Transcription factors are key regulatory elements that control gene expression. Recognition of transcription factor binding site (TFBS) motifs in the upstream region of coexpressed genes is therefore critical towards a true understanding of the regulations of gene expression. The task of discovering eukaryotic TFBSs remains a challenging problem. Here, we demonstrate that evolutionary computation can be used to search for TFBSs in upstream regions of genes known to be coexpressed. Evolutionary computation was used to search for TFBSs of genes regulated by octamer-binding factor and nuclear factor kappa B. The discovered binding sites included experimentally determined known binding motifs as well as lists of putative, previously unknown TFBSs. We believe that this method to search nucleotide sequence information efficiently for similar motifs will be useful for discovering TFBSs that affect gene regulation. PMID:15266008

  15. Biologic: Gene circuits and feedback in an introductory physics sequence for biology and premedical students

    Cahn, S B


    Two synthetic gene circuits -- the genetic toggle switch and the repressilator -- are analyzed quantitatively and discussed in the context of an educational module on gene circuits and feedback that constitutes the final topic of a year-long introductory physics sequence, aimed at biology and premedical undergraduate students. The genetic toggle switch consists of two genes, each of whose protein product represses the other's expression, while the repressilator consists of three genes, each of whose protein product represses the next gene's expression. Analytic, numerical, and electronic treatments of the genetic toggle switch shows that this gene circuit realizes bistability. A simplified treatment of the repressilator reveals that this circuit can realize sustained oscillations. In both cases, a "phase diagram" is obtained, that specifies the region of parameter space in which bistability or oscillatory behavior, respectively, occurs.

  16. Exome Sequencing Reveals Cubilin Mutation as a Single-Gene Cause of Proteinuria

    Ovunc, Bugsu; Otto, Edgar A.; Vega-Warner, Virginia; Saisawat, Pawaree; Ashraf, Shazia; Ramaswami, Gokul; Fathy, Hanan M.; Schoeb, Dominik; Chernin, Gil; Lyons, Robert H.; Engin YILMAZ; Hildebrandt, Friedhelm


    In two siblings of consanguineous parents with intermittent nephrotic-range proteinuria, we identified a homozygous deleterious frameshift mutation in the gene CUBN, which encodes cubulin, using exome capture and massively parallel re-sequencing. The mutation segregated with affected members of this family and was absent from 92 healthy individuals, thereby identifying a recessive mutation in CUBN as the single-gene cause of proteinuria in this sibship. Cubulin mutations cause a hereditary fo...

  17. Local synteny and codon usage contribute to asymmetric sequence divergence of Saccharomyces cerevisiae gene duplicates

    Bergthorsson Ulfar


    Full Text Available Abstract Background Duplicated genes frequently experience asymmetric rates of sequence evolution. Relaxed selective constraints and positive selection have both been invoked to explain the observation that one paralog within a gene-duplicate pair exhibits an accelerated rate of sequence evolution. In the majority of studies where asymmetric divergence has been established, there is no indication as to which gene copy, ancestral or derived, is evolving more rapidly. In this study we investigated the effect of local synteny (gene-neighborhood conservation and codon usage on the sequence evolution of gene duplicates in the S. cerevisiae genome. We further distinguish the gene duplicates into those that originated from a whole-genome duplication (WGD event (ohnologs versus small-scale duplications (SSD to determine if there exist any differences in their patterns of sequence evolution. Results For SSD pairs, the derived copy evolves faster than the ancestral copy. However, there is no relationship between rate asymmetry and synteny conservation (ancestral-like versus derived-like in ohnologs. mRNA abundance and optimal codon usage as measured by the CAI is lower in the derived SSD copies relative to ancestral paralogs. Moreover, in the case of ohnologs, the faster-evolving copy has lower CAI and lowered expression. Conclusions Together, these results suggest that relaxation of selection for codon usage and gene expression contribute to rate asymmetry in the evolution of duplicated genes and that in SSD pairs, the relaxation of selection stems from the loss of ancestral regulatory information in the derived copy.

  18. Genetic analysis of the PKHD1 gene with long-rang PCR sequencing.

    Tong, Yong-Qing; Liu, Bei; Fu, Chao-Hong; Zheng, Hong-Yun; Gu, Jian; Liu, Hang; Luo, Hong-Bo; Li, Yan


    PKHD1 gene mutations are found responsible for autosomal recessive polycystic kidney disease (ARPKD). However, it is inconvenient to detect the mutations by common polymerase chain reaction (PCR) because the open reading frame of PKHD1 is very long. Recently, long-range (LR) PCR is demonstrated to be a more sensitive mutation screening method for PKHD1 by directly sequencing. In this study, the entire PKHD1 coding region was amplified by 29 reactions to avoid the specific PCR amplification of individual exons, which generated the size of 1 to 7 kb products by LR PCR. This method was compared to the screening method with standard direct sequencing of each individual exon of the gene by a reference laboratory in 15 patients with ARPKD. The results showed that a total of 37 genetic changes were detected with LR PCR sequencing, which included 33 variations identified by the reference laboratory with standard direct sequencing. LR PCR sequencing had 100% sensitivity, 96% specificity, and 97.0% accuracy, which were higher than those with standard direct sequencing method. In conclusion, LR PCR sequencing is a reliable method with high sensitivity, specificity and accuracy for detecting genetic variations. It also has more intronic coverage and lower cost, and is an applicable clinical method for complex genetic analyses.

  19. How are exons encoding transmembrane sequences distributed in the exon-intron structure of genes?

    Sawada, Ryusuke; Mitaku, Shigeki


    The exon-intron structure of eukaryotic genes raises a question about the distribution of transmembrane regions in membrane proteins. Were exons that encode transmembrane regions formed simply by inserting introns into preexisting genes or by some kind of exon shuffling? To answer this question, the exon-per-gene distribution was analyzed for all genes in 40 eukaryotic genomes with a particular focus on exons encoding transmembrane segments. In 21 higher multicellular eukaryotes, the percentage of multi-exon genes (those containing at least one intron) within all genes in a genome was high (>70%) and with a mean of 87%. When genes were grouped by the number of exons per gene in higher eukaryotes, good exponential distributions were obtained not only for all genes but also for the exons encoding transmembrane segments, leading to a constant ratio of membrane proteins independent of the exon-per-gene number. The positional distribution of transmembrane regions in single-pass membrane proteins showed that they are generally located in the amino or carboxyl terminal regions. This nonrandom distribution of transmembrane regions explains the constant ratio of membrane proteins to the exon-per-gene numbers because there are always two terminal (i.e., the amino and carboxyl) regions - independent of the length of sequences.

  20. Cloning and Sequence Analysis on 3' Coding Region of Wild Boar and Cross Bred Pig Myostatin Gene

    LIU Di; YANG Xiu-qin; YANG Jia-fang


    Myostatin, with a highly conservative gene among breeds is a negative regulator of muscle. The 3' coding regions of wild boar and crossbred pig myostatin were cloned by RT-PCR and sequenced respectively. The homology of the nucleotide sequence between wild boar and crossbred pig was 100% and there was no difference in this region compared with pig myostatin gene of Genbank. This indicated that there was not change of gene sequence in this region during the evolution processes.

  1. Gene structure of the human DDX3 and chromosome mapping of its related sequences.

    Kim, Y S; Lee, S G; Park, S H; Song, K


    The human DDX3 gene (GenBank accession No. U50553) is the human homologue of the mouse Ddx3 gene and is a member of the gene family that contains DEAD motifs. Previously, we mapped the gene to the Xp11.3-11.23. In this report, we describe the structural organization of the human DDX3 gene. It consisted of 17 exons that span approximately 16 kb. An Alu element was present in the intron 13. Its organization was the same as that of the human DBY gene, a closely related sequence present on the Y chromosome. We also identified two processed pseudogenes (DDX3) with a sequence that is highly homologous to those of DDX3 cDNAs, but contain a translation termination codon within its open-reading frame. Pseudogenes are mapped on human chromosomes 4 and X, respectively. In this paper, we discuss the relationships between DDX3 and its related sequences that have been isolated.

  2. Sequence evolution and expression regulation of stress-responsive genes in natural populations of wild tomato.

    Fischer, Iris; Steige, Kim A; Stephan, Wolfgang; Mboup, Mamadou


    The wild tomato species Solanum chilense and S. peruvianum are a valuable non-model system for studying plant adaptation since they grow in diverse environments facing many abiotic constraints. Here we investigate the sequence evolution of regulatory regions of drought and cold responsive genes and their expression regulation. The coding regions of these genes were previously shown to exhibit signatures of positive selection. Expression profiles and sequence evolution of regulatory regions of members of the Asr (ABA/water stress/ripening induced) gene family and the dehydrin gene pLC30-15 were analyzed in wild tomato populations from contrasting environments. For S. chilense, we found that Asr4 and pLC30-15 appear to respond much faster to drought conditions in accessions from very dry environments than accessions from more mesic locations. Sequence analysis suggests that the promoter of Asr2 and the downstream region of pLC30-15 are under positive selection in some local populations of S. chilense. By investigating gene expression differences at the population level we provide further support of our previous conclusions that Asr2, Asr4, and pLC30-15 are promising candidates for functional studies of adaptation. Our analysis also demonstrates the power of the candidate gene approach in evolutionary biology research and highlights the importance of wild Solanum species as a genetic resource for their cultivated relatives.

  3. WebScipio: An online tool for the determination of gene structures using protein sequences

    Waack Stephan


    Full Text Available Abstract Background Obtaining the gene structure for a given protein encoding gene is an important step in many analyses. A software suited for this task should be readily accessible, accurate, easy to handle and should provide the user with a coherent representation of the most probable gene structure. It should be rigorous enough to optimise features on the level of single bases and at the same time flexible enough to allow for cross-species searches. Results WebScipio, a web interface to the Scipio software, allows a user to obtain the corresponding coding sequence structure of a here given a query protein sequence that belongs to an already assembled eukaryotic genome. The resulting gene structure is presented in various human readable formats like a schematic representation, and a detailed alignment of the query and the target sequence highlighting any discrepancies. WebScipio can also be used to identify and characterise the gene structures of homologs in related organisms. In addition, it offers a web service for integration with other programs. Conclusion WebScipio is a tool that allows users to get a high-quality gene structure prediction from a protein query. It offers more than 250 eukaryotic genomes that can be searched and produces predictions that are close to what can be achieved by manual annotation, for in-species and cross-species searches alike. WebScipio is freely accessible at

  4. Sequence evolution and expression regulation of stress-responsive genes in natural populations of wild tomato.

    Iris Fischer

    Full Text Available The wild tomato species Solanum chilense and S. peruvianum are a valuable non-model system for studying plant adaptation since they grow in diverse environments facing many abiotic constraints. Here we investigate the sequence evolution of regulatory regions of drought and cold responsive genes and their expression regulation. The coding regions of these genes were previously shown to exhibit signatures of positive selection. Expression profiles and sequence evolution of regulatory regions of members of the Asr (ABA/water stress/ripening induced gene family and the dehydrin gene pLC30-15 were analyzed in wild tomato populations from contrasting environments. For S. chilense, we found that Asr4 and pLC30-15 appear to respond much faster to drought conditions in accessions from very dry environments than accessions from more mesic locations. Sequence analysis suggests that the promoter of Asr2 and the downstream region of pLC30-15 are under positive selection in some local populations of S. chilense. By investigating gene expression differences at the population level we provide further support of our previous conclusions that Asr2, Asr4, and pLC30-15 are promising candidates for functional studies of adaptation. Our analysis also demonstrates the power of the candidate gene approach in evolutionary biology research and highlights the importance of wild Solanum species as a genetic resource for their cultivated relatives.

  5. Analysis of the interaction between bovine mitochondrial 28 S ribosomal subunits and mRNA.

    Farwell, M A; Schirawski, J; Hager, P W; Spremulli, L L


    The small subunit of the bovine mitochondrial ribosome forms a tight complex with mRNAs. This [28 S:mRNA] complex forms as readily on circular mRNAs as on linear mRNAs indicating that a free 5' end on the mRNA is not required for the interaction observed. The effects of monovalent cations on the equilibrium association constant and on the forward and reverse rate constants governing this interaction have been determined. Monovalent cations have a strong effect on the forward rate constant. Increasing the KCl concentration from 1 mM to 100 mM reduces kon by nearly 100-fold. Monovalent cations have only a small effect on the reverse rate constant, koff'. Analysis of these data indicates that the rate laws governing the formation and dissociation of the [28 S:mRNA] complex cannot be deduced from the chemical equation. This observation suggests that there are "hidden intermediates' in the formation and dissociation of this complex. The implications of these observations are discussed in terms of a model for the interaction between the mitochondrial 28 S subunit and mRNAs.

  6. Transcriptome sequencing and expression analysis of terpenoid biosynthesis genes in Litsea cubeba.

    Xiao-Jiao Han

    Full Text Available BACKGROUND: Aromatic essential oils extracted from fresh fruits of Litsea cubeba (Lour. Pers., have diverse medical and economic values. The dominant components in these essential oils are monoterpenes and sesquiterpenes. Understanding the molecular mechanisms of terpenoid biosynthesis is essential for improving the yield and quality of terpenes. However, the 40 available L. cubeba nucleotide sequences in the public databases are insufficient for studying the molecular mechanisms. Thus, high-throughput transcriptome sequencing of L. cubeba is necessary to generate large quantities of transcript sequences for the purpose of gene discovery, especially terpenoid biosynthesis related genes. RESULTS: Using Illumina paired-end sequencing, approximately 23.5 million high-quality reads were generated. De novo assembly yielded 68,648 unigenes with an average length of 834 bp. A total of 38,439 (56% unigenes were annotated for their functions, and 35,732 and 25,806 unigenes could be aligned to the GO and COG database, respectively. By searching against the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG, 16,130 unigenes were assigned to 297 KEGG pathways, and 61 unigenes, which contained the mevalonate and 2-C-methyl-D-erythritol 4-phosphate pathways, could be related to terpenoid backbone biosynthesis. Of the 12,963 unigenes, 285 were annotated to the terpenoid pathways using the PlantCyc database. Additionally, 14 terpene synthase genes were identified from the transcriptome. The expression patterns of the 16 genes related to terpenoid biosynthesis were analyzed by RT-qPCR to explore their putative functions. CONCLUSION: RNA sequencing was effective in identifying a large quantity of sequence information. To our knowledge, this study is the first exploration of the L. cubeba transcriptome, and the substantial amount of transcripts obtained will accelerate the understanding of the molecular mechanisms of essential oils biosynthesis. The

  7. Molecular cloning and long terminal repeat sequences of human endogenous retrovirus genes related to types A and B retrovirus genes

    Ono, M.


    By using a DNA fragment primarily encoding the reverse transcriptase (pol) region of the Syrian hamster intracisternal A particle (IAP; type A retrovirus) gene as a probe, human endogenous retrovirus genes, tentatively termed HERV-K genes, were cloned from a fetal human liver gene library. Typical HERV-K genes were 9.1 or 9.4 kilobases in length, having long terminal repeats (LTRs) of ca. 970 base pairs. Many structural features commonly observed on the retrovirus LTRs, such as the TATAA box, polyadenylation signal, and terminal inverted repeats, were present on each LTR, and a lysine (K) tRNA having a CUU anticodon was identified as a presumed primer tRNA. The HERV-K LTR, however, had little sequence homology to either the IAP LTR or other typical oncovirus LTRs. By filter hybridization, the number of HERV-K genes was estimated to be ca. 50 copies per haploid human genome. The cloned mouse mammary tumor virus (type B) gene was found to hybridize with both the HERV-K and IAP genes to essentially the same extent.

  8. Discovery of clubroot-resistant genes in Brassica napus by transcriptome sequencing.

    Chen, S W; Liu, T; Gao, Y; Zhang, C; Peng, S D; Bai, M B; Li, S J; Xu, L; Zhou, X Y; Lin, L B


    Clubroot significantly affects plants of the Brassicaceae family and is one of the main diseases causing serious losses in B. napus yield. Few studies have investigated the clubroot-resistance mechanism in B. napus. Identification of clubroot-resistant genes may be used in clubroot-resistant breeding, as well as to elucidate the molecular mechanism behind B. napus clubroot-resistance. We used three B. napus transcriptome samples to construct a transcriptome sequencing library by using Illumina HiSeq™ 2000 sequencing and bioinformatic analysis. In total, 171 million high-quality reads were obtained, containing 96,149 unigenes of N50-value. We aligned the obtained unigenes with the Nr, Swiss-Prot, clusters of orthologous groups, and gene ontology databases and annotated their functions. In the Kyoto encyclopedia of genes and genomes database, 25,033 unigenes (26.04%) were assigned to 124 pathways. Many genes, including broad-spectrum disease-resistance genes, specific clubroot-resistant genes, and genes related to indole-3-acetic acid (IAA) signal transduction, cytokinin synthesis, and myrosinase synthesis in the Huashuang 3 variety of B. napus were found to be related to clubroot-resistance. The effective clubroot-resistance observed in this variety may be due to the induced increased expression of these disease-resistant genes and strong inhibition of the IAA signal transduction, cytokinin synthesis, and myrosinase synthesis. The homology observed between unigenes 0048482, 0061770 and the Crr1 gene shared 94% nucleotide similarity. Furthermore, unigene 0061770 could have originated from an inversion of the Crr1 5'-end sequence.


    A. V. Vinogradov


    Full Text Available Aim: to estimate the frequency of DNMT3A gene exons 18–26 point mutations in acute myeloid leukemia (AML patients (pts using target automatic sequencing technique.Material and Methods. Bone marrow and peripheral blood samples were obtained from 34 AML pts aged 21 to 64, who were treated in Sverdlovsk Regional Hematological Centre (Ekaterinburg during the period 2012–2014. Distribution of the pts according to FAB-classification was as follows: AML M0 – 3, M1 – 1, M2 – 12, M3 – 3, M4 – 10, M5 – 2, M6 – 1, M7 – 1, blastic plasmacytoid dendritic cell neoplasm – 1. Total RNA was extracted from leukemic cells and subjected to reverse transcription. DNMT3A gene exons 18–26 were amplified by PCR. Detection of mutations in DNMT3A gene was performed by direct sequencing. Sequencing was realized using an automatic genetic analyzer ABI Prism 310.Results. The average frequency of functionally significant point mutations in DNMT3A gene exons 18– 26 among the treated AML pts was 5.9%. They were detected in morphological subgroups M2 and M4(according to WHO classification. The average frequency of DNMT3A gene exons 18–26 point mutations among the AML M2 and M4 pts without chromosomal aberrations and TP53 gene point mutations was 14.3%. In both cases there were samples in which DNMT3A gene mutations were accompanied by molecular lesions of NPM1, KRAS and WT1 genes. AML pts with DNMT3A gene exons 18–26 point mutations characterized by poor response to standard chemotherapeutic regimens and unfavorable prognosis.

  10. The nucleotide sequence of the dnaA gene and the first part of the dnaN gene of Escherichia coli K-12.

    Hansen, E B; Hansen, F G; von Meyenburg, K


    The nucleotide sequence of the dnaA gene and the first 10% of the dnaN gene was determined. From the nucleotide sequence the amino acid sequence of the dnaA gene product was derived. It is a basic protein of 467 amino acid residues with a molecular weight of 52.5 kD. The expression of the dnaA gene is in the counterclockwise direction like the one of the dnaN gene, for which potential startsites were found.

  11. An Efficient Method for Identifying Gene Fusions by Targeted RNA Sequencing from Fresh Frozen and FFPE Samples.

    Jonathan A Scolnick

    Full Text Available Fusion genes are known to be key drivers of tumor growth in several types of cancer. Traditionally, detecting fusion genes has been a difficult task based on fluorescent in situ hybridization to detect chromosomal abnormalities. More recently, RNA sequencing has enabled an increased pace of fusion gene identification. However, RNA-Seq is inefficient for the identification of fusion genes due to the high number of sequencing reads needed to detect the small number of fusion transcripts present in cells of interest. Here we describe a method, Single Primer Enrichment Technology (SPET, for targeted RNA sequencing that is customizable to any target genes, is simple to use, and efficiently detects gene fusions. Using SPET to target 5701 exons of 401 known cancer fusion genes for sequencing, we were able to identify known and previously unreported gene fusions from both fresh-frozen and formalin-fixed paraffin-embedded (FFPE tissue RNA in both normal tissue and cancer cells.

  12. An Efficient Method for Identifying Gene Fusions by Targeted RNA Sequencing from Fresh Frozen and FFPE Samples.

    Scolnick, Jonathan A; Dimon, Michelle; Wang, I-Ching; Huelga, Stephanie C; Amorese, Douglas A


    Fusion genes are known to be key drivers of tumor growth in several types of cancer. Traditionally, detecting fusion genes has been a difficult task based on fluorescent in situ hybridization to detect chromosomal abnormalities. More recently, RNA sequencing has enabled an increased pace of fusion gene identification. However, RNA-Seq is inefficient for the identification of fusion genes due to the high number of sequencing reads needed to detect the small number of fusion transcripts present in cells of interest. Here we describe a method, Single Primer Enrichment Technology (SPET), for targeted RNA sequencing that is customizable to any target genes, is simple to use, and efficiently detects gene fusions. Using SPET to target 5701 exons of 401 known cancer fusion genes for sequencing, we were able to identify known and previously unreported gene fusions from both fresh-frozen and formalin-fixed paraffin-embedded (FFPE) tissue RNA in both normal tissue and cancer cells.

  13. GeneWiz browser: An Interactive Tool for Visualizing Sequenced Chromosomes

    Hallin, Peter Fischer; Stærfeldt, Hans Henrik; Rotenberg, Eva;


    We present an interactive web application for visualizing genomic data of prokaryotic chromosomes. The tool (GeneWiz browser) allows users to carry out various analyses such as mapping alignments of homologous genes to other genomes, mapping of short sequencing reads to a reference chromosome......, and calculating DNA properties such as curvature or stacking energy along the chromosome. The GeneWiz browser produces an interactive graphic that enables zooming from a global scale down to single nucleotides, without changing the size of the plot. Its ability to disproportionally zoom provides optimal...

  14. Sequence divergence in two tandemly located pilin genes of Eikenella corrodens.


    Eikenella corrodens normally inhabits the human respiratory and gastrointestinal tracts but is frequently the cause of abscesses at various sites. Using the N-terminal portion of the Moraxella nonliquefaciens pilin gene as a hybridization probe, we cloned two tandemly located pilin genes of E. corrodens 31745, ecpC and ecpD, and expressed the two pilin genes separately in Escherichia coli. A comparison of the predicted amino acid sequences of E. corrodens 31745 EcpC and EcpD revealed consider...

  15. De Novo Transcriptome Sequencing of Oryza officinalis Wall ex Watt to Identify Disease-Resistance Genes

    Bin He


    Full Text Available Oryza officinalis Wall ex Watt is one of the most important wild relatives of cultivated rice and exhibits high resistance to many diseases. It has been used as a source of genes for introgression into cultivated rice. However, there are limited genomic resources and little genetic information publicly reported for this species. To better understand the pathways and factors involved in disease resistance and accelerating the process of rice breeding, we carried out a de novo transcriptome sequencing of O. officinalis. In this research, 137,229 contigs were obtained ranging from 200 to 19,214 bp with an N50 of 2331 bp through de novo assembly of leaves, stems and roots in O. officinalis using an Illumina HiSeq 2000 platform. Based on sequence similarity searches against a non-redundant protein database, a total of 88,249 contigs were annotated with gene descriptions and 75,589 transcripts were further assigned to GO terms. Candidate genes for plant–pathogen interaction and plant hormones regulation pathways involved in disease-resistance were identified. Further analyses of gene expression profiles showed that the majority of genes related to disease resistance were all expressed in the three tissues. In addition, there are two kinds of rice bacterial blight-resistant genes in O. officinalis, including two Xa1 genes and three Xa26 genes. All 2 Xa1 genes showed the highest expression level in stem, whereas one of Xa26 was expressed dominantly in leaf and other 2 Xa26 genes displayed low expression level in all three tissues. This transcriptomic database provides an opportunity for identifying the genes involved in disease-resistance and will provide a basis for studying functional genomics of O. officinalis and genetic improvement of cultivated rice in the future.

  16. Structural features of conopeptide genes inferred from partial sequences of the Conus tribblei genome.

    Barghi, Neda; Concepcion, Gisela P; Olivera, Baldomero M; Lluisma, Arturo O


    The evolvability of venom components (in particular, the gene-encoded peptide toxins) in venomous species serves as an adaptive strategy allowing them to target new prey types or respond to changes in the prey field. The structure, organization, and expression of the venom peptide genes may provide insights into the molecular mechanisms that drive the evolution of such genes. Conus is a particularly interesting group given the high chemical diversity of their venom peptides, and the rapid evolution of the conopeptide-encoding genes. Conus genomes, however, are large and characterized by a high proportion of repetitive sequences. As a result, the structure and organization of conopeptide genes have remained poorly known. In this study, a survey of the genome of Conus tribblei was undertaken to address this gap. A partial assembly of C. tribblei genome was generated; the assembly, though consisting of a large number of fragments, accounted for 2160.5 Mb of sequence. A large number of repetitive genomic elements consisting of 642.6 Mb of retrotransposable elements, simple repeats, and novel interspersed repeats were observed. We characterized the structural organization and distribution of conotoxin genes in the genome. A significant number of conopeptide genes (estimated to be between 148 and 193) belonging to different superfamilies with complete or nearly complete exon regions were observed, ~60 % of which were expressed. The unexpressed conopeptide genes represent hidden but significant conotoxin diversity. The conotoxin genes also differed in the frequency and length of the introns. The interruption of exons by long introns in the conopeptide genes and the presence of repeats in the introns may indicate the importance of introns in facilitating recombination, evolution and diversification of conotoxins. These findings advance our understanding of the structural framework that promotes the gene-level molecular evolution of venom peptides.

  17. Cloning, sequencing and expression of the Schwanniomyces occidentalis NADP-dependent glutamate dehydrogenase gene.

    De Zoysa, P A; Connerton, I F; Watson, D C; Johnston, J R


    The cloned NADP-specific glutamate dehydrogenase (GDH) genes of Aspergillus nidulans (gdhA) and Neurospora crassa (am) have been shown to hybridize under reduced stringency conditions to genomic sequences of the yeast Schwanniomyces occidentalis. Using 5' and 3' gene-specific probes, a unique 5.1 kb BclI restriction fragment that encompasses the entire Schwanniomyces sequence has been identified. A recombinant clone bearing the unique BclI fragment has been isolated from a pool of enriched clones in the yeast/E. coli shuttle vector pWH5 by colony hybridization. The identity of the plasmid clone was confirmed by functional complementation of the Saccharomyces cerevisiae gdh-1 mutation. The nucleotide sequence of the Schw. occidentalis GDH gene, which consists of 1380 nucleotides in a continuous reading frame of 459 amino acids, has been determined. The predicted amino acid sequence shows considerable homology with GDH proteins from other fungi and significant homology with all other available GDH sequences.

  18. Cloning and sequence analysis of gene encoding plasma aquaporin of Tamarix albiflonum

    DONG Yuzhi; YANG Chuanping; ZHANG Daoyuan; WANG Yucheng


    Plant aquaporins are water-selected-channels in plants and are involved in seed germination,cell elongation,stoma movement,fertilization and so on.Some plant aquapotins also play an important role in drought stress response.In this paper,the gene encoding the Tamarix albiflonum Aquaporin (AQP) was amplified by 5'rapid amplification of cDNA end (RACE) on the basis of the sequence information obtained from the expressed sequence tag of the subtractive hybridization library constructed under PEG6000 stress.The cDNA of the T.albiflonum AQP gene is 1,043 bp long,encoding a protein of 287 amino acids with a predicted molecular mass of 30.9 kDa,has 6 transmembrane regions,and possessing the major intrinsic protein (MIP) family signal consensus sequence SGXHXNPAVT and the higher plant plasma membrane intrinsic protein (PIP) highly conservative sequence GGGANXXXXGY and TGI/TNPARSL /FGAA I/VI/VF/YN.A comparative molecular analysis of the nucleotide sequence in National Center for Biotechnology Information (NCBI) databases showed that it shared 95% homology with the gene ofArabidopsis thaliana (MIP-C),with a theoretical isoelectric point 8.84.

  19. Distribution of Genes and Repetitive Elements in the Diabrotica virgifera virgifera Genome Estimated Using BAC Sequencing

    Brad S. Coates


    Full Text Available Feeding damage caused by the western corn rootworm, Diabrotica virgifera virgifera, is destructive to corn plants in North America and Europe where control remains challenging due to evolution of resistance to chemical and transgenic toxins. A BAC library, DvvBAC1, containing 109,486 clones with 104±34.5 kb inserts was created, which has an ~4.56X genome coverage based upon a 2.58 Gb (2.80 pg flow cytometry-estimated haploid genome size. Paired end sequencing of 1037 BAC inserts produced 1.17 Mb of data (~0.05% genome coverage and indicated ~9.4 and 16.0% of reads encode, respectively, endogenous genes and transposable elements (TEs. Sequencing genes within BAC full inserts demonstrated that TE densities are high within intergenic and intron regions and contribute to the increased gene size. Comparison of homologous genome regions cloned within different BAC clones indicated that TE movement may cause haplotype variation within the inbred strain. The data presented here indicate that the D. virgifera virgifera genome is large in size and contains a high proportion of repetitive sequence. These BAC sequencing methods that are applicable for characterization of genomes prior to sequencing may likely be valuable resources for genome annotation as well as scaffolding.

  20. Prosthetic joint infection due to Lysobacter thermophilus diagnosed by 16S rRNA gene sequencing.

    Dhawan, B; Sebastian, S; Malhotra, R; Kapil, A; Gautam, D


    We report the first case of prosthetic joint infection caused by Lysobacter thermophilus which was identified by 16S rRNA gene sequencing. Removal of prosthesis followed by antibiotic treatment resulted in good clinical outcome. This case illustrates the use of molecular diagnostics to detect uncommon organisms in suspected prosthetic infections.

  1. Identification of Legionella pneumophila serogroups and other Legionella species by mip gene sequencing.

    Haroon, Attiya; Koide, Michio; Higa, Futoshi; Tateyama, Masao; Fujita, Jiro


    The virulence factor known as the macrophage infectivity potentiator (mip) is responsible for the intracellular survival of Legionella species. In this study, we investigated the potential of the mip gene sequence to differentiate isolates of different species of Legionella and different serogroups of Legionella pneumophila. We used 35 clinical L. pneumophila isolates and one clinical isolate each of Legionella micdadei, Legionella longbeachae, and Legionella dumoffii (collected from hospitals all over Japan between 1980 and 2007). We used 19 environmental Legionella anisa isolates (collected in the Okinawa, Nara, Osaka, and Hyogo prefectures between 1987 and 2007) and two Legionella type strains. We extracted bacterial genomic DNA and amplified out the mip gene by PCR. PCR products were purified by agarose gel electrophoresis and the mip gene was then sequenced. The L. pneumophila isolates could be divided into two groups: one group was very similar to the type strain and was composed of serogroup (SG) 1 isolates only; the second group had more sequence variations and was composed of SG1 isolates as well as SG2, SG3, SG5, and SG10 isolates. Phylogenetic analysis displayed one cluster for L. anisa isolates, while other Legionella species were present at discrete levels. Our findings show that mip gene sequencing is an effective technique for differentiating L. pneumophila strains from other Legionella species.

  2. Phylogeny and identification of Pantoea species and typing of Pantoea agglomerans strains by multilocus gene sequencing.

    Delétoile, Alexis; Decré, Dominique; Courant, Stéphanie; Passet, Virginie; Audo, Jennifer; Grimont, Patrick; Arlet, Guillaume; Brisse, Sylvain


    Pantoea agglomerans and other Pantoea species cause infections in humans and are also pathogenic to plants, but the diversity of Pantoea strains and their possible association with hosts and disease remain poorly known, and identification of Pantoea species is difficult. We characterized 36 Pantoea strains, including 28 strains of diverse origins initially identified as P. agglomerans, by multilocus gene sequencing based on six protein-coding genes, by biochemical tests, and by antimicrobial susceptibility testing. Phylogenetic analysis and comparison with other species of Enterobacteriaceae revealed that the genus Pantoea is highly diverse. Most strains initially identified as P. agglomerans by use of API 20E strips belonged to a compact sequence cluster together with the type strain, but other strains belonged to diverse phylogenetic branches corresponding to other species of Pantoea or Enterobacteriaceae and to probable novel species. Biochemical characteristics such as fosfomycin resistance and utilization of d-tartrate could differentiate P. agglomerans from other Pantoea species. All 20 strains of P. agglomerans could be distinguished by multilocus sequence typing, revealing the very high discrimination power of this method for strain typing and population structure in this species, which is subdivided into two phylogenetic groups. PCR detection of the repA gene, associated with pathogenicity in plants, was positive in all clinical strains of P. agglomerans, suggesting that clinical and plant-associated strains do not form distinct populations. We provide a multilocus gene sequencing method that is a powerful tool for Pantoea species delineation and identification and for strain tracking.

  3. Phylogenetic analysis of Rutaceous plants based on single nucleotide polymorphism in chloroplast and nuclear gene sequences

    The family Rutaceae encompasses several genera including the economically important genus Citrus. In this study, we selected 22 citrus relatives belonging to the various sub groups of Rutaceae and compared the sequences of three gene fragments. The accessions selected belong to the subfamily Rutoide...

  4. The S-layer gene of Lactobacillus helveticus CNRZ 892 : cloning, sequence and heterologous expression

    Callegari, M.L.; Riboli, B.; Sanders, J.W; Cocconcelli, P.S.; Kok, J.; Venema, G; Morelli, L.


    Lactobacillus helveticus CNRZ 892 contains a surface layer (S-layer) composed of protein monomers of 43 kDa organized in regular arrays. The gene encoding this protein (slpH) has been cloned in Escherichia coli and sequenced. slpH consists of 440 codons and is preceded by a ribosome-binding site (RB

  5. Nearly identical bacteriophage structural gene sequences are widely distributed in both marine and freshwater environments.

    Short, Cindy M; Suttle, Curtis A


    Primers were designed to amplify a 592-bp region within a conserved structural gene (g20) found in some cyanophages. The goal was to use this gene as a proxy to infer genetic richness in natural cyanophage communities and to determine if sequences were more similar in similar environments. Gene products were amplified from samples from the Gulf of Mexico, the Arctic, Southern, and Northeast and Southeast Pacific Oceans, an Arctic cyanobacterial mat, a catfish production pond, lakes in Canada and Germany, and a depth of ca. 3,246 m in the Chuckchi Sea. Amplicons were separated by denaturing gradient gel electrophoresis, and selected bands were sequenced. Phylogenetic analysis revealed four previously unknown groups of g20 clusters, two of which were entirely found in freshwater. Also, sequences with >99% identities were recovered from environments that differed greatly in temperature and salinity. For example, nearly identical sequences were recovered from the Gulf of Mexico, the Southern Pacific Ocean, an Arctic freshwater cyanobacterial mat, and Lake Constance, Germany. These results imply that closely related hosts and the viruses infecting them are distributed widely across environments or that horizontal gene exchange occurs among phage communities from very different environments. Moreover, the amplification of g20 products from deep in the cyanobacterium-sparse Chuckchi Sea suggests that this primer set targets bacteriophages other than those infecting cyanobacteria.

  6. Isolation and Analysis of α-Gliadin Gene Coding Sequences from Triticum durum

    WANG Han-yan; WEI Yu-ming; ZE Hong-yan; ZHENG You-liang


    Three coding sequences of gliadins genes, designed as Gli2_Du1, Gli2_Du2 and Gli2_Du3, were isolated from the genomic DNA of Triticum durum accessions CItr5083. Gli2_Du1 and Gli2_Du2 contain 945 and 864 bp, encoding the mature proteins with 314 and 287 amino acid residues, respectively. Gli2_Du3 is recognized as a pseudogene due to the stop codon occurring in the coding region. The pseudogenes, commonly occurring in gliadins family, are attributed to the single base change C → T. The amino acid sequences deduced from these gene sequences were characterized with the typical structure of α-gliadin proteins, including the toxic sequences (PSQQQP). The peptide fraction PF(Y)PP(Q)is thought to be an extra unit of repetitive domain, slightly diverging from the previous report. Six cysteine residues were observed within two unique domains. Phylogenetic analysis showed Gli2_Du2 and Gli2_Du3 were closely related to the genes on chromosome 6A, whereas Gli2_Du1 seems to be more homologous with the genes on chromosome 6B.

  7. Transcriptomic sequencing reveals a set of unique genes activated by butyrate-induced histone modification

    Butyrate is a nutritional element with strong epigenetic regulatory activity as an inhibitor of histone deacetylases (HDACs). Based on the analysis of differentially expressed genes induced by butyrate in the bovine epithelial cell using deep RNA-sequencing technology (RNA-seq), a set of unique gen...




    The nucleotide sequences of two genes involved in sodium dodecyl sulfate (SDS) degradation, by Pseudomonas, have been determined. One of these, sdsA, codes for an alkyl sulfatase (58 957 Da) and has similarity (31.8% identity over a 201-amino acid stretch) to the N terminus of a predicted protein of

  9. Discovery and functional prioritization of Parkinson's disease candidate genes from large-scale whole exome sequencing

    I. Jansen (Iris); Ye, H. (Hui); Heetveld, S. (Sasja); Lechler, M.C. (Marie C.); Michels, H. (Helen); Seinstra, R.I. (Renée I.); Lubbe, S.J. (Steven J.); Drouet, V. (Valérie); S. Lesage (Suzanne); E. Majounie (Elisa); Gibbs, J.R. (J.Raphael); M.A. Nalls (Michael); M. Ryten (Mina); Botia, J.A. (Juan A.); J. Vandrovcova (Jana); J. Simón-Sánchez (Javier); Castillo-Lizardo, M. (Melissa); P. Rizzu (Patrizia); Blauwendraat, C. (Cornelis); Chouhan, A.K. (Amit K.); Li, Y. (Yarong); Yogi, P. (Puja); N. Amin (Najaf); C.M. van Duijn (Cock); Morris, H.R. (Huw R.); Brice, A. (Alexis); A. Singleton (Andrew); David, D.C. (Della C.); Nollen, E.A. (Ellen A.); A. Jain (Ashok); J.M. Shulman; P. Heutink (Peter); D.G. Hernandez (Dena); S. Arepalli (Sampath); J. Brooks (Janet); Price, R. (Ryan); Nicolas, A. (Aude); S. Chong (Sean); M.R. Cookson (Mark); A. Dillman (Allissa); M. Moore (Matt); B.J. Traynor (Bryan); A. Singleton (Andrew); V. Plagnol (Vincent); Nicholas W Wood,; U.-M. Sheerin (Una-Marie); Jose M Bras,; K. Charlesworth (Kate); M. Gardner (Mac); R. Guerreiro (Rita); D. Trabzuni (Danyah); Hardy, J. (John); M. Sharma; M. Saad (Mohamad); Javier Simón-Sánchez,; C. Schulte (Claudia); J.C. Corvol (Jean-Christophe); Dürr, A. (Alexandra); M. Vidailhet (M.); S. Sveinbjörnsdóttir (Sigurlaug); R.A. Barker (Roger); Caroline H Williams-Gray,; Y. Ben-Shlomo; H.W. Berendse (Henk W.); K.D. van Dijk (Karin); D. Berg (Daniela); K. Brockmann; K.D. Wurster (Kathrin); Mätzler, W. (Walter); Gasser, T. (Thomas); M. Martinez (Maria); R.M.A. de Bie (Rob); A. Biffi (Alessandro); D. Velseboer (Daan); B.R. Bloem (Bastiaan); B. Post (Bart); M. Wickremaratchi (Mirdhu); B. van de Warrenburg (Bart); Z. Bochdanovits (Zoltan); M. von Bonin (Malte); H. Pétursson (Hjörvar); O. Riess (Olaf); D.J. Burn (David); Lubbe, S. (Steven); Cooper, J.M. (J Mark); N.H. McNeill (Nathan); Schapira, A. (Anthony); Lungu, C. (Codrin); Chen, H. (Honglei); Dong, J. (Jing); Chinnery, P.F. (Patrick F.); G. Hudson (Gavin); Clarke, C.E. (Carl E.); C. Moorby (Catriona); C. Counsell (Carl); P. Damier (Philippe); J.-F. Dartigues; P. Deloukas (Panagiotis); E. Gray (Emma); T. Edkins (Ted); Hunt, S.E. (Sarah E.); S.C. Potter (Simon); A. Tashakkori-Ghanbaria (Avazeh); G. Deuschl (Günther); D. Lorenz (Delia); D.T. Dexter (David); F. Durif (Frank); J. Evans (Jonathan Mark); Langford, C. (Cordelia); T. Foltynie (Thomas); A.M. Goate (Alison); C. Harris (Clare); J.J. van Hilten (Jacobus); A. Hofman (Albert); J.R. Hollenbeck (John R.); J.L. Holton (Janice); Hu, M. (Michele); X. Huang (Xiaohong); Illig, T. (Thomas); P.V. Jónsson (Pálmi); J.-C. Lambert; S.S. O'Sullivan (Sean); T. Revesz (Tamas); K. Shaw (Karen); A.J. Lees (Andrew); P. Lichtner (Peter); P. Limousin (Patricia); G. Lopez; Escott-Price, V. (Valentina); J. Pearson (Justin); N. Williams (Nigel); E. Mudanohwo (Ese); J.S. Perlmutter (Joel); Pollak, P. (Pierre); F. Rivadeneira Ramirez (Fernando); A.G. Uitterlinden (André); S.J. Sawcer (Stephen); H. Scheffer (Hans); I. Shoulson (Ira); L. Shulman (Lee); Smith, C. (Colin); R. Walker (Robert); C.C.A. Spencer (Chris C.); A. Strange (Amy); H. Stefansson (Hreinn); F. Bettella (Francesco); J-A. Zwart (John-Anker); Stockton, J.D. (Joanna D.); D. Talbot; C.M. Tanner (Carlie); F. Tison (François); S. Winder-Rhodes (Sophie); K.P. Bhatia (Kailash)


    textabstractBackground: Whole-exome sequencing (WES) has been successful in identifying genes that cause familial Parkinson's disease (PD). However, until now this approach has not been deployed to study large cohorts of unrelated participants. To discover rare PD susceptibility variants, we perform

  10. Nucleotide sequence of the Syrian hamster intracisternal A-particle gene: close evolutionary relationship of type A particle gene to types B and D oncovirus genes.

    Ono, M; Toh, H; Miyata, T; Awaya, T


    We determined the complete nucleotide sequence of the intracisternal A-particle gene, IAP-H18, cloned from the normal Syrian hamster liver DNA. IAP-H18 was 7,951 base pairs in length with two identical long terminal repeats of 376 base pairs at both ends. On the coding strand, imperfect open reading frames corresponding to gag and pol of the retrovirus genome were observed, whereas many stop codons were present in the region corresponding to env. The putative H18 gag gene (809 amino acids) had a sequence homologous to the N-terminal half of the mouse mammary tumor virus gag gene and locally to the Rous sarcoma virus gag gene. The putative H18 pol gene (900 residues) was homologous to the Rous sarcoma virus pol gene almost throughout the entire region. Two conserved regions among the retrovirus pol genes have been reported. One presumably corresponds to the DNA polymerase and the RNase H domain, and the other corresponds to the DNA endonuclease domain of the multifunctional protein pol. By the comparison of the deduced amino acid sequences of the putative endonuclease domain of six representative oncovirus genomes, a phylogenetic tree of the oncovirus genomes was constructed, and the intracisternal A-particle (type A) genome was found to be more closely related to the mouse mammary tumor virus (type B) and squirrel monkey retrovirus (type D) genomes.

  11. Cloning and sequence analysis of Sox genes in a tetraploid cyprinid fish, Tor douronensis

    GUO BaoCheng; LI JunBing; TONG ChaoBo; HE ShunPing


    A PCR survey for Sox genes in a young tetraploid fish Tor douronensis (Teleostei: Cyprinidae) was per-formed to access the evolutionary fates of important functional genes after genome duplication caused by polyploidization event. Totally 13 Sox genes were obtained in Tor douronensis, which represent SoxB, SoxC and SoxE groups. Phylogenetic analysis of Sox genes in Tor douronensis provided evidence for fish-specific genome duplication, and suggested that Sox19 might be a teleost specific Sox gene member. Sequence analysis revealed most of the nucleotide substitutions between duplicated copies of Sox genes caused by tetraploidization event or their orthologues in other species are silent substitutions. It would appear that the sequences are under purifying selective pressure, strongly suggesting that they repre- sent functional genes and supporting selection against all null allele at either of two duplicated loci of Sox4a, Sox9a and Sox9b. Surprising variations of the intron length and similarities of two duplicated copies of Sox9a and Sox9b, suggest that Tor douronensis might be an allotetraploidy.

  12. [Characterization of 5S rRNA gene sequence and secondary structure in gymnosperms].

    Liu, Zhan-Lin; Zhang, Da-Ming; Wang, Xiao-Ru


    In higher plants the primary and the secondary structures of 5S ribosomal RNA gene are considered highly conservative. Little is known about the 5S rRNA gene structure, organization and variation in gyimnosperms. In this study we analyzed sequence and structure variation of 5S rRNA gene in Pinus through cloning and sequencing multiple copies of 5S rDNA repeats from individual trees of five pines, P. bungeana, P. tabulaeformis, P. yunnanensis, P. massoniana and P. densata. Pinus bungeana is from the subgenus Strobus while the other four are from the subgenus Pinus (diploxylon pines). Our results revealed variations in both primary and secondary structure among copies of 5S rDNA within individual genomes and between species. 5S rRNA gene in Pinus is 120 bp long in most of the 122 clones we sequenced except for one or two deletions in three clones. Among these clones 50 unique sequences were identified and they were shared by different pine species. Our sequences were compared to 13 sequences each representing a different gymnosperm species, and to six sequences representing both angiosperm monocots and dicots. Average sequence similarity was 97.1% among Pinus species and 94.3% between Pinus and other gymnosperms. Between gymnosperms and angiosperms the sequence similarity decreased to 88.1%. Similar to other molecular data, significant sequence divergence was found between the two Pinus subgenera. The 5S gene tree (neighbor-joining tree) grouped the four diploxylon pines together and separated them distinctly from P. bungeana. Comparison of sequence divergence within individuals and between species suggested that concerted evolution has been very weak especially after the divergence of the four diploxylon pines. The phylogenetic information contained in the 5S rRNA gene is limited due to its shorter length and the difficulties in identifying orthologous and paralogous copies of rDNA multigene family further complicate its phylogenetic application. Pinus densata is a

  13. Variation in the sequence and modification state of the human insulin gene flanking regions.

    Ullrich, A; Dull, T J; Gray, A; Philips, J A; Peter, S


    The nucleotide sequence of a highly repetitive sequence region upstream from the human insulin gene is reported. The length of this region varies between alleles in the population, and appears to be stably transmitted to the next generation in a Mendelian fashion. There is no significant correlation between the length of this sequence and two types of diabetes mellitus. We observe variation in the cleavability of a BglI recognition site downstream from the human insulin gene, which is probably due to variable nucleotide modification. This presumed modification state appears not to be inherited, and varies between tissues within an individual and between individuals for a given tissue. Both alleles in a given tissue DNA sample are modified to the same extent.

  14. Nucleotide sequence of maize dwarf mosaic virus capsid protein gene and its expression in Escherichia coli

    赛吉庆; 康良仪; 黄忠; 史春霖; 田波; 谢友菊


    The 3’-terminal 1 279 nucleotide sequence of maize dwarf mosaic virus (MDMV) genome has been determined. This sequence contains an open reading frame of 1023 nudeotides and a 3’ -non-coding region of 256 nucleotides. The open reading frame includes all of the coding regions for the viral capsid protein (CP) and part of the viral nuclear inclusion protein (Nib). The predicted viral CP consists of 313 amino acid residues with a calculated molecular weight of 35400. The amino acid sequence of the viral CP derived from MDMV cDNA shows about 47%-54% homology to that of 4 other potyviruses. The viral CP gene was constructed in frame with the lacZ gene in pUC19 plasmid and expressed in E. coli cells. The fusion polypeptide positively reacted in Western blot with an antiserum prepared against the native viral CP.

  15. Phylogenetic diversity of sequences of cyanophage photosynthetic gene psbA in marine and freshwaters.

    Chénard, C; Suttle, C A


    Many cyanophage isolates which infect the marine cyanobacteria Synechococcus spp. and Prochlorococcus spp. contain a gene homologous to psbA, which codes for the D1 protein involved in photosynthesis. In the present study, cyanophage psbA gene fragments were readily amplified from freshwater and marine samples, confirming their widespread occurrence in aquatic communities. Phylogenetic analyses demonstrated that sequences from freshwaters have an evolutionary history that is distinct from that of their marine counterparts. Similarly, sequences from cyanophages infecting Prochlorococcus and Synechococcus spp. were readily discriminated, as were sequences from podoviruses and myoviruses. Viral psbA sequences from the same geographic origins clustered within different clades. For example, cyanophage psbA sequences from the Arctic Ocean fell within the Synechococcus as well as Prochlorococcus phage groups. Moreover, as psbA sequences are not confined to a single family of phages, they provide an additional genetic marker that can be used to explore the diversity and evolutionary history of cyanophages in aquatic environments.

  16. Multidimensional metrics for estimating phage abundance, distribution, gene density, and sequence coverage in metagenomes

    Ramy Karam Aziz


    Full Text Available Phages are the most abundant biological entities on Earth and play major ecological roles, yet the current sequenced phage genomes do not adequately represent their diversity, and little is known about the abundance and distribution of these sequenced genomes in nature. Although the study of phage ecology has benefited tremendously from the emergence of metagenomic sequencing, a systematic survey of phage genes and genomes in various ecosystems is still lacking, and fundamental questions about phage biology, lifestyle, and ecology remain unanswered. To address these questions and improve comparative analysis of phages in different metagenomes, we screened a core set of publicly available metagenomic samples for sequences related to completely sequenced phages using the web tool, Phage Eco-Locator. We then adopted and deployed an array of mathematical and statistical metrics for a multidimensional estimation of the abundance and distribution of phage genes and genomes in various ecosystems. Experiments using those metrics individually showed their usefulness in emphasizing the pervasive, yet uneven, distribution of known phage sequences in environmental metagenomes. Using these metrics in combination allowed us to resolve phage genomes into clusters that correlated with their genotypes and taxonomic classes as well as their ecological properties. We propose adding this set of metrics to current metaviromic analysis pipelines, where they can provide insight regarding phage mosaicism, habitat specificity, and evolution.

  17. Gene discovery using mutagen-induced polymorphisms and deep sequencing: application to plant disease resistance.

    Zhu, Ying; Mang, Hyung-gon; Sun, Qi; Qian, Jun; Hipps, Ashley; Hua, Jian


    Next-generation sequencing technologies are accelerating gene discovery by combining multiple steps of mapping and cloning used in the traditional map-based approach into one step using DNA sequence polymorphisms existing between two different accessions/strains/backgrounds of the same species. The existing next-generation sequencing method, like the traditional one, requires the use of a segregating population from a cross of a mutant organism in one accession with a wild-type (WT) organism in a different accession. It therefore could potentially be limited by modification of mutant phenotypes in different accessions and/or by the lengthy process required to construct a particular mapping parent in a second accession. Here we present mapping and cloning of an enhancer mutation with next-generation sequencing on bulked segregants in the same accession using sequence polymorphisms induced by a chemical mutagen. This method complements the conventional cloning approach and makes forward genetics more feasible and powerful in molecularly dissecting biological processes in any organisms. The pipeline developed in this study can be used to clone causal genes in background of single mutants or higher order of mutants and in species with or without sequence information on multiple accessions.

  18. Sequence signatures involved in targeting the male-specific lethal complex to X-chromosomal genes in Drosophila melanogaster

    Philip Philge


    Full Text Available Abstract Background In Drosophila melanogaster, the dosage-compensation system that equalizes X-linked gene expression between males and females, thereby assuring that an appropriate balance is maintained between the expression of genes on the X chromosome(s and the autosomes, is at least partially mediated by the Male-Specific Lethal (MSL complex. This complex binds to genes with a preference for exons on the male X chromosome with a 3' bias, and it targets most expressed genes on the X chromosome. However, a number of genes are expressed but not targeted by the complex. High affinity sites seem to be responsible for initial recruitment of the complex to the X chromosome, but the targeting to and within individual genes is poorly understood. Results We have extensively examined X chromosome sequence variation within five types of gene features (promoters, 5' UTRs, coding sequences, introns, 3' UTRs and intergenic sequences, and assessed its potential involvement in dosage compensation. Presented results show that: the X chromosome has a distinct sequence composition within its gene features; some of the detected variation correlates with genes targeted by the MSL-complex; the insulator protein BEAF-32 preferentially binds upstream of MSL-bound genes; BEAF-32 and MOF co-localizes in promoters; and that bound genes have a distinct sequence composition that shows a 3' bias within coding sequence. Conclusions Although, many strongly bound genes are close to a high affinity site neither our promoter motif nor our coding sequence signatures show any correlation to HAS. Based on the results presented here, we believe that there are sequences in the promoters and coding sequences of targeted genes that have the potential to direct the secondary spreading of the MSL-complex to nearby genes.




    Full Text Available The objective of this research was to identify diversity of exon 5 UTMP gene fragment in Bali cattle using direct sequencing. The total 60 blood samples of Bali Cattle derived from BPTU Bali in Bali siland (20 heads, BPTU Serading in Sumbawa island (20 heads and Village Breeding Center in Barru District South Sulawesi (20 heads were used to evaluate their genetic diversity at exon 5 UTMP gene. The forward and reverse data sequences were analyzed using Bioedit program and alignment analysis was carried out using MEGA5 program. Meanwhile haplotype analysis was performed by DnaSPv5 program. The result showed that partial sequences in exon 5 UTMP gene had 16 haplotypes with the highest number of haplotypes ware found in VBC Barru district South Sulawesi (8 haplotypes. Moreover, the highest average of haplotype (h and nucleotide (p diversity were found in VBC Barru district South Sulawesi were 0.7949 and 0.0016, respectively. In addition, minisatellite insersion was found in exon 5 UTMP gene fragment on Bali cattle which are consist of 5'-CCA GTC ATG AAG AAG GCA GAG GTC GTC GTG CCG GCG AAA-3'. According to our results, haplotype and minisatellite variation in exon 5 UTMP gene fragment can be used as a candidate genetic marker specific for reproductive trait in the Bali cattle and for its strategy breeding program in the future.

  20. Two lamprey Hedgehog genes share non-coding regulatory sequences and expression patterns with gnathostome Hedgehogs.

    Shungo Kano

    Full Text Available Hedgehog (Hh genes play major roles in animal development and studies of their evolution, expression and function point to major differences among chordates. Here we focused on Hh genes in lampreys in order to characterize the evolution of Hh signalling at the emergence of vertebrates. Screening of a cosmid library of the river lamprey Lampetra fluviatilis and searching the preliminary genome assembly of the sea lamprey Petromyzon marinus indicate that lampreys have two Hh genes, named Hha and Hhb. Phylogenetic analyses suggest that Hha and Hhb are lamprey-specific paralogs closely related to Sonic/Indian Hh genes. Expression analysis indicates that Hha and Hhb are expressed in a Sonic Hh-like pattern. The two transcripts are expressed in largely overlapping but not identical domains in the lamprey embryonic brain, including a newly-described expression domain in the nasohypophyseal placode. Global alignments of genomic sequences and local alignment with known gnathostome regulatory motifs show that lamprey Hhs share conserved non-coding elements (CNE with gnathostome Hhs albeit with sequences that have significantly diverged and dispersed. Functional assays using zebrafish embryos demonstrate gnathostome-like midline enhancer activity for CNEs contained in intron2. We conclude that lamprey Hh genes are gnathostome Shh-like in terms of expression and regulation. In addition, they show some lamprey-specific features, including duplication and structural (but not functional changes in the intronic/regulatory sequences.

  1. Expressed sequences tags of the anther smut fungus, Microbotryum violaceum, identify mating and pathogenicity genes

    Devier Benjamin


    Full Text Available Abstract Background The basidiomycete fungus Microbotryum violaceum is responsible for the anther-smut disease in many plants of the Caryophyllaceae family and is a model in genetics and evolutionary biology. Infection is initiated by dikaryotic hyphae produced after the conjugation of two haploid sporidia of opposite mating type. This study describes M. violaceum ESTs corresponding to nuclear genes expressed during conjugation and early hyphal production. Results A normalized cDNA library generated 24,128 sequences, which were assembled into 7,765 unique genes; 25.2% of them displayed significant similarity to annotated proteins from other organisms, 74.3% a weak similarity to the same set of known proteins, and 0.5% were orphans. We identified putative pheromone receptors and genes that in other fungi are involved in the mating process. We also identified many sequences similar to genes known to be involved in pathogenicity in other fungi. The M. violaceum EST database, MICROBASE, is available on the Web and provides access to the sequences, assembled contigs, annotations and programs to compare similarities against MICROBASE. Conclusion This study provides a basis for cloning the mating type locus, for further investigation of pathogenicity genes in the anther smut fungi, and for comparative genomics.

  2. Sequencing the mouse Y chromosome reveals convergent gene acquisition and amplification on both sex chromosomes.

    Soh, Y Q Shirleen; Alföldi, Jessica; Pyntikova, Tatyana; Brown, Laura G; Graves, Tina; Minx, Patrick J; Fulton, Robert S; Kremitzki, Colin; Koutseva, Natalia; Mueller, Jacob L; Rozen, Steve; Hughes, Jennifer F; Owens, Elaine; Womack, James E; Murphy, William J; Cao, Qing; de Jong, Pieter; Warren, Wesley C; Wilson, Richard K; Skaletsky, Helen; Page, David C


    We sequenced the MSY (male-specific region of the Y chromosome) of the C57BL/6J strain of the laboratory mouse Mus musculus. In contrast to theories that Y chromosomes are heterochromatic and gene poor, the mouse MSY is 99.9% euchromatic and contains about 700 protein-coding genes. Only 2% of the MSY derives from the ancestral autosomes that gave rise to the mammalian sex chromosomes. Instead, all but 45 of the MSY's genes belong to three acquired, massively amplified gene families that have no homologs on primate MSYs but do have acquired, amplified homologs on the mouse X chromosome. The complete mouse MSY sequence brings to light dramatic forces in sex chromosome evolution: lineage-specific convergent acquisition and amplification of X-Y gene families, possibly fueled by antagonism between acquired X-Y homologs. The mouse MSY sequence presents opportunities for experimental studies of a sex-specific chromosome in its entirety, in a genetically tractable model organism.

  3. Sequence analysis of 21 genes located in the Kartagener syndrome linkage region on chromosome 15q.

    Geremek, Maciej; Schoenmaker, Frederieke; Zietkiewicz, Ewa; Pogorzelski, Andrzej; Diehl, Scott; Wijmenga, Cisca; Witt, Michal


    Primary ciliary dyskinesia (PCD) is a rare genetic disorder, which shows extensive genetic heterogeneity and is mostly inherited in an autosomal recessive fashion. There are four genes with a proven pathogenetic role in PCD. DNAH5 and DNAI1 are involved in 28 and 10% of PCD cases, respectively, while two other genes, DNAH11 and TXNDC3, have been identified as causal in one PCD family each. We have previously identified a 3.5 cM (2.82 Mb) region on chromosome 15q linked to Kartagener syndrome (KS), a subtype of PCD characterized by the randomization of body organ positioning. We have now refined the KS candidate region to a 1.8 Mb segment containing 18 known genes. The coding regions of these genes and three neighboring genes were subjected to sequence analysis in seven KS probands, and we were able to identify 60 single nucleotide sequence variants, 35 of which resided in mRNA coding sequences. However, none of the variations alone could explain the occurrence of the disease in these patients.

  4. Cloning and Sequence Analysis of Capsid Protein Gene of Iridovirus Indonesian Isolates

    Murwantoko .


    Full Text Available generated by an Adobe application 11.5606 Iridovirus was known as agents that caused serious systemic disease in freshwater and marine fishes. The mortality up to 100% of orange-spotted grouper (Epinephelus coioides due to iridovirus infection has been reported in Indonesia. The gene encoding capsid protein of iridovirus is supposed to be conserved and has the potency for the development of control methods. The objectives of this study are to clone the gene encoding capsid protein iridovirus and to analyze their sequences. The   spleen tissues of orange-spotted grouper were collected and extracted their DNA. The DNA fragment of capsid protein of iridovirus genes were amplified by PCR using designed primers with the extraction DNA as templates. The amplified DNA fragments were cloned in pBSKSII and sequenced.  The genes encoding capsid protein of iridovirus from Jepara and Bali were successfully amplified and cloned. The Jepara clone (IJP03 contained complete open reading frame (ORF of the gene composed by 1362 bp nucleotides which encoded 453 amino acids. Those Jepara and Bali (IGD01 clones shared 99.8% similarity in nucleotide level and 99.4% at amino acid level. Based on those sequences, Indonesian iridovirus was belonged to genus Megalocystivirus and shared 99,6-99,9% similarity on nucleotide level with DGIV, ISKNV, MCIV, and ALIV Normal 0 36 false false false

  5. Cloning, sequencing, and characterization of the Azospirillum brasilense fhuE gene.

    Cui, Yanhua; Tu, Ran; Guan, Yue; Ma, Luyan; Chen, Sanfeng


    The fhuE gene of Escherichia coli encodes the FhuE protein, which is a receptor protein in the coprogen-mediated siderophore iron-transport system. A fhuE gene homologue from Azospirillum brasilense, a nitrogen-fixing soil bacterium that lives in association with the roots of cereal grasses, was cloned, sequenced, and characterized. The A. brasilense fhuE encodes a protein of 802 amino acids with a predicted molecular weight of approximately 87 kDa. The deduced amino-acid sequence showed a high level of homology to the sequences of all the known fhuE gene products. The fhuE mutant was sensitive to iron starvation and defective in coprogen-mediated iron uptake. The mutant failed to express one membrane protein of approximately 78 kDa that was induced by iron starvation in the wild type. Complementation studies showed that the A. brasilense fhuE gene, when present on a low-copy number plasmid, could restore the functions of the mutant. Mutation in fhuE gene did not affect nitrogen fixation.

  6. An ancient repeat sequence in the ATP synthase beta-subunit gene of forcipulate sea stars.

    Foltz, David W


    A novel repeat sequence with a conserved secondary structure is described from two nonadjacent introns of the ATP synthase beta-subunit gene in sea stars of the order Forcipulatida (Echinodermata: Asteroidea). The repeat is present in both introns of all forcipulate sea stars examined, which suggests that it is an ancient feature of this gene (with an approximate age of 200 Mya). Both stem and loop regions show high levels of sequence constraint when compared to flanking nonrepetitive intronic regions. The repeat was also detected in (1) the family Pterasteridae, order Velatida and (2) the family Korethrasteridae, order Velatida. The repeat was not detected in (1) the family Echinasteridae, order Spinulosida, (2) the family Astropectinidae, order Paxillosida, (3) the family Solasteridae, order Velatida, or (4) the family Goniasteridae, order Valvatida. The repeat lacks similarity to published sequences in unrestricted GenBank searches, and there are no significant open reading frames in the repeat or in the flanking intron sequences. Comparison via parametric bootstrapping to a published phylogeny based on 4.2 kb of nuclear and mitochondrial sequence for a subset of these species allowed the null hypothesis of a congruent phylogeny to be rejected for each repeat, when compared separately to the published phylogeny. In contrast, the flanking nonrepetitive sequences in each intron yielded separate phylogenies that were each congruent with the published phylogeny. In four species, the repeat in one or both introns has apparently experienced gene conversion. The two introns also show a correlated pattern of nucleotide substitutions, even after excluding the putative cases of gene conversion.

  7. A score system for quality evaluation of RNA sequence tags: an improvement for gene expression profiling

    Pinheiro Daniel G


    Full Text Available Abstract Background High-throughput molecular approaches for gene expression profiling, such as Serial Analysis of Gene Expression (SAGE, Massively Parallel Signature Sequencing (MPSS or Sequencing-by-Synthesis (SBS represent powerful techniques that provide global transcription profiles of different cell types through sequencing of short fragments of transcripts, denominated sequence tags. These techniques have improved our understanding about the relationships between these expression profiles and cellular phenotypes. Despite this, more reliable datasets are still necessary. In this work, we present a web-based tool named S3T: Score System for Sequence Tags, to index sequenced tags in accordance with their reliability. This is made through a series of evaluations based on a defined rule set. S3T allows the identification/selection of tags, considered more reliable for further gene expression analysis. Results This methodology was applied to a public SAGE dataset. In order to compare data before and after filtering, a hierarchical clustering analysis was performed in samples from the same type of tissue, in distinct biological conditions, using these two datasets. Our results provide evidences suggesting that it is possible to find more congruous clusters after using S3T scoring system. Conclusion These results substantiate the proposed application to generate more reliable data. This is a significant contribution for determination of global gene expression profiles. The library analysis with S3T is freely available at S3T source code and datasets can also be downloaded from the aforementioned website.

  8. X-exome sequencing of 405 unresolved families identifies seven novel intellectual disability genes.

    Hu, H; Haas, S A; Chelly, J; Van Esch, H; Raynaud, M; de Brouwer, A P M; Weinert, S; Froyen, G; Frints, S G M; Laumonnier, F; Zemojtel, T; Love, M I; Richard, H; Emde, A-K; Bienek, M; Jensen, C; Hambrock, M; Fischer, U; Langnick, C; Feldkamp, M; Wissink-Lindhout, W; Lebrun, N; Castelnau, L; Rucci, J; Montjean, R; Dorseuil, O; Billuart, P; Stuhlmann, T; Shaw, M; Corbett, M A; Gardner, A; Willis-Owen, S; Tan, C; Friend, K L; Belet, S; van Roozendaal, K E P; Jimenez-Pocquet, M; Moizard, M-P; Ronce, N; Sun, R; O'Keeffe, S; Chenna, R; van Bömmel, A; Göke, J; Hackett, A; Field, M; Christie, L; Boyle, J; Haan, E; Nelson, J; Turner, G; Baynam, G; Gillessen-Kaesbach, G; Müller, U; Steinberger, D; Budny, B; Badura-Stronka, M; Latos-Bieleńska, A; Ousager, L B; Wieacker, P; Rodríguez Criado, G; Bondeson, M-L; Annerén, G; Dufke, A; Cohen, M; Van Maldergem, L; Vincent-Delorme, C; Echenne, B; Simon-Bouy, B; Kleefstra, T; Willemsen, M; Fryns, J-P; Devriendt, K; Ullmann, R; Vingron, M; Wrogemann, K; Wienker, T F; Tzschach, A; van Bokhoven, H; Gecz, J; Jentsch, T J; Chen, W; Ropers, H-H; Kalscheuer, V M


    X-linked intellectual disability (XLID) is a clinically and genetically heterogeneous disorder. During the past two decades in excess of 100 X-chromosome ID genes have been identified. Yet, a large number of families mapping to the X-chromosome remained unresolved suggesting that more XLID genes or loci are yet to be identified. Here, we have investigated 405 unresolved families with XLID. We employed massively parallel sequencing of all X-chromosome exons in the index males. The majority of these males were previously tested negative for copy number variations and for mutations in a subset of known XLID genes by Sanger sequencing. In total, 745 X-chromosomal genes were screened. After stringent filtering, a total of 1297 non-recurrent exonic variants remained for prioritization. Co-segregation analysis of potential clinically relevant changes revealed that 80 families (20%) carried pathogenic variants in established XLID genes. In 19 families, we detected likely causative protein truncating and missense variants in 7 novel and validated XLID genes (CLCN4, CNKSR2, FRMPD4, KLHL15, LAS1L, RLIM and USP27X) and potentially deleterious variants in 2 novel candidate XLID genes (CDK16 and TAF1). We show that the CLCN4 and CNKSR2 variants impair protein functions as indicated by electrophysiological studies and altered differentiation of cultured primary neurons from Clcn4(-/-) mice or after mRNA knock-down. The newly identified and candidate XLID proteins belong to pathways and networks with established roles in cognitive function and intellectual disability in particular. We suggest that systematic sequencing of all X-chromosomal genes in a cohort of patients with genetic evidence for X-chromosome locus involvement may resolve up to 58% of Fragile X-negative cases.

  9. Sequencing, physical organization and kinetic expression of the patulin biosynthetic gene cluster from Penicillium expansum.

    Tannous, Joanna; El Khoury, Rhoda; Snini, Selma P; Lippi, Yannick; El Khoury, André; Atoui, Ali; Lteif, Roger; Oswald, Isabelle P; Puel, Olivier


    Patulin is a polyketide-derived mycotoxin produced by numerous filamentous fungi. Among them, Penicillium expansum is by far the most problematic species. This fungus is a destructive phytopathogen capable of growing on fruit, provoking the blue mold decay of apples and producing significant amounts of patulin. The biosynthetic pathway of this mycotoxin is chemically well-characterized, but its genetic bases remain largely unknown with only few characterized genes in less economic relevant species. The present study consisted of the identification and positional organization of the patulin gene cluster in P. expansum strain NRRL 35695. Several amplification reactions were performed with degenerative primers that were designed based on sequences from the orthologous genes available in other species. An improved genome Walking approach was used in order to sequence the remaining adjacent genes of the cluster. RACE-PCR was also carried out from mRNAs to determine the start and stop codons of the coding sequences. The patulin gene cluster in P. expansum consists of 15 genes in the following order: patH, patG, patF, patE, patD, patC, patB, patA, patM, patN, patO, patL, patI, patJ, and patK. These genes share 60-70% of identity with orthologous genes grouped differently, within a putative patulin cluster described in a non-producing strain of Aspergillus clavatus. The kinetics of patulin cluster genes expression was studied under patulin-permissive conditions (natural apple-based medium) and patulin-restrictive conditions (Eagle's minimal essential medium), and demonstrated a significant association between gene expression and patulin production. In conclusion, the sequence of the patulin cluster in P. expansum constitutes a key step for a better understanding of the mechanisms leading to patulin production in this fungus. It will allow the role of each gene to be elucidated, and help to define strategies to reduce patulin production in apple-based products.

  10. A tool kit for quantifying eukaryotic rRNA gene sequences from human microbiome samples.

    Dollive, Serena; Peterfreund, Gregory L; Sherrill-Mix, Scott; Bittinger, Kyle; Sinha, Rohini; Hoffmann, Christian; Nabel, Christopher S; Hill, David A; Artis, David; Bachman, Michael A; Custers-Allen, Rebecca; Grunberg, Stephanie; Wu, Gary D; Lewis, James D; Bushman, Frederic D


    Eukaryotic microorganisms are important but understudied components of the human microbiome. Here we present a pipeline for analysis of deep sequencing data on single cell eukaryotes. We designed a new 18S rRNA gene-specific PCR primer set and compared a published rRNA gene internal transcribed spacer (ITS) gene primer set. Amplicons were tested against 24 specimens from defined eukaryotes and eight well-characterized human stool samples. A software pipeline was developed for taxonomic attribution, validated against simulated data, and tested on pyrosequence data. This study provides a well-characterized tool kit for sequence-based enumeration of eukaryotic organisms in human microbiome samples.

  11. Exome Sequencing Reveals Cubilin Mutation as a Single-Gene Cause of Proteinuria

    Ovunc, Bugsu; Otto, Edgar A.; Vega-Warner, Virginia; Saisawat, Pawaree; Ashraf, Shazia; Ramaswami, Gokul; Fathy, Hanan M.; Schoeb, Dominik; Chernin, Gil; Lyons, Robert H.; Yilmaz, Engin


    In two siblings of consanguineous parents with intermittent nephrotic-range proteinuria, we identified a homozygous deleterious frameshift mutation in the gene CUBN, which encodes cubulin, using exome capture and massively parallel re-sequencing. The mutation segregated with affected members of this family and was absent from 92 healthy individuals, thereby identifying a recessive mutation in CUBN as the single-gene cause of proteinuria in this sibship. Cubulin mutations cause a hereditary form of megaloblastic anemia secondary to vitamin B12 deficiency, and proteinuria occurs in 50% of cases since cubilin is coreceptor for both the intestinal vitamin B12-intrinsic factor complex and the tubular reabsorption of protein in the proximal tubule. In summary, we report successful use of exome capture and massively parallel re-sequencing to identify a rare, single-gene cause of nephropathy. PMID:21903995

  12. Exome sequencing reveals cubilin mutation as a single-gene cause of proteinuria.

    Ovunc, Bugsu; Otto, Edgar A; Vega-Warner, Virginia; Saisawat, Pawaree; Ashraf, Shazia; Ramaswami, Gokul; Fathy, Hanan M; Schoeb, Dominik; Chernin, Gil; Lyons, Robert H; Yilmaz, Engin; Hildebrandt, Friedhelm


    In two siblings of consanguineous parents with intermittent nephrotic-range proteinuria, we identified a homozygous deleterious frameshift mutation in the gene CUBN, which encodes cubulin, using exome capture and massively parallel re-sequencing. The mutation segregated with affected members of this family and was absent from 92 healthy individuals, thereby identifying a recessive mutation in CUBN as the single-gene cause of proteinuria in this sibship. Cubulin mutations cause a hereditary form of megaloblastic anemia secondary to vitamin B(12) deficiency, and proteinuria occurs in 50% of cases since cubilin is coreceptor for both the intestinal vitamin B(12)-intrinsic factor complex and the tubular reabsorption of protein in the proximal tubule. In summary, we report successful use of exome capture and massively parallel re-sequencing to identify a rare, single-gene cause of nephropathy.

  13. Whole-exome sequencing and homozygosity analysis implicate depolarization-regulated neuronal genes in autism.

    Maria H Chahrour

    Full Text Available Although autism has a clear genetic component, the high genetic heterogeneity of the disorder has been a challenge for the identification of causative genes. We used homozygosity analysis to identify probands from nonconsanguineous families that showed evidence of distant shared ancestry, suggesting potentially recessive mutations. Whole-exome sequencing of 16 probands revealed validated homozygous, potentially pathogenic recessive mutations that segregated perfectly with disease in 4/16 families. The candidate genes (UBE3B, CLTCL1, NCKAP5L, ZNF18 encode proteins involved in proteolysis, GTPase-mediated signaling, cytoskeletal organization, and other pathways. Furthermore, neuronal depolarization regulated the transcription of these genes, suggesting potential activity-dependent roles in neurons. We present a multidimensional strategy for filtering whole-exome sequence data to find candidate recessive mutations in autism, which may have broader applicability to other complex, heterogeneous disorders.

  14. Sequencing and analysis of the gene-rich space of cowpea

    Cheung Foo


    Full Text Available Abstract Background Cowpea, Vigna unguiculata (L. Walp., is one of the most important food and forage legumes in the semi-arid tropics because of its drought tolerance and ability to grow on poor quality soils. Approximately 80% of cowpea production takes place in the dry savannahs of tropical West and Central Africa, mostly by poor subsistence farmers. Despite its economic and social importance in the developing world, cowpea remains to a large extent an underexploited crop. Among the major goals of cowpea breeding and improvement programs is the stacking of desirable agronomic traits, such as disease and pest resistance and response to abiotic stresses. Implementation of marker-assisted selection and breeding programs is severely limited by a paucity of trait-linked markers and a general lack of information on gene structure and organization. With a nuclear genome size estimated at ~620 Mb, the cowpea genome is an ideal target for reduced representation sequencing. Results We report here the sequencing and analysis of the gene-rich, hypomethylated portion of the cowpea genome selectively cloned by methylation filtration (MF technology. Over 250,000 gene-space sequence reads (GSRs with an average length of 610 bp were generated, yielding ~160 Mb of sequence information. The GSRs were assembled, annotated by BLAST homology searches of four public protein annotation databases and four plant proteomes (A. thaliana, M. truncatula, O. sativa, and P. trichocarpa, and analyzed using various domain and gene modeling tools. A total of 41,260 GSR assemblies and singletons were annotated, of which 19,786 have unique GenBank accession numbers. Within the GSR dataset, 29% of the sequences were annotated using the Arabidopsis Gene Ontology (GO with the largest categories of assigned function being catalytic activity and metabolic processes, groups that include the majority of cellular enzymes and components of amino acid, carbohydrate and lipid metabolism. A

  15. alpha-Amylase gene of Streptomyces limosus: nucleotide sequence, expression motifs, and amino acid sequence homology to mammalian and invertebrate alpha-amylases.


    The nucleotide sequence of the coding and regulatory regions of the alpha-amylase gene (aml) of Streptomyces limosus was determined. High-resolution S1 mapping was used to locate the 5' end of the transcript and demonstrated that the gene is transcribed from a unique promoter. The predicted amino acid sequence has considerable identity to mammalian and invertebrate alpha-amylases, but not to those of plant, fungal, or eubacterial origin. Consistent with this is the susceptibility of the enzym...

  16. Core gene set as the basis of multilocus sequence analysis of the subclass Actinobacteridae.

    Toïdi Adékambi

    Full Text Available Comparative genomic sequencing is shedding new light on bacterial identification, taxonomy and phylogeny. An in silico assessment of a core gene set necessary for cellular functioning was made to determine a consensus set of genes that would be useful for the identification, taxonomy and phylogeny of the species belonging to the subclass Actinobacteridae which contained two orders Actinomycetales and Bifidobacteriales. The subclass Actinobacteridae comprised about 85% of the actinobacteria families. The following recommended criteria were used to establish a comprehensive gene set; the gene should (i be long enough to contain phylogenetically useful information, (ii not be subject to horizontal gene transfer, (iii be a single copy (iv have at least two regions sufficiently conserved that allow the design of amplification and sequencing primers and (v predict whole-genome relationships. We applied these constraints to 50 different Actinobacteridae genomes and made 1,224 pairwise comparisons of the genome conserved regions and gene fragments obtained by using Sequence VARiability Analysis Program (SVARAP, which allow designing the primers. Following a comparative statistical modeling phase, 3 gene fragments were selected, ychF, rpoB, and secY with R2>0.85. Selected sets of broad range primers were tested from the 3 gene fragments and were demonstrated to be useful for amplification and sequencing of 25 species belonging to 9 genera of Actinobacteridae. The intraspecies similarities were 96.3-100% for ychF, 97.8-100% for rpoB and 96.9-100% for secY among 73 strains belonging to 15 species of the subclass Actinobacteridae compare to 99.4-100% for 16S rRNA. The phylogenetic topology obtained from the combined datasets ychF+rpoB+secY was globally similar to that inferred from the 16S rRNA but with higher confidence. It was concluded that multi-locus sequence analysis using core gene set might represent the first consensus and valid approach for

  17. Identification of Expressed Resistance Gene Analogs from Peanut (Arachis hypogaea L.) Expressed Sequence Tags

    Zhanji Liu; Suping Feng; Manish K.Pandey; Xiaoping Chen; Albert K.Culbreath; Rajeev K.Varshney; Baozhu Guo


    Low genetic diversity makes peanut (Arachis hypogaea L.) very vulnerable to plant pathogens,causing severe yield loss and reduced seed quality.Several hundred partial genomic DNA sequences as nucleotide-binding-site leucine-rich repeat (NBS-LRR) resistance genes (R) have been identified,but a small portion with expressed transcripts has been found.We aimed to identify resistance gene analogs (RGAs) from peanut expressed sequence tags (ESTs) and to develop polymorphic markers.The protein sequences of 54 known R genes were used to identify homologs from peanut ESTs from public databases.A total of 1,053 ESTs corresponding to six different classes of known R genes were recovered,and assembled 156 contigs and 229 singletons as peanut-expressed RGAs.There were 69 that encoded for NBS-LRR proteins,191 that encoded for protein kinases,82 that encoded for LRR-PK/transmembrane proteins,28 that encoded for Toxin reductases,11 that encoded for LRR-domain containing proteins and four that encoded for TM-domain containing proteins.Twenty-eight simple sequence repeats (SSRs)were identified from 25 peanut expressed RGAs.One SSR polymorphic marker (RGA121) was identified.Two polymerase chain reaction-based markers (Ahsw-1 and Ahsw-2) developed from RGA013 were homologous to the Tomato Spotted Wilt Virus (TSWV) resistance gene.All three markers were mapped on the same linkage group AhlV.These expressed RGAs are the source for RGA-tagged marker development and identification of peanut resistance genes.

  18. Nucleotide sequence analysis of a candidate gene for ataxia-telangiectasia group D (ATDC)

    Leonhardt, E.A.; Kapp, L.N.; Young, B.R.; Murnane, J.P. (Univ. of California, San Francisco, CA (United States))


    A radioresistant cell clone (1B3) was previously isolated after transfection of an ataxia-telangiectasia (AT) group D cell line with a human cosmid library. A cosmid rescued from the integration site in 1B3 contained human DNA from chromosome position 11q23, the same region shown by both genetic linkage and chromosome transfer to contain the genes for AT complementation groups A/B, C, and D. A gene within the cosmid (ATDC) was found to produce mRNAs of different sizes. A cDNA for one of the most abundant mRNAs (3.0 kb) was isolated from a HeLa cell library. In the present study, the authors sequenced the 3.0-kb cDNA and the surrounding intron DNA in the cosmids. They used polymerase chain reaction, with primers in the introns, to confirm the number of exons and to analyze DNA from AT group D cells for mutations within this gene. Although no mutations were found, they do not rule out the possibility that mutations may be present within the regulatory sequences or coding sequences found in other mRNAs specific for this gene. From the sequence analysis, they found that the ATDC gene product is one of a group of proteins that share multiple zinc finger motifs and an adjacent leucine zipper motif. These proteins have been proposed to form homo- or hetero-dimers involved in nucleic acid binding, consistent with the fact that many of these proteins appear to be transcriptional regulatory factors involved in carcinogenesis and/or differentiation. The likelihood that the ATDC gene product is involved in transcriptional regulation could explain the pleiomorphic characteristics of AT, including abnormal cell cycle regulation. 36 refs., 5 figs., 2 tabs.

  19. Cloning, sequencing and application of the LEU2 gene from the sour dough yeast Candida milleri.

    Turakainen, Hilkka; Korhola, Matti


    We have cloned by complementation in Saccharomyces cerevisiae and sequenced a LEU2 gene from the sour dough yeast Candida milleri CBS 8195 and studied its chromosomal location. The LEU2 coding sequence was 1092 nt long encoding a putative beta-isopropylmalate dehydrogenase protein of 363 amino acids. The nucleotide sequence in the coding region had 71.6% identity to S. cerevisiae LEU2 sequence. On the protein level, the identity of C. milleri Leu2p to S. cerevisiae Leu2p was 84.1%. The CmLEU2 DNA probe hybridized to one to three chromosomal bands and two or three BamHI restriction fragments in C. milleri but did not give any signal to chromosomes or restriction fragments of C. albicans, S. cerevisiae, S. exiguus or Torulaspora delbrueckii. Using CmLEU2 probe for DNA hybridization makes it easy to quickly identify C. milleri among other sour dough yeasts.

  20. Nucleotide sequence of an immediate-early frog virus 3 gene.

    Willis, D; Foglesong, D; Granoff, A


    We have used "gene walking" with synthetic oligonucleotides and M13 dideoxynucleotide sequencing techniques to obtain the complete coding and flanking sequences of the gene encoding a major immediate-early RNA (molecular weight, 169,000) of frog virus 3. R-loop mapping of the cloned XbaI K fragment of frog virus 3 DNA with immediate-early RNA from infected cells showed that an RNA of approximately 500 to 600 nucleotides (the right size to code for the immediate-early viral 18-kilodalton protein of unknown function) hybridized to a region within 100 base pairs of one end of the XbaI K fragment; no evidence for splicing was observed in the electron microscope or by single-strand nuclease analysis. Further restriction mapping narrowed the location of the gene to the XbaI end of a 2-kilobase-pair XbaI-Bg/II fragment, which was bidirectionally subcloned into the bacteriophage pair mp10 and mp11 for sequencing. Mung bean nuclease mapping was used to identify both the 5' and the 3' ends of the mRNA. The 5' end mapped within an AT-rich region 19 base pairs upstream from two in-phase AUG start codons that were immediately followed by an open reading frame of 157 amino acids. Another AT-rich sequence was found at -29 base pairs from the 5' end of the mRNA start site; this sequence may function as a TATA box. The 3' end of the message displayed considerable microheterogeneity, but clearly terminated within a third AT-rich region 50 to 60 base pairs from the translation stop codon. The eucaryotic polyadenylic acid addition signal (AATAAA) was not present, a finding to be expected since frog virus 3 mRNA is not polyadenylated. Both the single-stranded mp10 clone of the XbaI-Bg/II fragment and a 15-base oligonucleotide complementary to the region flanking the two AUG translation start codons inhibited translation of the immediate-early 18-kilodalton protein in vitro, confirming the identity of the sequenced gene. As the regulatory sequences of this gene did not resemble those of

  1. Evidence for the adaptive significance of an LTR retrotransposon sequence in a Drosophila heterochromatic gene

    Rodriguez Jose M


    Full Text Available Abstract Background The potential adaptive significance of transposable elements (TEs to the host genomes in which they reside is a topic that has been hotly debated by molecular evolutionists for more than two decades. Recent genomic analyses have demonstrated that TE fragments are associated with functional genes in plants and animals. These findings suggest that TEs may contribute significantly to gene evolution. Results We have analyzed two transposable elements associated with genes in the sequenced Drosophila melanogaster y; cn bw sp strain. A fragment of the Antonia long terminal repeat (LTR retrotransposon is present in the intron of Chitinase 3 (Cht3, a gene located within the constitutive heterochromatin of chromosome 2L. Within the euchromatin of chromosome 2R a full-length Burdock LTR retrotransposon is located immediately 3' to cathD, a gene encoding cathepsin D. We tested for the presence of these two TE/gene associations in strains representing 12 geographically diverse populations of D. melanogaster. While the cathD insertion variant was detected only in the sequenced y; cn bw sp strain, the insertion variant present in the heterochromatic Cht3 gene was found to be fixed throughout twelve D. melanogaster populations and in a D. mauritiana strain suggesting that it maybe of adaptive significance. To further test this hypothesis, we sequenced a 685bp region spanning the LTR fragment in the intron of Cht3 in strains representative of the two sibling species D. melanogaster and D. mauritiana (~2.7 million years divergent. The level of sequence divergence between the two species within this region was significantly lower than expected from the neutral substitution rate and lower than the divergence observed between a randomly selected intron of the Drosophila Alcohol dehydrogenase gene (Adh. Conclusions Our results suggest that a 359 bp fragment of an Antonia retrotransposon (complete LTR is 659 bp located within the intron of the

  2. High occurrence of functional new chimeric genes in survey of rice chromosome 3 short arm genome sequences.

    Zhang, Chengjun; Wang, Jun; Marowsky, Nicholas C; Long, Manyuan; Wing, Rod A; Fan, Chuanzhu


    In an effort to identify newly evolved genes in rice, we searched the genomes of Asian-cultivated rice Oryza sativa ssp. japonica and its wild progenitors, looking for lineage-specific genes. Using genome pairwise comparison of approximately 20-Mb DNA sequences from the chromosome 3 short arm (Chr3s) in six rice species, O. sativa, O. nivara, O. rufipogon, O. glaberrima, O. barthii, and O. punctata, combined with synonymous substitution rate tests and other evidence, we were able to identify potential recently duplicated genes, which evolved within the last 1 Myr. We identified 28 functional O. sativa genes, which likely originated after O. sativa diverged from O. glaberrima. These genes account for around 1% (28/3,176) of all annotated genes on O. sativa's Chr3s. Among the 28 new genes, two recently duplicated segments contained eight genes. Fourteen of the 28 new genes consist of chimeric gene structure derived from one or multiple parental genes and flanking targeting sequences. Although the majority of these 28 new genes were formed by single or segmental DNA-based gene duplication and recombination, we found two genes that were likely originated partially through exon shuffling. Sequence divergence tests between new genes and their putative progenitors indicated that new genes were most likely evolving under natural selection. We showed all 28 new genes appeared to be functional, as suggested by Ka/Ks analysis and the presence of RNA-seq, cDNA, expressed sequence tag, massively parallel signature sequencing, and/or small RNA data. The high rate of new gene origination and of chimeric gene formation in rice may demonstrate rice's broad diversification, domestication, its environmental adaptation, and the role of new genes in rice speciation.

  3. Detection and sequence analysis of accessory gene regulator genes of Staphylococcus pseudintermedius isolates

    M. Ananda Chitra; Jayanthy, C.; Nagarajan, B.


    Background: Staphylococcus pseudintermedius (SP) is the major pathogenic species of dogs involved in a wide variety of skin and soft tissue infections. The accessory gene regulator (agr) locus of Staphylococcus aureus has been extensively studied, and it influences the expression of many virulence genes. It encodes a two-component signal transduction system that leads to down-regulation of surface proteins and up-regulation of secreted proteins during in vitro growth of S. aureus. The objecti...

  4. Sequence length polymorphisms within primate amelogenin and amelogenin-like genes: usefulness in sex determination.

    Morrill, Benson H; Rickords, Lee F; Schafstall, Heather J


    Sequence length polymorphisms between the amelogenin (AMELX) and the amelogenin-like (AMELY) genes both within and between several mammalian species have been identified and utilized for sex determination, species identification, and to elucidate evolutionary relationships. Sex determination via polymerase chain reaction (PCR) assays of the AMELX and AMELY genes has been successful in greater apes, prosimians, and two species of old world monkeys. To date, no sex determination PCR assay using AMELX and AMELY has been developed for new world monkeys. In this study, we present partial AMELX and AMELY sequences for five old world monkey species (Mandrillus sphinx, Macaca nemestrina, Macaca fuscata, Macaca mulatta, and Macaca fascicularis) along with primer sets that can be used for sex determination of these five species. In addition, we compare the sequences we generated with other primate AMELX and AMELY sequences available on GenBank and discuss sequence length polymorphisms and their usefulness in sex determination within primates. The mandrill and four species of macaque all share two similar deletion regions with each other, the human, and the chimpanzee in the region sequenced. These two deletion regions are 176-181 and 8 nucleotides in length. In analyzing existing primate sequences on GenBank, we also discovered that a separate six-nucleotide polymorphism located approximately 300 nucleotides upstream of the 177 nucleotide polymorphism in sequences of humans and chimps was also present in two species of new world monkeys (Saimiri boliviensis and Saimiri sciureus). We designed primers that incorporate this polymorphism, creating the first AMELX and AMELY PCR primer set that has been used successfully to generate two bands in a new world monkey species.

  5. Sequence breakpoints in the aflatoxin biosynthesis gene cluster and flanking regions in nonaflatoxigenic Aspergillus flavus isolates.

    Chang, Perng-Kuang; Horn, Bruce W; Dorner, Joe W


    Aspergillus flavus populations are genetically diverse. Isolates that produce either, neither, or both aflatoxins and cyclopiazonic acid (CPA) are present in the field. We investigated defects in the aflatoxin gene cluster in 38 nonaflatoxigenic A. flavus isolates collected from southern United States. PCR assays using aflatoxin-gene-specific primers grouped these isolates into eight (A-H) deletion patterns. Patterns C, E, G, and H, which contain 40 kb deletions, were examined for their sequence breakpoints. Pattern C has one breakpoint in the cypA 3' untranslated region (UTR) and another in the verA coding region. Pattern E has a breakpoint in the amdA coding region and another in the ver1 5'UTR. Pattern G contains a deletion identical to the one found in pattern C and has another deletion that extends from the cypA coding region to one end of the chromosome as suggested by the presence of telomeric sequence repeats, CCCTAATGTTGA. Pattern H has a deletion of the entire aflatoxin gene cluster from the hexA coding region in the sugar utilization gene cluster to the telomeric region. Thus, deletions in the aflatoxin gene cluster among A. flavus isolates are not rare, and the patterns appear to be diverse. Genetic drift may be a driving force that is responsible for the loss of the entire aflatoxin gene cluster in nonaflatoxigenic A. flavus isolates when aflatoxins have lost their adaptive value in nature.

  6. PPARG: Gene Expression Regulation and Next-Generation Sequencing for Unsolved Issues

    Valerio Costa


    Full Text Available Peroxisome proliferator-activated receptor gamma (PPARγ is one of the most extensively studied ligand-inducible transcription factors (TFs, able to modulate its transcriptional activity through conformational changes. It is of particular interest because of its pleiotropic functions: it plays a crucial role in the expression of key genes involved in adipogenesis, lipid and glucid metabolism, atherosclerosis, inflammation, and cancer. Its protein isoforms, the wide number of PPARγ target genes, ligands, and coregulators contribute to determine the complexity of its function. In addition, the presence of genetic variants is likely to affect expression levels of target genes although the impact of PPARG gene variations on the expression of target genes is not fully understood. The introduction of massively parallel sequencing platforms—in the Next Generation Sequencing (NGS era—has revolutionized the way of investigating the genetic causes of inherited diseases. In this context, DNA-Seq for identifying—within both coding and regulatory regions of PPARG gene—novel nucleotide variations and haplotypes associated to human diseases, ChIP-Seq for defining a PPARγ binding map, and RNA-Seq for unraveling the wide and intricate gene pathways regulated by PPARG, represent incredible steps toward the understanding of PPARγ in health and disease.

  7. Targeted enrichment of the black cottonwood (Populus trichocarpa gene space using sequence capture

    Zhou Lecong


    Full Text Available Abstract Background High-throughput re-sequencing is rapidly becoming the method of choice for studies of neutral and adaptive processes in natural populations across taxa. As re-sequencing the genome of large numbers of samples is still cost-prohibitive in many cases, methods for genome complexity reduction have been developed in attempts to capture most ecologically-relevant genetic variation. One of these approaches is sequence capture, in which oligonucleotide baits specific to genomic regions of interest are synthesized and used to retrieve and sequence those regions. Results We used sequence capture to re-sequence most predicted exons, their upstream regulatory regions, as well as numerous random genomic intervals in a panel of 48 genotypes of the angiosperm tree Populus trichocarpa (black cottonwood, or ‘poplar’. A total of 20.76Mb (5% of the poplar genome was targeted, corresponding to 173,040 baits. With 12 indexed samples run in each of four lanes on an Illumina HiSeq instrument (2x100 paired-end, 86.8% of the bait regions were on average sequenced at a depth ≥10X. Few off-target regions (>250bp away from any bait were present in the data, but on average ~80bp on either side of the baits were captured and sequenced to an acceptable depth (≥10X to call heterozygous SNPs. Nucleotide diversity estimates within and adjacent to protein-coding genes were similar to those previously reported in Populus spp., while intergenic regions had higher values consistent with a relaxation of selection. Conclusions Our results illustrate the efficiency and utility of sequence capture for re-sequencing highly heterozygous tree genomes, and suggest design considerations to optimize the use of baits in future studies.

  8. Molecular cloning and primary sequence analysis of a gene encoding a putative shitinase gene in Brassica oleracea var.capitata



    Chitinase,which catalyzes the hydrolysis of the β-1,4-acetyl-D-glucosamine linkages of the fungal cell wall polymer chitin,is involved in inducible plants defense system.By construction of cabbage(Brassica oleracea var. capitata) genomic library and screening the library with pRCH8,a probe of rice chitinase gene fragment,a chitinase genomic sequence was isolated.The complete uncleotide sequence of the putative cabbage chitinase gene (cabch29) was determined,with its longest open reading frame (ORF) encoding a polypeptide of 413 aa.This polypeptide consists of a 21 aa N-terminal signal peptide,two chitin-binding domains different from those of other classes of plant chitinases,and a catalytic domain.Homology analysis illustrated that this cabch29 gene has 58.8% identity at the nucleotide level with the pRCH8 ORF probe and has 50% identity at the amino acid level tiwh the catalytic domains of chitinase from bean,maize and sugar beet.Meanwhile,several kinds of cis-elements,such as TATA box,CAAT box,GATA motif,ASF-1 binding site,wound-response elements and AATAAA,have also been discovered in the flanking region of cabch29 gene.

  9. A De Novo Whole GCK Gene Deletion Not Detected by Gene Sequencing, in a Boy with Phenotypic GCK Insufficiency

    N. H. Birkebæk


    Full Text Available We report on a boy with diabetes mellitus and a phenotype indicating glucokinase (GCK insufficiency, but a normal GCK gene examination applying direct gene sequencing. The boy was referred for diabetes mellitus at 7.5 years old. His father, grandfather and great grandfather suffered type 2 DM. Several blood glucose profiles showed (BG of 6.5–10 mmol/L L. After three years on neutral insulin Hagedorn (NPH in a dose of 0.3 IU/kg/day haemoglobin A1c (HbA1c was 6.8%. Treatment was changed to sulphonylurea 750 mg a day, and after 4 years HbA1c was 7%. At that time a multiplex ligation-dependent amplification gene dosage assay (MLPA was done, revealing a whole GCK gene deletion. Medical treatment was ceased, and after one year HbA1c was 6.8%. This case underscores the importance of a MLPA examination if the phenotype of a patient is strongly indicative of GCK insufficiency and no mutation is identified using direct sequencing.

  10. Molecular Diagnostics of Gliomas Using Next Generation Sequencing of a Glioma-Tailored Gene Panel.

    Zacher, Angela; Kaulich, Kerstin; Stepanow, Stefanie; Wolter, Marietta; Köhrer, Karl; Felsberg, Jörg; Malzkorn, Bastian; Reifenberger, Guido


    Current classification of gliomas is based on histological criteria according to the World Health Organization (WHO) classification of tumors of the central nervous system. Over the past years, characteristic genetic profiles have been identified in various glioma types. These can refine tumor diagnostics and provide important prognostic and predictive information. We report on the establishment and validation of gene panel next generation sequencing (NGS) for the molecular diagnostics of gliomas. We designed a glioma-tailored gene panel covering 660 amplicons derived from 20 genes frequently aberrant in different glioma types. Sensitivity and specificity of glioma gene panel NGS for detection of DNA sequence variants and copy number changes were validated by single gene analyses. NGS-based mutation detection was optimized for application on formalin-fixed paraffin-embedded tissue specimens including small stereotactic biopsy samples. NGS data obtained in a retrospective analysis of 121 gliomas allowed for their molecular classification into distinct biological groups, including (i) isocitrate dehydrogenase gene (IDH) 1 or 2 mutant astrocytic gliomas with frequent α-thalassemia/mental retardation syndrome X-linked (ATRX) and tumor protein p53 (TP53) gene mutations, (ii) IDH mutant oligodendroglial tumors with 1p/19q codeletion, telomerase reverse transcriptase (TERT) promoter mutation and frequent Drosophila homolog of capicua (CIC) gene mutation, as well as (iii) IDH wildtype glioblastomas with frequent TERT promoter mutation, phosphatase and tensin homolog (PTEN) mutation and/or epidermal growth factor receptor (EGFR) amplification. Oligoastrocytic gliomas were genetically assigned to either of these groups. Our findings implicate gene panel NGS as a promising diagnostic technique that may facilitate integrated histological and molecular glioma classification.

  11. Complete exon sequencing of all known Usher syndrome genes greatly improves molecular diagnosis

    Lacombe Didier


    Full Text Available Abstract Background Usher syndrome (USH combines sensorineural deafness with blindness. It is inherited in an autosomal recessive mode. Early diagnosis is critical for adapted educational and patient management choices, and for genetic counseling. To date, nine causative genes have been identified for the three clinical subtypes (USH1, USH2 and USH3. Current diagnostic strategies make use of a genotyping microarray that is based on the previously reported mutations. The purpose of this study was to design a more accurate molecular diagnosis tool. Methods We sequenced the 366 coding exons and flanking regions of the nine known USH genes, in 54 USH patients (27 USH1, 21 USH2 and 6 USH3. Results Biallelic mutations were detected in 39 patients (72% and monoallelic mutations in an additional 10 patients (18.5%. In addition to biallelic mutations in one of the USH genes, presumably pathogenic mutations in another USH gene were detected in seven patients (13%, and another patient carried monoallelic mutations in three different USH genes. Notably, none of the USH3 patients carried detectable mutations in the only known USH3 gene, whereas they all carried mutations in USH2 genes. Most importantly, the currently used microarray would have detected only 30 of the 81 different mutations that we found, of which 39 (48% were novel. Conclusions Based on these results, complete exon sequencing of the currently known USH genes stands as a definite improvement for molecular diagnosis of this disease, which is of utmost importance in the perspective of gene therapy.

  12. Sequence Diversity of VP4 and VP7 Genes of Human Rotavirus Strains in Saudi Arabia.

    Abdel-Moneim, Ahmed S; Al-Malky, Mater I R; Alsulaimani, Adnan A A; Abuelsaad, Abdelaziz S A; Mohamed, Imad; Ismail, Ayman K


    Group A rotavirus is responsible for inducing severe diarrhea in young children worldwide. Rotavirus vaccines are used to control the disease in many countries. In the current study, the sequences of human rotavirus G and P types in Saudi Arabia are reported and compared to different relevant published sequences. In addition, the VP4 and VP7 genes of the G1P[8] strains are compared to different antigenic epitopes of the rotavirus vaccines. Stool samples were collected from children under 2 years suffering from severe diarrhea. Screening of the rotavirus-positive samples was performed with rapid antigen detection kit. RNA was amplified from rotavirus-positive samples by reverse transcriptase polymerase chain reaction assay for both VP4 and VP7 genes. Direct sequencing of the VP4 and VP7 genes was conducted and the obtained sequences were compared to each other and to the rotavirus vaccines. Both G1P[8] G1P[4] genotypes were detected. Phylogenetic analysis revealed that the detected strains belong to G1 lineage 1 and 2, P[8] lineage 3, and to P[4] lineage 5. Multiple amino acid substitutions were detected between the Saudi RVA strains and the commonly used vaccines. The current findings emphasize the importance of the continuous surveillance of the circulating rotavirus strains, which is crucial for monitoring virus evolution and helping in predicting the protection level afforded by rotavirus vaccines.

  13. Sequence and analysis of the gene for bacteriophage T3 RNA polymerase.

    McGraw, N J; Bailey, J N; Cleaves, G R; Dembinski, D R; Gocke, C R; Joliffe, L K; MacWright, R S; McAllister, W T


    The RNA polymerases encoded by bacteriophages T3 and T7 have similar structures, but exhibit nearly exclusive template specificities. We have determined the nucleotide sequence of the region of T3 DNA that encodes the T3 RNA polymerase (the gene 1.0 region), and have compared this sequence with the corresponding region of T7 DNA. The predicted amino acid sequence of the T3 RNA polymerase exhibits very few changes when compared to the T7 enzyme (82% of the residues are identical). Significant differences appear to cluster in three distinct regions in the amino-terminal half of the protein. Analysis of the data from both enzymes suggests features that may be important for polymerase function. In particular, a region that differs between the T3 and T7 enzymes exhibits significant homology to the bi-helical domain that is common to many sequence-specific DNA binding proteins. The region that flanks the structural gene contains a number of regulatory elements including: a promoter for the E. coli RNA polymerase, a potential processing site for RNase III and a promoter for the T3 polymerase. The promoter for the T3 RNA polymerase is located only 12 base pairs distal to the stop codon for the structural gene. PMID:3903658

  14. Analysis of mutations in the entire coding sequence of the factor VIII gene

    Bidichadani, S.I.; Lanyon, W.G.; Connor, J.M. [Glascow Univ. (United Kingdom)] [and others


    Hemophilia A is a common X-linked recessive disorder of bleeding caused by deleterious mutations in the gene for clotting factor VIII. The large size of the factor VIII gene, the high frequency of de novo mutations and its tissue-specific expression complicate the detection of mutations. We have used a combination of RT-PCR of ectopic factor VIII transcripts and genomic DNA-PCRs to amplify the entire essential sequence of the factor VIII gene. This is followed by chemical mismatch cleavage analysis and direct sequencing in order to facilitate a comprehensive search for mutations. We describe the characterization of nine potentially pathogenic mutations, six of which are novel. In each case, a correlation of the genotype with the observed phenotype is presented. In order to evaluate the pathogenicity of the five missense mutations detected, we have analyzed them for evolutionary sequence conservation and for their involvement of sequence motifs catalogued in the PROSITE database of protein sites and patterns.

  15. New Hosts of Simplicimonas similis and Trichomitus batrachorum Identified by 18S Ribosomal RNA Gene Sequences

    Kris Genelyn B. Dimasuay


    Full Text Available Trichomonads are obligate anaerobes generally found in the digestive and genitourinary tract of domestic animals. In this study, four trichomonad isolates were obtained from carabao, dog, and pig hosts using rectal swab. Genomic DNA was extracted using Chelex method and the 18S rRNA gene was successfully amplified through novel sets of primers and undergone DNA sequencing. Aligned isolate sequences together with retrieved 18S rRNA gene sequences of known trichomonads were utilized to generate phylogenetic trees using maximum likelihood and neighbor-joining analyses. Two isolates from carabao were identified as Simplicimonas similis while each isolate from dog and pig was identified as Pentatrichomonas hominis and Trichomitus batrachorum, respectively. This is the first report of S. similis in carabao and the identification of T. batrachorum in pig using 18S rRNA gene sequence analysis. The generated phylogenetic tree yielded three distinct groups mostly with relatively moderate to high bootstrap support and in agreement with the most recent classification. Pathogenic potential of the trichomonads in these hosts still needs further investigation.

  16. Massive parallel IGHV gene sequencing reveals a germinal center pathway in origins of human multiple myeloma.

    Cowan, Graeme; Weston-Bell, Nicola J; Bryant, Dean; Seckinger, Anja; Hose, Dirk; Zojer, Niklas; Sahota, Surinder S


    Human multiple myeloma (MM) is characterized by accumulation of malignant terminally differentiated plasma cells (PCs) in the bone marrow (BM), raising the question when during maturation neoplastic transformation begins. Immunoglobulin IGHV genes carry imprints of clonal tumor history, delineating somatic hypermutation (SHM) events that generally occur in the germinal center (GC). Here, we examine MM-derived IGHV genes using massive parallel deep sequencing, comparing them with profiles in normal BM PCs. In 4/4 presentation IgG MM, monoclonal tumor-derived IGHV sequences revealed significant evidence for intraclonal variation (ICV) in mutation patterns. IGHV sequences of 2/2 normal PC IgG populations revealed dominant oligoclonal expansions, each expansion also displaying mutational ICV. Clonal expansions in MM and in normal BM PCs reveal common IGHV features. In such MM, the data fit a model of tumor origins in which neoplastic transformation is initiated in a GC B-cell committed to terminal differentiation but still targeted by on-going SHM. Strikingly, the data parallel IGHV clonal sequences in some monoclonal gammopathy of undetermined significance (MGUS) known to display on-going SHM imprints. Since MGUS generally precedes MM, these data suggest origins of MGUS and MM with IGHV gene mutational ICV from the same GC B-cell, arising via a distinctive pathway.

  17. Mouse Cmu heavy chain immunoglobulin gene segment contains three intervening sequences separating domains.

    Calame, K; Rogers, J; Early, P; Davis, M; Livant, D; Wall, R; Hood, L


    The IgM molecule is composed of subunits made up of two light chain and two heavy chain (mu) polypeptides. The mu chain is encoded by several gene segments--variable (V), joining (J) and constant (Cmu). The Cmu gene segment is of particular interest for several reasons. First, the mu chain must exist in two very different environments--as an integral membrane protein in receptor IgM molecules (micrometer) and as soluble serum protein in IgM molecules into the blood (mus). Second, the Cmu region in mus is composed of four homology units or domains (Cmu1, Cmu2, Cmu3 and Cmu4) of approximately 110 amino acid residues plus a C-terminal tail of 19 residues. We asked two questions concerning the organisation of the Cmu gene segment. (1) Are the homology units separated by intervening DNA sequences as has been reported for alpha (ref. 5), gamma 1 (ref. 6) and gamma 2b (ref. 7) heavy chain genes? (2) Is the C-terminal tail separated from the Cmu4 domain by an intervening DNA sequence? If so, DNA rearrangements or RNA splicing could generate hydrophilic and hydrophobic C-terminal tails for the mus and micrometer polypeptides, respectively. We demonstrate here that intervening DNA sequences separate each of the four coding regions for Cmu domains, and that the coding regions for the Cmu4 domains and the C-terminal tail are directly contiguous.

  18. Breaking the 1000-gene barrier for Mimivirus using ultra-deep genome and transcriptome sequencing

    Claverie Jean-Michel


    Full Text Available Abstract Background Mimivirus, a giant dsDNA virus infecting Acanthamoeba, is the prototype of the mimiviridae family, the latest addition to the family of the nucleocytoplasmic large DNA viruses (NCLDVs. Its 1.2 Mb-genome was initially predicted to encode 917 genes. A subsequent RNA-Seq analysis precisely mapped many transcript boundaries and identified 75 new genes. Findings We now report a much deeper analysis using the SOLiD™ technology combining RNA-Seq of the Mimivirus transcriptome during the infectious cycle (202.4 Million reads, and a complete genome re-sequencing (45.3 Million reads. This study corrected the genome sequence and identified several single nucleotide polymorphisms. Our results also provided clear evidence of previously overlooked transcription units, including an important RNA polymerase subunit distantly related to Euryarchea homologues. The total Mimivirus gene count is now 1018, 11% greater than the original annotation. Conclusions This study highlights the huge progress brought about by ultra-deep sequencing for the comprehensive annotation of virus genomes, opening the door to a complete one-nucleotide resolution level description of their transcriptional activity, and to the realistic modeling of the viral genome expression at the ultimate molecular level. This work also illustrates the need to go beyond bioinformatics-only approaches for the annotation of short protein and non-coding genes in viral genomes.

  19. Zooplankton diversity analysis through single-gene sequencing of a community sample

    Nishida Mutsumi


    Full Text Available Abstract Background Oceans cover more than 70% of the earth's surface and are critical for the homeostasis of the environment. Among the components of the ocean ecosystem, zooplankton play vital roles in energy and matter transfer through the system. Despite their importance, understanding of zooplankton biodiversity is limited because of their fragile nature, small body size, and the large number of species from various taxonomic phyla. Here we present the results of single-gene zooplankton community analysis using a method that determines a large number of mitochondrial COI gene sequences from a bulk zooplankton sample. This approach will enable us to estimate the species richness of almost the entire zooplankton community. Results A sample was collected from a depth of 721 m to the surface in the western equatorial Pacific off Pohnpei Island, Micronesia, with a plankton net equipped with a 2-m2 mouth opening. A total of 1,336 mitochondrial COI gene sequences were determined from the cDNA library made from the sample. From the determined sequences, the occurrence of 189 species of zooplankton was estimated. BLASTN search results showed high degrees of similarity (>98% between the query and database for 10 species, including holozooplankton and merozooplankton. Conclusion In conjunction with the Census of Marine Zooplankton and Barcode of Life projects, single-gene zooplankton community analysis will be a powerful tool for estimating the species richness of zooplankton communities.

  20. Large scale in silico identification of MYB family genes from wheat expressed sequence tags.

    Cai, Hongsheng; Tian, Shan; Dong, Hansong


    The MYB proteins constitute one of the largest transcription factor families in plants. Much research has been performed to determine their structures, functions, and evolution, especially in the model plants, Arabidopsis, and rice. However, this transcription factor family has been much less studied in wheat (Triticum aestivum), for which no genome sequence is yet available. Despite this, expressed sequence tags are an important resource that permits opportunities for large scale gene identification. In this study, a total of 218 sequences from wheat were identified and confirmed to be putative MYB proteins, including 1RMYB, R2R3-type MYB, 3RMYB, and 4RMYB types. A total of 36 R2R3-type MYB genes with complete open reading frames were obtained. The putative orthologs were assigned in rice and Arabidopsis based on the phylogenetic tree. Tissue-specific expression pattern analyses confirmed the predicted orthologs, and this meant that gene information could be inferred from the Arabidopsis genes. Moreover, the motifs flanking the MYB domain were analyzed using the MEME web server. The distribution of motifs among wheat MYB proteins was investigated and this facilitated subfamily classification.

  1. Identification and characterization of rhizospheric microbial diversity by 16S ribosomal RNA gene sequencing

    Muhammad Naveed


    Full Text Available In the present study, samples of rhizosphere and root nodules were collected from different areas of Pakistan to isolate plant growth promoting rhizobacteria. Identification of bacterial isolates was made by 16S rRNA gene sequence analysis and taxonomical confirmation on EzTaxon Server. The identified bacterial strains were belonged to 5 genera i.e. Ensifer, Bacillus, Pseudomona, Leclercia and Rhizobium. Phylogenetic analysis inferred from 16S rRNA gene sequences showed the evolutionary relationship of bacterial strains with the respective genera. Based on phylogenetic analysis, some candidate novel species were also identified. The bacterial strains were also characterized for morphological, physiological, biochemical tests and glucose dehydrogenase (gdh gene that involved in the phosphate solublization using cofactor pyrroloquinolone quinone (PQQ. Seven rhizoshperic and 3 root nodulating stains are positive for gdh gene. Furthermore, this study confirms a novel association between microbes and their hosts like field grown crops, leguminous and non-leguminous plants. It was concluded that a diverse group of bacterial population exist in the rhizosphere and root nodules that might be useful in evaluating the mechanisms behind plant microbial interactions and strains QAU-63 and QAU-68 have sequence similarity of 97 and 95% which might be declared as novel after further taxonomic characterization.

  2. Amplification of complete gag gene sequences from geographically distinct equine infectious anemia virus isolates.

    Boldbaatar, Bazartseren; Bazartseren, Tsevel; Koba, Ryota; Murakami, Hironobu; Oguma, Keisuke; Murakami, Kenji; Sentsui, Hiroshi


    In the current study, primers described previously and modified versions of these primers were evaluated for amplification of full-length gag genes from different equine infectious anemia virus (EIAV) strains from several countries, including the USA, Germany and Japan. Each strain was inoculated into a primary horse leukocyte culture, and the full-length gag gene was amplified by reverse transcription polymerase chain reaction. Each amplified gag gene was cloned into a plasmid vector for sequencing, and the detectable copy numbers of target DNA were determined. Use of a mixture of two forward primers and one reverse primer in the polymerase chain reaction enabled the amplification of all EIAV strains used in this study. However, further study is required to confirm these primers as universal for all EIAV strains. The nucleotide sequence of gag is considered highly conserved, as evidenced by the use of gag-encoded capsid proteins as a common antigen for the detection of EIAV in serological tests. However, significant sequence variation in the gag genes of different EIAV strains was found in the current study.

  3. Cloning and sequence analysis of β-actin gene from Aedes albopictus (Diptera: Culicidae)

    Weijie Wang; Xiaobang Hu; Donghui Zhang; Jianhua Jiao; Yan Sun; Lei Ma; Changliang Zhu


    Objective: To obtain the complete β-actin gene from Aedes albopictus. Methods: Total RNA was extracted from C6/36 cells. Degenerate primers were designed based on the β-actin sequences of An. gambiae, Ae. aegypti, Cx. pipiens pallens and D.melanogaster. By RT-PCR, the product was amplified, purified, cloned into the pGT vector and sequenced. The β-actin sequence was aligned and phylogenetically analyzed by the BLAST program and the CLUSTAL W program. Results: A sequence of 1132 bp including an open reading frame of 1131 bp was obtained (GenBank DQ657949). The deduced protein had 376 amino acids.Aligned to SWISS-PROT, it exhibited a high level of identity with β-actins from Anopheles, Drosophila and Culex at the amino acid sequence level. Phylogenetic analysis indicated that Ae. albopictus β-actin was much more homologous with invertebrate β-actin than with vertebrate β-actin. Conclusion: The gene may be used as the internal control in the experiments of Ae. albopictus.

  4. Cloning, sequencing and expression of the gene encoding the extracellular metalloprotease of Aeromonas caviae.

    Kawakami, K; Toma, C; Honma, Y


    A gene (apk) encoding the extracellular protease of Aeromonas caviae Ae6 has been cloned and sequenced. For cloning the gene, the DNA genomic library was screened using skim milk LB agar. One clone harboring plasmid pKK3 was selected for sequencing. Nucleotide sequencing of the 3.5 kb region of pKK3 revealed a single open reading frame (ORF) of 1,785 bp encoding 595 amino acids. The deduced polypeptide contained a putative 16-amino acid signal peptide followed by a large propeptide. The N-terminal amino acid sequence of purified recombinant protein (APK) was consistent with the DNA sequence. This result suggested a mature protein of 412 amino acids with a molecular mass of 44 kDa. However, the molecular mass of purified recombinant APK revealed 34 kDa by SDS-PAGE, suggesting that further processing at the C-terminal region took place. The 2 motifs of zinc binding sites deduced are highly conserved in the APK as well as in other zinc metalloproteases including Vibrio proteolyticus neutral protease, Emp V from Vibrio vulnificus, HA/P from Vibrio cholerae, and Pseudomonas aeruginosa elastase. Proteolytic activity was inhibited by EDTA, Zincov, 1,10-phenanthroline and tetraethylenepentamine while unaffected by the other inhibitors tested. The protease showed maximum activity at pH 7.0 and was inactivated by heating at 80 C for 15 min. These results together suggest that APK belongs to the thermolysin family of metalloendopeptidases.

  5. Sequence analysis of mitochondrial 16S ribosomal RNA gene fragment from seven mosquito species

    Yogesh S Shouche; Milind S Patole


    Mosquitoes are vectors for the transmission of many human pathogens that include viruses, nematodes and protozoa. For the understanding of their vectorial capacity, identification of disease carrying and refractory strains is essential. Recently, molecular taxonomic techniques have been utilized for this purpose. Sequence analysis of the mitochondrial 16S rRNA gene has been used for molecular taxonomy in many insects. In this paper, we have analysed a 450 bp hypervariable region of the mitochondrial 16S rRNA gene in three major genera of mosquitoes, Aedes, Anopheles and Culex. The sequence was found to be unusually A + T rich and in substitutions the rate of transversions was higher than the transition rate. A phylogenetic tree was constructed with these sequences. An interesting feature of the sequences was a stretch of Ts that distinguished between Aedes and Culex on the one hand, and Anopheles on the other. This is the first report of mitochondrial rRNA sequences from these medically important genera of mosquitoes.

  6. Patterns of homoeologous gene expression shown by RNA sequencing in hexaploid bread wheat.

    Leach, Lindsey J


    BACKGROUND: Bread wheat (Triticum aestivum) has a large, complex and hexaploid genome consisting of A, B and D homoeologous chromosome sets. Therefore each wheat gene potentially exists as a trio of A, B and D homoeoloci, each of which may contribute differentially to wheat phenotypes. We describe a novel approach combining wheat cytogenetic resources (chromosome substitution \\'nullisomic-tetrasomic\\' lines) with next generation deep sequencing of gene transcripts (RNA-Seq), to directly and accurately identify homoeologue-specific single nucleotide variants and quantify the relative contribution of individual homoeoloci to gene expression. RESULTS: We discover, based on a sample comprising ~5-10% of the total wheat gene content, that at least 45% of wheat genes are expressed from all three distinct homoeoloci. Most of these genes show strikingly biased expression patterns in which expression is dominated by a single homoeolocus. The remaining ~55% of wheat genes are expressed from either one or two homoeoloci only, through a combination of extensive transcriptional silencing and homoeolocus loss. CONCLUSIONS: We conclude that wheat is tending towards functional diploidy, through a variety of mechanisms causing single homoeoloci to become the predominant source of gene transcripts. This discovery has profound consequences for wheat breeding and our understanding of wheat evolution.

  7. Discovery of differentially expressed genes in cashmere goat (Capra hircus) hair follicles by RNA sequencing.

    Qiao, X; Wu, J H; Wu, R B; Su, R; Li, C; Zhang, Y J; Wang, R J; Zhao, Y H; Fan, Y X; Zhang, W G; Li, J Q


    The mammalian hair follicle (HF) is a unique, highly regenerative organ with a distinct developmental cycle. Cashmere goat (Capra hircus) HFs can be divided into two categories based on structure and development time: primary and secondary follicles. To identify differentially expressed genes (DEGs) in the primary and secondary HFs of cashmere goats, the RNA sequencing of six individuals from Arbas, Inner Mongolia, was performed. A total of 617 DEGs were identified; 297 were upregulated while 320 were downregulated. Gene ontology analysis revealed that the main functions of the upregulated genes were electron transport, respiratory electron transport, mitochondrial electron transport, and gene expression. The downregulated genes were mainly involved in cell autophagy, protein complexes, neutrophil aggregation, and bacterial fungal defense reactions. According to the Kyoto Encyclopedia of Genes and Genomes database, these genes are mainly involved in the metabolism of cysteine and methionine, RNA polymerization, and the MAPK signaling pathway, and were enriched in primary follicles. A microRNA-target network revealed that secondary follicles are involved in several important biological processes, such as the synthesis of keratin-associated proteins and enzymes involved in amino acid biosynthesis. In summary, these findings will increase our understanding of the complex molecular mechanisms of HF development and cycling, and provide a basis for the further study of the genes and functions of HF development.

  8. Gene identification and protein classification in microbial metagenomic sequence data via incremental clustering

    Li Weizhong


    Full Text Available Abstract Background The identification and study of proteins from metagenomic datasets can shed light on the roles and interactions of the source organisms in their communities. However, metagenomic datasets are characterized by the presence of organisms with varying GC composition, codon usage biases etc., and consequently gene identification is challenging. The vast amount of sequence data also requires faster protein family classification tools. Results We present a computational improvement to a sequence clustering approach that we developed previously to identify and classify protein coding genes in large microbial metagenomic datasets. The clustering approach can be used to identify protein coding genes in prokaryotes, viruses, and intron-less eukaryotes. The computational improvement is based on an incremental clustering method that does not require the expensive all-against-all compute that was required by the original approach, while still preserving the remote homology detection capabilities. We present evaluations of the clustering approach in protein-coding gene identification and classification, and also present the results of updating the protein clusters from our previous work with recent genomic and metagenomic sequences. The clustering results are available via CAMERA, ( Conclusion The clustering paradigm is shown to be a very useful tool in the analysis of microbial metagenomic data. The incremental clustering method is shown to be much faster than the original approach in identifying genes, grouping sequences into existing protein families, and also identifying novel families that have multiple members in a metagenomic dataset. These clusters provide a basis for further studies of protein families.

  9. Gene discovery in the threatened elkhorn coral: 454 sequencing of the Acropora palmata transcriptome.

    Nicholas R Polato

    Full Text Available BACKGROUND: Cnidarians, including corals and anemones, offer unique insights into metazoan evolution because they harbor genetic similarities with vertebrates beyond that found in model invertebrates and retain genes known only from non-metazoans. Cataloging genes expressed in Acropora palmata, a foundation-species of reefs in the Caribbean and western Atlantic, will advance our understanding of the genetic basis of ecologically important traits in corals and comes at a time when sequencing efforts in other cnidarians allow for multi-species comparisons. RESULTS: A cDNA library from a sample enriched for symbiont free larval tissue was sequenced on the 454 GS-FLX platform. Over 960,000 reads were obtained and assembled into 42,630 contigs. Annotation data was acquired for 57% of the assembled sequences. Analysis of the assembled sequences indicated that 83-100% of all A. palmata transcripts were tagged, and provided a rough estimate of the total number genes expressed in our samples (~18,000-20,000. The coral annotation data contained many of the same molecular components as in the Bilateria, particularly in pathways associated with oxidative stress and DNA damage repair, and provided evidence that homologs of p53, a key player in DNA repair pathways, has experienced selection along the branch separating Cnidaria and Bilateria. Transcriptome wide screens of paralog groups and transition/transversion ratios highlighted genes including: green fluorescent proteins, carbonic anhydrase, and oxidative stress proteins; and functional groups involved in protein and nucleic acid metabolism, and the formation of structural molecules. These results provide a starting point for study of adaptive evolution in corals. CONCLUSIONS: Currently available transcriptome data now make comparative studies of the mechanisms underlying coral's evolutionary success possible. Here we identified candidate genes that enable corals to maintain genomic integrity despite

  10. Molecular characterization, tissue expression and sequence variability of the barramundi (Lates calcarifer myostatin gene

    Smith-Keune Carolyn


    Full Text Available Abstract Background Myostatin (MSTN is a member of the transforming growth factor-β superfamily that negatively regulates growth of skeletal muscle tissue. The gene encoding for the MSTN peptide is a consolidate candidate for the enhancement of productivity in terrestrial livestock. This gene potentially represents an important target for growth improvement of cultured finfish. Results Here we report molecular characterization, tissue expression and sequence variability of the barramundi (Lates calcarifer MSTN-1 gene. The barramundi MSTN-1 was encoded by three exons 379, 371 and 381 bp in length and translated into a 376-amino acid peptide. Intron 1 and 2 were 412 and 819 bp in length and presented typical GT...AG splicing sites. The upstream region contained cis-regulatory elements such as TATA-box and E-boxes. A first assessment of sequence variability suggested that higher mutation rates are found in the 5' flanking region with several SNP's present in this species. A putative micro RNA target site has also been observed in the 3'UTR (untranslated region and is highly conserved across teleost fish. The deduced amino acid sequence was conserved across vertebrates and exhibited characteristic conserved putative functional residues including a cleavage motif of proteolysis (RXXR, nine cysteines and two glycosilation sites. A qualitative analysis of the barramundi MSTN-1 expression pattern revealed that, in adult fish, transcripts are differentially expressed in various tissues other than skeletal muscles including gill, heart, kidney, intestine, liver, spleen, eye, gonad and brain. Conclusion Our findings provide valuable insights such as sequence variation and genomic information which will aid the further investigation of the barramundi MSTN-1 gene in association with growth. The finding for the first time in finfish MSTN of a miRNA target site in the 3'UTR provides an opportunity for the identification of regulatory mutations on the

  11. Comparative sequence analyses of the neurotoxin complex genes in Clostridium botulinum serotypes A, B, E, and F

    Ajay K. Singh


    Full Text Available Neurotoxin complex (NTC genes are arranged in two known hemagglutinin (HA and open reading frame X (ORFX clusters. NTC genes have been analyzed in four serotypes A, B, E and F of Clostridium botulinum causing human botulism. Analysis of amino acid sequences of NT genes demonstrated significant differences among subtypes and four serotypes. Phylogram tree of NT genes reveals that serotypes A1 and B1 are much closer compared to serotype E1 and F1. However, non-toxic non-hemagglutinin (NTNH gene is highly conserved among four serotypes. Analysis of phylogram tree of NTNH gene reveals that serotypes A and F are more closely related compared to serotype B and E. Additionally, sequences of HAs and ORFX genes are very divergent but these genes are specific in subtypes and serotypes of Clostridium botulinum. Information derived from sequence analyses of NTC has direct implication in development of detection tools and therapeutic countermeasures for botulism.

  12. Hematological- and Neurological-Expressed Sequence 1 Gene Products in Progenitor Cells during Newt Retinal Development

    Tatsushi Goto


    Full Text Available Urodele amphibians such as Japanese common newts have a remarkable ability to regenerate their injured neural retina, even as adults. We found that hematological- and neurological-expressed sequence 1 (Hn1 gene was induced in depigmented retinal pigment epithelial (RPE cells, and its expression was maintained at later stages of newt retinal regeneration. In this study, we investigated the distribution of the HN1 protein, the product of the Hn1 gene, in the developing retinas. Our immunohistochemical analyses suggested that the HN1 protein was highly expressed in an immature retina, and the subcellular localization changed during this retinogenesis as observed in newt retinal regeneration. We also found that the expression of Hn1 gene was not induced in mouse after retinal removal. Our results showed that Hn1 gene can be useful for detection of undifferentiated and dedifferentiated cells during both newt retinal development and regeneration.

  13. Identification of a DNA binding protein that recognizes the nonamer recombinational signal sequence of immunoglobulin genes.

    Halligan, B D; Desiderio, S V


    Extracts of nuclei from B- and T-lymphoid cells contain a protein that binds specifically to the conserved nonamer DNA sequence within the recombinational signals of immunoglobulin genes. Complexes with DNA fragments from four kappa light-chain joining (J) segments have the same electrophoretic mobility. Nonamer-containing DNA fragments from heavy-chain and light-chain genes compete for binding. Within the 5'-flanking DNA of the J kappa 4 gene segment, the binding site has been localized to a 27-base-pair interval spanning the nonamer region. The binding activity is recovered as a single peak after ion-exchange chromatography. The site of binding of the protein and its presence in nuclei of lymphoid cells suggest that it may function in the assembly of immunoglobulin genes.

  14. Genepleio Software for Effective Estimation of Gene Pleiotropy from Protein Sequences

    Wenhai Chen


    Full Text Available Though pleiotropy, which refers to the phenomenon of a gene affecting multiple traits, has long played a central role in genetics, development, and evolution, estimation of the number of pleiotropy components remains a hard mission to accomplish. In this paper, we report a newly developed software package, Genepleio, to estimate the effective gene pleiotropy from phylogenetic analysis of protein sequences. Since this estimate can be interpreted as the minimum pleiotropy of a gene, it is used to play a role of reference for many empirical pleiotropy measures. This work would facilitate our understanding of how gene pleiotropy affects the pattern of genotype-phenotype map and the consequence of organismal evolution.

  15. Large-scale Gene Ontology analysis of plant transcriptome-derived sequences retrieved by AFLP technology

    Ramina Angelo


    Full Text Available Abstract Background After 10-year-use of AFLP (Amplified Fragment Length Polymorphism technology for DNA fingerprinting and mRNA profiling, large repertories of genome- and transcriptome-derived sequences are available in public databases for model, crop and tree species. AFLP marker systems have been and are being extensively exploited for genome scanning and gene mapping, as well as cDNA-AFLP for transcriptome profiling and differentially expressed gene cloning. The evaluation, annotation and classification of genomic markers and expressed transcripts would be of great utility for both functional genomics and systems biology research in plants. This may be achieved by means of the Gene Ontology (GO, consisting in three structured vocabularies (i.e. ontologies describing genes, transcripts and proteins of any organism in terms of their associated cellular component, biological process and molecular function in a species-independent manner. In this paper, the functional annotation of about 8,000 AFLP-derived ESTs retrieved in the NCBI databases was carried out by using GO terminology. Results Descriptive statistics on the type, size and nature of gene sequences obtained by means of AFLP technology were calculated. The gene products associated with mRNA transcripts were then classified according to the three main GO vocabularies. A comparison of the functional content of cDNA-AFLP records was also performed by splitting the sequence dataset into monocots and dicots and by comparing them to all annotated ESTs of Arabidopsis and rice, respectively. On the whole, the statistical parameters adopted for the in silico AFLP-derived transcriptome-anchored sequence analysis proved to be critical for obtaining reliable GO results. Such an exhaustive annotation may offer a suitable platform for functional genomics, particularly useful in non-model species. Conclusion Reliable GO annotations of AFLP-derived sequences can be gathered through the optimization

  16. Nematode Diversity of Qingdao Coast Inferred from the 18S Ribosomal RNA Gene Sequence Analysis

    SHEN Xiquan; YANG Guanpin; LIU Yongjian


    The 18S ribosomal DNA gene (18S rDNA) sequences (approximately 1300 bp in length) were amplified from the DNA extracted from the free-living marine nematodes collected from the inter-tidal sediment of Qingdao coast in bulk with nematode specific primers. The PCR products were cloned, re-amplified, digested with Rsa I and Hin6Ⅰ restriction endonucleases and separated in agarose gel. Among 17 restriction fragment length types, types 1, 2 and 6 covered 61.2%, 14.4% and 9.3% of the clones analyzed, respectively, while the remaining 14 only covered 21 clones, which accounted for 15.1% of the total. Twenty-four representative clones were sequenced and phylogenetically analyzed by referring to those currently available in RDP and GenBank databases. Although it was hard to assign these sequences to known species or genera due to the lack of the 18S rDNA sequence data of known marine free-living nematodes, the obtained sequences were assigned to the nematodes of Adenophorea. Among them, twelve sequences were close to Pontonema vulgare and Adoncholaimus sp., four to Daptonemaprocerus and two (identical) to Enoplus brevis. Our results showed that free-living marine nematode diversities could be determined by PCR retrieving and analysis of the 18S rDNA sequences and an 18S rDNA sequence could be assigned to a species or a genus only if the 18S rDNA sequences of the free-living marine nematodes were accumulated to some extent.

  17. Extensive 16S rRNA gene sequence diversity in Campylobacter hyointestinalis strains: taxonomic and applied implications

    Harrington, C.S.; On, Stephen L.W.


    Phylogenetic relationships of Campylobacter hyointestinalis subspecies were examined by means of 16S rRNA gene sequencing. Sequence similarities among C. hyointestinalis subsp. lawsonii strains exceeded 99.0 %, but values among C. hyointestinalis subsp. hyointestinalis strains ranged from 96...... of the genus Campylobacter, emphasizing the need for multiple strain analysis when using 16S rRNA gene sequence comparisons for taxonomic investigations....

  18. Transcriptome profiling of bovine milk oligosaccharide metabolism genes using RNA-sequencing.

    Saumya Wickramasinghe

    Full Text Available This study examines the genes coding for enzymes involved in bovine milk oligosaccharide metabolism by comparing the oligosaccharide profiles with the expressions of glycosylation-related genes. Fresh milk samples (n = 32 were collected from four Holstein and Jersey cows at days 1, 15, 90 and 250 of lactation and free milk oligosaccharide profiles were analyzed. RNA was extracted from milk somatic cells at days 15 and 250 of lactation (n = 12 and gene expression analysis was conducted by RNA-Sequencing. A list was created of 121 glycosylation-related genes involved in oligosaccharide metabolism pathways in bovine by analyzing the oligosaccharide profiles and performing an extensive literature search. No significant differences were observed in either oligosaccharide profiles or expressions of glycosylation-related genes between Holstein and Jersey cows. The highest concentrations of free oligosaccharides were observed in the colostrum samples and a sharp decrease was observed in the concentration of free oligosaccharides on day 15, followed by progressive decrease on days 90 and 250. Ninety-two glycosylation-related genes were expressed in milk somatic cells. Most of these genes exhibited higher expression in day 250 samples indicating increases in net glycosylation-related metabolism in spite of decreases in free milk oligosaccharides in late lactation milk. Even though fucosylated free oligosaccharides were not identified, gene expression indicated the likely presence of fucosylated oligosaccharides in bovine milk. Fucosidase genes were expressed in milk and a possible explanation for not detecting fucosylated free oligosaccharides is the degradation of large fucosylated free oligosaccharides by the fucosidases. Detailed characterization of enzymes encoded by the 92 glycosylation-related genes identified in this study will provide the basic knowledge for metabolic network analysis of oligosaccharides in mammalian milk. These candidate

  19. Cloning and sequencing of the gene encoding LipL21 in the vaccinal leptospira serovars

    Rasoul Hoseinpur


    Full Text Available Background: Leptospirosis is a zoonotic disease in humans and animals, caused by the bacterium Leptospira interrogans. Gene expressing LipL21 is one of the genes identified in the bacterium, existing only in the pathogenic strains. The aim of this study was to cloning and analyzing the sequence of the gene encoding surface lipoprotein, LipL21, in five vaccinal leptospira serovars in Iran. Material and Methods: Pathogenic Leptospira interrogans serovars were cultured in EMJH medium with 10% rabbit serum. After genomic DNA extraction, PCR with specific primers was employed and the resulting product inserted in a vector then transferred into E. Coli DH5&alpha. The recombinant plasmids were finally sent for sequencing. Results: The analysis of gene lipL21 in domestic vaccinal serovars and comparison of them with other serovars in the GenBank database revealed that three vaccinal serovars serjo hardjo, canicola and pomona had 100% similarity with each other and grippotyphosa serovar had the highest difference with the vaccinal serovars. In general, the results showed that this gene is a highly conserved gene in the domestic vaccinal serovars and serovars in the GenBank database with more than 95.7 percent similarity. Conclusion: These results showed that the gene, lipL21, is highly conserved in the vaccinal serovars (similarities > 96.4 %. Therefore, the gene encoding surface protein LipL21 can serve as a useful serologic test with high specificity and sensitivity for diagnosis of leptospirosis in clinical samples and in future as an effective subunit vaccine candidate to be used.

  20. Target genes of microsatellite sequences in head and neck squamous cell carcinoma: mononucleotide repeats are not detected.

    Wang, Yimin; Liu, Xuejuan; Li, Yulin


    Microsatellite instability (MSI) is detected in a wide variety of tumors. It is thought that mismatch repair gene mutation or inactivation is the major cause of MSI. Microsatellite sequences are predominantly distributed in intergenic or intronic DNA. However, MSI is found in the exonic sequences of some genes, causing their inactivation. In this report, we searched GenBank for candidate genes containing potential MSI sequences in exonic regions. Twenty seven target genes were selected for MSI analysis. Instability was found in 70% of these genes (14/20) with head and neck squamous cell carcinoma (HNSCC). Interestingly, no instability was detected in mononucleotide repeats in genes or in intergenic sequences. We conclude that instability of mononucleotide repeats is a rare event in HNSCC. High MSI phenotype in young HNSCC patients is limited to noncoding regions only. MSI percentage in HNSCC tumor is closely related to the repeat type, repeat location and patient's age.

  1. Deep RNA sequencing analysis of readthrough gene fusions in human prostate adenocarcinoma and reference samples

    Nacu Serban


    Full Text Available Abstract Background Readthrough fusions across adjacent genes in the genome, or transcription-induced chimeras (TICs, have been estimated using expressed sequence tag (EST libraries to involve 4-6% of all genes. Deep transcriptional sequencing (RNA-Seq now makes it possible to study the occurrence and expression levels of TICs in individual samples across the genome. Methods We performed single-end RNA-Seq on three human prostate adenocarcinoma samples and their corresponding normal tissues, as well as brain and universal reference samples. We developed two bioinformatics methods to specifically identify TIC events: a targeted alignment method using artificial exon-exon junctions within 200,000 bp from adjacent genes, and genomic alignment allowing splicing within individual reads. We performed further experimental verification and characterization of selected TIC and fusion events using quantitative RT-PCR and comparative genomic hybridization microarrays. Results Targeted alignment against artificial exon-exon junctions yielded 339 distinct TIC events, including 32 gene pairs with multiple isoforms. The false discovery rate was estimated to be 1.5%. Spliced alignment to the genome was less sensitive, finding only 18% of those found by targeted alignment in 33-nt reads and 59% of those in 50-nt reads. However, spliced alignment revealed 30 cases of TICs with intervening exons, in addition to distant inversions, scrambled genes, and translocations. Our findings increase the catalog of observed TIC gene pairs by 66%. We verified 6 of 6 predicted TICs in all prostate samples, and 2 of 5 predicted novel distant gene fusions, both private events among 54 prostate tumor samples tested. Expression of TICs correlates with that of the upstream gene, which can explain the prostate-specific pattern of some TIC events and the restriction of the SLC45A3-ELK4 e4-e2 TIC to ERG-negative prostate samples, as confirmed in 20 matched prostate tumor and normal

  2. Differential Diagnosis of Two Chinese Families with Dyschromatoses by Targeted Gene Sequencing

    Jia-Wei Liu; Asan; Jun Sun; Sergio Vano-Galvan; Feng-Xia Liu; Xiu-Xiu Wei; Dong-Lai Ma


    Background: The dyschromatoses are a group of disorders characterized by simultaneous hyperpigmented macules together with hypopigmented macules.Dyschromatosis universalis hereditaria (DUH) and dyschromatosis symmetrica hereditaria are two major types.While clinical and histological presentations are similar in these two diseases, genetic diagnosis is critical in the differential diagnosis of these entities.Methods: Three patients initially diagnosed with DUH were included.The gene test was carried out by targeted gene sequencing.All mutations detected on ADAR1 and ABCB6 genes were analyzed according to the frequency in control database, the mutation types, and the published evidence to determine the pathogenicity.Results: Family pedigree and clinical presentations were reported in 3 patients from two Chinese families.All patients have prominent cutaneous dyschromatoses involving the whole body without systemic complications.Different pathogenic genes in these patients with similar phenotype were identified: One novel mutation on ADAR1 (c.1325C>G) and one recurrent mutation in ABCB6 (c.1270T>C), which successfully distinguished two diseases with the similar phenotype.Conclusion: Targeted gene sequencing is an effective tool for genetic diagnosis in pigmentary skin diseases.

  3. Chromosomal Organization and Sequence Diversity of Genes Encoding Lachrymatory Factor Synthase in Allium cepa L.

    Masamura, Noriya; McCallum, John; Khrustaleva, Ludmila; Kenel, Fernand; Pither-Joyce, Meegham; Shono, Jinji; Suzuki, Go; Mukai, Yasuhiko; Yamauchi, Naoki; Shigyo, Masayoshi


    Lachrymatory factor synthase (LFS) catalyzes the formation of lachrymatory factor, one of the most distinctive traits of bulb onion (Allium cepa L.). Therefore, we used LFS as a model for a functional gene in a huge genome, and we examined the chromosomal organization of LFS in A. cepa by multiple approaches. The first-level analysis completed the chromosomal assignment of LFS gene to chromosome 5 of A. cepa via the use of a complete set of A. fistulosum-shallot (A. cepa L. Aggregatum group) monosomic addition lines. Subsequent use of an F(2) mapping population from the interspecific cross A. cepa × A. roylei confirmed the assignment of an LFS locus to this chromosome. Sequence comparison of two BAC clones bearing LFS genes, LFS amplicons from diverse germplasm, and expressed sequences from a doubled haploid line revealed variation consistent with duplicated LFS genes. Furthermore, the BAC-FISH study using the two BAC clones as a probe showed that LFS genes are localized in the proximal region of the long arm of the chromosome. These results suggested that LFS in A. cepa is transcribed from at least two loci and that they are localized on chromosome 5.

  4. Complete genome sequence of Bacillus oceanisediminis 2691, a reservoir of heavy-metal resistance genes.

    Jung, Jaejoon; Jeong, Haeyoung; Kim, Hyun Ju; Lee, Dong-Woo; Lee, Sang Jun


    Ocean sediments are commonly subject to the pollution of various heavy metals. Intracellular heavy metal concentrations in marine microorganisms should be kept within allowable concentrations. Here, we report redundant heavy metal resistance related genes encoding heavy metal-sensing transcriptional regulators (i.e. cadC), heavy metal efflux pumps, and detoxifying enzymes in the complete genome sequence of Bacillus oceanisediminis 2691. By comparing CadC sequences of strain 2691 with those from other bacterial genomes, we demonstrated that each cadC gene located in the chromosome or plasmid of 2691 cells are similar to those of various near or distant microbes, which might shed light on evolutionary trajectories of redundant heavy metal resistance genes. In application aspects, these diverse heavy metal sensing genes can be harnessed as synthetic biological parts, modules, and devices for the development of heavy metal-specific biosensors. Heavy metal bioremediation technologies or platform cells can be also developed based on the marine genomic information of heavy metal resistance and/or detoxification genes in a bacterial isolate from ocean sediments.

  5. Human case of bacteremia caused by Streptococcus canis sequence type 9 harboring the scm gene.

    Taniyama, Daisuke; Abe, Yoshihiko; Sakai, Tetsuya; Kikuchi, Takahide; Takahashi, Takashi


    Streptococcus canis (Sc) is a zoonotic pathogen that is transferred mainly from companion animals to humans. One of the major virulence factors in Sc is the M-like protein encoded by the scm gene, which is involved in anti-phagocytic activities, as well as the recruitment of plasminogen to the bacterial surface in cooperation with enolase, and the consequent enhancement of bacterial transmigration and survival. This is the first reported human case of uncomplicated bacteremia following a dog bite, caused by Streptococcus canis harboring the scm gene. The similarity of the 16S rRNA from the infecting species to that of the Sc type strain, as well as the amplification of the species-specific cfg gene, encoding a co-hemolysin, was used to confirm the species identity. Furthermore, the isolate was confirmed as sequence type 9. The partial scm gene sequence harbored by the isolate was closely related to those of other two Sc strains. While this isolate did not possess the erm(A), erm(B), or mef(A), macrolide/lincosamide resistance genes, it was not susceptible to azithromycin: its susceptibility was intermediate. Even though human Sc bacteremia is rare, clinicians should be aware of this microorganism, as well as Pasteurella sp., Prevotella sp., and Capnocytophaga sp., when examining and treating patients with fever who maintain close contact with companion animals.

  6. Identification of regulatory sequences in the gene for 5-aminolevulinate synthase from rat.

    Braidotti, G; Borthwick, I A; May, B K


    The housekeeping enzyme 5-aminolevulinate synthase (ALAS) regulates the supply of heme for respiratory cytochromes. Here we report on the isolation of a genomic clone for the rat ALAS gene. The 5'-flanking region was fused to the chloramphenicol acetyltransferase gene and transient expression analysis revealed the presence of both positive and negative cis-acting sequences. Expression was substantially increased by the inclusion of the first intron located in the 5'-untranslated region. Sequence analysis of the promoter identified two elements at positions -59 and -88 bp with strong similarity to the binding site for nuclear respiratory factor 1 (NRF-1). Gel shift analysis revealed that both NRF-1 elements formed nucleoprotein complexes which could be abolished by an authentic NRF-1 oligomer. Mutagenesis of each NRF-1 motif in the ALAS promoter gave substantially lowered levels of chloramphenicol acetyltransferase expression, whereas mutagenesis of both NRF-1 motifs resulted in the almost complete loss of expression. These results establish that the NRF-1 motifs in the ALAS promoter are critical for promoter activity. NRF-1 binding sites have been identified in the promoters of several nuclear genes encoding mitochondrial proteins concerned with oxidative phosphorylation. The present studies suggest that NRF-1 may co-ordinate the supply of mitochondrial heme with the synthesis of respiratory cytochromes by regulating expression of ALAS. In erythroid cells, NRF-1 may be less important for controlling heme levels since an erythroid ALAS gene is strongly expressed and the promoter for this gene apparently lacks NRF-1 binding sites.

  7. Phylogenetic Relationships of Pseudorasbora, Pseudopungtungia, and Pungtungia (Teleostei; Cypriniformes; Gobioninae Inferred from Multiple Nuclear Gene Sequences

    Keun-Yong Kim


    Full Text Available Gobionine species belonging to the genera Pseudorasbora, Pseudopungtungia, and Pungtungia (Teleostei; Cypriniformes; Cyprinidae have been heavily studied because of problems on taxonomy, threats of extinction, invasion, and human health. Nucleotide sequences of three nuclear genes, that is, recombination activating protein gene 1 (rag1, recombination activating gene 2 (rag2, and early growth response 1 gene (egr1, from Pseudorasbora, Pseudopungtungia, and Pungtungia species residing in China, Japan, and Korea, were analyzed to elucidate their intergeneric and interspecific phylogenetic relationships. In the phylogenetic tree inferred from their multiple gene sequences, Pseudorasbora, Pseudopungtungia and Pungtungia species ramified into three phylogenetically distinct clades; the “tenuicorpa” clade composed of Pseudopungtungia tenuicorpa, the “parva” clade composed of all Pseudorasbora species/subspecies, and the “herzi” clade composed of Pseudopungtungia nigra, and Pungtungia herzi. The genus Pseudorasbora was recovered as monophyletic, while the genus Pseudopungtungia was recovered as polyphyletic. Our phylogenetic result implies the unstable taxonomic status of the genus Pseudopungtungia.

  8. Cloning and Sequence of Glycoprotein H Gene of Duck Plague Virus

    HAN Xian-jie; WANG Jun-wei; MA Bo


    The glycoprotein H (gH) gene homologue of duck plague virus (DPV) was cloned by degenerate polymerase chain reaction (PCR) and sequenced. It was located immediately downstream from the thymidine kinase gene (TK). In addition,the 3'-end of the gene homologue to herpesvirus UL21 was located downstream from the gH gene. DPV gH gene open reading frame (ORF) was 2 505 bp in length and its primary translation product was a polypeptide of 834 amino acids long.It possessed several characteristics of membrane glycoproteins, including an N-terminal hydrophobic signal sequence,an external domain containing eight putative N-linked glycosylation sites, a C-terminal transmembrane domain, and a charged cytoplasmic tail. Comparison with other herpesvirus revealed identities of 20.2, 25.1, 23.0, 23.0, 26.5 and 26.0% with the gH counterparts of the human herpesvirus virus 1 (HSV1), equine herpesvirus 4 (EHV4), bovine herpesvirus 1 (BHV1), pseudorabies virus (PRV), gallid herpesvirus 2 (GHV2) and gallid herpesvirus 3 (GHV3), respectively.

  9. Boronated monoclonal antibody 225. 28S for potential use in neutron capture therapy of malignant melanoma

    Tamat, S.R.; Moore, D.E.; Patwardhan, A.; Hersey, P. (Univ. of Sydney (Australia))


    The concept of conjugating boron cluster compounds to monoclonal antibodies has been examined by several groups of research workers in boron neutron capture therapy (BNCT). The procedures reported to date for boronation of monoclonal antibodies resulted in either an inadequate level of boron incorporation, the precipitation of the conjugates, or a loss of immunological activity. The present report describes the conjugation of dicesium-mercapto-undecahydrododecaborate (Cs2B12H11SH) to 225.28S monoclonal antibody directed against high molecular weight melanoma-associated antigens (HMW-MAA), using poly-L-ornithine as a bridge to increase the carrying capacity of the antibody and to minimize change in the conformational structure of antibody. The method produces a boron content of 1,300 to 1,700 B atoms per molecule 225.28S while retaining the immunoreactivity. Characterization in terms of the homogeneity of the conjugation of the boron-monoclonal antibody conjugates has been studied by gel electrophoresis and ion-exchange HPLC.

  10. Gene Identification and Expression Analysis of 86,136 Expressed Sequence Tags (EST) from the Rice Genome

    Yan Zhou; Lin Ye; Li Lin; Jun Li; Xuegang Wang; Hao Xu; Yibin Pan; Wei Lin; Wei Tian; Jing Liu; Liping Wei; Jiabin Tang; Siqi Liu; Huanming Yang; Jun Yu; Jian Wang; Michael G. Walker; Xiuqing Zhang; Jun Wang; Songnian Hu; Huayong Xu; Yajun Deng; Jianhai Dong


    Expressed Sequence Tag (EST) analysis has pioneered genome-wide gene discovery and expression profiling. In order to establish a gene expression index in the rice cultivar indica, we sequenced and analyzed 86,136 ESTs from nine rice cDNA libraries from the super hybrid cultivar LYP9 and its parental cultivars. We assembled these ESTs into 13,232 contigs and leave 8,976 singletons. Overall, 7,497 sequences were found similar to the existing sequences in GenBank and 14,711 are novel. These sequences are classified by molecular function, biological process and pathways according to the Gene Ontology. We compared our sequenced ESTs with the publicly available 95,000 ESTs from japonica, and found little sequence variation, despite the large difference between genome sequences. We then assembled the combined 173,000 rice ESTs for further analysis. Using the pooled ESTs, we compared gene expression in metabolism pathway between rice and Avabidopsis according to KEGG. We further profiled gene expression patterns in different tis sues, developmental stages, and in a conditional sterile mutant, after checking the libraries are comparable by means of sequence coverage. We also identified some possible library specific genes and a number of enzymes and transcription factors that contribute to rice development.

  11. Transcriptome Sequencing Identified Genes and Gene Ontologies Associated with Early Freezing Tolerance in Maize

    Li, Zhao; Hu, Guanghui; Liu, Xiangfeng; Zhou, Yao; Li, Yu; Zhang, Xu; Yuan, Xiaohui; Zhang, Qian; Yang, Deguang; Wang, Tianyu; Zhang, Zhiwu


    Originating in a tropical climate, maize has faced great challenges as cultivation has expanded to the majority of the world's temperate zones. In these zones, frost and cold temperatures are major factors that prevent maize from reaching its full yield potential. Among 30 elite maize inbred lines adapted to northern China, we identified two lines of extreme, but opposite, freezing tolerance levels—highly tolerant and highly sensitive. During the seedling stage of these two lines, we used RNA-seq to measure changes in maize whole genome transcriptome before and after freezing treatment. In total, 19,794 genes were expressed, of which 4550 exhibited differential expression due to either treatment (before or after freezing) or line type (tolerant or sensitive). Of the 4550 differently expressed genes, 948 exhibited differential expression due to treatment within line or lines under freezing condition. Analysis of gene ontology found that these 948 genes were significantly enriched for binding functions (DNA binding, ATP binding, and metal ion binding), protein kinase activity, and peptidase activity. Based on their enrichment, literature support, and significant levels of differential expression, 30 of these 948 genes were selected for quantitative real-time PCR (qRT-PCR) validation. The validation confirmed our RNA-Seq-based findings, with squared correlation coefficients of 80% and 50% in the tolerance and sensitive lines, respectively. This study provided valuable resources for further studies to enhance understanding of the molecular mechanisms underlying maize early freezing response and enable targeted breeding strategies for developing varieties with superior frost resistance to achieve yield potential. PMID:27774095

  12. Sequence analysis of the inversion region containing the pilin genes of Moraxella bovis.

    Fulks, K A; Marrs, C F; Stevens, S P; Green, M R


    Moraxella bovis EPP63 is able to produce two antigenically distinct pili called Q and I pili (previously called beta and alpha pili). Hybridization studies have shown that the transition between the types is due to inversion of a 2.1-kilobase segment of chromosomal DNA. We present the sequence of a 4.1-kilobase region of cloned DNA spanning the entire inversion region in orientation 1 (Q pilin expressed). Comparison of this sequence with the sequence of the polymerase chain reaction-amplified genomic DNA from orientation 2 (I pilin expressed) allows the site-specific region of recombination to be localized to a 26-base-pair region in which sequence similarity to the left inverted repeat of the Salmonella typhimurium hin system was previously noted. In addition, 50% sequence similarity was seen in a 60-base-pair segment of our sequence to the recombinational enhancer of bacteriophage P1, an inversion system related to the hin system of S. typhimurium. Finally, two open reading frames representing potential genes were identified.

  13. The genome sequence of black cottonwood (Populus trichocarpa) reveals 18 conserved cellulose synthase (CesA) genes.

    Djerbi, Soraya; Lindskog, Mats; Arvestad, Lars; Sterky, Fredrik; Teeri, Tuula T


    The genome sequence of Populus trichocarpa was screened for genes encoding cellulose synthases by using full-length cDNA sequences and ESTs previously identified in the tissue specific cDNA libraries of other poplars. The data obtained revealed 18 distinct CesA gene sequences in P. trichocarpa. The identified genes were grouped in seven gene pairs, one group of three sequences and one single gene. Evidence from gene expression studies of hybrid aspen suggests that both copies of at least one pair, CesA3-1 and CesA3-2, are actively transcribed. No sequences corresponding to the gene pair, CesA6-1 and CesA6-2, were found in Arabidopsis or hybrid aspen, while one homologous gene has been identified in the rice genome and an active transcript in Populus tremuloides. A phylogenetic analysis suggests that the CesA genes previously associated with secondary cell wall synthesis originate from a single ancestor gene and group in three distinct subgroups. The newly identified copies of CesA genes in P. trichocarpa give rise to a number of new questions concerning the mechanism of cellulose synthesis in trees.

  14. Molecular cloning and nucleotide sequence of a transforming gene detected by transfection of chicken B-cell lymphoma DNA

    Goubin, Gerard; Goldman, Debra S.; Luce, Judith; Neiman, Paul E.; Cooper, Geoffrey M.


    A transforming gene detected by transfection of chicken B-cell lymphoma DNA has been isolated by molecular cloning. It is homologous to a conserved family of sequences present in normal chicken and human DNAs but is not related to transforming genes of acutely transforming retroviruses. The nucleotide sequence of the cloned transforming gene suggests that it encodes a protein that is partially homologous to the amino terminus of transferrin and related proteins although only about one tenth the size of transferrin.

  15. Nucleotide sequence analysis of the Legionella micdadei mip gene, encoding a 30-kilodalton analog of the Legionella pneumophila Mip protein

    Bangsborg, Jette Marie; Cianciotto, N P; Hindersson, P


    After the demonstration of analogs of the Legionella pneumophila macrophage infectivity potentiator (Mip) protein in other Legionella species, the Legionella micdadei mip gene was cloned and expressed in Escherichia coli. DNA sequence analysis of the L. micdadei mip gene contained in the plasmid p...... homology with the mip-like genes of several Legionella species. Furthermore, amino acid sequence comparisons revealed significant homology to two eukaryotic proteins with isomerase activity (FK506-binding proteins)....

  16. Rapid high resolution genotyping of Francisella tularensis by whole genome sequence comparison of annotated genes ("MLST+".

    Markus H Antwerpen

    Full Text Available The zoonotic disease tularemia is caused by the bacterium Francisella tularensis. This pathogen is considered as a category A select agent with potential to be misused in bioterrorism. Molecular typing based on DNA-sequence like canSNP-typing or MLVA has become the accepted standard for this organism. Due to the organism's highly clonal nature, the current typing methods have reached their limit of discrimination for classifying closely related subpopulations within the subspecies F. tularensis ssp. holarctica. We introduce a new gene-by-gene approach, MLST+, based on whole genome data of 15 sequenced F. tularensis ssp. holarctica strains and apply this approach to investigate an epidemic of lethal tularemia among non-human primates in two animal facilities in Germany. Due to the high resolution of MLST+ we are able to demonstrate that three independent clones of this highly infectious pathogen were responsible for these spatially and temporally restricted outbreaks.

  17. Cloning and Sequence Analysis of the gp41 Gene of Clanis bilineata Nuclear Polyhedrosis Virus

    ZHU Shan-ying; WANG Wen-bing; ZHU Jiang


    Clanis bilineata Nucleo Polyhedro Virus (CbNPV) was purified from Clanis bilineata larva. To obtain the molecular information of the virus, the genomic DNA of CbNPV was extracted, and a DNA fragment library of the virus was constructed using shotgun. The positive clones were then sequenced and analyzed. An open-reading frame (ORF) that has high identity with the gp41 gene of most NPVs was found in the library. The gp41 gene of CbNPV is 933 base pair long and encodes a protein of 310 amino acids. The result of the amino acid sequence analysis showed that the CbNPV gp41 has 53-61 and 56-73% identities with Group Ⅰ and Ⅱ NPVs gp41 proteins, respectively. The result indicates that the isolated CbNPV is a novel baculovirus, and the CbNPV shares a much closer relationship with Group Ⅱ NPVs.

  18. Coptotermes gestroi (Isoptera: Rhinotermitidae) in Brazil: possible origins inferred by mitochondrial cytochrome oxidase II gene sequences.

    Martins, C; Fontes, L R; Bueno, O C; Martins, V G


    The Asian subterranean termite, Coptotermes gestroi, originally from northeast India through Burma, Thailand, Malaysia, and the Indonesian archipelago, is a major termite pest introduced in several countries around the world, including Brazil. We sequenced the mitochondrial COII gene from individuals representing 23 populations. Phylogenetic analysis of COII gene sequences from this and other studies resulted in two main groups: (1) populations of Cleveland (USA) and four populations of Malaysia and (2) populations of Brazil, four populations of Malaysia, and one population from each of Thailand, Puerto Rico, and Key West (USA). Three new localities are reported here, considerably enlarging the distribution of C. gestroi in Brazil: Campo Grande (state of Mato Grosso do Sul), Itajaí (state of Santa Catarina), and Porto Alegre (state of Rio Grande do Sul).

  19. Whole Blood Transcriptome Sequencing Reveals Gene Expression Differences between Dapulian and Landrace Piglets.

    Hu, Jiaqing; Yang, Dandan; Chen, Wei; Li, Chuanhao; Wang, Yandong; Zeng, Yongqing; Wang, Hui


    There is little genomic information regarding gene expression differences at the whole blood transcriptome level of different pig breeds at the neonatal stage. To solve this, we characterized differentially expressed genes (DEGs) in the whole blood of Dapulian (DPL) and Landrace piglets using RNA-seq (RNA-sequencing) technology. In this study, 83 DEGs were identified between the two breeds. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses identified immune response and metabolism as the most commonly enriched terms and pathways in the DEGs. Genes related to immunity and lipid metabolism were more highly expressed in the DPL piglets, while genes related to body growth were more highly expressed in the Landrace piglets. Additionally, the DPL piglets had twofold more single nucleotide polymorphisms (SNPs) and alternative splicing (AS) than the Landrace piglets. These results expand our knowledge of the genes transcribed in the piglet whole blood of two breeds and provide a basis for future research of the molecular mechanisms underlying the piglet differences.

  20. Comparison of inherently essential genes of Porphyromonas gingivalis identified in two transposon-sequencing libraries.

    Hutcherson, J A; Gogeneni, H; Yoder-Himes, D; Hendrickson, E L; Hackett, M; Whiteley, M; Lamont, R J; Scott, D A


    Porphyromonas gingivalis is a Gram-negative anaerobe and keystone periodontal pathogen. A mariner transposon insertion mutant library has recently been used to define 463 genes as putatively essential for the in vitro growth of P. gingivalis ATCC 33277 in planktonic culture (Library 1). We have independently generated a transposon insertion mutant library (Library 2) for the same P. gingivalis strain and herein compare genes that are putatively essential for in vitro growth in complex media, as defined by both libraries. In all, 281 genes (61%) identified by Library 1 were common to Library 2. Many of these common genes are involved in fundamentally important metabolic pathways, notably pyrimidine cycling as well as lipopolysaccharide, peptidoglycan, pantothenate and coenzyme A biosynthesis, and nicotinate and nicotinamide metabolism. Also in common are genes encoding heat-shock protein homologues, sigma factors, enzymes with proteolytic activity, and the majority of sec-related protein export genes. In addition to facilitating a better understanding of critical physiological processes, transposon-sequencing technology has the potential to identify novel strategies for the control of P. gingivalis infections. Those genes defined as essential by two independently generated TnSeq mutant libraries are likely to represent particularly attractive therapeutic targets.

  1. Whole Blood Transcriptome Sequencing Reveals Gene Expression Differences between Dapulian and Landrace Piglets

    Jiaqing Hu


    Full Text Available There is little genomic information regarding gene expression differences at the whole blood transcriptome level of different pig breeds at the neonatal stage. To solve this, we characterized differentially expressed genes (DEGs in the whole blood of Dapulian (DPL and Landrace piglets using RNA-seq (RNA-sequencing technology. In this study, 83 DEGs were identified between the two breeds. Gene Ontology (GO and Kyoto Encyclopedia of Genes and Genomes (KEGG pathway analyses identified immune response and metabolism as the most commonly enriched terms and pathways in the DEGs. Genes related to immunity and lipid metabolism were more highly expressed in the DPL piglets, while genes related to body growth were more highly expressed in the Landrace piglets. Additionally, the DPL piglets had twofold more single nucleotide polymorphisms (SNPs and alternative splicing (AS than the Landrace piglets. These results expand our knowledge of the genes transcribed in the piglet whole blood of two breeds and provide a basis for future research of the molecular mechanisms underlying the piglet differences.

  2. Advancing Eucalyptus genomics: identification and sequencing of lignin biosynthesis genes from deep-coverage BAC libraries

    Kudrna David


    Full Text Available Abstract Background Eucalyptus species are among the most planted hardwoods in the world because of their rapid growth, adaptability and valuable wood properties. The development and integration of genomic resources into breeding practice will be increasingly important in the decades to come. Bacterial artificial chromosome (BAC libraries are key genomic tools that enable positional cloning of important traits, synteny evaluation, and the development of genome framework physical maps for genetic linkage and genome sequencing. Results We describe the construction and characterization of two deep-coverage BAC libraries EG_Ba and EG_Bb obtained from nuclear DNA fragments of E. grandis (clone BRASUZ1 digested with HindIII and BstYI, respectively. Genome coverages of 17 and 15 haploid genome equivalents were estimated for EG_Ba and EG_Bb, respectively. Both libraries contained large inserts, with average sizes ranging from 135 Kb (Eg_Bb to 157 Kb (Eg_Ba, very low extra-nuclear genome contamination providing a probability of finding a single copy gene ≥ 99.99%. Libraries were screened for the presence of several genes of interest via hybridizations to high-density BAC filters followed by PCR validation. Five selected BAC clones were sequenced and assembled using the Roche GS FLX technology providing the whole sequence of the E. grandis chloroplast genome, and complete genomic sequences of important lignin biosynthesis genes. Conclusions The two E. grandis BAC libraries described in this study represent an important milestone for the advancement of Eucalyptus genomics and forest tree research. These BAC resources have a highly redundant genome coverage (> 15×, contain large average inserts and have a very low percentage of clones with organellar DNA or empty vectors. These publicly available BAC libraries are thus suitable for a broad range of applications in genetic and genomic research in Eucalyptus and possibly in related species of Myrtaceae

  3. Multiple, non-allelic, intein-coding sequences in eukaryotic RNA polymerase genes

    Butler Margaret I


    Full Text Available Abstract Background Inteins are self-splicing protein elements. They are translated as inserts within host proteins that excise themselves and ligate the flanking portions of the host protein (exteins with a peptide bond. They are encoded as in-frame insertions within the genes for the host proteins. Inteins are found in all three domains of life and in viruses, but have a very sporadic distribution. Only a small number of intein coding sequences have been identified in eukaryotic nuclear genes, and all of these are from ascomycete or basidiomycete fungi. Results We identified seven intein coding sequences within nuclear genes coding for the second largest subunits of RNA polymerase. These sequences were found in diverse eukaryotes: one is in the second largest subunit of RNA polymerase I (RPA2 from the ascomycete fungus Phaeosphaeria nodorum, one is in the RNA polymerase III (RPC2 of the slime mould Dictyostelium discoideum and four intein coding sequences are in RNA polymerase II genes (RPB2, one each from the green alga Chlamydomonas reinhardtii, the zygomycete fungus Spiromyces aspiralis and the chytrid fungi Batrachochytrium dendrobatidis and Coelomomyces stegomyiae. The remaining intein coding sequence is in a viral relic embedded within the genome of the oomycete Phytophthora ramorum. The Chlamydomonas and Dictyostelium inteins are the first nuclear-encoded inteins found outside of the fungi. These new inteins represent a unique dataset: they are found in homologous proteins that form a paralogous group. Although these paralogues diverged early in eukaryotic evolution, their sequences can be aligned over most of their length. The inteins are inserted at multiple distinct sites, each of which corresponds to a highly conserved region of RNA polymerase. This dataset supports earlier work suggesting that inteins preferentially occur in highly conserved regions of their host proteins. Conclusion The identification of these new inteins

  4. Characterisation of a DNA sequence element that directs Dictyostelium stalk cell-specific gene expression.

    Ceccarelli, A; Zhukovskaya, N; Kawata, T; Bozzaro, S; Williams, J


    The ecmB gene of Dictyostelium is expressed at culmination both in the prestalk cells that enter the stalk tube and in ancillary stalk cell structures such as the basal disc. Stalk tube-specific expression is regulated by sequence elements within the cap-site proximal part of the promoter, the stalk tube (ST) promoter region. Dd-STATa, a member of the STAT transcription factor family, binds to elements present in the ST promoter-region and represses transcription prior to entry into the stalk tube. We have characterised an activatory DNA sequence element, that lies distal to the repressor elements and that is both necessary and sufficient for expression within the stalk tube. We have mapped this activator to a 28 nucleotide region (the 28-mer) within which we have identified a GA-containing sequence element that is required for efficient gene transcription. The Dd-STATa protein binds to the 28-mer in an in vitro binding assay, and binding is dependent upon the GA-containing sequence. However, the ecmB gene is expressed in a Dd-STATa null mutant, therefore Dd-STATa cannot be responsible for activating the 28-mer in vivo. Instead, we identified a distinct 28-mer binding activity in nuclear extracts from the Dd-STATa null mutant, the activity of this GA binding activity being largely masked in wild type extracts by the high affinity binding of the Dd-STATa protein. We suggest, that in addition to the long range repression exerted by binding to the two known repressor sites, Dd-STATa inhibits transcription by direct competition with this putative activator for binding to the GA sequence.

  5. Serpins in rice: protein sequence analysis, phylogeny and gene expression during development

    Francis Sheila E


    Full Text Available Abstract Background Most members of the serpin family of proteins are potent, irreversible inhibitors of specific serine or cysteine proteinases. Inhibitory serpins are distinguished from members of other families of proteinase inhibitors by their metastable structure and unique suicide-substrate mechanism. Animal serpins exert control over a remarkable diversity of physiological processes including blood coagulation, fibrinolysis, innate immunity and aspects of development. Relatively little is known about the complement of serpin genes in plant genomes and the biological functions of plant serpins. Results A structurally refined amino-acid sequence alignment of the 14 full-length serpins encoded in the genome of the japonica rice Oryza sativa cv. Nipponbare (a monocot showed a diversity of reactive-centre sequences (which largely determine inhibitory specificity and a low degree of identity with those of serpins in Arabidopsis (a eudicot. A new convenient and functionally informative nomenclature for plant serpins in which the reactive-centre sequence is incorporated into the serpin name was developed and applied to the rice serpins. A phylogenetic analysis of the rice serpins provided evidence for two main clades and a number of relatively recent gene duplications. Transcriptional analysis showed vastly different levels of basal expression among eight selected rice serpin genes in callus tissue, during seedling development, among vegetative tissues of mature plants and throughout seed development. The gene OsSRP-LRS (Os03g41419, encoding a putative orthologue of Arabidopsis AtSerpin1 (At1g47710, was expressed ubiquitously and at high levels. The second most highly expressed serpin gene was OsSRP-PLP (Os11g11500, encoding a non-inhibitory serpin with a surprisingly well-conserved reactive-centre loop (RCL sequence among putative orthologues in other grass species. Conclusions The diversity of reactive-centre sequences among the putatively

  6. An abundance of ubiquitously expressed genes revealed by tissue transcriptome sequence data.

    Daniel Ramsköld


    Full Text Available The parts of the genome transcribed by a cell or tissue reflect the biological processes and functions it carries out. We characterized the features of mammalian tissue transcriptomes at the gene level through analysis of RNA deep sequencing (RNA-Seq data across human and mouse tissues and cell lines. We observed that roughly 8,000 protein-coding genes were ubiquitously expressed, contributing to around 75% of all mRNAs by message copy number in most tissues. These mRNAs encoded proteins that were often intracellular, and tended to be involved in metabolism, transcription, RNA processing or translation. In contrast, genes for secreted or plasma membrane proteins were generally expressed in only a subset of tissues. The distribution of expression levels was broad but fairly continuous: no support was found for the concept of distinct expression classes of genes. Expression estimates that included reads mapping to coding exons only correlated better with qRT-PCR data than estimates which also included 3' untranslated regions (UTRs. Muscle and liver had the least complex transcriptomes, in that they expressed predominantly ubiquitous genes and a large fraction of the transcripts came from a few highly expressed genes, whereas brain, kidney and testis expressed more complex transcriptomes with the vast majority of genes expressed and relatively small contributions from the most expressed genes. mRNAs expressed in brain had unusually long 3'UTRs, and mean 3'UTR length was higher for genes involved in development, morphogenesis and signal transduction, suggesting added complexity of UTR-based regulation for these genes. Our results support a model in which variable exterior components feed into a large, densely connected core composed of ubiquitously expressed intracellular proteins.

  7. Molecular analysis of the bovine coronavirus S1 gene by direct sequencing of diarrheic fecal specimens

    E. Takiuchi


    Full Text Available Bovine coronavirus (BCoV causes severe diarrhea in newborn calves, is associated with winter dysentery in adult cattle and respiratory infections in calves and feedlot cattle. The BCoV S protein plays a fundamental role in viral attachment and entry into the host cell, and is cleaved into two subunits termed S1 (amino terminal and S2 (carboxy terminal. The present study describes a strategy for the sequencing of the BCoV S1 gene directly from fecal diarrheic specimens that were previously identified as BCoV positive by RT-PCR assay for N gene detection. A consensus sequence of 2681 nucleotides was obtained through direct sequencing of seven overlapping PCR fragments of the S gene. The samples did not undergo cell culture passage prior to PCR amplification and sequencing. The structural analysis was based on the genomic differences between Brazilian strains and other known BCoV from different geographical regions. The phylogenetic analysis of the entire S1 gene showed that the BCoV Brazilian strains were more distant from the Mebus strain (97.8% identity for nucleotides and 96.8% identity for amino acids and more similar to the BCoV-ENT strain (98.7% for nucleotides and 98.7% for amino acids. Based on the phylogenetic analysis of the hypervariable region of the S1 subunit, these strains clustered with the American (BCoV-ENT, 182NS and Canadian (BCQ20, BCQ2070, BCQ9, BCQ571, BCQ1523 calf diarrhea and the Canadian winter dysentery (BCQ7373, BCQ2590 strains, but clustered on a separate branch of the Korean and respiratory BCoV strains. The BCoV strains of the present study were not clustered in the same branch of previously published Brazilian strains (AY606193, AY606194. These data agree with the genealogical construction and suggest that at least two different BCoV strains are circulating in Brazil.

  8. How the Sequence of a Gene Specifies Structural Symmetry in Proteins.

    Xiaojuan Shen

    Full Text Available Internal symmetry is commonly observed in the majority of fundamental protein folds. Meanwhile, sufficient evidence suggests that nascent polypeptide chains of proteins have the potential to start the co-translational folding process and this process allows mRNA to contain additional information on protein structure. In this paper, we study the relationship between gene sequences and protein structures from the viewpoint of symmetry to explore how gene sequences code for structural symmetry in proteins. We found that, for a set of two-fold symmetric proteins from left-handed beta-helix fold, intragenic symmetry always exists in their corresponding gene sequences. Meanwhile, codon usage bias and local mRNA structure might be involved in modulating translation speed for the formation of structural symmetry: a major decrease of local codon usage bias in the middle of the codon sequence can be identified as a common feature; and major or consecutive decreases in local mRNA folding energy near the boundaries of the symmetric substructures can also be observed. The results suggest that gene duplication and fusion may be an evolutionarily conserved process for this protein fold. In addition, the usage of rare codons and the formation of higher order of secondary structure near the boundaries of symmetric substructures might have coevolved as conserved mechanisms to slow down translation elongation and to facilitate effective folding of symmetric substructures. These findings provide valuable insights into our understanding of the mechanisms of translation and its evolution, as well as the design of proteins via symmetric modules.

  9. A rhodopsin-like protein in Cyanophora paradoxa: gene sequence and protein immunolocalization.

    Frassanito, Anna Maria; Barsanti, Laura; Passarelli, Vincenzo; Evangelista, Valtere; Gualtieri, Paolo


    Here, we report the DNA sequence of the rhodopsin gene in the alga Cyanophora paradoxa (Glaucophyta). The primers were designed according to the conserved regions of prokaryotic and eukaryotic rhodopsin-like proteins deposited in the GenBank. The sequence consists of 1,272 bp comprised of 5 introns. The correspondent protein, named Cyanophopsin, showed high identity to rhodopsin-like proteins of Archea, Bacteria, Fungi, and Algae. At the N-terminal, the protein is characterized by a region with no transmembrane alpha-helices (80 aa), followed by a region with 7alpha-helices (219 aa) and a shorter 35-aa C-terminal region. The DNA sequence of the N-terminal region was expressed in E. coli and the recombinant purified peptide was used as antigen in hens to obtain polyclonal antibodies. Indirect immunofluorescence in C. paradoxa cells showed a marked labeling of the muroplast (aka cyanelle) membrane.

  10. Whole-exome sequencing for the identification of susceptibility genes of Kashin-Beck disease.

    Zhenxing Yang

    Full Text Available OBJECTIVE: To identify and investigate the susceptibility genes of Kashin-Beck disease (KBD in Chinese population. METHODS: Whole-exome capturing and sequencing technology was used for the detection of genetic variations in 19 individuals from six families with high incidence of KBD. A total of 44 polymorphisms from 41 genes were genotyped from a total of 144 cases and 144 controls by using MassARRAY under the standard protocol from Sequenom. Association was applied on the data by using PLINK1.07. RESULTS: In the sequencing stage, each sample showed approximately 70-fold coverage, thus covering more than 99% of the target regions. Among the single nucleotide polymorphisms (SNPs used in the transmission disequilibrium test, 108 had a p-value of <0.01, whereas 1056 had a p-value of <0.05. Kyoto Encyclopedia of Genes and Genomes(KEGG pathway analysis indicates that these SNPs focus on three major pathways: regulation of actin cytoskeleton, focal adhesion, and metabolic pathways. In the validation stage, single locus effects revealed that two of these polymorphisms (rs7745040 and rs9275295 in the human leukocyte antigen (HLA-DRB1 gene and one polymorphism (rs9473132 in CD2-associated protein (CD2AP gene have a significant statistical association with KBD. CONCLUSIONS: HLA-DRB1 and CD2AP gene were identified to be among the susceptibility genes of KBD, thus supporting the role of the autoimmune response in KBD and the possibility of shared etiology between osteoarthritis, rheumatoid arthritis, and KBD.

  11. Cloning and sequence analysis of the Antheraea pernyi nucleopolyhedrovirus gp64 gene

    Wenbing Wang; Shanying Zhu; Liqun Wang; Feng Yu; Weide Shen


    Frequent outbreaks of the purulence disease of Chinese oak silkworm are reported in Middle and Northeast China. The disease is produced by the pathogen Antheraea pernyi nucleopolyhedrovirus (AnpeNPV). To obtain molecular information of the virus, the polyhedra of AnpeNPV were purified and characterized. The genomic DNA of AnpeNPV was extracted and digested with HindIII. The genome size of AnpeNPV is estimated at 128 kb. Based on the analysis of DNA fragments digested with HindIII, 23 fragments were bigger than 564 bp. A genomic library was generated using HindIII and the positive clones were sequenced and analysed. The gp64 gene, encoding the baculovirus envelope protein GP64, was found in an insert. The nucleotide sequence analysis indicated that the AnpeNPV gp64 gene consists of a 1530 nucleotide open reading frame (ORF), encoding a protein of 509 amino acids. Of the eight gp64 homologues, the AnpeNPV gp64 ORF shared the most sequence similarity with the gp64 gene of Anticarsia gemmatalis NPV, but not Bombyx mori NPV. The upstream region of the AnpeNPV gp64 ORF encoded the conserved transcriptional elements for early and late stage of the viral infection cycle. These results indicated that AnpeNPV belongs to group I NPV and was far removed in molecular phylogeny from the BmNPV.

  12. Molecular cloning and sequence analysis of a phenylalanine ammonia-lyase gene from dendrobium.

    Qing Jin

    Full Text Available In this study, a phenylalanine ammonia-lyase (PAL gene was cloned from Dendrobium candidum using homology cloning and RACE. The full-length sequence and catalytic active sites that appear in PAL proteins of Arabidopsis thaliana and Nicotiana tabacum are also found: PAL cDNA of D. candidum (designated Dc-PAL1, GenBank No. JQ765748 has 2,458 bps and contains a complete open reading frame (ORF of 2,142 bps, which encodes 713 amino acid residues. The amino acid sequence of DcPAL1 has more than 80% sequence identity with the PAL genes of other plants, as indicated by multiple alignments. The dominant sites and catalytic active sites, which are similar to that showing in PAL proteins of Arabidopsis thaliana and Nicotiana tabacum, are also found in DcPAL1. Phylogenetic tree analysis revealed that DcPAL is more closely related to PALs from orchidaceae plants than to those of other plants. The differential expression patterns of PAL in protocorm-like body, leaf, stem, and root, suggest that the PAL gene performs multiple physiological functions in Dendrobium candidum.

  13. Sequence Analysis of Bitter Taste Receptor Gene Repertoires in Different Ruminant Species.

    Ana Monteiro Ferreira

    Full Text Available Bitter taste has been extensively studied in mammalian species and is associated with sensitivity to toxins and with food choices that avoid dangerous substances in the diet. At the molecular level, bitter compounds are sensed by bitter taste receptor proteins (T2R present at the surface of taste receptor cells in the gustatory papillae. Our work aims at exploring the phylogenetic relationships of T2R gene sequences within different ruminant species. To accomplish this goal, we gathered a collection of ruminant species with different feeding behaviors and for which no genome data is available: American bison, chamois, elk, European bison, fallow deer, goat, moose, mouflon, muskox, red deer, reindeer and white tailed deer. The herbivores chosen for this study belong to different taxonomic families and habitats, and hence, exhibit distinct foraging behaviors and diet preferences. We describe the first partial repertoires of T2R gene sequences for these species obtained by direct sequencing. We then consider the homology and evolutionary history of these receptors within this ruminant group, and whether it relates to feeding type classification, using MEGA software. Our results suggest that phylogenetic proximity of T2R genes corresponds more to the traditional taxonomic groups of the species rather than reflecting a categorization by feeding strategy.

  14. A Review of Whole-Exome Sequencing Efforts Toward Hereditary Breast Cancer Susceptibility Gene Discovery.

    Chandler, Madison R; Bilgili, Erin P; Merner, Nancy D


    Inherited genetic risk factors contribute toward breast cancer (BC) onset. BC risk variants can be divided into three categories of penetrance (high, moderate, and low) that reflect the probability of developing the disease. Traditional BC susceptibility gene discovery approaches that searched for high- and moderate-risk variants in familial BC cases have had limited success; to date, these risk variants explain only ∼30% of familial BC cases. Next-generation sequencing technologies can be used to search for novel high and moderate BC risk variants, and this manuscript reviews 12 familial BC whole-exome sequencing efforts. Study design, filtering strategies, and segregation and validation analyses are discussed. Overall, only a modest number of novel BC risk genes were identified, and 90% and 97% of the exome-sequenced families and cases, respectively, had no BC risk variants reported. It is important to learn from these studies and consider alternate strategies in order to make further advances. The discovery of new BC susceptibility genes is critical for improved risk assessment and to provide insight toward disease mechanisms for the development of more effective therapies.

  15. [Sequence analysis of the phosphoprotein gene of peste des petits ruminants virus of Chinese origin].

    Bao, Jing-yue; Zhao, Wen-ji; Li, Lin; Wang, Zhi-liang; Wu, Guo-zhen; Wu, Xiao-dong; Liu, Chun-ju; Wang, Qing-hua; Wang, Jun-wei; Liu, Yu-tian; Li, Jin-ming; Wang, Ying-li


    The nucleotide sequences of P gene from a field strain of peste des petits ruminants virus (PPRV) ("China/Tib/Gej/07-30") was firstly determined. The P gene is 1,655 nucleotides long with two overlapping open reading frames (ORFs). The first ORF is 1530 nucleotides long and would produce P protein of 509 amino acid residues. The second ORF is 534 nucleotides long and would produce C protein of 177 amino acid residues. The first ORF produces a second mRNA transcript of 897 nucleotides long with an extra G nucleotide at position 751. Translation from this mRNA would produce V protein of 298 amino acid residues. The nucleotide and deduced amino acid sequence were compared with the homologous region of other PPRV isolates. At the amino acid level, the "China/Tib/Gej/07-30" shares homology of 86.10%-97.3%, 84.3%-94.9%, and 82.9%-96.3% for P, C, and V proteins respectively. Several sequence motifs in the P genes were identified on the basis of conservation in the PPRVs and the morbilliviruses.

  16. When is it MODY? Challenges in the Interpretation of Sequence Variants in MODY Genes.

    Althari, Sara; Gloyn, Anna L


    The genomics revolution has raised more questions than it has provided answers. Big data from large population-scale resequencing studies are increasingly deconstructing classic notions of Mendelian disease genetics, which support a simplistic correlation between mutational severity and phenotypic outcome. The boundaries are being blurred as the body of evidence showing monogenic disease-causing alleles in healthy genomes, and in the genomes of individu-als with increased common complex disease risk, continues to grow. In this review, we focus on the newly emerging challenges which pertain to the interpretation of sequence variants in genes implicated in the pathogenesis of maturity-onset diabetes of the young (MODY), a presumed mono-genic form of diabetes characterized by Mendelian inheritance. These challenges highlight the complexities surrounding the assignments of pathogenicity, in particular to rare protein-alerting variants, and bring to the forefront some profound clinical diagnostic implications. As MODY is both genetically and clinically heterogeneous, an accurate molecular diagnosis and cautious extrapolation of sequence data are critical to effective disease management and treatment. The biological and translational value of sequence information can only be attained by adopting a multitude of confirmatory analyses, which interrogate variant implication in disease from every possible angle. Indeed, studies which have effectively detected rare damaging variants in known MODY genes in normoglycemic individuals question the existence of a sin-gle gene mutation scenario: does monogenic diabetes exist when the genetic culprits of MODY have been systematical-ly identified in individuals without MODY?

  17. Human identification from forensic materials by amplification of a human-specific sequence in the myoglobin gene.

    Ono T; Miyaishi S; Yamamoto Y; Yoshitome K; Ishikawa T.; Ishizu H


    We developed a method for human identification of forensic biological materials by PCR-based detection of a human-specific sequence in exon 3 of the myoglobin gene. This human-specific DNA sequence was deduced from differences in the amino acid sequences of myoglobins between humans and other animal species. The new method enabled amplification of the target DNA fragment from 30 samples of human DNA, and the amplified sequences were identical with that already reported. Using this method, we ...

  18. Exome sequencing identifies a novel gene, WNK1, for susceptibility to pelvic organ prolapse (POP).

    Rao, Shuquan; Lang, Jinghe; Zhu, Lan; Chen, Juan


    Pelvic organ prolapse (POP) is a common gynecological disorder; however, the genetic components remain largely unidentified. Exome sequencing has been widely used to identify pathogenic gene mutations of several diseases because of its high chromosomal coverage and accuracy. In this study, we performed whole exome sequencing (WES), for the first time, on 8 peripheral blood DNA samples from representative POP cases. After filtering the sequencing data from the dbSNP database (build 138) and the 1000 Genomes Project, 2 missense variants in WNK1, c.2668G > A (p.G890R) and c.6761C> T (p.P2254L), were identified and further validated via Sanger sequencing. In validation stage, the c.2668G > A (p.G890R) variant and 8 additional variants were detected in 11 out of 161 POP patients. All these variants were absent in 231 healthy controls. Functional experiments showed that fibroblasts from the utero-sacral ligaments of POP with WNK1 mutations exhibited loose and irregular alignment compared with fibroblasts from healthy controls. In sum, our study identified a novel gene, WNK1, for POP susceptibility, expanded the causal mutation spectrums of POP, and provided evidence for the genetic diagnosis and medical management of POP in the future.

  19. Comparison of two approaches for the classification of 16S rRNA gene sequences.

    Chatellier, Sonia; Mugnier, Nathalie; Allard, Françoise; Bonnaud, Bertrand; Collin, Valérie; van Belkum, Alex; Veyrieras, Jean-Baptiste; Emler, Stefan


    The use of 16S rRNA gene sequences for microbial identification in clinical microbiology is accepted widely, and requires databases and algorithms. We compared a new research database containing curated 16S rRNA gene sequences in combination with the lca (lowest common ancestor) algorithm (RDB-LCA) to a commercially available 16S rDNA Centroid approach. We used 1025 bacterial isolates characterized by biochemistry, matrix-assisted laser desorption/ionization time-of-flight MS and 16S rDNA sequencing. Nearly 80 % of isolates were identified unambiguously at the species level by both classification platforms used. The remaining isolates were mostly identified correctly at the genus level due to the limited resolution of 16S rDNA sequencing. Discrepancies between both 16S rDNA platforms were due to differences in database content and the algorithm used, and could amount to up to 10.5 %. Up to 1.4 % of the analyses were found to be inconclusive. It is important to realize that despite the overall good performance of the pipelines for analysis, some inconclusive results remain that require additional in-depth analysis performed using supplementary methods.

  20. Gene detection, virus isolation, and sequence analysis of avian leukosis viruses in Taiwan country chickens.

    Chang, Shu-Wei; Hsu, Meng-Fang; Wang, Ching-Ho


    Avian leukosis virus (ALV) infection in Taiwan Country chickens (TCCs) was investigated by using gene detection, virus isolation, and sequence analysis. The blood samples of 61 TCC flocks at market ages from a slaughter house were screened for exogenous ALVs using polymerase chain reaction to investigate the ALV infection status. The buffy coats from three breeder and four commercial chicken flocks were cocultured with DF-1 cells to isolate the virus. The full proviral DNA genomes of two ALV isolates were sequenced, analyzed, and compared with reference ALV strains. The gene detection results showed that 60 and 43 of the 61 flocks were infected with subgroup A of ALV (ALV-A) and subgroup J of ALV (ALV-J), respectively. Virus isolation results showed that five ALV-As and two ALV-Js were isolated from those seven TCC flocks. The full sequences of the isolates showed that isolate TW-3577 possessed a myeloblastosis-associated virus 1 gp85 coding region and an ALV-J 3'-untranslated region (3'UTR) and was similar to ordinary ALV-A. However, TW-3593 was unique. The 3'UTR of this isolate displayed high identity to endogenous counterpart sequence and its gp85 was different from all subgroups. This unique ALV is common in Taiwan.

  1. Sequence variation in the androgen receptor gene is not a common determinant of male sexual orientation

    Macke, J.P.; Nathans, J.; King, V.L. (Johns Hopkins Univ., Baltimore, MD (United States)); Hu, N.; Hu, S.; Hamer, D.; Bailey, M. (Northwestern Univ., Evanston, IL (United States)); Brown, T. (Johns Hopkins Univ. School of Hygiene and Public Health, Baltimore, MD (United States))


    To test the hypothesis that DNA sequence variation in the androgen receptor gene plays a causal role in the development of male sexual orientation, the authors have (1) measured the degree of concordance of androgen receptor alleles in 36 pairs of homosexual brothers, (2) compared the lengths of polyglutamine and polyglycine tracts in the amino-terminal domain of the androgen receptor in a sample of 197 homosexual males and 213 unselected subjects, and (3) screened the entire androgen receptor coding region for sequence variation by PCR and denaturing gradient-gel electrophoresis (DGGE) and/or single-strand conformation polymorphism analysis in 20 homosexual males with homosexual or bisexual brothers and one homosexual male with no homosexual brothers, and screened the amino-terminal domain of the receptor for sequence variation in an additional 44 homosexual males, 37 of whom had one or more first- or second-degree male relatives who were either homosexual or bisexual. These analyses show that (1) homosexual brothers are as likely to be discordant as concordant for androgen receptor alleles; (2) there are no large-scale differences between the distributions of polyglycine or polyglutamine tract lengths in the homosexual and control groups; and (3) coding region sequence variation is not commonly found within the androgen receptor gene of homosexual men. The DGGE screen identified two rare amino acid substitutions, ser[sup 205] -to-arg and glu[sup 793]-to-asp, the biological significance of which is unknown. 32 refs., 2 figs., 2 tabs.

  2. Cloning,Sequencing and Phylogenetic Study of rbcL Gene from Cyanobacteria Arthrospira and Spirulina

    Liu Jinjie(刘金姐); Zhang Xuecheng; Sui Zhenghong; Mao Yunxiang; Sun Xue


    Large subunit gene of rubisco (rbcL) of cyanobacteria Arthrospira platensis FACHB341, A. Platensis FACHB439, A. Maxima OUQDSM and Spirulina sp. FACHB440 is cloned, sequenced and characterized. Results show that GC content of the gene in strain Spirulina sp. FACHB440 is higher than that in the others. The alignments based on deduced amino acid sequences indicate that Spirulina sp. FACHB440 is different from that in other three samples of Arthrospira, though they have the same conserved functional sites (95, 98, 121, 124, 221, 257). The nucleotide sequence similarity among the three strains of the genus of Arthrospira (96.5~99.6%) is higher than that between Arthrospira and Spirulina (78.1~78.5%). By comparison of the corresponding sequence of other cyanobacteria, a phylogenetic tree with two clusters is constructed. A. Platensis FACHB341, A. Maxima OUQDSM and A. Platensis FACHB439 form the monophyletic linage, which is fully supported by bootstrap values (1000), while Spirulina sp. FACHB440 and Anabaena sp. PCC7120 cluster in another linage with the bootstrap value of 909.

  3. [Sequence characterization of the 5'-Flanking region of the GHR gene in Tibetan sheep].

    Ma, Zhi-Jie; Wei, Ya-Ping; Zhong, Jin-Cheng; Chen, Zhi-Hua; Lu, Hong; Tong, Zi-Bao


    The 5'-Flanking sequence (including the P1 promotor and exon 1A) of the GHR gene in Oura-type Tibetan sheep (O. aries) was cloned by T-A method and sequenced (GenBank accession No. EF116490). Characterization and comparison of this sequence with mouflons (O. musimon), goat (C. hircus), cattle (B. taurus) and European bison (B. bonasus) orthologues were also conducted. Results showed that: 1) The 5'-flanking region contained many potential transcriptional factor binding sites such as those for C/EBPb, C/EBP, SP1, Cap, USF, HFH-2, HNF-3b, and Oct-1, which might have an important effect on transcription activation and regulation as well as tissue-specific expression. The rate of repetitive sequences was 2.55% and no SINEs, LINEs, LTR anti-transcription elements or DNA transposon elements were found, although one (TG)11 microsatellite was found. 2) In the P1 promotor region, sequence homology between the Tibetan sheep and mouflon, goat, cattle and European bison was 99.7%, 94.2%, 85.9% and 86.5%, respectively, while that for exon 1A was 99.0%, 97.0%, 92.7% and 94.6%, respectively. 3) The molecular phylogenetic tree among these species, constructed by the neighborhood joining method based on the sequences of no-coding region of the GHR genes, placed the two Bovinae species on one branch and the three Caprinae species on the other. Tibetan sheep and mouflons were joined first, followed by the goat, and then the Bovinae species, including the cattle and European bison. This result of phylogenetic clustering was not only identical to the taxonomy, but also to the phylogenetic clustering using the mitochondrial DNA of these species.

  4. Medical sequencing of candidate genes for nonsyndromic cleft lip and palate.

    Alexandre R Vieira


    Full Text Available Nonsyndromic or isolated cleft lip with or without cleft palate (CL/P occurs in wide geographic distribution with an average birth prevalence of 1/700. We used direct sequencing as an approach to study candidate genes for CL/P. We report here the results of sequencing on 20 candidate genes for clefts in 184 cases with CL/P selected with an emphasis on severity and positive family history. Genes were selected based on expression patterns, animal models, and/or role in known human clefting syndromes. For seven genes with identified coding mutations that are potentially etiologic, we performed linkage disequilibrium studies as well in 501 family triads (affected child/mother/father. The recently reported MSX1 P147Q mutation was also studied in an additional 1,098 cleft cases. Selected missense mutations were screened in 1,064 controls from unrelated individuals on the Centre d'Etude du Polymorphisme Humain (CEPH diversity cell line panel. Our aggregate data suggest that point mutations in these candidate genes are likely to contribute to 6% of isolated clefts, particularly those with more severe phenotypes (bilateral cleft of the lip with cleft palate. Additional cases, possibly due to microdeletions or isodisomy, were also detected and may contribute to clefts as well. Sequence analysis alone suggests that point mutations in FOXE1, GLI2, JAG2, LHX8, MSX1, MSX2, SATB2, SKI, SPRY2, and TBX10 may be rare causes of isolated cleft lip with or without cleft palate, and the linkage disequilibrium data support a larger, as yet unspecified, role for variants in or near MSX2, JAG2, and SKI. This study also illustrates the need to test large numbers of controls to distinguish rare polymorphic variants and prioritize functional studies for rare point mutations.

  5. Phylogenetic analysis of dermatophyte species using DNA sequence polymorphism in calmodulin gene.

    Ahmadi, Bahram; Mirhendi, Hossein; Makimura, Koichi; de Hoog, G Sybren; Shidfar, Mohammad Reza; Nouripour-Sisakht, Sadegh; Jalalizand, Niloofar


    Use of phylogenetic species concepts based on rDNA internal transcribe spacer (ITS) regions have improved the taxonomy of dermatophyte species; however, confirmation and refinement using other genes are needed. Since the calmodulin gene has not been systematically used in dermatophyte taxonomy, we evaluated its intra- and interspecies sequence variation as well as its application in identification, phylogenetic analysis, and taxonomy of 202 strains of 29 dermatophyte species. A set of primers was designed and optimized to amplify the target followed by bilateral sequencing. Using pairwise nucleotide comparisons, a mean similarity of 81% was observed among 29 dermatophyte species, with inter-species diversity ranging from 0 to 200 nucleotides (nt). Intraspecies nt differences were found within strains of Trichophyton interdigitale, Arthroderma simii, T. rubrum and A. vanbreuseghemii, while T. tonsurans, T. violaceum, Epidermophyton floccosum, Microsporum canis, M. audouinii, M. cookei, M. racemosum, M. gypseum, T. mentagrophytes, T schoenleinii, and A. benhamiae were conserved. Strains of E. floccosum/M. racemosum/M. cookei, A. obtosum/A. gertleri, T. tonsurans/T. equinum and a genotype of T. interdigitale had identical calmodulin sequences. For the majority of the species, tree topology obtained for calmodulin gene showed a congruence with coding and non-coding regions including ITS, BT2, and Tef-1α. Compared with the phylogenetic tree derived from ITS, BT2, and Tef-1α genes, some species such as E. floccosum and A. gertleri took relatively remote positions. Here, characterization and obtained dendrogram of calmodulin gene on a broad range of dermatophyte species provide a basis for further discovery of relationships between species. Studies of other loci are necessary to confirm the results.

  6. Sequence analysis of candidate genes in two Roma families with severe tooth agenesis

    Gabriková Dana


    Full Text Available Selective tooth agenesis is the most common congenital disorder affecting the formation of dentition in humans. Both its forms (hypodontia and more severe oligodontia can be found either in isolated form and they can be associated with systemic condition (syndromic tooth agenesis. In addition to previously known genes (PAX9, MSX1 and AXIN2 mutations in EDA, EDARADD and WNT10 gene were recently found to be involved in isolated forms of tooth agenesis. The objective of this study was to characterize the phenotype of affected members in two large families of Roma origin segregating severe isolated tooth agenesis with very variable phenotype and to perform mutation analysis of seven genes with aim to find causal mutation. 26 family members were clinically examined and coding regions of seven genes (MSX1, PAX9, AXIN2, EDA, EDAR, EDARADD and WNT10A were sequenced. With exclusion of third molars, average number of missing teeth was 8.2 ± 4.9 in family 1 and 7.1 ± 2.3 in family 2. The most frequently missing teeth were maxillary lateral incisors and first premolars and mandibular central incisors. Sequencing revealed four potentially damaging variants (g.Ala40Gly in MSX1, g.Ala240Pro in PAX9, g.Pro50Ser in AXIN2 and g.Met9Ile in EDARADD; however, none of them was present in all affected family members. Variable phenotype in both families examined in this study is in favour of heterogeneous genetic cause of tooth agenesis in these families: possible interaction of several defected genes, sequence variants in regulatory regions and additional environmental factors is assumed.

  7. Medical Sequencing of Candidate Genes for Nonsyndromic Cleft Lip and Palate.


    Full Text Available Nonsyndromic or isolated cleft lip with or without cleft palate (CL/P occurs in wide geographic distribution with an average birth prevalence of 1/700. We used direct sequencing as an approach to study candidate genes for CL/P. We report here the results of sequencing on 20 candidate genes for clefts in 184 cases with CL/P selected with an emphasis on severity and positive family history. Genes were selected based on expression patterns, animal models, and/or role in known human clefting syndromes. For seven genes with identified coding mutations that are potentially etiologic, we performed linkage disequilibrium studies as well in 501 family triads (affected child/mother/father. The recently reported MSX1 P147Q mutation was also studied in an additional 1,098 cleft cases. Selected missense mutations were screened in 1,064 controls from unrelated individuals on the Centre d'Etude du Polymorphisme Humain (CEPH diversity cell line panel. Our aggregate data suggest that point mutations in these candidate genes are likely to contribute to 6% of isolated clefts, particularly those with more severe phenotypes (bilateral cleft of the lip with cleft palate. Additional cases, possibly due to microdeletions or isodisomy, were also detected and may contribute to clefts as well. Sequence analysis alone suggests that point mutations in FOXE1, GLI2, JAG2, LHX8, MSX1, MSX2, SATB2, SKI, SPRY2, and TBX10 may be rare causes of isolated cleft lip with or without cleft palate, and the linkage disequilibrium data support a larger, as yet unspecified, role for variants in or near MSX2, JAG2, and SKI. This study also illustrates the need to test large numbers of controls to distinguish rare polymorphic variants and prioritize functional studies for rare point mutations.

  8. tRNADB-CE: tRNA gene database well-timed in the era of big sequence data.

    Abe, Takashi; Inokuchi, Hachiro; Yamada, Yuko; Muto, Akira; Iwasaki, Yuki; Ikemura, Toshimichi


    The tRNA gene data base curated by experts "tRNADB-CE" ( was constructed by analyzing 1,966 complete and 5,272 draft genomes of prokaryotes, 171 viruses', 121 chloroplasts', and 12 eukaryotes' genomes plus fragment sequences obtained by metagenome studies of environmental samples. 595,115 tRNA genes in total, and thus two times of genes compiled previously, have been registered, for which sequence, clover-leaf structure, and results of sequence-similarity and oligonucleotide-pattern searches can be browsed. To provide collective knowledge with help from experts in tRNA researches, we added a column for enregistering comments to each tRNA. By grouping bacterial tRNAs with an identical sequence, we have found high phylogenetic preservation of tRNA sequences, especially at the phylum level. Since many species-unknown tRNAs from metagenomic sequences have sequences identical to those found in species-known prokaryotes, the identical sequence group (ISG) can provide phylogenetic markers to investigate the microbial community in an environmental ecosystem. This strategy can be applied to a huge amount of short sequences obtained from next-generation sequencers, as showing that tRNADB-CE is a well-timed database in the era of big sequence data. It is also discussed that batch-learning self-organizing-map with oligonucleotide composition is useful for efficient knowledge discovery from big sequence data.

  9. tRNADB-CE: tRNA gene database well-timed in the era of big sequence data

    Takashi eAbe


    Full Text Available The tRNA Gene Data Base Curated by Experts tRNADB-CE ( was constructed by analyzing 1,966 complete and 5,272 draft genomes of prokaryotes, 171 viruses’, 121 chloroplasts’, and 12 eukaryotes’ genomes plus fragment sequences obtained by metagenome studies of environmental samples. 595,115 tRNA genes in total, and thus two times of genes compiled previously, have been registered, for which sequence, clover-leaf structure, and results of sequence-similarity and oligonucleotide-pattern searches can be browsed. To provide collective knowledge with help from experts in tRNA researches, we added a column for enregistering comments to each tRNA. By grouping bacterial tRNAs with an identical sequence, we have found high phylogenetic preservation of tRNA sequences, especially at the phylum level. Since many species-unknown tRNAs from metagenomic sequences have sequences identical to those found in species-known prokaryotes, the identical sequence group can provide phylogenetic markers to investigate the microbial community in an environmental ecosystem. This strategy can be applied to a huge amount of short sequences obtained from next-generation sequencers, as showing that tRNADB-CE is a well-timed database in the era of big sequence data. It is also discussed that BLSOM with oligonucleotide composition is useful for efficient knowledge discovery from big sequence data.

  10. Resolving arthropod phylogeny: exploring phylogenetic signal within 41 kb of protein-coding nuclear gene sequence.

    Regier, Jerome C; Shultz, Jeffrey W; Ganley, Austen R D; Hussey, April; Shi, Diane; Ball, Bernard; Zwick, Andreas; Stajich, Jason E; Cummings, Michael P; Martin, Joel W; Cunningham, Clifford W


    This study attempts to resolve relationships among and within the four basal arthropod lineages (Pancrustacea, Myriapoda, Euchelicerata, Pycnogonida) and to assess the widespread expectation that remaining phylogenetic problems will yield to increasing amounts of sequence data. Sixty-eight regions of 62 protein-coding nuclear genes (approximately 41 kilobases (kb)/taxon) were sequenced for 12 taxonomically diverse arthropod taxa and a tardigrade outgroup. Parsimony, likelihood, and Bayesian analyses of total nucleotide data generally strongly supported the monophyly of each of the basal lineages represented by more than one species. Other relationships within the Arthropoda were also supported, with support levels depending on method of analysis and inclusion/exclusion of synonymous changes. Removing third codon positions, where the assumption of base compositional homogeneity was rejected, altered the results. Removing the final class of synonymous mutations--first codon positions encoding leucine and arginine, which were also compositionally heterogeneous--yielded a data set that was consistent with a hypothesis of base compositional homogeneity. Furthermore, under such a data-exclusion regime, all 68 gene regions individually were consistent with base compositional homogeneity. Restricting likelihood analyses to nonsynonymous change recovered trees with strong support for the basal lineages but not for other groups that were variably supported with more inclusive data sets. In a further effort to increase phylogenetic signal, three types of data exploration were undertaken. (1) Individual genes were ranked by their average rate of nonsynonymous change, and three rate categories were assigned--fast, intermediate, and slow. Then, bootstrap analysis of each gene was performed separately to see which taxonomic groups received strong support. Five taxonomic groups were strongly supported independently by two or more genes, and these genes mostly belonged to the slow

  11. Replication error deficient and proficient colorectal cancer gene expression differences caused by 3'UTR polyT sequence deletions

    Wilding, Jennifer L; McGowan, Simon; Liu, Ying


    , and have distinct pathologies. Regulatory sequences controlling all aspects of mRNA processing, especially including message stability, are found in the 3'UTR sequence of most genes. The relevant sequences are typically A/U-rich elements or U repeats. Microarray analysis of 14 RER+ (deficient) and 16 RER......- (proficient) colorectal cancer cell lines confirms a striking difference in expression profiles. Analysis of the incidence of mononucleotide repeat sequences in the 3'UTRs, 5'UTRs, and coding sequences of those genes most differentially expressed in RER+ versus RER- cell lines has shown that much...... of this differential expression can be explained by the occurrence of a massive enrichment of genes with 3'UTR T repeats longer than 11 base pairs in the most differentially expressed genes. This enrichment was confirmed by analysis of two published consensus sets of RER differentially expressed probesets for a large...

  12. In silico phylogenetic and virulence gene profile analyses of avian pathogenic Escherichia coli genome sequences

    Thaís C.G. Rojas


    Full Text Available Avian pathogenic Escherichia coli (APEC infections are responsible for significant losses in the poultry industry worldwide. A zoonotic risk has been attributed to APEC strains because they present similarities to extraintestinal pathogenic E. coli (ExPEC associated with illness in humans, mainly urinary tract infections and neonatal meningitis. Here, we present in silico analyses with pathogenic E. coli genome sequences, including recently available APEC genomes. The phylogenetic tree, based on multi-locus sequence typing (MLST of seven housekeeping genes, revealed high diversity in the allelic composition. Nevertheless, despite this diversity, the phylogenetic tree was able to cluster the different pathotypes together. An in silico virulence gene profile was also determined for each of these strains, through the presence or absence of 83 well-known virulence genes/traits described in pathogenic E. coli strains. The MLST phylogeny and the virulence gene profiles demonstrated a certain genetic similarity between Brazilian APEC strains, APEC isolated in the United States, UPEC (uropathogenic E. coli and diarrheagenic strains isolated from humans. This correlation corroborates and reinforces the zoonotic potential hypothesis proposed to APEC.

  13. Genome sequence of Rickettsia bellii illuminates the role of amoebae in gene exchanges between intracellular pathogens.


    Full Text Available The recently sequenced Rickettsia felis genome revealed an unexpected plasmid carrying several genes usually associated with DNA transfer, suggesting that ancestral rickettsiae might have been endowed with a conjugation apparatus. Here we present the genome sequence of Rickettsia bellii, the earliest diverging species of known rickettsiae. The 1,552,076 base pair-long chromosome does not exhibit the colinearity observed between other rickettsia genomes, and encodes a complete set of putative conjugal DNA transfer genes most similar to homologues found in Protochlamydia amoebophila UWE25, an obligate symbiont of amoebae. The genome exhibits many other genes highly similar to homologues in intracellular bacteria of amoebae. We sought and observed sex pili-like cell surface appendages for R. bellii. We also found that R. bellii very efficiently multiplies in the nucleus of eukaryotic cells and survives in the phagocytic amoeba, Acanthamoeba polyphaga. These results suggest that amoeba-like ancestral protozoa could have served as a genetic "melting pot" where the ancestors of rickettsiae and other bacteria promiscuously exchanged genes, eventually leading to their adaptation to the intracellular lifestyle within eukaryotic cells.

  14. Comparative sequence analysis of double stranded RNA binding protein encoding gene of parapoxviruses from Indian camels

    G. Nagarajan


    Full Text Available The dsRNA binding protein (RBP encoding gene of parapoxviruses (PPVs from the Dromedary camels, inhabitating different geographical region of Rajasthan, India were amplified by polymerase chain reaction using the primers of pseudocowpoxvirus (PCPV from Finnish reindeer and cloned into pGEM-T for sequence analysis. Analysis of RBP encoding gene revealed that PPV DNA from Bikaner shared 98.3% and 76.6% sequence identity at the amino acid level, with Pali and Udaipur PPV DNA, respectively. Reference strains of Bovine papular stomatitis virus (BPSV and PCPV (reindeer PCPV and human PCPV shared 52.8% and 86.9% amino acid identity with RBP gene of camel PPVs from Bikaner, respectively. But different strains of orf virus (ORFV from different geographical areas of the world shared 69.5–71.7% amino acid identity with RBP gene of camel PPVs from Bikaner. These findings indicate that the camel PPVs described are closely related to bovine PPV (PCPV in comparison to caprine and ovine PPV (ORFV.

  15. Expressed sequence tag analysis of functional genes associated with adventitious rooting in Liriodendron hybrids.

    Zhong, Y D; Sun, X Y; Liu, E Y; Li, Y Q; Gao, Z; Yu, F X


    Liriodendron hybrids (Liriodendron chinense x L. tulipifera) are important landscaping and afforestation hardwood trees. To date, little genomic research on adventitious rooting has been reported in these hybrids, as well as in the genus Liriodendron. In the present study, we used adventitious roots to construct the first cDNA library for Liriodendron hybrids. A total of 5176 expressed sequence tags (ESTs) were generated and clustered into 2921 unigenes. Among these unigenes, 2547 had significant homology to the non-redundant protein database representing a wide variety of putative functions. Homologs of these genes regulated many aspects of adventitious rooting, including those for auxin signal transduction and root hair development. Results of quantitative real-time polymerase chain reaction showed that AUX1, IRE, and FB1 were highly expressed in adventitious roots and the expression of AUX1, ARF1, NAC1, RHD1, and IRE increased during the development of adventitious roots. Additionally, 181 simple sequence repeats were identified from 166 ESTs and more than 91.16% of these were dinucleotide and trinucleotide repeats. To the best of our knowledge, the present study reports the identification of the genes associated with adventitious rooting in the genus Liriodendron for the first time and provides a valuable resource for future genomic studies. Expression analysis of selected genes could allow us to identify regulatory genes that may be essential for adventitious rooting.

  16. A genetic similarity algorithm for searching the Gene Ontology terms and annotating anonymous protein sequences.

    Othman, Razib M; Deris, Safaai; Illias, Rosli M


    A genetic similarity algorithm is introduced in this study to find a group of semantically similar Gene Ontology terms. The genetic similarity algorithm combines semantic similarity measure algorithm with parallel genetic algorithm. The semantic similarity measure algorithm is used to compute the similitude strength between the Gene Ontology terms. Then, the parallel genetic algorithm is employed to perform batch retrieval and to accelerate the search in large search space of the Gene Ontology graph. The genetic similarity algorithm is implemented in the Gene Ontology browser named basic UTMGO to overcome the weaknesses of the existing Gene Ontology browsers which use a conventional approach based on keyword matching. To show the applicability of the basic UTMGO, we extend its structure to develop a Gene Ontology -based protein sequence annotation tool named extended UTMGO. The objective of developing the extended UTMGO is to provide a simple and practical tool that is capable of producing better results and requires a reasonable amount of running time with low computing cost specifically for offline usage. The computational results and comparison with other related tools are presented to show the effectiveness of the proposed algorithm and tools.

  17. Evolutionary dynamics of a conserved sequence motif in the ribosomal genes of the ciliate Paramecium

    Lynch Michael


    Full Text Available Abstract Background In protozoa, the identification of preserved motifs by comparative genomics is often impeded by difficulties to generate reliable alignments for non-coding sequences. Moreover, the evolutionary dynamics of regulatory elements in 3' untranslated regions (both in protozoa and metazoa remains a virtually unexplored issue. Results By screening Paramecium tetraurelia's 3' untranslated regions for 8-mers that were previously found to be preserved in mammalian 3' UTRs, we detect and characterize a motif that is distinctly conserved in the ribosomal genes of this ciliate. The motif appears to be conserved across Paramecium aurelia species but is absent from the ribosomal genes of four additional non-Paramecium species surveyed, including another ciliate, Tetrahymena thermophila. Motif-free ribosomal genes retain fewer paralogs in the genome and appear to be lost more rapidly relative to motif-containing genes. Features associated with the discovered preserved motif are consistent with this 8-mer playing a role in post-transcriptional regulation. Conclusions Our observations 1 shed light on the evolution of a putative regulatory motif across large phylogenetic distances; 2 are expected to facilitate the understanding of the modulation of ribosomal genes expression in Paramecium; and 3 reveal a largely unexplored--and presumably not restricted to Paramecium--association between the presence/absence of a DNA motif and the evolutionary fate of its host genes.

  18. Cloning and Sequencing of the Pokeweed Antiviral Protein Gene and Its Expression in E. coli

    CHEN Ding-hu; WANG Xi-feng; LI Li; ZHOU Guang-he


    The total RNA was isolated from pokeweed (Phytolacca americana ) leaves using the method of guanidine isothiocyanite and used as a template to amplify the deleted mutant pokeweed antiviral protein (PAP) gene by RT-PCR and then the gene was cloned into the pGEMR-T vector. The sequencing results showed that the PAP gene consisted of 711nt, which was 99.6% identical to the PAP gene reported by Lin et al (1991). The IPTG-inducible expression vector containing the PAP gene was constructed and transferred into the E. coli strain BL21 (DE3)-plysS. A specific protein was produced after induction with 0.4m mol/L IPTG and its molecular weight was 26ku. The results of the double diffusion on the agar plate and the western blotting test showed that the protein produced in E. coli was highly identical with the PAP extracted by a Frenchman from French pokeweed leaves. These revealed that PAP gene was actually achieved and exactly expressed in E . coli.

  19. cDNA cloning, sequence analysis, and chromosomal localization of the gene for human carnitine palmitoyltransferase

    Finocchiaro, G.; Taroni, F.; Martin, A.L.; Colombo, I.; Tarelli, G.T.; DiDonato, S. (Istituto Nazionale Neurologico C. Besta, Milan (Italy)); Rocchi, M. (Istituto G. Gaslini, Genoa (Italy))


    The authors have cloned and sequenced a cDNA encoding human liver carnitine palmitoyltransferase an inner mitochondrial membrane enzyme that plays a major role in the fatty acid oxidation pathway. Mixed oligonucleotide primers whose sequences were deduced from one tryptic peptide obtained from purified CPTase were used in a polymerase chain reaction, allowing the amplification of a 0.12-kilobase fragment of human genomic DNA encoding such a peptide. A 60-base-pair (bp) oligonucleotide synthesized on the basis of the sequence from this fragment was used for the screening of a cDNA library from human liver and hybridized to a cDNA insert of 2255 bp. This cDNA contains an open reading frame of 1974 bp that encodes a protein of 658 amino acid residues including 25 residues of an NH{sub 2}-terminal leader peptide. The assignment of this open reading frame to human liver CPTase is confirmed by matches to seven different amino acid sequences of tryptic peptides derived from pure human CPTase and by the 82.2% homology with the amino acid sequence of rat CPTase. The NH{sub 2}-terminal region of CPTase contains a leucine-proline motif that is shared by carnitine acetyl- and octanoyltransferases and by choline acetyltransferase. The gene encoding CPTase was assigned to human chromosome 1, region 1q12-1pter, by hybridization of CPTase cDNA with a DNA panel of 19 human-hanster somatic cell hybrids.

  20. Analyses of the Genus Aschersonia and Related Genera Using 28S rDNA RFLP Technique%座壳孢及其相近属PCR-RFLP分析

    邱君志; 陈莹; 马慧斐


    Eighteen isolates of entomopathogenic fungi, including Aschersonia, Lecanicillium and Beauveria were analyzed by RFLP(restriction fragment length polymorphism) analysis of a 28S rRNA gene fragment. The data obtained by four different endonucleases were used to construct phenograms based on the UPGMA algorithm.Polymorphisms were found between genera and species, with fewer among isolates within species. PCR-RFLP within the 28S rRNA gene was sufficient to distinguish between Aschersonia species. RFLP data showed genetic diversity in Lecanicillium isolates at an infraspecific level. The UPGMA analysis results indicated the closer taxonomic affinity of Lecanicillium to Beauveria compared to Aschersonia.%利用核糖体28S rRNA基因的RFLP对18个昆虫病原真菌(座壳孢、轮枝菌和白僵菌属)菌株进行了分析.用4种限制性内切酶对PCR产物进行了消化并采用UPGMA法对扩增产物的限制性图谱构建了亲缘图.结果表明,属间、种间的多态性高于同种不同菌株间的多态性,并可以清楚地将座壳孢属不同种区分开来,而且轮枝菌不同菌株间存在着遗传差异.轮枝菌与白僵菌间的亲缘关系比其与座壳孢近.

  1. Detecting functional divergence after gene duplication through evolutionary changes in posttranslational regulatory sequences.

    Nguyen Ba, Alex N; Strome, Bob; Hua, Jun Jie; Desmond, Jonathan; Gagnon-Arsenault, Isabelle; Weiss, Eric L; Landry, Christian R; Moses, Alan M


    Gene duplication is an important evolutionary mechanism that can result in functional divergence in paralogs due to neo-functionalization or sub-functionalization. Consistent with functional divergence after gene duplication, recent studies have shown accelerated evolution in retained paralogs. However, little is known in general about the impact of this accelerated evolution on the molecular functions of retained paralogs. For example, do new functions typically involve changes in enzymatic activities, or changes in protein regulation? Here we study the evolution of posttranslational regulation by examining the evolution of important regulatory sequences (short linear motifs) in retained duplicates created by the whole-genome duplication in budding yeast. To do so, we identified short linear motifs whose evolutionary constraint has relaxed after gene duplication with a likelihood-ratio test that can account for heterogeneity in the evolutionary process by using a non-central chi-squared null distribution. We find that short linear motifs are more likely to show changes in evolutionary constraints in retained duplicates compared to single-copy genes. We examine changes in constraints on known regulatory sequences and show that for the Rck1/Rck2, Fkh1/Fkh2, Ace2/Swi5 paralogs, they are associated with previously characterized differences in posttranslational regulation. Finally, we experimentally confirm our prediction that for the Ace2/Swi5 paralogs, Cbk1 regulated localization was lost along the lineage leading to SWI5 after gene duplication. Our analysis suggests that changes in posttranslational regulation mediated by short regulatory motifs systematically contribute to functional divergence after gene duplication.

  2. Gene Expression Profiling of Development and Anthocyanin Accumulation in Kiwifruit (Actinidia chinensis Based on Transcriptome Sequencing.

    Wenbin Li

    Full Text Available Red-fleshed kiwifruit (Actinidia chinensis Planch. 'Hongyang' is a promising commercial cultivar due to its nutritious value and unique flesh color, derived from vitamin C and anthocyanins. In this study, we obtained transcriptome data of 'Hongyang' from seven developmental stages using Illumina sequencing. We mapped 39-54 million reads to the recently sequenced kiwifruit genome and other databases to define gene structure, to analyze alternative splicing, and to quantify gene transcript abundance at different developmental stages. The transcript profiles throughout red kiwifruit development were constructed and analyzed, with a focus on the biosynthesis and metabolism of compounds such as phytohormones, sugars, starch and L-ascorbic acid, which are indispensable for the development and formation of quality fruit. Candidate genes for these pathways were identified through MapMan and phylogenetic analysis. The transcript levels of genes involved in sucrose and starch metabolism were consistent with the change in soluble sugar and starch content throughout kiwifruit development. The metabolism of L-ascorbic acid was very active, primarily through the L-galactose pathway. The genes responsible for the accumulation of anthocyanin in red kiwifruit were identified, and their expression levels were investigated during kiwifruit development. This survey of gene expression during kiwifruit development paves the way for further investigation of the development of this uniquely colored and nutritious fruit and reveals which factors are needed for high quality fruit formation. This transcriptome data and its analysis will be useful for improving kiwifruit genome annotation, for basic fruit molecular biology research, and for kiwifruit breeding and improvement.

  3. [Phylogenetic and Bioinformatics Analysis of Replicase Gene Sequence of Cucumber Green Mottle Mosaic Virus].

    Liang, Chaoqiong; Meng, Yan; Luo, Laixin; Liu, Pengfei; Li, Jianqiang


    The replicase genes of five isolates of Cucumber green mottle mosaic virus from Jiangsu, Zhejiang, Hunan and Beijing were amplificated, sequenced and analyzed. The similarities of nucleotide acid sequences indicated that 129 kD and 57 kD replicase genes of CGMMV-No. 1, CGMMV-No. 2, CGMMV-No. 3, CGMMV-No. 4 and CGMMV-No. 5 were 99.64% and 99.74%, respectively. The similarities of 129 kD and 57 kD replicase genes of CGMMV-No. 1, CGMMV-No. 3 and CGMMV-No. 4 were 99.95% and 99.94%, while they were lower between CGMMV-No. 2 and the rest of four reference sequences, just from 99.16% to 99.27% and from 99.04% to 99.18%. All reference sequences could be divided into six groups in neighbor-joining (NJ) phylogenetic trees based on the replicase gene sequences of 129 kD, 57 kD protein respectively. CGMMV-No. 1, CGMMV-No. 3 and CGMMV-No. 4 were clustered together with Shandong isolate (Accession No. KJ754195) in two NJ trees; CGMMV-No. 5 was clustered together with Liaoning isolate (Accession No. EF611826) in two NJ trees; CGMMV-No. 2 was clustered together with Korea watermelon isolate (Accession No. AF417242) in phylogenetic tree of 129 kD replicase gene of CGMMV; Interestingly, CGMMV-No. 2 was classified as a independent group in phylogenetic tree of 57 kD replicase gene of CGMMV. There were no significant hydrophobic and highly coiled coil regions on 129 kD and 57 kD proteins of tested CGMMV isolates. Except 129 kD protein of CGMMV-No. 4, the rest were unstable protein. The number of transmembrane helical segments (TMHs) of 129 kD protein of CGMMV-No. 1, CGMMV-No. 2, CGMMV-No. 3 and CGMMV-No. 5 were 6, 6, 2 and 4, respectively, which were 13, 13 and 5 on the 57 kD protein of CGMMV-No. 2, CGMMV-No. 4 and CGMMV-No. 5. The glycosylation site of 129 kD protein of tested CGMMV isolates were 2, 4, 4, 4 and 4, and that of 57 kD protein were 2, 5, 2, 5 and 2. There were difference between the disorders, globulins, phosphorylation sites and B cell antigen epitopes of 129 kD and 57

  4. Molecular cloning and sequence analysis of prion protein gene in Xiji donkey in China.

    Zhang, Zhuming; Wang, Renli; Xu, Lihua; Yuan, Fangzhong; Zhou, Xiangmei; Yang, Lifeng; Yin, Xiaomin; Xu, Binrui; Zhao, Deming


    Prion diseases are a group of human and animal neurodegenerative disorders caused by the deposition of an abnormal isoform prion protein (PrP(Sc)) encoded by a single copy prion protein gene (PRNP). Prion disease has been reported in many herbivores but not in Equus and the species barrier might be playing a role in resistance of these species to the disease. Therefore, analysis of genotype of prion protein (PrP) in these species may help understand the transmission of the disease. Xiji donkey is a rare species of Equus not widely reared in Ningxia, China, for service, food and medicine, but its PRNP has not been studied. Based on the reported PrP sequence in GenBank we designed primers and amplified, cloned and sequenced the PRNP of Xiji donkey. The sequence analysis showed that the Xiji donkey PRNP was consisted of an open reading frame of 768 nucleotides encoding 256 amino acids. Amino acid residues unique to donkey as compared with some Equus animals, mink, cow, sheep, human, dog, sika deer, rabbit and hamster were identified. The results showed that the amino acid sequence of Xiji donkey PrP starts with the consensus sequence MVKSH, with almost identical amino acid sequence to the PrP of other Equus species in this study. Amino acid sequence analysis showed high identity within species and close relation to the PRNP of sika deer, sheep, dog, camel, cow, mink, rabbit and hamster with 83.1-99.7% identity. The results provided the PRNP data for an additional Equus species, which should be useful to the study of the prion disease pathogenesis, resistance and cross species transmission.

  5. Allelic Diversity and Population Structure in Oenococcus oeni as Determined from Sequence Analysis of Housekeeping Genes

    de las Rivas, Blanca; Marcobal, Ángela; Muñoz, Rosario


    Oenococcus oeni is the organism of choice for promoting malolactic fermentation in wine. The population biology of O. oeni is poorly understood and remains unclear. For a better understanding of the mode of genetic variation within this species, we investigated by using multilocus sequence typing (MLST) with the gyrB, pgm, ddl, recP, and mleA genes the genetic diversity and genetic relationships among 18 O. oeni strains isolated in various years from wines of the United States, France, Germany, Spain, and Italy. These strains have also been characterized by ribotyping and restriction fragment length polymorphism (RFLP) analysis of the PCR-amplified 16S-23S rRNA gene intergenic spacer region (ISR). Ribotyping grouped the strains into two groups; however, the RFLP analysis of the ISRs showed no differences in the strains analyzed. In contrast, MLST in oenococci had a good discriminatory ability, and we have found a higher genetic diversity than indicated by ribotyping analysis. All sequence types were represented by a single strain, and all the strains could be distinguished from each other because they had unique combinations of alleles. Strains assumed to be identical showed the same sequence type. Phylogenetic analyses indicated a panmictic population structure in O. oeni. Sequences were analyzed for evidence of recombination by split decomposition analysis and analysis of clustered polymorphisms. All results indicated that recombination plays a major role in creating the genetic heterogeneity of O. oeni. A low standardized index of association value indicated that the O. oeni genes analyzed are close to linkage equilibrium. This study constitutes the first step in the development of an MLST method for O. oeni and the first example of the application of MLST to a nonpathogenic food production bacteria. PMID:15574919

  6. Probing the effect of promoters on noise in gene expression using thousands of designed sequences.

    Sharon, Eilon; van Dijk, David; Kalma, Yael; Keren, Leeat; Manor, Ohad; Yakhini, Zohar; Segal, Eran


    Genetically identical cells exhibit large variability (noise) in gene expression, with important consequences for cellular function. Although the amount of noise decreases with and is thus partly determined by the mean expression level, the extent to which different promoter sequences can deviate away from this trend is not fully known. Here, we present a high-throughput method for measuring promoter-driven noise for thousands of designed synthetic promoters in parallel. We use it to investigate how promoters encode different noise levels and find that the noise levels of promoters with similar mean expression levels can vary more than one order of magnitude, with nucleosome-disfavoring sequences resulting in lower noise and more transcription factor binding sites resulting in higher noise. We propose a kinetic model of gene expression that takes into account the nonspecific DNA binding and one-dimensional sliding along the DNA, which occurs when transcription factors search for their target sites. We show that this assumption can improve the prediction of the mean-independent component of expression noise for our designed promoter sequences, suggesting that a transcription factor target search may affect gene expression noise. Consistent with our findings in designed promoters, we find that binding-site multiplicity in native promoters is associated with higher expression noise. Overall, our results demonstrate that small changes in promoter DNA sequence can tune noise levels in a manner that is predictable and partly decoupled from effects on the mean expression levels. These insights may assist in designing promoters with desired noise levels.

  7. Revised Mimivirus major capsid protein sequence reveals intron-containing gene structure and extra domain

    Suzan-Monti Marie


    Full Text Available Abstract Background Acanthamoebae polyphaga Mimivirus (APM is the largest known dsDNA virus. The viral particle has a nearly icosahedral structure with an internal capsid shell surrounded with a dense layer of fibrils. A Capsid protein sequence, D13L, was deduced from the APM L425 coding gene and was shown to be the most abundant protein found within the viral particle. However this protein remained poorly characterised until now. A revised protein sequence deposited in a database suggested an additional N-terminal stretch of 142 amino acids missing from the original deduced sequence. This result led us to investigate the L425 gene structure and the biochemical properties of the complete APM major Capsid protein. Results This study describes the full length 3430 bp Capsid coding gene and characterises the 593 amino acids long corresponding Capsid protein 1. The recombinant full length protein allowed the production of a specific monoclonal antibody able to detect the Capsid protein 1 within the viral particle. This protein appeared to be post-translationnally modified by glycosylation and phosphorylation. We proposed a secondary structure prediction of APM Capsid protein 1 compared to the Capsid protein structure of Paramecium Bursaria Chlorella Virus 1, another member of the Nucleo-Cytoplasmic Large DNA virus family. Conclusion The characterisation of the full length L425 Capsid coding gene of Acanthamoebae polyphaga Mimivirus provides new insights into the structure of the main Capsid protein. The production of a full length recombinant protein will be useful for further structural studies.

  8. Analysis of the sequence variations in the Mhc DRB1-like gene of the endangered Humboldt penguin (Spheniscus humboldti).

    Kikkawa, Eri F; Tsuda, Tomi T; Naruse, Taeko K; Sumiyama, Daisuke; Fukuda, Michio; Kurita, Masanori; Murata, Koichi; Wilson, Rory P; LeMaho, Yvon; Tsuda, Michio; Kulski, Jerzy K; Inoko, Hidetoshi


    The Major Histocompatibility Complex (Mhc) genomic region of many vertebrates is known to contain at least one highly polymorphic class II gene that is homologous in sequence to one or other of the human Mhc DRB1 class II genes. The diversity of the avian Mhc class II gene sequences have been extensively studied in chickens, quails, and some songbirds, but have been largely ignored in the oceanic birds, including the flightless penguins. We have previously reported that several penguin species have a high degree of polymorphism on exon 2 of the Mhc class II DRB1-like gene. In this study, we present for the first time the complete nucleotide sequences of exon 2, intron 2, and exon 3 of the DRB1-like gene of 20 Humboldt penguins, a species that is presently vulnerable to the dangers of extinction. The Humboldt DRB1-like nucleotide and amino acid sequences reveal at least eight unique alleles. Phylogenetic analysis of all the available avian DRB-like sequences showed that, of five penguin species and nine other bird species, the sequences of the Humboldt penguins grouped most closely to the Little penguin and the mallard, respectively. The present analysis confirms that the sequence variations of the Mhc class II gene, DRB1, are useful for discriminating among individuals within the same penguin population as well those within different penguin population groups and species.

  9. Divergence in cis-regulatory sequences surrounding the opsin gene arrays of African cichlid fishes

    Streelman J Todd


    Full Text Available Abstract Background Divergence within cis-regulatory sequences may contribute to the adaptive evolution of gene expression, but functional alleles in these regions are difficult to identify without abundant genomic resources. Among African cichlid fishes, the differential expression of seven opsin genes has produced adaptive differences in visual sensitivity. Quantitative genetic analysis suggests that cis-regulatory alleles near the SWS2-LWS opsins may contribute to this variation. Here, we sequence BACs containing the opsin genes of two cichlids, Oreochromis niloticus and Metriaclima zebra. We use phylogenetic footprinting and shadowing to examine divergence in conserved non-coding elements, promoter sequences, and 3'-UTRs surrounding each opsin in search of candidate cis-regulatory sequences that influence cichlid opsin expression. Results We identified 20 conserved non-coding elements surrounding the opsins of cichlids and other teleosts, including one known enhancer and a retinal microRNA. Most conserved elements contained computationally-predicted binding sites that correspond to transcription factors that function in vertebrate opsin expression; O. niloticus and M. zebra were significantly divergent in two of these. Similarly, we found a large number of relevant transcription factor binding sites within each opsin's proximal promoter, and identified five opsins that were considerably divergent in both expression and the number of transcription factor binding sites shared between O. niloticus and M. zebra. We also found several microRNA target sites within the 3'-UTR of each opsin, including two 3'-UTRs that differ significantly between O. niloticus and M. zebra. Finally, we examined interspecific divergence among 18 phenotypically diverse cichlids from Lake Malawi for one conserved non-coding element, two 3'-UTRs, and five opsin proximal promoters. We found that all regions were highly conserved with some evidence of CRX transcription

  10. Clinical Next-Generation Sequencing Pipeline Outperforms a Combined Approach Using Sanger Sequencing and Multiplex Ligation-Dependent Probe Amplification in Targeted Gene Panel Analysis.

    Schenkel, Laila C; Kerkhof, Jennifer; Stuart, Alan; Reilly, Jack; Eng, Barry; Woodside, Crystal; Levstik, Alexander; Howlett, Christopher J; Rupar, Anthony C; Knoll, Joan H M; Ainsworth, Peter; Waye, John S; Sadikovic, Bekim


    Advances in next-generation sequencing (NGS) have facilitated parallel analysis of multiple genes enabling the implementation of cost-effective, rapid, and high-throughput methods for the molecular diagnosis of multiple genetic conditions, including the identification of BRCA1 and BRCA2 mutations in high-risk patients for hereditary breast and ovarian cancer. We clinically validated a NGS pipeline designed to replace Sanger sequencing and multiplex ligation-dependent probe amplification analysis and to facilitate detection of sequence and copy number alterations in a single test focusing on a BRCA1/BRCA2 gene analysis panel. Our custom capture library covers 46 exons, including BRCA1 exons 2, 3, and 5 to 24 and BRCA2 exons 2 to 27, with 20 nucleotides of intronic regions both 5' and 3' of each exon. We analyzed 402 retrospective patients, with previous Sanger sequencing and multiplex ligation-dependent probe amplification results, and 240 clinical prospective patients. One-hundred eighty-three unique variants, including sequence and copy number variants, were detected in the retrospective (n = 95) and prospective (n = 88) cohorts. This standardized NGS pipeline demonstrated 100% sensitivity and 100% specificity, uniformity, and high-depth nucleotide coverage per sample (approximately 7000 reads per nucleotide). Subsequently, the NGS pipeline was applied to the analysis of larger gene panels, which have shown similar uniformity, sample-to-sample reproducibility in coverage distribution, and sensitivity and specificity for detection of sequence and copy number variants.

  11. Rarity of DNA sequence alterations in the promoter region of the human androgen receptor gene

    D.F. Cabral


    Full Text Available The human androgen receptor (AR gene promoter lies in a GC-rich region containing two principal sites of transcription initiation and a putative Sp1 protein-binding site, without typical "TATA" and "CAAT" boxes. It has been suggested that mutations within the 5'untranslated region (5'UTR may contribute to the development of prostate cancer by changing the rates of gene transcription and/or translation. In order to investigate this question, the aim of the present study was to search for the presence of mutations or polymorphisms at the AR-5'UTR in 92 prostate cancer patients, where histological diagnosis of adenocarcinoma was established in specimens obtained from transurethral resection or after prostatectomy. The AR-5'UTR was amplified by PCR from genomic DNA samples of the patients and of 100 healthy male blood donors, included as controls. Conformation-sensitive gel electrophoresis was used for DNA sequence alteration screening. Only one band shift was detected in one individual from the blood donor group. Sequencing revealed a new single nucleotide deletion (T in the most conserved portion of the promoter region at position +36 downstream from the transcription initiation site I. Although the effect of this specific mutation remains unknown, its rarity reveals the high degree of sequence conservation of the human androgen promoter region. Moreover, the absence of detectable variation within the critical 5'UTR in prostate cancer patients indicates a low probability of its involvement in prostate cancer etiology.

  12. Mutation analysis of the coding sequence of the MECP2 gene in infantile autism.

    Beyer, Kim S; Blasi, Francesca; Bacchelli, Elena; Klauck, Sabine M; Maestrini, Elena; Poustka, Annemarie


    Mutations in the coding region of the methyl-CpG-binding protein 2 ( MECP2) gene cause Rett syndrome and have also been reported in a number of X-linked mental retardation syndromes. Furthermore, such mutations have recently been described in a few autistic patients. In this study, a large sample of individuals with autism was screened in order to elucidate systematically whether specific mutations in MECP2 play a role in autism. The mutation analysis of the coding sequence of the gene was performed by denaturing high-pressure liquid chromatography and direct sequencing. Taken together, 14 sequence variants were identified in 152 autistic patients from 134 German families and 50 unrelated patients from the International Molecular Genetic Study of Autism Consortium affected relative-pair sample. Eleven of these variants were excluded for having an aetiological role as they were either silent mutations, did not cosegregate with autism in the pedigrees of the patients or represented known polymorphisms. The relevance of the three remaining mutations towards the aetiology of autism could not be ruled out, although they were not localised within functional domains of MeCP2 and may be rare polymorphisms. Taking into account the large size of our sample, we conclude that mutations in the coding region of MECP2 do not play a major role in autism susceptibility. Therefore, infantile autism and Rett syndrome probably represent two distinct entities at the molecular genetic level.

  13. Cloning and sequence characteristics of the genomic gene of a rice metallothionein


    Northern blot analysis showed that a metallothionein gene, ricMT, is expressed strongly in the stem of rice with an expression level that could be more than 100-fold stronger than in leaf blades. The results suggest that the 5'upstream region flanking the coding sequence of the ricMT may contain a fairly strong promoter. To elucidate its regulation and promoter structure, the genomic clones of ricMT were screened out from a rice genomic library and a fragment of about 4 084 bp was sequenced. The fragment included a 5'upstream region of ca. 2 970 bp, a transcription region of ca. 690 bp and a 3'downstream region of ca. 420 bp. Computer analysis of the sequence homology showed that the 5'upstream region included a putative TATA box, a putative CAAT box, and a typical metal-responsive element TGCGCGCG. The results will promote further understanding of the mechanisms of gene regulation and metal response of plant metallothionein proteins.

  14. Bioinformatic identification of microRNAs and their target genes from Solanum tuberosum expressed sequence tags


    MicroRNAs (miRNAs) are a class of non-coding RNAs that regulate gene post-transcriptional expression in plants and animals. Low levels of some miRNAs and time- and tissue-specific expression patterns lead to the difficulty for experimental identification of miRNAs. Here we present a bioinformatic approach for expressed sequence tags (ESTs) prediction of novel miRNAs as well as their targets in Solanum tuberosum. We blasted the databases of S. Tuberosum ESTs to search for potential miRNAs, using previously known miRNA sequences from Arabidopsis, rice and other plant species. By analyzing parameters of plant precursors, including secondary structure, stem length and conservation of miRNAs, and following a variety of filtering criteria, a total of 22 potential miRNAs were detected. Using the newly identified miRNA sequences, we were able to further blast the S. Tuberosum mRNA database and detected 75 potential targets of miRNAs in S. Tuberosum. According to the mRNA annotations provided by the National Center for Biotechnology Information (NCBI) (, most of the miRNA target genes were predicted to encode transcription factors that regulate cell growth and development, signaling, and metabolism.

  15. Sequence Alterations of I(Ks Potassium Channel Genes in Kazakhstani Patients with Atrial Fibrillation

    Ainur Akilzhanova


    Full Text Available Introduction. Atrial fibrillation (AF is the most common sustained arrhythmia, and it results in significant morbidity and mortality. However, the pathogenesis of AF remains unclear to date. Recently, more pieces of evidence indicated that AF is a multifactorial disease resulting from the interaction between environmental factors and genetics. Recent studies suggest that genetic mutation of the slow delayed rectifier potassium channel (I(Ks may underlie AF.Objective. To investigate sequence alterations of I(Ks potassium channel genes KCNQ1, KCNE1 and KCNE2 in Kazakhstani patients with atrial fibrillation.Methods. Genomic DNA of 69 cases with atrial fibrillation and 27 relatives were analyzed for mutations in all protein-coding exons and their flanking splice site regions of the genes KCNQ1 (NM_000218.2 and NM_181798.1, KCNE1 (NM_000219.2, and KCNE2 (NM_172201.1 using bidirectional sequencing on the ABI 3730xL DNA Analyzer (Applied Biosystems, Foster City, CA, USA.Results. In total, a disease-causing mutation was identified in 39 of the 69 (56.5% index cases. Of these, altered sequence variants in the KCNQ1 gene accounted for 14.5% of the mutations, whereas a KCNE1 mutation accounted for 43.5% of the mutations and KCNE2 mutation accounted for 1.4% of the mutations. The majority of the distinct mutations were found in a single case (80%, whereas 20% of the mutations were observed more than once. We found two sequence variants in KCNQ1 exon 13 (S546S G1638A and exon 16 (Y662Y, C1986T in ten patients (14.5%. In KCNE1 gene in exon 3 mutation, S59G A280G was observed in 30 of 69 patients (43.5% and KCNE2 exon 2 T10K C29A in 1 patient (1.4%. Genetic cascade screening of 27 relatives to the 69 index cases with an identified mutation revealed 26.9% mutation carriers  who were at risk of cardiac events such as syncope or sudden unexpected death.Conclusion. In this cohort of Kazakhstani index cases with AF, a disease-causing mutation was identified in

  16. Opossum carboxylesterases: sequences, phylogeny and evidence for CES gene duplication events predating the marsupial-eutherian common ancestor

    Chan Jeannie


    Full Text Available Abstract Background Carboxylesterases (CES perform diverse metabolic roles in mammalian organisms in the detoxification of a broad range of drugs and xenobiotics and may also serve in specific roles in lipid, cholesterol, pheromone and lung surfactant metabolism. Five CES families have been reported in mammals with human CES1 and CES2 the most extensively studied. Here we describe the genetics, expression and phylogeny of CES isozymes in the opossum and report on the sequences and locations of CES1, CES2 and CES6 'like' genes within two gene clusters on chromosome one. We also discuss the likely sequence of gene duplication events generating multiple CES genes during vertebrate evolution. Results We report a cDNA sequence for an opossum CES and present evidence for CES1 and CES2 like genes expressed in opossum liver and intestine and for distinct gene locations of five opossum CES genes,CES1, CES2.1, CES2.2, CES2.3 and CES6, on chromosome 1. Phylogenetic and sequence alignment studies compared the predicted amino acid sequences for opossum CES with those for human, mouse, chicken, frog, salmon and Drosophila CES gene products. Phylogenetic analyses produced congruent phylogenetic trees depicting a rapid early diversification into at least five distinct CES gene family clusters: CES2, CES1, CES7, CES3, and CES6. Molecular divergence estimates based on a Bayesian relaxed clock approach revealed an origin for the five mammalian CES gene families between 328–378 MYA. Conclusion The deduced amino acid sequence for an opossum cDNA was consistent with its identity as a mammalian CES2 gene product (designated CES2.1. Distinct gene locations for opossum CES1 (1: 446,222,550–446,274,850, three CES2 genes (1: 677,773,395–677,927,030 and a CES6 gene (1: 677,585,520–677,730,419 were observed on chromosome 1. Opossum CES1 and multiple CES2 genes were expressed in liver and intestine. Amino acid sequences for opossum CES1 and three CES2 gene products

  17. Molecular characterisation of lineage IV peste des petits ruminants virus using multi gene sequence data.

    Kumar, K Senthil; Babu, Aravindh; Sundarapandian, G; Roy, Parimal; Thangavelu, A; Kumar, K Siva; Arumugam, R; Chandran, N D J; Muniraju, Murali; Mahapatra, Mana; Banyard, Ashley C; Manohar, B Murali; Parida, Satya


    Peste des petits ruminants is responsible for an economically important plague of small ruminants that is endemic across much of the developing world. Here we describe the detection and characterisation of a PPR virus from a recent outbreak in Tamil Nadu, India. We demonstrate the isolation of PPR virus from rectal swab and highlight the potential spread of disease to in-contact animals through faecal materials and use of faecal material as non-invasive method of sampling for susceptible wild ruminants. Finally we have performed a comprehensive 'multi-gene' assessment of lineage IV isolates of PPRV utilising sequence data from our study and publically available partial N, partial F and partial H gene data. We describe the effects of grouping PPRV isolates utilising different gene loci and conclude that the variable part of N gene at C terminus gives the best phylogenetic assessment of PPRV isolates with isolates generally clustering according to geographical isolation. This assessment highlights the importance of careful gene targeting with RT-PCR to enable thorough phylogenetic analysis.

  18. Cloning,sequencing and analyzing of the heavy chain V region genes of human polyreactive antibodies



    The heavy chain variable region genes of 5 human polyreactive mAbs generated in our laboratory have been cloned and sequenced using polymerase chain reaction(PCR) technique.We found that 2 and 3 mAbs utilized genes of the VHIV and VHⅢ families,respectively.The former 2 VH segments were in germline configuration.A common VH segment,with the best similarity of 90.1% to the published VHⅢ germline genes,was utilized by 2 different rearranged genes encoding the V regions of other 3 mAbs.This strongly suggests that the common VH segment is a unmutated copy of an unidentified germline VHⅢ gene.All these polyreactive mAbs displayed a large NDN region(VH-D-JH junction).The entire H chain V regions of these polyreactive mAbs are unusually basic.The analysis of the charge properties of these mAbs as well as those of other poly-and mono-reactive mAbs from literatures prompts us to propose that the charged amino acids with a particular distribution along the H chain V region,especially the binding sites(CDRs),may be an important structural feature involved in antibody polyreactivity.

  19. Novel and functional DNA sequence variants within the GATA5 gene promoter in ventricular septal defects

    Ji-Ping Shan; Xiao-Li Wang; Yuan-Gang Qiao; Hong-Xin Wan Yan; Wen-Hui Huang; Shu-Chao Pang; Bo Yan


    Background: Congenital heart disease (CHD) is the most common human birth defect. Genetic causes for CHD remain largely unknown. GATA transcription factor 5 (GATA 5) is an essential regulator for the heart development. Mutations in the GATA5 gene have been reported in patients with a variety of CHD. Since misregulation of gene expression have been associated with human diseases, we speculated that changed levels of cardiac transcription factors, GATA5, may mediate the development of CHD. Methods: In this study, GATA5 gene promoter was genetically and functionally analyzed in large cohorts of patients with ventricular septal defect (VSD) (n=343) and ethnic-matched healthy controls (n=348). Results: Two novel and heterozygous DNA sequence variants (DSVs), g.61051165A>G and g.61051463delC, were identified in three VSD patients, but not in the controls. In cultured cardiomyocytes, GATA5 gene promoter activities were significantly decreased by DSV g.61051165A>G and increased by DSV g.61051463delC. Moreover, fathers of the VSD patients carrying the same DSVs had reduced diastolic function of left ventricles. Three SNPs, g.61051279C>T (rs77067995), g.61051327A>C (rs145936691) and g.61051373G>A (rs80197101), and one novel heterozygous DSV, g.61051227C>T, were found in both VSD patients and controls with similar frequencies. Conclusion: Our data suggested that the DSVs in the GATA5 gene promoter may increase the susceptibility to the development of VSD as a risk factor.

  20. Isolation and sequencing of the HMG domain of ten Sox genes from Odorrana schmackeri (Amphibia: Anura

    Ning Wang


    Full Text Available Sox (SRY-related HMG-box genes encode a family of transcriptional regulators, which are characterized by a conserved 79-amino acid domain known as HMG-box. They play essential roles in a diverse range of processes including sex determination and the development of the central nervous system (CNS, neural crest and endoderm. In this paper, the HMG domain of ten distinct Sox gene family members (os-Sox2, os-Sox3a, os-Sox3b, os-Sox4, os-Sox11a, os-Sox11b, os-Sox14a, os-Sox14b, os-Sox21a, os-Sox21b were isolated from both male and female Odorrana schmackeri (Boettger, 1892 using PCR, and no sexual differences were found. Molecular phylogenetic analysis of the HMG domain suggested that these ten Sox genes are members of the SoxB and SoxC groups. In addition, sequence analysis suggested that four Sox genes (os-Sox3, os-Sox11, os-Sox14, os-Sox21 were duplicated. The duplication-degeneration-complementation model should be implied to explain the evolution and diversity of the Sox gene family in O. schmackeri.

  1. Next-generation sequencing identifies transportin 3 as the causative gene for LGMD1F.

    Annalaura Torella

    Full Text Available Limb-girdle muscular dystrophies (LGMD are genetically and clinically heterogeneous conditions. We investigated a large family with autosomal dominant transmission pattern, previously classified as LGMD1F and mapped to chromosome 7q32. Affected members are characterized by muscle weakness affecting earlier the pelvic girdle and the ileopsoas muscles. We sequenced the whole exome of four family members and identified a shared heterozygous frame-shift variant in the Transportin 3 (TNPO3 gene, encoding a member of the importin-β super-family. The TNPO3 gene is mapped within the LGMD1F critical interval and its 923-amino acid human gene product is also expressed in skeletal muscle. In addition, we identified an isolated case of LGMD with a new missense mutation in the same gene. We localized the mutant TNPO3 around the nucleus, but not inside. The involvement of gene related to the nuclear transport suggests a novel disease mechanism leading to muscular dystrophy.

  2. RNA sequencing analysis of human podocytes reveals glucocorticoid regulated gene networks targeting non-immune pathways

    Jiang, Lulu; Hindmarch, Charles C. T.; Rogers, Mark; Campbell, Colin; Waterfall, Christy; Coghill, Jane; Mathieson, Peter W.; Welsh, Gavin I.


    Glucocorticoids are steroids that reduce inflammation and are used as immunosuppressive drugs for many diseases. They are also the mainstay for the treatment of minimal change nephropathy (MCN), which is characterised by an absence of inflammation. Their mechanisms of action remain elusive. Evidence suggests that immunomodulatory drugs can directly act on glomerular epithelial cells or ‘podocytes’, the cell type which is the main target of injury in MCN. To understand the nature of glucocorticoid effects on non-immune cell functions, we generated RNA sequencing data from human podocyte cell lines and identified the genes that are significantly regulated in dexamethasone-treated podocytes compared to vehicle-treated cells. The upregulated genes are of functional relevance to cytoskeleton-related processes, whereas the downregulated genes mostly encode pro-inflammatory cytokines and growth factors. We observed a tendency for dexamethasone-upregulated genes to be downregulated in MCN patients. Integrative analysis revealed gene networks composed of critical signaling pathways that are likely targeted by dexamethasone in podocytes. PMID:27774996

  3. Sequencing, Expression and Diagnostic Application of the Nucleoprotein Gene of Xinjiang Hemorrhagic Fever Virus

    马本江; 杭长寿; 解燕乡; 王世文


    In order to analyze the nucleoprotein (NP) gene of Crimean-Congo hemorrhagic fever virus (CCHFV), viral RNA was amplified by RT-PCR by using the proof-reading DNA polymerase to produce the complete NP gene. The PCR product was sequenced, analyzed for phylogenesis and cloned into the expression vector pE132a and the recombinant plasmid expressed in E. coil BL-21 with high yield. The primarily purified fused protein.was used to coat ELISA plates for the detect antibodies. It was found the similarities between NP gene of BA88166 and other XHFVs in nucleotide level and amino acid contents were very significant, and the NP gene of BA88166 encoded a nucleoprotein with 482 amino acid and a deduced molecular weight (MW) of 54 kDa. Western blot assay showed that the fusion protein expressed in bacteria possessed good antigenicity. The results with ELISA for the detection of the human and animal sera collected in endemic areas were found to be in good accordance to the clinical diagnosis. It concluded that the relations of NP genes of XHFV BA88166 and other XHFVs appeared to be evolutionally close. The methodologies established in this study were accurate, specific, rapid and reproducible for the clinical examinations and epidemiological survey.

  4. Understanding gene sequence variation in the context of transcription regulation in yeast.

    Irit Gat-Viks


    Full Text Available DNA sequence polymorphism in a regulatory protein can have a widespread transcriptional effect. Here we present a computational approach for analyzing modules of genes with a common regulation that are affected by specific DNA polymorphisms. We identify such regulatory-linkage modules by integrating genotypic and expression data for individuals in a segregating population with complementary expression data of strains mutated in a variety of regulatory proteins. Our procedure searches simultaneously for groups of co-expressed genes, for their common underlying linkage interval, and for their shared regulatory proteins. We applied the method to a cross between laboratory and wild strains of S. cerevisiae, demonstrating its ability to correctly suggest modules and to outperform extant approaches. Our results suggest that middle sporulation genes are under the control of polymorphism in the sporulation-specific tertiary complex Sum1p/Rfm1p/Hst1p. In another example, our analysis reveals novel inter-relations between Swi3 and two mitochondrial inner membrane proteins underlying variation in a module of aerobic cellular respiration genes. Overall, our findings demonstrate that this approach provides a useful framework for the systematic mapping of quantitative trait loci and their role in gene expression variation.

  5. Genes contributing to pain sensitivity in the normal population: an exome sequencing study.

    Williams, Frances M K; Scollen, Serena; Cao, Dandan; Memari, Yasin; Hyde, Craig L; Zhang, Baohong; Sidders, Benjamin; Ziemek, Daniel; Shi, Yujian; Harris, Juliette; Harrow, Ian; Dougherty, Brian; Malarstig, Anders; McEwen, Robert; Stephens, Joel C; Patel, Ketan; Menni, Cristina; Shin, So-Youn; Hodgkiss, Dylan; Surdulescu, Gabriela; He, Wen; Jin, Xin; McMahon, Stephen B; Soranzo, Nicole; John, Sally; Wang, Jun; Spector, Tim D


    Sensitivity to pain varies considerably between individuals and is known to be heritable. Increased sensitivity to experimental pain is a risk factor for developing chronic pain, a common and debilitating but poorly understood symptom. To understand mechanisms underlying pain sensitivity and to search for rare gene variants (MAF<5%) influencing pain sensitivity, we explored the genetic variation in individuals' responses to experimental pain. Quantitative sensory testing to heat pain was performed in 2,500 volunteers from TwinsUK (TUK): exome sequencing to a depth of 70× was carried out on DNA from singletons at the high and low ends of the heat pain sensitivity distribution in two separate subsamples. Thus in TUK1, 101 pain-sensitive and 102 pain-insensitive were examined, while in TUK2 there were 114 and 96 individuals respectively. A combination of methods was used to test the association between rare variants and pain sensitivity, and the function of the genes identified was explored using network analysis. Using causal reasoning analysis on the genes with different patterns of SNVs by pain sensitivity status, we observed a significant enrichment of variants in genes of the angiotensin pathway (Bonferroni corrected p = 3.8×10(-4)). This pathway is already implicated in animal models and human studies of pain, supporting the notion that it may provide fruitful new targets in pain management. The approach of sequencing extreme exome variation in normal individuals has provided important insights into gene networks mediating pain sensitivity in humans and will be applicable to other common complex traits.

  6. Exploration of the Brn4-regulated genes enhancing adult hippocampal neurogenesis by RNA sequencing.

    Guo, Jingjing; Cheng, Xiang; Zhang, Lei; Wang, Linmei; Mao, Yongxin; Tian, Guixiang; Xu, Wenhao; Wu, Yuhao; Ma, Zhi; Qin, Jianbing; Tian, Meiling; Jin, Guohua; Shi, Wei; Zhang, Xinhua


    Adult hippocampal neurogenesis is essential for learning and memory, and its dysfunction is involved in neurodegenerative diseases. However, the molecular mechanisms underlying adult hippocampal neurogenesis are still largely unknown. Our previous studies indicated that the transcription factor Brn4 was upregulated and promoted neuronal differentiation of neural stem cells (NSCs) in the surgically denervated hippocampus in rats. In this study, we use high-throughput RNA sequencing to explore the molecular mechanisms underlying the enhancement of adult hippocampal neurogenesis induced by lentivirus-mediated Brn4 overexpression in vivo. After 10 days of the lentivirus injection, we found that the expression levels of genes related to neuronal development and maturation were significantly increased and the expression levels of genes related to NSC maintenance were significantly decreased, indicating enhanced neurogenesis in the hippocampus after Brn4 overexpression. Through RNA sequencing, we found that 658 genes were differentially expressed in the Brn4-overexpressed hippocampi compared with GFP-overexpressed controls. Many of these differentially expressed genes are involved in NSC division and differentiation. By using quantitative real-time PCR, we validated the expression changes of three genes, including Ctbp2, Notch2, and Gli1, all of which are reported to play key roles in neuronal differentiation of NSCs. Importantly, the expression levels of Ctbp2 and Notch2 were also significantly changed in the hippocampus of Brn4 KO mice, which indicates that the expression levels of Ctbp2 and Notch2 may be directly regulated by Brn4. Our current study provides a solid foundation for further investigation and identifies Ctbp2 and Notch2 as possible downstream targets of Brn4. © 2017 Wiley Periodicals, Inc.

  7. Normalization of transposon-mutant library sequencing datasets to improve identification of conditionally essential genes.

    DeJesus, Michael A; Ioerger, Thomas R


    Sequencing of transposon-mutant libraries using next-generation sequencing (TnSeq) has become a popular method for determining which genes and non-coding regions are essential for growth under various conditions in bacteria. For methods that rely on quantitative comparison of counts of reads at transposon insertion sites, proper normalization of TnSeq datasets is vitally important. Real TnSeq datasets are often noisy and exhibit a significant skew that can be dominated by high counts at a small number of sites (often for non-biological reasons). If two datasets that are not appropriately normalized are compared, it might cause the artifactual appearance of Differentially Essential (DE) genes in a statistical test, constituting type I errors (false positives). In this paper, we propose a novel method for normalization of TnSeq datasets that corrects for the skew of read-count distributions by fitting them to a Beta-Geometric distribution. We show that this read-count correction procedure reduces the number of false positives when comparing replicate datasets grown under the same conditions (for which no genuine differences in essentiality are expected). We compare these results to results obtained with other normalization procedures, and show that it results in greater reduction in the number of false positives. In addition we investigate the effects of normalization on the detection of DE genes.

  8. Agouti signalling protein (ASIP) gene: molecular cloning, sequence characterisation and tissue distribution in domestic goose.

    Zhang, J; Wang, C; Liu, Y; Liu, J; Wang, H Y; Liu, A F; He, D Q


    Agouti signalling protein (ASIP) is an endogenous antagonist of melanocortin-1 receptor (MC1R) and is involved in the regulation of pigmentation in mammals. The objective of this study was to identify and characterise the ASIP gene in domestic goose. The goose ASIP cDNA consisted of a 44-nucleotide 5'-terminal untranslated region (UTR), a 390-nucleotide open-reading frame (ORF) and a 45-nucleotide 3'-UTR. The length of goose ASIP genomic DNA was 6176 bp, including three coding exons and two introns. Bioinformatic analysis indicated that the ORF encodes a protein of 130 amino-acid residues with a molecular weight of 14.88 kDa and an isoelectric point of 9.73. Multiple sequence alignments and phylogenetic analysis showed that the amino-acid sequence of ASIP was conserved in vertebrates, especially in the avian species. RT-qPCR showed that the goose ASIP mRNA was differentially expressed in the pigment deposition tissues, including eye, foot, feather follicle, skin of the back, as well as in skin of the abdomen. The expression level of the ASIP gene in skin of the abdomen was higher than that in skin of the back. Those findings will contribute to further understanding the functions of the ASIP gene in geese plumage colouring.

  9. Sequence divergence in two tandemly located pilin genes of Eikenella corrodens.

    Tønjum, T; Weir, S; Bøvre, K; Progulske-Fox, A; Marrs, C F


    Eikenella corrodens normally inhabits the human respiratory and gastrointestinal tracts but is frequently the cause of abscesses at various sites. Using the N-terminal portion of the Moraxella nonliquefaciens pilin gene as a hybridization probe, we cloned two tandemly located pilin genes of E. corrodens 31745, ecpC and ecpD, and expressed the two pilin genes separately in Escherichia coli. A comparison of the predicted amino acid sequences of E. corrodens 31745 EcpC and EcpD revealed considerable divergence between the sequences of these two pilins and even less similarity to EcpA and EcpB of E. corrodens type strain ATCC 23834. EcpC from E. corrodens 31745 displayed high degrees of homology to the pilins of Neisseria gonorrhoeae and Pseudomonas aeruginosa. EcpD from E. corrodens 31745 showed the highest homologies with the pilin of one of the three P. aeruginosa classes, whereas EcpA and EcpB of strain ATCC 23834 most closely resemble Moraxella bovis pilins. These findings raise interesting questions about potential genetic transfer between different bacterial species, as opposed to convergent evolution.

  10. Sequence analysis and prokaryotic expression of Giardia lamblia α-18 giardin gene.

    Wu, Sheng; Yu, Xingang; Abdullahi, Auwalu Yusuf; Hu, Wei; Pan, Weida; Shi, Xianli; Tan, Liping; Song, Meiran; Li, Guoqing


    To study the genetic variation and prokaryotic expression of α18 giardin gene of Giardia lamblia zoonotic assemblage A and host-specific assemblage F, the α18 genes were amplified from G. lamblia assemblages A and F by PCR and sequenced. The PCR product was cloned into the prokaryotic expression vector pET-28a(+) and the positive recombinant plasmid was transformed into Escherichia coli Rosetta (DE3) strain for the expression. The expressed α18 giardin fusion protein was validated by SDS-PAGE and Western blot analysis, and purified by Ni-Agarose resin. The putative sequence of α18 giardin amino acid was analyzed by bioinformatics software. Results showed that the α18 giardin gene was 861 bp in length, encoding 286 amino acids; it was 100% homologous between human-derived and dog-derived G. lamblia assemblage A, but it was 86.8% homologous with G. lamblia assemblage F (cat-derived). Giardin α18 was about 36 kDa in molecular weight, with good reactivity. Prediction based on in silico analyses: it had hydrophobicity, without signal peptide and transmembrane domain, and contained 11 alpha regions, 13 beta sheets, 1 beta turn and 7 random coils in secondary structure. The above information would lay the foundation for research about the subcellular localization and biological function of α18 giardin in G. lamblia.

  11. Comparison of rpoB gene sequencing, 16S rRNA gene sequencing, gyrB multiplex PCR, and the VITEK2 system for identification of Acinetobacter clinical isolates.

    Lee, Min Jung; Jang, Sook Jin; Li, Xue Min; Park, Geon; Kook, Joong-Ki; Kim, Min Jung; Chang, Young-Hyo; Shin, Jong Hee; Kim, Soo Hyun; Kim, Dong-Min; Kang, Seong-Ho; Moon, Dae-Soo


    Since accurate identification of species is necessary for proper treatment of Acinetobacter infections, we compared the performances of 4 bacterial identification methods using 167 Acinetobacter clinical isolates to identify the best identification method. To secure more non-baumannii Acinetobacter (NBA) strains as target strains, we first identified Acinetobacter baumannii in a total of 495 Acinetobacter clinical isolates identified using the VITEK 2 system. Because 371 of 495 strains were identified as A. baumannii using gyrB multiplex 1 PCR and blaOXA51-like PCR, we performed rpoB gene sequencing and 16S rRNA gene sequencing on remaining 124 strains belonging to NBA and 52 strains of A. baumannii. For identification of Acinetobacter at the species level, the accuracy rates of rpoB gene sequencing, 16S rRNA gene sequencing, gyrB multiplex PCR, and the VITEK 2 were 98.2%, 93.4%, 77.2%, and 35.9%, respectively. The gyrB multiplex PCR seems to be very useful for the detection of ACB complex because its concordance rates to the final identification of strains of ACB complex were 100%. Both the rpoB gene sequencing and the 16S rRNA gene sequencing may be useful in identifying Acinetobacter.

  12. Tetrachloroethene Dehalogenase from Dehalospirillum multivorans: Cloning, Sequencing of the Encoding Genes, and Expression of the pceA Gene in Escherichia coli

    Neumann, Anke; Wohlfarth, Gert; Diekert, Gabriele


    The genes encoding tetrachloroethene reductive dehalogenase, a corrinoid-Fe/S protein, of Dehalospirillum multivorans were cloned and sequenced. The pceA gene is upstream of pceB and overlaps it by 4 bp. The presence of a ς70-like promoter sequence upstream of pceA and of a ρ-independent terminator downstream of pceB indicated that both genes are cotranscribed. This assumption is supported by reverse transcriptase PCR data. The pceA and pceB genes encode putative 501- and 74-amino-acid proteins, respectively, with calculated molecular masses of 55,887 and 8,354 Da, respectively. Four peptides obtained after trypsin treatment of tetrachloroethene (PCE) dehalogenase were found in the deduced amino acid sequence of pceA. The N-terminal amino acid sequence of the PCE dehalogenase isolated from D. multivorans was found 30 amino acids downstream of the N terminus of the deduced pceA product. The pceA gene contained a nucleotide stretch highly similar to binding motifs for two Fe4S4 clusters or for one Fe4S4 cluster and one Fe3S4 cluster. A consensus sequence for the binding of a corrinoid was not found in pceA. No significant similarities to genes in the databases were detected in sequence comparisons. The pceB gene contained two membrane-spanning helices as indicated by two hydrophobic stretches in the hydropathic plot. Sequence comparisons of pceB revealed no sequence similarities to genes present in the databases. Only in the presence of pUBS 520 supplying the recombinant bacteria with high levels of the rare Escherichia coli tRNA4Arg was pceA expressed, albeit nonfunctionally, in recombinant E. coli BL21 (DE3). PMID:9696761

  13. Barcode Sequencing Screen Identifies SUB1 as a Regulator of Yeast Pheromone Inducible Genes

    Anna Sliva


    Full Text Available The yeast pheromone response pathway serves as a valuable model of eukaryotic mitogen-activated protein kinase (MAPK pathways, and transcription of their downstream targets. Here, we describe application of a screening method combining two technologies: fluorescence-activated cell sorting (FACS, and barcode analysis by sequencing (Bar-Seq. Using this screening method, and pFUS1-GFP as a reporter for MAPK pathway activation, we readily identified mutants in known mating pathway components. In this study, we also include a comprehensive analysis of the FUS1 induction properties of known mating pathway mutants by flow cytometry, featuring single cell analysis of each mutant population. We also characterized a new source of false positives resulting from the design of this screen. Additionally, we identified a deletion mutant, sub1Δ, with increased basal expression of pFUS1-GFP. Here, in the first ChIP-Seq of Sub1, our data shows that Sub1 binds to the promoters of about half the genes in the genome (tripling the 991 loci previously reported, including the promoters of several pheromone-inducible genes, some of which show an increase upon pheromone induction. Here, we also present the first RNA-Seq of a sub1Δ mutant; the majority of genes have no change in RNA, but, of the small subset that do, most show decreased expression, consistent with biochemical studies implicating Sub1 as a positive transcriptional regulator. The RNA-Seq data also show that certain pheromone-inducible genes are induced less in the sub1Δ mutant relative to the wild type, supporting a role for Sub1 in regulation of mating pathway genes. The sub1Δ mutant has increased basal levels of a small subset of other genes besides FUS1, including IMD2 and FIG1, a gene encoding an integral membrane protein necessary for efficient mating.

  14. Targeted next generation sequencing reveals a novel intragenic deletion of the TPO gene in a family with intellectual disability

    Iqbal, Z.; Neveling, K.; Razzaq, A.; Shahzad, M.; Zahoor, M.Y.; Qasim, M.; Gilissen, C.; Wieskamp, N.; Kwint, M.P.; Gijsen, S.; Brouwer, A.P. de; Veltman, J.A.; Riazuddin, S.; Bokhoven, J.H.L.M. van


    BACKGROUNDS AND AIMS: Next generation sequencing (NGS) approaches have revolutionized the identification of mutations underlying genetic disorders. This technology is particularly useful for the identification of mutations in known and new genes for conditions with extensive genetic heterogeneity. I

  15. Cloning and sequencing of the trpE gene from Arthrobacter globiformis ATCC 8010 and several related subsurface Arthrobacter isolates

    Chernova, T.; Viswanathan, V.K.; Austria, N.; Nichols, B.P.


    Tryptophan dependent mutants of Arthrobacter globiformis ATCC 8010 were isolated and trp genes were cloned by complementation and marker rescue of the auxotrophic strains. Rescue studies and preliminary sequence analysis reveal that at least the genes trpE, trpC, and trpB are clustered together in this organism. In addition, sequence analysis of the entire trpE gene, which encodes component I of anthranilate synthase, is described. Segments of the trpE gene from 17 subsurface isolates of Arthrobacter sp. were amplified by PCR and sequenced. The partial trpE sequences from the various strains were aligned and subjected to phylogenetic analysis. The data suggest that in addition to single base changes, recombination and genetic exchange play a major role in the evolution of the Arthrobacter genome.

  16. DNA sequence templates adjacent nucleosome and ORC sites at gene amplification origins in Drosophila.

    Liu, Jun; Zimmer, Kurt; Rusch, Douglas B; Paranjape, Neha; Podicheti, Ram; Tang, Haixu; Calvi, Brian R


    Eukaryotic origins of DNA replication are bound by the origin recognition complex (ORC), which scaffolds assembly of a pre-replicative complex (pre-RC) that is then activated to initiate replication. Both pre-RC assembly and activation are strongly influenced by developmental changes to the epigenome, but molecular mechanisms remain incompletely defined. We have been examining the activation of origins responsible for developmental gene amplification in Drosophila. At a specific time in oogenesis, somatic follicle cells transition from genomic replication to a locus-specific replication from six amplicon origins. Previous evidence indicated that these amplicon origins are activated by nucleosome acetylation, but how this affects origin chromatin is unknown. Here, we examine nucleosome position in follicle cells using micrococcal nuclease digestion with Ilumina sequencing. The results indicate that ORC binding sites and other essential origin sequences are nucleosome-depleted regions (NDRs). Nucleosome position at the amplicons was highly similar among developmental stages during which ORC is or is not bound, indicating that being an NDR is not sufficient to specify ORC binding. Importantly, the data suggest that nucleosomes and ORC have opposite preferences for DNA sequence and structure. We propose that nucleosome hyperacetylation promotes pre-RC assembly onto adjacent DNA sequences that are disfavored by nucleosomes but favored by ORC.

  17. Genetic Diversity in Populations of Sepiella maindroni Using 16S rRNA Gene Sequence Analysis


    Part of the 16S rRNA gene is amplified with PCR and sequenced for 5 populations of common Chinese cuttlefish Sepiella maindroni: three from the South China Sea, one from East China Sea and one from Japan. The result shows that a total of 5 nucleotide positions are found to have gaps or insertions of base pairs among these individuals, and 13 positions are examined to be variable in all the sequences, which range from 494 to 509 base pairs. All of the individuals are grouped into 7 haplotypes (h1-h7). No marked genetic difference is observed among those populations. All of the individuals from Nagasaki belong to h1 and the h3 haplotype is found only in the coastal waters of China. AG transition in Nucleotide 255 is suggested to be taken as a kind of genetic marker to identify the populations distributed in East-South China Sea and the Nagasaki waters of Japan.

  18. discussion on validity of rana maoershanensis based on partial sequence of 16s rrna gene


    rana maoershanensis found in mt.maoershan in guangxi,china was reported as a new species in 2007,but there was no molecular data for this frog.the partial sequences (543 bp) of 16s rrna gene from 12 specimens of 3 brown frog species (rana hanluica,r.maoershanensis and r.chensinensis) were analyzed with 17 specimens of 9 species from genbank.the nucleotide sequence divergence between r.maoershanensis and the other brown frog species were 4.5%-6.5%,with 22-30 nucleotide substitutions at this locus.the phylogenetic relationships based on mp,ml,and bayesian inference indicate that the brown frogs from southern china were diverged into three groups (clades a,b and c).r.maoershanensis was clustered together a well-supported subclade (b-l).it is suggested that r.maoershanensis is a valid species.

  19. [Clonning and sequencing of partial gene of hepatitis G virus (HGV) from Nanjing of China].

    Xu, J; Zhou, Y; Xu, J; Ju, J; Wang, X; Chen, W; Wang, H


    Partial gene of hepatitis G virus (HGV) was cloned and sequenced from the serum of a patient with post-transfusion hepatitis C in Nanjing of China by reverse transcription-polymerase chain reaction (RT-PCR). The sequence showed 89.09%, 92.12%, 87.27%, 93.94% nucleotide identity over the corresponding region of HGU44402, HGU45966, HGU36380 in America and HGV isolate in Hebei Province of China respectively. Forty patients with post-transfusion hepatitis C and thirty patients with hepatitis non A-E were detected for HGV RNA by RT-PCR, the HGV RNA positive rate were 10.00% and 6.67% respectively.

  20. Analysis of unstable DNA sequence in FRM1 gene in Polish families with fragile X syndrome

    Milewski, Michal; Bal, Jerzy; Obersztyn, Ewa; Bocian, Ewa; Mazurczak, Tadeusz [Instytut Matki i Dziecka, Warsaw (Poland); Zygulska, Marta; Horst, Juergen [Institute of Human Genetics, Muenster (Germany); Deelen, Wout H.; Halley, Dicky J.J. [Erasmus Univ., Rotterdam (Netherlands)


    The unstable DNA sequence in the FMR1 gene was analyzed in 85 individuals from Polish families with fragile X syndrome in order to characterize mutations responsible for the disease in Poland. In all affected individuals classified on the basis of clinical features and expression of the fragile site at X(q27.3) a large expansion of the unstable sequence (full mutation) was detected. About 5% (2 of 43) of individuals with full mutation did not express the fragile site. Among normal alleles, ranging in size from 20 to 41 CGC repeats, allele with 29 repeats was the most frequent (37%). Transmission of premutated and fully mutated alleles to the offspring was always associated with size increase. No change in repeat number was found when normal alleles were transmitted. (author). 19 refs., 4 figs, 1 tab.

  1. Draft Genome Sequence of a Colistin-Resistant Klebsiella pneumoniae Clinical Strain Carrying the blaNDM-1 Carbapenemase Gene

    Yao, Zhihong; Feng, Yu; Lin, Ji


    ABSTRACT Klebsiella pneumoniae strain WCHKP1845, recovered from the sputum of a patient with pneumonia, was resistant to colistin and carried the carbapenemase gene blaNDM-1. Here, we report its 5.4-Mb draft genome sequence, comprising 140 contigs with an average 57.33% G+C content. The genome contains 5,118 coding sequences and 88 tRNA genes. PMID:28209835

  2. The Structure and Sequence Analysis of TLR4 Gene in Cattle

    WANG Xing-ping; LUO RENG Zhuo-ma; XU Shang-zhong; GAO Xue; LI Jun-ya; REN Hong-yan; CHEN Jin-bao


    Toll-like receptor 4 (TLR4) is essential for initiating the innate response to lipopolysaccharide (LPS) from Gram-negative bacteria by acting as a signal transducting receptor.In order to help in investigating TLR4 as a candidate disease-resistance gene in cows,we isolated the cDNA (GenBank accession no.DQ839566) by RT-PCR and rapid amplification of cDNA ends (RACE) experiments and analyzed the sequence characters by bioinformatics.The results showed that cattle TLR4 gene about 3 739 bp contains an open reading frame of 2 526 bp encoded 841 amino acids (aa),470 bp 5" untranslated region (UTR),and 743 bp 3' UTR.Tissue expression profile by RT-PCR indicated that TLR4 gene expresses in mammary glands,liver,muscle,duodenum,fats,uterus,kidneys,hearts,lungs,pancreas,and ovary.TLR4 protein domain predicted by bioinformatics consists of signal peptide,transmembrane helices domain,3 sorts of leucine-rich repeat domains (LRR,LRR-TYP,and LRRCT),and a toll-interleukinl-resistance domain (TIR).Leucine-rich repeat domains were related with recognizing a broad of pathogen-associated molecular patterns (PAMP) from pathogen,and TIR domain for downstream signaling transduction was most conservative (98% identify) than other domains after alignment of protein from ovine,porcine,human,and mouse.In addition,a 470 bp 5'-flanking region sequence was amplified by PCR,and 15 putative DNA binding sites were predicted,but this sequence lacks TATA box,CCAAT character,and GC-rich regions.

  3. Gene Sequence Analyses of the Healthy Oral Microbiome in Humans and Companion Animals.

    Davis, Eric M


    It has long been accepted that certain oral bacterial species are responsible for the development of periodontal disease. However, the focus of microbial and immunological research is shifting from studying the organisms associated with disease to examining the indigenous microbial inhabitants that are present in health. Microbiome refers to the aggregate genetic material of all microorganisms living in, or on, a defined habitat. Recent developments in gene sequence analysis have enabled detection and identification of bacteria from polymicrobial samples, including subgingival plaque. Diversity surveys utilizing this technology have demonstrated that bacterial culture techniques have vastly underestimated the richness and diversity of microorganisms in vivo, since only certain bacteria grow in vitro. Surveys using gene sequence analysis have demonstrated that the healthy oral microbiome is composed of an unexpectedly high number of diverse species, including putative pathogens. These findings support the view that coevolution microorganisms and macroscopic hosts has occurred in which certain microorganisms have adapted to survive in the oral cavity and host immune tolerance has allowed the establishment of a symbiotic relationship in which both parties receive benefits (mutualism). This review describes gene sequence analysis as an increasingly common, culture-independent tool for detecting bacteria in vivo and describes the results of recent oral microbiome diversity surveys of clinically healthy humans, dogs, and cats. Six bacterial phyla consistently dominated the healthy oral microbiome of all 3 host species. Previous hypotheses on etiology of periodontitis are reviewed in light of new scientific findings. Finally, the consideration that clinically relevant periodontal disease occurs when immune tolerance of the symbiotic oral microbiome is altered to a proinflammatory response will be discussed.

  4. Mycobacterial tlyA gene product is localized to cell-wall without signal sequence.

    Santosh eKumar


    Full Text Available The mycobacterial tlyA gene product, Rv1694 (MtbTlyA, has been annotated as 'hemolysin' which was re-annotated as 2'-O rRNA methyl transferase. In order to function as a hemolysin, it must reach extracellular milieu with the help of signal sequence(s and/or transmembrane segment(s. However, the MtbTlyA neither has classical signals sequences that signify general/Sec/Tat pathways nor transmembrane segments. Interestingly, the tlyA gene appears to be restricted to pathogenic strains such as H37Rv, M. marinum, M. leprae, than M. smegmatis, M. vaccae, M. kansasii etc., which highlights the need for a detailed investigation to understand its functions. In this study, we have provided several evidences which highlight the presence of TlyA on the surface of M. marinum (native host and upon expression in M. smegmatis (surrogate host and E. coli (heterologous host. The TlyA was visualized at the bacterial-surface by confocal microscopy and accessible to Proteinase K. In addition, sub-cellular fractionation has revealed the presence of TlyA in the membrane fractions and this sequestration is not dependent on TatA, TatC or SecA2 pathways. As a consequence of expression, the recombinant bacteria exhibit distinct hemolysis. Interestingly, the MtbTlyA was also detected in both membrane vesicles secreted by M. smegmatis and outer membrane vesicles secreted by E. coli. Our experimental evidences unambiguously confirm that the mycobacterial TlyA can reach the extra cellular milieu without any signal sequence. Hence, the localization of TlyA class of proteins at the bacterial surface may highlight the existence of non-classical bacterial secretion mechanisms.

  5. Sequencing and phylogenetic analysis of neurotoxin gene from an environmental isolate of Clostridium sp.: comparison with other clostridial neurotoxins.

    Dixit, Aparna; Alam, Syed Imteyaz; Singh, Lokendra


    A Clostridium sp. isolated from intestine of decaying fish exhibited 99% sequence identity with C. tetani at 16S rRNA level. It produced a neurotoxin that was neutralized by botulinum antitoxin (A+B+E) as well as tetanus antitoxin. The gene fragments for light chain, C-terminal and N-terminal regions of the heavy chain of the toxin were amplified using three reported primer sets for tetanus neurotoxin (TeNT). The neurotoxin gene fragments were cloned in Escherichia coli and sequenced. The sequences obtained exhibited approximately 98, 99 and 98% sequence identity with reported gene sequences of TeNT/LC, TeNT/HC and TeNT/HN, respectively. The phylogenetic interrelationship between the neurotoxin gene of Clostridium sp. with previously reported gene sequences of Clostridium botulinum A to G and C. tetani was examined by analysis of differences in the nucleotide sequences. Six amino acids were substituted at four different positions in the light chain of neurotoxin from the isolate when compared with the reported closest sequence of TeNT. Of these, four were located in the beta15 motif at a solvent inaccessible, buried region of the protein molecule. One of these substitutions were on the solvent accessible surface residue of alpha1 motif, previously shown to have strong sequence conservation. A substitution of two amino acids observed in N-terminal region of heavy chain were buried residues, located in the beta21 and beta37 motifs showing variability in other related sequences. The C-terminal region responsible for binding to receptor was conserved, showing no changes in the amino acid sequence.

  6. The complete sequence and gene organization of the mitochondrial genome of the gadilid scaphopod Siphonondentalium lobatum (Mollusca).

    Dreyer, Hermann; Steiner, Gerhard


    Comparisons of mitochondrial gene sequences and gene arrangements can be informative for reconstructing high-level phylogenetic relationships. We determined the complete sequence of the mitochondrial genome of Siphonodentalium lobatum, (Mollusca, Scaphopoda). With only 13,932 bases, it is the shortest molluscan mitochondrial genome reported so far. The genome contains the usual 13 protein-coding genes, two rRNA and 22 tRNA genes. The ATPase subunit 8 gene is exceptionally short. Several transfer RNAs show truncated TpsiC arms or DHU arms. The gene arrangement of S. lobatum is markedly different from all other known molluscan mitochondrial genomes and shows low similarity even to an unpublished gene order of a dentaliid scaphopod. Phylogenetic analyses of all available complete molluscan mitochondrial genomes based on amino acid sequences of 11 protein-coding genes yield trees with low support for the basal branches. None of the traditionally accepted molluscan taxa and phylogenies are recovered in all analyses, except for the euthyneuran Gastropoda. S. lobatum appears as the sister taxon to two of the three bivalve species. We conclude that the deep molluscan phylogeny is probably beyond the resolution of mitochondrial protein sequences. Moreover, assessing the phylogenetic signal in