WorldWideScience

Sample records for high sequence homology

  1. The use of coded PCR primers enables high-throughput sequencing of multiple homolog amplification products by 454 parallel sequencing.

    Directory of Open Access Journals (Sweden)

    Jonas Binladen

    2007-02-01

    Full Text Available The invention of the Genome Sequence 20 DNA Sequencing System (454 parallel sequencing platform has enabled the rapid and high-volume production of sequence data. Until now, however, individual emulsion PCR (emPCR reactions and subsequent sequencing runs have been unable to combine template DNA from multiple individuals, as homologous sequences cannot be subsequently assigned to their original sources.We use conventional PCR with 5'-nucleotide tagged primers to generate homologous DNA amplification products from multiple specimens, followed by sequencing through the high-throughput Genome Sequence 20 DNA Sequencing System (GS20, Roche/454 Life Sciences. Each DNA sequence is subsequently traced back to its individual source through 5'tag-analysis.We demonstrate that this new approach enables the assignment of virtually all the generated DNA sequences to the correct source once sequencing anomalies are accounted for (miss-assignment rate<0.4%. Therefore, the method enables accurate sequencing and assignment of homologous DNA sequences from multiple sources in single high-throughput GS20 run. We observe a bias in the distribution of the differently tagged primers that is dependent on the 5' nucleotide of the tag. In particular, primers 5' labelled with a cytosine are heavily overrepresented among the final sequences, while those 5' labelled with a thymine are strongly underrepresented. A weaker bias also exists with regards to the distribution of the sequences as sorted by the second nucleotide of the dinucleotide tags. As the results are based on a single GS20 run, the general applicability of the approach requires confirmation. However, our experiments demonstrate that 5'primer tagging is a useful method in which the sequencing power of the GS20 can be applied to PCR-based assays of multiple homologous PCR products. The new approach will be of value to a broad range of research areas, such as those of comparative genomics, complete mitochondrial

  2. Detecting false positive sequence homology: a machine learning approach.

    Science.gov (United States)

    Fujimoto, M Stanley; Suvorov, Anton; Jensen, Nicholas O; Clement, Mark J; Bybee, Seth M

    2016-02-24

    Accurate detection of homologous relationships of biological sequences (DNA or amino acid) amongst organisms is an important and often difficult task that is essential to various evolutionary studies, ranging from building phylogenies to predicting functional gene annotations. There are many existing heuristic tools, most commonly based on bidirectional BLAST searches that are used to identify homologous genes and combine them into two fundamentally distinct classes: orthologs and paralogs. Due to only using heuristic filtering based on significance score cutoffs and having no cluster post-processing tools available, these methods can often produce multiple clusters constituting unrelated (non-homologous) sequences. Therefore sequencing data extracted from incomplete genome/transcriptome assemblies originated from low coverage sequencing or produced by de novo processes without a reference genome are susceptible to high false positive rates of homology detection. In this paper we develop biologically informative features that can be extracted from multiple sequence alignments of putative homologous genes (orthologs and paralogs) and further utilized in context of guided experimentation to verify false positive outcomes. We demonstrate that our machine learning method trained on both known homology clusters obtained from OrthoDB and randomly generated sequence alignments (non-homologs), successfully determines apparent false positives inferred by heuristic algorithms especially among proteomes recovered from low-coverage RNA-seq data. Almost ~42 % and ~25 % of predicted putative homologies by InParanoid and HaMStR respectively were classified as false positives on experimental data set. Our process increases the quality of output from other clustering algorithms by providing a novel post-processing method that is both fast and efficient at removing low quality clusters of putative homologous genes recovered by heuristic-based approaches.

  3. Using structure to explore the sequence alignment space of remote homologs.

    Science.gov (United States)

    Kuziemko, Andrew; Honig, Barry; Petrey, Donald

    2011-10-01

    Protein structure modeling by homology requires an accurate sequence alignment between the query protein and its structural template. However, sequence alignment methods based on dynamic programming (DP) are typically unable to generate accurate alignments for remote sequence homologs, thus limiting the applicability of modeling methods. A central problem is that the alignment that is "optimal" in terms of the DP score does not necessarily correspond to the alignment that produces the most accurate structural model. That is, the correct alignment based on structural superposition will generally have a lower score than the optimal alignment obtained from sequence. Variations of the DP algorithm have been developed that generate alternative alignments that are "suboptimal" in terms of the DP score, but these still encounter difficulties in detecting the correct structural alignment. We present here a new alternative sequence alignment method that relies heavily on the structure of the template. By initially aligning the query sequence to individual fragments in secondary structure elements and combining high-scoring fragments that pass basic tests for "modelability", we can generate accurate alignments within a small ensemble. Our results suggest that the set of sequences that can currently be modeled by homology can be greatly extended.

  4. Using structure to explore the sequence alignment space of remote homologs.

    Directory of Open Access Journals (Sweden)

    Andrew Kuziemko

    2011-10-01

    Full Text Available Protein structure modeling by homology requires an accurate sequence alignment between the query protein and its structural template. However, sequence alignment methods based on dynamic programming (DP are typically unable to generate accurate alignments for remote sequence homologs, thus limiting the applicability of modeling methods. A central problem is that the alignment that is "optimal" in terms of the DP score does not necessarily correspond to the alignment that produces the most accurate structural model. That is, the correct alignment based on structural superposition will generally have a lower score than the optimal alignment obtained from sequence. Variations of the DP algorithm have been developed that generate alternative alignments that are "suboptimal" in terms of the DP score, but these still encounter difficulties in detecting the correct structural alignment. We present here a new alternative sequence alignment method that relies heavily on the structure of the template. By initially aligning the query sequence to individual fragments in secondary structure elements and combining high-scoring fragments that pass basic tests for "modelability", we can generate accurate alignments within a small ensemble. Our results suggest that the set of sequences that can currently be modeled by homology can be greatly extended.

  5. Partial amino acid sequence of apolipoprotein(a) shows that it is homologous to plasminogen

    International Nuclear Information System (INIS)

    Eaton, D.L.; Fless, G.M.; Kohr, W.J.; McLean, J.W.; Xu, Q.T.; Miller, C.G.; Lawn, R.M.; Scanu, A.M.

    1987-01-01

    Apolipoprotein(a) [apo(a)] is a glycoprotein with M/sub r/ ∼ 280,000 that is disulfide linked to apolipoprotein B in lipoprotein(a) particles. Elevated plasma levels of lipoprotein(a) are correlated with atherosclerosis. Partial amino acid sequence of apo(a) shows that it has striking homology to plasminogen. Plasminogen is a plasma serine protease zymogen that consists of five homologous and tandemly repeated domains called kringles and a trypsin-like protease domain. The amino-terminal sequence obtained for apo(a) is homologous to the beginning of kringle 4 but not the amino terminus of plasminogen. Apo(a) was subjected to limited proteolysis by trypsin or V8 protease, and fragments generated were isolated and sequenced. Sequences obtained from several of these fragments are highly (77-100%) homologous to plasminogen residues 391-421, which reside within kringle 4. Analysis of these internal apo(a) sequences revealed that apo(a) may contain at least two kringle 4-like domains. A sequence obtained from another tryptic fragment also shows homology to the end of kringle 4 and the beginning of kringle 5. Sequence data obtained from the two tryptic fragments shows homology with the protease domain of plasminogen. One of these sequences is homologous to the sequences surrounding the activation site of plasminogen. Plasminogen is activated by the cleavage of a specific arginine residue by urokinase and tissue plasminogen activator; however, the corresponding site in apo(a) is a serine that would not be cleaved by tissue plasminogen activator or urokinase. Using a plasmin-specific assay, no proteolytic activity could be demonstrated for lipoprotein(a) particles. These results suggest that apo(a) contains kringle-like domains and an inactive protease domain

  6. [Sequence analysis of LEAFY homologous gene from Dendrobium moniliforme and application for identification of medicinal Dendrobium].

    Science.gov (United States)

    Xing, Wen-Rui; Hou, Bei-Wei; Guan, Jing-Jiao; Luo, Jing; Ding, Xiao-Yu

    2013-04-01

    The LEAFY (LFY) homologous gene of Dendrobium moniliforme (L.) Sw. was cloned by new primers which were designed based on the conservative region of known sequences of orchid LEAFY gene. Partial LFY homologous gene was cloned by common PCR, then we got the complete LFY homologous gene Den LFY by Tail-PCR. The complete sequence of DenLFY gene was 3 575 bp which contained three exons and two introns. Using BLAST method, comparison analysis among the exon of LFY homologous gene indicted that the DenLFY gene had high identity with orchids LFY homologous, including the related fragment of PhalLFY (84%) in Phalaenopsis hybrid cultivar, LFY homologous gene in Oncidium (90%) and in other orchid (over 80%). Using MP analysis, Dendrobium is found to be the sister to Oncidium and Phalaenopsis. Homologous analysis demonstrated that the C-terminal amino acids were highly conserved. When the exons and introns were separately considered, exons and the sequence of amino acid were good markers for the function research of DenLFY gene. The second intron can be used in authentication research of Dendrobium based on the length polymorphism between Dendrobium moniliforme and Dendrobium officinale.

  7. CPHmodels-3.0--remote homology modeling using structure-guided sequence profiles

    DEFF Research Database (Denmark)

    Nielsen, Morten; Lundegaard, Claus; Lund, Ole

    2010-01-01

    CPHmodels-3.0 is a web server predicting protein 3D structure by use of single template homology modeling. The server employs a hybrid of the scoring functions of CPHmodels-2.0 and a novel remote homology-modeling algorithm. A query sequence is first attempted modeled using the fast CPHmodels-2.......0 profile-profile scoring function suitable for close homology modeling. The new computational costly remote homology-modeling algorithm is only engaged provided that no suitable PDB template is identified in the initial search. CPHmodels-3.0 was benchmarked in the CASP8 competition and produced models.......3 A. These performance values place the CPHmodels-3.0 method in the group of high performing 3D prediction tools. Beside its accuracy, one of the important features of the method is its speed. For most queries, the response time of the server is...

  8. Functional region prediction with a set of appropriate homologous sequences-an index for sequence selection by integrating structure and sequence information with spatial statistics

    Science.gov (United States)

    2012-01-01

    Background The detection of conserved residue clusters on a protein structure is one of the effective strategies for the prediction of functional protein regions. Various methods, such as Evolutionary Trace, have been developed based on this strategy. In such approaches, the conserved residues are identified through comparisons of homologous amino acid sequences. Therefore, the selection of homologous sequences is a critical step. It is empirically known that a certain degree of sequence divergence in the set of homologous sequences is required for the identification of conserved residues. However, the development of a method to select homologous sequences appropriate for the identification of conserved residues has not been sufficiently addressed. An objective and general method to select appropriate homologous sequences is desired for the efficient prediction of functional regions. Results We have developed a novel index to select the sequences appropriate for the identification of conserved residues, and implemented the index within our method to predict the functional regions of a protein. The implementation of the index improved the performance of the functional region prediction. The index represents the degree of conserved residue clustering on the tertiary structure of the protein. For this purpose, the structure and sequence information were integrated within the index by the application of spatial statistics. Spatial statistics is a field of statistics in which not only the attributes but also the geometrical coordinates of the data are considered simultaneously. Higher degrees of clustering generate larger index scores. We adopted the set of homologous sequences with the highest index score, under the assumption that the best prediction accuracy is obtained when the degree of clustering is the maximum. The set of sequences selected by the index led to higher functional region prediction performance than the sets of sequences selected by other sequence

  9. Genetic selection and DNA sequences of 4.5S RNA homologs

    DEFF Research Database (Denmark)

    Brown, S; Thon, G; Tolentino, E

    1989-01-01

    A general strategy for cloning the functional homologs of an Escherichia coli gene was used to clone homologs of 4.5S RNA from other bacteria. The genes encoding these homologs were selected by their ability to complement a deletion of the gene for 4.5S RNA. DNA sequences of the regions encoding...

  10. Interference of Homologous Sequences on the SNP Study of CYP2A13 Gene

    Directory of Open Access Journals (Sweden)

    Qinghua ZHOU

    2010-02-01

    Full Text Available Background and objective It has been proven that cytochrome P450 enzyme 2A13 (CYP2A13 played an important role in the association between single nucleotide polymorphisms (SNP and human diseases. Cytochrome P450 enzymes are a group of isoenzymes, whose sequence homology may interfere with the study for SNP. The aim of this study is to explore the interference on the SNP study of CYP2A13 caused by homologous sequences. Methods Taqman probe was applied to detect distribution of rs8192789 sites in 573 subjects, and BLAST method was used to analyze the amplified sequences. Partial sequences of CYP2A13 were emplified by PCR from 60 cases. The emplified sequences were TA cloned and sequenced. Results For rs8192789 loci in 573 cases, only 3 cases were TT, while the rest were CT heterozygotes, which was caused by homologous sequences. There are a large number of overlapping peaks in identical sequences of 60 cases, and the SNP of 101 amino acid site reported in the SNP database is not found. The cloned sequences are 247 bp, 235 bp fragments. Conclusion The homologous sequences may interfere the study for SNP of CYP2A13, and some SNP may not exist.

  11. CPHmodels-3.0--remote homology modeling using structure-guided sequence profiles.

    Science.gov (United States)

    Nielsen, Morten; Lundegaard, Claus; Lund, Ole; Petersen, Thomas Nordahl

    2010-07-01

    CPHmodels-3.0 is a web server predicting protein 3D structure by use of single template homology modeling. The server employs a hybrid of the scoring functions of CPHmodels-2.0 and a novel remote homology-modeling algorithm. A query sequence is first attempted modeled using the fast CPHmodels-2.0 profile-profile scoring function suitable for close homology modeling. The new computational costly remote homology-modeling algorithm is only engaged provided that no suitable PDB template is identified in the initial search. CPHmodels-3.0 was benchmarked in the CASP8 competition and produced models for 94% of the targets (117 out of 128), 74% were predicted as high reliability models (87 out of 117). These achieved an average RMSD of 4.6 A when superimposed to the 3D structure. The remaining 26% low reliably models (30 out of 117) could superimpose to the true 3D structure with an average RMSD of 9.3 A. These performance values place the CPHmodels-3.0 method in the group of high performing 3D prediction tools. Beside its accuracy, one of the important features of the method is its speed. For most queries, the response time of the server is web server is available at http://www.cbs.dtu.dk/services/CPHmodels/.

  12. Sequence-structure relationships in RNA loops: establishing the basis for loop homology modeling.

    Science.gov (United States)

    Schudoma, Christian; May, Patrick; Nikiforova, Viktoria; Walther, Dirk

    2010-01-01

    The specific function of RNA molecules frequently resides in their seemingly unstructured loop regions. We performed a systematic analysis of RNA loops extracted from experimentally determined three-dimensional structures of RNA molecules. A comprehensive loop-structure data set was created and organized into distinct clusters based on structural and sequence similarity. We detected clear evidence of the hallmark of homology present in the sequence-structure relationships in loops. Loops differing by structures. Thus, our results support the application of homology modeling for RNA loop model building. We established a threshold that may guide the sequence divergence-based selection of template structures for RNA loop homology modeling. Of all possible sequences that are, under the assumption of isosteric relationships, theoretically compatible with actual sequences observed in RNA structures, only a small fraction is contained in the Rfam database of RNA sequences and classes implying that the actual RNA loop space may consist of a limited number of unique loop structures and conserved sequences. The loop-structure data sets are made available via an online database, RLooM. RLooM also offers functionalities for the modeling of RNA loop structures in support of RNA engineering and design efforts.

  13. Sequence homology at the breakpoint and clinical phenotype of mitochondrial DNA deletion syndromes.

    Science.gov (United States)

    Sadikovic, Bekim; Wang, Jing; El-Hattab, Ayman W; Landsverk, Megan; Douglas, Ganka; Brundage, Ellen K; Craigen, William J; Schmitt, Eric S; Wong, Lee-Jun C

    2010-12-20

    Mitochondrial DNA (mtDNA) deletions are a common cause of mitochondrial disorders. Large mtDNA deletions can lead to a broad spectrum of clinical features with different age of onset, ranging from mild mitochondrial myopathies (MM), progressive external ophthalmoplegia (PEO), and Kearns-Sayre syndrome (KSS), to severe Pearson syndrome. The aim of this study is to investigate the molecular signatures surrounding the deletion breakpoints and their association with the clinical phenotype and age at onset. MtDNA deletions in 67 patients were characterized using array comparative genomic hybridization (aCGH) followed by PCR-sequencing of the deletion junctions. Sequence homology including both perfect and imperfect short repeats flanking the deletion regions were analyzed and correlated with clinical features and patients' age group. In all age groups, there was a significant increase in sequence homology flanking the deletion compared to mtDNA background. The youngest patient group (deletion distribution in size and locations, with a significantly lower sequence homology flanking the deletion, and the highest percentage of deletion mutant heteroplasmy. The older age groups showed rather discrete pattern of deletions with 44% of all patients over 6 years old carrying the most common 5 kb mtDNA deletion, which was found mostly in muscle specimens (22/41). Only 15% (3/20) of the young patients (deletion, which is usually present in blood rather than muscle. This group of patients predominantly (16 out of 17) exhibit multisystem disorder and/or Pearson syndrome, while older patients had predominantly neuromuscular manifestations including KSS, PEO, and MM. In conclusion, sequence homology at the deletion flanking regions is a consistent feature of mtDNA deletions. Decreased levels of sequence homology and increased levels of deletion mutant heteroplasmy appear to correlate with earlier onset and more severe disease with multisystem involvement.

  14. Homology analyses of the protein sequences of fatty acid synthases from chicken liver, rat mammary gland, and yeast

    International Nuclear Information System (INIS)

    Chang, Soo-Ik; Hammes, G.G.

    1989-01-01

    Homology analyses of the protein sequences of chicken liver and rat mammary gland fatty acid synthases were carried out. The amino acid sequences of the chicken and rat enzymes are 67% identical. If conservative substitutions are allowed, 78% of the amino acids are matched. A region of low homologies exists between the functional domains, in particular around amino acid residues 1059-1264 of the chicken enzyme. Homologies between the active sites of chicken and rat and of chicken and yeast enzymes have been analyzed by an alignment method. A high degree of homology exists between the active sites of the chicken and rat enzymes. However, the chicken and yeast enzymes show a lower degree of homology. The DADPH-binding dinucleotide folds of the β-ketoacyl reductase and the enoyl reductase sites were identified by comparison with a known consensus sequence for the DADP- and FAD-binding dinucleotide folds. The active sites of all of the enzymes are primarily in hydrophobic regions of the protein. This study suggests that the genes for the functional domains of fatty acid synthase were originally separated, and these genes were connected to each other by using different connecting nucleotide sequences in different species. An alternative explanation for the differences in rat and chicken is a common ancestry and mutations in the joining regions during evolution

  15. HomPPI: a class of sequence homology based protein-protein interface prediction methods

    Directory of Open Access Journals (Sweden)

    Dobbs Drena

    2011-06-01

    Full Text Available Abstract Background Although homology-based methods are among the most widely used methods for predicting the structure and function of proteins, the question as to whether interface sequence conservation can be effectively exploited in predicting protein-protein interfaces has been a subject of debate. Results We studied more than 300,000 pair-wise alignments of protein sequences from structurally characterized protein complexes, including both obligate and transient complexes. We identified sequence similarity criteria required for accurate homology-based inference of interface residues in a query protein sequence. Based on these analyses, we developed HomPPI, a class of sequence homology-based methods for predicting protein-protein interface residues. We present two variants of HomPPI: (i NPS-HomPPI (Non partner-specific HomPPI, which can be used to predict interface residues of a query protein in the absence of knowledge of the interaction partner; and (ii PS-HomPPI (Partner-specific HomPPI, which can be used to predict the interface residues of a query protein with a specific target protein. Our experiments on a benchmark dataset of obligate homodimeric complexes show that NPS-HomPPI can reliably predict protein-protein interface residues in a given protein, with an average correlation coefficient (CC of 0.76, sensitivity of 0.83, and specificity of 0.78, when sequence homologs of the query protein can be reliably identified. NPS-HomPPI also reliably predicts the interface residues of intrinsically disordered proteins. Our experiments suggest that NPS-HomPPI is competitive with several state-of-the-art interface prediction servers including those that exploit the structure of the query proteins. The partner-specific classifier, PS-HomPPI can, on a large dataset of transient complexes, predict the interface residues of a query protein with a specific target, with a CC of 0.65, sensitivity of 0.69, and specificity of 0.70, when homologs of

  16. An HMM posterior decoder for sequence feature prediction that includes homology information

    DEFF Research Database (Denmark)

    Käll, Lukas; Krogh, Anders Stærmose; Sonnhammer, Erik L. L.

    2005-01-01

    Motivation: When predicting sequence features like transmembrane topology, signal peptides, coil-coil structures, protein secondary structure or genes, extra support can be gained from homologs. Results: We present here a general hidden Markov model (HMM) decoding algorithm that combines probabil......Motivation: When predicting sequence features like transmembrane topology, signal peptides, coil-coil structures, protein secondary structure or genes, extra support can be gained from homologs. Results: We present here a general hidden Markov model (HMM) decoding algorithm that combines......://phobius.cgb.ki.se/poly.html . An implementation of the algorithm is available on request from the authors....

  17. Porcine MYF6 gene: sequence, homology analysis, and variation in the promoter region.

    Science.gov (United States)

    Wyszyńska-Koko, J; Kurył, J

    2004-01-01

    MYF6 gene codes for the bHLH transcription factor belonging to MyoD family. Its expression accompanies the processes of differentiation and maturation of myotubes during embriogenesis and continues on a relatively high level after birth, affecting the muscle phenotype. The porcine MYF6 gene was amplified and sequenced and compared with MYF6 gene sequences of other species. The amino acid sequence was deduced and an interspecies homology analysis was performed. Myf-6 protein shows a high conservation among species of 99 and 97% identity when comparing pig with cow and human, respectively, and of 93% when comparing pig with mouse and rat. The single nucleotide polymorphism (SNP) was revealed within the promoter region, which appeared to be T --> C transition recognized by a MspI restriction enzyme.

  18. A sensitive short read homology search tool for paired-end read sequencing data.

    Science.gov (United States)

    Techa-Angkoon, Prapaporn; Sun, Yanni; Lei, Jikai

    2017-10-16

    Homology search is still a significant step in functional analysis for genomic data. Profile Hidden Markov Model-based homology search has been widely used in protein domain analysis in many different species. In particular, with the fast accumulation of transcriptomic data of non-model species and metagenomic data, profile homology search is widely adopted in integrated pipelines for functional analysis. While the state-of-the-art tool HMMER has achieved high sensitivity and accuracy in domain annotation, the sensitivity of HMMER on short reads declines rapidly. The low sensitivity on short read homology search can lead to inaccurate domain composition and abundance computation. Our experimental results showed that half of the reads were missed by HMMER for a RNA-Seq dataset. Thus, there is a need for better methods to improve the homology search performance for short reads. We introduce a profile homology search tool named Short-Pair that is designed for short paired-end reads. By using an approximate Bayesian approach employing distribution of fragment lengths and alignment scores, Short-Pair can retrieve the missing end and determine true domains. In particular, Short-Pair increases the accuracy in aligning short reads that are part of remote homologs. We applied Short-Pair to a RNA-Seq dataset and a metagenomic dataset and quantified its sensitivity and accuracy on homology search. The experimental results show that Short-Pair can achieve better overall performance than the state-of-the-art methodology of profile homology search. Short-Pair is best used for next-generation sequencing (NGS) data that lack reference genomes. It provides a complementary paired-end read homology search tool to HMMER. The source code is freely available at https://sourceforge.net/projects/short-pair/ .

  19. Bidirectional gene sequences with similar homology to functional proteins of alkane degrading bacterium pseudomonas fredriksbergensis DNA

    International Nuclear Information System (INIS)

    Megeed, A.A.

    2011-01-01

    The potential for two overlapping fragments of DNA from a clone of newly isolated alkanes degrading bacterium Pseudomonas frederiksbergensis encoding sequences with similar homology to two parts of functional proteins is described. One strand contains a sequence with high homology to alkanes monooxygenase (alkB), a member of the alkanes hydroxylase family, and the other strand contains a sequence with some homology to alcohol dehydrogenase gene (alkJ). Overlapping of the genes on opposite strands has been reported in eukaryotic species, and is now reported in a bacterial species. The sequence comparisons and ORFS results revealed that the regulation and the genes organization involved in alkane oxidation represented in Pseudomonas frederiksberghensis varies among the different known alkane degrading bacteria. The alk gene cluster containing homologues to the known alkane monooxygenase (alkB), and rubredoxin (alkG) are oriented in the same direction, whereas alcohol dehydrogenase (alkJ) is oriented in the opposite direction. Such genomes encode messages on both strands of the DNA, or in an overlapping but different reading frames, of the same strand of DNA. The possibility of creating novel genes from pre-existing sequences, known as overprinting, which is a widespread phenomenon in small viruses. Here, the origin and evolution of the gene overlap to bacteriophages belonging to the family Microviridae have been investigated. Such a phenomenon is most widely described in extremely small genomes such as those of viruses or small plasmids, yet here is a unique phenomenon. (author)

  20. A Comprehensive Strategy for Accurate Mutation Detection of the Highly Homologous PMS2.

    Science.gov (United States)

    Li, Jianli; Dai, Hongzheng; Feng, Yanming; Tang, Jia; Chen, Stella; Tian, Xia; Gorman, Elizabeth; Schmitt, Eric S; Hansen, Terah A A; Wang, Jing; Plon, Sharon E; Zhang, Victor Wei; Wong, Lee-Jun C

    2015-09-01

    Germline mutations in the DNA mismatch repair gene PMS2 underlie the cancer susceptibility syndrome, Lynch syndrome. However, accurate molecular testing of PMS2 is complicated by a large number of highly homologous sequences. To establish a comprehensive approach for mutation detection of PMS2, we have designed a strategy combining targeted capture next-generation sequencing (NGS), multiplex ligation-dependent probe amplification, and long-range PCR followed by NGS to simultaneously detect point mutations and copy number changes of PMS2. Exonic deletions (E2 to E9, E5 to E9, E8, E10, E14, and E1 to E15), duplications (E11 to E12), and a nonsense mutation, p.S22*, were identified. Traditional multiplex ligation-dependent probe amplification and Sanger sequencing approaches cannot differentiate the origin of the exonic deletions in the 3' region when PMS2 and PMS2CL share identical sequences as a result of gene conversion. Our approach allows unambiguous identification of mutations in the active gene with a straightforward long-range-PCR/NGS method. Breakpoint analysis of multiple samples revealed that recurrent exon 14 deletions are mediated by homologous Alu sequences. Our comprehensive approach provides a reliable tool for accurate molecular analysis of genes containing multiple copies of highly homologous sequences and should improve PMS2 molecular analysis for patients with Lynch syndrome. Copyright © 2015 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  1. Protein backbone angle restraints from searching a database for chemical shift and sequence homology

    Energy Technology Data Exchange (ETDEWEB)

    Cornilescu, Gabriel; Delaglio, Frank; Bax, Ad [National Institutes of Health, Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases (United States)

    1999-03-15

    Chemical shifts of backbone atoms in proteins are exquisitely sensitive to local conformation, and homologous proteins show quite similar patterns of secondary chemical shifts. The inverse of this relation is used to search a database for triplets of adjacent residues with secondary chemical shifts and sequence similarity which provide the best match to the query triplet of interest. The database contains 13C{alpha}, 13C{beta}, 13C', 1H{alpha} and 15N chemical shifts for 20 proteins for which a high resolution X-ray structure is available. The computer program TALOS was developed to search this database for strings of residues with chemical shift and residue type homology. The relative importance of the weighting factors attached to the secondary chemical shifts of the five types of resonances relative to that of sequence similarity was optimized empirically. TALOS yields the 10 triplets which have the closest similarity in secondary chemical shift and amino acid sequence to those of the query sequence. If the central residues in these 10 triplets exhibit similar {phi} and {psi} backbone angles, their averages can reliably be used as angular restraints for the protein whose structure is being studied. Tests carried out for proteins of known structure indicate that the root-mean-square difference (rmsd) between the output of TALOS and the X-ray derived backbone angles is about 15 deg. Approximately 3% of the predictions made by TALOS are found to be in error.

  2. Structural organization of glycophorin A and B genes: Glycophorin B gene evolved by homologous recombination at Alu repeat sequences

    International Nuclear Information System (INIS)

    Kudo, Shinichi; Fukuda, Minoru

    1989-01-01

    Glycophorins A (GPA) and B (GPB) are two major sialoglycoproteins of the human erythrocyte membrane. Here the authors present a comparison of the genomic structures of GPA and GPB developed by analyzing DNA clones isolated from a K562 genomic library. Nucleotide sequences of exon-intron junctions and 5' and 3' flanking sequences revealed that the GPA and GPB genes consist of 7 and 5 exons, respectively, and both genes have >95% identical sequence from the 5' flanking region to the region ∼ 1 kilobase downstream from the exon encoding the transmembrane regions. In this homologous part of the genes, GPB lacks one exon due to a point mutation at the 5' splicing site of the third intron, which inactivates the 5' cleavage event of splicing and leads to ligation of the second to the fourth exon. Following these very homologous sequences, the genomic sequences for GPA and GPB diverge significantly and no homology can be detected in their 3' end sequences. The analysis of the Alu sequences and their flanking direct repeat sequences suggest that an ancestral genomic structure has been maintained in the GPA gene, whereas the GPB gene has arisen from the acquisition of 3' sequences different from those of the GPA gene by homologous recombination at the Alu repeats during or after gene duplication

  3. GLASSgo – Automated and Reliable Detection of sRNA Homologs From a Single Input Sequence

    Directory of Open Access Journals (Sweden)

    Steffen C. Lott

    2018-04-01

    Full Text Available Bacterial small RNAs (sRNAs are important post-transcriptional regulators of gene expression. The functional and evolutionary characterization of sRNAs requires the identification of homologs, which is frequently challenging due to their heterogeneity, short length and partly, little sequence conservation. We developed the GLobal Automatic Small RNA Search go (GLASSgo algorithm to identify sRNA homologs in complex genomic databases starting from a single sequence. GLASSgo combines an iterative BLAST strategy with pairwise identity filtering and a graph-based clustering method that utilizes RNA secondary structure information. We tested the specificity, sensitivity and runtime of GLASSgo, BLAST and the combination RNAlien/cmsearch in a typical use case scenario on 40 bacterial sRNA families. The sensitivity of the tested methods was similar, while the specificity of GLASSgo and RNAlien/cmsearch was significantly higher than that of BLAST. GLASSgo was on average ∼87 times faster than RNAlien/cmsearch, and only ∼7.5 times slower than BLAST, which shows that GLASSgo optimizes the trade-off between speed and accuracy in the task of finding sRNA homologs. GLASSgo is fully automated, whereas BLAST often recovers only parts of homologs and RNAlien/cmsearch requires extensive additional bioinformatic work to get a comprehensive set of homologs. GLASSgo is available as an easy-to-use web server to find homologous sRNAs in large databases.

  4. Annotating Protein Functional Residues by Coupling High-Throughput Fitness Profile and Homologous-Structure Analysis.

    Science.gov (United States)

    Du, Yushen; Wu, Nicholas C; Jiang, Lin; Zhang, Tianhao; Gong, Danyang; Shu, Sara; Wu, Ting-Ting; Sun, Ren

    2016-11-01

    Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available. To fully comprehend the diverse functions of a protein, it is essential to understand the functionality of individual residues. Current methods are highly dependent on evolutionary sequence conservation, which is

  5. Chemical shift homology in proteins

    International Nuclear Information System (INIS)

    Potts, Barbara C.M.; Chazin, Walter J.

    1998-01-01

    The degree of chemical shift similarity for homologous proteins has been determined from a chemical shift database of over 50 proteins representing a variety of families and folds, and spanning a wide range of sequence homologies. After sequence alignment, the similarity of the secondary chemical shifts of C α protons was examined as a function of amino acid sequence identity for 37 pairs of structurally homologous proteins. A correlation between sequence identity and secondary chemical shift rmsd was observed. Important insights are provided by examining the sequence identity of homologous proteins versus percentage of secondary chemical shifts that fall within 0.1 and 0.3 ppm thresholds. These results begin to establish practical guidelines for the extent of chemical shift similarity to expect among structurally homologous proteins

  6. WeederH: an algorithm for finding conserved regulatory motifs and regions in homologous sequences

    Directory of Open Access Journals (Sweden)

    Pesole Graziano

    2007-02-01

    Full Text Available Abstract Background This work addresses the problem of detecting conserved transcription factor binding sites and in general regulatory regions through the analysis of sequences from homologous genes, an approach that is becoming more and more widely used given the ever increasing amount of genomic data available. Results We present an algorithm that identifies conserved transcription factor binding sites in a given sequence by comparing it to one or more homologs, adapting a framework we previously introduced for the discovery of sites in sequences from co-regulated genes. Differently from the most commonly used methods, the approach we present does not need or compute an alignment of the sequences investigated, nor resorts to descriptors of the binding specificity of known transcription factors. The main novel idea we introduce is a relative measure of conservation, assuming that true functional elements should present a higher level of conservation with respect to the rest of the sequence surrounding them. We present tests where we applied the algorithm to the identification of conserved annotated sites in homologous promoters, as well as in distal regions like enhancers. Conclusion Results of the tests show how the algorithm can provide fast and reliable predictions of conserved transcription factor binding sites regulating the transcription of a gene, with better performances than other available methods for the same task. We also show examples on how the algorithm can be successfully employed when promoter annotations of the genes investigated are missing, or when regulatory sites and regions are located far away from the genes.

  7. Cluster based on sequence comparison of homologous proteins of 95 organism species - Gclust Server | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Gclust Server Cluster based on sequence comparison of homologous proteins of 95 organism spe...cies Data detail Data name Cluster based on sequence comparison of homologous proteins of 95 organism specie...istory of This Database Site Policy | Contact Us Cluster based on sequence compariso

  8. CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction

    KAUST Repository

    Cui, Xuefeng

    2016-06-15

    Motivation: Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment, threading and alignment-free methods, protein homology detection remains a challenging open problem. Recently, network methods that try to find transitive paths in the protein structure space demonstrate the importance of incorporating network information of the structure space. Yet, current methods merge the sequence space and the structure space into a single space, and thus introduce inconsistency in combining different sources of information. Method: We present a novel network-based protein homology detection method, CMsearch, based on cross-modal learning. Instead of exploring a single network built from the mixture of sequence and structure space information, CMsearch builds two separate networks to represent the sequence space and the structure space. It then learns sequence–structure correlation by simultaneously taking sequence information, structure information, sequence space information and structure space information into consideration. Results: We tested CMsearch on two challenging tasks, protein homology detection and protein structure prediction, by querying all 8332 PDB40 proteins. Our results demonstrate that CMsearch is insensitive to the similarity metrics used to define the sequence and the structure spaces. By using HMM–HMM alignment as the sequence similarity metric, CMsearch clearly outperforms state-of-the-art homology detection methods and the CASP-winning template-based protein structure prediction methods.

  9. Amino acid sequences mediating vascular cell adhesion molecule 1 binding to integrin alpha 4: homologous DSP sequence found for JC polyoma VP1 coat protein

    Directory of Open Access Journals (Sweden)

    Michael Andrew Meyer

    2013-07-01

    Full Text Available The JC polyoma viral coat protein VP1 was analyzed for amino acid sequences homologies to the IDSP sequence which mediates binding of VLA-4 (integrin alpha 4 to vascular cell adhesion molecule 1. Although the full sequence was not found, a DSP sequence was located near the critical arginine residue linked to infectivity of the virus and binding to sialic acid containing molecules such as integrins (3. For the JC polyoma virus, a DSP sequence was found at residues 70, 71 and 72 with homology also noted for the mouse polyoma virus and SV40 virus. Three dimensional modeling of the VP1 molecule suggests that the DSP loop has an accessible site for interaction from the external side of the assembled viral capsid pentamer.

  10. The OGCleaner: filtering false-positive homology clusters.

    Science.gov (United States)

    Fujimoto, M Stanley; Suvorov, Anton; Jensen, Nicholas O; Clement, Mark J; Snell, Quinn; Bybee, Seth M

    2017-01-01

    Detecting homologous sequences in organisms is an essential step in protein structure and function prediction, gene annotation and phylogenetic tree construction. Heuristic methods are often employed for quality control of putative homology clusters. These heuristics, however, usually only apply to pairwise sequence comparison and do not examine clusters as a whole. We present the Orthology Group Cleaner (the OGCleaner), a tool designed for filtering putative orthology groups as homology or non-homology clusters by considering all sequences in a cluster. The OGCleaner relies on high-quality orthologous groups identified in OrthoDB to train machine learning algorithms that are able to distinguish between true-positive and false-positive homology groups. This package aims to improve the quality of phylogenetic tree construction especially in instances of lower-quality transcriptome assemblies. https://github.com/byucsl/ogcleaner CONTACT: sfujimoto@gmail.comSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  11. Single-cell template strand sequencing by Strand-seq enables the characterization of individual homologs

    NARCIS (Netherlands)

    Sanders, Ashley D; Falconer, Ester; Hills, Mark; Spierings, Diana C J; Lansdorp, Peter M.

    The ability to distinguish between genome sequences of homologous chromosomes in single cells is important for studies of copy-neutral genomic rearrangements (such as inversions and translocations), building chromosome-length haplotypes, refining genome assemblies, mapping sister chromatid exchange

  12. Annotating Protein Functional Residues by Coupling High-Throughput Fitness Profile and Homologous-Structure Analysis

    Directory of Open Access Journals (Sweden)

    Yushen Du

    2016-11-01

    Full Text Available Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp, we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available.

  13. The use of coded PCR primers enables high-throughput sequencing of multiple homolog amplification products by 454 parallel sequencing

    DEFF Research Database (Denmark)

    Binladen, Jonas; Gilbert, M Thomas P; Bollback, Jonathan P

    2007-01-01

    BACKGROUND: The invention of the Genome Sequence 20 DNA Sequencing System (454 parallel sequencing platform) has enabled the rapid and high-volume production of sequence data. Until now, however, individual emulsion PCR (emPCR) reactions and subsequent sequencing runs have been unable to combine...... primers that is dependent on the 5' nucleotide of the tag. In particular, primers 5' labelled with a cytosine are heavily overrepresented among the final sequences, while those 5' labelled with a thymine are strongly underrepresented. A weaker bias also exists with regards to the distribution...

  14. Detecting remote sequence homology in disordered proteins: discovery of conserved motifs in the N-termini of Mononegavirales phosphoproteins.

    Directory of Open Access Journals (Sweden)

    David Karlin

    Full Text Available Paramyxovirinae are a large group of viruses that includes measles virus and parainfluenza viruses. The viral Phosphoprotein (P plays a central role in viral replication. It is composed of a highly variable, disordered N-terminus and a conserved C-terminus. A second viral protein alternatively expressed, the V protein, also contains the N-terminus of P, fused to a zinc finger. We suspected that, despite their high variability, the N-termini of P/V might all be homologous; however, using standard approaches, we could previously identify sequence conservation only in some Paramyxovirinae. We now compared the N-termini using sensitive sequence similarity search programs, able to detect residual similarities unnoticeable by conventional approaches. We discovered that all Paramyxovirinae share a short sequence motif in their first 40 amino acids, which we called soyuz1. Despite its short length (11-16aa, several arguments allow us to conclude that soyuz1 probably evolved by homologous descent, unlike linear motifs. Conservation across such evolutionary distances suggests that soyuz1 plays a crucial role and experimental data suggest that it binds the viral nucleoprotein to prevent its illegitimate self-assembly. In some Paramyxovirinae, the N-terminus of P/V contains a second motif, soyuz2, which might play a role in blocking interferon signaling. Finally, we discovered that the P of related Mononegavirales contain similarly overlooked motifs in their N-termini, and that their C-termini share a previously unnoticed structural similarity suggesting a common origin. Our results suggest several testable hypotheses regarding the replication of Mononegavirales and suggest that disordered regions with little overall sequence similarity, common in viral and eukaryotic proteins, might contain currently overlooked motifs (intermediate in length between linear motifs and disordered domains that could be detected simply by comparing orthologous proteins.

  15. MIPS: a database for protein sequences, homology data and yeast genome information.

    Science.gov (United States)

    Mewes, H W; Albermann, K; Heumann, K; Liebl, S; Pfeiffer, F

    1997-01-01

    The MIPS group (Martinsried Institute for Protein Sequences) at the Max-Planck-Institute for Biochemistry, Martinsried near Munich, Germany, collects, processes and distributes protein sequence data within the framework of the tripartite association of the PIR-International Protein Sequence Database (,). MIPS contributes nearly 50% of the data input to the PIR-International Protein Sequence Database. The database is distributed on CD-ROM together with PATCHX, an exhaustive supplement of unique, unverified protein sequences from external sources compiled by MIPS. Through its WWW server (http://www.mips.biochem.mpg.de/ ) MIPS permits internet access to sequence databases, homology data and to yeast genome information. (i) Sequence similarity results from the FASTA program () are stored in the FASTA database for all proteins from PIR-International and PATCHX. The database is dynamically maintained and permits instant access to FASTA results. (ii) Starting with FASTA database queries, proteins have been classified into families and superfamilies (PROT-FAM). (iii) The HPT (hashed position tree) data structure () developed at MIPS is a new approach for rapid sequence and pattern searching. (iv) MIPS provides access to the sequence and annotation of the complete yeast genome (), the functional classification of yeast genes (FunCat) and its graphical display, the 'Genome Browser' (). A CD-ROM based on the JAVA programming language providing dynamic interactive access to the yeast genome and the related protein sequences has been compiled and is available on request. PMID:9016498

  16. High frequency of phylogenetically diverse reductive dehalogenase-homologous genes in deep subseafloor sedimentary metagenomes

    Directory of Open Access Journals (Sweden)

    Mikihiko eKawai

    2014-03-01

    Full Text Available Marine subsurface sediments on the Pacific margin harbor diverse microbial communities even at depths of several hundreds meters below the seafloor (mbsf or more. Previous PCR-based molecular analysis showed the presence of diverse reductive dehalogenase gene (rdhA homologs in marine subsurface sediment, suggesting that anaerobic respiration of organohalides is one of the possible energy-yielding pathways in the organic-rich sedimentary habitat. However, primer-independent molecular characterization of rdhA has remained to be demonstrated. Here, we studied the diversity and frequency of rdhA homologs by metagenomic analysis of five different depth horizons (0.8, 5.1, 18.6, 48.5 and 107.0 mbsf at Site C9001 off the Shimokita Peninsula of Japan. From all metagenomic pools, remarkably diverse rdhA-homologous sequences, some of which are affiliated with novel clusters, were observed with high frequency. As a comparison, we also examined frequency of dissimilatory sulfite reductase genes (dsrAB, key functional genes for microbial sulfate reduction. The dsrAB were also widely observed in the metagenomic pools whereas the frequency of dsrAB genes was generally smaller than that of rdhA-homologous genes. The phylogenetic composition of rdhA-homologous genes was similar among the five depth horizons. Our metagenomic data revealed that subseafloor rdhA homologs are more diverse than previously identified from PCR-based molecular studies. Spatial distribution of similar rdhA homologs across wide depositional ages indicates that the heterotrophic metabolic processes mediated by the genes can be ecologically important, functioning in the organic-rich subseafloor sedimentary biosphere.

  17. Relative K-homology and normal operators

    DEFF Research Database (Denmark)

    Manuilov, Vladimir; Thomsen, Klaus

    2009-01-01

    -term exact sequence which generalizes the excision six-term exact sequence in the first variable of KK-theory. Subsequently we investigate the relative K-homology which arises from the group of relative extensions by specializing to abelian $C^*$-algebras. It turns out that this relative K-homology carries...

  18. No allelic variation in genes with high gliadin homology in patients with celiac disease and type 1 diabetes

    DEFF Research Database (Denmark)

    Nielsen, Christian; Hansen, Dorte; Husby, Steffen

    2004-01-01

    recognize gluten-derived peptides in which specific glutamine residues are deamidated to glutamic acid by tissue transglutaminase. Recently, intestinally expressed human genes with high homology to DQ2-gliadin celiac T-cell epitopes have been identified. Single or double point mutations which would increase...... the celiac T-cell epitope homology, and mutation in these genes, leading to the expression of glutamic acid at particular positions, could hypothetically be involved in the initiation of CD in HLA-DQ2-positive children. Six gene regions with high celiac T-cell epitope homology were investigated for single......-nucleotide polymorphisms using direct sequencing of DNA from 20 CD patients, 27 type 1 diabetes mellitus (T1DM) patients with associated CD, 24 patients with T1DM without CD and 110 healthy controls, all of Caucasian origin. No variants in any of these genes in any of the investigated groups were found. We conclude...

  19. Comparative genomic survey, exon-intron annotation and phylogenetic analysis of NAT-homologous sequences in archaea, protists, fungi, viruses, and invertebrates

    Science.gov (United States)

    We have previously published extensive genomic surveys [1-3], reporting NAT-homologous sequences in hundreds of sequenced bacterial, fungal and vertebrate genomes. We present here the results of our latest search of 2445 genomes, representing 1532 (70 archaeal, 1210 bacterial, 43 protist, 97 fungal,...

  20. Top-Down-Assisted Bottom-Up Method for Homologous Protein Sequencing: Hemoglobin from 33 Bird Species

    Science.gov (United States)

    Song, Yang; Laskay, Ünige A.; Vilcins, Inger-Marie E.; Barbour, Alan G.; Wysocki, Vicki H.

    2015-11-01

    Ticks are vectors for disease transmission because they are indiscriminant in their feeding on multiple vertebrate hosts, transmitting pathogens between their hosts. Identifying the hosts on which ticks have fed is important for disease prevention and intervention. We have previously shown that hemoglobin (Hb) remnants from a host on which a tick fed can be used to reveal the host's identity. For the present research, blood was collected from 33 bird species that are common in the U.S. as hosts for ticks but that have unknown Hb sequences. A top-down-assisted bottom-up mass spectrometry approach with a customized searching database, based on variability in known bird hemoglobin sequences, has been devised to facilitate fast and complete sequencing of hemoglobin from birds with unknown sequences. These hemoglobin sequences will be added to a hemoglobin database and used for tick host identification. The general approach has the potential to sequence any set of homologous proteins completely in a rapid manner.

  1. Exploring sequence characteristics related to high-level production of secreted proteins in Aspergillus niger.

    Directory of Open Access Journals (Sweden)

    Bastiaan A van den Berg

    Full Text Available Protein sequence features are explored in relation to the production of over-expressed extracellular proteins by fungi. Knowledge on features influencing protein production and secretion could be employed to improve enzyme production levels in industrial bioprocesses via protein engineering. A large set, over 600 homologous and nearly 2,000 heterologous fungal genes, were overexpressed in Aspergillus niger using a standardized expression cassette and scored for high versus no production. Subsequently, sequence-based machine learning techniques were applied for identifying relevant DNA and protein sequence features. The amino-acid composition of the protein sequence was found to be most predictive and interpretation revealed that, for both homologous and heterologous gene expression, the same features are important: tyrosine and asparagine composition was found to have a positive correlation with high-level production, whereas for unsuccessful production, contributions were found for methionine and lysine composition. The predictor is available online at http://bioinformatics.tudelft.nl/hipsec. Subsequent work aims at validating these findings by protein engineering as a method for increasing expression levels per gene copy.

  2. OCCURRENCE OF SMALL HOMOLOGOUS AND COMPLEMENTARY FRAGMENTS IN HUMAN VIRUS GENOMES AND THEIR POSSIBLE ROLE

    Directory of Open Access Journals (Sweden)

    E. P. Kharchenko

    2017-01-01

    Full Text Available With computer analysis occurrence of small homologous and complementary fragments (21 nucleotides in length has been studied in genomes of 14 human viruses causing most dangerous infections. The sample includes viruses with (+ and (– single stranded RNA and DNA-containing hepatitis A virus. Analysis of occurrence of homologous sequences has shown the existence two extreme situations. On the one hand, the same virus contains homologous sequences to almost all other viruses (for example, Ebola virus, severe acute respiratory syndrome-related coronavirus, and mumps virus, and numerous homologous sequences to the same other virus (especially in severe acute respiratory syndrome-related coronavirus to Dengue virus and in Ebola virus to poliovirus. On the other hand, there are rare occurrence and not numerous homologous sequences in genomes of other viruses (rubella virus, hepatitis A virus, and hepatitis B virus. Similar situation exists for occurrence of complementary sequences. Rubella virus, the genome of which has the high content of guanine and cytosine, has no complementary sequences to almost all other viruses. Most viruses have moderate level of occurrence for homologous and complementary sequences. Autocomplementary sequences are numerous in most viruses and one may suggest that the genome of single stranded RNA viruses has branched secondary structure. In addition to possible role in recombination among strains autocomplementary sequences could be regulators of translation rate of virus proteins and determine its optimal proportion in virion assembly with genome and mRNA folding. Occurrence of small homologous and complementary sequences in RNA- and DNA-containing viruses may be the result of multiple recombinations in the past and the present and determine their adaptation and variability. Recombination may take place in coinfection of human and/or common hosts. Inclusion of homologous and complementary sequences into genome could not

  3. Sequence homology and expression profile of genes associated with DNA repair pathways in Mycobacterium leprae.

    Science.gov (United States)

    Sharma, Mukul; Vedithi, Sundeep Chaitanya; Das, Madhusmita; Roy, Anindya; Ebenezer, Mannam

    2017-01-01

    Survival of Mycobacterium leprae, the causative bacteria for leprosy, in the human host is dependent to an extent on the ways in which its genome integrity is retained. DNA repair mechanisms protect bacterial DNA from damage induced by various stress factors. The current study is aimed at understanding the sequence and functional annotation of DNA repair genes in M. leprae. T he genome of M. leprae was annotated using sequence alignment tools to identify DNA repair genes that have homologs in Mycobacterium tuberculosis and Escherichia coli. A set of 96 genes known to be involved in DNA repair mechanisms in E. coli and Mycobacteriaceae were chosen as a reference. Among these, 61 were identified in M. leprae based on sequence similarity and domain architecture. The 61 were classified into 36 characterized gene products (59%), 11 hypothetical proteins (18%), and 14 pseudogenes (23%). All these genes have homologs in M. tuberculosis and 49 (80.32%) in E. coli. A set of 12 genes which are absent in E. coli were present in M. leprae and in Mycobacteriaceae. These 61 genes were further investigated for their expression profiles in the whole transcriptome microarray data of M. leprae which was obtained from the signal intensities of 60bp probes, tiling the entire genome with 10bp overlaps. It was noted that transcripts corresponding to all the 61 genes were identified in the transcriptome data with varying expression levels ranging from 0.18 to 2.47 fold (normalized with 16SrRNA). The mRNA expression levels of a representative set of seven genes ( four annotated and three hypothetical protein coding genes) were analyzed using quantitative Polymerase Chain Reaction (qPCR) assays with RNA extracted from skin biopsies of 10 newly diagnosed, untreated leprosy cases. It was noted that RNA expression levels were higher for genes involved in homologous recombination whereas the genes with a low level of expression are involved in the direct repair pathway. This study provided

  4. Structural and Sequence Similarities of Hydra Xeroderma Pigmentosum A Protein to Human Homolog Suggest Early Evolution and Conservation

    Directory of Open Access Journals (Sweden)

    Apurva Barve

    2013-01-01

    Full Text Available Xeroderma pigmentosum group A (XPA is a protein that binds to damaged DNA, verifies presence of a lesion, and recruits other proteins of the nucleotide excision repair (NER pathway to the site. Though its homologs from yeast, Drosophila, humans, and so forth are well studied, XPA has not so far been reported from protozoa and lower animal phyla. Hydra is a fresh-water cnidarian with a remarkable capacity for regeneration and apparent lack of organismal ageing. Cnidarians are among the first metazoa with a defined body axis, tissue grade organisation, and nervous system. We report here for the first time presence of XPA gene in hydra. Putative protein sequence of hydra XPA contains nuclear localization signal and bears the zinc-finger motif. It contains two conserved Pfam domains and various characterized features of XPA proteins like regions for binding to excision repair cross-complementing protein-1 (ERCC1 and replication protein A 70 kDa subunit (RPA70 proteins. Hydra XPA shows a high degree of similarity with vertebrate homologs and clusters with deuterostomes in phylogenetic analysis. Homology modelling corroborates the very close similarity between hydra and human XPA. The protein thus most likely functions in hydra in the same manner as in other animals, indicating that it arose early in evolution and has been conserved across animal phyla.

  5. Sequence homolog-based molecular engineering for shifting the enzymatic pH optimum

    Directory of Open Access Journals (Sweden)

    Fuqiang Ma

    2016-09-01

    Full Text Available Cell-free synthetic biology system organizes multiple enzymes (parts from different sources to implement unnatural catalytic functions. Highly adaption between the catalytic parts is crucial for building up efficient artificial biosynthetic systems. Protein engineering is a powerful technology to tailor various enzymatic properties including catalytic efficiency, substrate specificity, temperature adaptation and even achieve new catalytic functions. However, altering enzymatic pH optimum still remains a challenging task. In this study, we proposed a novel sequence homolog-based protein engineering strategy for shifting the enzymatic pH optimum based on statistical analyses of sequence-function relationship data of enzyme family. By two statistical procedures, artificial neural networks (ANNs and least absolute shrinkage and selection operator (Lasso, five amino acids in GH11 xylanase family were identified to be related to the evolution of enzymatic pH optimum. Site-directed mutagenesis of a thermophilic xylanase from Caldicellulosiruptor bescii revealed that four out of five mutations could alter the enzymatic pH optima toward acidic condition without compromising the catalytic activity and thermostability. Combination of the positive mutants resulted in the best mutant M31 that decreased its pH optimum for 1.5 units and showed increased catalytic activity at pH < 5.0 compared to the wild-type enzyme. Structure analysis revealed that all the mutations are distant from the active center, which may be difficult to be identified by conventional rational design strategy. Interestingly, the four mutation sites are clustered at a certain region of the enzyme, suggesting a potential “hot zone” for regulating the pH optima of xylanases. This study provides an efficient method of modulating enzymatic pH optima based on statistical sequence analyses, which can facilitate the design and optimization of suitable catalytic parts for the construction

  6. Sequence homology and expression profile of genes associated with dna repair pathways in Mycobacterium leprae

    Directory of Open Access Journals (Sweden)

    Mukul Sharma

    2017-01-01

    Full Text Available Background: Survival of Mycobacterium leprae, the causative bacteria for leprosy, in the human host is dependent to an extent on the ways in which its genome integrity is retained. DNA repair mechanisms protect bacterial DNA from damage induced by various stress factors. The current study is aimed at understanding the sequence and functional annotation of DNA repair genes in M. leprae. Methods: T he genome of M. leprae was annotated using sequence alignment tools to identify DNA repair genes that have homologs in Mycobacterium tuberculosis and Escherichia coli. A set of 96 genes known to be involved in DNA repair mechanisms in E. coli and Mycobacteriaceae were chosen as a reference. Among these, 61 were identified in M. leprae based on sequence similarity and domain architecture. The 61 were classified into 36 characterized gene products (59%, 11 hypothetical proteins (18%, and 14 pseudogenes (23%. All these genes have homologs in M. tuberculosis and 49 (80.32% in E. coli. A set of 12 genes which are absent in E. coli were present in M. leprae and in Mycobacteriaceae. These 61 genes were further investigated for their expression profiles in the whole transcriptome microarray data of M. leprae which was obtained from the signal intensities of 60bp probes, tiling the entire genome with 10bp overlaps. Results: It was noted that transcripts corresponding to all the 61 genes were identified in the transcriptome data with varying expression levels ranging from 0.18 to 2.47 fold (normalized with 16SrRNA. The mRNA expression levels of a representative set of seven genes ( four annotated and three hypothetical protein coding genes were analyzed using quantitative Polymerase Chain Reaction (qPCR assays with RNA extracted from skin biopsies of 10 newly diagnosed, untreated leprosy cases. It was noted that RNA expression levels were higher for genes involved in homologous recombination whereas the genes with a low level of expression are involved in the

  7. Molecular cloning, sequence analysis and homology modeling of the first caudata amphibian antifreeze-like protein in axolotl (Ambystoma mexicanum).

    Science.gov (United States)

    Zhang, Songyan; Gao, Jiuxiang; Lu, Yiling; Cai, Shasha; Qiao, Xue; Wang, Yipeng; Yu, Haining

    2013-08-01

    Antifreeze proteins (AFPs) refer to a class of polypeptides that are produced by certain vertebrates, plants, fungi, and bacteria and which permit their survival in subzero environments. In this study, we report the molecular cloning, sequence analysis and three-dimensional structure of the axolotl antifreeze-like protein (AFLP) by homology modeling of the first caudate amphibian AFLP. We constructed a full-length spleen cDNA library of axolotl (Ambystoma mexicanum). An EST having highest similarity (∼42%) with freeze-responsive liver protein Li16 from Rana sylvatica was identified, and the full-length cDNA was subsequently obtained by RACE-PCR. The axolotl antifreeze-like protein sequence represents an open reading frame for a putative signal peptide and the mature protein composed of 93 amino acids. The calculated molecular mass and the theoretical isoelectric point (pl) of this mature protein were 10128.6 Da and 8.97, respectively. The molecular characterization of this gene and its deduced protein were further performed by detailed bioinformatics analysis. The three-dimensional structure of current AFLP was predicted by homology modeling, and the conserved residues required for functionality were identified. The homology model constructed could be of use for effective drug design. This is the first report of an antifreeze-like protein identified from a caudate amphibian.

  8. CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction

    KAUST Repository

    Cui, Xuefeng; Lu, Zhiwu; Wang, Sheng; Jing-Yan Wang, Jim; Gao, Xin

    2016-01-01

    Motivation: Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment

  9. SANSparallel: interactive homology search against Uniprot.

    Science.gov (United States)

    Somervuo, Panu; Holm, Liisa

    2015-07-01

    Proteins evolve by mutations and natural selection. The network of sequence similarities is a rich source for mining homologous relationships that inform on protein structure and function. There are many servers available to browse the network of homology relationships but one has to wait up to a minute for results. The SANSparallel webserver provides protein sequence database searches with immediate response and professional alignment visualization by third-party software. The output is a list, pairwise alignment or stacked alignment of sequence-similar proteins from Uniprot, UniRef90/50, Swissprot or Protein Data Bank. The stacked alignments are viewed in Jalview or as sequence logos. The database search uses the suffix array neighborhood search (SANS) method, which has been re-implemented as a client-server, improved and parallelized. The method is extremely fast and as sensitive as BLAST above 50% sequence identity. Benchmarks show that the method is highly competitive compared to previously published fast database search programs: UBLAST, DIAMOND, LAST, LAMBDA, RAPSEARCH2 and BLAT. The web server can be accessed interactively or programmatically at http://ekhidna2.biocenter.helsinki.fi/cgi-bin/sans/sans.cgi. It can be used to make protein functional annotation pipelines more efficient, and it is useful in interactive exploration of the detailed evidence supporting the annotation of particular proteins of interest. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. Whole genome analysis of CRISPR Cas9 sgRNA off-target homologies via an efficient computational algorithm.

    Science.gov (United States)

    Zhou, Hong; Zhou, Michael; Li, Daisy; Manthey, Joseph; Lioutikova, Ekaterina; Wang, Hong; Zeng, Xiao

    2017-11-17

    The beauty and power of the genome editing mechanism, CRISPR Cas9 endonuclease system, lies in the fact that it is RNA-programmable such that Cas9 can be guided to any genomic loci complementary to a 20-nt RNA, single guide RNA (sgRNA), to cleave double stranded DNA, allowing the introduction of wanted mutations. Unfortunately, it has been reported repeatedly that the sgRNA can also guide Cas9 to off-target sites where the DNA sequence is homologous to sgRNA. Using human genome and Streptococcus pyogenes Cas9 (SpCas9) as an example, this article mathematically analyzed the probabilities of off-target homologies of sgRNAs and discovered that for large genome size such as human genome, potential off-target homologies are inevitable for sgRNA selection. A highly efficient computationl algorithm was developed for whole genome sgRNA design and off-target homology searches. By means of a dynamically constructed sequence-indexed database and a simplified sequence alignment method, this algorithm achieves very high efficiency while guaranteeing the identification of all existing potential off-target homologies. Via this algorithm, 1,876,775 sgRNAs were designed for the 19,153 human mRNA genes and only two sgRNAs were found to be free of off-target homology. By means of the novel and efficient sgRNA homology search algorithm introduced in this article, genome wide sgRNA design and off-target analysis were conducted and the results confirmed the mathematical analysis that for a sgRNA sequence, it is almost impossible to escape potential off-target homologies. Future innovations on the CRISPR Cas9 gene editing technology need to focus on how to eliminate the Cas9 off-target activity.

  11. Adhesive proteins of stalked and acorn barnacles display homology with low sequence similarities.

    Directory of Open Access Journals (Sweden)

    Jaimie-Leigh Jonker

    Full Text Available Barnacle adhesion underwater is an important phenomenon to understand for the prevention of biofouling and potential biotechnological innovations, yet so far, identifying what makes barnacle glue proteins 'sticky' has proved elusive. Examination of a broad range of species within the barnacles may be instructive to identify conserved adhesive domains. We add to extensive information from the acorn barnacles (order Sessilia by providing the first protein analysis of a stalked barnacle adhesive, Lepas anatifera (order Lepadiformes. It was possible to separate the L. anatifera adhesive into at least 10 protein bands using SDS-PAGE. Intense bands were present at approximately 30, 70, 90 and 110 kilodaltons (kDa. Mass spectrometry for protein identification was followed by de novo sequencing which detected 52 peptides of 7-16 amino acids in length. None of the peptides matched published or unpublished transcriptome sequences, but some amino acid sequence similarity was apparent between L. anatifera and closely-related Dosima fascicularis. Antibodies against two acorn barnacle proteins (ab-cp-52k and ab-cp-68k showed cross-reactivity in the adhesive glands of L. anatifera. We also analysed the similarity of adhesive proteins across several barnacle taxa, including Pollicipes pollicipes (a stalked barnacle in the order Scalpelliformes. Sequence alignment of published expressed sequence tags clearly indicated that P. pollicipes possesses homologues for the 19 kDa and 100 kDa proteins in acorn barnacles. Homology aside, sequence similarity in amino acid and gene sequences tended to decline as taxonomic distance increased, with minimum similarities of 18-26%, depending on the gene. The results indicate that some adhesive proteins (e.g. 100 kDa are more conserved within barnacles than others (20 kDa.

  12. Complete Unique Genome Sequence, Expression Profile, and Salivary Gland Tissue Tropism of the Herpesvirus 7 Homolog in Pigtailed Macaques.

    Science.gov (United States)

    Staheli, Jeannette P; Dyen, Michael R; Deutsch, Gail H; Basom, Ryan S; Fitzgibbon, Matthew P; Lewis, Patrick; Barcy, Serge

    2016-08-01

    Human herpesvirus 6A (HHV-6A), HHV-6B, and HHV-7 are classified as roseoloviruses and are highly prevalent in the human population. Roseolovirus reactivation in an immunocompromised host can cause severe pathologies. While the pathogenic potential of HHV-7 is unclear, it can reactivate HHV-6 from latency and thus contributes to severe pathological conditions associated with HHV-6. Because of the ubiquitous nature of roseoloviruses, their roles in such interactions and the resulting pathological consequences have been difficult to study. Furthermore, the lack of a relevant animal model for HHV-7 infection has hindered a better understanding of its contribution to roseolovirus-associated diseases. Using next-generation sequencing analysis, we characterized the unique genome of an uncultured novel pigtailed macaque roseolovirus. Detailed genomic analysis revealed the presence of gene homologs to all 84 known HHV-7 open reading frames. Phylogenetic analysis confirmed that the virus is a macaque homolog of HHV-7, which we have provisionally named Macaca nemestrina herpesvirus 7 (MneHV7). Using high-throughput RNA sequencing, we observed that the salivary gland tissue samples from nine different macaques had distinct MneHV7 gene expression patterns and that the overall number of viral transcripts correlated with viral loads in parotid gland tissue and saliva. Immunohistochemistry staining confirmed that, like HHV-7, MneHV7 exhibits a natural tropism for salivary gland ductal cells. We also observed staining for MneHV7 in peripheral nerve ganglia present in salivary gland tissues, suggesting that HHV-7 may also have a tropism for the peripheral nervous system. Our data demonstrate that MneHV7-infected macaques represent a relevant animal model that may help clarify the causality between roseolovirus reactivation and diseases. Human herpesvirus 6A (HHV-6A), HHV-6B, and HHV-7 are classified as roseoloviruses. We have recently discovered that pigtailed macaques are naturally

  13. Identification of the porcine homologous of human disease causing trinucleotide repeat sequences

    DEFF Research Database (Denmark)

    Madsen, Lone Bruhn; Thomsen, Bo; Sølvsten, Christina Ane Elisabeth

    2007-01-01

    in this paper the identification of porcine noncoding and polyglutamine-encoding TNR regions and the comparison to the homologous TNRs from human, chimpanzee, dog, opossum, rat, and mouse. Several of the porcine TNR regions are highly polymorphic both within and between different breeds. The TNR regions...

  14. Evolutionary distance from human homologs reflects allergenicity of animal food proteins.

    Science.gov (United States)

    Jenkins, John A; Breiteneder, Heimo; Mills, E N Clare

    2007-12-01

    In silico analysis of allergens can identify putative relationships among protein sequence, structure, and allergenic properties. Such systematic analysis reveals that most plant food allergens belong to a restricted number of protein superfamilies, with pollen allergens behaving similarly. We have investigated the structural relationships of animal food allergens and their evolutionary relatedness to human homologs to define how closely a protein must resemble a human counterpart to lose its allergenic potential. Profile-based sequence homology methods were used to classify animal food allergens into Pfam families, and in silico analyses of their evolutionary and structural relationships were performed. Animal food allergens could be classified into 3 main families--tropomyosins, EF-hand proteins, and caseins--along with 14 minor families each composed of 1 to 3 allergens. The evolutionary relationships of each of these allergen superfamilies showed that in general, proteins with a sequence identity to a human homolog above approximately 62% were rarely allergenic. Single substitutions in otherwise highly conserved regions containing IgE epitopes in EF-hand parvalbumins may modulate allergenicity. These data support the premise that certain protein structures are more allergenic than others. Contrasting with plant food allergens, animal allergens, such as the highly conserved tropomyosins, challenge the capability of the human immune system to discriminate between foreign and self-proteins. Such immune responses run close to becoming autoimmune responses. Exploiting the closeness between animal allergens and their human homologs in the development of recombinant allergens for immunotherapy will need to consider the potential for developing unanticipated autoimmune responses.

  15. Nucleotide and amino acid sequences of a coat protein of an Ukrainian isolate of Potato virus Y: comparison with homologous sequences of other isolates and phylogenetic analysis

    Directory of Open Access Journals (Sweden)

    Budzanivska I. G.

    2014-03-01

    Full Text Available Aim. Identification of the widespread Ukrainian isolate(s of PVY (Potato virus Y in different potato cultivars and subsequent phylogenetic analysis of detected PVY isolates based on NA and AA sequences of coat protein. Methods. ELISA, RT-PCR, DNA sequencing and phylogenetic analysis. Results. PVY has been identified serologically in potato cultivars of Ukrainian selection. In this work we have optimized a method for total RNA extraction from potato samples and offered a sensitive and specific PCR-based test system of own design for diagnostics of the Ukrainian PVY isolates. Part of the CP gene of the Ukrainian PVY isolate has been sequenced and analyzed phylogenetically. It is demonstrated that the Ukrainian isolate of Potato virus Y (CP gene has a higher percentage of homology with the recombinant isolates (strains of this pathogen (approx. 98.8– 99.8 % of homology for both nucleotide and translated amino acid sequences of the CP gene. The Ukrainian isolate of PVY is positioned in the separate cluster together with the isolates found in Syria, Japan and Iran; these isolates possibly have common origin. The Ukrainian PVY isolate is confirmed to be recombinant. Conclusions. This work underlines the need and provides the means for accurate monitoring of Potato virus Y in the agroecosystems of Ukraine. Most importantly, the phylogenetic analysis demonstrated the recombinant nature of this PVY isolate which has been attributed to the strain group O, subclade N:O.

  16. Dualities in persistent (co)homology

    International Nuclear Information System (INIS)

    De Silva, Vin; Morozov, Dmitriy; Vejdemo-Johansson, Mikael

    2011-01-01

    We consider sequences of absolute and relative homology and cohomology groups that arise naturally for a filtered cell complex. We establish algebraic relationships between their persistence modules, and show that they contain equivalent information. We explain how one can use the existing algorithm for persistent homology to process any of the four modules, and relate it to a recently introduced persistent cohomology algorithm. We present experimental evidence for the practical efficiency of the latter algorithm

  17. Neural network and SVM classifiers accurately predict lipid binding proteins, irrespective of sequence homology.

    Science.gov (United States)

    Bakhtiarizadeh, Mohammad Reza; Moradi-Shahrbabak, Mohammad; Ebrahimi, Mansour; Ebrahimie, Esmaeil

    2014-09-07

    Due to the central roles of lipid binding proteins (LBPs) in many biological processes, sequence based identification of LBPs is of great interest. The major challenge is that LBPs are diverse in sequence, structure, and function which results in low accuracy of sequence homology based methods. Therefore, there is a need for developing alternative functional prediction methods irrespective of sequence similarity. To identify LBPs from non-LBPs, the performances of support vector machine (SVM) and neural network were compared in this study. Comprehensive protein features and various techniques were employed to create datasets. Five-fold cross-validation (CV) and independent evaluation (IE) tests were used to assess the validity of the two methods. The results indicated that SVM outperforms neural network. SVM achieved 89.28% (CV) and 89.55% (IE) overall accuracy in identification of LBPs from non-LBPs and 92.06% (CV) and 92.90% (IE) (in average) for classification of different LBPs classes. Increasing the number and the range of extracted protein features as well as optimization of the SVM parameters significantly increased the efficiency of LBPs class prediction in comparison to the only previous report in this field. Altogether, the results showed that the SVM algorithm can be run on broad, computationally calculated protein features and offers a promising tool in detection of LBPs classes. The proposed approach has the potential to integrate and improve the common sequence alignment based methods. Copyright © 2014 Elsevier Ltd. All rights reserved.

  18. Clustering evolving proteins into homologous families.

    Science.gov (United States)

    Chan, Cheong Xin; Mahbob, Maisarah; Ragan, Mark A

    2013-04-08

    Clustering sequences into groups of putative homologs (families) is a critical first step in many areas of comparative biology and bioinformatics. The performance of clustering approaches in delineating biologically meaningful families depends strongly on characteristics of the data, including content bias and degree of divergence. New, highly scalable methods have recently been introduced to cluster the very large datasets being generated by next-generation sequencing technologies. However, there has been little systematic investigation of how characteristics of the data impact the performance of these approaches. Using clusters from a manually curated dataset as reference, we examined the performance of a widely used graph-based Markov clustering algorithm (MCL) and a greedy heuristic approach (UCLUST) in delineating protein families coded by three sets of bacterial genomes of different G+C content. Both MCL and UCLUST generated clusters that are comparable to the reference sets at specific parameter settings, although UCLUST tends to under-cluster compositionally biased sequences (G+C content 33% and 66%). Using simulated data, we sought to assess the individual effects of sequence divergence, rate heterogeneity, and underlying G+C content. Performance decreased with increasing sequence divergence, decreasing among-site rate variation, and increasing G+C bias. Two MCL-based methods recovered the simulated families more accurately than did UCLUST. MCL using local alignment distances is more robust across the investigated range of sequence features than are greedy heuristics using distances based on global alignment. Our results demonstrate that sequence divergence, rate heterogeneity and content bias can individually and in combination affect the accuracy with which MCL and UCLUST can recover homologous protein families. For application to data that are more divergent, and exhibit higher among-site rate variation and/or content bias, MCL may often be the better

  19. Gene Discovery through Genomic Sequencing of Brucella abortus

    Science.gov (United States)

    Sánchez, Daniel O.; Zandomeni, Ruben O.; Cravero, Silvio; Verdún, Ramiro E.; Pierrou, Ester; Faccio, Paula; Diaz, Gabriela; Lanzavecchia, Silvia; Agüero, Fernán; Frasch, Alberto C. C.; Andersson, Siv G. E.; Rossetti, Osvaldo L.; Grau, Oscar; Ugalde, Rodolfo A.

    2001-01-01

    Brucella abortus is the etiological agent of brucellosis, a disease that affects bovines and human. We generated DNA random sequences from the genome of B. abortus strain 2308 in order to characterize molecular targets that might be useful for developing immunological or chemotherapeutic strategies against this pathogen. The partial sequencing of 1,899 clones allowed the identification of 1,199 genomic sequence surveys (GSSs) with high homology (BLAST expect value < 10−5) to sequences deposited in the GenBank databases. Among them, 925 represent putative novel genes for the Brucella genus. Out of 925 nonredundant GSSs, 470 were classified in 15 categories based on cellular function. Seven hundred GSSs showed no significant database matches and remain available for further studies in order to identify their function. A high number of GSSs with homology to Agrobacterium tumefaciens and Rhizobium meliloti proteins were observed, thus confirming their close phylogenetic relationship. Among them, several GSSs showed high similarity with genes related to nodule nitrogen fixation, synthesis of nod factors, nodulation protein symbiotic plasmid, and nodule bacteroid differentiation. We have also identified several B. abortus homologs of virulence and pathogenesis genes from other pathogens, including a homolog to both the Shda gene from Salmonella enterica serovar Typhimurium and the AidA-1 gene from Escherichia coli. Other GSSs displayed significant homologies to genes encoding components of the type III and type IV secretion machineries, suggesting that Brucella might also have an active type III secretion machinery. PMID:11159979

  20. FastBLAST: homology relationships for millions of proteins.

    Directory of Open Access Journals (Sweden)

    Morgan N Price

    Full Text Available BACKGROUND: All-versus-all BLAST, which searches for homologous pairs of sequences in a database of proteins, is used to identify potential orthologs, to find new protein families, and to provide rapid access to these homology relationships. As DNA sequencing accelerates and data sets grow, all-versus-all BLAST has become computationally demanding. METHODOLOGY/PRINCIPAL FINDINGS: We present FastBLAST, a heuristic replacement for all-versus-all BLAST that relies on alignments of proteins to known families, obtained from tools such as PSI-BLAST and HMMer. FastBLAST avoids most of the work of all-versus-all BLAST by taking advantage of these alignments and by clustering similar sequences. FastBLAST runs in two stages: the first stage identifies additional families and aligns them, and the second stage quickly identifies the homologs of a query sequence, based on the alignments of the families, before generating pairwise alignments. On 6.53 million proteins from the non-redundant Genbank database ("NR", FastBLAST identifies new families 25 times faster than all-versus-all BLAST. Once the first stage is completed, FastBLAST identifies homologs for the average query in less than 5 seconds (8.6 times faster than BLAST and gives nearly identical results. For hits above 70 bits, FastBLAST identifies 98% of the top 3,250 hits per query. CONCLUSIONS/SIGNIFICANCE: FastBLAST enables research groups that do not have supercomputers to analyze large protein sequence data sets. FastBLAST is open source software and is available at http://microbesonline.org/fastblast.

  1. SEQUENCING AND SEQUENCE ANALYSIS OF MYOSTATIN GENE IN THE EXON 1 OF THE CAMEL (CAMELUS DROMEDARIUS

    Directory of Open Access Journals (Sweden)

    M. G. SHAH, A. S. QURESHI1, M. REISSMANN2 AND H. J. SCHWARTZ3

    2006-10-01

    Full Text Available Myostatin, also called growth differentiation factor-8 (GDF-8, is a member of the mammalian growth transforming family (TGF-beta superfamily, which is expressed specifically in developing an adult skeletal muscle. Muscular hypertrophy allele (mh allele in the double muscle breeds involved mutation within the myostatin gene. Genomic DNA was isolated from the camel hair using NucleoSpin Tissue kit. Two animals of each of the six breeds namely, Marecha, Dhatti, Larri, Kohi, Sakrai and Cambelpuri were used for sequencing. For PCR amplification of the gene, a primer pair was designed from homolog regions of already published sequences of farm animals from GenBank. Results showed that camel myostatin possessed more than 90% homology with that of cattle, sheep and pig. Camel formed separate cluster from the pig in spite of having high homology (98% and showed 94% homology with cattle and sheep as reported in literature. Sequence analysis of the PCR amplified part of exon 1 (256 bp of the camel myostatin was identical among six camel breeds.

  2. De novo sequencing of two novel peptides homologous to calcitonin-like peptides, from skin secretion of the Chinese Frog, Odorrana schmackeri

    Directory of Open Access Journals (Sweden)

    Geisa P.C. Evaristo

    2015-09-01

    Full Text Available An MS/MS based analytical strategy was followed to solve the complete sequence of two new peptides from frog (Odorrana schmackeri skin secretion. This involved reduction and alkylation with two different alkylating agents followed by high resolution tandem mass spectrometry. De novo sequencing was achieved by complementary CID and ETD fragmentations of full-length peptides and of selected tryptic fragments. Heavy and light isotope dimethyl labeling assisted with annotation of sequence ion series. The identified primary structures are GCD[I/L]STCATHN[I/L]VNE[I/L]NKFDKSKPSSGGVGPESP-NH2 and SCNLSTCATHNLVNELNKFDKSKPSSGGVGPESF-NH2, i.e. two carboxyamidated 34 residue peptides with an aminoterminal intramolecular ring structure formed by a disulfide bridge between Cys2 and Cys7. Edman degradation analysis of the second peptide positively confirmed the exact sequence, resolving I/L discriminations. Both peptide sequences are novel and share homology with calcitonin, calcitonin gene related peptide (CGRP and adrenomedullin from other vertebrates. Detailed sequence analysis as well as the 34 residue length of both O. schmackeri peptides, suggest they do not fully qualify as either calcitonins (32 residues or CGRPs (37 amino acids and may justify their classification in a novel peptide family within the calcitonin gene related peptide superfamily. Smooth muscle contractility assays with synthetic replicas of the S–S linked peptides on rat tail artery, uterus, bladder and ileum did not reveal myotropic activity.

  3. Productive Homologous and Non-homologous Recombination of Hepatitis C Virus in Cell Culture

    Science.gov (United States)

    Li, Yi-Ping; Mikkelsen, Lotte S.; Gottwein, Judith M.; Bukh, Jens

    2013-01-01

    Genetic recombination is an important mechanism for increasing diversity of RNA viruses, and constitutes a viral escape mechanism to host immune responses and to treatment with antiviral compounds. Although rare, epidemiologically important hepatitis C virus (HCV) recombinants have been reported. In addition, recombination is an important regulatory mechanism of cytopathogenicity for the related pestiviruses. Here we describe recombination of HCV RNA in cell culture leading to production of infectious virus. Initially, hepatoma cells were co-transfected with a replicating JFH1ΔE1E2 genome (genotype 2a) lacking functional envelope genes and strain J6 (2a), which has functional envelope genes but does not replicate in culture. After an initial decrease in the number of HCV positive cells, infection spread after 13–36 days. Sequencing of recovered viruses revealed non-homologous recombinants with J6 sequence from the 5′ end to the NS2–NS3 region followed by JFH1 sequence from Core to the 3′ end. These recombinants carried duplicated sequence of up to 2400 nucleotides. HCV replication was not required for recombination, as recombinants were observed in most experiments even when two replication incompetent genomes were co-transfected. Reverse genetic studies verified the viability of representative recombinants. After serial passage, subsequent recombination events reducing or eliminating the duplicated region were observed for some but not all recombinants. Furthermore, we found that inter-genotypic recombination could occur, but at a lower frequency than intra-genotypic recombination. Productive recombination of attenuated HCV genomes depended on expression of all HCV proteins and tolerated duplicated sequence. In general, no strong site specificity was observed. Non-homologous recombination was observed in most cases, while few homologous events were identified. A better understanding of HCV recombination could help identification of natural recombinants

  4. Prefiltering Model for Homology Detection Algorithms on GPU.

    Science.gov (United States)

    Retamosa, Germán; de Pedro, Luis; González, Ivan; Tamames, Javier

    2016-01-01

    Homology detection has evolved over the time from heavy algorithms based on dynamic programming approaches to lightweight alternatives based on different heuristic models. However, the main problem with these algorithms is that they use complex statistical models, which makes it difficult to achieve a relevant speedup and find exact matches with the original results. Thus, their acceleration is essential. The aim of this article was to prefilter a sequence database. To make this work, we have implemented a groundbreaking heuristic model based on NVIDIA's graphics processing units (GPUs) and multicore processors. Depending on the sensitivity settings, this makes it possible to quickly reduce the sequence database by factors between 50% and 95%, while rejecting no significant sequences. Furthermore, this prefiltering application can be used together with multiple homology detection algorithms as a part of a next-generation sequencing system. Extensive performance and accuracy tests have been carried out in the Spanish National Centre for Biotechnology (NCB). The results show that GPU hardware can accelerate the execution times of former homology detection applications, such as National Centre for Biotechnology Information (NCBI), Basic Local Alignment Search Tool for Proteins (BLASTP), up to a factor of 4.

  5. SINA: accurate high-throughput multiple sequence alignment of ribosomal RNA genes.

    Science.gov (United States)

    Pruesse, Elmar; Peplies, Jörg; Glöckner, Frank Oliver

    2012-07-15

    In the analysis of homologous sequences, computation of multiple sequence alignments (MSAs) has become a bottleneck. This is especially troublesome for marker genes like the ribosomal RNA (rRNA) where already millions of sequences are publicly available and individual studies can easily produce hundreds of thousands of new sequences. Methods have been developed to cope with such numbers, but further improvements are needed to meet accuracy requirements. In this study, we present the SILVA Incremental Aligner (SINA) used to align the rRNA gene databases provided by the SILVA ribosomal RNA project. SINA uses a combination of k-mer searching and partial order alignment (POA) to maintain very high alignment accuracy while satisfying high throughput performance demands. SINA was evaluated in comparison with the commonly used high throughput MSA programs PyNAST and mothur. The three BRAliBase III benchmark MSAs could be reproduced with 99.3, 97.6 and 96.1 accuracy. A larger benchmark MSA comprising 38 772 sequences could be reproduced with 98.9 and 99.3% accuracy using reference MSAs comprising 1000 and 5000 sequences. SINA was able to achieve higher accuracy than PyNAST and mothur in all performed benchmarks. Alignment of up to 500 sequences using the latest SILVA SSU/LSU Ref datasets as reference MSA is offered at http://www.arb-silva.de/aligner. This page also links to Linux binaries, user manual and tutorial. SINA is made available under a personal use license.

  6. Double Strand Break Repair, one mechanism can hide another: Alternative non-homologous end joining

    International Nuclear Information System (INIS)

    Rass, E.; Grabarz, A.; Bertrand, P.; Lopez, B.S.

    2012-01-01

    DNA double strand breaks are major cytotoxic lesions encountered by the cells. They can be induced by ionizing radiation or endogenous stress and can lead to genetic instability. Two mechanisms compete for the repair of DNA double strand breaks: homologous recombination and non-homologous end joining (NHEJ). Homologous recombination requires DNA sequences homology and is initiated by single strand resection. Recently, advances have been made concerning the major steps and proteins involved in resection. NHEJ, in contrast, does not require sequence homology. The existence of a DNA double strand break repair mechanism, independent of KU and ligase IV, the key proteins of the canonical non homologous end joining pathway, has been revealed lately and named alternative non homologous end joining. The hallmarks of this highly mutagenic pathway are deletions at repair junctions and frequent use of distal micro-homologies. This mechanism is also initiated by a single strand resection of the break. The aim of this review is firstly to present recent data on single strand resection, and secondly the alternative NHEJ pathway, including a discussion on the fidelity of NHEJ. Based on current knowledge, canonical NHEJ does not appear as an intrinsically mutagenic mechanism, but in contrast, as a conservative one. The structure of broken DNA ends actually dictates the quality repair of the alternative NHEJ and seems the actual responsible for the mutagenesis attributed beforehand to the canonical NHEJ. The existence of this novel DNA double strand breaks repair mechanism needs to be taken into account in the development of radiosensitizing strategies in order to optimise the efficiency of radiotherapy. (authors)

  7. Sequence of a cDNA encoding turtle high mobility group 1 protein.

    Science.gov (United States)

    Zheng, Jifang; Hu, Bi; Wu, Duansheng

    2005-07-01

    In order to understand sequence information about turtle HMG1 gene, a cDNA encoding HMG1 protein of the Chinese soft-shell turtle (Pelodiscus sinensis) was amplified by RT-PCR from kidney total RNA, and was cloned, sequenced and analyzed. The results revealed that the open reading frame (ORF) of turtle HMG1 cDNA is 606 bp long. The ORF codifies 202 amino acid residues, from which two DNA-binding domains and one polyacidic region are derived. The DNA-binding domains share higher amino acid identity with homologues sequences of chicken (96.5%) and mammalian (74%) than homologues sequence of rainbow trout (67%). The polyacidic region shows 84.6% amino acid homology with the equivalent region of chicken HMG1 cDNA. Turtle HMG1 protein contains 3 Cys residues located at completely conserved positions. Conservation in sequence and structure suggests that the functions of turtle HMG1 cDNA may be highly conserved during evolution. To our knowledge, this is the first report of HMG1 cDNA sequence in any reptilian.

  8. PSI/TM-Coffee: a web server for fast and accurate multiple sequence alignments of regular and transmembrane proteins using homology extension on reduced databases.

    Science.gov (United States)

    Floden, Evan W; Tommaso, Paolo D; Chatzou, Maria; Magis, Cedrik; Notredame, Cedric; Chang, Jia-Ming

    2016-07-08

    The PSI/TM-Coffee web server performs multiple sequence alignment (MSA) of proteins by combining homology extension with a consistency based alignment approach. Homology extension is performed with Position Specific Iterative (PSI) BLAST searches against a choice of redundant and non-redundant databases. The main novelty of this server is to allow databases of reduced complexity to rapidly perform homology extension. This server also gives the possibility to use transmembrane proteins (TMPs) reference databases to allow even faster homology extension on this important category of proteins. Aside from an MSA, the server also outputs topological prediction of TMPs using the HMMTOP algorithm. Previous benchmarking of the method has shown this approach outperforms the most accurate alignment methods such as MSAProbs, Kalign, PROMALS, MAFFT, ProbCons and PRALINE™. The web server is available at http://tcoffee.crg.cat/tmcoffee. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  9. Assembly and dynamics of the bacteriophage T4 homologous recombination machinery

    Directory of Open Access Journals (Sweden)

    Morrical Scott W

    2010-12-01

    Full Text Available Abstract Homologous recombination (HR, a process involving the physical exchange of strands between homologous or nearly homologous DNA molecules, is critical for maintaining the genetic diversity and genome stability of species. Bacteriophage T4 is one of the classic systems for studies of homologous recombination. T4 uses HR for high-frequency genetic exchanges, for homology-directed DNA repair (HDR processes including DNA double-strand break repair, and for the initiation of DNA replication (RDR. T4 recombination proteins are expressed at high levels during T4 infection in E. coli, and share strong sequence, structural, and/or functional conservation with their counterparts in cellular organisms. Biochemical studies of T4 recombination have provided key insights on DNA strand exchange mechanisms, on the structure and function of recombination proteins, and on the coordination of recombination and DNA synthesis activities during RDR and HDR. Recent years have seen the development of detailed biochemical models for the assembly and dynamics of presynaptic filaments in the T4 recombination system, for the atomic structure of T4 UvsX recombinase, and for the roles of DNA helicases in T4 recombination. The goal of this chapter is to review these recent advances and their implications for HR and HDR mechanisms in all organisms.

  10. An artificial functional family filter in homolog searching in next-generation sequencing metagenomics.

    Directory of Open Access Journals (Sweden)

    Ruofei Du

    Full Text Available In functional metagenomics, BLAST homology search is a common method to classify metagenomic reads into protein/domain sequence families such as Clusters of Orthologous Groups of proteins (COGs in order to quantify the abundance of each COG in the community. The resulting functional profile of the community is then used in downstream analysis to correlate the change in abundance to environmental perturbation, clinical variation, and so on. However, the short read length coupled with next-generation sequencing technologies poses a barrier in this approach, essentially because similarity significance cannot be discerned by searching with short reads. Consequently, artificial functional families are produced, in which those with a large number of reads assigned decreases the accuracy of functional profile dramatically. There is no method available to address this problem. We intended to fill this gap in this paper. We revealed that BLAST similarity scores of homologues for short reads from COG protein members coding sequences are distributed differently from the scores of those derived elsewhere. We showed that, by choosing an appropriate score cut-off, we are able to filter out most artificial families and simultaneously to preserve sufficient information in order to build the functional profile. We also showed that, by incorporated application of BLAST and RPS-BLAST, some artificial families with large read counts can be further identified after the score cutoff filtration. Evaluated on three experimental metagenomic datasets with different coverages, we found that the proposed method is robust against read coverage and consistently outperforms the other E-value cutoff methods currently used in literatures.

  11. CBH1 homologs and varian CBH1 cellulase

    Energy Technology Data Exchange (ETDEWEB)

    Goedegebuur, Frits; Gualfetti, Peter; Mitchinson, Colin; Neefe, Paulien

    2014-07-01

    Disclosed are a number of homologs and variants of Hypocrea jecorina Cel7A (formerly Trichoderma reesei cellobiohydrolase I or CBH1), nucleic acids encoding the same and methods for producing the same. The homologs and variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted and/or deleted.

  12. External and semi-internal controls for PCR amplification of homologous sequences in mixed templates

    DEFF Research Database (Denmark)

    Kalle, Elena; Gulevich, Alexander; Rensing, Christopher Günther T

    2013-01-01

    as an acceptable alternative. In order to evaluate the effects of inhibitors, a model multi-template mix was amplified in a mixture with DNAse-treated sample. Semi-internal control allowed establishment of intervals for robust PCR performance for different samples, thus enabling correct comparison of the samples......In a mixed template, the presence of homologous target DNA sequences creates environments that almost inevitably give rise to artifacts and biases during PCR. Heteroduplexes, chimeras, and skewed template-to-product ratios are the exclusive attributes of mixed template PCR and never occur....... This study demonstrated the efficiency of a model mixed template as an adequate external amplification control for a particular PCR application. The conditions of multi-template PCR do not allow implementation of a classic internal control; therefore we developed a convenient semi-internal control...

  13. Sequence and transcription analysis of the human cytomegalovirus DNA polymerase gene

    International Nuclear Information System (INIS)

    Kouzarides, T.; Bankier, A.T.; Satchwell, S.C.; Weston, K.; Tomlinson, P.; Barrell, B.G.

    1987-01-01

    DNA sequence analysis has revealed that the gene coding for the human cytomegalovirus (HCMV) DNA polymerase is present within the long unique region of the virus genome. Identification is based on extensive amino acid homology between the predicted HCMV open reading frame HFLF2 and the DNA polymerase of herpes simplex virus type 1. The authors present here a 5280 base-pair DNA sequence containing the HCMV pol gene, along with the analysis of transcripts encoded within this region. Since HCMV pol also shows homology to the predicted Epstein-Barr virus pol, they were able to analyze the extent of homology between the DNA polymerases of three distantly related herpes viruses, HCMV, Epstein-Barr virus, and herpes simplex virus. The comparison shows that these DNA polymerases exhibit considerable amino acid homology and highlights a number of highly conserved regions; two such regions show homology to sequences within the adenovirus type 2 DNA polymerase. The HCMV pol gene is flanked by open reading frames with homology to those of other herpes viruses; upstream, there is a reading frame homologous to the glycoprotein B gene of herpes simplex virus type I and Epstein-Barr virus, and downstream there is a reading frame homologous to BFLF2 of Epstein-Barr virus

  14. Two human cDNA molecules coding for the Duchenne muscular dystrophy (DMD) locus are highly homologous

    Energy Technology Data Exchange (ETDEWEB)

    Rosenthal, A.; Speer, A.; Billwitz, H. (Zentralinstitut fuer Molekularbiologie, Berlin-Buch (Germany Democratic Republic)); Cross, G.S.; Forrest, S.M.; Davies, K.E. (Univ. of Oxford (England))

    1989-07-11

    Recently the complete sequence of the human fetal cDNA coding for the Duchenne muscular dystrophy (DMD) locus was reported and a 3,685 amino acid long, rod-shaped cytoskeletal protein (dystrophin) was predicted as the protein product. Independently, the authors have isolated and sequenced different DMD cDNA molecules from human adult and fetal muscle. The complete 12.5 kb long sequence of all their cDNA clones has now been determined and they report here the nucleotide (nt) and amino acid (aa) differences between the sequences of both groups. The cDNA sequence comprises the whole coding region but lacks the first 110 nt from the 5{prime}-untranslated region and the last 1,417 nt of the 3{prime}-untranslated region. They have found 11 nt differences (approximately 99.9% homology) from which 7 occurred at the aa level.

  15. Isolation, sequence identification and tissue expression profile of a ...

    African Journals Online (AJOL)

    The complete expressed sequence tag (CDS) sequence of Banna mini-pig inbred line (BMI) ribokinase gene (RBKS) was amplified using the reverse transcription-polymerase chain reaction (RT-PCR) based on the conserved sequence information of the cattle or other mammals and known highly homologous swine ESTs.

  16. Primary structure and functional characterization of a Drosophila dopamine receptor with high homology to human D1/5 receptors.

    Science.gov (United States)

    Gotzes, F; Balfanz, S; Baumann, A

    1994-01-01

    Members of the superfamily of G-protein coupled receptors share significant similarities in sequence and transmembrane architecture. We have isolated a Drosophila homologue of the mammalian dopamine receptor family using a low stringency hybridization approach. The deduced amino acid sequence is approximately 70% homologous to the human D1/D5 receptors. When expressed in HEK 293 cells, the Drosophila receptor stimulates cAMP production in response to dopamine application. This effect was mimicked by SKF 38393, a specific D1 receptor agonist, but inhibited by dopaminergic antagonists such as butaclamol and flupentixol. In situ hybridization revealed that the Drosophila dopamine receptor is highly expressed in the somata of the optic lobes. This suggests that the receptor might be involved in the processing of visual information and/or visual learning in invertebrates.

  17. A family of cell-adhering peptides homologous to fibrinogen C-termini

    International Nuclear Information System (INIS)

    Levy-Beladev, Liron; Levdansky, Lilia; Gaberman, Elena; Friedler, Assaf; Gorodetsky, Raphael

    2010-01-01

    Research highlights: → Cell-adhesive sequences homologous to fibrinogen C-termini exist in other proteins. → The extended homologous cell-adhesive C-termini peptides family is termed Haptides. → In membrane-like environment random coiled Haptides adopt a helical conformation. → Replacing positively charged residues with alanine reduces Haptides activity. -- Abstract: A family of cell-adhesive peptides homologous to sequences on different chains of fibrinogen was investigated. These homologous peptides, termed Haptides, include the peptides Cβ, preCγ, and CαE, corresponding to sequences on the C-termini of fibrinogen chains β, γ, and αE, respectively. Haptides do not affect cell survival and rate of proliferation of the normal cell types tested. The use of new sensitive assays of cell adhesion clearly demonstrated the ability of Haptides, bound to inert matrices, to mediate attachment of different matrix-dependent cell types including normal fibroblasts, endothelial, and smooth muscle cells. Here we present new active Haptides bearing homologous sequences derived from the C-termini of other proteins, such as angiopoietin 1 and 2, tenascins C and X, and microfibril-associated glycoprotein-4. The cell adhesion properties of all the Haptides were found to be associated mainly with their 11 N-terminal residues. Mutated preCγ peptides revealed that positively charged residues account for their attachment effect. These results suggest a mechanism of direct electrostatic interaction of Haptides with the cell membrane. The extended Haptides family may be applied in modulating adhesion of cells to scaffolds for tissue regeneration and for enhancement of nanoparticulate transfection into cells.

  18. Induction of homologous recombination in Saccharomyces cerevisiae.

    Science.gov (United States)

    Simon, J R; Moore, P D

    1988-09-01

    We have investigated the effects of UV irradiation of Saccharomyces cerevisiae in order to distinguish whether UV-induced recombination results from the induction of enzymes required for homologous recombination, or the production of substrate sites for recombination containing regions of DNA damage. We utilized split-dose experiments to investigate the induction of proteins required for survival, gene conversion, and mutation in a diploid strain of S. cerevisiae. We demonstrate that inducing doses of UV irradiation followed by a 6 h period of incubation render the cells resistant to challenge doses of UV irradiation. The effects of inducing and challenge doses of UV irradiation upon interchromosomal gene conversion and mutation are strictly additive. Using the yeast URA3 gene cloned in non-replicating single- and double-stranded plasmid vectors that integrate into chromosomal genes upon transformation, we show that UV irradiation of haploid yeast cells and homologous plasmid DNA sequences each stimulate homologous recombination approximately two-fold, and that these effects are additive. Non-specific DNA damage has little effect on the stimulation of homologous recombination, as shown by studies in which UV-irradiated heterologous DNA was included in transformation/recombination experiments. We further demonstrate that the effect of competing single- and double-stranded heterologous DNA sequences differs in UV-irradiated and unirradiated cells, suggesting an induction of recombinational machinery in UV-irradiated S. cerevisiae cells.

  19. Designing a Bioengine for Detection and Analysis of Base String on an Affected Sequence in High-Concentration Regions

    Directory of Open Access Journals (Sweden)

    Debnath Bhattacharyya

    2013-01-01

    Full Text Available We design an Algorithm for bioengine. As a program are enable optimal alignments searching between two sequences, the host sequence (normal plant as well as query sequence (virus. Searching for homologues has become a routine operation of biological sequences in 4 × 4 combination with different subsequence (word size. This program takes the advantage of the high degree of homology between such sequences to construct an alignment of the matching regions. There is a main aim which is to detect the overlapping reading frames. This program also enables to find out the highly infected colones selection highest matching region with minimum gap or mismatch zones and unique virus colones matches. This is a small, portable, interactive, front-end program intended to be used to find out the regions of matching between host sequence and query subsequences. All the operations are carried out in fraction of seconds, depending on the required task and on the sequence length.

  20. Designing a Bioengine for Detection and Analysis of Base String on an Affected Sequence in High-Concentration Regions

    Science.gov (United States)

    Mandal, Bijoy Kumar; Kim, Tai-hoon

    2013-01-01

    We design an Algorithm for bioengine. As a program are enable optimal alignments searching between two sequences, the host sequence (normal plant) as well as query sequence (virus). Searching for homologues has become a routine operation of biological sequences in 4 × 4 combination with different subsequence (word size). This program takes the advantage of the high degree of homology between such sequences to construct an alignment of the matching regions. There is a main aim which is to detect the overlapping reading frames. This program also enables to find out the highly infected colones selection highest matching region with minimum gap or mismatch zones and unique virus colones matches. This is a small, portable, interactive, front-end program intended to be used to find out the regions of matching between host sequence and query subsequences. All the operations are carried out in fraction of seconds, depending on the required task and on the sequence length. PMID:24000321

  1. A recurrent translocation is mediated by homologous recombination between HERV-H elements

    Directory of Open Access Journals (Sweden)

    Hermetz Karen E

    2012-01-01

    Full Text Available Abstract Background Chromosome rearrangements are caused by many mutational mechanisms; of these, recurrent rearrangements can be particularly informative for teasing apart DNA sequence-specific factors. Some recurrent translocations are mediated by homologous recombination between large blocks of segmental duplications on different chromosomes. Here we describe a recurrent unbalanced translocation casued by recombination between shorter homologous regions on chromosomes 4 and 18 in two unrelated children with intellectual disability. Results Array CGH resolved the breakpoints of the 6.97-Megabase (Mb loss of 18q and the 7.30-Mb gain of 4q. Sequencing across the translocation breakpoints revealed that both translocations occurred between 92%-identical human endogenous retrovirus (HERV elements in the same orientation on chromosomes 4 and 18. In addition, we find sequence variation in the chromosome 4 HERV that makes one allele more like the chromosome 18 HERV. Conclusions Homologous recombination between HERVs on the same chromosome is known to cause chromosome deletions, but this is the first report of interchromosomal HERV-HERV recombination leading to a translocation. It is possible that normal sequence variation in substrates of non-allelic homologous recombination (NAHR affects the alignment of recombining segments and influences the propensity to chromosome rearrangement.

  2. Tracking TCRβ sequence clonotype expansions during antiviral therapy using high-throughput sequencing of the hypervariable region

    Directory of Open Access Journals (Sweden)

    Mark W Robinson

    2016-04-01

    Full Text Available To maintain a persistent infection viruses such as hepatitis C virus (HCV employ a range of mechanisms that subvert protective T cell responses. The suppression of antigen-specific T cell responses by HCV hinders efforts to profile T cell responses during chronic infection and antiviral therapy. Conventional methods of detecting antigen-specific T cells utilise either antigen stimulation (e.g. ELISpot, proliferation assays, cytokine production or antigen-loaded tetramer staining. This limits the ability to profile T cell responses during chronic infection due to suppressed effector function and the requirement for prior knowledge of antigenic viral peptide sequences. Recently high-throughput sequencing (HTS technologies have been developed for the analysis of T cell repertoires. In the present study we have assessed the feasibility of HTS of the TCRβ complementarity determining region (CDR3 to track T cell expansions in an antigen-independent manner. Using sequential blood samples from HCV-infected individuals undergoing anti-viral therapy we were able to measure the population frequencies of >35,000 TCRβ sequence clonotypes in each individual over the course of 12 weeks. TRBV/TRBJ gene segment usage varied markedly between individuals but remained relatively constant within individuals across the course of therapy. Despite this stable TRBV/TRBJ gene segment usage, a number of TCRβ sequence clonotypes showed dramatic changes in read frequency. These changes could not be linked to therapy outcomes in the present study however the TCRβ CDR3 sequences with the largest fold changes did include sequences with identical TRBV/TRBJ gene segment usage and high joining region homology to previously published CDR3 sequences from HCV-specific T cells targeting the HLA-B*0801-restricted 1395HSKKKCDEL1403 and HLA-A*0101–restricted 1435ATDALMTGY1443 epitopes. The pipeline developed in this proof of concept study provides a platform for the design of

  3. Regulation of homologous recombination at telomeres in budding yeast

    DEFF Research Database (Denmark)

    Eckert-Boulet, Nadine; Lisby, Michael

    2010-01-01

    Homologous recombination is suppressed at normal length telomere sequences. In contrast, telomere recombination is allowed when telomeres erode in the absence of telomerase activity or as a consequence of nucleolytic degradation or incomplete replication. Here, we review the mechanisms that contr...... that contribute to regulating mitotic homologous recombination at telomeres and the role of these mechanisms in signalling short telomeres in the budding yeast Saccharomyces cerevisiae....

  4. Micropathogen Community Analysis in Hyalomma rufipes via High-Throughput Sequencing of Small RNAs

    Science.gov (United States)

    Luo, Jin; Liu, Min-Xuan; Ren, Qiao-Yun; Chen, Ze; Tian, Zhan-Cheng; Hao, Jia-Wei; Wu, Feng; Liu, Xiao-Cui; Luo, Jian-Xun; Yin, Hong; Wang, Hui; Liu, Guang-Yuan

    2017-01-01

    Ticks are important vectors in the transmission of a broad range of micropathogens to vertebrates, including humans. Because of the role of ticks in disease transmission, identifying and characterizing the micropathogen profiles of tick populations have become increasingly important. The objective of this study was to survey the micropathogens of Hyalomma rufipes ticks. Illumina HiSeq2000 technology was utilized to perform deep sequencing of small RNAs (sRNAs) extracted from field-collected H. rufipes ticks in Gansu Province, China. The resultant sRNA library data revealed that the surveyed tick populations produced reads that were homologous to St. Croix River Virus (SCRV) sequences. We also observed many reads that were homologous to microbial and/or pathogenic isolates, including bacteria, protozoa, and fungi. As part of this analysis, a phylogenetic tree was constructed to display the relationships among the homologous sequences that were identified. The study offered a unique opportunity to gain insight into the micropathogens of H. rufipes ticks. The effective control of arthropod vectors in the future will require knowledge of the micropathogen composition of vectors harboring infectious agents. Understanding the ecological factors that regulate vector propagation in association with the prevalence and persistence of micropathogen lineages is also imperative. These interactions may affect the evolution of micropathogen lineages, especially if the micropathogens rely on the vector or host for dispersal. The sRNA deep-sequencing approach used in this analysis provides an intuitive method to survey micropathogen prevalence in ticks and other vector species. PMID:28861401

  5. Evaluating the efficacy of a structure-derived amino acid substitution matrix in detecting protein homologs by BLAST and PSI-BLAST.

    Science.gov (United States)

    Goonesekere, Nalin Cw

    2009-01-01

    The large numbers of protein sequences generated by whole genome sequencing projects require rapid and accurate methods of annotation. The detection of homology through computational sequence analysis is a powerful tool in determining the complex evolutionary and functional relationships that exist between proteins. Homology search algorithms employ amino acid substitution matrices to detect similarity between proteins sequences. The substitution matrices in common use today are constructed using sequences aligned without reference to protein structure. Here we present amino acid substitution matrices constructed from the alignment of a large number of protein domain structures from the structural classification of proteins (SCOP) database. We show that when incorporated into the homology search algorithms BLAST and PSI-blast, the structure-based substitution matrices enhance the efficacy of detecting remote homologs.

  6. Evaluating the efficacy of a structure-derived amino acid substitution matrix in detecting protein homologs by BLAST and PSI-BLAST

    Directory of Open Access Journals (Sweden)

    Nalin CW Goonesekere

    2009-06-01

    Full Text Available Nalin CW GoonesekereDepartment of Chemistry and Biochemistry, University of Northern iowa, Cedar Falls, IA, USAAbstract: The large numbers of protein sequences generated by whole genome sequencing projects require rapid and accurate methods of annotation. The detection of homology through computational sequence analysis is a powerful tool in determining the complex evolutionary and functional relationships that exist between proteins. Homology search algorithms employ amino acid substitution matrices to detect similarity between proteins sequences. The substitution matrices in common use today are constructed using sequences aligned without reference to protein structure. Here we present amino acid substitution matrices constructed from the alignment of a large number of protein domain structures from the structural classification of proteins (SCOP database. We show that when incorporated into the homology search algorithms BLAST and PSI-blaST, the structure-based substitution matrices enhance the efficacy of detecting remote homologs. Keywords: computational biology, protein homology, amino acid substitution matrix, protein structure

  7. Implication of the cause of differences in 3D structures of proteins with high sequence identity based on analyses of amino acid sequences and 3D structures.

    Science.gov (United States)

    Matsuoka, Masanari; Sugita, Masatake; Kikuchi, Takeshi

    2014-09-18

    Proteins that share a high sequence homology while exhibiting drastically different 3D structures are investigated in this study. Recently, artificial proteins related to the sequences of the GA and IgG binding GB domains of human serum albumin have been designed. These artificial proteins, referred to as GA and GB, share 98% amino acid sequence identity but exhibit different 3D structures, namely, a 3α bundle versus a 4β + α structure. Discriminating between their 3D structures based on their amino acid sequences is a very difficult problem. In the present work, in addition to using bioinformatics techniques, an analysis based on inter-residue average distance statistics is used to address this problem. It was hard to distinguish which structure a given sequence would take only with the results of ordinary analyses like BLAST and conservation analyses. However, in addition to these analyses, with the analysis based on the inter-residue average distance statistics and our sequence tendency analysis, we could infer which part would play an important role in its structural formation. The results suggest possible determinants of the different 3D structures for sequences with high sequence identity. The possibility of discriminating between the 3D structures based on the given sequences is also discussed.

  8. Homology groups for particles on one-connected graphs

    Science.gov (United States)

    MaciÄ Żek, Tomasz; Sawicki, Adam

    2017-06-01

    We present a mathematical framework for describing the topology of configuration spaces for particles on one-connected graphs. In particular, we compute the homology groups over integers for different classes of one-connected graphs. Our approach is based on some fundamental combinatorial properties of the configuration spaces, Mayer-Vietoris sequences for different parts of configuration spaces, and some limited use of discrete Morse theory. As one of the results, we derive the closed-form formulae for ranks of the homology groups for indistinguishable particles on tree graphs. We also give a detailed discussion of the second homology group of the configuration space of both distinguishable and indistinguishable particles. Our motivation is the search for new kinds of quantum statistics.

  9. Efficient Detection of Copy Number Mutations in PMS2 Exons with a Close Homolog.

    Science.gov (United States)

    Herman, Daniel S; Smith, Christina; Liu, Chang; Vaughn, Cecily P; Palaniappan, Selvi; Pritchard, Colin C; Shirts, Brian H

    2018-07-01

    Detection of 3' PMS2 copy-number mutations that cause Lynch syndrome is difficult because of highly homologous pseudogenes. To improve the accuracy and efficiency of clinical screening for these mutations, we developed a new method to analyze standard capture-based, next-generation sequencing data to identify deletions and duplications in PMS2 exons 9 to 15. The approach captures sequences using PMS2 targets, maps sequences randomly among regions with equal mapping quality, counts reads aligned to homologous exons and introns, and flags read count ratios outside of empirically derived reference ranges. The method was trained on 1352 samples, including 8 known positives, and tested on 719 samples, including 17 known positives. Clinical implementation of the first version of this method detected new mutations in the training (N = 7) and test (N = 2) sets that had not been identified by our initial clinical testing pipeline. The described final method showed complete sensitivity in both sample sets and false-positive rates of 5% (training) and 7% (test), dramatically decreasing the number of cases needing additional mutation evaluation. This approach leveraged the differences between gene and pseudogene to distinguish between PMS2 and PMS2CL copy-number mutations. These methods enable efficient and sensitive Lynch syndrome screening for 3' PMS2 copy-number mutations and may be applied similarly to other genomic regions with highly homologous pseudogenes. Copyright © 2018 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  10. p53 regulates the repair of DNA double-strand breaks by both homologous and non-homologous recombination

    International Nuclear Information System (INIS)

    Willers, H.; Powell, S.N.; Dahm-Daphi, J.

    2003-01-01

    Full text: p53 is known to suppress spontaneous homologous recombination (HR), while its role in non-homologous recombination (NHR) remains to be clarified. Here, we sought to determine the influence of p53 on the repair of chromosomal double-strand breaks (DSBs) by HR or NHR using specially designed recombination substrates that integrate into the genome. Isogenic mouse fibroblast pairs with or without expression of exogenous p53 protein were utilized. A reporter plasmid carrying a mutated XGPRT gene was chromosomally integrated and DSBs were generated within the plasmid by the I-SceI endonuclease. Subsequent homology-mediated repair from an episomal donor resulted in XGPRT reconstitution and cellular resistance to a selection antibiotic. Analogously, the repair of chromosomal I-SceI breaks by NHR using another novel reporter plasmid restored XGPRT translation. For p53-null cells, the mean frequency of I-SceI break repair via HR was 5.5 x 10 -4 . The p53-Val135 mutant, which previously has been shown to suppress spontaneous HR by 14-fold employing the same cell system and reporter gene, only caused a 2- to 3-fold suppression of break-induced HR. In contrast, a dramatic effect of p53 on repair via NHR was found. Preliminary sequence analysis indicated that there was at least a 1000-fold reduction of illegitimate repair events resulting in loss of sequence at the break sites. The observed effects were mediated by p53 mutants defective in regulation of the cell-cycle and apoptosis. The main findings were: (1) p53 virtually blocked illegitimate rejoining of chromosomal ends. (2) The suppression of homologous DSB repair was less pronounced than the inhibition of spontaneous HR. We hypothesize that p53 allows to a certain extent error-free homology-dependent repair to proceed, while blocking error-prone NHR. The data support and extent a previous model, in which p53 maintains genomic stability by regulating recombination independently of its transactivation function

  11. Determination of 5 '-leader sequences from radically disparate strains of porcine reproductive and respiratory syndrome virus reveals the presence of highly conserved sequence motifs

    DEFF Research Database (Denmark)

    Oleksiewicz, M.B.; Bøtner, Anette; Nielsen, Jens

    1999-01-01

    We determined the untranslated 5'-leader sequence for three different isolates of porcine reproductive and respiratory syndrome virus (PRRSV): pathogenic European- and American-types, as well as an American-type vaccine strain. 5'-leader from European- and American-type PRRSV differed in length...... (220 and 190 nt, respectively), and exhibited only approximately 50% nucleotide homology. Nevertheless, highly conserved areas were identified in the leader of all 3 PRRSV isolates, which constitute candidate motifs for binding of protein(s) involved in viral replication. These comparative data provide...

  12. Isolation and sequence of complementary DNA encoding human extracellular superoxide dismutase

    International Nuclear Information System (INIS)

    Hjalmarsson, K.; Marklund, S.L.; Engstroem, A.; Edlund, T.

    1987-01-01

    A complementary DNA (cDNA) clone from a human placenta cDNA library encoding extracellular superoxide dismutase has been isolated and the nucleotide sequence determined. The cDNA has a very high G + C content. EC-SOD is synthesized with a putative 18-amino acid signal peptide, preceding the 222 amino acids in the mature enzyme, indicating that the enzyme is a secretory protein. The first 95 amino acids of the mature enzyme show no sequence homology with other sequenced proteins and there is one possible N-glycosylation site (Asn-89). The amino acid sequence from residues 96-193 shows strong homology (∼ 50%) with the final two-thirds of the sequences of all know eukaryotic CuZn SODs, whereas the homology with the P. leiognathi CuZn SOD is clearly lower. The ligands to Cu and Zn, the cysteines forming the intrasubunit disulfide bridge in the CuZn SODs, and the arginine found in all CuZn SODs in the entrance to the active site can all be identified in EC-SOD. A comparison with bovine CuZn SOD, the three-dimensional structure of which is known, reveals that the homologies occur in the active site and the divergencies are in the part constituting the subunit contact area in CuZn SOD. Amino acid sequence 194-222 in the carboxyl-terminal end of EC-SOD is strongly hydrophilic and contains nine amino acids with a positive charge. This sequence probably confers the affinity of EC-SOD for heparin and heparan sulfate. An analysis of the amino acid sequence homologies with CuZn SODs from various species indicates that the EC-SODs may have evolved form the CuZn SODs before the evolution of fungi and plants

  13. Comparison of the degree of homology of DNA and quantity of repeated sequences in an intact plant and cell structure

    International Nuclear Information System (INIS)

    Solov'yan, V.T.; Kunaleh, V.A.; Shumnyl, V.K.; Vershinin, A.V.

    1986-01-01

    This paper attempts to assess the quantity of repeated sequences and degree of homology of DNA in the intact plant and two lines of callus tissue of Rauwolfia serpentina Benth maintained for 20 years, which differ among themselves in the level of biosynthesis of the pharmacologically valuable alkaloid ajmaline. The tritium-labeled repeats of plants and calli were used in direct and reverse hybridization on nitrocellulose filters. Hybridization of H 3-labeled repeats with phage 17 DNA was used as control. The radioactivity of filters after washing was measured in a liquid scintillation counter

  14. The lytic origin of herpesvirus papio is highly homologous to Epstein-Barr virus ori-Lyt: evolutionary conservation of transcriptional activation and replication signals.

    Science.gov (United States)

    Ryon, J J; Fixman, E D; Houchens, C; Zong, J; Lieberman, P M; Chang, Y N; Hayward, G S; Hayward, S D

    1993-01-01

    Herpesvirus papio (HVP) is a B-lymphotropic baboon virus with an estimated 40% homology to Epstein-Barr virus (EBV). We have cloned and sequenced ori-Lyt of herpesvirus papio and found a striking degree of nucleotide homology (89%) with ori-Lyt of EBV. Transcriptional elements form an integral part of EBV ori-Lyt. The promoter and enhancer domains of EBV ori-Lyt are conserved in herpesvirus papio. The EBV ori-Lyt promoter contains four binding sites for the EBV lytic cycle transactivator Zta, and the enhancer includes one Zta and two Rta response elements. All five of the Zta response elements and one of the Rta motifs are conserved in HVP ori-Lyt, and the HVP DS-L leftward promoter and the enhancer were activated in transient transfection assays by the EBV Zta and Rta transactivators. The EBV ori-Lyt enhancer contains a palindromic sequence, GGTCAGCTGACC, centered on a PvuII restriction site. This sequence, with a single base change, is also present in the HVP ori-Lyt enhancer. DNase I footprinting demonstrated that the PvuII sequence was bound by a protein present in a Raji nuclear extract. Mobility shift and competition assays using oligonucleotide probes identified this sequence as a binding site for the cellular transcription factor MLTF. Mutagenesis of the binding site indicated that MLTF contributes significantly to the constitutive activity of the ori-Lyt enhancer. The high degree of conservation of cis-acting signal sequences in HVP ori-Lyt was further emphasized by the finding that an HVP ori-Lyt-containing plasmid was replicated in Vero cells by a set of cotransfected EBV replication genes. The central domain of EBV ori-Lyt contains two related AT-rich palindromes, one of which is partially duplicated in the HVP sequence. The AT-rich palindromes are functionally important cis-acting motifs. Deletion of these palindromes severely diminished replication of an ori-Lyt target plasmid. Images PMID:8389916

  15. Cloning and characterization of the ddc homolog encoding L-2,4-diaminobutyrate decarboxylase in Enterobacter aerogenes.

    Science.gov (United States)

    Yamamoto, S; Mutoh, N; Tsuzuki, D; Ikai, H; Nakao, H; Shinoda, S; Narimatsu, S; Miyoshi, S I

    2000-05-01

    L-2,4-diaminobutyrate decarboxylase (DABA DC) catalyzes the formation of 1,3-diaminopropane (DAP) from DABA. In the present study, the ddc gene encoding DABA DC from Enterobacter aerogenes ATCC 13048 was cloned and characterized. Determination of the nucleotide sequence revealed an open reading frame of 1470 bp encoding a 53659-Da protein of 490 amino acids, whose deduced NH2-terminal sequence was identical to that of purified DABA DC from E. aerogenes. The deduced amino acid sequence was highly similar to those of Acinetobacter baumannii and Haemophilus influenzae DABA DCs encoded by the ddc genes. The lysine-307 of the E. aerogenes DABA DC was identified as the pyridoxal 5'-phosphate binding residue by site-directed mutagenesis. Furthermore, PCR analysis revealed the distribution of E. aerogenes ddc homologs in some other species of Enterobacteriaceae. Such a relatively wide occurrence of the ddc homologs implies biological significance of DABA DC and its product DAP.

  16. Gene Discovery through Genomic Sequencing of Brucella abortus

    OpenAIRE

    Sánchez, Daniel O.; Zandomeni, Ruben O.; Cravero, Silvio; Verdún, Ramiro E.; Pierrou, Ester; Faccio, Paula; Diaz, Gabriela; Lanzavecchia, Silvia; Agüero, Fernán; Frasch, Alberto C. C.; Andersson, Siv G. E.; Rossetti, Osvaldo L.; Grau, Oscar; Ugalde, Rodolfo A.

    2001-01-01

    Brucella abortus is the etiological agent of brucellosis, a disease that affects bovines and human. We generated DNA random sequences from the genome of B. abortus strain 2308 in order to characterize molecular targets that might be useful for developing immunological or chemotherapeutic strategies against this pathogen. The partial sequencing of 1,899 clones allowed the identification of 1,199 genomic sequence surveys (GSSs) with high homology (BLAST expect value < 10−5) to sequences deposit...

  17. Sequence of human protamine 2 cDNA

    Energy Technology Data Exchange (ETDEWEB)

    Domenjoud, L; Fronia, C; Uhde, F; Engel, W [Universitaet Goettingen (West Germany)

    1988-08-11

    The authors report the cloning and sequencing of a cDNA clone for human protamine 2 (hp2), isolated from a human testis cDNA library cloned in the vector {lambda}-gt11. A 66mer oligonucleotide, that corresponds to an amino acid sequence which is highly conserved between hp2 and mouse protamine 2 (mp2) served as hybridization probe. The homology between the amino acid sequence deduced from our cDNA and the published amino acid sequence for hp2 is 100%.

  18. Therapeutic Potential of a Scorpion Venom-Derived Antimicrobial Peptide and Its Homologs Against Antibiotic-Resistant Gram-Positive Bacteria

    Directory of Open Access Journals (Sweden)

    Gaomin Liu

    2018-05-01

    Full Text Available The alarming rise in the prevalence of antibiotic resistance among pathogenic bacteria poses a unique challenge for the development of effective therapeutic agents. Antimicrobial peptides (AMPs have attracted a great deal of attention as a possible solution to the increasing problem of antibiotic-resistant bacteria. Marcin-18 was identified from the scorpion Mesobuthus martensii at both DNA and protein levels. The genomic sequence revealed that the marcin-18 coding gene contains a phase-I intron with a GT-AG splice junction located in the DNA region encoding the N-terminal part of signal peptide. The peptide marcin-18 was also isolated from scorpion venom. A protein sequence homology search revealed that marcin-18 shares extremely high sequence identity to the AMPs meucin-18 and megicin-18. In vitro, chemically synthetic marcin-18 and its homologs (meucin-18 and megicin-18 showed highly potent inhibitory activity against Gram-positive bacteria, including some clinical antibiotic-resistant strains. Importantly, in a mouse acute peritonitis model, these peptides significantly decreased the bacterial load in ascites and rescued nearly all mice heavily infected with clinical methicillin-resistant Staphylococcus aureus from lethal bacteremia. Peptides exerted antimicrobial activity via a bactericidal mechanism and killed bacteria through membrane disruption. Taken together, marcin-18 and its homologs have potential for development as therapeutic agents for treating antibiotic-resistant, Gram-positive bacterial infections.

  19. Sequence homology: A poor predictive value for profilins cross-reactivity

    Directory of Open Access Journals (Sweden)

    Pazouki Nazanin

    2005-09-01

    Full Text Available Summary Background Profilins are highly cross-reactive allergens which bind IgE antibodies of almost 20% of plant-allergic patients. This study is aimed at investigating cross-reactivity of melon profilin with other plant profilins and the role of the linear and conformational epitopes in human IgE cross-reactivity. Methods Seventeen patients with melon allergy were selected based on clinical history and a positive skin prick test to melon extract. Melon profilin has been cloned and expressed in E. coli. The IgE binding and cross-reactivity of the recombinant profilin were measured by ELISA and inhibition ELISA. The amino acid sequence of melon profilin was compared with other profilin sequences. A combination of chemical cleavage and immunoblotting techniques were used to define the role of conformational and linear epitopes in IgE binding. Comparative modeling was used to construct three-dimensional models of profilins and to assess theoretical impact of amino acid differences on conformational structure. Results Profilin was identified as a major IgE-binding component of melon. Alignment of amino acid sequences of melon profilin with other profilins showed the most identity with watermelon profilin. This melon profilin showed substantial cross-reactivity with the tomato, peach, grape and Cynodon dactylon (Bermuda grass pollen profilins. Cantaloupe, watermelon, banana and Poa pratensis (Kentucky blue grass displayed no notable inhibition. Our experiments also indicated human IgE only react with complete melon profilin. Immunoblotting analysis with rabbit polyclonal antibody shows the reaction of the antibody to the fragmented and complete melon profilin. Although, the well-known linear epitope of profilins were identical in melon and watermelon, comparison of three-dimensional models of watermelon and melon profilins indicated amino acid differences influence the electric potential and accessibility of the solvent-accessible surface of

  20. Phylogenetic incongruence in E. coli O104: understanding the evolutionary relationships of emerging pathogens in the face of homologous recombination.

    Directory of Open Access Journals (Sweden)

    Weilong Hao

    Full Text Available Escherichia coli O104:H4 was identified as an emerging pathogen during the spring and summer of 2011 and was responsible for a widespread outbreak that resulted in the deaths of 50 people and sickened over 4075. Traditional phenotypic and genotypic assays, such as serotyping, pulsed field gel electrophoresis (PFGE, and multilocus sequence typing (MLST, permit identification and classification of bacterial pathogens, but cannot accurately resolve relationships among genotypically similar but pathotypically different isolates. To understand the evolutionary origins of E. coli O104:H4, we sequenced two strains isolated in Ontario, Canada. One was epidemiologically linked to the 2011 outbreak, and the second, unrelated isolate, was obtained in 2010. MLST analysis indicated that both isolates are of the same sequence type (ST678, but whole-genome sequencing revealed differences in chromosomal and plasmid content. Through comprehensive phylogenetic analysis of five O104:H4 ST678 genomes, we identified 167 genes in three gene clusters that have undergone homologous recombination with distantly related E. coli strains. These recombination events have resulted in unexpectedly high sequence diversity within the same sequence type. Failure to recognize or adjust for homologous recombination can result in phylogenetic incongruence. Understanding the extent of homologous recombination among different strains of the same sequence type may explain the pathotypic differences between the ON2010 and ON2011 strains and help shed new light on the emergence of this new pathogen.

  1. MollDE: a homology modeling framework you can click with.

    Science.gov (United States)

    Canutescu, Adrian A; Dunbrack, Roland L

    2005-06-15

    Molecular Integrated Development Environment (MolIDE) is an integrated application designed to provide homology modeling tools and protocols under a uniform, user-friendly graphical interface. Its main purpose is to combine the most frequent modeling steps in a semi-automatic, interactive way, guiding the user from the target protein sequence to the final three-dimensional protein structure. The typical basic homology modeling process is composed of building sequence profiles of the target sequence family, secondary structure prediction, sequence alignment with PDB structures, assisted alignment editing, side-chain prediction and loop building. All of these steps are available through a graphical user interface. MolIDE's user-friendly and streamlined interactive modeling protocol allows the user to focus on the important modeling questions, hiding from the user the raw data generation and conversion steps. MolIDE was designed from the ground up as an open-source, cross-platform, extensible framework. This allows developers to integrate additional third-party programs to MolIDE. http://dunbrack.fccc.edu/molide/molide.php rl_dunbrack@fccc.edu.

  2. Investigating homology between proteins using energetic profiles.

    Science.gov (United States)

    Wrabl, James O; Hilser, Vincent J

    2010-03-26

    Accumulated experimental observations demonstrate that protein stability is often preserved upon conservative point mutation. In contrast, less is known about the effects of large sequence or structure changes on the stability of a particular fold. Almost completely unknown is the degree to which stability of different regions of a protein is generally preserved throughout evolution. In this work, these questions are addressed through thermodynamic analysis of a large representative sample of protein fold space based on remote, yet accepted, homology. More than 3,000 proteins were computationally analyzed using the structural-thermodynamic algorithm COREX/BEST. Estimated position-specific stability (i.e., local Gibbs free energy of folding) and its component enthalpy and entropy were quantitatively compared between all proteins in the sample according to all-vs.-all pairwise structural alignment. It was discovered that the local stabilities of homologous pairs were significantly more correlated than those of non-homologous pairs, indicating that local stability was indeed generally conserved throughout evolution. However, the position-specific enthalpy and entropy underlying stability were less correlated, suggesting that the overall regional stability of a protein was more important than the thermodynamic mechanism utilized to achieve that stability. Finally, two different types of statistically exceptional evolutionary structure-thermodynamic relationships were noted. First, many homologous proteins contained regions of similar thermodynamics despite localized structure change, suggesting a thermodynamic mechanism enabling evolutionary fold change. Second, some homologous proteins with extremely similar structures nonetheless exhibited different local stabilities, a phenomenon previously observed experimentally in this laboratory. These two observations, in conjunction with the principal conclusion that homologous proteins generally conserved local stability, may

  3. Investigating homology between proteins using energetic profiles.

    Directory of Open Access Journals (Sweden)

    James O Wrabl

    2010-03-01

    Full Text Available Accumulated experimental observations demonstrate that protein stability is often preserved upon conservative point mutation. In contrast, less is known about the effects of large sequence or structure changes on the stability of a particular fold. Almost completely unknown is the degree to which stability of different regions of a protein is generally preserved throughout evolution. In this work, these questions are addressed through thermodynamic analysis of a large representative sample of protein fold space based on remote, yet accepted, homology. More than 3,000 proteins were computationally analyzed using the structural-thermodynamic algorithm COREX/BEST. Estimated position-specific stability (i.e., local Gibbs free energy of folding and its component enthalpy and entropy were quantitatively compared between all proteins in the sample according to all-vs.-all pairwise structural alignment. It was discovered that the local stabilities of homologous pairs were significantly more correlated than those of non-homologous pairs, indicating that local stability was indeed generally conserved throughout evolution. However, the position-specific enthalpy and entropy underlying stability were less correlated, suggesting that the overall regional stability of a protein was more important than the thermodynamic mechanism utilized to achieve that stability. Finally, two different types of statistically exceptional evolutionary structure-thermodynamic relationships were noted. First, many homologous proteins contained regions of similar thermodynamics despite localized structure change, suggesting a thermodynamic mechanism enabling evolutionary fold change. Second, some homologous proteins with extremely similar structures nonetheless exhibited different local stabilities, a phenomenon previously observed experimentally in this laboratory. These two observations, in conjunction with the principal conclusion that homologous proteins generally conserved

  4. GPCR-SSFE: A comprehensive database of G-protein-coupled receptor template predictions and homology models

    Directory of Open Access Journals (Sweden)

    Kreuchwig Annika

    2011-05-01

    Full Text Available Abstract Background G protein-coupled receptors (GPCRs transduce a wide variety of extracellular signals to within the cell and therefore have a key role in regulating cell activity and physiological function. GPCR malfunction is responsible for a wide range of diseases including cancer, diabetes and hyperthyroidism and a large proportion of drugs on the market target these receptors. The three dimensional structure of GPCRs is important for elucidating the molecular mechanisms underlying these diseases and for performing structure-based drug design. Although structural data are restricted to only a handful of GPCRs, homology models can be used as a proxy for those receptors not having crystal structures. However, many researchers working on GPCRs are not experienced homology modellers and are therefore unable to benefit from the information that can be gleaned from such three-dimensional models. Here, we present a comprehensive database called the GPCR-SSFE, which provides initial homology models of the transmembrane helices for a large variety of family A GPCRs. Description Extending on our previous theoretical work, we have developed an automated pipeline for GPCR homology modelling and applied it to a large set of family A GPCR sequences. Our pipeline is a fragment-based approach that exploits available family A crystal structures. The GPCR-SSFE database stores the template predictions, sequence alignments, identified sequence and structure motifs and homology models for 5025 family A GPCRs. Users are able to browse the GPCR dataset according to their pharmacological classification or search for results using a UniProt entry name. It is also possible for a user to submit a GPCR sequence that is not contained in the database for analysis and homology model building. The models can be viewed using a Jmol applet and are also available for download along with the alignments. Conclusions The data provided by GPCR-SSFE are useful for investigating

  5. GPCR-SSFE: a comprehensive database of G-protein-coupled receptor template predictions and homology models.

    Science.gov (United States)

    Worth, Catherine L; Kreuchwig, Annika; Kleinau, Gunnar; Krause, Gerd

    2011-05-23

    G protein-coupled receptors (GPCRs) transduce a wide variety of extracellular signals to within the cell and therefore have a key role in regulating cell activity and physiological function. GPCR malfunction is responsible for a wide range of diseases including cancer, diabetes and hyperthyroidism and a large proportion of drugs on the market target these receptors. The three dimensional structure of GPCRs is important for elucidating the molecular mechanisms underlying these diseases and for performing structure-based drug design. Although structural data are restricted to only a handful of GPCRs, homology models can be used as a proxy for those receptors not having crystal structures. However, many researchers working on GPCRs are not experienced homology modellers and are therefore unable to benefit from the information that can be gleaned from such three-dimensional models. Here, we present a comprehensive database called the GPCR-SSFE, which provides initial homology models of the transmembrane helices for a large variety of family A GPCRs. Extending on our previous theoretical work, we have developed an automated pipeline for GPCR homology modelling and applied it to a large set of family A GPCR sequences. Our pipeline is a fragment-based approach that exploits available family A crystal structures. The GPCR-SSFE database stores the template predictions, sequence alignments, identified sequence and structure motifs and homology models for 5025 family A GPCRs. Users are able to browse the GPCR dataset according to their pharmacological classification or search for results using a UniProt entry name. It is also possible for a user to submit a GPCR sequence that is not contained in the database for analysis and homology model building. The models can be viewed using a Jmol applet and are also available for download along with the alignments. The data provided by GPCR-SSFE are useful for investigating general and detailed sequence-structure-function relationships

  6. Universal sequence map (USM of arbitrary discrete sequences

    Directory of Open Access Journals (Sweden)

    Almeida Jonas S

    2002-02-01

    Full Text Available Abstract Background For over a decade the idea of representing biological sequences in a continuous coordinate space has maintained its appeal but not been fully realized. The basic idea is that any sequence of symbols may define trajectories in the continuous space conserving all its statistical properties. Ideally, such a representation would allow scale independent sequence analysis – without the context of fixed memory length. A simple example would consist on being able to infer the homology between two sequences solely by comparing the coordinates of any two homologous units. Results We have successfully identified such an iterative function for bijective mappingψ of discrete sequences into objects of continuous state space that enable scale-independent sequence analysis. The technique, named Universal Sequence Mapping (USM, is applicable to sequences with an arbitrary length and arbitrary number of unique units and generates a representation where map distance estimates sequence similarity. The novel USM procedure is based on earlier work by these and other authors on the properties of Chaos Game Representation (CGR. The latter enables the representation of 4 unit type sequences (like DNA as an order free Markov Chain transition table. The properties of USM are illustrated with test data and can be verified for other data by using the accompanying web-based tool:http://bioinformatics.musc.edu/~jonas/usm/. Conclusions USM is shown to enable a statistical mechanics approach to sequence analysis. The scale independent representation frees sequence analysis from the need to assume a memory length in the investigation of syntactic rules.

  7. Conservation of the glycoprotein B homologs of the Kaposi’s sarcoma-associated herpesvirus (KSHV/HHV8) and Old World primate rhadinoviruses of chimpanzees and macaques

    Science.gov (United States)

    Bruce, A. Gregory; Horst, Jeremy A.; Rose, Timothy M.

    2016-01-01

    The envelope-associated glycoprotein B (gB) is highly conserved within the Herpesviridae and plays a critical role in viral entry. We analyzed the evolutionary conservation of sequence and structural motifs within the Kaposi’s sarcoma-associated herpesvirus (KSHV) gB and homologs of Old World primate rhadinoviruses belonging to the distinct RV1 and RV2 rhadinovirus lineages. In addition to gB homologs of rhadinoviruses infecting the pig-tailed and rhesus macaques, we cloned and sequenced gB homologs of RV1 and RV2 rhadinoviruses infecting chimpanzees. A structural model of the KSHV gB was determined, and functional motifs and sequence variants were mapped to the model structure. Conserved domains and motifs were identified, including an “RGD” motif that plays a critical role in KSHV binding and entry through the cellular integrin αVβ3. The RGD motif was only detected in RV1 rhadinoviruses suggesting an important difference in cell tropism between the two rhadinovirus lineages. PMID:27070755

  8. MetaGO: Predicting Gene Ontology of Non-homologous Proteins Through Low-Resolution Protein Structure Prediction and Protein-Protein Network Mapping.

    Science.gov (United States)

    Zhang, Chengxin; Zheng, Wei; Freddolino, Peter L; Zhang, Yang

    2018-03-10

    Homology-based transferal remains the major approach to computational protein function annotations, but it becomes increasingly unreliable when the sequence identity between query and template decreases below 30%. We propose a novel pipeline, MetaGO, to deduce Gene Ontology attributes of proteins by combining sequence homology-based annotation with low-resolution structure prediction and comparison, and partner's homology-based protein-protein network mapping. The pipeline was tested on a large-scale set of 1000 non-redundant proteins from the CAFA3 experiment. Under the stringent benchmark conditions where templates with >30% sequence identity to the query are excluded, MetaGO achieves average F-measures of 0.487, 0.408, and 0.598, for Molecular Function, Biological Process, and Cellular Component, respectively, which are significantly higher than those achieved by other state-of-the-art function annotations methods. Detailed data analysis shows that the major advantage of the MetaGO lies in the new functional homolog detections from partner's homology-based network mapping and structure-based local and global structure alignments, the confidence scores of which can be optimally combined through logistic regression. These data demonstrate the power of using a hybrid model incorporating protein structure and interaction networks to deduce new functional insights beyond traditional sequence homology-based referrals, especially for proteins that lack homologous function templates. The MetaGO pipeline is available at http://zhanglab.ccmb.med.umich.edu/MetaGO/. Copyright © 2018. Published by Elsevier Ltd.

  9. Evaluating the efficacy of a structure-derived amino acid substitution matrix in detecting protein homologs by BLAST and PSI-BLAST

    OpenAIRE

    Goonesekere, Nalin CW

    2009-01-01

    Nalin CW GoonesekereDepartment of Chemistry and Biochemistry, University of Northern iowa, Cedar Falls, IA, USAAbstract: The large numbers of protein sequences generated by whole genome sequencing projects require rapid and accurate methods of annotation. The detection of homology through computational sequence analysis is a powerful tool in determining the complex evolutionary and functional relationships that exist between proteins. Homology search algorithms employ amino acid substitution ...

  10. Coding sequence of human rho cDNAs clone 6 and clone 9

    Energy Technology Data Exchange (ETDEWEB)

    Chardin, P; Madaule, P; Tavitian, A

    1988-03-25

    The authors have isolated human cDNAs including the complete coding sequence for two rho proteins corresponding to the incomplete isolates previously described as clone 6 and clone 9. The deduced a.a. sequences, when compared to the a.a. sequence deduced from clone 12 cDNA, show that there are in human at least three highly homologous rho genes. They suggest that clone 12 be named rhoA, clone 6 : rhoB and clone 9 : rhoC. RhoA, B and C proteins display approx. 30% a.a. identity with ras proteins,. mainly clustered in four highly homologous internal regions corresponding to the GTP binding site; however at least one significant difference is found; the 3 rho proteins have an Alanine in position corresponding to ras Glycine 13, suggesting that rho and ras proteins might have slightly different biochemical properties.

  11. Third-Generation Sequencing and Analysis of Four Complete Pig Liver Esterase Gene Sequences in Clones Identified by Screening BAC Library.

    Science.gov (United States)

    Zhou, Qiongqiong; Sun, Wenjuan; Liu, Xiyan; Wang, Xiliang; Xiao, Yuncai; Bi, Dingren; Yin, Jingdong; Shi, Deshi

    2016-01-01

    Pig liver carboxylesterase (PLE) gene sequences in GenBank are incomplete, which has led to difficulties in studying the genetic structure and regulation mechanisms of gene expression of PLE family genes. The aim of this study was to obtain and analysis of complete gene sequences of PLE family by screening from a Rongchang pig BAC library and third-generation PacBio gene sequencing. After a number of existing incomplete PLE isoform gene sequences were analysed, primers were designed based on conserved regions in PLE exons, and the whole pig genome used as a template for Polymerase chain reaction (PCR) amplification. Specific primers were then selected based on the PCR amplification results. A three-step PCR screening method was used to identify PLE-positive clones by screening a Rongchang pig BAC library and PacBio third-generation sequencing was performed. BLAST comparisons and other bioinformatics methods were applied for sequence analysis. Five PLE-positive BAC clones, designated BAC-10, BAC-70, BAC-75, BAC-119 and BAC-206, were identified. Sequence analysis yielded the complete sequences of four PLE genes, PLE1, PLE-B9, PLE-C4, and PLE-G2. Complete PLE gene sequences were defined as those containing regulatory sequences, exons, and introns. It was found that, not only did the PLE exon sequences of the four genes show a high degree of homology, but also that the intron sequences were highly similar. Additionally, the regulatory region of the genes contained two 720bps reverse complement sequences that may have an important function in the regulation of PLE gene expression. This is the first report to confirm the complete sequences of four PLE genes. In addition, the study demonstrates that each PLE isoform is encoded by a single gene and that the various genes exhibit a high degree of sequence homology, suggesting that the PLE family evolved from a single ancestral gene. Obtaining the complete sequences of these PLE genes provides the necessary foundation for

  12. Nucleotide sequence of the hexA gene for DNA mismatch repair in Streptococcus pneumoniae and homology of hexA to mutS of Escherichia coli and Salmonella typhimurium

    International Nuclear Information System (INIS)

    Priebe, S.D.; Hadi, S.M.; Greenberg, B.; Lacks, S.A.

    1988-01-01

    The Hex system of heteroduplex DNA base mismatch repair operates in Streptococcus pneumoniae after transformation and replication to correct donor and nascent DNA strands, respectively. A functionally similar system, called Mut, operates in Escherichia coli and Salmonella typhimurium. The nucleotide sequence of a 3.8-kilobase segment from the S. pneumoniae chromosome that includes the 2.7-kilobase hexA gene was determined. Chromosomal DNA used as donor to measure Hex phenotype was irradiated with UV light. An open reading frame that could encode a 17-kilodalton polypeptide (OrfC) was located just upstream of the gene encoding a polypeptide of 95 kilodaltons corresponding to HexA. Shine-Dalgarno sequences and putative promoters were identified upstream of each protein start site. Insertion mutations showed that only HexA functioned in mismatch repair and that the promoter for hexA transcription was located within the OrfC-coding region. The HexA polypeptide contains a consensus sequence for ATP- or GTP-binding sites in proteins. Comparison of the entire HexA protein sequence to that of MutS of S. typhimurium, showed the proteins to be homologous, inasmuch as 36% of their amino acid residues were identical. This homology indicates that the Hex and Mut systems of mismatch repair evolved from an ancestor common to the gram-positive streptococci and the gram-negative enterobacteria. It is the first direct evidence linking the two systems

  13. Fast and accurate taxonomic assignments of metagenomic sequences using MetaBin.

    Directory of Open Access Journals (Sweden)

    Vineet K Sharma

    Full Text Available Taxonomic assignment of sequence reads is a challenging task in metagenomic data analysis, for which the present methods mainly use either composition- or homology-based approaches. Though the homology-based methods are more sensitive and accurate, they suffer primarily due to the time needed to generate the Blast alignments. We developed the MetaBin program and web server for better homology-based taxonomic assignments using an ORF-based approach. By implementing Blat as the faster alignment method in place of Blastx, the analysis time has been reduced by severalfold. It is benchmarked using both simulated and real metagenomic datasets, and can be used for both single and paired-end sequence reads of varying lengths (≥45 bp. To our knowledge, MetaBin is the only available program that can be used for the taxonomic binning of short reads (<100 bp with high accuracy and high sensitivity using a homology-based approach. The MetaBin web server can be used to carry out the taxonomic analysis, by either submitting reads or Blastx output. It provides several options including construction of taxonomic trees, creation of a composition chart, functional analysis using COGs, and comparative analysis of multiple metagenomic datasets. MetaBin web server and a standalone version for high-throughput analysis are available freely at http://metabin.riken.jp/.

  14. Statistical alignment: computational properties, homology testing and goodness-of-fit

    DEFF Research Database (Denmark)

    Hein, J; Wiuf, Carsten; Møller, Martin

    2000-01-01

    The model of insertions and deletions in biological sequences, first formulated by Thorne, Kishino, and Felsenstein in 1991 (the TKF91 model), provides a basis for performing alignment within a statistical framework. Here we investigate this model.Firstly, we show how to accelerate the statistical...... alignment algorithms several orders of magnitude. The main innovations are to confine likelihood calculations to a band close to the similarity based alignment, to get good initial guesses of the evolutionary parameters and to apply an efficient numerical optimisation algorithm for finding the maximum...... analysis.Secondly, we propose a new homology test based on this model, where homology means that an ancestor to a sequence pair can be found finitely far back in time. This test has statistical advantages relative to the traditional shuffle test for proteins.Finally, we describe a goodness-of-fit test...

  15. Comparative analysis of the prion protein gene sequences in African lion.

    Science.gov (United States)

    Wu, Chang-De; Pang, Wan-Yong; Zhao, De-Ming

    2006-10-01

    The prion protein gene of African lion (Panthera Leo) was first cloned and polymorphisms screened. The results suggest that the prion protein gene of eight African lions is highly homogenous. The amino acid sequences of the prion protein (PrP) of all samples tested were identical. Four single nucleotide polymorphisms (C42T, C81A, C420T, T600C) in the prion protein gene (Prnp) of African lion were found, but no amino acid substitutions. Sequence analysis showed that the higher homology is observed to felis catus AF003087 (96.7%) and to sheep number M31313.1 (96.2%) Genbank accessed. With respect to all the mammalian prion protein sequences compared, the African lion prion protein sequence has three amino acid substitutions. The homology might in turn affect the potential intermolecular interactions critical for cross species transmission of prion disease.

  16. Productive homologous and non-homologous recombination of hepatitis C virus in cell culture

    DEFF Research Database (Denmark)

    Scheel, Troels K H; Galli, Andrea; Li, Yi-Ping

    2013-01-01

    . In addition, recombination is an important regulatory mechanism of cytopathogenicity for the related pestiviruses. Here we describe recombination of HCV RNA in cell culture leading to production of infectious virus. Initially, hepatoma cells were co-transfected with a replicating JFH1ΔE1E2 genome (genotype 2a......) lacking functional envelope genes and strain J6 (2a), which has functional envelope genes but does not replicate in culture. After an initial decrease in the number of HCV positive cells, infection spread after 13-36 days. Sequencing of recovered viruses revealed non-homologous recombinants with J6...

  17. Sequencing BPS spectra

    Energy Technology Data Exchange (ETDEWEB)

    Gukov, Sergei [Walter Burke Institute for Theoretical Physics, California Institute of Technology,1200 E California Blvd, Pasadena, CA 91125 (United States); Max-Planck-Institut für Mathematik,Vivatsgasse 7, D-53111 Bonn (Germany); Nawata, Satoshi [Walter Burke Institute for Theoretical Physics, California Institute of Technology,1200 E California Blvd, Pasadena, CA 91125 (United States); Centre for Quantum Geometry of Moduli Spaces, University of Aarhus,Nordre Ringgade 1, DK-8000 (Denmark); Saberi, Ingmar [Walter Burke Institute for Theoretical Physics, California Institute of Technology,1200 E California Blvd, Pasadena, CA 91125 (United States); Stošić, Marko [CAMGSD, Departamento de Matemática, Instituto Superior Técnico,Av. Rovisco Pais, 1049-001 Lisbon (Portugal); Mathematical Institute SANU,Knez Mihajlova 36, 11000 Belgrade (Serbia); Sułkowski, Piotr [Walter Burke Institute for Theoretical Physics, California Institute of Technology,1200 E California Blvd, Pasadena, CA 91125 (United States); Faculty of Physics, University of Warsaw,ul. Pasteura 5, 02-093 Warsaw (Poland)

    2016-03-02

    This paper provides both a detailed study of color-dependence of link homologies, as realized in physics as certain spaces of BPS states, and a broad study of the behavior of BPS states in general. We consider how the spectrum of BPS states varies as continuous parameters of a theory are perturbed. This question can be posed in a wide variety of physical contexts, and we answer it by proposing that the relationship between unperturbed and perturbed BPS spectra is described by a spectral sequence. These general considerations unify previous applications of spectral sequence techniques to physics, and explain from a physical standpoint the appearance of many spectral sequences relating various link homology theories to one another. We also study structural properties of colored HOMFLY homology for links and evaluate Poincaré polynomials in numerous examples. Among these structural properties is a novel “sliding” property, which can be explained by using (refined) modular S-matrix. This leads to the identification of modular transformations in Chern-Simons theory and 3d N=2 theory via the 3d/3d correspondence. Lastly, we introduce the notion of associated varieties as classical limits of recursion relations of colored superpolynomials of links, and study their properties.

  18. Sequencing BPS spectra

    International Nuclear Information System (INIS)

    Gukov, Sergei; Nawata, Satoshi; Saberi, Ingmar; Stošić, Marko; Sułkowski, Piotr

    2016-01-01

    This paper provides both a detailed study of color-dependence of link homologies, as realized in physics as certain spaces of BPS states, and a broad study of the behavior of BPS states in general. We consider how the spectrum of BPS states varies as continuous parameters of a theory are perturbed. This question can be posed in a wide variety of physical contexts, and we answer it by proposing that the relationship between unperturbed and perturbed BPS spectra is described by a spectral sequence. These general considerations unify previous applications of spectral sequence techniques to physics, and explain from a physical standpoint the appearance of many spectral sequences relating various link homology theories to one another. We also study structural properties of colored HOMFLY homology for links and evaluate Poincaré polynomials in numerous examples. Among these structural properties is a novel “sliding” property, which can be explained by using (refined) modular S-matrix. This leads to the identification of modular transformations in Chern-Simons theory and 3d N=2 theory via the 3d/3d correspondence. Lastly, we introduce the notion of associated varieties as classical limits of recursion relations of colored superpolynomials of links, and study their properties.

  19. Homotopic Chain Maps Have Equal s-Homology and d-Homology

    Directory of Open Access Journals (Sweden)

    M. Z. Kazemi-Baneh

    2016-01-01

    Full Text Available The homotopy of chain maps on preabelian categories is investigated and the equality of standard homologies and d-homologies of homotopic chain maps is established. As a special case, if X and Y are the same homotopy type, then their nth d-homology R-modules are isomorphic, and if X is a contractible space, then its nth d-homology R-modules for n≠0 are trivial.

  20. BLAST and FASTA similarity searching for multiple sequence alignment.

    Science.gov (United States)

    Pearson, William R

    2014-01-01

    BLAST, FASTA, and other similarity searching programs seek to identify homologous proteins and DNA sequences based on excess sequence similarity. If two sequences share much more similarity than expected by chance, the simplest explanation for the excess similarity is common ancestry-homology. The most effective similarity searches compare protein sequences, rather than DNA sequences, for sequences that encode proteins, and use expectation values, rather than percent identity, to infer homology. The BLAST and FASTA packages of sequence comparison programs provide programs for comparing protein and DNA sequences to protein databases (the most sensitive searches). Protein and translated-DNA comparisons to protein databases routinely allow evolutionary look back times from 1 to 2 billion years; DNA:DNA searches are 5-10-fold less sensitive. BLAST and FASTA can be run on popular web sites, but can also be downloaded and installed on local computers. With local installation, target databases can be customized for the sequence data being characterized. With today's very large protein databases, search sensitivity can also be improved by searching smaller comprehensive databases, for example, a complete protein set from an evolutionarily neighboring model organism. By default, BLAST and FASTA use scoring strategies target for distant evolutionary relationships; for comparisons involving short domains or queries, or searches that seek relatively close homologs (e.g. mouse-human), shallower scoring matrices will be more effective. Both BLAST and FASTA provide very accurate statistical estimates, which can be used to reliably identify protein sequences that diverged more than 2 billion years ago.

  1. Isolation of Specific Clones from Nonarrayed BAC Libraries through Homologous Recombination

    Directory of Open Access Journals (Sweden)

    Mikhail Nefedov

    2011-01-01

    Full Text Available We have developed a new approach to screen bacterial artificial chromosome (BAC libraries by recombination selection. To test this method, we constructed an orangutan BAC library using an E. coli strain (DY380 with temperature inducible homologous recombination (HR capability. We amplified one library segment, induced HR at 42∘C to make it recombination proficient, and prepared electrocompetent cells for transformation with a kanamycin cassette to target sequences in the orangutan genome through terminal recombineering homologies. Kanamycin-resistant colonies were tested for the presence of BACs containing the targeted genes by the use of a PCR-assay to confirm the presence of the kanamycin insertion. The results indicate that this is an effective approach for screening clones. The advantage of recombination screening is that it avoids the high costs associated with the preparation, screening, and archival storage of arrayed BAC libraries. In addition, the screening can be conceivably combined with genetic engineering to create knockout and reporter constructs for functional studies.

  2. A multidimensional strategy to detect polypharmacological targets in the absence of structural and sequence homology.

    Science.gov (United States)

    Durrant, Jacob D; Amaro, Rommie E; Xie, Lei; Urbaniak, Michael D; Ferguson, Michael A J; Haapalainen, Antti; Chen, Zhijun; Di Guilmi, Anne Marie; Wunder, Frank; Bourne, Philip E; McCammon, J Andrew

    2010-01-22

    Conventional drug design embraces the "one gene, one drug, one disease" philosophy. Polypharmacology, which focuses on multi-target drugs, has emerged as a new paradigm in drug discovery. The rational design of drugs that act via polypharmacological mechanisms can produce compounds that exhibit increased therapeutic potency and against which resistance is less likely to develop. Additionally, identifying multiple protein targets is also critical for side-effect prediction. One third of potential therapeutic compounds fail in clinical trials or are later removed from the market due to unacceptable side effects often caused by off-target binding. In the current work, we introduce a multidimensional strategy for the identification of secondary targets of known small-molecule inhibitors in the absence of global structural and sequence homology with the primary target protein. To demonstrate the utility of the strategy, we identify several targets of 4,5-dihydroxy-3-(1-naphthyldiazenyl)-2,7-naphthalenedisulfonic acid, a known micromolar inhibitor of Trypanosoma brucei RNA editing ligase 1. As it is capable of identifying potential secondary targets, the strategy described here may play a useful role in future efforts to reduce drug side effects and/or to increase polypharmacology.

  3. Homologous SV40 RNA trans-splicing: Special case or prime example of viral RNA trans-splicing?

    Directory of Open Access Journals (Sweden)

    Sushmita Poddar

    2014-06-01

    Full Text Available To date the Simian Virus 40 (SV40 is the only proven example of a virus that recruits the mechanism of RNA trans-splicing to diversify its sequences and gene products. Thereby, two identical viral transcripts are efficiently joined by homologous trans-splicing triggering the formation of a highly transforming 100 kDa super T antigen. Sequences of other viruses including HIV-1 and the human adenovirus type 5 were reported to be involved in heterologous trans-splicing towards cellular or viral sequences but the meaning of these events remains unclear. We computationally and experimentally investigated molecular features associated with viral RNA trans-splicing and identified a common pattern: Viral RNA trans-splicing occurs between strong cryptic or regular viral splice sites and strong regular or cryptic splice sites of the trans-splice partner sequences. The majority of these splice sites are supported by exonic splice enhancers. Splice sites that could compete with the trans-splicing sites for cis-splice reactions are weaker or inexistent. Finally, all but one of the trans-splice reactions seem to be facilitated by one or more complementary binding domains of 11 to 16 nucleotides in length which, however occur with a statistical probability close to one for the given length of the involved sequences. The chimeric RNAs generated via heterologous viral RNA trans-splicing either did not lead to fusion proteins or led to proteins of unknown function. Our data suggest that distinct viral RNAs are highly susceptible to trans-splicing and that heterologous viral trans-splicing, unlike homologous SV40 trans-splicing, represents a chance event.

  4. Cloning, Expression, Sequence Analysis and Homology Modeling of the Prolyl Endoprotease from Eurygaster integriceps Puton

    Directory of Open Access Journals (Sweden)

    Ravi Chandra Yandamuri

    2014-10-01

    Full Text Available eurygaster integriceps Puton, commonly known as sunn pest, is a major pest of wheat in Northern Africa, the Middle East and Eastern Europe. This insect injects a prolyl endoprotease into the wheat, destroying the gluten. The purpose of this study was to clone the full length cDNA of the sunn pest prolyl endoprotease (spPEP for expression in E. coli and to compare the amino acid sequence of the enzyme to other known PEPs in both phylogeny and potential tertiary structure. Sequence analysis shows that the 5ꞌ UTR contains several putative transcription factor binding sites for transcription factors known to be expressed in Drosophila that might be useful targets for inhibition of the enzyme. The spPEP was first identified as a prolyl endoprotease by Darkoh et al., 2010. The enzyme is a unique serine protease of the S9A family by way of its substrate recognition of the gluten proteins, which are greater than 30 kD in size. At 51% maximum identity to known PEPs, homology modeling using SWISS-MODEL, the porcine brain PEP (PDB: 2XWD was selected in the database of known PEP structures, resulting in a predicted tertiary structure 99% identical to the porcine brain PEP structure. A Km for the recombinant spPEP was determined to be 210 ± 53 µM for the zGly-Pro-pNA substrate in 0.025 M ethanolamine, pH 8.5, containing 0.1 M NaCl at 37 °C with a turnover rate of 172 ± 47 µM Gly-Pro-pNA/s/µM of enzyme.

  5. Genes homologous to glycopeptide resistance vanA are widespread in soil microbial communities

    DEFF Research Database (Denmark)

    Guardabassi, L.; Agersø, Yvonne

    2006-01-01

    -Ala : D-Ala ligase genes unrelated to vanA. In order to enhance detection of vanA-homologous genes, a third PCR step was added using primers targeting vanA in soil Paenibacillus. Sequencing of 25 clones obtained by this method allowed recovery of 23 novel sequences having 86-100% identity with van...

  6. In vivo blunt-end cloning through CRISPR/Cas9-facilitated non-homologous end-joining

    Science.gov (United States)

    Geisinger, Jonathan M.; Turan, Sören; Hernandez, Sophia; Spector, Laura P.; Calos, Michele P.

    2016-01-01

    The CRISPR/Cas9 system facilitates precise DNA modifications by generating RNA-guided blunt-ended double-strand breaks. We demonstrate that guide RNA pairs generate deletions that are repaired with a high level of precision by non-homologous end-joining in mammalian cells. We present a method called knock-in blunt ligation for exploiting these breaks to insert exogenous PCR-generated sequences in a homology-independent manner without loss of additional nucleotides. This method is useful for making precise additions to the genome such as insertions of marker gene cassettes or functional elements, without the need for homology arms. We successfully utilized this method in human and mouse cells to insert fluorescent protein cassettes into various loci, with efficiencies up to 36% in HEK293 cells without selection. We also created versions of Cas9 fused to the FKBP12-L106P destabilization domain in an effort to improve Cas9 performance. Our in vivo blunt-end cloning method and destabilization-domain-fused Cas9 variant increase the repertoire of precision genome engineering approaches. PMID:26762978

  7. Direct Single-Molecule Observation of Mode and Geometry of RecA-Mediated Homology Search.

    Science.gov (United States)

    Lee, Andrew J; Endo, Masayuki; Hobbs, Jamie K; Wälti, Christoph

    2018-01-23

    Genomic integrity, when compromised by accrued DNA lesions, is maintained through efficient repair via homologous recombination. For this process the ubiquitous recombinase A (RecA), and its homologues such as the human Rad51, are of central importance, able to align and exchange homologous sequences within single-stranded and double-stranded DNA in order to swap out defective regions. Here, we directly observe the widely debated mechanism of RecA homology searching at a single-molecule level using high-speed atomic force microscopy (HS-AFM) in combination with tailored DNA origami frames to present the reaction targets in a way suitable for AFM-imaging. We show that RecA nucleoprotein filaments move along DNA substrates via short-distance facilitated diffusions, or slides, interspersed with longer-distance random moves, or hops. Importantly, from the specific interaction geometry, we find that the double-stranded substrate DNA resides in the secondary DNA binding-site within the RecA nucleoprotein filament helical groove during the homology search. This work demonstrates that tailored DNA origami, in conjunction with HS-AFM, can be employed to reveal directly conformational and geometrical information on dynamic protein-DNA interactions which was previously inaccessible at an individual single-molecule level.

  8. The genome BLASTatlas - a GeneWiz extension for visualization of whole-genome homology

    DEFF Research Database (Denmark)

    Hallin, Peter Fischer; Binnewies, Tim Terence; Ussery, David

    2008-01-01

    ://www.cbs.dtu.dk/ws/BLASTatlas), where programming examples are available in Perl. By providing an interoperable method to carry out whole genome visualization of homology, this service offers bioinformaticians as well as biologists an easy-to-adopt workflow that can be directly called from the programming language of the user, hence......The development of fast and inexpensive methods for sequencing bacterial genomes has led to a wealth of data, often with many genomes being sequenced of the same species or closely related organisms. Thus, there is a need for visualization methods that will allow easy comparison of many sequenced...... genomes to a defined reference strain. The BLASTatlas is one such tool that is useful for mapping and visualizing whole genome homology of genes and proteins within a reference strain compared to other strains or species of one or more prokaryotic organisms. We provide examples of BLASTatlases, including...

  9. SAAS: Short Amino Acid Sequence - A Promising Protein Secondary Structure Prediction Method of Single Sequence

    Directory of Open Access Journals (Sweden)

    Zhou Yuan Wu

    2013-07-01

    Full Text Available In statistical methods of predicting protein secondary structure, many researchers focus on single amino acid frequencies in α-helices, β-sheets, and so on, or the impact near amino acids on an amino acid forming a secondary structure. But the paper considers a short sequence of amino acids (3, 4, 5 or 6 amino acids as integer, and statistics short sequence's probability forming secondary structure. Also, many researchers select low homologous sequences as statistical database. But this paper select whole PDB database. In this paper we propose a strategy to predict protein secondary structure using simple statistical method. Numerical computation shows that, short amino acids sequence as integer to statistics, which can easy see trend of short sequence forming secondary structure, and it will work well to select large statistical database (whole PDB database without considering homologous, and Q3 accuracy is ca. 74% using this paper proposed simple statistical method, but accuracy of others statistical methods is less than 70%.

  10. Non-homologous isofunctional enzymes: a systematic analysis of alternative solutions in enzyme evolution.

    Science.gov (United States)

    Omelchenko, Marina V; Galperin, Michael Y; Wolf, Yuri I; Koonin, Eugene V

    2010-04-30

    Evolutionarily unrelated proteins that catalyze the same biochemical reactions are often referred to as analogous - as opposed to homologous - enzymes. The existence of numerous alternative, non-homologous enzyme isoforms presents an interesting evolutionary problem; it also complicates genome-based reconstruction of the metabolic pathways in a variety of organisms. In 1998, a systematic search for analogous enzymes resulted in the identification of 105 Enzyme Commission (EC) numbers that included two or more proteins without detectable sequence similarity to each other, including 34 EC nodes where proteins were known (or predicted) to have distinct structural folds, indicating independent evolutionary origins. In the past 12 years, many putative non-homologous isofunctional enzymes were identified in newly sequenced genomes. In addition, efforts in structural genomics resulted in a vastly improved structural coverage of proteomes, providing for definitive assessment of (non)homologous relationships between proteins. We report the results of a comprehensive search for non-homologous isofunctional enzymes (NISE) that yielded 185 EC nodes with two or more experimentally characterized - or predicted - structurally unrelated proteins. Of these NISE sets, only 74 were from the original 1998 list. Structural assignments of the NISE show over-representation of proteins with the TIM barrel fold and the nucleotide-binding Rossmann fold. From the functional perspective, the set of NISE is enriched in hydrolases, particularly carbohydrate hydrolases, and in enzymes involved in defense against oxidative stress. These results indicate that at least some of the non-homologous isofunctional enzymes were recruited relatively recently from enzyme families that are active against related substrates and are sufficiently flexible to accommodate changes in substrate specificity.

  11. Homological stabilizer codes

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, Jonas T., E-mail: jonastyleranderson@gmail.com

    2013-03-15

    In this paper we define homological stabilizer codes on qubits which encompass codes such as Kitaev's toric code and the topological color codes. These codes are defined solely by the graphs they reside on. This feature allows us to use properties of topological graph theory to determine the graphs which are suitable as homological stabilizer codes. We then show that all toric codes are equivalent to homological stabilizer codes on 4-valent graphs. We show that the topological color codes and toric codes correspond to two distinct classes of graphs. We define the notion of label set equivalencies and show that under a small set of constraints the only homological stabilizer codes without local logical operators are equivalent to Kitaev's toric code or to the topological color codes. - Highlights: Black-Right-Pointing-Pointer We show that Kitaev's toric codes are equivalent to homological stabilizer codes on 4-valent graphs. Black-Right-Pointing-Pointer We show that toric codes and color codes correspond to homological stabilizer codes on distinct graphs. Black-Right-Pointing-Pointer We find and classify all 2D homological stabilizer codes. Black-Right-Pointing-Pointer We find optimal codes among the homological stabilizer codes.

  12. An aureobasidin A resistance gene isolated from Aspergillus is a homolog of yeast AUR1, a gene responsible for inositol phosphorylceramide (IPC) synthase activity.

    Science.gov (United States)

    Kuroda, M; Hashida-Okado, T; Yasumoto, R; Gomi, K; Kato, I; Takesako, K

    1999-03-01

    The AUR1 gene of Saccharomyces cerevisiae, mutations in which confer resistance to the antibiotic aureobasidin A, is necessary for inositol phosphorylceramide (IPC) synthase activity. We report the molecular cloning and characterization of the Aspergillus nidulans aurA gene, which is homologous to AUR1. A single point mutation in the aurA gene of A. nidulans confers a high level of resistance to aureobasidin A. The A. nidulans aurA gene was used to identify its homologs in other Aspergillus species, including A. fumigatus, A. niger, and A. oryzae. The deduced amino acid sequence of an aurA homolog from the pathogenic fungus A. fumigatus showed 87% identity to that of A. nidulans. The AurA proteins of A. nidulans and A. fumigatus shared common characteristics in primary structure, including sequence, hydropathy profile, and N-glycosylation sites, with their S. cerevisiae, Schizosaccharomyces pombe, and Candida albicans counterparts. These results suggest that the aureobasidin resistance gene is conserved evolutionarily in various fungi.

  13. Selective anticancer activity of a hexapeptide with sequence homology to a non-kinase domain of Cyclin Dependent Kinase 4

    Directory of Open Access Journals (Sweden)

    Agarwala Usha

    2011-06-01

    Full Text Available Abstract Background Cyclin-dependent kinases 2, 4 and 6 (Cdk2, Cdk4, Cdk6 are closely structurally homologous proteins which are classically understood to control the transition from the G1 to the S-phases of the cell cycle by combining with their appropriate cyclin D or cyclin E partners to form kinase-active holoenzymes. Deregulation of Cdk4 is widespread in human cancer, CDK4 gene knockout is highly protective against chemical and oncogene-mediated epithelial carcinogenesis, despite the continued presence of CDK2 and CDK6; and overexpresssion of Cdk4 promotes skin carcinogenesis. Surprisingly, however, Cdk4 kinase inhibitors have not yet fulfilled their expectation as 'blockbuster' anticancer agents. Resistance to inhibition of Cdk4 kinase in some cases could potentially be due to a non-kinase activity, as recently reported with epidermal growth factor receptor. Results A search for a potential functional site of non-kinase activity present in Cdk4 but not Cdk2 or Cdk6 revealed a previously-unidentified loop on the outside of the C'-terminal non-kinase domain of Cdk4, containing a central amino-acid sequence, Pro-Arg-Gly-Pro-Arg-Pro (PRGPRP. An isolated hexapeptide with this sequence and its cyclic amphiphilic congeners are selectively lethal at high doses to a wide range of human cancer cell lines whilst sparing normal diploid keratinocytes and fibroblasts. Treated cancer cells do not exhibit the wide variability of dose response typically seen with other anticancer agents. Cancer cell killing by PRGPRP, in a cyclic amphiphilic cassette, requires cells to be in cycle but does not perturb cell cycle distribution and is accompanied by altered relative Cdk4/Cdk1 expression and selective decrease in ATP levels. Morphological features of apoptosis are absent and cancer cell death does not appear to involve autophagy. Conclusion These findings suggest a potential new paradigm for the development of broad-spectrum cancer specific therapeutics with

  14. Quantitative RT-PCR based platform for rapid quantification of the transcripts of highly homologous multigene families and their members during grain development

    DEFF Research Database (Denmark)

    Kaczmarczyk, Agnieszka Ewa; Bowra, Steve; Elek, Zoltan

    2012-01-01

    expression combined with genetic variation in large multigene families with high homology among the alleles is very challenging. Results We designed a rapid qRT-PCR system with the aim of characterising the variation in the expression of hordein genes families. All the known D-, C-, B-, and gamma......-hordein sequences coding full length open reading frames were collected from commonly available databases. Phylogenetic analysis was performed and the members of the different hordein families were classified into subfamilies. Primer sets were designed to discriminate the gene expression level of whole families...... and its subgroups. More over the results indicate the genotypic specific gene expression. Conclusions Quantitative RT-PCR with SYBR Green labelling can be a useful technique to follow gene expression levels of large gene families with highly homologues members. We showed variation in the temporal...

  15. Structural insights into a high affinity nanobody:antigen complex by homology modelling

    DEFF Research Database (Denmark)

    Skottrup, Peter Durand

    2017-01-01

    Porphyromonas gingivalis is a major periodontitis-causing pathogens. P. gingivalis secrete a cysteine protease termed RgpB, which is specific for Arg-Xaa bonds in substrates. Recently, a nanobody-based assay was used to demonstrate that RgpB could represent a novel diagnostic target, thereby...... simplifying. P. gingivalis detection. The nanobody, VHH7, had a high binding affinity and was specific for RgpB, when tested towards the highly identical RgpA. In this study a homology model of VHH7 was build. The complementarity determining regions (CDR) comprising the paratope residues responsible for Rgp...

  16. Protein 3D structure computed from evolutionary sequence variation.

    Directory of Open Access Journals (Sweden)

    Debora S Marks

    Full Text Available The evolutionary trajectory of a protein through sequence space is constrained by its function. Collections of sequence homologs record the outcomes of millions of evolutionary experiments in which the protein evolves according to these constraints. Deciphering the evolutionary record held in these sequences and exploiting it for predictive and engineering purposes presents a formidable challenge. The potential benefit of solving this challenge is amplified by the advent of inexpensive high-throughput genomic sequencing.In this paper we ask whether we can infer evolutionary constraints from a set of sequence homologs of a protein. The challenge is to distinguish true co-evolution couplings from the noisy set of observed correlations. We address this challenge using a maximum entropy model of the protein sequence, constrained by the statistics of the multiple sequence alignment, to infer residue pair couplings. Surprisingly, we find that the strength of these inferred couplings is an excellent predictor of residue-residue proximity in folded structures. Indeed, the top-scoring residue couplings are sufficiently accurate and well-distributed to define the 3D protein fold with remarkable accuracy.We quantify this observation by computing, from sequence alone, all-atom 3D structures of fifteen test proteins from different fold classes, ranging in size from 50 to 260 residues, including a G-protein coupled receptor. These blinded inferences are de novo, i.e., they do not use homology modeling or sequence-similar fragments from known structures. The co-evolution signals provide sufficient information to determine accurate 3D protein structure to 2.7-4.8 Å C(α-RMSD error relative to the observed structure, over at least two-thirds of the protein (method called EVfold, details at http://EVfold.org. This discovery provides insight into essential interactions constraining protein evolution and will facilitate a comprehensive survey of the universe of

  17. A computational approach to discovering the functions of bacterial phytochromes by analysis of homolog distributions

    Directory of Open Access Journals (Sweden)

    Lamparter Tilman

    2006-03-01

    Full Text Available Abstract Background Phytochromes are photoreceptors, discovered in plants, that control a wide variety of developmental processes. They have also been found in bacteria and fungi, but for many species their biological role remains obscure. This work concentrates on the phytochrome system of Agrobacterium tumefaciens, a non-photosynthetic soil bacterium with two phytochromes. To identify proteins that might share common functions with phytochromes, a co-distribution analysis was performed on the basis of protein sequences from 138 bacteria. Results A database of protein sequences from 138 bacteria was generated. Each sequence was BLASTed against the entire database. The homolog distribution of each query protein was then compared with the homolog distribution of every other protein (target protein of the same species, and the target proteins were sorted according to their probability of co-distribution under random conditions. As query proteins, phytochromes from Agrobacterium tumefaciens, Pseudomonas aeruginosa, Deinococcus radiodurans and Synechocystis PCC 6803 were chosen along with several phytochrome-related proteins from A. tumefaciens. The Synechocystis photosynthesis protein D1 was selected as a control. In the D1 analyses, the ratio between photosynthesis-related proteins and those not related to photosynthesis among the top 150 in the co-distribution tables was > 3:1, showing that the method is appropriate for finding partner proteins with common functions. The co-distribution of phytochromes with other histidine kinases was remarkably high, although most co-distributed histidine kinases were not direct BLAST homologs of the query protein. This finding implies that phytochromes and other histidine kinases share common functions as parts of signalling networks. All phytochromes tested, with one exception, also revealed a remarkably high co-distribution with glutamate synthase and methionine synthase. This result implies a general role of

  18. Human papilloma viruses and cervical tumours: mapping of integration sites and analysis of adjacent cellular sequences

    International Nuclear Information System (INIS)

    Klimov, Eugene; Vinokourova, Svetlana; Moisjak, Elena; Rakhmanaliev, Elian; Kobseva, Vera; Laimins, Laimonis; Kisseljov, Fjodor; Sulimova, Galina

    2002-01-01

    In cervical tumours the integration of human papilloma viruses (HPV) transcripts often results in the generation of transcripts that consist of hybrids of viral and cellular sequences. Mapping data using a variety of techniques has demonstrated that HPV integration occurred without obvious specificity into human genome. However, these techniques could not demonstrate whether integration resulted in the generation of transcripts encoding viral or viral-cellular sequences. The aim of this work was to map the integration sites of HPV DNA and to analyse the adjacent cellular sequences. Amplification of the INTs was done by the APOT technique. The APOT products were sequenced according to standard protocols. The analysis of the sequences was performed using BLASTN program and public databases. To localise the INTs PCR-based screening of GeneBridge4-RH-panel was used. Twelve cellular sequences adjacent to integrated HPV16 (INT markers) expressed in squamous cell cervical carcinomas were isolated. For 11 INT markers homologous human genomic sequences were readily identified and 9 of these showed significant homologies to known genes/ESTs. Using the known locations of homologous cDNAs and the RH-mapping techniques, mapping studies showed that the INTs are distributed among different human chromosomes for each tumour sample and are located in regions with the high levels of expression. Integration of HPV genomes occurs into the different human chromosomes but into regions that contain highly transcribed genes. One interpretation of these studies is that integration of HPV occurs into decondensed regions, which are more accessible for integration of foreign DNA

  19. Bioinformatic approach in the identification of arabidopsis gene homologous in amaranthus

    Directory of Open Access Journals (Sweden)

    Jana Žiarovská

    2015-05-01

    Full Text Available Bioinfomatics offers an efficient tool for molecular genetics applications and sequence homology search algorithms became an inevitable part for many different research strategies. Appropriate managing of known data that are stored in public available databases can be used in many ways in the research. Here, we report the identification of RmlC-like cupins superfamily protein DNA sequence than is known in Arabidopsis genome for the Amaranthus - plant specie where this sequence was still not sequenced. A BLAST based approach was used to identify the homologous sequences in the nucleotide database and to find suitable parts of the Arabidopsis sequence were primers can be designed. In total, 64 hits were found in nucleotide database for Arabidopsis RmlC-like cupins sequence. A query cover ranged from 10% up to the 100% among RmlC-like cupins nucleotides and its homologues that are actually stored in public nucleotide databases. The most conserved region was identified for matches that posses nucleotides in the range of 1506 up to the 1925 bp of RmlC-like cupins DNA sequence stored in the database. The in silico approach was subsequently used in PCR analysis where the specifity of designed primers was approved. A unique, 250 bp long fragment was obtained for Amaranthus cruentus and a hybride Amaranthus hypochondriacus x hybridus in our analysis. Bioinformatic based analysis of unknown parts of the plant genomes as showed in this study is a very good additional tool in PCR based analysis of plant variability. This approach is suitable in the case for plants, where concrete genomic data are still missing for the appropriate genes, as was demonstrated for Amaranthus. 

  20. Searching remote homology with spectral clustering with symmetry in neighborhood cluster kernels.

    Directory of Open Access Journals (Sweden)

    Ujjwal Maulik

    Full Text Available Remote homology detection among proteins utilizing only the unlabelled sequences is a central problem in comparative genomics. The existing cluster kernel methods based on neighborhoods and profiles and the Markov clustering algorithms are currently the most popular methods for protein family recognition. The deviation from random walks with inflation or dependency on hard threshold in similarity measure in those methods requires an enhancement for homology detection among multi-domain proteins. We propose to combine spectral clustering with neighborhood kernels in Markov similarity for enhancing sensitivity in detecting homology independent of "recent" paralogs. The spectral clustering approach with new combined local alignment kernels more effectively exploits the unsupervised protein sequences globally reducing inter-cluster walks. When combined with the corrections based on modified symmetry based proximity norm deemphasizing outliers, the technique proposed in this article outperforms other state-of-the-art cluster kernels among all twelve implemented kernels. The comparison with the state-of-the-art string and mismatch kernels also show the superior performance scores provided by the proposed kernels. Similar performance improvement also is found over an existing large dataset. Therefore the proposed spectral clustering framework over combined local alignment kernels with modified symmetry based correction achieves superior performance for unsupervised remote homolog detection even in multi-domain and promiscuous domain proteins from Genolevures database families with better biological relevance. Source code available upon request.sarkar@labri.fr.

  1. In silico sequence analysis and homology modeling of predicted beta-amylase 7-like protein in Brachypodium distachyon L.

    Directory of Open Access Journals (Sweden)

    ERTUĞRUL FILIZ

    2014-04-01

    Full Text Available Beta-amylase (β-amylase, EC 3.2.1.2 is an enzyme that catalyses hydrolysis of glucosidic bonds in polysaccharides. In this study, we analyzed protein sequence of predicted beta-amylase 7-like protein in Brachypodium distachyon. pI (isoelectric point value was found as 5.23 in acidic character, while the instability index (II was found as 50.28 with accepted unstable protein. The prediction of subcellular localization was revealed that the protein may reside in chloroplast by using CELLO v.2.5. The 3D structure of protein was performed using comparative homology modeling with SWISS-MODEL. The accuracy of the predicted 3D structure was checked using Ramachandran plot analysis showed that 95.4% in favored region. The results of our study contribute to understanding of β-amylase protein structure in grass species and will be scientific base for 3D modeling of beta-amylase proteins in further studies.

  2. Molecular cloning and sequencing analysis of the interferon receptor (IFNAR-1) from Columba livia.

    Science.gov (United States)

    Li, Chao; Chang, Wei Shan

    2014-01-01

    Partial sequence cloning of interferon receptor (IFNAR-1) of Columba livia. In order to obtain a certain length (630 bp) of gene, a pair of primers was designed according to the conserved nucleotide sequence of Gallus (EU477527.1) and Taeniopygia guttata (XM_002189232.1) IFNAR-1 gene fragment that was published by GenBank. Special primers were designed by the Race method to amplify the 3'terminal cDNA. The Columba livia IFNAR-1 displayed 88.5%, 80.5% and 73.8% nucleotide identity to Falco peregrinus, Gallus and Taeniopygia guttata, respectively. Phylogenetic analysis of the IFNAR1 gene showed that the relationship of Columba livia, Falco peregrinus and chicken had high homology. We successfully obtained a Columba livia IFNAR-1 gene partial sequence. Analysis of the genetic tree showed that the relationship of Columba livia and Falco peregrinus IFNAR-1 had high homology. This result can be used as reference for further research and practical application.

  3. Molecular characterization, sequence analysis and tissue expression of a porcine gene – MOSPD2

    Directory of Open Access Journals (Sweden)

    Yang Jie

    2017-01-01

    Full Text Available The full-length cDNA sequence of a porcine gene, MOSPD2, was amplified using the rapid amplification of cDNA ends method based on a pig expressed sequence tag sequence which was highly homologous to the coding sequence of the human MOSPD2 gene. Sequence prediction analysis revealed that the open reading frame of this gene encodes a protein of 491 amino acids that has high homology with the motile sperm domain-containing protein 2 (MOSPD2 of five species: horse (89%, human (90%, chimpanzee (89%, rhesus monkey (89% and mouse (85%; thus, it could be defined as a porcine MOSPD2 gene. This novel porcine gene was assigned GeneID: 100153601. This gene is structured in 15 exons and 14 introns as revealed by computer-assisted analysis. The phylogenetic analysis revealed that the porcine MOSPD2 gene has a closer genetic relationship with the MOSPD2 gene of horse. Tissue expression analysis indicated that the porcine MOSPD2 gene is generally and differentially expressed in the spleen, muscle, skin, kidney, lung, liver, fat and heart. Our experiment is the first to establish the primary foundation for further research on the porcine MOSPD2 gene.

  4. Structural insights into a high affinity nanobody:antigen complex by homology modelling.

    Science.gov (United States)

    Skottrup, Peter Durand

    2017-09-01

    Porphyromonas gingivalis is a major periodontitis-causing pathogens. P. gingivalis secrete a cysteine protease termed RgpB, which is specific for Arg-Xaa bonds in substrates. Recently, a nanobody-based assay was used to demonstrate that RgpB could represent a novel diagnostic target, thereby simplifying. P. gingivalis detection. The nanobody, VHH7, had a high binding affinity and was specific for RgpB, when tested towards the highly identical RgpA. In this study a homology model of VHH7 was build. The complementarity determining regions (CDR) comprising the paratope residues responsible for RgpB binding were identified and used as input to the docking. Furthermore, residues likely involved in the RgpB epitope was identified based upon RgpB:RgpA alignment and analysis of residue surface accessibility. CDR residues and putitative RgpB epitope residues were used as input to an information-driven flexible docking approach using the HADDOCK server. Analysis of the VHH7:RgpB model demonstrated that the epitope was found in the immunoglobulin-like domain and residue pairs located at the molecular paratope:epitope interface important for complex stability was identified. Collectively, the VHH7 homology model and VHH7:RgpB docking supplies knowledge of the residues involved in the high affinity interaction. This information could prove valuable in the design of an antibody-drug conjugate for specific RgpB targeting. Copyright © 2017 Elsevier Inc. All rights reserved.

  5. DockoMatic 2.0: high throughput inverse virtual screening and homology modeling.

    Science.gov (United States)

    Bullock, Casey; Cornia, Nic; Jacob, Reed; Remm, Andrew; Peavey, Thomas; Weekes, Ken; Mallory, Chris; Oxford, Julia T; McDougal, Owen M; Andersen, Timothy L

    2013-08-26

    DockoMatic is a free and open source application that unifies a suite of software programs within a user-friendly graphical user interface (GUI) to facilitate molecular docking experiments. Here we describe the release of DockoMatic 2.0; significant software advances include the ability to (1) conduct high throughput inverse virtual screening (IVS); (2) construct 3D homology models; and (3) customize the user interface. Users can now efficiently setup, start, and manage IVS experiments through the DockoMatic GUI by specifying receptor(s), ligand(s), grid parameter file(s), and docking engine (either AutoDock or AutoDock Vina). DockoMatic automatically generates the needed experiment input files and output directories and allows the user to manage and monitor job progress. Upon job completion, a summary of results is generated by Dockomatic to facilitate interpretation by the user. DockoMatic functionality has also been expanded to facilitate the construction of 3D protein homology models using the Timely Integrated Modeler (TIM) wizard. The wizard TIM provides an interface that accesses the basic local alignment search tool (BLAST) and MODELER programs and guides the user through the necessary steps to easily and efficiently create 3D homology models for biomacromolecular structures. The DockoMatic GUI can be customized by the user, and the software design makes it relatively easy to integrate additional docking engines, scoring functions, or third party programs. DockoMatic is a free comprehensive molecular docking software program for all levels of scientists in both research and education.

  6. Molecular Cloning And Sequencing Of Disintegrin Like Domain ...

    African Journals Online (AJOL)

    Disintegrin-like domain was cloned and sequenced from Cerastes cerastes venom gland tissue. Nested RT-PCR was performed using initial primers designed based on the homology of disintegrins from Trimeresurus flavoviridis, Glodius halys , Agkistrodon halys and Trimeresurus macrosquamatus. The homology was ...

  7. High-precision, whole-genome sequencing of laboratory strains facilitates genetic studies.

    Directory of Open Access Journals (Sweden)

    Anjana Srivatsan

    2008-08-01

    Full Text Available Whole-genome sequencing is a powerful technique for obtaining the reference sequence information of multiple organisms. Its use can be dramatically expanded to rapidly identify genomic variations, which can be linked with phenotypes to obtain biological insights. We explored these potential applications using the emerging next-generation sequencing platform Solexa Genome Analyzer, and the well-characterized model bacterium Bacillus subtilis. Combining sequencing with experimental verification, we first improved the accuracy of the published sequence of the B. subtilis reference strain 168, then obtained sequences of multiple related laboratory strains and different isolates of each strain. This provides a framework for comparing the divergence between different laboratory strains and between their individual isolates. We also demonstrated the power of Solexa sequencing by using its results to predict a defect in the citrate signal transduction pathway of a common laboratory strain, which we verified experimentally. Finally, we examined the molecular nature of spontaneously generated mutations that suppress the growth defect caused by deletion of the stringent response mediator relA. Using whole-genome sequencing, we rapidly mapped these suppressor mutations to two small homologs of relA. Interestingly, stable suppressor strains had mutations in both genes, with each mutation alone partially relieving the relA growth defect. This supports an intriguing three-locus interaction module that is not easily identifiable through traditional suppressor mapping. We conclude that whole-genome sequencing can drastically accelerate the identification of suppressor mutations and complex genetic interactions, and it can be applied as a standard tool to investigate the genetic traits of model organisms.

  8. The nucleotide sequence of satellite RNA in grapevine fanleaf virus, strain F13.

    Science.gov (United States)

    Fuchs, M; Pinck, M; Serghini, M A; Ravelonandro, M; Walter, B; Pinck, L

    1989-04-01

    The nucleotide sequence of cDNA copies of grapevine fanleaf virus (strain F13) satellite RNA has been determined. The primary structure obtained was 1114 nucleotides in length, excluding the poly(A) tail, and contained only one long open reading frame encoding a 341 residue, highly hydrophilic polypeptide of Mr37275. The coding sequence was bordered by a leader of 14 nucleotides and a 3'-terminal non-coding region of 74 nucleotides. No homology has been found with small satellite RNAs associated with other nepoviruses. Two limited homologies of eight nucleotides have been detected between the satellite RNA in grapevine fanleaf virus and those in tomato black ring virus, and a consensus sequence U.G/UGAAAAU/AU/AU/A at the 5' end of nepovirus RNAs is reported. A less extended consensus exists in this region in comovirus and picornavirus RNA.

  9. FBH1 helicase disrupts RAD51 filaments in vitro and modulates homologous recombination in mammalian cells

    DEFF Research Database (Denmark)

    Simandlova, Jitka; Zagelbaum, Jennifer; Payne, Miranda J

    2013-01-01

    Efficient repair of DNA double strand breaks and interstrand cross-links requires the homologous recombination (HR) pathway, a potentially error-free process that utilizes a homologous sequence as a repair template. A key player in HR is RAD51, the eukaryotic ortholog of bacterial RecA protein. RAD......51 can polymerize on DNA to form a nucleoprotein filament that facilitates both the search for the homologous DNA sequences and the subsequent DNA strand invasion required to initiate HR. Because of its pivotal role in HR, RAD51 is subject to numerous positive and negative regulatory influences...... filaments on DNA through its ssDNA translocase function. Consistent with this, a mutant mouse embryonic stem cell line with a deletion in the FBH1 helicase domain fails to limit RAD51 chromatin association and shows hyper-recombination. Our data are consistent with FBH1 restraining RAD51 DNA binding under...

  10. An RNA secondary structure bias for non-homologous reverse transcriptase-mediated deletions in vivo

    DEFF Research Database (Denmark)

    Duch, Mogens; Carrasco, Maria L; Jespersen, Thomas

    2004-01-01

    Murine leukemia viruses harboring an internal ribosome entry site (IRES)-directed translational cassette are able to replicate, but undergo loss of heterologous sequences upon continued passage. While complete loss of heterologous sequences is favored when these are flanked by a direct repeat......, deletion mutants with junction sites within the heterologous cassette may also be retrieved, in particular from vectors without flanking repeats. Such deletion mutants were here used to investigate determinants of reverse transcriptase-mediated non-homologous recombination. Based upon previous structural...... result from template switching during first-strand cDNA synthesis and that the choice of acceptor sites for non-homologous recombination are guided by non-paired regions. Our results may have implications for recombination events taking place within structured regions of retroviral RNA genomes...

  11. The primary structures of two yeast enolase genes. Homology between the 5' noncoding flanking regions of yeast enolase and glyceraldehyde-3-phosphate dehydrogenase genes.

    Science.gov (United States)

    Holland, M J; Holland, J P; Thill, G P; Jackson, K A

    1981-02-10

    Segments of yeast genomic DNA containing two enolase structural genes have been isolated by subculture cloning procedures using a cDNA hybridization probe synthesized from purified yeast enolase mRNA. Based on restriction endonuclease and transcriptional maps of these two segments of yeast DNA, each hybrid plasmid contains a region of extensive nucleotide sequence homology which forms hybrids with the cDNA probe. The DNA sequences which flank this homologous region in the two hybrid plasmids are nonhomologous indicating that these sequences are nontandemly repeated in the yeast genome. The complete nucleotide sequence of the coding as well as the flanking noncoding regions of these genes has been determined. The amino acid sequence predicted from one reading frame of both structural genes is extremely similar to that determined for yeast enolase (Chin, C. C. Q., Brewer, J. M., Eckard, E., and Wold, F. (1981) J. Biol. Chem. 256, 1370-1376), confirming that these isolated structural genes encode yeast enolase. The nucleotide sequences of the coding regions of the genes are approximately 95% homologous, and neither gene contains an intervening sequence. Codon utilization in the enolase genes follows the same biased pattern previously described for two yeast glyceraldehyde-3-phosphate dehydrogenase structural genes (Holland, J. P., and Holland, M. J. (1980) J. Biol. Chem. 255, 2596-2605). DNA blotting analysis confirmed that the isolated segments of yeast DNA are colinear with yeast genomic DNA and that there are two nontandemly repeated enolase genes per haploid yeast genome. The noncoding portions of the two enolase genes adjacent to the initiation and termination codons are approximately 70% homologous and contain sequences thought to be involved in the synthesis and processing messenger RNA. Finally there are regions of extensive homology between the two enolase structural genes and two yeast glyceraldehyde-3-phosphate dehydrogenase structural genes within the 5

  12. DNA Barcoding: Amplification and sequence analysis of rbcl and matK genome regions in three divergent plant species

    Directory of Open Access Journals (Sweden)

    Javed Iqbal Wattoo

    2016-11-01

    Full Text Available Background: DNA barcoding is a novel method of species identification based on nucleotide diversity of conserved sequences. The establishment and refining of plant DNA barcoding systems is more challenging due to high genetic diversity among different species. Therefore, targeting the conserved nuclear transcribed regions would be more reliable for plant scientists to reveal genetic diversity, species discrimination and phylogeny. Methods: In this study, we amplified and sequenced the chloroplast DNA regions (matk+rbcl of Solanum nigrum, Euphorbia helioscopia and Dalbergia sissoo to study the functional annotation, homology modeling and sequence analysis to allow a more efficient utilization of these sequences among different plant species. These three species represent three families; Solanaceae, Euphorbiaceae and Fabaceae respectively. Biological sequence homology and divergence of amplified sequences was studied using Basic Local Alignment Tool (BLAST. Results: Both primers (matk+rbcl showed good amplification in three species. The sequenced regions reveled conserved genome information for future identification of different medicinal plants belonging to these species. The amplified conserved barcodes revealed different levels of biological homology after sequence analysis. The results clearly showed that the use of these conserved DNA sequences as barcode primers would be an accurate way for species identification and discrimination. Conclusion: The amplification and sequencing of conserved genome regions identified a novel sequence of matK in native species of Solanum nigrum. The findings of the study would be applicable in medicinal industry to establish DNA based identification of different medicinal plant species to monitor adulteration.

  13. Molecular characterization of DnaJ 5 homologs in silkworm Bombyx mori and its expression during egg diapause.

    Science.gov (United States)

    Sirigineedi, Sasibhushan; Vijayagowri, Esvaran; Murthy, Geetha N; Rao, Guruprasada; Ponnuvel, Kangayam M

    2014-12-01

    A comparison of the cDNA sequences (1 056 bp) of Bombyx mori DnaJ 5 homolog with B. mori genome revealed that unlike in other Hsps, it has an intron of 234 bp. The DnaJ 5 homolog contains 351 amino acids, of which 70 contain the conserved DnaJ domain at the N-terminal end. This homolog of B. mori has all desirable functional domains similar to other insects, and the 13 different DnaJ homologs identified in B. mori genome were distributed on different chromosomes. The expressed sequence tag database analysis of Hsp40 gene expression revealed higher expression in wing disc followed by diapause-induced eggs. Microarray analysis revealed higher expression of DnaJ 5 homolog at 18th h after oviposition in diapause-induced eggs. Further validation of DnaJ 5 expression through qPCR in diapause-induced and nondiapause eggs at different time intervals revealed higher expression in diapause eggs at 18 and 24 h after oviposition, which coincided with the expression of Hsp70 as the Hsp 40 is its co-chaperone. This study thus provides an outline of the genome organization of Hsp40 gene, and its role in egg diapause induction in B. mori. © 2013 Institute of Zoology, Chinese Academy of Sciences.

  14. Murine mammary tumor virus pol-related sequences in human DNA: characterization and sequence comparison with the complete murine mammary tumor virus pol gene

    International Nuclear Information System (INIS)

    Deen, K.C.; Sweet, R.W.

    1986-01-01

    Sequences in the human genome with homology to the murine mammary tumor virus (MMTV) pol gene were isolated from a human phage library. Ten clones with extensive pol homology were shown to define five separate loci. These loci share common sequences immediately adjacent to the pol-like segments and, in addition, contain a related repeat element which bounds this region. This organization is suggestive of a proviral structure. The authors estimate that the human genome contains 30 to 40 copies of these pol-related sequences. The pol region of one of the cloned segments (HM16) and the complete MMTV pol gene were sequenced and compared. The nucleotide homology between these pol sequences is 52% and is concentrated in the terminal regions. The MMTV pol gene contains a single long open reading frame encoding 899 amino acids and is demarcated from the partially overlapping putative gag gene by termination codons and a shift in translational reading frame. The pol sequence of HM16 is multiply terminated but does contain open reading frames which encode 370, 105, and 112 amino acids residues in separate reading frames. The authors deduced a composite pol protein sequence for HM16 by aligning it to the MMTV pol gene and then compared these sequences with other retroviral pol protein sequences. Conserved sequences occur in both the amino and carboxyl regions which lie within the polymerase and endonuclease domains of pol, respectively

  15. A 1,681-locus consensus genetic map of cultivated cucumber including 67 NB-LRR resistance gene homolog and ten gene loci.

    Science.gov (United States)

    Yang, Luming; Li, Dawei; Li, Yuhong; Gu, Xingfang; Huang, Sanwen; Garcia-Mas, Jordi; Weng, Yiqun

    2013-03-25

    Cucumber is an important vegetable crop that is susceptible to many pathogens, but no disease resistance (R) genes have been cloned. The availability of whole genome sequences provides an excellent opportunity for systematic identification and characterization of the nucleotide binding and leucine-rich repeat (NB-LRR) type R gene homolog (RGH) sequences in the genome. Cucumber has a very narrow genetic base making it difficult to construct high-density genetic maps. Development of a consensus map by synthesizing information from multiple segregating populations is a method of choice to increase marker density. As such, the objectives of the present study were to identify and characterize NB-LRR type RGHs, and to develop a high-density, integrated cucumber genetic-physical map anchored with RGH loci. From the Gy14 draft genome, 70 NB-containing RGHs were identified and characterized. Most RGHs were in clusters with uneven distribution across seven chromosomes. In silico analysis indicated that all 70 RGHs had EST support for gene expression. Phylogenetic analysis classified 58 RGHs into two clades: CNL and TNL. Comparative analysis revealed high-degree sequence homology and synteny in chromosomal locations of these RGH members between the cucumber and melon genomes. Fifty-four molecular markers were developed to delimit 67 of the 70 RGHs, which were integrated into a genetic map through linkage analysis. A 1,681-locus cucumber consensus map including 10 gene loci and spanning 730.0 cM in seven linkage groups was developed by integrating three component maps with a bin-mapping strategy. Physically, 308 scaffolds with 193.2 Mbp total DNA sequences were anchored onto this consensus map that covered 52.6% of the 367 Mbp cucumber genome. Cucumber contains relatively few NB-LRR RGHs that are clustered and unevenly distributed in the genome. All RGHs seem to be transcribed and shared significant sequence homology and synteny with the melon genome suggesting conservation of

  16. A novel mutation in TFL1 homolog affecting determinacy in cowpea (Vigna unguiculata).

    Science.gov (United States)

    Dhanasekar, P; Reddy, K S

    2015-02-01

    Mutations in the widely conserved Arabidopsis Terminal Flower 1 (TFL1) gene and its homologs have been demonstrated to result in determinacy across genera, the knowledge of which is lacking in cowpea. Understanding the molecular events leading to determinacy of apical meristems could hasten development of cowpea varieties with suitable ideotypes. Isolation and characterization of a novel mutation in cowpea TFL1 homolog (VuTFL1) affecting determinacy is reported here for the first time. Cowpea TFL1 homolog was amplified using primers designed based on conserved sequences in related genera and sequence variation was analysed in three gamma ray-induced determinate mutants, their indeterminate parent "EC394763" and two indeterminate varieties. The analyses of sequence variation exposed a novel SNP distinguishing the determinate mutants from the indeterminate types. The non-synonymous point mutation in exon 4 at position 1,176 resulted from transversion of cytosine (C) to adenine (A) leading to an amino acid change (Pro-136 to His) in determinate mutants. The effect of the mutation on protein function and stability was predicted to be detrimental using different bioinformatics/computational tools. The functionally significant novel substitution mutation is hypothesized to affect determinacy in the cowpea mutants. Development of suitable regeneration protocols in this hitherto recalcitrant crop and subsequent complementation assay in mutants or over-expressing assay in parents could decisively conclude the role of the SNP in regulating determinacy in these cowpea mutants.

  17. Conservation and co-option in developmental programmes: the importance of homology relationships

    Directory of Open Access Journals (Sweden)

    Becker May-Britt

    2005-10-01

    Full Text Available Abstract One of the surprising insights gained from research in evolutionary developmental biology (evo-devo is that increasing diversity in body plans and morphology in organisms across animal phyla are not reflected in similarly dramatic changes at the level of gene composition of their genomes. For instance, simplicity at the tissue level of organization often contrasts with a high degree of genetic complexity. Also intriguing is the observation that the coding regions of several genes of invertebrates show high sequence similarity to those in humans. This lack of change (conservation indicates that evolutionary novelties may arise more frequently through combinatorial processes, such as changes in gene regulation and the recruitment of novel genes into existing regulatory gene networks (co-option, and less often through adaptive evolutionary processes in the coding portions of a gene. As a consequence, it is of great interest to examine whether the widespread conservation of the genetic machinery implies the same developmental function in a last common ancestor, or whether homologous genes acquired new developmental roles in structures of independent phylogenetic origin. To distinguish between these two possibilities one must refer to current concepts of phylogeny reconstruction and carefully investigate homology relationships. Particularly problematic in terms of homology decisions is the use of gene expression patterns of a given structure. In the future, research on more organisms other than the typical model systems will be required since these can provide insights that are not easily obtained from comparisons among only a few distantly related model species.

  18. Competitive repair by naturally dispersed repetitive DNA during non-allelic homologous recombination

    Energy Technology Data Exchange (ETDEWEB)

    Hoang, Margaret L.; Tan, Frederick J.; Lai, David C.; Celniker, Sue E.; Hoskins, Roger A.; Dunham, Maitreya J.; Zheng, Yixian; Koshland, Douglas

    2010-08-27

    Genome rearrangements often result from non-allelic homologous recombination (NAHR) between repetitive DNA elements dispersed throughout the genome. Here we systematically analyze NAHR between Ty retrotransposons using a genome-wide approach that exploits unique features of Saccharomyces cerevisiae purebred and Saccharomyces cerevisiae/Saccharomyces bayanus hybrid diploids. We find that DNA double-strand breaks (DSBs) induce NAHR-dependent rearrangements using Ty elements located 12 to 48 kilobases distal to the break site. This break-distal recombination (BDR) occurs frequently, even when allelic recombination can repair the break using the homolog. Robust BDR-dependent NAHR demonstrates that sequences very distal to DSBs can effectively compete with proximal sequences for repair of the break. In addition, our analysis of NAHR partner choice between Ty repeats shows that intrachromosomal Ty partners are preferred despite the abundance of potential interchromosomal Ty partners that share higher sequence identity. This competitive advantage of intrachromosomal Tys results from the relative efficiencies of different NAHR repair pathways. Finally, NAHR generates deleterious rearrangements more frequently when DSBs occur outside rather than within a Ty repeat. These findings yield insights into mechanisms of repeat-mediated genome rearrangements associated with evolution and cancer.

  19. Allergenic characterization of a novel allergen, homologous to chymotrypsin, from german cockroach.

    Science.gov (United States)

    Jeong, Kyoung Yong; Son, Mina; Lee, Jae Hyun; Hong, Chein Soo; Park, Jung Won

    2015-05-01

    Cockroach feces are known to be rich in IgE-reactive components. Various protease allergens were identified by proteomic analysis of German cockroach fecal extract in a previous study. In this study, we characterized a novel allergen, a chymotrypsin-like serine protease. A cDNA sequence homologous to chymotrypsin was obtained by analysis of German cockroach expressed sequence tag (EST) clones. The recombinant chymotrypsins from the German cockroach and house dust mite (Der f 6) were expressed in Escherichia coli using the pEXP5NT/TOPO vector system, and their allergenicity was investigated by ELISA. The deduced amino acid sequence of German cockroach chymotrypsin showed 32.7 to 43.1% identity with mite group 3 (trypsin) and group 6 (chymotrypsin) allergens. Sera from 8 of 28 German cockroach allergy subjects (28.6%) showed IgE binding to the recombinant protein. IgE binding to the recombinant cockroach chymotrypsin was inhibited by house dust mite chymotrypsin Der f 6, while it minimally inhibited the German cockroach whole body extract. A novel allergen homologous to chymotrypsin was identified from the German cockroach and was cross-reactive with Der f 6.

  20. Competitive repair by naturally dispersed repetitive DNA during non-allelic homologous recombination.

    Directory of Open Access Journals (Sweden)

    Margaret L Hoang

    2010-12-01

    Full Text Available Genome rearrangements often result from non-allelic homologous recombination (NAHR between repetitive DNA elements dispersed throughout the genome. Here we systematically analyze NAHR between Ty retrotransposons using a genome-wide approach that exploits unique features of Saccharomyces cerevisiae purebred and Saccharomyces cerevisiae/Saccharomyces bayanus hybrid diploids. We find that DNA double-strand breaks (DSBs induce NAHR-dependent rearrangements using Ty elements located 12 to 48 kilobases distal to the break site. This break-distal recombination (BDR occurs frequently, even when allelic recombination can repair the break using the homolog. Robust BDR-dependent NAHR demonstrates that sequences very distal to DSBs can effectively compete with proximal sequences for repair of the break. In addition, our analysis of NAHR partner choice between Ty repeats shows that intrachromosomal Ty partners are preferred despite the abundance of potential interchromosomal Ty partners that share higher sequence identity. This competitive advantage of intrachromosomal Tys results from the relative efficiencies of different NAHR repair pathways. Finally, NAHR generates deleterious rearrangements more frequently when DSBs occur outside rather than within a Ty repeat. These findings yield insights into mechanisms of repeat-mediated genome rearrangements associated with evolution and cancer.

  1. Pure homology of algebraic varieties

    OpenAIRE

    Weber, Andrzej

    2003-01-01

    We show that for a complete complex algebraic variety the pure component of homology coincides with the image of intersection homology. Therefore pure homology is topologically invariant. To obtain slightly more general results we introduce "image homology" for noncomplete varieties.

  2. Nature and distribution of feline sarcoma virus nucleotide sequences.

    Science.gov (United States)

    Frankel, A E; Gilbert, J H; Porzig, K J; Scolnick, E M; Aaronson, S A

    1979-01-01

    The genomes of three independent isolates of feline sarcoma virus (FeSV) were compared by molecular hybridization techniques. Using complementary DNAs prepared from two strains, SM- and ST-FeSV, common complementary DNA'S were selected by sequential hybridization to FeSV and feline leukemia virus RNAs. These DNAs were shown to be highly related among the three independent sarcoma virus isolates. FeSV-specific complementary DNAs were prepared by selection for hybridization by the homologous FeSV RNA and against hybridization by fline leukemia virus RNA. Sarcoma virus-specific sequences of SM-FeSV were shown to differ from those of either ST- or GA-FeSV strains, whereas ST-FeSV-specific DNA shared extensive sequence homology with GA-FeSV. By molecular hybridization, each set of FeSV-specific sequences was demonstrated to be present in normal cat cellular DNA in approximately one copy per haploid genome and was conserved throughout Felidae. In contrast, FeSV-common sequences were present in multiple DNA copies and were found only in Mediterranean cats. The present results are consistent with the concept that each FeSV strain has arisen by a mechanism involving recombination between feline leukemia virus and cat cellular DNA sequences, the latter represented within the cat genome in a manner analogous to that of a cellular gene. PMID:225544

  3. [Preparation of monoclonal antibody against 4-amylphenol and homology modeling of its Fv fragment].

    Science.gov (United States)

    Cheng, Lei; Wu, Haizhen; Fei, Jing; Zhang, Lujia; Ye, Jiang; Zhang, Huizhan

    2017-03-01

    Objective To prepare and characterize a monoclonal antibody (mAb) against 4-amylphenol (4-AP), clone its cDNA sequence and make homology modeling for its Fv fragment. Methods A high-affinity anti-4-AP mAb was generated from a hybridoma cell line F10 using electrofusion between splenocytes from APA-BSA-immunized mouse and Sp2/0 myeloma cells. Then we extracted the mRNA of F10 cells and cloned the cDNA of mAb. The homology modeling and molecular docking of its Fv fragment was conducted with biological software. Results Under the optimum conditions, the ic-ELISA equation was y=A 2 +(A 1 -A 2 )/(1+(x/x 0 ) p ) (A 1 =1.28; A 2 =-0.066; x 0 =12560.75; p=0.74) with a correlation coefficient (R 2 ) of 0.997. The lowest detectable limit was 0.65 μg/mL. The heavy and light chains of mAb respectively belonged to IgG1 and Kappa. The homology modeling and molecular docking studies revealed that the binding of 4-Ap and mAb was attributed to the hydrogen bond and hydrophobic interactions. Conclusion The study successfully established a stable 4-AP mAb-secreting hybridoma cell line. The study on spatial structure of Fv fragment using homology modeling provided a reference for the development and design of single chain variable fragments.

  4. The complete nucleotide sequence, genome organization, and origin of human adenovirus type 11

    International Nuclear Information System (INIS)

    Stone, Daniel; Furthmann, Anne; Sandig, Volker; Lieber, Andre

    2003-01-01

    The complete DNA sequence and transcription map of human adenovirus type 11 are reported here. This is the first published sequence for a subgenera B human adenovirus and demonstrates a genome organization highly similar to those of other human adenoviruses. All of the genes from the early, intermediate, and late regions are present in the expected locations of the genome for a human adenovirus. The genome size is 34,794 bp in length and has a GC content of 48.9%. Sequence alignment with genomes of groups A (Ad12), C (Ad5), D (Ad17), E (Simian adenovirus 25), and F (Ad40) revealed homologies of 64, 54, 68, 75, and 52%, respectively. Detailed genomic analysis demonstrated that Ads 11 and 35 are highly conserved in all areas except the hexon hypervariable regions and fiber. Similarly, comparison of Ad11 with subgroup E SAV25 revealed poor homology between fibers but high homology in proteins encoded by all other areas of the genome. We propose an evolutionary model in which functional viruses can be reconstituted following fiber substitution from one serotype to another. According to this model either the Ad11 genome is a derivative of Ad35, from which the fiber was substituted with Ad7, or the Ad35 genome is the product of a fiber substitution from Ad21 into the Ad11 genome. This model also provides a possible explanation for the origin of group E Ads, which are evolutionarily derived from a group C fiber substitution into a group B genome

  5. High resolution sequence stratigraphy in China

    International Nuclear Information System (INIS)

    Zhang Shangfeng; Zhang Changmin; Yin Yanshi; Yin Taiju

    2008-01-01

    Since high resolution sequence stratigraphy was introduced into China by DENG Hong-wen in 1995, it has been experienced two development stages in China which are the beginning stage of theory research and development of theory research and application, and the stage of theoretical maturity and widely application that is going into. It is proved by practices that high resolution sequence stratigraphy plays more and more important roles in the exploration and development of oil and gas in Chinese continental oil-bearing basin and the research field spreads to the exploration of coal mine, uranium mine and other strata deposits. However, the theory of high resolution sequence stratigraphy still has some shortages, it should be improved in many aspects. The authors point out that high resolution sequence stratigraphy should be characterized quantitatively and modelized by computer techniques. (authors)

  6. Lectures on functor homology

    CERN Document Server

    Touzé, Antoine

    2015-01-01

    This book features a series of lectures that explores three different fields in which functor homology (short for homological algebra in functor categories) has recently played a significant role. For each of these applications, the functor viewpoint provides both essential insights and new methods for tackling difficult mathematical problems. In the lectures by Aurélien Djament, polynomial functors appear as coefficients in the homology of infinite families of classical groups, e.g. general linear groups or symplectic groups, and their stabilization. Djament’s theorem states that this stable homology can be computed using only the homology with trivial coefficients and the manageable functor homology. The series includes an intriguing development of Scorichenko’s unpublished results. The lectures by Wilberd van der Kallen lead to the solution of the general cohomological finite generation problem, extending Hilbert’s fourteenth problem and its solution to the context of cohomology. The focus here is o...

  7. Epitopes of human testis-specific lactate dehydrogenase deduced from a cDNA sequence

    International Nuclear Information System (INIS)

    Millan, J.L.; Driscoll, C.E.; LeVan, K.M.; Goldberg, E.

    1987-01-01

    The sequence and structure of human testis-specific L-lactate dehydrogenase [LDHC 4 , LDHX; (L)-lactate:NAD + oxidoreductase, EC 1.1.1.27] has been derived from analysis of a complementary DNA (cDNA) clone comprising the complete protein coding region of the enzyme. From the deduced amino acid sequence, human LDHC 4 is as different from rodent LDHC 4 (73% homology) as it is from human LDHA 4 (76% homology) and porcine LDHB 4 (68% homology). Subunit homologies are consistent with the conclusion that the LDHC gene arose by at least two independent duplication events. Furthermore, the lower degree of homology between mouse and human LDHC 4 and the appearance of this isozyme late in evolution suggests a higher rate of mutation in the mammalian LDHC genes than in the LDHA and -B genes. Comparison of exposed amino acid residues of discrete anti-genic determinants of mouse and human LDHC 4 reveals significant differences. Knowledge of the human LDHC 4 sequence will help design human-specific peptides useful in the development of a contraceptive vaccine

  8. [Sequencing and analysis of the complete genome of a rabies virus isolate from Sika deer].

    Science.gov (United States)

    Zhao, Yun-Jiao; Guo, Li; Huang, Ying; Zhang, Li-Shi; Qian, Ai-Dong

    2008-05-01

    One DRV strain was isolated from Sika Deer brain and sequenced. Nine overlapped gene fragments were amplified by RT-PCR through 3'-RACE and 5'-RACE method, and the complete DRV genome sequence was assembled. The length of the complete genome is 11863bp. The DRV genome organization was similar to other rabies viruses which were composed of five genes and the initiation sites and termination sites were highly conservative. There were mutated amino acids in important antigen sites of nucleoprotein and glycoprotein. The nucleotide and amino acid homologies of gene N, P, M, G, L in strains with completed genomie sequencing were compared. Compared with N gene sequence of other typical rabies viruses, a phylogenetic tree was established . These results indicated that DRV belonged to gene type 1. The highest homology compared with Chinese vaccine strain 3aG was 94%, and the lowest was 71% compared with WCBV. These findings provided theoretical reference for further research in rabies virus.

  9. Identification and Partial Characterization of Potential FtsL and FtsQ Homologs of Chlamydia

    Directory of Open Access Journals (Sweden)

    Scot P Ouellette

    2015-11-01

    Full Text Available Chlamydia is amongst the rare bacteria that lack the critical cell division protein FtsZ. By annotation, Chlamydia also lacks several other essential cell division proteins including the FtsLBQ complex that links the early (e.g. FtsZ and late (e.g. FtsI/Pbp3 components of the division machinery. Here, we report chlamydial FtsL and FtsQ homologs. Ct271 aligned well with E. coli FtsL and shared sequence homology with it, including a predicted leucine-zipper like motif. Based on in silico modeling, we show that Ct764 has structural homology to FtsQ in spite of little sequence similarity. Importantly, ct271/ftsL and ct764/ftsQ are present within all sequenced chlamydial genomes and are expressed during the replicative phase of the chlamydial developmental cycle, two key characteristics for a chlamydial cell division gene. GFP-Ct764 localized to the division septum of dividing transformed chlamydiae, and, importantly, over-expression inhibited chlamydial development. Using a bacterial two-hybrid approach, we show that Ct764 interacted with other components of the chlamydial division apparatus. However, Ct764 was not capable of complementing an E. coli FtsQ depletion strain in spite of its ability to interact with many of the same division proteins as E. coli FtsQ, suggesting that chlamydial FtsQ may function differently. We previously proposed that Chlamydia uses MreB and other rod-shape determining proteins as an alternative system for organizing the division site and its apparatus. Chlamydial FtsL and FtsQ homologs expand the number of identified chlamydial cell division proteins and suggest that Chlamydia has likely kept the late components of the division machinery while substituting the Mre system for the early components.

  10. Multiple Evolutionary Events Involved in Maintaining Homologs of Resistance to Powdery Mildew 8 in Brassica napus.

    Science.gov (United States)

    Li, Qin; Li, Jing; Sun, Jin-Long; Ma, Xian-Feng; Wang, Ting-Ting; Berkey, Robert; Yang, Hui; Niu, Ying-Ze; Fan, Jing; Li, Yan; Xiao, Shunyuan; Wang, Wen-Ming

    2016-01-01

    The Resistance to Powdery Mildew 8 (RPW8) locus confers broad-spectrum resistance to powdery mildew in Arabidopsis thaliana. There are four Homologous to RPW8s (BrHRs) in Brassica rapa and three in Brassica oleracea (BoHRs). Brassica napus (Bn) is derived from diploidization of a hybrid between B. rapa and B. oleracea, thus should have seven homologs of RPW8 (BnHRs). It is unclear whether these genes are still maintained or lost in B. napus after diploidization and how they might have been evolved. Here, we reported the identification and sequence polymorphisms of BnHRs from a set of B. napus accessions. Our data indicated that while the BoHR copy from B. oleracea is highly conserved, the BrHR copy from B. rapa is relatively variable in the B. napus genome owing to multiple evolutionary events, such as gene loss, point mutation, insertion, deletion, and intragenic recombination. Given the overall high sequence homology of BnHR genes, it is not surprising that both intragenic recombination between two orthologs and two paralogs were detected in B. napus, which may explain the loss of BoHR genes in some B. napus accessions. When ectopically expressed in Arabidopsis, a C-terminally truncated version of BnHRa and BnHRb, as well as the full length BnHRd fused with YFP at their C-termini could trigger cell death in the absence of pathogens and enhanced resistance to powdery mildew disease. Moreover, subcellular localization analysis showed that both BnHRa-YFP and BnHRb-YFP were mainly localized to the extra-haustorial membrane encasing the haustorium of powdery mildew. Taken together, our data suggest that the duplicated BnHR genes might have been subjected to differential selection and at least some may play a role in defense and could serve as resistance resource in engineering disease-resistant plants.

  11. Protein secondary structure prediction for a single-sequence using hidden semi-Markov models

    Directory of Open Access Journals (Sweden)

    Borodovsky Mark

    2006-03-01

    Full Text Available Abstract Background The accuracy of protein secondary structure prediction has been improving steadily towards the 88% estimated theoretical limit. There are two types of prediction algorithms: Single-sequence prediction algorithms imply that information about other (homologous proteins is not available, while algorithms of the second type imply that information about homologous proteins is available, and use it intensively. The single-sequence algorithms could make an important contribution to studies of proteins with no detected homologs, however the accuracy of protein secondary structure prediction from a single-sequence is not as high as when the additional evolutionary information is present. Results In this paper, we further refine and extend the hidden semi-Markov model (HSMM initially considered in the BSPSS algorithm. We introduce an improved residue dependency model by considering the patterns of statistically significant amino acid correlation at structural segment borders. We also derive models that specialize on different sections of the dependency structure and incorporate them into HSMM. In addition, we implement an iterative training method to refine estimates of HSMM parameters. The three-state-per-residue accuracy and other accuracy measures of the new method, IPSSP, are shown to be comparable or better than ones for BSPSS as well as for PSIPRED, tested under the single-sequence condition. Conclusions We have shown that new dependency models and training methods bring further improvements to single-sequence protein secondary structure prediction. The results are obtained under cross-validation conditions using a dataset with no pair of sequences having significant sequence similarity. As new sequences are added to the database it is possible to augment the dependency structure and obtain even higher accuracy. Current and future advances should contribute to the improvement of function prediction for orphan proteins inscrutable

  12. Identification and characterization of microRNAs from peanut (Arachis hypogaea L. by high-throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Xiaoyuan Chi

    Full Text Available BACKGROUND: MicroRNAs (miRNAs are noncoding RNAs of approximately 21 nt that regulate gene expression in plants post-transcriptionally by endonucleolytic cleavage or translational inhibition. miRNAs play essential roles in numerous developmental and physiological processes and many of them are conserved across species. Extensive studies of miRNAs have been done in a few model plants; however, less is known about the diversity of these regulatory RNAs in peanut (Arachis hypogaea L., one of the most important oilseed crops cultivated worldwide. RESULTS: A library of small RNA from peanut was constructed for deep sequencing. In addition to 126 known miRNAs from 33 families, 25 novel peanut miRNAs were identified. The miRNA* sequences of four novel miRNAs were discovered, providing additional evidence for the existence of miRNAs. Twenty of the novel miRNAs were considered to be species-specific because no homolog has been found for other plant species. qRT-PCR was used to analyze the expression of seven miRNAs in different tissues and in seed at different developmental stages and some showed tissue- and/or growth stage-specific expression. Furthermore, potential targets of these putative miRNAs were predicted on the basis of the sequence homology search. CONCLUSIONS: We have identified large numbers of miRNAs and their related target genes through deep sequencing of a small RNA library. This study of the identification and characterization of miRNAs in peanut can initiate further study on peanut miRNA regulation mechanisms, and help toward a greater understanding of the important roles of miRNAs in peanut.

  13. Mouse tetranectin: cDNA sequence, tissue-specific expression, and chromosomal mapping

    DEFF Research Database (Denmark)

    Ibaraki, K; Kozak, C A; Wewer, U M

    1995-01-01

    regulation, mouse tetranectin cDNA was cloned from a 16-day-old mouse embryo library. Sequence analysis revealed a 992-bp cDNA with an open reading frame of 606 bp, which is identical in length to the human tetranectin cDNA. The deduced amino acid sequence showed high homology to the human cDNA with 76......(s) of tetranectin. The sequence analysis revealed a difference in both sequence and size of the noncoding regions between mouse and human cDNAs. Northern analysis of the various tissues from mouse, rat, and cow showed the major transcript(s) to be approximately 1 kb, which is similar in size to that observed...

  14. Isolation and characterization of a FLOWERING LOCUS T homolog from pineapple (Ananas comosus (L.) Merr).

    Science.gov (United States)

    Lv, LingLing; Duan, Jun; Xie, JiangHui; Wei, ChangBin; Liu, YuGe; Liu, ShengHui; Sun, GuangMing

    2012-09-01

    FLOWERING LOCUS T (FT)-like genes are crucial regulators of flowering in angiosperms. A homolog of FT, designated as AcFT (GenBank ID: HQ343233), was isolated from pineapple cultivar Comte de Paris by reverse transcriptase polymerase chain reaction (RT-PCR) and rapid amplification of cDNA ends (RACE). The cDNA sequence of AcFT is 915 bp in length and contains an ORF of 534 bp, which encodes a protein of 177 aa. Molecular weight was 19.9 kDa and isoelectric point was 6.96. The deduced protein sequence of AcFT was 84% and 82% identical to homologs encoded by CgFT in Cymbidium goeringii and OgFT in Oncidium Gower Ramsey respectively. Quantitative real-time PCR (qRT-PCR) analyses showed that the expression of AcFT was high in flesh and none in leaves. qRT-PCR analyses in different stages indicated that the expression of AcFT reached the highest level on 40 d after flower inducing, when the multiple fruit and floral organs were forming. The 35S::AcFT transgenic Arabidopsis plants flowered earlier and had more inflorescences or branches than wild type plants. Copyright © 2012 Elsevier B.V. All rights reserved.

  15. Homology modelling and docking analysis of L-lactate dehydrogenase from Streptococcus thermopilus

    Directory of Open Access Journals (Sweden)

    Vukić Vladimir R.

    2016-01-01

    Full Text Available The aim of this research was to create a three-dimensional model of L-lactate dehydrogenase from the main yoghurt starter culture - Streptococcus thermopilus, to analyse its structural features and investigate substrate binding in the active site. NCBI BlastP was used against the Protein Data Bank database in order to identify the template for construction of homology models. Multiple sequence alignment was performed using the program MUSCULE within the UGENE 1.11.3 program. Homology models were constructed using the program Modeller v. 9.17. The obtained 3D model was verified by Ramachandran plots. Molecular docking simulations were performed using the program Surflex-Dock. The highest sequence similarity was observed with L-lactate dehydrogenase from Lactobacillus casei subsp. casei, with 69% identity. Therefore, its structure (PDB ID: 2ZQY:A was selected as a modelling template for homology modelling. Active residues are by sequence similarity predicted: S. thermophilus - HIS181 and S. aureus - HIS179. Binding energy of pyruvate to L-lactate dehydrogenase of S. thermopilus was - 7.874 kcal/mol. Pyruvate in L-lactate dehydrogenase of S. thermopilus makes H bonds with catalytic HIS181 (1.9 Å, as well as with THR235 (3.6 Å. Although our results indicate similar position of substrates between L-lactate dehydrogenase of S. thermopilus and S. aureus, differences in substrate distances and binding energy values could influence the reaction rate. Based on these results, the L-lactate dehydrogenase model proposed here could be used as a guide for further research, such as transition states of the reaction through molecular dynamics. [Projekat Ministarstva nauke Republike Srbije, br. III 46009

  16. MODexplorer: an integrated tool for exploring protein sequence, structure and function relationships.

    KAUST Repository

    Kosinski, Jan; Barbato, Alessandro; Tramontano, Anna

    2013-01-01

    SUMMARY: MODexplorer is an integrated tool aimed at exploring the sequence, structural and functional diversity in protein families useful in homology modeling and in analyzing protein families in general. It takes as input either the sequence or the structure of a protein and provides alignments with its homologs along with a variety of structural and functional annotations through an interactive interface. The annotations include sequence conservation, similarity scores, ligand-, DNA- and RNA-binding sites, secondary structure, disorder, crystallographic structure resolution and quality scores of models implied by the alignments to the homologs of known structure. MODexplorer can be used to analyze sequence and structural conservation among the structures of similar proteins, to find structures of homologs solved in different conformational state or with different ligands and to transfer functional annotations. Furthermore, if the structure of the query is not known, MODexplorer can be used to select the modeling templates taking all this information into account and to build a comparative model. AVAILABILITY AND IMPLEMENTATION: Freely available on the web at http://modorama.biocomputing.it/modexplorer. Website implemented in HTML and JavaScript with all major browsers supported. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

  17. MODexplorer: an integrated tool for exploring protein sequence, structure and function relationships.

    KAUST Repository

    Kosinski, Jan

    2013-02-08

    SUMMARY: MODexplorer is an integrated tool aimed at exploring the sequence, structural and functional diversity in protein families useful in homology modeling and in analyzing protein families in general. It takes as input either the sequence or the structure of a protein and provides alignments with its homologs along with a variety of structural and functional annotations through an interactive interface. The annotations include sequence conservation, similarity scores, ligand-, DNA- and RNA-binding sites, secondary structure, disorder, crystallographic structure resolution and quality scores of models implied by the alignments to the homologs of known structure. MODexplorer can be used to analyze sequence and structural conservation among the structures of similar proteins, to find structures of homologs solved in different conformational state or with different ligands and to transfer functional annotations. Furthermore, if the structure of the query is not known, MODexplorer can be used to select the modeling templates taking all this information into account and to build a comparative model. AVAILABILITY AND IMPLEMENTATION: Freely available on the web at http://modorama.biocomputing.it/modexplorer. Website implemented in HTML and JavaScript with all major browsers supported. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

  18. Comparative anatomy of the human APRT gene and enzyme: nucleotide sequence divergence and conservation of a nonrandom CpG dinucleotide arrangement

    International Nuclear Information System (INIS)

    Broderick, T.P.; Schaff, D.A.; Bertino, A.M.; Dush, M.K.; Tischfield, J.A.; Stambrook, P.J.

    1987-01-01

    The functional human adenine phosphoribosyltransferase (APRT) gene is <2.6 kilobases in length and contains five exons. The amino acid sequences of APRTs have been highly conserved throughout evolution. The human enzyme is 82%, 90%, and 40% identical to the mouse, hamster, and Escherichia coli enzymes, respectively. The promoter region of the human APRT gene, like that of several other housekeeping genes, lacks TATA and CCAAT boxes but contains five GC boxes that are potential binding sites for the Sp1 transcription factor. The distal three, however, are dispensable for gene expression. Comparison between human and mouse APRT gene nucleotide sequences reveals a high degree of homology within protein coding regions but an absence of significant homology in 5' flanking, 3' untranslated, and intron sequences, except for similarly positioned GC boxes in the promoter region and a 26-base-pair region in intron 3. This 26-base-pair sequence is 92% identical with a similarly positioned sequence in the mouse gene and is also found in intron 3 of the hamster gene, suggesting that its retention may be a consequence of stringent selection. The positions of all introns have been precisely retained in the human and both rodent genes. Retention of an elevated CpG dinucleotide content, despite loss of sequence homology, suggests that there may be selection for CpG dinucleotides in these regions and that their maintenance may be important for APRT gene function

  19. Identification and Analysis of Red Sea Mangrove (Avicennia marina) microRNAs by High-Throughput Sequencing and Their Association with Stress Responses

    KAUST Repository

    Khraiwesh, Basel; Pugalenthi, Ganesan; Fedoroff, Nina V.

    2013-01-01

    Although RNA silencing has been studied primarily in model plants, advances in high-throughput sequencing technologies have enabled profiling of the small RNA components of many more plant species, providing insights into the ubiquity and conservatism of some miRNA-based regulatory mechanisms. Small RNAs of 20 to 24 nucleotides (nt) are important regulators of gene transcript levels by either transcriptional or by posttranscriptional gene silencing, contributing to genome maintenance and controlling a variety of developmental and physiological processes. Here, we used deep sequencing and molecular methods to create an inventory of the small RNAs in the mangrove species, Avicennia marina. We identified 26 novel mangrove miRNAs and 193 conserved miRNAs belonging to 36 families. We determined that 2 of the novel miRNAs were produced from known miRNA precursors and 4 were likely to be species-specific by the criterion that we found no homologs in other plant species. We used qRT-PCR to analyze the expression of miRNAs and their target genes in different tissue sets and some demonstrated tissue-specific expression. Furthermore, we predicted potential targets of these putative miRNAs based on a sequence homology and experimentally validated through endonucleolytic cleavage assays. Our results suggested that expression profiles of miRNAs and their predicted targets could be useful in exploring the significance of the conservation patterns of plants, particularly in response to abiotic stress. Because of their well-developed abilities in this regard, mangroves and other extremophiles are excellent models for such exploration. © 2013 Khraiwesh et al.

  20. Identification and analysis of red sea mangrove (Avicennia marina microRNAs by high-throughput sequencing and their association with stress responses.

    Directory of Open Access Journals (Sweden)

    Basel Khraiwesh

    Full Text Available Although RNA silencing has been studied primarily in model plants, advances in high-throughput sequencing technologies have enabled profiling of the small RNA components of many more plant species, providing insights into the ubiquity and conservatism of some miRNA-based regulatory mechanisms. Small RNAs of 20 to 24 nucleotides (nt are important regulators of gene transcript levels by either transcriptional or by posttranscriptional gene silencing, contributing to genome maintenance and controlling a variety of developmental and physiological processes. Here, we used deep sequencing and molecular methods to create an inventory of the small RNAs in the mangrove species, Avicennia marina. We identified 26 novel mangrove miRNAs and 193 conserved miRNAs belonging to 36 families. We determined that 2 of the novel miRNAs were produced from known miRNA precursors and 4 were likely to be species-specific by the criterion that we found no homologs in other plant species. We used qRT-PCR to analyze the expression of miRNAs and their target genes in different tissue sets and some demonstrated tissue-specific expression. Furthermore, we predicted potential targets of these putative miRNAs based on a sequence homology and experimentally validated through endonucleolytic cleavage assays. Our results suggested that expression profiles of miRNAs and their predicted targets could be useful in exploring the significance of the conservation patterns of plants, particularly in response to abiotic stress. Because of their well-developed abilities in this regard, mangroves and other extremophiles are excellent models for such exploration.

  1. Identification and Analysis of Red Sea Mangrove (Avicennia marina) microRNAs by High-Throughput Sequencing and Their Association with Stress Responses

    KAUST Repository

    Khraiwesh, Basel

    2013-04-08

    Although RNA silencing has been studied primarily in model plants, advances in high-throughput sequencing technologies have enabled profiling of the small RNA components of many more plant species, providing insights into the ubiquity and conservatism of some miRNA-based regulatory mechanisms. Small RNAs of 20 to 24 nucleotides (nt) are important regulators of gene transcript levels by either transcriptional or by posttranscriptional gene silencing, contributing to genome maintenance and controlling a variety of developmental and physiological processes. Here, we used deep sequencing and molecular methods to create an inventory of the small RNAs in the mangrove species, Avicennia marina. We identified 26 novel mangrove miRNAs and 193 conserved miRNAs belonging to 36 families. We determined that 2 of the novel miRNAs were produced from known miRNA precursors and 4 were likely to be species-specific by the criterion that we found no homologs in other plant species. We used qRT-PCR to analyze the expression of miRNAs and their target genes in different tissue sets and some demonstrated tissue-specific expression. Furthermore, we predicted potential targets of these putative miRNAs based on a sequence homology and experimentally validated through endonucleolytic cleavage assays. Our results suggested that expression profiles of miRNAs and their predicted targets could be useful in exploring the significance of the conservation patterns of plants, particularly in response to abiotic stress. Because of their well-developed abilities in this regard, mangroves and other extremophiles are excellent models for such exploration. © 2013 Khraiwesh et al.

  2. TIMPs of parasitic helminths - a large-scale analysis of high-throughput sequence datasets.

    Science.gov (United States)

    Cantacessi, Cinzia; Hofmann, Andreas; Pickering, Darren; Navarro, Severine; Mitreva, Makedonka; Loukas, Alex

    2013-05-30

    Tissue inhibitors of metalloproteases (TIMPs) are a multifunctional family of proteins that orchestrate extracellular matrix turnover, tissue remodelling and other cellular processes. In parasitic helminths, such as hookworms, TIMPs have been proposed to play key roles in the host-parasite interplay, including invasion of and establishment in the vertebrate animal hosts. Currently, knowledge of helminth TIMPs is limited to a small number of studies on canine hookworms, whereas no information is available on the occurrence of TIMPs in other parasitic helminths causing neglected diseases. In the present study, we conducted a large-scale investigation of TIMP proteins of a range of neglected human parasites including the hookworm Necator americanus, the roundworm Ascaris suum, the liver flukes Clonorchis sinensis and Opisthorchis viverrini, as well as the schistosome blood flukes. This entailed mining available transcriptomic and/or genomic sequence datasets for the presence of homologues of known TIMPs, predicting secondary structures of defined protein sequences, systematic phylogenetic analyses and assessment of differential expression of genes encoding putative TIMPs in the developmental stages of A. suum, N. americanus and Schistosoma haematobium which infect the mammalian hosts. A total of 15 protein sequences with high homology to known eukaryotic TIMPs were predicted from the complement of sequence data available for parasitic helminths and subjected to in-depth bioinformatic analyses. Supported by the availability of gene manipulation technologies such as RNA interference and/or transgenesis, this work provides a basis for future functional explorations of helminth TIMPs and, in particular, of their role/s in fundamental biological pathways linked to long-term establishment in the vertebrate hosts, with a view towards the development of novel approaches for the control of neglected helminthiases.

  3. Identification of microRNAs from Eugenia uniflora by high-throughput sequencing and bioinformatics analysis.

    Science.gov (United States)

    Guzman, Frank; Almerão, Mauricio P; Körbes, Ana P; Loss-Morais, Guilherme; Margis, Rogerio

    2012-01-01

    microRNAs or miRNAs are small non-coding regulatory RNAs that play important functions in the regulation of gene expression at the post-transcriptional level by targeting mRNAs for degradation or inhibiting protein translation. Eugenia uniflora is a plant native to tropical America with pharmacological and ecological importance, and there have been no previous studies concerning its gene expression and regulation. To date, no miRNAs have been reported in Myrtaceae species. Small RNA and RNA-seq libraries were constructed to identify miRNAs and pre-miRNAs in Eugenia uniflora. Solexa technology was used to perform high throughput sequencing of the library, and the data obtained were analyzed using bioinformatics tools. From 14,489,131 small RNA clean reads, we obtained 1,852,722 mature miRNA sequences representing 45 conserved families that have been identified in other plant species. Further analysis using contigs assembled from RNA-seq allowed the prediction of secondary structures of 25 known and 17 novel pre-miRNAs. The expression of twenty-seven identified miRNAs was also validated using RT-PCR assays. Potential targets were predicted for the most abundant mature miRNAs in the identified pre-miRNAs based on sequence homology. This study is the first large scale identification of miRNAs and their potential targets from a species of the Myrtaceae family without genomic sequence resources. Our study provides more information about the evolutionary conservation of the regulatory network of miRNAs in plants and highlights species-specific miRNAs.

  4. Somatic association of telocentric chromosomes carrying homologous centromeres in common wheat.

    Science.gov (United States)

    Mello-Sampayo, T

    1973-01-01

    Measurements of distances between telocentric chromosomes, either homologous or representing the opposite arms of a metacentric chromosome (complementary telocentrics), were made at metaphase in root tip cells of common wheat carrying two homologous pairs of complementary telocentrics of chromosome 1 B or 6 B (double ditelosomic 1 B or 6 B). The aim was to elucidate the relative locations of the telocentric chromosomes within the cell. The data obtained strongly suggest that all four telocentrics of chromosome 1 B or 6 B are spacially and simultaneously co-associated. In plants carrying two complementary (6 B (S) and 6 B (L)) and a non-related (5 B (L)) telocentric, only the complementary chromosomes were found to be somatically associated. It is thought, therefore, that the somatic association of chromosomes may involve more than two chromosomes in the same association and, since complementary telocentrics are as much associated as homologous, that the homology between centromeres (probably the only homologous region that exists between complementary telocentrics) is a very important condition for somatic association of chromosomes. The spacial arrangement of chromosomes was studied at anaphase and prophase and the polar orientation of chromosomes at prophase was found to resemble anaphase orientation. This was taken as good evidence for the maintenance of the chromosome arrangement - the Rabl orientation - and of the peripheral location of the centromere and its association with the nuclear membrane. Within this general arrangement homologous telocentric chromosomes were frequently seen to have their centromeres associated or directed towards each other. The role of the centromere in somatic association as a spindle fibre attachment and chromosome binder is discussed. It is suggested that for non-homologous chromosomes to become associated in root tips, the only requirement needed should be the homology of centromeres such as exists between complementary

  5. The HMMER Web Server for Protein Sequence Similarity Search.

    Science.gov (United States)

    Prakash, Ananth; Jeffryes, Matt; Bateman, Alex; Finn, Robert D

    2017-12-08

    Protein sequence similarity search is one of the most commonly used bioinformatics methods for identifying evolutionarily related proteins. In general, sequences that are evolutionarily related share some degree of similarity, and sequence-search algorithms use this principle to identify homologs. The requirement for a fast and sensitive sequence search method led to the development of the HMMER software, which in the latest version (v3.1) uses a combination of sophisticated acceleration heuristics and mathematical and computational optimizations to enable the use of profile hidden Markov models (HMMs) for sequence analysis. The HMMER Web server provides a common platform by linking the HMMER algorithms to databases, thereby enabling the search for homologs, as well as providing sequence and functional annotation by linking external databases. This unit describes three basic protocols and two alternate protocols that explain how to use the HMMER Web server using various input formats and user defined parameters. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.

  6. A discriminative method for family-based protein remote homology detection that combines inductive logic programming and propositional models.

    Science.gov (United States)

    Bernardes, Juliana S; Carbone, Alessandra; Zaverucha, Gerson

    2011-03-23

    Remote homology detection is a hard computational problem. Most approaches have trained computational models by using either full protein sequences or multiple sequence alignments (MSA), including all positions. However, when we deal with proteins in the "twilight zone" we can observe that only some segments of sequences (motifs) are conserved. We introduce a novel logical representation that allows us to represent physico-chemical properties of sequences, conserved amino acid positions and conserved physico-chemical positions in the MSA. From this, Inductive Logic Programming (ILP) finds the most frequent patterns (motifs) and uses them to train propositional models, such as decision trees and support vector machines (SVM). We use the SCOP database to perform our experiments by evaluating protein recognition within the same superfamily. Our results show that our methodology when using SVM performs significantly better than some of the state of the art methods, and comparable to other. However, our method provides a comprehensible set of logical rules that can help to understand what determines a protein function. The strategy of selecting only the most frequent patterns is effective for the remote homology detection. This is possible through a suitable first-order logical representation of homologous properties, and through a set of frequent patterns, found by an ILP system, that summarizes essential features of protein functions.

  7. Gene Unprediction with Spurio: A tool to identify spurious protein sequences.

    Science.gov (United States)

    Höps, Wolfram; Jeffryes, Matt; Bateman, Alex

    2018-01-01

    We now have access to the sequences of tens of millions of proteins. These protein sequences are essential for modern molecular biology and computational biology. The vast majority of protein sequences are derived from gene prediction tools and have no experimental supporting evidence for their translation.  Despite the increasing accuracy of gene prediction tools there likely exists a large number of spurious protein predictions in the sequence databases.  We have developed the Spurio tool to help identify spurious protein predictions in prokaryotes.  Spurio searches the query protein sequence against a prokaryotic nucleotide database using tblastn and identifies homologous sequences. The tblastn matches are used to score the query sequence's likelihood of being a spurious protein prediction using a Gaussian process model. The most informative feature is the appearance of stop codons within the presumed translation of homologous DNA sequences. Benchmarking shows that the Spurio tool is able to distinguish spurious from true proteins. However, transposon proteins are prone to be predicted as spurious because of the frequency of degraded homologs found in the DNA sequence databases. Our initial experiments suggest that less than 1% of the proteins in the UniProtKB sequence database are likely to be spurious and that Spurio is able to identify over 60 times more spurious proteins than the AntiFam resource. The Spurio software and source code is available under an MIT license at the following URL: https://bitbucket.org/bateman-group/spurio.

  8. Mod two homology and cohomology

    CERN Document Server

    Hausmann, Jean-Claude

    2014-01-01

    Cohomology and homology modulo 2 helps the reader grasp more readily the basics of a major tool in algebraic topology. Compared to a more general approach to (co)homology this refreshing approach has many pedagogical advantages: It leads more quickly to the essentials of the subject, An absence of signs and orientation considerations simplifies the theory, Computations and advanced applications can be presented at an earlier stage, Simple geometrical interpretations of (co)chains. Mod 2 (co)homology was developed in the first quarter of the twentieth century as an alternative to integral homology, before both became particular cases of (co)homology with arbitrary coefficients. The first chapters of this book may serve as a basis for a graduate-level introductory course to (co)homology. Simplicial and singular mod 2 (co)homology are introduced, with their products and Steenrod squares, as well as equivariant cohomology. Classical applications include Brouwer's fixed point theorem, Poincaré duality, Borsuk-Ula...

  9. Mutagenic Organized Recombination Process by Homologous IN vivo Grouping (MORPHING) for directed enzyme evolution.

    Science.gov (United States)

    Gonzalez-Perez, David; Molina-Espeja, Patricia; Garcia-Ruiz, Eva; Alcalde, Miguel

    2014-01-01

    Approaches that depend on directed evolution require reliable methods to generate DNA diversity so that mutant libraries can focus on specific target regions. We took advantage of the high frequency of homologous DNA recombination in Saccharomyces cerevisiae to develop a strategy for domain mutagenesis aimed at introducing and in vivo recombining random mutations in defined segments of DNA. Mutagenic Organized Recombination Process by Homologous IN vivo Grouping (MORPHING) is a one-pot random mutagenic method for short protein regions that harnesses the in vivo recombination apparatus of yeast. Using this approach, libraries can be prepared with different mutational loads in DNA segments of less than 30 amino acids so that they can be assembled into the remaining unaltered DNA regions in vivo with high fidelity. As a proof of concept, we present two eukaryotic-ligninolytic enzyme case studies: i) the enhancement of the oxidative stability of a H2O2-sensitive versatile peroxidase by independent evolution of three distinct protein segments (Leu28-Gly57, Leu149-Ala174 and Ile199-Leu268); and ii) the heterologous functional expression of an unspecific peroxygenase by exclusive evolution of its native 43-residue signal sequence.

  10. High-throughput sequence alignment using Graphics Processing Units

    Directory of Open Access Journals (Sweden)

    Trapnell Cole

    2007-12-01

    Full Text Available Abstract Background The recent availability of new, less expensive high-throughput DNA sequencing technologies has yielded a dramatic increase in the volume of sequence data that must be analyzed. These data are being generated for several purposes, including genotyping, genome resequencing, metagenomics, and de novo genome assembly projects. Sequence alignment programs such as MUMmer have proven essential for analysis of these data, but researchers will need ever faster, high-throughput alignment tools running on inexpensive hardware to keep up with new sequence technologies. Results This paper describes MUMmerGPU, an open-source high-throughput parallel pairwise local sequence alignment program that runs on commodity Graphics Processing Units (GPUs in common workstations. MUMmerGPU uses the new Compute Unified Device Architecture (CUDA from nVidia to align multiple query sequences against a single reference sequence stored as a suffix tree. By processing the queries in parallel on the highly parallel graphics card, MUMmerGPU achieves more than a 10-fold speedup over a serial CPU version of the sequence alignment kernel, and outperforms the exact alignment component of MUMmer on a high end CPU by 3.5-fold in total application time when aligning reads from recent sequencing projects using Solexa/Illumina, 454, and Sanger sequencing technologies. Conclusion MUMmerGPU is a low cost, ultra-fast sequence alignment program designed to handle the increasing volume of data produced by new, high-throughput sequencing technologies. MUMmerGPU demonstrates that even memory-intensive applications can run significantly faster on the relatively low-cost GPU than on the CPU.

  11. HOMOLOGY BETWEEN SEGMENTS OF HUMAN HEMOSTATIC PROTEINS AND PROTEINS OF VIRUSES WHICH CAUSE ACUTE RESPIRATORY INFECTIONS OR DISEASES WITH SIMILAR SYMPTOMS

    Directory of Open Access Journals (Sweden)

    I. N. Zhilinskaya

    2017-01-01

    Full Text Available Objectives: To identify homologous segments of human hemostatic and viral proteins and to assess the role of human hemostatic proteins in viral replication. Materials and Methods: The following viruses were chosen for comparison: influenza B (B/Astrakhan/2/2017, coronaviruses (Hcov229E and SARS-Co, type 1 adenovirus (adenoid 71, measles (ICHINOSE-BA and rubella (Therien. The primary structures of viral proteins and 41 human hemostatic proteins were obtained from open–access www.ncbi.nlm.nih. gov and www.nextprot.org databases, respectively. Sequence homology was determined by comparing 12-amino-acid segments. Those sequences identical in ≥ 8 positions were considered homologous. Results: The analysis shows that viral proteins contain segments which mimic a number of human hemostatic proteins. Most of these segments, except those of adenovirus proteins, are homologous with coagulation factors. The increase in viral virulence, as in case of SARS-Co, correlates with an increased number of segments homologous with hemostatic proteins. Conclusion: Hemostasis plays an important role in viral replication and pathogenesis. The conclusion is consistent with the literature data about the relationship of hemostasis and inflammatory response to viral infections.

  12. FRESCO: Referential compression of highly similar sequences.

    Science.gov (United States)

    Wandelt, Sebastian; Leser, Ulf

    2013-01-01

    In many applications, sets of similar texts or sequences are of high importance. Prominent examples are revision histories of documents or genomic sequences. Modern high-throughput sequencing technologies are able to generate DNA sequences at an ever-increasing rate. In parallel to the decreasing experimental time and cost necessary to produce DNA sequences, computational requirements for analysis and storage of the sequences are steeply increasing. Compression is a key technology to deal with this challenge. Recently, referential compression schemes, storing only the differences between a to-be-compressed input and a known reference sequence, gained a lot of interest in this field. In this paper, we propose a general open-source framework to compress large amounts of biological sequence data called Framework for REferential Sequence COmpression (FRESCO). Our basic compression algorithm is shown to be one to two orders of magnitudes faster than comparable related work, while achieving similar compression ratios. We also propose several techniques to further increase compression ratios, while still retaining the advantage in speed: 1) selecting a good reference sequence; and 2) rewriting a reference sequence to allow for better compression. In addition,we propose a new way of further boosting the compression ratios by applying referential compression to already referentially compressed files (second-order compression). This technique allows for compression ratios way beyond state of the art, for instance,4,000:1 and higher for human genomes. We evaluate our algorithms on a large data set from three different species (more than 1,000 genomes, more than 3 TB) and on a collection of versions of Wikipedia pages. Our results show that real-time compression of highly similar sequences at high compression ratios is possible on modern hardware.

  13. Automated cleaning and pre-processing of immunoglobulin gene sequences from high-throughput sequencing

    Directory of Open Access Journals (Sweden)

    Miri eMichaeli

    2012-12-01

    Full Text Available High throughput sequencing (HTS yields tens of thousands to millions of sequences that require a large amount of pre-processing work to clean various artifacts. Such cleaning cannot be performed manually. Existing programs are not suitable for immunoglobulin (Ig genes, which are variable and often highly mutated. This paper describes Ig-HTS-Cleaner (Ig High Throughput Sequencing Cleaner, a program containing a simple cleaning procedure that successfully deals with pre-processing of Ig sequences derived from HTS, and Ig-Indel-Identifier (Ig Insertion – Deletion Identifier, a program for identifying legitimate and artifact insertions and/or deletions (indels. Our programs were designed for analyzing Ig gene sequences obtained by 454 sequencing, but they are applicable to all types of sequences and sequencing platforms. Ig-HTS-Cleaner and Ig-Indel-Identifier have been implemented in Java and saved as executable JAR files, supported on Linux and MS Windows. No special requirements are needed in order to run the programs, except for correctly constructing the input files as explained in the text. The programs' performance has been tested and validated on real and simulated data sets.

  14. Zea mI, the maize homolog of the allergen-encoding Lol pI gene of rye grass.

    Science.gov (United States)

    Broadwater, A H; Rubinstein, A L; Chay, C H; Klapper, D G; Bedinger, P A

    1993-09-15

    Sequence analysis of a pollen-specific cDNA from maize has identified a homolog (Zea mI) of the gene (Lol pI) encoding the major allergen of rye-grass pollen. The protein encoded by the partial cDNA sequence is 59.3% identical and 72.7% similar to the comparable region of the reported amino acid sequence of Lol pIA. Southern analysis indicates that this cDNA represents a member of a small multigene family in maize. Northern analysis shows expression only in pollen, not in vegetative or female floral tissues. The timing of expression is developmentally regulated, occurring at a low level prior to the first pollen mitosis and at a high level after this postmeiotic division. Western analysis detects a protein in maize pollen lysates using polyclonal antiserum and monoclonal antibodies directed against purified Lolium perenne allergen.

  15. Homology-integrated CRISPR-Cas (HI-CRISPR) system for one-step multigene disruption in Saccharomyces cerevisiae.

    Science.gov (United States)

    Bao, Zehua; Xiao, Han; Liang, Jing; Zhang, Lu; Xiong, Xiong; Sun, Ning; Si, Tong; Zhao, Huimin

    2015-05-15

    One-step multiple gene disruption in the model organism Saccharomyces cerevisiae is a highly useful tool for both basic and applied research, but it remains a challenge. Here, we report a rapid, efficient, and potentially scalable strategy based on the type II Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-CRISPR associated proteins (Cas) system to generate multiple gene disruptions simultaneously in S. cerevisiae. A 100 bp dsDNA mutagenizing homologous recombination donor is inserted between two direct repeats for each target gene in a CRISPR array consisting of multiple donor and guide sequence pairs. An ultrahigh copy number plasmid carrying iCas9, a variant of wild-type Cas9, trans-encoded RNA (tracrRNA), and a homology-integrated crRNA cassette is designed to greatly increase the gene disruption efficiency. As proof of concept, three genes, CAN1, ADE2, and LYP1, were simultaneously disrupted in 4 days with an efficiency ranging from 27 to 87%. Another three genes involved in an artificial hydrocortisone biosynthetic pathway, ATF2, GCY1, and YPR1, were simultaneously disrupted in 6 days with 100% efficiency. This homology-integrated CRISPR (HI-CRISPR) strategy represents a powerful tool for creating yeast strains with multiple gene knockouts.

  16. External and semi-internal controls for PCR amplification of homologous sequences in mixed templates.

    Science.gov (United States)

    Kalle, Elena; Gulevich, Alexander; Rensing, Christopher

    2013-11-01

    In a mixed template, the presence of homologous target DNA sequences creates environments that almost inevitably give rise to artifacts and biases during PCR. Heteroduplexes, chimeras, and skewed template-to-product ratios are the exclusive attributes of mixed template PCR and never occur in a single template assay. Yet, multi-template PCR has been used without appropriate attention to quality control and assay validation, in spite of the fact that such practice diminishes the reliability of results. External and internal amplification controls became obligatory elements of good laboratory practice in different PCR assays. We propose the inclusion of an analogous approach as a quality control system for multi-template PCR applications. The amplification controls must take into account the characteristics of multi-template PCR and be able to effectively monitor particular assay performance. This study demonstrated the efficiency of a model mixed template as an adequate external amplification control for a particular PCR application. The conditions of multi-template PCR do not allow implementation of a classic internal control; therefore we developed a convenient semi-internal control as an acceptable alternative. In order to evaluate the effects of inhibitors, a model multi-template mix was amplified in a mixture with DNAse-treated sample. Semi-internal control allowed establishment of intervals for robust PCR performance for different samples, thus enabling correct comparison of the samples. The complexity of the external and semi-internal amplification controls must be comparable with the assumed complexity of the samples. We also emphasize that amplification controls should be applied in multi-template PCR regardless of the post-assay method used to analyze products. © 2013 Elsevier B.V. All rights reserved.

  17. Highly immunogenic prime–boost DNA vaccination protects chickens against challenge with homologous and heterologous H5N1 virus

    Directory of Open Access Journals (Sweden)

    Anna Stachyra

    2014-01-01

    Full Text Available Highly pathogenic avian influenza viruses (HPAIVs cause huge economic losses in the poultry industry because of high mortality rate in infected flocks and trade restrictions. Protective antibodies, directed mainly against hemagglutinin (HA, are the primary means of protection against influenza outbreaks. A recombinant DNA vaccine based on the sequence of H5 HA from the H5N1/A/swan/Poland/305-135V08/2006 strain of HPAIV was prepared. Sequence manipulation included deletion of the proteolytic cleavage site to improve protein stability, codon usage optimization to improve translation and stability of RNA in host cells, and cloning into a commercially available vector to enable expression in animal cells. Naked plasmid DNA was complexed with a liposomal carrier and the immunization followed the prime–boost strategy. The immunogenic potential of the DNA vaccine was first proved in broilers in near-to-field conditions resembling a commercial farm. Next, the protective activity of the vaccine was confirmed in SPF layer-type chickens. Experimental infections (challenge experiments indicated that 100% of vaccinated chickens were protected against H5N1 of the same clade and that 70% of them were protected against H5N1 influenza virus of a different clade. Moreover, the DNA vaccine significantly limited (or even eliminated transmission of the virus to contact control chickens. Two intramuscular doses of DNA vaccine encoding H5 HA induced a strong protective response in immunized chicken. The effective protection lasted for a minimum 8 weeks after the second dose of the vaccine and was not limited to the homologous H5N1 virus. In addition, the vaccine reduced shedding of the virus.

  18. Genomic DNA sequence and cytosine methylation changes of adult rice leaves after seeds space flight

    Science.gov (United States)

    Shi, Jinming

    In this study, cytosine methylation on CCGG site and genomic DNA sequence changes of adult leaves of rice after seeds space flight were detected by methylation-sensitive amplification polymorphism (MSAP) and Amplified fragment length polymorphism (AFLP) technique respectively. Rice seeds were planted in the trial field after 4 days space flight on the shenzhou-6 Spaceship of China. Adult leaves of space-treated rice including 8 plants chosen randomly and 2 plants with phenotypic mutation were used for AFLP and MSAP analysis. Polymorphism of both DNA sequence and cytosine methylation were detected. For MSAP analysis, the average polymorphic frequency of the on-ground controls, space-treated plants and mutants are 1.3%, 3.1% and 11% respectively. For AFLP analysis, the average polymorphic frequencies are 1.4%, 2.9%and 8%respectively. Total 27 and 22 polymorphic fragments were cloned sequenced from MSAP and AFLP analysis respectively. Nine of the 27 fragments from MSAP analysis show homology to coding sequence. For the 22 polymorphic fragments from AFLP analysis, no one shows homology to mRNA sequence and eight fragments show homology to repeat region or retrotransposon sequence. These results suggest that although both genomic DNA sequence and cytosine methylation status can be effected by space flight, the genomic region homology to the fragments from genome DNA and cytosine methylation analysis were different.

  19. The Porcelain Crab Transcriptome and PCAD, the Porcelain Crab Microarray and Sequence Database

    Energy Technology Data Exchange (ETDEWEB)

    Tagmount, Abderrahmane; Wang, Mei; Lindquist, Erika; Tanaka, Yoshihiro; Teranishi, Kristen S.; Sunagawa, Shinichi; Wong, Mike; Stillman, Jonathon H.

    2010-01-27

    Background: With the emergence of a completed genome sequence of the freshwater crustacean Daphnia pulex, construction of genomic-scale sequence databases for additional crustacean sequences are important for comparative genomics and annotation. Porcelain crabs, genus Petrolisthes, have been powerful crustacean models for environmental and evolutionary physiology with respect to thermal adaptation and understanding responses of marine organisms to climate change. Here, we present a large-scale EST sequencing and cDNA microarray database project for the porcelain crab Petrolisthes cinctipes. Methodology/Principal Findings: A set of ~;;30K unique sequences (UniSeqs) representing ~;;19K clusters were generated from ~;;98K high quality ESTs from a set of tissue specific non-normalized and mixed-tissue normalized cDNA libraries from the porcelain crab Petrolisthes cinctipes. Homology for each UniSeq was assessed using BLAST, InterProScan, GO and KEGG database searches. Approximately 66percent of the UniSeqs had homology in at least one of the databases. All EST and UniSeq sequences along with annotation results and coordinated cDNA microarray datasets have been made publicly accessible at the Porcelain Crab Array Database (PCAD), a feature-enriched version of the Stanford and Longhorn Array Databases.Conclusions/Significance: The EST project presented here represents the third largest sequencing effort for any crustacean, and the largest effort for any crab species. Our assembly and clustering results suggest that our porcelain crab EST data set is equally diverse to the much larger EST set generated in the Daphnia pulex genome sequencing project, and thus will be an important resource to the Daphnia research community. Our homology results support the pancrustacea hypothesis and suggest that Malacostraca may be ancestral to Branchiopoda and Hexapoda. Our results also suggest that our cDNA microarrays cover as much of the transcriptome as can reasonably be captured in

  20. DSAP: deep-sequencing small RNA analysis pipeline.

    Science.gov (United States)

    Huang, Po-Jung; Liu, Yi-Chung; Lee, Chi-Ching; Lin, Wei-Chen; Gan, Richie Ruei-Chi; Lyu, Ping-Chiang; Tang, Petrus

    2010-07-01

    DSAP is an automated multiple-task web service designed to provide a total solution to analyzing deep-sequencing small RNA datasets generated by next-generation sequencing technology. DSAP uses a tab-delimited file as an input format, which holds the unique sequence reads (tags) and their corresponding number of copies generated by the Solexa sequencing platform. The input data will go through four analysis steps in DSAP: (i) cleanup: removal of adaptors and poly-A/T/C/G/N nucleotides; (ii) clustering: grouping of cleaned sequence tags into unique sequence clusters; (iii) non-coding RNA (ncRNA) matching: sequence homology mapping against a transcribed sequence library from the ncRNA database Rfam (http://rfam.sanger.ac.uk/); and (iv) known miRNA matching: detection of known miRNAs in miRBase (http://www.mirbase.org/) based on sequence homology. The expression levels corresponding to matched ncRNAs and miRNAs are summarized in multi-color clickable bar charts linked to external databases. DSAP is also capable of displaying miRNA expression levels from different jobs using a log(2)-scaled color matrix. Furthermore, a cross-species comparative function is also provided to show the distribution of identified miRNAs in different species as deposited in miRBase. DSAP is available at http://dsap.cgu.edu.tw.

  1. Prokaryotic caspase homologs: phylogenetic patterns and functional characteristics reveal considerable diversity.

    Directory of Open Access Journals (Sweden)

    Johannes Asplund-Samuelsson

    Full Text Available Caspases accomplish initiation and execution of apoptosis, a programmed cell death process specific to metazoans. The existence of prokaryotic caspase homologs, termed metacaspases, has been known for slightly more than a decade. Despite their potential connection to the evolution of programmed cell death in eukaryotes, the phylogenetic distribution and functions of these prokaryotic metacaspase sequences are largely uncharted, while a few experiments imply involvement in programmed cell death. Aiming at providing a more detailed picture of prokaryotic caspase homologs, we applied a computational approach based on Hidden Markov Model search profiles to identify and functionally characterize putative metacaspases in bacterial and archaeal genomes. Out of the total of 1463 analyzed genomes, merely 267 (18% were identified to contain putative metacaspases, but their taxonomic distribution included most prokaryotic phyla and a few archaea (Euryarchaeota. Metacaspases were particularly abundant in Alphaproteobacteria, Deltaproteobacteria and Cyanobacteria, which harbor many morphologically and developmentally complex organisms, and a distinct correlation was found between abundance and phenotypic complexity in Cyanobacteria. Notably, Bacillus subtilis and Escherichia coli, known to undergo genetically regulated autolysis, lacked metacaspases. Pfam domain architecture analysis combined with operon identification revealed rich and varied configurations among the metacaspase sequences. These imply roles in programmed cell death, but also e.g. in signaling, various enzymatic activities and protein modification. Together our data show a wide and scattered distribution of caspase homologs in prokaryotes with structurally and functionally diverse sub-groups, and with a potentially intriguing evolutionary role. These features will help delineate future characterizations of death pathways in prokaryotes.

  2. De novo transcriptome sequencing and sequence analysis of the malaria vector Anopheles sinensis (Diptera: Culicidae)

    Science.gov (United States)

    2014-01-01

    Background Anopheles sinensis is the major malaria vector in China and Southeast Asia. Vector control is one of the most effective measures to prevent malaria transmission. However, there is little transcriptome information available for the malaria vector. To better understand the biological basis of malaria transmission and to develop novel and effective means of vector control, there is a need to build a transcriptome dataset for functional genomics analysis by large-scale RNA sequencing (RNA-seq). Methods To provide a more comprehensive and complete transcriptome of An. sinensis, eggs, larvae, pupae, male adults and female adults RNA were pooled together for cDNA preparation, sequenced using the Illumina paired-end sequencing technology and assembled into unigenes. These unigenes were then analyzed in their genome mapping, functional annotation, homology, codon usage bias and simple sequence repeats (SSRs). Results Approximately 51.6 million clean reads were obtained, trimmed, and assembled into 38,504 unigenes with an average length of 571 bp, an N50 of 711 bp, and an average GC content 51.26%. Among them, 98.4% of unigenes could be mapped onto the reference genome, and 69% of unigenes could be annotated with known biological functions. Homology analysis identified certain numbers of An. sinensis unigenes that showed homology or being putative 1:1 orthologues with genomes of other Dipteran species. Codon usage bias was analyzed and 1,904 SSRs were detected, which will provide effective molecular markers for the population genetics of this species. Conclusions Our data and analysis provide the most comprehensive transcriptomic resource and characteristics currently available for An. sinensis, and will facilitate genetic, genomic studies, and further vector control of An. sinensis. PMID:25000941

  3. Model SNP development for complex genomes based on hexaploid oat using high-throughput 454 sequencing technology

    Directory of Open Access Journals (Sweden)

    Chao Shiaoman

    2011-01-01

    Full Text Available Abstract Background Genetic markers are pivotal to modern genomics research; however, discovery and genotyping of molecular markers in oat has been hindered by the size and complexity of the genome, and by a scarcity of sequence data. The purpose of this study was to generate oat expressed sequence tag (EST information, develop a bioinformatics pipeline for SNP discovery, and establish a method for rapid, cost-effective, and straightforward genotyping of SNP markers in complex polyploid genomes such as oat. Results Based on cDNA libraries of four cultivated oat genotypes, approximately 127,000 contigs were assembled from approximately one million Roche 454 sequence reads. Contigs were filtered through a novel bioinformatics pipeline to eliminate ambiguous polymorphism caused by subgenome homology, and 96 in silico SNPs were selected from 9,448 candidate loci for validation using high-resolution melting (HRM analysis. Of these, 52 (54% were polymorphic between parents of the Ogle1040 × TAM O-301 (OT mapping population, with 48 segregating as single Mendelian loci, and 44 being placed on the existing OT linkage map. Ogle and TAM amplicons from 12 primers were sequenced for SNP validation, revealing complex polymorphism in seven amplicons but general sequence conservation within SNP loci. Whole-amplicon interrogation with HRM revealed insertions, deletions, and heterozygotes in secondary oat germplasm pools, generating multiple alleles at some primer targets. To validate marker utility, 36 SNP assays were used to evaluate the genetic diversity of 34 diverse oat genotypes. Dendrogram clusters corresponded generally to known genome composition and genetic ancestry. Conclusions The high-throughput SNP discovery pipeline presented here is a rapid and effective method for identification of polymorphic SNP alleles in the oat genome. The current-generation HRM system is a simple and highly-informative platform for SNP genotyping. These techniques provide

  4. Phylogenetic analysis of the diacylglycerol kinase family of proteins and identification of multiple highly-specific conserved inserts and deletions within the catalytic domain that are distinctive characteristics of different classes of DGK homologs.

    Directory of Open Access Journals (Sweden)

    Radhey S Gupta

    Full Text Available Diacylglycerol kinase (DGK family of proteins, which phosphorylates diacylglycerol into phosphatidic acid, play important role in controlling diverse cellular processes in eukaryotic organisms. Most vertebrate species contain 10 different DGK isozymes, which are grouped into 5 different classes based on the presence or absence of specific functional domains. However, the relationships among different DGK isozymes or how they have evolved from a common ancestor is unclear. The catalytic domain constitutes the single largest sequence element within the DGK proteins that is commonly and uniquely shared by all family members, but there is limited understanding of the overall function of this domain. In this work, we have used the catalytic domain sequences to construct a phylogenetic tree for the DGK family members from representatives of the main vertebrate classes and have also examined the distributions of various DGK isozymes in eukaryotic phyla. In a tree based on catalytic domain sequences, the DGK homologs belonging to different classes formed strongly supported clusters which were separated by long branches, and the different isozymes within each class also generally formed monophyletic groupings. Further, our analysis of the sequence alignments of catalytic domains has identified >10 novel sequence signatures consisting of conserved signature indels (inserts or deletions, CSIs that are distinctive characteristics of either particular classes of DGK isozymes, or are commonly shared by members of two or more classes of DGK isozymes. The conserved indels in protein sequences are known to play important functional roles in the proteins/organisms where they are found. Thus, our identification of multiple highly specific CSIs that are distinguishing characteristics of different classes of DGK homologs points to the existence of important differences in the catalytic domain function among the DGK isozymes. The identified CSIs in conjunction with

  5. Multiscale analysis of nonlinear systems using computational homology

    Energy Technology Data Exchange (ETDEWEB)

    Konstantin Mischaikow; Michael Schatz; William Kalies; Thomas Wanner

    2010-05-24

    This is a collaborative project between the principal investigators. However, as is to be expected, different PIs have greater focus on different aspects of the project. This report lists these major directions of research which were pursued during the funding period: (1) Computational Homology in Fluids - For the computational homology effort in thermal convection, the focus of the work during the first two years of the funding period included: (1) A clear demonstration that homology can sensitively detect the presence or absence of an important flow symmetry, (2) An investigation of homology as a probe for flow dynamics, and (3) The construction of a new convection apparatus for probing the effects of large-aspect-ratio. (2) Computational Homology in Cardiac Dynamics - We have initiated an effort to test the use of homology in characterizing data from both laboratory experiments and numerical simulations of arrhythmia in the heart. Recently, the use of high speed, high sensitivity digital imaging in conjunction with voltage sensitive fluorescent dyes has enabled researchers to visualize electrical activity on the surface of cardiac tissue, both in vitro and in vivo. (3) Magnetohydrodynamics - A new research direction is to use computational homology to analyze results of large scale simulations of 2D turbulence in the presence of magnetic fields. Such simulations are relevant to the dynamics of black hole accretion disks. The complex flow patterns from simulations exhibit strong qualitative changes as a function of magnetic field strength. Efforts to characterize the pattern changes using Fourier methods and wavelet analysis have been unsuccessful. (4) Granular Flow - two experts in the area of granular media are studying 2D model experiments of earthquake dynamics where the stress fields can be measured; these stress fields from complex patterns of 'force chains' that may be amenable to analysis using computational homology. (5) Microstructure

  6. Multiscale analysis of nonlinear systems using computational homology

    Energy Technology Data Exchange (ETDEWEB)

    Konstantin Mischaikow, Rutgers University/Georgia Institute of Technology, Michael Schatz, Georgia Institute of Technology, William Kalies, Florida Atlantic University, Thomas Wanner,George Mason University

    2010-05-19

    This is a collaborative project between the principal investigators. However, as is to be expected, different PIs have greater focus on different aspects of the project. This report lists these major directions of research which were pursued during the funding period: (1) Computational Homology in Fluids - For the computational homology effort in thermal convection, the focus of the work during the first two years of the funding period included: (1) A clear demonstration that homology can sensitively detect the presence or absence of an important flow symmetry, (2) An investigation of homology as a probe for flow dynamics, and (3) The construction of a new convection apparatus for probing the effects of large-aspect-ratio. (2) Computational Homology in Cardiac Dynamics - We have initiated an effort to test the use of homology in characterizing data from both laboratory experiments and numerical simulations of arrhythmia in the heart. Recently, the use of high speed, high sensitivity digital imaging in conjunction with voltage sensitive fluorescent dyes has enabled researchers to visualize electrical activity on the surface of cardiac tissue, both in vitro and in vivo. (3) Magnetohydrodynamics - A new research direction is to use computational homology to analyze results of large scale simulations of 2D turbulence in the presence of magnetic fields. Such simulations are relevant to the dynamics of black hole accretion disks. The complex flow patterns from simulations exhibit strong qualitative changes as a function of magnetic field strength. Efforts to characterize the pattern changes using Fourier methods and wavelet analysis have been unsuccessful. (4) Granular Flow - two experts in the area of granular media are studying 2D model experiments of earthquake dynamics where the stress fields can be measured; these stress fields from complex patterns of 'force chains' that may be amenable to analysis using computational homology. (5) Microstructure

  7. Protein Function Prediction Based on Sequence and Structure Information

    KAUST Repository

    Smaili, Fatima Z.

    2016-05-25

    The number of available protein sequences in public databases is increasing exponentially. However, a significant fraction of these sequences lack functional annotation which is essential to our understanding of how biological systems and processes operate. In this master thesis project, we worked on inferring protein functions based on the primary protein sequence. In the approach we follow, 3D models are first constructed using I-TASSER. Functions are then deduced by structurally matching these predicted models, using global and local similarities, through three independent enzyme commission (EC) and gene ontology (GO) function libraries. The method was tested on 250 “hard” proteins, which lack homologous templates in both structure and function libraries. The results show that this method outperforms the conventional prediction methods based on sequence similarity or threading. Additionally, our method could be improved even further by incorporating protein-protein interaction information. Overall, the method we use provides an efficient approach for automated functional annotation of non-homologous proteins, starting from their sequence.

  8. The colocalization transition of homologous chromosomes at meiosis

    Science.gov (United States)

    Nicodemi, Mario; Panning, Barbara; Prisco, Antonella

    2008-06-01

    Meiosis is the specialized cell division required in sexual reproduction. During its early stages, in the mother cell nucleus, homologous chromosomes recognize each other and colocalize in a crucial step that remains one of the most mysterious of meiosis. Starting from recent discoveries on the system molecular components and interactions, we discuss a statistical mechanics model of chromosome early pairing. Binding molecules mediate long-distance interaction of special DNA recognition sequences and, if their concentration exceeds a critical threshold, they induce a spontaneous colocalization transition of chromosomes, otherwise independently diffusing.

  9. The ORF59 DNA polymerase processivity factor homologs of Old World primate RV2 rhadinoviruses are highly conserved nuclear antigens expressed in differentiated epithelium in infected macaques

    Directory of Open Access Journals (Sweden)

    Burnside Kellie L

    2009-11-01

    Full Text Available Abstract Background ORF59 DNA polymerase processivity factor of the human rhadinovirus, Kaposi's sarcoma-associated herpesvirus (KSHV, is required for efficient copying of the genome during virus replication. KSHV ORF59 is antigenic in the infected host and is used as a marker for virus activation and replication. Results We cloned, sequenced and expressed the genes encoding related ORF59 proteins from the RV1 rhadinovirus homologs of KSHV from chimpanzee (PtrRV1 and three species of macaques (RFHVMm, RFHVMn and RFHVMf, and have compared them with ORF59 proteins obtained from members of the more distantly-related RV2 rhadinovirus lineage infecting the same non-human primate species (PtrRV2, RRV, MneRV2, and MfaRV2, respectively. We found that ORF59 homologs of the RV1 and RV2 Old World primate rhadinoviruses are highly conserved with distinct phylogenetic clustering of the two rhadinovirus lineages. RV1 and RV2 ORF59 C-terminal domains exhibit a strong lineage-specific conservation. Rabbit antiserum was developed against a C-terminal polypeptide that is highly conserved between the macaque RV2 ORF59 sequences. This anti-serum showed strong reactivity towards ORF59 encoded by the macaque RV2 rhadinoviruses, RRV (rhesus and MneRV2 (pig-tail, with no cross reaction to human or macaque RV1 ORF59 proteins. Using this antiserum and RT-qPCR, we determined that RRV ORF59 is expressed early after permissive infection of both rhesus primary fetal fibroblasts and African green monkey kidney epithelial cells (Vero in vitro. RRV- and MneRV2-infected foci showed strong nuclear expression of ORF59 that correlated with production of infectious progeny virus. Immunohistochemical studies of an MneRV2-infected macaque revealed strong nuclear expression of ORF59 in infected cells within the differentiating layer of epidermis corroborating previous observations that differentiated epithelial cells are permissive for replication of KSHV-like rhadinoviruses

  10. Origin and spread of photosynthesis based upon conserved sequence features in key bacteriochlorophyll biosynthesis proteins.

    Science.gov (United States)

    Gupta, Radhey S

    2012-11-01

    The origin of photosynthesis and how this capability has spread to other bacterial phyla remain important unresolved questions. I describe here a number of conserved signature indels (CSIs) in key proteins involved in bacteriochlorophyll (Bchl) biosynthesis that provide important insights in these regards. The proteins BchL and BchX, which are essential for Bchl biosynthesis, are derived by gene duplication in a common ancestor of all phototrophs. More ancient gene duplication gave rise to the BchX-BchL proteins and the NifH protein of the nitrogenase complex. The sequence alignment of NifH-BchX-BchL proteins contain two CSIs that are uniquely shared by all NifH and BchX homologs, but not by any BchL homologs. These CSIs and phylogenetic analysis of NifH-BchX-BchL protein sequences strongly suggest that the BchX homologs are ancestral to BchL and that the Bchl-based anoxygenic photosynthesis originated prior to the chlorophyll (Chl)-based photosynthesis in cyanobacteria. Another CSI in the BchX-BchL sequence alignment that is uniquely shared by all BchX homologs and the BchL sequences from Heliobacteriaceae, but absent in all other BchL homologs, suggests that the BchL homologs from Heliobacteriaceae are primitive in comparison to all other photosynthetic lineages. Several other identified CSIs in the BchN homologs are commonly shared by all proteobacterial homologs and a clade consisting of the marine unicellular Cyanobacteria (Clade C). These CSIs in conjunction with the results of phylogenetic analyses and pair-wise sequence similarity on the BchL, BchN, and BchB proteins, where the homologs from Clade C Cyanobacteria and Proteobacteria exhibited close relationship, provide strong evidence that these two groups have incurred lateral gene transfers. Additionally, phylogenetic analyses and several CSIs in the BchL-N-B proteins that are uniquely shared by all Chlorobi and Chloroflexi homologs provide evidence that the genes for these proteins have also been

  11. Nucleotide sequence of the coat protein gene of the Skierniewice isolate of plum pox virus (PPV)

    International Nuclear Information System (INIS)

    Wypijewski, K.; Musial, W.; Augustyniak, J.; Malinowski, T.

    1994-01-01

    The coat protein (CP) gene of the Skierniewice isolate of plum pox virus (PPV-S) has been amplified using the reverse transcription - polymerase chain reaction (RT-PCR), cloned and sequenced. The nucleotide sequence of the gene and the deduced amino-acid sequences of PPV-S CP were compared with those of other PPV strains. The nucleotide sequence showed very high homology to most of the published sequences. The motif: Asp-Ala-Gly (DAG), important for the aphid transmissibility, was present in the amino-acid sequence. Our isolate did not react in ELISA with monoclonal antibodies MAb06 supposed to be specific for PPV-D. (author). 32 refs, 1 fig., 2 tabs

  12. Evolution and virulence contributions of the autotransporter proteins YapJ and YapK of Yersinia pestis CO92 and their homologs in Y. pseudotuberculosis IP32953.

    Science.gov (United States)

    Lenz, Jonathan D; Temple, Brenda R S; Miller, Virginia L

    2012-10-01

    Yersinia pestis, the causative agent of plague, evolved from the gastrointestinal pathogen Yersinia pseudotuberculosis. Both species have numerous type Va autotransporters, most of which appear to be highly conserved. In Y. pestis CO92, the autotransporter genes yapK and yapJ share a high level of sequence identity. By comparing yapK and yapJ to three homologous genes in Y. pseudotuberculosis IP32953 (YPTB0365, YPTB3285, and YPTB3286), we show that yapK is conserved in Y. pseudotuberculosis, while yapJ is unique to Y. pestis. All of these autotransporters exhibit >96% identity in the C terminus of the protein and identities ranging from 58 to 72% in their N termini. By extending this analysis to include homologous sequences from numerous Y. pestis and Y. pseudotuberculosis strains, we determined that these autotransporters cluster into a YapK (YPTB3285) class and a YapJ (YPTB3286) class. The YPTB3286-like gene of most Y. pestis strains appears to be inactivated, perhaps in favor of maintaining yapJ. Since autotransporters are important for virulence in many bacterial pathogens, including Y. pestis, any change in autotransporter content should be considered for its impact on virulence. Using established mouse models of Y. pestis infection, we demonstrated that despite the high level of sequence identity, yapK is distinct from yapJ in its contribution to disseminated Y. pestis infection. In addition, a mutant lacking both of these genes exhibits an additive attenuation, suggesting nonredundant roles for yapJ and yapK in systemic Y. pestis infection. However, the deletion of the homologous genes in Y. pseudotuberculosis does not seem to impact the virulence of this organism in orogastric or systemic infection models.

  13. Cloning and sequencing of Indian Water buffalo (Bubalus bubalis) interleukin-3 cDNA

    KAUST Repository

    Sugumar, Thennarasu

    2011-12-12

    Full-length cDNA (435 bp) of the interleukin-3(IL-3) gene of the Indian water buffalo was amplified by reverse transcriptase-polymerase chain reaction and sequenced. This sequence had 96% nucleotide identity and 92% amino acid identity with bovine IL-3. There are 10 amino acid substitutions in buffalo compared with that of bovine. The amino acid sequence of buffalo IL-3 also showed very high identity with that of other ruminants, indicating functional cross-reactivity. Structural homology modelling of buffalo IL-3 protein with human IL-3 showed the presence of five helical structures.

  14. Gene mining a marama bean expressed sequence tags (ESTs ...

    African Journals Online (AJOL)

    The authors reported the identification of genes associated with embryonic development and microsatellite sequences. The future direction will entail characterization of these genes using gene over-expression and mutant assays. Key words: Namibia, simple sequence repeats (SSR), data mining, homology searches, ...

  15. Electron microscopic comparison of the sequences of single-stranded genomes of mammalian parvoviruses by heteroduplex mapping

    Energy Technology Data Exchange (ETDEWEB)

    Banerjee, P.T.; Olson, W.H.; Allison, D.P.; Bates, R.C.; Snyder, C.E.; Mitra, S.

    1983-01-01

    The sequence homologies among the linear single-stranded genomes of several mammalian parvoviruses have been studied by electron microscopic analysis of tthe heteroduplexes produced by reannealing the complementary strands of their DNAs. The genomes of Kilham rat virus, H-1, minute virus of ice and LuIII, which are antigenically distinct non-defective parvoviruses, have considerable homology: about 70% of their sequences are conserved. The homologous regions map at similar locations in the left halves (from the 3' ends) of the genomes. No sequence homology, however, is observed between the DNAs of these nondefective parvoviruses and that of bovine parvovirus, another non-defective virus, or that of defective adenoassociated virus, nor between the genomes of bovine parvovirus and adenoassociated virus. This suggests that only very short, if any, homologous regions are present. From these results, an evolutionary relationship among Kilham rat virus, H-1, minute virus of mice and LuIII is predicted. It is interesting to note that, although LuIII was originally isolated from a human cell line and is specific for human cells in vitro, its genome has sequences in common only with the rodent viruses Kilham rat virus, minute virus of mice and H-1, and not with the other two mammalian parvoviruses tested.

  16. GAWK, a novel human pituitary polypeptide: isolation, immunocytochemical localization and complete amino acid sequence.

    Science.gov (United States)

    Benjannet, S; Leduc, R; Lazure, C; Seidah, N G; Marcinkiewicz, M; Chrétien, M

    1985-01-16

    During the course of reverse-phase high pressure liquid chromatography (RP-HPLC) purification of a postulated big ACTH (1) from human pituitary gland extracts, a highly purified peptide bearing no resemblance to any known polypeptide was isolated. The complete sequence of this 74 amino acid polypeptide, called GAWK, has been determined. Search on a computer data bank on the possible homology to any known protein or fragment, using a mutation data matrix, failed to reveal any homology greater than 30%. An antibody produced against a synthetic fragment allowed us to detect several immunoreactive forms. The antisera also enabled us to localize the polypeptide, by immunocytochemistry, in the anterior lobe of the pituitary gland.

  17. Hydroquinone: O-glucosyltransferase from cultivated Rauvolfia cells: enrichment and partial amino acid sequences.

    Science.gov (United States)

    Arend, J; Warzecha, H; Stöckigt, J

    2000-01-01

    Plant cell suspension cultures of Rauvolfia are able to produce a high amount of arbutin by glucosylation of exogenously added hydroquinone. A four step purification procedure using anion exchange, hydrophobic interaction, hydroxyapatite-chromatography and chromatofocusing delivered in a yield of 0.5%, an approximately 390 fold enrichment of the involved glucosyltransferase. SDS-PAGE showed a M(r) for the enzyme of 52 kDa. Proteolysis of the pure enzyme with endoproteinase LysC revealed six peptide fragments with 9-23 amino acids which were sequenced. Sequence alignment of the six peptides showed high homologies to glycosyltransferases from other higher plants.

  18. Salmon louse (Lepeophtheirus salmonis transcriptomes during post molting maturation and egg production, revealed using EST-sequencing and microarray analysis

    Directory of Open Access Journals (Sweden)

    Jonassen Inge

    2008-03-01

    Full Text Available Abstract Background Lepeophtheirus salmonis is an ectoparasitic copepod feeding on skin, mucus and blood from salmonid hosts. Initial analysis of EST sequences from pre adult and adult stages of L. salmonis revealed a large proportion of novel transcripts. In order to link unknown transcripts to biological functions we have combined EST sequencing and microarray analysis to characterize female salmon louse transcriptomes during post molting maturation and egg production. Results EST sequence analysis shows that 43% of the ESTs have no significant hits in GenBank. Sequenced ESTs assembled into 556 contigs and 1614 singletons and whenever homologous genes were identified no clear correlation with homologous genes from any specific animal group was evident. Sequence comparison of 27 L. salmonis proteins with homologous proteins in humans, zebrafish, insects and crustaceans revealed an almost identical sequence identity with all species. Microarray analysis of maturing female adult salmon lice revealed two major transcription patterns; up-regulation during the final molting followed by down regulation and female specific up regulation during post molting growth and egg production. For a third minor group of ESTs transcription decreased during molting from pre-adult II to immature adults. Genes regulated during molting typically gave hits with cuticula proteins whilst transcripts up regulated during post molting growth were female specific, including two vitellogenins. Conclusion The copepod L.salmonis contains high a level of novel genes. Among analyzed L.salmonis proteins, sequence identities with homologous proteins in crustaceans are no higher than to homologous proteins in humans. Three distinct processes, molting, post molting growth and egg production correlate with transcriptional regulation of three groups of transcripts; two including genes related to growth, one including genes related to egg production. The function of the regulated

  19. Multiple evolutionary events involved in maintaining homologs of Resistance to Powdery Mildew 8 in Brassica napus

    Directory of Open Access Journals (Sweden)

    Qin Li

    2016-07-01

    Full Text Available The Resistance to Powdery Mildew 8 (RPW8 locus confers broad-spectrum resistance to powdery mildew in Arabidopsis thaliana. There are four Homologous to RPW8s (BrHRs in Brassica rapa and three in B. oleracea (BoHRs. B. napus (Bn is derived from diploidization of a hybrid between B. rapa and B. oleracea, thus should have seven homologs of RPW8 (BnHRs. It is unclear whether these genes are still maintained or lost in B. napus after diploidization and how they might have been evolved. Here we reported the identification and sequence polymorphisms of BnHRs from a set of B. napus accessions. Our data indicated that while the BoHR copy from B. oleracea is highly conserved, the BrHR copy from B. rapa is relatively variable in the B. napus genome owing to multiple evolutionary events, such as gene loss, point mutation, insertion, deletion and intragenic recombination. Given the overall high sequence homology of BnHR genes, it is not surprising that both intragenic recombination between two orthologs and two paralogs were detected in B. napus, which may explain the loss of BoHR genes in some B. napus accessions. When ectopically expressed in Arabidopsis, a C-terminally truncated version of BnHRa and BnHRb, as well as the full length BnHRd fused with YFP at their C-termini could trigger cell death in the absence of pathogens and enhanced resistance to powdery mildew disease. Moreover, subcellular localization analysis showed that both BnHRa-YFP and BnHRb-YFP were mainly localized to the extra-haustorial membrane (EHM encasing the haustorium of powdery mildew. Taken together, our data suggest that the duplicated BnHR genes might have been subjected to differential selection and at least some may play a role in defense and could serve as resistance resource in engineering disease-resistant plants.

  20. The human homolog of S. cerevisiae CDC27, CDC27 Hs, is encoded by a highly conserved intronless gene present in multiple copies in the human genome

    Energy Technology Data Exchange (ETDEWEB)

    Devor, E.J.; Dill-Devor, R.M. [Univ. of Iowa College of Medicine, Iowa City (United States)

    1994-09-01

    We have obtained a number of unique sequences via PCR amplification of human genomic DNA using degenerate primers under low stringency (42{degrees}C). One of these, an 853 bp product, has been identified as a partial genomic sequence of the human homolog of the S. cerevisiae CDC27 gene, CDC27Hs (GenBank No. U00001). This gene, reported by Turgendreich et al. is also designated EST00556 from Adams et al. We have undertaken a more detailed examination of our sequence, MCP34N, and have found that: 1. the genomic sequence is nearly identical to CDC27Hs over its entire 853 bp length; 2. an MCP34N-specific PCR assay of several non-human primate species reveals amplification products in chimpanzee and gorilla genomes having greater than 90% sequence identity with CDC27Hs; and 3. an MCP34N-specific PCR assay of the BIOS hybrid cell line panel gives a discordancy pattern suggesting multiple loci. Based upon these data, we present the following initial characterization: 1. the complete MCP34N sequence identity with CDC27Hs indicates that the latter is encoded by an intronless gene; 2. CDC27Hs is highly conserved among higher primates; and 3. CDC27Hs is present in multiple copies in the human genome. These characteristics, taken together with those initially reported for CDC27Hs, suggest that this is an old gene that carries out an important but, as yet, unknown function in the human brain.

  1. Homology among tet determinants in conjugative elements of streptococci

    Energy Technology Data Exchange (ETDEWEB)

    Smith, M.D.; Hazum, S.; Guild, W.R.

    1981-10-01

    A mutation to tetracycline sensitivity in a resistant strain of Streptococcus pneumoniae was shown by several criteria to be due to a point mutation in the conjugative o(cat-tet) element found in the chromosomes of strains derived from BM6001, a clinical strain resistant to tetracycline and chloramphenicol. Strains carrying the mutation were transformed back to tetracycline resistance with the high efficiency of a point marker by donor deoxyribonucleic acids from its ancestral strain and from nine other clinical isolates of pneumococcus and by deoxyribonucleic acids from Group D Streptococcus faecalis and Group B Streptococcus agalactiae strains that also carry conjugative tet elements in their chromosomes. It was not transformed to resistance by tet plasmid deoxyribonucleic acids from either gram-negative or gram-positive species, except for one that carried transposon TN916, the conjugative tet element present in the chromosomes of some S. faecalis strains. The results showed that the tet determinants in conjugative elements of several streptococcal species share a high degree of deoxyribonucleic acid sequence homology and suggested that they differ from other tet genes.

  2. Complete amino acid sequence of bovine colostrum low-Mr cysteine proteinase inhibitor.

    Science.gov (United States)

    Hirado, M; Tsunasawa, S; Sakiyama, F; Niinobe, M; Fujii, S

    1985-07-01

    The complete amino acid sequence of bovine colostrum cysteine proteinase inhibitor was determined by sequencing native inhibitor and peptides obtained by cyanogen bromide degradation, Achromobacter lysylendopeptidase digestion and partial acid hydrolysis of reduced and S-carboxymethylated protein. Achromobacter peptidase digestion was successfully used to isolate two disulfide-containing peptides. The inhibitor consists of 112 amino acids with an Mr of 12787. Two disulfide bonds were established between Cys 66 and Cys 77 and between Cys 90 and Cys 110. A high degree of homology in the sequence was found between the colostrum inhibitor and human gamma-trace, human salivary acidic protein and chicken egg-white cystatin.

  3. Analysis of the role of homology arms in gene-targeting vectors in human cells.

    Directory of Open Access Journals (Sweden)

    Ayako Ishii

    Full Text Available Random integration of targeting vectors into the genome is the primary obstacle in human somatic cell gene targeting. Non-homologous end-joining (NHEJ, a major pathway for repairing DNA double-strand breaks, is thought to be responsible for most random integration events; however, absence of DNA ligase IV (LIG4, the critical NHEJ ligase, does not significantly reduce random integration frequency of targeting vector in human cells, indicating robust integration events occurring via a LIG4-independent mechanism. To gain insights into the mechanism and robustness of LIG4-independent random integration, we employed various types of targeting vectors to examine their integration frequencies in LIG4-proficient and deficient human cell lines. We find that the integration frequency of targeting vector correlates well with the length of homology arms and with the amount of repetitive DNA sequences, especially SINEs, present in the arms. This correlation was prominent in LIG4-deficient cells, but was also seen in LIG4-proficient cells, thus providing evidence that LIG4-independent random integration occurs frequently even when NHEJ is functionally normal. Our results collectively suggest that random integration frequency of conventional targeting vectors is substantially influenced by homology arms, which typically harbor repetitive DNA sequences that serve to facilitate LIG4-independent random integration in human cells, regardless of the presence or absence of functional NHEJ.

  4. Stringent homology-based prediction of H. sapiens-M. tuberculosis H37Rv protein-protein interactions.

    Science.gov (United States)

    Zhou, Hufeng; Gao, Shangzhi; Nguyen, Nam Ninh; Fan, Mengyuan; Jin, Jingjing; Liu, Bing; Zhao, Liang; Xiong, Geng; Tan, Min; Li, Shijun; Wong, Limsoon

    2014-04-08

    H. sapiens-M. tuberculosis H37Rv protein-protein interaction (PPI) data are essential for understanding the infection mechanism of the formidable pathogen M. tuberculosis H37Rv. Computational prediction is an important strategy to fill the gap in experimental H. sapiens-M. tuberculosis H37Rv PPI data. Homology-based prediction is frequently used in predicting both intra-species and inter-species PPIs. However, some limitations are not properly resolved in several published works that predict eukaryote-prokaryote inter-species PPIs using intra-species template PPIs. We develop a stringent homology-based prediction approach by taking into account (i) differences between eukaryotic and prokaryotic proteins and (ii) differences between inter-species and intra-species PPI interfaces. We compare our stringent homology-based approach to a conventional homology-based approach for predicting host-pathogen PPIs, based on cellular compartment distribution analysis, disease gene list enrichment analysis, pathway enrichment analysis and functional category enrichment analysis. These analyses support the validity of our prediction result, and clearly show that our approach has better performance in predicting H. sapiens-M. tuberculosis H37Rv PPIs. Using our stringent homology-based approach, we have predicted a set of highly plausible H. sapiens-M. tuberculosis H37Rv PPIs which might be useful for many of related studies. Based on our analysis of the H. sapiens-M. tuberculosis H37Rv PPI network predicted by our stringent homology-based approach, we have discovered several interesting properties which are reported here for the first time. We find that both host proteins and pathogen proteins involved in the host-pathogen PPIs tend to be hubs in their own intra-species PPI network. Also, both host and pathogen proteins involved in host-pathogen PPIs tend to have longer primary sequence, tend to have more domains, tend to be more hydrophilic, etc. And the protein domains from both

  5. High-Throughput Block Optical DNA Sequence Identification.

    Science.gov (United States)

    Sagar, Dodderi Manjunatha; Korshoj, Lee Erik; Hanson, Katrina Bethany; Chowdhury, Partha Pratim; Otoupal, Peter Britton; Chatterjee, Anushree; Nagpal, Prashant

    2018-01-01

    Optical techniques for molecular diagnostics or DNA sequencing generally rely on small molecule fluorescent labels, which utilize light with a wavelength of several hundred nanometers for detection. Developing a label-free optical DNA sequencing technique will require nanoscale focusing of light, a high-throughput and multiplexed identification method, and a data compression technique to rapidly identify sequences and analyze genomic heterogeneity for big datasets. Such a method should identify characteristic molecular vibrations using optical spectroscopy, especially in the "fingerprinting region" from ≈400-1400 cm -1 . Here, surface-enhanced Raman spectroscopy is used to demonstrate label-free identification of DNA nucleobases with multiplexed 3D plasmonic nanofocusing. While nanometer-scale mode volumes prevent identification of single nucleobases within a DNA sequence, the block optical technique can identify A, T, G, and C content in DNA k-mers. The content of each nucleotide in a DNA block can be a unique and high-throughput method for identifying sequences, genes, and other biomarkers as an alternative to single-letter sequencing. Additionally, coupling two complementary vibrational spectroscopy techniques (infrared and Raman) can improve block characterization. These results pave the way for developing a novel, high-throughput block optical sequencing method with lossy genomic data compression using k-mer identification from multiplexed optical data acquisition. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  6. Genome Sequence, Assembly and Characterization of Two Metschnikowia fructicola Strains Used as Biocontrol Agents of Postharvest Diseases

    Directory of Open Access Journals (Sweden)

    Edoardo Piombo

    2018-04-01

    Full Text Available The yeast Metschnikowia fructicola was reported as an efficient biological control agent of postharvest diseases of fruits and vegetables, and it is the bases of the commercial formulated product “Shemer.” Several mechanisms of action by which M. fructicola inhibits postharvest pathogens were suggested including iron-binding compounds, induction of defense signaling genes, production of fungal cell wall degrading enzymes and relatively high amounts of superoxide anions. We assembled the whole genome sequence of two strains of M. fructicola using PacBio and Illumina shotgun sequencing technologies. Using the PacBio, a high-quality draft genome consisting of 93 contigs, with an estimated genome size of approximately 26 Mb, was obtained. Comparative analysis of M. fructicola proteins with the other three available closely related genomes revealed a shared core of homologous proteins coded by 5,776 genes. Comparing the genomes of the two M. fructicola strains using a SNP calling approach resulted in the identification of 564,302 homologous SNPs with 2,004 predicted high impact mutations. The size of the genome is exceptionally high when compared with those of available closely related organisms, and the high rate of homology among M. fructicola genes points toward a recent whole-genome duplication event as the cause of this large genome. Based on the assembled genome, sequences were annotated with a gene description and gene ontology (GO term and clustered in functional groups. Analysis of CAZymes family genes revealed 1,145 putative genes, and transcriptomic analysis of CAZyme expression levels in M. fructicola during its interaction with either grapefruit peel tissue or Penicillium digitatum revealed a high level of CAZyme gene expression when the yeast was placed in wounded fruit tissue.

  7. SPOT-ligand 2: improving structure-based virtual screening by binding-homology search on an expanded structural template library.

    Science.gov (United States)

    Litfin, Thomas; Zhou, Yaoqi; Yang, Yuedong

    2017-04-15

    The high cost of drug discovery motivates the development of accurate virtual screening tools. Binding-homology, which takes advantage of known protein-ligand binding pairs, has emerged as a powerful discrimination technique. In order to exploit all available binding data, modelled structures of ligand-binding sequences may be used to create an expanded structural binding template library. SPOT-Ligand 2 has demonstrated significantly improved screening performance over its previous version by expanding the template library 15 times over the previous one. It also performed better than or similar to other binding-homology approaches on the DUD and DUD-E benchmarks. The server is available online at http://sparks-lab.org . yaoqi.zhou@griffith.edu.au or yuedong.yang@griffith.edu.au. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  8. Membrane and Protein Interactions of the Pleckstrin Homology Domain Superfamily

    Directory of Open Access Journals (Sweden)

    Marc Lenoir

    2015-10-01

    Full Text Available The human genome encodes about 285 proteins that contain at least one annotated pleckstrin homology (PH domain. As the first phosphoinositide binding module domain to be discovered, the PH domain recruits diverse protein architectures to cellular membranes. PH domains constitute one of the largest protein superfamilies, and have diverged to regulate many different signaling proteins and modules such as Dbl homology (DH and Tec homology (TH domains. The ligands of approximately 70 PH domains have been validated by binding assays and complexed structures, allowing meaningful extrapolation across the entire superfamily. Here the Membrane Optimal Docking Area (MODA program is used at a genome-wide level to identify all membrane docking PH structures and map their lipid-binding determinants. In addition to the linear sequence motifs which are employed for phosphoinositide recognition, the three dimensional structural features that allow peripheral membrane domains to approach and insert into the bilayer are pinpointed and can be predicted ab initio. The analysis shows that conserved structural surfaces distinguish which PH domains associate with membrane from those that do not. Moreover, the results indicate that lipid-binding PH domains can be classified into different functional subgroups based on the type of membrane insertion elements they project towards the bilayer.

  9. Membrane and Protein Interactions of the Pleckstrin Homology Domain Superfamily.

    Science.gov (United States)

    Lenoir, Marc; Kufareva, Irina; Abagyan, Ruben; Overduin, Michael

    2015-10-23

    The human genome encodes about 285 proteins that contain at least one annotated pleckstrin homology (PH) domain. As the first phosphoinositide binding module domain to be discovered, the PH domain recruits diverse protein architectures to cellular membranes. PH domains constitute one of the largest protein superfamilies, and have diverged to regulate many different signaling proteins and modules such as Dbl homology (DH) and Tec homology (TH) domains. The ligands of approximately 70 PH domains have been validated by binding assays and complexed structures, allowing meaningful extrapolation across the entire superfamily. Here the Membrane Optimal Docking Area (MODA) program is used at a genome-wide level to identify all membrane docking PH structures and map their lipid-binding determinants. In addition to the linear sequence motifs which are employed for phosphoinositide recognition, the three dimensional structural features that allow peripheral membrane domains to approach and insert into the bilayer are pinpointed and can be predicted ab initio. The analysis shows that conserved structural surfaces distinguish which PH domains associate with membrane from those that do not. Moreover, the results indicate that lipid-binding PH domains can be classified into different functional subgroups based on the type of membrane insertion elements they project towards the bilayer.

  10. Escherichia coli promoter sequences predict in vitro RNA polymerase selectivity.

    Science.gov (United States)

    Mulligan, M E; Hawley, D K; Entriken, R; McClure, W R

    1984-01-11

    We describe a simple algorithm for computing a homology score for Escherichia coli promoters based on DNA sequence alone. The homology score was related to 31 values, measured in vitro, of RNA polymerase selectivity, which we define as the product KBk2, the apparent second order rate constant for open complex formation. We found that promoter strength could be predicted to within a factor of +/-4.1 in KBk2 over a range of 10(4) in the same parameter. The quantitative evaluation was linked to an automated (Apple II) procedure for searching and evaluating possible promoters in DNA sequence files.

  11. High-throughput sequencing of black pepper root transcriptome

    Science.gov (United States)

    2012-01-01

    Background Black pepper (Piper nigrum L.) is one of the most popular spices in the world. It is used in cooking and the preservation of food and even has medicinal properties. Losses in production from disease are a major limitation in the culture of this crop. The major diseases are root rot and foot rot, which are results of root infection by Fusarium solani and Phytophtora capsici, respectively. Understanding the molecular interaction between the pathogens and the host’s root region is important for obtaining resistant cultivars by biotechnological breeding. Genetic and molecular data for this species, though, are limited. In this paper, RNA-Seq technology has been employed, for the first time, to describe the root transcriptome of black pepper. Results The root transcriptome of black pepper was sequenced by the NGS SOLiD platform and assembled using the multiple-k method. Blast2Go and orthoMCL methods were used to annotate 10338 unigenes. The 4472 predicted proteins showed about 52% homology with the Arabidopsis proteome. Two root proteomes identified 615 proteins, which seem to define the plant’s root pattern. Simple-sequence repeats were identified that may be useful in studies of genetic diversity and may have applications in biotechnology and ecology. Conclusions This dataset of 10338 unigenes is crucially important for the biotechnological breeding of black pepper and the ecogenomics of the Magnoliids, a major group of basal angiosperms. PMID:22984782

  12. High-throughput sequencing of black pepper root transcriptome

    Directory of Open Access Journals (Sweden)

    Gordo Sheila MC

    2012-09-01

    Full Text Available Abstract Background Black pepper (Piper nigrum L. is one of the most popular spices in the world. It is used in cooking and the preservation of food and even has medicinal properties. Losses in production from disease are a major limitation in the culture of this crop. The major diseases are root rot and foot rot, which are results of root infection by Fusarium solani and Phytophtora capsici, respectively. Understanding the molecular interaction between the pathogens and the host’s root region is important for obtaining resistant cultivars by biotechnological breeding. Genetic and molecular data for this species, though, are limited. In this paper, RNA-Seq technology has been employed, for the first time, to describe the root transcriptome of black pepper. Results The root transcriptome of black pepper was sequenced by the NGS SOLiD platform and assembled using the multiple-k method. Blast2Go and orthoMCL methods were used to annotate 10338 unigenes. The 4472 predicted proteins showed about 52% homology with the Arabidopsis proteome. Two root proteomes identified 615 proteins, which seem to define the plant’s root pattern. Simple-sequence repeats were identified that may be useful in studies of genetic diversity and may have applications in biotechnology and ecology. Conclusions This dataset of 10338 unigenes is crucially important for the biotechnological breeding of black pepper and the ecogenomics of the Magnoliids, a major group of basal angiosperms.

  13. High-Throughput Next-Generation Sequencing of Polioviruses

    Science.gov (United States)

    Montmayeur, Anna M.; Schmidt, Alexander; Zhao, Kun; Magaña, Laura; Iber, Jane; Castro, Christina J.; Chen, Qi; Henderson, Elizabeth; Ramos, Edward; Shaw, Jing; Tatusov, Roman L.; Dybdahl-Sissoko, Naomi; Endegue-Zanga, Marie Claire; Adeniji, Johnson A.; Oberste, M. Steven; Burns, Cara C.

    2016-01-01

    ABSTRACT The poliovirus (PV) is currently targeted for worldwide eradication and containment. Sanger-based sequencing of the viral protein 1 (VP1) capsid region is currently the standard method for PV surveillance. However, the whole-genome sequence is sometimes needed for higher resolution global surveillance. In this study, we optimized whole-genome sequencing protocols for poliovirus isolates and FTA cards using next-generation sequencing (NGS), aiming for high sequence coverage, efficiency, and throughput. We found that DNase treatment of poliovirus RNA followed by random reverse transcription (RT), amplification, and the use of the Nextera XT DNA library preparation kit produced significantly better results than other preparations. The average viral reads per total reads, a measurement of efficiency, was as high as 84.2% ± 15.6%. PV genomes covering >99 to 100% of the reference length were obtained and validated with Sanger sequencing. A total of 52 PV genomes were generated, multiplexing as many as 64 samples in a single Illumina MiSeq run. This high-throughput, sequence-independent NGS approach facilitated the detection of a diverse range of PVs, especially for those in vaccine-derived polioviruses (VDPV), circulating VDPV, or immunodeficiency-related VDPV. In contrast to results from previous studies on other viruses, our results showed that filtration and nuclease treatment did not discernibly increase the sequencing efficiency of PV isolates. However, DNase treatment after nucleic acid extraction to remove host DNA significantly improved the sequencing results. This NGS method has been successfully implemented to generate PV genomes for molecular epidemiology of the most recent PV isolates. Additionally, the ability to obtain full PV genomes from FTA cards will aid in facilitating global poliovirus surveillance. PMID:27927929

  14. In Silico Characterization of Pectate Lyase Protein Sequences from Different Source Organisms

    Directory of Open Access Journals (Sweden)

    Amit Kumar Dubey

    2010-01-01

    Full Text Available A total of 121 protein sequences of pectate lyases were subjected to homology search, multiple sequence alignment, phylogenetic tree construction, and motif analysis. The phylogenetic tree constructed revealed different clusters based on different source organisms representing bacterial, fungal, plant, and nematode pectate lyases. The multiple accessions of bacterial, fungal, nematode, and plant pectate lyase protein sequences were placed closely revealing a sequence level similarity. The multiple sequence alignment of these pectate lyase protein sequences from different source organisms showed conserved regions at different stretches with maximum homology from amino acid residues 439–467, 715–816, and 829–910 which could be used for designing degenerate primers or probes specific for pectate lyases. The motif analysis revealed a conserved Pec_Lyase_C domain uniformly observed in all pectate lyases irrespective of variable sources suggesting its possible role in structural and enzymatic functions.

  15. Acetylcholine Receptor: Complex of Homologous Subunits

    Science.gov (United States)

    Raftery, Michael A.; Hunkapiller, Michael W.; Strader, Catherine D.; Hood, Leroy E.

    1980-06-01

    The acetylcholine receptor from the electric ray Torpedo californica is composed of five subunits; two are identical and the other three are structurally related to them. Microsequence analysis of the four polypeptides demonstrates amino acid homology among the subunits. Further sequence analysis of both membrane-bound and Triton-solubilized, chromatographically purified receptor gave the stoichiometry of the four subunits (40,000:50,000:60,000:65,000 daltons) as 2:1:1:1, indicating that this protein is a pentameric complex with a molecular weight of 255,000 daltons. Genealogical analysis suggests that divergence from a common ancestral gene occurred early in the evolution of the receptor. This shared ancestry argues that each of the four subunits plays a functional role in the receptor's physiological action.

  16. Homology analysis and cross-immunogenicity of OmpA from pathogenic Yersinia enterocolitica, Yersinia pseudotuberculosis and Yersinia pestis.

    Science.gov (United States)

    Chen, Yuhuang; Duan, Ran; Li, Xu; Li, Kewei; Liang, Junrong; Liu, Chang; Qiu, Haiyan; Xiao, Yuchun; Jing, Huaiqi; Wang, Xin

    2015-12-01

    The outer membrane protein A (OmpA) is one of the intra-species conserved proteins with immunogenicity widely found in the family of Enterobacteriaceae. Here we first confirmed OmpA is conserved in the three pathogenic Yersinia: Yersinia pestis, Yersinia pseudotuberculosis and pathogenic Yersinia enterocolitica, with high homology at the nucleotide level and at the amino acid sequence level. The identity of ompA sequences for 262 Y. pestis strains, 134 Y. pseudotuberculosis strains and 219 pathogenic Y. enterocolitica strains are 100%, 98.8% and 97.7% similar. The main pattern of OmpA of pathogenic Yersinia are 86.2% and 88.8% identical at the nucleotide and amino acid sequence levels, respectively. Immunological analysis showed the immunogenicity of each OmpA and cross-immunogenicity of OmpA for pathogenic Yersinia where OmpA may be a vaccine candidate for Y. pestis and other pathogenic Yersinia. Copyright © 2015 Elsevier Ltd. All rights reserved.

  17. Anti-HmuY antibodies specifically recognize Porphyromonas gingivalis HmuY protein but not homologous proteins in other periodontopathogens.

    Directory of Open Access Journals (Sweden)

    Michał Śmiga

    Full Text Available Given the emerging evidence of an association between periodontal infections and systemic conditions, the search for specific methods to detect the presence of P. gingivalis, a principal etiologic agent in chronic periodontitis, is of high importance. The aim of this study was to characterize antibodies raised against purified P. gingivalis HmuY protein and selected epitopes of the HmuY molecule. Since other periodontopathogens produce homologs of HmuY, we also aimed to characterize responses of antibodies raised against the HmuY protein or its epitopes to the closest homologous proteins from Prevotella intermedia and Tannerella forsythia. Rabbits were immunized with purified HmuY protein or three synthetic, KLH-conjugated peptides, derived from the P. gingivalis HmuY protein. The reactivity of anti-HmuY antibodies with purified proteins or bacteria was determined using Western blotting and ELISA assay. First, we found homologs of P. gingivalis HmuY in P. intermedia (PinO and PinA proteins and T. forsythia (Tfo protein and identified corrected nucleotide and amino acid sequences of Tfo. All proteins were overexpressed in E. coli and purified using ion-exchange chromatography, hydrophobic chromatography and gel filtration. We demonstrated that antibodies raised against P. gingivalis HmuY are highly specific to purified HmuY protein and HmuY attached to P. gingivalis cells. No reactivity between P. intermedia and T. forsythia or between purified HmuY homologs from these bacteria and anti-HmuY antibodies was detected. The results obtained in this study demonstrate that P. gingivalis HmuY protein may serve as an antigen for specific determination of serum antibodies raised against this bacterium.

  18. Identification of a novel MLPK homologous gene MLPKn1 and its expression analysis in Brassica oleracea.

    Science.gov (United States)

    Gao, Qiguo; Shi, Songmei; Liu, Yudong; Pu, Quanming; Liu, Xiaohuan; Zhang, Ying; Zhu, Liquan

    2016-09-01

    M locus protein kinase, one of the SRK-interacting proteins, is a necessary positive regulator for the self-incompatibility response in Brassica. In B. rapa, MLPK is expressed as two different transcripts, MLPKf1 and MLPKf2, and either isoform can complement the mlpk/mlpk mutation. The AtAPK1B gene has been considered to be the ortholog of BrMLPK, and AtAPK1B has no role in self-incompatibility (SI) response in A. thaliana SRK-SCR plants. Until now, what causes the MLPK and APK1B function difference during SI response in Brassica and A. thaliana SRKb-SCRb plants has remained unknown. Here, in addition to the reported MLPKf1/2, we identified the new MLPKf1 homologous gene MLPKn1 from B. oleracea. BoMLPKn1 and BoMLPKf1 shared nucleotide sequence identity as high as 84.3 %, and the most striking difference consisted in two fragment insertions in BoMLPKn1. BoMLPKn1 and BoMLPKf1 had a similar gene structure; both their deduced amino acid sequences contained a typical plant myristoylation consensus sequence and a Ser/Thr protein kinase domain. BoMLPKn1 was widely expressed in petal, sepal, anther, stigma and leaf. Genome-wide survey revealed that the B. oleracea genome contained three MLPK homologous genes: BoMLPKf1/2, BoMLPKn1 and Bol008343n. The B. rapa genome also contained three MLPK homologous genes, BrMLPKf1/2, BraMLPKn1 and Bra040929. Phylogenetic analysis revealed that BoMLPKf1/2 and BrMLPKf1/2 were phylogenetically more distant from AtAPK1A than Bol008343n, Bra040929, BraMLPKn1 and BoMLPKn1, Synteny analysis revealed that the B. oleracea chromosomal region containing BoMLPKn1 displayed high synteny with the A. thaliana chromosomal region containing APK1B, whereas the B. rapa chromosomal region containing BraMLPKn1 showed high synteny with the A. thaliana chromosomal region containing APK1B. Together, these results revealed that BoMLPKn1/BraMLPKn1, and not the formerly reported BoMLPKf1/2 (BrMLPKf1/2), was the orthologous genes of AtAPK1B, and no ortholog of Bo

  19. Homologous series of induced early mutants in Indica rice. Pt.3: The relationship between the induction of homologous series of early mutants and its different pedigree

    International Nuclear Information System (INIS)

    Chen Xiulan; Yang Hefeng; He Zhentian; Han Yuepeng; Liu Xueyu

    2002-01-01

    The percentage of homologous series of early mutants (PHSEM) induced by irradiation was closely related to its pedigree. This study showed that PHSEM for varieties with the same pedigree were similar, and there were three different level of dominance (high, low and normal) in the homologous series induced from different pedigree. The PHSEM for varieties derived form distant-relative-parents were higher than that derived from close-relative-parents. There was the dominance pedigree for the induction of homologous series of early mutants. IR8(Peta x DGWG), IR127 (Cpslo x Sigadis) and IR24 (IR8 x IR127) were dominant pedigree, and varieties derived from them could be easily induced the homologous series of early mutants

  20. [Analysis of DNA-DNA homologies in obligate methylotrophic bacteria].

    Science.gov (United States)

    Doronina, N V; Govorukhina, N I; Lysenko, A M; Trotsenko, Iu A

    1988-01-01

    The genotypic affinity of 19 bacterial strains obligately dependent on methanol or methylamine as carbon and energy sources was studied by techniques of molecular DNA hybridization. The high homology level (35-88%) between motile strain Methylophilus methanolovorus V-1447D and nonmotile strain Methylobacillus sp. VSB-792 as well as other motile strains (Pseudomonas methanolica ATCC 21704, Methylomonas methanolica NRRL 5458, Pseudomonas sp. W6, strain A3) indicates that all of them belong to one genus. Rather high level of homology (62-63%) was found between Methylobacillus glycogenes ATCC 29475 and Pseudomonas insueta ATCC 21276 and strain G-10. The motile strain Methylophilus methylotrophus NCIB 10515 has a low homology (below 20%) to other of the studied obligate methylobacteria. Therefore, at least two genetically different genera of obligate methylobacteria can be distinguished, namely Methylophilus and Methylobacillus, the latter being represented by both motile and nonmotile forms.

  1. Amino acid sequences of ribosomal proteins S11 from Bacillus stearothermophilus and S19 from Halobacterium marismortui. Comparison of the ribosomal protein S11 family.

    Science.gov (United States)

    Kimura, M; Kimura, J; Hatakeyama, T

    1988-11-21

    The complete amino acid sequences of ribosomal proteins S11 from the Gram-positive eubacterium Bacillus stearothermophilus and of S19 from the archaebacterium Halobacterium marismortui have been determined. A search for homologous sequences of these proteins revealed that they belong to the ribosomal protein S11 family. Homologous proteins have previously been sequenced from Escherichia coli as well as from chloroplast, yeast and mammalian ribosomes. A pairwise comparison of the amino acid sequences showed that Bacillus protein S11 shares 68% identical residues with S11 from Escherichia coli and a slightly lower homology (52%) with the homologous chloroplast protein. The halophilic protein S19 is more related to the eukaryotic (45-49%) than to the eubacterial counterparts (35%).

  2. 454 sequencing of pooled BAC clones on chromosome 3H of barley

    Directory of Open Access Journals (Sweden)

    Yamaji Nami

    2011-05-01

    Full Text Available Abstract Background Genome sequencing of barley has been delayed due to its large genome size (ca. 5,000Mbp. Among the fast sequencing systems, 454 liquid phase pyrosequencing provides the longest reads and is the most promising method for BAC clones. Here we report the results of pooled sequencing of BAC clones selected with ESTs genetically mapped to chromosome 3H. Results We sequenced pooled barley BAC clones using a 454 parallel genome sequencer. A PCR screening system based on primer sets derived from genetically mapped ESTs on chromosome 3H was used for clone selection in a BAC library developed from cultivar "Haruna Nijo". The DNA samples of 10 or 20 BAC clones were pooled and used for shotgun library development. The homology between contig sequences generated in each pooled library and mapped EST sequences was studied. The number of contigs assigned on chromosome 3H was 372. Their lengths ranged from 1,230 bp to 58,322 bp with an average 14,891 bp. Of these contigs, 240 showed homology and colinearity with the genome sequence of rice chromosome 1. A contig annotation browser supplemented with query search by unique sequence or genetic map position was developed. The identified contigs can be annotated with barley cDNAs and reference sequences on the browser. Homology analysis of these contigs with rice genes indicated that 1,239 rice genes can be assigned to barley contigs by the simple comparison of sequence lengths in both species. Of these genes, 492 are assigned to rice chromosome 1. Conclusions We demonstrate the efficiency of sequencing gene rich regions from barley chromosome 3H, with special reference to syntenic relationships with rice chromosome 1.

  3. Antibodies against homologous microbial caseinolytic proteases P characterise primary biliary cirrhosis.

    Science.gov (United States)

    Bogdanos, Dimitrios-Petrou; Baum, Harold; Sharma, Umesh C; Grasso, Alessandro; Ma, Yun; Burroughs, Andrew K; Vergani, Diego

    2002-01-01

    Antibodies to caseinolytic protease P(177-194) (ClpP(177-194)) of the proteolytic subunit of the Clp complex of Escherichia coli (E. coli) are uniquely present in primary biliary cirrhosis (PBC). Molecular mimicry between the regulatory subunit ClpX and the principal T-cell epitope of pyruvate dehydrogenase complex (PDC-E2) in PBC, has been proposed to account for this. Since ClpP is highly conserved among bacteria we investigated whether the micro-organisms triggering these antibodies may be other than E. coli. E. coli ClpP(177-194) is homologous with ClpP peptides of Yersinia enterocolitica (YEREN) and Haemophilus influenzae (HAEIN). Enzyme linked immunosorbent assay (ELISA) reactivity to these peptides was tested in 45 patients with PBC, 44 pathological and 32 healthy controls. Reactivity to at least one of the ClpP peptides was observed in 21 (47%) PBC patients, 5.8% pathological and 3.1% healthy controls (PECOLI ClpP(177-194), alone or in association with YEREN and/or HAEIN peptides, compared to three (14.2%) reactive with YEREN, two (9.5%) with YEREN/HAEIN and one (4.7%) with HAEIN peptide. Simultaneous reactivity to homologous sequences was due to cross-reactivity as confirmed by competition ELISAs. The PBC-specificity of anti-microbial ClpP reactivity is confirmed: the questions as to primary trigger(s) and relevance to PBC pathogenesis remain open.

  4. Evolutionary conservation of nuclear and nucleolar targeting sequences in yeast ribosomal protein S6A

    International Nuclear Information System (INIS)

    Lipsius, Edgar; Walter, Korden; Leicher, Torsten; Phlippen, Wolfgang; Bisotti, Marc-Angelo; Kruppa, Joachim

    2005-01-01

    Over 1 billion years ago, the animal kingdom diverged from the fungi. Nevertheless, a high sequence homology of 62% exists between human ribosomal protein S6 and S6A of Saccharomyces cerevisiae. To investigate whether this similarity in primary structure is mirrored in corresponding functional protein domains, the nuclear and nucleolar targeting signals were delineated in yeast S6A and compared to the known human S6 signals. The complete sequence of S6A and cDNA fragments was fused to the 5'-end of the LacZ gene, the constructs were transiently expressed in COS cells, and the subcellular localization of the fusion proteins was detected by indirect immunofluorescence. One bipartite and two monopartite nuclear localization signals as well as two nucleolar binding domains were identified in yeast S6A, which are located at homologous regions in human S6 protein. Remarkably, the number, nature, and position of these targeting signals have been conserved, albeit their amino acid sequences have presumably undergone a process of co-evolution with their corresponding rRNAs

  5. Homologous Recombination between Genetically Divergent Campylobacter fetus Lineages Supports Host-Associated Speciation

    Science.gov (United States)

    Duim, Birgitta; van der Graaf-van Bloois, Linda; Wagenaar, Jaap A; Zomer, Aldert L

    2018-01-01

    Abstract Homologous recombination is a major driver of bacterial speciation. Genetic divergence and host association are important factors influencing homologous recombination. Here, we study these factors for Campylobacter fetus, which shows a distinct intraspecific host dichotomy. Campylobacter fetus subspecies fetus (Cff) and venerealis are associated with mammals, whereas C. fetus subsp. testudinum (Cft) is associated with reptiles. Recombination between these genetically divergent C. fetus lineages is extremely rare. Previously it was impossible to show whether this barrier to recombination was determined by the differential host preferences, by the genetic divergence between both lineages or by other factors influencing recombination, such as restriction-modification, CRISPR/Cas, and transformation systems. Fortuitously, a distinct C. fetus lineage (ST69) was found, which was highly related to mammal-associated C. fetus, yet isolated from a chelonian. The whole genome sequences of two C. fetus ST69 isolates were compared with those of mammal- and reptile-associated C. fetus strains for phylogenetic and recombination analysis. In total, 5.1–5.5% of the core genome of both ST69 isolates showed signs of recombination. Of the predicted recombination regions, 80.4% were most closely related to Cft, 14.3% to Cff, and 5.6% to C. iguaniorum. Recombination from C. fetus ST69 to Cft was also detected, but to a lesser extent and only in chelonian-associated Cft strains. This study shows that despite substantial genetic divergence no absolute barrier to homologous recombination exists between two distinct C. fetus lineages when occurring in the same host type, which provides valuable insights in bacterial speciation and evolution. PMID:29608720

  6. Homologous Recombination and Xylella fastidiosa Host-Pathogen Associations in South America.

    Science.gov (United States)

    Coletta-Filho, Helvécio D; Francisco, Carolina S; Lopes, João R S; Muller, Christiane; Almeida, Rodrigo P P

    2017-03-01

    Homologous recombination affects the evolution of bacteria such as Xylella fastidiosa, a naturally competent plant pathogen that requires insect vectors for dispersal. This bacterial species is taxonomically divided into subspecies, with phylogenetic clusters within subspecies that are host specific. One subspecies, pauca, is primarily limited to South America, with the exception of recently reported strains in Europe and Costa Rica. Despite the economic importance of X. fastidiosa subsp. pauca in South America, little is known about its genetic diversity. Multilocus sequence typing (MLST) has previously identified six sequence types (ST) among plant samples collected in Brazil (both subsp. pauca and multiplex). Here, we report on a survey of X. fastidiosa genetic diversity (MLST based) performed in six regions in Brazil and two in Argentina, by sampling five different plant species. In addition to the six previously reported ST, seven new subsp. pauca and two new subsp. multiplex ST were identified. The presence of subsp. multiplex in South America is considered to be the consequence of a single introduction from its native range in North America more than 80 years ago. Different phylogenetic approaches clustered the South American ST into four groups, with strains infecting citrus (subsp. pauca); coffee and olive (subsp. pauca); coffee, hibiscus, and plum (subsp. pauca); and plum (subsp. multiplex). In areas where these different genetic clusters occurred sympatrically, we found evidence of homologous recombination in the form of bidirectional allelic exchange between subspp. pauca and multiplex. In fact, the only strain of subsp. pauca isolated from a plum host had an allele that originated from subsp. multiplex. These signatures of bidirectional homologous recombination between endemic and introduced ST indicate that gene flow occurs in short evolutionary time frames in X. fastidiosa, despite the ecological isolation (i.e., host plant species) of genotypes.

  7. Cloning of an E. coli RecA and yeast RAD51 homolog, radA, an allele of the uvsC in Aspergillus nidulans and its mutator effects.

    Science.gov (United States)

    Seong, K Y; Chae, S K; Kang, H S

    1997-04-30

    An E. coli RecA and yeast RAD51 homolog from Aspergillus nidulans, radA, has been cloned by screening genomic and cDNA libraries with a PCR-amplified probe. This probe was generated using primers carrying the conserved sequences of eukaryotic RecA homologs. The deduced amino acid sequence revealed two conserved Walker-A and -B type nucleotide-binding domains and exhibited 88%, 60%, and 53% identity with Mei-3 of Neurospora crassa, rhp51+ of Schizosaccharomyces pombe, and Rad51 of Saccharomyces cerevisiae, respectively. radA null mutants constructed by replacing the whole coding region with a selection marker showed high methyl methanesulfonate (MMS) sensitivity. Heterozygous diploids of radA disruptant with the uvsC114 mutant failed to complement with respect to MMS-sensitivity, indicating that radA is an allele of uvsC. In selecting spontaneous forward selenate resistant mutations, mutator effects were observed in radA null mutants similarly to those shown in uvsC114 mutant strains.

  8. Geometric homology revisited

    OpenAIRE

    Ruffino, Fabio Ferrari

    2013-01-01

    Given a cohomology theory, there is a well-known abstract way to define the dual homology theory using the theory of spectra. In [4] the author provides a more geometric construction of the homology theory, using a generalization of the bordism groups. Such a generalization involves in its definition the vector bundle modification, which is a particular case of the Gysin map. In this paper we provide a more natural variant of that construction, which replaces the vector bundle modification wi...

  9. TurboFold: Iterative probabilistic estimation of secondary structures for multiple RNA sequences

    Directory of Open Access Journals (Sweden)

    Sharma Gaurav

    2011-04-01

    Full Text Available Abstract Background The prediction of secondary structure, i.e. the set of canonical base pairs between nucleotides, is a first step in developing an understanding of the function of an RNA sequence. The most accurate computational methods predict conserved structures for a set of homologous RNA sequences. These methods usually suffer from high computational complexity. In this paper, TurboFold, a novel and efficient method for secondary structure prediction for multiple RNA sequences, is presented. Results TurboFold takes, as input, a set of homologous RNA sequences and outputs estimates of the base pairing probabilities for each sequence. The base pairing probabilities for a sequence are estimated by combining intrinsic information, derived from the sequence itself via the nearest neighbor thermodynamic model, with extrinsic information, derived from the other sequences in the input set. For a given sequence, the extrinsic information is computed by using pairwise-sequence-alignment-based probabilities for co-incidence with each of the other sequences, along with estimated base pairing probabilities, from the previous iteration, for the other sequences. The extrinsic information is introduced as free energy modifications for base pairing in a partition function computation based on the nearest neighbor thermodynamic model. This process yields updated estimates of base pairing probability. The updated base pairing probabilities in turn are used to recompute extrinsic information, resulting in the overall iterative estimation procedure that defines TurboFold. TurboFold is benchmarked on a number of ncRNA datasets and compared against alternative secondary structure prediction methods. The iterative procedure in TurboFold is shown to improve estimates of base pairing probability with each iteration, though only small gains are obtained beyond three iterations. Secondary structures composed of base pairs with estimated probabilities higher than a

  10. Feature Selection and the Class Imbalance Problem in Predicting Protein Function from Sequence

    NARCIS (Netherlands)

    Al-Shahib, A.; Breitling, R.; Gilbert, D.

    2005-01-01

    Abstract: When the standard approach to predict protein function by sequence homology fails, other alternative methods can be used that require only the amino acid sequence for predicting function. One such approach uses machine learning to predict protein function directly from amino acid sequence

  11. The PLAC1-homology region of the ZP domain is sufficient for protein polymerisation

    Directory of Open Access Journals (Sweden)

    Litscher Eveline S

    2006-04-01

    Full Text Available Abstract Background Hundreds of extracellular proteins polymerise into filaments and matrices by using zona pellucida (ZP domains. ZP domain proteins perform highly diverse functions, ranging from structural to receptorial, and mutations in their genes are responsible for a number of severe human diseases. Recently, PLAC1, Oosp1-3, Papillote and CG16798 proteins were identified that share sequence homology with the N-terminal half of the ZP domain (ZP-N, but not with its C-terminal half (ZP-C. The functional significance of this partial conservation is unknown. Results By exploiting a highly engineered bacterial strain, we expressed in soluble form the PLAC1-homology region of mammalian sperm receptor ZP3 as a fusion to maltose binding protein. Mass spectrometry showed that the 4 conserved Cys residues within the ZP-N moiety of the fusion protein adopt the same disulfide bond connectivity as in full-length native ZP3, indicating that it is correctly folded, and electron microscopy and biochemical analyses revealed that it assembles into filaments. Conclusion These findings provide a function for PLAC1-like proteins and, by showing that ZP-N is a biologically active folding unit, prompt a re-evaluation of the architecture of the ZP domain and its polymers. Furthermore, they suggest that ZP-C might play a regulatory role in the assembly of ZP domain protein complexes.

  12. Protein remote homology detection based on bidirectional long short-term memory.

    Science.gov (United States)

    Li, Shumin; Chen, Junjie; Liu, Bin

    2017-10-10

    Protein remote homology detection plays a vital role in studies of protein structures and functions. Almost all of the traditional machine leaning methods require fixed length features to represent the protein sequences. However, it is never an easy task to extract the discriminative features with limited knowledge of proteins. On the other hand, deep learning technique has demonstrated its advantage in automatically learning representations. It is worthwhile to explore the applications of deep learning techniques to the protein remote homology detection. In this study, we employ the Bidirectional Long Short-Term Memory (BLSTM) to learn effective features from pseudo proteins, also propose a predictor called ProDec-BLSTM: it includes input layer, bidirectional LSTM, time distributed dense layer and output layer. This neural network can automatically extract the discriminative features by using bidirectional LSTM and the time distributed dense layer. Experimental results on a widely-used benchmark dataset show that ProDec-BLSTM outperforms other related methods in terms of both the mean ROC and mean ROC50 scores. This promising result shows that ProDec-BLSTM is a useful tool for protein remote homology detection. Furthermore, the hidden patterns learnt by ProDec-BLSTM can be interpreted and visualized, and therefore, additional useful information can be obtained.

  13. Management of High-Throughput DNA Sequencing Projects: Alpheus.

    Science.gov (United States)

    Miller, Neil A; Kingsmore, Stephen F; Farmer, Andrew; Langley, Raymond J; Mudge, Joann; Crow, John A; Gonzalez, Alvaro J; Schilkey, Faye D; Kim, Ryan J; van Velkinburgh, Jennifer; May, Gregory D; Black, C Forrest; Myers, M Kathy; Utsey, John P; Frost, Nicholas S; Sugarbaker, David J; Bueno, Raphael; Gullans, Stephen R; Baxter, Susan M; Day, Steve W; Retzel, Ernest F

    2008-12-26

    High-throughput DNA sequencing has enabled systems biology to begin to address areas in health, agricultural and basic biological research. Concomitant with the opportunities is an absolute necessity to manage significant volumes of high-dimensional and inter-related data and analysis. Alpheus is an analysis pipeline, database and visualization software for use with massively parallel DNA sequencing technologies that feature multi-gigabase throughput characterized by relatively short reads, such as Illumina-Solexa (sequencing-by-synthesis), Roche-454 (pyrosequencing) and Applied Biosystem's SOLiD (sequencing-by-ligation). Alpheus enables alignment to reference sequence(s), detection of variants and enumeration of sequence abundance, including expression levels in transcriptome sequence. Alpheus is able to detect several types of variants, including non-synonymous and synonymous single nucleotide polymorphisms (SNPs), insertions/deletions (indels), premature stop codons, and splice isoforms. Variant detection is aided by the ability to filter variant calls based on consistency, expected allele frequency, sequence quality, coverage, and variant type in order to minimize false positives while maximizing the identification of true positives. Alpheus also enables comparisons of genes with variants between cases and controls or bulk segregant pools. Sequence-based differential expression comparisons can be developed, with data export to SAS JMP Genomics for statistical analysis.

  14. Homologous Recombination—Experimental Systems, Analysis and Significance

    Science.gov (United States)

    Kuzminov, Andrei

    2014-01-01

    Homologous recombination is the most complex of all recombination events that shape genomes and produce material for evolution. Homologous recombination events are exchanges between DNA molecules in the lengthy regions of shared identity, catalyzed by a group of dedicated enzymes. There is a variety of experimental systems in E. coli and Salmonella to detect homologous recombination events of several different kinds. Genetic analysis of homologous recombination reveals three separate phases of this process: pre-synapsis (the early phase), synapsis (homologous strand exchange) and post-synapsis (the late phase). In E. coli, there are at least two independent pathway of the early phase and at least two independent pathways of the late phase. All this complexity is incongruent with the originally ascribed role of homologous recombination as accelerator of genome evolution: there is simply not enough duplication and repetition in enterobacterial genomes for homologous recombination to have a detectable evolutionary role, and therefore not enough selection to maintain such a complexity. At the same time, the mechanisms of homologous recombination are uniquely suited for repair of complex DNA lesions called chromosomal lesions. In fact, the two major classes of chromosomal lesions are recognized and processed by the two individual pathways at the early phase of homologous recombination. It follows, therefore, that homologous recombination events are occasional reflections of the continual recombinational repair, made possible in cases of natural or artificial genome redundancy. PMID:26442506

  15. A PHF8 homolog in C. elegans promotes DNA repair via homologous recombination.

    Directory of Open Access Journals (Sweden)

    Changrim Lee

    Full Text Available PHF8 is a JmjC domain-containing histone demethylase, defects in which are associated with X-linked mental retardation. In this study, we examined the roles of two PHF8 homologs, JMJD-1.1 and JMJD-1.2, in the model organism C. elegans in response to DNA damage. A deletion mutation in either of the genes led to hypersensitivity to interstrand DNA crosslinks (ICLs, while only mutation of jmjd-1.1 resulted in hypersensitivity to double-strand DNA breaks (DSBs. In response to ICLs, JMJD-1.1 did not affect the focus formation of FCD-2, a homolog of FANCD2, a key protein in the Fanconi anemia pathway. However, the dynamic behavior of RPA-1 and RAD-51 was affected by the mutation: the accumulations of both proteins at ICLs appeared normal, but their subsequent disappearance was retarded, suggesting that later steps of homologous recombination were defective. Similar changes in the dynamic behavior of RPA-1 and RAD-51 were seen in response to DSBs, supporting a role of JMJD-1.1 in homologous recombination. Such a role was also supported by our finding that the hypersensitivity of jmjd-1.1 worms to ICLs was rescued by knockdown of lig-4, a homolog of Ligase 4 active in nonhomologous end-joining. The hypersensitivity of jmjd-1.1 worms to ICLs was increased by rad-54 knockdown, suggesting that JMJD-1.1 acts in parallel with RAD-54 in modulating chromatin structure. Indeed, the level of histone H3 Lys9 tri-methylation, a marker of heterochromatin, was higher in jmjd-1.1 cells than in wild-type cells. We conclude that the histone demethylase JMJD-1.1 influences homologous recombination either by relaxing heterochromatin structure or by indirectly regulating the expression of multiple genes affecting DNA repair.

  16. Using SQL Databases for Sequence Similarity Searching and Analysis.

    Science.gov (United States)

    Pearson, William R; Mackey, Aaron J

    2017-09-13

    Relational databases can integrate diverse types of information and manage large sets of similarity search results, greatly simplifying genome-scale analyses. By focusing on taxonomic subsets of sequences, relational databases can reduce the size and redundancy of sequence libraries and improve the statistical significance of homologs. In addition, by loading similarity search results into a relational database, it becomes possible to explore and summarize the relationships between all of the proteins in an organism and those in other biological kingdoms. This unit describes how to use relational databases to improve the efficiency of sequence similarity searching and demonstrates various large-scale genomic analyses of homology-related data. It also describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. The unit also introduces search_demo, a database that stores sequence similarity search results. The search_demo database is then used to explore the evolutionary relationships between E. coli proteins and proteins in other organisms in a large-scale comparative genomic analysis. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.

  17. Phylo-mLogo: an interactive and hierarchical multiple-logo visualization tool for alignment of many sequences

    Directory of Open Access Journals (Sweden)

    Lee DT

    2007-02-01

    Full Text Available Abstract Background When aligning several hundreds or thousands of sequences, such as epidemic virus sequences or homologous/orthologous sequences of some big gene families, to reconstruct the epidemiological history or their phylogenies, how to analyze and visualize the alignment results of many sequences has become a new challenge for computational biologists. Although there are several tools available for visualization of very long sequence alignments, few of them are applicable to the alignments of many sequences. Results A multiple-logo alignment visualization tool, called Phylo-mLogo, is presented in this paper. Phylo-mLogo calculates the variabilities and homogeneities of alignment sequences by base frequencies or entropies. Different from the traditional representations of sequence logos, Phylo-mLogo not only displays the global logo patterns of the whole alignment of multiple sequences, but also demonstrates their local homologous logos for each clade hierarchically. In addition, Phylo-mLogo also allows the user to focus only on the analysis of some important, structurally or functionally constrained sites in the alignment selected by the user or by built-in automatic calculation. Conclusion With Phylo-mLogo, the user can symbolically and hierarchically visualize hundreds of aligned sequences simultaneously and easily check the changes of their amino acid sites when analyzing many homologous/orthologous or influenza virus sequences. More information of Phylo-mLogo can be found at URL http://biocomp.iis.sinica.edu.tw/phylomlogo.

  18. Complete cDNA sequence of human complement C1s and close physical linkage of the homologous genes C1s and C1r

    International Nuclear Information System (INIS)

    Tosi, M.; Duponchel, C.; Meo, T.; Julier, C.

    1987-01-01

    Overlapping molecular clones encoding the complement subcomponent C1s were isolated from a human liver cDNA library. The nucleotide sequence reconstructed from these clones spans about 85% of the length of the liver C1s messenger RNAs, which occur in three distinct size classes around 3 kilobases in length. Comparisons with the sequence of C1r, the other enzymatic subcomponent of C1, reveal 40% amino acid identity and conservation of all the cysteine residues. Beside the serine protease domain, the following sequence motifs, previously described in C1r, were also found in C1s: (a) two repeats of the type found in the Ba fragment of complement factor B and in several other complement but also noncomplement proteins, (b) a cysteine-rich segment homologous to the repeats of epidermal growth factor precursor, and (c) a duplicated segment found only in C1r and C1s. Differences in each of these structural motifs provide significant clues for the interpretation of the functional divergence of these interacting serine protease zymogens. Hybridizations of C1r and C1s probes to restriction endonuclease fragments of genomic DNA demonstrate close physical linkage of the corresponding genes. The implications of this finding are discussed with respect to the evolution of C1r and C1s after their origin by tandem gene duplication and to the previously observed combined hereditary deficiencies of Clr and Cls

  19. Using relational databases for improved sequence similarity searching and large-scale genomic analyses.

    Science.gov (United States)

    Mackey, Aaron J; Pearson, William R

    2004-10-01

    Relational databases are designed to integrate diverse types of information and manage large sets of search results, greatly simplifying genome-scale analyses. Relational databases are essential for management and analysis of large-scale sequence analyses, and can also be used to improve the statistical significance of similarity searches by focusing on subsets of sequence libraries most likely to contain homologs. This unit describes using relational databases to improve the efficiency of sequence similarity searching and to demonstrate various large-scale genomic analyses of homology-related data. This unit describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. These include basic use of the database to generate a novel sequence library subset, how to extend and use seqdb_demo for the storage of sequence similarity search results and making use of various kinds of stored search results to address aspects of comparative genomic analysis.

  20. Direct selection of expressed sequences on a YAC clone revealed proline-rich-like genes and BARE-1 sequences physically linked to the complex ¤Mla¤ powdery mildew resistance locus of barley (¤Hordeum vulgare¤ L.)

    DEFF Research Database (Denmark)

    Schwarz, G.; Michalek, W.; Jahoor, A.

    2002-01-01

    homology to the copia-like retroelement BA REI of barley, putatively involved in evolution of disease resistance loci. The high degree of clones representing barley rRNA sequences or false positives is a major disadvantage of direct selection of cDNAs in barley. (C) 2002 Elsevier Science Ireland Ltd. All...... gene. Of 22 selected cDNA clones, six were re-located on the YAC by southern analysis. Two of these clones are predicted to encode members of the hydroxyproline-rich glycoprotein and proline-rich protein gene families which have been implicated in plant defense response. Four sequences showed high...

  1. Lectures on homology with internal symmetries

    International Nuclear Information System (INIS)

    Solovyov, Yu.

    1993-09-01

    Homology with internal symmetries is a natural generalization of cyclic homology introduced, independently, by Connes and Tsygan, which has turned out to be a very useful tool in a number of problems of algebra, geometry topology, analysis and mathematical physics. It suffices to say cycling homology and cohomology are successfully applied in the index theory of elliptic operators on foliations, in the description of the homotopy type of pseudoisotopy spaces, in the theory of characteristic classes in algebraic K-theory. They are also applied in noncommutative differential geometry and in the cohomology of Lie algebras, the branches of mathematics which brought them to life in the first place. Essentially, we consider dihedral homology, which was successfully applied for the description of the homology type of groups of homeomorphisms and diffeomorphisms of simply connected manifolds. (author). 27 refs

  2. Nucleotide sequences of two genomic DNAs encoding peroxidase of Arabidopsis thaliana.

    Science.gov (United States)

    Intapruk, C; Higashimura, N; Yamamoto, K; Okada, N; Shinmyo, A; Takano, M

    1991-02-15

    The peroxidase (EC 1.11.1.7)-encoding gene of Arabidopsis thaliana was screened from a genomic library using a cDNA encoding a neutral isozyme of horseradish, Armoracia rusticana, peroxidase (HRP) as a probe, and two positive clones were isolated. From the comparison with the sequences of the HRP-encoding genes, we concluded that two clones contained peroxidase-encoding genes, and they were named prxCa and prxEa. Both genes consisted of four exons and three introns; the introns had consensus nucleotides, GT and AG, at the 5' and 3' ends, respectively. The lengths of each putative exon of the prxEa gene were the same as those of the HRP-basic-isozyme-encoding gene, prxC3, and coded for 349 amino acids (aa) with a sequence homology of 89% to that encoded by prxC3. The prxCa gene was very close to the HRP-neutral-isozyme-encoding gene, prxC1b, and coded for 354 aa with 91% homology to that encoded by prxC1b. The aa sequence homology was 64% between the two peroxidases encoded by prxCa and prxEa.

  3. Improving model construction of profile HMMs for remote homology detection through structural alignment

    Directory of Open Access Journals (Sweden)

    Zaverucha Gerson

    2007-11-01

    Full Text Available Abstract Background Remote homology detection is a challenging problem in Bioinformatics. Arguably, profile Hidden Markov Models (pHMMs are one of the most successful approaches in addressing this important problem. pHMM packages present a relatively small computational cost, and perform particularly well at recognizing remote homologies. This raises the question of whether structural alignments could impact the performance of pHMMs trained from proteins in the Twilight Zone, as structural alignments are often more accurate than sequence alignments at identifying motifs and functional residues. Next, we assess the impact of using structural alignments in pHMM performance. Results We used the SCOP database to perform our experiments. Structural alignments were obtained using the 3DCOFFEE and MAMMOTH-mult tools; sequence alignments were obtained using CLUSTALW, TCOFFEE, MAFFT and PROBCONS. We performed leave-one-family-out cross-validation over super-families. Performance was evaluated through ROC curves and paired two tailed t-test. Conclusion We observed that pHMMs derived from structural alignments performed significantly better than pHMMs derived from sequence alignment in low-identity regions, mainly below 20%. We believe this is because structural alignment tools are better at focusing on the important patterns that are more often conserved through evolution, resulting in higher quality pHMMs. On the other hand, sensitivity of these tools is still quite low for these low-identity regions. Our results suggest a number of possible directions for improvements in this area.

  4. Improving model construction of profile HMMs for remote homology detection through structural alignment.

    Science.gov (United States)

    Bernardes, Juliana S; Dávila, Alberto M R; Costa, Vítor S; Zaverucha, Gerson

    2007-11-09

    Remote homology detection is a challenging problem in Bioinformatics. Arguably, profile Hidden Markov Models (pHMMs) are one of the most successful approaches in addressing this important problem. pHMM packages present a relatively small computational cost, and perform particularly well at recognizing remote homologies. This raises the question of whether structural alignments could impact the performance of pHMMs trained from proteins in the Twilight Zone, as structural alignments are often more accurate than sequence alignments at identifying motifs and functional residues. Next, we assess the impact of using structural alignments in pHMM performance. We used the SCOP database to perform our experiments. Structural alignments were obtained using the 3DCOFFEE and MAMMOTH-mult tools; sequence alignments were obtained using CLUSTALW, TCOFFEE, MAFFT and PROBCONS. We performed leave-one-family-out cross-validation over super-families. Performance was evaluated through ROC curves and paired two tailed t-test. We observed that pHMMs derived from structural alignments performed significantly better than pHMMs derived from sequence alignment in low-identity regions, mainly below 20%. We believe this is because structural alignment tools are better at focusing on the important patterns that are more often conserved through evolution, resulting in higher quality pHMMs. On the other hand, sensitivity of these tools is still quite low for these low-identity regions. Our results suggest a number of possible directions for improvements in this area.

  5. Compositional Homology and Creative Thinking

    Directory of Open Access Journals (Sweden)

    Salvatore Tedesco

    2015-05-01

    Full Text Available The concept of homology is the most solid theoretical basis elaborated by the morphological thinking during its history. The enucleation of some general criteria for the interpretation of homology is today a fundamental tool for life sciences, and for restoring their own opening to the question of qualitative innovation that arose so powerfully in the original Darwinian project. The aim of this paper is to verify the possible uses of the concept of compositional homology in order to provide of an adequate understanding of the dynamics of creative thinking.

  6. Rational Homological Stability for Automorphisms of Manifolds

    DEFF Research Database (Denmark)

    Grey, Matthias

    In this thesis we prove rational homological stability for the classifying spaces of the homotopy automorphisms and block di↵eomorphisms of iterated connected sums of products of spheres of a certain connectivity.The results in particular apply to the manifolds       Npg,q  = (#g(Sp x Sq)) - int...... with coefficients in the homology of the universal covering, which is studied using rational homology theory. The result for the block di↵eomorphisms is deduced from the homological stability for the homotopy automorphisms upon using Surgery theory. Themain theorems of this thesis extend the homological stability...

  7. Homology with vesicle fusion mediator syntaxin-1a predicts determinants ofepimorphin/syntaxin-2 function in mammary epithelial morphogenesis

    Energy Technology Data Exchange (ETDEWEB)

    Chen, Connie S.; Nelson, Celeste M.; Khauv, Davitte; Bennett, Simone; Radisky, Evette S.; Hirai, Yohei; Bissell, Mina J.; Radisky, Derek C.

    2009-06-03

    We have shown that branching morphogenesis of mammary ductal structures requires the action of the morphogen epimorphin/syntaxin-2. Epimorphin, originally identified as an extracellular molecule, is identical to syntaxin-2, an intracellular molecule that is a member of the extensively investigated syntaxin family of proteins that mediate vesicle trafficking. We show here that although epimorphin/syntaxin-2 is highly homologous to syntaxin-1a, only epimorphin/syntaxin-2 can stimulate mammary branching morphogenesis. We construct a homology model of epimorphin/syntaxin-2 based on the published structure of syntaxin-1a, and we use this model to identify the structural motif responsible for the morphogenic activity. We identify four residues located within the cleft between helices B and C that differ between syntaxin-1a and epimorphin/syntaxin-2; through site-directed mutagenesis of these four amino acids, we confer the properties of epimorphin for cell adhesion, gene activation, and branching morphogenesis onto the inactive syntaxin-1a template. These results provide a dramatic demonstration of the use of structural information about one molecule to define a functional motif of a second molecule that is related at the sequence level but highly divergent functionally.

  8. Recovery of arrested replication forks by homologous recombination is error-prone.

    Directory of Open Access Journals (Sweden)

    Ismail Iraqui

    Full Text Available Homologous recombination is a universal mechanism that allows repair of DNA and provides support for DNA replication. Homologous recombination is therefore a major pathway that suppresses non-homology-mediated genome instability. Here, we report that recovery of impeded replication forks by homologous recombination is error-prone. Using a fork-arrest-based assay in fission yeast, we demonstrate that a single collapsed fork can cause mutations and large-scale genomic changes, including deletions and translocations. Fork-arrest-induced gross chromosomal rearrangements are mediated by inappropriate ectopic recombination events at the site of collapsed forks. Inverted repeats near the site of fork collapse stimulate large-scale genomic changes up to 1,500 times over spontaneous events. We also show that the high accuracy of DNA replication during S-phase is impaired by impediments to fork progression, since fork-arrest-induced mutation is due to erroneous DNA synthesis during recovery of replication forks. The mutations caused are small insertions/duplications between short tandem repeats (micro-homology indicative of replication slippage. Our data establish that collapsed forks, but not stalled forks, recovered by homologous recombination are prone to replication slippage. The inaccuracy of DNA synthesis does not rely on PCNA ubiquitination or trans-lesion-synthesis DNA polymerases, and it is not counteracted by mismatch repair. We propose that deletions/insertions, mediated by micro-homology, leading to copy number variations during replication stress may arise by progression of error-prone replication forks restarted by homologous recombination.

  9. Molecular cloning of cDNAs which are highly overexpressed in mitoxantrone-resistant cells

    DEFF Research Database (Denmark)

    Miyake, K; Mickley, L; Litman, Thomas

    1999-01-01

    mitoxantrone-resistant S1-M1-80 human colon carcinoma cells was screened by differential hybridization. Two cDNAs of different lengths were isolated and designated MXR1 and MXR2. Sequencing revealed a high degree of homology for the cDNAs with Expressed Sequence Tag sequences previously identified as belonging...... to an ATP binding cassette transporter. Homology to the Drosophila white gene and its homologues was found for the predicted amino acid sequence. Using either cDNA as a probe in a Northern analysis demonstrated high levels of expression in the S1-M1-80 cells and in the human breast cancer subline, MCF-7 Ad......Vp3000. Levels were lower in earlier steps of selection, and in partial revertants. The gene is amplified 10-12-fold in the MCF-7 AdVp3000 cells, but not in the S1-M1-80 cells These studies are consistent with the identification of a new ATP binding cassette transporter, which is overexpressed...

  10. Homologous gene targeting of a carotenoids biosynthetic gene in Rhodosporidium toruloides by Agrobacterium-mediated transformation.

    Science.gov (United States)

    Sun, Wenyi; Yang, Xiaobing; Wang, Xueying; Lin, Xinping; Wang, Yanan; Zhang, Sufang; Luan, Yushi; Zhao, Zongbao K

    2017-07-01

    To target a carotenoid biosynthetic gene in the oleaginous yeast Rhodosporidium toruloides by using the Agrobacterium-mediated transformation (AMT) method. The RHTO_04602 locus of R. toruloides NP11, previously assigned to code the carotenoid biosynthetic gene CRTI, was amplified from genomic DNA and cloned into the binary plasmid pZPK-mcs, resulting in pZPK-CRT. A HYG-expression cassette was inserted into the CRTI sequence of pZPK-CRT by utilizing the restriction-free clone strategy. The resulted plasmid was used to transform R. toruloides cells according to the AMT method, leading to a few white transformants. Sequencing analysis of those transformants confirmed homologous recombination and insertional inactivation of CRTI. When the white variants were transformed with a CRTI-expression cassette, cells became red and produced carotenoids as did the wild-type strain NP11. Successful homologous targeting of the CrtI locus confirmed the function of RHTO_04602 in carotenoids biosynthesis in R. toruloides. It provided valuable information for metabolic engineering of this non-model yeast species.

  11. Reconstruction of ancestral RNA sequences under multiple structural constraints.

    Science.gov (United States)

    Tremblay-Savard, Olivier; Reinharz, Vladimir; Waldispühl, Jérôme

    2016-11-11

    Secondary structures form the scaffold of multiple sequence alignment of non-coding RNA (ncRNA) families. An accurate reconstruction of ancestral ncRNAs must use this structural signal. However, the inference of ancestors of a single ncRNA family with a single consensus structure may bias the results towards sequences with high affinity to this structure, which are far from the true ancestors. In this paper, we introduce achARNement, a maximum parsimony approach that, given two alignments of homologous ncRNA families with consensus secondary structures and a phylogenetic tree, simultaneously calculates ancestral RNA sequences for these two families. We test our methodology on simulated data sets, and show that achARNement outperforms classical maximum parsimony approaches in terms of accuracy, but also reduces by several orders of magnitude the number of candidate sequences. To conclude this study, we apply our algorithms on the Glm clan and the FinP-traJ clan from the Rfam database. Our results show that our methods reconstruct small sets of high-quality candidate ancestors with better agreement to the two target structures than with classical approaches. Our program is freely available at: http://csb.cs.mcgill.ca/acharnement .

  12. Ultra-fast sequence clustering from similarity networks with SiLiX

    Directory of Open Access Journals (Sweden)

    Duret Laurent

    2011-04-01

    Full Text Available Abstract Background The number of gene sequences that are available for comparative genomics approaches is increasing extremely quickly. A current challenge is to be able to handle this huge amount of sequences in order to build families of homologous sequences in a reasonable time. Results We present the software package SiLiX that implements a novel method which reconsiders single linkage clustering with a graph theoretical approach. A parallel version of the algorithms is also presented. As a demonstration of the ability of our software, we clustered more than 3 millions sequences from about 2 billion BLAST hits in 7 minutes, with a high clustering quality, both in terms of sensitivity and specificity. Conclusions Comparing state-of-the-art software, SiLiX presents the best up-to-date capabilities to face the problem of clustering large collections of sequences. SiLiX is freely available at http://lbbe.univ-lyon1.fr/SiLiX.

  13. Highly conserved non-coding sequences are associated with vertebrate development.

    Directory of Open Access Journals (Sweden)

    Adam Woolfe

    2005-01-01

    Full Text Available In addition to protein coding sequence, the human genome contains a significant amount of regulatory DNA, the identification of which is proving somewhat recalcitrant to both in silico and functional methods. An approach that has been used with some success is comparative sequence analysis, whereby equivalent genomic regions from different organisms are compared in order to identify both similarities and differences. In general, similarities in sequence between highly divergent organisms imply functional constraint. We have used a whole-genome comparison between humans and the pufferfish, Fugu rubripes, to identify nearly 1,400 highly conserved non-coding sequences. Given the evolutionary divergence between these species, it is likely that these sequences are found in, and furthermore are essential to, all vertebrates. Most, and possibly all, of these sequences are located in and around genes that act as developmental regulators. Some of these sequences are over 90% identical across more than 500 bases, being more highly conserved than coding sequence between these two species. Despite this, we cannot find any similar sequences in invertebrate genomes. In order to begin to functionally test this set of sequences, we have used a rapid in vivo assay system using zebrafish embryos that allows tissue-specific enhancer activity to be identified. Functional data is presented for highly conserved non-coding sequences associated with four unrelated developmental regulators (SOX21, PAX6, HLXB9, and SHH, in order to demonstrate the suitability of this screen to a wide range of genes and expression patterns. Of 25 sequence elements tested around these four genes, 23 show significant enhancer activity in one or more tissues. We have identified a set of non-coding sequences that are highly conserved throughout vertebrates. They are found in clusters across the human genome, principally around genes that are implicated in the regulation of development

  14. Evolution of pH buffers and water homeostasis in eukaryotes: homology between humans and Acanthamoeba proteins.

    Science.gov (United States)

    Baig, Abdul M; Zohaib, R; Tariq, S; Ahmad, H R

    2018-02-01

    This study intended to trace the evolution of acid-base buffers and water homeostasis in eukaryotes. Acanthamoeba castellanii  was selected as a model unicellular eukaryote for this purpose. Homologies of proteins involved in pH and water regulatory mechanisms at cellular levels were compared between humans and A. castellanii. Amino acid sequence homology, structural homology, 3D modeling and docking prediction were done to show the extent of similarities between carbonic anhydrase 1 (CA1), aquaporin (AQP), band-3 protein and H + pump. Experimental assays were done with acetazolamide (AZM), brinzolamide and mannitol to observe their effects on the trophozoites of  A. castellanii.  The human CA1, AQP, band-3 protein and H + -transport proteins revealed similar proteins in Acanthamoeba. Docking showed the binding of AZM on amoebal AQP-like proteins.  Acanthamoeba showed transient shape changes and encystation at differential doses of brinzolamide, mannitol and AZM.  Conclusion: Water and pH regulating adapter proteins in Acanthamoeba and humans show significant homology, these mechanisms evolved early in the primitive unicellular eukaryotes and have remained conserved in multicellular eukaryotes.

  15. Calcium-Enhanced Twitching Motility in Xylella fastidiosa Is Linked to a Single PilY1 Homolog.

    Science.gov (United States)

    Cruz, Luisa F; Parker, Jennifer K; Cobine, Paul A; De La Fuente, Leonardo

    2014-12-01

    The plant-pathogenic bacterium Xylella fastidiosa is restricted to the xylem vessel environment, where mineral nutrients are transported through the plant host; therefore, changes in the concentrations of these elements likely impact the growth and virulence of this bacterium. Twitching motility, dependent on type IV pili (TFP), is required for movement against the transpiration stream that results in basipetal colonization. We previously demonstrated that calcium (Ca) increases the motility of X. fastidiosa, although the mechanism was unknown. PilY1 is a TFP structural protein recently shown to bind Ca and to regulate twitching and adhesion in bacterial pathogens of humans. Sequence analysis identified three pilY1 homologs in X. fastidiosa (PD0023, PD0502, and PD1611), one of which (PD1611) contains a Ca-binding motif. Separate deletions of PD0023 and PD1611 resulted in mutants that still showed twitching motility and were not impaired in attachment or biofilm formation. However, the response of increased twitching at higher Ca concentrations was lost in the pilY1-1611 mutant. Ca does not modulate the expression of any of the X. fastidiosa PilY1 homologs, although it increases the expression of the retraction ATPase pilT during active movement. The evidence presented here suggests functional differences between the PilY1 homologs, which may provide X. fastidiosa with an adaptive advantage in environments with high Ca concentrations, such as xylem sap. Copyright © 2014, American Society for Microbiology. All Rights Reserved.

  16. Homology in Electromagnetic Boundary Value Problems

    Directory of Open Access Journals (Sweden)

    Pellikka Matti

    2010-01-01

    Full Text Available We discuss how homology computation can be exploited in computational electromagnetism. We represent various cellular mesh reduction techniques, which enable the computation of generators of homology spaces in an acceptable time. Furthermore, we show how the generators can be used for setting up and analysis of an electromagnetic boundary value problem. The aim is to provide a rationale for homology computation in electromagnetic modeling software.

  17. [Cloning and sequencing of the papA gene from uropathogenic Escherichia coli 4030 strain].

    Science.gov (United States)

    Wu, Qinggang; Zhang, Jingping; Zhao, Chuncheng; Zhu, Jianguo

    2008-09-01

    Cloning and sequencing of the papA gene from uropathogenic Escherichia coli 4030 strain to investigate the differences of the sequences of the papA of UPEC4030 strain and the ones of related genes, in order to make whether or not it was a new genotype. Cloning and sequencing methods were used to analyze the sequence of the papA of UPEC4030 strain in comparison with related sequences. The sequence analysis of papA revealed a 722 bp gene and encode 192 amino acid polypeptide. The overall homology of the papA genes between UPEC4030 and the standard strains of ten F types were 36.11%-77.95% and 22.20%-78.34% at nucleotide and deduced amino acid levels. The homology between the sequence of the reverse primers and the corresponding sequence of UPEC4030 papA was 10%-66.67%. The results confirmed that UPEC4030 strain contained a novel papA variant. UPEC4030 strain could contain an unknown papA variant or the novel genotype. The pathogenic mechanism and epidemiology related need to be further studied.

  18. Quantification and Sequencing of Crossover Recombinant Molecules from Arabidopsis Pollen DNA.

    Science.gov (United States)

    Choi, Kyuha; Yelina, Nataliya E; Serra, Heïdi; Henderson, Ian R

    2017-01-01

    During meiosis, homologous chromosomes undergo recombination, which can result in formation of reciprocal crossover molecules. Crossover frequency is highly variable across the genome, typically occurring in narrow hotspots, which has a significant effect on patterns of genetic diversity. Here we describe methods to measure crossover frequency in plants at the hotspot scale (bp-kb), using allele-specific PCR amplification from genomic DNA extracted from the pollen of F 1 heterozygous plants. We describe (1) titration methods that allow amplification, quantification and sequencing of single crossover molecules, (2) quantitative PCR methods to more rapidly measure crossover frequency, and (3) application of high-throughput sequencing for study of crossover distributions within hotspots. We provide detailed descriptions of key steps including pollen DNA extraction, prior identification of hotspot locations, allele-specific oligonucleotide design, and sequence analysis approaches. Together, these methods allow the rate and recombination topology of plant hotspots to be robustly measured and compared between varied genetic backgrounds and environmental conditions.

  19. Interspecies hybridization on DNA resequencing microarrays: efficiency of sequence recovery and accuracy of SNP detection in human, ape, and codfish mitochondrial DNA genomes sequenced on a human-specific MitoChip

    Directory of Open Access Journals (Sweden)

    Carr Steven M

    2007-09-01

    Full Text Available Abstract Background Iterative DNA "resequencing" on oligonucleotide microarrays offers a high-throughput method to measure intraspecific biodiversity, one that is especially suited to SNP-dense gene regions such as vertebrate mitochondrial (mtDNA genomes. However, costs of single-species design and microarray fabrication are prohibitive. A cost-effective, multi-species strategy is to hybridize experimental DNAs from diverse species to a common microarray that is tiled with oligonucleotide sets from multiple, homologous reference genomes. Such a strategy requires that cross-hybridization between the experimental DNAs and reference oligos from the different species not interfere with the accurate recovery of species-specific data. To determine the pattern and limits of such interspecific hybridization, we compared the efficiency of sequence recovery and accuracy of SNP identification by a 15,452-base human-specific microarray challenged with human, chimpanzee, gorilla, and codfish mtDNA genomes. Results In the human genome, 99.67% of the sequence was recovered with 100.0% accuracy. Accuracy of SNP identification declines log-linearly with sequence divergence from the reference, from 0.067 to 0.247 errors per SNP in the chimpanzee and gorilla genomes, respectively. Efficiency of sequence recovery declines with the increase of the number of interspecific SNPs in the 25b interval tiled by the reference oligonucleotides. In the gorilla genome, which differs from the human reference by 10%, and in which 46% of these 25b regions contain 3 or more SNP differences from the reference, only 88% of the sequence is recoverable. In the codfish genome, which differs from the reference by > 30%, less than 4% of the sequence is recoverable, in short islands ≥ 12b that are conserved between primates and fish. Conclusion Experimental DNAs bind inefficiently to homologous reference oligonucleotide sets on a re-sequencing microarray when their sequences differ by

  20. The cohesion protein SOLO associates with SMC1 and is required for synapsis, recombination, homolog bias and cohesion and pairing of centromeres in Drosophila Meiosis.

    Science.gov (United States)

    Yan, Rihui; McKee, Bruce D

    2013-01-01

    Cohesion between sister chromatids is mediated by cohesin and is essential for proper meiotic segregation of both sister chromatids and homologs. solo encodes a Drosophila meiosis-specific cohesion protein with no apparent sequence homology to cohesins that is required in male meiosis for centromere cohesion, proper orientation of sister centromeres and centromere enrichment of the cohesin subunit SMC1. In this study, we show that solo is involved in multiple aspects of meiosis in female Drosophila. Null mutations in solo caused the following phenotypes: 1) high frequencies of homolog and sister chromatid nondisjunction (NDJ) and sharply reduced frequencies of homolog exchange; 2) reduced transmission of a ring-X chromosome, an indicator of elevated frequencies of sister chromatid exchange (SCE); 3) premature loss of centromere pairing and cohesion during prophase I, as indicated by elevated foci counts of the centromere protein CID; 4) instability of the lateral elements (LE)s and central regions of synaptonemal complexes (SCs), as indicated by fragmented and spotty staining of the chromosome core/LE component SMC1 and the transverse filament protein C(3)G, respectively, at all stages of pachytene. SOLO and SMC1 are both enriched on centromeres throughout prophase I, co-align along the lateral elements of SCs and reciprocally co-immunoprecipitate from ovarian protein extracts. Our studies demonstrate that SOLO is closely associated with meiotic cohesin and required both for enrichment of cohesin on centromeres and stable assembly of cohesin into chromosome cores. These events underlie and are required for stable cohesion of centromeres, synapsis of homologous chromosomes, and a recombination mechanism that suppresses SCE to preferentially generate homolog crossovers (homolog bias). We propose that SOLO is a subunit of a specialized meiotic cohesin complex that mediates both centromeric and axial arm cohesion and promotes homolog bias as a component of chromosome

  1. The genomic sequence of the Chinese hamster ovary (CHO)-K1 cell line

    DEFF Research Database (Denmark)

    Xu, Xun; Pan, Shengkai; Liu, Xin

    2011-01-01

    Chinese hamster ovary (CHO)-derived cell lines are the preferred host cells for the production of therapeutic proteins. Here we present a draft genomic sequence of the CHO-K1 ancestral cell line. The assembly comprises 2.45 Gb of genomic sequence, with 24,383 predicted genes. We associate most....... Homologs of most human glycosylation-associated genes are present in the CHO-K1 genome, although 141 of these homologs are not expressed under exponential growth conditions. Many important viral entry genes are also present in the genome but not expressed, which may explain the unusual viral resistance...... property of CHO cell lines. We discuss how the availability of this genome sequence may facilitate genome-scale science for the optimization of biopharmaceutical protein production....

  2. Development of versatile non-homologous end joining-based knock-in module for genome editing.

    Science.gov (United States)

    Sawatsubashi, Shun; Joko, Yudai; Fukumoto, Seiji; Matsumoto, Toshio; Sugano, Shigeo S

    2018-01-12

    CRISPR/Cas9-based genome editing has dramatically accelerated genome engineering. An important aspect of genome engineering is efficient knock-in technology. For improved knock-in efficiency, the non-homologous end joining (NHEJ) repair pathway has been used over the homology-dependent repair pathway, but there remains a need to reduce the complexity of the preparation of donor vectors. We developed the versatile NHEJ-based knock-in module for genome editing (VIKING). Using the consensus sequence of the time-honored pUC vector to cut donor vectors, any vector with a pUC backbone could be used as the donor vector without customization. Conditions required to minimize random integration rates of the donor vector were also investigated. We attempted to isolate null lines of the VDR gene in human HaCaT keratinocytes using knock-in/knock-out with a selection marker cassette, and found 75% of clones isolated were successfully knocked-in. Although HaCaT cells have hypotetraploid genome composition, the results suggest multiple clones have VDR null phenotypes. VIKING modules enabled highly efficient knock-in of any vectors harboring pUC vectors. Users now can insert various existing vectors into an arbitrary locus in the genome. VIKING will contribute to low-cost genome engineering.

  3. Homologous series of induced early mutants in indican rice. Pt.1. The production of homologous series of early mutants

    International Nuclear Information System (INIS)

    Chen Xiulan; Yang Hefeng; He Zhentian; Han Yuepeng; Liu Xueyu

    1999-01-01

    The percentage of homologous series of early mutants induced from the same Indican rice variety were almost the same (1.37%∼1.64%) in 1983∼1993, but the ones from the different eco-typical varieties were different. The early variety was 0.73%, the mid variety was 1.51%, and the late variety was 1.97%. The percentage of homologous series of early mutants from the varieties with the same pedigree and relationship were similar, but the one from the cog nation were lower than those from distant varieties. There are basic laws and characters in the homologous series of early mutants: 1. The inhibited phenotype is the basic of the homologous series of early mutants; 2. The production of the homologous series of early mutants is closely related with the growing period of the parent; 3. The parallel mutation of the stem and leaves are simultaneously happened with the variation of early or late maturing; 4. The occurrence of the homologous series of early mutants is in a state of imbalance. According to the law of parallel variability, the production of homologous series of early mutants can be predicted as long as the parents' classification of plant, pedigree and ecological type are identified. Therefore, the early breeding can be guided by the law of homologous series of early mutants

  4. Roche genome sequencer FLX based high-throughput sequencing of ancient DNA

    DEFF Research Database (Denmark)

    Alquezar-Planas, David E; Fordyce, Sarah Louise

    2012-01-01

    Since the development of so-called "next generation" high-throughput sequencing in 2005, this technology has been applied to a variety of fields. Such applications include disease studies, evolutionary investigations, and ancient DNA. Each application requires a specialized protocol to ensure...... that the data produced is optimal. Although much of the procedure can be followed directly from the manufacturer's protocols, the key differences lie in the library preparation steps. This chapter presents an optimized protocol for the sequencing of fossil remains and museum specimens, commonly referred...

  5. Two tandemly repeated telomere-associated sequences in Nicotiana plumbaginifolia.

    Science.gov (United States)

    Chen, C M; Wang, C T; Wang, C J; Ho, C H; Kao, Y Y; Chen, C C

    1997-12-01

    Two tandemly repeated telomere-associated sequences, NP3R and NP4R, have been isolated from Nicotiana plumbaginifolia. The length of a repeating unit for NP3R and NP4R is 165 and 180 nucleotides respectively. The abundance of NP3R, NP4R and telomeric repeats is, respectively, 8.4 x 10(4), 6 x 10(3) and 1.5 x 10(6) copies per haploid genome of N. plumbaginifolia. Fluorescence in situ hybridization revealed that NP3R is located at the ends and/or in interstitial regions of all 10 chromosomes and NP4R on the terminal regions of three chromosomes in the haploid genome of N. plumbaginifolia. Sequence homology search revealed that not only are NP3R and NP4R homologous to HRS60 and GRS, respectively, two tandem repeats isolated from N. tabacum, but that NP3R and NP4R are also related to each other, suggesting that they originated from a common ancestral sequence. The role of these repeated sequences in chromosome healing is discussed based on the observation that two to three copies of a telomere-similar sequence were present in each repeating unit of NP3R and NP4R.

  6. The Resistome of Low-Impacted Marine Environments Is Composed by Distant Metallo-β-Lactamases Homologs

    Directory of Open Access Journals (Sweden)

    Erica L. Fonseca

    2018-04-01

    Full Text Available The worldwide dispersion and sudden emergence of new antibiotic resistance genes (ARGs determined the need in uncovering which environment participate most as their source and reservoir. ARGs closely related to those currently found in human pathogens occur in the resistome of anthropogenic impacted environments. However, the role of pristine environment as the origin and source of ARGs remains underexplored and controversy, particularly, the marine environments represented by the oceans. Here, due to the ocean nature, we hypothesized that the resistome of this pristine/low-impacted marine environment is represented by distant ARG homologs. To test this hypothesis we performed an in silico analysis on the Global Ocean Sampling (GOS metagenomic project dataset focusing on the metallo-β-lactamases (MβLs as the ARG model. MβLs have been a challenge to public health, since they hydrolyze the carbapenems, one of the last therapeutic choice in clinics. Using Hidden Markov Model (HMM profiles, we were successful in identifying a high diversity of distant MβL homologs, related to the B1, B2, and B3 subclasses. The majority of them were distributed across the Atlantic, Indian, and Pacific Oceans being related to the chromosomally encoded MβL GOB present in Elizabethkingia genus. It was observed only a reduced number of metagenomic sequence homologs related to the acquired MβL enzymes (VIM, SPM-1, and AIM-1 that currently have impact in clinics. Therefore, low antibiotic impacted marine environment, as the ocean, are unlikely the source of ARGs that have been causing enormous threat to the public health.

  7. The Resistome of Low-Impacted Marine Environments Is Composed by Distant Metallo-β-Lactamases Homologs.

    Science.gov (United States)

    Fonseca, Erica L; Andrade, Bruno G N; Vicente, Ana C P

    2018-01-01

    The worldwide dispersion and sudden emergence of new antibiotic resistance genes (ARGs) determined the need in uncovering which environment participate most as their source and reservoir. ARGs closely related to those currently found in human pathogens occur in the resistome of anthropogenic impacted environments. However, the role of pristine environment as the origin and source of ARGs remains underexplored and controversy, particularly, the marine environments represented by the oceans. Here, due to the ocean nature, we hypothesized that the resistome of this pristine/low-impacted marine environment is represented by distant ARG homologs. To test this hypothesis we performed an in silico analysis on the Global Ocean Sampling (GOS) metagenomic project dataset focusing on the metallo-β-lactamases (MβLs) as the ARG model. MβLs have been a challenge to public health, since they hydrolyze the carbapenems, one of the last therapeutic choice in clinics. Using Hidden Markov Model (HMM) profiles, we were successful in identifying a high diversity of distant MβL homologs, related to the B1, B2, and B3 subclasses. The majority of them were distributed across the Atlantic, Indian, and Pacific Oceans being related to the chromosomally encoded MβL GOB present in Elizabethkingia genus. It was observed only a reduced number of metagenomic sequence homologs related to the acquired MβL enzymes (VIM, SPM-1, and AIM-1) that currently have impact in clinics. Therefore, low antibiotic impacted marine environment, as the ocean, are unlikely the source of ARGs that have been causing enormous threat to the public health.

  8. Cloning, Sequencing, and Expression of the Pyruvate Carboxylase Gene in Lactococcus lactis subsp. lactis C2†

    OpenAIRE

    Wang, H.; O'Sullivan, D. J.; Baldwin, K. A.; McKay, L. L.

    2000-01-01

    A functional pyc gene was isolated from Lactococcus lactis subsp. lactis C2 and was found to complement a Pyc defect in L. lactis KB4. The deduced lactococcal Pyc protein was highly homologous to Pyc sequences of other bacteria. The pyc gene was also detected in Lactococcus lactis subsp. cremoris and L. lactis subsp. lactis bv. diacetylactis strains.

  9. Tracing the Evolutionary History of the CAP Superfamily of Proteins Using Amino Acid Sequence Homology and Conservation of Splice Sites.

    Science.gov (United States)

    Abraham, Anup; Chandler, Douglas E

    2017-10-01

    Proteins of the CAP superfamily play numerous roles in reproduction, innate immune responses, cancer biology, and venom toxicology. Here we document the breadth of the CAP (Cysteine-RIch Secretory Protein (CRISP), Antigen 5, and Pathogenesis-Related) protein superfamily and trace the major events in its evolution using amino acid sequence homology and the positions of exon/intron borders within their genes. Seldom acknowledged in the literature, we find that many of the CAP subfamilies present in mammals, where they were originally characterized, have distinct homologues in the invertebrate phyla. Early eukaryotic CAP genes contained only one exon inherited from prokaryotic predecessors and as evolution progressed an increasing number of introns were inserted, reaching 2-5 in the invertebrate world and 5-15 in the vertebrate world. Focusing on the CRISP subfamily, we propose that these proteins evolved in three major steps: (1) origination of the CAP/PR/SCP domain in bacteria, (2) addition of a small Hinge domain to produce the two-domain SCP-like proteins found in roundworms and anthropoids, and (3) addition of an Ion Channel Regulatory domain, borrowed from invertebrate peptide toxins, to produce full length, three-domain CRISP proteins, first seen in insects and later to diversify into multiple subtypes in the vertebrate world.

  10. Directed PCR-free engineering of highly repetitive DNA sequences

    Directory of Open Access Journals (Sweden)

    Preissler Steffen

    2011-09-01

    Full Text Available Abstract Background Highly repetitive nucleotide sequences are commonly found in nature e.g. in telomeres, microsatellite DNA, polyadenine (poly(A tails of eukaryotic messenger RNA as well as in several inherited human disorders linked to trinucleotide repeat expansions in the genome. Therefore, studying repetitive sequences is of biological, biotechnological and medical relevance. However, cloning of such repetitive DNA sequences is challenging because specific PCR-based amplification is hampered by the lack of unique primer binding sites resulting in unspecific products. Results For the PCR-free generation of repetitive DNA sequences we used antiparallel oligonucleotides flanked by restriction sites of Type IIS endonucleases. The arrangement of recognition sites allowed for stepwise and seamless elongation of repetitive sequences. This facilitated the assembly of repetitive DNA segments and open reading frames encoding polypeptides with periodic amino acid sequences of any desired length. By this strategy we cloned a series of polyglutamine encoding sequences as well as highly repetitive polyadenine tracts. Such repetitive sequences can be used for diverse biotechnological applications. As an example, the polyglutamine sequences were expressed as His6-SUMO fusion proteins in Escherichia coli cells to study their aggregation behavior in vitro. The His6-SUMO moiety enabled affinity purification of the polyglutamine proteins, increased their solubility, and allowed controlled induction of the aggregation process. We successfully purified the fusions proteins and provide an example for their applicability in filter retardation assays. Conclusion Our seamless cloning strategy is PCR-free and allows the directed and efficient generation of highly repetitive DNA sequences of defined lengths by simple standard cloning procedures.

  11. Identification of a thymidine kinase (RuTK1) homolog differentially expressed in blackberry (Rubus L.) prickles

    International Nuclear Information System (INIS)

    Zhang, C.; Yang, H.; Wang, X.

    2016-01-01

    Thymidine kinase (TK) is a key enzyme in controlling DNA synthesis and plays an important role in cell proliferation. However, our understanding on the TK functions in plants is still limited. From an earlier comparative transcriptome analysis of shoot apex of blackberry cv. Boysenberry and its bud mutant cv. Ningzhi 1 with fewer and thinner prickles, we found a unigene homologous to TK, RuTK1 which was differentially expressed between them. In this study, the cDNA and genomic DNA (gDNA) sequences of RuTK1 were further analyzed. RuTK1 revealed an open reading frame (ORF) of 660 bp coding for 219 amino acid residues. The gDNA sequence, which contains four exons and three introns, is relatively conserved in most plant TK homologs. A phylogenetic analysis revealed that the TK proteins from plants were classified into three groups. In each group, TKs from the same family were relatively concentrated, and RuTK1 was classified to the dicotyledoneae class and closer to those from Rosaceae. RuTK1 was highly expressed in prickles at the early stage in Boysenberry compared to in Ningzhi1. In addition, RuTK1 expression was similarly greater in mature prickles at the late stage in both cultivars, which implies a possible involvement of RuTK1 in the cell cycle at the early stage of prickle formation. These results provide a novel foundation for the further elucidation of blackberry prickle development mechanism and the functions of TKs in plants. (author)

  12. Insight into the transcriptome of Arthrobotrys conoides using high throughput sequencing.

    Science.gov (United States)

    Ramesh, Pandit; Reena, Patel; Amitbikram, Mohapatra; Chaitanya, Joshi; Anju, Kunjadia

    2015-12-01

    Arthrobotrys conoides is a nematode-trapping fungus belonging to Orbiliales, Ascomycota group, and traps prey nematodes by means of adhesive network. Fungus has a potential to be used as a biocontrol agent against plant parasitic nematodes. In the present study, we characterized the transcriptome of A. conoides using high-throughput sequencing technology and characterized its virulence unigenes. Total 7,255 cDNA contigs with an average length of 425 bp were generated and 6184 (61.81%) transcripts were functionally annotated and characterized. Majority of unigenes were found analogous to the genes of plant pathogenic fungi. A total of 1749 transcripts were found to be orthologous with eukaryotic proteins of KOG database. Several carbohydrate active enzymes and peptidases were identified. We also analyzed classically and nonclassically secreted proteins and confirmed by BLASTP against fungal secretome database. A total of 916 contigs were analogous to 556 unique proteins of Pathogen Host Interaction (PHI) database. Further, we identified 91 unigenes homologous to the database of fungal virulence factor (DFVF). A total of 104 putative protein kinases coding transcripts were identified by BLASTP against KinBase database, which are major players in signaling pathways. This study provides a comprehensive look at the transcriptome of A. conoides and the identified unigenes might have a role in catching and killing prey nematodes by A. conoides. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  13. Applying Agrep to r-NSA to solve multiple sequences approximate matching.

    Science.gov (United States)

    Ni, Bing; Wong, Man-Hon; Lam, Chi-Fai David; Leung, Kwong-Sak

    2014-01-01

    This paper addresses the approximate matching problem in a database consisting of multiple DNA sequences, where the proposed approach applies Agrep to a new truncated suffix array, r-NSA. The construction time of the structure is linear to the database size, and the computations of indexing a substring in the structure are constant. The number of characters processed in applying Agrep is analysed theoretically, and the theoretical upper-bound can approximate closely the empirical number of characters, which is obtained through enumerating the characters in the actual structure built. Experiments are carried out using (synthetic) random DNA sequences, as well as (real) genome sequences including Hepatitis-B Virus and X-chromosome. Experimental results show that, compared to the straight-forward approach that applies Agrep to multiple sequences individually, the proposed approach solves the matching problem in much shorter time. The speed-up of our approach depends on the sequence patterns, and for highly similar homologous genome sequences, which are the common cases in real-life genomes, it can be up to several orders of magnitude.

  14. Sequence Exchange between Homologous NB-LRR Genes Converts Virus Resistance into Nematode Resistance, and Vice Versa.

    Science.gov (United States)

    Slootweg, Erik; Koropacka, Kamila; Roosien, Jan; Dees, Robert; Overmars, Hein; Lankhorst, Rene Klein; van Schaik, Casper; Pomp, Rikus; Bouwman, Liesbeth; Helder, Johannes; Schots, Arjen; Bakker, Jaap; Smant, Geert; Goverse, Aska

    2017-09-01

    Plants have evolved a limited repertoire of NB-LRR disease resistance ( R ) genes to protect themselves against myriad pathogens. This limitation is thought to be counterbalanced by the rapid evolution of NB-LRR proteins, as only a few sequence changes have been shown to be sufficient to alter resistance specificities toward novel strains of a pathogen. However, little is known about the flexibility of NB-LRR R genes to switch resistance specificities between phylogenetically unrelated pathogens. To investigate this, we created domain swaps between the close homologs Gpa2 and Rx1 , which confer resistance in potato ( Solanum tuberosum ) to the cyst nematode Globodera pallida and Potato virus X , respectively. The genetic fusion of the CC-NB-ARC of Gpa2 with the LRR of Rx1 (Gpa2 CN /Rx1 L ) results in autoactivity, but lowering the protein levels restored its specific activation response, including extreme resistance to Potato virus X in potato shoots. The reciprocal chimera (Rx1 CN /Gpa2 L ) shows a loss-of-function phenotype, but exchange of the first three LRRs of Gpa2 by the corresponding region of Rx1 was sufficient to regain a wild-type resistance response to G. pallida in the roots. These data demonstrate that exchanging the recognition moiety in the LRR is sufficient to convert extreme virus resistance in the leaves into mild nematode resistance in the roots, and vice versa. In addition, we show that the CC-NB-ARC can operate independently of the recognition specificities defined by the LRR domain, either aboveground or belowground. These data show the versatility of NB-LRR genes to generate resistance to unrelated pathogens with completely different lifestyles and routes of invasion. © 2017 American Society of Plant Biologists. All Rights Reserved.

  15. Draft genome sequence of the silver pomfret fish, Pampus argenteus.

    Science.gov (United States)

    AlMomin, Sabah; Kumar, Vinod; Al-Amad, Sami; Al-Hussaini, Mohsen; Dashti, Talal; Al-Enezi, Khaznah; Akbar, Abrar

    2016-01-01

    Silver pomfret, Pampus argenteus, is a fish species from coastal waters. Despite its high commercial value, this edible fish has not been sequenced. Hence, its genetic and genomic studies have been limited. We report the first draft genome sequence of the silver pomfret obtained using a Next Generation Sequencing (NGS) technology. We assembled 38.7 Gb of nucleotides into scaffolds of 350 Mb with N50 of about 1.5 kb, using high quality paired end reads. These scaffolds represent 63.7% of the estimated silver pomfret genome length. The newly sequenced and assembled genome has 11.06% repetitive DNA regions, and this percentage is comparable to that of the tilapia genome. The genome analysis predicted 16 322 genes. About 91% of these genes showed homology with known proteins. Many gene clusters were annotated to protein and fatty-acid metabolism pathways that may be important in the context of the meat texture and immune system developmental processes. The reference genome can pave the way for the identification of many other genomic features that could improve breeding and population-management strategies, and it can also help characterize the genetic diversity of P. argenteus.

  16. Exome Sequence Analysis of 14 Families With High Myopia

    DEFF Research Database (Denmark)

    Kloss, Bethany A.; Tompson, Stuart W.; Whisenhunt, Kristina N.

    2017-01-01

    Purpose: To identify causal gene mutations in 14 families with autosomal dominant (AD) high myopia using exome sequencing. Methods: Select individuals from 14 large Caucasian families with high myopia were exome sequenced. Gene variants were filtered to identify potential pathogenic changes. Sang...

  17. In vitro site selection of a consensus binding site for the Drosophila melanogaster Tbx20 homolog midline.

    Directory of Open Access Journals (Sweden)

    Nima Najand

    Full Text Available We employed in vitro site selection to identify a consensus binding sequence for the Drosophila melanogaster Tbx20 T-box transcription factor homolog Midline. We purified a bacterially expressed T-box DNA binding domain of Midline, and used it in four rounds of precipitation and polymerase-chain-reaction based amplification. We cloned and sequenced 54 random oligonucleotides selected by Midline. Electromobility shift-assays confirmed that 27 of these could bind the Midline T-box. Sequence alignment of these 27 clones suggests that Midline binds as a monomer to a consensus sequence that contains an AGGTGT core. Thus, the Midline consensus binding site we define in this study is similar to that defined for vertebrate Tbx20, but differs from a previously reported Midline binding sequence derived through site selection.

  18. The nucleotide sequence of 5S ribosomal RNA from Micrococcus lysodeikticus.

    Science.gov (United States)

    Hori, H; Osawa, S; Murao, K; Ishikura, H

    1980-01-01

    The nucleotide sequence of ribosomal 5S RNA from Micrococcus lysodeikticus is pGUUACGGCGGCUAUAGCGUGGGGGAAACGCCCGGCCGUAUAUCGAACCCGGAAGCUAAGCCCCAUAGCGCCGAUGGUUACUGUAACCGGGAGGUUGUGGGAGAGUAGGUCGCCGCCGUGAOH. When compared to other 5S RNAs, the sequence homology is greatest with Thermus aquaticus, and these two 5S RNAs reveal several features intermediate between those of typical gram-positive bacteria and gram-negative bacteria. PMID:6780979

  19. The cytochrome oxidase subunit I and subunit III genes in Oenothera mitochondria are transcribed from identical promoter sequences

    Science.gov (United States)

    Hiesel, Rudolf; Schobel, Werner; Schuster, Wolfgang; Brennicke, Axel

    1987-01-01

    Two loci encoding subunit III of the cytochrome oxidase (COX) in Oenothera mitochondria have been identified from a cDNA library of mitochondrial transcripts. A 657-bp sequence block upstream from the open reading frame is also present in the two copies of the COX subunit I gene and is presumably involved in homologous sequence rearrangement. The proximal points of sequence rearrangements are located 3 bp upstream from the COX I and 1139 bp upstream from the COX III initiation codons. The 5'-termini of both COX I and COX III mRNAs have been mapped in this common sequence confining the promoter region for the Oenothera mitochondrial COX I and COX III genes to the homologous sequence block. ImagesFig. 5. PMID:15981332

  20. Discovery of precursor and mature microRNAs and their putative gene targets using high-throughput sequencing in pineapple (Ananas comosus var. comosus).

    Science.gov (United States)

    Yusuf, Noor Hydayaty Md; Ong, Wen Dee; Redwan, Raimi Mohamed; Latip, Mariam Abd; Kumar, S Vijay

    2015-10-15

    MicroRNAs (miRNAs) are a class of small, endogenous non-coding RNAs that negatively regulate gene expression, resulting in the silencing of target mRNA transcripts through mRNA cleavage or translational inhibition. MiRNAs play significant roles in various biological and physiological processes in plants. However, the miRNA-mediated gene regulatory network in pineapple, the model tropical non-climacteric fruit, remains largely unexplored. Here, we report a complete list of pineapple mature miRNAs obtained from high-throughput small RNA sequencing and precursor miRNAs (pre-miRNAs) obtained from ESTs. Two small RNA libraries were constructed from pineapple fruits and leaves, respectively, using Illumina's Solexa technology. Sequence similarity analysis using miRBase revealed 579,179 reads homologous to 153 miRNAs from 41 miRNA families. In addition, a pineapple fruit transcriptome library consisting of approximately 30,000 EST contigs constructed using Solexa sequencing was used for the discovery of pre-miRNAs. In all, four pre-miRNAs were identified (MIR156, MIR399, MIR444 and MIR2673). Furthermore, the same pineapple transcriptome was used to dissect the function of the miRNAs in pineapple by predicting their putative targets in conjunction with their regulatory networks. In total, 23 metabolic pathways were found to be regulated by miRNAs in pineapple. The use of high-throughput sequencing in pineapples to unveil the presence of miRNAs and their regulatory pathways provides insight into the repertoire of miRNA regulation used exclusively in this non-climacteric model plant. Copyright © 2015 Elsevier B.V. All rights reserved.

  1. Characterization of Satellite DNA Sequences from the Commercially Important Marine Rotifers Brachionus rotundiformis and Brachionus plicatilis.

    Science.gov (United States)

    Boehm; Gibson; Lubzens

    2000-01-01

    This study was initiated to search for species-specific and strain-specific satellite DNA sequences for which oligonucleotide primers could be designed to differentiate between various commercially important strains of the marine monogonont rotifers Brachionus rotundiformis and Brachionus plicatilis. Two unrelated, highly reiterated satellite sequences were cloned and characterized. The eight sequenced monomers from B. rotundiformis and six from B. plicatilis had low intrarepeat variability and were similar in their overall lengths, A + T compositions, and high degrees of repeated motif substructure. However, hybridizations to 19 representative strains, sequence characterizations, and GenBank searches indicated that these two satellites are morphotype-specific and population-specific, respectively, and share little homology to each other or to other characterized sequences in the database. Primer pairs designed for the B. rotundiformis satellite confirmed hybridization specificities on polymerase chain reaction and could serve as a useful molecular diagnostic tool to identify strains belonging to the SS morphotype, which are gaining widespread usage as first feeds for marine fish in commercial production.

  2. Comparative sequence and structural analyses of G-protein-coupled receptor crystal structures and implications for molecular models.

    Directory of Open Access Journals (Sweden)

    Catherine L Worth

    Full Text Available BACKGROUND: Up until recently the only available experimental (high resolution structure of a G-protein-coupled receptor (GPCR was that of bovine rhodopsin. In the past few years the determination of GPCR structures has accelerated with three new receptors, as well as squid rhodopsin, being successfully crystallized. All share a common molecular architecture of seven transmembrane helices and can therefore serve as templates for building molecular models of homologous GPCRs. However, despite the common general architecture of these structures key differences do exist between them. The choice of which experimental GPCR structure(s to use for building a comparative model of a particular GPCR is unclear and without detailed structural and sequence analyses, could be arbitrary. The aim of this study is therefore to perform a systematic and detailed analysis of sequence-structure relationships of known GPCR structures. METHODOLOGY: We analyzed in detail conserved and unique sequence motifs and structural features in experimentally-determined GPCR structures. Deeper insight into specific and important structural features of GPCRs as well as valuable information for template selection has been gained. Using key features a workflow has been formulated for identifying the most appropriate template(s for building homology models of GPCRs of unknown structure. This workflow was applied to a set of 14 human family A GPCRs suggesting for each the most appropriate template(s for building a comparative molecular model. CONCLUSIONS: The available crystal structures represent only a subset of all possible structural variation in family A GPCRs. Some GPCRs have structural features that are distributed over different crystal structures or which are not present in the templates suggesting that homology models should be built using multiple templates. This study provides a systematic analysis of GPCR crystal structures and a consistent method for identifying

  3. Comparative sequence and structural analyses of G-protein-coupled receptor crystal structures and implications for molecular models.

    Science.gov (United States)

    Worth, Catherine L; Kleinau, Gunnar; Krause, Gerd

    2009-09-16

    Up until recently the only available experimental (high resolution) structure of a G-protein-coupled receptor (GPCR) was that of bovine rhodopsin. In the past few years the determination of GPCR structures has accelerated with three new receptors, as well as squid rhodopsin, being successfully crystallized. All share a common molecular architecture of seven transmembrane helices and can therefore serve as templates for building molecular models of homologous GPCRs. However, despite the common general architecture of these structures key differences do exist between them. The choice of which experimental GPCR structure(s) to use for building a comparative model of a particular GPCR is unclear and without detailed structural and sequence analyses, could be arbitrary. The aim of this study is therefore to perform a systematic and detailed analysis of sequence-structure relationships of known GPCR structures. We analyzed in detail conserved and unique sequence motifs and structural features in experimentally-determined GPCR structures. Deeper insight into specific and important structural features of GPCRs as well as valuable information for template selection has been gained. Using key features a workflow has been formulated for identifying the most appropriate template(s) for building homology models of GPCRs of unknown structure. This workflow was applied to a set of 14 human family A GPCRs suggesting for each the most appropriate template(s) for building a comparative molecular model. The available crystal structures represent only a subset of all possible structural variation in family A GPCRs. Some GPCRs have structural features that are distributed over different crystal structures or which are not present in the templates suggesting that homology models should be built using multiple templates. This study provides a systematic analysis of GPCR crystal structures and a consistent method for identifying suitable templates for GPCR homology modelling that will

  4. CLONING AND SEQUENCING OF PGIP FROM ‘JIN SERIES’ ALMOND (PRUNUS DULCIS

    Directory of Open Access Journals (Sweden)

    Yuhu Han

    2015-12-01

    Full Text Available Specific primers synthesized according to conservative regions of polygalacturonase inhibiting protein (PGIP gene were used to amplify Prunus Dulcis genomic DNA by polymerase-chain reaction (PCR. Six bands (pgip1, pgip2, pgip3, pgip4, pgip5 and pgip6 of genes were obtained and cloned into PBS-T vector. According to the length of bands, 717bp, 864bp, 796bp were A1 (pgip1, pgip2, pgip3, A2 (pgip4, A4 (pgip5, pgip6, respectively. DNA sequences showed that the fragments taken together were the gene encoding PGIP. A2 and A3 contained two exons interrupted by one intron, which has GT-AG sequence. Its DNA and amino acid sequences were highly homologies to those from Prunus Persica; Prunus Salicina; Prunus Americana; Prunus Mume, respectively. A conserved lencinerial fragment exists in the derived protein sequence.

  5. Sifting through genomes with iterative-sequence clustering produces a large, phylogenetically diverse protein-family resource.

    Science.gov (United States)

    Sharpton, Thomas J; Jospin, Guillaume; Wu, Dongying; Langille, Morgan G I; Pollard, Katherine S; Eisen, Jonathan A

    2012-10-13

    New computational resources are needed to manage the increasing volume of biological data from genome sequencing projects. One fundamental challenge is the ability to maintain a complete and current catalog of protein diversity. We developed a new approach for the identification of protein families that focuses on the rapid discovery of homologous protein sequences. We implemented fully automated and high-throughput procedures to de novo cluster proteins into families based upon global alignment similarity. Our approach employs an iterative clustering strategy in which homologs of known families are sifted out of the search for new families. The resulting reduction in computational complexity enables us to rapidly identify novel protein families found in new genomes and to perform efficient, automated updates that keep pace with genome sequencing. We refer to protein families identified through this approach as "Sifting Families," or SFams. Our analysis of ~10.5 million protein sequences from 2,928 genomes identified 436,360 SFams, many of which are not represented in other protein family databases. We validated the quality of SFam clustering through statistical as well as network topology-based analyses. We describe the rapid identification of SFams and demonstrate how they can be used to annotate genomes and metagenomes. The SFam database catalogs protein-family quality metrics, multiple sequence alignments, hidden Markov models, and phylogenetic trees. Our source code and database are publicly available and will be subject to frequent updates (http://edhar.genomecenter.ucdavis.edu/sifting_families/).

  6. Cloning of human and mouse genes homologous to RAD52, a yeast gene involved in DNA repair and recombination.

    NARCIS (Netherlands)

    D.F.R. Muris; O.Y. Bezzubova (Olga); J-M. Buerstedde; K. Vreeken; A.S. Balajee; C.J. Osgood; C. Troelstra (Christine); J.H.J. Hoeijmakers (Jan); K. Ostermann; H. Schmidt (Henning); A.T. Natarajan; J.C.J. Eeken; P.H.M. Lohmann (Paul); A. Pastink (Albert)

    1994-01-01

    textabstractThe RAD52 gene of Saccharomyces cerevisiae is required for recombinational repair of double-strand breaks. Using degenerate oligonucleotides based on conserved amino acid sequences of RAD52 and rad22, its counterpart from Schizosaccharomyces pombe, RAD52 homologs from man and mouse were

  7. FASH: A web application for nucleotides sequence search

    Directory of Open Access Journals (Sweden)

    Chew Paul

    2008-05-01

    Full Text Available Abstract FASH (Fourier Alignment Sequence Heuristics is a web application, based on the Fast Fourier Transform, for finding remote homologs within a long nucleic acid sequence. Given a query sequence and a long text-sequence (e.g, the human genome, FASH detects subsequences within the text that are remotely-similar to the query. FASH offers an alternative approach to Blast/Fasta for querying long RNA/DNA sequences. FASH differs from these other approaches in that it does not depend on the existence of contiguous seed-sequences in its initial detection phase. The FASH web server is user friendly and very easy to operate. Availability FASH can be accessed at https://fash.bgu.ac.il:8443/fash/default.jsp (secured website

  8. Colored Kauffman homology and super-A-polynomials

    International Nuclear Information System (INIS)

    Nawata, Satoshi; Ramadevi, P.; Zodinmawia

    2014-01-01

    We study the structural properties of colored Kauffman homologies of knots. Quadruple-gradings play an essential role in revealing the differential structure of colored Kauffman homology. Using the differential structure, the Kauffman homologies carrying the symmetric tensor products of the vector representation for the trefoil and the figure-eight are determined. In addition, making use of relations from representation theory, we also obtain the HOMFLY homologies colored by rectangular Young tableaux with two rows for these knots. Furthermore, the notion of super-A-polynomials is extended in order to encompass two-parameter deformations of PSL(2,ℂ) character varieties

  9. Establishment of screening technique for mutant cell and analysis of base sequence in the mutation

    International Nuclear Information System (INIS)

    Sofuni, Toshio; Nomi, Takehiko; Yamada, Masami; Masumura, Kenichi

    2000-01-01

    This research project aimed to establish an easy and quick detection method for radiation-induced mutation using molecular-biological techniques and an effective analyzing method for the molecular changes in base sequence. In this year, Spi mutants derived from γ-radiation exposed mouse were analyzed by PCR method and DNA sequence method. Male transgenic mice were exposed to γ-ray at 5,10, 50 Gy and the transgene was taken out from the genome DNA from the spleen in vivo packaging method. Spi mutant plaques were obtained by infecting the recovered phage to E. coli. Sequence analysis for the mutants was made using ALFred DNA sequencer and SequiTherm TM Long-Red Cycle sequencing kit. Sequence analysis was carried out for 41 of 50 independent Spi mutants obtained. The deletions were classified into 4 groups; Group 1 included 15 mutants that were characterized with a large deletion (43 bp-10 kb) with a short homologous sequence. Group 2 included 11 mutants of a large deletion having no homologous sequence at the connecting region. Group 3 included 11 mutants having a short deletion of less than 20 bp, which occurred in the non-repetitive sequence of gam gene and possibly caused by oxidative breakage of DNA or recombination of DNA fragment produced by the breakage. Group 4 included 4 mutants having deletions as short as 20 bp or less in the repetitive sequence of gam gene, resulting in an alteration of the reading frame. Thus, the synthesis of Gam protein was terminated by the appearance of TGA between code 13 and 14 of redB gene, leading to inactivation of gam gene and redBA gene. These results indicated that most of Spi mutants had a deletion in red/gam region and the deletions in more than half mutants occurred in homologous sequences as short as 8 bp. (M.N.)

  10. Protein thermostability prediction within homologous families using temperature-dependent statistical potentials.

    Directory of Open Access Journals (Sweden)

    Fabrizio Pucci

    Full Text Available The ability to rationally modify targeted physical and biological features of a protein of interest holds promise in numerous academic and industrial applications and paves the way towards de novo protein design. In particular, bioprocesses that utilize the remarkable properties of enzymes would often benefit from mutants that remain active at temperatures that are either higher or lower than the physiological temperature, while maintaining the biological activity. Many in silico methods have been developed in recent years for predicting the thermodynamic stability of mutant proteins, but very few have focused on thermostability. To bridge this gap, we developed an algorithm for predicting the best descriptor of thermostability, namely the melting temperature Tm, from the protein's sequence and structure. Our method is applicable when the Tm of proteins homologous to the target protein are known. It is based on the design of several temperature-dependent statistical potentials, derived from datasets consisting of either mesostable or thermostable proteins. Linear combinations of these potentials have been shown to yield an estimation of the protein folding free energies at low and high temperatures, and the difference of these energies, a prediction of the melting temperature. This particular construction, that distinguishes between the interactions that contribute more than others to the stability at high temperatures and those that are more stabilizing at low T, gives better performances compared to the standard approach based on T-independent potentials which predict the thermal resistance from the thermodynamic stability. Our method has been tested on 45 proteins of known Tm that belong to 11 homologous families. The standard deviation between experimental and predicted Tm's is equal to 13.6°C in cross validation, and decreases to 8.3°C if the 6 worst predicted proteins are excluded. Possible extensions of our approach are discussed.

  11. Cloning, sequencing, and expression of dnaK-operon proteins from the thermophilic bacterium Thermus thermophilus.

    Science.gov (United States)

    Osipiuk, J; Joachimiak, A

    1997-09-12

    We propose that the dnaK operon of Thermus thermophilus HB8 is composed of three functionally linked genes: dnaK, grpE, and dnaJ. The dnaK and dnaJ gene products are most closely related to their cyanobacterial homologs. The DnaK protein sequence places T. thermophilus in the plastid Hsp70 subfamily. In contrast, the grpE translated sequence is most similar to GrpE from Clostridium acetobutylicum, a Gram-positive anaerobic bacterium. A single promoter region, with homology to the Escherichia coli consensus promoter sequences recognized by the sigma70 and sigma32 transcription factors, precedes the postulated operon. This promoter is heat-shock inducible. The dnaK mRNA level increased more than 30 times upon 10 min of heat shock (from 70 degrees C to 85 degrees C). A strong transcription terminating sequence was found between the dnaK and grpE genes. The individual genes were cloned into pET expression vectors and the thermophilic proteins were overproduced at high levels in E. coli and purified to homogeneity. The recombinant T. thermophilus DnaK protein was shown to have a weak ATP-hydrolytic activity, with an optimum at 90 degrees C. The ATPase was stimulated by the presence of GrpE and DnaJ. Another open reading frame, coding for ClpB heat-shock protein, was found downstream of the dnaK operon.

  12. Induction of human immunodeficiency virus (HIV-1 envelope specific cell-mediated immunity by a non-homologous synthetic peptide.

    Directory of Open Access Journals (Sweden)

    Ammar Achour

    2007-11-01

    Full Text Available Cell mediated immunity, including efficient CTL response, is required to prevent HIV-1 from cell-to-cell transmission. In previous investigations, we have shown that B1 peptide derived by Fourier transformation of HIV-1 primary structures and sharing no sequence homology with the parent proteins was able to generate antiserum which recognizes envelope and Tat proteins. Here we have investigated cellular immune response towards a novel non-homologous peptide, referred to as cA1 peptide.The 20 amino acid sequence of cA1 peptide was predicted using the notion of peptide hydropathic properties; the peptide is encoded by the complementary anti-sense DNA strand to the sense strand of previously described non-homologous A1 peptide. In this report we demonstrate that the cA1 peptide can be a target for major histocompatibility complex (MHC class I-restricted cytotoxic T lymphocytes in HIV-1-infected or envelope-immunized individuals. The cA1 peptide is recognized in association with different MHC class I allotypes and could prime in vitro CTLs, derived from gp160-immunized individuals capable to recognize virus variants.For the first time a theoretically designed immunogen involved in broad-based cell-immune memory activation is described. Our findings may thus contribute to the advance in vaccine research by describing a novel strategy to develop a synthetic AIDS vaccine.

  13. Genome-wide sequence variations among Mycobacterium avium subspecies paratuberculosis.

    Directory of Open Access Journals (Sweden)

    Chung-Yi eHsu

    2011-12-01

    Full Text Available Mycobacterium avium subspecies paratuberculosis (M. ap, the causative agent of Johne’s disease (JD, infects many farmed ruminants, wildlife animals and humans. To better understand the molecular pathogenesis of these infections, we analyzed the whole genome sequences of several M. ap and M. avium subspecies avium (M. avium strains isolated from various hosts and environments. Using Next-generation sequencing technology, all 6 M. ap isolates showed a high percentage of homology (98% to the reference genome sequence of M. ap K-10 isolated from cattle. However, 2 M. avium isolates (DT 78 and Env 77 showed significant sequence diversity from the reference strain M. avium 104. The genomes of M. avium isolates DT 78 and Env 77 exhibited only 87% and 40% homology, respectively, to the M. avium 104 reference genome. Within the M. ap isolates, genomic rearrangements (insertions/deletions, Indels were not detected, and only unique single nucleotide polymorphisms (SNPs were observed among the 6 M. ap strains. While most of the SNPs (~100 in M. ap genomes were non-synonymous, a total of ~ 6000 SNPs were detected among M. avium genomes, most of them were synonymous suggesting a differential selective pressure between M. ap and M. avium isolates. In addition, SNPs-based phylo-genomic analysis showed that isolates from goat and Oryx are closely related to the cattle (K-10 strain while the human isolate (M. ap 4B is closely related to the environmental strains, indicating environmental source to human infections. Overall, SNPs were the most common variations among M. ap isolates while SNPs in addition to Indels were prevalent among M. avium isolates. Genomic variations will be useful in designing host-specific markers for the analysis of mycobacterial evolution and for developing novel diagnostics directed against Johne’s disease in animals.

  14. Reconstruction of ancestral RNA sequences under multiple structural constraints

    Directory of Open Access Journals (Sweden)

    Olivier Tremblay-Savard

    2016-11-01

    Full Text Available Abstract Background Secondary structures form the scaffold of multiple sequence alignment of non-coding RNA (ncRNA families. An accurate reconstruction of ancestral ncRNAs must use this structural signal. However, the inference of ancestors of a single ncRNA family with a single consensus structure may bias the results towards sequences with high affinity to this structure, which are far from the true ancestors. Methods In this paper, we introduce achARNement, a maximum parsimony approach that, given two alignments of homologous ncRNA families with consensus secondary structures and a phylogenetic tree, simultaneously calculates ancestral RNA sequences for these two families. Results We test our methodology on simulated data sets, and show that achARNement outperforms classical maximum parsimony approaches in terms of accuracy, but also reduces by several orders of magnitude the number of candidate sequences. To conclude this study, we apply our algorithms on the Glm clan and the FinP-traJ clan from the Rfam database. Conclusions Our results show that our methods reconstruct small sets of high-quality candidate ancestors with better agreement to the two target structures than with classical approaches. Our program is freely available at: http://csb.cs.mcgill.ca/acharnement .

  15. Statistical distributions of optimal global alignment scores of random protein sequences

    Directory of Open Access Journals (Sweden)

    Tang Jiaowei

    2005-10-01

    Full Text Available Abstract Background The inference of homology from statistically significant sequence similarity is a central issue in sequence alignments. So far the statistical distribution function underlying the optimal global alignments has not been completely determined. Results In this study, random and real but unrelated sequences prepared in six different ways were selected as reference datasets to obtain their respective statistical distributions of global alignment scores. All alignments were carried out with the Needleman-Wunsch algorithm and optimal scores were fitted to the Gumbel, normal and gamma distributions respectively. The three-parameter gamma distribution performs the best as the theoretical distribution function of global alignment scores, as it agrees perfectly well with the distribution of alignment scores. The normal distribution also agrees well with the score distribution frequencies when the shape parameter of the gamma distribution is sufficiently large, for this is the scenario when the normal distribution can be viewed as an approximation of the gamma distribution. Conclusion We have shown that the optimal global alignment scores of random protein sequences fit the three-parameter gamma distribution function. This would be useful for the inference of homology between sequences whose relationship is unknown, through the evaluation of gamma distribution significance between sequences.

  16. Identification and nucleotide sequence of the thymidine kinase gene of Shope fibroma virus

    International Nuclear Information System (INIS)

    Upton, C.; McFadden, G.

    1986-01-01

    The thymidine kinase (TK) gene of Shope fibroma virus (SFV), a tumorigenic leporipoxvirus, was localized within the viral genome with degenerate oligonucleotide probes. These probes were constructed to two regions of high sequence conservation between the vaccinia virus TK gene and those of several known eucaryotic cellular TK genes, including human, mouse, hamster, and chicken TK genes. The oligonucleotide probes initially localized the SFV TK gene 50 kilobases (kb) from the right terminus of the 160-kb SFV genome within the 9.5-kb BamHI-HindIII fragment E. Fine-mapping analysis indicated that the TK Gene was within a 1.2-kb AvaI-HaeIII fragment, and DNA sequencing of this region revealed an open reading frame capable of encoding a polypeptide of 187 amino acids possessing considerable homology to the TK genes of the vaccinia, variola, and monkeypox orthopoxviruses and also to a variety of cellular TK genes. Homology matrix analysis and homology scores suggest that the SFV TK gene has diverged significantly from its counterpart members in the orthopoxvirus genus. Nevertheless, the presence of conserved upstream open reading frames on the 5' side of all of the poxvirus TK genes indicates a similarity of functional organization between the orthopoxviruses and leporipoxviruses. These data suggest a common ancestral origin for at least some of the unique internal regions of the leporipoxviruses and orthopoxviruses as exemplified by SFV and vaccinia virus, respectively

  17. How to Choose the Suitable Template for Homology Modelling of GPCRs: 5-HT7 Receptor as a Test Case.

    Science.gov (United States)

    Shahaf, Nir; Pappalardo, Matteo; Basile, Livia; Guccione, Salvatore; Rayan, Anwar

    2016-09-01

    G protein-coupled receptors (GPCRs) are a super-family of membrane proteins that attract great pharmaceutical interest due to their involvement in almost every physiological activity, including extracellular stimuli, neurotransmission, and hormone regulation. Currently, structural information on many GPCRs is mainly obtained by the techniques of computer modelling in general and by homology modelling in particular. Based on a quantitative analysis of eighteen antagonist-bound, resolved structures of rhodopsin family "A" receptors - also used as templates to build 153 homology models - it was concluded that a higher sequence identity between two receptors does not guarantee a lower RMSD between their structures, especially when their pair-wise sequence identity (within trans-membrane domain and/or in binding pocket) lies between 25 % and 40 %. This study suggests that we should consider all template receptors having a sequence identity ≤50 % with the query receptor. In fact, most of the GPCRs, compared to the currently available resolved structures of GPCRs, fall within this range and lack a correlation between structure and sequence. When testing suitability for structure-based drug design, it was found that choosing as a template the most similar resolved protein, based on sequence resemblance only, led to unsound results in many cases. Molecular docking analyses were carried out, and enrichment factors as well as attrition rates were utilized as criteria for assessing suitability for structure-based drug design. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  18. Winnowing DNA for rare sequences: highly specific sequence and methylation based enrichment.

    Directory of Open Access Journals (Sweden)

    Jason D Thompson

    Full Text Available Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue.

  19. Winnowing DNA for rare sequences: highly specific sequence and methylation based enrichment.

    Science.gov (United States)

    Thompson, Jason D; Shibahara, Gosuke; Rajan, Sweta; Pel, Joel; Marziali, Andre

    2012-01-01

    Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue.

  20. Automated degenerate PCR primer design for high-throughput sequencing improves efficiency of viral sequencing

    Directory of Open Access Journals (Sweden)

    Li Kelvin

    2012-11-01

    Full Text Available Abstract Background In a high-throughput environment, to PCR amplify and sequence a large set of viral isolates from populations that are potentially heterogeneous and continuously evolving, the use of degenerate PCR primers is an important strategy. Degenerate primers allow for the PCR amplification of a wider range of viral isolates with only one set of pre-mixed primers, thus increasing amplification success rates and minimizing the necessity for genome finishing activities. To successfully select a large set of degenerate PCR primers necessary to tile across an entire viral genome and maximize their success, this process is best performed computationally. Results We have developed a fully automated degenerate PCR primer design system that plays a key role in the J. Craig Venter Institute’s (JCVI high-throughput viral sequencing pipeline. A consensus viral genome, or a set of consensus segment sequences in the case of a segmented virus, is specified using IUPAC ambiguity codes in the consensus template sequence to represent the allelic diversity of the target population. PCR primer pairs are then selected computationally to produce a minimal amplicon set capable of tiling across the full length of the specified target region. As part of the tiling process, primer pairs are computationally screened to meet the criteria for successful PCR with one of two described amplification protocols. The actual sequencing success rates for designed primers for measles virus, mumps virus, human parainfluenza virus 1 and 3, human respiratory syncytial virus A and B and human metapneumovirus are described, where >90% of designed primer pairs were able to consistently successfully amplify >75% of the isolates. Conclusions Augmenting our previously developed and published JCVI Primer Design Pipeline, we achieved similarly high sequencing success rates with only minor software modifications. The recommended methodology for the construction of the consensus

  1. Molecular cloning of chicken metallothionein. Deduction of the complete amino acid sequence and analysis of expression using cloned cDNA

    Energy Technology Data Exchange (ETDEWEB)

    Wei, D; Andrews, G K

    1988-01-25

    A cDNA library was constructed using RNA isolated from the livers of chickens which had been treated with zinc. This library was screened with a RNA probe complementary to mouse metallothionein-I (MT), and eight chicken MT cDNA clones were obtained. All of the cDNA clones contained nucleotide sequences homologous to regions of the longest (375 bp) cDNA clone. The latter contained an open reading frame of 189 bp, and the deduced amino acid sequence indicates a protein of 63 amino acids of which 20 are cysteine residues. Amino acid composition and partial amino acid sequence analyses of purified chicken MT protein agreed with the amino acid composition and sequence deduced from the cloned cDNA. Amino acid sequence comparison establish that chicken MT shares extensive homology with mammalian MTs. Southern blot analysis of chicken DNA indicates that the chicken MT gene is not a part of a large family of related sequences, but rather is likely to be a unique gene sequence. In the chicken liver, levels of chicken MT mRNA were rapidly induced by metals (Cd/sup 2 +/, Zn/sup 2 +/, Cu/sup 2 +/), glucocorticoids and lipopolysaccharide. MT mRNA was present in low levels in embryonic liver and increased to high levels during the first week after hatching before decreasing again to the basal levels found in adult liver. The results of this study establish that MT is highly conserved between birds and mammals and is regulated in the chicken by agents which also regulate expression of mammalian MT genes. However, in contrast to the mammals, the results suggest the existence of a single isoform of MT in the chicken.

  2. Cloning and sequence analysis of cDNA coding for rat nucleolar protein C23

    International Nuclear Information System (INIS)

    Ghaffari, S.H.; Olson, M.O.J.

    1986-01-01

    Using synthetic oligonucleotides as primers and probes, the authors have isolated and sequenced cDNA clones encoding protein C23, a putative nucleolus organizer protein. Poly(A + ) RNA was isolated from rat Novikoff hepatoma cells and enriched in C23 mRNA by sucrose density gradient ultracentrifugation. Two deoxyoligonuleotides, a 48- and a 27-mer, were synthesized on the basis of amino acid sequence from the C-terminal half of protein C23 and cDNA sequence data from CHO cell protein. The 48-mer was used a primer for synthesis of cDNA which was then inserted into plasmid pUC9. Transformed bacterial colonies were screened by hybridization with 32 P labeled 27-mer. Two clones among 5000 gave a strong positive signal. Plasmid DNAs from these clones were purified and characterized by blotting and nucleotide sequence analysis. The length of C23 mRNA was estimated to be 3200 bases in a northern blot analysis. The sequence of a 267 b.p. insert shows high homology with the CHO cDNA with only 9 nucleotide differences and an identical amino acid sequence. These studies indicate that this region of the protein is highly conserved

  3. Low-pass sequencing for microbial comparative genomics

    Directory of Open Access Journals (Sweden)

    Kennedy Sean

    2004-01-01

    Full Text Available Abstract Background We studied four extremely halophilic archaea by low-pass shotgun sequencing: (1 the metabolically versatile Haloarcula marismortui; (2 the non-pigmented Natrialba asiatica; (3 the psychrophile Halorubrum lacusprofundi and (4 the Dead Sea isolate Halobaculum gomorrense. Approximately one thousand single pass genomic sequences per genome were obtained. The data were analyzed by comparative genomic analyses using the completed Halobacterium sp. NRC-1 genome as a reference. Low-pass shotgun sequencing is a simple, inexpensive, and rapid approach that can readily be performed on any cultured microbe. Results As expected, the four archaeal halophiles analyzed exhibit both bacterial and eukaryotic characteristics as well as uniquely archaeal traits. All five halophiles exhibit greater than sixty percent GC content and low isoelectric points (pI for their predicted proteins. Multiple insertion sequence (IS elements, often involved in genome rearrangements, were identified in H. lacusprofundi and H. marismortui. The core biological functions that govern cellular and genetic mechanisms of H. sp. NRC-1 appear to be conserved in these four other halophiles. Multiple TATA box binding protein (TBP and transcription factor IIB (TFB homologs were identified from most of the four shotgunned halophiles. The reconstructed molecular tree of all five halophiles shows a large divergence between these species, but with the closest relationship being between H. sp. NRC-1 and H. lacusprofundi. Conclusion Despite the diverse habitats of these species, all five halophiles share (1 high GC content and (2 low protein isoelectric points, which are characteristics associated with environmental exposure to UV radiation and hypersalinity, respectively. Identification of multiple IS elements in the genome of H. lacusprofundi and H. marismortui suggest that genome structure and dynamic genome reorganization might be similar to that previously observed in the

  4. Escherichia coli promoter sequences predict in vitro RNA polymerase selectivity.

    OpenAIRE

    Mulligan, M E; Hawley, D K; Entriken, R; McClure, W R

    1984-01-01

    We describe a simple algorithm for computing a homology score for Escherichia coli promoters based on DNA sequence alone. The homology score was related to 31 values, measured in vitro, of RNA polymerase selectivity, which we define as the product KBk2, the apparent second order rate constant for open complex formation. We found that promoter strength could be predicted to within a factor of +/-4.1 in KBk2 over a range of 10(4) in the same parameter. The quantitative evaluation was linked to ...

  5. Rfam: annotating families of non-coding RNA sequences.

    Science.gov (United States)

    Daub, Jennifer; Eberhardt, Ruth Y; Tate, John G; Burge, Sarah W

    2015-01-01

    The primary task of the Rfam database is to collate experimentally validated noncoding RNA (ncRNA) sequences from the published literature and facilitate the prediction and annotation of new homologues in novel nucleotide sequences. We group homologous ncRNA sequences into "families" and related families are further grouped into "clans." We collate and manually curate data cross-references for these families from other databases and external resources. Our Web site offers researchers a simple interface to Rfam and provides tools with which to annotate their own sequences using our covariance models (CMs), through our tools for searching, browsing, and downloading information on Rfam families. In this chapter, we will work through examples of annotating a query sequence, collating family information, and searching for data.

  6. PEST sequences in the malaria parasite Plasmodium falciparum: a genomic study

    Directory of Open Access Journals (Sweden)

    Bell Angus

    2003-06-01

    Full Text Available Abstract Background Inhibitors of the protease calpain are known to have selectively toxic effects on Plasmodium falciparum. The enzyme has a natural inhibitor calpastatin and in eukaryotes is responsible for turnover of proteins containing short sequences enriched in certain amino acids (PEST sequences. The genome of P. falciparum was searched for this protease, its natural inhibitor and putative substrates. Methods The publicly available P. falciparum genome was found to have too many errors to permit reliable analysis. An earlier annotation of chromosome 2 was instead examined. PEST scores were determined for all annotated proteins. The published genome was searched for calpain and calpastatin homologs. Results Typical PEST sequences were found in 13% of the proteins on chromosome 2, including a surprising number of cell-surface proteins. The annotated calpain gene has a non-biological "intron" that appears to have been created to avoid an unrecognized frameshift. Only the catalytic domain has significant similarity with the vertebrate calpains. No calpastatin homologs were found in the published annotation. Conclusion A calpain gene is present in the genome and many putative substrates of this enzyme have been found. Calpastatin homologs may be found once the re-annotation is completed. Given the selective toxicity of calpain inhibitors, this enzyme may be worth exploring further as a potential drug target.

  7. De novo sequencing of two novel peptides homologous to calcitonin-like peptides, from skin secretion of the Chinese Frog, Odorrana schmackeri

    NARCIS (Netherlands)

    Evaristo, Geisa P C; Pinkse, Martijn W H; Chen, Tianbao; Wang, Lei; Mohammed, Shabaz; Heck, Albert J R; Mathes, Isabella; Lottspeich, Friedrich; Shaw, Chris; Albar, Juan Pablo; Verhaert, Peter D E M

    2015-01-01

    An MS/MS based analytical strategy was followed to solve the complete sequence of two new peptides from frog (Odorrana schmackeri) skin secretion. This involved reduction and alkylation with two different alkylating agents followed by high resolution tandem mass spectrometry. De novo sequencing was

  8. Nucleotide sequence analysis of the Legionella micdadei mip gene, encoding a 30-kilodalton analog of the Legionella pneumophila Mip protein

    DEFF Research Database (Denmark)

    Bangsborg, Jette Marie; Cianciotto, N P; Hindersson, P

    1991-01-01

    After the demonstration of analogs of the Legionella pneumophila macrophage infectivity potentiator (Mip) protein in other Legionella species, the Legionella micdadei mip gene was cloned and expressed in Escherichia coli. DNA sequence analysis of the L. micdadei mip gene contained in the plasmid p...... homology with the mip-like genes of several Legionella species. Furthermore, amino acid sequence comparisons revealed significant homology to two eukaryotic proteins with isomerase activity (FK506-binding proteins)....

  9. Chromhome: a rich internet application for accessing comparative chromosome homology maps.

    Science.gov (United States)

    Nagarajan, Sridevi; Rens, Willem; Stalker, James; Cox, Tony; Ferguson-Smith, Malcolm A

    2008-03-26

    Comparative genomics has become a significant research area in recent years, following the availability of a number of sequenced genomes. The comparison of genomes is of great importance in the analysis of functionally important genome regions. It can also be used to understand the phylogenetic relationships of species and the mechanisms leading to rearrangement of karyotypes during evolution. Many species have been studied at the cytogenetic level by cross species chromosome painting. With the large amount of such information, it has become vital to computerize the data and make them accessible worldwide. Chromhome http://www.chromhome.org is a comprehensive web application that is designed to provide cytogenetic comparisons among species and to fulfil this need. The Chromhome application architecture is multi-tiered with an interactive client layer, business logic and database layers. Enterprise java platform with open source framework OpenLaszlo is used to implement the Rich Internet Chromhome Application. Cross species comparative mapping raw data are collected and the processed information is stored into MySQL Chromhome database. Chromhome Release 1.0 contains 109 homology maps from 51 species. The data cover species from 14 orders and 30 families. The homology map displays all the chromosomes of the compared species as one image, making comparisons among species easier. Inferred data also provides maps of homologous regions that could serve as a guideline for researchers involved in phylogenetic or evolution based studies. Chromhome provides a useful resource for comparative genomics, holding graphical homology maps of a wide range of species. It brings together cytogenetic data of many genomes under one roof. Inferred painting can often determine the chromosomal homologous regions between two species, if each has been compared with a common third species. Inferred painting greatly reduces the need to map entire genomes and helps focus only on relevant

  10. Chromhome: A rich internet application for accessing comparative chromosome homology maps

    Directory of Open Access Journals (Sweden)

    Cox Tony

    2008-03-01

    Full Text Available Abstract Background Comparative genomics has become a significant research area in recent years, following the availability of a number of sequenced genomes. The comparison of genomes is of great importance in the analysis of functionally important genome regions. It can also be used to understand the phylogenetic relationships of species and the mechanisms leading to rearrangement of karyotypes during evolution. Many species have been studied at the cytogenetic level by cross species chromosome painting. With the large amount of such information, it has become vital to computerize the data and make them accessible worldwide. Chromhome http://www.chromhome.org is a comprehensive web application that is designed to provide cytogenetic comparisons among species and to fulfil this need. Results The Chromhome application architecture is multi-tiered with an interactive client layer, business logic and database layers. Enterprise java platform with open source framework OpenLaszlo is used to implement the Rich Internet Chromhome Application. Cross species comparative mapping raw data are collected and the processed information is stored into MySQL Chromhome database. Chromhome Release 1.0 contains 109 homology maps from 51 species. The data cover species from 14 orders and 30 families. The homology map displays all the chromosomes of the compared species as one image, making comparisons among species easier. Inferred data also provides maps of homologous regions that could serve as a guideline for researchers involved in phylogenetic or evolution based studies. Conclusion Chromhome provides a useful resource for comparative genomics, holding graphical homology maps of a wide range of species. It brings together cytogenetic data of many genomes under one roof. Inferred painting can often determine the chromosomal homologous regions between two species, if each has been compared with a common third species. Inferred painting greatly reduces the need to

  11. Construction of a novel kind of expression plasmid by homologous recombination in Saccharomyces cerevisiae

    Institute of Scientific and Technical Information of China (English)

    CHEN; Xiangling

    2005-01-01

    [1]Brunelli, J. P., Pall, M. L., A series of yeast vectors for expression of cDNAs and other DNA sequences, Yeast, 1993, 9: 1299―1308.[2]Sikorski, R. S., Hieter, P., A system of shuttle vectors and yeast host strains designed for efficient manipulation of DNA in Saccharomyces cerevisiae, Genetics, 1989, 122: 19―27.[3]Bonneaud, N., Ozier-Kalogerogoulos, O., Li, G. et al., A family of low and high copy replicative, integrative and single-stranded S. cerevisiae /E. coli shuttle vector, Yeast, 1991, 7: 609―615.[4]Huo, K. K., Yu, L. L., Chen, X. J., Li, Y. Y., A stable vector for high-level expression and secretion of human interferon alpha A in yeast, Science in China, Ser. B, 1993, 36(5): 557―567.[5]Zhou, Z. X., Yuan, H. Y., He, W. et al., Expression of the modified HBsAg gene SA-28 directed by a constitutive promoter, Journal of Fudan university (Natural Science), 2000, 39(3): 264―268.[6]Paques, F., Haber, J. E., Multiple pathways of recombination induces by double-strand breaks in Saccharomyces cerevisiae, Microbiology and Molecular Biology Reviews, 1999, 63(2): 349―404.[7]Martin, K., Damage-induced recombination in the yeast Saccharomyces cerevisiae, Mutation Research, 2000, 451: 91―105.[8]Alira, S., Tomoko, O., Homologous recombination and the roles of double-strand breaks, TIBS, 1995, 20: 387―391.[9]Patrick, S., Kelly, M. T., Stephen, V. K., Recombination factor of Saccharomyces cerevisiae, Mutation Research, 2000, 451: 257―275.[10]Manivasakam, P., Weber, S. C., McElver, J., Schiestl, R. H., Micro-homology mediated PCR targeting in Saccharomyces cerevisiae, Nucleic Acids Res., 1995, 23(14): 2799―2800.[11]Baudin, A., Lacroute, F., Cullin, C., A simple and efficient method for direct gene deletion in Saccharomyces cerevisiae, Nucleic Acids Res., 1993, 21(14): 3329―3330.[12]Hua, S. B., Qiu, M., Chan, E., Zhu, L., Luo, Y., Minimum length of sequence homology required for in vivo cloning by homolo-gous recombination in yeast, Plasmid, 1997, 38

  12. On (co)homology of Frobenius Poisson algebras

    OpenAIRE

    Zhu, Can; Van Oystaeyen, Fred; ZHANG, Yinhuo

    2014-01-01

    In this paper, we study Poisson (co)homology of a Frobenius Poisson algebra. More precisely, we show that there exists a duality between Poisson homology and Poisson cohomology of Frobenius Poisson algebras, similar to that between Hochschild homology and Hochschild cohomology of Frobenius algebras. Then we use the non-degenerate bilinear form on a unimodular Frobenius Poisson algebra to construct a Batalin-Vilkovisky structure on the Poisson cohomology ring making it into a Batalin-Vilkovisk...

  13. N-terminal sequence of human leukocyte glycoprotein Mo1: conservation across species and homology to platelet IIb/IIIa.

    Science.gov (United States)

    Pierce, M W; Remold-O'Donnell, E; Todd, R F; Arnaout, M A

    1986-12-12

    Mo1 and gp160-gp93 are two surface membrane glycoprotein heterodimers present on granulocytes and monocytes derived from humans and guinea pigs, respectively. We purified both antigens and found that their alpha subunits had identical N-termini which were significantly homologous to the alpha subunit of the human adhesion platelet glycoprotein IIb/IIIa.

  14. Compilation and analysis of Escherichia coli promoter DNA sequences.

    OpenAIRE

    Hawley, D K; McClure, W R

    1983-01-01

    The DNA sequence of 168 promoter regions (-50 to +10) for Escherichia coli RNA polymerase were compiled. The complete listing was divided into two groups depending upon whether or not the promoter had been defined by genetic (promoter mutations) or biochemical (5' end determination) criteria. A consensus promoter sequence based on homologies among 112 well-defined promoters was determined that was in substantial agreement with previous compilations. In addition, we have tabulated 98 promoter ...

  15. The Coding and Effector Transfer of Movement Sequences

    Science.gov (United States)

    Kovacs, Attila J.; Muhlbauer, Thomas; Shea, Charles H.

    2009-01-01

    Three experiments utilizing a 14-element arm movement sequence were designed to determine if reinstating the visual-spatial coordinates, which require movements to the same spatial locations utilized during acquisition, results in better effector transfer than reinstating the motor coordinates, which require the same pattern of homologous muscle…

  16. Ecological genomics in Xanthomonas: the nature of genetic adaptation with homologous recombination and host shifts

    KAUST Repository

    Huang, Chao-Li; Pu, Pei-Hua; Huang, Hao-Jen; Sung, Huang-Mo; Liaw, Hung-Jiun; Chen, Yi-Min; Chen, Chien-Ming; Huang, Ming-Ban; Osada, Naoki; Gojobori, Takashi; Pai, Tun-Wen; Chen, Yu-Tin; Hwang, Chi-Chuan; Chiang, Tzen-Yuh

    2015-01-01

    Background: Comparative genomics provides insights into the diversification of bacterial species. Bacterial speciation usually takes place with lasting homologous recombination, which not only acts as a cohering force between diverging lineages but brings advantageous alleles favored by natural selection, and results in ecologically distinct species, e.g., frequent host shift in Xanthomonas pathogenic to various plants. Results: Using whole-genome sequences, we examined the genetic divergence in Xanthomonas campestris that infected Brassicaceae, and X. citri, pathogenic to a wider host range. Genetic differentiation between two incipient races of X. citri pv. mangiferaeindicae was attributable to a DNA fragment introduced by phages. In contrast to most portions of the genome that had nearly equivalent levels of genetic divergence between subspecies as a result of the accumulation of point mutations, 10% of the core genome involving with homologous recombination contributed to the diversification in Xanthomonas, as revealed by the correlation between homologous recombination and genomic divergence. Interestingly, 179 genes were under positive selection; 98 (54.7%) of these genes were involved in homologous recombination, indicating that foreign genetic fragments may have caused the adaptive diversification, especially in lineages with nutritional transitions. Homologous recombination may have provided genetic materials for the natural selection, and host shifts likely triggered ecological adaptation in Xanthomonas. To a certain extent, we observed positive selection nevertheless contributed to ecological divergence beyond host shifting. Conclusion: Altogether, mediated with lasting gene flow, species formation in Xanthomonas was likely governed by natural selection that played a key role in helping the deviating populations to explore novel niches (hosts) or respond to environmental cues, subsequently triggering species diversification. © Huang et al.

  17. Ecological genomics in Xanthomonas: the nature of genetic adaptation with homologous recombination and host shifts

    KAUST Repository

    Huang, Chao-Li

    2015-03-15

    Background: Comparative genomics provides insights into the diversification of bacterial species. Bacterial speciation usually takes place with lasting homologous recombination, which not only acts as a cohering force between diverging lineages but brings advantageous alleles favored by natural selection, and results in ecologically distinct species, e.g., frequent host shift in Xanthomonas pathogenic to various plants. Results: Using whole-genome sequences, we examined the genetic divergence in Xanthomonas campestris that infected Brassicaceae, and X. citri, pathogenic to a wider host range. Genetic differentiation between two incipient races of X. citri pv. mangiferaeindicae was attributable to a DNA fragment introduced by phages. In contrast to most portions of the genome that had nearly equivalent levels of genetic divergence between subspecies as a result of the accumulation of point mutations, 10% of the core genome involving with homologous recombination contributed to the diversification in Xanthomonas, as revealed by the correlation between homologous recombination and genomic divergence. Interestingly, 179 genes were under positive selection; 98 (54.7%) of these genes were involved in homologous recombination, indicating that foreign genetic fragments may have caused the adaptive diversification, especially in lineages with nutritional transitions. Homologous recombination may have provided genetic materials for the natural selection, and host shifts likely triggered ecological adaptation in Xanthomonas. To a certain extent, we observed positive selection nevertheless contributed to ecological divergence beyond host shifting. Conclusion: Altogether, mediated with lasting gene flow, species formation in Xanthomonas was likely governed by natural selection that played a key role in helping the deviating populations to explore novel niches (hosts) or respond to environmental cues, subsequently triggering species diversification. © Huang et al.

  18. Molecular cloning, sequence characterization and expression pattern of Rab18 gene from watermelon (Citrullus lanatus).

    Science.gov (United States)

    Xinli, Xiao; Lei, Peng

    2015-03-04

    The complete mRNA sequence of watermelon Rab18 gene was amplified through the rapid amplification of cDNA ends (RACE) method. The full-length mRNA was 1010 bp containing a 645 bp open reading frame, which encodes a protein of 214 amino acids. Sequence analysis revealed that watermelon Rab18 protein shares high homology with the Rab18 of cucumber (99%), muskmelon (98%), Morus notabilis (90%), tomato (89%), wine grape (89%) and potato (88%). Phylogenetic analysis revealed that watermelon Rab18 gene has a closer genetic relationship with Rab18 gene of cucumber and muskmelon. Tissue expression profile analysis indicated that watermelon Rab18 gene was highly expressed in root, stem and leaf, moderately expressed in flower and weakly expressed in fruit.

  19. Full-length cDNA sequences from Rhesus monkey placenta tissue: analysis and utility for comparative mapping

    Directory of Open Access Journals (Sweden)

    Lee Sang-Rae

    2010-07-01

    Full Text Available Abstract Background Rhesus monkeys (Macaca mulatta are widely-used as experimental animals in biomedical research and are closely related to other laboratory macaques, such as cynomolgus monkeys (Macaca fascicularis, and to humans, sharing a last common ancestor from about 25 million years ago. Although rhesus monkeys have been studied extensively under field and laboratory conditions, research has been limited by the lack of genetic resources. The present study generated placenta full-length cDNA libraries, characterized the resulting expressed sequence tags, and described their utility for comparative mapping with human RefSeq mRNA transcripts. Results From rhesus monkey placenta full-length cDNA libraries, 2000 full-length cDNA sequences were determined and 1835 rhesus placenta cDNA sequences longer than 100 bp were collected. These sequences were annotated based on homology to human genes. Homology search against human RefSeq mRNAs revealed that our collection included the sequences of 1462 putative rhesus monkey genes. Moreover, we identified 207 genes containing exon alterations in the coding region and the untranslated region of rhesus monkey transcripts, despite the highly conserved structure of the coding regions. Approximately 10% (187 of all full-length cDNA sequences did not represent any public human RefSeq mRNAs. Intriguingly, two rhesus monkey specific exons derived from the transposable elements of AluYRa2 (SINE family and MER11B (LTR family were also identified. Conclusion The 1835 rhesus monkey placenta full-length cDNA sequences described here could expand genomic resources and information of rhesus monkeys. This increased genomic information will greatly contribute to the development of evolutionary biology and biomedical research.

  20. Homologous Recombination in Protozoan Parasites and Recombinase Inhibitors

    Directory of Open Access Journals (Sweden)

    Andrew A. Kelso

    2017-09-01

    Full Text Available Homologous recombination (HR is a DNA double-strand break (DSB repair pathway that utilizes a homologous template to fully repair the damaged DNA. HR is critical to maintain genome stability and to ensure genetic diversity during meiosis. A specialized class of enzymes known as recombinases facilitate the exchange of genetic information between sister chromatids or homologous chromosomes with the help of numerous protein accessory factors. The majority of the HR machinery is highly conserved among eukaryotes. In many protozoan parasites, HR is an essential DSB repair pathway that allows these organisms to adapt to environmental conditions and evade host immune systems through genetic recombination. Therefore, small molecule inhibitors, capable of disrupting HR in protozoan parasites, represent potential therapeutic options. A number of small molecule inhibitors were identified that disrupt the activities of the human recombinase RAD51. Recent studies have examined the effect of two of these molecules on the Entamoeba recombinases. Here, we discuss the current understandings of HR in the protozoan parasites Trypanosoma, Leishmania, Plasmodium, and Entamoeba, and we review the small molecule inhibitors known to disrupt human RAD51 activity.

  1. Damping at high homologous temperature in pure Cd, In, Pb, and Sn

    International Nuclear Information System (INIS)

    Cook, L.S.; Lakes, R.S.

    1995-01-01

    Typically, if a material possesses the stiffness necessary to be considered a structural material, its damping is low. Conversely, materials with high damping usually do not possess the stiffness necessary to be considered a structural material. Candidate materials for the high stiffness-low damping phase exist in abundance, whereas candidate materials for the moderate stiffness-high damping phase remain to be identified. One possible class of candidate materials for the moderate stiffness-high damping phase is metals at high homologous temperatures. Shear moduli of the specimens at 100 Hz are as follows: 4.1 GPa for indium, 5.7 GPa for lead, 15.7 GPa for tin, and 20.7 GPa for cadmium. Considering the behavior typical of metals, one may think of In and Pb as relatively compliant, while Sn and Cd could be called moderately stiff. The results are of some technological interest in view of the utility of materials with moderately high stiffness and damping. The combination of moderate stiffness and reasonably high loss tangent makes Cd the most promising metal tested with respect to technological applications. The shear modulus of Cd was highest of the metals tested (and very near that of aluminum (G = 27 GPa), which exhibits a loss tangent of about 0.001 at room temperature). The loss tangent of Cd at audio-frequencies was as high or higher than that of the other metals. In addition, frequency dependence of loss tangent was not as large as that observed in the other metals. No clear pattern relating damping to melting point emerged. An understanding in terms of viscoelastic mechanisms is not forthcoming at this time. Among the metal studied, cadmium exhibited a substantial loss tangent of 0.03 to 0.04 over much of the audio range, combined with a moderate stiffness, G = 20.7 GPa

  2. FASTERp: A Feature Array Search Tool for Estimating Resemblance of Protein Sequences

    Energy Technology Data Exchange (ETDEWEB)

    Macklin, Derek; Egan, Rob; Wang, Zhong

    2014-03-14

    Metagenome sequencing efforts have provided a large pool of billions of genes for identifying enzymes with desirable biochemical traits. However, homology search with billions of genes in a rapidly growing database has become increasingly computationally impractical. Here we present our pilot efforts to develop a novel alignment-free algorithm for homology search. Specifically, we represent individual proteins as feature vectors that denote the presence or absence of short kmers in the protein sequence. Similarity between feature vectors is then computed using the Tanimoto score, a distance metric that can be rapidly computed on bit string representations of feature vectors. Preliminary results indicate good correlation with optimal alignment algorithms (Spearman r of 0.87, ~;;1,000,000 proteins from Pfam), as well as with heuristic algorithms such as BLAST (Spearman r of 0.86, ~;;1,000,000 proteins). Furthermore, a prototype of FASTERp implemented in Python runs approximately four times faster than BLAST on a small scale dataset (~;;1000 proteins). We are optimizing and scaling to improve FASTERp to enable rapid homology searches against billion-protein databases, thereby enabling more comprehensive gene annotation efforts.

  3. Highly multiplexed targeted DNA sequencing from single nuclei.

    Science.gov (United States)

    Leung, Marco L; Wang, Yong; Kim, Charissa; Gao, Ruli; Jiang, Jerry; Sei, Emi; Navin, Nicholas E

    2016-02-01

    Single-cell DNA sequencing methods are challenged by poor physical coverage, high technical error rates and low throughput. To address these issues, we developed a single-cell DNA sequencing protocol that combines flow-sorting of single nuclei, time-limited multiple-displacement amplification (MDA), low-input library preparation, DNA barcoding, targeted capture and next-generation sequencing (NGS). This approach represents a major improvement over our previous single nucleus sequencing (SNS) Nature Protocols paper in terms of generating higher-coverage data (>90%), thereby enabling the detection of genome-wide variants in single mammalian cells at base-pair resolution. Furthermore, by pooling 48-96 single-cell libraries together for targeted capture, this approach can be used to sequence many single-cell libraries in parallel in a single reaction. This protocol greatly reduces the cost of single-cell DNA sequencing, and it can be completed in 5-6 d by advanced users. This single-cell DNA sequencing protocol has broad applications for studying rare cells and complex populations in diverse fields of biological research and medicine.

  4. Two zebrafish G2A homologs activate multiple intracellular signaling pathways in acidic environment

    Energy Technology Data Exchange (ETDEWEB)

    Ichijo, Yuta; Mochimaru, Yuta [Laboratory of Cell Signaling Regulation, Department of Life Sciences, School of Agriculture, Meiji University, Kawasaki 214-8571 (Japan); Azuma, Morio [Laboratory of Regulatory Biology, Graduate School of Science and Engineering, University of Toyama, 3190-Gofuku, Toyama 930-8555 (Japan); Satou, Kazuhiro; Negishi, Jun [Laboratory of Cell Signaling Regulation, Department of Life Sciences, School of Agriculture, Meiji University, Kawasaki 214-8571 (Japan); Nakakura, Takashi [Department of Anatomy, Graduate School of Medicine, Teikyo University, 2-11-1 Itabashi-Ku, Tokyo 173-8605 (Japan); Oshima, Natsuki [Laboratory of Cell Signaling Regulation, Department of Life Sciences, School of Agriculture, Meiji University, Kawasaki 214-8571 (Japan); Mogi, Chihiro; Sato, Koichi [Laboratory of Signal Transduction, Institute for Molecular and Cellular Regulation, Gunma University, Maebashi 371-8512 (Japan); Matsuda, Kouhei [Laboratory of Regulatory Biology, Graduate School of Science and Engineering, University of Toyama, 3190-Gofuku, Toyama 930-8555 (Japan); Okajima, Fumikazu [Laboratory of Signal Transduction, Institute for Molecular and Cellular Regulation, Gunma University, Maebashi 371-8512 (Japan); Tomura, Hideaki, E-mail: tomurah@meiji.ac.jp [Laboratory of Cell Signaling Regulation, Department of Life Sciences, School of Agriculture, Meiji University, Kawasaki 214-8571 (Japan)

    2016-01-01

    Human G2A is activated by various stimuli such as lysophosphatidylcholine (LPC), 9-hydroxyoctadecadienoic acid (9-HODE), and protons. The receptor is coupled to multiple intracellular signaling pathways, including the G{sub s}-protein/cAMP/CRE, G{sub 12/13}-protein/Rho/SRE, and G{sub q}-protein/phospholipase C/NFAT pathways. In the present study, we examined whether zebrafish G2A homologs (zG2A-a and zG2A-b) could respond to these stimuli and activate multiple intracellular signaling pathways. We also examined whether histidine residue and basic amino acid residue in the N-terminus of the homologs also play roles similar to those played by human G2A residues if the homologs sense protons. We found that the zG2A-a showed the high CRE, SRE, and NFAT activities, however, zG2A-b showed only the high SRE activity under a pH of 8.0. Extracellular acidification from pH 7.4 to 6.3 ameliorated these activities in zG2A-a-expressing cells. On the other hand, acidification ameliorated the SRE activity but not the CRE and NFAT activities in zG2A-b-expressing cells. LPC or 9-HODE did not modify any activity of either homolog. The substitution of histidine residue at the 174{sup th} position from the N-terminus of zG2A-a to asparagine residue attenuated proton-induced CRE and NFAT activities but not SRE activity. The substitution of arginine residue at the 32nd position from the N-terminus of zG2A-a to the alanine residue also attenuated its high and the proton-induced CRE and NFAT activities. On the contrary, the substitution did not attenuate SRE activity. The substitution of the arginine residue at the 10th position from the N-terminus of zG2A-b to the alanine residue also did not attenuate its high or the proton-induced SRE activity. These results indicate that zebrafish G2A homologs were activated by protons but not by LPC and 9-HODE, and the activation mechanisms of the homologs were similar to those of human G2A. - Highlights: • Zebrafish two G2A homologs are proton

  5. A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities

    OpenAIRE

    Maréchal Eric; Ortet Philippe; Roy Sylvaine; Bastien Olivier

    2005-01-01

    Abstract Background Popular methods to reconstruct molecular phylogenies are based on multiple sequence alignments, in which addition or removal of data may change the resulting tree topology. We have sought a representation of homologous proteins that would conserve the information of pair-wise sequence alignments, respect probabilistic properties of Z-scores (Monte Carlo methods applied to pair-wise comparisons) and be the basis for a novel method of consistent and stable phylogenetic recon...

  6. Sifting through genomes with iterative-sequence clustering produces a large, phylogenetically diverse protein-family resource

    Directory of Open Access Journals (Sweden)

    Sharpton Thomas J

    2012-10-01

    Full Text Available Abstract Background New computational resources are needed to manage the increasing volume of biological data from genome sequencing projects. One fundamental challenge is the ability to maintain a complete and current catalog of protein diversity. We developed a new approach for the identification of protein families that focuses on the rapid discovery of homologous protein sequences. Results We implemented fully automated and high-throughput procedures to de novo cluster proteins into families based upon global alignment similarity. Our approach employs an iterative clustering strategy in which homologs of known families are sifted out of the search for new families. The resulting reduction in computational complexity enables us to rapidly identify novel protein families found in new genomes and to perform efficient, automated updates that keep pace with genome sequencing. We refer to protein families identified through this approach as “Sifting Families,” or SFams. Our analysis of ~10.5 million protein sequences from 2,928 genomes identified 436,360 SFams, many of which are not represented in other protein family databases. We validated the quality of SFam clustering through statistical as well as network topology–based analyses. Conclusions We describe the rapid identification of SFams and demonstrate how they can be used to annotate genomes and metagenomes. The SFam database catalogs protein-family quality metrics, multiple sequence alignments, hidden Markov models, and phylogenetic trees. Our source code and database are publicly available and will be subject to frequent updates (http://edhar.genomecenter.ucdavis.edu/sifting_families/.

  7. Germline Chromothripsis Driven by L1-Mediated Retrotransposition and Alu/Alu Homologous Recombination

    DEFF Research Database (Denmark)

    Nazaryan-Petersen, Lusine; Bertelsen, Birgitte; Bak, Mads

    2016-01-01

    Chromothripsis (CTH) is a phenomenon where multiple localized double-stranded DNA breaks result in complex genomic rearrangements. Although the DNA-repair mechanisms involved in CTH have been described, the mechanisms driving the localized "shattering" process remain unclear. High-throughput sequ......Chromothripsis (CTH) is a phenomenon where multiple localized double-stranded DNA breaks result in complex genomic rearrangements. Although the DNA-repair mechanisms involved in CTH have been described, the mechanisms driving the localized "shattering" process remain unclear. High......-throughput sequence analysis of a familial germline CTH revealed an inserted SVAE retrotransposon associated with a 110-kb deletion displaying hallmarks of L1-mediated retrotransposition. Our analysis suggests that the SVAE insertion did not occur prior to or after, but concurrent with the CTH event. We also observed...... L1-endonuclease potential target sites in other breakpoints. In addition, we found four Alu elements flanking the 110-kb deletion and associated with an inversion. We suggest that chromatin looping mediated by homologous Alu elements may have brought distal DNA regions into close proximity...

  8. Exome sequencing identifies ZNF644 mutations in high myopia.

    Directory of Open Access Journals (Sweden)

    Yi Shi

    2011-06-01

    Full Text Available Myopia is the most common ocular disorder worldwide, and high myopia in particular is one of the leading causes of blindness. Genetic factors play a critical role in the development of myopia, especially high myopia. Recently, the exome sequencing approach has been successfully used for the disease gene identification of Mendelian disorders. Here we show a successful application of exome sequencing to identify a gene for an autosomal dominant disorder, and we have identified a gene potentially responsible for high myopia in a monogenic form. We captured exomes of two affected individuals from a Han Chinese family with high myopia and performed sequencing analysis by a second-generation sequencer with a mean coverage of 30× and sufficient depth to call variants at ∼97% of each targeted exome. The shared genetic variants of these two affected individuals in the family being studied were filtered against the 1000 Genomes Project and the dbSNP131 database. A mutation A672G in zinc finger protein 644 isoform 1 (ZNF644 was identified as being related to the phenotype of this family. After we performed sequencing analysis of the exons in the ZNF644 gene in 300 sporadic cases of high myopia, we identified an additional five mutations (I587V, R680G, C699Y, 3'UTR+12 C>G, and 3'UTR+592 G>A in 11 different patients. All these mutations were absent in 600 normal controls. The ZNF644 gene was expressed in human retinal and retinal pigment epithelium (RPE. Given that ZNF644 is predicted to be a transcription factor that may regulate genes involved in eye development, mutation may cause the axial elongation of eyeball found in high myopia patients. Our results suggest that ZNF644 might be a causal gene for high myopia in a monogenic form.

  9. Integration of vectors by homologous recombination in the plant pathogen Glomerella cingulata.

    Science.gov (United States)

    Rikkerink, E H; Solon, S L; Crowhurst, R N; Templeton, M D

    1994-03-01

    An homologous transformation system has been developed for the plant pathogenic fungus Glomerella cingulata (Colletotrichum gloeosporioides). A transformation vector containing the G. cingulata gpdA promoter fused to the hygromycin phosphotransferase gene was constructed. Southern analyses indicated that this vector integrated at single sites in most transformants. A novel method of PCR amplification across the recombination junction point indicated that the integration event occurred by homologous recombination in more than 95% of the transformants. Deletion studies demonstrated that 505 bp (the minimum length of homologous promoter DNA analysed which was still capable of promoter function) was sufficient to target integration events. Homologous integration of the vector resulted in duplication of the gdpA promoter region. When transformants were grown without selective pressure, a high incidence of vector excision by recombination between the duplicated regions was evident. The significance of these recombination characteristics is discussed with reference to the feasibility of performing gene disruption experiments.

  10. Exome sequencing generates high quality data in non-target regions

    Directory of Open Access Journals (Sweden)

    Guo Yan

    2012-05-01

    Full Text Available Abstract Background Exome sequencing using next-generation sequencing technologies is a cost efficient approach to selectively sequencing coding regions of human genome for detection of disease variants. A significant amount of DNA fragments from the capture process fall outside target regions, and sequence data for positions outside target regions have been mostly ignored after alignment. Result We performed whole exome sequencing on 22 subjects using Agilent SureSelect capture reagent and 6 subjects using Illumina TrueSeq capture reagent. We also downloaded sequencing data for 6 subjects from the 1000 Genomes Project Pilot 3 study. Using these data, we examined the quality of SNPs detected outside target regions by computing consistency rate with genotypes obtained from SNP chips or the Hapmap database, transition-transversion (Ti/Tv ratio, and percentage of SNPs inside dbSNP. For all three platforms, we obtained high-quality SNPs outside target regions, and some far from target regions. In our Agilent SureSelect data, we obtained 84,049 high-quality SNPs outside target regions compared to 65,231 SNPs inside target regions (a 129% increase. For our Illumina TrueSeq data, we obtained 222,171 high-quality SNPs outside target regions compared to 95,818 SNPs inside target regions (a 232% increase. For the data from the 1000 Genomes Project, we obtained 7,139 high-quality SNPs outside target regions compared to 1,548 SNPs inside target regions (a 461% increase. Conclusions These results demonstrate that a significant amount of high quality genotypes outside target regions can be obtained from exome sequencing data. These data should not be ignored in genetic epidemiology studies.

  11. Rapid Acquisition of Linezolid Resistance in Methicillin-Resistant Staphylococcus aureus: Role of Hypermutation and Homologous Recombination.

    Science.gov (United States)

    Iguchi, Shigekazu; Mizutani, Tomonori; Hiramatsu, Keiichi; Kikuchi, Ken

    2016-01-01

    We previously reported the case of a 64-year-old man with mediastinitis caused by Staphylococcus aureus in which the infecting bacterium acquired linezolid resistance after only 14 days treatment with linezolid. We therefore investigated relevant clinical isolates for possible mechanisms of this rapid acquisition of linezolid resistance. Using clinical S. aureus isolates, we assessed the in vitro mutation rate and performed stepwise selection for linezolid resistance. To investigate homologous recombination, sequences were determined for each of the 23S ribosomal RNA (23S rRNA) loci; analyzed sequences spanned the entirety of each 23S rRNA gene, including domain V, as well as the 16S-23S intergenic spacer regions. We additionally performed next-generation sequencing on clinical strains to identify single-nucleotide polymorphisms compared to the N315 genome. Strains isolated from the patient prior to linezolid exposure (M5-M7) showed higher-level linezolid resistance than N315, and the pre-exposure strain (M2) exhibited more rapid acquisition of linezolid resistance than did N315. However, the mutation rates of these and contemporaneous clinical isolates were similar to those of N315, and the isolates did not exhibit any mutations in hypermutation-related genes. Sequences of the 23S rRNA genes and 16S-23S intergenic spacer regions were identical among the pre- and post-exposure clinical strains. Notably, all of the pre-exposure isolates harbored a recQ missense mutation (Glu69Asp) with respect to N315; such a lesion may have affected short sequence recombination (facilitating, for example, recombination among rrn loci). We hypothesize that this mechanism contributed to rapid acquisition of linezolid resistance. Hypermutation and homologous recombination of the ribosomal RNA genes, including 23S rRNA genes, appear not to have been sources of the accelerated acquisition of linezolid resistance observed in our clinical case. Increased frequency of short sequence

  12. The Cloning and Functional Characterization of Peach CONSTANS and FLOWERING LOCUS T Homologous Genes PpCO and PpFT.

    Directory of Open Access Journals (Sweden)

    Xiang Zhang

    Full Text Available Flowering is an essential stage of plant growth and development. The successful transition to flowering not only ensures the completion of plant life cycles, it also serves as the basis for the production of economically important seeds and fruits. CONSTANS (CO and FLOWERING LOCUS T (FT are two genes playing critical roles in flowering time control in Arabidopsis. Through homology-based cloning and rapid-amplifications of cDNA ends (RACE, we obtained full-lengths cDNA sequences of Prunus persica CO (PpCO and Prunus persica FT (PpFT from peach (Prunus persica (L. Batsch and investigated their functions in flowering time regulation. PpCO and PpFT showed high homologies to Arabidopsis CO and FT at DNA, mRNA and protein levels. We showed that PpCO and PpFT were nucleus-localized and both showed transcriptional activation activities in yeast cells, consistent with their potential roles as transcription activators. Moreover, we established that the over-expression of PpCO could restore the late flowering phenotype of the Arabidopsis co-2 mutant, and the late flowering defect of the Arabidopsis ft-1 mutant can be rescued by the over-expression of PpFT, suggesting functional conservations of CO and FT genes in peach and Arabidopsis. Our results suggest that PpCO and PpFT are homologous genes of CO and FT in peach and they may function in regulating plant flowering time.

  13. Accurate protein structure modeling using sparse NMR data and homologous structure information.

    Science.gov (United States)

    Thompson, James M; Sgourakis, Nikolaos G; Liu, Gaohua; Rossi, Paolo; Tang, Yuefeng; Mills, Jeffrey L; Szyperski, Thomas; Montelione, Gaetano T; Baker, David

    2012-06-19

    While information from homologous structures plays a central role in X-ray structure determination by molecular replacement, such information is rarely used in NMR structure determination because it can be incorrect, both locally and globally, when evolutionary relationships are inferred incorrectly or there has been considerable evolutionary structural divergence. Here we describe a method that allows robust modeling of protein structures of up to 225 residues by combining (1)H(N), (13)C, and (15)N backbone and (13)Cβ chemical shift data, distance restraints derived from homologous structures, and a physically realistic all-atom energy function. Accurate models are distinguished from inaccurate models generated using incorrect sequence alignments by requiring that (i) the all-atom energies of models generated using the restraints are lower than models generated in unrestrained calculations and (ii) the low-energy structures converge to within 2.0 Å backbone rmsd over 75% of the protein. Benchmark calculations on known structures and blind targets show that the method can accurately model protein structures, even with very remote homology information, to a backbone rmsd of 1.2-1.9 Å relative to the conventional determined NMR ensembles and of 0.9-1.6 Å relative to X-ray structures for well-defined regions of the protein structures. This approach facilitates the accurate modeling of protein structures using backbone chemical shift data without need for side-chain resonance assignments and extensive analysis of NOESY cross-peak assignments.

  14. Morphological "primary homology" and expression of AG-subfamily MADS-box genes in pines, podocarps, and yews.

    Science.gov (United States)

    Englund, Marie; Carlsbecker, Annelie; Engström, Peter; Vergara-Silva, Francisco

    2011-01-01

    The morphological variation among reproductive organs of extant gymnosperms is remarkable, especially among conifers. Several hypotheses concerning morphological homology between various conifer reproductive organs have been put forward, in particular in relation to the pine ovuliferous scale. Here, we use the expression patterns of orthologs of the ABC-model MADS-box gene AGAMOUS (AG) for testing morphological homology hypotheses related to organs of the conifer female cone. To this end, we first developed a tailored 3'RACE procedure that allows reliable amplification of partial sequences highly similar to gymnosperm-derived members of the AG-subfamily of MADS-box genes. Expression patterns of two novel conifer AG orthologs cloned with this procedure-namely PodAG and TgAG, obtained from the podocarp Podocarpus reichei and the yew Taxus globosa, respectively-are then further characterized in the morphologically divergent female cones of these species. The expression patterns of PodAG and TgAG are compared with those of DAL2, a previously discovered Picea abies (Pinaceae) AG ortholog. By treating the expression patterns of DAL2, PodAG, and TgAG as character states mapped onto currently accepted cladogram topologies, we suggest that the epimatium-that is, the podocarp female cone organ previously postulated as a "modified" ovuliferous scale-and the canonical Pinaceae ovuliferous scale can be legitimally conceptualized as "primary homologs." Character state mapping for TgAG suggests in turn that the aril of Taxaceae should be considered as a different type of organ. This work demonstrates how the interaction between developmental-genetic data and formal cladistic theory could fruitfully contribute to gymnosperm systematics. © 2011 Wiley Periodicals, Inc.

  15. Unveiling novel RecO distant orthologues involved in homologous recombination.

    Directory of Open Access Journals (Sweden)

    Stéphanie Marsin

    2008-08-01

    Full Text Available The generation of a RecA filament on single-stranded DNA is a critical step in homologous recombination. Two main pathways leading to the formation of the nucleofilament have been identified in bacteria, based on the protein complexes mediating RecA loading: RecBCD (AddAB and RecFOR. Many bacterial species seem to lack some of the components involved in these complexes. The current annotation of the Helicobacter pylori genome suggests that this highly diverse bacterial pathogen has a reduced set of recombination mediator proteins. While it is now clear that homologous recombination plays a critical role in generating H. pylori diversity by allowing genomic DNA rearrangements and integration through transformation of exogenous DNA into the chromosome, no complete mediator complex is deduced from the sequence of its genome. Here we show by bioinformatics analysis the presence of a RecO remote orthologue that allowed the identification of a new set of RecO proteins present in all bacterial species where a RecR but not RecO was previously identified. HpRecO shares less than 15% identity with previously characterized homologues. Genetic dissection of recombination pathways shows that this novel RecO and the remote RecB homologue present in H. pylori are functional in repair and in RecA-dependent intrachromosomal recombination, defining two initiation pathways with little overlap. We found, however, that neither RecOR nor RecB contributes to transformation, suggesting the presence of a third, specialized, RecA-dependent pathway responsible for the integration of transforming DNA into the chromosome of this naturally competent bacteria. These results provide insight into the mechanisms that this successful pathogen uses to generate genetic diversity and adapt to changing environments and new hosts.

  16. Gene expression of a green fluorescent protein homolog as a host-specific biomarker of heat stress within a reef-building coral.

    Science.gov (United States)

    Smith-Keune, C; Dove, S

    2008-01-01

    Recent incidences of mass coral bleaching indicate that major reef building corals are increasingly suffering thermal stress associated with climate-related temperature increases. The development of pulse amplitude modulated (PAM) fluorometry has enabled rapid detection of the onset of thermal stress within coral algal symbionts, but sensitive biomarkers of thermal stress specific to the host coral have been slower to emerge. Differential display reverse transcription polymerase chain reaction (DDRT-PCR) was used to produce fingerprints of gene expression for the reef-building coral Acropora millepora exposed to 33 degrees C. Changes in the expression of 23 out of 399 putative genes occurred within 144 h. Down-regulation of one host-specific gene (AmA1a) occurred within just 6 h. Full-length sequencing revealed the product of this gene to be an all-protein chromatophore (green fluorescent protein [GFP]-homolog). RT-PCR revealed consistent down-regulation of this GFP-homolog for three replicate colonies within 6 h at both 32 degrees C and 33 degrees C but not at lower temperatures. Down-regulation of this host gene preceded significant decreases in the photosynthetic activity of photosystem II (dark-adapted F (v)/F (m)) of algal symbionts as measured by PAM fluorometry. Gene expression of host-specific genes such as GFP-homologs may therefore prove to be highly sensitive indicators for the onset of thermal stress within host coral cells.

  17. A local homology theory for linearly compact modules

    International Nuclear Information System (INIS)

    Nguyen Tu Cuong; Tran Tuan Nam

    2004-11-01

    We introduce a local homology theory for linearly modules which is in some sense dual to the local cohomology theory of A. Grothendieck. Some basic properties of local homology modules are shown such as: the vanishing and non-vanishing, the noetherianness of local homology modules. By using duality, we extend some well-known results in theory of local cohomology of A. Grothendieck. (author)

  18. A homology sound-based algorithm for speech signal interference

    Science.gov (United States)

    Jiang, Yi-jiao; Chen, Hou-jin; Li, Ju-peng; Zhang, Zhan-song

    2015-12-01

    Aiming at secure analog speech communication, a homology sound-based algorithm for speech signal interference is proposed in this paper. We first split speech signal into phonetic fragments by a short-term energy method and establish an interference noise cache library with the phonetic fragments. Then we implement the homology sound interference by mixing the randomly selected interferential fragments and the original speech in real time. The computer simulation results indicated that the interference produced by this algorithm has advantages of real time, randomness, and high correlation with the original signal, comparing with the traditional noise interference methods such as white noise interference. After further studies, the proposed algorithm may be readily used in secure speech communication.

  19. A priori Considerations When Conducting High-Throughput Amplicon-Based Sequence Analysis

    Directory of Open Access Journals (Sweden)

    Aditi Sengupta

    2016-03-01

    Full Text Available Amplicon-based sequencing strategies that include 16S rRNA and functional genes, alongside “meta-omics” analyses of communities of microorganisms, have allowed researchers to pose questions and find answers to “who” is present in the environment and “what” they are doing. Next-generation sequencing approaches that aid microbial ecology studies of agricultural systems are fast gaining popularity among agronomy, crop, soil, and environmental science researchers. Given the rapid development of these high-throughput sequencing techniques, researchers with no prior experience will desire information about the best practices that can be used before actually starting high-throughput amplicon-based sequence analyses. We have outlined items that need to be carefully considered in experimental design, sampling, basic bioinformatics, sequencing of mock communities and negative controls, acquisition of metadata, and in standardization of reaction conditions as per experimental requirements. Not all considerations mentioned here may pertain to a particular study. The overall goal is to inform researchers about considerations that must be taken into account when conducting high-throughput microbial DNA sequencing and sequences analysis.

  20. Using sequence similarity networks for visualization of relationships across diverse protein superfamilies.

    Directory of Open Access Journals (Sweden)

    Holly J Atkinson

    Full Text Available The dramatic increase in heterogeneous types of biological data--in particular, the abundance of new protein sequences--requires fast and user-friendly methods for organizing this information in a way that enables functional inference. The most widely used strategy to link sequence or structure to function, homology-based function prediction, relies on the fundamental assumption that sequence or structural similarity implies functional similarity. New tools that extend this approach are still urgently needed to associate sequence data with biological information in ways that accommodate the real complexity of the problem, while being accessible to experimental as well as computational biologists. To address this, we have examined the application of sequence similarity networks for visualizing functional trends across protein superfamilies from the context of sequence similarity. Using three large groups of homologous proteins of varying types of structural and functional diversity--GPCRs and kinases from humans, and the crotonase superfamily of enzymes--we show that overlaying networks with orthogonal information is a powerful approach for observing functional themes and revealing outliers. In comparison to other primary methods, networks provide both a good representation of group-wise sequence similarity relationships and a strong visual and quantitative correlation with phylogenetic trees, while enabling analysis and visualization of much larger sets of sequences than trees or multiple sequence alignments can easily accommodate. We also define important limitations and caveats in the application of these networks. As a broadly accessible and effective tool for the exploration of protein superfamilies, sequence similarity networks show great potential for generating testable hypotheses about protein structure-function relationships.

  1. Using sequence similarity networks for visualization of relationships across diverse protein superfamilies.

    Science.gov (United States)

    Atkinson, Holly J; Morris, John H; Ferrin, Thomas E; Babbitt, Patricia C

    2009-01-01

    The dramatic increase in heterogeneous types of biological data--in particular, the abundance of new protein sequences--requires fast and user-friendly methods for organizing this information in a way that enables functional inference. The most widely used strategy to link sequence or structure to function, homology-based function prediction, relies on the fundamental assumption that sequence or structural similarity implies functional similarity. New tools that extend this approach are still urgently needed to associate sequence data with biological information in ways that accommodate the real complexity of the problem, while being accessible to experimental as well as computational biologists. To address this, we have examined the application of sequence similarity networks for visualizing functional trends across protein superfamilies from the context of sequence similarity. Using three large groups of homologous proteins of varying types of structural and functional diversity--GPCRs and kinases from humans, and the crotonase superfamily of enzymes--we show that overlaying networks with orthogonal information is a powerful approach for observing functional themes and revealing outliers. In comparison to other primary methods, networks provide both a good representation of group-wise sequence similarity relationships and a strong visual and quantitative correlation with phylogenetic trees, while enabling analysis and visualization of much larger sets of sequences than trees or multiple sequence alignments can easily accommodate. We also define important limitations and caveats in the application of these networks. As a broadly accessible and effective tool for the exploration of protein superfamilies, sequence similarity networks show great potential for generating testable hypotheses about protein structure-function relationships.

  2. Cloning and sequence of cDNA encoding 1-aminocyclo- propane-1-carboxylate oxidase in Vanda flowers

    Directory of Open Access Journals (Sweden)

    Pattana Srifah Huehne

    2013-08-01

    Full Text Available The 1-aminocyclopropane-1-carboxylate oxidase (ACO gene in the final step of ethylene biosynthesis was isolated from ethylene-sensitive Vanda Miss Joaquim flowers. This consists of 1,242 base pairs (bp encoding for 326 amino acid residues. To investigate the specific divergence in orchid ACO sequences, the deduced Vanda ACO was aligned with five other orchid ACOs. The results reveal that the ACO sequences within Doritaenopsis, Phalaenopsis and Vanda show highly conserved and almost 95% identical homology, while the ACOs isolated from Cymbidium, Dendrobium and Cattleya are 8788% identical to Vanda ACO. In addition, the 2-oxoglutarate- Fe(II_oxygenase (Oxy domain of orchid ACOs consists of a higher degree of amino acid conservation than that of the non-haem dioxygenase (DIOX_N domain. The overall homology regions of Vanda ACO are commonly folded into 12 α-helices and 12 β-sheets similar to the three dimensional template-structure of Petunia ACO. This Vanda ACO cloned gene is highly expressed in flower tissue compared with root and leaf tissues. In particular, there is an abundance of ACO transcript accumulation in the column followed by the lip and the perianth of Vanda Miss Joaquim flowers at the fully-open stage.

  3. Purification and amino acid sequence of a bacteriocins produced by Lactobacillus salivarius K7 isolated from chicken intestine

    Directory of Open Access Journals (Sweden)

    Kenji Sonomoto

    2006-03-01

    Full Text Available A bacteriocin-producing strain, Lactobacillus K7, was isolated from a chicken intestine. The inhibitory activity was determined by spot-on-lawn technique. Identification of the strain was performed by morphological, biochemical (API 50 CH kit and molecular genetic (16S rDNA basis. Bacteriocin purification processes were carried out by amberlite adsorption, cation exchange and reverse-phase high perform- ance liquid chromatography. N-terminal amino acid sequences were performed by Edman degradation. Molecular mass was determined by electrospray-ionization (ESI mass spectrometry (MS. Lactobacillus K7 showed inhibitory activity against Lactobacillus sakei subsp. sakei JCM 1157T, Leuconostoc mesenteroides subsp. mesenteroides JCM 6124T and Bacillus coagulans JCM 2257T. This strain was identified as Lb. salivarius. The antimicrobial substance was destroyed by proteolytic enzymes, indicating its proteinaceous structure designated as a bacteriocin type. The purification of bacteriocin by amberlite adsorption, cation exchange, and reverse-phase chromatography resulted in only one single active peak, which was designated FK22. Molecular weight of this fraction was 4331.70 Da. By amino acid sequence, this peptide was homology to Abp 118 beta produced by Lb. salivarius UCC118. In addition, Lb. salivarius UCC118 produced 2-peptide bacteriocin, which was Abp 118 alpha and beta. Based on the partial amino acid sequences of Abp 118 beta, specific primers were designed from nucleotide sequences according to data from GenBank. The result showed that the deduced peptide was high homology to 2-peptide bacteriocin, Abp 118 alpha and beta.

  4. The concept of homology as a basis for evaluating developmental mechanisms: exploring selective attention across the life-span.

    Science.gov (United States)

    Lickliter, Robert; Bahrick, Lorraine E

    2013-01-01

    Research with human infants as well as non-human animal embryos and infants has consistently demonstrated the benefits of intersensory redundancy for perceptual learning and memory for redundantly specified information during early development. Studies of infant affect discrimination, face discrimination, numerical discrimination, sequence detection, abstract rule learning, and word comprehension and segmentation have all shown that intersensory redundancy promotes earlier detection of these properties when compared to unimodal exposure to the same properties. Here we explore the idea that such intersensory facilitation is evident across the life-span and that this continuity is an example of a developmental behavioral homology. We present evidence that intersensory facilitation is most apparent during early phases of learning for a variety of tasks, regardless of developmental level, including domains that are novel or tasks that require discrimination of fine detail or speeded responses. Under these conditions, infants, children, and adults all show intersensory facilitation, suggesting a developmental homology. We discuss the challenge and propose strategies for establishing appropriate guidelines for identifying developmental behavioral homologies. We conclude that evaluating the extent to which continuities observed across development are homologous can contribute to a better understanding of the processes of development. Copyright © 2012 Wiley Periodicals, Inc.

  5. Detection of a Usp-like gene in Calotropis procera plant from the de novo assembled genome contigs of the high-throughput sequencing dataset

    KAUST Repository

    Shokry, Ahmed M.

    2014-02-01

    The wild plant species Calotropis procera (C. procera) has many potential applications and beneficial uses in medicine, industry and ornamental field. It also represents an excellent source of genes for drought and salt tolerance. Genes encoding proteins that contain the conserved universal stress protein (USP) domain are known to provide organisms like bacteria, archaea, fungi, protozoa and plants with the ability to respond to a plethora of environmental stresses. However, information on the possible occurrence of Usp in C. procera is not available. In this study, we uncovered and characterized a one-class A Usp-like (UspA-like, NCBI accession No. KC954274) gene in this medicinal plant from the de novo assembled genome contigs of the high-throughput sequencing dataset. A number of GenBank accessions for Usp sequences were blasted with the recovered de novo assembled contigs. Homology modelling of the deduced amino acids (NCBI accession No. AGT02387) was further carried out using Swiss-Model, accessible via the EXPASY. Superimposition of C. procera USPA-like full sequence model on Thermus thermophilus USP UniProt protein (PDB accession No. Q5SJV7) was constructed using RasMol and Deep-View programs. The functional domains of the novel USPA-like amino acids sequence were identified from the NCBI conserved domain database (CDD) that provide insights into sequence structure/function relationships, as well as domain models imported from a number of external source databases (Pfam, SMART, COG, PRK, TIGRFAM). © 2014 Académie des sciences.

  6. Functional Coverage of the Human Genome by Existing Structures, Structural Genomics Targets, and Homology Models.

    Directory of Open Access Journals (Sweden)

    2005-08-01

    Full Text Available The bias in protein structure and function space resulting from experimental limitations and targeting of particular functional classes of proteins by structural biologists has long been recognized, but never continuously quantified. Using the Enzyme Commission and the Gene Ontology classifications as a reference frame, and integrating structure data from the Protein Data Bank (PDB, target sequences from the structural genomics projects, structure homology derived from the SUPERFAMILY database, and genome annotations from Ensembl and NCBI, we provide a quantified view, both at the domain and whole-protein levels, of the current and projected coverage of protein structure and function space relative to the human genome. Protein structures currently provide at least one domain that covers 37% of the functional classes identified in the genome; whole structure coverage exists for 25% of the genome. If all the structural genomics targets were solved (twice the current number of structures in the PDB, it is estimated that structures of one domain would cover 69% of the functional classes identified and complete structure coverage would be 44%. Homology models from existing experimental structures extend the 37% coverage to 56% of the genome as single domains and 25% to 31% for complete structures. Coverage from homology models is not evenly distributed by protein family, reflecting differing degrees of sequence and structure divergence within families. While these data provide coverage, conversely, they also systematically highlight functional classes of proteins for which structures should be determined. Current key functional families without structure representation are highlighted here; updated information on the "most wanted list" that should be solved is available on a weekly basis from http://function.rcsb.org:8080/pdb/function_distribution/index.html.

  7. Protein sequence comparison and protein evolution

    Energy Technology Data Exchange (ETDEWEB)

    Pearson, W.R. [Univ. of Virginia, Charlottesville, VA (United States). Dept. of Biochemistry

    1995-12-31

    This tutorial was one of eight tutorials selected to be presented at the Third International Conference on Intelligent Systems for Molecular Biology which was held in the United Kingdom from July 16 to 19, 1995. This tutorial examines how the information conserved during the evolution of a protein molecule can be used to infer reliably homology, and thus a shared proteinfold and possibly a shared active site or function. The authors start by reviewing a geological/evolutionary time scale. Next they look at the evolution of several protein families. During the tutorial, these families will be used to demonstrate that homologous protein ancestry can be inferred with confidence. They also examine different modes of protein evolution and consider some hypotheses that have been presented to explain the very earliest events in protein evolution. The next part of the tutorial will examine the technical aspects of protein sequence comparison. Both optimal and heuristic algorithms and their associated parameters that are used to characterize protein sequence similarities are discussed. Perhaps more importantly, they survey the statistics of local similarity scores, and how these statistics can both be used to improve the selectivity of a search and to evaluate the significance of a match. They them examine distantly related members of three protein families, the serine proteases, the glutathione transferases, and the G-protein-coupled receptors (GCRs). Finally, the discuss how sequence similarity can be used to examine internal repeated or mosaic structures in proteins.

  8. Immunoglobulin variable region sequences of two human monoclonal antibodies directed to an onco-developmental carbohydrate antigen, lactotetraosylceramide (LcOse4Cer).

    Science.gov (United States)

    Yago, K; Zenita, K; Ohwaki, I; Harada, R; Nozawa, S; Tsukazaki, K; Iwamori, M; Endo, N; Yasuda, N; Okuma, M

    1993-11-01

    A human monoclonal antibody, 11-50, was generated and was shown to recognize an onco-developmental carbohydrate antigen, LcOse4Cer. The isotype of this antibody was IgM, lambda, similar to the previously known human anti-LcOse4 antibodies, such as IgMWOO and HMST-1. We raised a murine anti-idiotypic antibody G3 (IgG1, kappa) against 11-50, and tested its reactivity towards the affinity purified human polyclonal anti-LcOse4 antibodies prepared from pooled human sera using a Gal beta 1-->3GlcNAc beta-immobilized column. The results indicated that at least a part of the human polyclonal anti-LcOse4 antibodies shared the G3 idiotype with 11-50. We further analyzed the sequence of variable regions of the two anti-LcOse4 antibodies, 11-50 and HMST-1. Sequence analysis of the heavy chain variable regions indicated that the VH regions of these two antibodies were highly homologous to each other (93.5% at the nucleic acid level), and these antibodies utilized the germline genes VH1.9III and hv3005f3 as the VH segments, which are closely related germline genes of the VHIII family. It was noted that these germline VH genes are frequently utilized in fetal B cells. The JH region of both antibodies was encoded by the JH4 gene. For the light chain, the V lambda segments of the two antibodies were 96.3% homologous to each other at the nucleic acid level. The V lambda segments of both antibodies showed the highest homology to the rearranged V lambda gene called V lambda II.DS among reported V lambda genes, while the exact germline V lambda genes encoding the two antibodies were not yet registered in available sequence databanks. The amino acid sequences of the J lambda segments of both antibodies were identical. These results indicate that the two human antibodies recognizing the onco-developmental carbohydrate antigen Lc4 are encoded by the same or very homologous germline genes.

  9. Structural protein descriptors in 1-dimension and their sequence-based predictions.

    Science.gov (United States)

    Kurgan, Lukasz; Disfani, Fatemeh Miri

    2011-09-01

    The last few decades observed an increasing interest in development and application of 1-dimensional (1D) descriptors of protein structure. These descriptors project 3D structural features onto 1D strings of residue-wise structural assignments. They cover a wide-range of structural aspects including conformation of the backbone, burying depth/solvent exposure and flexibility of residues, and inter-chain residue-residue contacts. We perform first-of-its-kind comprehensive comparative review of the existing 1D structural descriptors. We define, review and categorize ten structural descriptors and we also describe, summarize and contrast over eighty computational models that are used to predict these descriptors from the protein sequences. We show that the majority of the recent sequence-based predictors utilize machine learning models, with the most popular being neural networks, support vector machines, hidden Markov models, and support vector and linear regressions. These methods provide high-throughput predictions and most of them are accessible to a non-expert user via web servers and/or stand-alone software packages. We empirically evaluate several recent sequence-based predictors of secondary structure, disorder, and solvent accessibility descriptors using a benchmark set based on CASP8 targets. Our analysis shows that the secondary structure can be predicted with over 80% accuracy and segment overlap (SOV), disorder with over 0.9 AUC, 0.6 Matthews Correlation Coefficient (MCC), and 75% SOV, and relative solvent accessibility with PCC of 0.7 and MCC of 0.6 (0.86 when homology is used). We demonstrate that the secondary structure predicted from sequence without the use of homology modeling is as good as the structure extracted from the 3D folds predicted by top-performing template-based methods.

  10. Production of R,R-2,3-butanediol of ultra-high optical purity from Paenibacillus polymyxa ZJ-9 using homologous recombination.

    Science.gov (United States)

    Zhang, Li; Cao, Can; Jiang, Ruifan; Xu, Hong; Xue, Feng; Huang, Weiwei; Ni, Hao; Gao, Jian

    2018-08-01

    The present study describes the use of metabolic engineering to achieve the production of R,R-2,3-butanediol (R,R-2,3-BD) of ultra-high optical purity (>99.99%). To this end, the diacetyl reductase (DAR) gene (dud A) of Paenibacillus polymyxa ZJ-9 was knocked out via homologous recombination between the genome and the previously constructed targeting vector pRN5101-L'C in a process based on homologous single-crossover. PCR verification confirmed the successful isolation of the dud A gene disruption mutant P. polymyxa ZJ-9-△dud A. Moreover, fermentation results indicated that the optical purity of R,R-2,3-BD increased from about 98% to over 99.99%, with a titer of 21.62 g/L in Erlenmeyer flasks. The latter was further increased to 25.88 g/L by fed-batch fermentation in a 5-L bioreactor. Copyright © 2018 Elsevier Ltd. All rights reserved.

  11. Sequence analysis and molecular characterization of genes required for the biosynthesis of type 1 capsular polysaccharide in Staphylococcus aureus.

    Science.gov (United States)

    Lin, W S; Cunneen, T; Lee, C Y

    1994-11-01

    We previously cloned a 19.4-kb DNA region containing a cluster of genes affecting type 1 capsule production from Staphylococcus aureus M. Subcloning experiments showed that these capsule (cap) genes are localized in a 14.6-kb region. Sequencing analysis of the 14.6-kb fragment revealed 13 open reading frames (ORFs). Using complementation tests, we have mapped a collection of Cap- mutations in 10 of the 13 ORFs, indicating that these 10 genes are involved in capsule biosynthesis. The requirement for the remaining three ORFs in the synthesis of the capsule was demonstrated by constructing site-specific mutations corresponding to each of the three ORFs. Using an Escherichia coli S30 in vitro transcription-translation system, we clearly identified 7 of the 13 proteins predicted from the ORFs. Homology search between the predicted proteins and those in the data bank showed very high homology (52.3% identity) between capL and vipA, moderate homology (29% identity) between capI and vipB, and limited homology (21.8% identity) between capM and vipC. The vipA, vipB, and vipC genes have been shown to be involved in the biosynthesis of Salmonella typhi Vi antigen, a homopolymer polysaccharide consisting of N-acetylgalactosamino uronic acid, which is also one of the components of the staphylococcal type 1 capsule. The homology between these sets of genes therefore suggests that capL, capI, and capM may be involved in the biosynthesis of amino sugar, N-acetylgalactosamino uronic acid. In addition, the search showed that CapG aligned well with the consensus sequence of a family of acetyltransferases from various prokaryotic organisms, suggesting that CapG may be an acetyltransferase. Using the isogenic Cap- and Cap+ strains constructed in this study, we have confirmed that type 1 capsule is an important virulence factor in a mouse lethality test.

  12. Rotation in moderate-mass pre-main-sequence radiative track G stars

    International Nuclear Information System (INIS)

    Mcnamara, B.

    1990-01-01

    Recent studies suggest that the observed high-mass radiative track velocity histograms for pre-main-sequence stars differ significantly. In the Vogel and Kuhi (1981) study, these stars were found to possess a rather broad distribution of rotational velocities with a moderate peak at low velocities. In contrast, Smith et al. (1983), found a very sharply peaked distribution located at low values of v sin i. The difference in these velocity distributions is shown to be due to inadequate allowance for field stars in the Smith, et al., work. Once these stars are removed, the high-mass velocity distributions of the two regions are remarkably similar. This result suggests that a unique velocity distribution might be used in modeling very young stars. Assuming that the Orion Ic proto-F stars continue to contract in a homologous fashion, their average current rotational velocity is in agreement with that expected for zero-age main sequence F stars. 27 refs

  13. Statistical Inference for Porous Materials using Persistent Homology.

    Energy Technology Data Exchange (ETDEWEB)

    Moon, Chul [Univ. of Georgia, Athens, GA (United States); Heath, Jason E. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Mitchell, Scott A. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2017-12-01

    We propose a porous materials analysis pipeline using persistent homology. We rst compute persistent homology of binarized 3D images of sampled material subvolumes. For each image we compute sets of homology intervals, which are represented as summary graphics called persistence diagrams. We convert persistence diagrams into image vectors in order to analyze the similarity of the homology of the material images using the mature tools for image analysis. Each image is treated as a vector and we compute its principal components to extract features. We t a statistical model using the loadings of principal components to estimate material porosity, permeability, anisotropy, and tortuosity. We also propose an adaptive version of the structural similarity index (SSIM), a similarity metric for images, as a measure to determine the statistical representative elementary volumes (sREV) for persistence homology. Thus we provide a capability for making a statistical inference of the uid ow and transport properties of porous materials based on their geometry and connectivity.

  14. K-homology and K-cohomology constructions of relations

    International Nuclear Information System (INIS)

    Abd El-Sattar, A. Dabbour; Bayoumy, F.M.

    1990-08-01

    One of the important homology (cohomology) theories, based on systems of covering of the space, is the homology (cohomology) theory of relations. In the present work, by using the idea of K-homology and K-cohomology groups different varieties of the Dowker's theory are introduced and studied. These constructions are defined on the category of pairs of topological spaces and over a pair of coefficient groups. (author). 14 refs

  15. The sequence of the Helicoverpa armigera single nucleocapsid nucleopolyhedrovirus genome

    NARCIS (Netherlands)

    Chen, X.; IJkel, W.F.J.; Tarchini, R.; Sun, X.; Sandbrink, H.; Wang, H.; Peters, S.; Zuidema, D.; Klein Lankhorst, R.; Vlak, J.M.; Hu, Z.

    2001-01-01

    The nucleotide sequence of the Helicoverpa armigera single-nucleocapsid nucleopolyhedrovirus (HaSNPV) DNA genome was determined and analysed. The circular genome encompasses 131 403 bp, has a G C content of 39.1 molnd contains five homologous regions with a unique pattern of repeats.

  16. Isolation and characterization of rhamnose-binding lectins from eggs of steelhead trout (Oncorhynchus mykiss) homologous to low density lipoprotein receptor superfamily.

    Science.gov (United States)

    Tateno, H; Saneyoshi, A; Ogawa, T; Muramoto, K; Kamiya, H; Saneyoshi, M

    1998-07-24

    Two L-rhamnose-binding lectins named STL1 and STL2 were isolated from eggs of steelhead trout (Oncorhynchus mykiss) by affinity chromatography and ion exchange chromatography. The apparent molecular masses of purified STL1 and STL2 were estimated to be 84 and 68 kDa, respectively, by gel filtration chromatography. Sodium dodecyl sulfate polyacrylamide gel electrophoresis and matrix-assisted laser desorption ionization time of flight mass spectrometry of these lectins revealed that STL1 was composed of noncovalently linked trimer of 31.4-kDa subunits, and STL2 was noncovalently linked trimer of 21.5-kDa subunits. The minimum concentrations of STL1, a major component, and STL2, a minor component, needed to agglutinate rabbit erythrocytes were 9 and 0.2 microg/ml, respectively. The most effective saccharide in the hemagglutination inhibition assay for both STL1 and STL2 was L-rhamnose. Saccharides possessing the same configuration of hydroxyl groups at C2 and C4 as that in L-rhamnose, such as L-arabinose and D-galactose, also inhibited. The amino acid sequence of STL2 was determined by analysis of peptides generated by digestion of the S-carboxamidomethylated protein with Achromobacter protease I or Staphylococcus aureus V8 protease. The STL2 subunit of 195 amino acid residues proved to have a unique polypeptide architecture; that is, it was composed of two tandemly repeated homologous domains (STL2-N and STL2-C) with 52% internal homology. These two domains showed a sequence homology to the subunit (105 amino acid residues) of D-galactoside-specific sea urchin (Anthocidaris crassispina) egg lectin (37% for STL2-N and 46% for STL2-C, respectively). The N terminus of the STL1 subunit was blocked with an acetyl group. However, a partial amino acid sequence of the subunit showed a sequence similarity to STL2. Moreover, STL2 also showed a sequence homology to the ligand binding domain of the vitellogenin receptor. We have also employed surface plasmon resonance biosensor

  17. Complete Genome Sequences of Porcine Epidemic Diarrhea Virus Strains JSLS-1/2015 and JS-2/2015 Isolated from China.

    Science.gov (United States)

    Tao, Jie; Li, Benqiang; Zhang, Chunling; Liu, Huili

    2016-11-10

    Two porcine epidemic diarrhea virus (PEDV) strains, JSLS-1/2015 and JS-2/2015, were isolated from piglets with watery diarrhea in South China. Two genomic sequences were highly homologous to the attenuated DR13 strain. Furthermore, JSLS-1/2015 contains a 24-amino-acid deletion in open reading frame 1b, which was first reported in PEDV isolates. Copyright © 2016 Tao et al.

  18. Nucleotide sequence and genetic organization of barley stripe mosaic virus RNA gamma.

    Science.gov (United States)

    Gustafson, G; Hunter, B; Hanau, R; Armour, S L; Jackson, A O

    1987-06-01

    The complete nucleotide sequences of RNA gamma from the Type and ND18 strains of barley stripe mosaic virus (BSMV) have been determined. The sequences are 3164 (Type) and 2791 (ND18) nucleotides in length. Both sequences contain a 5'-noncoding region (87 or 88 nucleotides) which is followed by a long open reading frame (ORF1). A 42-nucleotide intercistronic region separates ORF1 from a second, shorter open reading frame (ORF2) located near the 3'-end of the RNA. There is a high degree of homology between the Type and ND18 strains in the nucleotide sequence of ORF1. However, the Type strain contains a 366 nucleotide direct tandem repeat within ORF1 which is absent in the ND18 strain. Consequently, the predicted translation product of Type RNA gamma ORF1 (mol wt 87,312) is significantly larger than that of ND18 RNA gamma ORF1 (mol wt 74,011). The amino acid sequence of the ORF1 polypeptide contains homologies with putative RNA polymerases from other RNA viruses, suggesting that this protein may function in replication of the BSMV genome. The nucleotide sequence of RNA gamma ORF2 is nearly identical in the Type and ND18 strains. ORF2 codes for a polypeptide with a predicted molecular weight of 17,209 (Type) or 17,074 (ND18) which is known to be translated from a subgenomic (sg) RNA. The initiation point of this sgRNA has been mapped to a location 27 nucleotides upstream of the ORF2 initiation codon in the intercistronic region between ORF1 and ORF2. The sgRNA is not coterminal with the 3'-end of the genomic RNA, but instead contains heterogeneous poly(A) termini up to 150 nucleotides long (J. Stanley, R. Hanau, and A. O. Jackson, 1984, Virology 139, 375-383). In the genomic RNA gamma, ORF2 is followed by a short poly(A) tract and a 238-nucleotide tRNA-like structure.

  19. High throughput 16S rRNA gene amplicon sequencing

    DEFF Research Database (Denmark)

    Nierychlo, Marta; Larsen, Poul; Jørgensen, Mads Koustrup

    S rRNA gene amplicon sequencing has been developed over the past few years and is now ready to use for more comprehensive studies related to plant operation and optimization thanks to short analysis time, low cost, high throughput, and high taxonomic resolution. In this study we show how 16S r......RNA gene amplicon sequencing can be used to reveal factors of importance for the operation of full-scale nutrient removal plants related to settling problems and floc properties. Using optimized DNA extraction protocols, indexed primers and our in-house Illumina platform, we prepared multiple samples...... be correlated to the presence of the species that are regarded as “strong” and “weak” floc formers. In conclusion, 16S rRNA gene amplicon sequencing provides a high throughput approach for a rapid and cheap community profiling of activated sludge that in combination with multivariate statistics can be used...

  20. Mutational analysis of the high-affinity zinc binding site validates a refined human dopamine transporter homology model.

    Directory of Open Access Journals (Sweden)

    Thomas Stockner

    Full Text Available The high-resolution crystal structure of the leucine transporter (LeuT is frequently used as a template for homology models of the dopamine transporter (DAT. Although similar in structure, DAT differs considerably from LeuT in a number of ways: (i when compared to LeuT, DAT has very long intracellular amino and carboxyl termini; (ii LeuT and DAT share a rather low overall sequence identity (22% and (iii the extracellular loop 2 (EL2 of DAT is substantially longer than that of LeuT. Extracellular zinc binds to DAT and restricts the transporter's movement through the conformational cycle, thereby resulting in a decrease in substrate uptake. Residue H293 in EL2 praticipates in zinc binding and must be modelled correctly to allow for a full understanding of its effects. We exploited the high-affinity zinc binding site endogenously present in DAT to create a model of the complete transmemberane domain of DAT. The zinc binding site provided a DAT-specific molecular ruler for calibration of the model. Our DAT model places EL2 at the transporter lipid interface in the vicinity of the zinc binding site. Based on the model, D206 was predicted to represent a fourth co-ordinating residue, in addition to the three previously described zinc binding residues H193, H375 and E396. This prediction was confirmed by mutagenesis: substitution of D206 by lysine and cysteine affected the inhibitory potency of zinc and the maximum inhibition exerted by zinc, respectively. Conversely, the structural changes observed in the model allowed for rationalizing the zinc-dependent regulation of DAT: upon binding, zinc stabilizes the outward-facing state, because its first coordination shell can only be completed in this conformation. Thus, the model provides a validated solution to the long extracellular loop and may be useful to address other aspects of the transport cycle.

  1. Persistent homology of complex networks

    International Nuclear Information System (INIS)

    Horak, Danijela; Maletić, Slobodan; Rajković, Milan

    2009-01-01

    Long-lived topological features are distinguished from short-lived ones (considered as topological noise) in simplicial complexes constructed from complex networks. A new topological invariant, persistent homology, is determined and presented as a parameterized version of a Betti number. Complex networks with distinct degree distributions exhibit distinct persistent topological features. Persistent topological attributes, shown to be related to the robust quality of networks, also reflect the deficiency in certain connectivity properties of networks. Random networks, networks with exponential connectivity distribution and scale-free networks were considered for homological persistency analysis

  2. Application of high-throughput DNA sequencing in phytopathology.

    Science.gov (United States)

    Studholme, David J; Glover, Rachel H; Boonham, Neil

    2011-01-01

    The new sequencing technologies are already making a big impact in academic research on medically important microbes and may soon revolutionize diagnostics, epidemiology, and infection control. Plant pathology also stands to gain from exploiting these opportunities. This manuscript reviews some applications of these high-throughput sequencing methods that are relevant to phytopathology, with emphasis on the associated computational and bioinformatics challenges and their solutions. Second-generation sequencing technologies have recently been exploited in genomics of both prokaryotic and eukaryotic plant pathogens. They are also proving to be useful in diagnostics, especially with respect to viruses. Copyright © 2011 by Annual Reviews. All rights reserved.

  3. Survey of transposable elements in sugarcane expressed sequence tags (ESTs

    Directory of Open Access Journals (Sweden)

    Rossi Magdalena

    2001-01-01

    Full Text Available The sugarcane expressed sequence tag (SUCEST project has produced a large number of cDNA sequences from several plant tissues submitted or not to different conditions of stress. In this paper we report the result of a search for transposable elements (TEs revealing a surprising amount of expressed TEs homologues. Of the 260,781 sequences grouped in 81,223 fragment assembly program (Phrap clusters, a total of 276 clones showed homology to previously reported TEs using a stringent cut-off value of e-50 or better. Homologous clones to Copia/Ty1 and Gypsy/Ty3 groups of long terminal repeat (LTR retrotransposons were found but no non-LTR retroelements were identified. All major transposon families were represented in sugarcane including Activator (Ac, Mutator (MuDR, Suppressor-mutator (En/Spm and Mariner. In order to compare the TE diversity in grasses genomes, we carried out a search for TEs described in sugarcane related species O.sativa, Z. mays and S. bicolor. We also present preliminary results showing the potential use of TEs insertion pattern polymorphism as molecular markers for cultivar identification.

  4. Identification and Characterization of Wilt and Salt Stress-Responsive MicroRNAs in Chickpea through High-Throughput Sequencing

    Science.gov (United States)

    Deokar, Amit Atmaram; Bhardwaj, Ankur R.; Agarwal, Manu; Katiyar-Agarwal, Surekha; Srinivasan, Ramamurthy; Jain, Pradeep Kumar

    2014-01-01

    Chickpea (Cicer arietinum) is the second most widely grown legume worldwide and is the most important pulse crop in the Indian subcontinent. Chickpea productivity is adversely affected by a large number of biotic and abiotic stresses. MicroRNAs (miRNAs) have been implicated in the regulation of plant responses to several biotic and abiotic stresses. This study is the first attempt to identify chickpea miRNAs that are associated with biotic and abiotic stresses. The wilt infection that is caused by the fungus Fusarium oxysporum f.sp. ciceris is one of the major diseases severely affecting chickpea yields. Of late, increasing soil salinization has become a major problem in realizing these potential yields. Three chickpea libraries using fungal-infected, salt-treated and untreated seedlings were constructed and sequenced using next-generation sequencing technology. A total of 12,135,571 unique reads were obtained. In addition to 122 conserved miRNAs belonging to 25 different families, 59 novel miRNAs along with their star sequences were identified. Four legume-specific miRNAs, including miR5213, miR5232, miR2111 and miR2118, were found in all of the libraries. Poly(A)-based qRT-PCR (Quantitative real-time PCR) was used to validate eleven conserved and five novel miRNAs. miR530 was highly up regulated in response to fungal infection, which targets genes encoding zinc knuckle- and microtubule-associated proteins. Many miRNAs responded in a similar fashion under both biotic and abiotic stresses, indicating the existence of cross talk between the pathways that are involved in regulating these stresses. The potential target genes for the conserved and novel miRNAs were predicted based on sequence homologies. miR166 targets a HD-ZIPIII transcription factor and was validated by 5′ RLM-RACE. This study has identified several conserved and novel miRNAs in the chickpea that are associated with gene regulation following exposure to wilt and salt stress. PMID:25295754

  5. Cloning and characterization of a functional human homolog of Escherichia coli endonuclease III

    Science.gov (United States)

    Aspinwall, Richard; Rothwell, Dominic G.; Roldan-Arjona, Teresa; Anselmino, Catherine; Ward, Christopher J.; Cheadle, Jeremy P.; Sampson, Julian R.; Lindahl, Tomas; Harris, Peter C.; Hickson, Ian D.

    1997-01-01

    Repair of oxidative damage to DNA bases is essential to prevent mutations and cell death. Endonuclease III is the major DNA glycosylase activity in Escherichia coli that catalyzes the excision of pyrimidines damaged by ring opening or ring saturation, and it also possesses an associated lyase activity that incises the DNA backbone adjacent to apurinic/apyrimidinic sites. During analysis of the area adjacent to the human tuberous sclerosis gene (TSC2) in chromosome region 16p13.3, we identified a gene, OCTS3, that encodes a 1-kb transcript. Analysis of OCTS3 cDNA clones revealed an open reading frame encoding a predicted protein of 34.3 kDa that shares extensive sequence similarity with E. coli endonuclease III and a related enzyme from Schizosaccharomyces pombe, including a conserved active site region and an iron/sulfur domain. The product of the OCTS3 gene was therefore designated hNTH1 (human endonuclease III homolog 1). The hNTH1 protein was overexpressed in E. coli and purified to apparent homogeneity. The recombinant protein had spectral properties indicative of the presence of an iron/sulfur cluster, and exhibited DNA glycosylase activity on double-stranded polydeoxyribonucleotides containing urea and thymine glycol residues, as well as an apurinic/apyrimidinic lyase activity. Our data indicate that hNTH1 is a structural and functional homolog of E. coli endonuclease III, and that this class of enzymes, for repair of oxidatively damaged pyrimidines in DNA, is highly conserved in evolution from microorganisms to human cells. PMID:8990169

  6. Identification of a Flavivirus Sequence in a Marine Arthropod.

    Directory of Open Access Journals (Sweden)

    Michael J Conway

    Full Text Available Phylogenetic analysis has yet to uncover the early origins of flaviviruses. In this study, I mined a database of expressed sequence tags in order to discover novel flavivirus sequences. Flavivirus sequences were identified in a pool of mRNA extracted from the sea spider Endeis spinosa (Pycnogonida, Pantopoda. Reconstruction of the translated sequences and BLAST analysis matched the sequence to the flavivirus NS5 gene. Additional sequences corresponding to envelope and the NS5 MTase domain were also identified. Phylogenetic analysis of homologous NS5 sequences revealed that Endeis spinosa NS5 (ESNS5 is likely related to classical insect-specific flaviviruses. It is unclear if ESNS5 represents genetic material from an active viral infection or an integrated viral genome. These data raise the possibility that classical insect-specific flaviviruses and perhaps medically relevant flaviviruses, evolved from progenitors that infected marine arthropods.

  7. WebPrInSeS: automated full-length clone sequence identification and verification using high-throughput sequencing data.

    Science.gov (United States)

    Massouras, Andreas; Decouttere, Frederik; Hens, Korneel; Deplancke, Bart

    2010-07-01

    High-throughput sequencing (HTS) is revolutionizing our ability to obtain cheap, fast and reliable sequence information. Many experimental approaches are expected to benefit from the incorporation of such sequencing features in their pipeline. Consequently, software tools that facilitate such an incorporation should be of great interest. In this context, we developed WebPrInSeS, a web server tool allowing automated full-length clone sequence identification and verification using HTS data. WebPrInSeS encompasses two separate software applications. The first is WebPrInSeS-C which performs automated sequence verification of user-defined open-reading frame (ORF) clone libraries. The second is WebPrInSeS-E, which identifies positive hits in cDNA or ORF-based library screening experiments such as yeast one- or two-hybrid assays. Both tools perform de novo assembly using HTS data from any of the three major sequencing platforms. Thus, WebPrInSeS provides a highly integrated, cost-effective and efficient way to sequence-verify or identify clones of interest. WebPrInSeS is available at http://webprinses.epfl.ch/ and is open to all users.

  8. Slow Replication Fork Velocity of Homologous Recombination-Defective Cells Results from Endogenous Oxidative Stress

    Science.gov (United States)

    Magdalou, Indiana; Machon, Christelle; Dardillac, Elodie; Técher, Hervé; Guitton, Jérôme; Debatisse, Michelle; Lopez, Bernard S.

    2016-01-01

    Replications forks are routinely hindered by different endogenous stresses. Because homologous recombination plays a pivotal role in the reactivation of arrested replication forks, defects in homologous recombination reveal the initial endogenous stress(es). Homologous recombination-defective cells consistently exhibit a spontaneously reduced replication speed, leading to mitotic extra centrosomes. Here, we identify oxidative stress as a major endogenous source of replication speed deceleration in homologous recombination-defective cells. The treatment of homologous recombination-defective cells with the antioxidant N-acetyl-cysteine or the maintenance of the cells at low O2 levels (3%) rescues both the replication fork speed, as monitored by single-molecule analysis (molecular combing), and the associated mitotic extra centrosome frequency. Reciprocally, the exposure of wild-type cells to H2O2 reduces the replication fork speed and generates mitotic extra centrosomes. Supplying deoxynucleotide precursors to H2O2-exposed cells rescued the replication speed. Remarkably, treatment with N-acetyl-cysteine strongly expanded the nucleotide pool, accounting for the replication speed rescue. Remarkably, homologous recombination-defective cells exhibit a high level of endogenous reactive oxygen species. Consistently, homologous recombination-defective cells accumulate spontaneous γH2AX or XRCC1 foci that are abolished by treatment with N-acetyl-cysteine or maintenance at 3% O2. Finally, oxidative stress stimulated homologous recombination, which is suppressed by supplying deoxynucleotide precursors. Therefore, the cellular redox status strongly impacts genome duplication and transmission. Oxidative stress should generate replication stress through different mechanisms, including DNA damage and nucleotide pool imbalance. These data highlight the intricacy of endogenous replication and oxidative stresses, which are both evoked during tumorigenesis and senescence initiation

  9. Slow Replication Fork Velocity of Homologous Recombination-Defective Cells Results from Endogenous Oxidative Stress.

    Directory of Open Access Journals (Sweden)

    Therese Wilhelm

    2016-05-01

    Full Text Available Replications forks are routinely hindered by different endogenous stresses. Because homologous recombination plays a pivotal role in the reactivation of arrested replication forks, defects in homologous recombination reveal the initial endogenous stress(es. Homologous recombination-defective cells consistently exhibit a spontaneously reduced replication speed, leading to mitotic extra centrosomes. Here, we identify oxidative stress as a major endogenous source of replication speed deceleration in homologous recombination-defective cells. The treatment of homologous recombination-defective cells with the antioxidant N-acetyl-cysteine or the maintenance of the cells at low O2 levels (3% rescues both the replication fork speed, as monitored by single-molecule analysis (molecular combing, and the associated mitotic extra centrosome frequency. Reciprocally, the exposure of wild-type cells to H2O2 reduces the replication fork speed and generates mitotic extra centrosomes. Supplying deoxynucleotide precursors to H2O2-exposed cells rescued the replication speed. Remarkably, treatment with N-acetyl-cysteine strongly expanded the nucleotide pool, accounting for the replication speed rescue. Remarkably, homologous recombination-defective cells exhibit a high level of endogenous reactive oxygen species. Consistently, homologous recombination-defective cells accumulate spontaneous γH2AX or XRCC1 foci that are abolished by treatment with N-acetyl-cysteine or maintenance at 3% O2. Finally, oxidative stress stimulated homologous recombination, which is suppressed by supplying deoxynucleotide precursors. Therefore, the cellular redox status strongly impacts genome duplication and transmission. Oxidative stress should generate replication stress through different mechanisms, including DNA damage and nucleotide pool imbalance. These data highlight the intricacy of endogenous replication and oxidative stresses, which are both evoked during tumorigenesis and

  10. Slow Replication Fork Velocity of Homologous Recombination-Defective Cells Results from Endogenous Oxidative Stress.

    Science.gov (United States)

    Wilhelm, Therese; Ragu, Sandrine; Magdalou, Indiana; Machon, Christelle; Dardillac, Elodie; Técher, Hervé; Guitton, Jérôme; Debatisse, Michelle; Lopez, Bernard S

    2016-05-01

    Replications forks are routinely hindered by different endogenous stresses. Because homologous recombination plays a pivotal role in the reactivation of arrested replication forks, defects in homologous recombination reveal the initial endogenous stress(es). Homologous recombination-defective cells consistently exhibit a spontaneously reduced replication speed, leading to mitotic extra centrosomes. Here, we identify oxidative stress as a major endogenous source of replication speed deceleration in homologous recombination-defective cells. The treatment of homologous recombination-defective cells with the antioxidant N-acetyl-cysteine or the maintenance of the cells at low O2 levels (3%) rescues both the replication fork speed, as monitored by single-molecule analysis (molecular combing), and the associated mitotic extra centrosome frequency. Reciprocally, the exposure of wild-type cells to H2O2 reduces the replication fork speed and generates mitotic extra centrosomes. Supplying deoxynucleotide precursors to H2O2-exposed cells rescued the replication speed. Remarkably, treatment with N-acetyl-cysteine strongly expanded the nucleotide pool, accounting for the replication speed rescue. Remarkably, homologous recombination-defective cells exhibit a high level of endogenous reactive oxygen species. Consistently, homologous recombination-defective cells accumulate spontaneous γH2AX or XRCC1 foci that are abolished by treatment with N-acetyl-cysteine or maintenance at 3% O2. Finally, oxidative stress stimulated homologous recombination, which is suppressed by supplying deoxynucleotide precursors. Therefore, the cellular redox status strongly impacts genome duplication and transmission. Oxidative stress should generate replication stress through different mechanisms, including DNA damage and nucleotide pool imbalance. These data highlight the intricacy of endogenous replication and oxidative stresses, which are both evoked during tumorigenesis and senescence initiation

  11. The complete genome sequence of the Atlantic salmon paramyxovirus (ASPV)

    International Nuclear Information System (INIS)

    Nylund, Stian; Karlsen, Marius; Nylund, Are

    2008-01-01

    The complete RNA genome of the Atlantic salmon paramyxovirus (ASPV), isolated from Atlantic salmon suffering from proliferative gill inflammation (PGI), has been determined. The genome is 16,965 nucleotides in length and consists of six nonoverlapping genes in the order 3'- N - P/C/V - M - F - HN - L -5', coding for the nucleocapsid, phospho-, matrix, fusion, hemagglutinin-neuraminidase and large polymerase proteins, respectively. The gene junctions contain highly conserved transcription start and stop signal sequences and trinucleotide intergenic regions similar to those of other Paramyxoviridae. The ASPV P-gene expression strategy is like that of the respiro- and morbilliviruses, which express the phosphoprotein from the primary transcript, and edit a portion of the mRNA to encode the accessory proteins V and W. It also encodes the C-protein by ribosomal choice of translation initiation. Pairwise comparisons of amino acid identities, and phylogenetic analysis of deduced ASPV protein sequences with homologous sequences from other Paramyxoviridae, show that ASPV has an affinity for the genus Respirovirus, but may represent a new genus within the subfamily Paramyxovirinae

  12. Microfluidic PCR Amplification and MiSeq Amplicon Sequencing Techniques for High-Throughput Detection and Genotyping of Human Pathogenic RNA Viruses in Human Feces, Sewage, and Oysters

    Directory of Open Access Journals (Sweden)

    Mamoru Oshiki

    2018-04-01

    Full Text Available Detection and genotyping of pathogenic RNA viruses in human and environmental samples are useful for monitoring the circulation and prevalence of these pathogens, whereas a conventional PCR assay followed by Sanger sequencing is time-consuming and laborious. The present study aimed to develop a high-throughput detection-and-genotyping tool for 11 human RNA viruses [Aichi virus; astrovirus; enterovirus; norovirus genogroup I (GI, GII, and GIV; hepatitis A virus; hepatitis E virus; rotavirus; sapovirus; and human parechovirus] using a microfluidic device and next-generation sequencer. Microfluidic nested PCR was carried out on a 48.48 Access Array chip, and the amplicons were recovered and used for MiSeq sequencing (Illumina, Tokyo, Japan; genotyping was conducted by homology searching and phylogenetic analysis of the obtained sequence reads. The detection limit of the 11 tested viruses ranged from 100 to 103 copies/μL in cDNA sample, corresponding to 101–104 copies/mL-sewage, 105–108 copies/g-human feces, and 102–105 copies/g-digestive tissues of oyster. The developed assay was successfully applied for simultaneous detection and genotyping of RNA viruses to samples of human feces, sewage, and artificially contaminated oysters. Microfluidic nested PCR followed by MiSeq sequencing enables efficient tracking of the fate of multiple RNA viruses in various environments, which is essential for a better understanding of the circulation of human pathogenic RNA viruses in the human population.

  13. Probabilistic Methods for Processing High-Throughput Sequencing Signals

    DEFF Research Database (Denmark)

    Sørensen, Lasse Maretty

    High-throughput sequencing has the potential to answer many of the big questions in biology and medicine. It can be used to determine the ancestry of species, to chart complex ecosystems and to understand and diagnose disease. However, going from raw sequencing data to biological or medical insig....... By estimating the genotypes on a set of candidate variants obtained from both a standard mapping-based approach as well as de novo assemblies, we are able to find considerably more structural variation than previous studies...... for reconstructing transcript sequences from RNA sequencing data. The method is based on a novel sparse prior distribution over transcript abundances and is markedly more accurate than existing approaches. The second chapter describes a new method for calling genotypes from a fixed set of candidate variants....... The method queries the reads using a graph representation of the variants and hereby mitigates the reference-bias that characterise standard genotyping methods. In the last chapter, we apply this method to call the genotypes of 50 deeply sequencing parent-offspring trios from the GenomeDenmark project...

  14. Sources of PCR-induced distortions in high-throughput sequencing data sets

    Science.gov (United States)

    Kebschull, Justus M.; Zador, Anthony M.

    2015-01-01

    PCR permits the exponential and sequence-specific amplification of DNA, even from minute starting quantities. PCR is a fundamental step in preparing DNA samples for high-throughput sequencing. However, there are errors associated with PCR-mediated amplification. Here we examine the effects of four important sources of error—bias, stochasticity, template switches and polymerase errors—on sequence representation in low-input next-generation sequencing libraries. We designed a pool of diverse PCR amplicons with a defined structure, and then used Illumina sequencing to search for signatures of each process. We further developed quantitative models for each process, and compared predictions of these models to our experimental data. We find that PCR stochasticity is the major force skewing sequence representation after amplification of a pool of unique DNA amplicons. Polymerase errors become very common in later cycles of PCR but have little impact on the overall sequence distribution as they are confined to small copy numbers. PCR template switches are rare and confined to low copy numbers. Our results provide a theoretical basis for removing distortions from high-throughput sequencing data. In addition, our findings on PCR stochasticity will have particular relevance to quantification of results from single cell sequencing, in which sequences are represented by only one or a few molecules. PMID:26187991

  15. Transformation of Aspergillus parasiticus with a homologous gene (pyrG) involved in pyrimidine biosynthesis

    International Nuclear Information System (INIS)

    Skory, C.D.; Horng, J.S.; Pestka, J.J.; Linz, J.E.

    1990-01-01

    The lack of efficient transformation methods for aflatoxigenic Aspergillus parasiticus has been a major constraint for the study of aflatoxin biosynthesis at the genetic level. A transformation system with efficiencies of 30 to 50 stable transformants per μg of DNA was developed for A. parasiticus by using homologous pyrG gene. The pyrG gene from A. parasiticus was isolated by in situ plaque hybridization of a lambda genomic DNA library. Uridine auxotrophs of A. parasiticus ATCC 36537, a mutant blocked in aflatoxin biosynthesis, were isolated by selection on 5-fluoroorotic acid following nitrosoguanidine mutagenesis. Isolates with mutations in the pyrG gene resulting in elimination of orotidine monophosphate (OMP) decarboxylase activity were detected by assaying cell extracts for their ability to convert [ 14 C]OMP to [ 14 C]UMP. Transformation of A. parasiticus pyrG protoplasts with the homologous pyrG gene restored the fungal cells to prototrophy. Enzymatic analysis of cell extracts of transformant clones demonstrated that these extracts had the ability to convert [ 14 C]OMP to [ 14 C]UMP. Southern analysis of DNA purified from transformant clones indicated that both pUC19 vector sequences and pyrG sequences were integrated into the genome. The development of this pyrG transformation system should allow cloning of the aflatoxin-biosynthetic genes, which will be useful in studying the regulation of aflatoxin biosynthesis and may ultimately provide a means for controlling aflatoxin production in the field

  16. Experimental design-based functional mining and characterization of high-throughput sequencing data in the sequence read archive.

    Directory of Open Access Journals (Sweden)

    Takeru Nakazato

    Full Text Available High-throughput sequencing technology, also called next-generation sequencing (NGS, has the potential to revolutionize the whole process of genome sequencing, transcriptomics, and epigenetics. Sequencing data is captured in a public primary data archive, the Sequence Read Archive (SRA. As of January 2013, data from more than 14,000 projects have been submitted to SRA, which is double that of the previous year. Researchers can download raw sequence data from SRA website to perform further analyses and to compare with their own data. However, it is extremely difficult to search entries and download raw sequences of interests with SRA because the data structure is complicated, and experimental conditions along with raw sequences are partly described in natural language. Additionally, some sequences are of inconsistent quality because anyone can submit sequencing data to SRA with no quality check. Therefore, as a criterion of data quality, we focused on SRA entries that were cited in journal articles. We extracted SRA IDs and PubMed IDs (PMIDs from SRA and full-text versions of journal articles and retrieved 2748 SRA ID-PMID pairs. We constructed a publication list referring to SRA entries. Since, one of the main themes of -omics analyses is clarification of disease mechanisms, we also characterized SRA entries by disease keywords, according to the Medical Subject Headings (MeSH extracted from articles assigned to each SRA entry. We obtained 989 SRA ID-MeSH disease term pairs, and constructed a disease list referring to SRA data. We previously developed feature profiles of diseases in a system called "Gendoo". We generated hyperlinks between diseases extracted from SRA and the feature profiles of it. The developed project, publication and disease lists resulting from this study are available at our web service, called "DBCLS SRA" (http://sra.dbcls.jp/. This service will improve accessibility to high-quality data from SRA.

  17. A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities.

    Science.gov (United States)

    Bastien, Olivier; Ortet, Philippe; Roy, Sylvaine; Maréchal, Eric

    2005-03-10

    Popular methods to reconstruct molecular phylogenies are based on multiple sequence alignments, in which addition or removal of data may change the resulting tree topology. We have sought a representation of homologous proteins that would conserve the information of pair-wise sequence alignments, respect probabilistic properties of Z-scores (Monte Carlo methods applied to pair-wise comparisons) and be the basis for a novel method of consistent and stable phylogenetic reconstruction. We have built up a spatial representation of protein sequences using concepts from particle physics (configuration space) and respecting a frame of constraints deduced from pair-wise alignment score properties in information theory. The obtained configuration space of homologous proteins (CSHP) allows the representation of real and shuffled sequences, and thereupon an expression of the TULIP theorem for Z-score probabilities. Based on the CSHP, we propose a phylogeny reconstruction using Z-scores. Deduced trees, called TULIP trees, are consistent with multiple-alignment based trees. Furthermore, the TULIP tree reconstruction method provides a solution for some previously reported incongruent results, such as the apicomplexan enolase phylogeny. The CSHP is a unified model that conserves mutual information between proteins in the way physical models conserve energy. Applications include the reconstruction of evolutionary consistent and robust trees, the topology of which is based on a spatial representation that is not reordered after addition or removal of sequences. The CSHP and its assigned phylogenetic topology, provide a powerful and easily updated representation for massive pair-wise genome comparisons based on Z-score computations.

  18. A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities

    Directory of Open Access Journals (Sweden)

    Maréchal Eric

    2005-03-01

    Full Text Available Abstract Background Popular methods to reconstruct molecular phylogenies are based on multiple sequence alignments, in which addition or removal of data may change the resulting tree topology. We have sought a representation of homologous proteins that would conserve the information of pair-wise sequence alignments, respect probabilistic properties of Z-scores (Monte Carlo methods applied to pair-wise comparisons and be the basis for a novel method of consistent and stable phylogenetic reconstruction. Results We have built up a spatial representation of protein sequences using concepts from particle physics (configuration space and respecting a frame of constraints deduced from pair-wise alignment score properties in information theory. The obtained configuration space of homologous proteins (CSHP allows the representation of real and shuffled sequences, and thereupon an expression of the TULIP theorem for Z-score probabilities. Based on the CSHP, we propose a phylogeny reconstruction using Z-scores. Deduced trees, called TULIP trees, are consistent with multiple-alignment based trees. Furthermore, the TULIP tree reconstruction method provides a solution for some previously reported incongruent results, such as the apicomplexan enolase phylogeny. Conclusion The CSHP is a unified model that conserves mutual information between proteins in the way physical models conserve energy. Applications include the reconstruction of evolutionary consistent and robust trees, the topology of which is based on a spatial representation that is not reordered after addition or removal of sequences. The CSHP and its assigned phylogenetic topology, provide a powerful and easily updated representation for massive pair-wise genome comparisons based on Z-score computations.

  19. Generation and analysis of expressed sequence tags from Botrytis cinerea

    Directory of Open Access Journals (Sweden)

    EVELYN SILVA

    2006-01-01

    Full Text Available Botrytis cinerea is a filamentous plant pathogen of a wide range of plant species, and its infection may cause enormous damage both during plant growth and in the post-harvest phase. We have constructed a cDNA library from an isolate of B. cinerea and have sequenced 11,482 expressed sequence tags that were assembled into 1,003 contigs sequences and 3,032 singletons. Approximately 81% of the unigenes showed significant similarity to genes coding for proteins with known functions: more than 50% of the sequences code for genes involved in cellular metabolism, 12% for transport of metabolites, and approximately 10% for cellular organization. Other functional categories include responses to biotic and abiotic stimuli, cell communication, cell homeostasis, and cell development. We carried out pair-wise comparisons with fungal databases to determine the B. cinerea unisequence set with relevant similarity to genes in other fungal pathogenic counterparts. Among the 4,035 non-redundant B. cinerea unigenes, 1,338 (23% have significant homology with Fusarium verticillioides unigenes. Similar values were obtained for Saccharomyces cerevisiae and Aspergillus nidulans (22% and 24%, respectively. The lower percentages of homology were with Magnaporthe grisae and Neurospora crassa (13% and 19%, respectively. Several genes involved in putative and known fungal virulence and general pathogenicity were identified. The results provide important information for future research on this fungal pathogen

  20. Communicating the Benefits of a Full Sequence of High School Science Courses

    Science.gov (United States)

    Nicholas, Catherine Marie

    2014-01-01

    High school students are generally uninformed about the benefits of enrolling in a full sequence of science courses, therefore only about a third of our nation's high school graduates have completed the science sequence of Biology, Chemistry and Physics. The lack of students completing a full sequence of science courses contributes to the deficit…

  1. The K-homology of nets of C∗-algebras

    Science.gov (United States)

    Ruzzi, Giuseppe; Vasselli, Ezio

    2014-12-01

    Let X be a space, intended as a possibly curved space-time, and A a precosheaf of C∗-algebras on X. Motivated by algebraic quantum field theory, we study the Kasparov and Θ-summable K-homology of A interpreting them in terms of the holonomy equivariant K-homology of the associated C∗-dynamical system. This yields a characteristic class for K-homology cycles of A with values in the odd cohomology of X, that we interpret as a generalized statistical dimension.

  2. New Insight Into the Diversity of SemiSWEET Sugar Transporters and the Homologs in Prokaryotes

    Directory of Open Access Journals (Sweden)

    Baolei Jia

    2018-05-01

    Full Text Available Sugars will eventually be exported transporters (SWEETs and SemiSWEETs represent a family of sugar transporters in eukaryotes and prokaryotes, respectively. SWEETs contain seven transmembrane helices (TMHs, while SemiSWEETs contain three. The functions of SemiSWEETs are less studied. In this perspective article, we analyzed the diversity and conservation of SemiSWEETs and further proposed the possible functions. 1,922 SemiSWEET homologs were retrieved from the UniProt database, which is not proportional to the sequenced prokaryotic genomes. However, these proteins are very diverse in sequences and can be classified into 19 clusters when >50% sequence identity is required. Moreover, a gene context analysis indicated that several SemiSWEETs are located in the operons that are related to diverse carbohydrate metabolism. Several proteins with seven TMHs can be found in bacteria, and sequence alignment suggested that these proteins in bacteria may be formed by the duplication and fusion. Multiple sequence alignments showed that the amino acids for sugar translocation are still conserved and coevolved, although the sequences show diversity. Among them, the functions of a few amino acids are still not clear. These findings highlight the challenges that exist in SemiSWEETs and provide future researchers the foundation to explore these uncharted areas.

  3. Evolution of biological sequences implies an extreme value distribution of type I for both global and local pairwise alignment scores.

    Science.gov (United States)

    Bastien, Olivier; Maréchal, Eric

    2008-08-07

    Confidence in pairwise alignments of biological sequences, obtained by various methods such as Blast or Smith-Waterman, is critical for automatic analyses of genomic data. Two statistical models have been proposed. In the asymptotic limit of long sequences, the Karlin-Altschul model is based on the computation of a P-value, assuming that the number of high scoring matching regions above a threshold is Poisson distributed. Alternatively, the Lipman-Pearson model is based on the computation of a Z-value from a random score distribution obtained by a Monte-Carlo simulation. Z-values allow the deduction of an upper bound of the P-value (1/Z-value2) following the TULIP theorem. Simulations of Z-value distribution is known to fit with a Gumbel law. This remarkable property was not demonstrated and had no obvious biological support. We built a model of evolution of sequences based on aging, as meant in Reliability Theory, using the fact that the amount of information shared between an initial sequence and the sequences in its lineage (i.e., mutual information in Information Theory) is a decreasing function of time. This quantity is simply measured by a sequence alignment score. In systems aging, the failure rate is related to the systems longevity. The system can be a machine with structured components, or a living entity or population. "Reliability" refers to the ability to operate properly according to a standard. Here, the "reliability" of a sequence refers to the ability to conserve a sufficient functional level at the folded and maturated protein level (positive selection pressure). Homologous sequences were considered as systems 1) having a high redundancy of information reflected by the magnitude of their alignment scores, 2) which components are the amino acids that can independently be damaged by random DNA mutations. From these assumptions, we deduced that information shared at each amino acid position evolved with a constant rate, corresponding to the

  4. Scrutinizing virus genome termini by high-throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Shasha Li

    Full Text Available Analysis of genomic terminal sequences has been a major step in studies on viral DNA replication and packaging mechanisms. However, traditional methods to study genome termini are challenging due to the time-consuming protocols and their inefficiency where critical details are lost easily. Recent advances in next generation sequencing (NGS have enabled it to be a powerful tool to study genome termini. In this study, using NGS we sequenced one iridovirus genome and twenty phage genomes and confirmed for the first time that the high frequency sequences (HFSs found in the NGS reads are indeed the terminal sequences of viral genomes. Further, we established a criterion to distinguish the type of termini and the viral packaging mode. We also obtained additional terminal details such as terminal repeats, multi-termini, asymmetric termini. With this approach, we were able to simultaneously detect details of the genome termini as well as obtain the complete sequence of bacteriophage genomes. Theoretically, this application can be further extended to analyze larger and more complicated genomes of plant and animal viruses. This study proposed a novel and efficient method for research on viral replication, packaging, terminase activity, transcription regulation, and metabolism of the host cell.

  5. QUASAR--scoring and ranking of sequence-structure alignments.

    Science.gov (United States)

    Birzele, Fabian; Gewehr, Jan E; Zimmer, Ralf

    2005-12-15

    Sequence-structure alignments are a common means for protein structure prediction in the fields of fold recognition and homology modeling, and there is a broad variety of programs that provide such alignments based on sequence similarity, secondary structure or contact potentials. Nevertheless, finding the best sequence-structure alignment in a pool of alignments remains a difficult problem. QUASAR (quality of sequence-structure alignments ranking) provides a unifying framework for scoring sequence-structure alignments that aids finding well-performing combinations of well-known and custom-made scoring schemes. Those scoring functions can be benchmarked against widely accepted quality scores like MaxSub, TMScore, Touch and APDB, thus enabling users to test their own alignment scores against 'standard-of-truth' structure-based scores. Furthermore, individual score combinations can be optimized with respect to benchmark sets based on known structural relationships using QUASAR's in-built optimization routines.

  6. SUGAR: graphical user interface-based data refiner for high-throughput DNA sequencing.

    Science.gov (United States)

    Sato, Yukuto; Kojima, Kaname; Nariai, Naoki; Yamaguchi-Kabata, Yumi; Kawai, Yosuke; Takahashi, Mamoru; Mimori, Takahiro; Nagasaki, Masao

    2014-08-08

    Next-generation sequencers (NGSs) have become one of the main tools for current biology. To obtain useful insights from the NGS data, it is essential to control low-quality portions of the data affected by technical errors such as air bubbles in sequencing fluidics. We develop a software SUGAR (subtile-based GUI-assisted refiner) which can handle ultra-high-throughput data with user-friendly graphical user interface (GUI) and interactive analysis capability. The SUGAR generates high-resolution quality heatmaps of the flowcell, enabling users to find possible signals of technical errors during the sequencing. The sequencing data generated from the error-affected regions of a flowcell can be selectively removed by automated analysis or GUI-assisted operations implemented in the SUGAR. The automated data-cleaning function based on sequence read quality (Phred) scores was applied to a public whole human genome sequencing data and we proved the overall mapping quality was improved. The detailed data evaluation and cleaning enabled by SUGAR would reduce technical problems in sequence read mapping, improving subsequent variant analysis that require high-quality sequence data and mapping results. Therefore, the software will be especially useful to control the quality of variant calls to the low population cells, e.g., cancers, in a sample with technical errors of sequencing procedures.

  7. Natural Homologous Triploidization and DNA Methylation in SARII-628, a Twin-seedling Line of Rice (Oryza sativa

    Directory of Open Access Journals (Sweden)

    Hai PENG

    2007-12-01

    Full Text Available A total of five pairs of diploid-triploid twin-seedlings (a diploid seedling and a triploid seedling emerged from a grain were selected out from 4500 pairs of seedlings from SARII-628, a twin-seedling rice line. SSR analysis indicated that no difference between the diploid seedling and corresponding triploid seedling in a twin-seedling was found at the 310 loci, indicating that there was no obvious change in DNA primary structure. A modified AFLP technique ‘MSAP (methylation-sensitive AFLP’ was used to analyze methylation mutation. Although no methylation mutation was noted among the five diploids, 29 methylation mutation loci were found from the corresponding triploids. This suggested that methylation mutation happened rapidly on M0 generation after natural homologous triploidization. The mutations were classified into 10 types, including 3 increased types, 3 decreased types and 4 undecided types of methylation-degrees. The bands of 22 loci were sequenced and then those sequences were searched through website. The result showed that the methylation mutation involved into the whole rice genome and the 12 pairs of chromosomes. The mutation trend was site-related and there were different mutation loci for different triploids, which foretold that SARII-628 would have different evolution fates after natural homologous triploidization.

  8. Transcription patterns of genes encoding four metallothionein homologs in Daphnia pulex exposed to copper and cadmium are time- and homolog-dependent

    International Nuclear Information System (INIS)

    Asselman, Jana; Shaw, Joseph R.; Glaholt, Stephen P.; Colbourne, John K.; De Schamphelaere, Karel A.C.

    2013-01-01

    Highlights: •Transcription patterns of 4 metallothionein isoforms in Daphnia pulex. •Under cadmium and copper stress these patterns are time-dependent. •Under cadmium and copper stress these patterns are homolog-dependent. •The results stress the complex regulation of metallothioneins. -- Abstract: Metallothioneins are proteins that play an essential role in metal homeostasis and detoxification in nearly all organisms studied to date. Yet discrepancies between outcomes of chronic and acute exposure experiments hamper the understanding of the regulatory mechanisms of their isoforms following metal exposure. Here, we investigated transcriptional differences among four identified homologs (mt1–mt4) in Daphnia pulex exposed across time to copper and cadmium relative to a control. Transcriptional upregulation of mt1 and mt3 was detected on day four following exposure to cadmium, whereas that of mt2 and mt4 was detected on day two and day eight following exposure to copper. These results confirm temporal and metal-specific differences in the transcriptional induction of genes encoding metallothionein homologs upon metal exposure which should be considered in ecotoxicological monitoring programs of metal-contaminated water bodies. Indeed, the mRNA expression patterns observed here illustrate the complex regulatory system associated with metallothioneins, as these patterns are not only dependent on the metal, but also on exposure time and the homolog studied. Further phylogenetic analysis and analysis of regulatory elements in upstream promoter regions revealed a high degree of similarity between metallothionein genes of Daphnia pulex and Daphnia magna, a species belonging to the same genus. These findings, combined with a limited amount of available expression data for D. magna metallothionein genes, tentatively suggest a potential generalization of the metallothionein response system between these Daphnia species

  9. Transcription patterns of genes encoding four metallothionein homologs in Daphnia pulex exposed to copper and cadmium are time- and homolog-dependent

    Energy Technology Data Exchange (ETDEWEB)

    Asselman, Jana, E-mail: jana.asselman@ugent.be [Laboratory of Environmental Toxicology and Aquatic Ecology, Ghent University, Ghent (Belgium); Shaw, Joseph R.; Glaholt, Stephen P. [The School of Public and Environmental Affairs, Indiana University, Bloomington, IN (United States); Colbourne, John K. [School of Biosciences, The University of Birmingham, Birmingham (United Kingdom); De Schamphelaere, Karel A.C. [Laboratory of Environmental Toxicology and Aquatic Ecology, Ghent University, Ghent (Belgium)

    2013-10-15

    Highlights: •Transcription patterns of 4 metallothionein isoforms in Daphnia pulex. •Under cadmium and copper stress these patterns are time-dependent. •Under cadmium and copper stress these patterns are homolog-dependent. •The results stress the complex regulation of metallothioneins. -- Abstract: Metallothioneins are proteins that play an essential role in metal homeostasis and detoxification in nearly all organisms studied to date. Yet discrepancies between outcomes of chronic and acute exposure experiments hamper the understanding of the regulatory mechanisms of their isoforms following metal exposure. Here, we investigated transcriptional differences among four identified homologs (mt1–mt4) in Daphnia pulex exposed across time to copper and cadmium relative to a control. Transcriptional upregulation of mt1 and mt3 was detected on day four following exposure to cadmium, whereas that of mt2 and mt4 was detected on day two and day eight following exposure to copper. These results confirm temporal and metal-specific differences in the transcriptional induction of genes encoding metallothionein homologs upon metal exposure which should be considered in ecotoxicological monitoring programs of metal-contaminated water bodies. Indeed, the mRNA expression patterns observed here illustrate the complex regulatory system associated with metallothioneins, as these patterns are not only dependent on the metal, but also on exposure time and the homolog studied. Further phylogenetic analysis and analysis of regulatory elements in upstream promoter regions revealed a high degree of similarity between metallothionein genes of Daphnia pulex and Daphnia magna, a species belonging to the same genus. These findings, combined with a limited amount of available expression data for D. magna metallothionein genes, tentatively suggest a potential generalization of the metallothionein response system between these Daphnia species.

  10. Construction and partial sequencing of a subtractive library in Calcutta 4 (Musa AA in early stage of infection with Mycosphaerella fijiensis Morelet

    Directory of Open Access Journals (Sweden)

    Milady Mendoza-Rodríguez

    2006-10-01

    Full Text Available The study of genes involved in plant defense response against pathogen attack, is one of most important steps leading to the elucidation of disease resistance molecular mechanisms. The generation of subtracted deoxyribonucleic acid libraries (cDNA, by means of suppression subtractive hybridization technique (SSH, has been used for this purpose. A subtractive hybridization was made between a cDNA population obtained from ‘Calcutta 4’ inoculated leaves with M. fijiensis (CCIBP-Pf83 and a mixture of cDNA from ‘Calcutta 4’ non inoculated leaves and mycelium. Leaves samples were taken at 6, 10 and 12 days after inoculation. The subtracted library was obtained by cloning and transformation of subtracted products and as a result, 600 recombinants clones were obtained. Sequence analysis of sixty nine clones, revealed redundancy of the expressed sequence tags and most of them showed no homology with reported sequences at databases and only 13 % had a high homology with metalothioneins. The results constitute a step in advance in the molecular study of Musa-Mycosphaerella fijiensis interaction. Key words: Banana-Mycosphaerella fijiensis interaction, BlackSigatoka, Musa spp., suppression subtractive hybridization

  11. Draft genome sequence of the Coccolithovirus Emiliania huxleyi virus 203.

    Science.gov (United States)

    Nissimov, Jozef I; Worthy, Charlotte A; Rooks, Paul; Napier, Johnathan A; Kimmance, Susan A; Henn, Matthew R; Ogata, Hiroyuki; Allen, Michael J

    2011-12-01

    The Coccolithoviridae are a recently discovered group of viruses that infect the marine coccolithophorid Emiliania huxleyi. Emiliania huxleyi virus 203 (EhV-203) has a 160- to 180-nm-diameter icosahedral structure and a genome of approximately 400 kbp, consisting of 464 coding sequences (CDSs). Here we describe the genomic features of EhV-203 together with a draft genome sequence and its annotation, highlighting the homology and heterogeneity of this genome in comparison with the EhV-86 reference genome.

  12. Direct evidence for sequence-dependent attraction between double-stranded DNA controlled by methylation.

    Science.gov (United States)

    Yoo, Jejoong; Kim, Hajin; Aksimentiev, Aleksei; Ha, Taekjip

    2016-03-22

    Although proteins mediate highly ordered DNA organization in vivo, theoretical studies suggest that homologous DNA duplexes can preferentially associate with one another even in the absence of proteins. Here we combine molecular dynamics simulations with single-molecule fluorescence resonance energy transfer experiments to examine the interactions between duplex DNA in the presence of spermine, a biological polycation. We find that AT-rich DNA duplexes associate more strongly than GC-rich duplexes, regardless of the sequence homology. Methyl groups of thymine acts as a steric block, relocating spermine from major grooves to interhelical regions, thereby increasing DNA-DNA attraction. Indeed, methylation of cytosines makes attraction between GC-rich DNA as strong as that between AT-rich DNA. Recent genome-wide chromosome organization studies showed that remote contact frequencies are higher for AT-rich and methylated DNA, suggesting that direct DNA-DNA interactions that we report here may play a role in the chromosome organization and gene regulation.

  13. Homologous recombination and non-homologous end-joining repair pathways in bovine embryos with different developmental competence

    Energy Technology Data Exchange (ETDEWEB)

    Henrique Barreta, Marcos [Universidade Federal de Santa Catarina, Campus Universitario de Curitibanos, Curitibanos, SC (Brazil); Laboratorio de Biotecnologia e Reproducao Animal-BioRep, Universidade Federal de Santa Maria, Santa Maria, RS (Brazil); Garziera Gasperin, Bernardo; Braga Rissi, Vitor; Cesaro, Matheus Pedrotti de [Laboratorio de Biotecnologia e Reproducao Animal-BioRep, Universidade Federal de Santa Maria, Santa Maria, RS (Brazil); Ferreira, Rogerio [Centro de Educacao Superior do Oeste-Universidade do Estado de Santa Catarina, Chapeco, SC (Brazil); Oliveira, Joao Francisco de; Goncalves, Paulo Bayard Dias [Laboratorio de Biotecnologia e Reproducao Animal-BioRep, Universidade Federal de Santa Maria, Santa Maria, RS (Brazil); Bordignon, Vilceu, E-mail: vilceu.bordignon@mcgill.ca [Department of Animal Science, McGill University, Ste-Anne-De-Bellevue, QC (Canada)

    2012-10-01

    This study investigated the expression of genes controlling homologous recombination (HR), and non-homologous end-joining (NHEJ) DNA-repair pathways in bovine embryos of different developmental potential. It also evaluated whether bovine embryos can respond to DNA double-strand breaks (DSBs) induced with ultraviolet irradiation by regulating expression of genes involved in HR and NHEJ repair pathways. Embryos with high, intermediate or low developmental competence were selected based on the cleavage time after in vitro insemination and were removed from in vitro culture before (36 h), during (72 h) and after (96 h) the expected period of embryonic genome activation. All studied genes were expressed before, during and after the genome activation period regardless the developmental competence of the embryos. Higher mRNA expression of 53BP1 and RAD52 was found before genome activation in embryos with low developmental competence. Expression of 53BP1, RAD51 and KU70 was downregulated at 72 h and upregulated at 168 h post-insemination in response to DSBs induced by ultraviolet irradiation. In conclusion, important genes controlling HR and NHEJ DNA-repair pathways are expressed in bovine embryos, however genes participating in these pathways are only regulated after the period of embryo genome activation in response to ultraviolet-induced DSBs.

  14. Homologous recombination and non-homologous end-joining repair pathways in bovine embryos with different developmental competence

    International Nuclear Information System (INIS)

    Henrique Barreta, Marcos; Garziera Gasperin, Bernardo; Braga Rissi, Vitor; Cesaro, Matheus Pedrotti de; Ferreira, Rogério; Oliveira, João Francisco de; Gonçalves, Paulo Bayard Dias; Bordignon, Vilceu

    2012-01-01

    This study investigated the expression of genes controlling homologous recombination (HR), and non-homologous end-joining (NHEJ) DNA-repair pathways in bovine embryos of different developmental potential. It also evaluated whether bovine embryos can respond to DNA double-strand breaks (DSBs) induced with ultraviolet irradiation by regulating expression of genes involved in HR and NHEJ repair pathways. Embryos with high, intermediate or low developmental competence were selected based on the cleavage time after in vitro insemination and were removed from in vitro culture before (36 h), during (72 h) and after (96 h) the expected period of embryonic genome activation. All studied genes were expressed before, during and after the genome activation period regardless the developmental competence of the embryos. Higher mRNA expression of 53BP1 and RAD52 was found before genome activation in embryos with low developmental competence. Expression of 53BP1, RAD51 and KU70 was downregulated at 72 h and upregulated at 168 h post-insemination in response to DSBs induced by ultraviolet irradiation. In conclusion, important genes controlling HR and NHEJ DNA-repair pathways are expressed in bovine embryos, however genes participating in these pathways are only regulated after the period of embryo genome activation in response to ultraviolet-induced DSBs.

  15. Two RNAs or DNAs May Artificially Fuse Together at a Short Homologous Sequence (SHS) during Reverse Transcription or Polymerase Chain Reactions, and Thus Reporting an SHS-Containing Chimeric RNA Requires Extra Caution

    Science.gov (United States)

    Xie, Bingkun; Yang, Wei; Ouyang, Yongchang; Chen, Lichan; Jiang, Hesheng; Liao, Yuying; Liao, D. Joshua

    2016-01-01

    Tens of thousands of chimeric RNAs have been reported. Most of them contain a short homologous sequence (SHS) at the joining site of the two partner genes but are not associated with a fusion gene. We hypothesize that many of these chimeras may be technical artifacts derived from SHS-caused mis-priming in reverse transcription (RT) or polymerase chain reactions (PCR). We cloned six chimeric complementary DNAs (cDNAs) formed by human mitochondrial (mt) 16S rRNA sequences at an SHS, which were similar to several expression sequence tags (ESTs).These chimeras, which could not be detected with cDNA protection assay, were likely formed because some regions of the 16S rRNA are reversely complementary to another region to form an SHS, which allows the downstream sequence to loop back and anneal at the SHS to prime the synthesis of its complementary strand, yielding a palindromic sequence that can form a hairpin-like structure.We identified a 16S rRNA that ended at the 4th nucleotide(nt) of the mt-tRNA-leu was dominant and thus should be the wild type. We also cloned a mouse Bcl2-Nek9 chimeric cDNA that contained a 5-nt unmatchable sequence between the two partners, contained two copies of the reverse primer in the same direction but did not contain the forward primer, making it unclear how this Bcl2-Nek9 was formed and amplified. Moreover, a cDNA was amplified because one primer has 4 nts matched to the template, suggesting that there may be many more artificial cDNAs than we have realized, because the nuclear and mt genomes have many more 4-nt than 5-nt or longer homologues. Altogether, the chimeric cDNAs we cloned are good examples suggesting that many cDNAs may be artifacts due to SHS-caused mis-priming and thus greater caution should be taken when new sequence is obtained from a technique involving DNA polymerization. PMID:27148738

  16. The nucleotide sequence of a Polish isolate of Tomato torrado virus.

    Science.gov (United States)

    Budziszewska, Marta; Obrepalska-Steplowska, Aleksandra; Wieczorek, Przemysław; Pospieszny, Henryk

    2008-12-01

    A new virus was isolated from greenhouse tomato plants showing symptoms of leaf and apex necrosis in Wielkopolska province in Poland in 2003. The observed symptoms and the virus morphology resembled viruses previously reported in Spain called Tomato torrado virus (ToTV) and that in Mexico called Tomato marchitez virus (ToMarV). The complete genome of a Polish isolate Wal'03 was determined using RT-PCR amplification using oligonucleotide primers developed against the ToTV sequences deposited in Genbank, followed by cloning, sequencing, and comparison with the sequence of the type isolate. Phylogenetic analyses, performed on the basis of fragments of polyproteins sequences, established the relationship of Polish isolate Wal'03 with Spanish ToTV and Mexican ToMarV, as well as with other viruses from Sequivirus, Sadwavirus, and Cheravirus genera, reported to be the most similar to the new tomato viruses. Wal'03 genome strands has the same organization and very high homology with the ToTV type isolate, showing only some nucleotide and deduced amino acid changes, in contrast to ToMarV, which was significantly different. The phylogenetic tree clustered aforementioned viruses to the same group, indicating that they have a common origin.

  17. Complete Genome Sequences of Four Isolates of Plutella xylostella Granulovirus

    OpenAIRE

    Spence, Robert J.; Noune, Christopher; Hauxwell, Caroline

    2016-01-01

    Granuloviruses are widespread pathogens of Plutella xylostella L. (diamondback moth) and potential biopesticides for control of this global insect pest. We report the complete genomes of four Plutella xylostella granulovirus isolates from China, Malaysia, and Taiwan exhibiting pairs of noncoding, homologous repeat regions with significant sequence variation but equivalent length.

  18. Overlapping genomic sequences: a treasure trove of single-nucleotide polymorphisms.

    Science.gov (United States)

    Taillon-Miller, P; Gu, Z; Li, Q; Hillier, L; Kwok, P Y

    1998-07-01

    An efficient strategy to develop a dense set of single-nucleotide polymorphism (SNP) markers is to take advantage of the human genome sequencing effort currently under way. Our approach is based on the fact that bacterial artificial chromosomes (BACs) and P1-based artificial chromosomes (PACs) used in long-range sequencing projects come from diploid libraries. If the overlapping clones sequenced are from different lineages, one is comparing the sequences from 2 homologous chromosomes in the overlapping region. We have analyzed in detail every SNP identified while sequencing three sets of overlapping clones found on chromosome 5p15.2, 7q21-7q22, and 13q12-13q13. In the 200.6 kb of DNA sequence analyzed in these overlaps, 153 SNPs were identified. Computer analysis for repetitive elements and suitability for STS development yielded 44 STSs containing 68 SNPs for further study. All 68 SNPs were confirmed to be present in at least one of the three (Caucasian, African-American, Hispanic) populations studied. Furthermore, 42 of the SNPs tested (62%) were informative in at least one population, 32 (47%) were informative in two or more populations, and 23 (34%) were informative in all three populations. These results clearly indicate that developing SNP markers from overlapping genomic sequence is highly efficient and cost effective, requiring only the two simple steps of developing STSs around the known SNPs and characterizing them in the appropriate populations.

  19. Protein backbone chemical shifts predicted from searching a database for torsion angle and sequence homology

    International Nuclear Information System (INIS)

    Shen Yang; Bax, Ad

    2007-01-01

    Chemical shifts of nuclei in or attached to a protein backbone are exquisitely sensitive to their local environment. A computer program, SPARTA, is described that uses this correlation with local structure to predict protein backbone chemical shifts, given an input three-dimensional structure, by searching a newly generated database for triplets of adjacent residues that provide the best match in φ/ψ/χ 1 torsion angles and sequence similarity to the query triplet of interest. The database contains 15 N, 1 H N , 1 H α , 13 C α , 13 C β and 13 C' chemical shifts for 200 proteins for which a high resolution X-ray (≤2.4 A) structure is available. The relative importance of the weighting factors for the φ/ψ/χ 1 angles and sequence similarity was optimized empirically. The weighted, average secondary shifts of the central residues in the 20 best-matching triplets, after inclusion of nearest neighbor, ring current, and hydrogen bonding effects, are used to predict chemical shifts for the protein of known structure. Validation shows good agreement between the SPARTA-predicted and experimental shifts, with standard deviations of 2.52, 0.51, 0.27, 0.98, 1.07 and 1.08 ppm for 15 N, 1 H N , 1 H α , 13 C α , 13 C β and 13 C', respectively, including outliers

  20. cDNA, genomic sequence cloning and overexpression of ribosomal ...

    African Journals Online (AJOL)

    RPS16 of eukaryote is a component of the 40S small ribosomal subunit encoded by RPS16 gene and is also a homolog of prokaryotic RPS9. The cDNA and genomic sequence of RPS16 was cloned successfully for the first time from the Giant Panda (Ailuropoda melanoleuca) using reverse transcription-polymerase chain ...

  1. Computing Homology Group Generators of Images Using Irregular Graph Pyramids

    OpenAIRE

    Peltier , Samuel; Ion , Adrian; Haxhimusa , Yll; Kropatsch , Walter; Damiand , Guillaume

    2007-01-01

    International audience; We introduce a method for computing homology groups and their generators of a 2D image, using a hierarchical structure i.e. irregular graph pyramid. Starting from an image, a hierarchy of the image is built, by two operations that preserve homology of each region. Instead of computing homology generators in the base where the number of entities (cells) is large, we first reduce the number of cells by a graph pyramid. Then homology generators are computed efficiently on...

  2. Homologs of the Acinetobacter baumannii AceI transporter represent a new family of bacterial multidrug efflux systems.

    Science.gov (United States)

    Hassan, Karl A; Liu, Qi; Henderson, Peter J F; Paulsen, Ian T

    2015-02-10

    Multidrug efflux systems are a major cause of resistance to antimicrobials in bacteria, including those pathogenic to humans, animals, and plants. These proteins are ubiquitous in these pathogens, and five families of bacterial multidrug efflux systems have been identified to date. By using transcriptomic and biochemical analyses, we recently identified the novel AceI (Acinetobacter chlorhexidine efflux) protein from Acinetobacter baumannii that conferred resistance to the biocide chlorhexidine, via an active efflux mechanism. Proteins homologous to AceI are encoded in the genomes of many other bacterial species and are particularly prominent within proteobacterial lineages. In this study, we expressed 23 homologs of AceI and examined their resistance and/or transport profiles. MIC analyses demonstrated that, like AceI, many of the homologs conferred resistance to chlorhexidine. Many of the AceI homologs conferred resistance to additional biocides, including benzalkonium, dequalinium, proflavine, and acriflavine. We conducted fluorimetric transport assays using the AceI homolog from Vibrio parahaemolyticus and confirmed that resistance to both proflavine and acriflavine was mediated by an active efflux mechanism. These results show that this group of AceI homologs represent a new family of bacterial multidrug efflux pumps, which we have designated the proteobacterial antimicrobial compound efflux (PACE) family of transport proteins. Bacterial multidrug efflux pumps are an important class of resistance determinants that can be found in every bacterial genome sequenced to date. These transport proteins have important protective functions for the bacterial cell but are a significant problem in the clinical setting, since a single efflux system can mediate resistance to many structurally and mechanistically diverse antibiotics and biocides. In this study, we demonstrate that proteins related to the Acinetobacter baumannii AceI transporter are a new class of multidrug

  3. New acute transforming feline retovirus with fms homology specifies a C-terminally truncated version of the c-fms protein that is different from SM-feline sarcoma virus v-fms protein

    International Nuclear Information System (INIS)

    Besmer, P.; Lader, E.; George, P.C.; Bergold, P.J.; Qui, F.; Zuckerman, E.E.; Hardy, W.D.

    1986-01-01

    The HZ5-feline sarcoma virus (FeSV) is a new acute transforming feline retrovirus which was isolated from a multicentric fibrosarcoma of a domestic cat. The HZ5-FeSV transforms fibroblasts in vitro and is replication defective. A biologically active integrated HZ5-FeSV provirus was molecularly cloned from cellular DNA of HZ5-FeSV-infected FRE-3A rat cells. The HZ5-FeSV has oncogene homology with the fms sequences of the SM-FeSV. The genome organization of the 8.6-kilobase HZ5-FeSV provirus is 5' Δgag-fms-Δpol-Δenv 3'. The HZ5- and SM-FeSVs display indistinguishable in vitro transformation characteristics, and the structures of the gag-fms transforming genes in the two viruses are very similar. In the HZ5-FeSV and the SM-FeSV, identical c-fms and feline leukemia virus p10 sequences form the 5' gag-fms junction. With regard to v-fms the two viruses are homologous up to 11 amino acids before the C terminus of the SM-FeSV v-fms protein. In HZ5-FeSV a segment of 362 nucleotides then follows before the 3' recombination site with feline leukemia virus pol. The new 3' v-fms sequence encodes 27 amino acids before reaching a TGA termination signal. The relationship of this sequence with the recently characterized human c-fms sequence has been examined. The 3' HZ5-FeSV v-fms sequence is homologous with 3' c-fms sequences. A frameshift mutation (11-base-pair deletion) was found in the C-terminal fms coding sequence of the HZ5-FeSV. As a result, the HZ5-FeSV v-fms protein is predicted to be a C-terminally truncated version of c-fms. This frameshift mutation may determine the oncogenic properties of v-fms in the HZ5-FeSV

  4. New acute transforming feline retovirus with fms homology specifies a C-terminally truncated version of the c-fms protein that is different from SM-feline sarcoma virus v-fms protein

    Energy Technology Data Exchange (ETDEWEB)

    Besmer, P.; Lader, E.; George, P.C.; Bergold, P.J.; Qui, F.; Zuckerman, E.E.; Hardy, W.D.

    1986-10-01

    The HZ5-feline sarcoma virus (FeSV) is a new acute transforming feline retrovirus which was isolated from a multicentric fibrosarcoma of a domestic cat. The HZ5-FeSV transforms fibroblasts in vitro and is replication defective. A biologically active integrated HZ5-FeSV provirus was molecularly cloned from cellular DNA of HZ5-FeSV-infected FRE-3A rat cells. The HZ5-FeSV has oncogene homology with the fms sequences of the SM-FeSV. The genome organization of the 8.6-kilobase HZ5-FeSV provirus is 5' ..delta..gag-fms-..delta..pol-..delta..env 3'. The HZ5- and SM-FeSVs display indistinguishable in vitro transformation characteristics, and the structures of the gag-fms transforming genes in the two viruses are very similar. In the HZ5-FeSV and the SM-FeSV, identical c-fms and feline leukemia virus p10 sequences form the 5' gag-fms junction. With regard to v-fms the two viruses are homologous up to 11 amino acids before the C terminus of the SM-FeSV v-fms protein. In HZ5-FeSV a segment of 362 nucleotides then follows before the 3' recombination site with feline leukemia virus pol. The new 3' v-fms sequence encodes 27 amino acids before reaching a TGA termination signal. The relationship of this sequence with the recently characterized human c-fms sequence has been examined. The 3' HZ5-FeSV v-fms sequence is homologous with 3' c-fms sequences. A frameshift mutation (11-base-pair deletion) was found in the C-terminal fms coding sequence of the HZ5-FeSV. As a result, the HZ5-FeSV v-fms protein is predicted to be a C-terminally truncated version of c-fms. This frameshift mutation may determine the oncogenic properties of v-fms in the HZ5-FeSV.

  5. Inverse statistical physics of protein sequences: a key issues review.

    Science.gov (United States)

    Cocco, Simona; Feinauer, Christoph; Figliuzzi, Matteo; Monasson, Rémi; Weigt, Martin

    2018-03-01

    In the course of evolution, proteins undergo important changes in their amino acid sequences, while their three-dimensional folded structure and their biological function remain remarkably conserved. Thanks to modern sequencing techniques, sequence data accumulate at unprecedented pace. This provides large sets of so-called homologous, i.e. evolutionarily related protein sequences, to which methods of inverse statistical physics can be applied. Using sequence data as the basis for the inference of Boltzmann distributions from samples of microscopic configurations or observables, it is possible to extract information about evolutionary constraints and thus protein function and structure. Here we give an overview over some biologically important questions, and how statistical-mechanics inspired modeling approaches can help to answer them. Finally, we discuss some open questions, which we expect to be addressed over the next years.

  6. Silencing the lettuce homologs of small rubber particle protein does not influence natural rubber biosynthesis in lettuce (Lactuca sativa).

    Science.gov (United States)

    Chakrabarty, Romit; Qu, Yang; Ro, Dae-Kyun

    2015-05-01

    Natural rubber, cis-1,4-polyisoprene, is an important raw material in chemical industries, but its biosynthetic mechanism remains elusive. Natural rubber is known to be synthesized in rubber particles suspended in laticifer cells in the Brazilian rubber tree (Hevea brasiliensis). In the rubber tree, rubber elongation factor (REF) and its homolog, small rubber particle protein (SRPP), were found to be the most abundant proteins in rubber particles, and they have been implicated in natural rubber biosynthesis. As lettuce (Lactuca sativa) can synthesize natural rubber, we utilized this annual, transformable plant to examine in planta roles of the lettuce REF/SRPP homologs by RNA interference. Among eight lettuce REF/SRPP homologs identified, transcripts of two genes (LsSRPP4 and LsSRPP8) accounted for more than 90% of total transcripts of REF/SRPP homologs in lettuce latex. LsSRPP4 displays a typical primary protein sequence as other REF/SRPP, while LsSRPP8 is twice as long as LsSRPP4. These two major LsSRPP transcripts were individually and simultaneously silenced by RNA interference, and relative abundance, polymer molecular weight, and polydispersity of natural rubber were analyzed from the LsSRPP4- and LsSRPP8-silenced transgenic lettuce. Despite previous data suggesting the implications of REF/SRPP in natural rubber biosynthesis, qualitative and quantitative alterations of natural rubber could not be observed in transgenic lettuce lines. It is concluded that lettuce REF/SRPP homologs are not critically important proteins in natural rubber biosynthesis in lettuce. Copyright © 2014 Elsevier Ltd. All rights reserved.

  7. Down-regulation of the strawberry Bet v 1-homologous allergen in concert with the flavonoid biosynthesis pathway in colorless strawberry mutant

    DEFF Research Database (Denmark)

    Hjernø, Karin; Alm, Rikard; Canbäck, Björn

    2006-01-01

    Proteomic screening of strawberry (Fragaria ananassa) yielded a 58% success rate in protein identification in spite of the fact that no genomic sequence is available for this species. This was achieved by a combination of MALDI-MS/MS de novo sequencing of double-derivatized peptides and indel......-tolerant searching against local protein databases built on both EST and full-length nucleotide sequences. The amino acid sequence of a strawberry allergen, homologous to the well-known major birch pollen allergen Bet v 1, was partially determined. This strawberry allergen, named Fra a 1 according...... to the nomenclature for allergen proteins, showed sequence identity of 54 and 77%, respectively, with corresponding allergens from birch and apple. Differential expression, as evaluated by 2-D DIGE, occurred in 10% of protein spots when red strawberries were compared to a colorless (white) strawberry mutant. White...

  8. A decline in transcript abundance for Heterodera glycines homologs of Caenorhabditis elegans uncoordinated genes accompanies its sedentary parasitic phase

    Directory of Open Access Journals (Sweden)

    Overall Christopher C

    2007-04-01

    Full Text Available Abstract Background Heterodera glycines (soybean cyst nematode [SCN], the major pathogen of Glycine max (soybean, undergoes muscle degradation (sarcopenia as it becomes sedentary inside the root. Many genes encoding muscular and neuromuscular components belong to the uncoordinated (unc family of genes originally identified in Caenorhabditis elegans. Previously, we reported a substantial decrease in transcript abundance for Hg-unc-87, the H. glycines homolog of unc-87 (calponin during the adult sedentary phase of SCN. These observations implied that changes in the expression of specific muscle genes occurred during sarcopenia. Results We developed a bioinformatics database that compares expressed sequence tag (est and genomic data of C. elegans and H. glycines (CeHg database. We identify H. glycines homologs of C. elegans unc genes whose protein products are involved in muscle composition and regulation. RT-PCR reveals the transcript abundance of H. glycines unc homologs at mobile and sedentary stages of its lifecycle. A prominent reduction in transcript abundance occurs in samples from sedentary nematodes for homologs of actin, unc-60B (cofilin, unc-89, unc-15 (paromyosin, unc-27 (troponin I, unc-54 (myosin, and the potassium channel unc-110 (twk-18. Less reduction is observed for the focal adhesion complex gene Hg-unc-97. Conclusion The CeHg bioinformatics database is shown to be useful in identifying homologs of genes whose protein products perform roles in specific aspects of H. glycines muscle biology. Our bioinformatics comparison of C. elegans and H. glycines genomic data and our Hg-unc-87 expression experiments demonstrate that the transcript abundance of specific H. glycines homologs of muscle gene decreases as the nematode becomes sedentary inside the root during its parasitic feeding stages.

  9. CDNA encoding a polypeptide including a hevein sequence

    Science.gov (United States)

    Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil

    1995-03-21

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74-79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.

  10. The chlamydial functional homolog of KsgA confers kasugamycin sensitivity to Chlamydia trachomatis and impacts bacterial fitness

    Directory of Open Access Journals (Sweden)

    Maurelli Anthony T

    2009-12-01

    Full Text Available Abstract Background rRNA adenine dimethyltransferases, represented by the Escherichia coli KsgA protein, are highly conserved phylogenetically and are generally not essential for growth. They are responsible for the post-transcriptional transfer of two methyl groups to two universally conserved adenosines located near the 3'end of the small subunit rRNA and participate in ribosome maturation. All sequenced genomes of Chlamydia reveal a ksgA homolog in each species, including C. trachomatis. Yet absence of a S-adenosyl-methionine synthetase in Chlamydia, the conserved enzyme involved in the synthesis of the methyl donor S-adenosyl-L-methionine, raises a doubt concerning the activity of the KsgA homolog in these organisms. Results Lack of the dimethylated adenosines following ksgA inactivation confers resistance to kasugamycin (KSM in E. coli. Expression of the C. trachomatis L2 KsgA ortholog restored KSM sensitivity to the E. coli ksgA mutant, suggesting that the chlamydial KsgA homolog has specific rRNA dimethylase activity. C. trachomatis growth was sensitive to KSM and we were able to isolate a KSM resistant mutant of C. trachomatis containing a frameshift mutation in ksgA, which led to the formation of a shorter protein with no activity. Growth of the C. trachomatis ksgA mutant was negatively affected in cell culture highlighting the importance of the methylase in the development of these obligate intracellular and as yet genetically intractable pathogens. Conclusion The presence of a functional rRNA dimethylase enzyme belonging to the KsgA family in Chlamydia presents an excellent chemotherapeutic target with real potential. It also confirms the existence of S-adenosyl-methionine - dependent methylation reactions in Chlamydia raising the question of how these organisms acquire this cofactor.

  11. I-SceI-mediated double-strand break does not increase the frequency of homologous recombination at the Dct locus in mouse embryonic stem cells.

    Science.gov (United States)

    Fenina, Myriam; Simon-Chazottes, Dominique; Vandormael-Pournin, Sandrine; Soueid, Jihane; Langa, Francina; Cohen-Tannoudji, Michel; Bernard, Bruno A; Panthier, Jean-Jacques

    2012-01-01

    Targeted induction of double-strand breaks (DSBs) at natural endogenous loci was shown to increase the rate of gene replacement by homologous recombination in mouse embryonic stem cells. The gene encoding dopachrome tautomerase (Dct) is specifically expressed in melanocytes and their precursors. To construct a genetic tool allowing the replacement of Dct gene by any gene of interest, we generated an embryonic stem cell line carrying the recognition site for the yeast I-SceI meganuclease embedded in the Dct genomic segment. The embryonic stem cell line was electroporated with an I-SceI expression plasmid, and a template for the DSB-repair process that carried sequence homologies to the Dct target. The I-SceI meganuclease was indeed able to introduce a DSB at the Dct locus in live embryonic stem cells. However, the level of gene targeting was not improved by the DSB induction, indicating a limited capacity of I-SceI to mediate homologous recombination at the Dct locus. These data suggest that homologous recombination by meganuclease-induced DSB may be locus dependent in mammalian cells.

  12. Heteromorphic Sex Chromosomes: Navigating Meiosis without a Homologous Partner

    OpenAIRE

    Checchi, Paula M.; Engebrecht, JoAnne

    2011-01-01

    Accurate chromosome segregation during meiosis relies on homology between the maternal and paternal chromosomes. Yet by definition, sex chromosomes of the heterogametic sex lack a homologous partner. Recent studies in a number of systems have shed light on the unique meiotic behavior of heteromorphic sex chromosomes, and highlight both the commonalities and differences in divergent species. During meiotic prophase, the homology-dependent processes of pairing, synapsis, and recombination have ...

  13. Induction and repair of DNA double strand breaks: The increasing spectrum of non-homologous end joining pathways

    International Nuclear Information System (INIS)

    Mladenov, Emil; Iliakis, George

    2011-01-01

    A defining characteristic of damage induced in the DNA by ionizing radiation (IR) is its clustered character that leads to the formation of complex lesions challenging the cellular repair mechanisms. The most widely investigated such complex lesion is the DNA double strand break (DSB). DSBs undermine chromatin stability and challenge the repair machinery because an intact template strand is lacking to assist restoration of integrity and sequence in the DNA molecule. Therefore, cells have evolved a sophisticated machinery to detect DSBs and coordinate a response on the basis of inputs from various sources. A central function of cellular responses to DSBs is the coordination of DSB repair. Two conceptually different mechanisms can in principle remove DSBs from the genome of cells of higher eukaryotes. Homologous recombination repair (HRR) uses as template a homologous DNA molecule and is therefore error-free; it functions preferentially in the S and G2 phases. Non-homologous end joining (NHEJ), on the other hand, simply restores DNA integrity by joining the two ends, is error prone as sequence is only fortuitously preserved and active throughout the cell cycle. The basis of DSB repair pathway choice remains unknown, but cells of higher eukaryotes appear programmed to utilize preferentially NHEJ. Recent work suggests that when the canonical DNA-PK dependent pathway of NHEJ (D-NHEJ), becomes compromised an alternative NHEJ pathway and not HRR substitutes in a quasi-backup function (B-NHEJ). Here, we outline aspects of DSB induction by IR and review the mechanisms of their processing in cells of higher eukaryotes. We place particular emphasis on backup pathways of NHEJ and summarize their increasing significance in various cellular processes, as well as their potential contribution to carcinogenesis.

  14. Description of durancin TW-49M, a novel enterocin B-homologous bacteriocin in carrot-isolated Enterococcus durans QU 49.

    Science.gov (United States)

    Hu, C-B; Zendo, T; Nakayama, J; Sonomoto, K

    2008-09-01

    To characterize the novel bacteriocin produced by Enterococcus durans. Enterococcus durans QU 49 was isolated from carrot and expressed bactericidal activity over 20-43 degrees C. Bacteriocins were purified to homogeneity using the three-step purification method, one of which, termed durancin TW-49M, was an enterocin B-homologous peptide with most identical residues occurring in the N-terminus. Durancin TW-49M was more tolerant in acidic than in alkali. DNA sequencing analysis revealed durancin TW-49M was translated as a prepeptide of the double-glycine type. Durancin TW-49M and enterocin B expressed similar antimicrobial spectra, in which no significant variation due to the diversity in their C-termini was observed. Durancin TW-49M, a novel nonpediocin-like class II bacteriocin, was characterized to the amino acid and genetic levels. The diverse C-terminal parts of durancin TW-49M and enterocin B were hardly to be suggested as the place determining the target cell specificity. This is the first and comprehensive study of a novel bacteriocin produced by Ent. durans. The high homology at the N-terminal halves between durancin TW-49M and enterocin B makes them suitable to study the structure-function relationship of bacteriocins and their immunity proteins.

  15. ChickVD: a sequence variation database for the chicken genome

    DEFF Research Database (Denmark)

    Wang, Jing; He, Ximiao; Ruan, Jue

    2005-01-01

    Working in parallel with the efforts to sequence the chicken (Gallus gallus) genome, the Beijing Genomics Institute led an international team of scientists from China, USA, UK, Sweden, The Netherlands and Germany to map extensive DNA sequence variation throughout the chicken genome by sampling DN...... on quantitative trait loci using data from collaborating institutions and public resources. Our data can be queried by search engine and homology-based BLAST searches. ChickVD is publicly accessible at http://chicken.genomics.org.cn. Udgivelsesdato: 2005-Jan-1...

  16. Draft genome sequence of the coccolithovirus Emiliania huxleyi virus 202.

    Science.gov (United States)

    Nissimov, Jozef I; Worthy, Charlotte A; Rooks, Paul; Napier, Johnathan A; Kimmance, Susan A; Henn, Matthew R; Ogata, Hiroyuki; Allen, Michael J

    2012-02-01

    Emiliania huxleyi virus 202 (EhV-202) is a member of the Coccolithoviridae, a group of viruses that infect the marine coccolithophorid Emiliania huxleyi. EhV-202 has a 160- to 180-nm-diameter icosahedral structure and a genome of approximately 407 kbp, consisting of 485 coding sequences (CDSs). Here we describe the genomic features of EhV-202, together with a draft genome sequence and its annotation, highlighting the homology and heterogeneity of this genome in comparison with the EhV-86 reference genome.

  17. The application of the high throughput sequencing technology in the transposable elements.

    Science.gov (United States)

    Liu, Zhen; Xu, Jian-hong

    2015-09-01

    High throughput sequencing technology has dramatically improved the efficiency of DNA sequencing, and decreased the costs to a great extent. Meanwhile, this technology usually has advantages of better specificity, higher sensitivity and accuracy. Therefore, it has been applied to the research on genetic variations, transcriptomics and epigenomics. Recently, this technology has been widely employed in the studies of transposable elements and has achieved fruitful results. In this review, we summarize the application of high throughput sequencing technology in the fields of transposable elements, including the estimation of transposon content, preference of target sites and distribution, insertion polymorphism and population frequency, identification of rare copies, transposon horizontal transfers as well as transposon tagging. We also briefly introduce the major common sequencing strategies and algorithms, their advantages and disadvantages, and the corresponding solutions. Finally, we envision the developing trends of high throughput sequencing technology, especially the third generation sequencing technology, and its application in transposon studies in the future, hopefully providing a comprehensive understanding and reference for related scientific researchers.

  18. The first insight into the salvia (lamiaceae) genome via bac library construction and high-throughput sequencing of target bac clones

    International Nuclear Information System (INIS)

    Hao, D.C.; Vautrin, S.; Berges, H.; Chen, S.L.

    2015-01-01

    Salvia is a representative genus of Lamiaceae, a eudicot family with significant species diversity and population adaptibility. One of the key goals of Salvia genomics research is to identify genes of adaptive significance. This information may help to improve the conservation of adaptive genetic variation and the management of medicinal plants to increase their health and productivity. Large-insert genomic libraries are a fundamental tool for achieving this purpose. We report herein the construction, characterization and screening of a gridded BAC library for Salvia officinalis (sage). The S. officinalis BAC library consists of 17,764 clones and the average insert size is 107 Kb, corresponding to 3 haploid genome equivalents. Seventeen positive clones (average insert size 115 Kb) containing five terpene synthase (TPS) genes were screened out by PCR and 12 of them were subject to Illumina HiSeq 2000 sequencing, which yielded 28,097,480 90-bp raw reads (2.53 Gb). Scaffolds containing sabinene synthase (Sab), a Sab homolog, TPS3 (kaurene synthase-like 2), copalyl diphosphate synthase 2 and one cytochrome P450 gene were retrieved via de novo assembly and annotation, which also have flanking noncoding sequences, including predicted promoters and repeat sequences. Among 2,638 repeat sequences, there are 330 amplifiable microsatellites. This BAC library provides a new resource for Lamiaceae genomic studies, including microsatellite marker development, physical mapping, comparative genomics and genome sequencing. Characterization of positive clones provided insights into the structure of the Salvia genome. These sequences will be used in the assembly of a future genome sequence for S. officinalis. (author)

  19. Homologous Recombination as a Replication Fork Escort: Fork-Protection and Recovery

    Directory of Open Access Journals (Sweden)

    Audrey Costes

    2012-12-01

    Full Text Available Homologous recombination is a universal mechanism that allows DNA repair and ensures the efficiency of DNA replication. The substrate initiating the process of homologous recombination is a single-stranded DNA that promotes a strand exchange reaction resulting in a genetic exchange that promotes genetic diversity and DNA repair. The molecular mechanisms by which homologous recombination repairs a double-strand break have been extensively studied and are now well characterized. However, the mechanisms by which homologous recombination contribute to DNA replication in eukaryotes remains poorly understood. Studies in bacteria have identified multiple roles for the machinery of homologous recombination at replication forks. Here, we review our understanding of the molecular pathways involving the homologous recombination machinery to support the robustness of DNA replication. In addition to its role in fork-recovery and in rebuilding a functional replication fork apparatus, homologous recombination may also act as a fork-protection mechanism. We discuss that some of the fork-escort functions of homologous recombination might be achieved by loading of the recombination machinery at inactivated forks without a need for a strand exchange step; as well as the consequence of such a model for the stability of eukaryotic genomes.

  20. Draft genome sequence of Cicer reticulatum L., the wild progenitor of chickpea provides a resource for agronomic trait improvement.

    Science.gov (United States)

    Gupta, Sonal; Nawaz, Kashif; Parween, Sabiha; Roy, Riti; Sahu, Kamlesh; Kumar Pole, Anil; Khandal, Hitaishi; Srivastava, Rishi; Kumar Parida, Swarup; Chattopadhyay, Debasis

    2017-02-01

    Cicer reticulatum L. is the wild progenitor of the fourth most important legume crop chickpea (C. arietinum L.). We assembled short-read sequences into 416 Mb draft genome of C. reticulatum and anchored 78% (327 Mb) of this assembly to eight linkage groups. Genome annotation predicted 25,680 protein-coding genes covering more than 90% of predicted gene space. The genome assembly shared a substantial synteny and conservation of gene orders with the genome of the model legume Medicago truncatula. Resistance gene homologs of wild and domesticated chickpeas showed high sequence homology and conserved synteny. Comparison of gene sequences and nucleotide diversity using 66 wild and domesticated chickpea accessions suggested that the desi type chickpea was genetically closer to the wild species than the kabuli type. Comparative analyses predicted gene flow between the wild and the cultivated species during domestication. Molecular diversity and population genetic structure determination using 15,096 genome-wide single nucleotide polymorphisms revealed an admixed domestication pattern among cultivated (desi and kabuli) and wild chickpea accessions belonging to three population groups reflecting significant influence of parentage or geographical origin for their cultivar-specific population classification. The assembly and the polymorphic sequence resources presented here would facilitate the study of chickpea domestication and targeted use of wild Cicer germplasms for agronomic trait improvement in chickpea. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  1. A geometric model for Hochschild homology of Soergel bimodules

    DEFF Research Database (Denmark)

    Webster, Ben; Williamson, Geordie

    2008-01-01

    An important step in the calculation of the triply graded link homology of Khovanov and Rozansky is the determination of the Hochschild homology of Soergel bimodules for SL(n). We present a geometric model for this Hochschild homology for any simple group G, as B–equivariant intersection cohomology...... on generators whose degree is explicitly determined by the geometry of the orbit closure, and to describe its Hilbert series, proving a conjecture of Jacob Rasmussen....

  2. Intron loss from the NADH dehydrogenase subunit 4 gene of lettuce mitochondrial DNA: evidence for homologous recombination of a cDNA intermediate.

    Science.gov (United States)

    Geiss, K T; Abbas, G M; Makaroff, C A

    1994-04-01

    The mitochondrial gene coding for subunit 4 of the NADH dehydrogenase complex I (nad4) has been isolated and characterized from lettuce, Lactuca sativa. Analysis of nad4 genes in a number of plants by Southern hybridization had previously suggested that the intron content varied between species. Characterization of the lettuce gene confirms this observation. Lettuce nad4 contains two exons and one group IIA intron, whereas previously sequenced nad4 genes from turnip and wheat contain three group IIA introns. Northern analysis identified a transcript of 1600 nucleotides, which represents the mature nad4 mRNA and a primary transcript of 3200 nucleotides. Sequence analysis of lettuce and turnip nad4 cDNAs was used to confirm the intron/exon border sequences and to examine RNA editing patterns. Editing is observed at the 5' and 3' ends of the lettuce transcript, but is absent from sequences that correspond to exons two, three and the 5' end of exon four in turnip and wheat. In contrast, turnip transcripts are highly edited in this region, suggesting that homologous recombination of an edited and spliced cDNA intermediate was involved in the loss of introns two and three from an ancestral lettuce nad4 gene.

  3. Nicotiana small RNA sequences support a host genome origin of cucumber mosaic virus satellite RNA.

    Directory of Open Access Journals (Sweden)

    Kiran Zahid

    2015-01-01

    Full Text Available Satellite RNAs (satRNAs are small noncoding subviral RNA pathogens in plants that depend on helper viruses for replication and spread. Despite many decades of research, the origin of satRNAs remains unknown. In this study we show that a β-glucuronidase (GUS transgene fused with a Cucumber mosaic virus (CMV Y satellite RNA (Y-Sat sequence (35S-GUS:Sat was transcriptionally repressed in N. tabacum in comparison to a 35S-GUS transgene that did not contain the Y-Sat sequence. This repression was not due to DNA methylation at the 35S promoter, but was associated with specific DNA methylation at the Y-Sat sequence. Both northern blot hybridization and small RNA deep sequencing detected 24-nt siRNAs in wild-type Nicotiana plants with sequence homology to Y-Sat, suggesting that the N. tabacum genome contains Y-Sat-like sequences that give rise to 24-nt sRNAs capable of guiding RNA-directed DNA methylation (RdDM to the Y-Sat sequence in the 35S-GUS:Sat transgene. Consistent with this, Southern blot hybridization detected multiple DNA bands in Nicotiana plants that had sequence homology to Y-Sat, suggesting that Y-Sat-like sequences exist in the Nicotiana genome as repetitive DNA, a DNA feature associated with 24-nt sRNAs. Our results point to a host genome origin for CMV satRNAs, and suggest novel approach of using small RNA sequences for finding the origin of other satRNAs.

  4. Molecular cloning of a human glycophorin B cDNA: nucleotide sequence and genomic relationship to glycophorin A

    International Nuclear Information System (INIS)

    Siebert, P.D.; Fukuda, M.

    1987-01-01

    The authors describe the isolation and nucleotide sequence of a human glycophorin B cDNA. The cDNA was identified by differential hybridization of synthetic oligonucleotide probes to a human erythroleukemic cell line (K562) cDNA library constructed in phage vector λgt10. The nucleotide sequence of the glycophorin B cDNA was compared with that of a previously cloned glycophorin A cDNA. The nucleotide sequences encoding the NH 2 -terminal leader peptide and first 26 amino acids of the two proteins are nearly identical. This homologous region is followed by areas specific to either glycophorin A or B and a number of small regions of homology, which in turn are followed by a very homologous region encoding the presumed membrane-spanning portion of the proteins. They used RNA blot hybridization with both cDNA and synthetic oligonucleotide probes to prove our previous hypothesis that glycophorin B is encoded by a single 0.5- to 0.6-kb mRNA and to show that glycophorins A and B are negatively and coordinately regulated by a tumor-promoting phorbol ester, phorbol 12-myristate 13-acetate. They established the intron/exon structure of the glycophorin A and B genes by oligonucleotide mapping; the results suggest a complex evolution of the glycophorin genes

  5. The complete genome sequence of human adenovirus 84, a highly recombinant new Human mastadenovirus D type with a unique fiber gene.

    Science.gov (United States)

    Kaján, Győző L; Kajon, Adriana E; Pinto, Alexis Castillo; Bartha, Dániel; Arnberg, Niklas

    2017-10-15

    A novel human adenovirus was isolated from a pediatric case of acute respiratory disease in Panama City, Panama in 2011. The clinical isolate was initially identified as an intertypic recombinant based on hexon and fiber gene sequencing. Based on the analysis of its complete genome sequence, the novel complex recombinant Human mastadenovirus D (HAdV-D) strain was classified into a new HAdV type: HAdV-84, and it was designated Adenovirus D human/PAN/P309886/2011/84[P43H17F84]. HAdV-D types possess usually an ocular or gastrointestinal tropism, and respiratory association is scarcely reported. The virus has a novel fiber type, most closely related to, but still clearly distant from that of HAdV-36. The predicted fiber is hypothesised to bind sialic acid with lower affinity compared to HAdV-37. Bioinformatic analysis of the complete genomic sequence of HAdV-84 revealed multiple homologous recombination events and provided deeper insight into HAdV evolution. Copyright © 2017 Elsevier B.V. All rights reserved.

  6. VITAL NMR: using chemical shift derived secondary structure information for a limited set of amino acids to assess homology model accuracy

    Energy Technology Data Exchange (ETDEWEB)

    Brothers, Michael C.; Nesbitt, Anna E.; Hallock, Michael J. [University of Illinois at Urbana-Champaign, Department of Chemistry (United States); Rupasinghe, Sanjeewa G. [University of Illinois at Urbana-Champaign, Department of Cell and Developmental Biology (United States); Tang Ming [University of Illinois at Urbana-Champaign, Department of Chemistry (United States); Harris, Jason; Baudry, Jerome [University of Tennessee, Department of Biochemistry, Cellular and Molecular Biology (United States); Schuler, Mary A. [University of Illinois at Urbana-Champaign, Department of Cell and Developmental Biology (United States); Rienstra, Chad M., E-mail: rienstra@illinois.edu [University of Illinois at Urbana-Champaign, Department of Chemistry (United States)

    2012-01-15

    Homology modeling is a powerful tool for predicting protein structures, whose success depends on obtaining a reasonable alignment between a given structural template and the protein sequence being analyzed. In order to leverage greater predictive power for proteins with few structural templates, we have developed a method to rank homology models based upon their compliance to secondary structure derived from experimental solid-state NMR (SSNMR) data. Such data is obtainable in a rapid manner by simple SSNMR experiments (e.g., {sup 13}C-{sup 13}C 2D correlation spectra). To test our homology model scoring procedure for various amino acid labeling schemes, we generated a library of 7,474 homology models for 22 protein targets culled from the TALOS+/SPARTA+ training set of protein structures. Using subsets of amino acids that are plausibly assigned by SSNMR, we discovered that pairs of the residues Val, Ile, Thr, Ala and Leu (VITAL) emulate an ideal dataset where all residues are site specifically assigned. Scoring the models with a predicted VITAL site-specific dataset and calculating secondary structure with the Chemical Shift Index resulted in a Pearson correlation coefficient (-0.75) commensurate to the control (-0.77), where secondary structure was scored site specifically for all amino acids (ALL 20) using STRIDE. This method promises to accelerate structure procurement by SSNMR for proteins with unknown folds through guiding the selection of remotely homologous protein templates and assessing model quality.

  7. VITAL NMR: Using Chemical Shift Derived Secondary Structure Information for a Limited Set of Amino Acids to Assess Homology Model Accuracy

    Energy Technology Data Exchange (ETDEWEB)

    Brothers, Michael C [University of Illinois, Urbana-Champaign; Nesbitt, Anna E [University of Illinois, Urbana-Champaign; Hallock, Michael J [University of Illinois, Urbana-Champaign; Rupasinghe, Sanjeewa [University of Illinois, Urbana-Champaign; Tang, Ming [University of Illinois, Urbana-Champaign; Harris, Jason B [ORNL; Baudry, Jerome Y [ORNL; Schuler, Mary A [University of Illinois, Urbana-Champaign; Rienstra, Chad M [University of Illinois, Urbana-Champaign

    2011-01-01

    Homology modeling is a powerful tool for predicting protein structures, whose success depends on obtaining a reasonable alignment between a given structural template and the protein sequence being analyzed. In order to leverage greater predictive power for proteins with few structural templates, we have developed a method to rank homology models based upon their compliance to secondary structure derived from experimental solid-state NMR (SSNMR) data. Such data is obtainable in a rapid manner by simple SSNMR experiments (e.g., (13)C-(13)C 2D correlation spectra). To test our homology model scoring procedure for various amino acid labeling schemes, we generated a library of 7,474 homology models for 22 protein targets culled from the TALOS+/SPARTA+ training set of protein structures. Using subsets of amino acids that are plausibly assigned by SSNMR, we discovered that pairs of the residues Val, Ile, Thr, Ala and Leu (VITAL) emulate an ideal dataset where all residues are site specifically assigned. Scoring the models with a predicted VITAL site-specific dataset and calculating secondary structure with the Chemical Shift Index resulted in a Pearson correlation coefficient (-0.75) commensurate to the control (-0.77), where secondary structure was scored site specifically for all amino acids (ALL 20) using STRIDE. This method promises to accelerate structure procurement by SSNMR for proteins with unknown folds through guiding the selection of remotely homologous protein templates and assessing model quality.

  8. Homology of normal chains and cohomology of charges

    CERN Document Server

    Pauw, Th De; Pfeffer, W F

    2017-01-01

    The authors consider a category of pairs of compact metric spaces and Lipschitz maps where the pairs satisfy a linearly isoperimetric condition related to the solvability of the Plateau problem with partially free boundary. It includes properly all pairs of compact Lipschitz neighborhood retracts of a large class of Banach spaces. On this category the authors define homology and cohomology functors with real coefficients which satisfy the Eilenberg-Steenrod axioms, but reflect the metric properties of the underlying spaces. As an example they show that the zero-dimensional homology of a space in our category is trivial if and only if the space is path connected by arcs of finite length. The homology and cohomology of a pair are, respectively, locally convex and Banach spaces that are in duality. Ignoring the topological structures, the homology and cohomology extend to all pairs of compact metric spaces. For locally acyclic spaces, the authors establish a natural isomorphism between their cohomology and the �...

  9. The N-terminal sequence of ribosomal protein L10 from the archaebacterium Halobacterium marismortui and its relationship to eubacterial protein L6 and other ribosomal proteins.

    Science.gov (United States)

    Dijk, J; van den Broek, R; Nasiulas, G; Beck, A; Reinhardt, R; Wittmann-Liebold, B

    1987-08-01

    The amino-terminal sequence of ribosomal protein L10 from Halobacterium marismortui has been determined up to residue 54, using both a liquid- and a gas-phase sequenator. The two sequences are in good agreement. The protein is clearly homologous to protein HcuL10 from the related strain Halobacterium cutirubrum. Furthermore, a weaker but distinct homology to ribosomal protein L6 from Escherichia coli and Bacillus stearothermophilus can be detected. In addition to 7 identical amino acids in the first 36 residues in all four sequences a number of conservative replacements occurs, of mainly hydrophobic amino acids. In this common region the pattern of conserved amino acids suggests the presence of a beta-alpha fold as it occurs in ribosomal proteins L12 and L30. Furthermore, several potential cases of homology to other ribosomal components of the three ur-kingdoms have been found.

  10. Reference Genome-Directed Resolution of Homologous and Homeologous Relationships within and between Different Oat Linkage Maps

    Directory of Open Access Journals (Sweden)

    Juan J. Gutierrez-Gonzalez

    2011-11-01

    Full Text Available Genome research on oat ( L. has received less attention than wheat ( L. and barley ( L. because it is a less prominent component of the human food system. To assess the potential of the model grass (L P. Beauv. as a surrogate for oat genome research, the whole genome sequence (WGS of was employed for comparative analysis with oat genetic linkage maps. Sequences of mapped molecular markers from one diploid spp. and two hexaploid oat maps were aligned to the WGS to infer syntenic relationships. Diploid and exhibit a high degree of synteny with 18 syntenic blocks covering 87% of the oat genome, which permitted postulation of an ancestral spp. chromosome structure. Synteny between oat and was also prevalent, with 50 syntenic blocks covering 76.6% of the ‘Kanota’ × ‘Ogle’ linkage map. Coalignment of diploid and hexaploid maps to helped resolve homeologous relationships between different oat linkage groups but also revealed many major rearrangements in oat subgenomes. Extending the analysis to a second oat linkage map (Ogle × ‘TAM O-301’ allowed identification of several putative homologous linkage groups across the two oat populations. These results indicate that the genome sequence will be a useful resource to assist genetics and genomics research in oat. The analytical strategy employed here should be applicable for genome research in other temperate grass crops with modest amounts of genomic data.

  11. Development and Testing of New Gene-Homologous EST-SSRs for Eucalyptus gomphocephala (Myrtaceae

    Directory of Open Access Journals (Sweden)

    Donna Bradbury

    2013-07-01

    Full Text Available Premise of the study: New microsatellite (simple sequence repeat [SSR] primers were developed from Eucalyptus expressed sequence tags (ESTs and optimized for genetic studies of the southwestern Australian tree E. gomphocephala, which is severely impacted by tree health decline and habitat fragmentation. Methods and Results: A total of 133 gene-homologous EST-SSR primer pairs were designed for Eucalyptus, and 44 were screened in E. gomphocephala. Of these, 17 produced reliable amplification products and 11 were polymorphic. Between two and 13 alleles were observed per locus, and observed heterozygosities ranged from 0.172 to 0.867. All 17 EST-SSRs that amplified E. gomphocephala cross-amplified to at least one of E. marginata, E. camaldulensis, and E. victrix. Conclusions: This set of EST-SSR primer pairs will be valuable tools for future population genetic studies of E. gomphocephala and other eucalypts, particularly for studying gene-linked variation and informing seed-sourcing strategies for ecological restoration.

  12. Analysis of quality raw data of second generation sequencers with Quality Assessment Software.

    Science.gov (United States)

    Ramos, Rommel Tj; Carneiro, Adriana R; Baumbach, Jan; Azevedo, Vasco; Schneider, Maria Pc; Silva, Artur

    2011-04-18

    Second generation technologies have advantages over Sanger; however, they have resulted in new challenges for the genome construction process, especially because of the small size of the reads, despite the high degree of coverage. Independent of the program chosen for the construction process, DNA sequences are superimposed, based on identity, to extend the reads, generating contigs; mismatches indicate a lack of homology and are not included. This process improves our confidence in the sequences that are generated. We developed Quality Assessment Software, with which one can review graphs showing the distribution of quality values from the sequencing reads. This software allow us to adopt more stringent quality standards for sequence data, based on quality-graph analysis and estimated coverage after applying the quality filter, providing acceptable sequence coverage for genome construction from short reads. Quality filtering is a fundamental step in the process of constructing genomes, as it reduces the frequency of incorrect alignments that are caused by measuring errors, which can occur during the construction process due to the size of the reads, provoking misassemblies. Application of quality filters to sequence data, using the software Quality Assessment, along with graphing analyses, provided greater precision in the definition of cutoff parameters, which increased the accuracy of genome construction.

  13. PMS2 gene mutational analysis: direct cDNA sequencing to circumvent pseudogene interference.

    Science.gov (United States)

    Wimmer, Katharina; Wernstedt, Annekatrin

    2014-01-01

    The presence of highly homologous pseudocopies can compromise the mutation analysis of a gene of interest. In particular, when using PCR-based strategies, pseudogene co-amplification has to be effectively prevented. This is often achieved by using primers designed to be parental gene specific according to the reference sequence and by applying stringent PCR conditions. However, there are cases in which this approach is of limited utility. For example, it has been shown that the PMS2 gene exchanges sequences with one of its pseudogenes, named PMS2CL. This results in functional PMS2 alleles containing pseudogene-derived sequences at their 3'-end and in nonfunctional PMS2CL pseudogene alleles that contain gene-derived sequences. Hence, the paralogues cannot be distinguished according to the reference sequence. This shortcoming can be effectively circumvented by using direct cDNA sequencing. This approach is based on the selective amplification of PMS2 transcripts in two overlapping 1.6-kb RT-PCR products. In addition to avoiding pseudogene co-amplification and allele dropout, this method has also the advantage that it allows to effectively identify deletions, splice mutations, and de novo retrotransposon insertions that escape the detection of most DNA-based mutation analysis protocols.

  14. The rate of nonallelic homologous recombination in males is highly variable, correlated between monozygotic twins and independent of age.

    Directory of Open Access Journals (Sweden)

    Jacqueline A L MacArthur

    2014-03-01

    Full Text Available Nonallelic homologous recombination (NAHR between highly similar duplicated sequences generates chromosomal deletions, duplications and inversions, which can cause diverse genetic disorders. Little is known about interindividual variation in NAHR rates and the factors that influence this. We estimated the rate of deletion at the CMT1A-REP NAHR hotspot in sperm DNA from 34 male donors, including 16 monozygotic (MZ co-twins (8 twin pairs aged 24 to 67 years old. The average NAHR rate was 3.5 × 10(-5 with a seven-fold variation across individuals. Despite good statistical power to detect even a subtle correlation, we observed no relationship between age of unrelated individuals and the rate of NAHR in their sperm, likely reflecting the meiotic-specific origin of these events. We then estimated the heritability of deletion rate by calculating the intraclass correlation (ICC within MZ co-twins, revealing a significant correlation between MZ co-twins (ICC = 0.784, p = 0.0039, with MZ co-twins being significantly more correlated than unrelated pairs. We showed that this heritability cannot be explained by variation in PRDM9, a known regulator of NAHR, or variation within the NAHR hotspot itself. We also did not detect any correlation between Body Mass Index (BMI, smoking status or alcohol intake and rate of NAHR. Our results suggest that other, as yet unidentified, genetic or environmental factors play a significant role in the regulation of NAHR and are responsible for the extensive variation in the population for the probability of fathering a child with a genomic disorder resulting from a pathogenic deletion.

  15. Highly accurate sequence imputation enables precise QTL mapping in Brown Swiss cattle.

    Science.gov (United States)

    Frischknecht, Mirjam; Pausch, Hubert; Bapst, Beat; Signer-Hasler, Heidi; Flury, Christine; Garrick, Dorian; Stricker, Christian; Fries, Ruedi; Gredler-Grandl, Birgit

    2017-12-29

    Within the last few years a large amount of genomic information has become available in cattle. Densities of genomic information vary from a few thousand variants up to whole genome sequence information. In order to combine genomic information from different sources and infer genotypes for a common set of variants, genotype imputation is required. In this study we evaluated the accuracy of imputation from high density chips to whole genome sequence data in Brown Swiss cattle. Using four popular imputation programs (Beagle, FImpute, Impute2, Minimac) and various compositions of reference panels, the accuracy of the imputed sequence variant genotypes was high and differences between the programs and scenarios were small. We imputed sequence variant genotypes for more than 1600 Brown Swiss bulls and performed genome-wide association studies for milk fat percentage at two stages of lactation. We found one and three quantitative trait loci for early and late lactation fat content, respectively. Known causal variants that were imputed from the sequenced reference panel were among the most significantly associated variants of the genome-wide association study. Our study demonstrates that whole-genome sequence information can be imputed at high accuracy in cattle populations. Using imputed sequence variant genotypes in genome-wide association studies may facilitate causal variant detection.

  16. Library Design-Facilitated High-Throughput Sequencing of Synthetic Peptide Libraries.

    Science.gov (United States)

    Vinogradov, Alexander A; Gates, Zachary P; Zhang, Chi; Quartararo, Anthony J; Halloran, Kathryn H; Pentelute, Bradley L

    2017-11-13

    A methodology to achieve high-throughput de novo sequencing of synthetic peptide mixtures is reported. The approach leverages shotgun nanoliquid chromatography coupled with tandem mass spectrometry-based de novo sequencing of library mixtures (up to 2000 peptides) as well as automated data analysis protocols to filter away incorrect assignments, noise, and synthetic side-products. For increasing the confidence in the sequencing results, mass spectrometry-friendly library designs were developed that enabled unambiguous decoding of up to 600 peptide sequences per hour while maintaining greater than 85% sequence identification rates in most cases. The reliability of the reported decoding strategy was additionally confirmed by matching fragmentation spectra for select authentic peptides identified from library sequencing samples. The methods reported here are directly applicable to screening techniques that yield mixtures of active compounds, including particle sorting of one-bead one-compound libraries and affinity enrichment of synthetic library mixtures performed in solution.

  17. Targeted Capture and High-Throughput Sequencing Using Molecular Inversion Probes (MIPs).

    Science.gov (United States)

    Cantsilieris, Stuart; Stessman, Holly A; Shendure, Jay; Eichler, Evan E

    2017-01-01

    Molecular inversion probes (MIPs) in combination with massively parallel DNA sequencing represent a versatile, yet economical tool for targeted sequencing of genomic DNA. Several thousand genomic targets can be selectively captured using long oligonucleotides containing unique targeting arms and universal linkers. The ability to append sequencing adaptors and sample-specific barcodes allows large-scale pooling and subsequent high-throughput sequencing at relatively low cost per sample. Here, we describe a "wet bench" protocol detailing the capture and subsequent sequencing of >2000 genomic targets from 192 samples, representative of a single lane on the Illumina HiSeq 2000 platform.

  18. Hybridization Capture Using Short PCR Products Enriches Small Genomes by Capturing Flanking Sequences (CapFlank)

    DEFF Research Database (Denmark)

    Tsangaras, Kyriakos; Wales, Nathan; Sicheritz-Pontén, Thomas

    2014-01-01

    , a non-negligible fraction of the resulting sequence reads are not homologous to the bait. We demonstrate that during capture, the bait-hybridized library molecules add additional flanking library sequences iteratively, such that baits limited to targeting relatively short regions (e.g. few hundred...... nucleotides) can result in enrichment across entire mitochondrial and bacterial genomes. Our findings suggest that some of the off-target sequences derived in capture experiments are non-randomly enriched, and that CapFlank will facilitate targeted enrichment of large contiguous sequences with minimal prior...

  19. The nucleotide sequence of human transition protein 1 cDNA

    Energy Technology Data Exchange (ETDEWEB)

    Luerssen, H; Hoyer-Fender, S; Engel, W [Universitaet Goettingen (West Germany)

    1988-08-11

    The authors have screened a human testis cDNA library with an oligonucleotide of 81 mer prepared according to a part of the published nucleotide sequence of the rat transition protein TP 1. They have isolated a cDNA clone with the length of 441 bp containing the coding region of 162 bp for human transition protein 1. There is about 84% homology in the coding region of the sequence compared to rat. The human cDNA-clone encodes a polypeptide of 54 amino acids of which 7 are different to that of rat.

  20. Lion (Panthera leo) and cheetah (Acinonyx jubatus) IFN-gamma sequences.

    Science.gov (United States)

    Maas, Miriam; Van Rhijn, Ildiko; Allsopp, Maria T E P; Rutten, Victor P M G

    2010-04-15

    Cloning and sequencing of the full length lion and cheetah interferon-gamma (IFN-gamma) transcript will enable the expression of the recombinant cytokine, to be used for production of monoclonal antibodies and to set up lion and cheetah-specific IFN-gamma ELISAs. These are relevant in blood-based diagnosis of bovine tuberculosis, an important threat to lions in the Kruger National Park. Alignment of nucleotide and amino acid sequences of lion and cheetah and that of domestic cats showed homologies of 97-100%. Copyright 2009 Elsevier B.V. All rights reserved.

  1. Identification of three homologous latex-clearing protein (lcp) genes from the genome of Streptomyces sp. strain CFMR 7.

    Science.gov (United States)

    Nanthini, Jayaram; Ong, Su Yean; Sudesh, Kumar

    2017-09-10

    Rubber materials have greatly contributed to human civilization. However, being a polymeric material does not decompose easily, it has caused huge environmental problems. On the other hand, only few bacteria are known to degrade rubber, with studies pertaining them being intensively focusing on the mechanism involved in microbial rubber degradation. The Streptomyces sp. strain CFMR 7, which was previously confirmed to possess rubber-degrading ability, was subjected to whole genome sequencing using the single molecule sequencing technology of the PacBio® RS II system. The genome was further analyzed and compared with previously reported rubber-degrading bacteria in order to identify the potential genes involved in rubber degradation. This led to the interesting discovery of three homologues of latex-clearing protein (Lcp) on the chromosome of this strain, which are probably responsible for rubber degrading activities. Genes encoding oxidoreductase α-subunit (oxiA) and oxidoreductase β-subunit (oxiB) were also found downstream of two lcp genes which are located adjacent to each other. In silico analysis reveals genes that have been identified to be involved in the microbial degradation of rubber in the Streptomyces sp. strain CFMR 7. This is the first whole genome sequence of a clear-zone-forming natural rubber- degrading Streptomyces sp., which harbours three Lcp homologous genes with the presence of oxiA and oxiB genes compared to the previously reported Gordonia polyisoprenivorans strain VH2 (with two Lcp homologous genes) and Nocardia nova SH22a (with only one Lcp gene). Copyright © 2017 Elsevier B.V. All rights reserved.

  2. (D,A)∞-modules over (D,A)∞-algebras and spectral sequences

    International Nuclear Information System (INIS)

    Lapin, S V

    2002-01-01

    We introduce the construction of a (D,A) ∞ -(co)module over a (D,A) ∞ -(co) algebra and study its main homotopy properties. We establish a connection between (D,A) ∞ -(co)modules over (D,A) ∞ -(co)algebras and spectral sequences, and thus obtain the structure of an A ∞ -comodule over the Milnor A ∞ -coalgebra on the homology of any spectrum directly from the differentials of the Adams spectral sequence of this spectrum

  3. The Complete Genome Sequence of Herpesvirus Papio 2 (Cercopithecine Herpesvirus 16) Shows Evidence of Recombination Events among Various Progenitor Herpesviruses†

    Science.gov (United States)

    Tyler, Shaun D.; Severini, Alberto

    2006-01-01

    We have sequenced the entire genome of herpesvirus papio 2 (HVP-2; Cercopithecine herpesvirus 16) strain X313, a baboon herpesvirus with close homology to other primate alphaherpesviruses, such as SA8, monkey B virus, and herpes simplex virus (HSV) type 1 and type 2. The genome of HVP-2 is 156,487 bp in length, with an overall GC content of 76.5%. The genome organization is identical to that of the other members of the genus Simplexvirus, with a long and a short unique region, each bordered by inverted repeats which end with an “a” sequence. All of the open reading frames detected in this genome were homologous and colinear with those of SA8 and B virus. The HSV gene RL1 (γ134.5; neurovirulence factor) is not present in HVP-2, as is the case for SA8 and B virus. The HVP-2 genome is 85% homologous to its closest relative, SA8. However, segment-by-segment bootstrap analysis of the genome revealed at least two regions that display closer homology to the corresponding sequences of B virus. The first region comprises the UL41 to UL44 genes, and the second region is located within the UL36 gene. We hypothesize that this localized and defined shift in homology is due to recombination events between an SA8-like progenitor of HVP-2 and a herpesvirus species more closely related to the B virus. Since some of the genes involved in these putative recombination events are determinants of virulence, a comparative analysis of their function may provide insight into the pathogenic mechanism of simplexviruses. PMID:16414998

  4. The complete genome sequence of herpesvirus papio 2 (Cercopithecine herpesvirus 16) shows evidence of recombination events among various progenitor herpesviruses.

    Science.gov (United States)

    Tyler, Shaun D; Severini, Alberto

    2006-02-01

    We have sequenced the entire genome of herpesvirus papio 2 (HVP-2; Cercopithecine herpesvirus 16) strain X313, a baboon herpesvirus with close homology to other primate alphaherpesviruses, such as SA8, monkey B virus, and herpes simplex virus (HSV) type 1 and type 2. The genome of HVP-2 is 156,487 bp in length, with an overall GC content of 76.5%. The genome organization is identical to that of the other members of the genus Simplexvirus, with a long and a short unique region, each bordered by inverted repeats which end with an "a" sequence. All of the open reading frames detected in this genome were homologous and colinear with those of SA8 and B virus. The HSV gene RL1 (gamma(1)34.5; neurovirulence factor) is not present in HVP-2, as is the case for SA8 and B virus. The HVP-2 genome is 85% homologous to its closest relative, SA8. However, segment-by-segment bootstrap analysis of the genome revealed at least two regions that display closer homology to the corresponding sequences of B virus. The first region comprises the UL41 to UL44 genes, and the second region is located within the UL36 gene. We hypothesize that this localized and defined shift in homology is due to recombination events between an SA8-like progenitor of HVP-2 and a herpesvirus species more closely related to the B virus. Since some of the genes involved in these putative recombination events are determinants of virulence, a comparative analysis of their function may provide insight into the pathogenic mechanism of simplexviruses.

  5. MicroRNA from Moringa oleifera: Identification by High Throughput Sequencing and Their Potential Contribution to Plant Medicinal Value.

    Science.gov (United States)

    Pirrò, Stefano; Zanella, Letizia; Kenzo, Maurice; Montesano, Carla; Minutolo, Antonella; Potestà, Marina; Sobze, Martin Sanou; Canini, Antonella; Cirilli, Marco; Muleo, Rosario; Colizzi, Vittorio; Galgani, Andrea

    2016-01-01

    Moringa oleifera is a widespread plant with substantial nutritional and medicinal value. We postulated that microRNAs (miRNAs), which are endogenous, noncoding small RNAs regulating gene expression at the post-transcriptional level, might contribute to the medicinal properties of plants of this species after ingestion into human body, regulating human gene expression. However, the knowledge is scarce about miRNA in Moringa. Furthermore, in order to test the hypothesis on the pharmacological potential properties of miRNA, we conducted a high-throughput sequencing analysis using the Illumina platform. A total of 31,290,964 raw reads were produced from a library of small RNA isolated from M. oleifera seeds. We identified 94 conserved and two novel miRNAs that were validated by qRT-PCR assays. Results from qRT-PCR trials conducted on the expression of 20 Moringa miRNA showed that are conserved across multiple plant species as determined by their detection in tissue of other common crop plants. In silico analyses predicted target genes for the conserved miRNA that in turn allowed to relate the miRNAs to the regulation of physiological processes. Some of the predicted plant miRNAs have functional homology to their mammalian counterparts and regulated human genes when they were transfected into cell lines. To our knowledge, this is the first report of discovering M. oleifera miRNAs based on high-throughput sequencing and bioinformatics analysis and we provided new insight into a potential cross-species control of human gene expression. The widespread cultivation and consumption of M. oleifera, for nutritional and medicinal purposes, brings humans into close contact with products and extracts of this plant species. The potential for miRNA transfer should be evaluated as one possible mechanism of action to account for beneficial properties of this valuable species.

  6. Very high resolution single pass HLA genotyping using amplicon sequencing on the 454 next generation DNA sequencers: Comparison with Sanger sequencing.

    Science.gov (United States)

    Yamamoto, F; Höglund, B; Fernandez-Vina, M; Tyan, D; Rastrou, M; Williams, T; Moonsamy, P; Goodridge, D; Anderson, M; Erlich, H A; Holcomb, C L

    2015-12-01

    Compared to Sanger sequencing, next-generation sequencing offers advantages for high resolution HLA genotyping including increased throughput, lower cost, and reduced genotype ambiguity. Here we describe an enhancement of the Roche 454 GS GType HLA genotyping assay to provide very high resolution (VHR) typing, by the addition of 8 primer pairs to the original 14, to genotype 11 HLA loci. These additional amplicons help resolve common and well-documented alleles and exclude commonly found null alleles in genotype ambiguity strings. Simplification of workflow to reduce the initial preparation effort using early pooling of amplicons or the Fluidigm Access Array™ is also described. Performance of the VHR assay was evaluated on 28 well characterized cell lines using Conexio Assign MPS software which uses genomic, rather than cDNA, reference sequence. Concordance was 98.4%; 1.6% had no genotype assignment. Of concordant calls, 53% were unambiguous. To further assess the assay, 59 clinical samples were genotyped and results compared to unambiguous allele assignments obtained by prior sequence-based typing supplemented with SSO and/or SSP. Concordance was 98.7% with 58.2% as unambiguous calls; 1.3% could not be assigned. Our results show that the amplicon-based VHR assay is robust and can replace current Sanger methodology. Together with software enhancements, it has the potential to provide even higher resolution HLA typing. Copyright © 2015. Published by Elsevier Inc.

  7. Cloning, analysis and functional annotation of expressed sequence tags from the Earthworm Eisenia fetida

    Science.gov (United States)

    Pirooznia, Mehdi; Gong, Ping; Guan, Xin; Inouye, Laura S; Yang, Kuan; Perkins, Edward J; Deng, Youping

    2007-01-01

    Background Eisenia fetida, commonly known as red wiggler or compost worm, belongs to the Lumbricidae family of the Annelida phylum. Little is known about its genome sequence although it has been extensively used as a test organism in terrestrial ecotoxicology. In order to understand its gene expression response to environmental contaminants, we cloned 4032 cDNAs or expressed sequence tags (ESTs) from two E. fetida libraries enriched with genes responsive to ten ordnance related compounds using suppressive subtractive hybridization-PCR. Results A total of 3144 good quality ESTs (GenBank dbEST accession number EH669363–EH672369 and EL515444–EL515580) were obtained from the raw clone sequences after cleaning. Clustering analysis yielded 2231 unique sequences including 448 contigs (from 1361 ESTs) and 1783 singletons. Comparative genomic analysis showed that 743 or 33% of the unique sequences shared high similarity with existing genes in the GenBank nr database. Provisional function annotation assigned 830 Gene Ontology terms to 517 unique sequences based on their homology with the annotated genomes of four model organisms Drosophila melanogaster, Mus musculus, Saccharomyces cerevisiae, and Caenorhabditis elegans. Seven percent of the unique sequences were further mapped to 99 Kyoto Encyclopedia of Genes and Genomes pathways based on their matching Enzyme Commission numbers. All the information is stored and retrievable at a highly performed, web-based and user-friendly relational database called EST model database or ESTMD version 2. Conclusion The ESTMD containing the sequence and annotation information of 4032 E. fetida ESTs is publicly accessible at . PMID:18047730

  8. On the optimal trimming of high-throughput mRNA sequence data

    Directory of Open Access Journals (Sweden)

    Matthew D MacManes

    2014-01-01

    Full Text Available The widespread and rapid adoption of high-throughput sequencing technologies has afforded researchers the opportunity to gain a deep understanding of genome level processes that underlie evolutionary change, and perhaps more importantly, the links between genotype and phenotype. In particular, researchers interested in functional biology and adaptation have used these technologies to sequence mRNA transcriptomes of specific tissues, which in turn are often compared to other tissues, or other individuals with different phenotypes. While these techniques are extremely powerful, careful attention to data quality is required. In particular, because high-throughput sequencing is more error-prone than traditional Sanger sequencing, quality trimming of sequence reads should be an important step in all data processing pipelines. While several software packages for quality trimming exist, no general guidelines for the specifics of trimming have been developed. Here, using empirically derived sequence data, I provide general recommendations regarding the optimal strength of trimming, specifically in mRNA-Seq studies. Although very aggressive quality trimming is common, this study suggests that a more gentle trimming, specifically of those nucleotides whose Phred score < 2 or < 5, is optimal for most studies across a wide variety of metrics.

  9. Complete Genome Sequences of Four Isolates of Plutella xylostella Granulovirus.

    Science.gov (United States)

    Spence, Robert J; Noune, Christopher; Hauxwell, Caroline

    2016-06-30

    Granuloviruses are widespread pathogens of Plutella xylostella L. (diamondback moth) and potential biopesticides for control of this global insect pest. We report the complete genomes of four Plutella xylostella granulovirus isolates from China, Malaysia, and Taiwan exhibiting pairs of noncoding, homologous repeat regions with significant sequence variation but equivalent length. Copyright © 2016 Spence et al.

  10. On the relationship between residue structural environment and sequence conservation in proteins.

    Science.gov (United States)

    Liu, Jen-Wei; Lin, Jau-Ji; Cheng, Chih-Wen; Lin, Yu-Feng; Hwang, Jenn-Kang; Huang, Tsun-Tsao

    2017-09-01

    Residues that are crucial to protein function or structure are usually evolutionarily conserved. To identify the important residues in protein, sequence conservation is estimated, and current methods rely upon the unbiased collection of homologous sequences. Surprisingly, our previous studies have shown that the sequence conservation is closely correlated with the weighted contact number (WCN), a measure of packing density for residue's structural environment, calculated only based on the C α positions of a protein structure. Moreover, studies have shown that sequence conservation is correlated with environment-related structural properties calculated based on different protein substructures, such as a protein's all atoms, backbone atoms, side-chain atoms, or side-chain centroid. To know whether the C α atomic positions are adequate to show the relationship between residue environment and sequence conservation or not, here we compared C α atoms with other substructures in their contributions to the sequence conservation. Our results show that C α positions are substantially equivalent to the other substructures in calculations of various measures of residue environment. As a result, the overlapping contributions between C α atoms and the other substructures are high, yielding similar structure-conservation relationship. Take the WCN as an example, the average overlapping contribution to sequence conservation is 87% between C α and all-atom substructures. These results indicate that only C α atoms of a protein structure could reflect sequence conservation at the residue level. © 2017 Wiley Periodicals, Inc.

  11. Characterization of Rous sarcoma virus-related sequences in the Japanese quail.

    Science.gov (United States)

    Chambers, J A; Cywinski, A; Chen, P J; Taylor, J M

    1986-08-01

    We detected sequences related to the avian retrovirus Rous sarcoma virus within the genome of the Japanese quail, a species previously considered to be free of endogenous avian leukosis virus elements. Using low-stringency conditions of hybridization, we screened a quail genomic library for clones containing retrovirus-related information. Of five clones so selected, one, lambda Q48, contained sequence information related to the gag, pol, and env genes of Rous sarcoma virus arranged in a contiguous fashion and spanning a distance of approximately 5.8 kilobases. This organization is consistent with the presence of an endogenous retroviral element within the Japanese quail genome. Use of this element as a high-stringency probe on Southern blots of genomic digests of several quail DNA demonstrated hybridization to a series of high-molecular-weight bands. By slot hybridization to quail DNA with a cloned probe, it was deduced that there were approximately 300 copies per diploid cell. In addition, the quail element also hybridized at low stringency to the DNA of the White Leghorn chicken and at high stringency to the DNAs of several species of jungle fowl and both true and ruffed pheasants. Limited nucleotide sequencing analysis of lambda Q48 revealed homologies of 65, 52, and 46% compared with the sequence of Rous sarcoma virus strain Prague C for the endonuclease domain of pol, the pol-env junction, and the 3'-terminal region of env, respectively. Comparisons at the amino acid level were also significant, thus confirming the retrovirus relatedness of the cloned quail element.

  12. Current status of grafts and implants in rhinoplasty: Part II. Homologous grafts and allogenic implants.

    Science.gov (United States)

    Sajjadian, Ali; Naghshineh, Nima; Rubinstein, Roee

    2010-03-01

    After reading this article, the participant should be able to: 1. Understand the challenges in restoring volume and structural integrity in rhinoplasty. 2. Identify the appropriate uses of various homologous grafts and allogenic implants in reconstruction, including: (a) freeze-dried acellular allogenic cadaveric dermis grafts, (b) irradiated cartilage grafts, (c) hydroxyapatite mineral matrix, (d) silicone implants, (e) high-density polyethylene implants, (f) polytetrafluoroethylene implants, and (g) injectable filler materials. 3. Identify the advantages and disadvantages of each of these biomaterials. 4. Understand the specific techniques that may aid in the use these grafts or implants. This review specifically addresses the use of homologous grafts and allogenic implants in rhinoplasty. It is important to stress that autologous materials remain the preferred graft material for use in rhinoplasty, owing to their high biocompatibility and low risk of infection and extrusion. However, concerns of donor-site morbidity, graft availability, and graft resorption have motivated the development and use of homologous and allogenic implants.

  13. Several aspects of some techniques avoiding homologous blood transfusions

    NARCIS (Netherlands)

    E.C.S.M. van Woerkens (Liesbeth)

    1998-01-01

    textabstractThe use of homologous blood products during anesthesia and surgery is not without risks. Complications due to homologous blood transfusions include transfusion reactions, isosensitization, transmission of infections (including HIV, hepatitis, CMV) and immunosuppression (resuiting in

  14. Evolution of biological sequences implies an extreme value distribution of type I for both global and local pairwise alignment scores

    Directory of Open Access Journals (Sweden)

    Maréchal Eric

    2008-08-01

    Full Text Available Abstract Background Confidence in pairwise alignments of biological sequences, obtained by various methods such as Blast or Smith-Waterman, is critical for automatic analyses of genomic data. Two statistical models have been proposed. In the asymptotic limit of long sequences, the Karlin-Altschul model is based on the computation of a P-value, assuming that the number of high scoring matching regions above a threshold is Poisson distributed. Alternatively, the Lipman-Pearson model is based on the computation of a Z-value from a random score distribution obtained by a Monte-Carlo simulation. Z-values allow the deduction of an upper bound of the P-value (1/Z-value2 following the TULIP theorem. Simulations of Z-value distribution is known to fit with a Gumbel law. This remarkable property was not demonstrated and had no obvious biological support. Results We built a model of evolution of sequences based on aging, as meant in Reliability Theory, using the fact that the amount of information shared between an initial sequence and the sequences in its lineage (i.e., mutual information in Information Theory is a decreasing function of time. This quantity is simply measured by a sequence alignment score. In systems aging, the failure rate is related to the systems longevity. The system can be a machine with structured components, or a living entity or population. "Reliability" refers to the ability to operate properly according to a standard. Here, the "reliability" of a sequence refers to the ability to conserve a sufficient functional level at the folded and maturated protein level (positive selection pressure. Homologous sequences were considered as systems 1 having a high redundancy of information reflected by the magnitude of their alignment scores, 2 which components are the amino acids that can independently be damaged by random DNA mutations. From these assumptions, we deduced that information shared at each amino acid position evolved with a

  15. The ubiquitin-homology protein, DAP-1, associates with tumor necrosis factor receptor (p60) death domain and induces apoptosis.

    Science.gov (United States)

    Liou, M L; Liou, H C

    1999-04-09

    The tumor necrosis factor receptor, p60 (TNF-R1), transduces death signals via the association of its cytoplasmic domain with several intracellular proteins. By screening a mammalian cDNA library using the yeast two-hybrid cloning technique, we isolated a ubiquitin-homology protein, DAP-1, which specifically interacts with the cytoplasmic death domain of TNF-R1. Sequence analysis reveals that DAP-1 shares striking sequence homology with the yeast SMT3 protein that is essential for the maintenance of chromosome integrity during mitosis (Meluh, P. B., and Koshland, D. (1995) Mol. Biol. Cell 6, 793-807). DAP-1 is nearly identical to PIC1, a protein that interacts with the PML tumor suppressor implicated in acute promyelocytic leukemia (Boddy, M. N., Howe, K., Etkin, L. D., Solomon, E., and Freemont, P. S. (1996) Oncogene 13, 971-982), and the sentrin protein, which associates with the Fas death receptor (Okura, T., Gong, L., Kamitani, T., Wada, T., Okura, I., Wei, C. F., Chang, H. M., and Yeh, E. T. (1996) J. Immunol. 157, 4277-4281). The in vivo interaction between DAP-1 and TNF-R1 was further confirmed in mammalian cells. In transient transfection assays, overexpression of DAP-1 suppresses NF-kappaB/Rel activity in 293T cells, a human kidney embryonic carcinoma cell line. Overexpression of either DAP-1 or sentrin causes apoptosis of TNF-sensitive L929 fibroblast cell line, as well as TNF-resistant osteosarcoma cell line, U2OS. Furthermore, the dominant negative Fas-associated death domain protein (FADD) protein blocks the cell death induced by either DAP-1 or FADD. Collectively, these observations highly suggest a role for DAP-1 in mediating TNF-induced cell death signaling pathways, presumably through the recruitment of FADD death effector.

  16. cDNA encoding a polypeptide including a hevein sequence

    Energy Technology Data Exchange (ETDEWEB)

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    2000-07-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.

  17. cDNA encoding a polypeptide including a hevein sequence

    Energy Technology Data Exchange (ETDEWEB)

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    1999-05-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli. 12 figs.

  18. cDNA encoding a polypeptide including a hevein sequence

    Energy Technology Data Exchange (ETDEWEB)

    Raikhel, Natasha V. (Okemos, MI); Broekaert, Willem F. (Dilbeek, BE); Chua, Nam-Hai (Scarsdale, NY); Kush, Anil (New York, NY)

    1999-05-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74-79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.

  19. cDNA encoding a polypeptide including a hevein sequence

    Energy Technology Data Exchange (ETDEWEB)

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    1995-03-21

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1,018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli. 11 figures.

  20. On the mutagenicity of homologous recombination and double-strand break repair in bacteriophage.

    Science.gov (United States)

    Shcherbakov, Victor P; Plugina, Lidiya; Shcherbakova, Tamara; Sizova, Svetlana; Kudryashova, Elena

    2011-01-02

    The double-strand break (DSB) repair via homologous recombination is generally construed as a high-fidelity process. However, some molecular genetic observations show that the recombination and the recombinational DSB repair may be mutagenic and even highly mutagenic. Here we developed an effective and precise method for studying the fidelity of DSB repair in vivo by combining DSBs produced site-specifically by the SegC endonuclease with the famous advantages of the recombination analysis of bacteriophage T4 rII mutants. The method is based on the comparison of the rate of reversion of rII mutation in the presence and in the absence of a DSB repair event initiated in the proximity of the mutation. We observed that DSB repair may moderately (up to 6-fold) increase the apparent reversion frequency, the effect of being dependent on the mutation structure. We also studied the effect of the T4 recombinase deficiency (amber mutation in the uvsX gene) on the fidelity of DSB repair. We observed that DSBs are still repaired via homologous recombination in the uvsX mutants, and the apparent fidelity of this repair is higher than that seen in the wild-type background. The mutator effect of the DSB repair may look unexpected given that most of the normal DNA synthesis in bacteriophage T4 is performed via a recombination-dependent replication (RDR) pathway, which is thought to be indistinguishable from DSB repair. There are three possible explanations for the observed mutagenicity of DSB repair: (1) the origin-dependent (early) DNA replication may be more accurate than the RDR; (2) the step of replication initiation may be more mutagenic than the process of elongation; and (3) the apparent mutagenicity may just reflect some non-randomness in the pool of replicating DNA, i.e., preferential replication of the sequences already involved in replication. We discuss the DSB repair pathway in the absence of UvsX recombinase. Copyright © 2010 Elsevier B.V. All rights reserved.

  1. Integrative analysis of genomic alterations in triple-negative breast cancer in association with homologous recombination deficiency.

    Directory of Open Access Journals (Sweden)

    Masahito Kawazu

    2017-06-01

    Full Text Available Triple-negative breast cancer (TNBC cells do not express estrogen receptors, progesterone receptors, or human epidermal growth factor receptor 2. Currently, apart from poly ADP-ribose polymerase inhibitors, there are few effective therapeutic options for this type of cancer. Here, we present comprehensive characterization of the genetic alterations in TNBC performed by high coverage whole genome sequencing together with transcriptome and whole exome sequencing. Silencing of the BRCA1 gene impaired the homologous recombination pathway in a subset of TNBCs, which exhibited similar phenotypes to tumors with BRCA1 mutations; they harbored many structural variations (SVs with relative enrichment for tandem duplication. Clonal analysis suggested that TP53 mutations and methylation of CpG dinucleotides in the BRCA1 promoter were early events of carcinogenesis. SVs were associated with driver oncogenic events such as amplification of MYC, NOTCH2, or NOTCH3 and affected tumor suppressor genes including RB1, PTEN, and KMT2C. Furthermore, we identified putative TGFA enhancer regions. Recurrent SVs that affected the TGFA enhancer region led to enhanced expression of the TGFA oncogene that encodes one of the high affinity ligands for epidermal growth factor receptor. We also identified a variety of oncogenes that could transform 3T3 mouse fibroblasts, suggesting that individual TNBC tumors may undergo a unique driver event that can be targetable. Thus, we revealed several features of TNBC with clinically important implications.

  2. Nucleotide sequences of two cellulase genes from alkalophilic Bacillus sp. strain N-4 and their strong homology.

    OpenAIRE

    Fukumori, F; Sashihara, N; Kudo, T; Horikoshi, K

    1986-01-01

    Two genes for cellulases of alkalophilic Bacillus sp. strain N-4 (ATCC 21833) have been sequenced. From the DNA sequences the cellulases encoded in the plasmids pNK1 and pNK2 consist of 488 and 409 amino acids, respectively. The DNA and protein sequences of the pNK1-encoded cellulase are related to those of the pNK2-encoded cellulase. The pNK2-encoded cellulase lacks the direct repeat sequence of a stretch of 60 amino acids near the C-terminal end of the pNK1-encoded cellulase. The duplicatio...

  3. Expressed sequence tag analysis of functional genes associated with adventitious rooting in Liriodendron hybrids.

    Science.gov (United States)

    Zhong, Y D; Sun, X Y; Liu, E Y; Li, Y Q; Gao, Z; Yu, F X

    2016-06-24

    Liriodendron hybrids (Liriodendron chinense x L. tulipifera) are important landscaping and afforestation hardwood trees. To date, little genomic research on adventitious rooting has been reported in these hybrids, as well as in the genus Liriodendron. In the present study, we used adventitious roots to construct the first cDNA library for Liriodendron hybrids. A total of 5176 expressed sequence tags (ESTs) were generated and clustered into 2921 unigenes. Among these unigenes, 2547 had significant homology to the non-redundant protein database representing a wide variety of putative functions. Homologs of these genes regulated many aspects of adventitious rooting, including those for auxin signal transduction and root hair development. Results of quantitative real-time polymerase chain reaction showed that AUX1, IRE, and FB1 were highly expressed in adventitious roots and the expression of AUX1, ARF1, NAC1, RHD1, and IRE increased during the development of adventitious roots. Additionally, 181 simple sequence repeats were identified from 166 ESTs and more than 91.16% of these were dinucleotide and trinucleotide repeats. To the best of our knowledge, the present study reports the identification of the genes associated with adventitious rooting in the genus Liriodendron for the first time and provides a valuable resource for future genomic studies. Expression analysis of selected genes could allow us to identify regulatory genes that may be essential for adventitious rooting.

  4. Homologous recombination is a force in the evolution of canine distemper virus.

    Science.gov (United States)

    Yuan, Chaowen; Liu, Wenxin; Wang, Yingbo; Hou, Jinlong; Zhang, Liguo; Wang, Guoqing

    2017-01-01

    Canine distemper virus (CDV) is the causative agent of canine distemper (CD) that is a highly contagious, lethal, multisystemic viral disease of receptive carnivores. The prevalence of CDV is a major concern in susceptible animals. Presently, it is unclear whether intragenic recombination can contribute to gene mutations and segment reassortment in the virus. In this study, 25 full-length CDV genome sequences were subjected to phylogenetic and recombinational analyses. The results of phylogenetic analysis, intragenic recombination, and nucleotide selection pressure indicated that mutation and recombination occurred in the six individual genes segment (H, F, P, N, L, M) of the CDV genome. The analysis also revealed pronounced genetic diversity in the CDV genome according to the geographically distinct lineages (genotypes), namely Asia-1, Asia-2, Asia-3, Europe, America-1, and America-2. The six recombination events were detected using SimPlot and RDP programs. The analysis of selection pressure demonstrated that a majority of the nucleotides in the CDV individual gene were under negative selection. Collectively, these data suggested that homologous recombination acts as a key force driving the genetic diversity and evolution of canine distemper virus.

  5. Khovanov homology for virtual knots with arbitrary coefficients

    International Nuclear Information System (INIS)

    Manturov, Vassily O

    2007-01-01

    The Khovanov homology theory over an arbitrary coefficient ring is extended to the case of virtual knots. We introduce a complex which is well-defined in the virtual case and is homotopy equivalent to the original Khovanov complex in the classical case. Unlike Khovanov's original construction, our definition of the complex does not use any additional prescription of signs to the edges of a cube. Moreover, our method enables us to construct a Khovanov homology theory for 'twisted virtual knots' in the sense of Bourgoin and Viro (including knots in three-dimensional projective space). We generalize a number of results of Khovanov homology theory (the Wehrli complex, minimality problems, Frobenius extensions) to virtual knots with non-orientable atoms

  6. DNA damage, homology-directed repair, and DNA methylation.

    Directory of Open Access Journals (Sweden)

    Concetta Cuozzo

    2007-07-01

    Full Text Available To explore the link between DNA damage and gene silencing, we induced a DNA double-strand break in the genome of Hela or mouse embryonic stem (ES cells using I-SceI restriction endonuclease. The I-SceI site lies within one copy of two inactivated tandem repeated green fluorescent protein (GFP genes (DR-GFP. A total of 2%-4% of the cells generated a functional GFP by homology-directed repair (HR and gene conversion. However, approximately 50% of these recombinants expressed GFP poorly. Silencing was rapid and associated with HR and DNA methylation of the recombinant gene, since it was prevented in Hela cells by 5-aza-2'-deoxycytidine. ES cells deficient in DNA methyl transferase 1 yielded as many recombinants as wild-type cells, but most of these recombinants expressed GFP robustly. Half of the HR DNA molecules were de novo methylated, principally downstream to the double-strand break, and half were undermethylated relative to the uncut DNA. Methylation of the repaired gene was independent of the methylation status of the converting template. The methylation pattern of recombinant molecules derived from pools of cells carrying DR-GFP at different loci, or from an individual clone carrying DR-GFP at a single locus, was comparable. ClustalW analysis of the sequenced GFP molecules in Hela and ES cells distinguished recombinant and nonrecombinant DNA solely on the basis of their methylation profile and indicated that HR superimposed novel methylation profiles on top of the old patterns. Chromatin immunoprecipitation and RNA analysis revealed that DNA methyl transferase 1 was bound specifically to HR GFP DNA and that methylation of the repaired segment contributed to the silencing of GFP expression. Taken together, our data support a mechanistic link between HR and DNA methylation and suggest that DNA methylation in eukaryotes marks homologous recombined segments.

  7. Activation of the polyomavirus enhancer by a murine activator protein 1 (AP1) homolog and two contiguous proteins.

    OpenAIRE

    Martin, M E; Piette, J; Yaniv, M; Tang, W J; Folk, W R

    1988-01-01

    The polyomavirus enhancer is composed of multiple DNA sequence elements serving as binding sites for proteins present in mouse nuclear extracts that activate transcription and DNA replication. We have identified three such proteins and their binding sites and correlate them with enhancer function. Mutation of nucleotide (nt) 5140 in the enhancer alters the binding site (TGACTAA, nt 5139-5145) for polyomavirus enhancer A binding protein 1 (PEA1), a murine homolog of the human transcription fac...

  8. Using high-throughput barcode sequencing to efficiently map connectomes.

    Science.gov (United States)

    Peikon, Ian D; Kebschull, Justus M; Vagin, Vasily V; Ravens, Diana I; Sun, Yu-Chi; Brouzes, Eric; Corrêa, Ivan R; Bressan, Dario; Zador, Anthony M

    2017-07-07

    The function of a neural circuit is determined by the details of its synaptic connections. At present, the only available method for determining a neural wiring diagram with single synapse precision-a 'connectome'-is based on imaging methods that are slow, labor-intensive and expensive. Here, we present SYNseq, a method for converting the connectome into a form that can exploit the speed and low cost of modern high-throughput DNA sequencing. In SYNseq, each neuron is labeled with a unique random nucleotide sequence-an RNA 'barcode'-which is targeted to the synapse using engineered proteins. Barcodes in pre- and postsynaptic neurons are then associated through protein-protein crosslinking across the synapse, extracted from the tissue, and joined into a form suitable for sequencing. Although our failure to develop an efficient barcode joining scheme precludes the widespread application of this approach, we expect that with further development SYNseq will enable tracing of complex circuits at high speed and low cost. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  9. High dimensional and high resolution pulse sequences for backbone resonance assignment of intrinsically disordered proteins

    Energy Technology Data Exchange (ETDEWEB)

    Zawadzka-Kazimierczuk, Anna; Kozminski, Wiktor, E-mail: kozmin@chem.uw.edu.pl [University of Warsaw, Faculty of Chemistry (Poland); Sanderova, Hana; Krasny, Libor [Institute of Microbiology, Academy of Sciences of the Czech Republic, Laboratory of Molecular Genetics of Bacteria, Department of Bacteriology (Czech Republic)

    2012-04-15

    Four novel 5D (HACA(N)CONH, HNCOCACB, (HACA)CON(CA)CONH, (H)NCO(NCA)CONH), and one 6D ((H)NCO(N)CACONH) NMR pulse sequences are proposed. The new experiments employ non-uniform sampling that enables achieving high resolution in indirectly detected dimensions. The experiments facilitate resonance assignment of intrinsically disordered proteins. The novel pulse sequences were successfully tested using {delta} subunit (20 kDa) of Bacillus subtilis RNA polymerase that has an 81-amino acid disordered part containing various repetitive sequences.

  10. A putative carbohydrate-binding domain of the lactose-binding Cytisus sessilifolius anti-H(O) lectin has a similar amino acid sequence to that of the L-fucose-binding Ulex europaeus anti-H(O) lectin.

    Science.gov (United States)

    Konami, Y; Yamamoto, K; Osawa, T; Irimura, T

    1995-04-01

    The complete amino acid sequence of a lactose-binding Cytisus sessilifolius anti-H(O) lectin II (CSA-II) was determined using a protein sequencer. After digestion of CSA-II with endoproteinase Lys-C or Asp-N, the resulting peptides were purified by reversed-phase high performance liquid chromatography (HPLC) and then subjected to sequence analysis. Comparison of the complete amino acid sequence of CSA-II with the sequences of other leguminous seed lectins revealed regions of extensive homology. The amino acid sequence of a putative carbohydrate-binding domain of CSA-II was found to be similar to those of several anti-H(O) leguminous lectins, especially to that of the L-fucose-binding Ulex europaeus lectin I (UEA-I).

  11. Parametric representation of centrifugal pump homologous curves

    International Nuclear Information System (INIS)

    Veloso, Marcelo A.; Mattos, Joao R.L. de

    2015-01-01

    Essential for any mathematical model designed to simulate flow transient events caused by pump operations is the pump performance data. The performance of a centrifugal pump is characterized by four basic quantities: the rotational speed, the volumetric flow rate, the dynamic head, and the hydraulic torque. The curves showing the relationships between these four variables are called the pump characteristic curves. The characteristic curves are empirically developed by the pump manufacturer and uniquely describe head and torque as functions of volumetric flow rate and rotation speed. Because of comprising a large amount of points, this configuration is not suitable for computational purposes. However, it can be converted to a simpler form by the development of the homologous curves, in which dynamic head and hydraulic torque ratios are expressed as functions of volumetric flow and rotation speed ratios. The numerical use of the complete set of homologous curves requires specification of sixteen partial curves, being eight for the dynamic head and eight for the hydraulic torque. As a consequence, the handling of homologous curves is still somewhat complicated. In solving flow transient problems that require the pump characteristic data for all the operation zones, the parametric form appears as the simplest way to deal with the homologous curves. In this approach, the complete characteristics of a pump can be described by only two closed curves, one for the dynamic head and other for the hydraulic torque, both in function of a single angular coordinate defined adequately in terms of the quotient between volumetric flow ratio and rotation speed ratio. The usefulness and advantages of this alternative method are demonstrated through a practical example in which the homologous curves for a pump of the type used in the main coolant loops of a pressurized water reactor (PWR) are transformed to the parametric form. (author)

  12. Gene repair of an Usher syndrome causing mutation by zinc-finger nuclease mediated homologous recombination.

    Science.gov (United States)

    Overlack, Nora; Goldmann, Tobias; Wolfrum, Uwe; Nagel-Wolfrum, Kerstin

    2012-06-26

    Human Usher syndrome (USH) is the most frequent cause of inherited deaf-blindness. It is clinically and genetically heterogeneous, assigned to three clinical types of which the most severe type is USH1. No effective treatment for the ophthalmic component of USH exists. Gene augmentation is an attractive strategy for hereditary retinal diseases. However, several USH genes, like USH1C, are expressed in various isoforms, hampering gene augmentation. As an alternative treatment strategy, we applied the zinc-finger nuclease (ZFN) technology for targeted gene repair of an USH1C, causing mutation by homologous recombination. We designed ZFNs customized for the p.R31X nonsense mutation in Ush1c. We evaluated ZFNs for DNA cleavage capability and analyzed ZFNs biocompatibilities by XTT assays. We demonstrated ZFNs mediated gene repair on genomic level by digestion assays and DNA sequencing, and on protein level by indirect immunofluorescence and Western blot analyses. The specifically designed ZFNs did not show cytotoxic effects in a p.R31X cell line. We demonstrated that ZFN induced cleavage of their target sequence. We showed that simultaneous application of ZFN and rescue DNA induced gene repair of the disease-causing mutation on the genomic level, resulting in recovery of protein expression. In our present study, we analyzed for the first time ZFN-activated gene repair of an USH gene. The data highlight the ability of ZFNs to induce targeted homologous recombination and mediate gene repair in USH. We provide further evidence that the ZFN technology holds great potential to recover disease-causing mutations in inherited retinal disorders.

  13. Primary homologies of the circumorbital bones of snakes.

    Science.gov (United States)

    Palci, Alessandro; Caldwell, Michael W

    2013-09-01

    Some snakes have two circumorbital ossifications that in the current literature are usually referred to as the postorbital and supraorbital. We review the arguments that have been proposed to justify this interpretation and provide counter-arguments that reject those conjectures of primary homology based on the observation of 32 species of lizards and 81 species of snakes (both extant and fossil). We present similarity arguments, both topological and structural, for reinterpretation of the primary homologies of the dorsal and posterior orbital ossifications of snakes. Applying the test of similarity, we conclude that the posterior orbital ossification of snakes is topologically consistent as the homolog of the lacertilian jugal, and that the dorsal orbital ossification present in some snakes (e.g., pythons, Loxocemus, and Calabaria) is the homolog of the lacertilian postfrontal. We therefore propose that the terms postorbital and supraorbital should be abandoned as reference language for the circumorbital bones of snakes, and be replaced with the terms jugal and postfrontal, respectively. The primary homology claim for the snake "postorbital" fails the test of similarity, while the term "supraorbital" is an unnecessary and inaccurate application of the concept of a neomorphic ossification, for an element that passes the test of similarity as a postfrontal. This reinterpretation of the circumorbital bones of snakes is bound to have important repercussions for future phylogenetic analyses and consequently for our understanding of the origin and evolution of snakes. Copyright © 2013 Wiley Periodicals, Inc.

  14. Cloning and sequence analysis of serine proteinase of Gloydius ussuriensis venom gland

    International Nuclear Information System (INIS)

    Sun Dejun; Liu Shanshan; Yang Chunwei; Zhao Yizhuo; Chang Shufang; Yan Weiqun

    2005-01-01

    Objective: To construct a cDNA library by using mRNA from Gloydius ussuriensis (G. Ussuriensis) venom gland, to clone and analyze serine proteinase gene from the cDNA library. Methods: Total RNA was isolated from venom gland of G. ussuriensis, mRNA was purified by using mRNA isolation Kit. The whole length cDNA was synthesized by means of smart cDNA synthesis strategy, and amplified by long distance PCR procedure, lately cDAN was cloned into vector pBluescrip-sk. The recombinant cDNA was transformed into E. coli DH5α. The cDNA of serine proteinase gene in the venom gland of G. ussuriensis was detected and amplified using the in situ hybridization. The cDNA fragment was inserted into pGEMT vector, cloned and its nucleotide sequence was determined. Results: The capacity of cDNA library of venom gland was above 2.3 x 10 6 . Its open reading frame was composed of 702 nucleotides and coded a protein pre-zymogen of 234 amino acids. It contained 12 cysteine residues. The sequence analysis indicated that the deduced amino acid sequence of the cDNA fragment shared high identity with the thrombin-like enzyme genes of other snakes in the GenBank. the query sequence exhibited strong amino acid sequence homology of 85% to the serine proteas of T. gramineus, thrombin-like serine proteinase I of D. acutus and serine protease catroxase II of C. atrox respectively. Based on the amino acid sequences of other thrombin-like enzymes, the catalytic residues and disulfide bridges of this thrombin-like enzyme were deduced as follows: catalytic residues, His 41 , Asp 86 , Ser 180 ; and six disulfide bridges Cys 7 -Cys 139 , Cys 26 -Cys 42 , Cys 74 -Cys 232 , Cys 118 -Cys 186 , Cys 150 -Cys 165 , Cys 176 -Cys 201 . Conclusion: The capacity of cDNA library of venom gland is above 2.3 x 10 6 , overtop the level of 10 5 capicity. The constructed cDNA library of G. ussuriensis venom gland would be helpful platform to detect new target genes and further gene manipulate. The cloned serine

  15. Prolonged Particulate Hexavalent Chromium Exposure Suppresses Homologous Recombination Repair in Human Lung Cells.

    Science.gov (United States)

    Browning, Cynthia L; Qin, Qin; Kelly, Deborah F; Prakash, Rohit; Vanoli, Fabio; Jasin, Maria; Wise, John Pierce

    2016-09-01

    Genomic instability is one of the primary models of carcinogenesis and a feature of almost all cancers. Homologous recombination (HR) repair protects against genomic instability by maintaining high genomic fidelity during the repair of DNA double strand breaks. The defining step of HR repair is the formation of the Rad51 nucleofilament, which facilitates the search for a homologous sequence and invasion of the template DNA strand. Particulate hexavalent chromium (Cr(VI)), a human lung carcinogen, induces DNA double strand breaks and chromosome instability. Since the loss of HR repair increases Cr(VI)-induced chromosome instability, we investigated the effect of extended Cr(VI) exposure on HR repair. We show acute (24 h) Cr(VI) exposure induces a normal HR repair response. In contrast, prolonged (120 h) exposure to particulate Cr(VI) inhibited HR repair and Rad51 nucleofilament formation. Prolonged Cr(VI) exposure had a profound effect on Rad51, evidenced by reduced protein levels and Rad51 mislocalization to the cytoplasm. The response of proteins involved in Rad51 nuclear import and nucleofilament formation displayed varying responses to prolonged Cr(VI) exposure. BRCA2 formed nuclear foci after prolonged Cr(VI) exposure, while Rad51C foci formation was suppressed. These results suggest that particulate Cr(VI), a major chemical carcinogen, inhibits HR repair by targeting Rad51, causing DNA double strand breaks to be repaired by a low fidelity, Rad51-independent repair pathway. These results further enhance our understanding of the underlying mechanism of Cr(VI)-induced chromosome instability and thus, carcinogenesis. © The Author 2016. Published by Oxford University Press on behalf of the Society of Toxicology. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  16. Fixing Formalin: A Method to Recover Genomic-Scale DNA Sequence Data from Formalin-Fixed Museum Specimens Using High-Throughput Sequencing.

    Directory of Open Access Journals (Sweden)

    Sarah M Hykin

    Full Text Available For 150 years or more, specimens were routinely collected and deposited in natural history collections without preserving fresh tissue samples for genetic analysis. In the case of most herpetological specimens (i.e. amphibians and reptiles, attempts to extract and sequence DNA from formalin-fixed, ethanol-preserved specimens-particularly for use in phylogenetic analyses-has been laborious and largely ineffective due to the highly fragmented nature of the DNA. As a result, tens of thousands of specimens in herpetological collections have not been available for sequence-based phylogenetic studies. Massively parallel High-Throughput Sequencing methods and the associated bioinformatics, however, are particularly suited to recovering meaningful genetic markers from severely degraded/fragmented DNA sequences such as DNA damaged by formalin-fixation. In this study, we compared previously published DNA extraction methods on three tissue types subsampled from formalin-fixed specimens of Anolis carolinensis, followed by sequencing. Sufficient quality DNA was recovered from liver tissue, making this technique minimally destructive to museum specimens. Sequencing was only successful for the more recently collected specimen (collected ~30 ybp. We suspect this could be due either to the conditions of preservation and/or the amount of tissue used for extraction purposes. For the successfully sequenced sample, we found a high rate of base misincorporation. After rigorous trimming, we successfully mapped 27.93% of the cleaned reads to the reference genome, were able to reconstruct the complete mitochondrial genome, and recovered an accurate phylogenetic placement for our specimen. We conclude that the amount of DNA available, which can vary depending on specimen age and preservation conditions, will determine if sequencing will be successful. The technique described here will greatly improve the value of museum collections by making many formalin-fixed specimens

  17. Fixing Formalin: A Method to Recover Genomic-Scale DNA Sequence Data from Formalin-Fixed Museum Specimens Using High-Throughput Sequencing.

    Science.gov (United States)

    Hykin, Sarah M; Bi, Ke; McGuire, Jimmy A

    2015-01-01

    For 150 years or more, specimens were routinely collected and deposited in natural history collections without preserving fresh tissue samples for genetic analysis. In the case of most herpetological specimens (i.e. amphibians and reptiles), attempts to extract and sequence DNA from formalin-fixed, ethanol-preserved specimens-particularly for use in phylogenetic analyses-has been laborious and largely ineffective due to the highly fragmented nature of the DNA. As a result, tens of thousands of specimens in herpetological collections have not been available for sequence-based phylogenetic studies. Massively parallel High-Throughput Sequencing methods and the associated bioinformatics, however, are particularly suited to recovering meaningful genetic markers from severely degraded/fragmented DNA sequences such as DNA damaged by formalin-fixation. In this study, we compared previously published DNA extraction methods on three tissue types subsampled from formalin-fixed specimens of Anolis carolinensis, followed by sequencing. Sufficient quality DNA was recovered from liver tissue, making this technique minimally destructive to museum specimens. Sequencing was only successful for the more recently collected specimen (collected ~30 ybp). We suspect this could be due either to the conditions of preservation and/or the amount of tissue used for extraction purposes. For the successfully sequenced sample, we found a high rate of base misincorporation. After rigorous trimming, we successfully mapped 27.93% of the cleaned reads to the reference genome, were able to reconstruct the complete mitochondrial genome, and recovered an accurate phylogenetic placement for our specimen. We conclude that the amount of DNA available, which can vary depending on specimen age and preservation conditions, will determine if sequencing will be successful. The technique described here will greatly improve the value of museum collections by making many formalin-fixed specimens available for

  18. A complete mitochondrial genome sequence of Ogura-type male-sterile cytoplasm and its comparative analysis with that of normal cytoplasm in radish (Raphanus sativus L.

    Directory of Open Access Journals (Sweden)

    Tanaka Yoshiyuki

    2012-07-01

    was found to be located at the edge of the largest unique region. Blast analysis performed to assign the unique regions showed that about 80% of the region was covered by short homologous sequences to the mitochondrial sequences of normal-type radish or other reported Brassicaceae species, although no homology was found for the remaining 20% of sequences. Conclusions Ogura-type mitochondrial genome was highly rearranged compared with the normal-type genome by recombination through one large repeat and multiple short repeats. The rearrangement has produced four unique regions in Ogura-type mitochondrial genome, and most of the unique regions are composed of known Brassicaceae mitochondrial sequences. This suggests that the regions unique to the Ogura-type genome were generated by integration and shuffling of pre-existing mitochondrial sequences during the evolution of Brassicaceae, and novel genes such as orf138 could have been created by the shuffling process of mitochondrial genome.

  19. Sequence analysis of RNase MRP RNA reveals its origination from eukaryotic RNase P RNA

    Science.gov (United States)

    Zhu, Yanglong; Stribinskis, Vilius; Ramos, Kenneth S.; Li, Yong

    2006-01-01

    RNase MRP is a eukaryote-specific endoribonuclease that generates RNA primers for mitochondrial DNA replication and processes precursor rRNA. RNase P is a ubiquitous endoribonuclease that cleaves precursor tRNA transcripts to produce their mature 5′ termini. We found extensive sequence homology of catalytic domains and specificity domains between their RNA subunits in many organisms. In Candida glabrata, the internal loop of helix P3 is 100% conserved between MRP and P RNAs. The helix P8 of MRP RNA from microsporidia Encephalitozoon cuniculi is identical to that of P RNA. Sequence homology can be widely spread over the whole molecule of MRP RNA and P RNA, such as those from Dictyostelium discoideum. These conserved nucleotides between the MRP and P RNAs strongly support the hypothesis that the MRP RNA is derived from the P RNA molecule in early eukaryote evolution. PMID:16540690

  20. Plastid, nuclear and reverse transcriptase sequences in the mitochondrial genome of Oenothera: is genetic information transferred between organelles via RNA?

    Science.gov (United States)

    Schuster, W; Brennicke, A

    1987-01-01

    We describe an open reading frame (ORF) with high homology to reverse transcriptase in the mitochondrial genome of Oenothera. This ORF displays all the characteristics of an active plant mitochondrial gene with a possible ribosome binding site and 39% T in the third codon position. It is located between a sequence fragment from the plastid genome and one of nuclear origin downstream from the gene encoding subunit 5 of the NADH dehydrogenase. The nuclear derived sequence consists of 528 nucleotides from the small ribosomal RNA and contains an expansion segment unique to nuclear rRNAs. The plastid sequence contains part of the ribosomal protein S4 and the complete tRNA(Ser). The observation that only transcribed sequences have been found i more than one subcellular compartment in higher plants suggests that interorganellar transfer of genetic information may occur via RNA and subsequent local reverse transcription and genomic integration. PMID:14650433