WorldWideScience

Sample records for sequence analysis characterization

  1. Characterization and sequence analysis of cysteine and glycine-rich ...

    African Journals Online (AJOL)

    Primers specific for CSRP3 were designed using known cDNA sequences of Bos taurus published in database with different accession numbers. Polymerase chain reaction (PCR) was performed and products were purified and sequenced. Sequence analysis and alignment were carried out using CLUSTAL W (1.83).

  2. Molecular characterization of Giardia psittaci by multilocus sequence analysis.

    Science.gov (United States)

    Abe, Niichiro; Makino, Ikuko; Kojima, Atsushi

    2012-12-01

    Multilocus sequence analyses targeting small subunit ribosomal DNA (SSU rDNA), elongation factor 1 alpha (ef1α), glutamate dehydrogenase (gdh), and beta giardin (β-giardin) were performed on Giardia psittaci isolates from three Budgerigars (Melopsittacus undulates) and four Barred parakeets (Bolborhynchus lineola) kept in individual households or imported from overseas. Nucleotide differences and phylogenetic analyses at four loci indicate the distinction of G. psittaci from the other known Giardia species: Giardia muris, Giardia microti, Giardia ardeae, and Giardia duodenalis assemblages. Furthermore, G. psittaci was related more closely to G. duodenalis than to the other known Giardia species, except for G. microti. Conflicting signals regarded as "double peaks" were found at the same nucleotide positions of the ef1α in all isolates. However, the sequences of the other three loci, including gdh and β-giardin, which are known to be highly variable, from all isolates were also mutually identical at every locus. They showed no double peaks. These results suggest that double peaks found in the ef1α sequences are caused not by mixed infection with genetically different G. psittaci isolates but by allelic sequence heterogeneity (ASH), which is observed in diplomonad lineages including G. duodenalis. No sequence difference was found in any G. psittaci isolates at the gdh and β-giardin, suggesting that G. psittaci is indeed not more diverse genetically than other Giardia species. This report is the first to provide evidence related to the genetic characteristics of G. psittaci obtained using multilocus sequence analysis. Copyright © 2012 Elsevier B.V. All rights reserved.

  3. Reverse transcriptase sequences from mulberry LTR retrotransposons: characterization analysis

    Directory of Open Access Journals (Sweden)

    Ma Bi

    2017-10-01

    Full Text Available Copia and Gypsy play important roles in structural, functional and evolutionary dynamics of plant genomes. In this study, a total of 106 and 101, Copia and Gypsy reverse transcriptase (rt were amplified respectively in the Morus notabilis genome using degenerate primers. All sequences exhibited high levels of heterogeneity, were rich in AT and possessed higher sequence divergence of Copia rt in comparison to Gypsy rt. Two reasons are likely to account for this phenomenon: a these elements often experience deletions or fragmentation by illegitimate or unequal homologous recombination in the transposition process; b strong purifying selective pressure drives the evolution of these elements through “selective silencing” with random mutation and eventual deletion from the host genome. Interestingly, mulberry rt clustered with other rt from distantly related taxa according to the phylogenetic analysis. This phenomenon did not result from horizontal transposable element transfer. Results obtained from fluorescence in situ hybridization revealed that most of the hybridization signals were preferentially concentrated in pericentromeric and distal regions of chromosomes, and these elements may play important roles in the regions in which they are found. Results of this study support the continued pursuit of further functional studies of Copia and Gypsy in the mulberry genome.

  4. Characterization and sequence analysis of cysteine and glycine-rich ...

    African Journals Online (AJOL)

    Tarek

    2011-04-18

    Apr 18, 2011 ... nucleotide alignment of both native buffalo and cattle CSRP3 cDNAs sequences ..... Exon III, Identities = 71/75 (94%), Gaps = 1/75 (1%) Strand=Plus/Plus ... Band MR, Larson JH, Rebeiz M, Green CA, Heyen DW, Donovan J,.

  5. Context based computational analysis and characterization of ARS consensus sequences (ACS of Saccharomyces cerevisiae genome

    Directory of Open Access Journals (Sweden)

    Vinod Kumar Singh

    2016-09-01

    Full Text Available Genome-wide experimental studies in Saccharomyces cerevisiae reveal that autonomous replicating sequence (ARS requires an essential consensus sequence (ACS for replication activity. Computational studies identified thousands of ACS like patterns in the genome. However, only a few hundreds of these sites act as replicating sites and the rest are considered as dormant or evolving sites. In a bid to understand the sequence makeup of replication sites, a content and context-based analysis was performed on a set of replicating ACS sequences that binds to origin-recognition complex (ORC denoted as ORC-ACS and non-replicating ACS sequences (nrACS, that are not bound by ORC. In this study, DNA properties such as base composition, correlation, sequence dependent thermodynamic and DNA structural profiles, and their positions have been considered for characterizing ORC-ACS and nrACS. Analysis reveals that ORC-ACS depict marked differences in nucleotide composition and context features in its vicinity compared to nrACS. Interestingly, an A-rich motif was also discovered in ORC-ACS sequences within its nucleosome-free region. Profound changes in the conformational features, such as DNA helical twist, inclination angle and stacking energy between ORC-ACS and nrACS were observed. Distribution of ACS motifs in the non-coding segments points to the locations of ORC-ACS which are found far away from the adjacent gene start position compared to nrACS thereby enabling an accessible environment for ORC-proteins. Our attempt is novel in considering the contextual view of ACS and its flanking region along with nucleosome positioning in the S. cerevisiae genome and may be useful for any computational prediction scheme.

  6. Characterization of platelet adhesion under flow using microscopic image sequence analysis.

    Science.gov (United States)

    Machin, M; Santomaso, A; Cozzi, M R; Battiston, M; Mazzuccato, M; De Marco, L; Canu, P

    2005-07-01

    A method for quantitative analysis of platelet deposition under flow is discussed here. The model system is based upon perfusion of blood platelets over an adhesive substrate immobilized on a glass coverslip acting as the lower surface of a rectangular flow chamber. The perfusion apparatus is mounted onto an inverted microscope equipped with epifluorescent illumination and intensified CCD video camera. Characterization is based on information obtained from a specific image analysis method applied to continuous sequences of microscopical images. Platelet recognition across the sequence of images is based on a time-dependent, bidimensional, gaussian-like pdf. Once a platelet is located,the variation of its position and shape as a function of time (i.e., the platelet history) can be determined. Analyzing the history we can establish if the platelet is moving on the surface, the frequency of this movement and the distance traveled before its resumes the velocity of a non-interacting cell. Therefore, we can determine how long the adhesion would last which is correlated to the resistance of the platelet-substrate bond. This algorithm enables the dynamic quantification of trajectories, as well as residence times, arrest and release frequencies for a high numbers of platelets at the same time. Statistically significant conclusions on platelet-surface interactions can then be obtained. An image analysis tool of this kind can dramatically help the investigation and characterization of the thrombogenic properties of artificial surfaces such as those used in artificial organs and biomedical devices.

  7. The Use of Next Generation Sequencing and Junction Sequence Analysis Bioinformatics to Achieve Molecular Characterization of Crops Improved Through Modern Biotechnology

    Directory of Open Access Journals (Sweden)

    David Kovalic

    2012-11-01

    Full Text Available The assessment of genetically modified (GM crops for regulatory approval currently requires a detailed molecular characterization of the DNA sequence and integrity of the transgene locus. In addition, molecular characterization is a critical component of event selection and advancement during product development. Typically, molecular characterization has relied on Southern blot analysis to establish locus and copy number along with targeted sequencing of polymerase chain reaction products spanning any inserted DNA to complete the characterization process. Here we describe the use of next generation (NexGen sequencing and junction sequence analysis bioinformatics in a new method for achieving full molecular characterization of a GM event without the need for Southern blot analysis. In this study, we examine a typical GM soybean [ (L. Merr.] line and demonstrate that this new method provides molecular characterization equivalent to the current Southern blot-based method. We also examine an event containing in vivo DNA rearrangement of multiple transfer DNA inserts to demonstrate that the new method is effective at identifying complex cases. Next generation sequencing and bioinformatics offers certain advantages over current approaches, most notably the simplicity, efficiency, and consistency of the method, and provides a viable alternative for efficiently and robustly achieving molecular characterization of GM crops.

  8. Identification and characterization of earthquake clusters: a comparative analysis for selected sequences in Italy

    Science.gov (United States)

    Peresan, Antonella; Gentili, Stefania

    2017-04-01

    Identification and statistical characterization of seismic clusters may provide useful insights about the features of seismic energy release and their relation to physical properties of the crust within a given region. Moreover, a number of studies based on spatio-temporal analysis of main-shocks occurrence require preliminary declustering of the earthquake catalogs. Since various methods, relying on different physical/statistical assumptions, may lead to diverse classifications of earthquakes into main events and related events, we aim to investigate the classification differences among different declustering techniques. Accordingly, a formal selection and comparative analysis of earthquake clusters is carried out for the most relevant earthquakes in North-Eastern Italy, as reported in the local OGS-CRS bulletins, compiled at the National Institute of Oceanography and Experimental Geophysics since 1977. The comparison is then extended to selected earthquake sequences associated with a different seismotectonic setting, namely to events that occurred in the region struck by the recent Central Italy destructive earthquakes, making use of INGV data. Various techniques, ranging from classical space-time windows methods to ad hoc manual identification of aftershocks, are applied for detection of earthquake clusters. In particular, a statistical method based on nearest-neighbor distances of events in space-time-energy domain, is considered. Results from clusters identification by the nearest-neighbor method turn out quite robust with respect to the time span of the input catalogue, as well as to minimum magnitude cutoff. The identified clusters for the largest events reported in North-Eastern Italy since 1977 are well consistent with those reported in earlier studies, which were aimed at detailed manual aftershocks identification. The study shows that the data-driven approach, based on the nearest-neighbor distances, can be satisfactorily applied to decompose the seismic

  9. CLONING, SEQUENCE ANALYSIS, AND CHARACTERIZATION OF PUTATIVE BETA-LACTAMASE OF STENOTROPHOMONAS MALTOPHILIA

    Directory of Open Access Journals (Sweden)

    Chong Seng Shueh

    2012-10-01

    Full Text Available The main objective of current study was to explore the function of chromosomal putative beta-lactamase gene (smlt 0115 in clinical Stenotrophomonas maltophilia. Antibiotic susceptibility test (AST screening for current antimicrobial drugs was done and Minimum Inhibitory Concentration (MIC level towards beta-lactams was determined by E-test. Putative beta-lactamase gene of S. maltophilia was amplified via PCR, with specific primers, then cloned into pET-15 expression plasmid and transformed into Escherichia coli BL21. The gene was sequenced and analyzed. The expressed protein was purified by affinity chromatography and the kinetic assay was performed. S. maltophilia ATCC 13637 was included in this experiment. Besides, a hospital strain which exhibited resistant to a series of beta-lactams including cefepime was identified via AST and MIC, hence it was named as S2 strain and was considered in this study. Sequencing result showed that putative beta-lactamase gene obtained from ATCC 13637 and S2 strains were predicted to have cephalosporinase activity by National Center for Biotechnology Information (NCBI blast program. Differences in the sequences of both ATCC 13637 and S2 strains were found via ClustalW alignment software. Kinetic assay proved a cephalosporinase characteristic produced by E. coli BL21 clone that overexpressed the putative beta-lactamase gene cloned under the control of an external promoter. Yet, expressed protein purified from S2 strain had high catalytic activity against beta-lactam antibiotics which was 14-fold higher than expressed protein purified from ATCC 13637 strain. This study represents the characterization analysis of putative beta-lactamase gene (smlt 0115 of S. maltophilia. The presence of the respective gene in the chromosome of S. maltophilia suggested that putative beta-lactamase gene (smlt 0115 of S. maltophilia plays a role in beta-lactamase resistance.

  10. Sequence analysis and molecular characterization of Wnt4 gene in metacestodes of Taenia solium.

    Science.gov (United States)

    Hou, Junling; Luo, Xuenong; Wang, Shuai; Yin, Cai; Zhang, Shaohua; Zhu, Xueliang; Dou, Yongxi; Cai, Xuepeng

    2014-04-01

    Wnt proteins are a family of secreted glycoproteins that are evolutionarily conserved and considered to be involved in extensive developmental processes in metazoan organisms. The characterization of wnt genes may improve understanding the parasite's development. In the present study, a wnt4 gene encoding 491amino acids was amplified from cDNA of metacestodes of Taenia solium using reverse transcription PCR (RT-PCR). Bioinformatics tools were used for sequence analysis. The conserved domain of the wnt gene family was predicted. The expression profile of Wnt4 was investigated using real-time PCR. Wnt4 expression was found to be dramatically increased in scolex evaginated cysticerci when compared to invaginated cysticerci. In situ hybridization showed that wnt4 gene was distributed in the posterior end of the worm along the primary body axis in evaginated cysticerci. These findings indicated that wnt4 may take part in the process of cysticerci evagination and play a role in scolex/bladder development of cysticerci of T. solium.

  11. Molecular characterization, sequence analysis and tissue expression of a porcine gene – MOSPD2

    Directory of Open Access Journals (Sweden)

    Yang Jie

    2017-01-01

    Full Text Available The full-length cDNA sequence of a porcine gene, MOSPD2, was amplified using the rapid amplification of cDNA ends method based on a pig expressed sequence tag sequence which was highly homologous to the coding sequence of the human MOSPD2 gene. Sequence prediction analysis revealed that the open reading frame of this gene encodes a protein of 491 amino acids that has high homology with the motile sperm domain-containing protein 2 (MOSPD2 of five species: horse (89%, human (90%, chimpanzee (89%, rhesus monkey (89% and mouse (85%; thus, it could be defined as a porcine MOSPD2 gene. This novel porcine gene was assigned GeneID: 100153601. This gene is structured in 15 exons and 14 introns as revealed by computer-assisted analysis. The phylogenetic analysis revealed that the porcine MOSPD2 gene has a closer genetic relationship with the MOSPD2 gene of horse. Tissue expression analysis indicated that the porcine MOSPD2 gene is generally and differentially expressed in the spleen, muscle, skin, kidney, lung, liver, fat and heart. Our experiment is the first to establish the primary foundation for further research on the porcine MOSPD2 gene.

  12. Characterization of Liaoning cashmere goat transcriptome: sequencing, de novo assembly, functional annotation and comparative analysis.

    Directory of Open Access Journals (Sweden)

    Hongliang Liu

    Full Text Available Liaoning cashmere goat is a famous goat breed for cashmere wool. In order to increase the transcriptome data and accelerate genetic improvement for this breed, we performed de novo transcriptome sequencing to generate the first expressed sequence tag dataset for the Liaoning cashmere goat, using next-generation sequencing technology.Transcriptome sequencing of Liaoning cashmere goat on a Roche 454 platform yielded 804,601 high-quality reads. Clustering and assembly of these reads produced a non-redundant set of 117,854 unigenes, comprising 13,194 isotigs and 104,660 singletons. Based on similarity searches with known proteins, 17,356 unigenes were assigned to 6,700 GO categories, and the terms were summarized into three main GO categories and 59 sub-categories. 3,548 and 46,778 unigenes had significant similarity to existing sequences in the KEGG and COG databases, respectively. Comparative analysis revealed that 42,254 unigenes were aligned to 17,532 different sequences in NCBI non-redundant nucleotide databases. 97,236 (82.51% unigenes were mapped to the 30 goat chromosomes. 35,551 (30.17% unigenes were matched to 11,438 reported goat protein-coding genes. The remaining non-matched unigenes were further compared with cattle and human reference genes, 67 putative new goat genes were discovered. Additionally, 2,781 potential simple sequence repeats were initially identified from all unigenes.The transcriptome of Liaoning cashmere goat was deep sequenced, de novo assembled, and annotated, providing abundant data to better understand the Liaoning cashmere goat transcriptome. The potential simple sequence repeats provide a material basis for future genetic linkage and quantitative trait loci analyses.

  13. Characterization and sequence analysis of the F2 promoter from corynephage BFK20

    International Nuclear Information System (INIS)

    Koptides, M.; Ugorcakova, J.; Baloghova, E.; Bukovska, G.; Timko, J.

    1994-01-01

    F2 promoter from corynephage BFK20 was isolated and characterized. It was functional in Escherichia coli and Corynebacterium glutamicum. Cloning of the F2 promoter into the pJUP05 promoter probe vector caused an increase of the neomycin phosphotransferase II specific activity. According to the Northern blot hybridization the nptII gene was expressed from the cloned F2 promoter. The apparent transcription start point in E. coli and C. glutamicum was determined. The-35 region of F2 promoter showed high similarity to that of E. coli promoter consensus sequence, but its - 10 region was G+C rich and had no significant homology to that. (author)

  14. Genomic Characterization for Parasitic Weeds of the Genus Striga by Sample Sequence Analysis

    Directory of Open Access Journals (Sweden)

    Matt C. Estep

    2012-03-01

    Full Text Available Generation of ∼2200 Sanger sequence reads or ∼10,000 454 reads for seven Lour. DNA samples (five species allowed identification of the highly repetitive DNA content in these genomes. The 14 most abundant repeats in these species were identified and partially assembled. Annotation indicated that they represent nine long terminal repeat (LTR retrotransposon families, three tandem satellite repeats, one long interspersed element (LINE retroelement, and one DNA transposon. All of these repeats are most closely related to repetitive elements in other closely related plants and are not products of horizontal transfer from their host species. These repeats were differentially abundant in each species, with the LTR retrotransposons and satellite repeats most responsible for variation in genome size. Each species had some repetitive elements that were more abundant and some less abundant than the other species examined, indicating that no single element or any unilateral growth or decrease trend in genome behavior was responsible for variation in genome size and composition. Genome sizes were determined by flow sorting, and the values of 615 Mb [ (L. Kuntze], 1330 Mb [ (Willd. Vatke], 1425 Mb [ (Delile Benth.] and 2460 Mb ( Benth. suggest a ploidy series, a prediction supported by repetitive DNA sequence analysis. Phylogenetic analysis using six chloroplast loci indicated the ancestral relationships of the five most agriculturally important species, with the unexpected result that the one parasite of dicotyledonous plants ( was found to be more closely related to some of the grass parasites than many of the grass parasites are to each other.

  15. Sequence Analysis and Characterization of Active Human Alu Subfamilies Based on the 1000 Genomes Pilot Project.

    Science.gov (United States)

    Konkel, Miriam K; Walker, Jerilyn A; Hotard, Ashley B; Ranck, Megan C; Fontenot, Catherine C; Storer, Jessica; Stewart, Chip; Marth, Gabor T; Batzer, Mark A

    2015-08-29

    The goal of the 1000 Genomes Consortium is to characterize human genome structural variation (SV), including forms of copy number variations such as deletions, duplications, and insertions. Mobile element insertions, particularly Alu elements, are major contributors to genomic SV among humans. During the pilot phase of the project we experimentally validated 645 (611 intergenic and 34 exon targeted) polymorphic "young" Alu insertion events, absent from the human reference genome. Here, we report high resolution sequencing of 343 (322 unique) recent Alu insertion events, along with their respective target site duplications, precise genomic breakpoint coordinates, subfamily assignment, percent divergence, and estimated A-rich tail lengths. All the sequenced Alu loci were derived from the AluY lineage with no evidence of retrotransposition activity involving older Alu families (e.g., AluJ and AluS). AluYa5 is currently the most active Alu subfamily in the human lineage, followed by AluYb8, and many others including three newly identified subfamilies we have termed AluYb7a3, AluYb8b1, and AluYa4a1. This report provides the structural details of 322 unique Alu variants from individual human genomes collectively adding about 100 kb of genomic variation. Many Alu subfamilies are currently active in human populations, including a surprising level of AluY retrotransposition. Human Alu subfamilies exhibit continuous evolution with potential drivers sprouting new Alu lineages. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  16. Analysis of multilocus sequence typing and virulence characterization of Listeria monocytogenes isolates from Chinese retail ready-to-eat food

    Directory of Open Access Journals (Sweden)

    Shi eWu

    2016-02-01

    Full Text Available Eighty Listeria monocytogenes isolates were obtained from Chinese retail ready-to-eat (RTE food and were previously characterized with serotyping and antibiotic susceptibility tests. The aim of this study was to characterize the subtype and virulence potential of these L. monocytogenes isolates by multilocus sequence typing (MLST, virulence-associate genes, epidemic clones (ECs and sequence analysis of the important virulence factor: internalin A (inlA. The result of MLST revealed that these L. monocytogenes isolates belonged to 14 different sequence types (STs. With the exception of four new STs (ST804, ST805, ST806 and ST807, all other STs observed in this study have been associated with human listeriosis and outbreaks to varying extents. Six virulence-associate genes (inlA, inlB, inlC, inlJ, hly and llsX were selected and their presence was investigated using PCR. All strains carried inlA, inlB, inlC, inlJ, and hly, whereas 38.8% (31/80 of strains harbored the listeriolysin S genes (llsX. A multiplex PCR assay was used to evaluate the presence of markers specific to epidemic clones of L. monocytogenes and identified 26.3% (21/80 of ECI in the 4b-4d-4e strains. Further study of inlA sequencing revealed that most strains contained the full-length InlA required for host cell invasion, whereas three mutations lead to premature stop codons (PMSC within a novel PMSCs at position 326 (GAA→TAA. MLST and inlA sequence analysis results were concordant, and different virulence potentials within isolates were observed. These findings suggest that L. monocytogenes isolates from RTE food in China could be virulent and be capable of causing human illness. Furthermore, the STs and virulence profiles of L. monocytogenes isolates have significant implications for epidemiological and public health studies of this pathogen.

  17. Analysis of Multilocus Sequence Typing and Virulence Characterization of Listeria monocytogenes Isolates from Chinese Retail Ready-to-Eat Food.

    Science.gov (United States)

    Wu, Shi; Wu, Qingping; Zhang, Jumei; Chen, Moutong; Guo, Weipeng

    2016-01-01

    Eighty Listeria monocytogenes isolates were obtained from Chinese retail ready-to-eat (RTE) food and were previously characterized with serotyping and antibiotic susceptibility tests. The aim of this study was to characterize the subtype and virulence potential of these L. monocytogenes isolates by multilocus sequence typing (MLST), virulence-associate genes, epidemic clones (ECs), and sequence analysis of the important virulence factor: internalin A (inlA). The result of MLST revealed that these L. monocytogenes isolates belonged to 14 different sequence types (STs). With the exception of four new STs (ST804, ST805, ST806, and ST807), all other STs observed in this study have been associated with human listeriosis and outbreaks to varying extents. Six virulence-associate genes (inlA, inlB, inlC, inlJ, hly, and llsX) were selected and their presence was investigated using PCR. All strains carried inlA, inlB, inlC, inlJ, and hly, whereas 38.8% (31/80) of strains harbored the listeriolysin S genes (llsX). A multiplex PCR assay was used to evaluate the presence of markers specific to epidemic clones of L. monocytogenes and identified 26.3% (21/80) of ECI in the 4b-4d-4e strains. Further study of inlA sequencing revealed that most strains contained the full-length InlA required for host cell invasion, whereas three mutations lead to premature stop codons (PMSC) within a novel PMSCs at position 326 (GAA → TAA). MLST and inlA sequence analysis results were concordant, and different virulence potentials within isolates were observed. These findings suggest that L. monocytogenes isolates from RTE food in China could be virulent and be capable of causing human illness. Furthermore, the STs and virulence profiles of L. monocytogenes isolates have significant implications for epidemiological and public health studies of this pathogen.

  18. Cloning, sequence analysis, expression of Cyathus bulleri laccase in Pichia pastoris and characterization of recombinant laccase

    Directory of Open Access Journals (Sweden)

    Garg Neha

    2012-10-01

    Full Text Available Abstract Background Laccases are blue multi-copper oxidases and catalyze the oxidation of phenolic and non-phenolic compounds. There is considerable interest in using these enzymes for dye degradation as well as for synthesis of aromatic compounds. Laccases are produced at relatively low levels and, sometimes, as isozymes in the native fungi. The investigation of properties of individual enzymes therefore becomes difficult. The goal of this study was to over-produce a previously reported laccase from Cyathus bulleri using the well-established expression system of Pichia pastoris and examine and compare the properties of the recombinant enzyme with that of the native laccase. Results In this study, complete cDNA encoding laccase (Lac from white rot fungus Cyathus bulleri was amplified by RACE-PCR, cloned and expressed in the culture supernatant of Pichia pastoris under the control of the alcohol oxidase (AOX1 promoter. The coding region consisted of 1,542 bp and encodes a protein of 513 amino acids with a signal peptide of 16 amino acids. The deduced amino acid sequence of the matured protein displayed high homology with laccases from Trametes versicolor and Coprinus cinereus. The sequence analysis indicated the presence of Glu 460 and Ser 113 and LEL tripeptide at the position known to influence redox potential of laccases placing this enzyme as a high redox enzyme. Addition of copper sulfate to the production medium enhanced the level of laccase by about 12-fold to a final activity of 7200 U L-1. The recombinant laccase (rLac was purified by ~4-fold to a specific activity of ~85 U mg-1 protein. A detailed study of thermostability, chloride and solvent tolerance of the rLac indicated improvement in the first two properties when compared to the native laccase (nLac. Altered glycosylation pattern, identified by peptide mass finger printing, was proposed to contribute to altered properties of the rLac. Conclusion Laccase of C. bulleri was

  19. Cloning, sequence analysis, expression of Cyathus bulleri laccase in Pichia pastoris and characterization of recombinant laccase.

    Science.gov (United States)

    Garg, Neha; Bieler, Nora; Kenzom, Tenzin; Chhabra, Meenu; Ansorge-Schumacher, Marion; Mishra, Saroj

    2012-10-23

    Laccases are blue multi-copper oxidases and catalyze the oxidation of phenolic and non-phenolic compounds. There is considerable interest in using these enzymes for dye degradation as well as for synthesis of aromatic compounds. Laccases are produced at relatively low levels and, sometimes, as isozymes in the native fungi. The investigation of properties of individual enzymes therefore becomes difficult. The goal of this study was to over-produce a previously reported laccase from Cyathus bulleri using the well-established expression system of Pichia pastoris and examine and compare the properties of the recombinant enzyme with that of the native laccase. In this study, complete cDNA encoding laccase (Lac) from white rot fungus Cyathus bulleri was amplified by RACE-PCR, cloned and expressed in the culture supernatant of Pichia pastoris under the control of the alcohol oxidase (AOX)1 promoter. The coding region consisted of 1,542 bp and encodes a protein of 513 amino acids with a signal peptide of 16 amino acids. The deduced amino acid sequence of the matured protein displayed high homology with laccases from Trametes versicolor and Coprinus cinereus. The sequence analysis indicated the presence of Glu 460 and Ser 113 and LEL tripeptide at the position known to influence redox potential of laccases placing this enzyme as a high redox enzyme. Addition of copper sulfate to the production medium enhanced the level of laccase by about 12-fold to a final activity of 7200 U L-1. The recombinant laccase (rLac) was purified by ~4-fold to a specific activity of ~85 U mg(-1) protein. A detailed study of thermostability, chloride and solvent tolerance of the rLac indicated improvement in the first two properties when compared to the native laccase (nLac). Altered glycosylation pattern, identified by peptide mass finger printing, was proposed to contribute to altered properties of the rLac. Laccase of C. bulleri was successfully produced extra-cellularly to a high level of 7200

  20. Genome wide characterization of simple sequence repeats in watermelon genome and their application in comparative mapping and genetic diversity analysis.

    Science.gov (United States)

    Zhu, Huayu; Song, Pengyao; Koo, Dal-Hoe; Guo, Luqin; Li, Yanman; Sun, Shouru; Weng, Yiqun; Yang, Luming

    2016-08-05

    Microsatellite markers are one of the most informative and versatile DNA-based markers used in plant genetic research, but their development has traditionally been difficult and costly. The whole genome sequencing with next-generation sequencing (NGS) technologies provides large amounts of sequence data to develop numerous microsatellite markers at whole genome scale. SSR markers have great advantage in cross-species comparisons and allow investigation of karyotype and genome evolution through highly efficient computation approaches such as in silico PCR. Here we described genome wide development and characterization of SSR markers in the watermelon (Citrullus lanatus) genome, which were then use in comparative analysis with two other important crop species in the Cucurbitaceae family: cucumber (Cucumis sativus L.) and melon (Cucumis melo L.). We further applied these markers in evaluating the genetic diversity and population structure in watermelon germplasm collections. A total of 39,523 microsatellite loci were identified from the watermelon draft genome with an overall density of 111 SSRs/Mbp, and 32,869 SSR primers were designed with suitable flanking sequences. The dinucleotide SSRs were the most common type representing 34.09 % of the total SSR loci and the AT-rich motifs were the most abundant in all nucleotide repeat types. In silico PCR analysis identified 832 and 925 SSR markers with each having a single amplicon in the cucumber and melon draft genome, respectively. Comparative analysis with these cross-species SSR markers revealed complicated mosaic patterns of syntenic blocks among the genomes of three species. In addition, genetic diversity analysis of 134 watermelon accessions with 32 highly informative SSR loci placed these lines into two groups with all accessions of C.lanatus var. citorides and three accessions of C. colocynthis clustered in one group and all accessions of C. lanatus var. lanatus and the remaining accessions of C. colocynthis

  1. Cloning, sequence analysis, and characterization of the genes involved in isoprimeverose metabolism in Lactobacillus pentosus

    NARCIS (Netherlands)

    Chaillou, S.; Lokman, B.C.; Leer, R.J.; Posthuma, C.; Postma, P.W.; Pouwels, P.H.

    1998-01-01

    Two genes, xylP and xylQ, from the xylose regulon of Lactobacillus pentosus were cloned and sequenced. Together with the repressor gene of the regulon, xylR, the xylPQ genes form an operon which is inducible by xylose and which is transcribed from a promoter located 145 bp upstream of xylP. A

  2. Characterization of Sri Lanka rabies virus isolates using nucleotide sequence analysis of nucleoprotein gene.

    Science.gov (United States)

    Arai, Y T; Takahashi, H; Kameoka, Y; Shiino, T; Wimalaratne, O; Lodmell, D L

    2001-01-01

    Thirty-four suspected rabid brain samples from 2 humans, 24 dogs, 4 cats, 2 mongooses, I jackal and I water buffalo were collected in 1995-1996 in Sri Lanka. Total RNA was extracted directly from brain suspensions and examined using a one-step reverse transcription-polymerase chain reaction (RT-PCR) for the rabies virus nucleoprotein (N) gene. Twenty-eight samples were found positive for the virus N gene by RT-PCR and also for the virus antigens by fluorescent antibody (FA) test. Rabies virus isolates obtained from different animal species in different regions of Sri Lanka were genetically homogenous. Sequences of 203 nucleotides (nt)-long RT-PCR products obtained from 16 of 27 samples were found identical. Sequences of 1350 nt of N genes of 14 RT-PCR products were determined. The Sri Lanka isolates under study formed a specific cluster that included also an earlier isolate from India but did not include the known isolates from China, Thailand, Malaysia, Israel, Iran, Oman, Saudi Arabia, Russia, Nepal, Philippines, Japan and from several other countries. These results suggest that one type of rabies virus is circulating among human, dog, cat, mongoose, jackal and water buffalo living near Colombo City and in other five remote regions in Sri Lanka.

  3. Characterization of European Yersinia enterocolitica 1A strains using restriction fragment length polymorphism and multilocus sequence analysis.

    Science.gov (United States)

    Murros, A; Säde, E; Johansson, P; Korkeala, H; Fredriksson-Ahomaa, M; Björkroth, J

    2016-10-01

    Yersinia enterocolitica is currently divided into two subspecies: subsp. enterocolitica including highly pathogenic strains of biotype 1B and subsp. palearctica including nonpathogenic strains of biotype 1A and moderately pathogenic strains of biotypes 2-5. In this work, we characterized 162 Y. enterocolitica strains of biotype 1A and 50 strains of biotypes 2-4 isolated from human, animal and food samples by restriction fragment length polymorphism using the HindIII restriction enzyme. Phylogenetic relatedness of 20 representative Y. enterocolitica strains including 15 biotype 1A strains was further studied by the multilocus sequence analysis of four housekeeping genes (glnA, gyrB, recA and HSP60). In all the analyses, biotype 1A strains formed a separate genomic group, which differed from Y. enterocolitica subsp. enterocolitica and from the strains of biotypes 2-4 of Y. enterocolitica subsp. palearctica. Based on these results, biotype 1A strains considered nonpathogenic should not be included in subspecies palearctica containing pathogenic strains of biotypes 2-5. Yersinia enterocolitica strains are currently divided into six biotypes and two subspecies. Strains of biotype 1A, which are phenotypically and genotypically very heterogeneous, are classified as subspecies palearctica. In this study, European Y. enterocolitica 1A strains isolated from both human and nonhuman sources were characterized using restriction fragment length polymorphism and multilocus sequence analysis. The European biotype 1A strains formed a separate group, which differed from strains belonging to subspecies enterocolitica and palearctica. This may indicate that the current division between the two subspecies is not sufficient considering the strain diversity within Y. enterocolitica. © 2016 The Society for Applied Microbiology.

  4. Metagenomic analysis and functional characterization of the biogas microbiome using high throughput shotgun sequencing and a novel binning strategy.

    Science.gov (United States)

    Campanaro, Stefano; Treu, Laura; Kougias, Panagiotis G; De Francisci, Davide; Valle, Giorgio; Angelidaki, Irini

    2016-01-01

    Biogas production is an economically attractive technology that has gained momentum worldwide over the past years. Biogas is produced by a biologically mediated process, widely known as "anaerobic digestion." This process is performed by a specialized and complex microbial community, in which different members have distinct roles in the establishment of a collective organization. Deciphering the complex microbial community engaged in this process is interesting both for unraveling the network of bacterial interactions and for applicability potential to the derived knowledge. In this study, we dissect the bioma involved in anaerobic digestion by means of high throughput Illumina sequencing (~51 gigabases of sequence data), disclosing nearly one million genes and extracting 106 microbial genomes by a novel strategy combining two binning processes. Microbial phylogeny and putative taxonomy performed using >400 proteins revealed that the biogas community is a trove of new species. A new approach based on functional properties as per network representation was developed to assign roles to the microbial species. The organization of the anaerobic digestion microbiome is resembled by a funnel concept, in which the microbial consortium presents a progressive functional specialization while reaching the final step of the process (i.e., methanogenesis). Key microbial genomes encoding enzymes involved in specific metabolic pathways, such as carbohydrates utilization, fatty acids degradation, amino acids fermentation, and syntrophic acetate oxidation, were identified. Additionally, the analysis identified a new uncultured archaeon that was putatively related to Methanomassiliicoccales but surprisingly having a methylotrophic methanogenic pathway. This study is a pioneer research on the phylogenetic and functional characterization of the microbial community populating biogas reactors. By applying for the first time high-throughput sequencing and a novel binning strategy, the

  5. Identification, sequence analysis, and characterization of serine/threonine protein kinase 17A from Clonorchis sinensis.

    Science.gov (United States)

    Huang, Lisi; Lv, Xiaoli; Huang, Yan; Hu, Yue; Yan, Haiyan; Zheng, Minghui; Zeng, Hua; Li, Xuerong; Liang, Chi; Wu, Zhongdao; Yu, Xinbing

    2014-05-01

    This is the first report of a novel protein from Clonorchis sinensis (C. sinensis), serine/threonine protein kinase 17A (CsSTK17A), which belongs to a member of the death-associated protein kinase (DAPK) family known to regulate diverse biological processes. The full-length sequence encoding CsSTK17A was isolated from C. sinensis adult cDNA plasmid library. Two transcribed isoforms of the gene were identified from the genome of C. sinensis. CsSTK17A contains a kinase domain at the N-terminus that shares a degree of conservation with the DAPK families. Besides, the catalytic domain contains 11 subdomains conserved among STKs and shares the highest identity with STK from Schistosoma mansoni (55.9%). Three-dimensional structure of CsSTK17A displays the canonical STK fold, including the helix C, P-loop, and the activation loop. We obtained recombinant CsSTK17A (rCsSTK17A) and anti-rCsSTK17A IgG. The rCsSTK17A could be probed by anti-rCsSTK17A rat serum, C. sinensis-infected rat serum and the sera from rats immunized with C. sinensis excretory-secretory products, indicating that it is a circulating antigen possessing a strong immunocompetence. Moreover, quantitative RT-PCR and western blotting analyses revealed that CsSTK17A exhibited the highest mRNA and protein expression level in eggs, followed by metacercariae and adult worms. Intriguingly, in the immunolocalization assay, CsSTK17A was intensively localized to the operculum region of eggs in uterus, as well as the vitelline gland of both adult worm and metacercaria, implying that the protein was associated with the reproduction and development of C. sinensis. Overall, these fundamental studies might contribute to further researches on signaling systems of the parasite.

  6. Genetic characterization of Australian Mycoplasma bovis isolates through whole genome sequencing analysis

    DEFF Research Database (Denmark)

    Parker, Alysia M.; Shukla, Ankit; House, John K.

    2016-01-01

    Mycoplasma bovis is a major pathogen in cattle causing mastitis, arthritis and pneumonia. First isolated in Australian cattle in 1970, M. bovis has persisted causing serious disease in infected herds. To date, genetic analysis of Australian M. bovis isolates has not been performed. With whole gen...

  7. Biological sequence analysis

    DEFF Research Database (Denmark)

    Durbin, Richard; Eddy, Sean; Krogh, Anders Stærmose

    This book provides an up-to-date and tutorial-level overview of sequence analysis methods, with particular emphasis on probabilistic modelling. Discussed methods include pairwise alignment, hidden Markov models, multiple alignment, profile searches, RNA secondary structure analysis, and phylogene...

  8. Characterization of shark complement factor I gene(s): genomic analysis of a novel shark-specific sequence.

    Science.gov (United States)

    Shin, Dong-Ho; Webb, Barbara M; Nakao, Miki; Smith, Sylvia L

    2009-07-01

    Complement factor I is a crucial regulator of mammalian complement activity. Very little is known of complement regulators in non-mammalian species. We isolated and sequenced four highly similar complement factor I cDNAs from the liver of the nurse shark (Ginglymostoma cirratum), designated as GcIf-1, GcIf-2, GcIf-3 and GcIf-4 (previously referred to as nsFI-a, -b, -c and -d) which encode 689, 673, 673 and 657 amino acid residues, respectively. They share 95% (shark-specific sequence between the leader peptide (LP) and the factor I membrane attack complex (FIMAC) domain. The cDNA sequences differ only in the size and composition of the shark-specific region (SSR). Sequence analysis of each SSR has identified within the region two novel short sequences (SS1 and SS2) and three repeat sequences (RS1-3). Genomic analysis has revealed the existence of three introns between the leader peptide and the FIMAC domain, tentatively designated intron 1, intron 2, and intron 3 which span 4067, 2293 and 2082bp, respectively. Southern blot analysis suggests the presence of a single gene copy for each cDNA type. Phylogenetic analysis suggests that complement factor I of cartilaginous fish diverged prior to the emergence of mammals. All four GcIf cDNA species are expressed in four different tissues and the liver is the main tissue in which expression level of all four is high. This suggests that the expression of GcIf isotypes is tissue-dependent.

  9. Image sequence analysis

    CERN Document Server

    1981-01-01

    The processing of image sequences has a broad spectrum of important applica­ tions including target tracking, robot navigation, bandwidth compression of TV conferencing video signals, studying the motion of biological cells using microcinematography, cloud tracking, and highway traffic monitoring. Image sequence processing involves a large amount of data. However, because of the progress in computer, LSI, and VLSI technologies, we have now reached a stage when many useful processing tasks can be done in a reasonable amount of time. As a result, research and development activities in image sequence analysis have recently been growing at a rapid pace. An IEEE Computer Society Workshop on Computer Analysis of Time-Varying Imagery was held in Philadelphia, April 5-6, 1979. A related special issue of the IEEE Transactions on Pattern Anal­ ysis and Machine Intelligence was published in November 1980. The IEEE Com­ puter magazine has also published a special issue on the subject in 1981. The purpose of this book ...

  10. Metagenomic analysis and functional characterization of the biogas microbiome using high throughput shotgun sequencing and a novel binning strategy

    DEFF Research Database (Denmark)

    Campanaro, Stefano; Treu, Laura; Kougias, Panagiotis

    2016-01-01

    Biogas production is an economically attractive technology that has gained momentum worldwide over the past years. Biogas is produced by a biologically mediated process, widely known as "anaerobic digestion." This process is performed by a specialized and complex microbial community, in which...... performed using >400 proteins revealed that the biogas community is a trove of new species. A new approach based on functional properties as per network representation was developed to assign roles to the microbial species. The organization of the anaerobic digestion microbiome is resembled by a funnel...... on the phylogenetic and functional characterization of the microbial community populating biogas reactors. By applying for the first time high-throughput sequencing and a novel binning strategy, the identified genes were anchored to single genomes providing a clear understanding of their metabolic pathways...

  11. Sequence analysis and characterization of rolling-circle replicating plasmid pVCM01 from Salmonella enterica

    Directory of Open Access Journals (Sweden)

    Penido, A. F. B.

    2013-12-01

    Full Text Available Aims: Characterization of cryptic plasmid pVCM01 (accession number JX133088 isolated from Salmonella enterica Enteritidis. Methodology and results: The complete sequence of pVCM01 was obtained. This plasmid possesses 1981 bp, with G+C content of 57% in agreement of the range of Salmonella genomic DNA. pVCM01 has a high degree of similarity to pB and pJ plasmids. It possesses six main open reading frames, only one have a very high degree of amino acid identity with protein involved in the rolling-circle-like replication (RCR. Based on the sequence similarities, pVCM01 plasmid belonged to the pC194/pUB110 rolling-circle replicating plasmid family. The Rep pVCM01 possesses the motifs: FLTLTVRN, HPHFHTL, SGDGYVKHERW, which were present in all Rep proteins. Conclusion, significance and impact of study: The small size of pVCM01 plasmid and its stability in E. coli cells, make it an attractive candidate to develop new vectors, such as cloning and/or expression vector.

  12. Characterization of the Complete Mitochondrial Genome Sequence of the Globose Head Whiptail Cetonurus globiceps (Gadiformes: Macrouridae and Its Phylogenetic Analysis.

    Directory of Open Access Journals (Sweden)

    Xiaofeng Shi

    Full Text Available The particular environmental characteristics of deep water such as its immense scale and high pressure systems, presents technological problems that have prevented research to broaden our knowledge of deep-sea fish. Here, we described the mitogenome sequence of a deep-sea fish, Cetonurus globiceps. The genome is 17,137 bp in length, with a standard set of 22 transfer RNA genes (tRNAs, two ribosomal RNA genes, 13 protein-coding genes, and two typical non-coding control regions. Additionally, a 70 bp tRNA(Thr-tRNA(Pro intergenic spacer is present. The C. globiceps mitogenome exhibited strand-specific asymmetry in nucleotide composition. The AT-skew and GC-skew values in the whole genome of C. globiceps were 0 and -0.2877, respectively, revealing that the H-strand had equal amounts of A and T and that the overall nucleotide composition was C skewed. All of the tRNA genes could be folded into cloverleaf secondary structures, while the secondary structure of tRNA(Ser(AGY lacked a discernible dihydrouridine stem. By comparing this genome sequence with the recognition sites in teleost species, several conserved sequence blocks were identified in the control region. However, the GTGGG-box, the typical characteristic of conserved sequence block E (CSB-E, was absent. Notably, tandem repeats were identified in the 3' portion of the control region. No similar repetitive motifs are present in most of other gadiform species. Phylogenetic analysis based on 12 protein coding genes provided strong support that C. globiceps was the most derived in the clade. Some relationships however, are in contrast with those presented in previous studies. This study enriches our knowledge of mitogenomes of the genus Cetonurus and provides valuable information on the evolution of Macrouridae mtDNA and deep-sea fish.

  13. Sequence analysis-based characterization and identification of neurovirulence-associated variants of 36 EV71 strains from China.

    Science.gov (United States)

    Xu, Jun; Wang, Fang; Zhao, Desheng; Liu, Jiang; Su, Hong; Wang, Baolong

    2018-03-30

    Enterovirus 71 (EV71) is the main pathogen of hand-foot-mouth disease (HFMD) and causes several neurological complications. As new strains of EV71 are constantly discovered, it is important to understand the genomic characteristics of the viruses and the mechanism of virulence. Herein, we isolated five strains of EV71 from HFMD patients with or without neurovirulence and sequenced their whole genomes. We then performed whole genome sequence analysis of totally 36 EV71 strains. The phylogenetic analysis of the VP1 region revealed all five isolated strains are clustered into C4a of C4 subgenotype. In addition, by comparing the complete genome sequences of 36 strains, 253 variable amino acid positions were found, 14 of which were identified to be associated with neurovirulence (P < 0.05). Moreover, a similar pattern of amino acid variants combination was identified in four strains without neurovirulence, indicating this type of variant pattern might be associated with avirulence. The strains with neurovirulence appeared to be distinguished from those without neurovirulence by the variants in VP1 and P2 regions, implying VP1 and P2 are the important regions associated with neurovirulence. Indeed, 3-D modeling of VP1 and P2 regions of non-neurovirulent and neurovirulent strains revealed that the different variants resulted in different protein structures and amino acid composition of ligand binding site, which might account for their difference in neurovirulence. In summary, our study reveals 14 variable amino acid positions of VP1, P2 and P3 regions are related to the virulence and that mutations in the capsid proteins of EV71 might contribute to neurovirulence. © 2018 Wiley Periodicals, Inc.

  14. Identification and characterization of large DNA deletions affecting oil quality traits in soybean seeds through transcriptome sequencing analysis.

    Science.gov (United States)

    Goettel, Wolfgang; Ramirez, Martha; Upchurch, Robert G; An, Yong-Qiang Charles

    2016-08-01

    Identification and characterization of a 254-kb genomic deletion on a duplicated chromosome segment that resulted in a low level of palmitic acid in soybean seeds using transcriptome sequencing. A large number of soybean genotypes varying in seed oil composition and content have been identified. Understanding the molecular mechanisms underlying these variations is important for breeders to effectively utilize them as a genetic resource. Through design and application of a bioinformatics approach, we identified nine co-regulated gene clusters by comparing seed transcriptomes of nine soybean genotypes varying in oil composition and content. We demonstrated that four gene clusters in the genotypes M23, Jack and N0304-303-3 coincided with large-scale genome rearrangements. The co-regulated gene clusters in M23 and Jack mapped to a previously described 164-kb deletion and a copy number amplification of the Rhg1 locus, respectively. The coordinately down-regulated gene clusters in N0304-303-3 were caused by a 254-kb deletion containing 19 genes including a fatty acyl-ACP thioesterase B gene (FATB1a). This deletion was associated with reduced palmitic acid content in seeds and was the molecular cause of a previously reported nonfunctional FATB1a allele, fap nc . The M23 and N0304-304-3 deletions were located in duplicated genome segments retained from the Glycine-specific whole genome duplication that occurred 13 million years ago. The homoeologous genes in these duplicated regions shared a strong similarity in both their encoded protein sequences and transcript accumulation levels, suggesting that they may have conserved and important functions in seeds. The functional conservation of homoeologous genes may result in genetic redundancy and gene dosage effects for their associated seed traits, explaining why the large deletion did not cause lethal effects or completely eliminate palmitic acid in N0304-303-3.

  15. SSH analysis of endosperm transcripts and characterization of heat stress regulated expressed sequence tags in bread wheat

    Directory of Open Access Journals (Sweden)

    Suneha Goswami

    2016-08-01

    Full Text Available Heat stress is one of the major problems in agriculturally important cereal crops, especially wheat. Here, we have constructed a subtracted cDNA library from the endosperm of HS-treated (42°C for 2 h wheat cv. HD2985 by suppression subtractive hybridization (SSH. We identified ~550 recombinant clones ranging from 200 to 500 bp with an average size of 300 bp. Sanger’s sequencing was performed with 205 positive clones to generate the differentially expressed sequence tags (ESTs. Most of the ESTs were observed to be localized on the long arm of chromosome 2A and associated with heat stress tolerance and metabolic pathways. Identified ESTs were BLAST search using Ensemble, TriFLD and TIGR databases and the predicted CDS were translated and aligned with the protein sequences available in pfam and InterProScan 5 databases to predict the differentially expressed proteins (DEPs. We observed eight different types of post-translational modifications (PTMs in the DEPs corresponds to the cloned ESTs—147 sites with phosphorylation, 21 sites with sumoylation, 237 with palmitoylation, 96 sites with S-nitrosylation, 3066 calpain cleavage sites, and 103 tyrosine nitration sites, predicted to sense the heat stress and regulate the expression of stress genes. Twelve DEPs were observed to have transmembrane helixes (TMH in their structure, predicted to play the role of sensors of HS. Quantitative Real-Time PCR of randomly selected ESTs showed very high relative expression of HSP17 under HS; up-regulation was observed more in wheat cv. HD2985 (thermotolerant, as compared to HD2329 (thermosusceptible during grain-filling. The abundance of transcripts was further validated through northern blot analysis. The ESTs and their corresponding DEPs can be used as molecular marker for screening or targeted precision breeding program. PTMs identified in the DEPs can be used to elucidate the thermotolerance mechanism of wheat – a novel step towards the development of

  16. Comparative genome analysis and characterization of the Salmonella Typhimurium strain CCRJ_26 isolated from swine carcasses using whole-genome sequencing approach.

    Science.gov (United States)

    Panzenhagen, P H N; Cabral, C C; Suffys, P N; Franco, R M; Rodrigues, D P; Conte-Junior, C A

    2018-04-01

    Salmonella pathogenicity relies on virulence factors many of which are clustered within the Salmonella pathogenicity islands. Salmonella also harbours mobile genetic elements such as virulence plasmids, prophage-like elements and antimicrobial resistance genes which can contribute to increase its pathogenicity. Here, we have genetically characterized a selected S. Typhimurium strain (CCRJ_26) from our previous study with Multiple Drugs Resistant profile and high-frequency PFGE clonal profile which apparently persists in the pork production centre of Rio de Janeiro State, Brazil. By whole-genome sequencing, we described the strain's genome virulent content and characterized the repertoire of bacterial plasmids, antibiotic resistance genes and prophage-like elements. Here, we have shown evidence that strain CCRJ_26 genome possible represent a virulence-associated phenotype which may be potentially virulent in human infection. Whole-genome sequencing technologies are still costly and remain underexplored for applied microbiology in Brazil. Hence, this genomic description of S. Typhimurium strain CCRJ_26 will provide help in future molecular epidemiological studies. The analysis described here reveals a quick and useful pipeline for bacterial virulence characterization using whole-genome sequencing approach. © 2018 The Society for Applied Microbiology.

  17. Characterization of X chromosome inactivation using integrated analysis of whole-exome and mRNA sequencing.

    Directory of Open Access Journals (Sweden)

    Szabolcs Szelinger

    Full Text Available In females, X chromosome inactivation (XCI is an epigenetic, gene dosage compensatory mechanism by inactivation of one copy of X in cells. Random XCI of one of the parental chromosomes results in an approximately equal proportion of cells expressing alleles from either the maternally or paternally inherited active X, and is defined by the XCI ratio. Skewed XCI ratio is suggestive of non-random inactivation, which can play an important role in X-linked genetic conditions. Current methods rely on indirect, semi-quantitative DNA methylation-based assay to estimate XCI ratio. Here we report a direct approach to estimate XCI ratio by integrated, family-trio based whole-exome and mRNA sequencing using phase-by-transmission of alleles coupled with allele-specific expression analysis. We applied this method to in silico data and to a clinical patient with mild cognitive impairment but no clear diagnosis or understanding molecular mechanism underlying the phenotype. Simulation showed that phased and unphased heterozygous allele expression can be used to estimate XCI ratio. Segregation analysis of the patient's exome uncovered a de novo, interstitial, 1.7 Mb deletion on Xp22.31 that originated on the paternally inherited X and previously been associated with heterogeneous, neurological phenotype. Phased, allelic expression data suggested an 83∶20 moderately skewed XCI that favored the expression of the maternally inherited, cytogenetically normal X and suggested that the deleterious affect of the de novo event on the paternal copy may be offset by skewed XCI that favors expression of the wild-type X. This study shows the utility of integrated sequencing approach in XCI ratio estimation.

  18. Sequence analysis and molecular characterization of genes required for the biosynthesis of type 1 capsular polysaccharide in Staphylococcus aureus.

    Science.gov (United States)

    Lin, W S; Cunneen, T; Lee, C Y

    1994-11-01

    We previously cloned a 19.4-kb DNA region containing a cluster of genes affecting type 1 capsule production from Staphylococcus aureus M. Subcloning experiments showed that these capsule (cap) genes are localized in a 14.6-kb region. Sequencing analysis of the 14.6-kb fragment revealed 13 open reading frames (ORFs). Using complementation tests, we have mapped a collection of Cap- mutations in 10 of the 13 ORFs, indicating that these 10 genes are involved in capsule biosynthesis. The requirement for the remaining three ORFs in the synthesis of the capsule was demonstrated by constructing site-specific mutations corresponding to each of the three ORFs. Using an Escherichia coli S30 in vitro transcription-translation system, we clearly identified 7 of the 13 proteins predicted from the ORFs. Homology search between the predicted proteins and those in the data bank showed very high homology (52.3% identity) between capL and vipA, moderate homology (29% identity) between capI and vipB, and limited homology (21.8% identity) between capM and vipC. The vipA, vipB, and vipC genes have been shown to be involved in the biosynthesis of Salmonella typhi Vi antigen, a homopolymer polysaccharide consisting of N-acetylgalactosamino uronic acid, which is also one of the components of the staphylococcal type 1 capsule. The homology between these sets of genes therefore suggests that capL, capI, and capM may be involved in the biosynthesis of amino sugar, N-acetylgalactosamino uronic acid. In addition, the search showed that CapG aligned well with the consensus sequence of a family of acetyltransferases from various prokaryotic organisms, suggesting that CapG may be an acetyltransferase. Using the isogenic Cap- and Cap+ strains constructed in this study, we have confirmed that type 1 capsule is an important virulence factor in a mouse lethality test.

  19. Cell-bound lipases from Burkholderia sp. ZYB002: gene sequence analysis, expression, enzymatic characterization, and 3D structural model.

    Science.gov (United States)

    Shu, Zhengyu; Lin, Hong; Shi, Shaolei; Mu, Xiangduo; Liu, Yanru; Huang, Jianzhong

    2016-05-03

    The whole-cell lipase from Burkholderia cepacia has been used as a biocatalyst in organic synthesis. However, there is no report in the literature on the component or the gene sequence of the cell-bound lipase from this species. Qualitative analysis of the cell-bound lipase would help to illuminate the regulation mechanism of gene expression and further improve the yield of the cell-bound lipase by gene engineering. Three predictive cell-bound lipases, lipA, lipC21 and lipC24, from Burkholderia sp. ZYB002 were cloned and expressed in E. coli. Both LipA and LipC24 displayed the lipase activity. LipC24 was a novel mesophilic enzyme and displayed preference for medium-chain-length acyl groups (C10-C14). The 3D structural model of LipC24 revealed the open Y-type active site. LipA displayed 96 % amino acid sequence identity with the known extracellular lipase. lipA-inactivation and lipC24-inactivation decreased the total cell-bound lipase activity of Burkholderia sp. ZYB002 by 42 % and 14 %, respectively. The cell-bound lipase activity from Burkholderia sp. ZYB002 originated from a multi-enzyme mixture with LipA as the main component. LipC24 was a novel lipase and displayed different enzymatic characteristics and structural model with LipA. Besides LipA and LipC24, other type of the cell-bound lipases (or esterases) should exist.

  20. Molecular and phylogenetic characterizations of an Eimeria krijgsmanni Yakimoff & Gouseff, 1938 (Apicomplexa: Eimeriidae) mouse intestinal protozoan parasite by partial 18S ribosomal RNA gene sequence analysis.

    Science.gov (United States)

    Takeo, Toshinori; Tanaka, Tetsuya; Matsubayashi, Makoto; Maeda, Hiroki; Kusakisako, Kodai; Matsui, Toshihiro; Mochizuki, Masami; Matsuo, Tomohide

    2014-08-01

    Previously, we characterized an undocumented strain of Eimeria krijgsmanni by morphological and biological features. Here, we present a detailed molecular phylogenetic analysis of this organism. Namely, 18S ribosomal RNA gene (rDNA) sequences of E. krijgsmanni were analyzed to incorporate this species into a comprehensive Eimeria phylogeny. As a result, partial 18S rDNA sequence from E. krijgsmanni was successfully determined, and two different types, Type A and Type B, that differed by 1 base pair were identified. E. krijgsmanni was originally isolated from a single oocyst, and thus the result show that the two types might have allelic sequence heterogeneity in the 18S rDNA. Based on phylogenetic analyses, the two types of E. krijgsmanni 18S rDNA formed one of two clades among murine Eimeria spp.; these Eimeria clades reflected morphological similarity among the Eimeria spp. This is the third molecular phylogenetic characterization of a murine Eimeria spp. in addition to E. falciformis and E. papillata. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  1. Development of expressed sequence tag-simple sequence repeat markers for genetic characterization and population structure analysis of Praxelis clematidea (Asteraceae).

    Science.gov (United States)

    Wang, Q Z; Huang, M; Downie, S R; Chen, Z X

    2016-05-23

    Invasive plants tend to spread aggressively in new habitats and an understanding of their genetic diversity and population structure is useful for their management. In this study, expressed sequence tag-simple sequence repeat (EST-SSR) markers were developed for the invasive plant species Praxelis clematidea (Asteraceae) from 5548 Stevia rebaudiana (Asteraceae) expressed sequence tags (ESTs). A total of 133 microsatellite-containing ESTs (2.4%) were identified, of which 56 (42.1%) were hexanucleotide repeat motifs and 50 (37.6%) were trinucleotide repeat motifs. Of the 24 primer pairs designed from these 133 ESTs, 7 (29.2%) resulted in significant polymorphisms. The number of alleles per locus ranged from 5 to 9. The relatively high genetic diversity (H = 0.2667, I = 0.4212, and P = 100%) of P. clematidea was related to high gene flow (Nm = 1.4996) among populations. The coefficient of population differentiation (GST = 0.2500) indicated that most genetic variation occurred within populations. A Mantel test suggested that there was significant correlation between genetic distance and geographical distribution (r = 0.3192, P = 0.012). These results further support the transferability of EST-SSR markers between closely related genera of the same family.

  2. Characterization of cereal cyst nematodes (Heterodera spp. in Morocco based on morphology, morphometrics and rDNA-ITS sequence analysis

    Directory of Open Access Journals (Sweden)

    Mokrini Fouad

    2017-09-01

    Full Text Available Morphological and molecular diversity among 11 populations of cereal cyst nematodes from different wheat production areas in Morocco was investigated using light microscopy, species-specific primers, complemented by the ITS-rDNA sequences. Morphometrics of cysts and second-stage juveniles (J2s were generally within the expected ranges for Heterodera avenae; only the isolate from Aïn Jmaa showed morphometrics conforming to those of H. latipons. When using species-specific primers for H. avenae and H. latipons, the specific bands of 109 bp and 204 bp, respectively, confirmed the morphological identification. In addition, the internal transcribed spacer (ITS regions were sequenced to study the diversity of the 11 populations. These sequences were compared with those of Heterodera species available in the GenBank database (www.ncbi.nlm.nih.gov and confirmed again the identity of the species. Ten sequences of the ITS-rDNA were similar (99–100% to the sequences of H. avenae published in GenBank and three sequences, corresponding with one population, were similar (97–99% to H. latipons.

  3. Molecular Characterization of Transgenic Events Using Next Generation Sequencing Approach.

    Science.gov (United States)

    Guttikonda, Satish K; Marri, Pradeep; Mammadov, Jafar; Ye, Liang; Soe, Khaing; Richey, Kimberly; Cruse, James; Zhuang, Meibao; Gao, Zhifang; Evans, Clive; Rounsley, Steve; Kumpatla, Siva P

    2016-01-01

    Demand for the commercial use of genetically modified (GM) crops has been increasing in light of the projected growth of world population to nine billion by 2050. A prerequisite of paramount importance for regulatory submissions is the rigorous safety assessment of GM crops. One of the components of safety assessment is molecular characterization at DNA level which helps to determine the copy number, integrity and stability of a transgene; characterize the integration site within a host genome; and confirm the absence of vector DNA. Historically, molecular characterization has been carried out using Southern blot analysis coupled with Sanger sequencing. While this is a robust approach to characterize the transgenic crops, it is both time- and resource-consuming. The emergence of next-generation sequencing (NGS) technologies has provided highly sensitive and cost- and labor-effective alternative for molecular characterization compared to traditional Southern blot analysis. Herein, we have demonstrated the successful application of both whole genome sequencing and target capture sequencing approaches for the characterization of single and stacked transgenic events and compared the results and inferences with traditional method with respect to key criteria required for regulatory submissions.

  4. Molecular Characterization of Transgenic Events Using Next Generation Sequencing Approach.

    Directory of Open Access Journals (Sweden)

    Satish K Guttikonda

    Full Text Available Demand for the commercial use of genetically modified (GM crops has been increasing in light of the projected growth of world population to nine billion by 2050. A prerequisite of paramount importance for regulatory submissions is the rigorous safety assessment of GM crops. One of the components of safety assessment is molecular characterization at DNA level which helps to determine the copy number, integrity and stability of a transgene; characterize the integration site within a host genome; and confirm the absence of vector DNA. Historically, molecular characterization has been carried out using Southern blot analysis coupled with Sanger sequencing. While this is a robust approach to characterize the transgenic crops, it is both time- and resource-consuming. The emergence of next-generation sequencing (NGS technologies has provided highly sensitive and cost- and labor-effective alternative for molecular characterization compared to traditional Southern blot analysis. Herein, we have demonstrated the successful application of both whole genome sequencing and target capture sequencing approaches for the characterization of single and stacked transgenic events and compared the results and inferences with traditional method with respect to key criteria required for regulatory submissions.

  5. Characterizing leader sequences of CRISPR loci

    DEFF Research Database (Denmark)

    Alkhnbashi, Omer; Shah, Shiraz Ali; Garrett, Roger Antony

    2016-01-01

    The CRISPR-Cas system is an adaptive immune system in many archaea and bacteria, which provides resistance against invading genetic elements. The first phase of CRISPR-Cas immunity is called adaptation, in which small DNA fragments are excised from genetic elements and are inserted into a CRISPR...... array generally adjacent to its so called leader sequence at one end of the array. It has been shown that transcription initiation and adaptation signals of the CRISPR array are located within the leader. However, apart from promoters, there is very little knowledge of sequence or structural motifs...... sequences by focusing on the consensus repeat of the adjacent CRISPR array and weak upstream conservation signals. We applied our tool to the analysis of a comprehensive genomic database and identified several characteristic properties of leader sequences specific to archaea and bacteria, ranging from...

  6. Identification, Characterization and Full-Length Sequence Analysis of a Novel Polerovirus Associated with Wheat Leaf Yellowing Disease.

    Science.gov (United States)

    Zhang, Peipei; Liu, Yan; Liu, Wenwen; Cao, Mengji; Massart, Sebastien; Wang, Xifeng

    2017-01-01

    To identify the pathogens responsible for leaf yellowing symptoms on wheat samples collected from Jinan, China, we tested for the presence of three known barley/wheat yellow dwarf viruses (BYDV-GAV, -PAV, WYDV-GPV) (most likely pathogens) using RT-PCR. A sample that tested negative for the three viruses was selected for small RNA sequencing. Twenty-five million sequences were generated, among which 5% were of viral origin. A novel polerovirus was discovered and temporarily named wheat leaf yellowing-associated virus (WLYaV). The full genome of WLYaV corresponds to 5,772 nucleotides (nt), with six AUG-initiated open reading frames, one non-AUG-initiated open reading frame, and three untranslated regions, showing typical features of the family Luteoviridae . Sequence comparison and phylogenetic analyses suggested that WLYaV had the closest relationship with sugarcane yellow leaf virus (ScYLV), but the identities of full genomic nucleotides and deduced amino acid sequence of coat protein (CP) were 64.9 and 86.2%, respectively, below the species demarcation thresholds (90%) in the family Luteoviridae . Furthermore, agroinoculation of Nicotiana benthamiana leaves with a cDNA clone of WLYaV caused yellowing symptoms on the plant. Our study adds a new polerovirus that is associated with wheat leaf yellowing disease, which would help to identify and control pathogens of wheat.

  7. Identification, Characterization and Full-Length Sequence Analysis of a Novel Polerovirus Associated with Wheat Leaf Yellowing Disease

    Directory of Open Access Journals (Sweden)

    Peipei Zhang

    2017-09-01

    Full Text Available To identify the pathogens responsible for leaf yellowing symptoms on wheat samples collected from Jinan, China, we tested for the presence of three known barley/wheat yellow dwarf viruses (BYDV-GAV, -PAV, WYDV-GPV (most likely pathogens using RT-PCR. A sample that tested negative for the three viruses was selected for small RNA sequencing. Twenty-five million sequences were generated, among which 5% were of viral origin. A novel polerovirus was discovered and temporarily named wheat leaf yellowing-associated virus (WLYaV. The full genome of WLYaV corresponds to 5,772 nucleotides (nt, with six AUG-initiated open reading frames, one non-AUG-initiated open reading frame, and three untranslated regions, showing typical features of the family Luteoviridae. Sequence comparison and phylogenetic analyses suggested that WLYaV had the closest relationship with sugarcane yellow leaf virus (ScYLV, but the identities of full genomic nucleotides and deduced amino acid sequence of coat protein (CP were 64.9 and 86.2%, respectively, below the species demarcation thresholds (90% in the family Luteoviridae. Furthermore, agroinoculation of Nicotiana benthamiana leaves with a cDNA clone of WLYaV caused yellowing symptoms on the plant. Our study adds a new polerovirus that is associated with wheat leaf yellowing disease, which would help to identify and control pathogens of wheat.

  8. Characterization of primary biogenic aerosol particles in urban, rural, and high-alpine air by DNA sequence and restriction fragment analysis of ribosomal RNA genes

    Directory of Open Access Journals (Sweden)

    V. R. Després

    2007-12-01

    Full Text Available This study explores the applicability of DNA analyses for the characterization of primary biogenic aerosol (PBA particles in the atmosphere. Samples of fine particulate matter (PM2.5 and total suspended particulates (TSP have been collected on different types of filter materials at urban, rural, and high-alpine locations along an altitude transect in the south of Germany (Munich, Hohenpeissenberg, Mt. Zugspitze.

    From filter segments loaded with about one milligram of air particulate matter, DNA could be extracted and DNA sequences could be determined for bacteria, fungi, plants and animals. Sequence analyses were used to determine the identity of biological organisms, and terminal restriction fragment length polymorphism analyses (T-RFLP were applied to estimate diversities and relative abundances of bacteria. Investigations of blank and background samples showed that filter materials have to be decontaminated prior to use, and that the sampling and handling procedures have to be carefully controlled to avoid artifacts in the analyses.

    Mass fractions of DNA in PM2.5 were found to be around 0.05% in urban, rural, and high-alpine aerosols. The average concentration of DNA determined for urban air was on the order of ~7 ng m−3, indicating that human adults may inhale about one microgram of DNA per day (corresponding to ~108 haploid bacterial genomes or ~105 haploid human genomes, respectively.

    Most of the bacterial sequences found in PM2.5 were from Proteobacteria (42 and some from Actinobacteria (10 and Firmicutes (1. The fungal sequences were characteristic for Ascomycota (3 and Basidiomycota (1, which are known to actively discharge spores into the atmosphere. The plant sequences could be attributed to green plants (2 and moss spores (2, while animal DNA was found only for one unicellular eukaryote (protist.

  9. Single-cell sequencing analysis characterizes common and cell-lineage-specific mutations in a muscle-invasive bladder cancer

    DEFF Research Database (Denmark)

    Li, Yingrui; Xu, Xun; Song, Luting

    2012-01-01

    sequencing of 66 individual tumor cells from a muscle-invasive bladder transitional cell carcinoma (TCC). Analyses of the somatic mutant allele frequency spectrum and clonal structure revealed that the tumor cells were derived from a single ancestral cell, but that subsequent evolution occurred, leading...... to two distinct tumor cell subpopulations. By analyzing recurrently mutant genes in an additional cohort of 99 TCC tumors, we identified genes that might play roles in the maintenance of the ancestral clone and in the muscle-invasive capability of subclones of this bladder cancer, respectively...

  10. Integrated analysis of 454 and Illumina transcriptomic sequencing characterizes carbon flux and energy source for fatty acid synthesis in developing Lindera glauca fruits for woody biodiesel.

    Science.gov (United States)

    Lin, Zixin; An, Jiyong; Wang, Jia; Niu, Jun; Ma, Chao; Wang, Libing; Yuan, Guanshen; Shi, Lingling; Liu, Lili; Zhang, Jinsong; Zhang, Zhixiang; Qi, Ji; Lin, Shanzhi

    2017-01-01

    Lindera glauca fruit with high quality and quantity of oil has emerged as a novel potential source of biodiesel in China, but the molecular regulatory mechanism of carbon flux and energy source for oil biosynthesis in developing fruits is still unknown. To better develop fruit oils of L. glauca as woody biodiesel, a combination of two different sequencing platforms (454 and Illumina) and qRT-PCR analysis was used to define a minimal reference transcriptome of developing L. glauca fruits, and to construct carbon and energy metabolic model for regulation of carbon partitioning and energy supply for FA biosynthesis and oil accumulation. We first analyzed the dynamic patterns of growth tendency, oil content, FA compositions, biodiesel properties, and the contents of ATP and pyridine nucleotide of L. glauca fruits from seven different developing stages. Comprehensive characterization of transcriptome of the developing L. glauca fruit was performed using a combination of two different next-generation sequencing platforms, of which three representative fruit samples (50, 125, and 150 DAF) and one mixed sample from seven developing stages were selected for Illumina and 454 sequencing, respectively. The unigenes separately obtained from long and short reads (201, and 259, respectively, in total) were reconciled using TGICL software, resulting in a total of 60,031 unigenes (mean length = 1061.95 bp) to describe a transcriptome for developing L. glauca fruits. Notably, 198 genes were annotated for photosynthesis, sucrose cleavage, carbon allocation, metabolite transport, acetyl-CoA formation, oil synthesis, and energy metabolism, among which some specific transporters, transcription factors, and enzymes were identified to be implicated in carbon partitioning and energy source for oil synthesis by an integrated analysis of transcriptomic sequencing and qRT-PCR. Importantly, the carbon and energy metabolic model was well established for oil biosynthesis of developing L

  11. Direct, rapid RNA sequence analysis

    International Nuclear Information System (INIS)

    Peattie, D.A.

    1987-01-01

    The original methods of RNA sequence analysis were based on enzymatic production and chromatographic separation of overlapping oligonucleotide fragments from within an RNA molecule followed by identification of the mononucleotides comprising the oligomer. Over the past decade the field of nucleic acid sequencing has changed dramatically, however, and RNA molecules now can be sequenced in a variety of more streamlined fashions. Most of the more recent advances in RNA sequencing have involved one-dimensional electrophoretic separation of 32 P-end-labeled oligoribonucleotides on polyacrylamide gels. In this chapter the author discusses two of these methods for determining the nucleotide sequences of RNA molecules rapidly: the chemical method and the enzymatic method. Both methods are direct and degradative, i.e., they rely on fragmatic and chemical approaches should be utilized. The single-strand-specific ribonucleases (A, T 1 , T 2 , and S 1 ) provide an efficient means to locate double-helical regions rapidly, and the chemical reactions provide a means to determine the RNA sequence within these regions. In addition, the chemical reactions allow one to assign interactions to specific atoms and to distinguish secondary interactions from tertiary ones. If the RNA molecule is small enough to be sequenced directly by the enzymatic or chemical method, the probing reactions can be done easily at the same time as sequencing reactions

  12. Mutation analysis and characterization of ATR sequence variants in breast cancer cases from high-risk French Canadian breast/ovarian cancer families

    Directory of Open Access Journals (Sweden)

    Pichette Roxane

    2006-09-01

    Full Text Available Abstract Background Ataxia telangiectasia-mutated and Rad3-related (ATR is a member of the PIK-related family which plays, along with ATM, a central role in cell-cycle regulation. ATR has been shown to phosphorylate several tumor suppressors like BRCA1, CHEK1 and TP53. ATR appears as a good candidate breast cancer susceptibility gene and the current study was designed to screen for ATR germline mutations potentially involved in breast cancer predisposition. Methods ATR direct sequencing was performed using a fluorescent method while widely available programs were used for linkage disequilibrium (LD, haplotype analyses, and tagging SNP (tSNP identification. Expression analyses were carried out using real-time PCR. Results The complete sequence of all exons and flanking intronic sequences were analyzed in DNA samples from 54 individuals affected with breast cancer from non-BRCA1/2 high-risk French Canadian breast/ovarian families. Although no germline mutation has been identified in the coding region, we identified 41 sequence variants, including 16 coding variants, 3 of which are not reported in public databases. SNP haplotypes were established and tSNPs were identified in 73 healthy unrelated French Canadians, providing a valuable tool for further association studies involving the ATR gene, using large cohorts. Our analyses led to the identification of two novel alternative splice transcripts. In contrast to the transcript generated by an alternative splicing site in the intron 41, the one resulting from a deletion of 121 nucleotides in exon 33 is widely expressed, at significant but relatively low levels, in both normal and tumoral cells including normal breast and ovarian tissue. Conclusion Although no deleterious mutations were identified in the ATR gene, the current study provides an haplotype analysis of the ATR gene polymorphisms, which allowed the identification of a set of SNPs that could be used as tSNPs for large-scale association

  13. Integrated sequence analysis. Final report

    International Nuclear Information System (INIS)

    Andersson, K.; Pyy, P.

    1998-02-01

    The NKS/RAK subprojet 3 'integrated sequence analysis' (ISA) was formulated with the overall objective to develop and to test integrated methodologies in order to evaluate event sequences with significant human action contribution. The term 'methodology' denotes not only technical tools but also methods for integration of different scientific disciplines. In this report, we first discuss the background of ISA and the surveys made to map methods in different application fields, such as man machine system simulation software, human reliability analysis (HRA) and expert judgement. Specific event sequences were, after the surveys, selected for application and testing of a number of ISA methods. The event sequences discussed in the report were cold overpressure of BWR, shutdown LOCA of BWR, steam generator tube rupture of a PWR and BWR disturbed signal view in the control room after an external event. Different teams analysed these sequences by using different ISA and HRA methods. Two kinds of results were obtained from the ISA project: sequence specific and more general findings. The sequence specific results are discussed together with each sequence description. The general lessons are discussed under a separate chapter by using comparisons of different case studies. These lessons include areas ranging from plant safety management (design, procedures, instrumentation, operations, maintenance and safety practices) to methodological findings (ISA methodology, PSA,HRA, physical analyses, behavioural analyses and uncertainty assessment). Finally follows a discussion about the project and conclusions are presented. An interdisciplinary study of complex phenomena is a natural way to produce valuable and innovative results. This project came up with structured ways to perform ISA and managed to apply the in practice. The project also highlighted some areas where more work is needed. In the HRA work, development is required for the use of simulators and expert judgement as

  14. Molecular cloning, sequence characterization and expression analysis of a CD63 homologue from the coleopteran beetle, Tenebrio molitor.

    Science.gov (United States)

    Patnaik, Bharat Bhusan; Kang, Seong Min; Seo, Gi Won; Lee, Hyo Jeong; Patnaik, Hongray Howrelia; Jo, Yong Hun; Tindwa, Hamisi; Lee, Yong Seok; Lee, Bok Luel; Kim, Nam Jung; Bang, In Seok; Han, Yeon Soo

    2013-10-15

    CD63, a member of the tetraspanin membrane protein family, plays a pivotal role in cell growth, motility, signal transduction, host-pathogen interactions and cancer. In this work, the cDNA encoding CD63 homologue (TmCD63) was cloned from larvae of a coleopteran beetle, Tenebrio molitor. The cDNA is comprised of an open reading frame of 705 bp, encoding putative protein of 235 amino acid residues. In silico analysis shows that the protein has four putative transmembrane domains and one large extracellular loop. The characteristic "Cys-Cys-Gly" motif and "Cys188" residues are highly conserved in the large extracellular loop. Phylogenetic analysis of TmCD63 revealed that they belong to the insect cluster with 50%-56% identity. Analysis of spatial expression patterns demonstrated that TmCD63 mRNA is mainly expressed in gut and Malphigian tubules of larvae and the testis of the adult. Developmental expression patterns of CD63 mRNA showed that TmCD63 transcripts are detected in late larval, pupal and adult stages. Interestingly, TmCD63 transcripts are upregulated to the maximum level of 4.5 fold, in response to DAP-type peptidoglycan during the first 6 h, although other immune elicitors also caused significant increase to the transcript level at later time-points. These results suggest that CD63 might contribute to T. molitor immune response against various microbial pathogens.

  15. Molecular Cloning, Sequence Characterization and Expression Analysis of a CD63 Homologue from the Coleopteran Beetle, Tenebrio molitor

    Directory of Open Access Journals (Sweden)

    Yeon Soo Han

    2013-10-01

    Full Text Available CD63, a member of the tetraspanin membrane protein family, plays a pivotal role in cell growth, motility, signal transduction, host-pathogen interactions and cancer. In this work, the cDNA encoding CD63 homologue (TmCD63 was cloned from larvae of a coleopteran beetle, Tenebrio molitor. The cDNA is comprised of an open reading frame of 705 bp, encoding putative protein of 235 amino acid residues. In silico analysis shows that the protein has four putative transmembrane domains and one large extracellular loop. The characteristic “Cys-Cys-Gly” motif and “Cys188” residues are highly conserved in the large extracellular loop. Phylogenetic analysis of TmCD63 revealed that they belong to the insect cluster with 50%–56% identity. Analysis of spatial expression patterns demonstrated that TmCD63 mRNA is mainly expressed in gut and Malphigian tubules of larvae and the testis of the adult. Developmental expression patterns of CD63 mRNA showed that TmCD63 transcripts are detected in late larval, pupal and adult stages. Interestingly, TmCD63 transcripts are upregulated to the maximum level of 4.5 fold, in response to DAP-type peptidoglycan during the first 6 h, although other immune elicitors also caused significant increase to the transcript level at later time-points. These results suggest that CD63 might contribute to T. molitor immune response against various microbial pathogens.

  16. Molecular Cloning, Sequence Characterization and Expression Analysis of a CD63 Homologue from the Coleopteran Beetle, Tenebrio molitor

    Science.gov (United States)

    Patnaik, Bharat Bhusan; Kang, Seong Min; Seo, Gi Won; Lee, Hyo Jeong; Patnaik, Hongray Howrelia; Jo, Yong Hun; Tindwa, Hamisi; Lee, Yong Seok; Lee, Bok Luel; Kim, Nam Jung; Bang, In Seok; Han, Yeon Soo

    2013-01-01

    CD63, a member of the tetraspanin membrane protein family, plays a pivotal role in cell growth, motility, signal transduction, host-pathogen interactions and cancer. In this work, the cDNA encoding CD63 homologue (TmCD63) was cloned from larvae of a coleopteran beetle, Tenebrio molitor. The cDNA is comprised of an open reading frame of 705 bp, encoding putative protein of 235 amino acid residues. In silico analysis shows that the protein has four putative transmembrane domains and one large extracellular loop. The characteristic “Cys-Cys-Gly” motif and “Cys188” residues are highly conserved in the large extracellular loop. Phylogenetic analysis of TmCD63 revealed that they belong to the insect cluster with 50%–56% identity. Analysis of spatial expression patterns demonstrated that TmCD63 mRNA is mainly expressed in gut and Malphigian tubules of larvae and the testis of the adult. Developmental expression patterns of CD63 mRNA showed that TmCD63 transcripts are detected in late larval, pupal and adult stages. Interestingly, TmCD63 transcripts are upregulated to the maximum level of 4.5 fold, in response to DAP-type peptidoglycan during the first 6 h, although other immune elicitors also caused significant increase to the transcript level at later time-points. These results suggest that CD63 might contribute to T. molitor immune response against various microbial pathogens. PMID:24132157

  17. The characterization of twenty sequenced human genomes.

    Directory of Open Access Journals (Sweden)

    Kimberly Pelak

    2010-09-01

    Full Text Available We present the analysis of twenty human genomes to evaluate the prospects for identifying rare functional variants that contribute to a phenotype of interest. We sequenced at high coverage ten "case" genomes from individuals with severe hemophilia A and ten "control" genomes. We summarize the number of genetic variants emerging from a study of this magnitude, and provide a proof of concept for the identification of rare and highly-penetrant functional variants by confirming that the cause of hemophilia A is easily recognizable in this data set. We also show that the number of novel single nucleotide variants (SNVs discovered per genome seems to stabilize at about 144,000 new variants per genome, after the first 15 individuals have been sequenced. Finally, we find that, on average, each genome carries 165 homozygous protein-truncating or stop loss variants in genes representing a diverse set of pathways.

  18. Sequence analysis by iterated maps, a review.

    Science.gov (United States)

    Almeida, Jonas S

    2014-05-01

    Among alignment-free methods, Iterated Maps (IMs) are on a particular extreme: they are also scale free (order free). The use of IMs for sequence analysis is also distinct from other alignment-free methodologies in being rooted in statistical mechanics instead of computational linguistics. Both of these roots go back over two decades to the use of fractal geometry in the characterization of phase-space representations. The time series analysis origin of the field is betrayed by the title of the manuscript that started this alignment-free subdomain in 1990, 'Chaos Game Representation'. The clash between the analysis of sequences as continuous series and the better established use of Markovian approaches to discrete series was almost immediate, with a defining critique published in same journal 2 years later. The rest of that decade would go by before the scale-free nature of the IM space was uncovered. The ensuing decade saw this scalability generalized for non-genomic alphabets as well as an interest in its use for graphic representation of biological sequences. Finally, in the past couple of years, in step with the emergence of BigData and MapReduce as a new computational paradigm, there is a surprising third act in the IM story. Multiple reports have described gains in computational efficiency of multiple orders of magnitude over more conventional sequence analysis methodologies. The stage appears to be now set for a recasting of IMs with a central role in processing nextgen sequencing results.

  19. Fractals in DNA sequence analysis

    Institute of Scientific and Technical Information of China (English)

    Yu Zu-Guo(喻祖国); Vo Anh; Gong Zhi-Min(龚志民); Long Shun-Chao(龙顺潮)

    2002-01-01

    Fractal methods have been successfully used to study many problems in physics, mathematics, engineering, finance,and even in biology. There has been an increasing interest in unravelling the mysteries of DNA; for example, how can we distinguish coding and noncoding sequences, and the problems of classification and evolution relationship of organisms are key problems in bioinformatics. Although much research has been carried out by taking into consideration the long-range correlations in DNA sequences, and the global fractal dimension has been used in these works by other people, the models and methods are somewhat rough and the results are not satisfactory. In recent years, our group has introduced a time series model (statistical point of view) and a visual representation (geometrical point of view)to DNA sequence analysis. We have also used fractal dimension, correlation dimension, the Hurst exponent and the dimension spectrum (multifractal analysis) to discuss problems in this field. In this paper, we introduce these fractal models and methods and the results of DNA sequence analysis.

  20. Characterization of the bovine pregnancy-associated glycoprotein gene family – analysis of gene sequences, regulatory regions within the promoter and expression of selected genes

    Directory of Open Access Journals (Sweden)

    Walker Angela M

    2009-04-01

    Full Text Available Abstract Background The Pregnancy-associated glycoproteins (PAGs belong to a large family of aspartic peptidases expressed exclusively in the placenta of species in the Artiodactyla order. In cattle, the PAG gene family is comprised of at least 22 transcribed genes, as well as some variants. Phylogenetic analyses have shown that the PAG family segregates into 'ancient' and 'modern' groupings. Along with sequence differences between family members, there are clear distinctions in their spatio-temporal distribution and in their relative level of expression. In this report, 1 we performed an in silico analysis of the bovine genome to further characterize the PAG gene family, 2 we scrutinized proximal promoter sequences of the PAG genes to evaluate the evolution pressures operating on them and to identify putative regulatory regions, 3 we determined relative transcript abundance of selected PAGs during pregnancy and, 4 we performed preliminary characterization of the putative regulatory elements for one of the candidate PAGs, bovine (bo PAG-2. Results From our analysis of the bovine genome, we identified 18 distinct PAG genes and 14 pseudogenes. We observed that the first 500 base pairs upstream of the translational start site contained multiple regions that are conserved among all boPAGs. However, a preponderance of conserved regions, that harbor recognition sites for putative transcriptional factors (TFs, were found to be unique to the modern boPAG grouping, but not the ancient boPAGs. We gathered evidence by means of Q-PCR and screening of EST databases to show that boPAG-2 is the most abundant of all boPAG transcripts. Finally, we provided preliminary evidence for the role of ETS- and DDVL-related TFs in the regulation of the boPAG-2 gene. Conclusion PAGs represent a relatively large gene family in the bovine genome. The proximal promoter regions of these genes display differences in putative TF binding sites, likely contributing to observed

  1. Integrated sequence analysis. Final report

    Energy Technology Data Exchange (ETDEWEB)

    Andersson, K.; Pyy, P

    1998-02-01

    The NKS/RAK subprojet 3 `integrated sequence analysis` (ISA) was formulated with the overall objective to develop and to test integrated methodologies in order to evaluate event sequences with significant human action contribution. The term `methodology` denotes not only technical tools but also methods for integration of different scientific disciplines. In this report, we first discuss the background of ISA and the surveys made to map methods in different application fields, such as man machine system simulation software, human reliability analysis (HRA) and expert judgement. Specific event sequences were, after the surveys, selected for application and testing of a number of ISA methods. The event sequences discussed in the report were cold overpressure of BWR, shutdown LOCA of BWR, steam generator tube rupture of a PWR and BWR disturbed signal view in the control room after an external event. Different teams analysed these sequences by using different ISA and HRA methods. Two kinds of results were obtained from the ISA project: sequence specific and more general findings. The sequence specific results are discussed together with each sequence description. The general lessons are discussed under a separate chapter by using comparisons of different case studies. These lessons include areas ranging from plant safety management (design, procedures, instrumentation, operations, maintenance and safety practices) to methodological findings (ISA methodology, PSA,HRA, physical analyses, behavioural analyses and uncertainty assessment). Finally follows a discussion about the project and conclusions are presented. An interdisciplinary study of complex phenomena is a natural way to produce valuable and innovative results. This project came up with structured ways to perform ISA and managed to apply the in practice. The project also highlighted some areas where more work is needed. In the HRA work, development is required for the use of simulators and expert judgement as

  2. Characterization and complete genome sequence analysis of a novel virulent Siphoviridae phage against Staphylococcus aureus isolated from bovine mastitis in Xinjiang, China.

    Science.gov (United States)

    Zhang, Qian; Xing, Shaozhen; Sun, Qiang; Pei, Guangqian; Cheng, Shi; Liu, Yannan; An, Xiaoping; Zhang, Xianglilan; Qu, Yonggang; Tong, Yigang

    2017-06-01

    Bovine mastitis is one of the most costly diseases in dairy cows worldwide. It can be caused by over 150 different microorganisms, where Staphylococcus aureus is the most frequently isolated and a major pathogen responsible for heavy economic losses in dairy industry. Although antibiotic therapy is most widely used, alternative treatments are necessary due to the increasing antibiotic resistance. Using phage for pathogen control is a promising tool in the fight against antibiotic resistance. Mainly using high-throughput sequencing, bioinformatics and our proposed phage termini identification method, we have isolated and characterized a novel virulent phage, designated as vB_SauS_IMEP5, from manure collected from dairy farms in Shihezi, Xinjiang, China, for use as a biocontrol agent against Staphylococcus aureus infections. Its latent period was about 30 min and its burst size was approximately 272PFU/cell. Phage vB_SauS_IMEP5 survives in a wide pH range between 3 and 12. A treatment at 70 °C for 20 min can inactive the phage. Morphological analysis of vB_SauS_IMEP5 revealed that phage vB_SauS_IMEP5 morphologically resembles phages in the family Siphoviridae. Among our tested multiplicity of infections (MOIs), the optimal multiplicity of infection (MOI) of this phage was determined to be 0.001, suggesting that phage vB_SauS_IMEP5 has high bacteriolytic potential and good efficiency for reducing bacterial growth. The complete genome of IME-P5 is a 44,677-bp, linear, double-stranded DNA, with a G+C content of 34.26%, containing 69 putative ORFs. The termini of genome were determined with next-generation sequencing data using our previously proposed termini identification method, which suggests that this phage has non-redundant termini with 9nt 3' protruding cohesive ends. The genomic and proteomic characteristics of IMEP5 demonstrate that this phage does not belong to any of the previously recognized Siphoviridae Staphylococcus phage groups, suggesting the

  3. Genome sequence analysis of five Canadian isolates of strawberry mottle virus reveals extensive intra-species diversity and a longer RNA2 with increased coding capacity compared to a previously characterized European isolate.

    Science.gov (United States)

    Bhagwat, Basdeo; Dickison, Virginia; Ding, Xinlun; Walker, Melanie; Bernardy, Michael; Bouthillier, Michel; Creelman, Alexa; DeYoung, Robyn; Li, Yinzi; Nie, Xianzhou; Wang, Aiming; Xiang, Yu; Sanfaçon, Hélène

    2016-06-01

    In this study, we report the genome sequence of five isolates of strawberry mottle virus (family Secoviridae, order Picornavirales) from strawberry field samples with decline symptoms collected in Eastern Canada. The Canadian isolates differed from the previously characterized European isolate 1134 in that they had a longer RNA2, resulting in a 239-amino-acid extension of the C-terminal region of the polyprotein. Sequence analysis suggests that reassortment and recombination occurred among the isolates. Phylogenetic analysis revealed that the Canadian isolates are diverse, grouping in two separate branches along with isolates from Europe and the Americas.

  4. Genomic organization, sequence characterization and expression analysis of Tenebrio molitor apolipophorin-III in response to an intracellular pathogen, Listeria monocytogenes.

    Science.gov (United States)

    Noh, Ju Young; Patnaik, Bharat Bhusan; Tindwa, Hamisi; Seo, Gi Won; Kim, Dong Hyun; Patnaik, Hongray Howrelia; Jo, Yong Hun; Lee, Yong Seok; Lee, Bok Luel; Kim, Nam Jung; Han, Yeon Soo

    2014-01-25

    Apolipophorin III (apoLp-III) is a well-known hemolymph protein having a functional role in lipid transport and immune response of insects. We cloned full-length cDNA encoding putative apoLp-III from larvae of the coleopteran beetle, Tenebrio molitor (TmapoLp-III), by identification of clones corresponding to the partial sequence of TmapoLp-III, subsequently followed with full length sequencing by a clone-by-clone primer walking method. The complete cDNA consists of 890 nucleotides, including an ORF encoding 196 amino acid residues. Excluding a putative signal peptide of the first 20 amino acid residues, the 176-residue mature apoLp-III has a calculated molecular mass of 19,146Da. Genomic sequence analysis with respect to its cDNA showed that TmapoLp-III was organized into four exons interrupted by three introns. Several immune-related transcription factor binding sites were discovered in the putative 5'-flanking region. BLAST and phylogenetic analyses reveal that TmapoLp-III has high sequence identity (88%) with Tribolium castaneum apoLp-III but shares little sequence homologies (molitor. Copyright © 2013 Elsevier B.V. All rights reserved.

  5. Analysis of S-RNase alleles of almond (Prunus dulcis): characterization of new sequences, resolution of synonyms and evidence of intragenic recombination.

    Science.gov (United States)

    Ortega, Encarnación; Bosković, Radovan I; Sargent, Daniel J; Tobutt, Kenneth R

    2006-11-01

    Cross-compatibility relationships in almond are controlled by a gametophytically expressed incompatibility system partly mediated by stylar RNases, of which 29 have been reported. To resolve possible synonyms and to provide data for phylogenetic analysis, 21 almond S-RNase alleles were cloned and sequenced from SP (signal peptide region) or C1 (first conserved region) to C5, except for the S29 allele, which could be cloned only from SP to C1. Nineteen sequences (S4, S6, S11-S22, S25-S29)) were potentially new whereas S10 and S24 had previously been published but with different labels. The sequences for S16 and S17 were identical to that for S1, published previously; likewise, S15 was identical to S5. In addition, S4 and S20 were identical, as were S13 and S19. A revised version of the standard table of almond incompatibility genotypes is presented. Several alleles had AT or GA tandem repeats in their introns. Sequences of the 23 distinct newly cloned or already published alleles were aligned. Sliding windows analysis of Ka/Ks identified regions where positive selection may operate; in contrast to the Maloideae, most of the region from the beginning of C3 to the beginning of RC4 appeared not to be under positive selection. Phylogenetic analysis indicated four pairs of alleles had "bootstrap" support > 80%: S5/S10, S4/S8, S11/S24, and S3/S6. Various motifs up to 19 residues long occurred in at least two alleles, and their distributions were consistent with intragenic recombination, as were separate phylogenetic analyses of the 5' and 3' sections. Sequence comparison of phylogenetically related alleles indicated the significance of the region between RC4 and C5 in defining specificity.

  6. A second pectin lyase gene (pel2) from Aspergillus oryzae KBN616: its sequence analysis and overexpression, and characterization of the gene products.

    Science.gov (United States)

    Kitamoto, N; Yoshino-Yasuda, S; Ohmiya, K; Tsukagoshi, N

    2001-01-01

    A second pectin lyase gene, designated pel2, was isolated from a shoyu koji mold Aspergillus oryzae KBN616 and characterized. The structural gene comprised 1306 bp with three introns. The ORF encoded 375 amino acids with a signal peptide of 19 amino acids. The deduced amino acid sequence showed high similarity to those of A. oryzae Pel1, Aspergillus niger pectin lyases and Glomerella cingulata Pn1A. The pel2 gene was overexpressed under the control of the promoter of the A. oryzae TEF1 gene for purification and enzymatic characterization of its gene product. The gene product exhibited two molecular masses of 48 and 44 kDa due to different degrees of glycosylation. Both proteins had the same pH optimum of 6.0 and temperature optimum of 50 degrees C.

  7. Southern-by-Sequencing: A Robust Screening Approach for Molecular Characterization of Genetically Modified Crops

    Directory of Open Access Journals (Sweden)

    Gina M. Zastrow-Hayes

    2015-03-01

    Full Text Available Molecular characterization of events is an integral part of the advancement process during genetically modified (GM crop product development. Assessment of these events is traditionally accomplished by polymerase chain reaction (PCR and Southern blot analyses. Southern blot analysis can be time-consuming and comparatively expensive and does not provide sequence-level detail. We have developed a sequence-based application, Southern-by-Sequencing (SbS, utilizing sequence capture coupled with next-generation sequencing (NGS technology to replace Southern blot analysis for event selection in a high-throughput molecular characterization environment. SbS is accomplished by hybridizing indexed and pooled whole-genome DNA libraries from GM plants to biotinylated probes designed to target the sequence of transformation plasmids used to generate events within the pool. This sequence capture process enriches the sequence data obtained for targeted regions of interest (transformation plasmid DNA. Taking advantage of the DNA adjacent to the targeted bases (referred to as next-to-target sequence that accompanies the targeted transformation plasmid sequence, the data analysis detects plasmid-to-genome and plasmid-to-plasmid junctions introduced during insertion into the plant genome. Analysis of these junction sequences provides sequence-level information as to the following: the number of insertion loci including detection of unlinked, independently segregating, small DNA fragments; copy number; rearrangements, truncations, or deletions of the intended insertion DNA; and the presence of transformation plasmid backbone sequences. This molecular evidence from SbS analysis is used to characterize and select GM plants meeting optimal molecular characterization criteria. SbS technology has proven to be a robust event screening tool for use in a high-throughput molecular characterization environment.

  8. Molecular characterization and phylogenetic relationships among microsporidian isolates infecting silkworm, Bombyx mori using small subunit rRNA (SSU-rRNA) gene sequence analysis.

    Science.gov (United States)

    Nath, B Surendra; Gupta, S K; Bajpai, A K

    2012-12-01

    The life cycle, spore morphology, pathogenicity, tissue specificity, mode of transmission and small subunit rRNA (SSU-rRNA) gene sequence analysis of the five new microsporidian isolates viz., NIWB-11bp, NIWB-12n, NIWB-13md, NIWB-14b and NIWB-15mb identified from the silkworm, Bombyx mori have been studied along with type species, NIK-1s_mys. The life cycle of the microsporidians identified exhibited the sequential developmental cycles that are similar to the general developmental cycle of the genus, Nosema. The spores showed considerable variations in their shape, length and width. The pathogenicity observed was dose-dependent and differed from each of the microsporidian isolates; the NIWB-15mb was found to be more virulent than other isolates. All of the microsporidians were found to infect most of the tissues examined and showed gonadal infection and transovarial transmission in the infected silkworms. SSU-rRNA sequence based phylogenetic tree placed NIWB-14b, NIWB-12n and NIWB-11bp in a separate branch along with other Nosema species and Nosema bombycis; while NIWB-15mb and NIWB-13md together formed another cluster along with other Nosema species. NIK-1s_mys revealed a signature sequence similar to standard type species, N. bombycis, indicating that NIK-1s_mys is similar to N. bombycis. Based on phylogenetic relationships, branch length information based on genetic distance and nucleotide differences, we conclude that the microsporidian isolates identified are distinctly different from the other known species and belonging to the genus, Nosema. This SSU-rRNA gene sequence analysis method is found to be more useful approach in detecting different and closely related microsporidians of this economically important domestic insect.

  9. Characterization of Erwinia amylovora strains from different host plants using repetitive-sequences PCR analysis, and restriction fragment length polymorphism and short-sequence DNA repeats of plasmid pEA29.

    Science.gov (United States)

    Barionovi, D; Giorgi, S; Stoeger, A R; Ruppitsch, W; Scortichini, M

    2006-05-01

    The three main aims of the study were the assessment of the genetic relationship between a deviating Erwinia amylovora strain isolated from Amelanchier sp. (Maloideae) grown in Canada and other strains from Maloideae and Rosoideae, the investigation of the variability of the PstI fragment of the pEA29 plasmid using restriction fragment length polymorphism (RFLP) analysis and the determination of the number of short-sequence DNA repeats (SSR) by DNA sequence analysis in representative strains. Ninety-three strains obtained from 12 plant genera and different geographical locations were examined by repetitive-sequences PCR using Enterobacterial Repetitive Intergenic Consensus, BOX and Repetitive Extragenic Palindromic primer sets. Upon the unweighted pair group method with arithmetic mean analysis, a deviating strain from Amelanchier sp. was analysed using amplified ribosomal DNA restriction analysis (ARDRA) analysis and the sequencing of the 16S rDNA gene. This strain showed 99% similarity to other E. amylovora strains in the 16S gene and the same banding pattern with ARDRA. The RFLP analysis of pEA29 plasmid using MspI and Sau3A restriction enzymes showed a higher variability than that previously observed and no clear-cut grouping of the strains was possible. The number of SSR units reiterated two to 12 times. The strains obtained from pear orchards showing for the first time symptoms of fire blight had a low number of SSR units. The strains from Maloideae exhibit a wider genetic variability than previously thought. The RFLP analysis of a fragment of the pEA29 plasmid would not seem a reliable method for typing E. amylovora strains. A low number of SSR units was observed with first epidemics of fire blight. The current detection techniques are mainly based on the genetic similarities observed within the strains from the cultivated tree-fruit crops. For a more reliable detection of the fire blight pathogen also in wild and ornamentals Rosaceous plants the genetic

  10. Biphasic Study to Characterize Agricultural Biogas Plants by High-Throughput 16S rRNA Gene Amplicon Sequencing and Microscopic Analysis.

    Science.gov (United States)

    Maus, Irena; Kim, Yong Sung; Wibberg, Daniel; Stolze, Yvonne; Off, Sandra; Antonczyk, Sebastian; Pühler, Alfred; Scherer, Paul; Schlüter, Andreas

    2017-02-28

    Process surveillance within agricultural biogas plants (BGPs) was concurrently studied by high-throughput 16S rRNA gene amplicon sequencing and an optimized quantitative microscopic fingerprinting (QMF) technique. In contrast to 16S rRNA gene amplicons, digitalized microscopy is a rapid and cost-effective method that facilitates enumeration and morphological differentiation of the most significant groups of methanogens regarding their shape and characteristic autofluorescent factor 420. Moreover, the fluorescence signal mirrors cell vitality. In this study, four different BGPs were investigated. The results indicated stable process performance in the mesophilic BGPs and in the thermophilic reactor. Bacterial subcommunity characterization revealed significant differences between the four BGPs. Most remarkably, the genera Defluviitoga and Halocella dominated the thermophilic bacterial subcommunity, whereas members of another taxon, Syntrophaceticus , were found to be abundant in the mesophilic BGP. The domain Archaea was dominated by the genus Methanoculleus in all four BGPs, followed by Methanosaeta in BGP1 and BGP3. In contrast, Methanothermobacter members were highly abundant in the thermophilic BGP4. Furthermore, a high consistency between the sequencing approach and the QMF method was shown, especially for the thermophilic BGP. The differences elucidated that using this biphasic approach for mesophilic BGPs provided novel insights regarding disaggregated single cells of Methanosarcina and Methanosaeta species. Both dominated the archaeal subcommunity and replaced coccoid Methanoculleus members belonging to the same group of Methanomicrobiales that have been frequently observed in similar BGPs. This work demonstrates that combining QMF and 16S rRNA gene amplicon sequencing is a complementary strategy to describe archaeal community structures within biogas processes.

  11. Sequence analysis and characterization of pyruvate kinase from Clonorchis sinensis, a 53.1-kDa homopentamer, implicated immune protective efficacy against clonorchiasis

    Directory of Open Access Journals (Sweden)

    Tingjin Chen

    2017-11-01

    Full Text Available Abstract Background Clonorchis sinensis, the causative agent of clonorchiasis, is classified as one of the most neglected tropical diseases and affects more than 15 million people globally. This hepatobiliary disease is highly associated with cholangiocarcinoma. As key molecules in the infectivity and subsistence of trematodes, glycolytic enzymes have been targets for drug and vaccine development. Clonorchis sinensis pyruvate kinase (CsPK, a crucial glycolytic enzyme, was characterized in this research. Results Differences were observed in the sequences and spatial structures of CsPK and PKs from humans, rats, mice and rabbits. CsPK possessed a characteristic active site signature (IKLIAKIENHEGV and some unique sites but lacked the N-terminal domain. The predicted subunit molecular mass (Mr of CsPK was 53.1 kDa. Recombinant CsPK (rCsPK was a homopentamer with a Mr. of approximately 290 kDa by both native PAGE and gel filtration chromatography. Significant differences in the protein and mRNA levels of CsPK were observed among four life stages of C. sinensis (egg, adult worm, excysted metacercaria and metacercaria, suggesting that these developmental stages may be associated with diverse energy demands. CsPK was widely distributed in adult worms. Moreover, an intense Th1-biased immune response was persistently elicited in rats immunized with rCsPK. Also, rat anti-rCsPK sera suppressed C. sinensis adult subsistence both in vivo and in vitro. Conclusions The sequences and spatial structures, molecular mass, and expression profile of CsPK have been characterized. rCsPK was indicated to be a homopentamer. Rat anti-rCsPK sera suppressed C. sinensis adult subsistence both in vivo and in vitro. CsPK is worthy of further study as a promising target for drug and vaccine development.

  12. Sequence analysis and characterization of pyruvate kinase from Clonorchis sinensis, a 53.1-kDa homopentamer, implicated immune protective efficacy against clonorchiasis.

    Science.gov (United States)

    Chen, Tingjin; Jiang, Hongye; Sun, Hengchang; Xie, Zhizhi; Ren, Pengli; Zhao, Lu; Dong, Huimin; Shi, Mengchen; Lv, Zhiyue; Wu, Zhongdao; Li, Xuerong; Yu, Xinbing; Huang, Yan; Xu, Jin

    2017-11-09

    Clonorchis sinensis, the causative agent of clonorchiasis, is classified as one of the most neglected tropical diseases and affects more than 15 million people globally. This hepatobiliary disease is highly associated with cholangiocarcinoma. As key molecules in the infectivity and subsistence of trematodes, glycolytic enzymes have been targets for drug and vaccine development. Clonorchis sinensis pyruvate kinase (CsPK), a crucial glycolytic enzyme, was characterized in this research. Differences were observed in the sequences and spatial structures of CsPK and PKs from humans, rats, mice and rabbits. CsPK possessed a characteristic active site signature (IKLIAKIENHEGV) and some unique sites but lacked the N-terminal domain. The predicted subunit molecular mass (Mr) of CsPK was 53.1 kDa. Recombinant CsPK (rCsPK) was a homopentamer with a Mr. of approximately 290 kDa by both native PAGE and gel filtration chromatography. Significant differences in the protein and mRNA levels of CsPK were observed among four life stages of C. sinensis (egg, adult worm, excysted metacercaria and metacercaria), suggesting that these developmental stages may be associated with diverse energy demands. CsPK was widely distributed in adult worms. Moreover, an intense Th1-biased immune response was persistently elicited in rats immunized with rCsPK. Also, rat anti-rCsPK sera suppressed C. sinensis adult subsistence both in vivo and in vitro. The sequences and spatial structures, molecular mass, and expression profile of CsPK have been characterized. rCsPK was indicated to be a homopentamer. Rat anti-rCsPK sera suppressed C. sinensis adult subsistence both in vivo and in vitro. CsPK is worthy of further study as a promising target for drug and vaccine development.

  13. Characterization of Fusobacterium varium Fv113-g1 isolated from a patient with ulcerative colitis based on complete genome sequence and transcriptome analysis.

    Directory of Open Access Journals (Sweden)

    Tsuyoshi Sekizuka

    Full Text Available Fusobacterium spp. present in the oral and gut flora is carcinogenic and is associated with the risk of pancreatic and colorectal cancers. Fusobacterium spp. is also implicated in a broad spectrum of human pathologies, including Crohn's disease and ulcerative colitis (UC. Here we report the complete genome sequence of Fusobacterium varium Fv113-g1 (genome size, 3.96 Mb isolated from a patient with UC. Comparative genome analyses totally suggested that Fv113-g1 is basically assigned as F. varium, in particular, it could be reclassified as notable F. varium subsp. similar to F. ulcerans because of partial shared orthologs. Compared with the genome sequences of F. varium ATCC 27725 (genome size, 3.30 Mb and other strains of Fusobacterium spp., Fv113-g1 possesses many accessary pan-genome sequences with noteworthy multiple virulence factors, including 44 autotransporters (type V secretion system, T5SS and 13 Fusobacterium adhesion (FadA paralogs involved in potential mucosal inflammation. Indeed, transcriptome analysis demonstrated that Fv113-g1-specific accessary genes, such as multiple T5SS and fadA paralogs, showed notably increased expression with D-MEM cultivation than with brain heart infusion broth. This implied that growth condition may enhance the expression of such potential virulence factors, leading to remarkable survival against other gut microorganisms and to the pathogenicity to human intestinal epithelium.

  14. Characterization of bovine ruminal epithelial bacterial communities using 16S rRNA sequencing, PCR-DGGE, and qRT-PCR analysis.

    Science.gov (United States)

    Li, Meiju; Zhou, Mi; Adamowicz, Elizabeth; Basarab, John A; Guan, Le Luo

    2012-02-24

    Currently, knowledge regarding the ecology and function of bacteria attached to the epithelial tissue of the rumen wall is limited. In this study, the diversity of the bacterial community attached to the rumen epithelial tissue was compared to the rumen content bacterial community using 16S rRNA gene sequencing, PCR-DGGE, and qRT-PCR analysis. Sequence analysis of 2785 randomly selected clones from six 16S rDNA (∼1.4kb) libraries showed that the community structures of three rumen content libraries clustered together and were separated from the rumen tissue libraries. The diversity index of each library revealed that ruminal content bacterial communities (4.12/4.42/4.88) were higher than ruminal tissue communities (2.90/2.73/3.23), based on 97% similarity. The phylum Firmicutes was predominant in the ruminal tissue communities, while the phylum Bacteroidetes was predominant in the ruminal content communities. The phyla Fibrobacteres, Planctomycetes, and Verrucomicrobia were only detected in the ruminal content communities. PCR-DGGE analysis of the bacterial profiles of the rumen content and ruminal epithelial tissue samples from 22 steers further confirmed that there is a distinct bacterial community that inhibits the rumen epithelium. The distinctive epimural bacterial communities suggest that Firmicutes, together with other epithelial-specific species, may have additional functions other than food digestion. Copyright © 2011 Elsevier B.V. All rights reserved.

  15. Computation Sequences: A Way to Characterize Classes of Attribute Grammars

    DEFF Research Database (Denmark)

    Nielson, Hanne Riis

    1983-01-01

    A computation sequence for a derivation tree specifies a way of walking through the tree evaluating all the attributes of all nodes. By requiring that each derivation tree has a computation sequence with a certain property, it is possible to give simple characterizations of well-known subclasses ...

  16. Novel algorithms for protein sequence analysis

    NARCIS (Netherlands)

    Ye, Kai

    2008-01-01

    Each protein is characterized by its unique sequential order of amino acids, the so-called protein sequence. Biology”s paradigm is that this order of amino acids determines the protein”s architecture and function. In this thesis, we introduce novel algorithms to analyze protein sequences. Chapter 1

  17. Sequence analysis of Leukemia DNA

    Science.gov (United States)

    Nacong, Nasria; Lusiyanti, Desy; Irawan, Muhammad. Isa

    2018-03-01

    Cancer is a very deadly disease, one of which is leukemia disease or better known as blood cancer. The cancer cell can be detected by taking DNA in laboratory test. This study focused on local alignment of leukemia and non leukemia data resulting from NCBI in the form of DNA sequences by using Smith-Waterman algorithm. SmithWaterman algorithm was invented by TF Smith and MS Waterman in 1981. These algorithms try to find as much as possible similarity of a pair of sequences, by giving a negative value to the unequal base pair (mismatch), and positive values on the same base pair (match). So that will obtain the maximum positive value as the end of the alignment, and the minimum value as the initial alignment. This study will use sequences of leukemia and 3 sequences of non leukemia.

  18. Characterization of Pasteurella multocida associated with ovine pneumonia using multi-locus sequence typing (MLST) and virulence-associated gene profile analysis and comparison with porcine isolates.

    Science.gov (United States)

    García-Alvarez, Andrés; Vela, Ana Isabel; San Martín, Elvira; Chaves, Fernando; Fernández-Garayzábal, José Francisco; Lucas, Domínguez; Cid, Dolores

    2017-05-01

    Pasteurella multocida is a pathogen causing disease in a wide range of hosts including sheep and pigs. Isolates from ovine pneumonia were characterized by MLST (Multi-host and RIRDC databases) and virulence-associated gene (VAG) typing and compared with porcine isolates. Ovine and porcine isolates did not share any STs as determined by both schemes and exhibited different VAG profiles. With the Multi-host database, sixteen STs were identified among 43 sheep isolates with two STs (ST50 and ST19) comprising 53.5% of the isolates, and seven MLST genotypes (ST3, ST11 and ST62 included 75% of the isolates) among the 48 pig isolates. The most frequent VAG profile among sheep isolates was tbpA+/toxA+ (69.8% of isolates) and pfhA+ (62.5%) and hgbB+ (33.3%) among pig isolates. Representative ovine and porcine isolates of those STs identified by the Multi-host scheme were further typed using the RIRDC scheme. Seven STs were identified among the ovine isolates (ST95 RIRDC , ST131 RIRDC , ST203 RIRDC , ST320 RIRDC , ST324 RIRDC , ST321 RIRDC , and ST323 RIRDC ), with the latter four sequence types being new STs identified in this study, and six STs (ST9 RIRDC , ST13 RIRDC , ST27 RIRDC , ST50 RIRDC , and ST74 RIRDC and a new sequence type ST322 RIRDC ) among the porcine isolates. STs identified among ovine isolates have been detected exclusively in small ruminants, suggesting an adaptation to these hosts, while the genotypes identified among pig isolates have been previously identified in multiple hosts and therefore they are not restricted to pigs. The differences in genotypes and VAG profiles between ovine and pig isolates suggest they could represent different subpopulations of P. multocida. Copyright © 2017 Elsevier B.V. All rights reserved.

  19. Genome Sequencing and Analysis Conference IV

    Energy Technology Data Exchange (ETDEWEB)

    1993-12-31

    J. Craig Venter and C. Thomas Caskey co-chaired Genome Sequencing and Analysis Conference IV held at Hilton Head, South Carolina from September 26--30, 1992. Venter opened the conference by noting that approximately 400 researchers from 16 nations were present four times as many participants as at Genome Sequencing Conference I in 1989. Venter also introduced the Data Fair, a new component of the conference allowing exchange and on-site computer analysis of unpublished sequence data.

  20. Structural and functional characterization of the exonuclease I (sbcB) gene and gene product from Escherichia coli and a Markov chain analysis of DNA sequences

    International Nuclear Information System (INIS)

    Phillips, G.J.

    1987-01-01

    The nucleotide sequence for the structural gene for exonuclease I (sbcB) from Escherichia coli was determined. Two putative promotes for this gene were identified and were predicted to have weak transcription initiation activity. In addition, the sbcB coding region contains many non-optimal codons. These observations are consistent with the suggestions that sbcB is a poorly expressed gene. Several mutant exonuclease I genes were cloned onto pBR322 plasmids. These genes represented both sbcB and xonA mutation. One of the xonA mutation (xonA6) was associated with a 1.2-kb insertion of an IS-30 related mobile genetic element in the 3'-region of the gene. Two of the mutations (xonA2 and xonA6) encode unstable polypeptides. Determination of exonucleolytic activity on single-stranded DNA from cell extracts containing each of the cloned mutant genes revealed no correlation between residual exonucleolytic activity and the pheno-types of sbcB and xonA mutants. A proposal that the exonuclease I protein contains an additional activity besides its ability to degrade single-stranded DNA is presented. Characterization of E. coli strains which overproduce exonuclease I showed increased sensitivity to UV irradiation

  1. Identification and characterization of microRNAs related to salt stress in broccoli, using high-throughput sequencing and bioinformatics analysis.

    Science.gov (United States)

    Tian, Yunhong; Tian, Yunming; Luo, Xiaojun; Zhou, Tao; Huang, Zuoping; Liu, Ying; Qiu, Yihan; Hou, Bing; Sun, Dan; Deng, Hongyu; Qian, Shen; Yao, Kaitai

    2014-09-03

    MicroRNAs (miRNAs) are a new class of endogenous regulators of a broad range of physiological processes, which act by regulating gene expression post-transcriptionally. The brassica vegetable, broccoli (Brassica oleracea var. italica), is very popular with a wide range of consumers, but environmental stresses such as salinity are a problem worldwide in restricting its growth and yield. Little is known about the role of miRNAs in the response of broccoli to salt stress. In this study, broccoli subjected to salt stress and broccoli grown under control conditions were analyzed by high-throughput sequencing. Differential miRNA expression was confirmed by real-time reverse transcription polymerase chain reaction (RT-PCR). The prediction of miRNA targets was undertaken using the Kyoto Encyclopedia of Genes and Genomes (KEGG) Orthology (KO) database and Gene Ontology (GO)-enrichment analyses. Two libraries of small (or short) RNAs (sRNAs) were constructed and sequenced by high-throughput Solexa sequencing. A total of 24,511,963 and 21,034,728 clean reads, representing 9,861,236 (40.23%) and 8,574,665 (40.76%) unique reads, were obtained for control and salt-stressed broccoli, respectively. Furthermore, 42 putative known and 39 putative candidate miRNAs that were differentially expressed between control and salt-stressed broccoli were revealed by their read counts and confirmed by the use of stem-loop real-time RT-PCR. Amongst these, the putative conserved miRNAs, miR393 and miR855, and two putative candidate miRNAs, miR3 and miR34, were the most strongly down-regulated when broccoli was salt-stressed, whereas the putative conserved miRNA, miR396a, and the putative candidate miRNA, miR37, were the most up-regulated. Finally, analysis of the predicted gene targets of miRNAs using the GO and KO databases indicated that a range of metabolic and other cellular functions known to be associated with salt stress were up-regulated in broccoli treated with salt. A comprehensive

  2. Quantiprot - a Python package for quantitative analysis of protein sequences.

    Science.gov (United States)

    Konopka, Bogumił M; Marciniak, Marta; Dyrka, Witold

    2017-07-17

    The field of protein sequence analysis is dominated by tools rooted in substitution matrices and alignments. A complementary approach is provided by methods of quantitative characterization. A major advantage of the approach is that quantitative properties defines a multidimensional solution space, where sequences can be related to each other and differences can be meaningfully interpreted. Quantiprot is a software package in Python, which provides a simple and consistent interface to multiple methods for quantitative characterization of protein sequences. The package can be used to calculate dozens of characteristics directly from sequences or using physico-chemical properties of amino acids. Besides basic measures, Quantiprot performs quantitative analysis of recurrence and determinism in the sequence, calculates distribution of n-grams and computes the Zipf's law coefficient. We propose three main fields of application of the Quantiprot package. First, quantitative characteristics can be used in alignment-free similarity searches, and in clustering of large and/or divergent sequence sets. Second, a feature space defined by quantitative properties can be used in comparative studies of protein families and organisms. Third, the feature space can be used for evaluating generative models, where large number of sequences generated by the model can be compared to actually observed sequences.

  3. The characterization of a new set of EST-derived simple sequence repeat (SSR markers as a resource for the genetic analysis of Phaseolus vulgaris

    Directory of Open Access Journals (Sweden)

    Borba Tereza CO

    2011-05-01

    Full Text Available Abstract Background Over recent years, a growing effort has been made to develop microsatellite markers for the genomic analysis of the common bean (Phaseolus vulgaris to broaden the knowledge of the molecular genetic basis of this species. The availability of large sets of expressed sequence tags (ESTs in public databases has given rise to an expedient approach for the identification of SSRs (Simple Sequence Repeats, specifically EST-derived SSRs. In the present work, a battery of new microsatellite markers was obtained from a search of the Phaseolus vulgaris EST database. The diversity, degree of transferability and polymorphism of these markers were tested. Results From 9,583 valid ESTs, 4,764 had microsatellite motifs, from which 377 were used to design primers, and 302 (80.11% showed good amplification quality. To analyze transferability, a group of 167 SSRs were tested, and the results showed that they were 82% transferable across at least one species. The highest amplification rates were observed between the species from the Phaseolus (63.7%, Vigna (25.9%, Glycine (19.8%, Medicago (10.2%, Dipterix (6% and Arachis (1.8% genera. The average PIC (Polymorphism Information Content varied from 0.53 for genomic SSRs to 0.47 for EST-SSRs, and the average number of alleles per locus was 4 and 3, respectively. Among the 315 newly tested SSRs in the BJ (BAT93 X Jalo EEP558 population, 24% (76 were polymorphic. The integration of these segregant loci into a framework map composed of 123 previously obtained SSR markers yielded a total of 199 segregant loci, of which 182 (91.5% were mapped to 14 linkage groups, resulting in a map length of 1,157 cM. Conclusions A total of 302 newly developed EST-SSR markers, showing good amplification quality, are available for the genetic analysis of Phaseolus vulgaris. These markers showed satisfactory rates of transferability, especially between species that have great economic and genomic values. Their diversity

  4. Robustness analysis of chiller sequencing control

    International Nuclear Information System (INIS)

    Liao, Yundan; Sun, Yongjun; Huang, Gongsheng

    2015-01-01

    Highlights: • Uncertainties with chiller sequencing control were systematically quantified. • Robustness of chiller sequencing control was systematically analyzed. • Different sequencing control strategies were sensitive to different uncertainties. • A numerical method was developed for easy selection of chiller sequencing control. - Abstract: Multiple-chiller plant is commonly employed in the heating, ventilating and air-conditioning system to increase operational feasibility and energy-efficiency under part load condition. In a multiple-chiller plant, chiller sequencing control plays a key role in achieving overall energy efficiency while not sacrifices the cooling sufficiency for indoor thermal comfort. Various sequencing control strategies have been developed and implemented in practice. Based on the observation that (i) uncertainty, which cannot be avoided in chiller sequencing control, has a significant impact on the control performance and may cause the control fail to achieve the expected control and/or energy performance; and (ii) in current literature few studies have systematically addressed this issue, this paper therefore presents a study on robustness analysis of chiller sequencing control in order to understand the robustness of various chiller sequencing control strategies under different types of uncertainty. Based on the robustness analysis, a simple and applicable method is developed to select the most robust control strategy for a given chiller plant in the presence of uncertainties, which will be verified using case studies

  5. Using a sequence characterized amplified region (SCAR) marker for ...

    African Journals Online (AJOL)

    GREGORY

    2010-09-13

    Sep 13, 2010 ... This work used sequence characterized amplified region (SCAR) marker to detect the Bacillus cereus strain in strawberry fields. The purpose was to develop an effective molecular method for detecting the functional target microorganisms applied in agricultural fields. A 3×109. CFU/ml vegetative cell.

  6. Probabilistic accident sequence recovery analysis

    International Nuclear Information System (INIS)

    Stutzke, Martin A.; Cooper, Susan E.

    2004-01-01

    Recovery analysis is a method that considers alternative strategies for preventing accidents in nuclear power plants during probabilistic risk assessment (PRA). Consideration of possible recovery actions in PRAs has been controversial, and there seems to be a widely held belief among PRA practitioners, utility staff, plant operators, and regulators that the results of recovery analysis should be skeptically viewed. This paper provides a framework for discussing recovery strategies, thus lending credibility to the process and enhancing regulatory acceptance of PRA results and conclusions. (author)

  7. Identification and characterization of wheat long non-protein coding RNAs responsive to powdery mildew infection and heat stress by using microarray analysis and SBS sequencing

    Directory of Open Access Journals (Sweden)

    Peng Huiru

    2011-04-01

    Full Text Available Abstract Background Biotic and abiotic stresses, such as powdery mildew infection and high temperature, are important limiting factors for yield and grain quality in wheat production. Emerging evidences suggest that long non-protein coding RNAs (npcRNAs are developmentally regulated and play roles in development and stress responses of plants. However, identification of long npcRNAs is limited to a few plant species, such as Arabidopsis, rice and maize, no systematic identification of long npcRNAs and their responses to abiotic and biotic stresses is reported in wheat. Results In this study, by using computational analysis and experimental approach we identified 125 putative wheat stress responsive long npcRNAs, which are not conserved among plant species. Among them, some were precursors of small RNAs such as microRNAs and siRNAs, two long npcRNAs were identified as signal recognition particle (SRP 7S RNA variants, and three were characterized as U3 snoRNAs. We found that wheat long npcRNAs showed tissue dependent expression patterns and were responsive to powdery mildew infection and heat stress. Conclusion Our results indicated that diverse sets of wheat long npcRNAs were responsive to powdery mildew infection and heat stress, and could function in wheat responses to both biotic and abiotic stresses, which provided a starting point to understand their functions and regulatory mechanisms in the future.

  8. Time fluctuation analysis of forest fire sequences

    Science.gov (United States)

    Vega Orozco, Carmen D.; Kanevski, Mikhaïl; Tonini, Marj; Golay, Jean; Pereira, Mário J. G.

    2013-04-01

    Forest fires are complex events involving both space and time fluctuations. Understanding of their dynamics and pattern distribution is of great importance in order to improve the resource allocation and support fire management actions at local and global levels. This study aims at characterizing the temporal fluctuations of forest fire sequences observed in Portugal, which is the country that holds the largest wildfire land dataset in Europe. This research applies several exploratory data analysis measures to 302,000 forest fires occurred from 1980 to 2007. The applied clustering measures are: Morisita clustering index, fractal and multifractal dimensions (box-counting), Ripley's K-function, Allan Factor, and variography. These algorithms enable a global time structural analysis describing the degree of clustering of a point pattern and defining whether the observed events occur randomly, in clusters or in a regular pattern. The considered methods are of general importance and can be used for other spatio-temporal events (i.e. crime, epidemiology, biodiversity, geomarketing, etc.). An important contribution of this research deals with the analysis and estimation of local measures of clustering that helps understanding their temporal structure. Each measure is described and executed for the raw data (forest fires geo-database) and results are compared to reference patterns generated under the null hypothesis of randomness (Poisson processes) embedded in the same time period of the raw data. This comparison enables estimating the degree of the deviation of the real data from a Poisson process. Generalizations to functional measures of these clustering methods, taking into account the phenomena, were also applied and adapted to detect time dependences in a measured variable (i.e. burned area). The time clustering of the raw data is compared several times with the Poisson processes at different thresholds of the measured function. Then, the clustering measure value

  9. The Rhizobium etli rpoN locus: DNA sequence analysis and phenotypical characterization of rpoN, ptsN, and ptsA mutants.

    Science.gov (United States)

    Michiels, J; Van Soom, T; D'hooghe, I; Dombrecht, B; Benhassine, T; de Wilde, P; Vanderleyden, J

    1998-04-01

    The rpoN region of Rhizobium etli was isolated by using the Bradyrhizobium japonicum rpoN1 gene as a probe. Nucleotide sequence analysis of a 5,600-bp DNA fragment of this region revealed the presence of four complete open reading frames (ORFs), ORF258, rpoN, ORF191, and ptsN, coding for proteins of 258, 520, 191, and 154 amino acids, respectively. The gene product of ORF258 is homologous to members of the ATP-binding cassette-type permeases. ORF191 and ptsN are homologous to conserved ORFs found downstream from rpoN genes in other bacterial species. Unlike in most other microorganisms, rpoN and ORF191 are separated by approximately 1.6 kb. The R. etli rpoN gene was shown to control in free-living conditions the production of melanin, the activation of nifH, and the metabolism of C4-dicarboxylic acids and several nitrogen sources (ammonium, nitrate, alanine, and serine). Expression of the rpoN gene was negatively autoregulated and occurred independently of the nitrogen source. Inactivation of the ptsN gene resulted in a decrease of melanin synthesis and nifH expression. In a search for additional genes controlling the synthesis of melanin, an R. etli mutant carrying a Tn5 insertion in ptsA, a gene homologous to the Escherichia coli gene coding for enzyme I of the phosphoenolpyruvate:sugar phosphotransferase system, was obtained. The R. etli ptsA mutant also displayed reduced expression of nifH. The ptsN and ptsA mutants also displayed increased sensitivity to the toxic effects of malate and succinate. Growth of both mutants was inhibited by these C4-dicarboxylates at 20 mM at pH 7.0, while wild-type cells grow normally under these conditions. The effect of malate occurred independently of the nitrogen source used. Growth inhibition was decreased by lowering the pH of the growth medium. These results suggest that ptsN and ptsA are part of the same regulatory cascade, the inactivation of which renders the cells sensitive to toxic effects of elevated concentrations of

  10. Characterization of the Pathogenicity of Streptococcus intermedius TYG1620 Isolated from a Human Brain Abscess Based on the Complete Genome Sequence with Transcriptome Analysis and Transposon Mutagenesis in a Murine Subcutaneous Abscess Model.

    Science.gov (United States)

    Hasegawa, Noriko; Sekizuka, Tsuyoshi; Sugi, Yutaka; Kawakami, Nobuhiro; Ogasawara, Yumiko; Kato, Kengo; Yamashita, Akifumi; Takeuchi, Fumihiko; Kuroda, Makoto

    2017-02-01

    Streptococcus intermedius is known to cause periodontitis and pyogenic infections in the brain and liver. Here we report the complete genome sequence of strain TYG1620 (genome size, 2,006,877 bp; GC content, 37.6%; 2,020 predicted open reading frames [ORFs]) isolated from a brain abscess in an infant. Comparative analysis of S. intermedius genome sequences suggested that TYG1620 carries a notable type VII secretion system (T7SS), two long repeat regions, and 19 ORFs for cell wall-anchored proteins (CWAPs). To elucidate the genes responsible for the pathogenicity of TYG1620, transcriptome analysis was performed in a murine subcutaneous abscess model. The results suggest that the levels of expression of small hypothetical proteins similar to phenol-soluble modulin β1 (PSMβ1), a staphylococcal virulence factor, significantly increased in the abscess model. In addition, an experiment in a murine subcutaneous abscess model with random transposon (Tn) mutant attenuation suggested that Tn mutants with mutations in 212 ORFs in the Tn mutant library were attenuated in the murine abscess model (629 ORFs were disrupted in total); the 212 ORFs are putatively essential for abscess formation. Transcriptome analysis identified 37 ORFs, including paralogs of the T7SS and a putative glucan-binding CWAP in long repeat regions, to be upregulated and attenuated in vivo This study provides a comprehensive characterization of S. intermedius pathogenicity based on the complete genome sequence and a murine subcutaneous abscess model with transcriptome and Tn mutagenesis, leading to the identification of pivotal targets for vaccines or antimicrobial agents for the control of S. intermedius infections. Copyright © 2017 American Society for Microbiology.

  11. Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes

    Directory of Open Access Journals (Sweden)

    Rebecca M. Davidson

    2011-11-01

    Full Text Available Transcriptome sequencing is a powerful method for studying global expression patterns in large, complex genomes. Evaluation of sequence-based expression profiles during reproductive development would provide functional annotation to genes underlying agronomic traits. We generated transcriptome profiles for 12 diverse maize ( L. reproductive tissues representing male, female, developing seed, and leaf tissues using high throughput transcriptome sequencing. Overall, ∼80% of annotated genes were expressed. Comparative analysis between sequence and hybridization-based methods demonstrated the utility of ribonucleic acid sequencing (RNA-seq for expression determination and differentiation of paralagous genes (∼85% of maize genes. Analysis of 4975 gene families across reproductive tissues revealed expression divergence is proportional to family size. In all pairwise comparisons between tissues, 7 (pre- vs. postemergence cobs to 48% (pollen vs. ovule of genes were differentially expressed. Genes with expression restricted to a single tissue within this study were identified with the highest numbers observed in leaves, endosperm, and pollen. Coexpression network analysis identified 17 gene modules with complex and shared expression patterns containing many previously described maize genes. The data and analyses in this study provide valuable tools through improved gene annotation, gene family characterization, and a core set of candidate genes to further characterize maize reproductive development and improve grain yield potential.

  12. Characterizing the D2 statistic: word matches in biological sequences.

    Science.gov (United States)

    Forêt, Sylvain; Wilson, Susan R; Burden, Conrad J

    2009-01-01

    Word matches are often used in sequence comparison methods, either as a measure of sequence similarity or in the first search steps of algorithms such as BLAST or BLAT. The D2 statistic is the number of matches of words of k letters between two sequences. Recent advances have been made in the characterization of this statistic and in the approximation of its distribution. Here, these results are extended to the case of approximate word matches. We compute the exact value of the variance of the D2 statistic for the case of a uniform letter distribution, and introduce a method to provide accurate approximations of the variance in the remaining cases. This enables the distribution of D2 to be approximated for typical situations arising in biological research. We apply these results to the identification of cis-regulatory modules, and show that this method detects such sequences with a high accuracy. The ability to approximate the distribution of D2 for both exact and approximate word matches will enable the use of this statistic in a more precise manner for sequence comparison, database searches, and identification of transcription factor binding sites.

  13. Cloning and nucleotide sequence analysis of pepV, a carnosinase gene from Lactobacillus delbrueckii subsp. lactis DSM 7290, and partial characterization of the enzyme.

    Science.gov (United States)

    Vongerichten, K F; Klein, J R; Matern, H; Plapp, R

    1994-10-01

    Cell extracts of Lactobacillus delbrueckii subsp. lactis DSM 7290 were found to exhibit unique peptolytic ability against unusual beta-alanyl-dipeptides. In order to clone the gene encoding this activity, designated pepV, a gene library of strain DSM 7290 genomic DNA, prepared in the low-copy-number plasmid pLG339, was screened for heterologous expression in Escherichia coli. Recombinant clones harbouring pepV were identified by their ability to allow the utilization of carnosine (beta-alanyl-histidine) as a source of histidine by the E. coli mutant strain UK197 (pepD, hisG). Complementation was observed in a colony harbouring a recombinant plasmid (pKV101), carrying pepV. A 2.4 kb fragment containing pepV was subcloned and its nucleotide sequence revealed an open reading frame (ORF) of 1413 nucleotides, corresponding to a protein with predicted molecular mass of 51998 Da. A single transcription initiation site 71 bp upstream of the ATG translational start codon was identified by primer extension. No significant homology was detected between pepV or its deduced amino acid sequence with any entry in the databases. The only similarity was found in a region conserved in the ArgE/DapE/CPG2/YscS family of proteins. This observation, and protease inhibitor studies, indicated that pepV is of the metalloprotease type. A second ORF present in the sequenced fragment showed extensive homology to a variety of amino acid permeases from E. coli and Saccharomyces cerevisiae.

  14. Preliminary hazard analysis using sequence tree method

    International Nuclear Information System (INIS)

    Huang Huiwen; Shih Chunkuan; Hung Hungchih; Chen Minghuei; Yih Swu; Lin Jiinming

    2007-01-01

    A system level PHA using sequence tree method was developed to perform Safety Related digital I and C system SSA. The conventional PHA is a brainstorming session among experts on various portions of the system to identify hazards through discussions. However, this conventional PHA is not a systematic technique, the analysis results strongly depend on the experts' subjective opinions. The analysis quality cannot be appropriately controlled. Thereby, this research developed a system level sequence tree based PHA, which can clarify the relationship among the major digital I and C systems. Two major phases are included in this sequence tree based technique. The first phase uses a table to analyze each event in SAR Chapter 15 for a specific safety related I and C system, such as RPS. The second phase uses sequence tree to recognize what I and C systems are involved in the event, how the safety related systems work, and how the backup systems can be activated to mitigate the consequence if the primary safety systems fail. In the sequence tree, the defense-in-depth echelons, including Control echelon, Reactor trip echelon, ESFAS echelon, and Indication and display echelon, are arranged to construct the sequence tree structure. All the related I and C systems, include digital system and the analog back-up systems are allocated in their specific echelon. By this system centric sequence tree based analysis, not only preliminary hazard can be identified systematically, the vulnerability of the nuclear power plant can also be recognized. Therefore, an effective simplified D3 evaluation can be performed as well. (author)

  15. Molecular characterization and phylogenetic analysis of Explanatum explanatum in India based on nucleotide sequences of ribosomal ITS2 and the mitochondrial gene nad1.

    Science.gov (United States)

    Hayashi, Kei; Mohanta, Uday K; Ohari, Yuma; Neeraja, Tambireddy; Singh, T Shantikumar; Sugiyama, Hiromu; Itagaki, Tadashi

    2016-12-01

    The aim of this study was to analyze the phylogenetic relationship between Explanatum explanatum populations in India and other countries of the Indian subcontinent. Seventy liver amphistomes collected from four localities in India were identified as E. explanatum based on the nucleotide sequences of ribosomal ITS2. The flukes were then analyzed phylogenetically based on the nucleotide sequence of the mitochondrial gene nad1 in comparison with flukes from Bangladesh and Nepal. In the resulting phylogenetic tree, the nad1 haplotypes from India were divided into four clades, and the flukes showing the haplotypes of clades A and C were predominant in India. The haplotypes of the clades A and C have also been detected in Bangladesh and Nepal, and therefore, it seems they occur commonly throughout the Indian subcontinent. The results of AMOVA suggested that gene flow was likely to occur between E. explanatum populations in these countries. These countries are geographically close and have been historically and culturally connected to each other, and therefore, the movements of host ruminants among these countries might have been involved in the migration of the flukes and their gene flow.

  16. Characterization, Genome Sequence, and Analysis of Escherichia Phage CICC 80001, a Bacteriophage Infecting an Efficient L-Aspartic Acid Producing Escherichia coli.

    Science.gov (United States)

    Xu, Youqiang; Ma, Yuyue; Yao, Su; Jiang, Zengyan; Pei, Jiangsen; Cheng, Chi

    2016-03-01

    Escherichia phage CICC 80001 was isolated from the bacteriophage contaminated medium of an Escherichia coli strain HY-05C (CICC 11022S) which could produce L-aspartic acid. The phage had a head diameter of 45-50 nm and a tail of about 10 nm. The one-step growth curve showed a latent period of 10 min and a rise period of about 20 min. The average burst size was about 198 phage particles per infected cell. Tests were conducted on the plaques, multiplicity of infection, and host range. The genome of CICC 80001 was sequenced with a length of 38,810 bp, and annotated. The key proteins leading to host-cell lysis were phylogenetically analyzed. One protein belonged to class II holin, and the other two belonged to the endopeptidase family and N-acetylmuramoyl-L-alanine amidase family, respectively. The genome showed the sequence identity of 82.7% with that of Enterobacteria phage T7, and carried ten unique open reading frames. The bacteriophage resistant E. coli strain designated CICC 11021S was breeding and its L-aspartase activity was 84.4% of that of CICC 11022S.

  17. Google matrix analysis of DNA sequences.

    Science.gov (United States)

    Kandiah, Vivek; Shepelyansky, Dima L

    2013-01-01

    For DNA sequences of various species we construct the Google matrix [Formula: see text] of Markov transitions between nearby words composed of several letters. The statistical distribution of matrix elements of this matrix is shown to be described by a power law with the exponent being close to those of outgoing links in such scale-free networks as the World Wide Web (WWW). At the same time the sum of ingoing matrix elements is characterized by the exponent being significantly larger than those typical for WWW networks. This results in a slow algebraic decay of the PageRank probability determined by the distribution of ingoing elements. The spectrum of [Formula: see text] is characterized by a large gap leading to a rapid relaxation process on the DNA sequence networks. We introduce the PageRank proximity correlator between different species which determines their statistical similarity from the view point of Markov chains. The properties of other eigenstates of the Google matrix are also discussed. Our results establish scale-free features of DNA sequence networks showing their similarities and distinctions with the WWW and linguistic networks.

  18. Google matrix analysis of DNA sequences.

    Directory of Open Access Journals (Sweden)

    Vivek Kandiah

    Full Text Available For DNA sequences of various species we construct the Google matrix [Formula: see text] of Markov transitions between nearby words composed of several letters. The statistical distribution of matrix elements of this matrix is shown to be described by a power law with the exponent being close to those of outgoing links in such scale-free networks as the World Wide Web (WWW. At the same time the sum of ingoing matrix elements is characterized by the exponent being significantly larger than those typical for WWW networks. This results in a slow algebraic decay of the PageRank probability determined by the distribution of ingoing elements. The spectrum of [Formula: see text] is characterized by a large gap leading to a rapid relaxation process on the DNA sequence networks. We introduce the PageRank proximity correlator between different species which determines their statistical similarity from the view point of Markov chains. The properties of other eigenstates of the Google matrix are also discussed. Our results establish scale-free features of DNA sequence networks showing their similarities and distinctions with the WWW and linguistic networks.

  19. Digital image sequence processing, compression, and analysis

    CERN Document Server

    Reed, Todd R

    2004-01-01

    IntroductionTodd R. ReedCONTENT-BASED IMAGE SEQUENCE REPRESENTATIONPedro M. Q. Aguiar, Radu S. Jasinschi, José M. F. Moura, andCharnchai PluempitiwiriyawejTHE COMPUTATION OF MOTIONChristoph Stiller, Sören Kammel, Jan Horn, and Thao DangMOTION ANALYSIS AND DISPLACEMENT ESTIMATION IN THE FREQUENCY DOMAINLuca Lucchese and Guido Maria CortelazzoQUALITY OF SERVICE ASSESSMENT IN NEW GENERATION WIRELESS VIDEO COMMUNICATIONSGaetano GiuntaERROR CONCEALMENT IN DIGITAL VIDEOFrancesco G.B. De NataleIMAGE SEQUENCE RESTORATION: A WIDER PERSPECTIVEAnil KokaramVIDEO SUMMARIZATIONCuneyt M. Taskiran and Edward

  20. CcMP-II, a new hemorrhagic metalloproteinase from Cerastes cerastes snake venom: purification, biochemical characterization and amino acid sequence analysis.

    Science.gov (United States)

    Boukhalfa-Abib, Hinda; Laraba-Djebari, Fatima

    2015-01-01

    Snake venom metalloproteinases (SVMPs) are the most abundant components in snake venoms. They are important in the induction of systemic alterations and local tissue damage after envenomation. CcMP-II, which is a metalloproteinase purified from Cerastes cerastes snake venom, was obtained by a combination of gel filtration, ion-exchange and affinity chromatographies. It was homogeneous on SDS-PAGE, with a molecular mass estimated to 35kDa and presents a pI of 5.6. CcMP-II has an N-terminal sequence of EDRHINLVSVADHRMXTKY, with high levels of homology with those of the members of class P-II of SVMPs, which comprises metalloproteinase and disintegrin-like domains together. This proteinase displayed a fibrinogenolytic and hemorrhagic activities. The proteolytic and hemorrhagic activities of CcMP-II were inhibited by EDTA and 1,10-phenanthroline. However, these activities were not affected by aprotinine and PMSF, suggesting that CcMP-II is a zinc-dependent hemorrhagic metalloproteinase with an α-fibrinogenase activity. The hemorrhagic metalloproteinase CcMP-II was also able to hydrolyze extracellular matrix components, such as type IV collagen and laminin. These results indicate that CcMP-II is implicated in the local and systemic bleeding, contributing thus in the toxicity of C. cerastes venom. Copyright © 2014 Elsevier Inc. All rights reserved.

  1. Comprehensive analysis of Salmonella sequence polymorphisms and development of a LDR-UA assay for the detection and characterization of selected serotypes.

    Science.gov (United States)

    Lauri, Andrea; Castiglioni, Bianca; Mariani, Paola

    2011-07-01

    Salmonella is a major cause of food-borne disease, and Salmonella enterica subspecies I includes the most clinically relevant serotypes. Salmonella serotype determination is important for the disease etiology assessment and contamination source tracking. This task will be facilitated by the disclosure of Salmonella serotype sequence polymorphisms, here annotated in seven genes (sefA, safA, safC, bigA, invA, fimA, and phsB) from 139 S. enterica strains, of which 109 belonging to 44 serotypes of subsp. I. One hundred nineteen polymorphic sites were scored and associated to single serotypes or to serotype groups belonging to S. enterica subsp. I. A diagnostic tool was constructed based on the Ligation Detection Reaction-Universal Array (LDR-UA) for the detection of polymorphic sites uniquely associated to serotypes of primary interest (Salmonella Hadar, Salmonella Infantis, Salmonella Enteritidis, Salmonella Typhimurium, Salmonella Gallinarum, Salmonella Virchow, and Salmonella Paratyphi B). The implementation of promiscuous probes allowed the diagnosis of ten further serotypes that could be associated to a unique hybridization pattern. Finally, the sensitivity and applicability of the tool was tested on target DNA dilutions and with controlled meat contamination, allowing the detection of one Salmonella CFU in 25 g of meat.

  2. Sequence comparison and phylogenetic analysis of core gene of ...

    African Journals Online (AJOL)

    Phylogenetic analysis suggests that our sequences are clustered with sequences reported from Japan. This is the first phylogenetic analysis of HCV core gene from Pakistani population. Our sequences and sequences from Japan are grouped into same cluster in the phylogenetic tree. Sequence comparison and ...

  3. [Complete genome sequencing and sequence analysis of BCG Tice].

    Science.gov (United States)

    Wang, Zhiming; Pan, Yuanlong; Wu, Jun; Zhu, Baoli

    2012-10-04

    The objective of this study is to obtain the complete genome sequence of Bacillus Calmette-Guerin Tice (BCG Tice), in order to provide more information about the molecular biology of BCG Tice and design more reasonable vaccines to prevent tuberculosis. We assembled the data from high-throughput sequencing with SOAPdenovo software, with many contigs and scaffolds obtained. There are many sequence gaps and physical gaps remained as a result of regional low coverage and low quality. We designed primers at the end of contigs and performed PCR amplification in order to link these contigs and scaffolds. With various enzymes to perform PCR amplification, adjustment of PCR reaction conditions, and combined with clone construction to sequence, all the gaps were finished. We obtained the complete genome sequence of BCG Tice and submitted it to GenBank of National Center for Biotechnology Information (NCBI). The genome of BCG Tice is 4334064 base pairs in length, with GC content 65.65%. The problems and strategies during the finishing step of BCG Tice sequencing are illuminated here, with the hope of affording some experience to those who are involved in the finishing step of genome sequencing. The microarray data were verified by our results.

  4. OTU analysis using metagenomic shotgun sequencing data.

    Directory of Open Access Journals (Sweden)

    Xiaolin Hao

    Full Text Available Because of technological limitations, the primer and amplification biases in targeted sequencing of 16S rRNA genes have veiled the true microbial diversity underlying environmental samples. However, the protocol of metagenomic shotgun sequencing provides 16S rRNA gene fragment data with natural immunity against the biases raised during priming and thus the potential of uncovering the true structure of microbial community by giving more accurate predictions of operational taxonomic units (OTUs. Nonetheless, the lack of statistically rigorous comparison between 16S rRNA gene fragments and other data types makes it difficult to interpret previously reported results using 16S rRNA gene fragments. Therefore, in the present work, we established a standard analysis pipeline that would help confirm if the differences in the data are true or are just due to potential technical bias. This pipeline is built by using simulated data to find optimal mapping and OTU prediction methods. The comparison between simulated datasets revealed a relationship between 16S rRNA gene fragments and full-length 16S rRNA sequences that a 16S rRNA gene fragment having a length >150 bp provides the same accuracy as a full-length 16S rRNA sequence using our proposed pipeline, which could serve as a good starting point for experimental design and making the comparison between 16S rRNA gene fragment-based and targeted 16S rRNA sequencing-based surveys possible.

  5. A comparison of 454 sequencing and clonal sequencing for the characterization of hepatitis C virus NS3 variants

    NARCIS (Netherlands)

    Ho, Cynthia K. Y.; Welkers, Matthijs R. A.; Thomas, Xiomara V.; Sullivan, James C.; Kieffer, Tara L.; Reesink, Henk W.; Rebers, Sjoerd P. H.; de Jong, Menno D.; Schinkel, Janke; Molenkamp, Richard

    2015-01-01

    We compared 454 amplicon sequencing with clonal sequencing for the characterization of intra-host hepatitis C virus (HCV) NS3 variants. Clonal and 454 sequences were obtained from 12 patients enrolled in a clinical phase I study for telaprevir, an NS3-4a protease inhibitor. Thirty-nine datasets were

  6. Characterization of Mycoplasma hyosynoviae strains by amplified fragment length polymorphism analysis, pulsed-field gel electrophoresis and 16S ribosomal DNA sequencing

    DEFF Research Database (Denmark)

    Kokotovic, Branko; Friis, N.F.; Ahrens, Peter

    2002-01-01

    , were investigated by analysis of amplified fragment length polymorphisms of the Bgl II and Mfe I restriction sites and by pulsed-field gel electrophoresis of a Bss HII digest of chromosomal DNA. Both methods allowed unambiguous differentiation of the analysed strains and showed similar discriminatory...

  7. Sequence analysis and molecular characterization of Clonorchis sinensis hexokinase, an unusual trimeric 50-kDa glucose-6-phosphate-sensitive allosteric enzyme.

    Directory of Open Access Journals (Sweden)

    Tingjin Chen

    Full Text Available Clonorchiasis, which is induced by the infection of Clonorchis sinensis (C. sinensis, is highly associated with cholangiocarcinoma. Because the available examination, treatment and interrupting transmission provide limited opportunities to prevent infection, it is urgent to develop integrated strategies to prevent and control clonorchiasis. Glycolytic enzymes are crucial molecules for trematode survival and have been targeted for drug development. Hexokinase of C. sinensis (CsHK, the first key regulatory enzyme of the glycolytic pathway, was characterized in this study. The calculated molecular mass (Mr of CsHK was 50.0 kDa. The obtained recombinant CsHK (rCsHK was a homotrimer with an Mr of approximately 164 kDa, as determined using native PAGE and gel filtration. The highest activity was obtained with 50 mM glycine-NaOH at pH 10 and 100 mM Tris-HCl at pH 8.5 and 10. The kinetics of rCsHK has a moderate thermal stability. Compared to that of the corresponding negative control, the enzymatic activity was significantly inhibited by praziquantel (PZQ and anti-rCsHK serum. rCsHK was homotropically and allosterically activated by its substrates, including glucose, mannose, fructose, and ATP. ADP exhibited mixed allosteric effect on rCsHK with respect to ATP, while inorganic pyrophosphate (PPi displayed net allosteric activation with various allosteric systems. Fructose behaved as a dose-dependent V activator with the substrate glucose. Glucose-6-phosphate (G6P displayed net allosteric inhibition on rCsHK with respect to ATP or glucose with various allosteric systems in a dose-independent manner. There were differences in both mRNA and protein levels of CsHK among the life stages of adult worm, metacercaria, excysted metacercaria and egg of C. sinensis, suggesting different energy requirements during different development stages. Our study furthers the understanding of the biological functions of CsHK and supports the need to screen for small

  8. Sequence Analysis and Molecular Characterization of Clonorchis sinensis Hexokinase, an Unusual Trimeric 50-kDa Glucose-6-Phosphate-Sensitive Allosteric Enzyme

    Science.gov (United States)

    Chen, Tingjin; Ning, Dan; Sun, Hengchang; Li, Ran; Shang, Mei; Li, Xuerong; Wang, Xiaoyun; Chen, Wenjun; Liang, Chi; Li, Wenfang; Mao, Qiang; Li, Ye; Deng, Chuanhuan; Wang, Lexun; Wu, Zhongdao; Huang, Yan; Xu, Jin; Yu, Xinbing

    2014-01-01

    Clonorchiasis, which is induced by the infection of Clonorchis sinensis (C. sinensis), is highly associated with cholangiocarcinoma. Because the available examination, treatment and interrupting transmission provide limited opportunities to prevent infection, it is urgent to develop integrated strategies to prevent and control clonorchiasis. Glycolytic enzymes are crucial molecules for trematode survival and have been targeted for drug development. Hexokinase of C. sinensis (CsHK), the first key regulatory enzyme of the glycolytic pathway, was characterized in this study. The calculated molecular mass (Mr) of CsHK was 50.0 kDa. The obtained recombinant CsHK (rCsHK) was a homotrimer with an Mr of approximately 164 kDa, as determined using native PAGE and gel filtration. The highest activity was obtained with 50 mM glycine-NaOH at pH 10 and 100 mM Tris-HCl at pH 8.5 and 10. The kinetics of rCsHK has a moderate thermal stability. Compared to that of the corresponding negative control, the enzymatic activity was significantly inhibited by praziquantel (PZQ) and anti-rCsHK serum. rCsHK was homotropically and allosterically activated by its substrates, including glucose, mannose, fructose, and ATP. ADP exhibited mixed allosteric effect on rCsHK with respect to ATP, while inorganic pyrophosphate (PPi) displayed net allosteric activation with various allosteric systems. Fructose behaved as a dose-dependent V activator with the substrate glucose. Glucose-6-phosphate (G6P) displayed net allosteric inhibition on rCsHK with respect to ATP or glucose with various allosteric systems in a dose-independent manner. There were differences in both mRNA and protein levels of CsHK among the life stages of adult worm, metacercaria, excysted metacercaria and egg of C. sinensis, suggesting different energy requirements during different development stages. Our study furthers the understanding of the biological functions of CsHK and supports the need to screen for small molecule inhibitors

  9. Sequence Matching Analysis for Curriculum Development

    Directory of Open Access Journals (Sweden)

    Liem Yenny Bendatu

    2015-06-01

    Full Text Available Many organizations apply information technologies to support their business processes. Using the information technologies, the actual events are recorded and utilized to conform with predefined model. Conformance checking is an approach to measure the fitness and appropriateness between process model and actual events. However, when there are multiple events with the same timestamp, the traditional approach unfit to result such measures. This study attempts to develop a sequence matching analysis. Considering conformance checking as the basis of this approach, this proposed approach utilizes the current control flow technique in process mining domain. A case study in the field of educational process has been conducted. This study also proposes a curriculum analysis framework to test the proposed approach. By considering the learning sequence of students, it results some measurements for curriculum development. Finally, the result of the proposed approach has been verified by relevant instructors for further development.

  10. Characterization and perturbation of Gabor frame sequences with rational parameters

    DEFF Research Database (Denmark)

    Bownik, M.; Christensen, Ole

    2007-01-01

    Let A c L-2(R) be at most countable, and E N. We characterize various frame-properties for Gabor systems of the form G(l. p/q . A) = {e(2 pi imx) g (x-np/q) : m, n epsilon Z, g epsilon A} in terms of the corresponding frame properties for the row vectors in the Zibulski-Zeevi matrix. This extends...... work by [Ron and Shen, Weyl-Heisenherg systenis and Riesz bases in L-2(R-d). Duke Math. J. 89 (1997) 237-282]. who considered the case where A is finite. As a consequence of the results, we obtain results concerning stability of Gabor frames under perturbation of the generators. We also introduce...... the concept of rigid frame sequences, which have the property that all Sufficiently small perturbations with a lower frame bound above some threshold value, automatically generate the same closed linear span. Finally, we characterize rigid Gabor frame sequences in terms of their Zibulski-Zeevi matrix....

  11. Differentiation of Shewanella putrefaciens and Shewanella alga on the basis of whole-cell protein profiles, ribotyping, phenotypic characterization, and 16S rRNA gene sequence analysis

    DEFF Research Database (Denmark)

    Vogel, Birte Fonnesbech; Jørgensen, K.; Christensen, H.

    1997-01-01

    Seventy-six presumed Shewanella putrefaciens isolates from fish, oil drillings, and clinical specimens, the type strain of Shewanella putrefaciens (ATCC 8071), the type strain of Shewanella alga (IAM 14159), and the type strain of Shewanella hanedai (ATCC 33224) were compared by several typing...... methods. Numerical analysis of sodium dodecyl sulfate-polyacrylamide gel electrophoresis of whole-cell protein and ribotyping patterns showed that the strains were separated into two distinct clusters with 56% +/- 10% and 40% +/- 14% similarity for whole- cell protein profiling and ribotyping......, respectively. One cluster consisted of 26 isolates with 52 to 55 mol% G + C and included 15 human isolates, mostly clinical specimens, 8 isolates from marine waters, and the type strain of S. alga. This homogeneous cluster of mesophilic, halotolerant strains was by all analyses identical to the recently...

  12. Network clustering coefficient approach to DNA sequence analysis

    Energy Technology Data Exchange (ETDEWEB)

    Gerhardt, Guenther J.L. [Universidade Federal do Rio Grande do Sul-Hospital de Clinicas de Porto Alegre, Rua Ramiro Barcelos 2350/sala 2040/90035-003 Porto Alegre (Brazil); Departamento de Fisica e Quimica da Universidade de Caxias do Sul, Rua Francisco Getulio Vargas 1130, 95001-970 Caxias do Sul (Brazil); Lemke, Ney [Programa Interdisciplinar em Computacao Aplicada, Unisinos, Av. Unisinos, 950, 93022-000 Sao Leopoldo, RS (Brazil); Corso, Gilberto [Departamento de Biofisica e Farmacologia, Centro de Biociencias, Universidade Federal do Rio Grande do Norte, Campus Universitario, 59072 970 Natal, RN (Brazil)]. E-mail: corso@dfte.ufrn.br

    2006-05-15

    In this work we propose an alternative DNA sequence analysis tool based on graph theoretical concepts. The methodology investigates the path topology of an organism genome through a triplet network. In this network, triplets in DNA sequence are vertices and two vertices are connected if they occur juxtaposed on the genome. We characterize this network topology by measuring the clustering coefficient. We test our methodology against two main bias: the guanine-cytosine (GC) content and 3-bp (base pairs) periodicity of DNA sequence. We perform the test constructing random networks with variable GC content and imposed 3-bp periodicity. A test group of some organisms is constructed and we investigate the methodology in the light of the constructed random networks. We conclude that the clustering coefficient is a valuable tool since it gives information that is not trivially contained in 3-bp periodicity neither in the variable GC content.

  13. Sequence analysis of the genome of carnation (Dianthus caryophyllus L.).

    Science.gov (United States)

    Yagi, Masafumi; Kosugi, Shunichi; Hirakawa, Hideki; Ohmiya, Akemi; Tanase, Koji; Harada, Taro; Kishimoto, Kyutaro; Nakayama, Masayoshi; Ichimura, Kazuo; Onozaki, Takashi; Yamaguchi, Hiroyasu; Sasaki, Nobuhiro; Miyahara, Taira; Nishizaki, Yuzo; Ozeki, Yoshihiro; Nakamura, Noriko; Suzuki, Takamasa; Tanaka, Yoshikazu; Sato, Shusei; Shirasawa, Kenta; Isobe, Sachiko; Miyamura, Yoshinori; Watanabe, Akiko; Nakayama, Shinobu; Kishida, Yoshie; Kohara, Mitsuyo; Tabata, Satoshi

    2014-06-01

    The whole-genome sequence of carnation (Dianthus caryophyllus L.) cv. 'Francesco' was determined using a combination of different new-generation multiplex sequencing platforms. The total length of the non-redundant sequences was 568,887,315 bp, consisting of 45,088 scaffolds, which covered 91% of the 622 Mb carnation genome estimated by k-mer analysis. The N50 values of contigs and scaffolds were 16,644 bp and 60,737 bp, respectively, and the longest scaffold was 1,287,144 bp. The average GC content of the contig sequences was 36%. A total of 1050, 13, 92 and 143 genes for tRNAs, rRNAs, snoRNA and miRNA, respectively, were identified in the assembled genomic sequences. For protein-encoding genes, 43 266 complete and partial gene structures excluding those in transposable elements were deduced. Gene coverage was ∼ 98%, as deduced from the coverage of the core eukaryotic genes. Intensive characterization of the assigned carnation genes and comparison with those of other plant species revealed characteristic features of the carnation genome. The results of this study will serve as a valuable resource for fundamental and applied research of carnation, especially for breeding new carnation varieties. Further information on the genomic sequences is available at http://carnation.kazusa.or.jp. © The Author 2013. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  14. Analysis and functional characterization of sequence variations in ligand binding domain of thyroid hormone receptors in autism spectrum disorder (ASD) patients.

    Science.gov (United States)

    Kalikiri, Mahesh Kumar; Mamidala, Madhu Poornima; Rao, Ananth N; Rajesh, Vidya

    2017-12-01

    Autism spectrum disorder (ASD) is a neuro developmental disorder, reported to be on a rise in the past two decades. Thyroid hormone-T3 plays an important role in early embryonic and central nervous system development. T3 mediates its function by binding to thyroid hormone receptors, TRα and TRβ. Alterations in T3 levels and thyroid receptor mutations have been earlier implicated in neuropsychiatric disorders and have been linked to environmental toxins. Limited reports from earlier studies have shown the effectiveness of T3 treatment with promising results in children with ASD and that the thyroid hormone levels in these children was also normal. This necessitates the need to explore the genetic variations in the components of the thyroid hormone pathway in ASD children. To achieve this objective, we performed genetic analysis of ligand binding domain of THRA and THRB receptor genes in 30 ASD subjects and in age matched controls from India. Our study for the first time reports novel single nucleotide polymorphisms in the THRA and THRB receptor genes of ASD individuals. Autism Res 2017, 10: 1919-1928. ©2017 International Society for Autism Research, Wiley Periodicals, Inc. Thyroid hormone (T3) and thyroid receptors (TRα and TRβ) are the major components of the thyroid hormone pathway. The link between thyroid pathway and neuronal development is proven in clinical medicine. Since the thyroid hormone levels in Autistic children are normal, variations in their receptors needs to be explored. To achieve this objective, changes in THRA and THRB receptor genes was studied in 30 ASD and normal children from India. The impact of some of these mutations on receptor function was also studied. © 2017 International Society for Autism Research, Wiley Periodicals, Inc.

  15. Analysis of Pteridium ribosomal RNA sequences by rapid direct sequencing.

    Science.gov (United States)

    Tan, M K

    1991-08-01

    A total of 864 bases from 5 regions interspersed in the 18S and 26S rRNA molecules from various clones of Pteridium covering the general geographical distribution of the genus was analysed using a rapid rRNA sequencing technique. No base difference has been detected amongst the three major lineages, two of which apparently separated before the breakup of the ancient supercontinent, Pangaea. These regions of the rRNA sequences have thus been conserved for at least 160 million years and are here compared with other eukaryotic, especially plant rRNAs.

  16. Sequencing and characterization of the guppy (Poecilia reticulata transcriptome

    Directory of Open Access Journals (Sweden)

    Rodd F Helen

    2011-04-01

    Full Text Available Abstract Background Next-generation sequencing is providing researchers with a relatively fast and affordable option for developing genomic resources for organisms that are not among the traditional genetic models. Here we present a de novo assembly of the guppy (Poecilia reticulata transcriptome using 454 sequence reads, and we evaluate potential uses of this transcriptome, including detection of sex-specific transcripts and deployment as a reference for gene expression analysis in guppies and a related species. Guppies have been model organisms in ecology, evolutionary biology, and animal behaviour for over 100 years. An annotated transcriptome and other genomic tools will facilitate understanding the genetic and molecular bases of adaptation and variation in a vertebrate species with a uniquely well known natural history. Results We generated approximately 336 Mbp of mRNA sequence data from male brain, male body, female brain, and female body. The resulting 1,162,670 reads assembled into 54,921 contigs, creating a reference transcriptome for the guppy with an average read depth of 28×. We annotated nearly 40% of this reference transcriptome by searching protein and gene ontology databases. Using this annotated transcriptome database, we identified candidate genes of interest to the guppy research community, putative single nucleotide polymorphisms (SNPs, and male-specific expressed genes. We also showed that our reference transcriptome can be used for RNA-sequencing-based analysis of differential gene expression. We identified transcripts that, in juveniles, are regulated differently in the presence and absence of an important predator, Rivulus hartii, including two genes implicated in stress response. For each sample in the RNA-seq study, >50% of high-quality reads mapped to unique sequences in the reference database with high confidence. In addition, we evaluated the use of the guppy reference transcriptome for gene expression analyses in

  17. Cloning, characterization and sequence comparison of the gene coding for IMP dehydrogenase from Pyrococcus furiosus.

    Science.gov (United States)

    Collart, F R; Osipiuk, J; Trent, J; Olsen, G J; Huberman, E

    1996-10-03

    We have cloned and characterized the gene encoding inosine monophosphate dehydrogenase (IMPDH) from Pyrococcus furiosus (Pf), a hyperthermophillic archeon. Sequence analysis of the Pf gene indicated an open reading frame specifying a protein of 485 amino acids (aa) with a calculated M(r) of 52900. Canonical Archaea promoter elements, Box A and Box B, are located -49 and -17 nucleotides (nt), respectively, upstream of the putative start codon. The sequence of the putative active-site region conforms to the IMPDH signature motif and contains a putative active-site cysteine. Phylogenetic relationships derived by using all available IMPDH sequences are consistent with trees developed for other molecules; they do not precisely resolve the history of Pf IMPDH but indicate a close similarity to bacterial IMPDH proteins. The phylogenetic analysis indicates that a gene duplication occurred prior to the division between rodents and humans, accounting for the Type I and II isoforms identified in mice and humans.

  18. Characterizing immunoglobulin repertoire from whole blood by a personal genome sequencer.

    Directory of Open Access Journals (Sweden)

    Fan Gao

    Full Text Available In human immune system, V(DJ recombination produces an enormously large repertoire of immunoglobulins (Ig so that they can tackle different antigens from bacteria, viruses and tumor cells. Several studies have demonstrated the utility of next-generation sequencers such as Roche 454 and Illumina Genome Analyzer to characterize the repertoire of immunoglobulins. However, these techniques typically require separation of B cell population from whole blood and require a few weeks for running the sequencers, so it may not be practical to implement them in clinical settings. Recently, the Ion Torrent personal genome sequencer has emerged as a tabletop personal genome sequencer that can be operated in a time-efficient and cost-effective manner. In this study, we explored the technical feasibility to use multiplex PCR for amplifying V(DJ recombination for IgH, directly from whole blood, then sequence the amplicons by the Ion Torrent sequencer. The whole process including data generation and analysis can be completed in one day. We tested the method in a pilot study on patients with benign, atypical and malignant meningiomas. Despite the noisy data, we were able to compare the samples by their usage frequencies of the V segment, as well as their somatic hypermutation rates. In summary, our study suggested that it is technically feasible to perform clinical monitoring of V(DJ recombination within a day by personal genome sequencers.

  19. Genome survey sequencing and genetic background characterization of Gracilariopsis lemaneiformis (Rhodophyta) based on next-generation sequencing.

    Science.gov (United States)

    Zhou, Wei; Hu, Yiyi; Sui, Zhenghong; Fu, Feng; Wang, Jinguo; Chang, Lianpeng; Guo, Weihua; Li, Binbin

    2013-01-01

    Gracilariopsis lemaneiformis has a high economic value and is one of the most important aquaculture species in China. Despite it is economic importance, it has remained largely unstudied at the genomic level. In this study, we conducted a genome survey of Gp. lemaneiformis using next-generation sequencing (NGS) technologies. In total, 18.70 Gb of high-quality sequence data with an estimated genome size of 97 Mb were obtained by HiSeq 2000 sequencing for Gp. lemaneiformis. These reads were assembled into 160,390 contigs with a N50 length of 3.64 kb, which were further assembled into 125,685 scaffolds with a total length of 81.17 Mb. Genome analysis predicted 3490 genes and a GC% content of 48%. The identified genes have an average transcript length of 1,429 bp, an average coding sequence size of 1,369 bp, 1.36 exons per gene, exon length of 1,008 bp, and intron length of 191 bp. From the initial assembled scaffold, transposable elements constituted 54.64% (44.35 Mb) of the genome, and 7737 simple sequence repeats (SSRs) were identified. Among these SSRs, the trinucleotide repeat type was the most abundant (up to 73.20% of total SSRs), followed by the di- (17.41%), tetra- (5.49%), hexa- (2.90%), and penta- (1.00%) nucleotide repeat type. These characteristics suggest that Gp. lemaneiformis is a model organism for genetic study. This is the first report of genome-wide characterization within this taxon.

  20. Genome Survey Sequencing and Genetic Background Characterization of Gracilariopsis lemaneiformis (Rhodophyta) Based on Next-Generation Sequencing

    Science.gov (United States)

    Sui, Zhenghong; Fu, Feng; Wang, Jinguo; Chang, Lianpeng; Guo, Weihua; Li, Binbin

    2013-01-01

    Gracilariopsis lemaneiformis has a high economic value and is one of the most important aquaculture species in China. Despite it is economic importance, it has remained largely unstudied at the genomic level. In this study, we conducted a genome survey of Gp. lemaneiformis using next-generation sequencing (NGS) technologies. In total, 18.70 Gb of high-quality sequence data with an estimated genome size of 97 Mb were obtained by HiSeq 2000 sequencing for Gp. lemaneiformis. These reads were assembled into 160,390 contigs with a N50 length of 3.64 kb, which were further assembled into 125,685 scaffolds with a total length of 81.17 Mb. Genome analysis predicted 3490 genes and a GC% content of 48%. The identified genes have an average transcript length of 1,429 bp, an average coding sequence size of 1,369 bp, 1.36 exons per gene, exon length of 1,008 bp, and intron length of 191 bp. From the initial assembled scaffold, transposable elements constituted 54.64% (44.35 Mb) of the genome, and 7737 simple sequence repeats (SSRs) were identified. Among these SSRs, the trinucleotide repeat type was the most abundant (up to 73.20% of total SSRs), followed by the di- (17.41%), tetra- (5.49%), hexa- (2.90%), and penta- (1.00%) nucleotide repeat type. These characteristics suggest that Gp. lemaneiformis is a model organism for genetic study. This is the first report of genome-wide characterization within this taxon. PMID:23875008

  1. Molecular characterization of Taenia multiceps isolates from Gansu Province, China by sequencing of mitochondrial cytochrome C oxidase subunit 1.

    Science.gov (United States)

    Li, Wen Hui; Jia, Wan Zhong; Qu, Zi Gang; Xie, Zhi Zhou; Luo, Jian Xun; Yin, Hong; Sun, Xiao Lin; Blaga, Radu; Fu, Bao Quan

    2013-04-01

    A total of 16 Taenia multiceps isolates collected from naturally infected sheep or goats in Gansu Province, China were characterized by sequences of mitochondrial cytochrome c oxidase subunit 1 (cox1) gene. The complete cox1 gene was amplified for individual T. multiceps isolates by PCR, ligated to pMD18T vector, and sequenced. Sequence analysis indicated that out of 16 T. multiceps isolates 10 unique cox1 gene sequences of 1,623 bp were obtained with sequence variation of 0.12-0.68%. The results showed that the cox1 gene sequences were highly conserved among the examined T. multiceps isolates. However, they were quite different from those of the other Taenia species. Phylogenetic analysis based on complete cox1 gene sequences revealed that T. multiceps isolates were composed of 3 genotypes and distinguished from the other Taenia species.

  2. Sequence Ready Characterization of the Pericentromeric Region of 19p12

    Energy Technology Data Exchange (ETDEWEB)

    Evan E. Eichler

    2006-08-31

    Current mapping and sequencing strategies have been inadequate within the proximal portion of 19p12 due, in part, to the presence of a recently expanded ZNF (zinc-finger) gene family and the presence of large (25-50 kb) inverted beta-satellite repeat structures which bracket this tandemly duplicated gene family. The virtual of absence of classically defined “unique” sequence within the region has hampered efforts to identify and characterize a suitable minimal tiling path of clones which can be used as templates required for finished sequencing of the region. The goal of this proposal is to develop and implement a novel sequence-anchor strategy to generate a contiguous BAC map of the most proximal portion of chromosome 19p12 for the purpose of complete sequence characterization. The target region will be an estimated 4.5 Mb of DNA extending from STS marker D19S450 (the beginning of the ZNF gene cluster) to the centromeric (alpha-satellite) junction of 19p11. The approach will entail 1) pre-selection of 19p12 BAC and cosmid clones (NIH approved library) utilizing both 19p12 -unique and 19p12-SPECIFIC repeat probes (Eichler et al., 1998); 2) the generation of a BAC/cosmid end-sequence map across the region with a density of one marker every 8kb; 3) the development of a second-generation of STS (sequence tagged sites) which will be used to identify and verify clonal overlap at the level of the sequence; 4) incorporation of these sequence-anchored overlapping clones into existing cosmid/BAC restriction maps developed at Livermore National Laboratory; and 5) validation of the organization of this region utilizing high-resolution FISH techniques (extended chromatin analysis) on monochromosomal 19 somatic cell hybrids and parental cell lines of source material. The data generated will be used in the selection of the most parsimonious tiling path of BAC clones to be sequenced as part of the JGI effort on chromosome 19 and should serve as a model for the sequence

  3. FAST: FAST Analysis of Sequences Toolbox

    Directory of Open Access Journals (Sweden)

    Travis J. Lawrence

    2015-05-01

    Full Text Available FAST (FAST Analysis of Sequences Toolbox provides simple, powerful open source command-line tools to filter, transform, annotate and analyze biological sequence data. Modeled after the GNU (GNU’s Not Unix Textutils such as grep, cut, and tr, FAST tools such as fasgrep, fascut, and fastr make it easy to rapidly prototype expressive bioinformatic workflows in a compact and generic command vocabulary. Compact combinatorial encoding of data workflows with FAST commands can simplify the documentation and reproducibility of bioinformatic protocols, supporting better transparency in biological data science. Interface self-consistency and conformity with conventions of GNU, Matlab, Perl, BioPerl, R and GenBank help make FAST easy and rewarding to learn. FAST automates numerical, taxonomic, and text-based sorting, selection and transformation of sequence records and alignment sites based on content, index ranges, descriptive tags, annotated features, and in-line calculated analytics, including composition and codon usage. Automated content- and feature-based extraction of sites and support for molecular population genetic statistics makes FAST useful for molecular evolutionary analysis. FAST is portable, easy to install and secure thanks to the relative maturity of its Perl and BioPerl foundations, with stable releases posted to CPAN. Development as well as a publicly accessible Cookbook and Wiki are available on the FAST GitHub repository at https://github.com/tlawrence3/FAST. The default data exchange format in FAST is Multi-FastA (specifically, a restriction of BioPerl FastA format. Sanger and Illumina 1.8+ FastQ formatted files are also supported. FAST makes it easier for non-programmer biologists to interactively investigate and control biological data at the speed of thought.

  4. Bayesian Correlation Analysis for Sequence Count Data.

    Directory of Open Access Journals (Sweden)

    Daniel Sánchez-Taltavull

    Full Text Available Evaluating the similarity of different measured variables is a fundamental task of statistics, and a key part of many bioinformatics algorithms. Here we propose a Bayesian scheme for estimating the correlation between different entities' measurements based on high-throughput sequencing data. These entities could be different genes or miRNAs whose expression is measured by RNA-seq, different transcription factors or histone marks whose expression is measured by ChIP-seq, or even combinations of different types of entities. Our Bayesian formulation accounts for both measured signal levels and uncertainty in those levels, due to varying sequencing depth in different experiments and to varying absolute levels of individual entities, both of which affect the precision of the measurements. In comparison with a traditional Pearson correlation analysis, we show that our Bayesian correlation analysis retains high correlations when measurement confidence is high, but suppresses correlations when measurement confidence is low-especially for entities with low signal levels. In addition, we consider the influence of priors on the Bayesian correlation estimate. Perhaps surprisingly, we show that naive, uniform priors on entities' signal levels can lead to highly biased correlation estimates, particularly when different experiments have widely varying sequencing depths. However, we propose two alternative priors that provably mitigate this problem. We also prove that, like traditional Pearson correlation, our Bayesian correlation calculation constitutes a kernel in the machine learning sense, and thus can be used as a similarity measure in any kernel-based machine learning algorithm. We demonstrate our approach on two RNA-seq datasets and one miRNA-seq dataset.

  5. A basic analysis toolkit for biological sequences

    Directory of Open Access Journals (Sweden)

    Siragusa Enrico

    2007-09-01

    Full Text Available Abstract This paper presents a software library, nicknamed BATS, for some basic sequence analysis tasks. Namely, local alignments, via approximate string matching, and global alignments, via longest common subsequence and alignments with affine and concave gap cost functions. Moreover, it also supports filtering operations to select strings from a set and establish their statistical significance, via z-score computation. None of the algorithms is new, but although they are generally regarded as fundamental for sequence analysis, they have not been implemented in a single and consistent software package, as we do here. Therefore, our main contribution is to fill this gap between algorithmic theory and practice by providing an extensible and easy to use software library that includes algorithms for the mentioned string matching and alignment problems. The library consists of C/C++ library functions as well as Perl library functions. It can be interfaced with Bioperl and can also be used as a stand-alone system with a GUI. The software is available at http://www.math.unipa.it/~raffaele/BATS/ under the GNU GPL.

  6. Whole genome sequence analysis of Mycobacterium suricattae

    KAUST Repository

    Dippenaar, Anzaan; Parsons, Sven David Charles; Sampson, Samantha Leigh; Van Der Merwe, Ruben Gerhard; Drewe, Julian Ashley; Abdallah, Abdallah; Siame, Kabengele Keith; Gey Van Pittius, Nicolaas Claudius; Van Helden, Paul David; Pain, Arnab; Warren, Robin Mark

    2015-01-01

    Tuberculosis occurs in various mammalian hosts and is caused by a range of different lineages of the Mycobacterium tuberculosis complex (MTBC). A recently described member, Mycobacterium suricattae, causes tuberculosis in meerkats (Suricata suricatta) in Southern Africa and preliminary genetic analysis showed this organism to be closely related to an MTBC pathogen of rock hyraxes (Procavia capensis), the dassie bacillus. Here we make use of whole genome sequencing to describe the evolution of the genome of M. suricattae, including known and novel regions of difference, SNPs and IS6110 insertion sites. We used genome-wide phylogenetic analysis to show that M. suricattae clusters with the chimpanzee bacillus, previously isolated from a chimpanzee (Pan troglodytes) in West Africa. We propose an evolutionary scenario for the Mycobacterium africanum lineage 6 complex, showing the evolutionary relationship of M. africanum and chimpanzee bacillus, and the closely related members M. suricattae, dassie bacillus and Mycobacterium mungi.

  7. Whole genome sequence analysis of Mycobacterium suricattae

    KAUST Repository

    Dippenaar, Anzaan

    2015-10-21

    Tuberculosis occurs in various mammalian hosts and is caused by a range of different lineages of the Mycobacterium tuberculosis complex (MTBC). A recently described member, Mycobacterium suricattae, causes tuberculosis in meerkats (Suricata suricatta) in Southern Africa and preliminary genetic analysis showed this organism to be closely related to an MTBC pathogen of rock hyraxes (Procavia capensis), the dassie bacillus. Here we make use of whole genome sequencing to describe the evolution of the genome of M. suricattae, including known and novel regions of difference, SNPs and IS6110 insertion sites. We used genome-wide phylogenetic analysis to show that M. suricattae clusters with the chimpanzee bacillus, previously isolated from a chimpanzee (Pan troglodytes) in West Africa. We propose an evolutionary scenario for the Mycobacterium africanum lineage 6 complex, showing the evolutionary relationship of M. africanum and chimpanzee bacillus, and the closely related members M. suricattae, dassie bacillus and Mycobacterium mungi.

  8. Comparison of next generation sequencing technologies for transcriptome characterization

    Directory of Open Access Journals (Sweden)

    Soltis Douglas E

    2009-08-01

    Full Text Available Abstract Background We have developed a simulation approach to help determine the optimal mixture of sequencing methods for most complete and cost effective transcriptome sequencing. We compared simulation results for traditional capillary sequencing with "Next Generation" (NG ultra high-throughput technologies. The simulation model was parameterized using mappings of 130,000 cDNA sequence reads to the Arabidopsis genome (NCBI Accession SRA008180.19. We also generated 454-GS20 sequences and de novo assemblies for the basal eudicot California poppy (Eschscholzia californica and the magnoliid avocado (Persea americana using a variety of methods for cDNA synthesis. Results The Arabidopsis reads tagged more than 15,000 genes, including new splice variants and extended UTR regions. Of the total 134,791 reads (13.8 MB, 119,518 (88.7% mapped exactly to known exons, while 1,117 (0.8% mapped to introns, 11,524 (8.6% spanned annotated intron/exon boundaries, and 3,066 (2.3% extended beyond the end of annotated UTRs. Sequence-based inference of relative gene expression levels correlated significantly with microarray data. As expected, NG sequencing of normalized libraries tagged more genes than non-normalized libraries, although non-normalized libraries yielded more full-length cDNA sequences. The Arabidopsis data were used to simulate additional rounds of NG and traditional EST sequencing, and various combinations of each. Our simulations suggest a combination of FLX and Solexa sequencing for optimal transcriptome coverage at modest cost. We have also developed ESTcalc http://fgp.huck.psu.edu/NG_Sims/ngsim.pl, an online webtool, which allows users to explore the results of this study by specifying individualized costs and sequencing characteristics. Conclusion NG sequencing technologies are a highly flexible set of platforms that can be scaled to suit different project goals. In terms of sequence coverage alone, the NG sequencing is a dramatic advance

  9. Biological characterization and complete nucleotide sequence of a Tunisian isolate of Moroccan watermelon mosaic virus.

    Science.gov (United States)

    Yakoubi, S; Desbiez, C; Fakhfakh, H; Wipf-Scheibel, C; Marrakchi, M; Lecoq, H

    2008-01-01

    During a survey conducted in October 2005, cucurbit leaf samples showing virus-like symptoms were collected from the major cucurbit-growing areas in Tunisia. DAS-ELISA showed the presence of Moroccan watermelon mosaic virus (MWMV, Potyvirus), detected for the first time in Tunisia, in samples from the region of Cap Bon (Northern Tunisia). MWMV isolate TN05-76 (MWMV-Tn) was characterized biologically and its full-length genome sequence was established. MWMV-Tn was found to have biological properties similar to those reported for the MWMV type strain from Morocco. Phylogenetic analysis including the comparison of complete amino-acid sequences of 42 potyviruses confirmed that MWMV-Tn is related (65% amino-acid sequence identity) to Papaya ringspot virus (PRSV) isolates but is a member of a distinct virus species. Sequence analysis on parts of the CP gene of MWMV isolates from different geographical origins revealed some geographic structure of MWMV variability, with three different clusters: one cluster including isolates from the Mediterranean region, a second including isolates from western and central Africa, and a third one including isolates from the southern part of Africa. A significant correlation was observed between geographic and genetic distances between isolates. Isolates from countries in the Mediterranean region where MWMV has recently emerged (France, Spain, Portugal) have highly conserved sequences, suggesting that they may have a common and recent origin. MWMV from Sudan, a highly divergent variant, may be considered an evolutionary intermediate between MWMV and PRSV.

  10. Computational analysis of sequence selection mechanisms.

    Science.gov (United States)

    Meyerguz, Leonid; Grasso, Catherine; Kleinberg, Jon; Elber, Ron

    2004-04-01

    Mechanisms leading to gene variations are responsible for the diversity of species and are important components of the theory of evolution. One constraint on gene evolution is that of protein foldability; the three-dimensional shapes of proteins must be thermodynamically stable. We explore the impact of this constraint and calculate properties of foldable sequences using 3660 structures from the Protein Data Bank. We seek a selection function that receives sequences as input, and outputs survival probability based on sequence fitness to structure. We compute the number of sequences that match a particular protein structure with energy lower than the native sequence, the density of the number of sequences, the entropy, and the "selection" temperature. The mechanism of structure selection for sequences longer than 200 amino acids is approximately universal. For shorter sequences, it is not. We speculate on concrete evolutionary mechanisms that show this behavior.

  11. Comparative analysis of sequences from PT 2013

    DEFF Research Database (Denmark)

    Mikkelsen, Susie Sommer

    Sheatfish and not EHNV. Generally, mistakes occurred at the ends of the sequences. This can be due to several factors. One is that the sequence has not been trimmed of the sequence primer sites. Another is the lack of quality control of the chromatogram. Finally, sequencing in just one direction can result...... diseases in Europe. As part of the EURL proficiency test for fish diseases it is required to sequence any RANA virus isolates found in any of the samples. It is also highly recommended to sequence the ISA virus to determine whether it be HPRΔ or HPR0. Furthermore, it is recommended that any VHSV and IHNV...... isolates be genotyped. As part of the evaluation of the proficiency results it was decided this year to look into the quality and similarity of the sequence results for selected viruses. Ampoule III in the proficiency test 2013 contained an EHNV isolate. The EURL received 43 sequences from 41 laboratories...

  12. SVAMP: Sequence variation analysis, maps and phylogeny

    KAUST Repository

    Naeem, Raeece

    2014-04-03

    Summary: SVAMP is a stand-alone desktop application to visualize genomic variants (in variant call format) in the context of geographical metadata. Users of SVAMP are able to generate phylogenetic trees and perform principal coordinate analysis in real time from variant call format (VCF) and associated metadata files. Allele frequency map, geographical map of isolates, Tajima\\'s D metric, single nucleotide polymorphism density, GC and variation density are also available for visualization in real time. We demonstrate the utility of SVAMP in tracking a methicillin-resistant Staphylococcus aureus outbreak from published next-generation sequencing data across 15 countries. We also demonstrate the scalability and accuracy of our software on 245 Plasmodium falciparum malaria isolates from three continents. Availability and implementation: The Qt/C++ software code, binaries, user manual and example datasets are available at http://cbrc.kaust.edu.sa/svamp. © The Author 2014.

  13. Statistical analysis of next generation sequencing data

    CERN Document Server

    Nettleton, Dan

    2014-01-01

    Next Generation Sequencing (NGS) is the latest high throughput technology to revolutionize genomic research. NGS generates massive genomic datasets that play a key role in the big data phenomenon that surrounds us today. To extract signals from high-dimensional NGS data and make valid statistical inferences and predictions, novel data analytic and statistical techniques are needed. This book contains 20 chapters written by prominent statisticians working with NGS data. The topics range from basic preprocessing and analysis with NGS data to more complex genomic applications such as copy number variation and isoform expression detection. Research statisticians who want to learn about this growing and exciting area will find this book useful. In addition, many chapters from this book could be included in graduate-level classes in statistical bioinformatics for training future biostatisticians who will be expected to deal with genomic data in basic biomedical research, genomic clinical trials and personalized med...

  14. Movement Pattern Analysis Based on Sequence Signatures

    Directory of Open Access Journals (Sweden)

    Seyed Hossein Chavoshi

    2015-09-01

    Full Text Available Increased affordability and deployment of advanced tracking technologies have led researchers from various domains to analyze the resulting spatio-temporal movement data sets for the purpose of knowledge discovery. Two different approaches can be considered in the analysis of moving objects: quantitative analysis and qualitative analysis. This research focuses on the latter and uses the qualitative trajectory calculus (QTC, a type of calculus that represents qualitative data on moving point objects (MPOs, and establishes a framework to analyze the relative movement of multiple MPOs. A visualization technique called sequence signature (SESI is used, which enables to map QTC patterns in a 2D indexed rasterized space in order to evaluate the similarity of relative movement patterns of multiple MPOs. The applicability of the proposed methodology is illustrated by means of two practical examples of interacting MPOs: cars on a highway and body parts of a samba dancer. The results show that the proposed method can be effectively used to analyze interactions of multiple MPOs in different domains.

  15. Direct chloroplast sequencing: comparison of sequencing platforms and analysis tools for whole chloroplast barcoding.

    Directory of Open Access Journals (Sweden)

    Marta Brozynska

    Full Text Available Direct sequencing of total plant DNA using next generation sequencing technologies generates a whole chloroplast genome sequence that has the potential to provide a barcode for use in plant and food identification. Advances in DNA sequencing platforms may make this an attractive approach for routine plant identification. The HiSeq (Illumina and Ion Torrent (Life Technology sequencing platforms were used to sequence total DNA from rice to identify polymorphisms in the whole chloroplast genome sequence of a wild rice plant relative to cultivated rice (cv. Nipponbare. Consensus chloroplast sequences were produced by mapping sequence reads to the reference rice chloroplast genome or by de novo assembly and mapping of the resulting contigs to the reference sequence. A total of 122 polymorphisms (SNPs and indels between the wild and cultivated rice chloroplasts were predicted by these different sequencing and analysis methods. Of these, a total of 102 polymorphisms including 90 SNPs were predicted by both platforms. Indels were more variable with different sequencing methods, with almost all discrepancies found in homopolymers. The Ion Torrent platform gave no apparent false SNP but was less reliable for indels. The methods should be suitable for routine barcoding using appropriate combinations of sequencing platform and data analysis.

  16. Identification and characterization of rhizospheric microbial diversity by 16S ribosomal RNA gene sequencing

    Directory of Open Access Journals (Sweden)

    Muhammad Naveed

    2014-09-01

    Full Text Available In the present study, samples of rhizosphere and root nodules were collected from different areas of Pakistan to isolate plant growth promoting rhizobacteria. Identification of bacterial isolates was made by 16S rRNA gene sequence analysis and taxonomical confirmation on EzTaxon Server. The identified bacterial strains were belonged to 5 genera i.e. Ensifer, Bacillus, Pseudomona, Leclercia and Rhizobium. Phylogenetic analysis inferred from 16S rRNA gene sequences showed the evolutionary relationship of bacterial strains with the respective genera. Based on phylogenetic analysis, some candidate novel species were also identified. The bacterial strains were also characterized for morphological, physiological, biochemical tests and glucose dehydrogenase (gdh gene that involved in the phosphate solublization using cofactor pyrroloquinolone quinone (PQQ. Seven rhizoshperic and 3 root nodulating stains are positive for gdh gene. Furthermore, this study confirms a novel association between microbes and their hosts like field grown crops, leguminous and non-leguminous plants. It was concluded that a diverse group of bacterial population exist in the rhizosphere and root nodules that might be useful in evaluating the mechanisms behind plant microbial interactions and strains QAU-63 and QAU-68 have sequence similarity of 97 and 95% which might be declared as novel after further taxonomic characterization.

  17. Noncoding sequence classification based on wavelet transform analysis: part I

    Science.gov (United States)

    Paredes, O.; Strojnik, M.; Romo-Vázquez, R.; Vélez Pérez, H.; Ranta, R.; Garcia-Torales, G.; Scholl, M. K.; Morales, J. A.

    2017-09-01

    DNA sequences in human genome can be divided into the coding and noncoding ones. Coding sequences are those that are read during the transcription. The identification of coding sequences has been widely reported in literature due to its much-studied periodicity. Noncoding sequences represent the majority of the human genome. They play an important role in gene regulation and differentiation among the cells. However, noncoding sequences do not exhibit periodicities that correlate to their functions. The ENCODE (Encyclopedia of DNA elements) and Epigenomic Roadmap Project projects have cataloged the human noncoding sequences into specific functions. We study characteristics of noncoding sequences with wavelet analysis of genomic signals.

  18. Image sequence analysis workstation for multipoint motion analysis

    Science.gov (United States)

    Mostafavi, Hassan

    1990-08-01

    This paper describes an application-specific engineering workstation designed and developed to analyze motion of objects from video sequences. The system combines the software and hardware environment of a modem graphic-oriented workstation with the digital image acquisition, processing and display techniques. In addition to automation and Increase In throughput of data reduction tasks, the objective of the system Is to provide less invasive methods of measurement by offering the ability to track objects that are more complex than reflective markers. Grey level Image processing and spatial/temporal adaptation of the processing parameters is used for location and tracking of more complex features of objects under uncontrolled lighting and background conditions. The applications of such an automated and noninvasive measurement tool include analysis of the trajectory and attitude of rigid bodies such as human limbs, robots, aircraft in flight, etc. The system's key features are: 1) Acquisition and storage of Image sequences by digitizing and storing real-time video; 2) computer-controlled movie loop playback, freeze frame display, and digital Image enhancement; 3) multiple leading edge tracking in addition to object centroids at up to 60 fields per second from both live input video or a stored Image sequence; 4) model-based estimation and tracking of the six degrees of freedom of a rigid body: 5) field-of-view and spatial calibration: 6) Image sequence and measurement data base management; and 7) offline analysis software for trajectory plotting and statistical analysis.

  19. Pig genome sequence - analysis and publication strategy

    DEFF Research Database (Denmark)

    Archibald, Alan L.; Bolund, Lars; Churcher, Carol

    2010-01-01

    preferentially selected for sequencing. In accordance with the Bermuda and Fort Lauderdale agreements and the more recent Toronto Statement the data have been released into public sequence repositories (Genbank/EMBL, NCBI/Ensembl trace repositories) in a timely manner and in advance of publication. CONCLUSIONS...

  20. Simple sequence repeat (SSR) vs. sequence-related amplified polymorphism (SRAP) markers for Cynara cardunculus characterization

    Energy Technology Data Exchange (ETDEWEB)

    Casadevall, R.; Martin, E.; Cravero, V.

    2011-07-01

    A little is known about the genetic variability present in globe artichoke, cultivated and wild cardoons. This knowledge is very important for efficient genetic resources utilization, and to gain a better understanding of genetic structure of this botanical varieties. With the aims to determine genetic distances between Cynara cardunculus accessions and to compare two molecular markers systems for their efficiency to differ between botanical varieties, a molecular characterization of sixteen accessions from different geographical origins was performed. Seven SSR and seven SRAP markers were used for varieties characterization and to calculate genetic distances between them. Both distance matrices were subjected to cluster analysis. Exclusive SSR alleles were found for globe artichoke and for wild cardoon, but non exclusive alleles were found for cultivated cardoon. For both markers systems two major groups were identified, one of them included mostly globe artichoke accessions and the other one grouped mainly cardoons. The differences observed in the sub-cluster conformation with each marker systems may be due to intrinsic characteristics of the markers. Concluding, both kind of molecular markers are valuable tools for studying genetic distances between C. cardunculus accessions although they give different information. Nevertheless, SSR electrophoretic profiles are simpler to score than SRAP markers because they consist of just a few bands. As well, bands are highly informative because of the great number of alleles existing in population and they are codominant markers. In addition, SSRs use would reduce time and costs. (Author) 31 refs.

  1. Sequence analysis of PROTEOLYSIS 6 from Solanum lycopersicum

    Science.gov (United States)

    Roslan, Nur Farhana; Chew, Bee Lyn; Goh, Hoe-Han; Isa, Nurulhikma Md

    2018-04-01

    The N-end rule pathway is a protein degradation pathway that relates the protein half-life with the identity of its N-terminal residues. A destabilizing N-terminal residues is created by enzymatic reaction or chemical modifications. This destabilized substrate will be recognized by PROTEOLYSIS 6 (PRT6) protein, which encodes an E3 ligase enzyme and resulted in substrate degradation by proteasome. PRT6 has been studied in Arabidopsis thaliana and barley but not yet been studied in fleshy fruit plants. Hence, this study was carried out in tomato that is known as the model for fleshy fruit plants. BLASTX analysis identified that Solyc09g010830 which encodes for a PRT6 gene in tomato based on its sequence similarity with PRT6 in A. thaliana. In silico gene expression analysis shows that PRT6 gene was highly expressed in tomato fruits breaker +5. Co-expression analysis shows that PRT6 may not only involved in abiotic stresses but also in biotic stresses. The objective is to analyze the sequence and characterize PRT6 gene in tomato.

  2. Incident sequence analysis; event trees, methods and graphical symbols

    International Nuclear Information System (INIS)

    1980-11-01

    When analyzing incident sequences, unwanted events resulting from a certain cause are looked for. Graphical symbols and explanations of graphical representations are presented. The method applies to the analysis of incident sequences in all types of facilities. By means of the incident sequence diagram, incident sequences, i.e. the logical and chronological course of repercussions initiated by the failure of a component or by an operating error, can be presented and analyzed simply and clearly

  3. Computer-aided visualization and analysis system for sequence evaluation

    Energy Technology Data Exchange (ETDEWEB)

    Chee, Mark S.; Wang, Chunwei; Jevons, Luis C.; Bernhart, Derek H.; Lipshutz, Robert J.

    2004-05-11

    A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.

  4. Characterizing ncRNAs in human pathogenic protists using high-throughput sequencing technology

    Directory of Open Access Journals (Sweden)

    Lesley Joan Collins

    2011-12-01

    Full Text Available ncRNAs are key genes in many human diseases including cancer and viral infection, as well as providing critical functions in pathogenic organisms such as fungi, bacteria, viruses and protists. Until now the identification and characterization of ncRNAs associated with disease has been slow or inaccurate requiring many years of testing to understand complicated RNA and protein gene relationships. High-throughput sequencing now offers the opportunity to characterize miRNAs, siRNAs, snoRNAs and long ncRNAs on a genomic scale making it faster and easier to clarify how these ncRNAs contribute to the disease state. However, this technology is still relatively new, and ncRNA discovery is not an application of high priority for streamlined bioinformatics. Here we summarize background concepts and practical approaches for ncRNA analysis using high-throughput sequencing, and how it relates to understanding human disease. As a case study, we focus on the parasitic protists Giardia lamblia and Trichomonas vaginalis, where large evolutionary distance has meant difficulties in comparing ncRNAs with those from model eukaryotes. A combination of biological, computational and sequencing approaches has enabled easier classification of ncRNA classes such as snoRNAs, but has also aided the identification of novel classes. It is hoped that a higher level of understanding of ncRNA expression and interaction may aid in the development of less harsh treatment for protist-based diseases.

  5. Characterizing ncRNAs in Human Pathogenic Protists Using High-Throughput Sequencing Technology

    Science.gov (United States)

    Collins, Lesley Joan

    2011-01-01

    ncRNAs are key genes in many human diseases including cancer and viral infection, as well as providing critical functions in pathogenic organisms such as fungi, bacteria, viruses, and protists. Until now the identification and characterization of ncRNAs associated with disease has been slow or inaccurate requiring many years of testing to understand complicated RNA and protein gene relationships. High-throughput sequencing now offers the opportunity to characterize miRNAs, siRNAs, small nucleolar RNAs (snoRNAs), and long ncRNAs on a genomic scale, making it faster and easier to clarify how these ncRNAs contribute to the disease state. However, this technology is still relatively new, and ncRNA discovery is not an application of high priority for streamlined bioinformatics. Here we summarize background concepts and practical approaches for ncRNA analysis using high-throughput sequencing, and how it relates to understanding human disease. As a case study, we focus on the parasitic protists Giardia lamblia and Trichomonas vaginalis, where large evolutionary distance has meant difficulties in comparing ncRNAs with those from model eukaryotes. A combination of biological, computational, and sequencing approaches has enabled easier classification of ncRNA classes such as snoRNAs, but has also aided the identification of novel classes. It is hoped that a higher level of understanding of ncRNA expression and interaction may aid in the development of less harsh treatment for protist-based diseases. PMID:22303390

  6. Characterization of race 65 of Colletotrichum lindemuthianum by sequencing ITS regions

    Directory of Open Access Journals (Sweden)

    Marcela Coelho

    2016-09-01

    Full Text Available The present work aimed characterize isolates of C. lindemuthianum race 65 from different regions in Brazil by ITS sequencing. A total of 17 isolates of race 65, collected in the states of Mato Grosso, Minas Gerais, Paraná, Santa Catarina and São Paulo, were studied. Analysis of the sequences of isolates 8, 9, 12, 14 and 15 revealed the presence of two single nucleotide polymorphisms (SNPs in the ITS1 region at the same positions. These isolates, when analyzed together with the sequence of isolate 17, revealed a SNP in the ITS2 region. The highest genetic dissimilarity, observed between isolates 11 and  3 and between isolates 11 and 10, was 0.772. In turn, isolates 7 and 2 were the most similar, with a value of 0.002 for genetic distance. The phylogenetic tree obtained based on the sequences of the ITS1 and ITS2 regions revealed the formation of two groups, one with a subgroup. The results reveal high molecular variability among isolates of race 65 of C. lindemuthianum.

  7. Experimental design-based functional mining and characterization of high-throughput sequencing data in the sequence read archive.

    Directory of Open Access Journals (Sweden)

    Takeru Nakazato

    Full Text Available High-throughput sequencing technology, also called next-generation sequencing (NGS, has the potential to revolutionize the whole process of genome sequencing, transcriptomics, and epigenetics. Sequencing data is captured in a public primary data archive, the Sequence Read Archive (SRA. As of January 2013, data from more than 14,000 projects have been submitted to SRA, which is double that of the previous year. Researchers can download raw sequence data from SRA website to perform further analyses and to compare with their own data. However, it is extremely difficult to search entries and download raw sequences of interests with SRA because the data structure is complicated, and experimental conditions along with raw sequences are partly described in natural language. Additionally, some sequences are of inconsistent quality because anyone can submit sequencing data to SRA with no quality check. Therefore, as a criterion of data quality, we focused on SRA entries that were cited in journal articles. We extracted SRA IDs and PubMed IDs (PMIDs from SRA and full-text versions of journal articles and retrieved 2748 SRA ID-PMID pairs. We constructed a publication list referring to SRA entries. Since, one of the main themes of -omics analyses is clarification of disease mechanisms, we also characterized SRA entries by disease keywords, according to the Medical Subject Headings (MeSH extracted from articles assigned to each SRA entry. We obtained 989 SRA ID-MeSH disease term pairs, and constructed a disease list referring to SRA data. We previously developed feature profiles of diseases in a system called "Gendoo". We generated hyperlinks between diseases extracted from SRA and the feature profiles of it. The developed project, publication and disease lists resulting from this study are available at our web service, called "DBCLS SRA" (http://sra.dbcls.jp/. This service will improve accessibility to high-quality data from SRA.

  8. Establishing a framework for comparative analysis of genome sequences

    Energy Technology Data Exchange (ETDEWEB)

    Bansal, A.K.

    1995-06-01

    This paper describes a framework and a high-level language toolkit for comparative analysis of genome sequence alignment The framework integrates the information derived from multiple sequence alignment and phylogenetic tree (hypothetical tree of evolution) to derive new properties about sequences. Multiple sequence alignments are treated as an abstract data type. Abstract operations have been described to manipulate a multiple sequence alignment and to derive mutation related information from a phylogenetic tree by superimposing parsimonious analysis. The framework has been applied on protein alignments to derive constrained columns (in a multiple sequence alignment) that exhibit evolutionary pressure to preserve a common property in a column despite mutation. A Prolog toolkit based on the framework has been implemented and demonstrated on alignments containing 3000 sequences and 3904 columns.

  9. Scalable Kernel Methods and Algorithms for General Sequence Analysis

    Science.gov (United States)

    Kuksa, Pavel

    2011-01-01

    Analysis of large-scale sequential data has become an important task in machine learning and pattern recognition, inspired in part by numerous scientific and technological applications such as the document and text classification or the analysis of biological sequences. However, current computational methods for sequence comparison still lack…

  10. Recurrence plot analysis of DNA sequences

    Energy Technology Data Exchange (ETDEWEB)

    Wu Zuobing [State Key Laboratory of Nonlinear Mechanics, Institute of Mechanics, Chinese Academy of Sciences, Beijing 100080 (China)]. E-mail: wuzb@lnm.imech.ac.cn

    2004-11-15

    Recurrence plot technique of DNA sequences is established on metric representation and employed to analyze correlation structure of nucleotide strings. It is found that, in the transference of nucleotide strings, a human DNA fragment has a major correlation distance, but a yeast chromosome's correlation distance has a constant increasing.

  11. Graph-based clustering and characterization of repetitive sequences in next-generation sequencing data

    Czech Academy of Sciences Publication Activity Database

    Novák, Petr; Neumann, Pavel; Macas, Jiří

    2010-01-01

    Roč. 11, č. 1 (2010), s. 378-389 ISSN 1471-2105 R&D Projects: GA MŠk(CZ) OC10037; GA MŠk(CZ) LC06004 Institutional research plan: CEZ:AV0Z50510513 Keywords : repetitive DNA * plant genome * next generation sequencing Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 3.028, year: 2010

  12. Analysis of Neuronal Sequences Using Pairwise Biases

    Science.gov (United States)

    2015-08-27

    semantic memory (knowledge of facts) and implicit memory (e.g., how to ride a bike ). Evidence for the participation of the hippocampus in the formation of...hippocampal formation in an attempt to be cured of severe epileptic seizures. Although the surgery was successful in regards to reducing the frequency and...very different from each other in many ways including duration and number of spikes. Still, these sequences share a similar trend in the general order

  13. Characterization of a desert soil sequence at Yucca Mountain, NV

    International Nuclear Information System (INIS)

    Guertal, W.R.; Hofmann, L.L. Hudson, D.B.; Flint, A.L.

    1994-01-01

    Yucca Mountain, Nevada, is currently being evaluated as a potential site for a geologic repository for high level radioactive waste. Hydrologic evaluation of the unsaturated zone of Yucca Mountain is being conducted as an integrated set of surface and subsurface-based activities with a common objective to characterize the temporal and spatial distribution of water flux through the potential repository. Yucca Mountain is covered with a thin to thick layer of colluvial/alluvial materials, where there are not bedrock outcrops. It is across this surface boundary that all infiltration and all exfiltration occurs. This surface boundary effects water movement through the unsaturated zone. Characterization of the hydrologic properties of surficial materials is then a necessary step for short term characterization goals and for long term modeling

  14. In silico characterization of boron transporter (BOR1 protein sequences in Poaceae species

    Directory of Open Access Journals (Sweden)

    Ertuğrul Filiz

    2013-01-01

    Full Text Available Boron (B is essential for the plant growth and development, and its primary function is connected with formation of the cell wall. Moreover, boron toxicity is a shared problem in semiarid and arid regions. In this study, boron transporter protein (BOR1 sequences from some Poaceae species (Hordeum vulgare subsp. vulgare, Zea mays, Brachypodium distachyon, Oryza sativa subsp. japonica, Oryza sativa subsp. indica, Sorghum bicolor, Triticum aestivum were evaluated by bioinformatics tools. Physicochemical analyses revealed that most of BOR1 proteins were basic character and had generally aliphatic amino acids. Analysis of the domains showed that transmembrane domains were identified constantly and three motifs were detected with 50 amino acids length. Also, the motif SPNPWEPGSYDHWTVAKDMFNVPPAYIFGAFIPATMVAGLYYFDHSVASQ was found most frequently with 25 repeats. The phylogenetic tree showed divergence into two main clusters. B. distachyon species were clustered separately. Finally, this study contributes to the new BOR1 protein characterization in grasses and create scientific base for in silico analysis in future.

  15. Error Analysis of Deep Sequencing of Phage Libraries: Peptides Censored in Sequencing

    Directory of Open Access Journals (Sweden)

    Wadim L. Matochko

    2013-01-01

    Full Text Available Next-generation sequencing techniques empower selection of ligands from phage-display libraries because they can detect low abundant clones and quantify changes in the copy numbers of clones without excessive selection rounds. Identification of errors in deep sequencing data is the most critical step in this process because these techniques have error rates >1%. Mechanisms that yield errors in Illumina and other techniques have been proposed, but no reports to date describe error analysis in phage libraries. Our paper focuses on error analysis of 7-mer peptide libraries sequenced by Illumina method. Low theoretical complexity of this phage library, as compared to complexity of long genetic reads and genomes, allowed us to describe this library using convenient linear vector and operator framework. We describe a phage library as N×1 frequency vector n=ni, where ni is the copy number of the ith sequence and N is the theoretical diversity, that is, the total number of all possible sequences. Any manipulation to the library is an operator acting on n. Selection, amplification, or sequencing could be described as a product of a N×N matrix and a stochastic sampling operator (Sa. The latter is a random diagonal matrix that describes sampling of a library. In this paper, we focus on the properties of Sa and use them to define the sequencing operator (Seq. Sequencing without any bias and errors is Seq=Sa IN, where IN is a N×N unity matrix. Any bias in sequencing changes IN to a nonunity matrix. We identified a diagonal censorship matrix (CEN, which describes elimination or statistically significant downsampling, of specific reads during the sequencing process.

  16. Characterizing Aftershock Sequences of the Recent Strong Earthquakes in Central Italy

    Science.gov (United States)

    Kossobokov, Vladimir G.; Nekrasova, Anastasia K.

    2017-10-01

    The recent strong earthquakes in Central Italy allow for a comparative analysis of their aftershocks from the viewpoint of the Unified Scaling Law for Earthquakes, USLE, which generalizes the Gutenberg-Richter relationship making use of naturally fractal distribution of earthquake sources of different size in a seismic region. In particular, we consider aftershocks as a sequence of avalanches in self-organized system of blocks-and-faults of the Earth lithosphere, each aftershock series characterized with the distribution of the USLE control parameter, η. We found the existence, in a long-term, of different, intermittent levels of rather steady seismic activity characterized with a near constant value of η, which switch, in mid-term, at times of transition associated with catastrophic events. On such a transition, seismic activity may follow different scenarios with inter-event time scaling of different kind, including constant, logarithmic, power law, exponential rise/decay or a mixture of those as observed in the case of the ongoing one associated with the three strong earthquakes in 2016. Evidently, our results do not support the presence of universality of seismic energy release, while providing constraints on modelling seismic sequences for earthquake physicists and supplying decision makers with information for improving local seismic hazard assessments.

  17. Characterization of methicillin-resistant Staphylococcus aureus Sequence Type 398

    DEFF Research Database (Denmark)

    Christiansen, Mette Theilgaard

    Staphylococcus aureus is an opportunistic pathogen that colonizes the nares and skin surfaces of several animal species, including man. S. aureus can cause a wide variety of infections ranging from superficial soft tissue and skin infections to severe and deadly systemic infections. Traditionally S....... aureus and methicillin-resistant Staphylococcus aureus (MRSA) have been associated with hospitals, but during the past decades MRSA has emerged in the community and now a new branch of MRSA has been found in association with livestock (LA-MRSA). A specific lineage (multilocus sequence type 398 (ST398...

  18. MultiLocus Sequence Analysis- and Amplified Fragment Length Polymorphism-based characterization of xanthomonads associated with bacterial spot of tomato and pepper and their relatedness to Xanthomonas species.

    Science.gov (United States)

    Hamza, A A; Robene-Soustrade, I; Jouen, E; Lefeuvre, P; Chiroleu, F; Fisher-Le Saux, M; Gagnevin, L; Pruvost, O

    2012-05-01

    MultiLocus Sequence Analysis (MLSA) and Amplified Fragment Length Polymorphism (AFLP) were used to measure the genetic relatedness of a comprehensive collection of xanthomonads pathogenic to solaneous hosts to Xanthomonas species. The MLSA scheme was based on partial sequences of four housekeeping genes (atpD, dnaK, efp and gyrB). Globally, MLSA data unambiguously identified strains causing bacterial spot of tomato and pepper at the species level and was consistent with AFLP data. Genetic distances derived from both techniques showed a close relatedness of (i) X. euvesicatoria, X. perforans and X. alfalfae and (ii) X. gardneri and X. cynarae. Maximum likelihood tree topologies derived from each gene portion and the concatenated data set for species in the X. campestris 16S rRNA core (i.e. the species cluster comprising all strains causing bacterial spot of tomato and pepper) were not congruent, consistent with the detection of several putative recombination events in our data sets by several recombination search algorithms. One recombinant region in atpD was identified in most strains of X. euvesicatoria including the type strain. Copyright © 2012 Elsevier GmbH. All rights reserved.

  19. Cloning and sequence analysis of hyaluronoglucosaminidase (nagH gene of Clostridium chauvoei

    Directory of Open Access Journals (Sweden)

    Saroj K. Dangi

    2017-09-01

    Full Text Available Aim: Blackleg disease is caused by Clostridium chauvoei in ruminants. Although virulence factors such as C. chauvoei toxin A, sialidase, and flagellin are well characterized, hyaluronidases of C. chauvoei are not characterized. The present study was aimed at cloning and sequence analysis of hyaluronoglucosaminidase (nagH gene of C. chauvoei. Materials and Methods: C. chauvoei strain ATCC 10092 was grown in ATCC 2107 media and confirmed by polymerase chain reaction (PCR using the primers specific for 16-23S rDNA spacer region. nagH gene of C. chauvoei was amplified and cloned into pRham-SUMO vector and transformed into Escherichia cloni 10G cells. The construct was then transformed into E. cloni cells. Colony PCR was carried out to screen the colonies followed by sequencing of nagH gene in the construct. Results: PCR amplification yielded nagH gene of 1143 bp product, which was cloned in prokaryotic expression system. Colony PCR, as well as sequencing of nagH gene, confirmed the presence of insert. Sequence was then subjected to BLAST analysis of NCBI, which confirmed that the sequence was indeed of nagH gene of C. chauvoei. Phylogenetic analysis of the sequence showed that it is closely related to Clostridium perfringens and Clostridium paraputrificum. Conclusion: The gene for virulence factor nagH was cloned into a prokaryotic expression vector and confirmed by sequencing.

  20. Cloning and sequence analysis of benzo-a-pyreneinducible ...

    African Journals Online (AJOL)

    The phylogenetic tree based on the amino acid sequences clearly shows tilapia CYP1A and killifish CYP1A to be more closely related to each other than to the other CYP1A subfamilies. Sequence analysis of 3727 bp of genomic DNA showed that the clone obtained was the structural gene of CYP1A which consists of ...

  1. Biological sequence analysis: probabilistic models of proteins and nucleic acids

    National Research Council Canada - National Science Library

    Durbin, Richard

    1998-01-01

    ... analysis methods are now based on principles of probabilistic modelling. Examples of such methods include the use of probabilistically derived score matrices to determine the significance of sequence alignments, the use of hidden Markov models as the basis for profile searches to identify distant members of sequence families, and the inference...

  2. Phylogenetic analysis of the genus Hordeum using repetitive DNA sequences

    DEFF Research Database (Denmark)

    Svitashev, S.; Bryngelsson, T.; Vershinin, A.

    1994-01-01

    A set of six cloned barley (Hordeum vulgare) repetitive DNA sequences was used for the analysis of phylogenetic relationships among 31 species (46 taxa) of the genus Hordeum, using molecular hybridization techniques. In situ hybridization experiments showed dispersed organization of the sequences...

  3. Parametric inference for biological sequence analysis.

    Science.gov (United States)

    Pachter, Lior; Sturmfels, Bernd

    2004-11-16

    One of the major successes in computational biology has been the unification, by using the graphical model formalism, of a multitude of algorithms for annotating and comparing biological sequences. Graphical models that have been applied to these problems include hidden Markov models for annotation, tree models for phylogenetics, and pair hidden Markov models for alignment. A single algorithm, the sum-product algorithm, solves many of the inference problems that are associated with different statistical models. This article introduces the polytope propagation algorithm for computing the Newton polytope of an observation from a graphical model. This algorithm is a geometric version of the sum-product algorithm and is used to analyze the parametric behavior of maximum a posteriori inference calculations for graphical models.

  4. Recurrence time statistics: versatile tools for genomic DNA sequence analysis.

    Science.gov (United States)

    Cao, Yinhe; Tung, Wen-Wen; Gao, J B

    2004-01-01

    With the completion of the human and a few model organisms' genomes, and the genomes of many other organisms waiting to be sequenced, it has become increasingly important to develop faster computational tools which are capable of easily identifying the structures and extracting features from DNA sequences. One of the more important structures in a DNA sequence is repeat-related. Often they have to be masked before protein coding regions along a DNA sequence are to be identified or redundant expressed sequence tags (ESTs) are to be sequenced. Here we report a novel recurrence time based method for sequence analysis. The method can conveniently study all kinds of periodicity and exhaustively find all repeat-related features from a genomic DNA sequence. An efficient codon index is also derived from the recurrence time statistics, which has the salient features of being largely species-independent and working well on very short sequences. Efficient codon indices are key elements of successful gene finding algorithms, and are particularly useful for determining whether a suspected EST belongs to a coding or non-coding region. We illustrate the power of the method by studying the genomes of E. coli, the yeast S. cervisivae, the nematode worm C. elegans, and the human, Homo sapiens. Computationally, our method is very efficient. It allows us to carry out analysis of genomes on the whole genomic scale by a PC.

  5. Molecular characterization, tissue expression and sequence variability of the barramundi (Lates calcarifer myostatin gene

    Directory of Open Access Journals (Sweden)

    Smith-Keune Carolyn

    2008-02-01

    Full Text Available Abstract Background Myostatin (MSTN is a member of the transforming growth factor-β superfamily that negatively regulates growth of skeletal muscle tissue. The gene encoding for the MSTN peptide is a consolidate candidate for the enhancement of productivity in terrestrial livestock. This gene potentially represents an important target for growth improvement of cultured finfish. Results Here we report molecular characterization, tissue expression and sequence variability of the barramundi (Lates calcarifer MSTN-1 gene. The barramundi MSTN-1 was encoded by three exons 379, 371 and 381 bp in length and translated into a 376-amino acid peptide. Intron 1 and 2 were 412 and 819 bp in length and presented typical GT...AG splicing sites. The upstream region contained cis-regulatory elements such as TATA-box and E-boxes. A first assessment of sequence variability suggested that higher mutation rates are found in the 5' flanking region with several SNP's present in this species. A putative micro RNA target site has also been observed in the 3'UTR (untranslated region and is highly conserved across teleost fish. The deduced amino acid sequence was conserved across vertebrates and exhibited characteristic conserved putative functional residues including a cleavage motif of proteolysis (RXXR, nine cysteines and two glycosilation sites. A qualitative analysis of the barramundi MSTN-1 expression pattern revealed that, in adult fish, transcripts are differentially expressed in various tissues other than skeletal muscles including gill, heart, kidney, intestine, liver, spleen, eye, gonad and brain. Conclusion Our findings provide valuable insights such as sequence variation and genomic information which will aid the further investigation of the barramundi MSTN-1 gene in association with growth. The finding for the first time in finfish MSTN of a miRNA target site in the 3'UTR provides an opportunity for the identification of regulatory mutations on the

  6. RESEARCH NOTE Genome-based exome-sequencing analysis ...

    Indian Academy of Sciences (India)

    Navya

    2017-02-22

    Feb 22, 2017 ... Genome-based exome-sequencing analysis identifies GYG1, DIS3L, DDRGK1 genes ... Cardiology Division, Department of Internal Medicine, Severance .... with p values of <0.05 byanalyzing differences in allele distribution.

  7. Editorial: Special Issue on Algorithms for Sequence Analysis and Storage

    Directory of Open Access Journals (Sweden)

    Veli Mäkinen

    2014-03-01

    Full Text Available This special issue of Algorithms is dedicated to approaches to biological sequence analysis that have algorithmic novelty and potential for fundamental impact in methods used for genome research.

  8. Tools for integrated sequence-structure analysis with UCSF Chimera

    Directory of Open Access Journals (Sweden)

    Huang Conrad C

    2006-07-01

    Full Text Available Abstract Background Comparing related structures and viewing the structures in the context of sequence alignments are important tasks in protein structure-function research. While many programs exist for individual aspects of such work, there is a need for interactive visualization tools that: (a provide a deep integration of sequence and structure, far beyond mapping where a sequence region falls in the structure and vice versa; (b facilitate changing data of one type based on the other (for example, using only sequence-conserved residues to match structures, or adjusting a sequence alignment based on spatial fit; (c can be used with a researcher's own data, including arbitrary sequence alignments and annotations, closely or distantly related sets of proteins, etc.; and (d interoperate with each other and with a full complement of molecular graphics features. We describe enhancements to UCSF Chimera to achieve these goals. Results The molecular graphics program UCSF Chimera includes a suite of tools for interactive analyses of sequences and structures. Structures automatically associate with sequences in imported alignments, allowing many kinds of crosstalk. A novel method is provided to superimpose structures in the absence of a pre-existing sequence alignment. The method uses both sequence and secondary structure, and can match even structures with very low sequence identity. Another tool constructs structure-based sequence alignments from superpositions of two or more proteins. Chimera is designed to be extensible, and mechanisms for incorporating user-specific data without Chimera code development are also provided. Conclusion The tools described here apply to many problems involving comparison and analysis of protein structures and their sequences. Chimera includes complete documentation and is intended for use by a wide range of scientists, not just those in the computational disciplines. UCSF Chimera is free for non-commercial use and is

  9. Characterization and phylogenetic analysis of α-gliadin gene ...

    Indian Academy of Sciences (India)

    Supplementary data: Characterization and phylogenetic analysis of α-gliadin gene sequences reveals significant genomic divergence in Triticeae species. Guang-Rong Li, Tao Lang, En-Nian Yang, Cheng Liu ... The MITE insertion at the 3 UTR is boxed. Figure 2. The secondary structure of MITE insertion in HM452949.

  10. Molecular characterization and diversity analysis in chilli pepper ...

    African Journals Online (AJOL)

    India is considered to be the secondary center of diversity of chilli pepper, especially of Capsicum annuum. Simple sequence repeats (SSRs) are the most widely used marker system for plant variety characterization and diversity analysis especially in cultivated species which have low levels of polymorphism. The diversity ...

  11. Molecular characterization and expression analysis of fat mass and ...

    Indian Academy of Sciences (India)

    Keywords. fat mass and obesity-associated gene (FTO); rabbit; mRNA expression patterns; sequence analysis; Oryctolagus cuniculus. ... In this work, the molecular characterization and expression features of rabbit (Oryctolagus cuniculus) FTO cDNA were analysed. The rabbit FTO cDNA with a size of 2158 bp was cloned, ...

  12. The DNA sequence, annotation and analysis of human chromosome 3

    DEFF Research Database (Denmark)

    Muzny, D.M.; Bolund, Lars; As part of the Chinese Human Genome Sequencing Consortium, E.T.A.L.

    2006-01-01

    as numerous loci involved in multiple human cancers such as the gene encoding FHIT, which contains the most common constitutive fragile site in the genome, FRA3B. Using genomic sequence from chimpanzee and rhesus macaque, we were able to characterize the breakpoints defining a large pericentric inversion...

  13. Characterization of genomic sequence of a drought-resistant gene ...

    Indian Academy of Sciences (India)

    to study the genomics of polyploid plants, as most pro- genitors have been ... had been shown to constitute significant stress in pilot exper- iments. Untreated ... Southern blotting, real-time quantitative PCR and total soluble sugar analysis.

  14. Isolation and characterization of gene sequences expressed in cotton fiber

    Directory of Open Access Journals (Sweden)

    Taciana de Carvalho Coutinho

    2016-06-01

    Full Text Available ABSTRACT Cotton fiber are tubular cells which develop from the differentiation of ovule epidermis. In addition to being one of the most important natural fiber of the textile group, cotton fiber afford an excellent experimental system for studying the cell wall. The aim of this work was to isolate and characterise the genes expressed in cotton fiber (Gossypium hirsutum L. to be used in future work in cotton breeding. Fiber of the cotton cultivar CNPA ITA 90 II were used to extract RNA for the subsequent generation of a cDNA library. Seventeen sequences were obtained, of which 14 were already described in the NCBI database (National Centre for Biotechnology Information, such as those encoding the lipid transfer proteins (LTPs and arabinogalactans (AGP. However, other cDNAs such as the B05 clone, which displays homology with the glycosyltransferases, have still not been described for this crop. Nevertheless, results showed that several clones obtained in this study are associated with cell wall proteins, wall-modifying enzymes and lipid transfer proteins directly involved in fiber development.

  15. Sequencing and Analysis of Neanderthal Genomic DNA

    Energy Technology Data Exchange (ETDEWEB)

    Noonan, James P.; Coop, Graham; Kudaravalli, Sridhar; Smith,Doug; Krause, Johannes; Alessi, Joe; Chen, Feng; Platt, Darren; Paabo,Svante; Pritchard, Jonathan K.; Rubin, Edward M.

    2006-06-13

    Recovery and analysis of multiple Neanderthal autosomalsequences using a metagenomic approach reveals that modern humans andNeanderthals split ~;400,000 years ago, without significant evidence ofsubsequent admixture.

  16. SVAMP: Sequence variation analysis, maps and phylogeny

    KAUST Repository

    Naeem, Raeece; Hidayah, Lailatul; Preston, Mark D.; Clark, Taane G.; Pain, Arnab

    2014-01-01

    Summary: SVAMP is a stand-alone desktop application to visualize genomic variants (in variant call format) in the context of geographical metadata. Users of SVAMP are able to generate phylogenetic trees and perform principal coordinate analysis

  17. DSAP: deep-sequencing small RNA analysis pipeline.

    Science.gov (United States)

    Huang, Po-Jung; Liu, Yi-Chung; Lee, Chi-Ching; Lin, Wei-Chen; Gan, Richie Ruei-Chi; Lyu, Ping-Chiang; Tang, Petrus

    2010-07-01

    DSAP is an automated multiple-task web service designed to provide a total solution to analyzing deep-sequencing small RNA datasets generated by next-generation sequencing technology. DSAP uses a tab-delimited file as an input format, which holds the unique sequence reads (tags) and their corresponding number of copies generated by the Solexa sequencing platform. The input data will go through four analysis steps in DSAP: (i) cleanup: removal of adaptors and poly-A/T/C/G/N nucleotides; (ii) clustering: grouping of cleaned sequence tags into unique sequence clusters; (iii) non-coding RNA (ncRNA) matching: sequence homology mapping against a transcribed sequence library from the ncRNA database Rfam (http://rfam.sanger.ac.uk/); and (iv) known miRNA matching: detection of known miRNAs in miRBase (http://www.mirbase.org/) based on sequence homology. The expression levels corresponding to matched ncRNAs and miRNAs are summarized in multi-color clickable bar charts linked to external databases. DSAP is also capable of displaying miRNA expression levels from different jobs using a log(2)-scaled color matrix. Furthermore, a cross-species comparative function is also provided to show the distribution of identified miRNAs in different species as deposited in miRBase. DSAP is available at http://dsap.cgu.edu.tw.

  18. Human liver cell trafficking mutants: characterization and whole exome sequencing.

    Directory of Open Access Journals (Sweden)

    Fei Yuan

    Full Text Available The HuH7 liver cell mutant Trf1 is defective in membrane trafficking and is complemented by the casein kinase 2α subunit CK2α''. Here we identify characteristic morphologies, trafficking and mutational changes in six additional HuH7 mutants Trf2-Trf7. Trf1 cells were previously shown to be severely defective in gap junction functions. Using a Lucifer yellow transfer assay, remarkable attenuation of gap junction communication was revealed in each of the mutants Trf2-Trf7. Electron microscopy and light microscopy of thiamine pyrophosphatase showed that several mutants exhibited fragmented Golgi apparatus cisternae compared to parental HuH7 cells. Intracellular trafficking was investigated using assays of transferrin endocytosis and recycling and VSV G secretion. Surface binding of transferrin was reduced in all six Trf2-Trf7 mutants, which generally correlated with the degree of reduced expression of the transferrin receptor at the cell surface. The mutants displayed the same transferrin influx rates as HuH7, and for efflux rate, only Trf6 differed, having a slower transferrin efflux rate than HuH7. The kinetics of VSV G transport along the exocytic pathway were altered in Trf2 and Trf5 mutants. Genetic changes unique to particular Trf mutants were identified by exome sequencing, and one was investigated in depth. The novel mutation Ile34Phe in the GTPase RAB22A was identified in Trf4. RNA interference knockdown of RAB22A or overexpression of RAB22AI34F in HuH7 cells caused phenotypic changes characteristic of the Trf4 mutant. In addition, the Ile34Phe mutation reduced both guanine nucleotide binding and hydrolysis activities of RAB22A. Thus, the RAB22A Ile34Phe mutation appears to contribute to the Trf4 mutant phenotype.

  19. Probabilistic topic modeling for the analysis and classification of genomic sequences

    Science.gov (United States)

    2015-01-01

    Background Studies on genomic sequences for classification and taxonomic identification have a leading role in the biomedical field and in the analysis of biodiversity. These studies are focusing on the so-called barcode genes, representing a well defined region of the whole genome. Recently, alignment-free techniques are gaining more importance because they are able to overcome the drawbacks of sequence alignment techniques. In this paper a new alignment-free method for DNA sequences clustering and classification is proposed. The method is based on k-mers representation and text mining techniques. Methods The presented method is based on Probabilistic Topic Modeling, a statistical technique originally proposed for text documents. Probabilistic topic models are able to find in a document corpus the topics (recurrent themes) characterizing classes of documents. This technique, applied on DNA sequences representing the documents, exploits the frequency of fixed-length k-mers and builds a generative model for a training group of sequences. This generative model, obtained through the Latent Dirichlet Allocation (LDA) algorithm, is then used to classify a large set of genomic sequences. Results and conclusions We performed classification of over 7000 16S DNA barcode sequences taken from Ribosomal Database Project (RDP) repository, training probabilistic topic models. The proposed method is compared to the RDP tool and Support Vector Machine (SVM) classification algorithm in a extensive set of trials using both complete sequences and short sequence snippets (from 400 bp to 25 bp). Our method reaches very similar results to RDP classifier and SVM for complete sequences. The most interesting results are obtained when short sequence snippets are considered. In these conditions the proposed method outperforms RDP and SVM with ultra short sequences and it exhibits a smooth decrease of performance, at every taxonomic level, when the sequence length is decreased. PMID:25916734

  20. Sequence characterization of cotton leaf curl virus from Rajasthan: phylogenetic relationship with other members of geminiviruses and detection of recombination.

    Science.gov (United States)

    Kumar, A; Kumar, J; Khan, J A

    2010-04-01

    Diseased cotton plants showing typical leaf curl symptoms were collected from experimental plot of Agriculture Research Station-Sriganganagar, Rajasthan. Complete DNA-A component from samples taken from two areas were amplified through rolling circle amplification (RCA) using templiphi kit (GE Healthcare) and characterized. DNA-A of one isolate consists of 2751 nucleotides and second isolate of 2759 nucleotide. Both sequences comprised six ORF's. Genome organization of DNA-A of one isolate shows high sequence similarity with other characterized local begomovirus isolates of Rajasthan, while other isolate shows high sequence similarity with CLCuV reported from Pakistan. The maximum similarity of first isolate, CLCuV-SG01, shows highest sequence identity with Cotton leaf curl Abohar (Rajasthan) virus, and second isolate, CLCuV-SG02, shows highest sequence identity with cotton leaf curl virus from Pakistan. Both isolates showed 85% similarities with each other. The sequence data revealed probable infiltration of some strains of Cotton leaf curl virus from Pakistan to India, or co-existence of different isolates under similar geographical conditions. While CLCuV-SG01 shows highest nt sequence similarity with CLCuV Rajasthan (Abohar), nt identity of V1 ORF (encoding coat protein) of SG01 shows the highest nt identity (100%) with CLCuV Multan (Bhatinda) and Abohar virus while AC1 region also showed difference. Complete nucleotide sequence of SG01 shows only 86% similarity with CLCuV Multan virus. Similarity search revealed significant difference in AV1 and AC1 regions with respect to DNA-A suggesting an evolutionary history of recombination. Computer based analysis, recombination detection Program (RDP) supports the recombination hypothesis, indicated that recombination with other begomoviruses had taken place within V1 ORF and AC1 ORF of CLCuV-SG01 and AC1 ORF of CLCuV-SG02 and also in noncoding intergenic region (IR).

  1. Expressed sequence tags as a tool for phylogenetic analysis of placental mammal evolution.

    Directory of Open Access Journals (Sweden)

    Morgan Kullberg

    Full Text Available BACKGROUND: We investigate the usefulness of expressed sequence tags, ESTs, for establishing divergences within the tree of placental mammals. This is done on the example of the established relationships among primates (human, lagomorphs (rabbit, rodents (rat and mouse, artiodactyls (cow, carnivorans (dog and proboscideans (elephant. METHODOLOGY/PRINCIPAL FINDINGS: We have produced 2000 ESTs (1.2 mega bases from a marsupial mouse and characterized the data for their use in phylogenetic analysis. The sequences were used to identify putative orthologous sequences from whole genome projects. Although most ESTs stem from single sequence reads, the frequency of potential sequencing errors was found to be lower than allelic variation. Most of the sequences represented slowly evolving housekeeping-type genes, with an average amino acid distance of 6.6% between human and mouse. Positive Darwinian selection was identified at only a few single sites. Phylogenetic analyses of the EST data yielded trees that were consistent with those established from whole genome projects. CONCLUSIONS: The general quality of EST sequences and the general absence of positive selection in these sequences make ESTs an attractive tool for phylogenetic analysis. The EST approach allows, at reasonable costs, a fast extension of data sampling from species outside the genome projects.

  2. Purification and sequence characterization of chondroitin sulfate and dermatan sulfate from fishes.

    Science.gov (United States)

    Lin, Na; Mo, Xiaoli; Yang, Yang; Zhang, Hong

    2017-04-01

    Chondroitin sulfate (CS) and dermatan sulfate (DS) were extracted and purified from skins or bones of salmon (Salmo salar), snakehead (Channa argus), monkfish (Lophius litulon) and skipjack tuna (Katsuwonus pelamis). Size, structural sequences and sulfate groups of oligosaccharides in the purified CS and DS could be characterized and identified using high performance liquid chromatography (HPLC) combined with Orbitrap mass spectrometry. CS and DS chain structure varies depending on origin, but motif structure appears consistent. Structures of CS and DS oligosaccharides with different size and sulfate groups were compared between fishes and other animals, and results showed that some minor differences of special structures could be identified by hydrophilic interaction chromatography-liquid chromatography-fourier transform-mass/mass spectrometry (HILIC-LC-FT-MS/MS). For example, data showed that salmon and skipjack CS had a higher percentage content of high-level sulfated oligosaccharides than that porcine CS. In addition, structural information of different origins of CS and DS was analyzed by principal component analysis (PCA) and results showed that CS and DS samples could be differentiated according to their molecular conformation and oligosaccharide fragments information. Understanding CS and DS structure derived from different origins may lead to the production of CS or DS with unique disaccharides or oligosaccharides sequence composition and biological functions.

  3. Sequencing and analysis of the Mediterranean amphioxus (Branchiostoma lanceolatum transcriptome.

    Directory of Open Access Journals (Sweden)

    Silvan Oulion

    Full Text Available BACKGROUND: The basally divergent phylogenetic position of amphioxus (Cephalochordata, as well as its conserved morphology, development and genetics, make it the best proxy for the chordate ancestor. Particularly, studies using the amphioxus model help our understanding of vertebrate evolution and development. Thus, interest for the amphioxus model led to the characterization of both the transcriptome and complete genome sequence of the American species, Branchiostoma floridae. However, recent technical improvements allowing induction of spawning in the laboratory during the breeding season on a daily basis with the Mediterranean species Branchiostoma lanceolatum have encouraged European Evo-Devo researchers to adopt this species as a model even though no genomic or transcriptomic data have been available. To fill this need we used the pyrosequencing method to characterize the B. lanceolatum transcriptome and then compared our results with the published transcriptome of B. floridae. RESULTS: Starting with total RNA from nine different developmental stages of B. lanceolatum, a normalized cDNA library was constructed and sequenced on Roche GS FLX (Titanium mode. Around 1.4 million of reads were produced and assembled into 70,530 contigs (average length of 490 bp. Overall 37% of the assembled sequences were annotated by BlastX and their Gene Ontology terms were determined. These results were then compared to genomic and transcriptomic data of B. floridae to assess similarities and specificities of each species. CONCLUSION: We obtained a high-quality amphioxus (B. lanceolatum reference transcriptome using a high throughput sequencing approach. We found that 83% of the predicted genes in the B. floridae complete genome sequence are also found in the B. lanceolatum transcriptome, while only 41% were found in the B. floridae transcriptome obtained with traditional Sanger based sequencing. Therefore, given the high degree of sequence conservation

  4. Phylogenetic characterization of Canine Parvovirus VP2 partial sequences from symptomatic dogs samples.

    Science.gov (United States)

    Zienius, D; Lelešius, R; Kavaliauskis, H; Stankevičius, A; Šalomskas, A

    2016-01-01

    The aim of the present study was to detect canine parvovirus (CPV) from faecal samples of clinically ill domestic dogs by polymerase chain reaction (PCR) followed by VP2 gene partial sequencing and molecular characterization of circulating strains in Lithuania. Eleven clinically and antigen-tested positive dog faecal samples, collected during the period of 2014-2015, were investigated by using PCR. The phylogenetic investigations indicated that the Lithuanian CPV VP2 partial sequences (3025-3706 cds) were closely related and showed 99.0-99.9% identity. All Lithuanian sequences were associated with one phylogroup, but grouped in different clusters. Ten of investigated Lithuanian CPV VP2 sequences were closely associated with CPV 2a antigenic variant (99.4% nt identity). Five CPV VP2 sequences from Lithuania were related to CPV-2a, but were rather divergent (6.8 nt differences). Only one CPV VP2 sequence from Lithuania was associated (99.3% nt identity) with CPV-2b VP2 sequences from France, Italy, USA and Korea. The four of eleven investigated Lithuanian dogs with CPV infection symptoms were vaccinated with CPV-2 vaccine, but their VP2 sequences were phylogenetically distantly associated with CPV vaccine strains VP2 sequences (11.5-15.8 nt differences). Ten Lithuanian CPV VP2 sequences had monophyletic relations among the close geographically associated samples, but five of them were rather divergent (1.0% less sequence similarity). The one Lithuanian CPV VP2 sequence was closely related with CPV-2b antigenic variant. All the Lithuanian CPV VP2 partial sequences were conservative and phylogenetically low associated with most commonly used CPV vaccine strains.

  5. Nonlinear analysis of river flow time sequences

    Science.gov (United States)

    Porporato, Amilcare; Ridolfi, Luca

    1997-06-01

    Within the field of chaos theory several methods for the analysis of complex dynamical systems have recently been proposed. In light of these ideas we study the dynamics which control the behavior over time of river flow, investigating the existence of a low-dimension deterministic component. The present article follows the research undertaken in the work of Porporato and Ridolfi [1996a] in which some clues as to the existence of chaos were collected. Particular emphasis is given here to the problem of noise and to nonlinear prediction. With regard to the latter, the benefits obtainable by means of the interpolation of the available time series are reported and the remarkable predictive results attained with this nonlinear method are shown.

  6. In Silico Characterization of Pectate Lyase Protein Sequences from Different Source Organisms

    Directory of Open Access Journals (Sweden)

    Amit Kumar Dubey

    2010-01-01

    Full Text Available A total of 121 protein sequences of pectate lyases were subjected to homology search, multiple sequence alignment, phylogenetic tree construction, and motif analysis. The phylogenetic tree constructed revealed different clusters based on different source organisms representing bacterial, fungal, plant, and nematode pectate lyases. The multiple accessions of bacterial, fungal, nematode, and plant pectate lyase protein sequences were placed closely revealing a sequence level similarity. The multiple sequence alignment of these pectate lyase protein sequences from different source organisms showed conserved regions at different stretches with maximum homology from amino acid residues 439–467, 715–816, and 829–910 which could be used for designing degenerate primers or probes specific for pectate lyases. The motif analysis revealed a conserved Pec_Lyase_C domain uniformly observed in all pectate lyases irrespective of variable sources suggesting its possible role in structural and enzymatic functions.

  7. Classification and characterization of species within the genus lens using genotyping-by-sequencing (GBS.

    Directory of Open Access Journals (Sweden)

    Melissa M L Wong

    Full Text Available Lentil (Lens culinaris ssp. culinaris is a nutritious and affordable pulse with an ancient crop domestication history. The genus Lens consists of seven taxa, however, there are many discrepancies in the taxon and gene pool classification of lentil and its wild relatives. Due to the narrow genetic basis of cultivated lentil, there is a need towards better understanding of the relationships amongst wild germplasm to assist introgression of favourable genes into lentil breeding programs. Genotyping-by-sequencing (GBS is an easy and affordable method that allows multiplexing of up to 384 samples or more per library to generate genome-wide single nucleotide Polymorphism (SNP markers. In this study, we aimed to characterize our lentil germplasm collection using a two-enzyme GBS approach. We constructed two 96-plex GBS libraries with a total of 60 accessions where some accessions had several samples and each sample was sequenced in two technical replicates. We developed an automated GBS pipeline and detected a total of 266,356 genome-wide SNPs. After filtering low quality and redundant SNPs based on haplotype information, we constructed a maximum-likelihood tree using 5,389 SNPs. The phylogenetic tree grouped the germplasm collection into their respective taxa with strong support. Based on phylogenetic tree and STRUCTURE analysis, we identified four gene pools, namely L. culinaris/L. orientalis/L. tomentosus, L. lamottei/L. odemensis, L. ervoides and L. nigricans which form primary, secondary, tertiary and quaternary gene pools, respectively. We discovered sequencing bias problems likely due to DNA quality and observed severe run-to-run variation in the wild lentils. We examined the authenticity of the germplasm collection and identified 17% misclassified samples. Our study demonstrated that GBS is a promising and affordable tool for screening by plant breeders interested in crop wild relatives.

  8. Characterization of GM events by insert knowledge adapted re-sequencing approaches

    OpenAIRE

    Yang, Litao; Wang, Congmao; Holst-Jensen, Arne; Morisset, Dany; Lin, Yongjun; Zhang, Dabing

    2013-01-01

    Detection methods and data from molecular characterization of genetically modified (GM) events are needed by stakeholders of public risk assessors and regulators. Generally, the molecular characteristics of GM events are incomprehensively revealed by current approaches and biased towards detecting transformation vector derived sequences. GM events are classified based on available knowledge of the sequences of vectors and inserts (insert knowledge). Herein we present three insert knowledge-ad...

  9. Identification and characterization of Highlands J virus from a Mississippi sandhill crane using unbiased next-generation sequencing

    Science.gov (United States)

    Ip, Hon S.; Wiley, Michael R.; Long, Renee; Gustavo, Palacios; Shearn-Bochsler, Valerie; Whitehouse, Chris A.

    2014-01-01

    Advances in massively parallel DNA sequencing platforms, commonly termed next-generation sequencing (NGS) technologies, have greatly reduced time, labor, and cost associated with DNA sequencing. Thus, NGS has become a routine tool for new viral pathogen discovery and will likely become the standard for routine laboratory diagnostics of infectious diseases in the near future. This study demonstrated the application of NGS for the rapid identification and characterization of a virus isolated from the brain of an endangered Mississippi sandhill crane. This bird was part of a population restoration effort and was found in an emaciated state several days after Hurricane Isaac passed over the refuge in Mississippi in 2012. Post-mortem examination had identified trichostrongyliasis as the possible cause of death, but because a virus with morphology consistent with a togavirus was isolated from the brain of the bird, an arboviral etiology was strongly suspected. Because individual molecular assays for several known arboviruses were negative, unbiased NGS by Illumina MiSeq was used to definitively identify and characterize the causative viral agent. Whole genome sequencing and phylogenetic analysis revealed the viral isolate to be the Highlands J virus, a known avian pathogen. This study demonstrates the use of unbiased NGS for the rapid detection and characterization of an unidentified viral pathogen and the application of this technology to wildlife disease diagnostics and conservation medicine.

  10. Accident sequence analysis of human-computer interface design

    International Nuclear Information System (INIS)

    Fan, C.-F.; Chen, W.-H.

    2000-01-01

    It is important to predict potential accident sequences of human-computer interaction in a safety-critical computing system so that vulnerable points can be disclosed and removed. We address this issue by proposing a Multi-Context human-computer interaction Model along with its analysis techniques, an Augmented Fault Tree Analysis, and a Concurrent Event Tree Analysis. The proposed augmented fault tree can identify the potential weak points in software design that may induce unintended software functions or erroneous human procedures. The concurrent event tree can enumerate possible accident sequences due to these weak points

  11. Food Fish Identification from DNA Extraction through Sequence Analysis

    Science.gov (United States)

    Hallen-Adams, Heather E.

    2015-01-01

    This experiment exposed 3rd and 4th y undergraduates and graduate students taking a course in advanced food analysis to DNA extraction, polymerase chain reaction (PCR), and DNA sequence analysis. Students provided their own fish sample, purchased from local grocery stores, and the class as a whole extracted DNA, which was then subjected to PCR,…

  12. Analysis and Visualization Tool for Targeted Amplicon Bisulfite Sequencing on Ion Torrent Sequencers.

    Directory of Open Access Journals (Sweden)

    Stephan Pabinger

    Full Text Available Targeted sequencing of PCR amplicons generated from bisulfite deaminated DNA is a flexible, cost-effective way to study methylation of a sample at single CpG resolution and perform subsequent multi-target, multi-sample comparisons. Currently, no platform specific protocol, support, or analysis solution is provided to perform targeted bisulfite sequencing on a Personal Genome Machine (PGM. Here, we present a novel tool, called TABSAT, for analyzing targeted bisulfite sequencing data generated on Ion Torrent sequencers. The workflow starts with raw sequencing data, performs quality assessment, and uses a tailored version of Bismark to map the reads to a reference genome. The pipeline visualizes results as lollipop plots and is able to deduce specific methylation-patterns present in a sample. The obtained profiles are then summarized and compared between samples. In order to assess the performance of the targeted bisulfite sequencing workflow, 48 samples were used to generate 53 different Bisulfite-Sequencing PCR amplicons from each sample, resulting in 2,544 amplicon targets. We obtained a mean coverage of 282X using 1,196,822 aligned reads. Next, we compared the sequencing results of these targets to the methylation level of the corresponding sites on an Illumina 450k methylation chip. The calculated average Pearson correlation coefficient of 0.91 confirms the sequencing results with one of the industry-leading CpG methylation platforms and shows that targeted amplicon bisulfite sequencing provides an accurate and cost-efficient method for DNA methylation studies, e.g., to provide platform-independent confirmation of Illumina Infinium 450k methylation data. TABSAT offers a novel way to analyze data generated by Ion Torrent instruments and can also be used with data from the Illumina MiSeq platform. It can be easily accessed via the Platomics platform, which offers a web-based graphical user interface along with sample and parameter storage

  13. Multilocus Sequence Analysis and rpoB Sequencing of Mycobacterium abscessus (Sensu Lato) Strains▿

    Science.gov (United States)

    Macheras, Edouard; Roux, Anne-Laure; Bastian, Sylvaine; Leão, Sylvia Cardoso; Palaci, Moises; Sivadon-Tardy, Valérie; Gutierrez, Cristina; Richter, Elvira; Rüsch-Gerdes, Sabine; Pfyffer, Gaby; Bodmer, Thomas; Cambau, Emmanuelle; Gaillard, Jean-Louis; Heym, Beate

    2011-01-01

    Mycobacterium abscessus, Mycobacterium bolletii, and Mycobacterium massiliense (Mycobacterium abscessus sensu lato) are closely related species that currently are identified by the sequencing of the rpoB gene. However, recent studies show that rpoB sequencing alone is insufficient to discriminate between these species, and some authors have questioned their current taxonomic classification. We studied here a large collection of M. abscessus (sensu lato) strains by partial rpoB sequencing (752 bp) and multilocus sequence analysis (MLSA). The final MLSA scheme developed was based on the partial sequences of eight housekeeping genes: argH, cya, glpK, gnd, murC, pgm, pta, and purH. The strains studied included the three type strains (M. abscessus CIP 104536T, M. massiliense CIP 108297T, and M. bolletii CIP 108541T) and 120 isolates recovered between 1997 and 2007 in France, Germany, Switzerland, and Brazil. The rpoB phylogenetic tree confirmed the existence of three main clusters, each comprising the type strain of one species. However, divergence values between the M. massiliense and M. bolletii clusters all were below 3% and between the M. abscessus and M. massiliense clusters were from 2.66 to 3.59%. The tree produced using the concatenated MLSA gene sequences (4,071 bp) also showed three main clusters, each comprising the type strain of one species. The M. abscessus cluster had a bootstrap value of 100% and was mostly compact. Bootstrap values for the M. massiliense and M. bolletii branches were much lower (71 and 61%, respectively), with the M. massiliense cluster having a fuzzy aspect. Mean (range) divergence values were 2.17% (1.13 to 2.58%) between the M. abscessus and M. massiliense clusters, 2.37% (1.5 to 2.85%) between the M. abscessus and M. bolletii clusters, and 2.28% (0.86 to 2.68%) between the M. massiliense and M. bolletii clusters. Adding the rpoB sequence to the MLSA-concatenated sequence (total sequence, 4,823 bp) had little effect on the clustering

  14. Multilocus sequence analysis and rpoB sequencing of Mycobacterium abscessus (sensu lato) strains.

    Science.gov (United States)

    Macheras, Edouard; Roux, Anne-Laure; Bastian, Sylvaine; Leão, Sylvia Cardoso; Palaci, Moises; Sivadon-Tardy, Valérie; Gutierrez, Cristina; Richter, Elvira; Rüsch-Gerdes, Sabine; Pfyffer, Gaby; Bodmer, Thomas; Cambau, Emmanuelle; Gaillard, Jean-Louis; Heym, Beate

    2011-02-01

    Mycobacterium abscessus, Mycobacterium bolletii, and Mycobacterium massiliense (Mycobacterium abscessus sensu lato) are closely related species that currently are identified by the sequencing of the rpoB gene. However, recent studies show that rpoB sequencing alone is insufficient to discriminate between these species, and some authors have questioned their current taxonomic classification. We studied here a large collection of M. abscessus (sensu lato) strains by partial rpoB sequencing (752 bp) and multilocus sequence analysis (MLSA). The final MLSA scheme developed was based on the partial sequences of eight housekeeping genes: argH, cya, glpK, gnd, murC, pgm, pta, and purH. The strains studied included the three type strains (M. abscessus CIP 104536(T), M. massiliense CIP 108297(T), and M. bolletii CIP 108541(T)) and 120 isolates recovered between 1997 and 2007 in France, Germany, Switzerland, and Brazil. The rpoB phylogenetic tree confirmed the existence of three main clusters, each comprising the type strain of one species. However, divergence values between the M. massiliense and M. bolletii clusters all were below 3% and between the M. abscessus and M. massiliense clusters were from 2.66 to 3.59%. The tree produced using the concatenated MLSA gene sequences (4,071 bp) also showed three main clusters, each comprising the type strain of one species. The M. abscessus cluster had a bootstrap value of 100% and was mostly compact. Bootstrap values for the M. massiliense and M. bolletii branches were much lower (71 and 61%, respectively), with the M. massiliense cluster having a fuzzy aspect. Mean (range) divergence values were 2.17% (1.13 to 2.58%) between the M. abscessus and M. massiliense clusters, 2.37% (1.5 to 2.85%) between the M. abscessus and M. bolletii clusters, and 2.28% (0.86 to 2.68%) between the M. massiliense and M. bolletii clusters. Adding the rpoB sequence to the MLSA-concatenated sequence (total sequence, 4,823 bp) had little effect on the

  15. An optimum analysis sequence for environmental gamma-ray spectrometry

    Energy Technology Data Exchange (ETDEWEB)

    De la Torre, F.; Rios M, C.; Ruvalcaba A, M. G.; Mireles G, F.; Saucedo A, S.; Davila R, I.; Pinedo, J. L., E-mail: fta777@hotmail.co [Universidad Autonoma de Zacatecas, Centro Regional de Estudis Nucleares, Calle Cipres No. 10, Fracc. La Penuela, 98068 Zacatecas (Mexico)

    2010-10-15

    This work aims to obtain an optimum analysis sequence for environmental gamma-ray spectroscopy by means of Genie 2000 (Canberra). Twenty different analysis sequences were customized using different peak area percentages and different algorithms for: 1) peak finding, and 2) peak area determination, and with or without the use of a library -based on evaluated nuclear data- of common gamma-ray emitters in environmental samples. The use of an optimum analysis sequence with certified nuclear information avoids the problems originated by the significant variations in out-of-date nuclear parameters of commercial software libraries. Interference-free gamma ray energies with absolute emission probabilities greater than 3.75% were included in the customized library. The gamma-ray spectroscopy system (based on a Ge Re-3522 Canberra detector) was calibrated both in energy and shape by means of the IAEA-2002 reference spectra for software intercomparison. To test the performance of the analysis sequences, the IAEA-2002 reference spectrum was used. The z-score and the reduced {chi}{sup 2} criteria were used to determine the optimum analysis sequence. The results show an appreciable variation in the peak area determinations and their corresponding uncertainties. Particularly, the combination of second derivative peak locate with simple peak area integration algorithms provides the greater accuracy. Lower accuracy comes from the combination of library directed peak locate algorithm and Genie's Gamma-M peak area determination. (Author)

  16. An optimum analysis sequence for environmental gamma-ray spectrometry

    International Nuclear Information System (INIS)

    De la Torre, F.; Rios M, C.; Ruvalcaba A, M. G.; Mireles G, F.; Saucedo A, S.; Davila R, I.; Pinedo, J. L.

    2010-10-01

    This work aims to obtain an optimum analysis sequence for environmental gamma-ray spectroscopy by means of Genie 2000 (Canberra). Twenty different analysis sequences were customized using different peak area percentages and different algorithms for: 1) peak finding, and 2) peak area determination, and with or without the use of a library -based on evaluated nuclear data- of common gamma-ray emitters in environmental samples. The use of an optimum analysis sequence with certified nuclear information avoids the problems originated by the significant variations in out-of-date nuclear parameters of commercial software libraries. Interference-free gamma ray energies with absolute emission probabilities greater than 3.75% were included in the customized library. The gamma-ray spectroscopy system (based on a Ge Re-3522 Canberra detector) was calibrated both in energy and shape by means of the IAEA-2002 reference spectra for software intercomparison. To test the performance of the analysis sequences, the IAEA-2002 reference spectrum was used. The z-score and the reduced χ 2 criteria were used to determine the optimum analysis sequence. The results show an appreciable variation in the peak area determinations and their corresponding uncertainties. Particularly, the combination of second derivative peak locate with simple peak area integration algorithms provides the greater accuracy. Lower accuracy comes from the combination of library directed peak locate algorithm and Genie's Gamma-M peak area determination. (Author)

  17. Establishment of screening technique for mutant cell and analysis of base sequence in the mutation

    International Nuclear Information System (INIS)

    Sofuni, Toshio; Nomi, Takehiko; Yamada, Masami; Masumura, Kenichi

    2000-01-01

    This research project aimed to establish an easy and quick detection method for radiation-induced mutation using molecular-biological techniques and an effective analyzing method for the molecular changes in base sequence. In this year, Spi mutants derived from γ-radiation exposed mouse were analyzed by PCR method and DNA sequence method. Male transgenic mice were exposed to γ-ray at 5,10, 50 Gy and the transgene was taken out from the genome DNA from the spleen in vivo packaging method. Spi mutant plaques were obtained by infecting the recovered phage to E. coli. Sequence analysis for the mutants was made using ALFred DNA sequencer and SequiTherm TM Long-Red Cycle sequencing kit. Sequence analysis was carried out for 41 of 50 independent Spi mutants obtained. The deletions were classified into 4 groups; Group 1 included 15 mutants that were characterized with a large deletion (43 bp-10 kb) with a short homologous sequence. Group 2 included 11 mutants of a large deletion having no homologous sequence at the connecting region. Group 3 included 11 mutants having a short deletion of less than 20 bp, which occurred in the non-repetitive sequence of gam gene and possibly caused by oxidative breakage of DNA or recombination of DNA fragment produced by the breakage. Group 4 included 4 mutants having deletions as short as 20 bp or less in the repetitive sequence of gam gene, resulting in an alteration of the reading frame. Thus, the synthesis of Gam protein was terminated by the appearance of TGA between code 13 and 14 of redB gene, leading to inactivation of gam gene and redBA gene. These results indicated that most of Spi mutants had a deletion in red/gam region and the deletions in more than half mutants occurred in homologous sequences as short as 8 bp. (M.N.)

  18. Molecular Characterization of Cultivated Bromeliad Accessions with Inter-Simple Sequence Repeat (ISSR Markers

    Directory of Open Access Journals (Sweden)

    Yongming Yu

    2012-05-01

    Full Text Available Bromeliads are of great economic importance in flower production; however little information is available with respect to genetic characterization of cultivated bromeliads thus far. In the present study, a selection of cultivated bromeliads was characterized via inter-simple sequence repeat (ISSR markers with an emphasis on genetic diversity and population structure. Twelve ISSR primers produced 342 bands, of which 287 (~84% were polymorphic, with polymorphic bands per primer ranging from 17 to 34. The Jaccard’s similarity ranged from 0.08 to 0.89 and averaged ~0.30 for the investigated bromeliads. The Bayesian-based approach, together with the un-weighted paired group method with arithmetic average (UPGMA-based clustering and the principal coordinate analysis (PCoA, distinctly grouped the bromeliads from Neoregelia, Guzmania, and Vriesea into three separately clusters, well corresponding with their botanical classifications; whereas the bromeliads of Aechmea other than the recently selected hybrids were not well assigned to a cluster. Additionally, ISSR marker was proven efficient for the identification of hybrids and bud sports of cultivated bromeliads. The findings achieved herein will further our knowledge about the genetic variability within cultivated bromeliads and therefore facilitate breeding for new varieties of cultivated bromeliads in future as well.

  19. Strategies in protein sequencing and characterization: Multi-enzyme digestion coupled with alternate CID/ETD tandem mass spectrometry

    Energy Technology Data Exchange (ETDEWEB)

    Nardiello, Donatella; Palermo, Carmen, E-mail: carmen.palermo@unifg.it; Natale, Anna; Quinto, Maurizio; Centonze, Diego

    2015-01-07

    Highlights: • Multi-enzyme digestion for protein sequencing and characterization by CID/ETD. • Simultaneous use of trypsin/chymotrypsin for the maximization of sequence. • Identification of PTMs, sequence variants and species-specific residues. • Increase of accuracy in sequence assignments by orthogonal fragmentation techniques. - Abstract: A strategy based on a simultaneous multi-enzyme digestion coupled with electron transfer dissociation (ETD) and collision-induced dissociation (CID) was developed for protein sequencing and characterization, as a valid alternative platform in ion-trap based proteomics. The effect of different proteolytic procedures using chymotrypsin, trypsin, a combination of both, and Lys-C, was carefully evaluated in terms of number of identified peptides, protein coverage, and score distribution. A systematic comparison between CID and ETD is shown for the analysis of peptides originating from the in-solution digestion of standard caseins. The best results were achieved with a trypsin/chymotrypsin mix combined with CID and ETD operating in alternating mode. A post-database search validation of MS/MS dataset was performed, then, the matched peptides were cross checked by the evaluation of ion scores, rank, number of experimental product ions, and their relative abundances in the MS/MS spectrum. By integrated CID/ETD experiments, high quality-spectra have been obtained, thus allowing a confirmation of spectral information and an increase of accuracy in peptide sequence assignments. Overlapping peptides, produced throughout the proteins, reduce the ambiguity in mapping modifications between natural variants and animal species, and allow the characterization of post translational modifications. The advantages of using the enzymatic mix trypsin/chymotrypsin were confirmed by the nanoLC and CID/ETD tandem mass spectrometry of goat milk proteins, previously separated by two-dimensional gel electrophoresis.

  20. Sequence analysis of serum albumins reveals the molecular evolution of ligand recognition properties.

    Science.gov (United States)

    Fanali, Gabriella; Ascenzi, Paolo; Bernardi, Giorgio; Fasano, Mauro

    2012-01-01

    Serum albumin (SA) is a circulating protein providing a depot and carrier for many endogenous and exogenous compounds. At least seven major binding sites have been identified by structural and functional investigations mainly in human SA. SA is conserved in vertebrates, with at least 49 entries in protein sequence databases. The multiple sequence analysis of this set of entries leads to the definition of a cladistic tree for the molecular evolution of SA orthologs in vertebrates, thus showing the clustering of the considered species, with lamprey SAs (Lethenteron japonicum and Petromyzon marinus) in a separate outgroup. Sequence analysis aimed at searching conserved domains revealed that most SA sequences are made up by three repeated domains (about 600 residues), as extensively characterized for human SA. On the contrary, lamprey SAs are giant proteins (about 1400 residues) comprising seven repeated domains. The phylogenetic analysis of the SA family reveals a stringent correlation with the taxonomic classification of the species available in sequence databases. A focused inspection of the sequences of ligand binding sites in SA revealed that in all sites most residues involved in ligand binding are conserved, although the versatility towards different ligands could be peculiar of higher organisms. Moreover, the analysis of molecular links between the different sites suggests that allosteric modulation mechanisms could be restricted to higher vertebrates.

  1. Validation of Genotyping-By-Sequencing Analysis in Populations of Tetraploid Alfalfa by 454 Sequencing

    Science.gov (United States)

    Rocher, Solen; Jean, Martine; Castonguay, Yves; Belzile, François

    2015-01-01

    Genotyping-by-sequencing (GBS) is a relatively low-cost high throughput genotyping technology based on next generation sequencing and is applicable to orphan species with no reference genome. A combination of genome complexity reduction and multiplexing with DNA barcoding provides a simple and affordable way to resolve allelic variation between plant samples or populations. GBS was performed on ApeKI libraries using DNA from 48 genotypes each of two heterogeneous populations of tetraploid alfalfa (Medicago sativa spp. sativa): the synthetic cultivar Apica (ATF0) and a derived population (ATF5) obtained after five cycles of recurrent selection for superior tolerance to freezing (TF). Nearly 400 million reads were obtained from two lanes of an Illumina HiSeq 2000 sequencer and analyzed with the Universal Network-Enabled Analysis Kit (UNEAK) pipeline designed for species with no reference genome. Following the application of whole dataset-level filters, 11,694 single nucleotide polymorphism (SNP) loci were obtained. About 60% had a significant match on the Medicago truncatula syntenic genome. The accuracy of allelic ratios and genotype calls based on GBS data was directly assessed using 454 sequencing on a subset of SNP loci scored in eight plant samples. Sequencing depth in this study was not sufficient for accurate tetraploid allelic dosage, but reliable genotype calls based on diploid allelic dosage were obtained when using additional quality filtering. Principal Component Analysis of SNP loci in plant samples revealed that a small proportion (<5%) of the genetic variability assessed by GBS is able to differentiate ATF0 and ATF5. Our results confirm that analysis of GBS data using UNEAK is a reliable approach for genome-wide discovery of SNP loci in outcrossed polyploids. PMID:26115486

  2. Mycobacterium malmesburyense sp. nov., a non-tuberculous species of the genus Mycobacterium revealed by multiple gene sequence characterization.

    Science.gov (United States)

    Gcebe, Nomakorinte; Rutten, Victor; Pittius, Nicolaas Gey van; Naicker, Brendon; Michel, Anita

    2017-04-01

    Non-tuberculous mycobacteria (NTM) are ubiquitous in the environment, and an increasing number of NTM species have been isolated and characterized from both humans and animals, highlighting the zoonotic potential of these bacteria. Host exposure to NTM may impact on cross-reactive immune responsiveness, which may affect diagnosis of bovine tuberculosis and may also play a role in the variability of the efficacy of Mycobacterium bovis BCG vaccination against tuberculosis. In this study we characterized 10 NTM isolates originating from water, soil, nasal swabs of cattle and African buffalo as well as bovine tissue samples. These isolates were previously identified during an NTM survey and were all found, using 16S rRNA gene sequence analysis to be closely related to Mycobacterium moriokaense. A polyphasic approach that included phenotypic characterization, antibiotic susceptibility profiling, mycolic acid profiling and phylogenetic analysis of four gene loci, 16S rRNA, hsp65, sodA and rpoB, was employed to characterize these isolates. Sequence data analysis of the four gene loci revealed that these isolates belong to a unique species of the genus Mycobacterium. This evidence was further supported by several differences in phenotypic characteristics between the isolates and the closely related species. We propose the name Mycobacterium malmesburyense sp. nov. for this novel species. The type strain is WCM 7299T (=ATCC BAA-2759T=CIP 110822T).

  3. Sequence characterization and glycosylation sites identification of donkey milk lactoferrin by multiple enzyme digestions and mass spectrometry

    DEFF Research Database (Denmark)

    Gallina, Serafina; Cunsolo, Vincenzo; Saletti, Rosaria

    2016-01-01

    Lactoferrin, a protein showing an array of biochemical properties, including immuno-modulation, iron-binding ability, as well as antioxidant, antibacterial and antiviral activities, but which may also represent a potential milk allergen, was isolated from donkey milk by ion exchange chromatography...... characterization of donkey lactoferrin sequence, that, at least for the covered sequence, differs from the horse genomic deduced sequence (UniProtKB Acc. Nr. O77811) by five point substitutions located at positions 91 (Arg → His), 328 (Thr → Ile/Leu), 466 (Ala → Gly), 642 (Asn → Ser) and 668 (Ser → Ala). Analysis...... of the glycosylated protein showed that glycans in donkey lactoferrin are linked to the protein backbone via an amide bond to asparagine residues located at the positions 137, 281 and 476....

  4. Genetic mutation analysis of human gastric adenocarcinomas using ion torrent sequencing platform.

    Directory of Open Access Journals (Sweden)

    Zhi Xu

    Full Text Available Gastric cancer is the one of the major causes of cancer-related death, especially in Asia. Gastric adenocarcinoma, the most common type of gastric cancer, is heterogeneous and its incidence and cause varies widely with geographical regions, gender, ethnicity, and diet. Since unique mutations have been observed in individual human cancer samples, identification and characterization of the molecular alterations underlying individual gastric adenocarcinomas is a critical step for developing more effective, personalized therapies. Until recently, identifying genetic mutations on an individual basis by DNA sequencing remained a daunting task. Recent advances in new next-generation DNA sequencing technologies, such as the semiconductor-based Ion Torrent sequencing platform, makes DNA sequencing cheaper, faster, and more reliable. In this study, we aim to identify genetic mutations in the genes which are targeted by drugs in clinical use or are under development in individual human gastric adenocarcinoma samples using Ion Torrent sequencing. We sequenced 737 loci from 45 cancer-related genes in 238 human gastric adenocarcinoma samples using the Ion Torrent Ampliseq Cancer Panel. The sequencing analysis revealed a high occurrence of mutations along the TP53 locus (9.7% in our sample set. Thus, this study indicates the utility of a cost and time efficient tool such as Ion Torrent sequencing to screen cancer mutations for the development of personalized cancer therapy.

  5. Sequence and phylogenetic analysis of chicken anaemia virus obtained from backyard and commercial chickens in Nigeria.

    Science.gov (United States)

    Oluwayelu, D O; Todd, D; Olaleye, O D

    2008-12-01

    This work reports the first molecular analysis study of chicken anaemia virus (CAV) in backyard chickens in Africa using molecular cloning and sequence analysis to characterize CAV strains obtained from commercial chickens and Nigerian backyard chickens. Partial VP1 gene sequences were determined for three CAVs from commercial chickens and for six CAV variants present in samples from a backyard chicken. Multiple alignment analysis revealed that the 6% and 4% nucleotide diversity obtained respectively for the commercial and backyard chicken strains translated to only 2% amino acid diversity for each breed. Overall, the amino acid composition of Nigerian CAVs was found to be highly conserved. Since the partial VP1 gene sequence of two backyard chicken cloned CAV strains (NGR/CI-8 and NGR/CI-9) were almost identical and evolutionarily closely related to the commercial chicken strains NGR-1, and NGR-4 and NGR-5, respectively, we concluded that CAV infections had crossed the farm boundary.

  6. Plastome Sequence Determination and Comparative Analysis for Members of the Lolium-Festuca Grass Species Complex

    Science.gov (United States)

    Hand, Melanie L.; Spangenberg, German C.; Forster, John W.; Cogan, Noel O. I.

    2013-01-01

    Chloroplast genome sequences are of broad significance in plant biology, due to frequent use in molecular phylogenetics, comparative genomics, population genetics, and genetic modification studies. The present study used a second-generation sequencing approach to determine and assemble the plastid genomes (plastomes) of four representatives from the agriculturally important Lolium-Festuca species complex of pasture grasses (Lolium multiflorum, Festuca pratensis, Festuca altissima, and Festuca ovina). Total cellular DNA was extracted from either roots or leaves, was sequenced, and the output was filtered for plastome-related reads. A comparison between sources revealed fewer plastome-related reads from root-derived template but an increase in incidental bacterium-derived sequences. Plastome assembly and annotation indicated high levels of sequence identity and a conserved organization and gene content between species. However, frequent deletions within the F. ovina plastome appeared to contribute to a smaller plastid genome size. Comparative analysis with complete plastome sequences from other members of the Poaceae confirmed conservation of most grass-specific features. Detailed analysis of the rbcL–psaI intergenic region, however, revealed a “hot-spot” of variation characterized by independent deletion events. The evolutionary implications of this observation are discussed. The complete plastome sequences are anticipated to provide the basis for potential organelle-specific genetic modification of pasture grasses. PMID:23550121

  7. Characterization of Satellite DNA Sequences from the Commercially Important Marine Rotifers Brachionus rotundiformis and Brachionus plicatilis.

    Science.gov (United States)

    Boehm; Gibson; Lubzens

    2000-01-01

    This study was initiated to search for species-specific and strain-specific satellite DNA sequences for which oligonucleotide primers could be designed to differentiate between various commercially important strains of the marine monogonont rotifers Brachionus rotundiformis and Brachionus plicatilis. Two unrelated, highly reiterated satellite sequences were cloned and characterized. The eight sequenced monomers from B. rotundiformis and six from B. plicatilis had low intrarepeat variability and were similar in their overall lengths, A + T compositions, and high degrees of repeated motif substructure. However, hybridizations to 19 representative strains, sequence characterizations, and GenBank searches indicated that these two satellites are morphotype-specific and population-specific, respectively, and share little homology to each other or to other characterized sequences in the database. Primer pairs designed for the B. rotundiformis satellite confirmed hybridization specificities on polymerase chain reaction and could serve as a useful molecular diagnostic tool to identify strains belonging to the SS morphotype, which are gaining widespread usage as first feeds for marine fish in commercial production.

  8. Rickettsia asembonensis Characterization by Multilocus Sequence Typing of Complete Genes, Peru.

    Science.gov (United States)

    Loyola, Steev; Flores-Mendoza, Carmen; Torre, Armando; Kocher, Claudine; Melendrez, Melanie; Luce-Fedrow, Alison; Maina, Alice N; Richards, Allen L; Leguia, Mariana

    2018-05-01

    While studying rickettsial infections in Peru, we detected Rickettsia asembonensis in fleas from domestic animals. We characterized 5 complete genomic regions (17kDa, gltA, ompA, ompB, and sca4) and conducted multilocus sequence typing and phylogenetic analyses. The molecular isolate from Peru is distinct from the original R. asembonensis strain from Kenya.

  9. Characterization of GM events by insert knowledge adapted re-sequencing approaches.

    Science.gov (United States)

    Yang, Litao; Wang, Congmao; Holst-Jensen, Arne; Morisset, Dany; Lin, Yongjun; Zhang, Dabing

    2013-10-03

    Detection methods and data from molecular characterization of genetically modified (GM) events are needed by stakeholders of public risk assessors and regulators. Generally, the molecular characteristics of GM events are incomprehensively revealed by current approaches and biased towards detecting transformation vector derived sequences. GM events are classified based on available knowledge of the sequences of vectors and inserts (insert knowledge). Herein we present three insert knowledge-adapted approaches for characterization GM events (TT51-1 and T1c-19 rice as examples) based on paired-end re-sequencing with the advantages of comprehensiveness, accuracy, and automation. The comprehensive molecular characteristics of two rice events were revealed with additional unintended insertions comparing with the results from PCR and Southern blotting. Comprehensive transgene characterization of TT51-1 and T1c-19 is shown to be independent of a priori knowledge of the insert and vector sequences employing the developed approaches. This provides an opportunity to identify and characterize also unknown GM events.

  10. Genomic sequence around butterfly wing development genes: annotation and comparative analysis.

    Directory of Open Access Journals (Sweden)

    Inês C Conceição

    Full Text Available BACKGROUND: Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. METHODOLOGY/PRINCIPAL FINDINGS: We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes. CONCLUSIONS: The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1 the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2 the high

  11. Structural characterization of HDPE/LLDPE blend-based nano composites obtained by different blending sequence

    International Nuclear Information System (INIS)

    Passador, Fabio R.; Ruvolo Filho, Adhemar; Pessan, Luiz A.

    2011-01-01

    The blending sequence affects the morphology formation of the nanocomposites. In this work, the blending sequences were explored to determine its influence in the rheological behavior of HDPE/LLDPE/OMMT nanocomposites. The nanocomposites were obtained by melt-intercalation using a mixture of LLDPE-g-MA and HDPE-g-MA as compatibilizer system in a torque rheometer at 180 deg C and five blending sequences were studied. The materials structures were characterized by wide angle X-ray diffraction (WAXD) and by rheological properties. The nanoclay's addition increased the shear viscosity at low shear rates, changing the behavior of HDPE/LLDPE matrix to a Bingham model behavior with an apparent yield stress. Intense interactions were obtained for the blending sequence where LLDPE and/or LLDPE-g-MA were first reinforced with organoclay since the intercalation process occurs preferentially in the amorphous phase. (author)

  12. De novo transcriptome sequencing and sequence analysis of the malaria vector Anopheles sinensis (Diptera: Culicidae)

    Science.gov (United States)

    2014-01-01

    Background Anopheles sinensis is the major malaria vector in China and Southeast Asia. Vector control is one of the most effective measures to prevent malaria transmission. However, there is little transcriptome information available for the malaria vector. To better understand the biological basis of malaria transmission and to develop novel and effective means of vector control, there is a need to build a transcriptome dataset for functional genomics analysis by large-scale RNA sequencing (RNA-seq). Methods To provide a more comprehensive and complete transcriptome of An. sinensis, eggs, larvae, pupae, male adults and female adults RNA were pooled together for cDNA preparation, sequenced using the Illumina paired-end sequencing technology and assembled into unigenes. These unigenes were then analyzed in their genome mapping, functional annotation, homology, codon usage bias and simple sequence repeats (SSRs). Results Approximately 51.6 million clean reads were obtained, trimmed, and assembled into 38,504 unigenes with an average length of 571 bp, an N50 of 711 bp, and an average GC content 51.26%. Among them, 98.4% of unigenes could be mapped onto the reference genome, and 69% of unigenes could be annotated with known biological functions. Homology analysis identified certain numbers of An. sinensis unigenes that showed homology or being putative 1:1 orthologues with genomes of other Dipteran species. Codon usage bias was analyzed and 1,904 SSRs were detected, which will provide effective molecular markers for the population genetics of this species. Conclusions Our data and analysis provide the most comprehensive transcriptomic resource and characteristics currently available for An. sinensis, and will facilitate genetic, genomic studies, and further vector control of An. sinensis. PMID:25000941

  13. Characterization and Development of EST-SSRs by Deep Transcriptome Sequencing in Chinese Cabbage (Brassica rapa L. ssp. pekinensis

    Directory of Open Access Journals (Sweden)

    Qian Ding

    2015-01-01

    Full Text Available Simple sequence repeats (SSRs are among the most important markers for population analysis and have been widely used in plant genetic mapping and molecular breeding. Expressed sequence tag-SSR (EST-SSR markers, located in the coding regions, are potentially more efficient for QTL mapping, gene targeting, and marker-assisted breeding. In this study, we investigated 51,694 nonredundant unigenes, assembled from clean reads from deep transcriptome sequencing with a Solexa/Illumina platform, for identification and development of EST-SSRs in Chinese cabbage. In total, 10,420 EST-SSRs with over 12 bp were identified and characterized, among which 2744 EST-SSRs are new and 2317 are known ones showing polymorphism with previously reported SSRs. A total of 7877 PCR primer pairs for 1561 EST-SSR loci were designed, and primer pairs for twenty-four EST-SSRs were selected for primer evaluation. In nineteen EST-SSR loci (79.2%, amplicons were successfully generated with high quality. Seventeen (89.5% showed polymorphism in twenty-four cultivars of Chinese cabbage. The polymorphic alleles of each polymorphic locus were sequenced, and the results showed that most polymorphisms were due to variations of SSR repeat motifs. The EST-SSRs identified and characterized in this study have important implications for developing new tools for genetics and molecular breeding in Chinese cabbage.

  14. Sequence analysis corresponding to the PPE and PE proteins in ...

    Indian Academy of Sciences (India)

    Unknown

    AB repeats; Mycobacterium tuberculosis genome; PE-PPE domain; PPE, PE proteins; sequence analysis; surface antigens. J. Biosci. | Vol. ... bacterium tuberculosis genomes resulted in the identification of a previously uncharacterized 225 amino acid- ...... Vega Lopez F, Brooks L A, Dockrell H M, De Smet K A,. Thompson ...

  15. Molecular cloning, expression analysis and sequence prediction of ...

    African Journals Online (AJOL)

    CCAAT/enhancer-binding protein beta as an essential transcriptional factor, regulates the differentiation of adipocytes and the deposition of fat. Herein, we cloned the whole open reading frame (ORF) of bovine C/EBPβ gene and analyzed its putative protein structures via DNA cloning and sequence analysis. Then, the ...

  16. Multilocus sequence analysis of phytopathogenic species of the genus Streptomyces

    Science.gov (United States)

    The identification and classification of species within the genus Streptomyces is difficult because there are presently 576 validly described species and this number increases every year. The value of the application of multilocus sequence analysis scheme to the systematics of Streptomyces species h...

  17. Sequence symmetry analysis in pharmacovigilance and pharmacoepidemiologic studies

    DEFF Research Database (Denmark)

    Lai, Edward Chia Cheng; Pratt, Nicole; Hsieh, Cheng Yang

    2017-01-01

    Sequence symmetry analysis (SSA) is a method for detecting adverse drug events by utilizing computerized claims data. The method has been increasingly used to investigate safety concerns of medications and as a pharmacovigilance tool to identify unsuspected side effects. Validation studies have i...

  18. DNAApp: a mobile application for sequencing data analysis.

    Science.gov (United States)

    Nguyen, Phi-Vu; Verma, Chandra Shekhar; Gan, Samuel Ken-En

    2014-11-15

    There have been numerous applications developed for decoding and visualization of ab1 DNA sequencing files for Windows and MAC platforms, yet none exists for the increasingly popular smartphone operating systems. The ability to decode sequencing files cannot easily be carried out using browser accessed Web tools. To overcome this hurdle, we have developed a new native app called DNAApp that can decode and display ab1 sequencing file on Android and iOS. In addition to in-built analysis tools such as reverse complementation, protein translation and searching for specific sequences, we have incorporated convenient functions that would facilitate the harnessing of online Web tools for a full range of analysis. Given the high usage of Android/iOS tablets and smartphones, such bioinformatics apps would raise productivity and facilitate the high demand for analyzing sequencing data in biomedical research. The Android version of DNAApp is available in Google Play Store as 'DNAApp', and the iOS version is available in the App Store. More details on the app can be found at www.facebook.com/APDLab; www.bii.a-star.edu.sg/research/trd/apd.php The DNAApp user guide is available at http://tinyurl.com/DNAAppuser, and a video tutorial is available on Google Play Store and App Store, as well as on the Facebook page. samuelg@bii.a-star.edu.sg. © The Author 2014. Published by Oxford University Press.

  19. DNAApp: a mobile application for sequencing data analysis

    Science.gov (United States)

    Nguyen, Phi-Vu; Verma, Chandra Shekhar; Gan, Samuel Ken-En

    2014-01-01

    Summary: There have been numerous applications developed for decoding and visualization of ab1 DNA sequencing files for Windows and MAC platforms, yet none exists for the increasingly popular smartphone operating systems. The ability to decode sequencing files cannot easily be carried out using browser accessed Web tools. To overcome this hurdle, we have developed a new native app called DNAApp that can decode and display ab1 sequencing file on Android and iOS. In addition to in-built analysis tools such as reverse complementation, protein translation and searching for specific sequences, we have incorporated convenient functions that would facilitate the harnessing of online Web tools for a full range of analysis. Given the high usage of Android/iOS tablets and smartphones, such bioinformatics apps would raise productivity and facilitate the high demand for analyzing sequencing data in biomedical research. Availability and implementation: The Android version of DNAApp is available in Google Play Store as ‘DNAApp’, and the iOS version is available in the App Store. More details on the app can be found at www.facebook.com/APDLab; www.bii.a-star.edu.sg/research/trd/apd.php The DNAApp user guide is available at http://tinyurl.com/DNAAppuser, and a video tutorial is available on Google Play Store and App Store, as well as on the Facebook page. Contact: samuelg@bii.a-star.edu.sg PMID:25095882

  20. Long-read sequencing data analysis for yeasts.

    Science.gov (United States)

    Yue, Jia-Xing; Liti, Gianni

    2018-06-01

    Long-read sequencing technologies have become increasingly popular due to their strengths in resolving complex genomic regions. As a leading model organism with small genome size and great biotechnological importance, the budding yeast Saccharomyces cerevisiae has many isolates currently being sequenced with long reads. However, analyzing long-read sequencing data to produce high-quality genome assembly and annotation remains challenging. Here, we present a modular computational framework named long-read sequencing data analysis for yeasts (LRSDAY), the first one-stop solution that streamlines this process. Starting from the raw sequencing reads, LRSDAY can produce chromosome-level genome assembly and comprehensive genome annotation in a highly automated manner with minimal manual intervention, which is not possible using any alternative tool available to date. The annotated genomic features include centromeres, protein-coding genes, tRNAs, transposable elements (TEs), and telomere-associated elements. Although tailored for S. cerevisiae, we designed LRSDAY to be highly modular and customizable, making it adaptable to virtually any eukaryotic organism. When applying LRSDAY to an S. cerevisiae strain, it takes ∼41 h to generate a complete and well-annotated genome from ∼100× Pacific Biosciences (PacBio) running the basic workflow with four threads. Basic experience working within the Linux command-line environment is recommended for carrying out the analysis using LRSDAY.

  1. Exploiting BAC-end sequences for the mining, characterization and utility of new short sequences repeat (SSR) markers in Citrus.

    Science.gov (United States)

    Biswas, Manosh Kumar; Chai, Lijun; Mayer, Christoph; Xu, Qiang; Guo, Wenwu; Deng, Xiuxin

    2012-05-01

    The aim of this study was to develop a large set of microsatellite markers based on publicly available BAC-end sequences (BESs), and to evaluate their transferability, discriminating capacity of genotypes and mapping ability in Citrus. A set of 1,281 simple sequence repeat (SSR) markers were developed from the 46,339 Citrus clementina BAC-end sequences (BES), of them 20.67% contained SSR longer than 20 bp, corresponding to roughly one perfect SSR per 2.04 kb. The most abundant motifs were di-nucleotide (16.82%) repeats. Among all repeat motifs (TA/AT)n is the most abundant (8.38%), followed by (AG/CT)n (4.51%). Most of the BES-SSR are located in the non-coding region, but 1.3% of BES-SSRs were found to be associated with transposable element (TE). A total of 400 novel SSR primer pairs were synthesized and their transferability and polymorphism tested on a set of 16 Citrus and Citrus relative's species. Among these 333 (83.25%) were successfully amplified and 260 (65.00%) showed cross-species transferability with Poncirus trifoliata and Fortunella sp. These cross-species transferable markers could be useful for cultivar identification, for genomic study of Citrus, Poncirus and Fortunella sp. Utility of the developed SSR marker was demonstrated by identifying a set of 118 markers each for construction of linkage map of Citrus reticulata and Poncirus trifoliata. Genetic diversity and phylogenetic relationship among 40 Citrus and its related species were conducted with the aid of 25 randomly selected SSR primer pairs and results revealed that citrus genomic SSRs are superior to genic SSR for genetic diversity and germplasm characterization of Citrus spp.

  2. Characterization of 47 MHC class I sequences in Filipino cynomolgus macaques

    Science.gov (United States)

    Campbell, Kevin J.; Detmer, Ann M.; Karl, Julie A.; Wiseman, Roger W.; Blasky, Alex J.; Hughes, Austin L.; Bimber, Benjamin N.; O’Connor, Shelby L.; O’Connor, David H.

    2009-01-01

    Cynomolgus macaques (Macaca fascicularis) provide increasingly common models for infectious disease research. Several geographically distinct populations of these macaques from Southeast Asia and the Indian Ocean island of Mauritius are available for pathogenesis studies. Though host genetics may profoundly impact results of such studies, similarities and differences between populations are often overlooked. In this study we identified 47 full-length MHC class I nucleotide sequences in 16 cynomolgus macaques of Filipino origin. The majority of MHC class I sequences characterized (39 of 47) were unique to this regional population. However, we discovered eight sequences with perfect identity and six sequences with close similarity to previously defined MHC class I sequences from other macaque populations. We identified two ancestral MHC haplotypes that appear to be shared between Filipino and Mauritian cynomolgus macaques, notably a Mafa-B haplotype that has previously been shown to protect Mauritian cynomolgus macaques against challenge with a simian/human immunodeficiency virus, SHIV89.6P. We also identified a Filipino cynomolgus macaque MHC class I sequence for which the predicted protein sequence differs from Mamu-B*17 by a single amino acid. This is important because Mamu-B*17 is strongly associated with protection against simian immunodeficiency virus (SIV) challenge in Indian rhesus macaques. These findings have implications for the evolutionary history of Filipino cynomolgus macaques as well as for the use of this model in SIV/SHIV research protocols. PMID:19107381

  3. Construction of an integrated database to support genomic sequence analysis

    Energy Technology Data Exchange (ETDEWEB)

    Gilbert, W.; Overbeek, R.

    1994-11-01

    The central goal of this project is to develop an integrated database to support comparative analysis of genomes including DNA sequence data, protein sequence data, gene expression data and metabolism data. In developing the logic-based system GenoBase, a broader integration of available data was achieved due to assistance from collaborators. Current goals are to easily include new forms of data as they become available and to easily navigate through the ensemble of objects described within the database. This report comments on progress made in these areas.

  4. Population-Sequencing as a Biomarker of Burkholderia mallei and Burkholderia pseudomallei Evolution through Microbial Forensic Analysis

    Directory of Open Access Journals (Sweden)

    John P. Jakupciak

    2013-01-01

    Full Text Available Large-scale genomics projects are identifying biomarkers to detect human disease. B. pseudomallei and B. mallei are two closely related select agents that cause melioidosis and glanders. Accurate characterization of metagenomic samples is dependent on accurate measurements of genetic variation between isolates with resolution down to strain level. Often single biomarker sensitivity is augmented by use of multiple or panels of biomarkers. In parallel with single biomarker validation, advances in DNA sequencing enable analysis of entire genomes in a single run: population-sequencing. Potentially, direct sequencing could be used to analyze an entire genome to serve as the biomarker for genome identification. However, genome variation and population diversity complicate use of direct sequencing, as well as differences caused by sample preparation protocols including sequencing artifacts and mistakes. As part of a Department of Homeland Security program in bacterial forensics, we examined how to implement whole genome sequencing (WGS analysis as a judicially defensible forensic method for attributing microbial sample relatedness; and also to determine the strengths and limitations of whole genome sequence analysis in a forensics context. Herein, we demonstrate use of sequencing to provide genetic characterization of populations: direct sequencing of populations.

  5. Cloning and sequence analysis of cDNA coding for rat nucleolar protein C23

    International Nuclear Information System (INIS)

    Ghaffari, S.H.; Olson, M.O.J.

    1986-01-01

    Using synthetic oligonucleotides as primers and probes, the authors have isolated and sequenced cDNA clones encoding protein C23, a putative nucleolus organizer protein. Poly(A + ) RNA was isolated from rat Novikoff hepatoma cells and enriched in C23 mRNA by sucrose density gradient ultracentrifugation. Two deoxyoligonuleotides, a 48- and a 27-mer, were synthesized on the basis of amino acid sequence from the C-terminal half of protein C23 and cDNA sequence data from CHO cell protein. The 48-mer was used a primer for synthesis of cDNA which was then inserted into plasmid pUC9. Transformed bacterial colonies were screened by hybridization with 32 P labeled 27-mer. Two clones among 5000 gave a strong positive signal. Plasmid DNAs from these clones were purified and characterized by blotting and nucleotide sequence analysis. The length of C23 mRNA was estimated to be 3200 bases in a northern blot analysis. The sequence of a 267 b.p. insert shows high homology with the CHO cDNA with only 9 nucleotide differences and an identical amino acid sequence. These studies indicate that this region of the protein is highly conserved

  6. Illumina MiSeq Sequencing for Preliminary Analysis of Microbiome Causing Primary Endodontic Infections in Egypt

    Directory of Open Access Journals (Sweden)

    Sally Ali Tawfik

    2018-01-01

    Full Text Available The use of high throughput next generation technologies has allowed more comprehensive analysis than traditional Sanger sequencing. The specific aim of this study was to investigate the microbial diversity of primary endodontic infections using Illumina MiSeq sequencing platform in Egyptian patients. Samples were collected from 19 patients in Suez Canal University Hospital (Endodontic Department using sterile # 15K file and paper points. DNA was extracted using Mo Bio power soil DNA isolation extraction kit followed by PCR amplification and agarose gel electrophoresis. The microbiome was characterized on the basis of the V3 and V4 hypervariable region of the 16S rRNA gene by using paired-end sequencing on Illumina MiSeq device. MOTHUR software was used in sequence filtration and analysis of sequenced data. A total of 1858 operational taxonomic units at 97% similarity were assigned to 26 phyla, 245 families, and 705 genera. Four main phyla Firmicutes, Bacteroidetes, Proteobacteria, and Synergistetes were predominant in all samples. At genus level, Prevotella, Bacillus, Porphyromonas, Streptococcus, and Bacteroides were the most abundant. Illumina MiSeq platform sequencing can be used to investigate oral microbiome composition of endodontic infections. Elucidating the ecology of endodontic infections is a necessary step in developing effective intracanal antimicrobials.

  7. The Pinus taeda genome is characterized by diverse and highly diverged repetitive sequences

    Directory of Open Access Journals (Sweden)

    Yandell Mark

    2010-07-01

    Full Text Available Abstract Background In today's age of genomic discovery, no attempt has been made to comprehensively sequence a gymnosperm genome. The largest genus in the coniferous family Pinaceae is Pinus, whose 110-120 species have extremely large genomes (c. 20-40 Gb, 2N = 24. The size and complexity of these genomes have prompted much speculation as to the feasibility of completing a conifer genome sequence. Conifer genomes are reputed to be highly repetitive, but there is little information available on the nature and identity of repetitive units in gymnosperms. The pines have extensive genetic resources, with approximately 329000 ESTs from eleven species and genetic maps in eight species, including a dense genetic map of the twelve linkage groups in Pinus taeda. Results We present here the Sanger sequence and annotation of ten P. taeda BAC clones and Genome Analyzer II whole genome shotgun (WGS sequences representing 7.5% of the genome. Computational annotation of ten BACs predicts three putative protein-coding genes and at least fifteen likely pseudogenes in nearly one megabase of sequence. We found three conifer-specific LTR retroelements in the BACs, and tentatively identified at least 15 others based on evidence from the distantly related angiosperms. Alignment of WGS sequences to the BACs indicates that 80% of BAC sequences have similar copies (≥ 75% nucleotide identity elsewhere in the genome, but only 23% have identical copies (99% identity. The three most common repetitive elements in the genome were identified and, when combined, represent less than 5% of the genome. Conclusions This study indicates that the majority of repeats in the P. taeda genome are 'novel' and will therefore require additional BAC or genomic sequencing for accurate characterization. The pine genome contains a very large number of diverged and probably defunct repetitive elements. This study also provides new evidence that sequencing a pine genome using a WGS approach is

  8. Molecular characterization of Fasciola gigantica from Mauritania based on mitochondrial and nuclear ribosomal DNA sequences.

    Science.gov (United States)

    Amor, Nabil; Farjallah, Sarra; Salem, Mohamed; Lamine, Dia Mamadou; Merella, Paolo; Said, Khaled; Ben Slimane, Badreddine

    2011-10-01

    Fasciolosis caused by Fasciola hepatica and Fasciola gigantica (Platyhelminthes: Trematoda: Digenea) is considered the most important helminth infection of ruminants in tropical countries, causing considerable socioeconomic problems. From Africa, F. gigantica has been previously characterized from Burkina Faso, Senegal, Kenya, Zambia and Mali, while F. hepatica has been reported from Morocco and Tunisia, and both species have been observed from Ethiopia and Egypt on the basis of morphometric differences, while the use of molecular markers is necessary to distinguish exactly between species. Samples identified morphologically as F. gigantica (n=60) from sheep and cattle from different geographical localities of Mauritania were genetically characterized by sequences of the first (ITS-1), the 5.8S, and second (ITS-2) Internal Transcribed Spacers (ITS) of nuclear ribosomal DNA (rDNA) genes and the mitochondrial Cytochrome c Oxidase I (COI) gene. Comparison of the sequences of the Mauritanian samples with sequences of Fasciola spp. from GenBank confirmed that all samples belong to the species F. gigantica. The nucleotide sequencing of ITS rDNA of F. gigantica showed no nucleotide variation in the ITS-1, 5.8S, and ITS-2 rDNA sequences among all samples examined and those from Burkina Faso, Kenya, Egypt and Iran. The phylogenetic trees based on the ITS-1 and ITS-2 sequences showed a close relationship of the Mauritanian samples with isolates of F. gigantica from different localities of Africa and Asia. The COI genotypes of the Mauritanian specimens of F. gigantica had a high level of diversity, and they belonged to the F. gigantica phylogenically distinguishable clade. The present study is the first molecular characterization of F. gigantica in sheep and cattle from Mauritania, allowing a reliable approach for the genetic differentiation of Fasciola spp. and providing basis for further studies on liver flukes in the African countries. Copyright © 2011 Elsevier Inc. All

  9. Analysis of Sequence Diagram Layout in Advanced UML Modelling Tools

    Directory of Open Access Journals (Sweden)

    Ņikiforova Oksana

    2016-05-01

    Full Text Available System modelling using Unified Modelling Language (UML is the task that should be solved for software development. The more complex software becomes the higher requirements are stated to demonstrate the system to be developed, especially in its dynamic aspect, which in UML is offered by a sequence diagram. To solve this task, the main attention is devoted to the graphical presentation of the system, where diagram layout plays the central role in information perception. The UML sequence diagram due to its specific structure is selected for a deeper analysis on the elements’ layout. The authors research represents the abilities of modern UML modelling tools to offer automatic layout of the UML sequence diagram and analyse them according to criteria required for the diagram perception.

  10. Evolutionary analysis of hepatitis C virus gene sequences from 1953

    Science.gov (United States)

    Gray, Rebecca R.; Tanaka, Yasuhito; Takebe, Yutaka; Magiorkinis, Gkikas; Buskell, Zelma; Seeff, Leonard; Alter, Harvey J.; Pybus, Oliver G.

    2013-01-01

    Reconstructing the transmission history of infectious diseases in the absence of medical or epidemiological records often relies on the evolutionary analysis of pathogen genetic sequences. The precision of evolutionary estimates of epidemic history can be increased by the inclusion of sequences derived from ‘archived’ samples that are genetically distinct from contemporary strains. Historical sequences are especially valuable for viral pathogens that circulated for many years before being formally identified, including HIV and the hepatitis C virus (HCV). However, surprisingly few HCV isolates sampled before discovery of the virus in 1989 are currently available. Here, we report and analyse two HCV subgenomic sequences obtained from infected individuals in 1953, which represent the oldest genetic evidence of HCV infection. The pairwise genetic diversity between the two sequences indicates a substantial period of HCV transmission prior to the 1950s, and their inclusion in evolutionary analyses provides new estimates of the common ancestor of HCV in the USA. To explore and validate the evolutionary information provided by these sequences, we used a new phylogenetic molecular clock method to estimate the date of sampling of the archived strains, plus the dates of four more contemporary reference genomes. Despite the short fragments available, we conclude that the archived sequences are consistent with a proposed sampling date of 1953, although statistical uncertainty is large. Our cross-validation analyses suggest that the bias and low statistical power observed here likely arise from a combination of high evolutionary rate heterogeneity and an unstructured, star-like phylogeny. We expect that attempts to date other historical viruses under similar circumstances will meet similar problems. PMID:23938759

  11. Genotypic Characterization of Bradyrhizobium Strains Nodulating Endemic Woody Legumes of the Canary Islands by PCR-Restriction Fragment Length Polymorphism Analysis of Genes Encoding 16S rRNA (16S rDNA) and 16S-23S rDNA Intergenic Spacers, Repetitive Extragenic Palindromic PCR Genomic Fingerprinting, and Partial 16S rDNA Sequencing

    Science.gov (United States)

    Vinuesa, Pablo; Rademaker, Jan L. W.; de Bruijn, Frans J.; Werner, Dietrich

    1998-01-01

    We present a phylogenetic analysis of nine strains of symbiotic nitrogen-fixing bacteria isolated from nodules of tagasaste (Chamaecytisus proliferus) and other endemic woody legumes of the Canary Islands, Spain. These and several reference strains were characterized genotypically at different levels of taxonomic resolution by computer-assisted analysis of 16S ribosomal DNA (rDNA) PCR-restriction fragment length polymorphisms (PCR-RFLPs), 16S-23S rDNA intergenic spacer (IGS) RFLPs, and repetitive extragenic palindromic PCR (rep-PCR) genomic fingerprints with BOX, ERIC, and REP primers. Cluster analysis of 16S rDNA restriction patterns with four tetrameric endonucleases grouped the Canarian isolates with the two reference strains, Bradyrhizobium japonicum USDA 110spc4 and Bradyrhizobium sp. strain (Centrosema) CIAT 3101, resolving three genotypes within these bradyrhizobia. In the analysis of IGS RFLPs with three enzymes, six groups were found, whereas rep-PCR fingerprinting revealed an even greater genotypic diversity, with only two of the Canarian strains having similar fingerprints. Furthermore, we show that IGS RFLPs and even very dissimilar rep-PCR fingerprints can be clustered into phylogenetically sound groupings by combining them with 16S rDNA RFLPs in computer-assisted cluster analysis of electrophoretic patterns. The DNA sequence analysis of a highly variable 264-bp segment of the 16S rRNA genes of these strains was found to be consistent with the fingerprint-based classification. Three different DNA sequences were obtained, one of which was not previously described, and all belonged to the B. japonicum/Rhodopseudomonas rDNA cluster. Nodulation assays revealed that none of the Canarian isolates nodulated Glycine max or Leucaena leucocephala, but all nodulated Acacia pendula, C. proliferus, Macroptilium atropurpureum, and Vigna unguiculata. PMID:9603820

  12. GntR family of regulators in Mycobacterium smegmatis: a sequence and structure based characterization

    Directory of Open Access Journals (Sweden)

    Ranjan Akash

    2007-08-01

    Full Text Available Abstract Background Mycobacterium smegmatis is fast growing non-pathogenic mycobacteria. This organism has been widely used as a model organism to study the biology of other virulent and extremely slow growing species like Mycobacterium tuberculosis. Based on the homology of the N-terminal DNA binding domain, the recently sequenced genome of M. smegmatis has been shown to possess several putative GntR regulators. A striking characteristic feature of this family of regulators is that they possess a conserved N-terminal DNA binding domain and a diverse C-terminal domain involved in the effector binding and/or oligomerization. Since the physiological role of these regulators is critically dependent upon effector binding and operator sites, we have analysed and classified these regulators into their specific subfamilies and identified their potential binding sites. Results The sequence analysis of M. smegmatis putative GntRs has revealed that FadR, HutC, MocR and the YtrA-like regulators are encoded by 45, 8, 8 and 1 genes respectively. Further out of 45 FadR-like regulators, 19 were classified into the FadR group and 26 into the VanR group. All these proteins showed similar secondary structural elements specific to their respective subfamilies except MSMEG_3959, which showed additional secondary structural elements. Using the reciprocal BLAST searches, we further identified the orthologs of these regulators in Bacillus subtilis and other mycobacteria. Since the expression of many regulators is auto-regulatory, we have identified potential operator sites for a number of these GntR regulators by analyzing the upstream sequences. Conclusion This study helps in extending the annotation of M. smegmatis GntR proteins. It identifies the GntR regulators of M. smegmatis that could serve as a model for studying orthologous regulators from virulent as well as other saprophytic mycobacteria. This study also sheds some light on the nucleotide preferences in the

  13. Transcriptome characterization of the South African abalone Haliotis midae using sequencing-by-synthesis

    Directory of Open Access Journals (Sweden)

    Roodt-Wilding Rouvay

    2011-03-01

    Full Text Available Abstract Background Worldwide, the genus Haliotis is represented by 56 extant species and several of these are commercially cultured. Among the six abalone species found in South Africa, Haliotis midae is the only aquacultured species. Despite its economic importance, genomic sequence resources for H. midae, and for abalone in general, are still scarce. Next generation sequencing technologies provide a fast and efficient tool to generate large sequence collections that can be used to characterize the transcriptome and identify expressed genes associated with economically important traits like growth and disease resistance. Results More than 25 million short reads generated by the Illumina Genome Analyzer were de novo assembled in 22,761 contigs with an average size of 260 bp. With a stringent E-value threshold of 10-10, 3,841 contigs (16.8% had a BLAST homologous match against the Genbank non-redundant (NR protein database. Most of these sequences were annotated using the gene ontology (GO and eukaryotic orthologous groups of proteins (KOG databases and assigned to various functional categories. According to annotation results, many gene families involved in immune response were identified. Thousands of simple sequence repeats (SSR and single nucleotide polymorphisms (SNP were detected. Setting stringent parameters to ensure a high probability of amplification, 420 primer pairs in 181 contigs containing SSR loci were designed. Conclusion This data represents the most comprehensive genomic resource for the South African abalone H. midae to date. The amount of assembled sequences demonstrated the utility of the Illumina sequencing technology in the transcriptome characterization of a non-model species. It allowed the development of several markers and the identification of promising candidate genes for future studies on population and functional genomics in H. midae and in other abalone species.

  14. Using SQL Databases for Sequence Similarity Searching and Analysis.

    Science.gov (United States)

    Pearson, William R; Mackey, Aaron J

    2017-09-13

    Relational databases can integrate diverse types of information and manage large sets of similarity search results, greatly simplifying genome-scale analyses. By focusing on taxonomic subsets of sequences, relational databases can reduce the size and redundancy of sequence libraries and improve the statistical significance of homologs. In addition, by loading similarity search results into a relational database, it becomes possible to explore and summarize the relationships between all of the proteins in an organism and those in other biological kingdoms. This unit describes how to use relational databases to improve the efficiency of sequence similarity searching and demonstrates various large-scale genomic analyses of homology-related data. It also describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. The unit also introduces search_demo, a database that stores sequence similarity search results. The search_demo database is then used to explore the evolutionary relationships between E. coli proteins and proteins in other organisms in a large-scale comparative genomic analysis. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.

  15. Systematic characterization of Bacillus Genetic Stock Center Bacillus thuringiensis strains using Multi-Locus Sequence Typing.

    Science.gov (United States)

    Wang, Kui; Shu, Changlong; Soberón, Mario; Bravo, Alejandra; Zhang, Jie

    2018-04-30

    The goal of this work was to perform a systematic characterization of Bacillus thuringiensis (Bt) strains from the Bacillus Genetic Stock Center (BGSC) collection using Multi-Locus Sequence Typing (MLST). Different genetic markers of 158 Bacillus thuringiensis (Bt) strains from 73 different serovars stored in the BGSC, that represented 92% of the different Bt serovars of the BGSC were analyzed, the 8% that were not analyzed were not available. In addition, we analyzed 72 Bt strains from 18 serovars available at the pubMLST bcereus database, and Bt strains G03, HBF18 and Bt185, with no H serovars provided by our laboratory. We performed a systematic MLST analysis using seven housekeeping genes (glpF, gmK, ilvD, pta, pur, pycA and tpi) and analyzed correlation of the results of this analysis with strain serovars. The 233 Bt strains analyzed were assigned to 119 STs from which 19 STs were new. Genetic relationships were established by phylogenetic analysis and showed that STs could be grouped in two major Clusters containing 21 sub-groups. We found that a significant number of STs (101 in total) correlated with specific serovars, such as ST13 that corresponded to nine Bt isolates from B. thuringiensis serovar kenyae. However, other serovars showed high genetic variability and correlated with multiple STs; for example, B. thuringiensis serovar morrisoni correlated with 11 different STs. In addition, we found that 16 different STs correlated with multiple serovars (2-4 different serovars); for example, ST12 correlated with B. thuringiensis serovar alesti, dakota, palmanyolensis and sotto/dendrolimus. These data indicated that only partial correspondence between MLST and serotyping can be established. Copyright © 2018 Elsevier Inc. All rights reserved.

  16. Sequencing and characterizing the genome of Estrella lausannensis as an undergraduate project: training students and biological insights

    Directory of Open Access Journals (Sweden)

    Claire eBertelli

    2015-02-01

    Full Text Available With the widespread availability of high-throughput sequencing technologies, sequencing projects have become pervasive in the molecular life sciences. The huge bulk of data generated daily must be analyzed further by biologists with skills in bioinformatics and by embedded bioinformaticians, i.e., bioinformaticians integrated in wet lab research groups. Thus, students interested in molecular life sciences must be trained in the main steps of genomics: sequencing, assembly, annotation and analysis. To reach that goal, a practical course has been set up for master students at the University of Lausanne: the Sequence a genome class. At the beginning of the academic year, a few bacterial species whose genome is unknown are provided to the students, who sequence and assemble the genome(s and perform manual annotation. Here, we report the progress of the first class from September 2010 to June 2011 and the results obtained by seven master students who specifically assembled and annotated the genome of Estrella lausannensis, an obligate intracellular bacterium related to Chlamydia. The draft genome of Estrella is composed of 29 scaffolds encompassing 2,819,825 bp that encode for 2,233 putative proteins. Estrella also possesses a 9,136 bp plasmid that encodes for 14 genes, among which we found an integrase and a toxin/antitoxin module. Like all other members of the Chlamydiales order, Estrella possesses a highly conserved type III secretion system, considered as a key virulence factor. The annotation of the Estrella genome also allowed the characterization of the metabolic abilities of this strictly intracellular bacterium. Altogether, the students provided the scientific community with the Estrella genome sequence and a preliminary understanding of the biology of this recently-discovered bacterial genus, while learning to use cutting-edge technologies for sequencing and to perform bioinformatics analyses.

  17. Now And Next Generation Sequencing Techniques: Future of Sequence Analysis using Cloud Computing

    Directory of Open Access Journals (Sweden)

    Radhe Shyam Thakur

    2012-12-01

    Full Text Available Advancements in the field of sequencing techniques resulted in the huge sequenced data to be produced at a very faster rate. It is going cumbersome for the datacenter to maintain the databases. Data mining and sequence analysis approaches needs to analyze the databases several times to reach any efficient conclusion. To cope with such overburden on computer resources and to reach efficient and effective conclusions quickly, the virtualization of the resources and computation on pay as you go concept was introduced and termed as cloud computing. The datacenter’s hardware and software is collectively known as cloud which when available publicly is termed as public cloud. The datacenter’s resources are provided in a virtual mode to the clients via a service provider like Amazon, Google and Joyent which charges on pay as you go manner. The workload is shifted to the provider which is maintained by the required hardware and software upgradation. The service provider manages it by upgrading the requirements in the virtual mode. Basically a virtual environment is created according to the need of the user by taking permission from datacenter via internet, the task is performed and the environment is deleted after the task is over. In this discussion, we are focusing on the basics of cloud computing, the prerequisites and overall working of clouds. Furthermore, briefly the applications of cloud computing in biological systems, especially in comparative genomics, genome informatics and SNP detection with reference to traditional workflow are discussed.

  18. Now and next-generation sequencing techniques: future of sequence analysis using cloud computing.

    Science.gov (United States)

    Thakur, Radhe Shyam; Bandopadhyay, Rajib; Chaudhary, Bratati; Chatterjee, Sourav

    2012-01-01

    Advances in the field of sequencing techniques have resulted in the greatly accelerated production of huge sequence datasets. This presents immediate challenges in database maintenance at datacenters. It provides additional computational challenges in data mining and sequence analysis. Together these represent a significant overburden on traditional stand-alone computer resources, and to reach effective conclusions quickly and efficiently, the virtualization of the resources and computation on a pay-as-you-go concept (together termed "cloud computing") has recently appeared. The collective resources of the datacenter, including both hardware and software, can be available publicly, being then termed a public cloud, the resources being provided in a virtual mode to the clients who pay according to the resources they employ. Examples of public companies providing these resources include Amazon, Google, and Joyent. The computational workload is shifted to the provider, which also implements required hardware and software upgrades over time. A virtual environment is created in the cloud corresponding to the computational and data storage needs of the user via the internet. The task is then performed, the results transmitted to the user, and the environment finally deleted after all tasks are completed. In this discussion, we focus on the basics of cloud computing, and go on to analyze the prerequisites and overall working of clouds. Finally, the applications of cloud computing in biological systems, particularly in comparative genomics, genome informatics, and SNP detection are discussed with reference to traditional workflows.

  19. SEQUENCING AND SEQUENCE ANALYSIS OF MYOSTATIN GENE IN THE EXON 1 OF THE CAMEL (CAMELUS DROMEDARIUS

    Directory of Open Access Journals (Sweden)

    M. G. SHAH, A. S. QURESHI1, M. REISSMANN2 AND H. J. SCHWARTZ3

    2006-10-01

    Full Text Available Myostatin, also called growth differentiation factor-8 (GDF-8, is a member of the mammalian growth transforming family (TGF-beta superfamily, which is expressed specifically in developing an adult skeletal muscle. Muscular hypertrophy allele (mh allele in the double muscle breeds involved mutation within the myostatin gene. Genomic DNA was isolated from the camel hair using NucleoSpin Tissue kit. Two animals of each of the six breeds namely, Marecha, Dhatti, Larri, Kohi, Sakrai and Cambelpuri were used for sequencing. For PCR amplification of the gene, a primer pair was designed from homolog regions of already published sequences of farm animals from GenBank. Results showed that camel myostatin possessed more than 90% homology with that of cattle, sheep and pig. Camel formed separate cluster from the pig in spite of having high homology (98% and showed 94% homology with cattle and sheep as reported in literature. Sequence analysis of the PCR amplified part of exon 1 (256 bp of the camel myostatin was identical among six camel breeds.

  20. An Imaging And Graphics Workstation For Image Sequence Analysis

    Science.gov (United States)

    Mostafavi, Hassan

    1990-01-01

    This paper describes an application-specific engineering workstation designed and developed to analyze imagery sequences from a variety of sources. The system combines the software and hardware environment of the modern graphic-oriented workstations with the digital image acquisition, processing and display techniques. The objective is to achieve automation and high throughput for many data reduction tasks involving metric studies of image sequences. The applications of such an automated data reduction tool include analysis of the trajectory and attitude of aircraft, missile, stores and other flying objects in various flight regimes including launch and separation as well as regular flight maneuvers. The workstation can also be used in an on-line or off-line mode to study three-dimensional motion of aircraft models in simulated flight conditions such as wind tunnels. The system's key features are: 1) Acquisition and storage of image sequences by digitizing real-time video or frames from a film strip; 2) computer-controlled movie loop playback, slow motion and freeze frame display combined with digital image sharpening, noise reduction, contrast enhancement and interactive image magnification; 3) multiple leading edge tracking in addition to object centroids at up to 60 fields per second from both live input video or a stored image sequence; 4) automatic and manual field-of-view and spatial calibration; 5) image sequence data base generation and management, including the measurement data products; 6) off-line analysis software for trajectory plotting and statistical analysis; 7) model-based estimation and tracking of object attitude angles; and 8) interface to a variety of video players and film transport sub-systems.

  1. Multilocus sequence analysis of Treponema denticola strains of diverse origin

    Directory of Open Access Journals (Sweden)

    Mo Sisu

    2013-02-01

    Full Text Available Abstract Background The oral spirochete bacterium Treponema denticola is associated with both the incidence and severity of periodontal disease. Although the biological or phenotypic properties of a significant number of T. denticola isolates have been reported in the literature, their genetic diversity or phylogeny has never been systematically investigated. Here, we describe a multilocus sequence analysis (MLSA of 20 of the most highly studied reference strains and clinical isolates of T. denticola; which were originally isolated from subgingival plaque samples taken from subjects from China, Japan, the Netherlands, Canada and the USA. Results The sequences of the 16S ribosomal RNA gene, and 7 conserved protein-encoding genes (flaA, recA, pyrH, ppnK, dnaN, era and radC were successfully determined for each strain. Sequence data was analyzed using a variety of bioinformatic and phylogenetic software tools. We found no evidence of positive selection or DNA recombination within the protein-encoding genes, where levels of intraspecific sequence polymorphism varied from 18.8% (flaA to 8.9% (dnaN. Phylogenetic analysis of the concatenated protein-encoding gene sequence data (ca. 6,513 nucleotides for each strain using Bayesian and maximum likelihood approaches indicated that the T. denticola strains were monophyletic, and formed 6 well-defined clades. All analyzed T. denticola strains appeared to have a genetic origin distinct from that of ‘Treponema vincentii’ or Treponema pallidum. No specific geographical relationships could be established; but several strains isolated from different continents appear to be closely related at the genetic level. Conclusions Our analyses indicate that previous biological and biophysical investigations have predominantly focused on a subset of T. denticola strains with a relatively narrow range of genetic diversity. Our methodology and results establish a genetic framework for the discrimination and phylogenetic

  2. Sirius PSB: a generic system for analysis of biological sequences.

    Science.gov (United States)

    Koh, Chuan Hock; Lin, Sharene; Jedd, Gregory; Wong, Limsoon

    2009-12-01

    Computational tools are essential components of modern biological research. For example, BLAST searches can be used to identify related proteins based on sequence homology, or when a new genome is sequenced, prediction models can be used to annotate functional sites such as transcription start sites, translation initiation sites and polyadenylation sites and to predict protein localization. Here we present Sirius Prediction Systems Builder (PSB), a new computational tool for sequence analysis, classification and searching. Sirius PSB has four main operations: (1) Building a classifier, (2) Deploying a classifier, (3) Search for proteins similar to query proteins, (4) Preliminary and post-prediction analysis. Sirius PSB supports all these operations via a simple and interactive graphical user interface. Besides being a convenient tool, Sirius PSB has also introduced two novelties in sequence analysis. Firstly, genetic algorithm is used to identify interesting features in the feature space. Secondly, instead of the conventional method of searching for similar proteins via sequence similarity, we introduced searching via features' similarity. To demonstrate the capabilities of Sirius PSB, we have built two prediction models - one for the recognition of Arabidopsis polyadenylation sites and another for the subcellular localization of proteins. Both systems are competitive against current state-of-the-art models based on evaluation of public datasets. More notably, the time and effort required to build each model is greatly reduced with the assistance of Sirius PSB. Furthermore, we show that under certain conditions when BLAST is unable to find related proteins, Sirius PSB can identify functionally related proteins based on their biophysical similarities. Sirius PSB and its related supplements are available at: http://compbio.ddns.comp.nus.edu.sg/~sirius.

  3. Characterization of full-length sequenced cDNA inserts (FLIcs from Atlantic salmon (Salmo salar

    Directory of Open Access Journals (Sweden)

    Lunner Sigbjørn

    2009-10-01

    Full Text Available Abstract Background Sequencing of the Atlantic salmon genome is now being planned by an international research consortium. Full-length sequenced inserts from cDNAs (FLIcs are an important tool for correct annotation and clustering of the genomic sequence in any species. The large amount of highly similar duplicate sequences caused by the relatively recent genome duplication in the salmonid ancestor represents a particular challenge for the genome project. FLIcs will therefore be an extremely useful resource for the Atlantic salmon sequencing project. In addition to be helpful in order to distinguish between duplicate genome regions and in determining correct gene structures, FLIcs are an important resource for functional genomic studies and for investigation of regulatory elements controlling gene expression. In contrast to the large number of ESTs available, including the ESTs from 23 developmental and tissue specific cDNA libraries contributed by the Salmon Genome Project (SGP, the number of sequences where the full-length of the cDNA insert has been determined has been small. Results High quality full-length insert sequences from 560 pre-smolt white muscle tissue specific cDNAs were generated, accession numbers [GenBank: BT043497 - BT044056]. Five hundred and ten (91% of the transcripts were annotated using Gene Ontology (GO terms and 440 of the FLIcs are likely to contain a complete coding sequence (cCDS. The sequence information was used to identify putative paralogs, characterize salmon Kozak motifs, polyadenylation signal variation and to identify motifs likely to be involved in the regulation of particular genes. Finally, conserved 7-mers in the 3'UTRs were identified, of which some were identical to miRNA target sequences. Conclusion This paper describes the first Atlantic salmon FLIcs from a tissue and developmental stage specific cDNA library. We have demonstrated that many FLIcs contained a complete coding sequence (cCDS. This

  4. Characterization of full-length sequenced cDNA inserts (FLIcs) from Atlantic salmon (Salmo salar)

    Science.gov (United States)

    Andreassen, Rune; Lunner, Sigbjørn; Høyheim, Bjørn

    2009-01-01

    Background Sequencing of the Atlantic salmon genome is now being planned by an international research consortium. Full-length sequenced inserts from cDNAs (FLIcs) are an important tool for correct annotation and clustering of the genomic sequence in any species. The large amount of highly similar duplicate sequences caused by the relatively recent genome duplication in the salmonid ancestor represents a particular challenge for the genome project. FLIcs will therefore be an extremely useful resource for the Atlantic salmon sequencing project. In addition to be helpful in order to distinguish between duplicate genome regions and in determining correct gene structures, FLIcs are an important resource for functional genomic studies and for investigation of regulatory elements controlling gene expression. In contrast to the large number of ESTs available, including the ESTs from 23 developmental and tissue specific cDNA libraries contributed by the Salmon Genome Project (SGP), the number of sequences where the full-length of the cDNA insert has been determined has been small. Results High quality full-length insert sequences from 560 pre-smolt white muscle tissue specific cDNAs were generated, accession numbers [GenBank: BT043497 - BT044056]. Five hundred and ten (91%) of the transcripts were annotated using Gene Ontology (GO) terms and 440 of the FLIcs are likely to contain a complete coding sequence (cCDS). The sequence information was used to identify putative paralogs, characterize salmon Kozak motifs, polyadenylation signal variation and to identify motifs likely to be involved in the regulation of particular genes. Finally, conserved 7-mers in the 3'UTRs were identified, of which some were identical to miRNA target sequences. Conclusion This paper describes the first Atlantic salmon FLIcs from a tissue and developmental stage specific cDNA library. We have demonstrated that many FLIcs contained a complete coding sequence (cCDS). This suggests that the remaining c

  5. Whole genome sequence phylogenetic analysis of four Mexican rabies viruses isolated from cattle.

    Science.gov (United States)

    Bárcenas-Reyes, I; Loza-Rubio, E; Cantó-Alarcón, G J; Luna-Cozar, J; Enríquez-Vázquez, A; Barrón-Rodríguez, R J; Milián-Suazo, F

    2017-08-01

    Phylogenetic analysis of the rabies virus in molecular epidemiology has been traditionally performed on partial sequences of the genome, such as the N, G, and P genes; however, that approach raises concerns about the discriminatory power compared to whole genome sequencing. In this study we characterized four strains of the rabies virus isolated from cattle in Querétaro, Mexico by comparing the whole genome sequence to that of strains from the American, European and Asian continents. Four cattle brain samples positive to rabies and characterized as AgV11, genotype 1, were used in the study. A cDNA sequence was generated by reverse transcription PCR (RT-PCR) using oligo dT. cDNA samples were sequenced in an Illumina NextSeq 500 platform. The phylogenetic analysis was performed with MEGA 6.0. Minimum evolution phylogenetic trees were constructed with the Neighbor-Joining method and bootstrapped with 1000 replicates. Three large and seven small clusters were formed with the 26 sequences used. The largest cluster grouped strains from different species in South America: Brazil, and the French Guyana. The second cluster grouped five strains from Mexico. A Mexican strain reported in a different study was highly related to our four strains, suggesting common source of infection. The phylogenetic analysis shows that the type of host is different for the different regions in the American Continent; rabies is more related to bats. It was concluded that the rabies virus in central Mexico is genetically stable and that it is transmitted by the vampire bat Desmodus rotundus. Copyright © 2017 Elsevier Ltd. All rights reserved.

  6. Short Communication Phylogenetic Characterization of HIV Type 1 CRF01_AE V3 Envelope Sequences in Pregnant Women in Northern Vietnam

    Science.gov (United States)

    Caridha, Rozina; Ha, Tran Thi Thanh; Gaseitsiwe, Simani; Hung, Pham Viet; Anh, Nguyen Mai; Bao, Nguyen Huy; Khang, Dinh Duy; Hien, Nguyen Tran; Cam, Phung Dac; Chiodi, Francesca

    2012-01-01

    Abstract Characterization of HIV-1 strains is important for surveillance of the HIV-1 epidemic. In Vietnam HIV-1-infected pregnant women often fail to receive the care they are entitled to. Here, we analyzed phylogenetically HIV-1 env sequences from 37 HIV-1-infected pregnant women from Ha Noi (n=22) and Hai Phong (n=15), where they delivered in 2005–2007. All carried CRF01_AE in the gp120 V3 region. In 21 women CRF01_AE was also found in the reverse transcriptase gene. We compared their env gp120 V3 sequences phylogenetically in a maximum likelihood tree to those of 198 other CRF01_AE sequences in Vietnam and 229 from neighboring countries, predominantly Thailand, from the HIV-1 database. Altogether 464 sequences were analyzed. All but one of the maternal sequences colocalized with sequences from northern Vietnam. The maternal sequences had evolved the least when compared to sequences collected in Ha Noi in 2002, as shown by analysis of synonymous and nonsynonymous changes, than to other Vietnamese sequences collected earlier and/or elsewhere. Since the HIV-1 epidemic in women in Vietnam may still be underestimated, characterization of HIV-1 in pregnant women is important to observe how HIV-1 has evolved and follow its molecular epidemiology. PMID:21936713

  7. Characterization of mutagen-activated cellular oncogenes that confer anchorage independence to human fibroblasts and tumorigenicity to NIH 3T3 cells: Sequence analysis of an enzymatically amplified mutant HRAS allele

    International Nuclear Information System (INIS)

    Stevens, C.W.; Manoharan, T.H.; Fahl, W.E.

    1988-01-01

    Treatment of diploid human fibroblasts with an alkylating mutagen has been shown to induce stable, anchorage-independent cell populations at frequencies consistent with an activating mutation. After treatment of human foreskin fibroblasts with the mutagen benzo[α]pyrene (±)anti-7,8-dihydrodiol 9,10-epoxide and selection in soft agar, 17 anchorage-independent clones were isolated and expanded, and their cellular DNA was used to cotransfect NIH 3T3 cells along with pSV2neo. DNA from 11 of the 17 clones induced multiple NIH 3T3 cell tumors in recipient nude mice. Southern blot analyses showed the presence of human Alu repetitive sequences in all of the NIH 3T3 tumor cell DNAs. Intact, human HRAS sequences were observed in 2 of the 11 tumor groups, whereas no hybridization was detected when human KRAS or NRAS probes were used. Slow-migrating ras p21 proteins, consistent with codon 12 mutations, were observed in the same two NIH 3T3 tumor cell groups that contained the human HRAS bands. Genomic DNA from one of these two human anchorage-independent cell populations (clone 21A) was used to enzymatically amplify a portion of exon 1 of the HRAS gene. The results demonstrate that exposure of normal human cells to a common environmental mutagen yields HRAS GC → TA codon 12 transversions that have been commonly observed in human tumors

  8. Transcriptome sequencing and characterization for the sea cucumber Apostichopus japonicus (Selenka, 1867.

    Directory of Open Access Journals (Sweden)

    Huixia Du

    Full Text Available BACKGROUND: Sea cucumbers are a special group of marine invertebrates. They occupy a taxonomic position that is believed to be important for understanding the origin and evolution of deuterostomes. Some of them such as Apostichopus japonicus represent commercially important aquaculture species in Asian countries. Many efforts have been devoted to increasing the number of expressed sequence tags (ESTs for A. japonicus, but a comprehensive characterization of its transcriptome remains lacking. Here, we performed the large-scale transcriptome profiling and characterization by pyrosequencing diverse cDNA libraries from A. japonicus. RESULTS: In total, 1,061,078 reads were obtained by 454 sequencing of eight cDNA libraries representing different developmental stages and adult tissues in A. japonicus. These reads were assembled into 29,666 isotigs, which were further clustered into 21,071 isogroups. Nearly 40% of the isogroups showed significant matches to known proteins based on sequence similarity. Gene ontology (GO and KEGG pathway analyses recovered diverse biological functions and processes. Candidate genes that were potentially involved in aestivation were identified. Transcriptome comparison with the sea urchin Strongylocentrotus purpuratus revealed similar patterns of GO term representation. In addition, 4,882 putative orthologous genes were identified, of which 202 were not present in the non-echinoderm organisms. More than 700 simple sequence repeats (SSRs and 54,000 single nucleotide polymorphisms (SNPs were detected in the A. japonicus transcriptome. CONCLUSION: Pyrosequencing was proven to be efficient in rapidly identifying a large set of genes for the sea cucumber A. japonicus. Through the large-scale transcriptome sequencing as well as public EST data integration, we performed a comprehensive characterization of the A. japonicus transcriptome and identified candidate aestivation-related genes. A large number of potential genetic

  9. CISAPS: Complex Informational Spectrum for the Analysis of Protein Sequences

    Directory of Open Access Journals (Sweden)

    Charalambos Chrysostomou

    2015-01-01

    Full Text Available Complex informational spectrum analysis for protein sequences (CISAPS and its web-based server are developed and presented. As recent studies show, only the use of the absolute spectrum in the analysis of protein sequences using the informational spectrum analysis is proven to be insufficient. Therefore, CISAPS is developed to consider and provide results in three forms including absolute, real, and imaginary spectrum. Biologically related features to the analysis of influenza A subtypes as presented as a case study in this study can also appear individually either in the real or imaginary spectrum. As the results presented, protein classes can present similarities or differences according to the features extracted from CISAPS web server. These associations are probable to be related with the protein feature that the specific amino acid index represents. In addition, various technical issues such as zero-padding and windowing that may affect the analysis are also addressed. CISAPS uses an expanded list of 611 unique amino acid indices where each one represents a different property to perform the analysis. This web-based server enables researchers with little knowledge of signal processing methods to apply and include complex informational spectrum analysis to their work.

  10. Characterization of bud emergence 46 (BEM46) protein: Sequence, structural, phylogenetic and subcellular localization analyses

    International Nuclear Information System (INIS)

    Kumar, Abhishek; Kollath-Leiß, Krisztina; Kempken, Frank

    2013-01-01

    Highlights: •All eukaryotes have at least a single copy of a bem46 ortholog. •The catalytic triad of BEM46 is illustrated using sequence and structural analysis. •We identified indels in the conserved domain of BEM46 protein. •Localization studies of BEM46 protein were carried out using GFP-fusion tagging. -- Abstract: The bud emergence 46 (BEM46) protein from Neurospora crassa belongs to the α/β-hydrolase superfamily. Recently, we have reported that the BEM46 protein is localized in the perinuclear ER and also forms spots close by the plasma membrane. The protein appears to be required for cell type-specific polarity formation in N. crassa. Furthermore, initial studies suggested that the BEM46 amino acid sequence is conserved in eukaryotes and is considered to be one of the widespread conserved “known unknown” eukaryotic genes. This warrants for a comprehensive phylogenetic analysis of this superfamily to unravel origin and molecular evolution of these genes in different eukaryotes. Herein, we observe that all eukaryotes have at least a single copy of a bem46 ortholog. Upon scanning of these proteins in various genomes, we find that there are expansions leading into several paralogs in vertebrates. Usingcomparative genomic analyses, we identified insertion/deletions (indels) in the conserved domain of BEM46 protein, which allow to differentiate fungal classes such as ascomycetes from basidiomycetes. We also find that exonic indels are able to differentiate BEM46 homologs of different eukaryotic lineage. Furthermore, we unravel that BEM46 protein from N. crassa possess a novel endoplasmic-retention signal (PEKK) using GFP-fusion tagging experiments. We propose that three residues namely a serine 188S, a histidine 292H and an aspartic acid 262D are most critical residues, forming a catalytic triad in BEM46 protein from N. crassa. We carried out a comprehensive study on bem46 genes from a molecular evolution perspective with combination of functional

  11. Characterization of bud emergence 46 (BEM46) protein: Sequence, structural, phylogenetic and subcellular localization analyses

    Energy Technology Data Exchange (ETDEWEB)

    Kumar, Abhishek; Kollath-Leiß, Krisztina; Kempken, Frank, E-mail: fkempken@bot.uni-kiel.de

    2013-08-30

    Highlights: •All eukaryotes have at least a single copy of a bem46 ortholog. •The catalytic triad of BEM46 is illustrated using sequence and structural analysis. •We identified indels in the conserved domain of BEM46 protein. •Localization studies of BEM46 protein were carried out using GFP-fusion tagging. -- Abstract: The bud emergence 46 (BEM46) protein from Neurospora crassa belongs to the α/β-hydrolase superfamily. Recently, we have reported that the BEM46 protein is localized in the perinuclear ER and also forms spots close by the plasma membrane. The protein appears to be required for cell type-specific polarity formation in N. crassa. Furthermore, initial studies suggested that the BEM46 amino acid sequence is conserved in eukaryotes and is considered to be one of the widespread conserved “known unknown” eukaryotic genes. This warrants for a comprehensive phylogenetic analysis of this superfamily to unravel origin and molecular evolution of these genes in different eukaryotes. Herein, we observe that all eukaryotes have at least a single copy of a bem46 ortholog. Upon scanning of these proteins in various genomes, we find that there are expansions leading into several paralogs in vertebrates. Usingcomparative genomic analyses, we identified insertion/deletions (indels) in the conserved domain of BEM46 protein, which allow to differentiate fungal classes such as ascomycetes from basidiomycetes. We also find that exonic indels are able to differentiate BEM46 homologs of different eukaryotic lineage. Furthermore, we unravel that BEM46 protein from N. crassa possess a novel endoplasmic-retention signal (PEKK) using GFP-fusion tagging experiments. We propose that three residues namely a serine 188S, a histidine 292H and an aspartic acid 262D are most critical residues, forming a catalytic triad in BEM46 protein from N. crassa. We carried out a comprehensive study on bem46 genes from a molecular evolution perspective with combination of functional

  12. Integrated mRNA and microRNA transcriptome sequencing characterizes sequence variants and mRNA–microRNA regulatory network in nasopharyngeal carcinoma model systems

    Directory of Open Access Journals (Sweden)

    Carol Ying-Ying Szeto

    2014-01-01

    Full Text Available Nasopharyngeal carcinoma (NPC is a prevalent malignancy in Southeast Asia among the Chinese population. Aberrant regulation of transcripts has been implicated in many types of cancers including NPC. Herein, we characterized mRNA and miRNA transcriptomes by RNA sequencing (RNASeq of NPC model systems. Matched total mRNA and small RNA of undifferentiated Epstein–Barr virus (EBV-positive NPC xenograft X666 and its derived cell line C666, well-differentiated NPC cell line HK1, and the immortalized nasopharyngeal epithelial cell line NP460 were sequenced by Solexa technology. We found 2812 genes and 149 miRNAs (human and EBV to be differentially expressed in NP460, HK1, C666 and X666 with RNASeq; 533 miRNA–mRNA target pairs were inversely regulated in the three NPC cell lines compared to NP460. Integrated mRNA/miRNA expression profiling and pathway analysis show extracellular matrix organization, Beta-1 integrin cell surface interactions, and the PI3K/AKT, EGFR, ErbB, and Wnt pathways were potentially deregulated in NPC. Real-time quantitative PCR was performed on selected mRNA/miRNAs in order to validate their expression. Transcript sequence variants such as short insertions and deletions (INDEL, single nucleotide variant (SNV, and isomiRs were characterized in the NPC model systems. A novel TP53 transcript variant was identified in NP460, HK1, and C666. Detection of three previously reported novel EBV-encoded BART miRNAs and their isomiRs were also observed. Meta-analysis of a model system to a clinical system aids the choice of different cell lines in NPC studies. This comprehensive characterization of mRNA and miRNA transcriptomes in NPC cell lines and the xenograft provides insights on miRNA regulation of mRNA and valuable resources on transcript variation and regulation in NPC, which are potentially useful for mechanistic and preclinical studies.

  13. In Vivo Enhancer Analysis Chromosome 16 Conserved NoncodingSequences

    Energy Technology Data Exchange (ETDEWEB)

    Pennacchio, Len A.; Ahituv, Nadav; Moses, Alan M.; Nobrega,Marcelo; Prabhakar, Shyam; Shoukry, Malak; Minovitsky, Simon; Visel,Axel; Dubchak, Inna; Holt, Amy; Lewis, Keith D.; Plajzer-Frick, Ingrid; Akiyama, Jennifer; De Val, Sarah; Afzal, Veena; Black, Brian L.; Couronne, Olivier; Eisen, Michael B.; Rubin, Edward M.

    2006-02-01

    The identification of enhancers with predicted specificitiesin vertebrate genomes remains a significant challenge that is hampered bya lack of experimentally validated training sets. In this study, weleveraged extreme evolutionary sequence conservation as a filter toidentify putative gene regulatory elements and characterized the in vivoenhancer activity of human-fish conserved and ultraconserved1 noncodingelements on human chromosome 16 as well as such elements from elsewherein the genome. We initially tested 165 of these extremely conservedsequences in a transgenic mouse enhancer assay and observed that 48percent (79/165) functioned reproducibly as tissue-specific enhancers ofgene expression at embryonic day 11.5. While driving expression in abroad range of anatomical structures in the embryo, the majority of the79 enhancers drove expression in various regions of the developingnervous system. Studying a set of DNA elements that specifically droveforebrain expression, we identified DNA signatures specifically enrichedin these elements and used these parameters to rank all ~;3,400human-fugu conserved noncoding elements in the human genome. The testingof the top predictions in transgenic mice resulted in a three-foldenrichment for sequences with forebrain enhancer activity. These datadramatically expand the catalogue of in vivo-characterized human geneenhancers and illustrate the future utility of such training sets for avariety of iological applications including decoding the regulatoryvocabulary of the human genome.

  14. Cloning and sequence analysis of sucrose phosphate synthase gene from varieties of Pennisetum species.

    Science.gov (United States)

    Li, H C; Lu, H B; Yang, F Y; Liu, S J; Bai, C J; Zhang, Y W

    2015-03-31

    Sucrose phosphate synthase (SPS) is an enzyme used by higher plants for sucrose synthesis. In this study, three primer sets were designed on the basis of known SPS sequences from maize (GenBank: NM_001112224.1) and sugarcane (GenBank: JN584485.1), and five novel SPS genes were identified by RT-PCR from the genomes of Pennisetum spp (the hybrid P. americanum x P. purpureum, P. purpureum Schum., P. purpureum Schum. cv. Red, P. purpureum Schum. cv. Taiwan, and P. purpureum Schum. cv. Mott). The cloned sequences showed 99.9% identity and 80-88% similarity to the SPS sequences of other plants. The SPS gene of hybrid Pennisetum had one nucleotide and four amino acid polymorphisms compared to the other four germplasms, and cluster analysis was performed to assess genetic diversity in this species. Additional characterization of the SPS gene product can potentially allow Pennisetum to be exploited as a biofuel source.

  15. CAFE: aCcelerated Alignment-FrEe sequence analysis.

    Science.gov (United States)

    Lu, Yang Young; Tang, Kujin; Ren, Jie; Fuhrman, Jed A; Waterman, Michael S; Sun, Fengzhu

    2017-07-03

    Alignment-free genome and metagenome comparisons are increasingly important with the development of next generation sequencing (NGS) technologies. Recently developed state-of-the-art k-mer based alignment-free dissimilarity measures including CVTree, $d_2^*$ and $d_2^S$ are more computationally expensive than measures based solely on the k-mer frequencies. Here, we report a standalone software, aCcelerated Alignment-FrEe sequence analysis (CAFE), for efficient calculation of 28 alignment-free dissimilarity measures. CAFE allows for both assembled genome sequences and unassembled NGS shotgun reads as input, and wraps the output in a standard PHYLIP format. In downstream analyses, CAFE can also be used to visualize the pairwise dissimilarity measures, including dendrograms, heatmap, principal coordinate analysis and network display. CAFE serves as a general k-mer based alignment-free analysis platform for studying the relationships among genomes and metagenomes, and is freely available at https://github.com/younglululu/CAFE. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  16. Sedimentology, Sequence Stratigraphy and Reservoir Characterization of Samana Suk Formation Exposed in Namal Gorge Section, Salt Range, Mianwali, Punjab, Pakistan

    Directory of Open Access Journals (Sweden)

    Muhammad Hayat

    2016-06-01

    Full Text Available Samana Suk Formation of Bathonian-callovain age, exposed in Nammal Gorge Salt Range, has been studied for microfacies and sequence stratigraphic investigation. The formation is mainly composed of limestone, with minor beds of sandstone and marl. The limestone is grey, yellowish and purple in color. Limestone is fine grained, thin to medium bedded and inter-bedded with algal laminations. The sandstone is light yellowish brown, brick red in color, calcareous and quartzose. Within Samana Suk Formation one 2ndorder sequence and two 3rdorder sequences have been identified. Their regional correlation through fine-tuned dating helped to develop basin fill model and to understand facies dynamics. A facie belt comprising a wide belt of carbonate facies characterized by Peloidal Packstone microfacies represents inner ramp setting and Pelletal/ Peloidal Wackstone, Mud-Wackstone and Mudstone microfacies represent the low energy lagoonal environment. The sandstone lithofacies represents high energy beach environment which indicates aggrading to pro-grading pattern. The porosity analysis has been done on different samples of limestone and sandstone. For the porosity analysis the Image J software is used. In limestone the porosity ranges up to 6% while in sandstone the porosity ranging up to 18%. From the field and porosity analysis it is concluded that Samana Suk Formation in study area is good reservoir.

  17. Mapping and characterizing N6-methyladenine in eukaryotic genomes using single molecule real-time sequencing.

    Science.gov (United States)

    Zhu, Shijia; Beaulaurier, John; Deikus, Gintaras; Wu, Tao; Strahl, Maya; Hao, Ziyang; Luo, Guanzheng; Gregory, James A; Chess, Andrew; He, Chuan; Xiao, Andrew; Sebra, Robert; Schadt, Eric E; Fang, Gang

    2018-05-15

    N6-methyladenine (m6dA) has been discovered as a novel form of DNA methylation prevalent in eukaryotes, however, methods for high resolution mapping of m6dA events are still lacking. Single-molecule real-time (SMRT) sequencing has enabled the detection of m6dA events at single-nucleotide resolution in prokaryotic genomes, but its application to detecting m6dA in eukaryotic genomes has not been rigorously examined. Herein, we identified unique characteristics of eukaryotic m6dA methylomes that fundamentally differ from those of prokaryotes. Based on these differences, we describe the first approach for mapping m6dA events using SMRT sequencing specifically designed for the study of eukaryotic genomes, and provide appropriate strategies for designing experiments and carrying out sequencing in future studies. We apply the novel approach to study two eukaryotic genomes. For green algae, we construct the first complete genome-wide map of m6dA at single nucleotide and single molecule resolution. For human lymphoblastoid cells (hLCLs), joint analyses of SMRT sequencing and independent sequencing data suggest that putative m6dA events are enriched in the promoters of young, full length LINE-1 elements (L1s). These analyses demonstrate a general method for rigorous mapping and characterization of m6dA events in eukaryotic genomes. Published by Cold Spring Harbor Laboratory Press.

  18. Genomic localization, sequence analysis, and transcription of the putative human cytomegalovirus DNA polymerase gene

    International Nuclear Information System (INIS)

    Heilbronn, T.; Jahn, G.; Buerkle, A.; Freese, U.K.; Fleckenstein, B.; Zur Hausen, H.

    1987-01-01

    The human cytomegalovirus (HCMV)-induced DNA polymerase has been well characterized biochemically and functionally, but its genomic location has not yet been assigned. To identify the coding sequence, cross-hybridization with the herpes simplex virus type 1 (HSV-1) polymerase gene was used, as suggested by the close similarity of the herpes group virus-induced DNA polymerases to the HCMV DNA polymerase. A cosmid and plasmid library of the entire HCMV genome was screened with the BamHI Q fragment of HSF-1 at different stringency conditions. One PstI-HincII restriction fragment of 850 base pairs mapping within the EcoRI M fragment of HCMV cross-hybridized at T/sub m/ - 25/degrees/C. Sequence analysis revealed one open reading frame spanning the entire sequence. The amino acid sequence showed a highly conserved domain of 133 amino acids shared with the HSV and putative Esptein-Barr virus polymerase sequences. This domain maps within the C-terminal part of the HSV polymerase gene, which has been suggested to contain part of the catalytic center of the enzyme. Transcription analysis revealed one 5.4-kilobase early transcript in the sense orientation with respect to the open reading frame identified. This transcript appears to code for the 140-kilodalton HCMV polymerase protein

  19. Environmental impact analysis for the main accidental sequences of ignitor

    International Nuclear Information System (INIS)

    Carpignano, A.; Francabandiera, S.; Vella, R.; Zucchetti, M.

    1996-01-01

    A safety analysis study has been applied to the Ignitor machine using Probabilistic Safety Assessment. The main initiating events have been identified, and accident sequences have been studied by means of traditional methods such as Failure Mode and Effect Analysis (FMEA), Fault Trees (FT) and Event Trees (ET). The consequences of the radioactive environmental releases have been assessed in terms of Effective Dose Equivalent (EDEs) to the Most Exposed Individuals (MEI) of the chosen site, by means of a population dose code. Results point out the low enviromental impact of the machine. 13 refs., 1 fig., 3 tabs

  20. High-resolution characterization of sequence signatures due to non-random cleavage of cell-free DNA.

    Science.gov (United States)

    Chandrananda, Dineika; Thorne, Natalie P; Bahlo, Melanie

    2015-06-17

    High-throughput sequencing of cell-free DNA fragments found in human plasma has been used to non-invasively detect fetal aneuploidy, monitor organ transplants and investigate tumor DNA. However, many biological properties of this extracellular genetic material remain unknown. Research that further characterizes circulating DNA could substantially increase its diagnostic value by allowing the application of more sophisticated bioinformatics tools that lead to an improved signal to noise ratio in the sequencing data. In this study, we investigate various features of cell-free DNA in plasma using deep-sequencing data from two pregnant women (>70X, >50X) and compare them with matched cellular DNA. We utilize a descriptive approach to examine how the biological cleavage of cell-free DNA affects different sequence signatures such as fragment lengths, sequence motifs at fragment ends and the distribution of cleavage sites along the genome. We show that the size distributions of these cell-free DNA molecules are dependent on their autosomal and mitochondrial origin as well as the genomic location within chromosomes. DNA mapping to particular microsatellites and alpha repeat elements display unique size signatures. We show how cell-free fragments occur in clusters along the genome, localizing to nucleosomal arrays and are preferentially cleaved at linker regions by correlating the mapping locations of these fragments with ENCODE annotation of chromatin organization. Our work further demonstrates that cell-free autosomal DNA cleavage is sequence dependent. The region spanning up to 10 positions on either side of the DNA cleavage site show a consistent pattern of preference for specific nucleotides. This sequence motif is present in cleavage sites localized to nucleosomal cores and linker regions but is absent in nucleosome-free mitochondrial DNA. These background signals in cell-free DNA sequencing data stem from the non-random biological cleavage of these fragments. This

  1. Rapid Characterization of Insulin Modifications and Sequence Variations by Proteinase K Digestion and UHPLC-ESI-MS

    Science.gov (United States)

    Yang, Rong-Sheng; Tang, Weijuan; Sheng, Huaming; Meng, Fanyu

    2018-01-01

    Discovery of novel insulin analogs as therapeutics has remained an active area of research. Compared with native human insulin, insulin analog molecules normally incorporate either covalent modifications or amino acid sequence variations. From the drug discovery and development perspective, methods for efficient and detailed characterization of these primary structural changes are very important. In this report, we demonstrate that proteinase K digestion coupled with UPLC-ESI-MS analysis provides a simple and rapid approach to characterize the modifications and sequence variations of insulin molecules. A commercially available proteinase K digestion kit was used to process recombinant human insulin (RHI), insulin glargine, and fluorescein isothiocynate-labeled recombinant human insulin (FITC-RHI) samples. The LC-MS data clearly showed that RHI and insulin glargine samples can be differentiated, and the FITC modifications in all three amine sites of the RHI molecule are well characterized. The end-to-end experiment and data interpretation was achieved within 60 min. This approach is fast and simple, and can be easily implemented in early drug discovery laboratories to facilitate research on more advanced insulin therapeutics. [Figure not available: see fulltext.

  2. Rapid Characterization of Insulin Modifications and Sequence Variations by Proteinase K Digestion and UHPLC-ESI-MS

    Science.gov (United States)

    Yang, Rong-Sheng; Tang, Weijuan; Sheng, Huaming; Meng, Fanyu

    2018-05-01

    Discovery of novel insulin analogs as therapeutics has remained an active area of research. Compared with native human insulin, insulin analog molecules normally incorporate either covalent modifications or amino acid sequence variations. From the drug discovery and development perspective, methods for efficient and detailed characterization of these primary structural changes are very important. In this report, we demonstrate that proteinase K digestion coupled with UPLC-ESI-MS analysis provides a simple and rapid approach to characterize the modifications and sequence variations of insulin molecules. A commercially available proteinase K digestion kit was used to process recombinant human insulin (RHI), insulin glargine, and fluorescein isothiocynate-labeled recombinant human insulin (FITC-RHI) samples. The LC-MS data clearly showed that RHI and insulin glargine samples can be differentiated, and the FITC modifications in all three amine sites of the RHI molecule are well characterized. The end-to-end experiment and data interpretation was achieved within 60 min. This approach is fast and simple, and can be easily implemented in early drug discovery laboratories to facilitate research on more advanced insulin therapeutics. [Figure not available: see fulltext.

  3. Genome-Wide Characterization of Simple Sequence Repeat (SSR) Loci in Chinese Jujube and Jujube SSR Primer Transferability

    Science.gov (United States)

    Xiao, Jing; Zhao, Jin; Liu, Mengjun; Liu, Ping; Dai, Li; Zhao, Zhihui

    2015-01-01

    Chinese jujube (Ziziphus jujuba), an economically important species in the Rhamnaceae family, is a popular fruit tree in Asia. Here, we surveyed and characterized simple sequence repeats (SSRs) in the jujube genome. A total of 436,676 SSR loci were identified, with an average distance of 0.93 Kb between the loci. A large proportion of the SSRs included mononucleotide, dinucleotide and trinucleotide repeat motifs, which accounted for 64.87%, 24.40%, and 8.74% of all repeats, respectively. Among the mononucleotide repeats, A/T was the most common, whereas AT/TA was the most common dinucleotide repeat. A total of 30,565 primer pairs were successfully designed and screened using a series of criteria. Moreover, 725 of 1,000 randomly selected primer pairs were effective among 6 cultivars, and 511 of these primer pairs were polymorphic. Sequencing the amplicons of two SSRs across three jujube cultivars revealed variations in the repeats. The transferability of jujube SSR primers proved that 35/64 SSRs could be transferred across family boundary. Using jujube SSR primers, clustering analysis results from 15 species were highly consistent with the Angiosperm Phylogeny Group (APGIII) System. The genome-wide characterization of SSRs in Chinese jujube is very valuable for whole-genome characterization and marker-assisted selection in jujube breeding. In addition, the transferability of jujube SSR primers could provide a solid foundation for their further utilization. PMID:26000739

  4. Analysis of sequence diversity through internal transcribed spacers and simple sequence repeats to identify Dendrobium species.

    Science.gov (United States)

    Liu, Y T; Chen, R K; Lin, S J; Chen, Y C; Chin, S W; Chen, F C; Lee, C Y

    2014-04-08

    The Orchidaceae is one of the largest and most diverse families of flowering plants. The Dendrobium genus has high economic potential as ornamental plants and for medicinal purposes. In addition, the species of this genus are able to produce large crops. However, many Dendrobium varieties are very similar in outward appearance, making it difficult to distinguish one species from another. This study demonstrated that the 12 Dendrobium species used in this study may be divided into 2 groups by internal transcribed spacer (ITS) sequence analysis. Red and yellow flowers may also be used to separate these species into 2 main groups. In particular, the deciduous characteristic is associated with the ITS genetic diversity of the A group. Of 53 designed simple sequence repeat (SSR) primer pairs, 7 pairs were polymorphic for polymerase chain reaction products that were amplified from a specific band. The results of this study demonstrate that these 7 SSR primer pairs may potentially be used to identify Dendrobium species and their progeny in future studies.

  5. Using Behavior Sequence Analysis to Map Serial Killers' Life Histories.

    Science.gov (United States)

    Keatley, David A; Golightly, Hayley; Shephard, Rebecca; Yaksic, Enzo; Reid, Sasha

    2018-03-01

    The aim of the current research was to provide a novel method for mapping the developmental sequences of serial killers' life histories. An in-depth biographical account of serial killers' lives, from birth through to conviction, was gained and analyzed using Behavior Sequence Analysis. The analyses highlight similarities in behavioral events across the serial killers' lives, indicating not only which risk factors occur, but the temporal order of these factors. Results focused on early childhood environment, indicating the role of parental abuse; behaviors and events surrounding criminal histories of serial killers, showing that many had previous convictions and were known to police for other crimes; behaviors surrounding their murders, highlighting differences in victim choice and modus operandi; and, finally, trial pleas and convictions. The present research, therefore, provides a novel approach to synthesizing large volumes of data on criminals and presenting results in accessible, understandable outcomes.

  6. RNA-ID, a Powerful Tool for Identifying and Characterizing Regulatory Sequences.

    Science.gov (United States)

    Brule, C E; Dean, K M; Grayhack, E J

    2016-01-01

    The identification and analysis of sequences that regulate gene expression is critical because regulated gene expression underlies biology. RNA-ID is an efficient and sensitive method to discover and investigate regulatory sequences in the yeast Saccharomyces cerevisiae, using fluorescence-based assays to detect green fluorescent protein (GFP) relative to a red fluorescent protein (RFP) control in individual cells. Putative regulatory sequences can be inserted either in-frame or upstream of a superfolder GFP fusion protein whose expression, like that of RFP, is driven by the bidirectional GAL1,10 promoter. In this chapter, we describe the methodology to identify and study cis-regulatory sequences in the RNA-ID system, explaining features and variations of the RNA-ID reporter, as well as some applications of this system. We describe in detail the methods to analyze a single regulatory sequence, from construction of a single GFP variant to assay of variants by flow cytometry, as well as modifications required to screen libraries of different strains simultaneously. We also describe subsequent analyses of regulatory sequences. © 2016 Elsevier Inc. All rights reserved.

  7. Multifractal analysis of 2001 Mw 7 . 7 Bhuj earthquake sequence in Gujarat, Western India

    Science.gov (United States)

    Aggarwal, Sandeep Kumar; Pastén, Denisse; Khan, Prosanta Kumar

    2017-12-01

    The 2001 Mw 7 . 7 Bhuj mainshock seismic sequence in the Kachchh area, occurring during 2001 to 2012, has been analyzed using mono-fractal and multi-fractal dimension spectrum analysis technique. This region was characterized by frequent moderate shocks of Mw ≥ 5 . 0 for more than a decade since the occurrence of 2001 Bhuj earthquake. The present study is therefore important for precursory analysis using this sequence. The selected long-sequence has been investigated first time for completeness magnitude Mc 3.0 using the maximum curvature method. Multi-fractal Dq spectrum (Dq ∼ q) analysis was carried out using effective window-length of 200 earthquakes with a moving window of 20 events overlapped by 180 events. The robustness of the analysis has been tested by considering the magnitude completeness correction term of 0.2 to Mc 3.0 as Mc 3.2 and we have tested the error in the calculus of Dq for each magnitude threshold. On the other hand, the stability of the analysis has been investigated down to the minimum magnitude of Mw ≥ 2 . 6 in the sequence. The analysis shows the multi-fractal dimension spectrum Dq decreases with increasing of clustering of events with time before a moderate magnitude earthquake in the sequence, which alternatively accounts for non-randomness in the spatial distribution of epicenters and its self-organized criticality. Similar behavior is ubiquitous elsewhere around the globe, and warns for proximity of a damaging seismic event in an area. OS: Please confirm math roman or italics in abs.

  8. Swab-to-Sequence: Real-time Data Analysis Platform for the Biomolecule Sequencer

    Data.gov (United States)

    National Aeronautics and Space Administration — DNA was successfully sequenced on the ISS in 2016, but the DNA sequenced was prepared on the ground. With FY’16 IRAD funds, the same team developed a...

  9. Nucleotide Sequence and Characterization of the Broad-Host-Range Lactococcal Plasmid pWVO1

    NARCIS (Netherlands)

    Leenhouts, Cornelis; Tolner, Berend; Bron, Sierd; Kok, Jan; Venema, Gerhardus; Seegers, Jozef

    The nucleotide sequence of the Lactococcus lactis broad-host-range plasmid pWVO1, replicating in both gram-positive and gram-negative bacteria, was determined. This analysis revealed four open reading frames (ORFs). ORF A appeared to encode a trans-acting 26.8-kDa protein (RepA), necessary for

  10. Genetic diversity analysis of Leuconostoc mesenteroides from Korean vegetables and food products by multilocus sequence typing.

    Science.gov (United States)

    Sharma, Anshul; Kaur, Jasmine; Lee, Sulhee; Park, Young-Seo

    2018-06-01

    In the present study, 35 Leuconostoc mesenteroides strains isolated from vegetables and food products from South Korea were studied by multilocus sequence typing (MLST) of seven housekeeping genes (atpA, groEL, gyrB, pheS, pyrG, rpoA, and uvrC). The fragment sizes of the seven amplified housekeeping genes ranged in length from 366 to 1414 bp. Sequence analysis indicated 27 different sequence types (STs) with 25 of them being represented by a single strain indicating high genetic diversity, whereas the remaining 2 were characterized by five strains each. In total, 220 polymorphic nucleotide sites were detected among seven housekeeping genes. The phylogenetic analysis based on the STs of the seven loci indicated that the 35 strains belonged to two major groups, A (28 strains) and B (7 strains). Split decomposition analysis showed that intraspecies recombination played a role in generating diversity among strains. The minimum spanning tree showed that the evolution of the STs was not correlated with food source. This study signifies that the multilocus sequence typing is a valuable tool to access the genetic diversity among L. mesenteroides strains from South Korea and can be used further to monitor the evolutionary changes.

  11. [Characterization of Black and Dichothrix Cyanobacteria Based on the 16S Ribosomal RNA Gene Sequence

    Science.gov (United States)

    Ortega, Maya

    2010-01-01

    My project focuses on characterizing different cyanobacteria in thrombolitic mats found on the island of Highborn Cay, Bahamas. Thrombolites are interesting ecosystems because of the ability of bacteria in these mats to remove carbon dioxide from the atmosphere and mineralize it as calcium carbonate. In the future they may be used as models to develop carbon sequestration technologies, which could be used as part of regenerative life systems in space. These thrombolitic communities are also significant because of their similarities to early communities of life on Earth. I targeted two cyanobacteria in my research, Dichothrix spp. and whatever black is, since they are believed to be important to carbon sequestration in these thrombolitic mats. The goal of my summer research project was to molecularly identify these two cyanobacteria. DNA was isolated from each organism through mat dissections and DNA extractions. I ran Polymerase Chain Reactions (PCR) to amplify the 16S ribosomal RNA (rRNA) gene in each cyanobacteria. This specific gene is found in almost all bacteria and is highly conserved, meaning any changes in the sequence are most likely due to evolution. As a result, the 16S rRNA gene can be used for bacterial identification of different species based on the sequence of their 16S rRNA gene. Since the exact sequence of the Dichothrix gene was unknown, I designed different primers that flanked the gene based on the known sequences from other taxonomically similar cyanobacteria. Once the 16S rRNA gene was amplified, I cloned the gene into specialized Escherichia coli cells and sent the gene products for sequencing. Once the sequence is obtained, it will be added to a genetic database for future reference to and classification of other Dichothrix sp.

  12. De novo sequencing, assembly and characterization of antennal transcriptome of Anomala corpulenta Motschulsky (Coleoptera: Rutelidae.

    Directory of Open Access Journals (Sweden)

    Haoliang Chen

    Full Text Available Anomala corpulenta is an important insect pest and can cause enormous economic losses in agriculture, horticulture and forestry. It is widely distributed in China, and both larvae and adults can cause serious damage. It is difficult to control this pest because the larvae live underground. Any new control strategy should exploit alternatives to heavily and frequently used chemical insecticides. However, little genetic research has been carried out on A. corpulenta due to the lack of genomic resources. Genomic resources could be produced by next generation sequencing technologies with low cost and in a short time. In this study, we performed de novo sequencing, assembly and characterization of the antennal transcriptome of A. corpulenta.Illumina sequencing technology was used to sequence the antennal transcriptome of A. corpulenta. Approximately 76.7 million total raw reads and about 68.9 million total clean reads were obtained, and then 35,656 unigenes were assembled. Of these unigenes, 21,463 of them could be annotated in the NCBI nr database, and, among the annotated unigenes, 11,154 and 6,625 unigenes could be assigned to GO and COG, respectively. Additionally, 16,350 unigenes could be annotated in the Swiss-Prot database, and 14,499 unigenes could map onto 258 pathways in the KEGG Pathway database. We also found 24 unigenes related to OBPs, 6 to CSPs, and in total 167 unigenes related to chemodetection. We analyzed 4 OBPs and 3CSPs sequences and their RT-qPCR results agreed well with their FPKM values.We produced the first large-scale antennal transcriptome of A. corpulenta, which is a species that has little genomic information in public databases. The identified chemodetection unigenes can promote the molecular mechanistic study of behavior in A. corpulenta. These findings provide a general sequence resource for molecular genetics research on A. corpulenta.

  13. Sequence comparison and phylogenetic analysis of core gene of ...

    African Journals Online (AJOL)

    STORAGESEVER

    2010-07-19

    Jul 19, 2010 ... and antisense primers, a single band of 573 base pairs .... Amino acid sequence alignment of Cluster I and Cluster II of phylogenetic tree. First ten sequences ... sequence weighting, postion-spiecific gap penalties and weight.

  14. Characterization of Aftershock Sequences from Large Strike-Slip Earthquakes Along Geometrically Complex Faults

    Science.gov (United States)

    Sexton, E.; Thomas, A.; Delbridge, B. G.

    2017-12-01

    Large earthquakes often exhibit complex slip distributions and occur along non-planar fault geometries, resulting in variable stress changes throughout the region of the fault hosting aftershocks. To better discern the role of geometric discontinuities on aftershock sequences, we compare areas of enhanced and reduced Coulomb failure stress and mean stress for systematic differences in the time dependence and productivity of these aftershock sequences. In strike-slip faults, releasing structures, including stepovers and bends, experience an increase in both Coulomb failure stress and mean stress during an earthquake, promoting fluid diffusion into the region and further failure. Conversely, Coulomb failure stress and mean stress decrease in restraining bends and stepovers in strike-slip faults, and fluids diffuse away from these areas, discouraging failure. We examine spatial differences in seismicity patterns along structurally complex strike-slip faults which have hosted large earthquakes, such as the 1992 Mw 7.3 Landers, the 2010 Mw 7.2 El-Mayor Cucapah, the 2014 Mw 6.0 South Napa, and the 2016 Mw 7.0 Kumamoto events. We characterize the behavior of these aftershock sequences with the Epidemic Type Aftershock-Sequence Model (ETAS). In this statistical model, the total occurrence rate of aftershocks induced by an earthquake is λ(t) = λ_0 + \\sum_{i:t_i

  15. Genome-wide microsatellite characterization and marker development in the sequenced Brassica crop species.

    Science.gov (United States)

    Shi, Jiaqin; Huang, Shunmou; Zhan, Jiepeng; Yu, Jingyin; Wang, Xinfa; Hua, Wei; Liu, Shengyi; Liu, Guihua; Wang, Hanzhong

    2014-02-01

    Although much research has been conducted, the pattern of microsatellite distribution has remained ambiguous, and the development/utilization of microsatellite markers has still been limited/inefficient in Brassica, due to the lack of genome sequences. In view of this, we conducted genome-wide microsatellite characterization and marker development in three recently sequenced Brassica crops: Brassica rapa, Brassica oleracea and Brassica napus. The analysed microsatellite characteristics of these Brassica species were highly similar or almost identical, which suggests that the pattern of microsatellite distribution is likely conservative in Brassica. The genomic distribution of microsatellites was highly non-uniform and positively or negatively correlated with genes or transposable elements, respectively. Of the total of 115 869, 185 662 and 356 522 simple sequence repeat (SSR) markers developed with high frequencies (408.2, 343.8 and 356.2 per Mb or one every 2.45, 2.91 and 2.81 kb, respectively), most represented new SSR markers, the majority had determined physical positions, and a large number were genic or putative single-locus SSR markers. We also constructed a comprehensive database for the newly developed SSR markers, which was integrated with public Brassica SSR markers and annotated genome components. The genome-wide SSR markers developed in this study provide a useful tool to extend the annotated genome resources of sequenced Brassica species to genetic study/breeding in different Brassica species.

  16. Characterization of sequence diversity in Plasmodium falciparum SERA5 from Indian isolates

    Directory of Open Access Journals (Sweden)

    Rahul C.N

    2015-06-01

    Full Text Available Objective: To characterize the sequence diversity of blood-stage Plasmodium falciparum serine repeat antigen-5 (PfSERA5 which is lacking in a malaria-endemic country like India. Methods: In this study, parasitic DNA was obtained from field isolates collected from various geographic regions. Subsequently, PfSERA5 gene sequence was PCR amplified and DNA sequenced. Results: We reported the existence of unique repeat polymorphisms and novel haplotypes for both the octamer repeat (OR and serine repeat (SR regions of the N-terminal fragment of PfSERA5 from Indian isolates. Several isolates from India were identical to low-frequency African haplotypes. Unique finding of our study was an Indian isolate showing deletion in a perfectly conserved 14 mer sequence within octamer repeat. Indian haplotypes reported in this study were found to be distributed into the three earlier classified allelic clusters of FCR3, K1 and Honduras showcasing broad diversity as compared to worldwide haplotypes. Conclusions: This study is the first report on genetic diversity of PfSERA5 antigen from India. Further evaluation of these haplotypes by serotyping would provide useful information for investigating variant-specific immunity and aid in malaria vaccine research.

  17. Linear discriminant analysis of character sequences using occurrences of words

    KAUST Repository

    Dutta, Subhajit; Chaudhuri, Probal; Ghosh, Anil

    2014-01-01

    Classification of character sequences, where the characters come from a finite set, arises in disciplines such as molecular biology and computer science. For discriminant analysis of such character sequences, the Bayes classifier based on Markov models turns out to have class boundaries defined by linear functions of occurrences of words in the sequences. It is shown that for such classifiers based on Markov models with unknown orders, if the orders are estimated from the data using cross-validation, the resulting classifier has Bayes risk consistency under suitable conditions. Even when Markov models are not valid for the data, we develop methods for constructing classifiers based on linear functions of occurrences of words, where the word length is chosen by cross-validation. Such linear classifiers are constructed using ideas of support vector machines, regression depth, and distance weighted discrimination. We show that classifiers with linear class boundaries have certain optimal properties in terms of their asymptotic misclassification probabilities. The performance of these classifiers is demonstrated in various simulated and benchmark data sets.

  18. Planarian homeobox genes: cloning, sequence analysis, and expression.

    Science.gov (United States)

    Garcia-Fernàndez, J; Baguñà, J; Saló, E

    1991-01-01

    Freshwater planarians (Platyhelminthes, Turbellaria, and Tricladida) are acoelomate, triploblastic, unsegmented, and bilaterally symmetrical organisms that are mainly known for their ample power to regenerate a complete organism from a small piece of their body. To identify potential pattern-control genes in planarian regeneration, we have isolated two homeobox-containing genes, Dth-1 and Dth-2 [Dugesia (Girardia) tigrina homeobox], by using degenerate oligonucleotides corresponding to the most conserved amino acid sequence from helix-3 of the homeodomain. Dth-1 and Dth-2 homeodomains are closely related (68% at the nucleotide level and 78% at the protein level) and show the conserved residues characteristic of the homeodomains identified to data. Similarity with most homeobox sequences is low (30-50%), except with Drosophila NK homeodomains (80-82% with NK-2) and the rodent TTF-1 homeodomain (77-87%). Some unusual amino acid residues specific to NK-2, TTF-1, Dth-1, and Dth-2 can be observed in the recognition helix (helix-3) and may define a family of homeodomains. The deduced amino acid sequences from the cDNAs contain, in addition to the homeodomain, other domains also present in various homeobox-containing genes. The expression of both genes, detected by Northern blot analysis, appear slightly higher in cephalic regions than in the rest of the intact organism, while a slight increase is detected in the central period (5 days) or regeneration. Images PMID:1714599

  19. Analysis of correlations between sites in models of protein sequences

    International Nuclear Information System (INIS)

    Giraud, B.G.; Lapedes, A.; Liu, L.C.

    1998-01-01

    A criterion based on conditional probabilities, related to the concept of algorithmic distance, is used to detect correlated mutations at noncontiguous sites on sequences. We apply this criterion to the problem of analyzing correlations between sites in protein sequences; however, the analysis applies generally to networks of interacting sites with discrete states at each site. Elementary models, where explicit results can be derived easily, are introduced. The number of states per site considered ranges from 2, illustrating the relation to familiar classical spin systems, to 20 states, suitable for representing amino acids. Numerical simulations show that the criterion remains valid even when the genetic history of the data samples (e.g., protein sequences), as represented by a phylogenetic tree, introduces nonindependence between samples. Statistical fluctuations due to finite sampling are also investigated and do not invalidate the criterion. A subsidiary result is found: The more homogeneous a population, the more easily its average properties can drift from the properties of its ancestor. copyright 1998 The American Physical Society

  20. Linear discriminant analysis of character sequences using occurrences of words

    KAUST Repository

    Dutta, Subhajit

    2014-02-01

    Classification of character sequences, where the characters come from a finite set, arises in disciplines such as molecular biology and computer science. For discriminant analysis of such character sequences, the Bayes classifier based on Markov models turns out to have class boundaries defined by linear functions of occurrences of words in the sequences. It is shown that for such classifiers based on Markov models with unknown orders, if the orders are estimated from the data using cross-validation, the resulting classifier has Bayes risk consistency under suitable conditions. Even when Markov models are not valid for the data, we develop methods for constructing classifiers based on linear functions of occurrences of words, where the word length is chosen by cross-validation. Such linear classifiers are constructed using ideas of support vector machines, regression depth, and distance weighted discrimination. We show that classifiers with linear class boundaries have certain optimal properties in terms of their asymptotic misclassification probabilities. The performance of these classifiers is demonstrated in various simulated and benchmark data sets.

  1. Galaxy Workflows for Web-based Bioinformatics Analysis of Aptamer High-throughput Sequencing Data

    Directory of Open Access Journals (Sweden)

    William H Thiel

    2016-01-01

    Full Text Available Development of RNA and DNA aptamers for diagnostic and therapeutic applications is a rapidly growing field. Aptamers are identified through iterative rounds of selection in a process termed SELEX (Systematic Evolution of Ligands by EXponential enrichment. High-throughput sequencing (HTS revolutionized the modern SELEX process by identifying millions of aptamer sequences across multiple rounds of aptamer selection. However, these vast aptamer HTS datasets necessitated bioinformatics techniques. Herein, we describe a semiautomated approach to analyze aptamer HTS datasets using the Galaxy Project, a web-based open source collection of bioinformatics tools that were originally developed to analyze genome, exome, and transcriptome HTS data. Using a series of Workflows created in the Galaxy webserver, we demonstrate efficient processing of aptamer HTS data and compilation of a database of unique aptamer sequences. Additional Workflows were created to characterize the abundance and persistence of aptamer sequences within a selection and to filter sequences based on these parameters. A key advantage of this approach is that the online nature of the Galaxy webserver and its graphical interface allow for the analysis of HTS data without the need to compile code or install multiple programs.

  2. Determining physical constraints in transcriptional initiationcomplexes using DNA sequence analysis

    Energy Technology Data Exchange (ETDEWEB)

    Shultzaberger, Ryan K.; Chiang, Derek Y.; Moses, Alan M.; Eisen,Michael B.

    2007-07-01

    Eukaryotic gene expression is often under the control ofcooperatively acting transcription factors whose binding is limited bystructural constraints. By determining these structural constraints, wecan understand the "rules" that define functional cooperativity.Conversely, by understanding the rules of binding, we can inferstructural characteristics. We have developed an information theory basedmethod for approximating the physical limitations of cooperativeinteractions by comparing sequence analysis to microarray expressiondata. When applied to the coordinated binding of the sulfur amino acidregulatory protein Met4 by Cbf1 and Met31, we were able to create acombinatorial model that can correctly identify Met4 regulatedgenes.

  3. A Retrospective Examination of Feline Leukemia Subgroup Characterization: Viral Interference Assays to Deep Sequencing

    Directory of Open Access Journals (Sweden)

    Elliott S. Chiu

    2018-01-01

    Full Text Available Feline leukemia virus (FeLV was the first feline retrovirus discovered, and is associated with multiple fatal disease syndromes in cats, including lymphoma. The original research conducted on FeLV employed classical virological techniques. As methods have evolved to allow FeLV genetic characterization, investigators have continued to unravel the molecular pathology associated with this fascinating agent. In this review, we discuss how FeLV classification, transmission, and disease-inducing potential have been defined sequentially by viral interference assays, Sanger sequencing, PCR, and next-generation sequencing. In particular, we highlight the influences of endogenous FeLV and host genetics that represent FeLV research opportunities on the near horizon.

  4. A Retrospective Examination of Feline Leukemia Subgroup Characterization: Viral Interference Assays to Deep Sequencing.

    Science.gov (United States)

    Chiu, Elliott S; Hoover, Edward A; VandeWoude, Sue

    2018-01-10

    Feline leukemia virus (FeLV) was the first feline retrovirus discovered, and is associated with multiple fatal disease syndromes in cats, including lymphoma. The original research conducted on FeLV employed classical virological techniques. As methods have evolved to allow FeLV genetic characterization, investigators have continued to unravel the molecular pathology associated with this fascinating agent. In this review, we discuss how FeLV classification, transmission, and disease-inducing potential have been defined sequentially by viral interference assays, Sanger sequencing, PCR, and next-generation sequencing. In particular, we highlight the influences of endogenous FeLV and host genetics that represent FeLV research opportunities on the near horizon.

  5. Molecular genetic characterization of the RD-114 gene family of endogenous feline retroviral sequences.

    Science.gov (United States)

    Reeves, R H; O'Brien, S J

    1984-01-01

    RD-114 is a replication-competent, xenotropic retrovirus which is homologous to a family of moderately repetitive DNA sequences present at ca. 20 copies in the normal cellular genome of domestic cats. To examine the extent and character of genomic divergence of the RD-114 gene family as well as to assess their positional association within the cat genome, we have prepared a series of molecular clones of endogenous RD-114 DNA segments from a genomic library of cat cellular DNA. Their restriction endonuclease maps were compared with each other as well as to that of the prototype-inducible RD-114 which was molecularly cloned from a chronically infected human cell line. The endogenous sequences analyzed were similar to each other in that they were colinear with RD-114 proviral DNA, were bounded by long terminal redundancies, and conserved many restriction sites in the gag and pol regions. However, the env regions of many of the sequences examined were substantially deleted. Several of the endogenous RD-114 genomes contained a novel envelope sequence which was unrelated to the env gene of the prototype RD-114 env gene but which, like RD-114 and endogenous feline leukemia virus provirus, was found only in species of the genus Felis, and not in other closely related Felidae genera. The endogenous RD-114 sequences each had a distinct cellular flank which indicates that these sequences are not tandem but dispersed nonspecifically throughout the genome. Southern analysis of cat cellular DNA confirmed the conclusions about conserved restriction sites in endogenous sequences and indicated that a single locus may be responsible for the production of the major inducible form of RD-114. Images PMID:6090693

  6. Sequencing and de novo analysis of a coral larval transcriptome using 454 GSFlx

    Directory of Open Access Journals (Sweden)

    Colbourne John K

    2009-05-01

    Full Text Available Abstract Background New methods are needed for genomic-scale analysis of emerging model organisms that exemplify important biological questions but lack fully sequenced genomes. For example, there is an urgent need to understand the potential for corals to adapt to climate change, but few molecular resources are available for studying these processes in reef-building corals. To facilitate genomics studies in corals and other non-model systems, we describe methods for transcriptome sequencing using 454, as well as strategies for assembling a useful catalog of genes from the output. We have applied these methods to sequence the transcriptome of planulae larvae from the coral Acropora millepora. Results More than 600,000 reads produced in a single 454 sequencing run were assembled into ~40,000 contigs with five-fold average sequencing coverage. Based on sequence similarity with known proteins, these analyses identified ~11,000 different genes expressed in a range of conditions including thermal stress and settlement induction. Assembled sequences were annotated with gene names, conserved domains, and Gene Ontology terms. Targeted searches using these annotations identified the majority of genes associated with essential metabolic pathways and conserved signaling pathways, as well as novel candidate genes for stress-related processes. Comparisons with the genome of the anemone Nematostella vectensis revealed ~8,500 pairs of orthologs and ~100 candidate coral-specific genes. More than 30,000 SNPs were detected in the coral sequences, and a subset of these validated by re-sequencing. Conclusion The methods described here for deep sequencing of the transcriptome should be widely applicable to generate catalogs of genes and genetic markers in emerging model organisms. Our data provide the most comprehensive sequence resource currently available for reef-building corals, and include an extensive collection of potential genetic markers for association and

  7. Molecular cloning, sequence characterization and expression pattern of Rab18 gene from watermelon (Citrullus lanatus).

    Science.gov (United States)

    Xinli, Xiao; Lei, Peng

    2015-03-04

    The complete mRNA sequence of watermelon Rab18 gene was amplified through the rapid amplification of cDNA ends (RACE) method. The full-length mRNA was 1010 bp containing a 645 bp open reading frame, which encodes a protein of 214 amino acids. Sequence analysis revealed that watermelon Rab18 protein shares high homology with the Rab18 of cucumber (99%), muskmelon (98%), Morus notabilis (90%), tomato (89%), wine grape (89%) and potato (88%). Phylogenetic analysis revealed that watermelon Rab18 gene has a closer genetic relationship with Rab18 gene of cucumber and muskmelon. Tissue expression profile analysis indicated that watermelon Rab18 gene was highly expressed in root, stem and leaf, moderately expressed in flower and weakly expressed in fruit.

  8. Impact of sequencing depth on the characterization of the microbiome and resistome.

    Science.gov (United States)

    Zaheer, Rahat; Noyes, Noelle; Ortega Polo, Rodrigo; Cook, Shaun R; Marinier, Eric; Van Domselaar, Gary; Belk, Keith E; Morley, Paul S; McAllister, Tim A

    2018-04-12

    Developments in high-throughput next generation sequencing (NGS) technology have rapidly advanced the understanding of overall microbial ecology as well as occurrence and diversity of specific genes within diverse environments. In the present study, we compared the ability of varying sequencing depths to generate meaningful information about the taxonomic structure and prevalence of antimicrobial resistance genes (ARGs) in the bovine fecal microbial community. Metagenomic sequencing was conducted on eight composite fecal samples originating from four beef cattle feedlots. Metagenomic DNA was sequenced to various depths, D1, D0.5 and D0.25, with average sample read counts of 117, 59 and 26 million, respectively. A comparative analysis of the relative abundance of reads aligning to different phyla and antimicrobial classes indicated that the relative proportions of read assignments remained fairly constant regardless of depth. However, the number of reads being assigned to ARGs as well as to microbial taxa increased significantly with increasing depth. We found a depth of D0.5 was suitable to describe the microbiome and resistome of cattle fecal samples. This study helps define a balance between cost and required sequencing depth to acquire meaningful results.

  9. Molecular detection and sequence characterization of diverse rhabdoviruses in bats, China.

    Science.gov (United States)

    Xu, Lin; Wu, Jianmin; Jiang, Tinglei; Qin, Shaomin; Xia, Lele; Li, Xingyu; He, Biao; Tu, Changchun

    2018-01-15

    The Rhabdoviridae is among the most diverse families of RNA viruses and currently classified into 18 genera with some rhabdoviruses lethal to humans and other animals. Herein, we describe genetic characterization of three novel rhabdoviruses from bats in China. Of these, two viruses (Jinghong bat virus and Benxi bat virus) found in Rhinolophus bats showed a phylogenetic relationship with vesiculoviruses, and sequence analyses indicate that they represent two new species within the genus Vesiculovirus. The remaining Yangjiang bat virus found in Hipposideros larvatus bats were only distantly related to currently known rhabdoviruses. Copyright © 2017 Elsevier B.V. All rights reserved.

  10. A functional U-statistic method for association analysis of sequencing data.

    Science.gov (United States)

    Jadhav, Sneha; Tong, Xiaoran; Lu, Qing

    2017-11-01

    Although sequencing studies hold great promise for uncovering novel variants predisposing to human diseases, the high dimensionality of the sequencing data brings tremendous challenges to data analysis. Moreover, for many complex diseases (e.g., psychiatric disorders) multiple related phenotypes are collected. These phenotypes can be different measurements of an underlying disease, or measurements characterizing multiple related diseases for studying common genetic mechanism. Although jointly analyzing these phenotypes could potentially increase the power of identifying disease-associated genes, the different types of phenotypes pose challenges for association analysis. To address these challenges, we propose a nonparametric method, functional U-statistic method (FU), for multivariate analysis of sequencing data. It first constructs smooth functions from individuals' sequencing data, and then tests the association of these functions with multiple phenotypes by using a U-statistic. The method provides a general framework for analyzing various types of phenotypes (e.g., binary and continuous phenotypes) with unknown distributions. Fitting the genetic variants within a gene using a smoothing function also allows us to capture complexities of gene structure (e.g., linkage disequilibrium, LD), which could potentially increase the power of association analysis. Through simulations, we compared our method to the multivariate outcome score test (MOST), and found that our test attained better performance than MOST. In a real data application, we apply our method to the sequencing data from Minnesota Twin Study (MTS) and found potential associations of several nicotine receptor subunit (CHRN) genes, including CHRNB3, associated with nicotine dependence and/or alcohol dependence. © 2017 WILEY PERIODICALS, INC.

  11. Micropathogen Community Analysis in Hyalomma rufipes via High-Throughput Sequencing of Small RNAs

    Science.gov (United States)

    Luo, Jin; Liu, Min-Xuan; Ren, Qiao-Yun; Chen, Ze; Tian, Zhan-Cheng; Hao, Jia-Wei; Wu, Feng; Liu, Xiao-Cui; Luo, Jian-Xun; Yin, Hong; Wang, Hui; Liu, Guang-Yuan

    2017-01-01

    Ticks are important vectors in the transmission of a broad range of micropathogens to vertebrates, including humans. Because of the role of ticks in disease transmission, identifying and characterizing the micropathogen profiles of tick populations have become increasingly important. The objective of this study was to survey the micropathogens of Hyalomma rufipes ticks. Illumina HiSeq2000 technology was utilized to perform deep sequencing of small RNAs (sRNAs) extracted from field-collected H. rufipes ticks in Gansu Province, China. The resultant sRNA library data revealed that the surveyed tick populations produced reads that were homologous to St. Croix River Virus (SCRV) sequences. We also observed many reads that were homologous to microbial and/or pathogenic isolates, including bacteria, protozoa, and fungi. As part of this analysis, a phylogenetic tree was constructed to display the relationships among the homologous sequences that were identified. The study offered a unique opportunity to gain insight into the micropathogens of H. rufipes ticks. The effective control of arthropod vectors in the future will require knowledge of the micropathogen composition of vectors harboring infectious agents. Understanding the ecological factors that regulate vector propagation in association with the prevalence and persistence of micropathogen lineages is also imperative. These interactions may affect the evolution of micropathogen lineages, especially if the micropathogens rely on the vector or host for dispersal. The sRNA deep-sequencing approach used in this analysis provides an intuitive method to survey micropathogen prevalence in ticks and other vector species. PMID:28861401

  12. Genome sequencing of bacteria: sequencing, de novo assembly and rapid analysis using open source tools.

    Science.gov (United States)

    Kisand, Veljo; Lettieri, Teresa

    2013-04-01

    De novo genome sequencing of previously uncharacterized microorganisms has the potential to open up new frontiers in microbial genomics by providing insight into both functional capabilities and biodiversity. Until recently, Roche 454 pyrosequencing was the NGS method of choice for de novo assembly because it generates hundreds of thousands of long reads (tools for processing NGS data are increasingly free and open source and are often adopted for both their high quality and role in promoting academic freedom. The error rate of pyrosequencing the Alcanivorax borkumensis genome was such that thousands of insertions and deletions were artificially introduced into the finished genome. Despite a high coverage (~30 fold), it did not allow the reference genome to be fully mapped. Reads from regions with errors had low quality, low coverage, or were missing. The main defect of the reference mapping was the introduction of artificial indels into contigs through lower than 100% consensus and distracting gene calling due to artificial stop codons. No assembler was able to perform de novo assembly comparable to reference mapping. Automated annotation tools performed similarly on reference mapped and de novo draft genomes, and annotated most CDSs in the de novo assembled draft genomes. Free and open source software (FOSS) tools for assembly and annotation of NGS data are being developed rapidly to provide accurate results with less computational effort. Usability is not high priority and these tools currently do not allow the data to be processed without manual intervention. Despite this, genome assemblers now readily assemble medium short reads into long contigs (>97-98% genome coverage). A notable gap in pyrosequencing technology is the quality of base pair calling and conflicting base pairs between single reads at the same nucleotide position. Regardless, using draft whole genomes that are not finished and remain fragmented into tens of contigs allows one to characterize

  13. Streaming support for data intensive cloud-based sequence analysis.

    Science.gov (United States)

    Issa, Shadi A; Kienzler, Romeo; El-Kalioby, Mohamed; Tonellato, Peter J; Wall, Dennis; Bruggmann, Rémy; Abouelhoda, Mohamed

    2013-01-01

    Cloud computing provides a promising solution to the genomics data deluge problem resulting from the advent of next-generation sequencing (NGS) technology. Based on the concepts of "resources-on-demand" and "pay-as-you-go", scientists with no or limited infrastructure can have access to scalable and cost-effective computational resources. However, the large size of NGS data causes a significant data transfer latency from the client's site to the cloud, which presents a bottleneck for using cloud computing services. In this paper, we provide a streaming-based scheme to overcome this problem, where the NGS data is processed while being transferred to the cloud. Our scheme targets the wide class of NGS data analysis tasks, where the NGS sequences can be processed independently from one another. We also provide the elastream package that supports the use of this scheme with individual analysis programs or with workflow systems. Experiments presented in this paper show that our solution mitigates the effect of data transfer latency and saves both time and cost of computation.

  14. Streaming Support for Data Intensive Cloud-Based Sequence Analysis

    Directory of Open Access Journals (Sweden)

    Shadi A. Issa

    2013-01-01

    Full Text Available Cloud computing provides a promising solution to the genomics data deluge problem resulting from the advent of next-generation sequencing (NGS technology. Based on the concepts of “resources-on-demand” and “pay-as-you-go”, scientists with no or limited infrastructure can have access to scalable and cost-effective computational resources. However, the large size of NGS data causes a significant data transfer latency from the client’s site to the cloud, which presents a bottleneck for using cloud computing services. In this paper, we provide a streaming-based scheme to overcome this problem, where the NGS data is processed while being transferred to the cloud. Our scheme targets the wide class of NGS data analysis tasks, where the NGS sequences can be processed independently from one another. We also provide the elastream package that supports the use of this scheme with individual analysis programs or with workflow systems. Experiments presented in this paper show that our solution mitigates the effect of data transfer latency and saves both time and cost of computation.

  15. Next-generation sequence analysis of cancer xenograft models.

    Directory of Open Access Journals (Sweden)

    Fernando J Rossello

    Full Text Available Next-generation sequencing (NGS studies in cancer are limited by the amount, quality and purity of tissue samples. In this situation, primary xenografts have proven useful preclinical models. However, the presence of mouse-derived stromal cells represents a technical challenge to their use in NGS studies. We examined this problem in an established primary xenograft model of small cell lung cancer (SCLC, a malignancy often diagnosed from small biopsy or needle aspirate samples. Using an in silico strategy that assign reads according to species-of-origin, we prospectively compared NGS data from primary xenograft models with matched cell lines and with published datasets. We show here that low-coverage whole-genome analysis demonstrated remarkable concordance between published genome data and internal controls, despite the presence of mouse genomic DNA. Exome capture sequencing revealed that this enrichment procedure was highly species-specific, with less than 4% of reads aligning to the mouse genome. Human-specific expression profiling with RNA-Seq replicated array-based gene expression experiments, whereas mouse-specific transcript profiles correlated with published datasets from human cancer stroma. We conclude that primary xenografts represent a useful platform for complex NGS analysis in cancer research for tumours with limited sample resources, or those with prominent stromal cell populations.

  16. Streaming Support for Data Intensive Cloud-Based Sequence Analysis

    Science.gov (United States)

    Issa, Shadi A.; Kienzler, Romeo; El-Kalioby, Mohamed; Tonellato, Peter J.; Wall, Dennis; Bruggmann, Rémy; Abouelhoda, Mohamed

    2013-01-01

    Cloud computing provides a promising solution to the genomics data deluge problem resulting from the advent of next-generation sequencing (NGS) technology. Based on the concepts of “resources-on-demand” and “pay-as-you-go”, scientists with no or limited infrastructure can have access to scalable and cost-effective computational resources. However, the large size of NGS data causes a significant data transfer latency from the client's site to the cloud, which presents a bottleneck for using cloud computing services. In this paper, we provide a streaming-based scheme to overcome this problem, where the NGS data is processed while being transferred to the cloud. Our scheme targets the wide class of NGS data analysis tasks, where the NGS sequences can be processed independently from one another. We also provide the elastream package that supports the use of this scheme with individual analysis programs or with workflow systems. Experiments presented in this paper show that our solution mitigates the effect of data transfer latency and saves both time and cost of computation. PMID:23710461

  17. Phylogenetic characterization of a biogas plant microbial community integrating clone library 16S-rDNA sequences and metagenome sequence data obtained by 454-pyrosequencing.

    Science.gov (United States)

    Kröber, Magdalena; Bekel, Thomas; Diaz, Naryttza N; Goesmann, Alexander; Jaenicke, Sebastian; Krause, Lutz; Miller, Dimitri; Runte, Kai J; Viehöver, Prisca; Pühler, Alfred; Schlüter, Andreas

    2009-06-01

    The phylogenetic structure of the microbial community residing in a fermentation sample from a production-scale biogas plant fed with maize silage, green rye and liquid manure was analysed by an integrated approach using clone library sequences and metagenome sequence data obtained by 454-pyrosequencing. Sequencing of 109 clones from a bacterial and an archaeal 16S-rDNA amplicon library revealed that the obtained nucleotide sequences are similar but not identical to 16S-rDNA database sequences derived from different anaerobic environments including digestors and bioreactors. Most of the bacterial 16S-rDNA sequences could be assigned to the phylum Firmicutes with the most abundant class Clostridia and to the class Bacteroidetes, whereas most archaeal 16S-rDNA sequences cluster close to the methanogen Methanoculleus bourgensis. Further sequences of the archaeal library most probably represent so far non-characterised species within the genus Methanoculleus. A similar result derived from phylogenetic analysis of mcrA clone sequences. The mcrA gene product encodes the alpha-subunit of methyl-coenzyme-M reductase involved in the final step of methanogenesis. BLASTn analysis applying stringent settings resulted in assignment of 16S-rDNA metagenome sequence reads to 62 16S-rDNA amplicon sequences thus enabling frequency of abundance estimations for 16S-rDNA clone library sequences. Ribosomal Database Project (RDP) Classifier processing of metagenome 16S-rDNA reads revealed abundance of the phyla Firmicutes, Bacteroidetes and Euryarchaeota and the orders Clostridiales, Bacteroidales and Methanomicrobiales. Moreover, a large fraction of 16S-rDNA metagenome reads could not be assigned to lower taxonomic ranks, demonstrating that numerous microorganisms in the analysed fermentation sample of the biogas plant are still unclassified or unknown.

  18. Characterization of some sedimentary sequences from Cambay basin, India, by pyrolysis-GC

    Science.gov (United States)

    Philp, R. P.; Garg, A. K.

    Pyrolysis-gas chromatography of sedimentary sequences from a key exploratory well of the southern Cambay Basin, India, has been performed to characterize the nature of the source material and its maturity. In samples from the Eocene-Paleocene section (2960-3407 m), the pyrolysate is dominated by hydrocarbons in the lower molecular weight region indicating a significant input algal source material. The presence of various xylenes and phenols in the pyrograms is indicative of a significant input from higher plant material. The organic material in this section is interpreted to have been derived from marine-terrestrial source inputs deposited under swampy to marine and reducing environments. Good mature source rocks with type III kerogens which are wet gas/gas condensate-prone have been identified in this region. This paper intends to discuss the characterization of source rocks using the pyrolysis-gas chromatography approach and the significance of the distribution of the pyrolysis product.

  19. Sequence characterization of heat shock protein gene of Cyclospora cayetanensis isolates from Nepal, Mexico, and Peru.

    Science.gov (United States)

    Sulaiman, Irshad M; Torres, Patricia; Simpson, Steven; Kerdahi, Khalil; Ortega, Ynes

    2013-04-01

    We have described the development of a 2-step nested PCR protocol based on the characterization of the 70-kDa heat shock protein (HSP70) gene for rapid detection of the human-pathogenic Cyclospora cayetanensis parasite. We tested and validated these newly designed primer sets by PCR amplification followed by nucleotide sequencing of PCR-amplified HSP70 fragments belonging to 16 human C. cayetanensis isolates from 3 different endemic regions that include Nepal, Mexico, and Peru. No genetic polymorphism was observed among the isolates at the characterized regions of the HSP70 locus. This newly developed HSP70 gene-based nested PCR protocol provides another useful genetic marker for the rapid detection of C. cayetanensis in the future.

  20. Extended -Regular Sequence for Automated Analysis of Microarray Images

    Directory of Open Access Journals (Sweden)

    Jin Hee-Jeong

    2006-01-01

    Full Text Available Microarray study enables us to obtain hundreds of thousands of expressions of genes or genotypes at once, and it is an indispensable technology for genome research. The first step is the analysis of scanned microarray images. This is the most important procedure for obtaining biologically reliable data. Currently most microarray image processing systems require burdensome manual block/spot indexing work. Since the amount of experimental data is increasing very quickly, automated microarray image analysis software becomes important. In this paper, we propose two automated methods for analyzing microarray images. First, we propose the extended -regular sequence to index blocks and spots, which enables a novel automatic gridding procedure. Second, we provide a methodology, hierarchical metagrid alignment, to allow reliable and efficient batch processing for a set of microarray images. Experimental results show that the proposed methods are more reliable and convenient than the commercial tools.

  1. Sequence Quality Analysis Tool for HIV Type 1 Protease and Reverse Transcriptase

    OpenAIRE

    DeLong, Allison K.; Wu, Mingham; Bennett, Diane; Parkin, Neil; Wu, Zhijin; Hogan, Joseph W.; Kantor, Rami

    2012-01-01

    Access to antiretroviral therapy is increasing globally and drug resistance evolution is anticipated. Currently, protease (PR) and reverse transcriptase (RT) sequence generation is increasing, including the use of in-house sequencing assays, and quality assessment prior to sequence analysis is essential. We created a computational HIV PR/RT Sequence Quality Analysis Tool (SQUAT) that runs in the R statistical environment. Sequence quality thresholds are calculated from a large dataset (46,802...

  2. Sequence and structural characterization of Trx-Grx type of monothiol glutaredoxins from Ashbya gossypii.

    Science.gov (United States)

    Yadav, Saurabh; Kumari, Pragati; Kushwaha, Hemant Ritturaj

    2013-01-01

    Glutaredoxins are enzymatic antioxidants which are small, ubiquitous, glutathione dependent and essentially classified under thioredoxin-fold superfamily. Glutaredoxins are classified into two types: dithiol and monothiol. Monothiol glutaredoxins which carry the signature "CGFS" as a redox active motif is known for its role in oxidative stress, inside the cell. In the present analysis, the 138 amino acid long monothiol glutaredoxin, AgGRX1 from Ashbya gossypii was identified and has been used for the analysis. The multiple sequence alignment of the AgGRX1 protein sequence revealed the characteristic motif of typical monothiol glutaredoxin as observed in various other organisms. The proposed structure of the AgGRX1 protein was used to analyze signature folds related to the thioredoxin superfamily. Further, the study highlighted the structural features pertaining to the complex mechanism of glutathione docking and interacting residues.

  3. High-Throughput Analysis of T-DNA Location and Structure Using Sequence Capture.

    Directory of Open Access Journals (Sweden)

    Soichi Inagaki

    Full Text Available Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA-genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously, using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. Our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.

  4. High-Throughput Analysis of T-DNA Location and Structure Using Sequence Capture.

    Science.gov (United States)

    Inagaki, Soichi; Henry, Isabelle M; Lieberman, Meric C; Comai, Luca

    2015-01-01

    Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA-genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously, using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. Our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.

  5. [Multilocus sequence-typing for characterization of Moscow strains of Haemophilus influenzae type b].

    Science.gov (United States)

    Platonov, A E; Mironov, K O; Iatsyshina, S B; Koroleva, I S; Platonova, O V; Gushchin, A E; Shipulin, G A

    2003-01-01

    Haemophilius influenzae, type b (Hib) bacteria, were genotyped by multilocus sequence typing (MLST) using 5 loci (adk, fucK, mdh, pgi, recA). 42 Moscow Hib strains (including 38 isolates form cerebrospinal fluid of children, who had purulent meningitis in 1999-2001, and 4 strains isolated from healthy carriers of Hib), as well as 2 strains from Yekaterinburg were studied. In MLST a strain is characterized, by alleles and their combinations (an allele profile) referred to also as sequence-type (ST). 9 Sts were identified within the Russian Hib bacteria: ST-1 was found in 25 strains (57%), ST-12 was found in 8 strains (18%), ST-11 was found in 4 strains (9%) and ST-15 was found in 2 strains (4.5%); all other STs strains (13, 14, 16, 17, 51) were found in isolated cases (2.3%). A comparison of allelic profiles and of nucleotide sequences showed that 93% of Russian isolates, i.e. strain with ST-1, 11, 12, 13, 15 and 17, belong to one and the same clonal complex. 2 isolates from Norway and Sweden from among 7 foreign Hib strains studied up to now can be described as belonging to the same clonal complex; 5 Hib strains were different from the Russian ones.

  6. Whole-exome sequencing analysis of Waardenburg syndrome in a Chinese family

    Science.gov (United States)

    Chen, Dezhong; Zhao, Na; Wang, Jing; Li, Zhuoyu; Wu, Changxin; Fu, Jie; Xiao, Han

    2017-01-01

    Waardenburg syndrome (WS) is a dominantly inherited, genetically heterogeneous auditory-pigmentary syndrome characterized by non-progressive sensorineural hearing loss and iris discoloration. By whole-exome sequencing (WES), we identified a nonsense mutation (c.598C>T) in PAX3 gene, predicted to be disease causing by in silico analysis. This is the first report of genetically diagnosed case of WS PAX3 c.598C>T nonsense mutation in Chinese ethnic origin by WES and in silico functional prediction methods. PMID:28690861

  7. Analysis of breast cancer metastasis candidate genes from next generation-sequencing via systematic functional genomics

    DEFF Research Database (Denmark)

    Blomstrøm, Monica Marie

    2016-01-01

    several growth modulators and invasion modulators were identified and independently validated. These candidates revealed a group of genes with metastasis-related functions in vitro that are involved in RNA-related processes, such as RNA-processing. Moreover, a general feature was that proliferation......) and non-CSCs. The main goal of this project was to functionally characterize a set of candidate genes recovered from next-generation sequencing analysis for their role in breast cancer metastasis formation. The starting gene set comprised 104 gene variants; i.e. 57 wildtype and 47 mutated variants. During...

  8. Whole-exome sequencing analysis of Waardenburg syndrome in a Chinese family.

    Science.gov (United States)

    Chen, Dezhong; Zhao, Na; Wang, Jing; Li, Zhuoyu; Wu, Changxin; Fu, Jie; Xiao, Han

    2017-01-01

    Waardenburg syndrome (WS) is a dominantly inherited, genetically heterogeneous auditory-pigmentary syndrome characterized by non-progressive sensorineural hearing loss and iris discoloration. By whole-exome sequencing (WES), we identified a nonsense mutation (c.598C>T) in PAX3 gene, predicted to be disease causing by in silico analysis. This is the first report of genetically diagnosed case of WS PAX3 c.598C>T nonsense mutation in Chinese ethnic origin by WES and in silico functional prediction methods.

  9. Complete sequence and comparative analysis of the chloroplast genome of Plinia trunciflora

    Directory of Open Access Journals (Sweden)

    Maria Eguiluz

    2017-11-01

    Full Text Available Abstract Plinia trunciflora is a Brazilian native fruit tree from the Myrtaceae family, also known as jaboticaba. This species has great potential by its fruit production. Due to the high content of essential oils in their leaves and of anthocyanins in the fruits, there is also an increasing interest by the pharmaceutical industry. Nevertheless, there are few studies focusing on its molecular biology and genetic characterization. We herein report the complete chloroplast (cp genome of P. trunciflora using high-throughput sequencing and compare it to other previously sequenced Myrtaceae genomes. The cp genome of P. trunciflora is 159,512 bp in size, comprising inverted repeats of 26,414 bp and single-copy regions of 88,097 bp (LSC and 18,587 bp (SSC. The genome contains 111 single-copy genes (77 protein-coding, 30 tRNA and four rRNA genes. Phylogenetic analysis using 57 cp protein-coding genes demonstrated that P. trunciflora, Eugenia uniflora and Acca sellowiana form a cluster with closer relationship to Syzygium cumini than with Eucalyptus. The complete cp sequence reported here can be used in evolutionary and population genetics studies, contributing to resolve the complex taxonomy of this species and fill the gap in genetic characterization.

  10. Complete sequence and comparative analysis of the chloroplast genome of Plinia trunciflora

    Science.gov (United States)

    Eguiluz, Maria; Yuyama, Priscila Mary; Guzman, Frank; Rodrigues, Nureyev Ferreira; Margis, Rogerio

    2017-01-01

    Abstract Plinia trunciflora is a Brazilian native fruit tree from the Myrtaceae family, also known as jaboticaba. This species has great potential by its fruit production. Due to the high content of essential oils in their leaves and of anthocyanins in the fruits, there is also an increasing interest by the pharmaceutical industry. Nevertheless, there are few studies focusing on its molecular biology and genetic characterization. We herein report the complete chloroplast (cp) genome of P. trunciflora using high-throughput sequencing and compare it to other previously sequenced Myrtaceae genomes. The cp genome of P. trunciflora is 159,512 bp in size, comprising inverted repeats of 26,414 bp and single-copy regions of 88,097 bp (LSC) and 18,587 bp (SSC). The genome contains 111 single-copy genes (77 protein-coding, 30 tRNA and four rRNA genes). Phylogenetic analysis using 57 cp protein-coding genes demonstrated that P. trunciflora, Eugenia uniflora and Acca sellowiana form a cluster with closer relationship to Syzygium cumini than with Eucalyptus. The complete cp sequence reported here can be used in evolutionary and population genetics studies, contributing to resolve the complex taxonomy of this species and fill the gap in genetic characterization. PMID:29111566

  11. Complete sequence and comparative analysis of the chloroplast genome of Plinia trunciflora.

    Science.gov (United States)

    Eguiluz, Maria; Yuyama, Priscila Mary; Guzman, Frank; Rodrigues, Nureyev Ferreira; Margis, Rogerio

    2017-01-01

    Plinia trunciflora is a Brazilian native fruit tree from the Myrtaceae family, also known as jaboticaba. This species has great potential by its fruit production. Due to the high content of essential oils in their leaves and of anthocyanins in the fruits, there is also an increasing interest by the pharmaceutical industry. Nevertheless, there are few studies focusing on its molecular biology and genetic characterization. We herein report the complete chloroplast (cp) genome of P. trunciflora using high-throughput sequencing and compare it to other previously sequenced Myrtaceae genomes. The cp genome of P. trunciflora is 159,512 bp in size, comprising inverted repeats of 26,414 bp and single-copy regions of 88,097 bp (LSC) and 18,587 bp (SSC). The genome contains 111 single-copy genes (77 protein-coding, 30 tRNA and four rRNA genes). Phylogenetic analysis using 57 cp protein-coding genes demonstrated that P. trunciflora, Eugenia uniflora and Acca sellowiana form a cluster with closer relationship to Syzygium cumini than with Eucalyptus. The complete cp sequence reported here can be used in evolutionary and population genetics studies, contributing to resolve the complex taxonomy of this species and fill the gap in genetic characterization.

  12. Preliminary Genomic Characterization of Ten Hardwood Tree Species from Multiplexed Low Coverage Whole Genome Sequencing.

    Directory of Open Access Journals (Sweden)

    Margaret Staton

    Full Text Available Forest health issues are on the rise in the United States, resulting from introduction of alien pests and diseases, coupled with abiotic stresses related to climate change. Increasingly, forest scientists are finding genetic/genomic resources valuable in addressing forest health issues. For a set of ten ecologically and economically important native hardwood tree species representing a broad phylogenetic spectrum, we used low coverage whole genome sequencing from multiplex Illumina paired ends to economically profile their genomic content. For six species, the genome content was further analyzed by flow cytometry in order to determine the nuclear genome size. Sequencing yielded a depth of 0.8X to 7.5X, from which in silico analysis yielded preliminary estimates of gene and repetitive sequence content in the genome for each species. Thousands of genomic SSRs were identified, with a clear predisposition toward dinucleotide repeats and AT-rich repeat motifs. Flanking primers were designed for SSR loci for all ten species, ranging from 891 loci in sugar maple to 18,167 in redbay. In summary, we have demonstrated that useful preliminary genome information including repeat content, gene content and useful SSR markers can be obtained at low cost and time input from a single lane of Illumina multiplex sequence.

  13. Sequencing Infrastructure Investments under Deep Uncertainty Using Real Options Analysis

    Directory of Open Access Journals (Sweden)

    Nishtha Manocha

    2018-02-01

    Full Text Available The adaptation tipping point and adaptation pathway approach developed to make decisions under deep uncertainty do not shed light on which among the multiple available pathways should be chosen as the preferred pathway. This creates the need to extend these approaches by means of suitable tools that can help sequence actions and subsequently enable the outlining of relevant policies. This paper presents two sequencing approaches, namely, the “Build to Target” and “Build Up” approach, to aid in sub-selecting a set of preferred pathways. Both approaches differ in the levels of flexibility they offer. They are exemplified by means of two case studies wherein the Net Present Valuation and the Real Options Analysis are employed as selection criterions. The results demonstrate the benefit of these two approaches when used in conjunction with the adaptation pathways and show how the pathways selected by means of a Build to Target approach generally have a value greater than, or at least the same as, the pathways selected by the Build Up approach. Further, this paper also demonstrates the capacity of Real Options to quantify and capture the economic value of flexibility, which cannot be done by traditional valuation approaches such as Net Present Valuation.

  14. [Study on sequence characterized amplified region (SCAR) markers of Cornus officinalis].

    Science.gov (United States)

    Chen, Suiqing; Lu, Xiaolei; Wang, Lili

    2011-05-01

    To establish sequence characterized amplified region markers of Cornus officinalis and provide a scientific basis for molecular identification of C. officinalis. The random primer was screened through RAPD to obtain specific RAPD marker bands. The RAPD marker bands were separated, extracted, cloned and sequenced. Both ends of the sequence of RAPD marker bands were determined. A pair of specific primers was designed for conventional PCR reaction, and SCAR marker was acquired. Four pairs of primers were designed based on the sequence of RAPD marker bands. The DNA of the seven varieties of C. officinalis was amplified by using YST38 and YST43 primer. The results showed that seven varieties of C. officinalis were able to produce a single PCR product. It was an effective way to identify C. officinalis. The varieties with cylindrical and long-pear shape fruits amplified by YST38 showed a specific band, which could be used as the evidence of variety identification. Seven varieties of C. oficinalis were amplified by using primer YST39. But the size of band of the variety with spindly shape fruit (35,0400 bp) was about 300 bp, which was shorter than those of the variety with the other shape fruits of C. officinalis (650-700 bp). The variety with the spindly shape fruit could be identified through this difference. The primer YST92 could produce a fragment from 600-700 bp in the varieties with cylindrical and long-pear shape fruits, a fragment from 200-300 bp in the varieties with oval and short-cylindrical shape fruits and had no fragment in the varieties with long cylindrical, elliptic and short-pear shape fruits, which could be used to select the different shapes of C. officinalis. SCAR mark is established and can be used as the basis for breeding and distinguishing the verieties of C. officinalis.

  15. A comparison of parallel pyrosequencing and sanger clone-based sequencing and its impact on the characterization of the genetic diversity of HIV-1.

    Directory of Open Access Journals (Sweden)

    Binhua Liang

    Full Text Available BACKGROUND: Pyrosequencing technology has the potential to rapidly sequence HIV-1 viral quasispecies without requiring the traditional approach of cloning. In this study, we investigated the utility of ultra-deep pyrosequencing to characterize genetic diversity of the HIV-1 gag quasispecies and assessed the possible contribution of pyrosequencing technology in studying HIV-1 biology and evolution. METHODOLOGY/PRINCIPAL FINDINGS: HIV-1 gag gene was amplified from 96 patients using nested PCR. The PCR products were cloned and sequenced using capillary based Sanger fluorescent dideoxy termination sequencing. The same PCR products were also directly sequenced using the 454 pyrosequencing technology. The two sequencing methods were evaluated for their ability to characterize quasispecies variation, and to reveal sites under host immune pressure for their putative functional significance. A total of 14,034 variations were identified by 454 pyrosequencing versus 3,632 variations by Sanger clone-based (SCB sequencing. 11,050 of these variations were detected only by pyrosequencing. These undetected variations were located in the HIV-1 Gag region which is known to contain putative cytotoxic T lymphocyte (CTL and neutralizing antibody epitopes, and sites related to virus assembly and packaging. Analysis of the positively selected sites derived by the two sequencing methods identified several differences. All of them were located within the CTL epitope regions. CONCLUSIONS/SIGNIFICANCE: Ultra-deep pyrosequencing has proven to be a powerful tool for characterization of HIV-1 genetic diversity with enhanced sensitivity, efficiency, and accuracy. It also improved reliability of downstream evolutionary and functional analysis of HIV-1 quasispecies.

  16. Oligo-Miocene reservoir sequence characterization and structuring in the Sisseb El Alem-Kalaa Kebira regions (Northeastern Tunisia)

    Science.gov (United States)

    Houatmia, Faten; Khomsi, Sami; Bédir, Mourad

    2015-11-01

    The Sisseb El Alem-Enfidha basin is located in the northeastern Tunisia, It is borded by Nadhour - Saouaf syncline to the north, Kairouan plain to the south, the Mediterranean Sea to the east and Tunisian Atlassic "dorsale" to the west. Oligocene and Miocene deltaic deposits present the main potential deep aquifers in this basin with high porosity (25%-30%). The interpretation of twenty seismic reflection profiles, calibrated by wire line logging data of twelve oil wells, hydraulic wells and geologic field sections highlighted the impact of tectonics on the structuring geometry of Oligo-Miocene sandstones reservoirs and their distribution in raised structures and subsurface depressions. Miocene seismostratigraphy analysis from Ain Ghrab Formation (Langhian) to the Segui Formation (Quaternary) showed five third-order seismic sequence deposits and nine extended lenticular sandy bodies reservoirs limited by toplap and downlap surfaces unconformities, Oligocene deposits presented also five third- order seismic sequences with five extended lenticular sandy bodies reservoirs. The Depth and the thickness maps of these sequence reservoir packages exhibited the structuring of this basin in sub-basins characterized by important lateral and vertical geometric and thichness variations. Petroleum wells wire line logging correlation with clay volume calculation showed an heterogeneous multilayer reservoirs of Oligocene and Miocene formed by the arrangement of fourteen sandstone bodies being able to be good reservoirs, separated by impermeable clay packages and affected by faults. Reservoirs levels correspond mainly to the lower system tract (LST) of sequences. Intensive fracturing by deep seated faults bounding the different sub-basins play a great role for water surface recharge and inter-layer circulations between affected reservoirs. The total pore volume of the Oligo-Miocene reservoir sandy bodies in the study area, is estimated to about 4 × 1012 m3 and equivalent to 4

  17. Analysis of mutations in the entire coding sequence of the factor VIII gene

    Energy Technology Data Exchange (ETDEWEB)

    Bidichadani, S.I.; Lanyon, W.G.; Connor, J.M. [Glascow Univ. (United Kingdom)] [and others

    1994-09-01

    Hemophilia A is a common X-linked recessive disorder of bleeding caused by deleterious mutations in the gene for clotting factor VIII. The large size of the factor VIII gene, the high frequency of de novo mutations and its tissue-specific expression complicate the detection of mutations. We have used a combination of RT-PCR of ectopic factor VIII transcripts and genomic DNA-PCRs to amplify the entire essential sequence of the factor VIII gene. This is followed by chemical mismatch cleavage analysis and direct sequencing in order to facilitate a comprehensive search for mutations. We describe the characterization of nine potentially pathogenic mutations, six of which are novel. In each case, a correlation of the genotype with the observed phenotype is presented. In order to evaluate the pathogenicity of the five missense mutations detected, we have analyzed them for evolutionary sequence conservation and for their involvement of sequence motifs catalogued in the PROSITE database of protein sites and patterns.

  18. Nonlinear analysis of sequence repeats of multi-domain proteins

    Energy Technology Data Exchange (ETDEWEB)

    Huang Yanzhao [Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan 430074, Hubei (China); Li Mingfeng [Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan 430074, Hubei (China); Xiao Yi [Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan 430074, Hubei (China)]. E-mail: lmf_bill@sina.com

    2007-11-15

    Many multi-domain proteins have repetitive three-dimensional structures but nearly-random amino acid sequences. In the present paper, by using a modified recurrence plot proposed by us previously, we show that these amino acid sequences have hidden repetitions in fact. These results indicate that the repetitive domain structures are encoded by the repetitive sequences. This also gives a method to detect the repetitive domain structures directly from amino acid sequences.

  19. Application of whole genome shotgun sequencing for detection and characterization of genetically modified organisms and derived products.

    Science.gov (United States)

    Holst-Jensen, Arne; Spilsberg, Bjørn; Arulandhu, Alfred J; Kok, Esther; Shi, Jianxin; Zel, Jana

    2016-07-01

    The emergence of high-throughput, massive or next-generation sequencing technologies has created a completely new foundation for molecular analyses. Various selective enrichment processes are commonly applied to facilitate detection of predefined (known) targets. Such approaches, however, inevitably introduce a bias and are prone to miss unknown targets. Here we review the application of high-throughput sequencing technologies and the preparation of fit-for-purpose whole genome shotgun sequencing libraries for the detection and characterization of genetically modified and derived products. The potential impact of these new sequencing technologies for the characterization, breeding selection, risk assessment, and traceability of genetically modified organisms and genetically modified products is yet to be fully acknowledged. The published literature is reviewed, and the prospects for future developments and use of the new sequencing technologies for these purposes are discussed.

  20. BlastKOALA and GhostKOALA: KEGG Tools for Functional Characterization of Genome and Metagenome Sequences.

    Science.gov (United States)

    Kanehisa, Minoru; Sato, Yoko; Morishima, Kanae

    2016-02-22

    BlastKOALA and GhostKOALA are automatic annotation servers for genome and metagenome sequences, which perform KO (KEGG Orthology) assignments to characterize individual gene functions and reconstruct KEGG pathways, BRITE hierarchies and KEGG modules to infer high-level functions of the organism or the ecosystem. Both servers are made freely available at the KEGG Web site (http://www.kegg.jp/blastkoala/). In BlastKOALA, the KO assignment is performed by a modified version of the internally used KOALA algorithm after the BLAST search against a non-redundant dataset of pangenome sequences at the species, genus or family level, which is generated from the KEGG GENES database by retaining the KO content of each taxonomic category. In GhostKOALA, which utilizes more rapid GHOSTX for database search and is suitable for metagenome annotation, the pangenome dataset is supplemented with Cd-hit clusters including those for viral genes. The result files may be downloaded and manipulated for further KEGG Mapper analysis, such as comparative pathway analysis using multiple BlastKOALA results. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  1. Human factors review for Severe Accident Sequence Analysis (SASA)

    International Nuclear Information System (INIS)

    Krois, P.A.; Haas, P.M.; Manning, J.J.; Bovell, C.R.

    1984-01-01

    The paper will discuss work being conducted during this human factors review including: (1) support of the Severe Accident Sequence Analysis (SASA) Program based on an assessment of operator actions, and (2) development of a descriptive model of operator severe accident management. Research by SASA analysts on the Browns Ferry Unit One (BF1) anticipated transient without scram (ATWS) was supported through a concurrent assessment of operator performance to demonstrate contributions to SASA analyses from human factors data and methods. A descriptive model was developed called the Function Oriented Accident Management (FOAM) model, which serves as a structure for bridging human factors, operations, and engineering expertise and which is useful for identifying needs/deficiencies in the area of accident management. The assessment of human factors issues related to ATWS required extensive coordination with SASA analysts. The analysis was consolidated primarily to six operator actions identified in the Emergency Procedure Guidelines (EPGs) as being the most critical to the accident sequence. These actions were assessed through simulator exercises, qualitative reviews, and quantitative human reliability analyses. The FOAM descriptive model assumes as a starting point that multiple operator/system failures exceed the scope of procedures and necessitates a knowledge-based emergency response by the operators. The FOAM model provides a functionally-oriented structure for assembling human factors, operations, and engineering data and expertise into operator guidance for unconventional emergency responses to mitigate severe accident progression and avoid/minimize core degradation. Operators must also respond to potential radiological release beyond plant protective barriers. Research needs in accident management and potential uses of the FOAM model are described. 11 references, 1 figure

  2. Sequence analysis of L RNA of Lassa virus

    International Nuclear Information System (INIS)

    Vieth, Simon; Torda, Andrew E.; Asper, Marcel; Schmitz, Herbert; Guenther, Stephan

    2004-01-01

    The L RNA of three Lassa virus strains originating from Nigeria, Ghana/Ivory Coast, and Sierra Leone was sequenced and the data subjected to structure predictions and phylogenetic analyses. The L gene products had 2218-2221 residues, diverged by 18% at the amino acid level, and contained several conserved regions. Only one region of 504 residues (positions 1043-1546) could be assigned a function, namely that of an RNA polymerase. Secondary structure predictions suggest that this domain is very similar to RNA-dependent RNA polymerases of known structure encoded by plus-strand RNA viruses, permitting a model to be built. Outside the polymerase region, there is little structural data, except for regions of strong alpha-helical content and probably a coiled-coil domain at the N terminus. No evidence for reassortment or recombination during Lassa virus evolution was found. The secondary structure-assisted alignment of the RNA polymerase region permitted a reliable reconstruction of the phylogeny of all negative-strand RNA viruses, indicating that Arenaviridae are most closely related to Nairoviruses. In conclusion, the data provide a basis for structural and functional characterization of the Lassa virus L protein and reveal new insights into the phylogeny of negative-strand RNA viruses

  3. Sequence analysis of cereal sucrose synthase genes and isolation ...

    African Journals Online (AJOL)

    SERVER

    2007-10-18

    Oct 18, 2007 ... sequencing of sucrose synthase gene fragment from sor- ghum using primers designed at their conserved exons. MATERIALS AND METHODS. Multiple sequence alignment. Sucrose synthase gene sequences of various cereals like rice, maize, and barley were accessed from NCBI Genbank database.

  4. Chimera: construction of chimeric sequences for phylogenetic analysis

    NARCIS (Netherlands)

    Leunissen, J.A.M.

    2003-01-01

    Chimera allows the construction of chimeric protein or nucleic acid sequence files by concatenating sequences from two or more sequence files in PHYLIP formats. It allows the user to interactively select genes and species from the input files. The concatenated result is stored to one single output

  5. Molecular Characterization of Five Potyviruses Infecting Korean Sweet Potatoes Based on Analyses of Complete Genome Sequences

    Directory of Open Access Journals (Sweden)

    Hae-Ryun Kwak

    2015-12-01

    Full Text Available Sweet potatoes (Ipomea batatas L. are grown extensively, in tropical and temperate regions, and are important food crops worldwide. In Korea, potyviruses, including Sweet potato feathery mottle virus (SPFMV, Sweet potato virus C (SPVC, Sweet potato virus G (SPVG, Sweet potato virus 2 (SPV2, and Sweet potato latent virus (SPLV, have been detected in sweet potato fields at a high (~95% incidence. In the present work, complete genome sequences of 18 isolates, representing the five potyviruses mentioned above, were compared with previously reported genome sequences. The complete genomes consisted of 10,081 to 10,830 nucleotides, excluding the poly-A tails. Their genomic organizations were typical of the Potyvirus genus, including one target open reading frame coding for a putative polyprotein. Based on phylogenetic analyses and sequence comparisons, the Korean SPFMV isolates belonged to the strains RC and O with >98% nucleotide sequence identity. Korean SPVC isolates had 99% identity to the Japanese isolate SPVC-Bungo and 70% identity to the SPFMV isolates. The Korean SPVG isolates showed 99% identity to the three previously reported SPVG isolates. Korean SPV2 isolates had 97% identity to the SPV2 GWB-2 isolate from the USA. Korean SPLV isolates had a relatively low (88% nucleotide sequence identity with the Taiwanese SPLV-TW isolates, and they were phylogenetically distantly related to SPFMV isolates. Recombination analysis revealed that possible recombination events occurred in the P1, HC-Pro and NIa-NIb regions of SPFMV and SPLV isolates and these regions were identified as hotspots for recombination in the sweet potato potyviruses.

  6. Accident Sequence Evaluation Program: Human reliability analysis procedure

    Energy Technology Data Exchange (ETDEWEB)

    Swain, A.D.

    1987-02-01

    This document presents a shortened version of the procedure, models, and data for human reliability analysis (HRA) which are presented in the Handbook of Human Reliability Analysis With emphasis on Nuclear Power Plant Applications (NUREG/CR-1278, August 1983). This shortened version was prepared and tried out as part of the Accident Sequence Evaluation Program (ASEP) funded by the US Nuclear Regulatory Commission and managed by Sandia National Laboratories. The intent of this new HRA procedure, called the ''ASEP HRA Procedure,'' is to enable systems analysts, with minimal support from experts in human reliability analysis, to make estimates of human error probabilities and other human performance characteristics which are sufficiently accurate for many probabilistic risk assessments. The ASEP HRA Procedure consists of a Pre-Accident Screening HRA, a Pre-Accident Nominal HRA, a Post-Accident Screening HRA, and a Post-Accident Nominal HRA. The procedure in this document includes changes made after tryout and evaluation of the procedure in four nuclear power plants by four different systems analysts and related personnel, including human reliability specialists. The changes consist of some additional explanatory material (including examples), and more detailed definitions of some of the terms. 42 refs.

  7. Accident Sequence Evaluation Program: Human reliability analysis procedure

    International Nuclear Information System (INIS)

    Swain, A.D.

    1987-02-01

    This document presents a shortened version of the procedure, models, and data for human reliability analysis (HRA) which are presented in the Handbook of Human Reliability Analysis With emphasis on Nuclear Power Plant Applications (NUREG/CR-1278, August 1983). This shortened version was prepared and tried out as part of the Accident Sequence Evaluation Program (ASEP) funded by the US Nuclear Regulatory Commission and managed by Sandia National Laboratories. The intent of this new HRA procedure, called the ''ASEP HRA Procedure,'' is to enable systems analysts, with minimal support from experts in human reliability analysis, to make estimates of human error probabilities and other human performance characteristics which are sufficiently accurate for many probabilistic risk assessments. The ASEP HRA Procedure consists of a Pre-Accident Screening HRA, a Pre-Accident Nominal HRA, a Post-Accident Screening HRA, and a Post-Accident Nominal HRA. The procedure in this document includes changes made after tryout and evaluation of the procedure in four nuclear power plants by four different systems analysts and related personnel, including human reliability specialists. The changes consist of some additional explanatory material (including examples), and more detailed definitions of some of the terms. 42 refs

  8. A Quantitative Accident Sequence Analysis for a VHTR

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Jintae; Lee, Joeun; Jae, Moosung [Hanyang University, Seoul (Korea, Republic of)

    2016-05-15

    In Korea, the basic design features of VHTR are currently discussed in the various design concepts. Probabilistic risk assessment (PRA) offers a logical and structured method to assess risks of a large and complex engineered system, such as a nuclear power plant. It will be introduced at an early stage in the design, and will be upgraded at various design and licensing stages as the design matures and the design details are defined. Risk insights to be developed from the PRA are viewed as essential to developing a design that is optimized in meeting safety objectives and in interpreting the applicability of the existing demands to the safety design approach of the VHTR. In this study, initiating events which may occur in VHTRs were selected through MLD method. The initiating events were then grouped into four categories for the accident sequence analysis. Initiating events frequency and safety systems failure rate were calculated by using reliability data obtained from the available sources and fault tree analysis. After quantification, uncertainty analysis was conducted. The SR and LR frequency are calculated respectively 7.52E- 10/RY and 7.91E-16/RY, which are relatively less than the core damage frequency of LWRs.

  9. Comparing methods of classifying life courses: Sequence analysis and latent class analysis

    NARCIS (Netherlands)

    Elzinga, C.H.; Liefbroer, Aart C.; Han, Sapphire

    2017-01-01

    We compare life course typology solutions generated by sequence analysis (SA) and latent class analysis (LCA). First, we construct an analytic protocol to arrive at typology solutions for both methodologies and present methods to compare the empirical quality of alternative typologies. We apply this

  10. Comparing methods of classifying life courses: sequence analysis and latent class analysis

    NARCIS (Netherlands)

    Han, Y.; Liefbroer, A.C.; Elzinga, C.

    2017-01-01

    We compare life course typology solutions generated by sequence analysis (SA) and latent class analysis (LCA). First, we construct an analytic protocol to arrive at typology solutions for both methodologies and present methods to compare the empirical quality of alternative typologies. We apply this

  11. Characterization of Campylobacter jejuni applying flaA short variable region sequencing, multilocus sequencing and Fourier transform infrared spectroscopy

    DEFF Research Database (Denmark)

    Josefsen, Mathilde Hartmann; Bonnichsen, Lise; Larsson, Jonas

    flaA short variable region sequencing and phenetic Fourier transform infrared (FTIR) spectroscopy was applied on a collection of 102 Campylobacter jejuni isolated from continuous sampling of organic, free range geese and chickens. FTIR has been shown to serve as a valuable tool in typing...

  12. Genome-Wide Analysis of Simple Sequence Repeats in Bitter Gourd (Momordica charantia

    Directory of Open Access Journals (Sweden)

    Junjie Cui

    2017-06-01

    Full Text Available Bitter gourd (Momordica charantia is widely cultivated as a vegetable and medicinal herb in many Asian and African countries. After the sequencing of the cucumber (Cucumis sativus, watermelon (Citrullus lanatus, and melon (Cucumis melo genomes, bitter gourd became the fourth cucurbit species whose whole genome was sequenced. However, a comprehensive analysis of simple sequence repeats (SSRs in bitter gourd, including a comparison with the three aforementioned cucurbit species has not yet been published. Here, we identified a total of 188,091 and 167,160 SSR motifs in the genomes of the bitter gourd lines ‘Dali-11’ and ‘OHB3-1,’ respectively. Subsequently, the SSR content, motif lengths, and classified motif types were characterized for the bitter gourd genomes and compared among all the cucurbit genomes. Lastly, a large set of 138,727 unique in silico SSR primer pairs were designed for bitter gourd. Among these, 71 primers were selected, all of which successfully amplified SSRs from the two bitter gourd lines ‘Dali-11’ and ‘K44’. To further examine the utilization of unique SSR primers, 21 SSR markers were used to genotype a collection of 211 bitter gourd lines from all over the world. A model-based clustering method and phylogenetic analysis indicated a clear separation among the geographic groups. The genomic SSR markers developed in this study have considerable potential value in advancing bitter gourd research.

  13. RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis.

    Science.gov (United States)

    Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab

    2012-01-01

    RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. http://www.cemb.edu.pk/sw.html RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language.

  14. Characterization of transcriptome dynamics during watermelon fruit development: sequencing, assembly, annotation and gene expression profiles.

    Science.gov (United States)

    Guo, Shaogui; Liu, Jingan; Zheng, Yi; Huang, Mingyun; Zhang, Haiying; Gong, Guoyi; He, Hongju; Ren, Yi; Zhong, Silin; Fei, Zhangjun; Xu, Yong

    2011-09-21

    Cultivated watermelon [Citrullus lanatus (Thunb.) Matsum. & Nakai var. lanatus] is an important agriculture crop world-wide. The fruit of watermelon undergoes distinct stages of development with dramatic changes in its size, color, sweetness, texture and aroma. In order to better understand the genetic and molecular basis of these changes and significantly expand the watermelon transcript catalog, we have selected four critical stages of watermelon fruit development and used Roche/454 next-generation sequencing technology to generate a large expressed sequence tag (EST) dataset and a comprehensive transcriptome profile for watermelon fruit flesh tissues. We performed half Roche/454 GS-FLX run for each of the four watermelon fruit developmental stages (immature white, white-pink flesh, red flesh and over-ripe) and obtained 577,023 high quality ESTs with an average length of 302.8 bp. De novo assembly of these ESTs together with 11,786 watermelon ESTs collected from GenBank produced 75,068 unigenes with a total length of approximately 31.8 Mb. Overall 54.9% of the unigenes showed significant similarities to known sequences in GenBank non-redundant (nr) protein database and around two-thirds of them matched proteins of cucumber, the most closely-related species with a sequenced genome. The unigenes were further assigned with gene ontology (GO) terms and mapped to biochemical pathways. More than 5,000 SSRs were identified from the EST collection. Furthermore we carried out digital gene expression analysis of these ESTs and identified 3,023 genes that were differentially expressed during watermelon fruit development and ripening, which provided novel insights into watermelon fruit biology and a comprehensive resource of candidate genes for future functional analysis. We then generated profiles of several interesting metabolites that are important to fruit quality including pigmentation and sweetness. Integrative analysis of metabolite and digital gene expression

  15. Discovery and characterization of 3000+ main-sequence binaries from APOGEE spectra

    Science.gov (United States)

    El-Badry, Kareem; Ting, Yuan-Sen; Rix, Hans-Walter; Quataert, Eliot; Weisz, Daniel R.; Cargile, Phillip; Conroy, Charlie; Hogg, David W.; Bergemann, Maria; Liu, Chao

    2018-05-01

    We develop a data-driven spectral model for identifying and characterizing spatially unresolved multiple-star systems and apply it to APOGEE DR13 spectra of main-sequence stars. Binaries and triples are identified as targets whose spectra can be significantly better fit by a superposition of two or three model spectra, drawn from the same isochrone, than any single-star model. From an initial sample of ˜20 000 main-sequence targets, we identify ˜2500 binaries in which both the primary and secondary stars contribute detectably to the spectrum, simultaneously fitting for the velocities and stellar parameters of both components. We additionally identify and fit ˜200 triple systems, as well as ˜700 velocity-variable systems in which the secondary does not contribute detectably to the spectrum. Our model simplifies the process of simultaneously fitting single- or multi-epoch spectra with composite models and does not depend on a velocity offset between the two components of a binary, making it sensitive to traditionally undetectable systems with periods of hundreds or thousands of years. In agreement with conventional expectations, almost all the spectrally identified binaries with measured parallaxes fall above the main sequence in the colour-magnitude diagram. We find excellent agreement between spectrally and dynamically inferred mass ratios for the ˜600 binaries in which a dynamical mass ratio can be measured from multi-epoch radial velocities. We obtain full orbital solutions for 64 systems, including 14 close binaries within hierarchical triples. We make available catalogues of stellar parameters, abundances, mass ratios, and orbital parameters.

  16. Frame sequences analysis technique of linear objects movement

    Science.gov (United States)

    Oshchepkova, V. Y.; Berg, I. A.; Shchepkin, D. V.; Kopylova, G. V.

    2017-12-01

    Obtaining data by noninvasive methods are often needed in many fields of science and engineering. This is achieved through video recording in various frame rate and light spectra. In doing so quantitative analysis of movement of the objects being studied becomes an important component of the research. This work discusses analysis of motion of linear objects on the two-dimensional plane. The complexity of this problem increases when the frame contains numerous objects whose images may overlap. This study uses a sequence containing 30 frames at the resolution of 62 × 62 pixels and frame rate of 2 Hz. It was required to determine the average velocity of objects motion. This velocity was found as an average velocity for 8-12 objects with the error of 15%. After processing dependencies of the average velocity vs. control parameters were found. The processing was performed in the software environment GMimPro with the subsequent approximation of the data obtained using the Hill equation.

  17. Transcriptome sequencing and positive selected genes analysis of Bombyx mandarina.

    Directory of Open Access Journals (Sweden)

    Tingcai Cheng

    Full Text Available The wild silkworm Bombyx mandarina is widely believed to be an ancestor of the domesticated silkworm, Bombyx mori. Silkworms are often used as a model for studying the mechanism of species domestication. Here, we performed transcriptome sequencing of the wild silkworm using an Illumina HiSeq2000 platform. We produced 100,004,078 high-quality reads and assembled them into 50,773 contigs with an N50 length of 1764 bp and a mean length of 941.62 bp. A total of 33,759 unigenes were identified, with 12,805 annotated in the Nr database, 8273 in the Pfam database, and 9093 in the Swiss-Prot database. Expression profile analysis found significant differential expression of 1308 unigenes between the middle silk gland (MSG and posterior silk gland (PSG. Three sericin genes (sericin 1, sericin 2, and sericin 3 were expressed specifically in the MSG and three fibroin genes (fibroin-H, fibroin-L, and fibroin/P25 were expressed specifically in the PSG. In addition, 32,297 Single-nucleotide polymorphisms (SNPs and 361 insertion-deletions (INDELs were detected. Comparison with the domesticated silkworm p50/Dazao identified 5,295 orthologous genes, among which 400 might have experienced or to be experiencing positive selection by Ka/Ks analysis. These data and analyses presented here provide insights into silkworm domestication and an invaluable resource for wild silkworm genomics research.

  18. A tapping device for recording and quantitative characterization of rhythmic/auditory sequences.

    Science.gov (United States)

    Piazza, Caterina; Cesareo, Ambra; Caccia, Martina; Reni, Gianluigi; Lorusso, Maria L

    2017-07-01

    The processing of auditory stimuli is essential for the correct perception of language and deficits in this ability are often related to the presence or development of language disorders. The motor imitation (e.g. tapping or beating) of rhythmic sequences can be a very sensitive correlate of deficits in auditory processing. Thus, the study of the tapping performance, with the investigation of both temporal and intensity information, might be very useful. The present work is aimed at the development and preliminary testing of a tapping device to be used for the imitation and/or the production of rhythmic sequences, allowing the recording of both tapping duration and intensity. The device is essentially made up of a Force Sensing Resistor and an Arduino UNO board. It was validated using different sampling frequencies (f s ) in a group of 10 young healthy adults investigating its efficacy in terms of touch and intensity detection by means of two testing procedures. Results demonstrated a good performance of the device when programmed with fs equal to 50 and 100Hz. Moreover, both temporal and intensity parameters were extracted, thus supporting the potential use of the device for the analysis of the imitation or production of rhythmic sequences. This work represents a first step for the development of a useful, low cost tool to support the diagnosis, training and rehabilitation of language disorders.

  19. Cloning, sequencing, and sequence analysis of two novel plasmids from the thermophilic anaerobic bacterium Anaerocellum thermophilum

    DEFF Research Database (Denmark)

    Clausen, Anders; Mikkelsen, Marie Just; Schrøder, I.

    2004-01-01

    The nucleotide sequence of two novel plasmids isolated from the extreme thermophilic anaerobic bacterium Anaerocellum thermophilum DSM6725 (A. thermophilum), growing optimally at 70degreesC, has been determined. pBAS2 was found to be a 3653 bp plasmid with a GC content of 43%, and the sequence re...... with highest similarity to DNA repair protein from Campylobacter jejuni (25% aa). Orf34 showed similarity to sigma factors with highest similarity (28% aa) to the sporulation specific Sigma factor, Sigma 28(K) from Bacillus thuringiensis....

  20. Sequence and phylogenetic analysis of chicken anaemia virus obtained from backyard and commercial chickens in Nigeria : research communication

    Directory of Open Access Journals (Sweden)

    D.O. Oluwayelu

    2008-09-01

    Full Text Available This work reports the first molecular analysis study of chicken anaemia virus (CAV in backyard chickens in Africa using molecular cloning and sequence analysis to characterize CAV strains obtained from commercial chickens and Nigerian backyard chickens. Partial VP1 gene sequences were determined for three CAVs from commercial chickens and for six CAV variants present in samples from a backyard chicken. Multiple alignment analysis revealed that the 6 % and 4 % nucleotide diversity obtained respectively for the commercial and backyard chicken strains translated to only 2 % amino acid diversity for each breed. Overall, the amino acid composition of Nigerian CAVs was found to be highly conserved. Since the partial VP1 gene sequence of two backyard chicken cloned CAV strains (NGR/Cl-8 and NGR/Cl-9 were almost identical and evolutionarily closely related to the commercial chicken strains NGR-1, and NGR-4 and NGR-5, respectively, we concluded that CAV infections had crossed the farm boundary.

  1. Characterization of Rous sarcoma virus-related sequences in the Japanese quail.

    Science.gov (United States)

    Chambers, J A; Cywinski, A; Chen, P J; Taylor, J M

    1986-08-01

    We detected sequences related to the avian retrovirus Rous sarcoma virus within the genome of the Japanese quail, a species previously considered to be free of endogenous avian leukosis virus elements. Using low-stringency conditions of hybridization, we screened a quail genomic library for clones containing retrovirus-related information. Of five clones so selected, one, lambda Q48, contained sequence information related to the gag, pol, and env genes of Rous sarcoma virus arranged in a contiguous fashion and spanning a distance of approximately 5.8 kilobases. This organization is consistent with the presence of an endogenous retroviral element within the Japanese quail genome. Use of this element as a high-stringency probe on Southern blots of genomic digests of several quail DNA demonstrated hybridization to a series of high-molecular-weight bands. By slot hybridization to quail DNA with a cloned probe, it was deduced that there were approximately 300 copies per diploid cell. In addition, the quail element also hybridized at low stringency to the DNA of the White Leghorn chicken and at high stringency to the DNAs of several species of jungle fowl and both true and ruffed pheasants. Limited nucleotide sequencing analysis of lambda Q48 revealed homologies of 65, 52, and 46% compared with the sequence of Rous sarcoma virus strain Prague C for the endonuclease domain of pol, the pol-env junction, and the 3'-terminal region of env, respectively. Comparisons at the amino acid level were also significant, thus confirming the retrovirus relatedness of the cloned quail element.

  2. Automatic analysis of the 2015 Gorkha earthquake aftershock sequence.

    Science.gov (United States)

    Baillard, C.; Lyon-Caen, H.; Bollinger, L.; Rietbrock, A.; Letort, J.; Adhikari, L. B.

    2016-12-01

    The Mw 7.8 Gorkha earthquake, that partially ruptured the Main Himalayan Thrust North of Kathmandu on the 25th April 2015, was the largest and most catastrophic earthquake striking Nepal since the great M8.4 1934 earthquake. This mainshock was followed by multiple aftershocks, among them, two notable events that occurred on the 12th May with magnitudes of 7.3 Mw and 6.3 Mw. Due to these recent events it became essential for the authorities and for the scientific community to better evaluate the seismic risk in the region through a detailed analysis of the earthquake catalog, amongst others, the spatio-temporal distribution of the Gorkha aftershock sequence. Here we complement this first study by doing a microseismic study using seismic data coming from the eastern part of the Nepalese Seismological Center network associated to one broadband station in Everest. Our primary goal is to deliver an accurate catalog of the aftershock sequence. Due to the exceptional number of events detected we performed an automatic picking/locating procedure which can be splitted in 4 steps: 1) Coarse picking of the onsets using a classical STA/LTA picker, 2) phase association of picked onsets to detect and declare seismic events, 3) Kurtosis pick refinement around theoretical arrival times to increase picking and location accuracy and, 4) local magnitude calculation based amplitude of waveforms. This procedure is time efficient ( 1 sec/event), reduces considerably the location uncertainties ( 2 to 5 km errors) and increases the number of events detected compared to manual processing. Indeed, the automatic detection rate is 10 times higher than the manual detection rate. By comparing to the USGS catalog we were able to give a new attenuation law to compute local magnitudes in the region. A detailed analysis of the seismicity shows a clear migration toward the east of the region and a sudden decrease of seismicity 100 km east of Kathmandu which may reveal the presence of a tectonic

  3. Identification and characterization of novel serum microRNA candidates from deep sequencing in cervical cancer patients.

    Science.gov (United States)

    Juan, Li; Tong, Hong-li; Zhang, Pengjun; Guo, Guanghong; Wang, Zi; Wen, Xinyu; Dong, Zhennan; Tian, Ya-ping

    2014-09-03

    Small non-coding microRNAs (miRNAs) are involved in cancer development and progression, and serum profiles of cervical cancer patients may be useful for identifying novel miRNAs. We performed deep sequencing on serum pools of cervical cancer patients and healthy controls with 3 replicates and constructed a small RNA library. We used MIREAP to predict novel miRNAs and identified 2 putative novel miRNAs between serum pools of cervical cancer patients and healthy controls after filtering out pseudo-pre-miRNAs using Triplet-SVM analysis. The 2 putative novel miRNAs were validated by real time PCR and were significantly decreased in cervical cancer patients compared with healthy controls. One novel miRNA had an area under curve (AUC) of 0.921 (95% CI: 0.883, 0.959) with a sensitivity of 85.7% and a specificity of 88.2% when discriminating between cervical cancer patients and healthy controls. Our results suggest that characterizing serum profiles of cervical cancers by Solexa sequencing may be a good method for identifying novel miRNAs and that the validated novel miRNAs described here may be cervical cancer-associated biomarkers.

  4. A DNA Structure-Based Bionic Wavelet Transform and Its Application to DNA Sequence Analysis

    Directory of Open Access Journals (Sweden)

    Fei Chen

    2003-01-01

    Full Text Available DNA sequence analysis is of great significance for increasing our understanding of genomic functions. An important task facing us is the exploration of hidden structural information stored in the DNA sequence. This paper introduces a DNA structure-based adaptive wavelet transform (WT – the bionic wavelet transform (BWT – for DNA sequence analysis. The symbolic DNA sequence can be separated into four channels of indicator sequences. An adaptive symbol-to-number mapping, determined from the structural feature of the DNA sequence, was introduced into WT. It can adjust the weight value of each channel to maximise the useful energy distribution of the whole BWT output. The performance of the proposed BWT was examined by analysing synthetic and real DNA sequences. Results show that BWT performs better than traditional WT in presenting greater energy distribution. This new BWT method should be useful for the detection of the latent structural features in future DNA sequence analysis.

  5. Characterization of CG6178 gene product with high sequence similarity to firefly luciferase in Drosophila melanogaster.

    Science.gov (United States)

    Oba, Yuichi; Ojika, Makoto; Inouye, Satoshi

    2004-03-31

    This is the first identification of a long-chain fatty acyl-CoA synthetase in Drosophila by enzymatic characterization. The gene product of CG6178 (CG6178) in Drosophila melanogaster genome, which has a high sequence similarity to firefly luciferase, has been expressed and characterized. CG6178 showed long-chain fatty acyl-CoA synthetic activity in the presence of ATP, CoA and Mg(2+), suggesting a fatty acyl adenylate is an intermediate. Recently, it was revealed that firefly luciferase has two catalytic functions, monooxygenase (luciferase) and AMP-mediated CoA ligase (fatty acyl-CoA synthetase). However, unlike firefly luciferase, CG6178 did not show luminescence activity in the presence of firefly luciferin, ATP, CoA and Mg(2+). The enzymatic properties of CG6178 including substrate specificity, pH dependency and optimal temperature were close to those of firefly luciferase and rat fatty acyl-CoA synthetase. Further, phylogenic analyses strongly suggest that the firefly luciferase gene may have evolved from a fatty acyl-CoA synthetase gene as a common ancestral gene.

  6. Genotyping-by-Sequencing Analysis for Determining Population Structure of Finger Millet Germplasm of Diverse Origins

    Directory of Open Access Journals (Sweden)

    Anil Kumar

    2016-07-01

    Full Text Available Finger millet [ (L. Gaertn.] is grown mainly by subsistence farmers in arid and semiarid regions of the world. To broaden its genetic base and to boost its production, it is of paramount importance to characterize and genotype the diverse gene pool of this important food and nutritional security crop. However, as a result of nonavailability of the genome sequence of finger millet, the progress could not be made in realizing the molecular basis of unique qualities of the crop. In the present investigation, attempts have been made to characterize the genetically diverse collection of 113 finger millet accessions through whole-genome genotyping-by-sequencing (GBS, which resulted in a genome-wide set of 23,000 single-nucleotide polymorphisms (SNPs segregating across the entire collection and several thousand SNPs segregating within every accession. A model-based population structure analysis reveals the presence of three subpopulations among the finger millet accessions, which are in parallel with the results of phylogenetic analysis. The observed population structure is consistent with the hypothesis that finger millet was domesticated first in Africa, and from there it was introduced to India some 3000 yr ago. A total of 1128 gene ontology (GO terms were assigned to SNP-carrying genes for three main categories: biological process, cellular component, and molecular function. Facilitated access to high-throughput genotyping and sequencing technologies are likely to improve the breeding process in developing countries, and as such, this data will be very useful to breeders who are working for the genetic improvement of finger millet.

  7. Genotyping-by-Sequencing Analysis for Determining Population Structure of Finger Millet Germplasm of Diverse Origins.

    Science.gov (United States)

    Kumar, Anil; Sharma, Divya; Tiwari, Apoorv; Jaiswal, J P; Singh, N K; Sood, Salej

    2016-07-01

    Finger millet [ (L.) Gaertn.] is grown mainly by subsistence farmers in arid and semiarid regions of the world. To broaden its genetic base and to boost its production, it is of paramount importance to characterize and genotype the diverse gene pool of this important food and nutritional security crop. However, as a result of nonavailability of the genome sequence of finger millet, the progress could not be made in realizing the molecular basis of unique qualities of the crop. In the present investigation, attempts have been made to characterize the genetically diverse collection of 113 finger millet accessions through whole-genome genotyping-by-sequencing (GBS), which resulted in a genome-wide set of 23,000 single-nucleotide polymorphisms (SNPs) segregating across the entire collection and several thousand SNPs segregating within every accession. A model-based population structure analysis reveals the presence of three subpopulations among the finger millet accessions, which are in parallel with the results of phylogenetic analysis. The observed population structure is consistent with the hypothesis that finger millet was domesticated first in Africa, and from there it was introduced to India some 3000 yr ago. A total of 1128 gene ontology (GO) terms were assigned to SNP-carrying genes for three main categories: biological process, cellular component, and molecular function. Facilitated access to high-throughput genotyping and sequencing technologies are likely to improve the breeding process in developing countries, and as such, this data will be very useful to breeders who are working for the genetic improvement of finger millet. Copyright © 2016 Crop Science Society of America.

  8. Sequencing and analysis of the gene-rich space of cowpea

    Directory of Open Access Journals (Sweden)

    Cheung Foo

    2008-02-01

    total of 5,888 GSRs had homology to genes encoding transcription factors (TFs and transcription associated factors (TAFs representing about 5% of the total annotated sequences in the dataset. Sixty-two (62 of the 64 well-characterized plant transcription factor (TF gene families are represented in the cowpea GSRs, and these families are of similar size and phylogenetic organization to those characterized in other plants. The cowpea GSRs also provides a rich source of genes involved in photoperiodic control, symbiosis, and defense-related responses. Comparisons to available databases revealed that about 74% of cowpea ESTs and 70% of all legume ESTs were represented in the GSR dataset. As approximately 12% of all GSRs contain an identifiable simple-sequence repeat, the dataset is a powerful resource for the design of microsatellite markers. Conclusion The availability of extensive publicly available genomic data for cowpea, a non-model legume with significant importance in the developing world, represents a significant step forward in legume research. Not only does the gene space sequence enable the detailed analysis of gene structure, gene family organization and phylogenetic relationships within cowpea, but it also facilitates the characterization of syntenic relationships with other cultivated and model legumes, and will contribute to determining patterns of chromosomal evolution in the Leguminosae. The micro and macrosyntenic relationships detected between cowpea and other cultivated and model legumes should simplify the identification of informative markers for marker-assisted trait selection and map-based gene isolation necessary for cowpea improvement.

  9. Utility of Whole-Genome Sequencing in Characterizing Acinetobacter Epidemiology and Analyzing Hospital Outbreaks

    Science.gov (United States)

    Fitzpatrick, Margaret A.; Hauser, Alan R.

    2015-01-01

    Acinetobacter baumannii frequently causes nosocomial infections and outbreaks. Whole-genome sequencing (WGS) is a promising technique for strain typing and outbreak investigations. We compared the performance of conventional methods with WGS for strain typing clinical Acinetobacter isolates and analyzing a carbapenem-resistant A. baumannii (CRAB) outbreak. We performed two band-based typing techniques (pulsed-field gel electrophoresis and repetitive extragenic palindromic-PCR), multilocus sequence type (MLST) analysis, and WGS on 148 Acinetobacter calcoaceticus-A. baumannii complex bloodstream isolates collected from a single hospital from 2005 to 2012. Phylogenetic trees inferred from core-genome single nucleotide polymorphisms (SNPs) confirmed three Acinetobacter species within this collection. Four major A. baumannii clonal lineages (as defined by MLST) circulated during the study, three of which are globally distributed and one of which is novel. WGS indicated that a threshold of 2,500 core SNPs accurately distinguished A. baumannii isolates from different clonal lineages. The band-based techniques performed poorly in assigning isolates to clonal lineages and exhibited little agreement with sequence-based techniques. After applying WGS to a CRAB outbreak that occurred during the study, we identified a threshold of 2.5 core SNPs that distinguished nonoutbreak from outbreak strains. WGS was more discriminatory than the band-based techniques and was used to construct a more accurate transmission map that resolved many of the plausible transmission routes suggested by epidemiologic links. Our study demonstrates that WGS is superior to conventional techniques for A. baumannii strain typing and outbreak analysis. These findings support the incorporation of WGS into health care infection prevention efforts. PMID:26699703

  10. Image registration based on virtual frame sequence analysis

    Energy Technology Data Exchange (ETDEWEB)

    Chen, H.; Ng, W.S. [Nanyang Technological University, Computer Integrated Medical Intervention Laboratory, School of Mechanical and Aerospace Engineering, Singapore (Singapore); Shi, D. (Nanyang Technological University, School of Computer Engineering, Singapore, Singpore); Wee, S.B. [Tan Tock Seng Hospital, Department of General Surgery, Singapore (Singapore)

    2007-08-15

    This paper is to propose a new framework for medical image registration with large nonrigid deformations, which still remains one of the biggest challenges for image fusion and further analysis in many medical applications. Registration problem is formulated as to recover a deformation process with the known initial state and final state. To deal with large nonlinear deformations, virtual frames are proposed to be inserted to model the deformation process. A time parameter is introduced and the deformation between consecutive frames is described with a linear affine transformation. Experiments are conducted with simple geometric deformation as well as complex deformations presented in MRI and ultrasound images. All the deformations are characterized with nonlinearity. The positive results demonstrated the effectiveness of this algorithm. The framework proposed in this paper is feasible to register medical images with large nonlinear deformations and is especially useful for sequential images. (orig.)

  11. A genome-wide analysis of lentivector integration sites using targeted sequence capture and next generation sequencing technology.

    Science.gov (United States)

    Ustek, Duran; Sirma, Sema; Gumus, Ergun; Arikan, Muzaffer; Cakiris, Aris; Abaci, Neslihan; Mathew, Jaicy; Emrence, Zeliha; Azakli, Hulya; Cosan, Fulya; Cakar, Atilla; Parlak, Mahmut; Kursun, Olcay

    2012-10-01

    One application of next-generation sequencing (NGS) is the targeted resequencing of interested genes which has not been used in viral integration site analysis of gene therapy applications. Here, we combined targeted sequence capture array and next generation sequencing to address the whole genome profiling of viral integration sites. Human 293T and K562 cells were transduced with a HIV-1 derived vector. A custom made DNA probe sets targeted pLVTHM vector used to capture lentiviral vector/human genome junctions. The captured DNA was sequenced using GS FLX platform. Seven thousand four hundred and eighty four human genome sequences flanking the long terminal repeats (LTR) of pLVTHM fragment sequences matched with an identity of at least 98% and minimum 50 bp criteria in both cells. In total, 203 unique integration sites were identified. The integrations in both cell lines were totally distant from the CpG islands and from the transcription start sites and preferentially located in introns. A comparison between the two cell lines showed that the lentiviral-transduced DNA does not have the same preferred regions in the two different cell lines. Copyright © 2012 Elsevier B.V. All rights reserved.

  12. CSReport: A New Computational Tool Designed for Automatic Analysis of Class Switch Recombination Junctions Sequenced by High-Throughput Sequencing.

    Science.gov (United States)

    Boyer, François; Boutouil, Hend; Dalloul, Iman; Dalloul, Zeinab; Cook-Moreau, Jeanne; Aldigier, Jean-Claude; Carrion, Claire; Herve, Bastien; Scaon, Erwan; Cogné, Michel; Péron, Sophie

    2017-05-15

    B cells ensure humoral immune responses due to the production of Ag-specific memory B cells and Ab-secreting plasma cells. In secondary lymphoid organs, Ag-driven B cell activation induces terminal maturation and Ig isotype class switch (class switch recombination [CSR]). CSR creates a virtually unique IgH locus in every B cell clone by intrachromosomal recombination between two switch (S) regions upstream of each C region gene. Amount and structural features of CSR junctions reveal valuable information about the CSR mechanism, and analysis of CSR junctions is useful in basic and clinical research studies of B cell functions. To provide an automated tool able to analyze large data sets of CSR junction sequences produced by high-throughput sequencing (HTS), we designed CSReport, a software program dedicated to support analysis of CSR recombination junctions sequenced with a HTS-based protocol (Ion Torrent technology). CSReport was assessed using simulated data sets of CSR junctions and then used for analysis of Sμ-Sα and Sμ-Sγ1 junctions from CH12F3 cells and primary murine B cells, respectively. CSReport identifies junction segment breakpoints on reference sequences and junction structure (blunt-ended junctions or junctions with insertions or microhomology). Besides the ability to analyze unprecedentedly large libraries of junction sequences, CSReport will provide a unified framework for CSR junction studies. Our results show that CSReport is an accurate tool for analysis of sequences from our HTS-based protocol for CSR junctions, thereby facilitating and accelerating their study. Copyright © 2017 by The American Association of Immunologists, Inc.

  13. Sequencing and analysis of an Irish human genome.

    LENUS (Irish Health Repository)

    Tong, Pin

    2010-01-01

    Recent studies generating complete human sequences from Asian, African and European subgroups have revealed population-specific variation and disease susceptibility loci. Here, choosing a DNA sample from a population of interest due to its relative geographical isolation and genetic impact on further populations, we extend the above studies through the generation of 11-fold coverage of the first Irish human genome sequence.

  14. Exome Sequence Analysis of 14 Families With High Myopia

    DEFF Research Database (Denmark)

    Kloss, Bethany A.; Tompson, Stuart W.; Whisenhunt, Kristina N.

    2017-01-01

    Purpose: To identify causal gene mutations in 14 families with autosomal dominant (AD) high myopia using exome sequencing. Methods: Select individuals from 14 large Caucasian families with high myopia were exome sequenced. Gene variants were filtered to identify potential pathogenic changes. Sang...

  15. Database-driven primary analysis of raw sequencing data

    DEFF Research Database (Denmark)

    2014-01-01

    The present invention relates to methods for identifying the source of a biological sequence containing sample from raw sequencing reads. The method may be used to identify the source of unknown DNA and can be used for diagnostic, biodefense, food safety and quality, and hygiene applications...

  16. Isolation and sequence characterization of DNA-A genome of a new begomovirus strain associated with severe leaf curling symptoms of Jatropha curcas L.

    KAUST Repository

    Chauhan, Sushma

    2018-04-22

    Begomoviruses belong to the family Geminiviridae are associated with several disease symptoms, such as mosaic and leaf curling in Jatropha curcas. The molecular characterization of these viral strains will help in developing management strategies to control the disease. In this study, J. curcas that was infected with begomovirus and showed acute leaf curling symptoms were identified. DNA-A segment from pathogenic viral strain was isolated and sequenced. The sequenced genome was assembled and characterized in detail. The full-length DNA-A sequence was covered by primer walking. The genome sequence showed the general organization of DNA-A from begomovirus by the distribution of ORFs in both viral and anti-viral strands. The genome size ranged from 2844 bp–2852 bp. Three strains with minor nucleotide variations were identified, and a phylogenetic analysis was performed by comparing the DNA-A segments from other reported begomovirus isolates. The maximum sequence similarity was observed with Euphorbia yellow mosaic virus (FN435995). In the phylogenetic tree, no clustering was observed with previously reported begomovirus strains isolated from J. curcas host. The strains isolated in this study belong to new begomoviral strain that elicits symptoms of leaf curling in J. curcas. The results indicate that the probable origin of the strains is from Jatropha mosaic virus infecting J. gassypifolia. The strains isolated in this study are referred as Jatropha curcas leaf curl India virus (JCLCIV) based on the major symptoms exhibited by host J. curcas.

  17. Accelerating next generation sequencing data analysis with system level optimizations.

    Science.gov (United States)

    Kathiresan, Nagarajan; Temanni, Ramzi; Almabrazi, Hakeem; Syed, Najeeb; Jithesh, Puthen V; Al-Ali, Rashid

    2017-08-22

    Next generation sequencing (NGS) data analysis is highly compute intensive. In-memory computing, vectorization, bulk data transfer, CPU frequency scaling are some of the hardware features in the modern computing architectures. To get the best execution time and utilize these hardware features, it is necessary to tune the system level parameters before running the application. We studied the GATK-HaplotypeCaller which is part of common NGS workflows, that consume more than 43% of the total execution time. Multiple GATK 3.x versions were benchmarked and the execution time of HaplotypeCaller was optimized by various system level parameters which included: (i) tuning the parallel garbage collection and kernel shared memory to simulate in-memory computing, (ii) architecture-specific tuning in the PairHMM library for vectorization, (iii) including Java 1.8 features through GATK source code compilation and building a runtime environment for parallel sorting and bulk data transfer (iv) the default 'on-demand' mode of CPU frequency is over-clocked by using 'performance-mode' to accelerate the Java multi-threads. As a result, the HaplotypeCaller execution time was reduced by 82.66% in GATK 3.3 and 42.61% in GATK 3.7. Overall, the execution time of NGS pipeline was reduced to 70.60% and 34.14% for GATK 3.3 and GATK 3.7 respectively.

  18. The sequence and analysis of a Chinese pig genome

    Directory of Open Access Journals (Sweden)

    Fang Xiaodong

    2012-11-01

    Full Text Available Abstract Background The pig is an economically important food source, amounting to approximately 40% of all meat consumed worldwide. Pigs also serve as an important model organism because of their similarity to humans at the anatomical, physiological and genetic level, making them very useful for studying a variety of human diseases. A pig strain of particular interest is the miniature pig, specifically the Wuzhishan pig (WZSP, as it has been extensively inbred. Its high level of homozygosity offers increased ease for selective breeding for specific traits and a more straightforward understanding of the genetic changes that underlie its biological characteristics. WZSP also serves as a promising means for applications in surgery, tissue engineering, and xenotransplantation. Here, we report the sequencing and analysis of an inbreeding WZSP genome. Results Our results reveal some unique genomic features, including a relatively high level of homozygosity in the diploid genome, an unusual distribution of heterozygosity, an over-representation of tRNA-derived transposable elements, a small amount of porcine endogenous retrovirus, and a lack of type C retroviruses. In addition, we carried out systematic research on gene evolution, together with a detailed investigation of the counterparts of human drug target genes. Conclusion Our results provide the opportunity to more clearly define the genomic character of pig, which could enhance our ability to create more useful pig models.

  19. Analysis of expressed sequence tags from the Ulva prolifera (Chlorophyta)

    Science.gov (United States)

    Niu, Jianfeng; Hu, Haiyan; Hu, Songnian; Wang, Guangce; Peng, Guang; Sun, Song

    2010-01-01

    In 2008, a green tide broke out before the sailing competition of the 29th Olympic Games in Qingdao. The causative species was determined to be Enteromorpha prolifera ( Ulva prolifera O. F. Müller), a familiar green macroalga along the coastline of China. Rapid accumulation of a large biomass of floating U. prolifera prompted research on different aspects of this species. In this study, we constructed a nonnormalized cDNA library from the thalli of U. prolifera and acquired 10 072 high-quality expressed sequence tags (ESTs). These ESTs were assembled into 3 519 nonredundant gene groups, including 1 446 clusters and 2 073 singletons. After annotation with the nr database, a large number of genes were found to be related with chloroplast and ribosomal protein, GO functional classification showed 1 418 ESTs participated in photosynthesis and 1 359 ESTs were responsible for the generation of precursor metabolites and energy. In addition, rather comprehensive carbon fixation pathways were found in U. prolifera using KEGG. Some stress-related and signal transduction-related genes were also found in this study. All the evidences displayed that U. prolifera had substance and energy foundation for the intense photosynthesis and the rapid proliferation. Phylogenetic analysis of cytochrome c oxidase subunit I revealed that this green-tide causative species is most closely affiliated to Pseudendoclonium akinetum (Ulvophyceae).

  20. Sequence analysis of sub-genotype D hepatitis B surface antigens isolated from Jeddah, Saudi Arabia

    Directory of Open Access Journals (Sweden)

    Sahar EL Hadad

    2018-05-01

    Full Text Available Little is known about the prevalence of HBV genotypes/sub-genotypes in Jeddah province, although the hepatitis B virus (HBV was identified as the most predominant type of hepatitis in Saudi Arabia. To characterize HBV genotypes/sub-genotypes, serum samples from 15 patients with chronic HBV were collected and subjected to HBsAg gene amplification and sequence analysis. Phylogenetic analysis of the HBsAg gene sequences revealed that 11 (48% isolates belonged to HBV/D while 4 (18% were associated with HBV/C. Notably, a HBV/D sub-genotype phylogenetic tree identified that eight current isolates (72% belonged to HBV/D1, whereas three isolates (28% appeared to be more closely related to HBV/D5, although they formed a novel cluster supported by a branch with 99% bootstrap value. Isolates belonging to D1 were grouped in one branch and seemed to be more closely related to various strains isolated from different countries. For further determination of whether the three current isolates belonged to HBV/D5 or represented a novel sub-genotype, HBV/DA, whole HBV genome sequences would be required. In the present study, we verified that HBV/D1 is the most prevalent HBV sub-genotype in Jeddah, and identified novel variant mutations suggesting that an additional sub-genotype designated HBV/DA should be proposed. Overall, the results of the present HBsAg sequence analyses provide us with insights regarding the nucleotide differences between the present HBsAg/D isolates identified in the populace of Jeddah, Saudi Arabia and those previously isolated worldwide. Additional studies with large numbers of subjects in other areas might lead to the discovery of the specific HBV strain genotypes or even additional new sub-genotypes that are circulating in Saudi Arabia. Keywords: Hepatitis B virus, HBV sub-genotypes, HBV/D, HBsAg, Viral isolates, Population studies

  1. Identification of similar regions of protein structures using integrated sequence and structure analysis tools

    Directory of Open Access Journals (Sweden)

    Heiland Randy

    2006-03-01

    Full Text Available Abstract Background Understanding protein function from its structure is a challenging problem. Sequence based approaches for finding homology have broad use for annotation of both structure and function. 3D structural information of protein domains and their interactions provide a complementary view to structure function relationships to sequence information. We have developed a web site http://www.sblest.org/ and an API of web services that enables users to submit protein structures and identify statistically significant neighbors and the underlying structural environments that make that match using a suite of sequence and structure analysis tools. To do this, we have integrated S-BLEST, PSI-BLAST and HMMer based superfamily predictions to give a unique integrated view to prediction of SCOP superfamilies, EC number, and GO term, as well as identification of the protein structural environments that are associated with that prediction. Additionally, we have extended UCSF Chimera and PyMOL to support our web services, so that users can characterize their own proteins of interest. Results Users are able to submit their own queries or use a structure already in the PDB. Currently the databases that a user can query include the popular structural datasets ASTRAL 40 v1.69, ASTRAL 95 v1.69, CLUSTER50, CLUSTER70 and CLUSTER90 and PDBSELECT25. The results can be downloaded directly from the site and include function prediction, analysis of the most conserved environments and automated annotation of query proteins. These results reflect both the hits found with PSI-BLAST, HMMer and with S-BLEST. We have evaluated how well annotation transfer can be performed on SCOP ID's, Gene Ontology (GO ID's and EC Numbers. The method is very efficient and totally automated, generally taking around fifteen minutes for a 400 residue protein. Conclusion With structural genomics initiatives determining structures with little, if any, functional characterization

  2. Draft Genome Sequencing and Comparative Analysis of Aspergillus sojae NBRC4239

    Science.gov (United States)

    Sato, Atsushi; Oshima, Kenshiro; Noguchi, Hideki; Ogawa, Masahiro; Takahashi, Tadashi; Oguma, Tetsuya; Koyama, Yasuji; Itoh, Takehiko; Hattori, Masahira; Hanya, Yoshiki

    2011-01-01

    We conducted genome sequencing of the filamentous fungus Aspergillus sojae NBRC4239 isolated from the koji used to prepare Japanese soy sauce. We used the 454 pyrosequencing technology and investigated the genome with respect to enzymes and secondary metabolites in comparison with other Aspergilli sequenced. Assembly of 454 reads generated a non-redundant sequence of 39.5-Mb possessing 13 033 putative genes and 65 scaffolds composed of 557 contigs. Of the 2847 open reading frames with Pfam domain scores of >150 found in A. sojae NBRC4239, 81.7% had a high degree of similarity with the genes of A. oryzae. Comparative analysis identified serine carboxypeptidase and aspartic protease genes unique to A. sojae NBRC4239. While A. oryzae possessed three copies of α-amyalse gene, A. sojae NBRC4239 possessed only a single copy. Comparison of 56 gene clusters for secondary metabolites between A. sojae NBRC4239 and A. oryzae revealed that 24 clusters were conserved, whereas 32 clusters differed between them that included a deletion of 18 508 bp containing mfs1, mao1, dmaT, and pks-nrps for the cyclopiazonic acid (CPA) biosynthesis, explaining the no productivity of CPA in A. sojae. The A. sojae NBRC4239 genome data will be useful to characterize functional features of the koji moulds used in Japanese industries. PMID:21659486

  3. Secure and robust cloud computing for high-throughput forensic microsatellite sequence analysis and databasing.

    Science.gov (United States)

    Bailey, Sarah F; Scheible, Melissa K; Williams, Christopher; Silva, Deborah S B S; Hoggan, Marina; Eichman, Christopher; Faith, Seth A

    2017-11-01

    Next-generation Sequencing (NGS) is a rapidly evolving technology with demonstrated benefits for forensic genetic applications, and the strategies to analyze and manage the massive NGS datasets are currently in development. Here, the computing, data storage, connectivity, and security resources of the Cloud were evaluated as a model for forensic laboratory systems that produce NGS data. A complete front-to-end Cloud system was developed to upload, process, and interpret raw NGS data using a web browser dashboard. The system was extensible, demonstrating analysis capabilities of autosomal and Y-STRs from a variety of NGS instrumentation (Illumina MiniSeq and MiSeq, and Oxford Nanopore MinION). NGS data for STRs were concordant with standard reference materials previously characterized with capillary electrophoresis and Sanger sequencing. The computing power of the Cloud was implemented with on-demand auto-scaling to allow multiple file analysis in tandem. The system was designed to store resulting data in a relational database, amenable to downstream sample interpretations and databasing applications following the most recent guidelines in nomenclature for sequenced alleles. Lastly, a multi-layered Cloud security architecture was tested and showed that industry standards for securing data and computing resources were readily applied to the NGS system without disadvantageous effects for bioinformatic analysis, connectivity or data storage/retrieval. The results of this study demonstrate the feasibility of using Cloud-based systems for secured NGS data analysis, storage, databasing, and multi-user distributed connectivity. Copyright © 2017 Elsevier B.V. All rights reserved.

  4. Event Sequence Analysis of the Air Intelligence Agency Information Operations Center Flight Operations

    National Research Council Canada - National Science Library

    Larsen, Glen

    1998-01-01

    This report applies Event Sequence Analysis, methodology adapted from aircraft mishap investigation, to an investigation of the performance of the Air Intelligence Agency's Information Operations Center (IOC...

  5. Identification and characterization of plastid-type proteins from sequence-attributed features using machine learning

    Science.gov (United States)

    2013-01-01

    Background Plastids are an important component of plant cells, being the site of manufacture and storage of chemical compounds used by the cell, and contain pigments such as those used in photosynthesis, starch synthesis/storage, cell color etc. They are essential organelles of the plant cell, also present in algae. Recent advances in genomic technology and sequencing efforts is generating a huge amount of DNA sequence data every day. The predicted proteome of these genomes needs annotation at a faster pace. In view of this, one such annotation need is to develop an automated system that can distinguish between plastid and non-plastid proteins accurately, and further classify plastid-types based on their functionality. We compared the amino acid compositions of plastid proteins with those of non-plastid ones and found significant differences, which were used as a basis to develop various feature-based prediction models using similarity-search and machine learning. Results In this study, we developed separate Support Vector Machine (SVM) trained classifiers for characterizing the plastids in two steps: first distinguishing the plastid vs. non-plastid proteins, and then classifying the identified plastids into their various types based on their function (chloroplast, chromoplast, etioplast, and amyloplast). Five diverse protein features: amino acid composition, dipeptide composition, the pseudo amino acid composition, Nterminal-Center-Cterminal composition and the protein physicochemical properties are used to develop SVM models. Overall, the dipeptide composition-based module shows the best performance with an accuracy of 86.80% and Matthews Correlation Coefficient (MCC) of 0.74 in phase-I and 78.60% with a MCC of 0.44 in phase-II. On independent test data, this model also performs better with an overall accuracy of 76.58% and 74.97% in phase-I and phase-II, respectively. The similarity-based PSI-BLAST module shows very low performance with about 50% prediction

  6. Analysis of the Macaca mulatta transcriptome and the sequence divergence between Macaca and human.

    Science.gov (United States)

    Magness, Charles L; Fellin, P Campion; Thomas, Matthew J; Korth, Marcus J; Agy, Michael B; Proll, Sean C; Fitzgibbon, Matthew; Scherer, Christina A; Miner, Douglas G; Katze, Michael G; Iadonato, Shawn P

    2005-01-01

    We report the initial sequencing and comparative analysis of the Macaca mulatta transcriptome. Cloned sequences from 11 tissues, nine animals, and three species (M. mulatta, M. fascicularis, and M. nemestrina) were sampled, resulting in the generation of 48,642 sequence reads. These data represent an initial sampling of the putative rhesus orthologs for 6,216 human genes. Mean nucleotide diversity within M. mulatta and sequence divergence among M. fascicularis, M. nemestrina, and M. mulatta are also reported.

  7. RetroTector online, a rational tool for analysis of retroviral elements in small and medium size vertebrate genomic sequences.

    Science.gov (United States)

    Sperber, Göran; Lövgren, Anders; Eriksson, Nils-Einar; Benachenhou, Farid; Blomberg, Jonas

    2009-06-16

    The rapid accumulation of genomic information in databases necessitates rapid and specific algorithms for extracting biologically meaningful information. More or less complete retroviral sequences, also called proviral or endogenous retroviral sequences; ERVs, constitutes at least 5% of vertebrate genomes. After infecting the host, these retroviruses have integrated in germ line cells, and have then been carried in genomes for at least several 100 million years. A better understanding of structure and function of these sequences can have profound biological and medical consequences. RetroTector (ReTe) is a platform-independent Java program for identification and characterization of proviral sequences in vertebrate genomes. The full ReTe requires a local installation with a MySQL database. Although not overly complicated, the installation may take some time. A "light" version of ReTe, (RetroTector online; ROL) which does not require specific installation procedures is provided, via the World Wide Web. ROL http://www.fysiologi.neuro.uu.se/jbgs/ was implemented under the Batchelor web interface (A Lövgren et al). It allows both GenBank accession number, file and FASTA cut-and-paste admission of sequences (5 to 10,000 kilobases). Up to ten submissions can be done simultaneously, allowing batch analysis of analysis of any retroviral sequences found in the submitted sequence is graphically presented, exportable in standard formats. With the current server, a complete analysis of a 1 Megabase sequence is complete in 10 minutes. It is possible to mask nonretroviral repetitive sequences in the submitted sequence, using host genome specific "brooms", which increase specificity. Proviral sequences can be hard to recognize

  8. Sequence analysis of mitochondrial 16S ribosomal RNA gene ...

    Indian Academy of Sciences (India)

    Unknown

    For the understanding of their vectorial capacity, identification of disease carrying and refractory strains is essential. ... been widely used for phylogenetic studies and sequence differences in ... In order to fill up the internal gap, a new set.

  9. simple sequence repeat (SSR) markers in genetic analysis of

    African Journals Online (AJOL)

    Yomi

    2012-08-28

    1998). Cross- species amplification of soybean (Glycine max) simple sequence repeats (SSRs) within the genus and other legume genera: implications for the transferability of SSRs in plants. Mol. Biol. Evol. 15:1275-1287.

  10. Sequence and expression analysis of gaps in human chromosome 20

    DEFF Research Database (Denmark)

    Minocherhomji, Sheroy; Seemann, Stefan; Mang, Yuan

    2012-01-01

    /or overlap disease-associated loci, including the DLGAP4 locus. In this study, we sequenced ~99% of all three unfinished gaps on human chr 20, determined their complete genomic sizes and assessed epigenetic profiles using a combination of Sanger sequencing, mate pair paired-end high-throughput sequencing......The finished human genome-assemblies comprise several hundred un-sequenced euchromatic gaps, which may be rich in long polypurine/polypyrimidine stretches. Human chromosome 20 (chr 20) currently has three unfinished gaps remaining on its q-arm. All three gaps are within gene-dense regions and...... and chromatin, methylation and expression analyses. We found histone 3 trimethylated at Lysine 27 to be distributed across all three gaps in immortalized B-lymphocytes. In one gap, five novel CpG islands were predominantly hypermethylated in genomic DNA from peripheral blood lymphocytes and human cerebellum...

  11. DELIMINATE--a fast and efficient method for loss-less compression of genomic sequences: sequence analysis.

    Science.gov (United States)

    Mohammed, Monzoorul Haque; Dutta, Anirban; Bose, Tungadri; Chadaram, Sudha; Mande, Sharmila S

    2012-10-01

    An unprecedented quantity of genome sequence data is currently being generated using next-generation sequencing platforms. This has necessitated the development of novel bioinformatics approaches and algorithms that not only facilitate a meaningful analysis of these data but also aid in efficient compression, storage, retrieval and transmission of huge volumes of the generated data. We present a novel compression algorithm (DELIMINATE) that can rapidly compress genomic sequence data in a loss-less fashion. Validation results indicate relatively higher compression efficiency of DELIMINATE when compared with popular general purpose compression algorithms, namely, gzip, bzip2 and lzma. Linux, Windows and Mac implementations (both 32 and 64-bit) of DELIMINATE are freely available for download at: http://metagenomics.atc.tcs.com/compression/DELIMINATE. sharmila@atc.tcs.com Supplementary data are available at Bioinformatics online.

  12. Analysis of 16S rRNA amplicon sequencing options on the Roche/454 next-generation titanium sequencing platform.

    Directory of Open Access Journals (Sweden)

    Hideyuki Tamaki

    Full Text Available BACKGROUND: 16S rRNA gene pyrosequencing approach has revolutionized studies in microbial ecology. While primer selection and short read length can affect the resulting microbial community profile, little is known about the influence of pyrosequencing methods on the sequencing throughput and the outcome of microbial community analyses. The aim of this study is to compare differences in output, ease, and cost among three different amplicon pyrosequencing methods for the Roche/454 Titanium platform METHODOLOGY/PRINCIPAL FINDINGS: The following three pyrosequencing methods for 16S rRNA genes were selected in this study: Method-1 (standard method is the recommended method for bi-directional sequencing using the LIB-A kit; Method-2 is a new option designed in this study for unidirectional sequencing with the LIB-A kit; and Method-3 uses the LIB-L kit for unidirectional sequencing. In our comparison among these three methods using 10 different environmental samples, Method-2 and Method-3 produced 1.5-1.6 times more useable reads than the standard method (Method-1, after quality-based trimming, and did not compromise the outcome of microbial community analyses. Specifically, Method-3 is the most cost-effective unidirectional amplicon sequencing method as it provided the most reads and required the least effort in consumables management. CONCLUSIONS: Our findings clearly demonstrated that alternative pyrosequencing methods for 16S rRNA genes could drastically affect sequencing output (e.g. number of reads before and after trimming but have little effect on the outcomes of microbial community analysis. This finding is important for both researchers and sequencing facilities utilizing 16S rRNA gene pyrosequencing for microbial ecological studies.

  13. Transcriptomic characterization of soybean (Glycine max) roots in response to rhizobium infection by RNA sequencing

    International Nuclear Information System (INIS)

    He, Q.; Li, Z.; Wang, S.; Huang, S.; Yang, H.

    2018-01-01

    Legumes interacting with rhizobium to convert N2 into ammonia for plant use has attracted worldwide interest. However, the plant basal nitrogen fixation mechanisms induced in response to Rhizobium, giving differential gene expression of plants, have not yet been fully realized. The differential expressed genes of soybean between inoculated and mock-inoculated were analyzed by a RNA-Seq. The results of the sequencing were aligned against the Williams 82 genome sequence, which contain 55787 transcripts; 280 and 316 transcripts were found to be up- and down-regulated, respectively, for inoculated and mock-inoculated soybean roots at stage V1. Gene ontology (GO) analyses detected 104, 182 and 178 genes associated with the cell component category, molecular function category and biological process category, respectively. Pathway analysis revealed that 98 differentially expressed genes (115 transcripts) were involved in 169 biological pathways. We selected 19 differentially expressed genes and analyzed their expressions in mock-inoculated, inoculated USDA110 and CCBAU45436 using qRT-PCR. The results were in accordance with those obtained from rhizobia infected RNA-Seq data. These showed that the results of RNA-Seq had reliability and universality. Additionally, this study showed some novel genes associated with the nitrogen fixation process in comparison to previously identified QTLs. (author)

  14. Compilation and analysis of Escherichia coli promoter DNA sequences.

    OpenAIRE

    Hawley, D K; McClure, W R

    1983-01-01

    The DNA sequence of 168 promoter regions (-50 to +10) for Escherichia coli RNA polymerase were compiled. The complete listing was divided into two groups depending upon whether or not the promoter had been defined by genetic (promoter mutations) or biochemical (5' end determination) criteria. A consensus promoter sequence based on homologies among 112 well-defined promoters was determined that was in substantial agreement with previous compilations. In addition, we have tabulated 98 promoter ...

  15. Sequence quality analysis tool for HIV type 1 protease and reverse transcriptase.

    Science.gov (United States)

    Delong, Allison K; Wu, Mingham; Bennett, Diane; Parkin, Neil; Wu, Zhijin; Hogan, Joseph W; Kantor, Rami

    2012-08-01

    Access to antiretroviral therapy is increasing globally and drug resistance evolution is anticipated. Currently, protease (PR) and reverse transcriptase (RT) sequence generation is increasing, including the use of in-house sequencing assays, and quality assessment prior to sequence analysis is essential. We created a computational HIV PR/RT Sequence Quality Analysis Tool (SQUAT) that runs in the R statistical environment. Sequence quality thresholds are calculated from a large dataset (46,802 PR and 44,432 RT sequences) from the published literature ( http://hivdb.Stanford.edu ). Nucleic acid sequences are read into SQUAT, identified, aligned, and translated. Nucleic acid sequences are flagged if with >five 1-2-base insertions; >one 3-base insertion; >one deletion; >six PR or >18 RT ambiguous bases; >three consecutive PR or >four RT nucleic acid mutations; >zero stop codons; >three PR or >six RT ambiguous amino acids; >three consecutive PR or >four RT amino acid mutations; >zero unique amino acids; or 15% genetic distance from another submitted sequence. Thresholds are user modifiable. SQUAT output includes a summary report with detailed comments for troubleshooting of flagged sequences, histograms of pairwise genetic distances, neighbor joining phylogenetic trees, and aligned nucleic and amino acid sequences. SQUAT is a stand-alone, free, web-independent tool to ensure use of high-quality HIV PR/RT sequences in interpretation and reporting of drug resistance, while increasing awareness and expertise and facilitating troubleshooting of potentially problematic sequences.

  16. De novo assembly and characterization of the spleen transcriptome of common carp (Cyprinus carpio) using Illumina paired-end sequencing.

    Science.gov (United States)

    Li, Guoxi; Zhao, Yinli; Liu, Zhonghu; Gao, Chunsheng; Yan, Fengbin; Liu, Bianzhi; Feng, Jianxin

    2015-06-01

    Common carp (Cyprinus carpio) is one of the most important aquacultured species of the family Cyprinidae, and breeding this species for disease resistance is becoming more and more important. However, at the genome or transcriptome levels, study of the immunogenetics of disease resistance in the common carp is lacking. In this study, 60,316,906 and 75,200,328 paired-end clean reads were obtained from two cDNA libraries of the common carp spleen by Illumina paired-end sequencing technology. Totally, 130,293 unique transcript fragments (unigenes) were assembled, with an average length of 1400.57 bp. Approximately 105,612 (81.06%) unigenes could be annotated according to their homology with matches in the Nr, Nt, Swiss-Prot, COG, GO, or KEGG databases, and they were found to represent 46,747 non-redundant genes. Comparative analysis showed that 59.82% of the unigenes have significant similarity to zebrafish Refseq proteins. Gene expression comparison revealed that 10,432 and 6889 annotated unigenes were, respectively, up- and down-regulated with at least twofold changes between two developmental stages of the common carp spleen. Gene ontology and KEGG analysis were performed to classify all unigenes into functional categories for understanding gene functions and regulation pathways. In addition, 46,847 simple sequence repeats (SSRs) were detected from 35,618 unigenes, and a large number of single nucleotide polymorphism (SNP) and insertion/deletion (INDEL) sites were identified in the spleen transcriptome of common carp. This study has characterized the spleen transcriptome of the common carp for the first time, providing a valuable resource for a better understanding of the common carp immune system and defense mechanisms. This knowledge will also facilitate future functional studies on common carp immunogenetics that may eventually be applied in breeding programs. Copyright © 2015 Elsevier Ltd. All rights reserved.

  17. Identification and Mapping of Simple Sequence Repeat Markers from Common Bean (Phaseolus vulgaris L. Bacterial Artificial Chromosome End Sequences for Genome Characterization and Genetic–Physical Map Integration

    Directory of Open Access Journals (Sweden)

    Juana M. Córdoba

    2010-11-01

    Full Text Available Microsatellite markers or simple sequence repeat (SSR loci are useful for diversity characterization and genetic–physical mapping. Different in silico microsatellite search methods have been developed for mining bacterial artificial chromosome (BAC end sequences for SSRs. The overall goal of this study was genome characterization based on SSRs in 89,017 BAC end sequences (BESs from the G19833 common bean ( L. library. Another objective was to identify new SSR taking into account three tandem motif identification programs (Automated Microsatellite Marker Development [AMMD], Tandem Repeats Finder [TRF], and SSRLocator [SSRL]. Among the microsatellite search engines, SSRL identified the highest number of SSRs; however, when primer design was attempted, the number dropped due to poor primer design regions. Automated Microsatellite Marker Development software identified many SSRs with valuable AT/TA or AG/TC motifs, while TRF found fewer SSRs and produced no primers. A subgroup of 323 AT-rich, di-, and trinucleotide SSRs were selected from the AMMD results and used in a parental survey with DOR364 and G19833, of which 75 could be mapped in the corresponding population; these represented 4052 BAC clones. Together with 92 previously mapped BES- and 114 non-BES-derived markers, a total of 280 SSRs were included in the polymerase chain reaction (PCR-based map, integrating a total of 8232 BAC clones in 162 contigs from the physical map.

  18. Transuranic waste characterization sampling and analysis plan

    International Nuclear Information System (INIS)

    1994-01-01

    Los Alamos National Laboratory (the Laboratory) is located approximately 25 miles northwest of Santa Fe, New Mexico, situated on the Pajarito Plateau. Technical Area 54 (TA-54), one of the Laboratory's many technical areas, is a radioactive and hazardous waste management and disposal area located within the Laboratory's boundaries. The purpose of this transuranic waste characterization, sampling, and analysis plan (CSAP) is to provide a methodology for identifying, characterizing, and sampling approximately 25,000 containers of transuranic waste stored at Pads 1, 2, and 4, Dome 48, and the Fiberglass Reinforced Plywood Box Dome at TA-54, Area G, of the Laboratory. Transuranic waste currently stored at Area G was generated primarily from research and development activities, processing and recovery operations, and decontamination and decommissioning projects. This document was created to facilitate compliance with several regulatory requirements and program drivers that are relevant to waste management at the Laboratory, including concerns of the New Mexico Environment Department

  19. Characterization and Exergy Analysis of Triphenyl Borate

    International Nuclear Information System (INIS)

    Acarali, N. B.

    2015-01-01

    In this study, unlike from the literature, boron oxide, borax decahydrate, boric acid and borax pentahydrate as boron sources were used to synthesize Triphenyl Borate (TPB). The reactions of TPB were carried out by using both phenol and various boron sources in inert water-immiscible organic solvent successfully. On the basis of analyzes (FT-IR, SEM, TGA/DSC) obtained, it was seen that phenol acted as a support to borate structure framework and thermal characterisation of the amorphous solid under determined conditions suggested that usage of different boron sources had effects for glass transition temperature in TPB production. The exergy analysis was performed to the TPB production to determine efficiency. The exergy analysis showed that the highest exergy efficiency was obtained by using boron oxide as a boron source. Consequently, all analyses results showed that TPB was produced successfully. Accordingly, characterization and exergy analysis supported each other. (author)

  20. First fungal genome sequence from Africa: A preliminary analysis

    Directory of Open Access Journals (Sweden)

    Rene Sutherland

    2012-01-01

    Full Text Available Some of the most significant breakthroughs in the biological sciences this century will emerge from the development of next generation sequencing technologies. The ease of availability of DNA sequence made possible through these new technologies has given researchers opportunities to study organisms in a manner that was not possible with Sanger sequencing. Scientists will, therefore, need to embrace genomics, as well as develop and nurture the human capacity to sequence genomes and utilise the ’tsunami‘ of data that emerge from genome sequencing. In response to these challenges, we sequenced the genome of Fusarium circinatum, a fungal pathogen of pine that causes pitch canker, a disease of great concern to the South African forestry industry. The sequencing work was conducted in South Africa, making F. circinatum the first eukaryotic organism for which the complete genome has been sequenced locally. Here we report on the process that was followed to sequence, assemble and perform a preliminary characterisation of the genome. Furthermore, details of the computer annotation and manual curation of this genome are presented. The F. circinatum genome was found to be nearly 44 million bases in size, which is similar to that of four other Fusarium genomes that have been sequenced elsewhere. The genome contains just over 15 000 open reading frames, which is less than that of the related species, Fusarium oxysporum, but more than that for Fusarium verticillioides. Amongst the various putative gene clusters identified in F. circinatum, those encoding the secondary metabolites fumosin and fusarin appeared to harbour evidence of gene translocation. It is anticipated that similar comparisons of other loci will provide insights into the genetic basis for pathogenicity of the pitch canker pathogen. Perhaps more importantly, this project has engaged a relatively large group of scientists

  1. Using Whole Genome Analysis to Examine Recombination across Diverse Sequence Types of Staphylococcus aureus.

    Directory of Open Access Journals (Sweden)

    Elizabeth M Driebe

    Full Text Available Staphylococcus aureus is an important clinical pathogen worldwide and understanding this organism's phylogeny and, in particular, the role of recombination, is important both to understand the overall spread of virulent lineages and to characterize outbreaks. To further elucidate the phylogeny of S. aureus, 35 diverse strains were sequenced using whole genome sequencing. In addition, 29 publicly available whole genome sequences were included to create a single nucleotide polymorphism (SNP-based phylogenetic tree encompassing 11 distinct lineages. All strains of a particular sequence type fell into the same clade with clear groupings of the major clonal complexes of CC8, CC5, CC30, CC45 and CC1. Using a novel analysis method, we plotted the homoplasy density and SNP density across the whole genome and found evidence of recombination throughout the entire chromosome, but when we examined individual clonal lineages we found very little recombination. However, when we analyzed three branches of multiple lineages, we saw intermediate and differing levels of recombination between them. These data demonstrate that in S. aureus, recombination occurs across major lineages that subsequently expand in a clonal manner. Estimated mutation rates for the CC8 and CC5 lineages were different from each other. While the CC8 lineage rate was similar to previous studies, the CC5 lineage was 100-fold greater. Fifty known virulence genes were screened in all genomes in silico to determine their distribution across major clades. Thirty-three genes were present variably across clades, most of which were not constrained by ancestry, indicating horizontal gene transfer or gene loss.

  2. REFGEN and TREENAMER: Automated Sequence Data Handling for Phylogenetic Analysis in the Genomic Era

    Science.gov (United States)

    Leonard, Guy; Stevens, Jamie R.; Richards, Thomas A.

    2009-01-01

    The phylogenetic analysis of nucleotide sequences and increasingly that of amino acid sequences is used to address a number of biological questions. Access to extensive datasets, including numerous genome projects, means that standard phylogenetic analyses can include many hundreds of sequences. Unfortunately, most phylogenetic analysis programs do not tolerate the sequence naming conventions of genome databases. Managing large numbers of sequences and standardizing sequence labels for use in phylogenetic analysis programs can be a time consuming and laborious task. Here we report the availability of an online resource for the management of gene sequences recovered from public access genome databases such as GenBank. These web utilities include the facility for renaming every sequence in a FASTA alignment file, with each sequence label derived from a user-defined combination of the species name and/or database accession number. This facility enables the user to keep track of the branching order of the sequences/taxa during multiple tree calculations and re-optimisations. Post phylogenetic analysis, these webpages can then be used to rename every label in the subsequent tree files (with a user-defined combination of species name and/or database accession number). Together these programs drastically reduce the time required for managing sequence alignments and labelling phylogenetic figures. Additional features of our platform include the automatic removal of identical accession numbers (recorded in the report file) and generation of species and accession number lists for use in supplementary materials or figure legends. PMID:19812722

  3. REFGEN and TREENAMER: Automated Sequence Data Handling for Phylogenetic Analysis in the Genomic Era

    Directory of Open Access Journals (Sweden)

    Guy Leonard

    2009-01-01

    Full Text Available The phylogenetic analysis of nucleotide sequences and increasingly that of amino acid sequences is used to address a number of biological questions. Access to extensive datasets, including numerous genome projects, means that standard phylogenetic analyses can include many hundreds of sequences. Unfortunately, most phylogenetic analysis programs do not tolerate the sequence naming conventions of genome databases. Managing large numbers of sequences and standardizing sequence labels for use in phylogenetic analysis programs can be a time consuming and laborious task. Here we report the availability of an online resource for the management of gene sequences recovered from public access genome databases such as GenBank. These web utilities include the facility for renaming every sequence in a FASTA alignment fi le, with each sequence label derived from a user-defined combination of the species name and/or database accession number. This facility enables the user to keep track of the branching order of the sequences/taxa during multiple tree calculations and re-optimisations. Post phylogenetic analysis, these webpages can then be used to rename every label in the subsequent tree fi les (with a user-defined combination of species name and/or database accession number. Together these programs drastically reduce the time required for managing sequence alignments and labelling phylogenetic figures. Additional features of our platform include the automatic removal of identical accession numbers (recorded in the report file and generation of species and accession number lists for use in supplementary materials or figure legends.

  4. Sequencing and Characterization of Divergent Marbling Levels in the Beef Cattle ( Muscle Transcriptome

    Directory of Open Access Journals (Sweden)

    Dong Chen

    2015-02-01

    Full Text Available Marbling is an important trait regarding the quality of beef. Analysis of beef cattle transcriptome and its expression profile data are essential to extend the genetic information resources and would support further studies on beef cattle. RNA sequencing was performed in beef cattle using the Illumina High-Seq2000 platform. Approximately 251.58 million clean reads were generated from a high marbling (H group and low marbling (L group. Approximately 80.12% of the 19,994 bovine genes (protein coding were detected in all samples, and 749 genes exhibited differential expression between the H and L groups based on fold change (>1.5-fold, p<0.05. Multiple gene ontology terms and biological pathways were found significantly enriched among the differentially expressed genes. The transcriptome data will facilitate future functional studies on marbling formation in beef cattle and may be applied to improve breeding programs for cattle and closely related mammals.

  5. Characterizing spatial heterogeneity based on the b-value and fractal analyses of the 2015 Nepal earthquake sequence

    Science.gov (United States)

    Nampally, Subhadra; Padhy, Simanchal; Dimri, Vijay P.

    2018-01-01

    The nature of spatial distribution of heterogeneities in the source area of the 2015 Nepal earthquake is characterized based on the seismic b-value and fractal analysis of its aftershocks. The earthquake size distribution of aftershocks gives a b-value of 1.11 ± 0.08, possibly representing the highly heterogeneous and low stress state of the region. The aftershocks exhibit a fractal structure characterized by a spectrum of generalized dimensions, Dq varying from D2 = 1.66 to D22 = 0.11. The existence of a fractal structure suggests that the spatial distribution of aftershocks is not a random phenomenon, but it self-organizes into a critical state, exhibiting a scale-independent structure governed by a power-law scaling, where a small perturbation in stress is sufficient enough to trigger aftershocks. In order to obtain the bias in fractal dimensions resulting from finite data size, we compared the multifractal spectrum for the real data and random simulations. On comparison, we found that the lower limit of bias in D2 is 0.44. The similarity in their multifractal spectra suggests the lack of long-range correlation in the data, with an only weakly multifractal or a monofractal with a single correlation dimension D2 characterizing the data. The minimum number of events required for a multifractal process with an acceptable error is discussed. We also tested for a possible correlation between changes in D2 and energy released during the earthquakes. The values of D2 rise during the two largest earthquakes (M > 7.0) in the sequence. The b- and D2 values are related by D2 = 1.45 b that corresponds to the intermediate to large earthquakes. Our results provide useful constraints on the spatial distribution of b- and D2-values, which are useful for seismic hazard assessment in the aftershock area of a large earthquake.

  6. Deep sequencing and ecological characterization of gut microbial communities of diverse bumble bee species.

    Directory of Open Access Journals (Sweden)

    Haw Chuan Lim

    Full Text Available Gut bacterial communities of bumble bees are correlated with defense against pathogens. Further understanding this host-microbe association is vitally important as bumble bees are currently experiencing global population declines, potentially due in part to emergent diseases. In this study, we used pyrosequencing and community fingerprinting (ARISA to characterize the gut microbial communities of nine bumble species from across the Bombus phylogeny. Overall, we delimited 74 bacterial taxa (operational taxonomic units or OTUs belonging to Betaproteobacteria, Gammaproteobacteria, Bacilli, Actinobacteria, Flavobacteria and Alphaproteobacteria. Each bacterial community was taxonomically simple, containing an average of 1.9 common (relative abundance per sample > 5% bacterial OTUs. The most abundant and prevalent (occurring in 92% of the samples bacterial OTU, based on 16S rRNA sequences, closely matched that of the previously described Betaproteobacteria species Snodgrassella alvi. Bacteria that were first described in bee-related external environments dominated a number of gut bacterial communities, suggesting that they are not strictly dependent on the internal gut environment. The ARISA data showed a correlation between bacterial community structures and the geographic locations where the bees were sampled, suggesting that at least a subset of the bacterial species may be transmitted environmentally. Using light and fluorescent microscopy, we demonstrated that the gut bacteria form a biofilm on the internal epithelial surface of the ileum, corroborating results obtained from Apis mellifera.

  7. Genetic characterization of autochthonous grapevine cultivars from Eastern Turkey by simple sequence repeats (SSRs

    Directory of Open Access Journals (Sweden)

    Sadiye Peral Eyduran

    2016-01-01

    Full Text Available In this research, two well-recognized standard grape cultivars, Cabernet Sauvignon and Merlot, together with eight historical autochthonous grapevine cultivars from Eastern Anatolia in Turkey, were genetically characterized by using 12 pairs of simple sequence repeat (SSR primers in order to evaluate their genetic diversity and relatedness. All of the used SSR primers produced successful amplifications and revealed DNA polymorphisms, which were subsequently utilized to evaluate the genetic relatedness of the grapevine cultivars. Allele richness was implied by the identification of 69 alleles in 8 autochthonous cultivars with a mean value of 5.75 alleles per locus. The average expected heterozygosity and observed heterozygosity were found to be 0.749 and 0.739, respectively. Taking into account the generated alleles, the highest number was recorded in VVC2C3 and VVS2 loci (nine and eight alleles per locus, respectively, whereas the lowest number was recorded in VrZAG83 (three alleles per locus. Two main clusters were produced by using the unweighted pair-group method with arithmetic mean dendrogram constructed on the basis of the SSR data. Only Cabernet Sauvignon and Merlot cultivars were included in the first cluster. The second cluster involved the rest of the autochthonous cultivars. The results obtained during the study illustrated clearly that SSR markers have verified to be an effective tool for fingerprinting grapevine cultivars and carrying out grapevine biodiversity studies. The obtained data are also meaningful references for grapevine domestication.

  8. Characterization of promoter sequence of toll-like receptor genes in Vechur cattle

    Directory of Open Access Journals (Sweden)

    R. Lakshmi

    2016-06-01

    Full Text Available Aim: To analyze the promoter sequence of toll-like receptor (TLR genes in Vechur cattle, an indigenous breed of Kerala with the sequence of Bos taurus and access the differences that could be attributed to innate immune responses against bovine mastitis. Materials and Methods: Blood samples were collected from Jugular vein of Vechur cattle, maintained at Vechur cattle conservation center of Kerala Veterinary and Animal Sciences University, using an acid-citrate-dextrose anticoagulant. The genomic DNA was extracted, and polymerase chain reaction was carried out to amplify the promoter region of TLRs. The amplified product of TLR2, 4, and 9 promoter regions was sequenced by Sanger enzymatic DNA sequencing technique. Results: The sequence of promoter region of TLR2 of Vechur cattle with the B. taurus sequence present in GenBank showed 98% similarity and revealed variants for four sequence motifs. The sequence of the promoter region of TLR4 of Vechur cattle revealed 99% similarity with that of B. taurus sequence but not reveals significant variant in motifregions. However, two heterozygous loci were observed from the chromatogram. Promoter sequence of TLR9 gene also showed 99% similarity to B. taurus sequence and revealed variants for four sequence motifs. Conclusion: The results of this study indicate that significant variation in the promoter of TLR2 and 9 genes in Vechur cattle breed and may potentially link the influence the innate immunity response against mastitis diseases.

  9. Whole-Genome Sequencing in Microbial Forensic Analysis of Gamma-Irradiated Microbial Materials.

    Science.gov (United States)

    Broomall, Stacey M; Ait Ichou, Mohamed; Krepps, Michael D; Johnsky, Lauren A; Karavis, Mark A; Hubbard, Kyle S; Insalaco, Joseph M; Betters, Janet L; Redmond, Brady W; Rivers, Bryan A; Liem, Alvin T; Hill, Jessica M; Fochler, Edward T; Roth, Pierce A; Rosenzweig, C Nicole; Skowronski, Evan W; Gibbons, Henry S

    2016-01-15

    Effective microbial forensic analysis of materials used in a potential biological attack requires robust methods of morphological and genetic characterization of the attack materials in order to enable the attribution of the materials to potential sources and to exclude other potential sources. The genetic homogeneity and potential intersample variability of many of the category A to C bioterrorism agents offer a particular challenge to the generation of attributive signatures, potentially requiring whole-genome or proteomic approaches to be utilized. Currently, irradiation of mail is standard practice at several government facilities judged to be at particularly high risk. Thus, initial forensic signatures would need to be recovered from inactivated (nonviable) material. In the study described in this report, we determined the effects of high-dose gamma irradiation on forensic markers of bacterial biothreat agent surrogate organisms with a particular emphasis on the suitability of genomic DNA (gDNA) recovered from such sources as a template for whole-genome analysis. While irradiation of spores and vegetative cells affected the retention of Gram and spore stains and sheared gDNA into small fragments, we found that irradiated material could be utilized to generate accurate whole-genome sequence data on the Illumina and Roche 454 sequencing platforms. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

  10. CCK-5: sequence analysis of a small cholecystokinin from canine brain and intestine

    International Nuclear Information System (INIS)

    Shively, J.; Reeve, J.R. Jr.; Eysselein, V.E.; Ben-Avram, C.; Vigna, S.R.; Walsh, J.H.

    1987-01-01

    The purpose of this study is to purify and to characterize chemically cholecystokinin (CCK)-like peptides present in brain and gut extracts that elute from gel filtration after the octapeptide. Canine small intestinal mucosa and brain were boiled in water and then extracted in cold trifluoroacetic acid, and cholecystokinin-like immunoreactivity was determined by carboxyl-terminal specific radioimmunoassay. Gel permeation chromatography on Sephadex G-50 revealed a form of CCK apparently smaller than CCK-8. Microsequence analysis showed that the amino terminal primary sequence of this small CCK was Gly-Trp-Met-Asp. Immunochemical and chromatographic analysis indicated that the carboxyl-terminal residue was Phe-NH 2 and thus the full sequence is Gly-Trp-Met-Asp-Phe-NH 2 . An antibody that recognizes synthetic CCK-8, CCK-5, and CCK-equally did not reveal the presence of significant amounts of CCK-4. These results indicate that CCK-5 is the major CCK form smaller than the octapeptide present in brain and small intestine. This finding, coupled with the demonstration by others that CCK-5 interacts with high-affinity brain CCK receptors, indicates that CCK-5 may play a physiological role in brain function

  11. [Study of human immunodeficiency virus transmission chains in Andalusia: analysis from baseline antiretroviral resistance sequences].

    Science.gov (United States)

    Pérez-Parra, Santiago; Chueca-Porcuna, Natalia; Álvarez-Estevez, Marta; Pasquau, Juan; Omar, Mohamed; Collado, Antonio; Vinuesa, David; Lozano, Ana Belen; García-García, Federico

    2015-11-01

    Protease and reverse transcriptase HIV-1 sequences provide useful information for patient clinical management, as well as information on resistance to antiretrovirals. The aim of this study is to evaluate transmission events, transmitted drug resistance, and to georeference subtypes among newly diagnosed patients referred to our center. A study was conducted on 693 patients diagnosed between 2005 and 2012 in Southern Spain. Protease and reverse transcriptase sequences were obtained for resistance to cART analysis with Trugene(®) HIV Genotyping Kit (Siemens, NAD). MEGA 5.2, Neighbor-Joining, ArcGIS and REGA were used for subsequent analysis. The results showed 298 patients clustered into 77 different transmission events. Most of the clusters were formed by pairs (n=49), of men having sex with men (n=26), Spanish (n=37), and below 45 years of age (73.5%). Urban areas from Granada, and the coastal areas of Almeria and Granada showed the greatest subtype heterogeneity. Five clusters were formed by more than 10 patients, and 15 clusters had transmitted drug resistance. The study data demonstrate how the phylogenetic characterization of transmission clusters is a powerful tool to monitor the spread of HIV, and may contribute to design correct preventive measures to minimize it. Copyright © 2015 Elsevier España, S.L.U. y Sociedad Española de Enfermedades Infecciosas y Microbiología Clínica. All rights reserved.

  12. Identification of somatic mutations in cancer through Bayesian-based analysis of sequenced genome pairs.

    Science.gov (United States)

    Christoforides, Alexis; Carpten, John D; Weiss, Glen J; Demeure, Michael J; Von Hoff, Daniel D; Craig, David W

    2013-05-04

    The field of cancer genomics has rapidly adopted next-generation sequencing (NGS) in order to study and characterize malignant tumors with unprecedented resolution. In particular for cancer, one is often trying to identify somatic mutations--changes specific to a tumor and not within an individual's germline. However, false positive and false negative detections often result from lack of sufficient variant evidence, contamination of the biopsy by stromal tissue, sequencing errors, and the erroneous classification of germline variation as tumor-specific. We have developed a generalized Bayesian analysis framework for matched tumor/normal samples with the purpose of identifying tumor-specific alterations such as single nucleotide mutations, small insertions/deletions, and structural variation. We describe our methodology, and discuss its application to other types of paired-tissue analysis such as the detection of loss of heterozygosity as well as allelic imbalance. We also demonstrate the high level of sensitivity and specificity in discovering simulated somatic mutations, for various combinations of a) genomic coverage and b) emulated heterogeneity. We present a Java-based implementation of our methods named Seurat, which is made available for free academic use. We have demonstrated and reported on the discovery of different types of somatic change by applying Seurat to an experimentally-derived cancer dataset using our methods; and have discussed considerations and practices regarding the accurate detection of somatic events in cancer genomes. Seurat is available at https://sites.google.com/site/seuratsomatic.

  13. Sequence analysis of Maturase K (matK): A chloroplast-encoding ...

    African Journals Online (AJOL)

    The application and utilization of sequence data has been found very informative in the characterization and phylogenetic relationship of different crops species. This study aimed to use bioinformatics tools to characterize the matK gene in some selected legumes with special reference to pigeon pea [cajanus cajan ...

  14. Generation and Characterization of HIV-1 Transmitted and Founder Virus Consensus Sequence from Intravenous Drug Users in Xinjiang, China.

    Science.gov (United States)

    Li, Fan; Ma, Liying; Feng, Yi; Hu, Jing; Ni, Na; Ruan, Yuhua; Shao, Yiming

    2017-06-01

    HIV-1 transmission in intravenous drug users (IDUs) has been characterized by high genetic multiplicity and suggests a greater challenge for HIV-1 infection blocking. We investigated a total of 749 sequences of full-length gp160 gene obtained by single genome sequencing (SGS) from 22 HIV-1 early infected IDUs in Xinjiang province, northwest China, and generated a transmitted and founder virus (T/F virus) consensus sequence (IDU.CON). The T/F virus was classified as subtype CRF07_BC and predicted to be CCR5-tropic virus. The variable region (V1, V2, and V4 loop) of IDU.CON showed length variation compared with the heterosexual T/F virus consensus sequence (HSX.CON) and homosexual T/F virus consensus sequence (MSM.CON). A total of 26 N-linked glycosylation sites were discovered in the IDU.CON sequence, which is less than that of MSM.CON and HSX.CON. Characterization of T/F virus from IDUs highlights the genetic make-up and complexity of virus near the moment of transmission or in early infection preceding systemic dissemination and is important toward the development of an effective HIV-1 preventive methods, including vaccines.

  15. Characterization and Sequencing of a Genotype XII Newcastle Disease Virus Isolated from a Peacock (Pavo cristatus) in Peru.

    Science.gov (United States)

    Chumbe, Ana; Izquierdo-Lara, Ray; Tataje-Lavanda, Luis; Figueroa, Aling; Segovia, Karen; Gonzalez, Rosa; Cribillero, Giovana; Montalvan, Angela; Fernández-Díaz, Manolo; Icochea, Eliana

    2015-07-30

    Here, we report the first complete sequence and biological characterization of a Newcastle disease virus (NDV) isolated from a peacock in South America (NDV/peacock/Peru/2011). This isolate, classified as genotype XII in class II, highlights the need for increased surveillance of noncommercial avian species. Copyright © 2015 Chumbe et al.

  16. Mycobacterium malmesburyense sp. nov., a non-tuberculous species of the genus Mycobacterium revealed by multiple gene sequence characterization

    CSIR Research Space (South Africa)

    Gcebe, N

    2017-04-01

    Full Text Available Journal of Systematic and Evolutionary Microbiology: DOI 10.1099/ijsem.0.001678 Mycobacterium malmesburyense sp. nov., a non-tuberculous species of the genus Mycobacterium revealed by multiple gene sequence characterization Gcebe N Rutten V Gey...

  17. Oral focal epithelial hyperplasia: report of 3 cases with human papillomavirus DNA sequencing analysis.

    Science.gov (United States)

    Gültekin, S E; Tokman Yildirim, Benay; Sarisoy, S

    2011-01-01

    Focal epithelial hyperplasia (FEH), or Heck's disease, is a benign proliferative viral infection of the oral mucosa that is related to Human Papil-lomavirus (HPV), mainly subtypes 13 and 32. Although this condition is known to exist in numerous populations and ethnic groups, the reported cases among Caucasians are relatively rare. It presents as asymptomatic papules or nodules on the oral mucosa, gingiva, tongue, and lips. Histopathologically, it is characterized by parakeratosis, epithelial hyperplasia, focal acanthosis, fusion, and horizontal outgrowth of epithelial ridges and the cells named mitozoids. The purpose of this case report was to present 3 cases of focal epithelial hyperplasia in a pediatric age group. Histopathological and clinical features of cases are discussed and DNA sequencing analysis is reported in which HPV 13, HPV 32, and HPV 11 genomes are detected.

  18. Molecular Cloning and Sequencing of AlkalophilicCellulosimicrobium cellulans CKMX1 Xylanase Gene Isolated from Mushroom Compost and Characterization of the Gene Product

    Directory of Open Access Journals (Sweden)

    Abhishek Walia

    2015-12-01

    Full Text Available ABSTRACT A xylanolytic bacterium was isolated from mushroom compost by using enrichment technique. Results from the metabolic fingerprinting, whole-cell fatty acids methyl ester analysis and 16S rDNA sequencing suggested the bacterium to be Cellulosimicrobium cellulans CKMX1. Due to the xylanolytic activity of this bacterium, isolation and characterization of the xylanase gene were attempted. A distinct fragment of about 1671 bp was successfully amplified using PCR and cloned into Escherichia coli DH5α. A BLAST search confirmed that the DNA sequence from the amplified fragment was endo-1, 4-beta-xylanase, which was a member of glycoside hydrolase family 11. It showed 98% homology withCellulosimicrobium sp. xylanase gene (Accession no. FJ859907.1 reported from the gut of Eisenia fetida in Korea. In silicophysico-chemical characterization of amino acid sequence of xylanase showed an open reading frame encoding a 556 amino acid sequence with a molecular weight of 58 kDa and theoretical isolectric point (pI of 4.46 was computed using Expasy's ProtParam server. Secondary and homology based 3D structure of xylanase was analysed using SOPMA and Swiss-Prot software.

  19. Analysis and characterization of heparin impurities.

    Science.gov (United States)

    Beni, Szabolcs; Limtiaco, John F K; Larive, Cynthia K

    2011-01-01

    This review discusses recent developments in analytical methods available for the sensitive separation, detection and structural characterization of heparin contaminants. The adulteration of raw heparin with oversulfated chondroitin sulfate (OSCS) in 2007-2008 spawned a global crisis resulting in extensive revisions to the pharmacopeia monographs on heparin and prompting the FDA to recommend the development of additional physicochemical methods for the analysis of heparin purity. The analytical chemistry community quickly responded to this challenge, developing a wide variety of innovative approaches, several of which are reported in this special issue. This review provides an overview of methods of heparin isolation and digestion, discusses known heparin contaminants, including OSCS, and summarizes recent publications on heparin impurity analysis using sensors, near-IR, Raman, and NMR spectroscopy, as well as electrophoretic and chromatographic separations.

  20. Expressed Sequence Tag-Simple Sequence Repeat (EST-SSR Marker Resources for Diversity Analysis of Mango (Mangifera indica L.

    Directory of Open Access Journals (Sweden)

    Natalie L. Dillon

    2014-01-01

    Full Text Available In this study, a collection of 24,840 expressed sequence tags (ESTs generated from five mango (Mangifera indica L. cDNA libraries was mined for EST-based simple sequence repeat (SSR markers. Over 1,000 ESTs with SSR motifs were detected from more than 24,000 EST sequences with di- and tri-nucleotide repeat motifs the most abundant. Of these, 25 EST-SSRs in genes involved in plant development, stress response, and fruit color and flavor development pathways were selected, developed into PCR markers and characterized in a population of 32 mango selections including M. indica varieties, and related Mangifera species. Twenty-four of the 25 EST-SSR markers exhibited polymorphisms, identifying a total of 86 alleles with an average of 5.38 alleles per locus, and distinguished between all Mangifera selections. Private alleles were identified for Mangifera species. These newly developed EST-SSR markers enhance the current 11 SSR mango genetic identity panel utilized by the Australian Mango Breeding Program. The current panel has been used to identify progeny and parents for selection and the application of this extended panel will further improve and help to design mango hybridization strategies for increased breeding efficiency.

  1. Whole Genome Sequencing Based Characterization of Extensively Drug-Resistant Mycobacterium tuberculosis Isolates from Pakistan

    KAUST Repository

    Ali, Asho; Hasan, Zahra; McNerney, Ruth; Mallard, Kim; Hill-Cawthorne, Grant A.; Coll, Francesc; Nair, Mridul; Pain, Arnab; Clark, Taane G.; Hasan, Rumina

    2015-01-01

    Improved molecular diagnostic methods for detection drug resistance in Mycobacterium tuberculosis (MTB) strains are required. Resistance to first- and second- line anti-tuberculous drugs has been associated with single nucleotide polymorphisms (SNPs) in particular genes. However, these SNPs can vary between MTB lineages therefore local data is required to describe different strain populations. We used whole genome sequencing (WGS) to characterize 37 extensively drug-resistant (XDR) MTB isolates from Pakistan and investigated 40 genes associated with drug resistance. Rifampicin resistance was attributable to SNPs in the rpoB hot-spot region. Isoniazid resistance was most commonly associated with the katG codon 315 (92%) mutation followed by inhA S94A (8%) however, one strain did not have SNPs in katG, inhA or oxyR-ahpC. All strains were pyrazimamide resistant but only 43% had pncA SNPs. Ethambutol resistant strains predominantly had embB codon 306 (62%) mutations, but additional SNPs at embB codons 406, 378 and 328 were also present. Fluoroquinolone resistance was associated with gyrA 91-94 codons in 81% of strains; four strains had only gyr B mutations, while others did not have SNPs in either gyrA or gyrB. Streptomycin resistant strains had mutations in ribosomal RNA genes; rpsL codon 43 (42%); rrs 500 region (16%), and gidB (34%) while six strains did not have mutations in any of these genes. Amikacin/kanamycin/capreomycin resistance was associated with SNPs in rrs at nt1401 (78%) and nt1484 (3%), except in seven (19%) strains. We estimate that if only the common hot-spot region targets of current commercial assays were used, the concordance between phenotypic and genotypic testing for these XDR strains would vary between rifampicin (100%), isoniazid (92%), flouroquinolones (81%), aminoglycoside (78%) and ethambutol (62%); while pncA sequencing would provide genotypic resistance in less than half the isolates. This work highlights the importance of expanded

  2. Whole Genome Sequencing Based Characterization of Extensively Drug-Resistant Mycobacterium tuberculosis Isolates from Pakistan

    KAUST Repository

    Ali, Asho

    2015-02-26

    Improved molecular diagnostic methods for detection drug resistance in Mycobacterium tuberculosis (MTB) strains are required. Resistance to first- and second- line anti-tuberculous drugs has been associated with single nucleotide polymorphisms (SNPs) in particular genes. However, these SNPs can vary between MTB lineages therefore local data is required to describe different strain populations. We used whole genome sequencing (WGS) to characterize 37 extensively drug-resistant (XDR) MTB isolates from Pakistan and investigated 40 genes associated with drug resistance. Rifampicin resistance was attributable to SNPs in the rpoB hot-spot region. Isoniazid resistance was most commonly associated with the katG codon 315 (92%) mutation followed by inhA S94A (8%) however, one strain did not have SNPs in katG, inhA or oxyR-ahpC. All strains were pyrazimamide resistant but only 43% had pncA SNPs. Ethambutol resistant strains predominantly had embB codon 306 (62%) mutations, but additional SNPs at embB codons 406, 378 and 328 were also present. Fluoroquinolone resistance was associated with gyrA 91-94 codons in 81% of strains; four strains had only gyr B mutations, while others did not have SNPs in either gyrA or gyrB. Streptomycin resistant strains had mutations in ribosomal RNA genes; rpsL codon 43 (42%); rrs 500 region (16%), and gidB (34%) while six strains did not have mutations in any of these genes. Amikacin/kanamycin/capreomycin resistance was associated with SNPs in rrs at nt1401 (78%) and nt1484 (3%), except in seven (19%) strains. We estimate that if only the common hot-spot region targets of current commercial assays were used, the concordance between phenotypic and genotypic testing for these XDR strains would vary between rifampicin (100%), isoniazid (92%), flouroquinolones (81%), aminoglycoside (78%) and ethambutol (62%); while pncA sequencing would provide genotypic resistance in less than half the isolates. This work highlights the importance of expanded

  3. Analysis of expressed sequence tags from Prunus mume flower and fruit and development of simple sequence repeat markers

    Directory of Open Access Journals (Sweden)

    Gao Zhihong

    2010-07-01

    Full Text Available Abstract Background Expressed Sequence Tag (EST has been a cost-effective tool in molecular biology and represents an abundant valuable resource for genome annotation, gene expression, and comparative genomics in plants. Results In this study, we constructed a cDNA library of Prunus mume flower and fruit, sequenced 10,123 clones of the library, and obtained 8,656 expressed sequence tag (EST sequences with high quality. The ESTs were assembled into 4,473 unigenes composed of 1,492 contigs and 2,981 singletons and that have been deposited in NCBI (accession IDs: GW868575 - GW873047, among which 1,294 unique ESTs were with known or putative functions. Furthermore, we found 1,233 putative simple sequence repeats (SSRs in the P. mume unigene dataset. We randomly tested 42 pairs of PCR primers flanking potential SSRs, and 14 pairs were identified as true-to-type SSR loci and could amplify polymorphic bands from 20 individual plants of P. mume. We further used the 14 EST-SSR primer pairs to test the transferability on peach and plum. The result showed that nearly 89% of the primer pairs produced target PCR bands in the two species. A high level of marker polymorphism was observed in the plum species (65% and low in the peach (46%, and the clustering analysis of the three species indicated that these SSR markers were useful in the evaluation of genetic relationships and diversity between and within the Prunus species. Conclusions We have constructed the first cDNA library of P. mume flower and fruit, and our data provide sets of molecular biology resources for P. mume and other Prunus species. These resources will be useful for further study such as genome annotation, new gene discovery, gene functional analysis, molecular breeding, evolution and comparative genomics between Prunus species.

  4. Accident Sequence Precursor Analysis for SGTR by Using Dynamic PSA Approach

    International Nuclear Information System (INIS)

    Lee, Han Sul; Heo, Gyun Young; Kim, Tae Wan

    2016-01-01

    In order to address this issue, this study suggests the sequence tree model to analyze accident sequence systematically. Using the sequence tree model, all possible scenarios which need a specific safety action to prevent the core damage can be identified and success conditions of safety action under complicated situation such as combined accident will be also identified. Sequence tree is branch model to divide plant condition considering the plant dynamics. Since sequence tree model can reflect the plant dynamics, arising from interaction of different accident timing and plant condition and from the interaction between the operator action, mitigation system, and the indicators for operation, sequence tree model can be used to develop the dynamic event tree model easily. Target safety action for this study is a feed-and-bleed (F and B) operation. A F and B operation directly cools down the reactor cooling system (RCS) using the primary cooling system when residual heat removal by the secondary cooling system is not available. In this study, a TLOFW accident and a TLOFW accident with LOCA were the target accidents. Based on the conventional PSA model and indicators, the sequence tree model for a TLOFW accident was developed. Based on the results of a sampling analysis and data from the conventional PSA model, the CDF caused by Sequence no. 26 can be realistically estimated. For a TLOFW accident with LOCA, second accident timings were categorized according to plant condition. Indicators were selected as branch point using the flow chart and tables, and a corresponding sequence tree model was developed. If sampling analysis is performed, practical accident sequences can be identified based on the sequence analysis. If a realistic distribution for the variables can be obtained for sampling analysis, much more realistic accident sequences can be described. Moreover, if the initiating event frequency under a combined accident can be quantified, the sequence tree model

  5. Sequencing and phylogenetic analysis of Herpes simplex virus type ...

    African Journals Online (AJOL)

    For determination of the genetic relationship of HSV-2 glycoprotein G gene (gG) in Iran with those in other countries, DNA fragment of 1100 bp corresponding to gG from six HSV-2 strains have been isolated from human infected sera samples in Iran, it was amplified in PCR system and was sequenced for determining ...

  6. Transcriptome analysis of blueberry using 454 EST sequencing

    Science.gov (United States)

    Blueberry (Vaccinium corymbosum) is a major berry crop in the United States, and one that has great nutritional and economical value. Next generation sequencing methodologies, such as 454, have been demonstrated to be successful and efficient in producing a snap-shot of transcriptional activities du...

  7. Functional analysis of bipartite begomovirus coat protein promoter sequences

    International Nuclear Information System (INIS)

    Lacatus, Gabriela; Sunter, Garry

    2008-01-01

    We demonstrate that the AL2 gene of Cabbage leaf curl virus (CaLCuV) activates the CP promoter in mesophyll and acts to derepress the promoter in vascular tissue, similar to that observed for Tomato golden mosaic virus (TGMV). Binding studies indicate that sequences mediating repression and activation of the TGMV and CaLCuV CP promoter specifically bind different nuclear factors common to Nicotiana benthamiana, spinach and tomato. However, chromatin immunoprecipitation demonstrates that TGMV AL2 can interact with both sequences independently. Binding of nuclear protein(s) from different crop species to viral sequences conserved in both bipartite and monopartite begomoviruses, including TGMV, CaLCuV, Pepper golden mosaic virus and Tomato yellow leaf curl virus suggests that bipartite begomoviruses bind common host factors to regulate the CP promoter. This is consistent with a model in which AL2 interacts with different components of the cellular transcription machinery that bind viral sequences important for repression and activation of begomovirus CP promoters

  8. Sequence analysis of mitochondrial 16S ribosomal RNA gene

    Indian Academy of Sciences (India)

    Mosquitoes are vectors for the transmission of many human pathogens that include viruses, nematodes and protozoa. For the understanding of their vectorial capacity, identification of disease carrying and refractory strains is essential. Recently, molecular taxonomic techniques have been utilized for this purpose. Sequence ...

  9. Illumina-based de novo transcriptome sequencing and analysis

    Indian Academy of Sciences (India)

    In the present study, we used Illumina HiSeq technology to perform de novo assembly of heart and musk gland transcriptomes from the Chinese forest musk deer. A total of 239,383 transcripts and 176,450 unigenes were obtained, of which 37,329 unigenes were matched to known sequences in the NCBI nonredundant ...

  10. Generation and analysis of expressed sequence tags from Botrytis cinerea

    Directory of Open Access Journals (Sweden)

    EVELYN SILVA

    2006-01-01

    Full Text Available Botrytis cinerea is a filamentous plant pathogen of a wide range of plant species, and its infection may cause enormous damage both during plant growth and in the post-harvest phase. We have constructed a cDNA library from an isolate of B. cinerea and have sequenced 11,482 expressed sequence tags that were assembled into 1,003 contigs sequences and 3,032 singletons. Approximately 81% of the unigenes showed significant similarity to genes coding for proteins with known functions: more than 50% of the sequences code for genes involved in cellular metabolism, 12% for transport of metabolites, and approximately 10% for cellular organization. Other functional categories include responses to biotic and abiotic stimuli, cell communication, cell homeostasis, and cell development. We carried out pair-wise comparisons with fungal databases to determine the B. cinerea unisequence set with relevant similarity to genes in other fungal pathogenic counterparts. Among the 4,035 non-redundant B. cinerea unigenes, 1,338 (23% have significant homology with Fusarium verticillioides unigenes. Similar values were obtained for Saccharomyces cerevisiae and Aspergillus nidulans (22% and 24%, respectively. The lower percentages of homology were with Magnaporthe grisae and Neurospora crassa (13% and 19%, respectively. Several genes involved in putative and known fungal virulence and general pathogenicity were identified. The results provide important information for future research on this fungal pathogen

  11. Whole-genome sequence-based analysis of thyroid function

    DEFF Research Database (Denmark)

    Taylor, Peter N.; Porcu, Eleonora; Chew, Shelby

    2015-01-01

    Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N = 2,287). Using additional whole-genome seque...

  12. DNA sequence and prokaryotic expression analysis of vitellogenin ...

    African Journals Online (AJOL)

    In this study, the DNA sequence of vitellogenin from Antheraea pernyi (Ap-Vg) was identified and its functional domain (30-740 aa, Ap-Vg-1) was expressed in Escherichia coli BL21 (DE3) cells. The recombinant Ap-Vg-1 proteins were purified and used for antibody preparation. The results showed that the intact DNA ...

  13. Molecular cloning, sequence analysis and structure prediction of the ...

    African Journals Online (AJOL)

    AJL

    2012-04-19

    Apr 19, 2012 ... The primers were based on the rBAT sequences of other animals deposited in GenBank. .... fragment; M1, 2000 bp DNA ladder; M2, 1000 bp DNA ladder. spliced to obtain the ..... A traffic signal for heterodimeric amino acid.

  14. A bibliometric analysis of global research on genome sequencing ...

    African Journals Online (AJOL)

    The results show that disease and protein related researches were the leading research focuses, and comparative genomics and evolution related research had strong potential in the near future. Key words: Genome sequencing, research trend, scientometrics, science citation index expanded (SCI-Expanded), word cluster ...

  15. Cloning and sequence analysis of the defective in anther ...

    African Journals Online (AJOL)

    To clone the defective in anther dehiscence1 (DAD1) gene fragment of Chinese kale, about 700 bp product was obtained by PCR amplification using Chinese kale genomic DNA as the template and a pair of specific primers designed according to the conserved sequence of DAD1 genes of Arabidopsis thaliana and ...

  16. Sequence and comparative analysis of Leuconostoc dairy bacteriophages

    DEFF Research Database (Denmark)

    Kot, Witold; Hansen, Lars Henrik; Neve, Horst

    2014-01-01

    Bacteriophages attacking Leuconostoc species may significantly influence the quality of the final product. There is however limited knowledge of this group of phages in the literature. We have determined the complete genome sequences of nine Leuconostoc bacteriophages virulent to either Leuconostoc...

  17. Genetic Characterization of Fasciola Isolates from West Azerbaijan Province Iran Based on ITS1 and ITS2 Sequence of Ribosomal DNA

    Science.gov (United States)

    GALAVANI, Hossein; GHOLIZADEH, Saber; HAZRATI TAPPEH, Khosrow

    2016-01-01

    Background: Fascioliasis, caused by Fasciola hepatica and F. gigantica, has medical and economic importance in the world. Molecular approaches comparing traditional methods using for identification and characterization of Fasciola spp. are precise and reliable. The aims of current study were molecular characterization of Fasciola spp. in West Azerbaijan Province, Iran and then comparative analysis of them using GenBank sequences. Methods: A total number of 580 isolates were collected from different hosts in five cities of West Azerbaijan Province, in 2014 from 90 slaughtered cattle (n=50) and sheep (n=40). After morphological identification and DNA extraction, designing specific primer were used to amplification of ITS1, 5.8s and ITS2 regions, 50 samples were conducted to sequence, randomly. Result: Using morphometric characters 99.14% and 0.86% of isolates identified as F. hepatica and F. gigantica, respectively. PCR amplification of 1081 bp fragment and sequencing result showed 100% similarity with F. hepatica in ITS1 (428 bp), 5.8s (158 bp), and ITS2 (366 bp) regions. Sequence comparison among current study sequences and GenBank data showed 98% identity with 11 nucleotide mismatches. However, in phylogenetic tree F. hepatica sequences of West Azerbaijan Province, Iran, were in a close relationship with Iranian, Asian, and African isolates. Conclusions: Only F. hepatica species is distributed among sheep and cattle in West Azerbaijan Province Iran. However, 5 and 6 bp variation in ITS1 and ITS2 regions, respectively, is not enough to separate of Fasciola spp. Therefore, more studies are essential for designing new molecular markers to correct species identification. PMID:27095969

  18. Sequencing and characterization of the complete mitochondrial genome of Japanese Swellshark (Cephalloscyllium umbratile).

    Science.gov (United States)

    Zhu, Ke-Cheng; Liang, Yin-Yin; Wu, Na; Guo, Hua-Yang; Zhang, Nan; Jiang, Shi-Gui; Zhang, Dian-Chang

    2017-11-10

    To further comprehend the genome features of Cephalloscyllium umbratile (Carcharhiniformes), an endangered species, the complete mitochondrial DNA (mtDNA) was firstly sequenced and annotated. The full-length mtDNA of C. umbratile was 16,697 bp and contained ribosomal RNA (rRNA) genes, 13 protein-coding genes (PCGs), 23 transfer RNA (tRNA) genes, and a major non-coding control region. Each PCG was initiated by an authoritative ATN codon, except for COX1 initiated by a GTG codon. Seven of 13 PCGs had a typical TAA termination codon, while others terminated with a single T or TA. Moreover, the relative synonymous codon usage of the 13 PCGs was consistent with that of other published Carcharhiniformes. All tRNA genes had typical clover-leaf secondary structures, except for tRNA-Ser (GCT), which lacked the dihydrouridine 'DHU' arm. Furthermore, the analysis of the average Ka/Ks in the 13 PCGs of three Carcharhiniformes species indicated a strong purifying selection within this group. In addition, phylogenetic analysis revealed that C. umbratile was closely related to Glyphis glyphis and Glyphis garricki. Our data supply a useful resource for further studies on genetic diversity and population structure of C. umbratile.

  19. Analysis of Comparative Sequence and Genomic Data to Verify Phylogenetic Relationship and Explore a New Subfamily of Bacterial Lipases.

    Directory of Open Access Journals (Sweden)

    Malihe Masomian

    Full Text Available Thermostable and organic solvent-tolerant enzymes have significant potential in a wide range of synthetic reactions in industry due to their inherent stability at high temperatures and their ability to endure harsh organic solvents. In this study, a novel gene encoding a true lipase was isolated by construction of a genomic DNA library of thermophilic Aneurinibacillus thermoaerophilus strain HZ into Escherichia coli plasmid vector. Sequence analysis revealed that HZ lipase had 62% identity to putative lipase from Bacillus pseudomycoides. The closely characterized lipases to the HZ lipase gene are from thermostable Bacillus and Geobacillus lipases belonging to the subfamily I.5 with ≤ 57% identity. The amino acid sequence analysis of HZ lipase determined a conserved pentapeptide containing the active serine, GHSMG and a Ca(2+-binding motif, GCYGSD in the enzyme. Protein structure modeling showed that HZ lipase consisted of an α/β hydrolase fold and a lid domain. Protein sequence alignment, conserved regions analysis, clustal distance matrix and amino acid composition illustrated differences between HZ lipase and other thermostable lipases. Phylogenetic analysis revealed that this lipase represented a new subfamily of family I of bacterial true lipases, classified as family I.9. The HZ lipase was expressed under promoter Plac using IPTG and was characterized. The recombinant enzyme showed optimal activity at 65 °C and retained ≥ 97% activity after incubation at 50 °C for 1h. The HZ lipase was stable in various polar and non-polar organic solvents.

  20. Extensive next-generation sequencing analysis in chronic lymphocytic leukemia at diagnosis: clinical and biological correlations

    Directory of Open Access Journals (Sweden)

    Gian Matteo Rigolin

    2016-09-01

    Full Text Available Abstract Background In chronic lymphocytic leukemia (CLL, next-generation sequencing (NGS analysis represents a sensitive, reproducible, and resource-efficient technique for routine screening of gene mutations. Methods We performed an extensive biologic characterization of newly diagnosed CLL, including NGS analysis of 20 genes frequently mutated in CLL and karyotype analysis to assess whether NGS and karyotype results could be of clinical relevance in the refinement of prognosis and assessment of risk of progression. The genomic DNA from peripheral blood samples of 200 consecutive CLL patients was analyzed using Ion Torrent Personal Genome Machine, a NGS platform that uses semiconductor sequencing technology. Karyotype analysis was performed using efficient mitogens. Results Mutations were detected in 42.0 % of cases with 42.8 % of mutated patients presenting 2 or more mutations. The presence of mutations by NGS was associated with unmutated IGHV gene (p = 0.009, CD38 positivity (p = 0.010, risk stratification by fluorescence in situ hybridization (FISH (p < 0.001, and the complex karyotype (p = 0.003. A high risk as assessed by FISH analysis was associated with mutations affecting TP53 (p = 0.012, BIRC3 (p = 0.003, and FBXW7 (p = 0.003 while the complex karyotype was significantly associated with TP53, ATM, and MYD88 mutations (p = 0.003, 0.018, and 0.001, respectively. By multivariate analysis, the multi-hit profile (≥2 mutations by NGS was independently associated with a shorter time to first treatment (p = 0.004 along with TP53 disruption (p = 0.040, IGHV unmutated status (p < 0.001, and advanced stage (p < 0.001. Advanced stage (p = 0.010, TP53 disruption (p < 0.001, IGHV unmutated status (p = 0.020, and the complex karyotype (p = 0.007 were independently associated with a shorter overall survival. Conclusions At diagnosis, an extensive biologic characterization including

  1. Characterization of the Burkholderia thailandensis SOS response by using whole-transcriptome shotgun sequencing.

    Science.gov (United States)

    Ulrich, Ricky L; Deshazer, David; Kenny, Tara A; Ulrich, Melanie P; Moravusova, Anna; Opperman, Timothy; Bavari, Sina; Bowlin, Terry L; Moir, Donald T; Panchal, Rekha G

    2013-10-01

    The bacterial SOS response is a well-characterized regulatory network encoded by most prokaryotic bacterial species and is involved in DNA repair. In addition to nucleic acid repair, the SOS response is involved in pathogenicity, stress-induced mutagenesis, and the emergence and dissemination of antibiotic resistance. Using high-throughput sequencing technology (SOLiD RNA-Seq), we analyzed the Burkholderia thailandensis global SOS response to the fluoroquinolone antibiotic, ciprofloxacin (CIP), and the DNA-damaging chemical, mitomycin C (MMC). We demonstrate that a B. thailandensis recA mutant (RU0643) is ∼4-fold more sensitive to CIP in contrast to the parental strain B. thailandensis DW503. Our RNA-Seq results show that CIP and MMC treatment (P SOS response were induced and include lexA, uvrA, dnaE, dinB, recX, and recA. At the genome-wide level, we found an overall decrease in gene expression, especially for genes involved in amino acid and carbohydrate transport and metabolism, following both CIP and MMC exposure. Interestingly, we observed the upregulation of several genes involved in bacterial motility and enhanced transcription of a B. thailandensis genomic island encoding a Siphoviridae bacteriophage designated E264. Using B. thailandensis plaque assays and PCR with B. mallei ATCC 23344 as the host, we demonstrate that CIP and MMC exposure in B. thailandensis DW503 induces the transcription and translation of viable bacteriophage in a RecA-dependent manner. This is the first report of the SOS response in Burkholderia spp. to DNA-damaging agents. We have identified both common and unique adaptive responses of B. thailandensis to chemical stress and DNA damage.

  2. Characterizing novel endogenous retroviruses from genetic variation inferred from short sequence reads

    DEFF Research Database (Denmark)

    Mourier, Tobias; Mollerup, Sarah; Vinner, Lasse

    2015-01-01

    From Illumina sequencing of DNA from brain and liver tissue from the lion, Panthera leo, and tumor samples from the pike-perch, Sander lucioperca, we obtained two assembled sequence contigs with similarity to known retroviruses. Phylogenetic analyses suggest that the pike-perch retrovirus belongs...... to the epsilonretroviruses, and the lion retrovirus to the gammaretroviruses. To determine if these novel retroviral sequences originate from an endogenous retrovirus or from a recently integrated exogenous retrovirus, we assessed the genetic diversity of the parental sequences from which the short Illumina reads...

  3. XplorSeq: a software environment for integrated management and phylogenetic analysis of metagenomic sequence data.

    Science.gov (United States)

    Frank, Daniel N

    2008-10-07

    Advances in automated DNA sequencing technology have accelerated the generation of metagenomic DNA sequences, especially environmental ribosomal RNA gene (rDNA) sequences. As the scale of rDNA-based studies of microbial ecology has expanded, need has arisen for software that is capable of managing, annotating, and analyzing the plethora of diverse data accumulated in these projects. XplorSeq is a software package that facilitates the compilation, management and phylogenetic analysis of DNA sequences. XplorSeq was developed for, but is not limited to, high-throughput analysis of environmental rRNA gene sequences. XplorSeq integrates and extends several commonly used UNIX-based analysis tools by use of a Macintosh OS-X-based graphical user interface (GUI). Through this GUI, users may perform basic sequence import and assembly steps (base-calling, vector/primer trimming, contig assembly), perform BLAST (Basic Local Alignment and Search Tool; 123) searches of NCBI and local databases, create multiple sequence alignments, build phylogenetic trees, assemble Operational Taxonomic Units, estimate biodiversity indices, and summarize data in a variety of formats. Furthermore, sequences may be annotated with user-specified meta-data, which then can be used to sort data and organize analyses and reports. A document-based architecture permits parallel analysis of sequence data from multiple clones or amplicons, with sequences and other data stored in a single file. XplorSeq should benefit researchers who are engaged in analyses of environmental sequence data, especially those with little experience using bioinformatics software. Although XplorSeq was developed for management of rDNA sequence data, it can be applied to most any sequencing project. The application is available free of charge for non-commercial use at http://vent.colorado.edu/phyloware.

  4. Molecular characterizations of somatic hybrids developed between Pleurotus florida and Lentinus squarrosulus through inter-simple sequence repeat markers and sequencing of ribosomal RNA-ITS gene.

    Science.gov (United States)

    Mallick, Pijush; Chattaraj, Shruti; Sikdar, Samir Ranjan

    2017-10-01

    The 12 pfls somatic hybrids and 2 parents of Pleurotus florida and Lentinus s quarrosulus were characterized by ISSR and sequencing of rRNA-ITS genes. Five ISSR primers were used and amplified a total of 54 reproducible fragments with 98.14% polymorphism among all the pfls hybrid populations and parental strains. UPGMA-based cluster exhibited a dendrogram with three major groups between the parents and pfls hybrids. Parent P . florida and L . squarrosulus showed different degrees of genetic distance with all the hybrid lines and they showed closeness to hybrid pfls 1m and pfls 1h , respectively. ITS1(F) and ITS4(R) amplified the rRNA-ITS gene with 611-867 bp sequence length. The nucleotide polymorphisms were found in the ITS1, ITS2 and 5.8S rRNA region with different number of bases. Based on rRNA-ITS sequence, UPGMA cluster exhibited three distinct groups between L. squarrosulus and pfls 1p , pfls 1m and pfls 1s , and pfls 1e and P. florida .

  5. Development and Characterization of Simple Sequence Repeat (SSR) Markers Based on RNA-Sequencing of Medicago sativa and In silico Mapping onto the M. truncatula Genome

    Science.gov (United States)

    Wang, Zan; Yu, Guohui; Shi, Binbin; Wang, Xuemin; Qiang, Haiping; Gao, Hongwen

    2014-01-01

    Sufficient codominant genetic markers are needed for various genetic investigations in alfalfa since the species is an outcrossing autotetraploid. With the newly developed next generation sequencing technology, a large amount of transcribed sequences of alfalfa have been generated and are available for identifying SSR markers by data mining. A total of 54,278 alfalfa non-redundant unigenes were assembled through the Illumina HiSeqTM 2000 sequencing technology. Based on 3,903 unigene sequences, 4,493 SSRs were identified. Tri-nucleotide repeats (56.71%) were the most abundant motif class while AG/CT (21.7%), AGG/CCT (19.8%), AAC/GTT (10.3%), ATC/ATG (8.8%), and ACC/GGT (6.3%) were the subsequent top five nucleotide repeat motifs. Eight hundred and thirty- seven EST-SSR primer pairs were successfully designed. Of these, 527 (63%) primer pairs yielded clear and scored PCR products and 372 (70.6%) exhibited polymorphisms. High transferability was observed for ssp falcata at 99.2% (523) and 71.7% (378) in M. truncatula. In addition, 313 of 527 SSR marker sequences were in silico mapped onto the eight M. truncatula chromosomes. Thirty-six polymorphic SSR primer pairs were used in the genetic relatedness analysis of 30 Chinese alfalfa cultivated accessions generating a total of 199 scored alleles. The mean observed heterozygosity and polymorphic information content were 0.767 and 0.635, respectively. The codominant markers not only enriched the current resources of molecular markers in alfalfa, but also would facilitate targeted investigations in marker-trait association, QTL mapping, and genetic diversity analysis in alfalfa. PMID:24642969

  6. Phenomenological uncertainty analysis of containment building pressure load caused by severe accident sequences

    International Nuclear Information System (INIS)

    Park, S.Y.; Ahn, K.I.

    2014-01-01

    Highlights: • Phenomenological uncertainty analysis has been applied to level 2 PSA. • The methodology provides an alternative to simple deterministic analyses and sensitivity studies. • A realistic evaluation provides a more complete characterization of risks. • Uncertain parameters of MAAP code for the early containment failure were identified. - Abstract: This paper illustrates an application of a severe accident analysis code, MAAP, to the uncertainty evaluation of early containment failure scenarios employed in the containment event tree (CET) model of a reference plant. An uncertainty analysis of containment pressure behavior during severe accidents has been performed for an optimum assessment of an early containment failure model. The present application is mainly focused on determining an estimate of the containment building pressure load caused by severe accident sequences of a nuclear power plant. Key modeling parameters and phenomenological models employed for the present uncertainty analysis are closely related to the in-vessel hydrogen generation, direct containment heating, and gas combustion. The basic approach of this methodology is to (1) develop severe accident scenarios for which containment pressure loads should be performed based on a level 2 PSA, (2) identify severe accident phenomena relevant to an early containment failure, (3) identify the MAAP input parameters, sensitivity coefficients, and modeling options that describe or influence the early containment failure phenomena, (4) prescribe the likelihood descriptions of the potential range of these parameters, and (5) evaluate the code predictions using a number of random combinations of parameter inputs sampled from the likelihood distributions

  7. Sequence analysis of the Legionella micdadei groELS operon

    DEFF Research Database (Denmark)

    Hindersson, P; Høiby, N; Bangsborg, Jette Marie

    1991-01-01

    A 2.7 kb DNA fragment encoding the 60 kDa common antigen (CA) and a 13 kDa protein of Legionella micdadei was sequenced. Two open reading frames of 57,677 and 10,456 Da were identified, corresponding to the heat shock proteins GroEL and GroES, respectively. Typical -35, -10, and Shine-Dalgarno heat...

  8. Copolymers of N-cyclohexylacrylamide and n-butyl acrylate: synthesis, characterization, monomer reactivity ratios and mean sequence length

    Directory of Open Access Journals (Sweden)

    2007-06-01

    Full Text Available Copolymerization of N-cyclohexylacrylamide (NCHA and n-butyl acrylate (BA was carried out in dimethylformamide at 55±1°C using azobisisobutyronitrile as a free radical initiator. The copolymers were characterized by 1H-NMR spectroscopy and the copolymer compositions were determined by 1H-NMR analysis. The reactivity ratios of the monomers were determined by both linear and non-linear methods. The reactivity ratios of monomers determined using linear methods like Fineman-Ross (r1 = 0.37 and r2 = 1.77 , Kelen-Tudos (r1 = 0.38 and r2 = 1.77, ext. Kelen-Tudos (r1 = 0.37 and r2 = 1.75 Yezrieler-Brokhina-Roskin (r1 = 0.37 and r2 = 1.77 and non-linear methods like Tidwell-Mortimer (r1 = 0.37 and r2 = 1.76, ProCop (r1 = 0.36 and r2 = 1.82. The Q and e values for NCHA are 0.67 and 0.68 respectively. Mean sequence lengths of copolymers are estimated from r1 and r2 values. It shows that the BA units increases in a linear fashion in the polymer chain as the concentration of BA increases in the monomer feed.

  9. The Matrix Method of Representation, Analysis and Classification of Long Genetic Sequences

    Directory of Open Access Journals (Sweden)

    Ivan V. Stepanyan

    2017-01-01

    Full Text Available The article is devoted to a matrix method of comparative analysis of long nucleotide sequences by means of presenting each sequence in the form of three digital binary sequences. This method uses a set of symmetries of biochemical attributes of nucleotides. It also uses the possibility of presentation of every whole set of N-mers as one of the members of a Kronecker family of genetic matrices. With this method, a long nucleotide sequence can be visually represented as an individual fractal-like mosaic or another regular mosaic of binary type. In contrast to natural nucleotide sequences, artificial random sequences give non-regular patterns. Examples of binary mosaics of long nucleotide sequences are shown, including cases of human chromosomes and penicillins. The obtained results are then discussed.

  10. OPTSDNA: Performance evaluation of an efficient distributed bioinformatics system for DNA sequence analysis.

    Science.gov (United States)

    Khan, Mohammad Ibrahim; Sheel, Chotan

    2013-01-01

    Storage of sequence data is a big concern as the amount of data generated is exponential in nature at several locations. Therefore, there is a need to develop techniques to store data using compression algorithm. Here we describe optimal storage algorithm (OPTSDNA) for storing large amount of DNA sequences of varying length. This paper provides performance analysis of optimal storage algorithm (OPTSDNA) of a distributed bioinformatics computing system for analysis of DNA sequences. OPTSDNA algorithm is used for storing various sizes of DNA sequences into database. DNA sequences of different lengths were stored by using this algorithm. These input DNA sequences are varied in size from very small to very large. Storage size is calculated by this algorithm. Response time is also calculated in this work. The efficiency and performance of the algorithm is high (in size calculation with percentage) when compared with other known with sequential approach.

  11. Targeted next-generation sequencing analysis identifies novel mutations in families with severe familial exudative vitreoretinopathy

    Science.gov (United States)

    Huang, Xiao-Yan; Zhuang, Hong; Wu, Ji-Hong; Li, Jian-Kang; Hu, Fang-Yuan; Zheng, Yu; Tellier, Laurent Christian Asker M.; Zhang, Sheng-Hai; Gao, Feng-Juan; Zhang, Jian-Guo

    2017-01-01

    Purpose Familial exudative vitreoretinopathy (FEVR) is a genetically and clinically heterogeneous disease, characterized by failure of vascular development of the peripheral retina. The symptoms of FEVR vary widely among patients in the same family, and even between the two eyes of a given patient. This study was designed to identify the genetic defect in a patient cohort of ten Chinese families with a definitive diagnosis of FEVR. Methods To identify the causative gene, next-generation sequencing (NGS)-based target capture sequencing was performed. Segregation analysis of the candidate variant was performed in additional family members by using Sanger sequencing and quantitative real-time PCR (QPCR). Results Of the cohort of ten FEVR families, six pathogenic variants were identified, including four novel and two known heterozygous mutations. Of the variants identified, four were missense variants, and two were novel heterozygous deletion mutations [LRP5, c.4053 DelC (p.Ile1351IlefsX88); TSPAN12, EX8Del]. The two novel heterozygous deletion mutations were not observed in the control subjects and could give rise to a relatively severe FEVR phenotype, which could be explained by the protein function prediction. Conclusions We identified two novel heterozygous deletion mutations [LRP5, c.4053 DelC (p.Ile1351IlefsX88); TSPAN12, EX8Del] using targeted NGS as a causative mutation for FEVR. These genetic deletion variations exhibit a severe form of FEVR, with tractional retinal detachments compared with other known point mutations. The data further enrich the mutation spectrum of FEVR and enhance our understanding of genotype–phenotype correlations to provide useful information for disease diagnosis, prognosis, and effective genetic counseling. PMID:28867931

  12. Characterizing the Genetic Basis for Nicotine Induced Cancer Development: A Transcriptome Sequencing Study.

    Directory of Open Access Journals (Sweden)

    Jasmin H Bavarva

    Full Text Available Nicotine is a known risk factor for cancer development and has been shown to alter gene expression in cells and tissue upon exposure. We used Illumina® Next Generation Sequencing (NGS technology to gain unbiased biological insight into the transcriptome of normal epithelial cells (MCF-10A to nicotine exposure. We generated expression data from 54,699 transcripts using triplicates of control and nicotine stressed cells. As a result, we identified 138 differentially expressed transcripts, including 39 uncharacterized genes. Additionally, 173 transcripts that are primarily associated with DNA replication, recombination, and repair showed evidence for alternative splicing. We discovered the greatest nicotine stress response by HPCAL4 (up-regulated by 4.71 fold and NPAS3 (down-regulated by -2.73 fold; both are genes that have not been previously implicated in nicotine exposure but are linked to cancer. We also discovered significant down-regulation (-2.3 fold and alternative splicing of NEAT1 (lncRNA that may have an important, yet undiscovered regulatory role. Gene ontology analysis revealed nicotine exposure influenced genes involved in cellular and metabolic processes. This study reveals previously unknown consequences of nicotine stress on the transcriptome of normal breast epithelial cells and provides insight into the underlying biological influence of nicotine on normal cells, marking the foundation for future studies.

  13. Characterization of microbial biofilms in a thermophilic biogas system by high-throughput metagenome sequencing.

    Science.gov (United States)

    Rademacher, Antje; Zakrzewski, Martha; Schlüter, Andreas; Schönberg, Mandy; Szczepanowski, Rafael; Goesmann, Alexander; Pühler, Alfred; Klocke, Michael

    2012-03-01

    DNAs of two biofilms of a thermophilic two-phase leach-bed biogas reactor fed with rye silage and winter barley straw were sequenced by 454-pyrosequencing technology to assess the biofilm-based microbial community and their genetic potential for anaerobic digestion. The studied biofilms matured on the surface of the substrates in the hydrolysis reactor (HR) and on the packing in the anaerobic filter reactor (AF). The classification of metagenome reads showed Clostridium as most prevalent bacteria in the HR, indicating a predominant role for plant material digestion. Notably, insights into the genetic potential of plant-degrading bacteria were determined as well as further bacterial groups, which may assist Clostridium in carbohydrate degradation. Methanosarcina and Methanothermobacter were determined as most prevalent methanogenic archaea. In consequence, the biofilm-based methanogenesis in this system might be driven by the hydrogenotrophic pathway but also by the aceticlastic methanogenesis depending on metabolite concentrations such as the acetic acid concentration. Moreover, bacteria, which are capable of acetate oxidation in syntrophic interaction with methanogens, were also predicted. Finally, the metagenome analysis unveiled a large number of reads with unidentified microbial origin, indicating that the anaerobic degradation process may also be conducted by up to now unknown species. © 2011 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.

  14. Isolation and characterization of microsatellite markers for Dendranthema morifolium (Asteraceae) using next-generation sequencing.

    Science.gov (United States)

    Yuan, W-J; Ye, S; Du, L-H; Li, S-M; Miao, X; Shang, F-D

    2016-10-05

    Dendranthema morifolium (Asteraceae) is a perennial herbaceous plant native to China. A long history of artificial crossings may have resulted in complex genetic background and decreased genetic diversity. To protect the genetic diversity of D. morifolium and enabling breeding of new D. morifolium cultivars, we developed a set of molecular markers. We used pyrosequencing of an enriched microsatellite library by Roche 454 FLX+ platform, to isolate D. morifolium simple sequence repeats (SSRs). A total of 32,863 raw reads containing 2251 SSRs were obtained. To test the effectiveness of these SSR markers, we designed primers by randomly selecting 100 novel SSRs, and amplified them across 60 cultivars representing five different petal shape groups. Sixteen SSRs were polymorphic with the number of alleles ranging from 6 to 19, and their expected and observed heterozygosities ranging from 0.477 to 0.848, and 0.250 to 0.804, respectively. The polymorphism information content ranged from 0.459 to 0.854 and the inbreeding coefficient ranged from -0.119 to 0.759. An unweighted pair-group method arithmetic average analysis was performed to survey the phylogenetic relationships of these 60 cultivars and five clusters were identified. These markers can be used for investigating genetic relationships and identifying elite alleles through linkage and association analyses.

  15. Characterization of Three Mycobacterium spp. with Potential Use in Bioremediation by Genome Sequencing and Comparative Genomics.

    Science.gov (United States)

    Das, Sarbashis; Pettersson, B M Fredrik; Behra, Phani Rama Krishna; Ramesh, Malavika; Dasgupta, Santanu; Bhattacharya, Alok; Kirsebom, Leif A

    2015-06-16

    We provide the genome sequences of the type strains of the polychlorophenol-degrading Mycobacterium chlorophenolicum (DSM43826), the degrader of chlorinated aliphatics Mycobacterium chubuense (DSM44219) and Mycobacterium obuense (DSM44075) that has been tested for use in cancer immunotherapy. The genome sizes of M. chlorophenolicum, M. chubuense, and M. obuense are 6.93, 5.95, and 5.58 Mb with GC-contents of 68.4%, 69.2%, and 67.9%, respectively. Comparative genomic analysis revealed that 3,254 genes are common and we predicted approximately 250 genes acquired through horizontal gene transfer from different sources including proteobacteria. The data also showed that the biodegrading Mycobacterium spp. NBB4, also referred to as M. chubuense NBB4, is distantly related to the M. chubuense type strain and should be considered as a separate species, we suggest it to be named Mycobacterium ethylenense NBB4. Among different categories we identified genes with potential roles in: biodegradation of aromatic compounds and copper homeostasis. These are the first nonpathogenic Mycobacterium spp. found harboring genes involved in copper homeostasis. These findings would therefore provide insight into the role of this group of Mycobacterium spp. in bioremediation as well as the evolution of copper homeostasis within the Mycobacterium genus. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  16. Sequencing and Characterization of Novel PII Signaling Protein Gene in Microalga Haematococcus pluvialis

    Directory of Open Access Journals (Sweden)

    Ruijuan Ma

    2017-10-01

    Full Text Available The PII signaling protein is a key protein for controlling nitrogen assimilatory reactions in most organisms, but little information is reported on PII proteins of green microalga Haematococcus pluvialis. Since H. pluvialis cells can produce a large amount of astaxanthin upon nitrogen starvation, its PII protein may represent an important factor on elevated production of Haematococcus astaxanthin. This study identified and isolated the coding gene (HpGLB1 from this microalga. The full-length of HpGLB1 was 1222 bp, including 621 bp coding sequence (CDS, 103 bp 5′ untranslated region (5′ UTR, and 498 bp 3′ untranslated region (3′ UTR. The CDS could encode a protein with 206 amino acids (HpPII. Its calculated molecular weight (Mw was 22.4 kDa and the theoretical isoelectric point was 9.53. When H. pluvialis cells were exposed to nitrogen starvation, the HpGLB1 expression was increased 2.46 times in 48 h, concomitant with the raise of astaxanthin content. This study also used phylogenetic analysis to prove that HpPII was homogeneous to the PII proteins of other green microalgae. The results formed a fundamental basis for the future study on HpPII, for its potential physiological function in Haematococcus astaxanthin biosysthesis.

  17. Prevalence, complete genome sequencing and phylogenetic analysis of porcine deltacoronavirus in South Korea, 2014-2016.

    Science.gov (United States)

    Jang, G; Lee, K-K; Kim, S-H; Lee, C

    2017-10-01

    Porcine deltacoronavirus (PDCoV) is a newly emerged enterotropic swine coronavirus that causes enteritis and diarrhoea in piglets. Here, a nested reverse transcription (RT)-PCR approach for the detection of PDCoV was developed to identify and characterize aetiologic agent(s) associated with diarrhoeal diseases in piglets in South Korea. A PCR-based method was applied to investigate the presence of PDCoV in 683 diarrhoeic samples collected from 449 commercial pig farms in South Korea from January 2014 to December 2016. The molecular-based survey indicated a relatively high prevalence of PDCoV (19.03%) in South Korea. Among those, the monoinfection of PDCoV (9.66%) and co-infection of PDCoV (6.30%) with porcine epidemic diarrhoea (PEDV) were predominant in diarrhoeal samples. The full-length genomes or the complete spike genes of the most recent strains identified in 2016 (KNU16-07, KNU16-08 and KNU16-11) were sequenced and analysed to characterize PDCoV currently prevalent in South Korea. We found a single insertion-deletion signature and dozens of genetic changes in the spike (S) genes of the KNU16 isolates. Phylogenetic analysis based on the entire genome and spike protein sequences of these strains indicated that they are most closely related to other Korean isolates grouped with the US strains. However, Korean PDCoV strains formed different branches within the same cluster, implying continuous evolution in the field. Our data will advance the understanding of the molecular epidemiology and evolutionary characteristics of PDCoV circulating in South Korea. © 2017 Blackwell Verlag GmbH.

  18. Characterization of expressed sequence tag-derived simple sequence repeat markers for Aspergillus flavus: emphasis on variability of isolates from the southern United States.

    Science.gov (United States)

    Wang, Xinwang; Wadl, Phillip A; Wood-Jones, Alicia; Windham, Gary; Trigiano, Robert N; Scruggs, Mary; Pilgrim, Candace; Baird, Richard

    2012-12-01

    Simple sequence repeat (SSR) markers were developed from Aspergillus flavus expressed sequence tag (EST) database to conduct an analysis of genetic relationships of Aspergillus isolates from numerous host species and geographical regions, but primarily from the United States. Twenty-nine primers were designed from 362 tri-nucleotide EST-SSR sequences. Eighteen polymorphic loci were used to genotype 96 Aspergillus species isolates. The number of alleles detected per locus ranged from 2 to 24 with a mean of 8.2 alleles. Haploid diversity ranged from 0.28 to 0.91. Genetic distance matrix was used to perform principal coordinates analysis (PCA) and to generate dendrograms using unweighted pair group method with arithmetic mean (UPGMA). Two principal coordinates explained more than 75 % of the total variation among the isolates. One clade was identified for A. flavus isolates (n = 87) with the other Aspergillus species (n = 7) using PCA, but five distinct clusters were present when the others taxa were excluded from the analysis. Six groups were noted when the EST-SSR data were compared using UPGMA. However, the latter PCA or UPGMA comparison resulted in no direct associations with host species, geographical region or aflatoxin production. Furthermore, there was no direct correlation to visible morphological features such as sclerotial types. The isolates from Mississippi Delta region, which contained the largest percentage of isolates, did not show any unusual clustering except for isolates K32, K55, and 199. Further studies of these three isolates are warranted to evaluate their pathogenicity, aflatoxin production potential, additional gene sequences (e.g., RPB2), and morphological comparisons.

  19. RetroTector online, a rational tool for analysis of retroviral elements in small and medium size vertebrate genomic sequences

    Directory of Open Access Journals (Sweden)

    Benachenhou Farid

    2009-06-01

    Full Text Available Abstract Background The rapid accumulation of genomic information in databases necessitates rapid and specific algorithms for extracting biologically meaningful information. More or less complete retroviral sequences, also called proviral or endogenous retroviral sequences; ERVs, constitutes at least 5% of vertebrate genomes. After infecting the host, these retroviruses have integrated in germ line cells, and have then been carried in genomes for at least several 100 million years. A better understanding of structure and function of these sequences can have profound biological and medical consequences. Methods RetroTector© (ReTe is a platform-independent Java program for identification and characterization of proviral sequences in vertebrate genomes. The full ReTe requires a local installation with a MySQL database. Although not overly complicated, the installation may take some time. A "light" version of ReTe, (RetroTector online; ROL which does not require specific installation procedures is provided, via the World Wide Web. Results ROL http://www.fysiologi.neuro.uu.se/jbgs/ was implemented under the Batchelor web interface (A Lövgren et al. It allows both GenBank accession number, file and FASTA cut-and-paste admission of sequences (5 to 10 000 kilobases. Up to ten submissions can be done simultaneously, allowing batch analysis of Discussion Proviral sequences can be hard to recognize, especially if the integration occurred many million years ago. Precise delineation of LTR, gag, pro, pol and env can be difficult, requiring manual work. ROL is a way of simplifying these tasks. Conclusion ROL provides 1. annotation and presentation of known retroviral sequences, 2. detection of proviral chains in unknown genomic sequences, with up to 100 Mbase per submission.

  20. Analysis of xylem formation in pine by cDNA sequencing

    Science.gov (United States)

    Allona, I.; Quinn, M.; Shoop, E.; Swope, K.; St Cyr, S.; Carlis, J.; Riedl, J.; Retzel, E.; Campbell, M. M.; Sederoff, R.; hide

    1998-01-01

    Secondary xylem (wood) formation is likely to involve some genes expressed rarely or not at all in herbaceous plants. Moreover, environmental and developmental stimuli influence secondary xylem differentiation, producing morphological and chemical changes in wood. To increase our understanding of xylem formation, and to provide material for comparative analysis of gymnosperm and angiosperm sequences, ESTs were obtained from immature xylem of loblolly pine (Pinus taeda L.). A total of 1,097 single-pass sequences were obtained from 5' ends of cDNAs made from gravistimulated tissue from bent trees. Cluster analysis detected 107 groups of similar sequences, ranging in size from 2 to 20 sequences. A total of 361 sequences fell into these groups, whereas 736 sequences were unique. About 55% of the pine EST sequences show similarity to previously described sequences in public databases. About 10% of the recognized genes encode factors involved in cell wall formation. Sequences similar to cell wall proteins, most known lignin biosynthetic enzymes, and several enzymes of carbohydrate metabolism were found. A number of putative regulatory proteins also are represented. Expression patterns of several of these genes were studied in various tissues and organs of pine. Sequencing novel genes expressed during xylem formation will provide a powerful means of identifying mechanisms controlling this important differentiation pathway.

  1. MiSeq: A Next Generation Sequencing Platform for Genomic Analysis.

    Science.gov (United States)

    Ravi, Rupesh Kanchi; Walton, Kendra; Khosroheidari, Mahdieh

    2018-01-01

    MiSeq, Illumina's integrated next generation sequencing instrument, uses reversible-terminator sequencing-by-synthesis technology to provide end-to-end sequencing solutions. The MiSeq instrument is one of the smallest benchtop sequencers that can perform onboard cluster generation, amplification, genomic DNA sequencing, and data analysis, including base calling, alignment and variant calling, in a single run. It performs both single- and paired-end runs with adjustable read lengths from 1 × 36 base pairs to 2 × 300 base pairs. A single run can produce output data of up to 15 Gb in as little as 4 h of runtime and can output up to 25 M single reads and 50 M paired-end reads. Thus, MiSeq provides an ideal platform for rapid turnaround time. MiSeq is also a cost-effective tool for various analyses focused on targeted gene sequencing (amplicon sequencing and target enrichment), metagenomics, and gene expression studies. For these reasons, MiSeq has become one of the most widely used next generation sequencing platforms. Here, we provide a protocol to prepare libraries for sequencing using the MiSeq instrument and basic guidelines for analysis of output data from the MiSeq sequencing run.

  2. Characterization of microflora in Latin-style cheeses by next-generation sequencing technology

    Directory of Open Access Journals (Sweden)

    Lusk Tina S

    2012-11-01

    Full Text Available Abstract Background Cheese contamination can occur at numerous stages in the manufacturing process including the use of improperly pasteurized or raw milk. Of concern is the potential contamination by Listeria monocytogenes and other pathogenic bacteria that find the high moisture levels and moderate pH of popular Latin-style cheeses like queso fresco a hospitable environment. In the investigation of a foodborne outbreak, samples typically undergo enrichment in broth for 24 hours followed by selective agar plating to isolate bacterial colonies for confirmatory testing. The broth enrichment step may also enable background microflora to proliferate, which can confound subsequent analysis if not inhibited by effective broth or agar additives. We used 16S rRNA gene sequencing to provide a preliminary survey of bacterial species associated with three brands of Latin-style cheeses after 24-hour broth enrichment. Results Brand A showed a greater diversity than the other two cheese brands (Brands B and C at nearly every taxonomic level except phylum. Brand B showed the least diversity and was dominated by a single bacterial taxon, Exiguobacterium, not previously reported in cheese. This genus was also found in Brand C, although Lactococcus was prominent, an expected finding since this bacteria belongs to the group of lactic acid bacteria (LAB commonly found in fermented foods. Conclusions The contrasting diversity observed in Latin-style cheese was surprising, demonstrating that despite similarity of cheese type, raw materials and cheese making conditions appear to play a critical role in the microflora composition of the final product. The high bacterial diversity associated with Brand A suggests it may have been prepared with raw materials of high bacterial diversity or influenced by the ecology of the processing environment. Additionally, the presence of Exiguobacterium in high proportions (96% in Brand B and, to a lesser extent, Brand C (46%, may

  3. The smallest cells pose the biggest problems: high-performance computing and the analysis of metagenome sequence data

    International Nuclear Information System (INIS)

    Edwards, R A

    2008-01-01

    New high-throughput DNA sequencing technologies have revolutionized how scientists study the organisms around us. In particular, microbiology - the study of the smallest, unseen organisms that pervade our lives - has embraced these new techniques to characterize and analyze the cellular constituents and use this information to develop novel tools, techniques, and therapeutics. So-called next-generation DNA sequencing platforms have resulted in huge increases in the amount of raw data that can be rapidly generated. Argonne National Laboratory developed the premier platform for the analysis of this new data (mg-rast) that is used by microbiologists worldwide. This paper uses the accounting from the computational analysis of more than 10,000,000,000 bp of DNA sequence data, describes an analysis of the advanced computational requirements, and suggests the level of analysis that will be essential as microbiologists move to understand how these tiny organisms affect our every day lives. The results from this analysis indicate that data analysis is a linear problem, but that most analyses are held up in queues. With sufficient resources, computations could be completed in a few hours for a typical dataset. These data also suggest execution times that delimit timely completion of computational analyses, and provide bounds for problematic processes

  4. Generation and Analysis of Full-length cDNA Sequences from Elephant Shark (Callorhinchus milii)

    KAUST Repository

    Kodzius, Rimantas

    2009-03-17

    Cartilaginous fishes are the oldest living group of jawed vertebrates and therefore is an important group for understanding the evolution of vertebrate genomes including the human genome. Our laboratory has proposed elephant shark (C. milii) as a model cartilaginous fish genome because of its relatively small genome size (910 Mb). The whole genome of C. milii is being sequenced (first cartilaginous fish genome to be sequenced completely). To characterize the transcriptome of C. milii and to assist in annotating exon-intron boundaries, transcriptional start sites and alternatively spliced transcripts, we are generating full-length cDNA sequences from C. milii.

  5. Analysis of Pre-Analytic Factors Affecting the Success of Clinical Next-Generation Sequencing of Solid Organ Malignancies

    International Nuclear Information System (INIS)

    Chen, Hui; Luthra, Rajyalakshmi; Goswami, Rashmi S.; Singh, Rajesh R.; Roy-Chowdhuri, Sinchita

    2015-01-01

    Application of next-generation sequencing (NGS) technology to routine clinical practice has enabled characterization of personalized cancer genomes to identify patients likely to have a response to targeted therapy. The proper selection of tumor sample for downstream NGS based mutational analysis is critical to generate accurate results and to guide therapeutic intervention. However, multiple pre-analytic factors come into play in determining the success of NGS testing. In this review, we discuss pre-analytic requirements for AmpliSeq PCR-based sequencing using Ion Torrent Personal Genome Machine (PGM) (Life Technologies), a NGS sequencing platform that is often used by clinical laboratories for sequencing solid tumors because of its low input DNA requirement from formalin fixed and paraffin embedded tissue. The success of NGS mutational analysis is affected not only by the input DNA quantity but also by several other factors, including the specimen type, the DNA quality, and the tumor cellularity. Here, we review tissue requirements for solid tumor NGS based mutational analysis, including procedure types, tissue types, tumor volume and fraction, decalcification, and treatment effects

  6. Analysis of Pre-Analytic Factors Affecting the Success of Clinical Next-Generation Sequencing of Solid Organ Malignancies

    Energy Technology Data Exchange (ETDEWEB)

    Chen, Hui [Department of Pathology, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd, Houston, TX 77030 (United States); Luthra, Rajyalakshmi, E-mail: rluthra@mdanderson.org; Goswami, Rashmi S.; Singh, Rajesh R. [Department of Hematopathology, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd, Houston, TX 77030 (United States); Roy-Chowdhuri, Sinchita [Department of Pathology, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd, Houston, TX 77030 (United States)

    2015-08-28

    Application of next-generation sequencing (NGS) technology to routine clinical practice has enabled characterization of personalized cancer genomes to identify patients likely to have a response to targeted therapy. The proper selection of tumor sample for downstream NGS based mutational analysis is critical to generate accurate results and to guide therapeutic intervention. However, multiple pre-analytic factors come into play in determining the success of NGS testing. In this review, we discuss pre-analytic requirements for AmpliSeq PCR-based sequencing using Ion Torrent Personal Genome Machine (PGM) (Life Technologies), a NGS sequencing platform that is often used by clinical laboratories for sequencing solid tumors because of its low input DNA requirement from formalin fixed and paraffin embedded tissue. The success of NGS mutational analysis is affected not only by the input DNA quantity but also by several other factors, including the specimen type, the DNA quality, and the tumor cellularity. Here, we review tissue requirements for solid tumor NGS based mutational analysis, including procedure types, tissue types, tumor volume and fraction, decalcification, and treatment effects.

  7. Analysis of Pre-Analytic Factors Affecting the Success of Clinical Next-Generation Sequencing of Solid Organ Malignancies

    Directory of Open Access Journals (Sweden)

    Hui Chen

    2015-08-01

    Full Text Available Application of next-generation sequencing (NGS technology to routine clinical practice has enabled characterization of personalized cancer genomes to identify patients likely to have a response to targeted therapy. The proper selection of tumor sample for downstream NGS based mutational analysis is critical to generate accurate results and to guide therapeutic intervention. However, multiple pre-analytic factors come into play in determining the success of NGS testing. In this review, we discuss pre-analytic requirements for AmpliSeq PCR-based sequencing using Ion Torrent Personal Genome Machine (PGM (Life Technologies, a NGS sequencing platform that is often used by clinical laboratories for sequencing solid tumors because of its low input DNA requirement from formalin fixed and paraffin embedded tissue. The success of NGS mutational analysis is affected not only by the input DNA quantity but also by several other factors, including the specimen type, the DNA quality, and the tumor cellularity. Here, we review tissue requirements for solid tumor NGS based mutational analysis, including procedure types, tissue types, tumor volume and fraction, decalcification, and treatment effects.

  8. Isolation and sequencing analysis on the seed-specific promoter from soybean

    Institute of Scientific and Technical Information of China (English)

    CAIYIN Qinggele; LI Mingchun; WEI Dongsheng; CAI Yi; XING Laijun

    2007-01-01

    The low level of foreign genes' expression in transgenic plants is a key factor that limits plant genetic engineering.Because of the critical regulatory activity of the promoters on gene transcription,they are studied extensively to improve the efficiency of the plant transgenic system.The constitutive promoters,such as CaMV 35S promoter,are usually used in plant genetic engineering.But those constitutive promoters continuously express their downstream genes during the whole life span in all the tissues of the host plants.This is not only wasteful to host plant's energy,but also harmful to host plants and usually affects their agronomic characteristics.In contrast,the seed-specific promoter only expresses its downstream genes from mid to late stage of seed maturation,and there is no expression or much lower expression in other tissues.So the seed-specific promoters are distinguished for their improvement and what they have brought to plant quality engineering.The aim of this article is to characterize a new seed-specific promoter and improve grain quality.The promoter region of β-conglycinin α-subunit gene was isolated from the genomic DNA of soybean Jilin 43 by PCR method,and successfully extended this fragment by TAIL PCR method and obtained the promoter fragment BCSP666.Sequencing analysis showed that the cloned fragment BCSP666 contained all of the motifs,such as RY repeat element,AG/CCCCA motif,TACACAT motif,ACGTmotif,A/T rich motif and E-box etc.,which constituted the seed-specific promoter activity.Based on this sequencing analysis,the seed-specific promoter activity of the fragment BCSP666 was predicted.And then the seed-specific expression vector pBI121-666,which contained GUS reporter gene,was constructed with the fragment BCSP666.Transformation of Arabidopsis thaliana plants by Agrobacterium-mediated floral-dip method with the recombined vector pBI121-666was conducted.The transgenic plants were selected on the kanamycin-resistant MS medium

  9. Genetic Diversity in Passiflora Species Assessed by Morphological and ITS Sequence Analysis

    Directory of Open Access Journals (Sweden)

    Shiamala Devi Ramaiya

    2014-01-01

    Full Text Available This study used morphological characterization and phylogenetic analysis of the internal transcribed spacer (ITS region of nuclear ribosomal DNA to investigate the phylogeny of Passiflora species. The samples were collected from various regions of East Malaysia, and discriminant function analysis based on linear combinations of morphological variables was used to classify the Passiflora species. The biplots generated five distinct groups discriminated by morphological variables. The group consisted of cultivars of P. edulis with high levels of genetic similarity; in contrast, P. foetida was highly divergent from other species in the morphological biplots. The final dataset of aligned sequences from nine studied Passiflora accessions and 30 other individuals obtained from GenBank database (NCBI yielded one most parsimonious tree with two strongly supported clades. Maximum parsimony (MP tree showed the phylogenetic relationships within this subgenus Passiflora support the classification at the series level. The constructed phylogenic tree also confirmed the divergence of P. foetida from all other species and the closeness of wild and cultivated species. The phylogenetic relationships were consistent with results of morphological assessments. The results of this study indicate that ITS region analysis represents a useful tool for evaluating genetic diversity in Passiflora at the species level.

  10. Isolation and sequence characterization of DNA-A genome of a new begomovirus strain associated with severe leaf curling symptoms of Jatropha curcas L.

    Science.gov (United States)

    Chauhan, Sushma; Rahman, Hifzur; Mastan, Shaik G; Pamidimarri, D V N Sudheer; Reddy, Muppala P

    2018-07-20

    Begomoviruses belong to the family Geminiviridae are associated with several disease symptoms, such as mosaic and leaf curling in Jatropha curcas. The molecular characterization of these viral strains will help in developing management strategies to control the disease. In this study, J. curcas that was infected with begomovirus and showed acute leaf curling symptoms were identified. DNA-A segment from pathogenic viral strain was isolated and sequenced. The sequenced genome was assembled and characterized in detail. The full-length DNA-A sequence was covered by primer walking. The genome sequence showed the general organization of DNA-A from begomovirus by the distribution of ORFs in both viral and anti-viral strands. The genome size ranged from 2844 bp-2852 bp. Three strains with minor nucleotide variations were identified, and a phylogenetic analysis was performed by comparing the DNA-A segments from other reported begomovirus isolates. The maximum sequence similarity was observed with Euphorbia yellow mosaic virus (FN435995). In the phylogenetic tree, no clustering was observed with previously reported begomovirus strains isolated from J. curcas host. The strains isolated in this study belong to new begomoviral strain that elicits symptoms of leaf curling in J. curcas. The results indicate that the probable origin of the strains is from Jatropha mosaic virus infecting J. gassypifolia. The strains isolated in this study are referred as Jatropha curcas leaf curl India virus (JCLCIV) based on the major symptoms exhibited by host J. curcas. Copyright © 2018 Elsevier B.V. All rights reserved.

  11. Maturity onset diabetes of youth (MODY) in Turkish children: sequence analysis of 11 causative genes by next generation sequencing.

    Science.gov (United States)

    Ağladıoğlu, Sebahat Yılmaz; Aycan, Zehra; Çetinkaya, Semra; Baş, Veysel Nijat; Önder, Aşan; Peltek Kendirci, Havva Nur; Doğan, Haldun; Ceylaner, Serdar

    2016-04-01

    Maturity-onset diabetes of the youth (MODY), is a genetically and clinically heterogeneous group of diseasesand is often misdiagnosed as type 1 or type 2 diabetes. The aim of this study is to investigate both novel and proven mutations of 11 MODY genes in Turkish children by using targeted next generation sequencing. A panel of 11 MODY genes were screened in 43 children with MODY diagnosed by clinical criterias. Studies of index cases was done with MISEQ-ILLUMINA, and family screenings and confirmation studies of mutations was done by Sanger sequencing. We identified 28 (65%) point mutations among 43 patients. Eighteen patients have GCK mutations, four have HNF1A, one has HNF4A, one has HNF1B, two have NEUROD1, one has PDX1 gene variations and one patient has both HNF1A and HNF4A heterozygote mutations. This is the first study including molecular studies of 11 MODY genes in Turkish children. GCK is the most frequent type of MODY in our study population. Very high frequency of novel mutations (42%) in our study population, supports that in heterogenous disorders like MODY sequence analysis provides rapid, cost effective and accurate genetic diagnosis.

  12. Whole genome sequencing and bioinformatics analysis of two Egyptian genomes.

    Science.gov (United States)

    ElHefnawi, Mahmoud; Jeon, Sungwon; Bhak, Youngjune; ElFiky, Asmaa; Horaiz, Ahmed; Jun, JeHoon; Kim, Hyunho; Bhak, Jong

    2018-05-15

    We report two Egyptian male genomes (EGP1 and EGP2) sequenced at ~ 30× sequencing depths. EGP1 had 4.7 million variants, where 198,877 were novel variants while EGP2 had 209,109 novel variants out of 4.8 million variants. The mitochondrial haplogroup of the two individuals were identified to be H7b1 and L2a1c, respectively. We also identified the Y haplogroup of EGP1 (R1b) and EGP2 (J1a2a1a2 > P58 > FGC11). EGP1 had a mutation in the NADH gene of the mitochondrial genome ND4 (m.11778 G > A) that causes Leber's hereditary optic neuropathy. Some SNPs shared by the two genomes were associated with an increased level of cholesterol and triglycerides, probably related with Egyptians obesity. Comparison of these genomes with African and Western-Asian genomes can provide insights on Egyptian ancestry and genetic history. This resource can be used to further understand genomic diversity and functional classification of variants as well as human migration and evolution across Africa and Western-Asia. Copyright © 2017. Published by Elsevier B.V.

  13. Accident sequence precursor analysis level 2/3 model development

    International Nuclear Information System (INIS)

    Lui, C.H.; Galyean, W.J.; Brownson, D.A.

    1997-01-01

    The US Nuclear Regulatory Commission's Accident Sequence Precursor (ASP) program currently uses simple Level 1 models to assess the conditional core damage probability for operational events occurring in commercial nuclear power plants (NPP). Since not all accident sequences leading to core damage will result in the same radiological consequences, it is necessary to develop simple Level 2/3 models that can be used to analyze the response of the NPP containment structure in the context of a core damage accident, estimate the magnitude of the resulting radioactive releases to the environment, and calculate the consequences associated with these releases. The simple Level 2/3 model development work was initiated in 1995, and several prototype models have been completed. Once developed, these simple Level 2/3 models are linked to the simple Level 1 models to provide risk perspectives for operational events. This paper describes the methods implemented for the development of these simple Level 2/3 ASP models, and the linkage process to the existing Level 1 models

  14. Sequence analysis of putative swrW gene required for surfactant ...

    African Journals Online (AJOL)

    Serratia marcescens produces biosurfactant serrawettin, essential for its population migration behavior. Serrawettin W1 was revealed to be an antibiotic serratamolide that makes it significant for deoxyribonucleic acid (DNA) and protein sequence analysis. Four nucleotide and amino-acid sequences from local strains ...

  15. Cathepsin L of Triatoma brasiliensis (Reduviidae, Triatominae): sequence characterization, expression pattern and zymography.

    Science.gov (United States)

    Waniek, Peter J; Pacheco Costa, Juliana E; Jansen, Ana M; Costa, Jane; Araújo, Catarina A C

    2012-01-01

    Triatoma brasiliensis is considered one of the main vectors of Chagas disease commonly found in semi-arid areas of northeastern Brazil. These insects use proteases, such as carboxypeptidase B, aminopeptidases and different cathepsins for blood digestion. In the present study, two genes encoding cathepsin L from the midgut of T. brasiliensis were identified and characterized. Mature T. brasiliensis cathepsin L-like proteinases (TBCATL-1, TBCATL-2) showed a high level of identity to the cathepsin L-like proteinases of other insects, with highest similarity to Rhodnius prolixus. Both cathepsin L transcripts were highly abundant in the posterior midgut region, the main region of the blood digestion. Determination of the pH in the whole intestine of unfed T. brasiliensis revealed alkaline conditions in the anterior midgut region (stomach) and acidic conditions in the posterior midgut region (small intestine). Gelatine in-gel zymography showed the activity of at least four distinct proteinases in the small intestine and the cysteine proteinase inhibitors transepoxysuccinyl-l-leucylamido-(4-guanidino)butane (E-64) and cathepsin B inhibitor and N-(l-3-trans-propylcarbamoyl-oxirane-2-carbonyl)-l-isoleucyl-l-proline (CA-074) were employed to characterize enzymatic activity. E-64 fully inhibited cysteine proteinase activity, whereas in the samples treated with CA-074 residual proteinase activity was detectable. Thus, proteolytic activity could at least partially be ascribed to cathepsin L. Western blot analysis using specific anti cathepsin L antibodies confirmed the presence of cathepsin L in the lumen of the small intestine of the insects. Copyright © 2011 Elsevier Ltd. All rights reserved.

  16. Repetitive DNA in the pea (Pisum sativum L. genome: comprehensive characterization using 454 sequencing and comparison to soybean and Medicago truncatula

    Directory of Open Access Journals (Sweden)

    Navrátilová Alice

    2007-11-01

    Full Text Available Abstract Background Extraordinary size variation of higher plant nuclear genomes is in large part caused by differences in accumulation of repetitive DNA. This makes repetitive DNA of great interest for studying the molecular mechanisms shaping architecture and function of complex plant genomes. However, due to methodological constraints of conventional cloning and sequencing, a global description of repeat composition is available for only a very limited number of higher plants. In order to provide further data required for investigating evolutionary patterns of repeated DNA within and between species, we used a novel approach based on massive parallel sequencing which allowed a comprehensive repeat characterization in our model species, garden pea (Pisum sativum. Results Analysis of 33.3 Mb sequence data resulted in quantification and partial sequence reconstruction of major repeat families occurring in the pea genome with at least thousands of copies. Our results showed that the pea genome is dominated by LTR-retrotransposons, estimated at 140,000 copies/1C. Ty3/gypsy elements are less diverse and accumulated to higher copy numbers than Ty1/copia. This is in part due to a large population of Ogre-like retrotransposons which alone make up over 20% of the genome. In addition to numerous types of mobile elements, we have discovered a set of novel satellite repeats and two additional variants of telomeric sequences. Comparative genome analysis revealed that there are only a few repeat sequences conserved between pea and soybean genomes. On the other hand, all major families of pea mobile elements are well represented in M. truncatula. Conclusion We have demonstrated that even in a species with a relatively large genome like pea, where a single 454-sequencing run provided only 0.77% coverage, the generated sequences were sufficient to reconstruct and analyze major repeat families corresponding to a total of 35–48% of the genome. These data

  17. Genomic insight into the common carp (Cyprinus carpio genome by sequencing analysis of BAC-end sequences

    Directory of Open Access Journals (Sweden)

    Wang Jintu

    2011-04-01

    Full Text Available Abstract Background Common carp is one of the most important aquaculture teleost fish in the world. Common carp and other closely related Cyprinidae species provide over 30% aquaculture production in the world. However, common carp genomic resources are still relatively underdeveloped. BAC end sequences (BES are important resources for genome research on BAC-anchored genetic marker development, linkage map and physical map integration, and whole genome sequence assembling and scaffolding. Result To develop such valuable resources in common carp (Cyprinus carpio, a total of 40,224 BAC clones were sequenced on both ends, generating 65,720 clean BES with an average read length of 647 bp after sequence processing, representing 42,522,168 bp or 2.5% of common carp genome. The first survey of common carp genome was conducted with various bioinformatics tools. The common carp genome contains over 17.3% of repetitive elements with GC content of 36.8% and 518 transposon ORFs. To identify and develop BAC-anchored microsatellite markers, a total of 13,581 microsatellites were detected from 10,355 BES. The coding region of 7,127 genes were recognized from 9,443 BES on 7,453 BACs, with 1,990 BACs have genes on both ends. To evaluate the similarity to the genome of closely related zebrafish, BES of common carp were aligned against zebrafish genome. A total of 39,335 BES of common carp have conserved homologs on zebrafish genome which demonstrated the high similarity between zebrafish and common carp genomes, indicating the feasibility of comparative mapping between zebrafish and common carp once we have physical map of common carp. Conclusion BAC end sequences are great resources for the first genome wide survey of common carp. The repetitive DNA was estimated to be approximate 28% of common carp genome, indicating the higher complexity of the genome. Comparative analysis had mapped around 40,000 BES to zebrafish genome and established over 3

  18. Genomic insight into the common carp (Cyprinus carpio) genome by sequencing analysis of BAC-end sequences

    Science.gov (United States)

    2011-01-01

    Background Common carp is one of the most important aquaculture teleost fish in the world. Common carp and other closely related Cyprinidae species provide over 30% aquaculture production in the world. However, common carp genomic resources are still relatively underdeveloped. BAC end sequences (BES) are important resources for genome research on BAC-anchored genetic marker development, linkage map and physical map integration, and whole genome sequence assembling and scaffolding. Result To develop such valuable resources in common carp (Cyprinus carpio), a total of 40,224 BAC clones were sequenced on both ends, generating 65,720 clean BES with an average read length of 647 bp after sequence processing, representing 42,522,168 bp or 2.5% of common carp genome. The first survey of common carp genome was conducted with various bioinformatics tools. The common carp genome contains over 17.3% of repetitive elements with GC content of 36.8% and 518 transposon ORFs. To identify and develop BAC-anchored microsatellite markers, a total of 13,581 microsatellites were detected from 10,355 BES. The coding region of 7,127 genes were recognized from 9,443 BES on 7,453 BACs, with 1,990 BACs have genes on both ends. To evaluate the similarity to the genome of closely related zebrafish, BES of common carp were aligned against zebrafish genome. A total of 39,335 BES of common carp have conserved homologs on zebrafish genome which demonstrated the high similarity between zebrafish and common carp genomes, indicating the feasibility of comparative mapping between zebrafish and common carp once we have physical map of common carp. Conclusion BAC end sequences are great resources for the first genome wide survey of common carp. The repetitive DNA was estimated to be approximate 28% of common carp genome, indicating the higher complexity of the genome. Comparative analysis had mapped around 40,000 BES to zebrafish genome and established over 3,100 microsyntenies, covering over 50% of

  19. Technical Overview of Ecological Risk Assessment - Analysis Phase: Exposure Characterization

    Science.gov (United States)

    Exposure Characterization is the second major component of the analysis phase of a risk assessment. For a pesticide risk assessment, the exposure characterization describes the potential or actual contact of a pesticide with a plant, animal, or media.

  20. EO-1 analysis applicable to coastal characterization

    Science.gov (United States)

    Burke, Hsiao-hua K.; Misra, Bijoy; Hsu, Su May; Griffin, Michael K.; Upham, Carolyn; Farrar, Kris

    2003-09-01

    The EO-1 satellite is part of NASA's New Millennium Program (NMP). It consists of three imaging sensors: the multi-spectral Advanced Land Imager (ALI), Hyperion and Atmospheric Corrector. Hyperion provides a high-resolution hyperspectral imager capable of resolving 220 spectral bands (from 0.4 to 2.5 micron) with a 30 m resolution. The instrument images a 7.5 km by 100 km land area per image. Hyperion is currently the only space-borne HSI data source since the launch of EO-1 in late 2000. The discussion begins with the unique capability of hyperspectral sensing to coastal characterization: (1) most ocean feature algorithms are semi-empirical retrievals and HSI has all spectral bands to provide legacy with previous sensors and to explore new information, (2) coastal features are more complex than those of deep ocean that coupled effects are best resolved with HSI, and (3) with contiguous spectral coverage, atmospheric compensation can be done with more accuracy and confidence, especially since atmospheric aerosol effects are the most pronounced in the visible region where coastal feature lie. EO-1 data from Chesapeake Bay from 19 February 2002 are analyzed. In this presentation, it is first illustrated that hyperspectral data inherently provide more information for feature extraction than multispectral data despite Hyperion has lower SNR than ALI. Chlorophyll retrievals are also shown. The results compare favorably with data from other sources. The analysis illustrates the potential value of Hyperion (and HSI in general) data to coastal characterization. Future measurement requirements (air borne and space borne) are also discussed.

  1. Simultaneous digital quantification and fluorescence-based size characterization of massively parallel sequencing libraries.

    Science.gov (United States)

    Laurie, Matthew T; Bertout, Jessica A; Taylor, Sean D; Burton, Joshua N; Shendure, Jay A; Bielas, Jason H

    2013-08-01

    Due to the high cost of failed runs and suboptimal data yields, quantification and determination of fragment size range are crucial steps in the library preparation process for massively parallel sequencing (or next-generation sequencing). Current library quality control methods commonly involve quantification using real-time quantitative PCR and size determination using gel or capillary electrophoresis. These methods are laborious and subject to a number of significant limitations that can make library calibration unreliable. Herein, we propose and test an alternative method for quality control of sequencing libraries using droplet digital PCR (ddPCR). By exploiting a correlation we have discovered between droplet fluorescence and amplicon size, we achieve the joint quantification and size determination of target DNA with a single ddPCR assay. We demonstrate the accuracy and precision of applying this method to the preparation of sequencing libraries.

  2. Single-cell template strand sequencing by Strand-seq enables the characterization of individual homologs

    NARCIS (Netherlands)

    Sanders, Ashley D; Falconer, Ester; Hills, Mark; Spierings, Diana C J; Lansdorp, Peter M.

    The ability to distinguish between genome sequences of homologous chromosomes in single cells is important for studies of copy-neutral genomic rearrangements (such as inversions and translocations), building chromosome-length haplotypes, refining genome assemblies, mapping sister chromatid exchange

  3. Characterization of upstream sequences of the LIM2 gene that bind developmentally regulated and lens-specific proteins

    Institute of Scientific and Technical Information of China (English)

    HSU Heng; Robert L. CHURCH

    2004-01-01

    During lens development, lens epithelial cells differentiate into fiber cells. To date, four major lens fiber cell intrinsic membrane proteins (MIP) ranging in size from 70 kD to 19 kD have been characterized. The second most abundant lens fiber cell intrinsic membrane protein is MP19. This protein probably is involved with lens cell communication and relates with cataractogenesis. The aim of this research is to characterize upstream sequences of the MP19 (also called LIM2) gene that bind developmentally regulated and lens-specific proteins. We have used the gel mobility assays and corresponding competition experiments to identify and characterize cis elements within approximately 500 bases of LIM2 upstream sequences. Our studies locate the positions of some cis elements, including a "CA" repeat, a methylation Hha I island, an FnuD II site, an Ap1 and an Ap2 consensus sequences, and identify some specific cis elements which relate to lens-specific transcription of LIM2. Our experiments also preliminarily identify trans factors which bind to specific cis elements of the LIM2 promoter and/or regulate transcription of LIM2. We conclude that developmental regulation and coordination of the MP 19 gene in ocular lens fiber cells is controlled by the presence of specific cis elements that bind regulatory trans factors that affect LIM2 gene expression. DNA methylation is one mechanism of controlling LIM2 gene expression during lens development.

  4. A symbolic dynamics approach for the complexity analysis of chaotic pseudo-random sequences

    International Nuclear Information System (INIS)

    Xiao Fanghong

    2004-01-01

    By considering a chaotic pseudo-random sequence as a symbolic sequence, authors present a symbolic dynamics approach for the complexity analysis of chaotic pseudo-random sequences. The method is applied to the cases of Logistic map and one-way coupled map lattice to demonstrate how it works, and a comparison is made between it and the approximate entropy method. The results show that this method is applicable to distinguish the complexities of different chaotic pseudo-random sequences, and it is superior to the approximate entropy method

  5. Systematic Internal Transcribed Spacer Sequence Analysis for Identification of Clinical Mold Isolates in Diagnostic Mycology: a 5-Year Study▿ †

    Science.gov (United States)

    Ciardo, Diana E.; Lucke, Katja; Imhof, Alex; Bloemberg, Guido V.; Böttger, Erik C.

    2010-01-01

    The implementation of internal transcribed spacer (ITS) sequencing for routine identification of molds in the diagnostic mycology laboratory was analyzed in a 5-year study. All mold isolates (n = 6,900) recovered in our laboratory from 2005 to 2009 were included in this study. According to a defined work flow, which in addition to troublesome phenotypic identification takes clinical relevance into account, 233 isolates were subjected to ITS sequence analysis. Sequencing resulted in successful identification for 78.6% of the analyzed isolates (57.1% at species level, 21.5% at genus level). In comparison, extended in-depth phenotypic characterization of the isolates subjected to sequencing achieved taxonomic assignment for 47.6% of these, with a mere 13.3% at species level. Optimization of DNA extraction further improved the efficacy of molecular identification. This study is the first of its kind to testify to the systematic implementation of sequence-based identification procedures in the routine workup of mold isolates in the diagnostic mycology laboratory. PMID:20573873

  6. Systematic internal transcribed spacer sequence analysis for identification of clinical mold isolates in diagnostic mycology: a 5-year study.

    Science.gov (United States)

    Ciardo, Diana E; Lucke, Katja; Imhof, Alex; Bloemberg, Guido V; Böttger, Erik C

    2010-08-01

    The implementation of internal transcribed spacer (ITS) sequencing for routine identification of molds in the diagnostic mycology laboratory was analyzed in a 5-year study. All mold isolates (n = 6,900) recovered in our laboratory from 2005 to 2009 were included in this study. According to a defined work flow, which in addition to troublesome phenotypic identification takes clinical relevance into account, 233 isolates were subjected to ITS sequence analysis. Sequencing resulted in successful identification for 78.6% of the analyzed isolates (57.1% at species level, 21.5% at genus level). In comparison, extended in-depth phenotypic characterization of the isolates subjected to sequencing achieved taxonomic assignment for 47.6% of these, with a mere 13.3% at species level. Optimization of DNA extraction further improved the efficacy of molecular identification. This study is the first of its kind to testify to the systematic implementation of sequence-based identification procedures in the routine workup of mold isolates in the diagnostic mycology laboratory.

  7. The sequence and analysis of duplication rich human chromosome 16

    Energy Technology Data Exchange (ETDEWEB)

    Martin, Joel; Han, Cliff; Gordon, Laurie A.; Terry, Astrid; Prabhakar, Shyam; She, Xinwei; Xie, Gary; Hellsten, Uffe; Man Chan, Yee; Altherr, Michael; Couronne, Olivier; Aerts, Andrea; Bajorek, Eva; Black, Stacey; Blumer, Heather; Branscomb, Elbert; Brown, Nancy C.; Bruno, William J.; Buckingham, Judith M.; Callen, David F.; Campbell, Connie S.; Campbell, Mary L.; Campbell, Evelyn W.; Caoile, Chenier; Challacombe, Jean F.; Chasteen, Leslie A.; Chertkov, Olga; Chi, Han C.; Christensen, Mari; Clark, Lynn M.; Cohn, Judith D.; Denys, Mirian; Detter, John C.; Dickson, Mark; Dimitrijevic-Bussod, Mira; Escobar, Julio; Fawcett, Joseph J.; Flowers, Dave; Fotopulos, Dea; Glavina, Tijana; Gomez, Maria; Gonzales, Eidelyn; Goodstein, David; Goodwin, Lynne A.; Grady, Deborah L.; Grigoriev, Igor; Groza, Matthew; Hammon, Nancy; Hawkins, Trevor; Haydu, Lauren; Hildebrand, Carl E.; Huang, Wayne; Israni, Sanjay; Jett, Jamie; Jewett, Phillip E.; Kadner, Kristen; Kimball, Heather; Kobayashi, Arthur; Krawczyk, Marie-Claude; Leyba, Tina; Longmire, Jonathan L.; Lopez, Frederick; Lou, Yunian; Lowry, Steve; Ludeman, Thom; Mark, Graham A.; Mcmurray, Kimberly L.; Meincke, Linda J.; Morgan, Jenna; Moyzis, Robert K.; Mundt, Mark O.; Munk, A. Christine; Nandkeshwar, Richard D.; Pitluck, Sam; Pollard, Martin; Predki, Paul; Parson-Quintana, Beverly; Ramirez, Lucia; Rash, Sam; Retterer, James; Ricke, Darryl O.; Robinson, Donna L.; Rodriguez, Alex; Salamov, Asaf; Saunders, Elizabeth H.; Scott, Duncan; Shough, Timothy; Stallings, Raymond L.; Stalvey, Malinda; Sutherland, Robert D.; Tapia, Roxanne; Tesmer, Judith G.; Thayer, Nina; Thompson, Linda S.; Tice, Hope; Torney, David C.; Tran-Gyamfi, Mary; Tsai, Ming; Ulanovsky, Levy E.; Ustaszewska, Anna; Vo, Nu; White, P. Scott; Williams, Albert L.; Wills, Patricia L.; Wu, Jung-Rung; Wu, Kevin; Yang, Joan; DeJong, Pieter; Bruce, David; Doggett, Norman; Deaven, Larry; Schmutz, Jeremy; Grimwood, Jane; Richardson, Paul; et al.

    2004-08-01

    We report here the 78,884,754 base pairs of finished human chromosome 16 sequence, representing over 99.9 percent of its euchromatin. Manual annotation revealed 880 protein coding genes confirmed by 1,637 aligned transcripts, 19 tRNA genes, 341 pseudogenes and 3 RNA pseudogenes. These genes include metallothionein, cadherin and iroquois gene families, as well as the disease genes for polycystic kidney disease and acute myelomonocytic leukemia. Several large-scale structural polymorphisms spanning hundreds of kilobasepairs were identified and result in gene content differences across humans. One of the unique features of chromosome 16 is its high level of segmental duplication, ranked among the highest of the human autosomes. While the segmental duplications are enriched in the relatively gene poor pericentromere of the p-arm, some are involved in recent gene duplication and conversion events which are likely to have had an impact on the evolution of primates and human disease susceptibility.

  8. Analysis of decision procedures for a sequence of inventory periods

    International Nuclear Information System (INIS)

    Avenhaus, R.

    1982-07-01

    Optimal test procedures for a sequence of inventory periods will be discussed. Starting with a game theoretical description of the conflict situation between the plant operator and the inspector, the objectives of the inspector as well as the general decision theoretical problem will be formulated. In the first part the objective of 'secure' detection will be emphasized which means that only at the end of the reference time a decision is taken by the inspector. In the second part the objective of 'timely' detection will be emphasized which will lead to sequential test procedures. At the end of the paper all procedures will be summarized, and in view of the multitude of procedures available at the moment some comments about future work will be given. (orig./HP) [de

  9. The Sequence and Analysis of Duplication Rich Human Chromosome 16

    Science.gov (United States)

    Martin, Joel; Han, Cliff; Gordon, Laurie A.; Terry, Astrid; Prabhakar, Shyam; She, Xinwei; Xie, Gary; Hellsten, Uffe; Man Chan, Yee; Altherr, Michael; Couronne, Olivier; Aerts, Andrea; Bajorek, Eva; Black, Stacey; Blumer, Heather; Branscomb, Elbert; Brown, Nancy C.; Bruno, William J.; Buckingham, Judith M.; Callen, David F.; Campbell, Connie S.; Campbell, Mary L.; Campbell, Evelyn W.; Caoile, Chenier; Challacombe, Jean F.; Chasteen, Leslie A.; Chertkov, Olga; Chi, Han C.; Christensen, Mari; Clark, Lynn M.; Cohn, Judith D.; Denys, Mirian; Detter, John C.; Dickson, Mark; Dimitrijevic-Bussod, Mira; Escobar, Julio; Fawcett, Joseph J.; Flowers, Dave; Fotopulos, Dea; Glavina, Tijana; Gomez, Maria; Gonzales, Eidelyn; Goodstein, David; Goodwin, Lynne A.; Grady, Deborah L.; Grigoriev, Igor; Groza, Matthew; Hammon, Nancy; Hawkins, Trevor; Haydu, Lauren; Hildebrand, Carl E.; Huang, Wayne; Israni, Sanjay; Jett, Jamie; Jewett, Phillip E.; Kadner, Kristen; Kimball, Heather; Kobayashi, Arthur; Krawczyk, Marie-Claude; Leyba, Tina; Longmire, Jonathan L.; Lopez, Frederick; Lou, Yunian; Lowry, Steve; Ludeman, Thom; Mark, Graham A.; Mcmurray, Kimberly L.; Meincke, Linda J.; Morgan, Jenna; Moyzis, Robert K.; Mundt, Mark O.; Munk, A. Christine; Nandkeshwar, Richard D.; Pitluck, Sam; Pollard, Martin; Predki, Paul; Parson-Quintana, Beverly; Ramirez, Lucia; Rash, Sam; Retterer, James; Ricke, Darryl O.; Robinson, Donna L.; Rodriguez, Alex; Salamov, Asaf; Saunders, Elizabeth H.; Scott, Duncan; Shough, Timothy; Stallings, Raymond L.; Stalvey, Malinda; Sutherland, Robert D.; Tapia, Roxanne; Tesmer, Judith G.; Thayer, Nina; Thompson, Linda S.; Tice, Hope; Torney, David C.; Tran-Gyamfi, Mary; Tsai, Ming; Ulanovsky, Levy E.; Ustaszewska, Anna; Vo, Nu; White, P. Scott; Williams, Albert L.; Wills, Patricia L.; Wu, Jung-Rung; Wu, Kevin; Yang, Joan; DeJong, Pieter; Bruce, David; Doggett, Norman; Deaven, Larry; Schmutz, Jeremy; Grimwood, Jane; Richardson, Paul; et al.

    2004-01-01

    We report here the 78,884,754 base pairs of finished human chromosome 16 sequence, representing over 99.9 percent of its euchromatin. Manual annotation revealed 880 protein coding genes confirmed by 1,637 aligned transcripts, 19 tRNA genes, 341 pseudogenes and 3 RNA pseudogenes. These genes include metallothionein, cadherin