WorldWideScience

Sample records for references sequence analysis

  1. Improvements and impacts of GRCh38 human reference on high throughput sequencing data analysis.

    Science.gov (United States)

    Guo, Yan; Dai, Yulin; Yu, Hui; Zhao, Shilin; Samuels, David C; Shyr, Yu

    2017-03-01

    Analyses of high throughput sequencing data starts with alignment against a reference genome, which is the foundation for all re-sequencing data analyses. Each new release of the human reference genome has been augmented with improved accuracy and completeness. It is presumed that the latest release of human reference genome, GRCh38 will contribute more to high throughput sequencing data analysis by providing more accuracy. But the amount of improvement has not yet been quantified. We conducted a study to compare the genomic analysis results between the GRCh38 reference and its predecessor GRCh37. Through analyses of alignment, single nucleotide polymorphisms, small insertion/deletions, copy number and structural variants, we show that GRCh38 offers overall more accurate analysis of human sequencing data. More importantly, GRCh38 produced fewer false positive structural variants. In conclusion, GRCh38 is an improvement over GRCh37 not only from the genome assembly aspect, but also yields more reliable genomic analysis results. Copyright © 2017. Published by Elsevier Inc.

  2. Sequence Factorization with Multiple References.

    Directory of Open Access Journals (Sweden)

    Sebastian Wandelt

    Full Text Available The success of high-throughput sequencing has lead to an increasing number of projects which sequence large populations of a species. Storage and analysis of sequence data is a key challenge in these projects, because of the sheer size of the datasets. Compression is one simple technology to deal with this challenge. Referential factorization and compression schemes, which store only the differences between input sequence and a reference sequence, gained lots of interest in this field. Highly-similar sequences, e.g., Human genomes, can be compressed with a compression ratio of 1,000:1 and more, up to two orders of magnitude better than with standard compression techniques. Recently, it was shown that the compression against multiple references from the same species can boost the compression ratio up to 4,000:1. However, a detailed analysis of using multiple references is lacking, e.g., for main memory consumption and optimality. In this paper, we describe one key technique for the referential compression against multiple references: The factorization of sequences. Based on the notion of an optimal factorization, we propose optimization heuristics and identify parameter settings which greatly influence 1 the size of the factorization, 2 the time for factorization, and 3 the required amount of main memory. We evaluate a total of 30 setups with a varying number of references on data from three different species. Our results show a wide range of factorization sizes (optimal to an overhead of up to 300%, factorization speed (0.01 MB/s to more than 600 MB/s, and main memory usage (few dozen MB to dozens of GB. Based on our evaluation, we identify the best configurations for common use cases. Our evaluation shows that multi-reference factorization is much better than single-reference factorization.

  3. A Reference Viral Database (RVDB) To Enhance Bioinformatics Analysis of High-Throughput Sequencing for Novel Virus Detection.

    Science.gov (United States)

    Goodacre, Norman; Aljanahi, Aisha; Nandakumar, Subhiksha; Mikailov, Mike; Khan, Arifa S

    2018-01-01

    Detection of distantly related viruses by high-throughput sequencing (HTS) is bioinformatically challenging because of the lack of a public database containing all viral sequences, without abundant nonviral sequences, which can extend runtime and obscure viral hits. Our reference viral database (RVDB) includes all viral, virus-related, and virus-like nucleotide sequences (excluding bacterial viruses), regardless of length, and with overall reduced cellular sequences. Semantic selection criteria (SEM-I) were used to select viral sequences from GenBank, resulting in a first-generation viral database (VDB). This database was manually and computationally reviewed, resulting in refined, semantic selection criteria (SEM-R), which were applied to a new download of updated GenBank sequences to create a second-generation VDB. Viral entries in the latter were clustered at 98% by CD-HIT-EST to reduce redundancy while retaining high viral sequence diversity. The viral identity of the clustered representative sequences (creps) was confirmed by BLAST searches in NCBI databases and HMMER searches in PFAM and DFAM databases. The resulting RVDB contained a broad representation of viral families, sequence diversity, and a reduced cellular content; it includes full-length and partial sequences and endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Testing of RVDBv10.2, with an in-house HTS transcriptomic data set indicated a significantly faster run for virus detection than interrogating the entirety of the NCBI nonredundant nucleotide database, which contains all viral sequences but also nonviral sequences. RVDB is publically available for facilitating HTS analysis, particularly for novel virus detection. It is meant to be updated on a regular basis to include new viral sequences added to GenBank. IMPORTANCE To facilitate bioinformatics analysis of high-throughput sequencing (HTS) data for the detection of both known and novel viruses, we have

  4. Analysis of the complete genome sequence of Nocardia seriolae UTF1, the causative agent of fish nocardiosis: The first reference genome sequence of the fish pathogenic Nocardia species.

    Science.gov (United States)

    Yasuike, Motoshige; Nishiki, Issei; Iwasaki, Yuki; Nakamura, Yoji; Fujiwara, Atushi; Shimahara, Yoshiko; Kamaishi, Takashi; Yoshida, Terutoyo; Nagai, Satoshi; Kobayashi, Takanori; Katoh, Masaya

    2017-01-01

    Nocardiosis caused by Nocardia seriolae is one of the major threats in the aquaculture of Seriola species (yellowtail; S. quinqueradiata, amberjack; S. dumerili and kingfish; S. lalandi) in Japan. Here, we report the complete nucleotide genome sequence of N. seriolae UTF1, isolated from a cultured yellowtail. The genome is a circular chromosome of 8,121,733 bp with a G+C content of 68.1% that encodes 7,697 predicted proteins. In the N. seriolae UTF1 predicted genes, we found orthologs of virulence factors of pathogenic mycobacteria and human clinical Nocardia isolates involved in host cell invasion, modulation of phagocyte function and survival inside the macrophages. The virulence factor candidates provide an essential basis for understanding their pathogenic mechanisms at the molecular level by the fish nocardiosis research community in future studies. We also found many potential antibiotic resistance genes on the N. seriolae UTF1 chromosome. Comparative analysis with the four existing complete genomes, N. farcinica IFM 10152, N. brasiliensis HUJEG-1 and N. cyriacigeorgica GUH-2 and N. nova SH22a, revealed that 2,745 orthologous genes were present in all five Nocardia genomes (core genes) and 1,982 genes were unique to N. seriolae UTF1. In particular, the N. seriolae UTF1 genome contains a greater number of mobile elements and genes of unknown function that comprise the differences in structure and gene content from the other Nocardia genomes. In addition, a lot of the N. seriolae UTF1-specific genes were assigned to the ABC transport system. Because of limited resources in ocean environments, these N. seriolae UTF1 specific ABC transporters might facilitate adaptation strategies essential for marine environment survival. Thus, the availability of the complete N. seriolae UTF1 genome sequence will provide a valuable resource for comparative genomic studies of N. seriolae isolates, as well as provide new insights into the ecological and functional diversity of

  5. Biological sequence analysis

    DEFF Research Database (Denmark)

    Durbin, Richard; Eddy, Sean; Krogh, Anders Stærmose

    This book provides an up-to-date and tutorial-level overview of sequence analysis methods, with particular emphasis on probabilistic modelling. Discussed methods include pairwise alignment, hidden Markov models, multiple alignment, profile searches, RNA secondary structure analysis, and phylogene...

  6. Transcriptome sequencing of the Microarray Quality Control (MAQC RNA reference samples using next generation sequencing

    Directory of Open Access Journals (Sweden)

    Thierry-Mieg Danielle

    2009-06-01

    Full Text Available Abstract Background Transcriptome sequencing using next-generation sequencing platforms will soon be competing with DNA microarray technologies for global gene expression analysis. As a preliminary evaluation of these promising technologies, we performed deep sequencing of cDNA synthesized from the Microarray Quality Control (MAQC reference RNA samples using Roche's 454 Genome Sequencer FLX. Results We generated more that 3.6 million sequence reads of average length 250 bp for the MAQC A and B samples and introduced a data analysis pipeline for translating cDNA read counts into gene expression levels. Using BLAST, 90% of the reads mapped to the human genome and 64% of the reads mapped to the RefSeq database of well annotated genes with e-values ≤ 10-20. We measured gene expression levels in the A and B samples by counting the numbers of reads that mapped to individual RefSeq genes in multiple sequencing runs to evaluate the MAQC quality metrics for reproducibility, sensitivity, specificity, and accuracy and compared the results with DNA microarrays and Quantitative RT-PCR (QRTPCR from the MAQC studies. In addition, 88% of the reads were successfully aligned directly to the human genome using the AceView alignment programs with an average 90% sequence similarity to identify 137,899 unique exon junctions, including 22,193 new exon junctions not yet contained in the RefSeq database. Conclusion Using the MAQC metrics for evaluating the performance of gene expression platforms, the ExpressSeq results for gene expression levels showed excellent reproducibility, sensitivity, and specificity that improved systematically with increasing shotgun sequencing depth, and quantitative accuracy that was comparable to DNA microarrays and QRTPCR. In addition, a careful mapping of the reads to the genome using the AceView alignment programs shed new light on the complexity of the human transcriptome including the discovery of thousands of new splice variants.

  7. The Release 6 reference sequence of the Drosophila melanogaster genome.

    Science.gov (United States)

    Hoskins, Roger A; Carlson, Joseph W; Wan, Kenneth H; Park, Soo; Mendez, Ivonne; Galle, Samuel E; Booth, Benjamin W; Pfeiffer, Barret D; George, Reed A; Svirskas, Robert; Krzywinski, Martin; Schein, Jacqueline; Accardo, Maria Carmela; Damia, Elisabetta; Messina, Giovanni; Méndez-Lago, María; de Pablos, Beatriz; Demakova, Olga V; Andreyeva, Evgeniya N; Boldyreva, Lidiya V; Marra, Marco; Carvalho, A Bernardo; Dimitri, Patrizio; Villasante, Alfredo; Zhimulev, Igor F; Rubin, Gerald M; Karpen, Gary H; Celniker, Susan E

    2015-03-01

    Drosophila melanogaster plays an important role in molecular, genetic, and genomic studies of heredity, development, metabolism, behavior, and human disease. The initial reference genome sequence reported more than a decade ago had a profound impact on progress in Drosophila research, and improving the accuracy and completeness of this sequence continues to be important to further progress. We previously described improvement of the 117-Mb sequence in the euchromatic portion of the genome and 21 Mb in the heterochromatic portion, using a whole-genome shotgun assembly, BAC physical mapping, and clone-based finishing. Here, we report an improved reference sequence of the single-copy and middle-repetitive regions of the genome, produced using cytogenetic mapping to mitotic and polytene chromosomes, clone-based finishing and BAC fingerprint verification, ordering of scaffolds by alignment to cDNA sequences, incorporation of other map and sequence data, and validation by whole-genome optical restriction mapping. These data substantially improve the accuracy and completeness of the reference sequence and the order and orientation of sequence scaffolds into chromosome arm assemblies. Representation of the Y chromosome and other heterochromatic regions is particularly improved. The new 143.9-Mb reference sequence, designated Release 6, effectively exhausts clone-based technologies for mapping and sequencing. Highly repeat-rich regions, including large satellite blocks and functional elements such as the ribosomal RNA genes and the centromeres, are largely inaccessible to current sequencing and assembly methods and remain poorly represented. Further significant improvements will require sequencing technologies that do not depend on molecular cloning and that produce very long reads. © 2015 Hoskins et al.; Published by Cold Spring Harbor Laboratory Press.

  8. Image sequence analysis

    CERN Document Server

    1981-01-01

    The processing of image sequences has a broad spectrum of important applica­ tions including target tracking, robot navigation, bandwidth compression of TV conferencing video signals, studying the motion of biological cells using microcinematography, cloud tracking, and highway traffic monitoring. Image sequence processing involves a large amount of data. However, because of the progress in computer, LSI, and VLSI technologies, we have now reached a stage when many useful processing tasks can be done in a reasonable amount of time. As a result, research and development activities in image sequence analysis have recently been growing at a rapid pace. An IEEE Computer Society Workshop on Computer Analysis of Time-Varying Imagery was held in Philadelphia, April 5-6, 1979. A related special issue of the IEEE Transactions on Pattern Anal­ ysis and Machine Intelligence was published in November 1980. The IEEE Com­ puter magazine has also published a special issue on the subject in 1981. The purpose of this book ...

  9. Reference genome sequence of the model plant Setaria.

    Science.gov (United States)

    Bennetzen, Jeffrey L; Schmutz, Jeremy; Wang, Hao; Percifield, Ryan; Hawkins, Jennifer; Pontaroli, Ana C; Estep, Matt; Feng, Liang; Vaughn, Justin N; Grimwood, Jane; Jenkins, Jerry; Barry, Kerrie; Lindquist, Erika; Hellsten, Uffe; Deshpande, Shweta; Wang, Xuewen; Wu, Xiaomei; Mitros, Therese; Triplett, Jimmy; Yang, Xiaohan; Ye, Chu-Yu; Mauro-Herrera, Margarita; Wang, Lin; Li, Pinghua; Sharma, Manoj; Sharma, Rita; Ronald, Pamela C; Panaud, Olivier; Kellogg, Elizabeth A; Brutnell, Thomas P; Doust, Andrew N; Tuskan, Gerald A; Rokhsar, Daniel; Devos, Katrien M

    2012-05-13

    We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The ∼400-Mb assembly covers ∼80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).

  10. Reference genome sequence of the model plant Setaria

    Energy Technology Data Exchange (ETDEWEB)

    Bennetzen, Jeffrey L [ORNL; Schmutz, Jeremy [Hudson Alpha Institute of Biotechnology; Wang, Hao [University of Georgia, Athens, GA; Percifield, Ryan [University of Georgia, Athens, GA; Hawkins, Jennifer [University of Georgia, Athens, GA; Pontaroli, Ana C. [University of Georgia, Athens, GA; Estep, Matt [University of Georgia, Athens, GA; Feng, Liang [University of Georgia, Athens, GA; Vaughn, Justin N [ORNL; Grimwood, Jane [Hudson Alpha Institute of Biotechnology; Jenkins, Jerry [Hudson Alpha Institute of Biotechnology; Barry, Kerrie [U.S. Department of Energy, Joint Genome Institute; Lindquist, Erika [U.S. Department of Energy, Joint Genome Institute; Hellsten, Uffe [U.S. Department of Energy, Joint Genome Institute; Deshpande, Shweta [U.S. Department of Energy, Joint Genome Institute; Wang, Xuewen [University of Georgia, Athens, GA; Wu, Xiaomei [University of Georgia, Athens, GA; Mitros, Therese [University of California, Berkeley; Triplett, Jimmy [University of Missouri, St. Louis; Yang, Xiaohan [ORNL; Ye, Chuyu [ORNL; Mauro-Herrera, Margarita [Oklahoma State University; Wang, Lin [Cornell University; Li, Pinghua [Cornell University; Sharma, Manoj [University of California, Davis; Sharma, Rita [University of California, Davis; Ronald, Pamela [University of California, Davis; Panaud, Olivier [Universite de Perpignan, Perpignan, France; Kellogg, Elizabeth A. [University of Missouri, St. Louis; Brutnell, Thomas P. [Cornell University; Doust, Andrew N. [Oklahoma State University; Tuskan, Gerald A [ORNL; Rokhsar, Daniel [U.S. Department of Energy, Joint Genome Institute; Devos, Katrien M [ORNL

    2012-01-01

    We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The ~400-Mb assembly covers ~80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).

  11. Reference genome sequence of the model plant Setaria

    Energy Technology Data Exchange (ETDEWEB)

    Bennetzen, Jeffrey L [ORNL; Yang, Xiaohan [ORNL; Ye, Chuyu [ORNL; Tuskan, Gerald A [ORNL

    2012-01-01

    We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The {approx}400-Mb assembly covers {approx}80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).

  12. MGmapper: Reference based mapping and taxonomy annotation of metagenomics sequence reads

    DEFF Research Database (Denmark)

    Petersen, Thomas Nordahl; Lukjancenko, Oksana; Thomsen, Martin Christen Frølund

    2017-01-01

    number of false positive species annotations are a problem unless thresholds or post-processing are applied to differentiate between correct and false annotations. MGmapper is a package to process raw next generation sequence data and perform reference based sequence assignment, followed by a post...... pipeline is freely available as a bitbucked package (https://bitbucket.org/genomicepidemiology/mgmapper). A web-version (https://cge.cbs.dtu.dk/services/MGmapper) provides the basic functionality for analysis of small fastq datasets....

  13. Defining reference sequences for Nocardia species by similarity and clustering analyses of 16S rRNA gene sequence data.

    Directory of Open Access Journals (Sweden)

    Manal Helal

    Full Text Available BACKGROUND: The intra- and inter-species genetic diversity of bacteria and the absence of 'reference', or the most representative, sequences of individual species present a significant challenge for sequence-based identification. The aims of this study were to determine the utility, and compare the performance of several clustering and classification algorithms to identify the species of 364 sequences of 16S rRNA gene with a defined species in GenBank, and 110 sequences of 16S rRNA gene with no defined species, all within the genus Nocardia. METHODS: A total of 364 16S rRNA gene sequences of Nocardia species were studied. In addition, 110 16S rRNA gene sequences assigned only to the Nocardia genus level at the time of submission to GenBank were used for machine learning classification experiments. Different clustering algorithms were compared with a novel algorithm or the linear mapping (LM of the distance matrix. Principal Components Analysis was used for the dimensionality reduction and visualization. RESULTS: The LM algorithm achieved the highest performance and classified the set of 364 16S rRNA sequences into 80 clusters, the majority of which (83.52% corresponded with the original species. The most representative 16S rRNA sequences for individual Nocardia species have been identified as 'centroids' in respective clusters from which the distances to all other sequences were minimized; 110 16S rRNA gene sequences with identifications recorded only at the genus level were classified using machine learning methods. Simple kNN machine learning demonstrated the highest performance and classified Nocardia species sequences with an accuracy of 92.7% and a mean frequency of 0.578. CONCLUSION: The identification of centroids of 16S rRNA gene sequence clusters using novel distance matrix clustering enables the identification of the most representative sequences for each individual species of Nocardia and allows the quantitation of inter- and intra

  14. Direct, rapid RNA sequence analysis

    International Nuclear Information System (INIS)

    Peattie, D.A.

    1987-01-01

    The original methods of RNA sequence analysis were based on enzymatic production and chromatographic separation of overlapping oligonucleotide fragments from within an RNA molecule followed by identification of the mononucleotides comprising the oligomer. Over the past decade the field of nucleic acid sequencing has changed dramatically, however, and RNA molecules now can be sequenced in a variety of more streamlined fashions. Most of the more recent advances in RNA sequencing have involved one-dimensional electrophoretic separation of 32 P-end-labeled oligoribonucleotides on polyacrylamide gels. In this chapter the author discusses two of these methods for determining the nucleotide sequences of RNA molecules rapidly: the chemical method and the enzymatic method. Both methods are direct and degradative, i.e., they rely on fragmatic and chemical approaches should be utilized. The single-strand-specific ribonucleases (A, T 1 , T 2 , and S 1 ) provide an efficient means to locate double-helical regions rapidly, and the chemical reactions provide a means to determine the RNA sequence within these regions. In addition, the chemical reactions allow one to assign interactions to specific atoms and to distinguish secondary interactions from tertiary ones. If the RNA molecule is small enough to be sequenced directly by the enzymatic or chemical method, the probing reactions can be done easily at the same time as sequencing reactions

  15. Comparison of the Equine Reference Sequence with Its Sanger Source Data and New Illumina Reads.

    Directory of Open Access Journals (Sweden)

    Jovan Rebolledo-Mendez

    Full Text Available The reference assembly for the domestic horse, EquCab2, published in 2009, was built using approximately 30 million Sanger reads from a Thoroughbred mare named Twilight. Contiguity in the assembly was facilitated using nearly 315 thousand BAC end sequences from Twilight's half brother Bravo. Since then, it has served as the foundation for many genome-wide analyses that include not only the modern horse, but ancient horses and other equid species as well. As data mapped to this reference has accumulated, consistent variation between mapped datasets and the reference, in terms of regions with no read coverage, single nucleotide variants, and small insertions/deletions have become apparent. In many cases, it is not clear whether these differences are the result of true sequence variation between the research subjects' and Twilight's genome or due to errors in the reference. EquCab2 is regarded as "The Twilight Assembly." The objective of this study was to identify inconsistencies between the EquCab2 assembly and the source Twilight Sanger data used to build it. To that end, the original Sanger and BAC end reads have been mapped back to this equine reference and assessed with the addition of approximately 40X coverage of new Illumina Paired-End sequence data. The resulting mapped datasets identify those regions with low Sanger read coverage, as well as variation in genomic content that is not consistent with either the original Twilight Sanger data or the new genomic sequence data generated from Twilight on the Illumina platform. As the haploid EquCab2 reference assembly was created using Sanger reads derived largely from a single individual, the vast majority of variation detected in a mapped dataset comprised of those same Sanger reads should be heterozygous. In contrast, homozygous variations would represent either errors in the reference or contributions from Bravo's BAC end sequences. Our analysis identifies 720,843 homozygous discrepancies

  16. Subsampled open-reference clustering creates consistent, comprehensive OTU definitions and scales to billions of sequences.

    Science.gov (United States)

    Rideout, Jai Ram; He, Yan; Navas-Molina, Jose A; Walters, William A; Ursell, Luke K; Gibbons, Sean M; Chase, John; McDonald, Daniel; Gonzalez, Antonio; Robbins-Pianka, Adam; Clemente, Jose C; Gilbert, Jack A; Huse, Susan M; Zhou, Hong-Wei; Knight, Rob; Caporaso, J Gregory

    2014-01-01

    We present a performance-optimized algorithm, subsampled open-reference OTU picking, for assigning marker gene (e.g., 16S rRNA) sequences generated on next-generation sequencing platforms to operational taxonomic units (OTUs) for microbial community analysis. This algorithm provides benefits over de novo OTU picking (clustering can be performed largely in parallel, reducing runtime) and closed-reference OTU picking (all reads are clustered, not only those that match a reference database sequence with high similarity). Because more of our algorithm can be run in parallel relative to "classic" open-reference OTU picking, it makes open-reference OTU picking tractable on massive amplicon sequence data sets (though on smaller data sets, "classic" open-reference OTU clustering is often faster). We illustrate that here by applying it to the first 15,000 samples sequenced for the Earth Microbiome Project (1.3 billion V4 16S rRNA amplicons). To the best of our knowledge, this is the largest OTU picking run ever performed, and we estimate that our new algorithm runs in less than 1/5 the time than would be required of "classic" open reference OTU picking. We show that subsampled open-reference OTU picking yields results that are highly correlated with those generated by "classic" open-reference OTU picking through comparisons on three well-studied datasets. An implementation of this algorithm is provided in the popular QIIME software package, which uses uclust for read clustering. All analyses were performed using QIIME's uclust wrappers, though we provide details (aided by the open-source code in our GitHub repository) that will allow implementation of subsampled open-reference OTU picking independently of QIIME (e.g., in a compiled programming language, where runtimes should be further reduced). Our analyses should generalize to other implementations of these OTU picking algorithms. Finally, we present a comparison of parameter settings in QIIME's OTU picking workflows and

  17. Subsampled open-reference clustering creates consistent, comprehensive OTU definitions and scales to billions of sequences

    Directory of Open Access Journals (Sweden)

    Jai Ram Rideout

    2014-08-01

    Full Text Available We present a performance-optimized algorithm, subsampled open-reference OTU picking, for assigning marker gene (e.g., 16S rRNA sequences generated on next-generation sequencing platforms to operational taxonomic units (OTUs for microbial community analysis. This algorithm provides benefits over de novo OTU picking (clustering can be performed largely in parallel, reducing runtime and closed-reference OTU picking (all reads are clustered, not only those that match a reference database sequence with high similarity. Because more of our algorithm can be run in parallel relative to “classic” open-reference OTU picking, it makes open-reference OTU picking tractable on massive amplicon sequence data sets (though on smaller data sets, “classic” open-reference OTU clustering is often faster. We illustrate that here by applying it to the first 15,000 samples sequenced for the Earth Microbiome Project (1.3 billion V4 16S rRNA amplicons. To the best of our knowledge, this is the largest OTU picking run ever performed, and we estimate that our new algorithm runs in less than 1/5 the time than would be required of “classic” open reference OTU picking. We show that subsampled open-reference OTU picking yields results that are highly correlated with those generated by “classic” open-reference OTU picking through comparisons on three well-studied datasets. An implementation of this algorithm is provided in the popular QIIME software package, which uses uclust for read clustering. All analyses were performed using QIIME’s uclust wrappers, though we provide details (aided by the open-source code in our GitHub repository that will allow implementation of subsampled open-reference OTU picking independently of QIIME (e.g., in a compiled programming language, where runtimes should be further reduced. Our analyses should generalize to other implementations of these OTU picking algorithms. Finally, we present a comparison of parameter settings in

  18. Reference genome-independent assessment of mutation density using restriction enzyme-phased sequencing

    Directory of Open Access Journals (Sweden)

    Monson-Miller Jennifer

    2012-02-01

    Full Text Available Abstract Background The availability of low cost sequencing has spurred its application to discovery and typing of variation, including variation induced by mutagenesis. Mutation discovery is challenging as it requires a substantial amount of sequencing and analysis to detect very rare changes and distinguish them from noise. Also challenging are the cases when the organism of interest has not been sequenced or is highly divergent from the reference. Results We describe the development of a simple method for reduced representation sequencing. Input DNA was digested with a single restriction enzyme and ligated to Y adapters modified to contain a sequence barcode and to provide a compatible overhang for ligation. We demonstrated the efficiency of this method at SNP discovery using rice and arabidopsis. To test its suitability for the discovery of very rare SNP, one control and three mutagenized rice individuals (1, 5 and 10 mM sodium azide were used to prepare genomic libraries for Illumina sequencers by ligating barcoded adapters to NlaIII restriction sites. For genome-dependent discovery 15-30 million of 80 base reads per individual were aligned to the reference sequence achieving individual sequencing coverage from 7 to 15×. We identified high-confidence base changes by comparing sequences across individuals and identified instances consistent with mutations, i.e. changes that were found in a single treated individual and were solely GC to AT transitions. For genome-independent discovery 70-mers were extracted from the sequence of the control individual and single-copy sequence was identified by comparing the 70-mers across samples to evaluate copy number and variation. This de novo "genome" was used to align the reads and identify mutations as above. Covering approximately 1/5 of the 380 Mb genome of rice we detected mutation densities ranging from 0.6 to 4 per Mb of diploid DNA depending on the mutagenic treatment. Conclusions The

  19. Integrated sequence analysis. Final report

    International Nuclear Information System (INIS)

    Andersson, K.; Pyy, P.

    1998-02-01

    The NKS/RAK subprojet 3 'integrated sequence analysis' (ISA) was formulated with the overall objective to develop and to test integrated methodologies in order to evaluate event sequences with significant human action contribution. The term 'methodology' denotes not only technical tools but also methods for integration of different scientific disciplines. In this report, we first discuss the background of ISA and the surveys made to map methods in different application fields, such as man machine system simulation software, human reliability analysis (HRA) and expert judgement. Specific event sequences were, after the surveys, selected for application and testing of a number of ISA methods. The event sequences discussed in the report were cold overpressure of BWR, shutdown LOCA of BWR, steam generator tube rupture of a PWR and BWR disturbed signal view in the control room after an external event. Different teams analysed these sequences by using different ISA and HRA methods. Two kinds of results were obtained from the ISA project: sequence specific and more general findings. The sequence specific results are discussed together with each sequence description. The general lessons are discussed under a separate chapter by using comparisons of different case studies. These lessons include areas ranging from plant safety management (design, procedures, instrumentation, operations, maintenance and safety practices) to methodological findings (ISA methodology, PSA,HRA, physical analyses, behavioural analyses and uncertainty assessment). Finally follows a discussion about the project and conclusions are presented. An interdisciplinary study of complex phenomena is a natural way to produce valuable and innovative results. This project came up with structured ways to perform ISA and managed to apply the in practice. The project also highlighted some areas where more work is needed. In the HRA work, development is required for the use of simulators and expert judgement as

  20. Computer-aided visualization and analysis system for sequence evaluation

    Energy Technology Data Exchange (ETDEWEB)

    Chee, Mark S.; Wang, Chunwei; Jevons, Luis C.; Bernhart, Derek H.; Lipshutz, Robert J.

    2004-05-11

    A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.

  1. Direct chloroplast sequencing: comparison of sequencing platforms and analysis tools for whole chloroplast barcoding.

    Directory of Open Access Journals (Sweden)

    Marta Brozynska

    Full Text Available Direct sequencing of total plant DNA using next generation sequencing technologies generates a whole chloroplast genome sequence that has the potential to provide a barcode for use in plant and food identification. Advances in DNA sequencing platforms may make this an attractive approach for routine plant identification. The HiSeq (Illumina and Ion Torrent (Life Technology sequencing platforms were used to sequence total DNA from rice to identify polymorphisms in the whole chloroplast genome sequence of a wild rice plant relative to cultivated rice (cv. Nipponbare. Consensus chloroplast sequences were produced by mapping sequence reads to the reference rice chloroplast genome or by de novo assembly and mapping of the resulting contigs to the reference sequence. A total of 122 polymorphisms (SNPs and indels between the wild and cultivated rice chloroplasts were predicted by these different sequencing and analysis methods. Of these, a total of 102 polymorphisms including 90 SNPs were predicted by both platforms. Indels were more variable with different sequencing methods, with almost all discrepancies found in homopolymers. The Ion Torrent platform gave no apparent false SNP but was less reliable for indels. The methods should be suitable for routine barcoding using appropriate combinations of sequencing platform and data analysis.

  2. Fractals in DNA sequence analysis

    Institute of Scientific and Technical Information of China (English)

    Yu Zu-Guo(喻祖国); Vo Anh; Gong Zhi-Min(龚志民); Long Shun-Chao(龙顺潮)

    2002-01-01

    Fractal methods have been successfully used to study many problems in physics, mathematics, engineering, finance,and even in biology. There has been an increasing interest in unravelling the mysteries of DNA; for example, how can we distinguish coding and noncoding sequences, and the problems of classification and evolution relationship of organisms are key problems in bioinformatics. Although much research has been carried out by taking into consideration the long-range correlations in DNA sequences, and the global fractal dimension has been used in these works by other people, the models and methods are somewhat rough and the results are not satisfactory. In recent years, our group has introduced a time series model (statistical point of view) and a visual representation (geometrical point of view)to DNA sequence analysis. We have also used fractal dimension, correlation dimension, the Hurst exponent and the dimension spectrum (multifractal analysis) to discuss problems in this field. In this paper, we introduce these fractal models and methods and the results of DNA sequence analysis.

  3. Effector-independent motor sequence representations exist in extrinsic and intrinsic reference frames.

    Science.gov (United States)

    Wiestler, Tobias; Waters-Metenier, Sheena; Diedrichsen, Jörn

    2014-04-02

    Many daily activities rely on the ability to produce meaningful sequences of movements. Motor sequences can be learned in an effector-specific fashion (such that benefits of training are restricted to the trained hand) or an effector-independent manner (meaning that learning also facilitates performance with the untrained hand). Effector-independent knowledge can be represented in extrinsic/world-centered or in intrinsic/body-centered coordinates. Here, we used functional magnetic resonance imaging (fMRI) and multivoxel pattern analysis to determine the distribution of intrinsic and extrinsic finger sequence representations across the human neocortex. Participants practiced four sequences with one hand for 4 d, and then performed these sequences during fMRI with both left and right hand. Between hands, these sequences were equivalent in extrinsic or intrinsic space, or were unrelated. In dorsal premotor cortex (PMd), we found that sequence-specific activity patterns correlated higher for extrinsic than for unrelated pairs, providing evidence for an extrinsic sequence representation. In contrast, primary sensory and motor cortices showed effector-independent representations in intrinsic space, with considerable overlap of the two reference frames in caudal PMd. These results suggest that effector-independent representations exist not only in world-centered, but also in body-centered coordinates, and that PMd may be involved in transforming sequential knowledge between the two. Moreover, although effector-independent sequence representations were found bilaterally, they were stronger in the hemisphere contralateral to the trained hand. This indicates that intermanual transfer relies on motor memories that are laid down during training in both hemispheres, but preferentially draws upon sequential knowledge represented in the trained hemisphere.

  4. Integrated sequence analysis. Final report

    Energy Technology Data Exchange (ETDEWEB)

    Andersson, K.; Pyy, P

    1998-02-01

    The NKS/RAK subprojet 3 `integrated sequence analysis` (ISA) was formulated with the overall objective to develop and to test integrated methodologies in order to evaluate event sequences with significant human action contribution. The term `methodology` denotes not only technical tools but also methods for integration of different scientific disciplines. In this report, we first discuss the background of ISA and the surveys made to map methods in different application fields, such as man machine system simulation software, human reliability analysis (HRA) and expert judgement. Specific event sequences were, after the surveys, selected for application and testing of a number of ISA methods. The event sequences discussed in the report were cold overpressure of BWR, shutdown LOCA of BWR, steam generator tube rupture of a PWR and BWR disturbed signal view in the control room after an external event. Different teams analysed these sequences by using different ISA and HRA methods. Two kinds of results were obtained from the ISA project: sequence specific and more general findings. The sequence specific results are discussed together with each sequence description. The general lessons are discussed under a separate chapter by using comparisons of different case studies. These lessons include areas ranging from plant safety management (design, procedures, instrumentation, operations, maintenance and safety practices) to methodological findings (ISA methodology, PSA,HRA, physical analyses, behavioural analyses and uncertainty assessment). Finally follows a discussion about the project and conclusions are presented. An interdisciplinary study of complex phenomena is a natural way to produce valuable and innovative results. This project came up with structured ways to perform ISA and managed to apply the in practice. The project also highlighted some areas where more work is needed. In the HRA work, development is required for the use of simulators and expert judgement as

  5. Application of genotyping-by-sequencing on semiconductor sequencing platforms: a comparison of genetic and reference-based marker ordering in barley.

    Directory of Open Access Journals (Sweden)

    Martin Mascher

    Full Text Available The rapid development of next-generation sequencing platforms has enabled the use of sequencing for routine genotyping across a range of genetics studies and breeding applications. Genotyping-by-sequencing (GBS, a low-cost, reduced representation sequencing method, is becoming a common approach for whole-genome marker profiling in many species. With quickly developing sequencing technologies, adapting current GBS methodologies to new platforms will leverage these advancements for future studies. To test new semiconductor sequencing platforms for GBS, we genotyped a barley recombinant inbred line (RIL population. Based on a previous GBS approach, we designed bar code and adapter sets for the Ion Torrent platforms. Four sets of 24-plex libraries were constructed consisting of 94 RILs and the two parents and sequenced on two Ion platforms. In parallel, a 96-plex library of the same RILs was sequenced on the Illumina HiSeq 2000. We applied two different computational pipelines to analyze sequencing data; the reference-independent TASSEL pipeline and a reference-based pipeline using SAMtools. Sequence contigs positioned on the integrated physical and genetic map were used for read mapping and variant calling. We found high agreement in genotype calls between the different platforms and high concordance between genetic and reference-based marker order. There was, however, paucity in the number of SNP that were jointly discovered by the different pipelines indicating a strong effect of alignment and filtering parameters on SNP discovery. We show the utility of the current barley genome assembly as a framework for developing very low-cost genetic maps, facilitating high resolution genetic mapping and negating the need for developing de novo genetic maps for future studies in barley. Through demonstration of GBS on semiconductor sequencing platforms, we conclude that the GBS approach is amenable to a range of platforms and can easily be modified as new

  6. Reference voltage calculation method based on zero-sequence component optimisation for a regional compensation DVR

    Science.gov (United States)

    Jian, Le; Cao, Wang; Jintao, Yang; Yinge, Wang

    2018-04-01

    This paper describes the design of a dynamic voltage restorer (DVR) that can simultaneously protect several sensitive loads from voltage sags in a region of an MV distribution network. A novel reference voltage calculation method based on zero-sequence voltage optimisation is proposed for this DVR to optimise cost-effectiveness in compensation of voltage sags with different characteristics in an ungrounded neutral system. Based on a detailed analysis of the characteristics of voltage sags caused by different types of faults and the effect of the wiring mode of the transformer on these characteristics, the optimisation target of the reference voltage calculation is presented with several constraints. The reference voltages under all types of voltage sags are calculated by optimising the zero-sequence component, which can reduce the degree of swell in the phase-to-ground voltage after compensation to the maximum extent and can improve the symmetry degree of the output voltages of the DVR, thereby effectively increasing the compensation ability. The validity and effectiveness of the proposed method are verified by simulation and experimental results.

  7. Choice of reference sequence and assembler for alignment of Listeria monocytogenes short-read sequence data greatly influences rates of error in SNP analyses.

    Directory of Open Access Journals (Sweden)

    Arthur W Pightling

    Full Text Available The wide availability of whole-genome sequencing (WGS and an abundance of open-source software have made detection of single-nucleotide polymorphisms (SNPs in bacterial genomes an increasingly accessible and effective tool for comparative analyses. Thus, ensuring that real nucleotide differences between genomes (i.e., true SNPs are detected at high rates and that the influences of errors (such as false positive SNPs, ambiguously called sites, and gaps are mitigated is of utmost importance. The choices researchers make regarding the generation and analysis of WGS data can greatly influence the accuracy of short-read sequence alignments and, therefore, the efficacy of such experiments. We studied the effects of some of these choices, including: i depth of sequencing coverage, ii choice of reference-guided short-read sequence assembler, iii choice of reference genome, and iv whether to perform read-quality filtering and trimming, on our ability to detect true SNPs and on the frequencies of errors. We performed benchmarking experiments, during which we assembled simulated and real Listeria monocytogenes strain 08-5578 short-read sequence datasets of varying quality with four commonly used assemblers (BWA, MOSAIK, Novoalign, and SMALT, using reference genomes of varying genetic distances, and with or without read pre-processing (i.e., quality filtering and trimming. We found that assemblies of at least 50-fold coverage provided the most accurate results. In addition, MOSAIK yielded the fewest errors when reads were aligned to a nearly identical reference genome, while using SMALT to align reads against a reference sequence that is ∼0.82% distant from 08-5578 at the nucleotide level resulted in the detection of the greatest numbers of true SNPs and the fewest errors. Finally, we show that whether read pre-processing improves SNP detection depends upon the choice of reference sequence and assembler. In total, this study demonstrates that researchers

  8. Sequence-based comparative study of classical swine fever virus genogroup 2.2 isolate with pestivirus reference strains.

    Science.gov (United States)

    Kumar, Ravi; Rajak, Kaushal Kishor; Chandra, Tribhuwan; Muthuchelvan, Dhanavelu; Saxena, Arpit; Chaudhary, Dheeraj; Kumar, Ajay; Pandey, Awadh Bihari

    2015-09-01

    This study was undertaken with the aim to compare and establish the genetic relatedness between classical swine fever virus (CSFV) genogroup 2.2 isolate and pestivirus reference strains. The available complete genome sequences of CSFV/IND/UK/LAL-290 strain and other pestivirus reference strains were retrieved from GenBank. The complete genome sequence, complete open reading frame, 5' and 3' non-coding region (NCR) sequences were analyzed and compared with reference pestiviruses strains. Clustal W model in MegAlign program of Lasergene 6.0 software was used for analysis of genetic heterogeneity. Phylogenetic analysis was carried out using MEGA 6.06 software package. The complete genome sequence alignment of CSFV/IND/UK/LAL-290 isolate and reference pestivirus strains showed 58.9-72% identities at the nucleotide level and 50.3-76.9% at amino acid level. Sequence homology of 5' and 3' NCRs was found to be 64.1-82.3% and 22.9-71.4%, respectively. In phylogenetic analysis, overall tree topology was found similar irrespective of sequences used in this study; however, whole genome phylogeny of pestivirus formed two main clusters, which further distinguished into the monophyletic clade of each pestivirus species. CSFV/IND/UK/LAL-290 isolate placed with the CSFV Eystrup strain in the same clade with close proximity to border disease virus and Aydin strains. CSFV/IND/UK/LAL-290 exhibited the analogous genomic organization to those of all reference pestivirus strains. Based on sequence identity and phylogenetic analysis, the isolate showed close homology to Aydin/04-TR virus and distantly related to Bungowannah virus.

  9. No Reference Prediction of Quality Metrics for H.264 Compressed Infrared Image Sequences for UAV Applications

    DEFF Research Database (Denmark)

    Hossain, Kabir; Mantel, Claire; Forchhammer, Søren

    2018-01-01

    The framework for this research work is the acquisition of Infrared (IR) images from Unmanned Aerial Vehicles (UAV). In this paper we consider the No-Reference (NR) prediction of Full Reference Quality Metrics for Infrared (IR) video sequences which are compressed and thus distorted by an H.264...

  10. A STUDY ON DETERMINING THE REFERENCE SPREADING SEQUENCES FOR A DS/CDMACOMMUNICATION SYSTEM

    Directory of Open Access Journals (Sweden)

    Cebrail ÇİFTLİKLİ

    2002-02-01

    Full Text Available In a direct sequence/code division multiple access (DS/CDMA system, the role of the spreading sequences (codes is crucial since the multiple access interference (MAI is the main performance limitation. In this study, we propose an accurate criterion which enables the determination of the reference spreading codes which yield lower bit error rates (BER's in a given code set for a DS/CDMA system using despreading sequences weighted by stepping chip waveforms. The numerical results show that the spreading codes determined by the proposed criterion are the most suitable codes for using as references.

  11. Sequence analysis of Leukemia DNA

    Science.gov (United States)

    Nacong, Nasria; Lusiyanti, Desy; Irawan, Muhammad. Isa

    2018-03-01

    Cancer is a very deadly disease, one of which is leukemia disease or better known as blood cancer. The cancer cell can be detected by taking DNA in laboratory test. This study focused on local alignment of leukemia and non leukemia data resulting from NCBI in the form of DNA sequences by using Smith-Waterman algorithm. SmithWaterman algorithm was invented by TF Smith and MS Waterman in 1981. These algorithms try to find as much as possible similarity of a pair of sequences, by giving a negative value to the unequal base pair (mismatch), and positive values on the same base pair (match). So that will obtain the maximum positive value as the end of the alignment, and the minimum value as the initial alignment. This study will use sequences of leukemia and 3 sequences of non leukemia.

  12. Genome Sequencing and Analysis Conference IV

    Energy Technology Data Exchange (ETDEWEB)

    1993-12-31

    J. Craig Venter and C. Thomas Caskey co-chaired Genome Sequencing and Analysis Conference IV held at Hilton Head, South Carolina from September 26--30, 1992. Venter opened the conference by noting that approximately 400 researchers from 16 nations were present four times as many participants as at Genome Sequencing Conference I in 1989. Venter also introduced the Data Fair, a new component of the conference allowing exchange and on-site computer analysis of unpublished sequence data.

  13. Routine Whole-Genome Sequencing for Outbreak Investigations of Staphylococcus aureus in a National Reference Center

    Directory of Open Access Journals (Sweden)

    Geraldine Durand

    2018-03-01

    Full Text Available The French National Reference Center for Staphylococci currently uses DNA arrays and spa typing for the initial epidemiological characterization of Staphylococcus aureus strains. We here describe the use of whole-genome sequencing (WGS to investigate retrospectively four distinct and virulent S. aureus lineages [clonal complexes (CCs: CC1, CC5, CC8, CC30] involved in hospital and community outbreaks or sporadic infections in France. We used a WGS bioinformatics pipeline based on de novo assembly (reference-free approach, single nucleotide polymorphism analysis, and on the inclusion of epidemiological markers. We examined the phylogeographic diversity of the French dominant hospital-acquired CC8-MRSA (methicillin-resistant S. aureus Lyon clone through WGS analysis which did not demonstrate evidence of large-scale geographic clustering. We analyzed sporadic cases along with two outbreaks of a CC1-MSSA (methicillin-susceptible S. aureus clone containing the Panton–Valentine leukocidin (PVL and results showed that two sporadic cases were closely related. We investigated an outbreak of PVL-positive CC30-MSSA in a school environment and were able to reconstruct the transmission history between eight families. We explored different outbreaks among newborns due to the CC5-MRSA Geraldine clone and we found evidence of an unsuspected link between two otherwise distinct outbreaks. Here, WGS provides the resolving power to disprove transmission events indicated by conventional methods (same sequence type, spa type, toxin profile, and antibiotic resistance profile and, most importantly, WGS can reveal unsuspected transmission events. Therefore, WGS allows to better describe and understand outbreaks and (inter-national dissemination of S. aureus lineages. Our findings underscore the importance of adding WGS for (inter-national surveillance of infections caused by virulent clones of S. aureus but also substantiate the fact that technological optimization at

  14. Time fluctuation analysis of forest fire sequences

    Science.gov (United States)

    Vega Orozco, Carmen D.; Kanevski, Mikhaïl; Tonini, Marj; Golay, Jean; Pereira, Mário J. G.

    2013-04-01

    Forest fires are complex events involving both space and time fluctuations. Understanding of their dynamics and pattern distribution is of great importance in order to improve the resource allocation and support fire management actions at local and global levels. This study aims at characterizing the temporal fluctuations of forest fire sequences observed in Portugal, which is the country that holds the largest wildfire land dataset in Europe. This research applies several exploratory data analysis measures to 302,000 forest fires occurred from 1980 to 2007. The applied clustering measures are: Morisita clustering index, fractal and multifractal dimensions (box-counting), Ripley's K-function, Allan Factor, and variography. These algorithms enable a global time structural analysis describing the degree of clustering of a point pattern and defining whether the observed events occur randomly, in clusters or in a regular pattern. The considered methods are of general importance and can be used for other spatio-temporal events (i.e. crime, epidemiology, biodiversity, geomarketing, etc.). An important contribution of this research deals with the analysis and estimation of local measures of clustering that helps understanding their temporal structure. Each measure is described and executed for the raw data (forest fires geo-database) and results are compared to reference patterns generated under the null hypothesis of randomness (Poisson processes) embedded in the same time period of the raw data. This comparison enables estimating the degree of the deviation of the real data from a Poisson process. Generalizations to functional measures of these clustering methods, taking into account the phenomena, were also applied and adapted to detect time dependences in a measured variable (i.e. burned area). The time clustering of the raw data is compared several times with the Poisson processes at different thresholds of the measured function. Then, the clustering measure value

  15. Locus Reference Genomic sequences: An improved basis for describing human DNA variants

    KAUST Repository

    Dalgleish, Raymond; Flicek, Paul; Cunningham, Fiona; Astashyn, Alex; Tully, Raymond E; Proctor, Glenn; Chen, Yuan; McLaren, William M; Larsson, Pontus; Vaughan, Brendan W; Bé roud, Christophe; Dobson, Glen; Lehvä slaiho, Heikki; Taschner, Peter EM; den Dunnen, Johan T; Devereau, Andrew; Birney, Ewan; Brookes, Anthony J; Maglott, Donna R

    2010-01-01

    As our knowledge of the complexity of gene architecture grows, and we increase our understanding of the subtleties of gene expression, the process of accurately describing disease-causing gene variants has become increasingly problematic. In part, this is due to current reference DNA sequence formats that do not fully meet present needs. Here we present the Locus Reference Genomic (LRG) sequence format, which has been designed for the specifi c purpose of gene variant reporting. The format builds on the successful National Center for Biotechnology Information (NCBI) RefSeqGene project and provides a single-fi le record containing a uniquely stable reference DNA sequence along with all relevant transcript and protein sequences essential to the description of gene variants. In principle, LRGs can be created for any organism, not just human. In addition, we recognize the need to respect legacy numbering systems for exons and amino acids and the LRG format takes account of these. We hope that widespread adoption of LRGs - which will be created and maintained by the NCBI and the European Bioinformatics Institute (EBI) - along with consistent use of the Human Genome Variation Society (HGVS)- approved variant nomenclature will reduce errors in the reporting of variants in the literature and improve communication about variants aff ecting human health. Further information can be found on the LRG web site (http://www.lrg-sequence.org). 2010 Dalgleish et al.; licensee BioMed Central Ltd.

  16. Locus Reference Genomic sequences: An improved basis for describing human DNA variants

    KAUST Repository

    Dalgleish, Raymond

    2010-04-15

    As our knowledge of the complexity of gene architecture grows, and we increase our understanding of the subtleties of gene expression, the process of accurately describing disease-causing gene variants has become increasingly problematic. In part, this is due to current reference DNA sequence formats that do not fully meet present needs. Here we present the Locus Reference Genomic (LRG) sequence format, which has been designed for the specifi c purpose of gene variant reporting. The format builds on the successful National Center for Biotechnology Information (NCBI) RefSeqGene project and provides a single-fi le record containing a uniquely stable reference DNA sequence along with all relevant transcript and protein sequences essential to the description of gene variants. In principle, LRGs can be created for any organism, not just human. In addition, we recognize the need to respect legacy numbering systems for exons and amino acids and the LRG format takes account of these. We hope that widespread adoption of LRGs - which will be created and maintained by the NCBI and the European Bioinformatics Institute (EBI) - along with consistent use of the Human Genome Variation Society (HGVS)- approved variant nomenclature will reduce errors in the reporting of variants in the literature and improve communication about variants aff ecting human health. Further information can be found on the LRG web site (http://www.lrg-sequence.org). 2010 Dalgleish et al.; licensee BioMed Central Ltd.

  17. An optimum analysis sequence for environmental gamma-ray spectrometry

    Energy Technology Data Exchange (ETDEWEB)

    De la Torre, F.; Rios M, C.; Ruvalcaba A, M. G.; Mireles G, F.; Saucedo A, S.; Davila R, I.; Pinedo, J. L., E-mail: fta777@hotmail.co [Universidad Autonoma de Zacatecas, Centro Regional de Estudis Nucleares, Calle Cipres No. 10, Fracc. La Penuela, 98068 Zacatecas (Mexico)

    2010-10-15

    This work aims to obtain an optimum analysis sequence for environmental gamma-ray spectroscopy by means of Genie 2000 (Canberra). Twenty different analysis sequences were customized using different peak area percentages and different algorithms for: 1) peak finding, and 2) peak area determination, and with or without the use of a library -based on evaluated nuclear data- of common gamma-ray emitters in environmental samples. The use of an optimum analysis sequence with certified nuclear information avoids the problems originated by the significant variations in out-of-date nuclear parameters of commercial software libraries. Interference-free gamma ray energies with absolute emission probabilities greater than 3.75% were included in the customized library. The gamma-ray spectroscopy system (based on a Ge Re-3522 Canberra detector) was calibrated both in energy and shape by means of the IAEA-2002 reference spectra for software intercomparison. To test the performance of the analysis sequences, the IAEA-2002 reference spectrum was used. The z-score and the reduced {chi}{sup 2} criteria were used to determine the optimum analysis sequence. The results show an appreciable variation in the peak area determinations and their corresponding uncertainties. Particularly, the combination of second derivative peak locate with simple peak area integration algorithms provides the greater accuracy. Lower accuracy comes from the combination of library directed peak locate algorithm and Genie's Gamma-M peak area determination. (Author)

  18. An optimum analysis sequence for environmental gamma-ray spectrometry

    International Nuclear Information System (INIS)

    De la Torre, F.; Rios M, C.; Ruvalcaba A, M. G.; Mireles G, F.; Saucedo A, S.; Davila R, I.; Pinedo, J. L.

    2010-10-01

    This work aims to obtain an optimum analysis sequence for environmental gamma-ray spectroscopy by means of Genie 2000 (Canberra). Twenty different analysis sequences were customized using different peak area percentages and different algorithms for: 1) peak finding, and 2) peak area determination, and with or without the use of a library -based on evaluated nuclear data- of common gamma-ray emitters in environmental samples. The use of an optimum analysis sequence with certified nuclear information avoids the problems originated by the significant variations in out-of-date nuclear parameters of commercial software libraries. Interference-free gamma ray energies with absolute emission probabilities greater than 3.75% were included in the customized library. The gamma-ray spectroscopy system (based on a Ge Re-3522 Canberra detector) was calibrated both in energy and shape by means of the IAEA-2002 reference spectra for software intercomparison. To test the performance of the analysis sequences, the IAEA-2002 reference spectrum was used. The z-score and the reduced χ 2 criteria were used to determine the optimum analysis sequence. The results show an appreciable variation in the peak area determinations and their corresponding uncertainties. Particularly, the combination of second derivative peak locate with simple peak area integration algorithms provides the greater accuracy. Lower accuracy comes from the combination of library directed peak locate algorithm and Genie's Gamma-M peak area determination. (Author)

  19. Diversity in non-repetitive human sequences not found in the reference genome.

    Science.gov (United States)

    Kehr, Birte; Helgadottir, Anna; Melsted, Pall; Jonsson, Hakon; Helgason, Hannes; Jonasdottir, Adalbjörg; Jonasdottir, Aslaug; Sigurdsson, Asgeir; Gylfason, Arnaldur; Halldorsson, Gisli H; Kristmundsdottir, Snaedis; Thorgeirsson, Gudmundur; Olafsson, Isleifur; Holm, Hilma; Thorsteinsdottir, Unnur; Sulem, Patrick; Helgason, Agnar; Gudbjartsson, Daniel F; Halldorsson, Bjarni V; Stefansson, Kari

    2017-04-01

    Genomes usually contain some non-repetitive sequences that are missing from the reference genome and occur only in a population subset. Such non-repetitive, non-reference (NRNR) sequences have remained largely unexplored in terms of their characterization and downstream analyses. Here we describe 3,791 breakpoint-resolved NRNR sequence variants called using PopIns from whole-genome sequence data of 15,219 Icelanders. We found that over 95% of the 244 NRNR sequences that are 200 bp or longer are present in chimpanzees, indicating that they are ancestral. Furthermore, 149 variant loci are in linkage disequilibrium (r 2 > 0.8) with a genome-wide association study (GWAS) catalog marker, suggesting disease relevance. Additionally, we report an association (P = 3.8 × 10 -8 , odds ratio (OR) = 0.92) with myocardial infarction (23,360 cases, 300,771 controls) for a 766-bp NRNR sequence variant. Our results underline the importance of including variation of all complexity levels when searching for variants that associate with disease.

  20. Robustness analysis of chiller sequencing control

    International Nuclear Information System (INIS)

    Liao, Yundan; Sun, Yongjun; Huang, Gongsheng

    2015-01-01

    Highlights: • Uncertainties with chiller sequencing control were systematically quantified. • Robustness of chiller sequencing control was systematically analyzed. • Different sequencing control strategies were sensitive to different uncertainties. • A numerical method was developed for easy selection of chiller sequencing control. - Abstract: Multiple-chiller plant is commonly employed in the heating, ventilating and air-conditioning system to increase operational feasibility and energy-efficiency under part load condition. In a multiple-chiller plant, chiller sequencing control plays a key role in achieving overall energy efficiency while not sacrifices the cooling sufficiency for indoor thermal comfort. Various sequencing control strategies have been developed and implemented in practice. Based on the observation that (i) uncertainty, which cannot be avoided in chiller sequencing control, has a significant impact on the control performance and may cause the control fail to achieve the expected control and/or energy performance; and (ii) in current literature few studies have systematically addressed this issue, this paper therefore presents a study on robustness analysis of chiller sequencing control in order to understand the robustness of various chiller sequencing control strategies under different types of uncertainty. Based on the robustness analysis, a simple and applicable method is developed to select the most robust control strategy for a given chiller plant in the presence of uncertainties, which will be verified using case studies

  1. Unraveling systematic inventory of Echinops (Asteraceae) with special reference to nrDNA ITS sequence-based molecular typing of Echinops abuzinadianus.

    Science.gov (United States)

    Ali, M A; Al-Hemaid, F M; Lee, J; Hatamleh, A A; Gyulai, G; Rahman, M O

    2015-10-02

    The present study explored the systematic inventory of Echinops L. (Asteraceae) of Saudi Arabia, with special reference to the molecular typing of Echinops abuzinadianus Chaudhary, an endemic species to Saudi Arabia, based on the internal transcribed spacer (ITS) sequences (ITS1-5.8S-ITS2) of nuclear ribosomal DNA. A sequence similarity search using BLAST and a phylogenetic analysis of the ITS sequence of E. abuzinadianus revealed a high level of sequence similarity with E. glaberrimus DC. (section Ritropsis). The novel primary sequence and the secondary structure of ITS2 of E. abuzinadianus could potentially be used for molecular genotyping.

  2. Differential DNA Methylation Analysis without a Reference Genome

    Directory of Open Access Journals (Sweden)

    Johanna Klughammer

    2015-12-01

    Full Text Available Genome-wide DNA methylation mapping uncovers epigenetic changes associated with animal development, environmental adaptation, and species evolution. To address the lack of high-throughput methods for DNA methylation analysis in non-model organisms, we developed an integrated approach for studying DNA methylation differences independent of a reference genome. Experimentally, our method relies on an optimized 96-well protocol for reduced representation bisulfite sequencing (RRBS, which we have validated in nine species (human, mouse, rat, cow, dog, chicken, carp, sea bass, and zebrafish. Bioinformatically, we developed the RefFreeDMA software to deduce ad hoc genomes directly from RRBS reads and to pinpoint differentially methylated regions between samples or groups of individuals (http://RefFreeDMA.computational-epigenetics.org. The identified regions are interpreted using motif enrichment analysis and/or cross-mapping to annotated genomes. We validated our method by reference-free analysis of cell-type-specific DNA methylation in the blood of human, cow, and carp. In summary, we present a cost-effective method for epigenome analysis in ecology and evolution, which enables epigenome-wide association studies in natural populations and species without a reference genome.

  3. Probabilistic accident sequence recovery analysis

    International Nuclear Information System (INIS)

    Stutzke, Martin A.; Cooper, Susan E.

    2004-01-01

    Recovery analysis is a method that considers alternative strategies for preventing accidents in nuclear power plants during probabilistic risk assessment (PRA). Consideration of possible recovery actions in PRAs has been controversial, and there seems to be a widely held belief among PRA practitioners, utility staff, plant operators, and regulators that the results of recovery analysis should be skeptically viewed. This paper provides a framework for discussing recovery strategies, thus lending credibility to the process and enhancing regulatory acceptance of PRA results and conclusions. (author)

  4. Improvement of the banana "Musa acuminata" reference sequence using NGS data and semi-automated bioinformatics methods.

    Science.gov (United States)

    Martin, Guillaume; Baurens, Franc-Christophe; Droc, Gaëtan; Rouard, Mathieu; Cenci, Alberto; Kilian, Andrzej; Hastie, Alex; Doležel, Jaroslav; Aury, Jean-Marc; Alberti, Adriana; Carreel, Françoise; D'Hont, Angélique

    2016-03-16

    Recent advances in genomics indicate functional significance of a majority of genome sequences and their long range interactions. As a detailed examination of genome organization and function requires very high quality genome sequence, the objective of this study was to improve reference genome assembly of banana (Musa acuminata). We have developed a modular bioinformatics pipeline to improve genome sequence assemblies, which can handle various types of data. The pipeline comprises several semi-automated tools. However, unlike classical automated tools that are based on global parameters, the semi-automated tools proposed an expert mode for a user who can decide on suggested improvements through local compromises. The pipeline was used to improve the draft genome sequence of Musa acuminata. Genotyping by sequencing (GBS) of a segregating population and paired-end sequencing were used to detect and correct scaffold misassemblies. Long insert size paired-end reads identified scaffold junctions and fusions missed by automated assembly methods. GBS markers were used to anchor scaffolds to pseudo-molecules with a new bioinformatics approach that avoids the tedious step of marker ordering during genetic map construction. Furthermore, a genome map was constructed and used to assemble scaffolds into super scaffolds. Finally, a consensus gene annotation was projected on the new assembly from two pre-existing annotations. This approach reduced the total Musa scaffold number from 7513 to 1532 (i.e. by 80%), with an N50 that increased from 1.3 Mb (65 scaffolds) to 3.0 Mb (26 scaffolds). 89.5% of the assembly was anchored to the 11 Musa chromosomes compared to the previous 70%. Unknown sites (N) were reduced from 17.3 to 10.0%. The release of the Musa acuminata reference genome version 2 provides a platform for detailed analysis of banana genome variation, function and evolution. Bioinformatics tools developed in this work can be used to improve genome sequence assemblies in

  5. Nordic reference study on uncertainty and sensitivity analysis

    International Nuclear Information System (INIS)

    Hirschberg, S.; Jacobsson, P.; Pulkkinen, U.; Porn, K.

    1989-01-01

    This paper provides a review of the first phase of Nordic reference study on uncertainty and sensitivity analysis. The main objective of this study is to use experiences form previous Nordic Benchmark Exercises and reference studies concerning critical modeling issues such as common cause failures and human interactions, and to demonstrate the impact of associated uncertainties on the uncertainty of the investigated accident sequence. This has been done independently by three working groups which used different approaches to modeling and to uncertainty analysis. The estimated uncertainty interval for the analyzed accident sequence is large. Also the discrepancies between the groups are substantial but can be explained. Sensitivity analyses which have been carried out concern e.g. use of different CCF-quantification models, alternative handling of CCF-data, time windows for operator actions and time dependences in phase mission operation, impact of state-of-knowledge dependences and ranking of dominating uncertainty contributors. Specific findings with respect to these issues are summarized in the paper

  6. A BRCA2 mutation incorrectly mapped in the original BRCA2 reference sequence, is a common West Danish founder mutation disrupting mRNA splicing

    DEFF Research Database (Denmark)

    Thomassen, Mads; Pedersen, Inge Søkilde; Vogel, Ida

    2011-01-01

    Inherited mutations in the tumor suppressor genes BRCA1 and BRCA2 predispose carriers to breast and ovarian cancer. The authors have identified a mutation in BRCA2, 7845+1G>A (c.7617+1G>A), not previously regarded as deleterious because of incorrect mapping of the splice junction in the originally...... published genomic reference sequence. This reference sequence is generally used in many laboratories and it maps the mutation 16 base pairs inside intron 15. However, according to the recent reference sequences the mutation is located in the consensus donor splice sequence. By reverse transcriptase analysis......, loss of exon 15 in the final transcript interrupting the open reading frame was demonstrated. Furthermore, the mutation segregates with a cancer phenotype in 18 Danish families. By genetic analysis of more than 3,500 Danish breast/ovarian cancer risk families, the mutation was identified as the most...

  7. Sequence analysis by iterated maps, a review.

    Science.gov (United States)

    Almeida, Jonas S

    2014-05-01

    Among alignment-free methods, Iterated Maps (IMs) are on a particular extreme: they are also scale free (order free). The use of IMs for sequence analysis is also distinct from other alignment-free methodologies in being rooted in statistical mechanics instead of computational linguistics. Both of these roots go back over two decades to the use of fractal geometry in the characterization of phase-space representations. The time series analysis origin of the field is betrayed by the title of the manuscript that started this alignment-free subdomain in 1990, 'Chaos Game Representation'. The clash between the analysis of sequences as continuous series and the better established use of Markovian approaches to discrete series was almost immediate, with a defining critique published in same journal 2 years later. The rest of that decade would go by before the scale-free nature of the IM space was uncovered. The ensuing decade saw this scalability generalized for non-genomic alphabets as well as an interest in its use for graphic representation of biological sequences. Finally, in the past couple of years, in step with the emergence of BigData and MapReduce as a new computational paradigm, there is a surprising third act in the IM story. Multiple reports have described gains in computational efficiency of multiple orders of magnitude over more conventional sequence analysis methodologies. The stage appears to be now set for a recasting of IMs with a central role in processing nextgen sequencing results.

  8. Validation of Genotyping-By-Sequencing Analysis in Populations of Tetraploid Alfalfa by 454 Sequencing

    Science.gov (United States)

    Rocher, Solen; Jean, Martine; Castonguay, Yves; Belzile, François

    2015-01-01

    Genotyping-by-sequencing (GBS) is a relatively low-cost high throughput genotyping technology based on next generation sequencing and is applicable to orphan species with no reference genome. A combination of genome complexity reduction and multiplexing with DNA barcoding provides a simple and affordable way to resolve allelic variation between plant samples or populations. GBS was performed on ApeKI libraries using DNA from 48 genotypes each of two heterogeneous populations of tetraploid alfalfa (Medicago sativa spp. sativa): the synthetic cultivar Apica (ATF0) and a derived population (ATF5) obtained after five cycles of recurrent selection for superior tolerance to freezing (TF). Nearly 400 million reads were obtained from two lanes of an Illumina HiSeq 2000 sequencer and analyzed with the Universal Network-Enabled Analysis Kit (UNEAK) pipeline designed for species with no reference genome. Following the application of whole dataset-level filters, 11,694 single nucleotide polymorphism (SNP) loci were obtained. About 60% had a significant match on the Medicago truncatula syntenic genome. The accuracy of allelic ratios and genotype calls based on GBS data was directly assessed using 454 sequencing on a subset of SNP loci scored in eight plant samples. Sequencing depth in this study was not sufficient for accurate tetraploid allelic dosage, but reliable genotype calls based on diploid allelic dosage were obtained when using additional quality filtering. Principal Component Analysis of SNP loci in plant samples revealed that a small proportion (<5%) of the genetic variability assessed by GBS is able to differentiate ATF0 and ATF5. Our results confirm that analysis of GBS data using UNEAK is a reliable approach for genome-wide discovery of SNP loci in outcrossed polyploids. PMID:26115486

  9. Preliminary hazard analysis using sequence tree method

    International Nuclear Information System (INIS)

    Huang Huiwen; Shih Chunkuan; Hung Hungchih; Chen Minghuei; Yih Swu; Lin Jiinming

    2007-01-01

    A system level PHA using sequence tree method was developed to perform Safety Related digital I and C system SSA. The conventional PHA is a brainstorming session among experts on various portions of the system to identify hazards through discussions. However, this conventional PHA is not a systematic technique, the analysis results strongly depend on the experts' subjective opinions. The analysis quality cannot be appropriately controlled. Thereby, this research developed a system level sequence tree based PHA, which can clarify the relationship among the major digital I and C systems. Two major phases are included in this sequence tree based technique. The first phase uses a table to analyze each event in SAR Chapter 15 for a specific safety related I and C system, such as RPS. The second phase uses sequence tree to recognize what I and C systems are involved in the event, how the safety related systems work, and how the backup systems can be activated to mitigate the consequence if the primary safety systems fail. In the sequence tree, the defense-in-depth echelons, including Control echelon, Reactor trip echelon, ESFAS echelon, and Indication and display echelon, are arranged to construct the sequence tree structure. All the related I and C systems, include digital system and the analog back-up systems are allocated in their specific echelon. By this system centric sequence tree based analysis, not only preliminary hazard can be identified systematically, the vulnerability of the nuclear power plant can also be recognized. Therefore, an effective simplified D3 evaluation can be performed as well. (author)

  10. Sequencing and de novo assembly of 150 genomes from Denmark as a population reference

    DEFF Research Database (Denmark)

    Maretty, Lasse; Jensen, Jacob Malte; Petersen, Bent

    2017-01-01

    Hundreds of thousands of human genomes are now being sequenced to characterize genetic variation and use this information to augment association mapping studies of complex disorders and other phenotypic traits. Genetic variation is identified mainly by mapping short reads to the reference genome......-coverage sequencing with mate-pair libraries extending up to 20 kilobases. We report de novo assemblies of 150 individuals (50 trios) from the GenomeDenmark project. The quality of these assemblies is similar to those obtained using the more expensive long-read technology. We use the assemblies to identify a rich set...... or by performing local assembly. However, these approaches are biased against discovery of structural variants and variation in the more complex parts of the genome. Hence, large-scale de novo assembly is needed. Here we show that it is possible to construct excellent de novo assemblies from high...

  11. Sequencing and de novo assembly of 150 genomes from Denmark as a population reference

    DEFF Research Database (Denmark)

    Maretty, Lasse; Jensen, Jacob Malte; Petersen, Bent

    2017-01-01

    Hundreds of thousands of human genomes are now being sequenced to characterize genetic variation and use this information to augment association mapping studies of complex disorders and other phenotypic traits. Genetic variation is identified mainly by mapping short reads to the reference genome...... or by performing local assembly. However, these approaches are biased against discovery of structural variants and variation in the more complex parts of the genome. Hence, large-scale de novo assembly is needed. Here we show that it is possible to construct excellent de novo assemblies from high......-coverage sequencing with mate-pair libraries extending up to 20 kilobases. We report de novo assemblies of 150 individuals (50 trios) from the GenomeDenmark project. The quality of these assemblies is similar to those obtained using the more expensive long-read technology. We use the assemblies to identify a rich set...

  12. Digital image sequence processing, compression, and analysis

    CERN Document Server

    Reed, Todd R

    2004-01-01

    IntroductionTodd R. ReedCONTENT-BASED IMAGE SEQUENCE REPRESENTATIONPedro M. Q. Aguiar, Radu S. Jasinschi, José M. F. Moura, andCharnchai PluempitiwiriyawejTHE COMPUTATION OF MOTIONChristoph Stiller, Sören Kammel, Jan Horn, and Thao DangMOTION ANALYSIS AND DISPLACEMENT ESTIMATION IN THE FREQUENCY DOMAINLuca Lucchese and Guido Maria CortelazzoQUALITY OF SERVICE ASSESSMENT IN NEW GENERATION WIRELESS VIDEO COMMUNICATIONSGaetano GiuntaERROR CONCEALMENT IN DIGITAL VIDEOFrancesco G.B. De NataleIMAGE SEQUENCE RESTORATION: A WIDER PERSPECTIVEAnil KokaramVIDEO SUMMARIZATIONCuneyt M. Taskiran and Edward

  13. Iterative normalization technique for reference sequence generation for zero-tail discrete fourier transform spread orthogonal frequency division multiplexing

    DEFF Research Database (Denmark)

    2017-01-01

    Systems, methods, apparatuses, and computer program products for generating sequences for zero-tail discrete fourier transform (DFT)-spread-orthogonal frequency division multiplexing (OFDM) (ZT DFT-s-OFDM) reference signals. One method includes adding a zero vector to an input sequence...... of each of the elements, converting the sequence to time domain, generating a zero-padded sequence by forcing a zero head and tail of the sequence, and repeating the steps until a final sequence with zero-tail and flat frequency response is obtained....

  14. The development of rhythmic attending in auditory sequences: attunement, referent period, focal attending.

    Science.gov (United States)

    Drake, C; Jones, M R; Baruch, C

    2000-12-15

    This paper is divided into three sections. The first section is theoretical; it extends Dynamic Attending Theory (Jones, M. R. Psychological Review 83 (1976) 323; Jones, M. R. Perception and Psychophysics 41(6) (1987) 631; Jones, M. R. Psychomusicology 9(2) (1990) 193; Jones, M. R., & Boltz, M. Psychological Review 96(3) (1989) 459) to developmental questions concerning tempo and time hierarchies. Generally Dynamic Attending Theory proposes that, when listening to a complex auditory sequence, listeners spontaneously focus on events occurring at an intermediate rate (the referent level), and they then may shift attention to events occurring over longer or shorter time spans, that is at lower (faster) or higher (slower) hierarchical levels (focal attending). The second section of the paper is experimental. It examines maturational changes of three dynamic attending activities involving referent period and level, attunement, and focal attending. Tasks involve both motor tapping (including spontaneous motor tempo and synchronization with simple sequences and music) and tempo discrimination. We compare performances by 4-, 6-, 8-, and 10-year-old children and adults, with or without musical training. Results indicate three changes with increased age and musical training: (1) a slowing of the mean spontaneous tapping rate (a reflection of the referent period) and mean synchronization rate (a reflection of the referent level), (2) enhanced ability to synchronize tapping and discriminate tempo (improved attunement), and (3) an enlarged range of tapping rates towards slower rates and higher hierarchical levels (improved focal attending). A final section considers results in light of the theory proposed here. It is suggested that growth trends can be expressed in terms of listeners' engagement of slower attending oscillators with age and experience, accompanied by the passage from the initial use of a single oscillator towards the coupling of multiple oscillators.

  15. Sequence comparison and phylogenetic analysis of core gene of ...

    African Journals Online (AJOL)

    Phylogenetic analysis suggests that our sequences are clustered with sequences reported from Japan. This is the first phylogenetic analysis of HCV core gene from Pakistani population. Our sequences and sequences from Japan are grouped into same cluster in the phylogenetic tree. Sequence comparison and ...

  16. [Complete genome sequencing and sequence analysis of BCG Tice].

    Science.gov (United States)

    Wang, Zhiming; Pan, Yuanlong; Wu, Jun; Zhu, Baoli

    2012-10-04

    The objective of this study is to obtain the complete genome sequence of Bacillus Calmette-Guerin Tice (BCG Tice), in order to provide more information about the molecular biology of BCG Tice and design more reasonable vaccines to prevent tuberculosis. We assembled the data from high-throughput sequencing with SOAPdenovo software, with many contigs and scaffolds obtained. There are many sequence gaps and physical gaps remained as a result of regional low coverage and low quality. We designed primers at the end of contigs and performed PCR amplification in order to link these contigs and scaffolds. With various enzymes to perform PCR amplification, adjustment of PCR reaction conditions, and combined with clone construction to sequence, all the gaps were finished. We obtained the complete genome sequence of BCG Tice and submitted it to GenBank of National Center for Biotechnology Information (NCBI). The genome of BCG Tice is 4334064 base pairs in length, with GC content 65.65%. The problems and strategies during the finishing step of BCG Tice sequencing are illuminated here, with the hope of affording some experience to those who are involved in the finishing step of genome sequencing. The microarray data were verified by our results.

  17. OTU analysis using metagenomic shotgun sequencing data.

    Directory of Open Access Journals (Sweden)

    Xiaolin Hao

    Full Text Available Because of technological limitations, the primer and amplification biases in targeted sequencing of 16S rRNA genes have veiled the true microbial diversity underlying environmental samples. However, the protocol of metagenomic shotgun sequencing provides 16S rRNA gene fragment data with natural immunity against the biases raised during priming and thus the potential of uncovering the true structure of microbial community by giving more accurate predictions of operational taxonomic units (OTUs. Nonetheless, the lack of statistically rigorous comparison between 16S rRNA gene fragments and other data types makes it difficult to interpret previously reported results using 16S rRNA gene fragments. Therefore, in the present work, we established a standard analysis pipeline that would help confirm if the differences in the data are true or are just due to potential technical bias. This pipeline is built by using simulated data to find optimal mapping and OTU prediction methods. The comparison between simulated datasets revealed a relationship between 16S rRNA gene fragments and full-length 16S rRNA sequences that a 16S rRNA gene fragment having a length >150 bp provides the same accuracy as a full-length 16S rRNA sequence using our proposed pipeline, which could serve as a good starting point for experimental design and making the comparison between 16S rRNA gene fragment-based and targeted 16S rRNA sequencing-based surveys possible.

  18. Analysis and Visualization Tool for Targeted Amplicon Bisulfite Sequencing on Ion Torrent Sequencers.

    Directory of Open Access Journals (Sweden)

    Stephan Pabinger

    Full Text Available Targeted sequencing of PCR amplicons generated from bisulfite deaminated DNA is a flexible, cost-effective way to study methylation of a sample at single CpG resolution and perform subsequent multi-target, multi-sample comparisons. Currently, no platform specific protocol, support, or analysis solution is provided to perform targeted bisulfite sequencing on a Personal Genome Machine (PGM. Here, we present a novel tool, called TABSAT, for analyzing targeted bisulfite sequencing data generated on Ion Torrent sequencers. The workflow starts with raw sequencing data, performs quality assessment, and uses a tailored version of Bismark to map the reads to a reference genome. The pipeline visualizes results as lollipop plots and is able to deduce specific methylation-patterns present in a sample. The obtained profiles are then summarized and compared between samples. In order to assess the performance of the targeted bisulfite sequencing workflow, 48 samples were used to generate 53 different Bisulfite-Sequencing PCR amplicons from each sample, resulting in 2,544 amplicon targets. We obtained a mean coverage of 282X using 1,196,822 aligned reads. Next, we compared the sequencing results of these targets to the methylation level of the corresponding sites on an Illumina 450k methylation chip. The calculated average Pearson correlation coefficient of 0.91 confirms the sequencing results with one of the industry-leading CpG methylation platforms and shows that targeted amplicon bisulfite sequencing provides an accurate and cost-efficient method for DNA methylation studies, e.g., to provide platform-independent confirmation of Illumina Infinium 450k methylation data. TABSAT offers a novel way to analyze data generated by Ion Torrent instruments and can also be used with data from the Illumina MiSeq platform. It can be easily accessed via the Platomics platform, which offers a web-based graphical user interface along with sample and parameter storage

  19. The zebrafish reference genome sequence and its relationship to the human genome

    Science.gov (United States)

    Howe, Kerstin; Clark, Matthew D.; Torroja, Carlos F.; Torrance, James; Berthelot, Camille; Muffato, Matthieu; Collins, John E.; Humphray, Sean; McLaren, Karen; Matthews, Lucy; McLaren, Stuart; Sealy, Ian; Caccamo, Mario; Churcher, Carol; Scott, Carol; Barrett, Jeffrey C.; Koch, Romke; Rauch, Gerd-Jörg; White, Simon; Chow, William; Kilian, Britt; Quintais, Leonor T.; Guerra-Assunção, José A.; Zhou, Yi; Gu, Yong; Yen, Jennifer; Vogel, Jan-Hinnerk; Eyre, Tina; Redmond, Seth; Banerjee, Ruby; Chi, Jianxiang; Fu, Beiyuan; Langley, Elizabeth; Maguire, Sean F.; Laird, Gavin K.; Lloyd, David; Kenyon, Emma; Donaldson, Sarah; Sehra, Harminder; Almeida-King, Jeff; Loveland, Jane; Trevanion, Stephen; Jones, Matt; Quail, Mike; Willey, Dave; Hunt, Adrienne; Burton, John; Sims, Sarah; McLay, Kirsten; Plumb, Bob; Davis, Joy; Clee, Chris; Oliver, Karen; Clark, Richard; Riddle, Clare; Eliott, David; Threadgold, Glen; Harden, Glenn; Ware, Darren; Mortimer, Beverly; Kerry, Giselle; Heath, Paul; Phillimore, Benjamin; Tracey, Alan; Corby, Nicole; Dunn, Matthew; Johnson, Christopher; Wood, Jonathan; Clark, Susan; Pelan, Sarah; Griffiths, Guy; Smith, Michelle; Glithero, Rebecca; Howden, Philip; Barker, Nicholas; Stevens, Christopher; Harley, Joanna; Holt, Karen; Panagiotidis, Georgios; Lovell, Jamieson; Beasley, Helen; Henderson, Carl; Gordon, Daria; Auger, Katherine; Wright, Deborah; Collins, Joanna; Raisen, Claire; Dyer, Lauren; Leung, Kenric; Robertson, Lauren; Ambridge, Kirsty; Leongamornlert, Daniel; McGuire, Sarah; Gilderthorp, Ruth; Griffiths, Coline; Manthravadi, Deepa; Nichol, Sarah; Barker, Gary; Whitehead, Siobhan; Kay, Michael; Brown, Jacqueline; Murnane, Clare; Gray, Emma; Humphries, Matthew; Sycamore, Neil; Barker, Darren; Saunders, David; Wallis, Justene; Babbage, Anne; Hammond, Sian; Mashreghi-Mohammadi, Maryam; Barr, Lucy; Martin, Sancha; Wray, Paul; Ellington, Andrew; Matthews, Nicholas; Ellwood, Matthew; Woodmansey, Rebecca; Clark, Graham; Cooper, James; Tromans, Anthony; Grafham, Darren; Skuce, Carl; Pandian, Richard; Andrews, Robert; Harrison, Elliot; Kimberley, Andrew; Garnett, Jane; Fosker, Nigel; Hall, Rebekah; Garner, Patrick; Kelly, Daniel; Bird, Christine; Palmer, Sophie; Gehring, Ines; Berger, Andrea; Dooley, Christopher M.; Ersan-Ürün, Zübeyde; Eser, Cigdem; Geiger, Horst; Geisler, Maria; Karotki, Lena; Kirn, Anette; Konantz, Judith; Konantz, Martina; Oberländer, Martina; Rudolph-Geiger, Silke; Teucke, Mathias; Osoegawa, Kazutoyo; Zhu, Baoli; Rapp, Amanda; Widaa, Sara; Langford, Cordelia; Yang, Fengtang; Carter, Nigel P.; Harrow, Jennifer; Ning, Zemin; Herrero, Javier; Searle, Steve M. J.; Enright, Anton; Geisler, Robert; Plasterk, Ronald H. A.; Lee, Charles; Westerfield, Monte; de Jong, Pieter J.; Zon, Leonard I.; Postlethwait, John H.; Nüsslein-Volhard, Christiane; Hubbard, Tim J. P.; Crollius, Hugues Roest; Rogers, Jane; Stemple, Derek L.

    2013-01-01

    Zebrafish have become a popular organism for the study of vertebrate gene function1,2. The virtually transparent embryos of this species, and the ability to accelerate genetic studies by gene knockdown or overexpression, have led to the widespread use of zebrafish in the detailed investigation of vertebrate gene function and increasingly, the study of human genetic disease3–5. However, for effective modelling of human genetic disease it is important to understand the extent to which zebrafish genes and gene structures are related to orthologous human genes. To examine this, we generated a high-quality sequence assembly of the zebrafish genome, made up of an overlapping set of completely sequenced large-insert clones that were ordered and oriented using a high-resolution high-density meiotic map. Detailed automatic and manual annotation provides evidence of more than 26,000 protein-coding genes6, the largest gene set of any vertebrate so far sequenced. Comparison to the human reference genome shows that approximately 70% of human genes have at least one obvious zebrafish orthologue. In addition, the high quality of this genome assembly provides a clearer understanding of key genomic features such as a unique repeat content, a scarcity of pseudogenes, an enrichment of zebrafish-specific genes on chromosome 4 and chromosomal regions that influence sex determination. PMID:23594743

  20. The zebrafish reference genome sequence and its relationship to the human genome.

    Science.gov (United States)

    Howe, Kerstin; Clark, Matthew D; Torroja, Carlos F; Torrance, James; Berthelot, Camille; Muffato, Matthieu; Collins, John E; Humphray, Sean; McLaren, Karen; Matthews, Lucy; McLaren, Stuart; Sealy, Ian; Caccamo, Mario; Churcher, Carol; Scott, Carol; Barrett, Jeffrey C; Koch, Romke; Rauch, Gerd-Jörg; White, Simon; Chow, William; Kilian, Britt; Quintais, Leonor T; Guerra-Assunção, José A; Zhou, Yi; Gu, Yong; Yen, Jennifer; Vogel, Jan-Hinnerk; Eyre, Tina; Redmond, Seth; Banerjee, Ruby; Chi, Jianxiang; Fu, Beiyuan; Langley, Elizabeth; Maguire, Sean F; Laird, Gavin K; Lloyd, David; Kenyon, Emma; Donaldson, Sarah; Sehra, Harminder; Almeida-King, Jeff; Loveland, Jane; Trevanion, Stephen; Jones, Matt; Quail, Mike; Willey, Dave; Hunt, Adrienne; Burton, John; Sims, Sarah; McLay, Kirsten; Plumb, Bob; Davis, Joy; Clee, Chris; Oliver, Karen; Clark, Richard; Riddle, Clare; Elliot, David; Eliott, David; Threadgold, Glen; Harden, Glenn; Ware, Darren; Begum, Sharmin; Mortimore, Beverley; Mortimer, Beverly; Kerry, Giselle; Heath, Paul; Phillimore, Benjamin; Tracey, Alan; Corby, Nicole; Dunn, Matthew; Johnson, Christopher; Wood, Jonathan; Clark, Susan; Pelan, Sarah; Griffiths, Guy; Smith, Michelle; Glithero, Rebecca; Howden, Philip; Barker, Nicholas; Lloyd, Christine; Stevens, Christopher; Harley, Joanna; Holt, Karen; Panagiotidis, Georgios; Lovell, Jamieson; Beasley, Helen; Henderson, Carl; Gordon, Daria; Auger, Katherine; Wright, Deborah; Collins, Joanna; Raisen, Claire; Dyer, Lauren; Leung, Kenric; Robertson, Lauren; Ambridge, Kirsty; Leongamornlert, Daniel; McGuire, Sarah; Gilderthorp, Ruth; Griffiths, Coline; Manthravadi, Deepa; Nichol, Sarah; Barker, Gary; Whitehead, Siobhan; Kay, Michael; Brown, Jacqueline; Murnane, Clare; Gray, Emma; Humphries, Matthew; Sycamore, Neil; Barker, Darren; Saunders, David; Wallis, Justene; Babbage, Anne; Hammond, Sian; Mashreghi-Mohammadi, Maryam; Barr, Lucy; Martin, Sancha; Wray, Paul; Ellington, Andrew; Matthews, Nicholas; Ellwood, Matthew; Woodmansey, Rebecca; Clark, Graham; Cooper, James D; Cooper, James; Tromans, Anthony; Grafham, Darren; Skuce, Carl; Pandian, Richard; Andrews, Robert; Harrison, Elliot; Kimberley, Andrew; Garnett, Jane; Fosker, Nigel; Hall, Rebekah; Garner, Patrick; Kelly, Daniel; Bird, Christine; Palmer, Sophie; Gehring, Ines; Berger, Andrea; Dooley, Christopher M; Ersan-Ürün, Zübeyde; Eser, Cigdem; Geiger, Horst; Geisler, Maria; Karotki, Lena; Kirn, Anette; Konantz, Judith; Konantz, Martina; Oberländer, Martina; Rudolph-Geiger, Silke; Teucke, Mathias; Lanz, Christa; Raddatz, Günter; Osoegawa, Kazutoyo; Zhu, Baoli; Rapp, Amanda; Widaa, Sara; Langford, Cordelia; Yang, Fengtang; Schuster, Stephan C; Carter, Nigel P; Harrow, Jennifer; Ning, Zemin; Herrero, Javier; Searle, Steve M J; Enright, Anton; Geisler, Robert; Plasterk, Ronald H A; Lee, Charles; Westerfield, Monte; de Jong, Pieter J; Zon, Leonard I; Postlethwait, John H; Nüsslein-Volhard, Christiane; Hubbard, Tim J P; Roest Crollius, Hugues; Rogers, Jane; Stemple, Derek L

    2013-04-25

    Zebrafish have become a popular organism for the study of vertebrate gene function. The virtually transparent embryos of this species, and the ability to accelerate genetic studies by gene knockdown or overexpression, have led to the widespread use of zebrafish in the detailed investigation of vertebrate gene function and increasingly, the study of human genetic disease. However, for effective modelling of human genetic disease it is important to understand the extent to which zebrafish genes and gene structures are related to orthologous human genes. To examine this, we generated a high-quality sequence assembly of the zebrafish genome, made up of an overlapping set of completely sequenced large-insert clones that were ordered and oriented using a high-resolution high-density meiotic map. Detailed automatic and manual annotation provides evidence of more than 26,000 protein-coding genes, the largest gene set of any vertebrate so far sequenced. Comparison to the human reference genome shows that approximately 70% of human genes have at least one obvious zebrafish orthologue. In addition, the high quality of this genome assembly provides a clearer understanding of key genomic features such as a unique repeat content, a scarcity of pseudogenes, an enrichment of zebrafish-specific genes on chromosome 4 and chromosomal regions that influence sex determination.

  1. Sequence Matching Analysis for Curriculum Development

    Directory of Open Access Journals (Sweden)

    Liem Yenny Bendatu

    2015-06-01

    Full Text Available Many organizations apply information technologies to support their business processes. Using the information technologies, the actual events are recorded and utilized to conform with predefined model. Conformance checking is an approach to measure the fitness and appropriateness between process model and actual events. However, when there are multiple events with the same timestamp, the traditional approach unfit to result such measures. This study attempts to develop a sequence matching analysis. Considering conformance checking as the basis of this approach, this proposed approach utilizes the current control flow technique in process mining domain. A case study in the field of educational process has been conducted. This study also proposes a curriculum analysis framework to test the proposed approach. By considering the learning sequence of students, it results some measurements for curriculum development. Finally, the result of the proposed approach has been verified by relevant instructors for further development.

  2. Certified reference materials for organic analysis

    International Nuclear Information System (INIS)

    QianChuanfan

    2002-01-01

    This presentation discusses the requirements for certified reference materials (CRMs) to be used in measurements of residues of pesticides in food and environmental samples. It deals with the types of CRMs, matrix selection, sample preparation and representativeness. It also discusses CRM validity period and gives some examples of CRM preparation

  3. Analysis of Pteridium ribosomal RNA sequences by rapid direct sequencing.

    Science.gov (United States)

    Tan, M K

    1991-08-01

    A total of 864 bases from 5 regions interspersed in the 18S and 26S rRNA molecules from various clones of Pteridium covering the general geographical distribution of the genus was analysed using a rapid rRNA sequencing technique. No base difference has been detected amongst the three major lineages, two of which apparently separated before the breakup of the ancient supercontinent, Pangaea. These regions of the rRNA sequences have thus been conserved for at least 160 million years and are here compared with other eukaryotic, especially plant rRNAs.

  4. De novo clustering methods outperform reference-based methods for assigning 16S rRNA gene sequences to operational taxonomic units

    Directory of Open Access Journals (Sweden)

    Sarah L. Westcott

    2015-12-01

    Full Text Available Background. 16S rRNA gene sequences are routinely assigned to operational taxonomic units (OTUs that are then used to analyze complex microbial communities. A number of methods have been employed to carry out the assignment of 16S rRNA gene sequences to OTUs leading to confusion over which method is optimal. A recent study suggested that a clustering method should be selected based on its ability to generate stable OTU assignments that do not change as additional sequences are added to the dataset. In contrast, we contend that the quality of the OTU assignments, the ability of the method to properly represent the distances between the sequences, is more important.Methods. Our analysis implemented six de novo clustering algorithms including the single linkage, complete linkage, average linkage, abundance-based greedy clustering, distance-based greedy clustering, and Swarm and the open and closed-reference methods. Using two previously published datasets we used the Matthew’s Correlation Coefficient (MCC to assess the stability and quality of OTU assignments.Results. The stability of OTU assignments did not reflect the quality of the assignments. Depending on the dataset being analyzed, the average linkage and the distance and abundance-based greedy clustering methods generated OTUs that were more likely to represent the actual distances between sequences than the open and closed-reference methods. We also demonstrated that for the greedy algorithms VSEARCH produced assignments that were comparable to those produced by USEARCH making VSEARCH a viable free and open source alternative to USEARCH. Further interrogation of the reference-based methods indicated that when USEARCH or VSEARCH were used to identify the closest reference, the OTU assignments were sensitive to the order of the reference sequences because the reference sequences can be identical over the region being considered. More troubling was the observation that while both USEARCH and

  5. Mapping of Micro-Tom BAC-End Sequences to the Reference Tomato Genome Reveals Possible Genome Rearrangements and Polymorphisms

    Science.gov (United States)

    Asamizu, Erika; Shirasawa, Kenta; Hirakawa, Hideki; Sato, Shusei; Tabata, Satoshi; Yano, Kentaro; Ariizumi, Tohru; Shibata, Daisuke; Ezura, Hiroshi

    2012-01-01

    A total of 93,682 BAC-end sequences (BESs) were generated from a dwarf model tomato, cv. Micro-Tom. After removing repetitive sequences, the BESs were similarity searched against the reference tomato genome of a standard cultivar, “Heinz 1706.” By referring to the “Heinz 1706” physical map and by eliminating redundant or nonsignificant hits, 28,804 “unique pair ends” and 8,263 “unique ends” were selected to construct hypothetical BAC contigs. The total physical length of the BAC contigs was 495, 833, 423 bp, covering 65.3% of the entire genome. The average coverage of euchromatin and heterochromatin was 58.9% and 67.3%, respectively. From this analysis, two possible genome rearrangements were identified: one in chromosome 2 (inversion) and the other in chromosome 3 (inversion and translocation). Polymorphisms (SNPs and Indels) between the two cultivars were identified from the BLAST alignments. As a result, 171,792 polymorphisms were mapped on 12 chromosomes. Among these, 30,930 polymorphisms were found in euchromatin (1 per 3,565 bp) and 140,862 were found in heterochromatin (1 per 2,737 bp). The average polymorphism density in the genome was 1 polymorphism per 2,886 bp. To facilitate the use of these data in Micro-Tom research, the BAC contig and polymorphism information are available in the TOMATOMICS database. PMID:23227037

  6. Improved taxonomic assignment of human intestinal 16S rRNA sequences by a dedicated reference database

    NARCIS (Netherlands)

    Ritari, Jarmo; Salojärvi, Jarkko; Lahti, Leo; Vos, de Willem M.

    2015-01-01

    Background: Current sequencing technology enables taxonomic profiling of microbial ecosystems at high resolution and depth by using the 16S rRNA gene as a phylogenetic marker. Taxonomic assignation of newly acquired data is based on sequence comparisons with comprehensive reference databases to

  7. De novo transcriptome sequencing and sequence analysis of the malaria vector Anopheles sinensis (Diptera: Culicidae)

    Science.gov (United States)

    2014-01-01

    Background Anopheles sinensis is the major malaria vector in China and Southeast Asia. Vector control is one of the most effective measures to prevent malaria transmission. However, there is little transcriptome information available for the malaria vector. To better understand the biological basis of malaria transmission and to develop novel and effective means of vector control, there is a need to build a transcriptome dataset for functional genomics analysis by large-scale RNA sequencing (RNA-seq). Methods To provide a more comprehensive and complete transcriptome of An. sinensis, eggs, larvae, pupae, male adults and female adults RNA were pooled together for cDNA preparation, sequenced using the Illumina paired-end sequencing technology and assembled into unigenes. These unigenes were then analyzed in their genome mapping, functional annotation, homology, codon usage bias and simple sequence repeats (SSRs). Results Approximately 51.6 million clean reads were obtained, trimmed, and assembled into 38,504 unigenes with an average length of 571 bp, an N50 of 711 bp, and an average GC content 51.26%. Among them, 98.4% of unigenes could be mapped onto the reference genome, and 69% of unigenes could be annotated with known biological functions. Homology analysis identified certain numbers of An. sinensis unigenes that showed homology or being putative 1:1 orthologues with genomes of other Dipteran species. Codon usage bias was analyzed and 1,904 SSRs were detected, which will provide effective molecular markers for the population genetics of this species. Conclusions Our data and analysis provide the most comprehensive transcriptomic resource and characteristics currently available for An. sinensis, and will facilitate genetic, genomic studies, and further vector control of An. sinensis. PMID:25000941

  8. Evolutionary analysis of hepatitis C virus gene sequences from 1953

    Science.gov (United States)

    Gray, Rebecca R.; Tanaka, Yasuhito; Takebe, Yutaka; Magiorkinis, Gkikas; Buskell, Zelma; Seeff, Leonard; Alter, Harvey J.; Pybus, Oliver G.

    2013-01-01

    Reconstructing the transmission history of infectious diseases in the absence of medical or epidemiological records often relies on the evolutionary analysis of pathogen genetic sequences. The precision of evolutionary estimates of epidemic history can be increased by the inclusion of sequences derived from ‘archived’ samples that are genetically distinct from contemporary strains. Historical sequences are especially valuable for viral pathogens that circulated for many years before being formally identified, including HIV and the hepatitis C virus (HCV). However, surprisingly few HCV isolates sampled before discovery of the virus in 1989 are currently available. Here, we report and analyse two HCV subgenomic sequences obtained from infected individuals in 1953, which represent the oldest genetic evidence of HCV infection. The pairwise genetic diversity between the two sequences indicates a substantial period of HCV transmission prior to the 1950s, and their inclusion in evolutionary analyses provides new estimates of the common ancestor of HCV in the USA. To explore and validate the evolutionary information provided by these sequences, we used a new phylogenetic molecular clock method to estimate the date of sampling of the archived strains, plus the dates of four more contemporary reference genomes. Despite the short fragments available, we conclude that the archived sequences are consistent with a proposed sampling date of 1953, although statistical uncertainty is large. Our cross-validation analyses suggest that the bias and low statistical power observed here likely arise from a combination of high evolutionary rate heterogeneity and an unstructured, star-like phylogeny. We expect that attempts to date other historical viruses under similar circumstances will meet similar problems. PMID:23938759

  9. CCF analysis of high redundancy systems safety/relief valve data analysis and reference BWR application

    International Nuclear Information System (INIS)

    Mankamo, T.; Bjoere, S.; Olsson, Lena

    1992-12-01

    Dependent failure analysis and modeling were developed for high redundancy systems. The study included a comprehensive data analysis of safety and relief valves at the Finnish and Swedish BWR plants, resulting in improved understanding of Common Cause Failure mechanisms in these components. The reference application on the Forsmark 1/2 reactor relief system, constituting of twelve safety/relief lines and two regulating relief lines, covered different safety criteria cases of reactor depressurization and overpressure protection function, and failure to re close sequences. For the quantification of dependencies, the Alpha Factor Model, the Binomial Probability Model and the Common Load Model were compared for applicability in high redundancy systems

  10. Molecular Findings Among Patients Referred for Clinical Whole-Exome Sequencing

    Science.gov (United States)

    Yang, Yaping; Muzny, Donna M.; Xia, Fan; Niu, Zhiyv; Person, Richard; Ding, Yan; Ward, Patricia; Braxton, Alicia; Wang, Min; Buhay, Christian; Veeraraghavan, Narayanan; Hawes, Alicia; Chiang, Theodore; Leduc, Magalie; Beuten, Joke; Zhang, Jing; He, Weimin; Scull, Jennifer; Willis, Alecia; Landsverk, Megan; Craigen, William J.; Bekheirnia, Mir Reza; Stray-Pedersen, Asbjorg; Liu, Pengfei; Wen, Shu; Alcaraz, Wendy; Cui, Hong; Walkiewicz, Magdalena; Reid, Jeffrey; Bainbridge, Matthew; Patel, Ankita; Boerwinkle, Eric; Beaudet, Arthur L.; Lupski, James R.; Plon, Sharon E.; Gibbs, Richard A.; Eng, Christine M.

    2015-01-01

    ), 65 (12.3%) X-linked, and 1 (0.2%) mitochondrial. Of 504 patients with a molecular diagnosis, 23 (4.6%) had blended phenotypes resulting from 2 single gene defects. About 30% of the positive cases harbored mutations in disease genes reported since 2011. There were 95 medically actionable incidental findings in genes unrelated to the phenotype but with immediate implications for management in 92 patients (4.6%), including 59 patients (3%) with mutations in genes recommended for reporting by the American College of Medical Genetics and Genomics. CONCLUSIONS AND RELEVANCE Whole-exome sequencing provided a potential molecular diagnosis for 25% of a large cohort of patients referred for evaluation of suspected genetic conditions, including detection of rare genetic events and new mutations contributing to disease. The yield of whole-exome sequencing may offer advantages over traditional molecular diagnostic approaches in certain patients. PMID:25326635

  11. Biological reference materials and analysis of toxic elements

    Energy Technology Data Exchange (ETDEWEB)

    Subramanian, R; Sukumar, A

    1988-12-01

    Biological monitoring of toxic metal pollution in the environment requires quality control analysis with use of standard reference materials. A variety of biological tissues are increasingly used for analysis of element bioaccumulation, but the available Certified Reference Materials (CRMs) are insufficient. An attempt is made to review the studies made using biological reference materials for animal and human tissues. The need to have inter-laboratory studies and CRM in the field of biological monitoring of toxic metals is also discussed.

  12. Multilocus sequence analysis of Treponema denticola strains of diverse origin

    Directory of Open Access Journals (Sweden)

    Mo Sisu

    2013-02-01

    Full Text Available Abstract Background The oral spirochete bacterium Treponema denticola is associated with both the incidence and severity of periodontal disease. Although the biological or phenotypic properties of a significant number of T. denticola isolates have been reported in the literature, their genetic diversity or phylogeny has never been systematically investigated. Here, we describe a multilocus sequence analysis (MLSA of 20 of the most highly studied reference strains and clinical isolates of T. denticola; which were originally isolated from subgingival plaque samples taken from subjects from China, Japan, the Netherlands, Canada and the USA. Results The sequences of the 16S ribosomal RNA gene, and 7 conserved protein-encoding genes (flaA, recA, pyrH, ppnK, dnaN, era and radC were successfully determined for each strain. Sequence data was analyzed using a variety of bioinformatic and phylogenetic software tools. We found no evidence of positive selection or DNA recombination within the protein-encoding genes, where levels of intraspecific sequence polymorphism varied from 18.8% (flaA to 8.9% (dnaN. Phylogenetic analysis of the concatenated protein-encoding gene sequence data (ca. 6,513 nucleotides for each strain using Bayesian and maximum likelihood approaches indicated that the T. denticola strains were monophyletic, and formed 6 well-defined clades. All analyzed T. denticola strains appeared to have a genetic origin distinct from that of ‘Treponema vincentii’ or Treponema pallidum. No specific geographical relationships could be established; but several strains isolated from different continents appear to be closely related at the genetic level. Conclusions Our analyses indicate that previous biological and biophysical investigations have predominantly focused on a subset of T. denticola strains with a relatively narrow range of genetic diversity. Our methodology and results establish a genetic framework for the discrimination and phylogenetic

  13. Now And Next Generation Sequencing Techniques: Future of Sequence Analysis using Cloud Computing

    Directory of Open Access Journals (Sweden)

    Radhe Shyam Thakur

    2012-12-01

    Full Text Available Advancements in the field of sequencing techniques resulted in the huge sequenced data to be produced at a very faster rate. It is going cumbersome for the datacenter to maintain the databases. Data mining and sequence analysis approaches needs to analyze the databases several times to reach any efficient conclusion. To cope with such overburden on computer resources and to reach efficient and effective conclusions quickly, the virtualization of the resources and computation on pay as you go concept was introduced and termed as cloud computing. The datacenter’s hardware and software is collectively known as cloud which when available publicly is termed as public cloud. The datacenter’s resources are provided in a virtual mode to the clients via a service provider like Amazon, Google and Joyent which charges on pay as you go manner. The workload is shifted to the provider which is maintained by the required hardware and software upgradation. The service provider manages it by upgrading the requirements in the virtual mode. Basically a virtual environment is created according to the need of the user by taking permission from datacenter via internet, the task is performed and the environment is deleted after the task is over. In this discussion, we are focusing on the basics of cloud computing, the prerequisites and overall working of clouds. Furthermore, briefly the applications of cloud computing in biological systems, especially in comparative genomics, genome informatics and SNP detection with reference to traditional workflow are discussed.

  14. Now and next-generation sequencing techniques: future of sequence analysis using cloud computing.

    Science.gov (United States)

    Thakur, Radhe Shyam; Bandopadhyay, Rajib; Chaudhary, Bratati; Chatterjee, Sourav

    2012-01-01

    Advances in the field of sequencing techniques have resulted in the greatly accelerated production of huge sequence datasets. This presents immediate challenges in database maintenance at datacenters. It provides additional computational challenges in data mining and sequence analysis. Together these represent a significant overburden on traditional stand-alone computer resources, and to reach effective conclusions quickly and efficiently, the virtualization of the resources and computation on a pay-as-you-go concept (together termed "cloud computing") has recently appeared. The collective resources of the datacenter, including both hardware and software, can be available publicly, being then termed a public cloud, the resources being provided in a virtual mode to the clients who pay according to the resources they employ. Examples of public companies providing these resources include Amazon, Google, and Joyent. The computational workload is shifted to the provider, which also implements required hardware and software upgrades over time. A virtual environment is created in the cloud corresponding to the computational and data storage needs of the user via the internet. The task is then performed, the results transmitted to the user, and the environment finally deleted after all tasks are completed. In this discussion, we focus on the basics of cloud computing, and go on to analyze the prerequisites and overall working of clouds. Finally, the applications of cloud computing in biological systems, particularly in comparative genomics, genome informatics, and SNP detection are discussed with reference to traditional workflows.

  15. FAST: FAST Analysis of Sequences Toolbox

    Directory of Open Access Journals (Sweden)

    Travis J. Lawrence

    2015-05-01

    Full Text Available FAST (FAST Analysis of Sequences Toolbox provides simple, powerful open source command-line tools to filter, transform, annotate and analyze biological sequence data. Modeled after the GNU (GNU’s Not Unix Textutils such as grep, cut, and tr, FAST tools such as fasgrep, fascut, and fastr make it easy to rapidly prototype expressive bioinformatic workflows in a compact and generic command vocabulary. Compact combinatorial encoding of data workflows with FAST commands can simplify the documentation and reproducibility of bioinformatic protocols, supporting better transparency in biological data science. Interface self-consistency and conformity with conventions of GNU, Matlab, Perl, BioPerl, R and GenBank help make FAST easy and rewarding to learn. FAST automates numerical, taxonomic, and text-based sorting, selection and transformation of sequence records and alignment sites based on content, index ranges, descriptive tags, annotated features, and in-line calculated analytics, including composition and codon usage. Automated content- and feature-based extraction of sites and support for molecular population genetic statistics makes FAST useful for molecular evolutionary analysis. FAST is portable, easy to install and secure thanks to the relative maturity of its Perl and BioPerl foundations, with stable releases posted to CPAN. Development as well as a publicly accessible Cookbook and Wiki are available on the FAST GitHub repository at https://github.com/tlawrence3/FAST. The default data exchange format in FAST is Multi-FastA (specifically, a restriction of BioPerl FastA format. Sanger and Illumina 1.8+ FastQ formatted files are also supported. FAST makes it easier for non-programmer biologists to interactively investigate and control biological data at the speed of thought.

  16. Bayesian Correlation Analysis for Sequence Count Data.

    Directory of Open Access Journals (Sweden)

    Daniel Sánchez-Taltavull

    Full Text Available Evaluating the similarity of different measured variables is a fundamental task of statistics, and a key part of many bioinformatics algorithms. Here we propose a Bayesian scheme for estimating the correlation between different entities' measurements based on high-throughput sequencing data. These entities could be different genes or miRNAs whose expression is measured by RNA-seq, different transcription factors or histone marks whose expression is measured by ChIP-seq, or even combinations of different types of entities. Our Bayesian formulation accounts for both measured signal levels and uncertainty in those levels, due to varying sequencing depth in different experiments and to varying absolute levels of individual entities, both of which affect the precision of the measurements. In comparison with a traditional Pearson correlation analysis, we show that our Bayesian correlation analysis retains high correlations when measurement confidence is high, but suppresses correlations when measurement confidence is low-especially for entities with low signal levels. In addition, we consider the influence of priors on the Bayesian correlation estimate. Perhaps surprisingly, we show that naive, uniform priors on entities' signal levels can lead to highly biased correlation estimates, particularly when different experiments have widely varying sequencing depths. However, we propose two alternative priors that provably mitigate this problem. We also prove that, like traditional Pearson correlation, our Bayesian correlation calculation constitutes a kernel in the machine learning sense, and thus can be used as a similarity measure in any kernel-based machine learning algorithm. We demonstrate our approach on two RNA-seq datasets and one miRNA-seq dataset.

  17. A basic analysis toolkit for biological sequences

    Directory of Open Access Journals (Sweden)

    Siragusa Enrico

    2007-09-01

    Full Text Available Abstract This paper presents a software library, nicknamed BATS, for some basic sequence analysis tasks. Namely, local alignments, via approximate string matching, and global alignments, via longest common subsequence and alignments with affine and concave gap cost functions. Moreover, it also supports filtering operations to select strings from a set and establish their statistical significance, via z-score computation. None of the algorithms is new, but although they are generally regarded as fundamental for sequence analysis, they have not been implemented in a single and consistent software package, as we do here. Therefore, our main contribution is to fill this gap between algorithmic theory and practice by providing an extensible and easy to use software library that includes algorithms for the mentioned string matching and alignment problems. The library consists of C/C++ library functions as well as Perl library functions. It can be interfaced with Bioperl and can also be used as a stand-alone system with a GUI. The software is available at http://www.math.unipa.it/~raffaele/BATS/ under the GNU GPL.

  18. Whole genome sequence analysis of Mycobacterium suricattae

    KAUST Repository

    Dippenaar, Anzaan; Parsons, Sven David Charles; Sampson, Samantha Leigh; Van Der Merwe, Ruben Gerhard; Drewe, Julian Ashley; Abdallah, Abdallah; Siame, Kabengele Keith; Gey Van Pittius, Nicolaas Claudius; Van Helden, Paul David; Pain, Arnab; Warren, Robin Mark

    2015-01-01

    Tuberculosis occurs in various mammalian hosts and is caused by a range of different lineages of the Mycobacterium tuberculosis complex (MTBC). A recently described member, Mycobacterium suricattae, causes tuberculosis in meerkats (Suricata suricatta) in Southern Africa and preliminary genetic analysis showed this organism to be closely related to an MTBC pathogen of rock hyraxes (Procavia capensis), the dassie bacillus. Here we make use of whole genome sequencing to describe the evolution of the genome of M. suricattae, including known and novel regions of difference, SNPs and IS6110 insertion sites. We used genome-wide phylogenetic analysis to show that M. suricattae clusters with the chimpanzee bacillus, previously isolated from a chimpanzee (Pan troglodytes) in West Africa. We propose an evolutionary scenario for the Mycobacterium africanum lineage 6 complex, showing the evolutionary relationship of M. africanum and chimpanzee bacillus, and the closely related members M. suricattae, dassie bacillus and Mycobacterium mungi.

  19. Whole genome sequence analysis of Mycobacterium suricattae

    KAUST Repository

    Dippenaar, Anzaan

    2015-10-21

    Tuberculosis occurs in various mammalian hosts and is caused by a range of different lineages of the Mycobacterium tuberculosis complex (MTBC). A recently described member, Mycobacterium suricattae, causes tuberculosis in meerkats (Suricata suricatta) in Southern Africa and preliminary genetic analysis showed this organism to be closely related to an MTBC pathogen of rock hyraxes (Procavia capensis), the dassie bacillus. Here we make use of whole genome sequencing to describe the evolution of the genome of M. suricattae, including known and novel regions of difference, SNPs and IS6110 insertion sites. We used genome-wide phylogenetic analysis to show that M. suricattae clusters with the chimpanzee bacillus, previously isolated from a chimpanzee (Pan troglodytes) in West Africa. We propose an evolutionary scenario for the Mycobacterium africanum lineage 6 complex, showing the evolutionary relationship of M. africanum and chimpanzee bacillus, and the closely related members M. suricattae, dassie bacillus and Mycobacterium mungi.

  20. Computational analysis of sequence selection mechanisms.

    Science.gov (United States)

    Meyerguz, Leonid; Grasso, Catherine; Kleinberg, Jon; Elber, Ron

    2004-04-01

    Mechanisms leading to gene variations are responsible for the diversity of species and are important components of the theory of evolution. One constraint on gene evolution is that of protein foldability; the three-dimensional shapes of proteins must be thermodynamically stable. We explore the impact of this constraint and calculate properties of foldable sequences using 3660 structures from the Protein Data Bank. We seek a selection function that receives sequences as input, and outputs survival probability based on sequence fitness to structure. We compute the number of sequences that match a particular protein structure with energy lower than the native sequence, the density of the number of sequences, the entropy, and the "selection" temperature. The mechanism of structure selection for sequences longer than 200 amino acids is approximately universal. For shorter sequences, it is not. We speculate on concrete evolutionary mechanisms that show this behavior.

  1. Comparative analysis of sequences from PT 2013

    DEFF Research Database (Denmark)

    Mikkelsen, Susie Sommer

    Sheatfish and not EHNV. Generally, mistakes occurred at the ends of the sequences. This can be due to several factors. One is that the sequence has not been trimmed of the sequence primer sites. Another is the lack of quality control of the chromatogram. Finally, sequencing in just one direction can result...... diseases in Europe. As part of the EURL proficiency test for fish diseases it is required to sequence any RANA virus isolates found in any of the samples. It is also highly recommended to sequence the ISA virus to determine whether it be HPRΔ or HPR0. Furthermore, it is recommended that any VHSV and IHNV...... isolates be genotyped. As part of the evaluation of the proficiency results it was decided this year to look into the quality and similarity of the sequence results for selected viruses. Ampoule III in the proficiency test 2013 contained an EHNV isolate. The EURL received 43 sequences from 41 laboratories...

  2. SVAMP: Sequence variation analysis, maps and phylogeny

    KAUST Repository

    Naeem, Raeece

    2014-04-03

    Summary: SVAMP is a stand-alone desktop application to visualize genomic variants (in variant call format) in the context of geographical metadata. Users of SVAMP are able to generate phylogenetic trees and perform principal coordinate analysis in real time from variant call format (VCF) and associated metadata files. Allele frequency map, geographical map of isolates, Tajima\\'s D metric, single nucleotide polymorphism density, GC and variation density are also available for visualization in real time. We demonstrate the utility of SVAMP in tracking a methicillin-resistant Staphylococcus aureus outbreak from published next-generation sequencing data across 15 countries. We also demonstrate the scalability and accuracy of our software on 245 Plasmodium falciparum malaria isolates from three continents. Availability and implementation: The Qt/C++ software code, binaries, user manual and example datasets are available at http://cbrc.kaust.edu.sa/svamp. © The Author 2014.

  3. Statistical analysis of next generation sequencing data

    CERN Document Server

    Nettleton, Dan

    2014-01-01

    Next Generation Sequencing (NGS) is the latest high throughput technology to revolutionize genomic research. NGS generates massive genomic datasets that play a key role in the big data phenomenon that surrounds us today. To extract signals from high-dimensional NGS data and make valid statistical inferences and predictions, novel data analytic and statistical techniques are needed. This book contains 20 chapters written by prominent statisticians working with NGS data. The topics range from basic preprocessing and analysis with NGS data to more complex genomic applications such as copy number variation and isoform expression detection. Research statisticians who want to learn about this growing and exciting area will find this book useful. In addition, many chapters from this book could be included in graduate-level classes in statistical bioinformatics for training future biostatisticians who will be expected to deal with genomic data in basic biomedical research, genomic clinical trials and personalized med...

  4. Determination of ancient ceramics reference material by neutron activation analysis

    International Nuclear Information System (INIS)

    Li Huhou; Sun Jingxin; Wang Yuqi; Lu Liangcai

    1986-01-01

    Contents of trace elements in the reference material of ancient ceramics (KPS-1) were determined by means of activation analysis, using thermal neutron irradiation produced in nuclear reactor. KPS-1 favoured the analysis of ancient ceramics because it had not only many kinds of element but also appropriate contents of composition. The values presented here are reliable within the experimental precision, and have shown that the reference material had a good homogeneity. So KPS-1 can be used as a suitable reference material for the ancient ceramics analysis

  5. Movement Pattern Analysis Based on Sequence Signatures

    Directory of Open Access Journals (Sweden)

    Seyed Hossein Chavoshi

    2015-09-01

    Full Text Available Increased affordability and deployment of advanced tracking technologies have led researchers from various domains to analyze the resulting spatio-temporal movement data sets for the purpose of knowledge discovery. Two different approaches can be considered in the analysis of moving objects: quantitative analysis and qualitative analysis. This research focuses on the latter and uses the qualitative trajectory calculus (QTC, a type of calculus that represents qualitative data on moving point objects (MPOs, and establishes a framework to analyze the relative movement of multiple MPOs. A visualization technique called sequence signature (SESI is used, which enables to map QTC patterns in a 2D indexed rasterized space in order to evaluate the similarity of relative movement patterns of multiple MPOs. The applicability of the proposed methodology is illustrated by means of two practical examples of interacting MPOs: cars on a highway and body parts of a samba dancer. The results show that the proposed method can be effectively used to analyze interactions of multiple MPOs in different domains.

  6. Noncoding sequence classification based on wavelet transform analysis: part I

    Science.gov (United States)

    Paredes, O.; Strojnik, M.; Romo-Vázquez, R.; Vélez Pérez, H.; Ranta, R.; Garcia-Torales, G.; Scholl, M. K.; Morales, J. A.

    2017-09-01

    DNA sequences in human genome can be divided into the coding and noncoding ones. Coding sequences are those that are read during the transcription. The identification of coding sequences has been widely reported in literature due to its much-studied periodicity. Noncoding sequences represent the majority of the human genome. They play an important role in gene regulation and differentiation among the cells. However, noncoding sequences do not exhibit periodicities that correlate to their functions. The ENCODE (Encyclopedia of DNA elements) and Epigenomic Roadmap Project projects have cataloged the human noncoding sequences into specific functions. We study characteristics of noncoding sequences with wavelet analysis of genomic signals.

  7. The profile analysis of attempted-suicide patients referred to ...

    African Journals Online (AJOL)

    The profile analysis of attempted-suicide patients referred to Pelonomi ... The main precipitating factors included problematic relationships (55.4%), ... physical – 18.2%), low self-esteem/ worthlessness/hopelessness/humiliation (16.7%), and

  8. Image sequence analysis workstation for multipoint motion analysis

    Science.gov (United States)

    Mostafavi, Hassan

    1990-08-01

    This paper describes an application-specific engineering workstation designed and developed to analyze motion of objects from video sequences. The system combines the software and hardware environment of a modem graphic-oriented workstation with the digital image acquisition, processing and display techniques. In addition to automation and Increase In throughput of data reduction tasks, the objective of the system Is to provide less invasive methods of measurement by offering the ability to track objects that are more complex than reflective markers. Grey level Image processing and spatial/temporal adaptation of the processing parameters is used for location and tracking of more complex features of objects under uncontrolled lighting and background conditions. The applications of such an automated and noninvasive measurement tool include analysis of the trajectory and attitude of rigid bodies such as human limbs, robots, aircraft in flight, etc. The system's key features are: 1) Acquisition and storage of Image sequences by digitizing and storing real-time video; 2) computer-controlled movie loop playback, freeze frame display, and digital Image enhancement; 3) multiple leading edge tracking in addition to object centroids at up to 60 fields per second from both live input video or a stored Image sequence; 4) model-based estimation and tracking of the six degrees of freedom of a rigid body: 5) field-of-view and spatial calibration: 6) Image sequence and measurement data base management; and 7) offline analysis software for trajectory plotting and statistical analysis.

  9. The wolf reference genome sequence (Canis lupus lupus) and its implications for Canis spp. population genomics

    DEFF Research Database (Denmark)

    Gopalakrishnan, Shyam; Samaniego Castruita, Jose Alfredo; Sinding, Mikkel Holger Strander

    2017-01-01

    Background An increasing number of studies are addressing the evolutionary genomics of dog domestication, principally through resequencing dog, wolf and related canid genomes. There is, however, only one de novo assembled canid genome currently available against which to map such data - that of a......Background An increasing number of studies are addressing the evolutionary genomics of dog domestication, principally through resequencing dog, wolf and related canid genomes. There is, however, only one de novo assembled canid genome currently available against which to map such data...... that regardless of the reference genome choice, most evolutionary genomic analyses yield qualitatively similar results, including those exploring the structure between the wolves and dogs using admixture and principal component analysis. However, we do observe differences in the genomic coverage of re-mapped...

  10. Analysis of E-mail Transactions in Virtual Reference Services

    Directory of Open Access Journals (Sweden)

    Astutik Nur Qomariyah

    2016-01-01

    Full Text Available Today, the use of traditional reference desk in the academic libraries has been rarely used, thus expanding or even move to a virtual reference service. A minimum level of virtual reference services are provided in the academic library is currently in general is the electronic mail (e-mail. One of the academic library specifically provide virtual reference services via e-mail is a Petra Christian University (PCU Library (ref-desk@petra.ac.id.. In such services librarians provide assistance to users in finding information and answer questions. This study aimed to analyze the transaction reference services virtually through e-mail at the PCU Library, with a view of the types of questions based on user background, the writing style of language communication interaction used based on user background, and cultural values are revealed behind the user in virtual reference services (e-mail. This study uses content analysis (content analysis of the transcript e-mail received librarians of reference services began March 10 until June 16, 2015. The results showed that the types of questions asked in reference service virtual (e-mail in the Library UK Petra include: specific search, access online resources, operation of online resources, policies and procedures for services, and library holdings with background the student (PCU and non-PCU, faculty, and librarians. Based on the background of users found that overall more types of questions asked in virtual reference services (e-mail is a problem of access to online resources, and generally submitted by the students. Then, the writing style of the user's language in interaction reference service virtual (e-mail tends to be formal, which includes the word greeting, the message will be delivered, and regards cover, either by the student (PCU and non-PCU, lecturer, or librarians. While cultural values that revealed the background behind the user in virtual reference services (e-mail is obedience, courtesy and

  11. Novel algorithms for protein sequence analysis

    NARCIS (Netherlands)

    Ye, Kai

    2008-01-01

    Each protein is characterized by its unique sequential order of amino acids, the so-called protein sequence. Biology”s paradigm is that this order of amino acids determines the protein”s architecture and function. In this thesis, we introduce novel algorithms to analyze protein sequences. Chapter 1

  12. Pig genome sequence - analysis and publication strategy

    DEFF Research Database (Denmark)

    Archibald, Alan L.; Bolund, Lars; Churcher, Carol

    2010-01-01

    preferentially selected for sequencing. In accordance with the Bermuda and Fort Lauderdale agreements and the more recent Toronto Statement the data have been released into public sequence repositories (Genbank/EMBL, NCBI/Ensembl trace repositories) in a timely manner and in advance of publication. CONCLUSIONS...

  13. Characterization and sequence analysis of cysteine and glycine-rich ...

    African Journals Online (AJOL)

    Primers specific for CSRP3 were designed using known cDNA sequences of Bos taurus published in database with different accession numbers. Polymerase chain reaction (PCR) was performed and products were purified and sequenced. Sequence analysis and alignment were carried out using CLUSTAL W (1.83).

  14. Analysis of E-mail Transactions in Virtual Reference Services

    Directory of Open Access Journals (Sweden)

    Astutik Nur Qomariyah

    2018-01-01

    Full Text Available Today, the use of traditional reference desk in the academic libraries has been rarely used, thus expanding or even move to a virtual reference service. A minimum level of virtual reference services are provided in the academic library is currently in general is the electronic mail (e-mail. One of the academic library specifically provide virtual reference services via e-mail is a Petra Christian University (PCU Library (refdesk@petra.ac.id.. In such services librarians provide assistance to users in finding information and answer questions. This study aimed to analyze the transaction reference services virtually through e-mail at the PCU Library, with a view of the types of questions based on user background, the writing style of language communication interaction used based on user background, and cultural values are revealed behind the user in virtual reference services (e-mail. This study uses content analysis (content analysis of the transcript e-mail received librarians of reference services began March 10 until June 16, 2015. The results showed that the types of questions asked in reference service virtual (e-mail in the Library UK Petra include: specific search, access online resources, operation of online resources, policies and procedures for services, and library holdings with background the student (PCU and non-PCU, faculty, and librarians. Based on the background of users found that overall more types of questions asked in virtual reference services (e-mail is a problem of access to online resources, and generally submitted by the students. Then, the writing style of the user's language in interaction reference service virtual (e-mail tends to be formal, which includes the word greeting, the message will be delivered, and regards cover, either by the student (PCU and non-PCU, lecturer, or librarians. While cultural values that revealed the background behind the user in virtual reference services (e-mail is obedience, courtesy and

  15. Recent advances in nanopore-based nucleic acid analysis and sequencing

    International Nuclear Information System (INIS)

    Shi, Jidong; Fang, Ying; Hou, Junfeng

    2016-01-01

    Nanopore-based sequencing platforms are transforming the field of genomic science. This review (containing 116 references) highlights some recent progress on nanopore-based nucleic acid analysis and sequencing. These studies are classified into three categories, biological, solid-state, and hybrid nanopores, according to their nanoporous materials. We begin with a brief description of the translocation-based detection mechanism of nanopores. Next, specific examples are given in nanopore-based nucleic acid analysis and sequencing, with an emphasis on identifying strategies that can improve the resolution of nanopores. This review concludes with a discussion of future research directions that will advance the practical applications of nanopore technology. (author)

  16. Incident sequence analysis; event trees, methods and graphical symbols

    International Nuclear Information System (INIS)

    1980-11-01

    When analyzing incident sequences, unwanted events resulting from a certain cause are looked for. Graphical symbols and explanations of graphical representations are presented. The method applies to the analysis of incident sequences in all types of facilities. By means of the incident sequence diagram, incident sequences, i.e. the logical and chronological course of repercussions initiated by the failure of a component or by an operating error, can be presented and analyzed simply and clearly

  17. Molecular Identification of Unusual Pathogenic Yeast Isolates by Large Ribosomal Subunit Gene Sequencing: 2 Years of Experience at the United Kingdom Mycology Reference Laboratory▿

    Science.gov (United States)

    Linton, Christopher J.; Borman, Andrew M.; Cheung, Grace; Holmes, Ann D.; Szekely, Adrien; Palmer, Michael D.; Bridge, Paul D.; Campbell, Colin K.; Johnson, Elizabeth M.

    2007-01-01

    Rapid identification of yeast isolates from clinical samples is particularly important given their innately variable antifungal susceptibility profiles. We present here an analysis of the utility of PCR amplification and sequence analysis of the hypervariable D1/D2 region of the 26S rRNA gene for the identification of yeast species submitted to the United Kingdom Mycology Reference Laboratory over a 2-year period. A total of 3,033 clinical isolates were received from 2004 to 2006 encompassing 50 different yeast species. While more than 90% of the isolates, corresponding to the most common Candida species, could be identified by using the AUXACOLOR2 yeast identification kit, 153 isolates (5%), comprised of 47 species, could not be identified by using this system and were subjected to molecular identification via 26S rRNA gene sequencing. These isolates included some common species that exhibited atypical biochemical and phenotypic profiles and also many rarer yeast species that are infrequently encountered in the clinical setting. All 47 species requiring molecular identification were unambiguously identified on the basis of D1/D2 sequences, and the molecular identities correlated well with the observed biochemical profiles of the various organisms. Together, our data underscore the utility of molecular techniques as a reference adjunct to conventional methods of yeast identification. Further, we show that PCR amplification and sequencing of the D1/D2 region reliably identifies more than 45 species of clinically significant yeasts and can also potentially identify new pathogenic yeast species. PMID:17251397

  18. The importance of reference materials in doping-control analysis.

    Science.gov (United States)

    Mackay, Lindsey G; Kazlauskas, Rymantas

    2011-08-01

    Currently a large range of pure substance reference materials are available for calibration of doping-control methods. These materials enable traceability to the International System of Units (SI) for the results generated by World Anti-Doping Agency (WADA)-accredited laboratories. Only a small number of prohibited substances have threshold limits for which quantification is highly important. For these analytes only the highest quality reference materials that are available should be used. Many prohibited substances have no threshold limits and reference materials provide essential identity confirmation. For these reference materials the correct identity is critical and the methods used to assess identity in these cases should be critically evaluated. There is still a lack of certified matrix reference materials to support many aspects of doping analysis. However, in key areas a range of urine matrix materials have been produced for substances with threshold limits, for example 19-norandrosterone and testosterone/epitestosterone (T/E) ratio. These matrix-certified reference materials (CRMs) are an excellent independent means of checking method recovery and bias and will typically be used in method validation and then regularly as quality-control checks. They can be particularly important in the analysis of samples close to threshold limits, in which measurement accuracy becomes critical. Some reference materials for isotope ratio mass spectrometry (IRMS) analysis are available and a matrix material certified for steroid delta values is currently under production. In other new areas, for example the Athlete Biological Passport, peptide hormone testing, designer steroids, and gene doping, reference material needs still need to be thoroughly assessed and prioritised.

  19. pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree

    Directory of Open Access Journals (Sweden)

    Kodner Robin B

    2010-10-01

    Full Text Available Abstract Background Likelihood-based phylogenetic inference is generally considered to be the most reliable classification method for unknown sequences. However, traditional likelihood-based phylogenetic methods cannot be applied to large volumes of short reads from next-generation sequencing due to computational complexity issues and lack of phylogenetic signal. "Phylogenetic placement," where a reference tree is fixed and the unknown query sequences are placed onto the tree via a reference alignment, is a way to bring the inferential power offered by likelihood-based approaches to large data sets. Results This paper introduces pplacer, a software package for phylogenetic placement and subsequent visualization. The algorithm can place twenty thousand short reads on a reference tree of one thousand taxa per hour per processor, has essentially linear time and memory complexity in the number of reference taxa, and is easy to run in parallel. Pplacer features calculation of the posterior probability of a placement on an edge, which is a statistically rigorous way of quantifying uncertainty on an edge-by-edge basis. It also can inform the user of the positional uncertainty for query sequences by calculating expected distance between placement locations, which is crucial in the estimation of uncertainty with a well-sampled reference tree. The software provides visualizations using branch thickness and color to represent number of placements and their uncertainty. A simulation study using reads generated from 631 COG alignments shows a high level of accuracy for phylogenetic placement over a wide range of alignment diversity, and the power of edge uncertainty estimates to measure placement confidence. Conclusions Pplacer enables efficient phylogenetic placement and subsequent visualization, making likelihood-based phylogenetics methodology practical for large collections of reads; it is freely available as source code, binaries, and a web service.

  20. Survey sequencing and comparative analysis of the elephant shark (Callorhinchus milii genome.

    Directory of Open Access Journals (Sweden)

    Byrappa Venkatesh

    2007-04-01

    Full Text Available Owing to their phylogenetic position, cartilaginous fishes (sharks, rays, skates, and chimaeras provide a critical reference for our understanding of vertebrate genome evolution. The relatively small genome of the elephant shark, Callorhinchus milii, a chimaera, makes it an attractive model cartilaginous fish genome for whole-genome sequencing and comparative analysis. Here, the authors describe survey sequencing (1.4x coverage and comparative analysis of the elephant shark genome, one of the first cartilaginous fish genomes to be sequenced to this depth. Repetitive sequences, represented mainly by a novel family of short interspersed element-like and long interspersed element-like sequences, account for about 28% of the elephant shark genome. Fragments of approximately 15,000 elephant shark genes reveal specific examples of genes that have been lost differentially during the evolution of tetrapod and teleost fish lineages. Interestingly, the degree of conserved synteny and conserved sequences between the human and elephant shark genomes are higher than that between human and teleost fish genomes. Elephant shark contains putative four Hox clusters indicating that, unlike teleost fish genomes, the elephant shark genome has not experienced an additional whole-genome duplication. These findings underscore the importance of the elephant shark as a critical reference vertebrate genome for comparative analysis of the human and other vertebrate genomes. This study also demonstrates that a survey-sequencing approach can be applied productively for comparative analysis of distantly related vertebrate genomes.

  1. Overview of errors in the reference sequence and annotation of Mycobacterium tuberculosis H37Rv, and variation amongst its isolates

    KAUST Repository

    Köser, Claudio U.

    2012-06-01

    Since its publication in 1998, the genome sequence of the Mycobacterium tuberculosis H37Rv laboratory strain has acted as the cornerstone for the study of tuberculosis. In this review we address some of the practical aspects that have come to light relating to the use of H37Rv throughout the past decade which are of relevance for the ongoing genomic and laboratory studies of this pathogen. These include errors in the genome reference sequence and its annotation, as well as the recently detected variation amongst isolates of H37Rv from different laboratories. © 2011 Elsevier B.V..

  2. SEQUENCING AND DE NOVO DRAFT ASSEMBLIES OF A FATHEAD MINNOW (Pimpehales promelas) reference genome

    Data.gov (United States)

    U.S. Environmental Protection Agency — The dataset provides the URLs for accessing the genome sequence data and two draft assemblies as well as fathead minnow genotyping data associated with estimating...

  3. Certification of biological candidates reference materials by neutron activation analysis

    Science.gov (United States)

    Kabanov, Denis V.; Nesterova, Yulia V.; Merkulov, Viktor G.

    2018-03-01

    The paper gives the results of interlaboratory certification of new biological candidate reference materials by neutron activation analysis recommended by the Institute of Nuclear Chemistry and Technology (Warsaw, Poland). The correctness and accuracy of the applied method was statistically estimated for the determination of trace elements in candidate reference materials. The procedure of irradiation in the reactor thermal fuel assembly without formation of fast neutrons was carried out. It excluded formation of interfering isotopes leading to false results. The concentration of more than 20 elements (e.g., Ba, Br, Ca, Co, Ce, Cr, Cs, Eu, Fe, Hf, La, Lu, Rb, Sb, Sc, Ta, Th, Tb, Yb, U, Zn) in candidate references of tobacco leaves and bottom sediment compared to certified reference materials were determined. It was shown that the average error of the applied method did not exceed 10%.

  4. A new certified reference material for size analysis of nanoparticles

    International Nuclear Information System (INIS)

    Braun, Adelina; Kestens, Vikram; Franks, Katrin; Roebben, Gert; Lamberty, Andrée; Linsinger, Thomas P. J.

    2012-01-01

    A certified reference material, ERM-FD100, for quality assurance and validation of various nanoparticle sizing methods, was developed by the Institute for Reference Materials and Measurements. The material was prepared from an industrially sourced colloidal silica containing nanoparticles with a nominal equivalent spherical diameter of 20 nm. The homogeneity and stability of the candidate reference material was assessed by means of dynamic light scattering and centrifugal liquid sedimentation. Certification of the candidate reference material was based on a global interlaboratory comparison in which 34 laboratories participated with various analytical methods (DLS, CLS, EM, SAXS, ELS). After scrutinising the interlaboratory comparison data, 4 different certified particle size values, specific for the corresponding analytical method, could be assigned. The good comparability of results allowed the certification of the colloidal silica material for nanoparticle size analysis.

  5. Establishing a framework for comparative analysis of genome sequences

    Energy Technology Data Exchange (ETDEWEB)

    Bansal, A.K.

    1995-06-01

    This paper describes a framework and a high-level language toolkit for comparative analysis of genome sequence alignment The framework integrates the information derived from multiple sequence alignment and phylogenetic tree (hypothetical tree of evolution) to derive new properties about sequences. Multiple sequence alignments are treated as an abstract data type. Abstract operations have been described to manipulate a multiple sequence alignment and to derive mutation related information from a phylogenetic tree by superimposing parsimonious analysis. The framework has been applied on protein alignments to derive constrained columns (in a multiple sequence alignment) that exhibit evolutionary pressure to preserve a common property in a column despite mutation. A Prolog toolkit based on the framework has been implemented and demonstrated on alignments containing 3000 sequences and 3904 columns.

  6. CSReport: A New Computational Tool Designed for Automatic Analysis of Class Switch Recombination Junctions Sequenced by High-Throughput Sequencing.

    Science.gov (United States)

    Boyer, François; Boutouil, Hend; Dalloul, Iman; Dalloul, Zeinab; Cook-Moreau, Jeanne; Aldigier, Jean-Claude; Carrion, Claire; Herve, Bastien; Scaon, Erwan; Cogné, Michel; Péron, Sophie

    2017-05-15

    B cells ensure humoral immune responses due to the production of Ag-specific memory B cells and Ab-secreting plasma cells. In secondary lymphoid organs, Ag-driven B cell activation induces terminal maturation and Ig isotype class switch (class switch recombination [CSR]). CSR creates a virtually unique IgH locus in every B cell clone by intrachromosomal recombination between two switch (S) regions upstream of each C region gene. Amount and structural features of CSR junctions reveal valuable information about the CSR mechanism, and analysis of CSR junctions is useful in basic and clinical research studies of B cell functions. To provide an automated tool able to analyze large data sets of CSR junction sequences produced by high-throughput sequencing (HTS), we designed CSReport, a software program dedicated to support analysis of CSR recombination junctions sequenced with a HTS-based protocol (Ion Torrent technology). CSReport was assessed using simulated data sets of CSR junctions and then used for analysis of Sμ-Sα and Sμ-Sγ1 junctions from CH12F3 cells and primary murine B cells, respectively. CSReport identifies junction segment breakpoints on reference sequences and junction structure (blunt-ended junctions or junctions with insertions or microhomology). Besides the ability to analyze unprecedentedly large libraries of junction sequences, CSReport will provide a unified framework for CSR junction studies. Our results show that CSReport is an accurate tool for analysis of sequences from our HTS-based protocol for CSR junctions, thereby facilitating and accelerating their study. Copyright © 2017 by The American Association of Immunologists, Inc.

  7. Analysis of the AD sequence in Zion plant using the March 1.1 code

    International Nuclear Information System (INIS)

    Oriolo, F.; Paci, S.

    1985-01-01

    The analyses of the AD sequences for the Zion power plant, made at the Pisa University, in the framework of the participation in the Source Tern Working Group. After a short description of the plant and the sequence under analysis, the model used for the reference computation and the results obtained using the March 1.1 code are shown. Together with the reference computation a series of parametric tests have been also made, concerning some input code variables, in order to ascertain their influence on the transient trend. The results of these analyses are shown in Appendix

  8. Scalable Kernel Methods and Algorithms for General Sequence Analysis

    Science.gov (United States)

    Kuksa, Pavel

    2011-01-01

    Analysis of large-scale sequential data has become an important task in machine learning and pattern recognition, inspired in part by numerous scientific and technological applications such as the document and text classification or the analysis of biological sequences. However, current computational methods for sequence comparison still lack…

  9. Preparation and analysis of a marble reference material

    International Nuclear Information System (INIS)

    Carmo Freitas, M.; Moens, L.; Seabra e Barros, J.

    1988-01-01

    A 7 kg stone of a Carrara marble was reduced to grains smaller than 100 μm, mixed and homogenized in order to prepare a marble reference material. The homogeneity was tested for 16 elements by instrumental neutron activation analysis (INAA). Through a one-way analysis of variance based on several analyses of each of 15 bottles and within the same bottle, it was concluded that the inter-bottle heterogeneity is not greater than the intra-bottle heterogeneity. Results on the concentration of major and trace elements in the marble reference material, obtained by different laboratories and different techniques, are given. The limestone certified reference material KALKSTEIN KH was used to evaluate measurement accuracy, to intercalibrate laboratories, and to provide compatibility of measurement data. (author) 10 refs.; 12 tabs

  10. Recurrence plot analysis of DNA sequences

    Energy Technology Data Exchange (ETDEWEB)

    Wu Zuobing [State Key Laboratory of Nonlinear Mechanics, Institute of Mechanics, Chinese Academy of Sciences, Beijing 100080 (China)]. E-mail: wuzb@lnm.imech.ac.cn

    2004-11-15

    Recurrence plot technique of DNA sequences is established on metric representation and employed to analyze correlation structure of nucleotide strings. It is found that, in the transference of nucleotide strings, a human DNA fragment has a major correlation distance, but a yeast chromosome's correlation distance has a constant increasing.

  11. QTL analysis by sequencing of Water Use Efficiency (WUE) in potato

    DEFF Research Database (Denmark)

    Kaminski, Kacper Piotr; Sønderkær, Mads; Sørensen, Kirsten Kørup

    2013-01-01

    The traditional approach to potato breeding, the classical “mate and phenotype” approach is relatively costly and because phenotyping and growth capacity is limited, this are being slowly replaced by Marker Assisted Selection (MAS) breeding schemes. MAS is based on the presence of DNA polymorphic.......sparsipilum), phenotyped for water use efficiency. This population has also previously been phenotyped for the total glycoalkaloid (TGA) content....... and time consuming process. Here, a novel method for Quantitative Trait Locus (QTL) analysis has been developed, that allows for development of specific markers by use of genomic sequence reads and the recently published reference genome sequence for potato. Prior to sequencing the mapping population...

  12. Sequencing and De novo Draft Assemblies of the Fathead Minnow (Pimphales promelas)Reference Genome

    Science.gov (United States)

    This study was undertaken to develop genome-scale resources for the fathead minnow (Pimphales promelas) an important model organism widely used in both aquatic ecotoxicology research and in regulatory toxicity testing. We report on the first sequencing and two draft assemblies fo...

  13. Biological and environmental reference materials in neutron activation analysis work

    International Nuclear Information System (INIS)

    Guinn, V.P.; Gavrilas, M.

    1990-01-01

    The great usefulness of reference materials, especially ones of certified elemental composition, is discussed with particular attention devoted to their use in instrumental neutron activation analysis (INAA) work. Their use, including both certified and uncertified values, in calculations made by the INAA Advance Prediction Computer Program (APCP) is discussed. The main features of the APCP are described, and mention is made of the large number of reference materials run on the APCP (including the new personal computer version of the program), with NBS Oyster Tissue SRM-1566 used as the principal examle. (orig.)

  14. Certification of biological reference materials by instrumental neutron activation analysis

    International Nuclear Information System (INIS)

    Lanjewar, Mamata R.; Lanjewar, R.B.

    2014-01-01

    A multielemental instrumental neutron activation analysis (INAA) method by short and long irradiation has been employed for the determination of 21 minor and trace elements in two standard Reference Materials P-RBF and P-WBF from Institute of Radioecology and Applied Nuclear Techniques ,Czechoslovakia. Also some biological standards such as Bowen's kale, cabbage leaves (Poland) including wheat and rice flour samples of local origin were analysed. It is suggested that INAA is an ideal method for the certification of Reference Materials of Biological Matrices. (author)

  15. On criteria for examining analysis quality with standard reference material

    International Nuclear Information System (INIS)

    Yang Huating

    1997-01-01

    The advantages and disadvantages and applicability of some criteria for examining analysis quality with standard reference material are discussed. The combination of the uncertainties of the instrument examined and the reference material should be determined on the basis of specific situations. Without the data of the instrument's uncertainty, it would be applicable to substitute the standard deviation multiplied by certain times for the uncertainty. The result of the examining should not result in more error reported in routine measurements than it really is. Over strict examining should also be avoided

  16. Analysis of Neuronal Sequences Using Pairwise Biases

    Science.gov (United States)

    2015-08-27

    semantic memory (knowledge of facts) and implicit memory (e.g., how to ride a bike ). Evidence for the participation of the hippocampus in the formation of...hippocampal formation in an attempt to be cured of severe epileptic seizures. Although the surgery was successful in regards to reducing the frequency and...very different from each other in many ways including duration and number of spikes. Still, these sequences share a similar trend in the general order

  17. Google matrix analysis of DNA sequences.

    Science.gov (United States)

    Kandiah, Vivek; Shepelyansky, Dima L

    2013-01-01

    For DNA sequences of various species we construct the Google matrix [Formula: see text] of Markov transitions between nearby words composed of several letters. The statistical distribution of matrix elements of this matrix is shown to be described by a power law with the exponent being close to those of outgoing links in such scale-free networks as the World Wide Web (WWW). At the same time the sum of ingoing matrix elements is characterized by the exponent being significantly larger than those typical for WWW networks. This results in a slow algebraic decay of the PageRank probability determined by the distribution of ingoing elements. The spectrum of [Formula: see text] is characterized by a large gap leading to a rapid relaxation process on the DNA sequence networks. We introduce the PageRank proximity correlator between different species which determines their statistical similarity from the view point of Markov chains. The properties of other eigenstates of the Google matrix are also discussed. Our results establish scale-free features of DNA sequence networks showing their similarities and distinctions with the WWW and linguistic networks.

  18. Google matrix analysis of DNA sequences.

    Directory of Open Access Journals (Sweden)

    Vivek Kandiah

    Full Text Available For DNA sequences of various species we construct the Google matrix [Formula: see text] of Markov transitions between nearby words composed of several letters. The statistical distribution of matrix elements of this matrix is shown to be described by a power law with the exponent being close to those of outgoing links in such scale-free networks as the World Wide Web (WWW. At the same time the sum of ingoing matrix elements is characterized by the exponent being significantly larger than those typical for WWW networks. This results in a slow algebraic decay of the PageRank probability determined by the distribution of ingoing elements. The spectrum of [Formula: see text] is characterized by a large gap leading to a rapid relaxation process on the DNA sequence networks. We introduce the PageRank proximity correlator between different species which determines their statistical similarity from the view point of Markov chains. The properties of other eigenstates of the Google matrix are also discussed. Our results establish scale-free features of DNA sequence networks showing their similarities and distinctions with the WWW and linguistic networks.

  19. Human factors review for Severe Accident Sequence Analysis (SASA)

    International Nuclear Information System (INIS)

    Krois, P.A.; Haas, P.M.; Manning, J.J.; Bovell, C.R.

    1984-01-01

    The paper will discuss work being conducted during this human factors review including: (1) support of the Severe Accident Sequence Analysis (SASA) Program based on an assessment of operator actions, and (2) development of a descriptive model of operator severe accident management. Research by SASA analysts on the Browns Ferry Unit One (BF1) anticipated transient without scram (ATWS) was supported through a concurrent assessment of operator performance to demonstrate contributions to SASA analyses from human factors data and methods. A descriptive model was developed called the Function Oriented Accident Management (FOAM) model, which serves as a structure for bridging human factors, operations, and engineering expertise and which is useful for identifying needs/deficiencies in the area of accident management. The assessment of human factors issues related to ATWS required extensive coordination with SASA analysts. The analysis was consolidated primarily to six operator actions identified in the Emergency Procedure Guidelines (EPGs) as being the most critical to the accident sequence. These actions were assessed through simulator exercises, qualitative reviews, and quantitative human reliability analyses. The FOAM descriptive model assumes as a starting point that multiple operator/system failures exceed the scope of procedures and necessitates a knowledge-based emergency response by the operators. The FOAM model provides a functionally-oriented structure for assembling human factors, operations, and engineering data and expertise into operator guidance for unconventional emergency responses to mitigate severe accident progression and avoid/minimize core degradation. Operators must also respond to potential radiological release beyond plant protective barriers. Research needs in accident management and potential uses of the FOAM model are described. 11 references, 1 figure

  20. Error Analysis of Deep Sequencing of Phage Libraries: Peptides Censored in Sequencing

    Directory of Open Access Journals (Sweden)

    Wadim L. Matochko

    2013-01-01

    Full Text Available Next-generation sequencing techniques empower selection of ligands from phage-display libraries because they can detect low abundant clones and quantify changes in the copy numbers of clones without excessive selection rounds. Identification of errors in deep sequencing data is the most critical step in this process because these techniques have error rates >1%. Mechanisms that yield errors in Illumina and other techniques have been proposed, but no reports to date describe error analysis in phage libraries. Our paper focuses on error analysis of 7-mer peptide libraries sequenced by Illumina method. Low theoretical complexity of this phage library, as compared to complexity of long genetic reads and genomes, allowed us to describe this library using convenient linear vector and operator framework. We describe a phage library as N×1 frequency vector n=ni, where ni is the copy number of the ith sequence and N is the theoretical diversity, that is, the total number of all possible sequences. Any manipulation to the library is an operator acting on n. Selection, amplification, or sequencing could be described as a product of a N×N matrix and a stochastic sampling operator (Sa. The latter is a random diagonal matrix that describes sampling of a library. In this paper, we focus on the properties of Sa and use them to define the sequencing operator (Seq. Sequencing without any bias and errors is Seq=Sa IN, where IN is a N×N unity matrix. Any bias in sequencing changes IN to a nonunity matrix. We identified a diagonal censorship matrix (CEN, which describes elimination or statistically significant downsampling, of specific reads during the sequencing process.

  1. Reference-quality genome sequence of Aegilops tauschii, the source of wheat D genome, shows that recombination shapes genome structure and evolution

    Science.gov (United States)

    Aegilops tauschii is the diploid progenitor of the D genome of hexaploid wheat and an important genetic resource for wheat. A reference-quality sequence for the Ae. tauschii genome was produced with a combination of ordered-clone sequencing, whole-genome shotgun sequencing, and BioNano optical geno...

  2. No-Reference Video Quality Assessment by HEVC Codec Analysis

    DEFF Research Database (Denmark)

    Huang, Xin; Søgaard, Jacob; Forchhammer, Søren

    2015-01-01

    This paper proposes a No-Reference (NR) Video Quality Assessment (VQA) method for videos subject to the distortion given by High Efficiency Video Coding (HEVC). The proposed assessment can be performed either as a BitstreamBased (BB) method or as a Pixel-Based (PB). It extracts or estimates...... the transform coefficients, estimates the distortion, and assesses the video quality. The proposed scheme generates VQA features based on Intra coded frames, and then maps features using an Elastic Net to predict subjective video quality. A set of HEVC coded 4K UHD sequences are tested. Results show...... that the quality scores computed by the proposed method are highly correlated with the subjective assessment....

  3. Large sample neutron activation analysis of a reference inhomogeneous sample

    International Nuclear Information System (INIS)

    Vasilopoulou, T.; Athens National Technical University, Athens; Tzika, F.; Stamatelatos, I.E.; Koster-Ammerlaan, M.J.J.

    2011-01-01

    A benchmark experiment was performed for Neutron Activation Analysis (NAA) of a large inhomogeneous sample. The reference sample was developed in-house and consisted of SiO 2 matrix and an Al-Zn alloy 'inhomogeneity' body. Monte Carlo simulations were employed to derive appropriate correction factors for neutron self-shielding during irradiation as well as self-attenuation of gamma rays and sample geometry during counting. The large sample neutron activation analysis (LSNAA) results were compared against reference values and the trueness of the technique was evaluated. An agreement within ±10% was observed between LSNAA and reference elemental mass values, for all matrix and inhomogeneity elements except Samarium, provided that the inhomogeneity body was fully simulated. However, in cases that the inhomogeneity was treated as not known, the results showed a reasonable agreement for most matrix elements, while large discrepancies were observed for the inhomogeneity elements. This study provided a quantification of the uncertainties associated with inhomogeneity in large sample analysis and contributed to the identification of the needs for future development of LSNAA facilities for analysis of inhomogeneous samples. (author)

  4. Cloning and sequence analysis of benzo-a-pyreneinducible ...

    African Journals Online (AJOL)

    The phylogenetic tree based on the amino acid sequences clearly shows tilapia CYP1A and killifish CYP1A to be more closely related to each other than to the other CYP1A subfamilies. Sequence analysis of 3727 bp of genomic DNA showed that the clone obtained was the structural gene of CYP1A which consists of ...

  5. Biological sequence analysis: probabilistic models of proteins and nucleic acids

    National Research Council Canada - National Science Library

    Durbin, Richard

    1998-01-01

    ... analysis methods are now based on principles of probabilistic modelling. Examples of such methods include the use of probabilistically derived score matrices to determine the significance of sequence alignments, the use of hidden Markov models as the basis for profile searches to identify distant members of sequence families, and the inference...

  6. Phylogenetic analysis of the genus Hordeum using repetitive DNA sequences

    DEFF Research Database (Denmark)

    Svitashev, S.; Bryngelsson, T.; Vershinin, A.

    1994-01-01

    A set of six cloned barley (Hordeum vulgare) repetitive DNA sequences was used for the analysis of phylogenetic relationships among 31 species (46 taxa) of the genus Hordeum, using molecular hybridization techniques. In situ hybridization experiments showed dispersed organization of the sequences...

  7. Newly developed standard reference materials for organic contaminant analysis

    Energy Technology Data Exchange (ETDEWEB)

    Poster, D.; Kucklick, J.; Schantz, M.; Porter, B.; Wise, S. [National Inst. of Stand. and Technol., Gaithersburg, MD (USA). Center for Anal. Chem.

    2004-09-15

    The National Institute of Standards and Technology (NIST) has issued a number of Standard Reference Materials (SRM) for specified analytes. The SRMs are biota and biological related materials, sediments and particle related SRMs. The certified compounds for analysis are polychlorinated biphenyls (PCB), polycylic aromatic hydrocarbons (PAH) and their nitro-analogues, chlorinated pesticides, methylmercury, organic tin compounds, fatty acids, polybrominated biphenyl ethers (PBDE). The authors report on origin of materials and analytic methods. (uke)

  8. Cost Analysis Sources and Documents Data Base Reference Manual (Update)

    Science.gov (United States)

    1989-06-01

    M: Refcrence Manual PRICE H: Training Course Workbook 11. Use in Cost Analysis. Important source of cost estimates for electronic and mechanical...Nature of Data. Contains many microeconomic time series by month or quarter. 5. Level of Detail. Very detailed. 6. Normalization Processes Required...Reference Manual. Moorestown, N.J,: GE Corporation, September 1986. 64. PRICE Training Course Workbook . Moorestown, N.J.: GE Corporation, February 1986

  9. Sequencing and de novo assembly of 150 genomes from Denmark as a population reference

    DEFF Research Database (Denmark)

    Maretty, Lasse; Jensen, Jacob Malte; Petersen, Bent

    2017-01-01

    or by performing local assembly. However, these approaches are biased against discovery of structural variants and variation in the more complex parts of the genome. Hence, large-scale de novo assembly is needed. Here we show that it is possible to construct excellent de novo assemblies from high......-coverage sequencing with mate-pair libraries extending up to 20 kilobases. We report de novo assemblies of 150 individuals (50 trios) from the GenomeDenmark project. The quality of these assemblies is similar to those obtained using the more expensive long-read technology. We use the assemblies to identify a rich set...

  10. Parametric inference for biological sequence analysis.

    Science.gov (United States)

    Pachter, Lior; Sturmfels, Bernd

    2004-11-16

    One of the major successes in computational biology has been the unification, by using the graphical model formalism, of a multitude of algorithms for annotating and comparing biological sequences. Graphical models that have been applied to these problems include hidden Markov models for annotation, tree models for phylogenetics, and pair hidden Markov models for alignment. A single algorithm, the sum-product algorithm, solves many of the inference problems that are associated with different statistical models. This article introduces the polytope propagation algorithm for computing the Newton polytope of an observation from a graphical model. This algorithm is a geometric version of the sum-product algorithm and is used to analyze the parametric behavior of maximum a posteriori inference calculations for graphical models.

  11. Recurrence time statistics: versatile tools for genomic DNA sequence analysis.

    Science.gov (United States)

    Cao, Yinhe; Tung, Wen-Wen; Gao, J B

    2004-01-01

    With the completion of the human and a few model organisms' genomes, and the genomes of many other organisms waiting to be sequenced, it has become increasingly important to develop faster computational tools which are capable of easily identifying the structures and extracting features from DNA sequences. One of the more important structures in a DNA sequence is repeat-related. Often they have to be masked before protein coding regions along a DNA sequence are to be identified or redundant expressed sequence tags (ESTs) are to be sequenced. Here we report a novel recurrence time based method for sequence analysis. The method can conveniently study all kinds of periodicity and exhaustively find all repeat-related features from a genomic DNA sequence. An efficient codon index is also derived from the recurrence time statistics, which has the salient features of being largely species-independent and working well on very short sequences. Efficient codon indices are key elements of successful gene finding algorithms, and are particularly useful for determining whether a suspected EST belongs to a coding or non-coding region. We illustrate the power of the method by studying the genomes of E. coli, the yeast S. cervisivae, the nematode worm C. elegans, and the human, Homo sapiens. Computationally, our method is very efficient. It allows us to carry out analysis of genomes on the whole genomic scale by a PC.

  12. Establishment and analysis of a reference transcriptome for Spodoptera frugiperda.

    Science.gov (United States)

    Legeai, Fabrice; Gimenez, Sylvie; Duvic, Bernard; Escoubas, Jean-Michel; Gosselin Grenet, Anne-Sophie; Blanc, Florence; Cousserans, François; Séninet, Imène; Bretaudeau, Anthony; Mutuel, Doriane; Girard, Pierre-Alain; Monsempes, Christelle; Magdelenat, Ghislaine; Hilliou, Frédérique; Feyereisen, René; Ogliastro, Mylène; Volkoff, Anne-Nathalie; Jacquin-Joly, Emmanuelle; d'Alençon, Emmanuelle; Nègre, Nicolas; Fournier, Philippe

    2014-08-23

    Spodoptera frugiperda (Noctuidae) is a major agricultural pest throughout the American continent. The highly polyphagous larvae are frequently devastating crops of importance such as corn, sorghum, cotton and grass. In addition, the Sf9 cell line, widely used in biochemistry for in vitro protein production, is derived from S. frugiperda tissues. Many research groups are using S. frugiperda as a model organism to investigate questions such as plant adaptation, pest behavior or resistance to pesticides. In this study, we constructed a reference transcriptome assembly (Sf_TR2012b) of RNA sequences obtained from more than 35 S. frugiperda developmental time-points and tissue samples. We assessed the quality of this reference transcriptome by annotating a ubiquitous gene family--ribosomal proteins--as well as gene families that have a more constrained spatio-temporal expression and are involved in development, immunity and olfaction. We also provide a time-course of expression that we used to characterize the transcriptional regulation of the gene families studied. We conclude that the Sf_TR2012b transcriptome is a valid reference transcriptome. While its reliability decreases for the detection and annotation of genes under strong transcriptional constraint we still recover a fair percentage of tissue-specific transcripts. That allowed us to explore the spatial and temporal expression of genes and to observe that some olfactory receptors are expressed in antennae and palps but also in other non related tissues such as fat bodies. Similarly, we observed an interesting interplay of gene families involved in immunity between fat bodies and antennae.

  13. Neutron activation analysis for certification of standard reference materials

    International Nuclear Information System (INIS)

    Capote Rodriguez, G.; Perez Zayas, G.; Hernandez Rivero, A.; Ribeiro Guevara, S.

    1996-01-01

    Neutron activation analysis is used extensively as one of the analytical techniques in the certification of standard reference materials. Characteristics of neutron activation analysis which make it valuable in this role are: accuracy multielemental capability to asses homogeneity, high sensitivity for many elements, and essentially non-destructive method. This paper report the concentrations of 30 elements (major, minor and trace elements) in four Cuban samples. The samples were irradiated in a thermal neutron flux of 10 12- 10 13 n.cm 2. s -1. The gamma ray spectra were measured by HPGe detectors and were analyzed using ACTAN program development in Center of Applied Studies for Nuclear Development

  14. No-Reference Video Quality Assessment using MPEG Analysis

    DEFF Research Database (Denmark)

    Søgaard, Jacob; Forchhammer, Søren; Korhonen, Jari

    2013-01-01

    We present a method for No-Reference (NR) Video Quality Assessment (VQA) for decoded video without access to the bitstream. This is achieved by extracting and pooling features from a NR image quality assessment method used frame by frame. We also present methods to identify the video coding...... and estimate the video coding parameters for MPEG-2 and H.264/AVC which can be used to improve the VQA. The analysis differs from most other video coding analysis methods since it is without access to the bitstream. The results show that our proposed method is competitive with other recent NR VQA methods...

  15. A methodology for uncertainty analysis of reference equations of state

    DEFF Research Database (Denmark)

    Cheung, Howard; Frutiger, Jerome; Bell, Ian H.

    We present a detailed methodology for the uncertainty analysis of reference equations of state (EOS) based on Helmholtz energy. In recent years there has been an increased interest in uncertainties of property data and process models of thermal systems. In the literature there are various...... for uncertainty analysis is suggested as a tool for EOS. The uncertainties of the EOS properties are calculated from the experimental values and the EOS model structure through the parameter covariance matrix and subsequent linear error propagation. This allows reporting the uncertainty range (95% confidence...

  16. RESEARCH NOTE Genome-based exome-sequencing analysis ...

    Indian Academy of Sciences (India)

    Navya

    2017-02-22

    Feb 22, 2017 ... Genome-based exome-sequencing analysis identifies GYG1, DIS3L, DDRGK1 genes ... Cardiology Division, Department of Internal Medicine, Severance .... with p values of <0.05 byanalyzing differences in allele distribution.

  17. Editorial: Special Issue on Algorithms for Sequence Analysis and Storage

    Directory of Open Access Journals (Sweden)

    Veli Mäkinen

    2014-03-01

    Full Text Available This special issue of Algorithms is dedicated to approaches to biological sequence analysis that have algorithmic novelty and potential for fundamental impact in methods used for genome research.

  18. Tools for integrated sequence-structure analysis with UCSF Chimera

    Directory of Open Access Journals (Sweden)

    Huang Conrad C

    2006-07-01

    Full Text Available Abstract Background Comparing related structures and viewing the structures in the context of sequence alignments are important tasks in protein structure-function research. While many programs exist for individual aspects of such work, there is a need for interactive visualization tools that: (a provide a deep integration of sequence and structure, far beyond mapping where a sequence region falls in the structure and vice versa; (b facilitate changing data of one type based on the other (for example, using only sequence-conserved residues to match structures, or adjusting a sequence alignment based on spatial fit; (c can be used with a researcher's own data, including arbitrary sequence alignments and annotations, closely or distantly related sets of proteins, etc.; and (d interoperate with each other and with a full complement of molecular graphics features. We describe enhancements to UCSF Chimera to achieve these goals. Results The molecular graphics program UCSF Chimera includes a suite of tools for interactive analyses of sequences and structures. Structures automatically associate with sequences in imported alignments, allowing many kinds of crosstalk. A novel method is provided to superimpose structures in the absence of a pre-existing sequence alignment. The method uses both sequence and secondary structure, and can match even structures with very low sequence identity. Another tool constructs structure-based sequence alignments from superpositions of two or more proteins. Chimera is designed to be extensible, and mechanisms for incorporating user-specific data without Chimera code development are also provided. Conclusion The tools described here apply to many problems involving comparison and analysis of protein structures and their sequences. Chimera includes complete documentation and is intended for use by a wide range of scientists, not just those in the computational disciplines. UCSF Chimera is free for non-commercial use and is

  19. PMS2 gene mutational analysis: direct cDNA sequencing to circumvent pseudogene interference.

    Science.gov (United States)

    Wimmer, Katharina; Wernstedt, Annekatrin

    2014-01-01

    The presence of highly homologous pseudocopies can compromise the mutation analysis of a gene of interest. In particular, when using PCR-based strategies, pseudogene co-amplification has to be effectively prevented. This is often achieved by using primers designed to be parental gene specific according to the reference sequence and by applying stringent PCR conditions. However, there are cases in which this approach is of limited utility. For example, it has been shown that the PMS2 gene exchanges sequences with one of its pseudogenes, named PMS2CL. This results in functional PMS2 alleles containing pseudogene-derived sequences at their 3'-end and in nonfunctional PMS2CL pseudogene alleles that contain gene-derived sequences. Hence, the paralogues cannot be distinguished according to the reference sequence. This shortcoming can be effectively circumvented by using direct cDNA sequencing. This approach is based on the selective amplification of PMS2 transcripts in two overlapping 1.6-kb RT-PCR products. In addition to avoiding pseudogene co-amplification and allele dropout, this method has also the advantage that it allows to effectively identify deletions, splice mutations, and de novo retrotransposon insertions that escape the detection of most DNA-based mutation analysis protocols.

  20. Sequencing and Analysis of Neanderthal Genomic DNA

    Energy Technology Data Exchange (ETDEWEB)

    Noonan, James P.; Coop, Graham; Kudaravalli, Sridhar; Smith,Doug; Krause, Johannes; Alessi, Joe; Chen, Feng; Platt, Darren; Paabo,Svante; Pritchard, Jonathan K.; Rubin, Edward M.

    2006-06-13

    Recovery and analysis of multiple Neanderthal autosomalsequences using a metagenomic approach reveals that modern humans andNeanderthals split ~;400,000 years ago, without significant evidence ofsubsequent admixture.

  1. SVAMP: Sequence variation analysis, maps and phylogeny

    KAUST Repository

    Naeem, Raeece; Hidayah, Lailatul; Preston, Mark D.; Clark, Taane G.; Pain, Arnab

    2014-01-01

    Summary: SVAMP is a stand-alone desktop application to visualize genomic variants (in variant call format) in the context of geographical metadata. Users of SVAMP are able to generate phylogenetic trees and perform principal coordinate analysis

  2. Sequencing and analysis of the Mediterranean amphioxus (Branchiostoma lanceolatum transcriptome.

    Directory of Open Access Journals (Sweden)

    Silvan Oulion

    Full Text Available BACKGROUND: The basally divergent phylogenetic position of amphioxus (Cephalochordata, as well as its conserved morphology, development and genetics, make it the best proxy for the chordate ancestor. Particularly, studies using the amphioxus model help our understanding of vertebrate evolution and development. Thus, interest for the amphioxus model led to the characterization of both the transcriptome and complete genome sequence of the American species, Branchiostoma floridae. However, recent technical improvements allowing induction of spawning in the laboratory during the breeding season on a daily basis with the Mediterranean species Branchiostoma lanceolatum have encouraged European Evo-Devo researchers to adopt this species as a model even though no genomic or transcriptomic data have been available. To fill this need we used the pyrosequencing method to characterize the B. lanceolatum transcriptome and then compared our results with the published transcriptome of B. floridae. RESULTS: Starting with total RNA from nine different developmental stages of B. lanceolatum, a normalized cDNA library was constructed and sequenced on Roche GS FLX (Titanium mode. Around 1.4 million of reads were produced and assembled into 70,530 contigs (average length of 490 bp. Overall 37% of the assembled sequences were annotated by BlastX and their Gene Ontology terms were determined. These results were then compared to genomic and transcriptomic data of B. floridae to assess similarities and specificities of each species. CONCLUSION: We obtained a high-quality amphioxus (B. lanceolatum reference transcriptome using a high throughput sequencing approach. We found that 83% of the predicted genes in the B. floridae complete genome sequence are also found in the B. lanceolatum transcriptome, while only 41% were found in the B. floridae transcriptome obtained with traditional Sanger based sequencing. Therefore, given the high degree of sequence conservation

  3. Quality of Standard Reference Materials for Short Time Activation Analysis

    International Nuclear Information System (INIS)

    Ismail, S.S.; Oberleitner, W.

    2003-01-01

    Some environmental reference materials (CFA-1633 b, IAEA-SL-1, SARM-1,BCR-176, Coal-1635, IAEA-SL-3, BCR-146, and SRAM-5) were analysed by short-time activation analysis. The results show that these materials can be classified in three groups, according to their activities after irradiation. The obtained results were compared in order to create a quality index for determination of short-lived nuclides at high count rates. It was found that Cfta is not a suitable standard for determining very short-lived nuclides (half-lives<1 min) because the activity it produces is 15-fold higher than that SL-3. Biological reference materials, such as SRM-1571, SRM-1573, SRM-1575, SRM-1577, IAEA-392, and IAEA-393, were also investigated by a higher counting efficiency system. The quality of this system and its well-type detector for investigating short-lived nuclides was discussed

  4. DSAP: deep-sequencing small RNA analysis pipeline.

    Science.gov (United States)

    Huang, Po-Jung; Liu, Yi-Chung; Lee, Chi-Ching; Lin, Wei-Chen; Gan, Richie Ruei-Chi; Lyu, Ping-Chiang; Tang, Petrus

    2010-07-01

    DSAP is an automated multiple-task web service designed to provide a total solution to analyzing deep-sequencing small RNA datasets generated by next-generation sequencing technology. DSAP uses a tab-delimited file as an input format, which holds the unique sequence reads (tags) and their corresponding number of copies generated by the Solexa sequencing platform. The input data will go through four analysis steps in DSAP: (i) cleanup: removal of adaptors and poly-A/T/C/G/N nucleotides; (ii) clustering: grouping of cleaned sequence tags into unique sequence clusters; (iii) non-coding RNA (ncRNA) matching: sequence homology mapping against a transcribed sequence library from the ncRNA database Rfam (http://rfam.sanger.ac.uk/); and (iv) known miRNA matching: detection of known miRNAs in miRBase (http://www.mirbase.org/) based on sequence homology. The expression levels corresponding to matched ncRNAs and miRNAs are summarized in multi-color clickable bar charts linked to external databases. DSAP is also capable of displaying miRNA expression levels from different jobs using a log(2)-scaled color matrix. Furthermore, a cross-species comparative function is also provided to show the distribution of identified miRNAs in different species as deposited in miRBase. DSAP is available at http://dsap.cgu.edu.tw.

  5. Quantiprot - a Python package for quantitative analysis of protein sequences.

    Science.gov (United States)

    Konopka, Bogumił M; Marciniak, Marta; Dyrka, Witold

    2017-07-17

    The field of protein sequence analysis is dominated by tools rooted in substitution matrices and alignments. A complementary approach is provided by methods of quantitative characterization. A major advantage of the approach is that quantitative properties defines a multidimensional solution space, where sequences can be related to each other and differences can be meaningfully interpreted. Quantiprot is a software package in Python, which provides a simple and consistent interface to multiple methods for quantitative characterization of protein sequences. The package can be used to calculate dozens of characteristics directly from sequences or using physico-chemical properties of amino acids. Besides basic measures, Quantiprot performs quantitative analysis of recurrence and determinism in the sequence, calculates distribution of n-grams and computes the Zipf's law coefficient. We propose three main fields of application of the Quantiprot package. First, quantitative characteristics can be used in alignment-free similarity searches, and in clustering of large and/or divergent sequence sets. Second, a feature space defined by quantitative properties can be used in comparative studies of protein families and organisms. Third, the feature space can be used for evaluating generative models, where large number of sequences generated by the model can be compared to actually observed sequences.

  6. Certification of standard reference materials employing neutron activation analysis

    International Nuclear Information System (INIS)

    Capote Rodriguez, G.; Hernandez Rivero, A.; Molina Insfran, J.; Ribeiro Guevara, S.; Santana Encinosa, C.; Perez Zayas, G.

    1997-01-01

    Neutron activation analysis (Naa) is used extensively as one of the analytical techniques in the certification of standard reference materials (Srm). Characteristics of Naa which make it valuable in this role are: accuracy; multielemental capability; ability to assess homogeneity; high sensitivity for many elements, and essentially non-destructive method. This paper reports the concentrations of thirty elements (major, minor and trace elements) in four Cuban Srm's. The samples were irradiated in a thermal neutron flux of 10 12 -10 13 neutrons.cm -2 .s -1 . The gamma-ray spectra were measured by HPGe detectors and were analysed using ACTAN program, developed in CEADEN. (author) [es

  7. Analyses of Tissue Culture Adaptation of Human Herpesvirus-6A by Whole Genome Deep Sequencing Redefines the Reference Sequence and Identifies Virus Entry Complex Changes.

    Science.gov (United States)

    Tweedy, Joshua G; Escriva, Eric; Topf, Maya; Gompels, Ursula A

    2017-12-31

    Tissue-culture adaptation of viruses can modulate infection. Laboratory passage and bacterial artificial chromosome (BAC)mid cloning of human cytomegalovirus, HCMV, resulted in genomic deletions and rearrangements altering genes encoding the virus entry complex, which affected cellular tropism, virulence, and vaccine development. Here, we analyse these effects on the reference genome for related betaherpesviruses, Roseolovirus, human herpesvirus 6A (HHV-6A) strain U1102. This virus is also naturally "cloned" by germline subtelomeric chromosomal-integration in approximately 1% of human populations, and accurate references are key to understanding pathological relationships between exogenous and endogenous virus. Using whole genome next-generation deep-sequencing Illumina-based methods, we compared the original isolate to tissue-culture passaged and the BACmid-cloned virus. This re-defined the reference genome showing 32 corrections and 5 polymorphisms. Furthermore, minor variant analyses of passaged and BACmid virus identified emerging populations of a further 32 single nucleotide polymorphisms (SNPs) in 10 loci, half non-synonymous indicating cell-culture selection. Analyses of the BAC-virus genome showed deletion of the BAC cassette via loxP recombination removing green fluorescent protein (GFP)-based selection. As shown for HCMV culture effects, select HHV-6A SNPs mapped to genes encoding mediators of virus cellular entry, including virus envelope glycoprotein genes gB and the gH/gL complex. Comparative models suggest stabilisation of the post-fusion conformation. These SNPs are essential to consider in vaccine-design, antimicrobial-resistance, and pathogenesis.

  8. galaxie--CGI scripts for sequence identification through automated phylogenetic analysis.

    Science.gov (United States)

    Nilsson, R Henrik; Larsson, Karl-Henrik; Ursing, Björn M

    2004-06-12

    The prevalent use of similarity searches like BLAST to identify sequences and species implicitly assumes the reference database to be of extensive sequence sampling. This is often not the case, restraining the correctness of the outcome as a basis for sequence identification. Phylogenetic inference outperforms similarity searches in retrieving correct phylogenies and consequently sequence identities, and a project was initiated to design a freely available script package for sequence identification through automated Web-based phylogenetic analysis. Three CGI scripts were designed to facilitate qualified sequence identification from a Web interface. Query sequences are aligned to pre-made alignments or to alignments made by ClustalW with entries retrieved from a BLAST search. The subsequent phylogenetic analysis is based on the PHYLIP package for inferring neighbor-joining and parsimony trees. The scripts are highly configurable. A service installation and a version for local use are found at http://andromeda.botany.gu.se/galaxiewelcome.html and http://galaxie.cgb.ki.se

  9. Nonlinear analysis of river flow time sequences

    Science.gov (United States)

    Porporato, Amilcare; Ridolfi, Luca

    1997-06-01

    Within the field of chaos theory several methods for the analysis of complex dynamical systems have recently been proposed. In light of these ideas we study the dynamics which control the behavior over time of river flow, investigating the existence of a low-dimension deterministic component. The present article follows the research undertaken in the work of Porporato and Ridolfi [1996a] in which some clues as to the existence of chaos were collected. Particular emphasis is given here to the problem of noise and to nonlinear prediction. With regard to the latter, the benefits obtainable by means of the interpolation of the available time series are reported and the remarkable predictive results attained with this nonlinear method are shown.

  10. Accident sequence analysis of human-computer interface design

    International Nuclear Information System (INIS)

    Fan, C.-F.; Chen, W.-H.

    2000-01-01

    It is important to predict potential accident sequences of human-computer interaction in a safety-critical computing system so that vulnerable points can be disclosed and removed. We address this issue by proposing a Multi-Context human-computer interaction Model along with its analysis techniques, an Augmented Fault Tree Analysis, and a Concurrent Event Tree Analysis. The proposed augmented fault tree can identify the potential weak points in software design that may induce unintended software functions or erroneous human procedures. The concurrent event tree can enumerate possible accident sequences due to these weak points

  11. New Approaches and Technologies to Sequence de novo Plant reference Genomes (2013 DOE JGI Genomics of Energy and Environment 8th Annual User Meeting)

    Energy Technology Data Exchange (ETDEWEB)

    Schmutz, Jeremy

    2013-03-01

    Jeremy Schmutz of the HudsonAlpha Institute for Biotechnology on New approaches and technologies to sequence de novo plant reference genomes at the 8th Annual Genomics of Energy Environment Meeting on March 27, 2013 in Walnut Creek, CA.

  12. Genomic sequence around butterfly wing development genes: annotation and comparative analysis.

    Directory of Open Access Journals (Sweden)

    Inês C Conceição

    Full Text Available BACKGROUND: Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. METHODOLOGY/PRINCIPAL FINDINGS: We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes. CONCLUSIONS: The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1 the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2 the high

  13. Food Fish Identification from DNA Extraction through Sequence Analysis

    Science.gov (United States)

    Hallen-Adams, Heather E.

    2015-01-01

    This experiment exposed 3rd and 4th y undergraduates and graduate students taking a course in advanced food analysis to DNA extraction, polymerase chain reaction (PCR), and DNA sequence analysis. Students provided their own fish sample, purchased from local grocery stores, and the class as a whole extracted DNA, which was then subjected to PCR,…

  14. Multilocus Sequence Analysis and rpoB Sequencing of Mycobacterium abscessus (Sensu Lato) Strains▿

    Science.gov (United States)

    Macheras, Edouard; Roux, Anne-Laure; Bastian, Sylvaine; Leão, Sylvia Cardoso; Palaci, Moises; Sivadon-Tardy, Valérie; Gutierrez, Cristina; Richter, Elvira; Rüsch-Gerdes, Sabine; Pfyffer, Gaby; Bodmer, Thomas; Cambau, Emmanuelle; Gaillard, Jean-Louis; Heym, Beate

    2011-01-01

    Mycobacterium abscessus, Mycobacterium bolletii, and Mycobacterium massiliense (Mycobacterium abscessus sensu lato) are closely related species that currently are identified by the sequencing of the rpoB gene. However, recent studies show that rpoB sequencing alone is insufficient to discriminate between these species, and some authors have questioned their current taxonomic classification. We studied here a large collection of M. abscessus (sensu lato) strains by partial rpoB sequencing (752 bp) and multilocus sequence analysis (MLSA). The final MLSA scheme developed was based on the partial sequences of eight housekeeping genes: argH, cya, glpK, gnd, murC, pgm, pta, and purH. The strains studied included the three type strains (M. abscessus CIP 104536T, M. massiliense CIP 108297T, and M. bolletii CIP 108541T) and 120 isolates recovered between 1997 and 2007 in France, Germany, Switzerland, and Brazil. The rpoB phylogenetic tree confirmed the existence of three main clusters, each comprising the type strain of one species. However, divergence values between the M. massiliense and M. bolletii clusters all were below 3% and between the M. abscessus and M. massiliense clusters were from 2.66 to 3.59%. The tree produced using the concatenated MLSA gene sequences (4,071 bp) also showed three main clusters, each comprising the type strain of one species. The M. abscessus cluster had a bootstrap value of 100% and was mostly compact. Bootstrap values for the M. massiliense and M. bolletii branches were much lower (71 and 61%, respectively), with the M. massiliense cluster having a fuzzy aspect. Mean (range) divergence values were 2.17% (1.13 to 2.58%) between the M. abscessus and M. massiliense clusters, 2.37% (1.5 to 2.85%) between the M. abscessus and M. bolletii clusters, and 2.28% (0.86 to 2.68%) between the M. massiliense and M. bolletii clusters. Adding the rpoB sequence to the MLSA-concatenated sequence (total sequence, 4,823 bp) had little effect on the clustering

  15. Multilocus sequence analysis and rpoB sequencing of Mycobacterium abscessus (sensu lato) strains.

    Science.gov (United States)

    Macheras, Edouard; Roux, Anne-Laure; Bastian, Sylvaine; Leão, Sylvia Cardoso; Palaci, Moises; Sivadon-Tardy, Valérie; Gutierrez, Cristina; Richter, Elvira; Rüsch-Gerdes, Sabine; Pfyffer, Gaby; Bodmer, Thomas; Cambau, Emmanuelle; Gaillard, Jean-Louis; Heym, Beate

    2011-02-01

    Mycobacterium abscessus, Mycobacterium bolletii, and Mycobacterium massiliense (Mycobacterium abscessus sensu lato) are closely related species that currently are identified by the sequencing of the rpoB gene. However, recent studies show that rpoB sequencing alone is insufficient to discriminate between these species, and some authors have questioned their current taxonomic classification. We studied here a large collection of M. abscessus (sensu lato) strains by partial rpoB sequencing (752 bp) and multilocus sequence analysis (MLSA). The final MLSA scheme developed was based on the partial sequences of eight housekeeping genes: argH, cya, glpK, gnd, murC, pgm, pta, and purH. The strains studied included the three type strains (M. abscessus CIP 104536(T), M. massiliense CIP 108297(T), and M. bolletii CIP 108541(T)) and 120 isolates recovered between 1997 and 2007 in France, Germany, Switzerland, and Brazil. The rpoB phylogenetic tree confirmed the existence of three main clusters, each comprising the type strain of one species. However, divergence values between the M. massiliense and M. bolletii clusters all were below 3% and between the M. abscessus and M. massiliense clusters were from 2.66 to 3.59%. The tree produced using the concatenated MLSA gene sequences (4,071 bp) also showed three main clusters, each comprising the type strain of one species. The M. abscessus cluster had a bootstrap value of 100% and was mostly compact. Bootstrap values for the M. massiliense and M. bolletii branches were much lower (71 and 61%, respectively), with the M. massiliense cluster having a fuzzy aspect. Mean (range) divergence values were 2.17% (1.13 to 2.58%) between the M. abscessus and M. massiliense clusters, 2.37% (1.5 to 2.85%) between the M. abscessus and M. bolletii clusters, and 2.28% (0.86 to 2.68%) between the M. massiliense and M. bolletii clusters. Adding the rpoB sequence to the MLSA-concatenated sequence (total sequence, 4,823 bp) had little effect on the

  16. Reference materials and interlaboratory comparison for actinide analysis

    International Nuclear Information System (INIS)

    Hanssens, Alain; Viallesoubranne, Carole; Roche, Claude; Liozon, Gerard

    2008-01-01

    Measurement quality is crucial for the safety of nuclear facilities and is a primary requirement for fissile material monitoring and accountancy. CETAMA (Cea Committee for the establishment of analysis methods), in collaboration with Cea and AREVA laboratories, fabricates certified reference materials and organizes interlaboratory comparison programs for plutonium and uranium assay in solution. A new plutonium metal measurement standard (MP3) is currently being prepared by Cea and is a subject of cooperative work in view of its certification and use by analysis laboratories. U and Pu interlaboratory comparisons are carried out at regular intervals on benchmark samples in coordination with working groups from French nuclear laboratories. These programs are supported by international cooperation. 'Chemical' methods (potentiometry, gravimetric analysis, etc.) generally provide the best accuracy. Coulometry is the benchmark technique for plutonium assay: its metrological qualities should be an incentive for wider use by laboratories performing precise control assays of plutonium as well as uranium. Gravimetric analysis provides excellent results for analysis of pure uranyl nitrate solutions. In view of its many advantages we encourage laboratories to employ this technique to assay pure U or Pu solutions. 'Physical' or 'physicochemical' methods are increasingly used, and their performance has improved. K-edge absorption spectrometry and isotope dilution mass spectrometry are capable of reaching measurement quality levels comparable to those of the best 'chemical' methods. (authors)

  17. Reference materials and interlaboratory comparison for actinide analysis

    Energy Technology Data Exchange (ETDEWEB)

    Hanssens, Alain; Viallesoubranne, Carole; Roche, Claude; Liozon, Gerard [Commissariat a l' Energie Atomique, Marcoule: BP 17171, 30207 Bagnols sur Ceze (France)

    2008-07-01

    Measurement quality is crucial for the safety of nuclear facilities and is a primary requirement for fissile material monitoring and accountancy. CETAMA (Cea Committee for the establishment of analysis methods), in collaboration with Cea and AREVA laboratories, fabricates certified reference materials and organizes interlaboratory comparison programs for plutonium and uranium assay in solution. A new plutonium metal measurement standard (MP3) is currently being prepared by Cea and is a subject of cooperative work in view of its certification and use by analysis laboratories. U and Pu interlaboratory comparisons are carried out at regular intervals on benchmark samples in coordination with working groups from French nuclear laboratories. These programs are supported by international cooperation. 'Chemical' methods (potentiometry, gravimetric analysis, etc.) generally provide the best accuracy. Coulometry is the benchmark technique for plutonium assay: its metrological qualities should be an incentive for wider use by laboratories performing precise control assays of plutonium as well as uranium. Gravimetric analysis provides excellent results for analysis of pure uranyl nitrate solutions. In view of its many advantages we encourage laboratories to employ this technique to assay pure U or Pu solutions. 'Physical' or 'physicochemical' methods are increasingly used, and their performance has improved. K-edge absorption spectrometry and isotope dilution mass spectrometry are capable of reaching measurement quality levels comparable to those of the best 'chemical' methods. (authors)

  18. Effect of reference loads on fracture mechanics analysis of surface cracked pipe based on reference stress method

    International Nuclear Information System (INIS)

    Shim, Do Jun; Son, Beom Goo; Kim, Young Jin; Kim, Yun Jae

    2004-01-01

    To investigate relevance of the definition of the reference stress to estimate J and C * for surface crack problems, this paper compares FE J and C * results for surface cracked pipes with those estimated according to the reference stress approach using various definitions of the reference stress. Pipes with part circumferential inner surface crack and finite internal axial crack are considered, subject to internal pressure and global bending. The crack depth and aspect ratio are systematically varied. The reference stress is defined in four different ways using (I) the local limit load, (II) the global limit load, (III) the global limit load determined from the FE limit analysis, and (IV) the optimised reference load. It is found that the reference stress based on the local limit load gives overall excessively conservative estimates of J and C * . Use of the global limit load clearly reduces the conservatism, compared to that of the local limit load, although it can provide sometimes non-conservative estimates of J and C * . The use of the FE global limit load gives overall non-conservative estimates of J and C * . The reference stress based on the optimised reference load gives overall accurate estimates of J and C * , compared to other definitions of the reference stress. Based on the present finding, general guidance on the choice of the reference stress for surface crack problems is given

  19. Consistency and reproducibility of next-generation sequencing and other multigene mutational assays: A worldwide ring trial study on quantitative cytological molecular reference specimens.

    Science.gov (United States)

    Malapelle, Umberto; Mayo-de-Las-Casas, Clara; Molina-Vila, Miguel A; Rosell, Rafael; Savic, Spasenija; Bihl, Michel; Bubendorf, Lukas; Salto-Tellez, Manuel; de Biase, Dario; Tallini, Giovanni; Hwang, David H; Sholl, Lynette M; Luthra, Rajyalakshmi; Weynand, Birgit; Vander Borght, Sara; Missiaglia, Edoardo; Bongiovanni, Massimo; Stieber, Daniel; Vielh, Philippe; Schmitt, Fernando; Rappa, Alessandra; Barberis, Massimo; Pepe, Francesco; Pisapia, Pasquale; Serra, Nicola; Vigliar, Elena; Bellevicine, Claudio; Fassan, Matteo; Rugge, Massimo; de Andrea, Carlos E; Lozano, Maria D; Basolo, Fulvio; Fontanini, Gabriella; Nikiforov, Yuri E; Kamel-Reid, Suzanne; da Cunha Santos, Gilda; Nikiforova, Marina N; Roy-Chowdhuri, Sinchita; Troncone, Giancarlo

    2017-08-01

    Molecular testing of cytological lung cancer specimens includes, beyond epidermal growth factor receptor (EGFR), emerging predictive/prognostic genomic biomarkers such as Kirsten rat sarcoma viral oncogene homolog (KRAS), neuroblastoma RAS viral [v-ras] oncogene homolog (NRAS), B-Raf proto-oncogene, serine/threonine kinase (BRAF), and phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit α (PIK3CA). Next-generation sequencing (NGS) and other multigene mutational assays are suitable for cytological specimens, including smears. However, the current literature reflects single-institution studies rather than multicenter experiences. Quantitative cytological molecular reference slides were produced with cell lines designed to harbor concurrent mutations in the EGFR, KRAS, NRAS, BRAF, and PIK3CA genes at various allelic ratios, including low allele frequencies (AFs; 1%). This interlaboratory ring trial study included 14 institutions across the world that performed multigene mutational assays, from tissue extraction to data analysis, on these reference slides, with each laboratory using its own mutation analysis platform and methodology. All laboratories using NGS (n = 11) successfully detected the study's set of mutations with minimal variations in the means and standard errors of variant fractions at dilution points of 10% (P = .171) and 5% (P = .063) despite the use of different sequencing platforms (Illumina, Ion Torrent/Proton, and Roche). However, when mutations at a low AF of 1% were analyzed, the concordance of the NGS results was low, and this reflected the use of different thresholds for variant calling among the institutions. In contrast, laboratories using matrix-assisted laser desorption/ionization-time of flight (n = 2) showed lower concordance in terms of mutation detection and mutant AF quantification. Quantitative molecular reference slides are a useful tool for monitoring the performance of different multigene mutational

  20. Enhanced electricity system analysis for decision making - A reference book

    International Nuclear Information System (INIS)

    2000-01-01

    The objective of electricity system analysis in support of decision making is to provide comparative assessment results upon which relevant policy choices between alternative technology options and supply strategies can be based. This reference book offers analysts, planners and decision makers documented information on enhanced approaches to electricity system analysis, that can assist in achieving this objective. The book describes the main elements of comprehensive electricity system analysis and outlines an advanced integrated analysis and decision making framework for the electric power sector. Emphasis is placed on mechanisms for building consensus between interested and affected parties, and on aspects of planning that go beyond the traditional economic optimisation approach. The scope and contents of the book cover the topics to be addressed in decision making for the power sector and the process of integrating economic, social, health and environmental aspects in the comparative assessment of alternative options and strategies. The book describes and discusses overall frameworks, processes and state of the art methods and techniques available to analysts and planners for carrying out comparative assessment studies, in order to provide sound information to decision makers. This reference book is published as part of a series of technical reports and documents prepared in the framework of the inter-agency joint project (DECADES) on databases and methodologies for comparative assessment of different energy sources for electricity generation. The overall objective of the DECADES project is to enhance capabilities for incorporating economic, social, health and environmental issues in the comparative assessment of electricity generation options and strategies in the process of decision making for the power sector. The project, established in 1992, is carried out jointly by the European Commission (EC), the Economic and Social Commission for Asia and the Pacific

  1. An optimized protocol for generation and analysis of Ion Proton sequencing reads for RNA-Seq.

    Science.gov (United States)

    Yuan, Yongxian; Xu, Huaiqian; Leung, Ross Ka-Kit

    2016-05-26

    Previous studies compared running cost, time and other performance measures of popular sequencing platforms. However, comprehensive assessment of library construction and analysis protocols for Proton sequencing platform remains unexplored. Unlike Illumina sequencing platforms, Proton reads are heterogeneous in length and quality. When sequencing data from different platforms are combined, this can result in reads with various read length. Whether the performance of the commonly used software for handling such kind of data is satisfactory is unknown. By using universal human reference RNA as the initial material, RNaseIII and chemical fragmentation methods in library construction showed similar result in gene and junction discovery number and expression level estimated accuracy. In contrast, sequencing quality, read length and the choice of software affected mapping rate to a much larger extent. Unspliced aligner TMAP attained the highest mapping rate (97.27 % to genome, 86.46 % to transcriptome), though 47.83 % of mapped reads were clipped. Long reads could paradoxically reduce mapping in junctions. With reference annotation guide, the mapping rate of TopHat2 significantly increased from 75.79 to 92.09 %, especially for long (>150 bp) reads. Sailfish, a k-mer based gene expression quantifier attained highly consistent results with that of TaqMan array and highest sensitivity. We provided for the first time, the reference statistics of library preparation methods, gene detection and quantification and junction discovery for RNA-Seq by the Ion Proton platform. Chemical fragmentation performed equally well with the enzyme-based one. The optimal Ion Proton sequencing options and analysis software have been evaluated.

  2. Development and application of a multi-targeting reference plasmid as calibrator for analysis of five genetically modified soybean events.

    Science.gov (United States)

    Pi, Liqun; Li, Xiang; Cao, Yiwei; Wang, Canhua; Pan, Liangwen; Yang, Litao

    2015-04-01

    Reference materials are important in accurate analysis of genetically modified organism (GMO) contents in food/feeds, and development of novel reference plasmid is a new trend in the research of GMO reference materials. Herein, we constructed a novel multi-targeting plasmid, pSOY, which contained seven event-specific sequences of five GM soybeans (MON89788-5', A2704-12-3', A5547-127-3', DP356043-5', DP305423-3', A2704-12-5', and A5547-127-5') and sequence of soybean endogenous reference gene Lectin. We evaluated the specificity, limit of detection and quantification, and applicability of pSOY in both qualitative and quantitative PCR analyses. The limit of detection (LOD) was as low as 20 copies in qualitative PCR, and the limit of quantification (LOQ) in quantitative PCR was 10 copies. In quantitative real-time PCR analysis, the PCR efficiencies of all event-specific and Lectin assays were higher than 90%, and the squared regression coefficients (R(2)) were more than 0.999. The quantification bias varied from 0.21% to 19.29%, and the relative standard deviations were from 1.08% to 9.84% in simulated samples analysis. All the results demonstrated that the developed multi-targeting plasmid, pSOY, was a credible substitute of matrix reference materials, and could be used as a reliable reference calibrator in the identification and quantification of multiple GM soybean events.

  3. An analysis of uncertainties in the reference resonance absorption calculations

    International Nuclear Information System (INIS)

    Milosevic, M.; Pesic, M.

    1997-05-01

    A recently appeared generation of design-oriented methods, which allows to compute the space and energy dependence of the resonant absorption inside the fuel rod, induces a new problem of validation of results obtained with improved resonance treatments, Because no experimental results are available on the spatial and energy distribution of resonance absorption, detailed reference calculations were generated with the continuos-energy Monte Carlo and energy pointwise slowing-down codes. The accuracy of these calculations depends>on various in.fluences. In this paper an analysis of some influences, such as differences ;n nuclear data libraries and philosophy of reproducing the cross section data, is presented. Example application is given for a calculation benchmark that consists of determination of resonance absorption by 238 U in typical PWR pin cell geometry (author)

  4. Neutron activation analysis of new botanical reference materials. Pt. 2

    International Nuclear Information System (INIS)

    Kucera, J.; Soukal, L.

    1993-01-01

    The certified, information, and other values of elemental contents were compared with results of neutron activation analysis (NAA) for the new Czechoslovak botanical reference materials (RMs) Green Algae 12-02-02, Lucerne 12-02-03, Wheat Bread Fluor 12-02-04, and Rye Bread Flour 12-02-05. These were prepared by the Institute of Radioecology and Applied Nuclear Techniques (IRANT), Kosice, and statistically evaluated after interlaboratory comparisons. For the majority of elements, a very good agreement was found between the IRANT values and the results of NAA. In several cases, however, significant differences were detected; possible analytical reasons for the differences and the suitability of a purely statistical evaluation of intercomparison results without analytical considerations for RM certification are discussed. (orig.)

  5. No-Reference Video Quality Assessment using Codec Analysis

    DEFF Research Database (Denmark)

    Søgaard, Jacob; Forchhammer, Søren; Korhonen, Jari

    2015-01-01

    types of videos, estimating the level of quantization used in the I-frames, and exploiting this information to assess the video quality. In order to do this for H.264/AVC, the distribution of the DCT-coefficients after intra-prediction and deblocking are modeled. To obtain VQA features for H.264/AVC, we......A no-reference video quality assessment (VQA) method is presented for videos distorted by H.264/AVC and MPEG-2. The assessment is performed without access to the bit-stream. Instead we analyze and estimate coefficients based on decoded pixels. The approach involves distinguishing between the two...... propose a novel estimation method of the quantization in H.264/AVC videos without bitstream access, which can also be used for Peak Signalto-Noise Ratio (PSNR) estimation. The results from the MPEG-2 and H.264/AVC analysis are mapped to a perceptual measure of video quality by Support Vector Regression...

  6. Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes

    Directory of Open Access Journals (Sweden)

    Rebecca M. Davidson

    2011-11-01

    Full Text Available Transcriptome sequencing is a powerful method for studying global expression patterns in large, complex genomes. Evaluation of sequence-based expression profiles during reproductive development would provide functional annotation to genes underlying agronomic traits. We generated transcriptome profiles for 12 diverse maize ( L. reproductive tissues representing male, female, developing seed, and leaf tissues using high throughput transcriptome sequencing. Overall, ∼80% of annotated genes were expressed. Comparative analysis between sequence and hybridization-based methods demonstrated the utility of ribonucleic acid sequencing (RNA-seq for expression determination and differentiation of paralagous genes (∼85% of maize genes. Analysis of 4975 gene families across reproductive tissues revealed expression divergence is proportional to family size. In all pairwise comparisons between tissues, 7 (pre- vs. postemergence cobs to 48% (pollen vs. ovule of genes were differentially expressed. Genes with expression restricted to a single tissue within this study were identified with the highest numbers observed in leaves, endosperm, and pollen. Coexpression network analysis identified 17 gene modules with complex and shared expression patterns containing many previously described maize genes. The data and analyses in this study provide valuable tools through improved gene annotation, gene family characterization, and a core set of candidate genes to further characterize maize reproductive development and improve grain yield potential.

  7. Genome sequencing of bacteria: sequencing, de novo assembly and rapid analysis using open source tools.

    Science.gov (United States)

    Kisand, Veljo; Lettieri, Teresa

    2013-04-01

    De novo genome sequencing of previously uncharacterized microorganisms has the potential to open up new frontiers in microbial genomics by providing insight into both functional capabilities and biodiversity. Until recently, Roche 454 pyrosequencing was the NGS method of choice for de novo assembly because it generates hundreds of thousands of long reads (tools for processing NGS data are increasingly free and open source and are often adopted for both their high quality and role in promoting academic freedom. The error rate of pyrosequencing the Alcanivorax borkumensis genome was such that thousands of insertions and deletions were artificially introduced into the finished genome. Despite a high coverage (~30 fold), it did not allow the reference genome to be fully mapped. Reads from regions with errors had low quality, low coverage, or were missing. The main defect of the reference mapping was the introduction of artificial indels into contigs through lower than 100% consensus and distracting gene calling due to artificial stop codons. No assembler was able to perform de novo assembly comparable to reference mapping. Automated annotation tools performed similarly on reference mapped and de novo draft genomes, and annotated most CDSs in the de novo assembled draft genomes. Free and open source software (FOSS) tools for assembly and annotation of NGS data are being developed rapidly to provide accurate results with less computational effort. Usability is not high priority and these tools currently do not allow the data to be processed without manual intervention. Despite this, genome assemblers now readily assemble medium short reads into long contigs (>97-98% genome coverage). A notable gap in pyrosequencing technology is the quality of base pair calling and conflicting base pairs between single reads at the same nucleotide position. Regardless, using draft whole genomes that are not finished and remain fragmented into tens of contigs allows one to characterize

  8. Whole-genome sequencing and genetic variant analysis of a Quarter Horse mare.

    KAUST Repository

    Doan, Ryan; Cohen, Noah D; Sawyer, Jason; Ghaffari, Noushin; Johnson, Charlie D; Dindot, Scott V

    2012-01-01

    BACKGROUND: The catalog of genetic variants in the horse genome originates from a few select animals, the majority originating from the Thoroughbred mare used for the equine genome sequencing project. The purpose of this study was to identify genetic variants, including single nucleotide polymorphisms (SNPs), insertion/deletion polymorphisms (INDELs), and copy number variants (CNVs) in the genome of an individual Quarter Horse mare sequenced by next-generation sequencing. RESULTS: Using massively parallel paired-end sequencing, we generated 59.6 Gb of DNA sequence from a Quarter Horse mare resulting in an average of 24.7X sequence coverage. Reads were mapped to approximately 97% of the reference Thoroughbred genome. Unmapped reads were de novo assembled resulting in 19.1 Mb of new genomic sequence in the horse. Using a stringent filtering method, we identified 3.1 million SNPs, 193 thousand INDELs, and 282 CNVs. Genetic variants were annotated to determine their impact on gene structure and function. Additionally, we genotyped this Quarter Horse for mutations of known diseases and for variants associated with particular traits. Functional clustering analysis of genetic variants revealed that most of the genetic variation in the horse's genome was enriched in sensory perception, signal transduction, and immunity and defense pathways. CONCLUSIONS: This is the first sequencing of a horse genome by next-generation sequencing and the first genomic sequence of an individual Quarter Horse mare. We have increased the catalog of genetic variants for use in equine genomics by the addition of novel SNPs, INDELs, and CNVs. The genetic variants described here will be a useful resource for future studies of genetic variation regulating performance traits and diseases in equids.

  9. Whole-genome sequencing and genetic variant analysis of a Quarter Horse mare.

    KAUST Repository

    Doan, Ryan

    2012-02-17

    BACKGROUND: The catalog of genetic variants in the horse genome originates from a few select animals, the majority originating from the Thoroughbred mare used for the equine genome sequencing project. The purpose of this study was to identify genetic variants, including single nucleotide polymorphisms (SNPs), insertion/deletion polymorphisms (INDELs), and copy number variants (CNVs) in the genome of an individual Quarter Horse mare sequenced by next-generation sequencing. RESULTS: Using massively parallel paired-end sequencing, we generated 59.6 Gb of DNA sequence from a Quarter Horse mare resulting in an average of 24.7X sequence coverage. Reads were mapped to approximately 97% of the reference Thoroughbred genome. Unmapped reads were de novo assembled resulting in 19.1 Mb of new genomic sequence in the horse. Using a stringent filtering method, we identified 3.1 million SNPs, 193 thousand INDELs, and 282 CNVs. Genetic variants were annotated to determine their impact on gene structure and function. Additionally, we genotyped this Quarter Horse for mutations of known diseases and for variants associated with particular traits. Functional clustering analysis of genetic variants revealed that most of the genetic variation in the horse\\'s genome was enriched in sensory perception, signal transduction, and immunity and defense pathways. CONCLUSIONS: This is the first sequencing of a horse genome by next-generation sequencing and the first genomic sequence of an individual Quarter Horse mare. We have increased the catalog of genetic variants for use in equine genomics by the addition of novel SNPs, INDELs, and CNVs. The genetic variants described here will be a useful resource for future studies of genetic variation regulating performance traits and diseases in equids.

  10. An outline of reference materials for analysis techniques in China

    Energy Technology Data Exchange (ETDEWEB)

    Yuanxun, Zhang; Yine, Qian; Yongping, Zhang; Yongpeng, Tong [Shanghai Institute of Nuclear Research, Academia Sinica (China)

    1994-07-01

    This paper provides background information on the development in the field of reference materials in China. The major considerations in development of reference materials include homogeneity, stability, handling procedures and certification. Further it discusses the plans for development in the near future specific natural-matrix reference materials containing low levels of trace elements and having high degree of homogeneity.

  11. An outline of reference materials for analysis techniques in China

    International Nuclear Information System (INIS)

    Zhang Yuanxun; Qian Yine; Zhang Yongping; Tong Yongpeng

    1994-01-01

    This paper provides background information on the development in the field of reference materials in China. The major considerations in development of reference materials include homogeneity, stability, handling procedures and certification. Further it discusses the plans for development in the near future specific natural-matrix reference materials containing low levels of trace elements and having high degree of homogeneity

  12. MRI of the temporo-mandibular joint: which sequence is best suited to assess the cortical bone of the mandibular condyle? A cadaveric study using micro-CT as the standard of reference.

    Science.gov (United States)

    Karlo, Christoph A; Patcas, Raphael; Kau, Thomas; Watzal, Helmut; Signorelli, Luca; Müller, Lukas; Ullrich, Oliver; Luder, Hans-Ulrich; Kellenberger, Christian J

    2012-07-01

    To determine the best suited sagittal MRI sequence out of a standard temporo-mandibular joint (TMJ) imaging protocol for the assessment of the cortical bone of the mandibular condyles of cadaveric specimens using micro-CT as the standard of reference. Sixteen TMJs in 8 human cadaveric heads (mean age, 81 years) were examined by MRI. Upon all sagittal sequences, two observers measured the cortical bone thickness (CBT) of the anterior, superior and posterior portions of the mandibular condyles (i.e. objective analysis), and assessed for the presence of cortical bone thinning, erosions or surface irregularities as well as subcortical bone cysts and anterior osteophytes (i.e. subjective analysis). Micro-CT of the condyles was performed to serve as the standard of reference for statistical analysis. Inter-observer agreements for objective (r = 0.83-0.99, P < 0.01) and subjective (κ = 0.67-0.88) analyses were very good. Mean CBT measurements were most accurate, and cortical bone thinning, erosions, surface irregularities and subcortical bone cysts were best depicted on the 3D fast spoiled gradient echo recalled sequence (3D FSPGR). The most reliable MRI sequence to assess the cortical bone of the mandibular condyles on sagittal imaging planes is the 3D FSPGR sequence. MRI may be used to assess the cortical bone of the TMJ. • Depiction of cortical bone is best on 3D FSPGR sequences. • MRI can assess treatment response in patients with TMJ abnormalities.

  13. Sequence analysis of the genome of carnation (Dianthus caryophyllus L.).

    Science.gov (United States)

    Yagi, Masafumi; Kosugi, Shunichi; Hirakawa, Hideki; Ohmiya, Akemi; Tanase, Koji; Harada, Taro; Kishimoto, Kyutaro; Nakayama, Masayoshi; Ichimura, Kazuo; Onozaki, Takashi; Yamaguchi, Hiroyasu; Sasaki, Nobuhiro; Miyahara, Taira; Nishizaki, Yuzo; Ozeki, Yoshihiro; Nakamura, Noriko; Suzuki, Takamasa; Tanaka, Yoshikazu; Sato, Shusei; Shirasawa, Kenta; Isobe, Sachiko; Miyamura, Yoshinori; Watanabe, Akiko; Nakayama, Shinobu; Kishida, Yoshie; Kohara, Mitsuyo; Tabata, Satoshi

    2014-06-01

    The whole-genome sequence of carnation (Dianthus caryophyllus L.) cv. 'Francesco' was determined using a combination of different new-generation multiplex sequencing platforms. The total length of the non-redundant sequences was 568,887,315 bp, consisting of 45,088 scaffolds, which covered 91% of the 622 Mb carnation genome estimated by k-mer analysis. The N50 values of contigs and scaffolds were 16,644 bp and 60,737 bp, respectively, and the longest scaffold was 1,287,144 bp. The average GC content of the contig sequences was 36%. A total of 1050, 13, 92 and 143 genes for tRNAs, rRNAs, snoRNA and miRNA, respectively, were identified in the assembled genomic sequences. For protein-encoding genes, 43 266 complete and partial gene structures excluding those in transposable elements were deduced. Gene coverage was ∼ 98%, as deduced from the coverage of the core eukaryotic genes. Intensive characterization of the assigned carnation genes and comparison with those of other plant species revealed characteristic features of the carnation genome. The results of this study will serve as a valuable resource for fundamental and applied research of carnation, especially for breeding new carnation varieties. Further information on the genomic sequences is available at http://carnation.kazusa.or.jp. © The Author 2013. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  14. Sequence analysis corresponding to the PPE and PE proteins in ...

    Indian Academy of Sciences (India)

    Unknown

    AB repeats; Mycobacterium tuberculosis genome; PE-PPE domain; PPE, PE proteins; sequence analysis; surface antigens. J. Biosci. | Vol. ... bacterium tuberculosis genomes resulted in the identification of a previously uncharacterized 225 amino acid- ...... Vega Lopez F, Brooks L A, Dockrell H M, De Smet K A,. Thompson ...

  15. Molecular cloning, expression analysis and sequence prediction of ...

    African Journals Online (AJOL)

    CCAAT/enhancer-binding protein beta as an essential transcriptional factor, regulates the differentiation of adipocytes and the deposition of fat. Herein, we cloned the whole open reading frame (ORF) of bovine C/EBPβ gene and analyzed its putative protein structures via DNA cloning and sequence analysis. Then, the ...

  16. Multilocus sequence analysis of phytopathogenic species of the genus Streptomyces

    Science.gov (United States)

    The identification and classification of species within the genus Streptomyces is difficult because there are presently 576 validly described species and this number increases every year. The value of the application of multilocus sequence analysis scheme to the systematics of Streptomyces species h...

  17. Sequence symmetry analysis in pharmacovigilance and pharmacoepidemiologic studies

    DEFF Research Database (Denmark)

    Lai, Edward Chia Cheng; Pratt, Nicole; Hsieh, Cheng Yang

    2017-01-01

    Sequence symmetry analysis (SSA) is a method for detecting adverse drug events by utilizing computerized claims data. The method has been increasingly used to investigate safety concerns of medications and as a pharmacovigilance tool to identify unsuspected side effects. Validation studies have i...

  18. DNAApp: a mobile application for sequencing data analysis.

    Science.gov (United States)

    Nguyen, Phi-Vu; Verma, Chandra Shekhar; Gan, Samuel Ken-En

    2014-11-15

    There have been numerous applications developed for decoding and visualization of ab1 DNA sequencing files for Windows and MAC platforms, yet none exists for the increasingly popular smartphone operating systems. The ability to decode sequencing files cannot easily be carried out using browser accessed Web tools. To overcome this hurdle, we have developed a new native app called DNAApp that can decode and display ab1 sequencing file on Android and iOS. In addition to in-built analysis tools such as reverse complementation, protein translation and searching for specific sequences, we have incorporated convenient functions that would facilitate the harnessing of online Web tools for a full range of analysis. Given the high usage of Android/iOS tablets and smartphones, such bioinformatics apps would raise productivity and facilitate the high demand for analyzing sequencing data in biomedical research. The Android version of DNAApp is available in Google Play Store as 'DNAApp', and the iOS version is available in the App Store. More details on the app can be found at www.facebook.com/APDLab; www.bii.a-star.edu.sg/research/trd/apd.php The DNAApp user guide is available at http://tinyurl.com/DNAAppuser, and a video tutorial is available on Google Play Store and App Store, as well as on the Facebook page. samuelg@bii.a-star.edu.sg. © The Author 2014. Published by Oxford University Press.

  19. DNAApp: a mobile application for sequencing data analysis

    Science.gov (United States)

    Nguyen, Phi-Vu; Verma, Chandra Shekhar; Gan, Samuel Ken-En

    2014-01-01

    Summary: There have been numerous applications developed for decoding and visualization of ab1 DNA sequencing files for Windows and MAC platforms, yet none exists for the increasingly popular smartphone operating systems. The ability to decode sequencing files cannot easily be carried out using browser accessed Web tools. To overcome this hurdle, we have developed a new native app called DNAApp that can decode and display ab1 sequencing file on Android and iOS. In addition to in-built analysis tools such as reverse complementation, protein translation and searching for specific sequences, we have incorporated convenient functions that would facilitate the harnessing of online Web tools for a full range of analysis. Given the high usage of Android/iOS tablets and smartphones, such bioinformatics apps would raise productivity and facilitate the high demand for analyzing sequencing data in biomedical research. Availability and implementation: The Android version of DNAApp is available in Google Play Store as ‘DNAApp’, and the iOS version is available in the App Store. More details on the app can be found at www.facebook.com/APDLab; www.bii.a-star.edu.sg/research/trd/apd.php The DNAApp user guide is available at http://tinyurl.com/DNAAppuser, and a video tutorial is available on Google Play Store and App Store, as well as on the Facebook page. Contact: samuelg@bii.a-star.edu.sg PMID:25095882

  20. Long-read sequencing data analysis for yeasts.

    Science.gov (United States)

    Yue, Jia-Xing; Liti, Gianni

    2018-06-01

    Long-read sequencing technologies have become increasingly popular due to their strengths in resolving complex genomic regions. As a leading model organism with small genome size and great biotechnological importance, the budding yeast Saccharomyces cerevisiae has many isolates currently being sequenced with long reads. However, analyzing long-read sequencing data to produce high-quality genome assembly and annotation remains challenging. Here, we present a modular computational framework named long-read sequencing data analysis for yeasts (LRSDAY), the first one-stop solution that streamlines this process. Starting from the raw sequencing reads, LRSDAY can produce chromosome-level genome assembly and comprehensive genome annotation in a highly automated manner with minimal manual intervention, which is not possible using any alternative tool available to date. The annotated genomic features include centromeres, protein-coding genes, tRNAs, transposable elements (TEs), and telomere-associated elements. Although tailored for S. cerevisiae, we designed LRSDAY to be highly modular and customizable, making it adaptable to virtually any eukaryotic organism. When applying LRSDAY to an S. cerevisiae strain, it takes ∼41 h to generate a complete and well-annotated genome from ∼100× Pacific Biosciences (PacBio) running the basic workflow with four threads. Basic experience working within the Linux command-line environment is recommended for carrying out the analysis using LRSDAY.

  1. GMOMETHODS: the European Union database of reference methods for GMO analysis.

    Science.gov (United States)

    Bonfini, Laura; Van den Bulcke, Marc H; Mazzara, Marco; Ben, Enrico; Patak, Alexandre

    2012-01-01

    In order to provide reliable and harmonized information on methods for GMO (genetically modified organism) analysis we have published a database called "GMOMETHODS" that supplies information on PCR assays validated according to the principles and requirements of ISO 5725 and/or the International Union of Pure and Applied Chemistry protocol. In addition, the database contains methods that have been verified by the European Union Reference Laboratory for Genetically Modified Food and Feed in the context of compliance with an European Union legislative act. The web application provides search capabilities to retrieve primers and probes sequence information on the available methods. It further supplies core data required by analytical labs to carry out GM tests and comprises information on the applied reference material and plasmid standards. The GMOMETHODS database currently contains 118 different PCR methods allowing identification of 51 single GM events and 18 taxon-specific genes in a sample. It also provides screening assays for detection of eight different genetic elements commonly used for the development of GMOs. The application is referred to by the Biosafety Clearing House, a global mechanism set up by the Cartagena Protocol on Biosafety to facilitate the exchange of information on Living Modified Organisms. The publication of the GMOMETHODS database can be considered an important step toward worldwide standardization and harmonization in GMO analysis.

  2. Consistent Feature Extraction From Vector Fields: Combinatorial Representations and Analysis Under Local Reference Frames

    Energy Technology Data Exchange (ETDEWEB)

    Bhatia, Harsh [Univ. of Utah, Salt Lake City, UT (United States)

    2015-05-01

    This dissertation presents research on addressing some of the contemporary challenges in the analysis of vector fields—an important type of scientific data useful for representing a multitude of physical phenomena, such as wind flow and ocean currents. In particular, new theories and computational frameworks to enable consistent feature extraction from vector fields are presented. One of the most fundamental challenges in the analysis of vector fields is that their features are defined with respect to reference frames. Unfortunately, there is no single “correct” reference frame for analysis, and an unsuitable frame may cause features of interest to remain undetected, thus creating serious physical consequences. This work develops new reference frames that enable extraction of localized features that other techniques and frames fail to detect. As a result, these reference frames objectify the notion of “correctness” of features for certain goals by revealing the phenomena of importance from the underlying data. An important consequence of using these local frames is that the analysis of unsteady (time-varying) vector fields can be reduced to the analysis of sequences of steady (timeindependent) vector fields, which can be performed using simpler and scalable techniques that allow better data management by accessing the data on a per-time-step basis. Nevertheless, the state-of-the-art analysis of steady vector fields is not robust, as most techniques are numerical in nature. The residing numerical errors can violate consistency with the underlying theory by breaching important fundamental laws, which may lead to serious physical consequences. This dissertation considers consistency as the most fundamental characteristic of computational analysis that must always be preserved, and presents a new discrete theory that uses combinatorial representations and algorithms to provide consistency guarantees during vector field analysis along with the uncertainty

  3. Improved Ancestry Estimation for both Genotyping and Sequencing Data using Projection Procrustes Analysis and Genotype Imputation

    Science.gov (United States)

    Wang, Chaolong; Zhan, Xiaowei; Liang, Liming; Abecasis, Gonçalo R.; Lin, Xihong

    2015-01-01

    Accurate estimation of individual ancestry is important in genetic association studies, especially when a large number of samples are collected from multiple sources. However, existing approaches developed for genome-wide SNP data do not work well with modest amounts of genetic data, such as in targeted sequencing or exome chip genotyping experiments. We propose a statistical framework to estimate individual ancestry in a principal component ancestry map generated by a reference set of individuals. This framework extends and improves upon our previous method for estimating ancestry using low-coverage sequence reads (LASER 1.0) to analyze either genotyping or sequencing data. In particular, we introduce a projection Procrustes analysis approach that uses high-dimensional principal components to estimate ancestry in a low-dimensional reference space. Using extensive simulations and empirical data examples, we show that our new method (LASER 2.0), combined with genotype imputation on the reference individuals, can substantially outperform LASER 1.0 in estimating fine-scale genetic ancestry. Specifically, LASER 2.0 can accurately estimate fine-scale ancestry within Europe using either exome chip genotypes or targeted sequencing data with off-target coverage as low as 0.05×. Under the framework of LASER 2.0, we can estimate individual ancestry in a shared reference space for samples assayed at different loci or by different techniques. Therefore, our ancestry estimation method will accelerate discovery in disease association studies not only by helping model ancestry within individual studies but also by facilitating combined analysis of genetic data from multiple sources. PMID:26027497

  4. Construction of an integrated database to support genomic sequence analysis

    Energy Technology Data Exchange (ETDEWEB)

    Gilbert, W.; Overbeek, R.

    1994-11-01

    The central goal of this project is to develop an integrated database to support comparative analysis of genomes including DNA sequence data, protein sequence data, gene expression data and metabolism data. In developing the logic-based system GenoBase, a broader integration of available data was achieved due to assistance from collaborators. Current goals are to easily include new forms of data as they become available and to easily navigate through the ensemble of objects described within the database. This report comments on progress made in these areas.

  5. Reference indices of hip structural analysis in Ukrainian women

    Directory of Open Access Journals (Sweden)

    N.V. Grygorieva

    2017-10-01

    Full Text Available Background. Nowadays, a comprehensive assessment of osteoporosis and the risk of osteoporotic fractures involves the combine use of bone mineral density (BMD, 10-year probability of major osteoporotic fractures (Fracture Risk Assessment Tool, Trabecular Bone Score, and parameters of hip structural ana­lysis. In recent years, reference data on the three above-mentioned methods have been developed for the Ukrainian population, but there are no data on the latest methodology. The objective of the study was to assess the age characteristics of hip structural analysis parameters in Ukrainian women and to offer their reference values for use in clinical practice. Materials and methods. Using the dual energy X-ray absorptiometry method, we examined 690 healthy women aged 20–89 years wi­thout osteoporosis and other clinically significant diseases and conditions affecting the bone metabolism, without other accompanying pathology of hip joint. Results. The results of the study showed a significant effect of age on femoral strength index (FSI, cross-sectional moment of inertia (CSMI, cross-sectional area (CSA, distance from center of femoral head to center of femoral neck (d1, distance from center of femoral head to inter-trochanteric line (d2, mean femoral neck dia­meter (d3, distance from center of mass of femoral neck to superior neck margin (y, shaft angle (a and hip axis length (HAL indices, but not on parameters of neck/shaft angle (q. A significant decrease of FSI with age was established on the background on increase of CSMI, CSA and HAL parameters. Indices of height and body weight were reliably related with parameters of CSMI, CSA and HAL. FSI was significantly related to the body weight, but not to the height. In addition, it reliably correlated with BMD measured at femoral neck and lesser at total hip and lumbar spine. The HAL did not significant correlate with any of the measured BMD, which confirms its independent role in prediction of

  6. Survey of methods for integrated sequence analysis with emphasis on man-machine interaction

    Energy Technology Data Exchange (ETDEWEB)

    Kahlbom, U; Holmgren, P [RELCON, Stockholm (Sweden)

    1995-05-01

    This report presents a literature study concerning recently developed monotonic methodologies in the human reliability area. The work was performed by RELCON AB on commission by NKS/RAK-1, subproject 3. The topic of subproject 3 is `Integrated Sequence Analysis with Emphasis on Man-Machine Interaction`. The purpose with the study was to compile recently developed methodologies and to propose some of these methodologies for use in the sequence analysis task. The report describes mainly non-dynamic (monotonic) methodologies. One exception is HITLINE, which is a semi-dynamic method. Reference provides a summary of approaches to dynamic analysis of man-machine-interaction, and explains the differences between monotonic and dynamic methodologies. (au) 21 refs.

  7. Survey of methods for integrated sequence analysis with emphasis on man-machine interaction

    International Nuclear Information System (INIS)

    Kahlbom, U.; Holmgren, P.

    1995-05-01

    This report presents a literature study concerning recently developed monotonic methodologies in the human reliability area. The work was performed by RELCON AB on commission by NKS/RAK-1, subproject 3. The topic of subproject 3 is 'Integrated Sequence Analysis with Emphasis on Man-Machine Interaction'. The purpose with the study was to compile recently developed methodologies and to propose some of these methodologies for use in the sequence analysis task. The report describes mainly non-dynamic (monotonic) methodologies. One exception is HITLINE, which is a semi-dynamic method. Reference provides a summary of approaches to dynamic analysis of man-machine-interaction, and explains the differences between monotonic and dynamic methodologies. (au) 21 refs

  8. Characterization of Liaoning cashmere goat transcriptome: sequencing, de novo assembly, functional annotation and comparative analysis.

    Directory of Open Access Journals (Sweden)

    Hongliang Liu

    Full Text Available Liaoning cashmere goat is a famous goat breed for cashmere wool. In order to increase the transcriptome data and accelerate genetic improvement for this breed, we performed de novo transcriptome sequencing to generate the first expressed sequence tag dataset for the Liaoning cashmere goat, using next-generation sequencing technology.Transcriptome sequencing of Liaoning cashmere goat on a Roche 454 platform yielded 804,601 high-quality reads. Clustering and assembly of these reads produced a non-redundant set of 117,854 unigenes, comprising 13,194 isotigs and 104,660 singletons. Based on similarity searches with known proteins, 17,356 unigenes were assigned to 6,700 GO categories, and the terms were summarized into three main GO categories and 59 sub-categories. 3,548 and 46,778 unigenes had significant similarity to existing sequences in the KEGG and COG databases, respectively. Comparative analysis revealed that 42,254 unigenes were aligned to 17,532 different sequences in NCBI non-redundant nucleotide databases. 97,236 (82.51% unigenes were mapped to the 30 goat chromosomes. 35,551 (30.17% unigenes were matched to 11,438 reported goat protein-coding genes. The remaining non-matched unigenes were further compared with cattle and human reference genes, 67 putative new goat genes were discovered. Additionally, 2,781 potential simple sequence repeats were initially identified from all unigenes.The transcriptome of Liaoning cashmere goat was deep sequenced, de novo assembled, and annotated, providing abundant data to better understand the Liaoning cashmere goat transcriptome. The potential simple sequence repeats provide a material basis for future genetic linkage and quantitative trait loci analyses.

  9. Analysis of Sequence Diagram Layout in Advanced UML Modelling Tools

    Directory of Open Access Journals (Sweden)

    Ņikiforova Oksana

    2016-05-01

    Full Text Available System modelling using Unified Modelling Language (UML is the task that should be solved for software development. The more complex software becomes the higher requirements are stated to demonstrate the system to be developed, especially in its dynamic aspect, which in UML is offered by a sequence diagram. To solve this task, the main attention is devoted to the graphical presentation of the system, where diagram layout plays the central role in information perception. The UML sequence diagram due to its specific structure is selected for a deeper analysis on the elements’ layout. The authors research represents the abilities of modern UML modelling tools to offer automatic layout of the UML sequence diagram and analyse them according to criteria required for the diagram perception.

  10. Network clustering coefficient approach to DNA sequence analysis

    Energy Technology Data Exchange (ETDEWEB)

    Gerhardt, Guenther J.L. [Universidade Federal do Rio Grande do Sul-Hospital de Clinicas de Porto Alegre, Rua Ramiro Barcelos 2350/sala 2040/90035-003 Porto Alegre (Brazil); Departamento de Fisica e Quimica da Universidade de Caxias do Sul, Rua Francisco Getulio Vargas 1130, 95001-970 Caxias do Sul (Brazil); Lemke, Ney [Programa Interdisciplinar em Computacao Aplicada, Unisinos, Av. Unisinos, 950, 93022-000 Sao Leopoldo, RS (Brazil); Corso, Gilberto [Departamento de Biofisica e Farmacologia, Centro de Biociencias, Universidade Federal do Rio Grande do Norte, Campus Universitario, 59072 970 Natal, RN (Brazil)]. E-mail: corso@dfte.ufrn.br

    2006-05-15

    In this work we propose an alternative DNA sequence analysis tool based on graph theoretical concepts. The methodology investigates the path topology of an organism genome through a triplet network. In this network, triplets in DNA sequence are vertices and two vertices are connected if they occur juxtaposed on the genome. We characterize this network topology by measuring the clustering coefficient. We test our methodology against two main bias: the guanine-cytosine (GC) content and 3-bp (base pairs) periodicity of DNA sequence. We perform the test constructing random networks with variable GC content and imposed 3-bp periodicity. A test group of some organisms is constructed and we investigate the methodology in the light of the constructed random networks. We conclude that the clustering coefficient is a valuable tool since it gives information that is not trivially contained in 3-bp periodicity neither in the variable GC content.

  11. Iterative normalization technique for reference sequence generation for zero-tail discrete fourier transform spread orthogonal frequency division multiplexing

    DEFF Research Database (Denmark)

    2017-01-01

    , and performing an iterative manipulation of the input sequence. The performing of the iterative manipulation of the input sequence may include, for example: computing frequency domain response of the sequence, normalizing elements of the computed frequency domain sequence to unitary power while maintaining phase...

  12. Using SQL Databases for Sequence Similarity Searching and Analysis.

    Science.gov (United States)

    Pearson, William R; Mackey, Aaron J

    2017-09-13

    Relational databases can integrate diverse types of information and manage large sets of similarity search results, greatly simplifying genome-scale analyses. By focusing on taxonomic subsets of sequences, relational databases can reduce the size and redundancy of sequence libraries and improve the statistical significance of homologs. In addition, by loading similarity search results into a relational database, it becomes possible to explore and summarize the relationships between all of the proteins in an organism and those in other biological kingdoms. This unit describes how to use relational databases to improve the efficiency of sequence similarity searching and demonstrates various large-scale genomic analyses of homology-related data. It also describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. The unit also introduces search_demo, a database that stores sequence similarity search results. The search_demo database is then used to explore the evolutionary relationships between E. coli proteins and proteins in other organisms in a large-scale comparative genomic analysis. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.

  13. Reactor neutron activation analysis on reference materials from intercomparison runs

    International Nuclear Information System (INIS)

    Pantelica, A.; Salagean, M.

    2003-01-01

    A review of using the Instrumental Neutron Activation Analysis (INAA) technique in our laboratory to determine major, minor and trace elements in mineral and biological samples from international intercomparison runs organised by IAEA Vienna, IAEA-MEL Monaco, 'pb-anal' Kosice, INCT Warszawa and IPNT Krakow is presented. Neutron irradiation was carried out at WWR-S reactor in Bucharest (short and long irradiation) during 1982-1997 and at TRIGA reactor in Pitesti (long irradiation) during the later period. The following type of materials were analysed: soils, marine sediments, uranium phosphate ore, water sludge, copper flue dust, whey powder, yeast, cereal flour (rye and wheat), marine animal tissue (mussel, garfish and tuna fish), as well as vegetal tissue (seaweed, cabbage, spinach, alfalfa, algae, tea leaves and herbs). The following elements could be, in general, determined: Ag, As, Au, Ba, Br, Ca, Ce, Co, Cr, Cs, Eu, Fe, Hf, Hg, K, La, Lu, Mo, Na, Nd, Ni, Rb, Sb, Sc, Se, Sm, Sr, Ta, Tb, Th, U, W, Yb and Zn of long-lived radionuclides, as well as Al, Ca, Cl, Cu, Mg, Mn, and Ti of short-lived radionuclides. Data obtained in our laboratory for various matrix samples presented and compared with the intercomparison certified values. The intercomparison exercises offer to the participating laboratories the opportunity to test the accuracy of their analytical methods as well as to acquire valuable Reference Materials/ standards for future analytical applications. (authors)

  14. Nuclear microprobe analysis of the standard reference materials

    International Nuclear Information System (INIS)

    Jaksic, M.; Fazinic, S.; Bogdanovic, I.; Tadic, T.

    2002-01-01

    Most of the presently existing Standard Reference Materials (SRM) for nuclear analytical methods are certified for the analyzed mass of the order of few hundred mg. Typical mass of sample which is analyzed by PIXE or XRF methods is very often below 1 mg. By the development of focused proton or x-ray beams, masses which can be typically analyzed go down to μg or even ng level. It is difficult to make biological or environmental SRMs which can give desired homogeneity at such low scale. However, use of fundamental parameter quantitative evaluation procedures (absolute method), minimize needs for SRMs. In PIXE and micro PIXE setup at our Institute, fundamental parameter approach is used. For exact calibration of the quantitative analysis procedure just one standard sample is needed. In our case glass standards which showed homogeneity down to micron scale were used. Of course, it is desirable to use SRMs for quality assurance, and therefore need for homogenous materials can be justified even for micro PIXE method. In this presentation, brief overview of PIXE setup calibration is given, along with some recent results of tests of several SRMs

  15. SEQUENCING AND SEQUENCE ANALYSIS OF MYOSTATIN GENE IN THE EXON 1 OF THE CAMEL (CAMELUS DROMEDARIUS

    Directory of Open Access Journals (Sweden)

    M. G. SHAH, A. S. QURESHI1, M. REISSMANN2 AND H. J. SCHWARTZ3

    2006-10-01

    Full Text Available Myostatin, also called growth differentiation factor-8 (GDF-8, is a member of the mammalian growth transforming family (TGF-beta superfamily, which is expressed specifically in developing an adult skeletal muscle. Muscular hypertrophy allele (mh allele in the double muscle breeds involved mutation within the myostatin gene. Genomic DNA was isolated from the camel hair using NucleoSpin Tissue kit. Two animals of each of the six breeds namely, Marecha, Dhatti, Larri, Kohi, Sakrai and Cambelpuri were used for sequencing. For PCR amplification of the gene, a primer pair was designed from homolog regions of already published sequences of farm animals from GenBank. Results showed that camel myostatin possessed more than 90% homology with that of cattle, sheep and pig. Camel formed separate cluster from the pig in spite of having high homology (98% and showed 94% homology with cattle and sheep as reported in literature. Sequence analysis of the PCR amplified part of exon 1 (256 bp of the camel myostatin was identical among six camel breeds.

  16. Complete genome sequence analysis of novel human bocavirus reveals genetic recombination between human bocavirus 2 and human bocavirus 4.

    Science.gov (United States)

    Khamrin, Pattara; Okitsu, Shoko; Ushijima, Hiroshi; Maneekarn, Niwat

    2013-07-01

    Epidemiological surveillance of human bocavirus (HBoV) was conducted on fecal specimens collected from hospitalized children with diarrhea in Chiang Mai, Thailand in 2011. By partial sequence analysis of VP1 gene, an unusual strain of HBoV (CMH-S011-11), was initially identified as HBoV4. The complete genome sequence of CMH-S011-11 was performed and analyzed further to clarify whether it was a recombinant strain or a new HBoV variant. Analysis of complete genome sequence revealed that the coding sequence starting from NS1, NP1 to VP1/VP2 was 4795 nucleotides long. Interestingly, the nucleotide sequence of NS1 gene of CMH-S011-11 was most closely related to the HBoV2 reference strains detected in Pakistan, which contradicted to the initial genotyping result of the partial VP1 region in the previous study. In addition, comparison of NP1 nucleotide sequence of CMH-S011-11 with those of other HBoV1-4 reference strains also revealed a high level of sequence identity with HBoV2. On the other hand, nucleotide sequence of VP1/VP2 gene of CMH-S011-11 was most closely related to those of HBoV4 reference strains detected in Nigeria. The overall full-length sequence analysis revealed that this CMH-S011-11 was grouped within HBoV4 species, but located in a separate branch from other HBoV4 prototype strains. Recombination analysis revealed that CMH-S011-11 was the result of recombination between HBoV2 and HBoV4 strains with the break point located near the start codon of VP2. Copyright © 2013 Elsevier B.V. All rights reserved.

  17. sRNAnalyzer-a flexible and customizable small RNA sequencing data analysis pipeline.

    Science.gov (United States)

    Wu, Xiaogang; Kim, Taek-Kyun; Baxter, David; Scherler, Kelsey; Gordon, Aaron; Fong, Olivia; Etheridge, Alton; Galas, David J; Wang, Kai

    2017-12-01

    Although many tools have been developed to analyze small RNA sequencing (sRNA-Seq) data, it remains challenging to accurately analyze the small RNA population, mainly due to multiple sequence ID assignment caused by short read length. Additional issues in small RNA analysis include low consistency of microRNA (miRNA) measurement results across different platforms, miRNA mapping associated with miRNA sequence variation (isomiR) and RNA editing, and the origin of those unmapped reads after screening against all endogenous reference sequence databases. To address these issues, we built a comprehensive and customizable sRNA-Seq data analysis pipeline-sRNAnalyzer, which enables: (i) comprehensive miRNA profiling strategies to better handle isomiRs and summarization based on each nucleotide position to detect potential SNPs in miRNAs, (ii) different sequence mapping result assignment approaches to simulate results from microarray/qRT-PCR platforms and a local probabilistic model to assign mapping results to the most-likely IDs, (iii) comprehensive ribosomal RNA filtering for accurate mapping of exogenous RNAs and summarization based on taxonomy annotation. We evaluated our pipeline on both artificial samples (including synthetic miRNA and Escherichia coli cultures) and biological samples (human tissue and plasma). sRNAnalyzer is implemented in Perl and available at: http://srnanalyzer.systemsbiology.net/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  18. sRNAnalyzer—a flexible and customizable small RNA sequencing data analysis pipeline

    Science.gov (United States)

    Kim, Taek-Kyun; Baxter, David; Scherler, Kelsey; Gordon, Aaron; Fong, Olivia; Etheridge, Alton; Galas, David J.

    2017-01-01

    Abstract Although many tools have been developed to analyze small RNA sequencing (sRNA-Seq) data, it remains challenging to accurately analyze the small RNA population, mainly due to multiple sequence ID assignment caused by short read length. Additional issues in small RNA analysis include low consistency of microRNA (miRNA) measurement results across different platforms, miRNA mapping associated with miRNA sequence variation (isomiR) and RNA editing, and the origin of those unmapped reads after screening against all endogenous reference sequence databases. To address these issues, we built a comprehensive and customizable sRNA-Seq data analysis pipeline—sRNAnalyzer, which enables: (i) comprehensive miRNA profiling strategies to better handle isomiRs and summarization based on each nucleotide position to detect potential SNPs in miRNAs, (ii) different sequence mapping result assignment approaches to simulate results from microarray/qRT-PCR platforms and a local probabilistic model to assign mapping results to the most-likely IDs, (iii) comprehensive ribosomal RNA filtering for accurate mapping of exogenous RNAs and summarization based on taxonomy annotation. We evaluated our pipeline on both artificial samples (including synthetic miRNA and Escherichia coli cultures) and biological samples (human tissue and plasma). sRNAnalyzer is implemented in Perl and available at: http://srnanalyzer.systemsbiology.net/. PMID:29069500

  19. Informatics for RNA Sequencing: A Web Resource for Analysis on the Cloud.

    Directory of Open Access Journals (Sweden)

    Malachi Griffith

    2015-08-01

    Full Text Available Massively parallel RNA sequencing (RNA-seq has rapidly become the assay of choice for interrogating RNA transcript abundance and diversity. This article provides a detailed introduction to fundamental RNA-seq molecular biology and informatics concepts. We make available open-access RNA-seq tutorials that cover cloud computing, tool installation, relevant file formats, reference genomes, transcriptome annotations, quality-control strategies, expression, differential expression, and alternative splicing analysis methods. These tutorials and additional training resources are accompanied by complete analysis pipelines and test datasets made available without encumbrance at www.rnaseq.wiki.

  20. PseudoMLSA: a database for multigenic sequence analysis of Pseudomonas species

    Directory of Open Access Journals (Sweden)

    Lalucat Jorge

    2010-04-01

    Full Text Available Abstract Background The genus Pseudomonas comprises more than 100 species of environmental, clinical, agricultural, and biotechnological interest. Although, the recommended method for discriminating bacterial species is DNA-DNA hybridisation, alternative techniques based on multigenic sequence analysis are becoming a common practice in bacterial species discrimination studies. Since there is not a general criterion for determining which genes are more useful for species resolution; the number of strains and genes analysed is increasing continuously. As a result, sequences of different genes are dispersed throughout several databases. This sequence information needs to be collected in a common database, in order to be useful for future identification-based projects. Description The PseudoMLSA Database is a comprehensive database of multiple gene sequences from strains of Pseudomonas species. The core of the database is composed of selected gene sequences from all Pseudomonas type strains validly assigned to the genus through 2008. The database is aimed to be useful for MultiLocus Sequence Analysis (MLSA procedures, for the identification and characterisation of any Pseudomonas bacterial isolate. The sequences are available for download via a direct connection to the National Center for Biotechnology Information (NCBI. Additionally, the database includes an online BLAST interface for flexible nucleotide queries and similarity searches with the user's datasets, and provides a user-friendly output for easily parsing, navigating, and analysing BLAST results. Conclusions The PseudoMLSA database amasses strains and sequence information of validly described Pseudomonas species, and allows free querying of the database via a user-friendly, web-based interface available at http://www.uib.es/microbiologiaBD/Welcome.html. The web-based platform enables easy retrieval at strain or gene sequence information level; including references to published peer

  1. An Imaging And Graphics Workstation For Image Sequence Analysis

    Science.gov (United States)

    Mostafavi, Hassan

    1990-01-01

    This paper describes an application-specific engineering workstation designed and developed to analyze imagery sequences from a variety of sources. The system combines the software and hardware environment of the modern graphic-oriented workstations with the digital image acquisition, processing and display techniques. The objective is to achieve automation and high throughput for many data reduction tasks involving metric studies of image sequences. The applications of such an automated data reduction tool include analysis of the trajectory and attitude of aircraft, missile, stores and other flying objects in various flight regimes including launch and separation as well as regular flight maneuvers. The workstation can also be used in an on-line or off-line mode to study three-dimensional motion of aircraft models in simulated flight conditions such as wind tunnels. The system's key features are: 1) Acquisition and storage of image sequences by digitizing real-time video or frames from a film strip; 2) computer-controlled movie loop playback, slow motion and freeze frame display combined with digital image sharpening, noise reduction, contrast enhancement and interactive image magnification; 3) multiple leading edge tracking in addition to object centroids at up to 60 fields per second from both live input video or a stored image sequence; 4) automatic and manual field-of-view and spatial calibration; 5) image sequence data base generation and management, including the measurement data products; 6) off-line analysis software for trajectory plotting and statistical analysis; 7) model-based estimation and tracking of object attitude angles; and 8) interface to a variety of video players and film transport sub-systems.

  2. Sirius PSB: a generic system for analysis of biological sequences.

    Science.gov (United States)

    Koh, Chuan Hock; Lin, Sharene; Jedd, Gregory; Wong, Limsoon

    2009-12-01

    Computational tools are essential components of modern biological research. For example, BLAST searches can be used to identify related proteins based on sequence homology, or when a new genome is sequenced, prediction models can be used to annotate functional sites such as transcription start sites, translation initiation sites and polyadenylation sites and to predict protein localization. Here we present Sirius Prediction Systems Builder (PSB), a new computational tool for sequence analysis, classification and searching. Sirius PSB has four main operations: (1) Building a classifier, (2) Deploying a classifier, (3) Search for proteins similar to query proteins, (4) Preliminary and post-prediction analysis. Sirius PSB supports all these operations via a simple and interactive graphical user interface. Besides being a convenient tool, Sirius PSB has also introduced two novelties in sequence analysis. Firstly, genetic algorithm is used to identify interesting features in the feature space. Secondly, instead of the conventional method of searching for similar proteins via sequence similarity, we introduced searching via features' similarity. To demonstrate the capabilities of Sirius PSB, we have built two prediction models - one for the recognition of Arabidopsis polyadenylation sites and another for the subcellular localization of proteins. Both systems are competitive against current state-of-the-art models based on evaluation of public datasets. More notably, the time and effort required to build each model is greatly reduced with the assistance of Sirius PSB. Furthermore, we show that under certain conditions when BLAST is unable to find related proteins, Sirius PSB can identify functionally related proteins based on their biophysical similarities. Sirius PSB and its related supplements are available at: http://compbio.ddns.comp.nus.edu.sg/~sirius.

  3. HPV-QUEST: A highly customized system for automated HPV sequence analysis capable of processing Next Generation sequencing data set.

    Science.gov (United States)

    Yin, Li; Yao, Jiqiang; Gardner, Brent P; Chang, Kaifen; Yu, Fahong; Goodenow, Maureen M

    2012-01-01

    Next Generation sequencing (NGS) applied to human papilloma viruses (HPV) can provide sensitive methods to investigate the molecular epidemiology of multiple type HPV infection. Currently a genotyping system with a comprehensive collection of updated HPV reference sequences and a capacity to handle NGS data sets is lacking. HPV-QUEST was developed as an automated and rapid HPV genotyping system. The web-based HPV-QUEST subtyping algorithm was developed using HTML, PHP, Perl scripting language, and MYSQL as the database backend. HPV-QUEST includes a database of annotated HPV reference sequences with updated nomenclature covering 5 genuses, 14 species and 150 mucosal and cutaneous types to genotype blasted query sequences. HPV-QUEST processes up to 10 megabases of sequences within 1 to 2 minutes. Results are reported in html, text and excel formats and display e-value, blast score, and local and coverage identities; provide genus, species, type, infection site and risk for the best matched reference HPV sequence; and produce results ready for additional analyses.

  4. MRI of the temporo-mandibular joint: which sequence is best suited to assess the cortical bone of the mandibular condyle? A cadaveric study using micro-CT as the standard of reference

    International Nuclear Information System (INIS)

    Karlo, Christoph A.; Patcas, Raphael; Signorelli, Luca; Mueller, Lukas; Kau, Thomas; Watzal, Helmut; Kellenberger, Christian J.; Ullrich, Oliver; Luder, Hans-Ulrich

    2012-01-01

    To determine the best suited sagittal MRI sequence out of a standard temporo-mandibular joint (TMJ) imaging protocol for the assessment of the cortical bone of the mandibular condyles of cadaveric specimens using micro-CT as the standard of reference. Sixteen TMJs in 8 human cadaveric heads (mean age, 81 years) were examined by MRI. Upon all sagittal sequences, two observers measured the cortical bone thickness (CBT) of the anterior, superior and posterior portions of the mandibular condyles (i.e. objective analysis), and assessed for the presence of cortical bone thinning, erosions or surface irregularities as well as subcortical bone cysts and anterior osteophytes (i.e. subjective analysis). Micro-CT of the condyles was performed to serve as the standard of reference for statistical analysis. Inter-observer agreements for objective (r = 0.83-0.99, P < 0.01) and subjective (κ = 0.67-0.88) analyses were very good. Mean CBT measurements were most accurate, and cortical bone thinning, erosions, surface irregularities and subcortical bone cysts were best depicted on the 3D fast spoiled gradient echo recalled sequence (3D FSPGR). The most reliable MRI sequence to assess the cortical bone of the mandibular condyles on sagittal imaging planes is the 3D FSPGR sequence. (orig.)

  5. MRI of the temporo-mandibular joint: which sequence is best suited to assess the cortical bone of the mandibular condyle? A cadaveric study using micro-CT as the standard of reference

    Energy Technology Data Exchange (ETDEWEB)

    Karlo, Christoph A. [University Hospital Zurich, Department of Diagnostic and Interventional Radiology, Zurich (Switzerland); University Children' s Hospital Zurich, Department of Diagnostic Imaging, Zurich (Switzerland); Patcas, Raphael; Signorelli, Luca; Mueller, Lukas [University of Zurich, Clinic for Orthodontics and Pediatric Dentistry, Center of Dental Medicine, Zurich (Switzerland); Kau, Thomas; Watzal, Helmut; Kellenberger, Christian J. [University Children' s Hospital Zurich, Department of Diagnostic Imaging, Zurich (Switzerland); Ullrich, Oliver [University of Zurich, Institute of Anatomy, Faculty of Medicine, Zurich (Switzerland); Luder, Hans-Ulrich [University of Zurich, Section of Orofacial Structures and Development, Center of Dental Medicine, Zurich (Switzerland)

    2012-07-15

    To determine the best suited sagittal MRI sequence out of a standard temporo-mandibular joint (TMJ) imaging protocol for the assessment of the cortical bone of the mandibular condyles of cadaveric specimens using micro-CT as the standard of reference. Sixteen TMJs in 8 human cadaveric heads (mean age, 81 years) were examined by MRI. Upon all sagittal sequences, two observers measured the cortical bone thickness (CBT) of the anterior, superior and posterior portions of the mandibular condyles (i.e. objective analysis), and assessed for the presence of cortical bone thinning, erosions or surface irregularities as well as subcortical bone cysts and anterior osteophytes (i.e. subjective analysis). Micro-CT of the condyles was performed to serve as the standard of reference for statistical analysis. Inter-observer agreements for objective (r = 0.83-0.99, P < 0.01) and subjective ({kappa} = 0.67-0.88) analyses were very good. Mean CBT measurements were most accurate, and cortical bone thinning, erosions, surface irregularities and subcortical bone cysts were best depicted on the 3D fast spoiled gradient echo recalled sequence (3D FSPGR). The most reliable MRI sequence to assess the cortical bone of the mandibular condyles on sagittal imaging planes is the 3D FSPGR sequence. (orig.)

  6. Molecular characterization of Giardia psittaci by multilocus sequence analysis.

    Science.gov (United States)

    Abe, Niichiro; Makino, Ikuko; Kojima, Atsushi

    2012-12-01

    Multilocus sequence analyses targeting small subunit ribosomal DNA (SSU rDNA), elongation factor 1 alpha (ef1α), glutamate dehydrogenase (gdh), and beta giardin (β-giardin) were performed on Giardia psittaci isolates from three Budgerigars (Melopsittacus undulates) and four Barred parakeets (Bolborhynchus lineola) kept in individual households or imported from overseas. Nucleotide differences and phylogenetic analyses at four loci indicate the distinction of G. psittaci from the other known Giardia species: Giardia muris, Giardia microti, Giardia ardeae, and Giardia duodenalis assemblages. Furthermore, G. psittaci was related more closely to G. duodenalis than to the other known Giardia species, except for G. microti. Conflicting signals regarded as "double peaks" were found at the same nucleotide positions of the ef1α in all isolates. However, the sequences of the other three loci, including gdh and β-giardin, which are known to be highly variable, from all isolates were also mutually identical at every locus. They showed no double peaks. These results suggest that double peaks found in the ef1α sequences are caused not by mixed infection with genetically different G. psittaci isolates but by allelic sequence heterogeneity (ASH), which is observed in diplomonad lineages including G. duodenalis. No sequence difference was found in any G. psittaci isolates at the gdh and β-giardin, suggesting that G. psittaci is indeed not more diverse genetically than other Giardia species. This report is the first to provide evidence related to the genetic characteristics of G. psittaci obtained using multilocus sequence analysis. Copyright © 2012 Elsevier B.V. All rights reserved.

  7. CISAPS: Complex Informational Spectrum for the Analysis of Protein Sequences

    Directory of Open Access Journals (Sweden)

    Charalambos Chrysostomou

    2015-01-01

    Full Text Available Complex informational spectrum analysis for protein sequences (CISAPS and its web-based server are developed and presented. As recent studies show, only the use of the absolute spectrum in the analysis of protein sequences using the informational spectrum analysis is proven to be insufficient. Therefore, CISAPS is developed to consider and provide results in three forms including absolute, real, and imaginary spectrum. Biologically related features to the analysis of influenza A subtypes as presented as a case study in this study can also appear individually either in the real or imaginary spectrum. As the results presented, protein classes can present similarities or differences according to the features extracted from CISAPS web server. These associations are probable to be related with the protein feature that the specific amino acid index represents. In addition, various technical issues such as zero-padding and windowing that may affect the analysis are also addressed. CISAPS uses an expanded list of 611 unique amino acid indices where each one represents a different property to perform the analysis. This web-based server enables researchers with little knowledge of signal processing methods to apply and include complex informational spectrum analysis to their work.

  8. Hunting down frame shifts: Ecological analysis of diverse functional gene sequences

    Directory of Open Access Journals (Sweden)

    Michal eStrejcek

    2015-11-01

    Full Text Available Functional gene ecological analyses using amplicon sequencing can be challenging as translated sequences are often burdened with shifted reading frames. The aim of this work was to evaluate several bioinformatics tools designed to correct errors which arise during sequencing in an effort to reduce the number of frame-shifts (FS. Genes encoding for alpha subunits of biphenyl (bphA and benzoate (benA dioxygenases were used as model sequences. FrameBot, a FS correction tool, was able to reduce the number of detected FS to zero. However, up to 43.1% of sequences were discarded by FrameBot as non-specific targets. Therefore, we proposed a de novo mode of FrameBot for FS correction, which works on a similar basis as common chimera identifying platforms and is not dependent on reference sequences. By nature of FrameBot de novo design, it is crucial to provide it with data as error free as possible. We tested the ability of several publicly available correction tools to decrease the number of errors in the data sets. The combination of Maximum Expected Error (MEE filtering and single linkage pre-clustering (SLP proved the most efficient read procession. Applying FrameBot de novo on the processed data enabled analysis of BphA sequences with minimal losses of potentially functional sequences not homologous to those previously known. This experiment also demonstrated the extensive diversity of dioxygenases in soil. A script which performs FrameBot de novo is presented in the supplementary material to the study and the tool was implemented into FunGene Pipeline available at http://fungene.cme.msu.edu/FunGenePipeline/ and https://github.com/rdpstaff/Framebot.

  9. FDSTools: A software package for analysis of massively parallel sequencing data with the ability to recognise and correct STR stutter and other PCR or sequencing noise.

    Science.gov (United States)

    Hoogenboom, Jerry; van der Gaag, Kristiaan J; de Leeuw, Rick H; Sijen, Titia; de Knijff, Peter; Laros, Jeroen F J

    2017-03-01

    Massively parallel sequencing (MPS) is on the advent of a broad scale application in forensic research and casework. The improved capabilities to analyse evidentiary traces representing unbalanced mixtures is often mentioned as one of the major advantages of this technique. However, most of the available software packages that analyse forensic short tandem repeat (STR) sequencing data are not well suited for high throughput analysis of such mixed traces. The largest challenge is the presence of stutter artefacts in STR amplifications, which are not readily discerned from minor contributions. FDSTools is an open-source software solution developed for this purpose. The level of stutter formation is influenced by various aspects of the sequence, such as the length of the longest uninterrupted stretch occurring in an STR. When MPS is used, STRs are evaluated as sequence variants that each have particular stutter characteristics which can be precisely determined. FDSTools uses a database of reference samples to determine stutter and other systemic PCR or sequencing artefacts for each individual allele. In addition, stutter models are created for each repeating element in order to predict stutter artefacts for alleles that are not included in the reference set. This information is subsequently used to recognise and compensate for the noise in a sequence profile. The result is a better representation of the true composition of a sample. Using Promega Powerseq™ Auto System data from 450 reference samples and 31 two-person mixtures, we show that the FDSTools correction module decreases stutter ratios above 20% to below 3%. Consequently, much lower levels of contributions in the mixed traces are detected. FDSTools contains modules to visualise the data in an interactive format allowing users to filter data with their own preferred thresholds. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

  10. CAFE: aCcelerated Alignment-FrEe sequence analysis.

    Science.gov (United States)

    Lu, Yang Young; Tang, Kujin; Ren, Jie; Fuhrman, Jed A; Waterman, Michael S; Sun, Fengzhu

    2017-07-03

    Alignment-free genome and metagenome comparisons are increasingly important with the development of next generation sequencing (NGS) technologies. Recently developed state-of-the-art k-mer based alignment-free dissimilarity measures including CVTree, $d_2^*$ and $d_2^S$ are more computationally expensive than measures based solely on the k-mer frequencies. Here, we report a standalone software, aCcelerated Alignment-FrEe sequence analysis (CAFE), for efficient calculation of 28 alignment-free dissimilarity measures. CAFE allows for both assembled genome sequences and unassembled NGS shotgun reads as input, and wraps the output in a standard PHYLIP format. In downstream analyses, CAFE can also be used to visualize the pairwise dissimilarity measures, including dendrograms, heatmap, principal coordinate analysis and network display. CAFE serves as a general k-mer based alignment-free analysis platform for studying the relationships among genomes and metagenomes, and is freely available at https://github.com/younglululu/CAFE. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  11. Reference model analysis of suitability for logistics management

    Directory of Open Access Journals (Sweden)

    Cezary Mańkowski

    2011-12-01

    Full Text Available Reference models are one of the many instruments aspiring to find into a set of different concepts, methods and techniques used in managing the logistics. Therefore, the aim of this paper is to present the results of assessing the suitability of reference models for solving logistical problems. This evaluation indicates that they are universal, support the realization of all the logistics management function in various areas, such as logistics of manufacturing glass products.

  12. Study of the fast inversion recovery pulse sequence. With reference to fast fluid attenuated inversion recovery and fast short TI inversion recovery pulse sequence

    International Nuclear Information System (INIS)

    Tsuchihashi, Toshio; Maki, Toshio; Suzuki, Takeshi

    1997-01-01

    The fast inversion recovery (fast IR) pulse sequence was evaluated. We compared the fast fluid attenuated inversion recovery (fast FLAIR) pulse sequence in which inversion time (TI) was established as equal to the water null point for the purpose of the water-suppressed T 2 -weighted image, with the fast short TI inversion recovery (fast STIR) pulse sequence in which TI was established as equal to the fat null point for purpose of fat suppression. In the fast FLAIR pulse sequence, the water null point was increased by making TR longer. In the FLAIR pulse sequence, the longitudinal magnetization contrast is determined by TI. If TI is increased, T 2 -weighted contrast improves in the same way as increasing TR for the SE pulse sequence. Therefore, images should be taken with long TR and long TI, which are longer than TR and longer than the water null point. On the other hand, the fat null point is not affected by TR in the fast STIR pulse sequence. However, effective TE was affected by variation of the null point. This increased in proportion to the increase in effective TE. Our evaluation indicated that the fast STIR pulse sequence can control the extensive signals from fat in a short time. (author)

  13. Analysis of decision procedures for a sequence of inventory periods

    International Nuclear Information System (INIS)

    Avenhaus, R.

    1982-07-01

    Optimal test procedures for a sequence of inventory periods will be discussed. Starting with a game theoretical description of the conflict situation between the plant operator and the inspector, the objectives of the inspector as well as the general decision theoretical problem will be formulated. In the first part the objective of 'secure' detection will be emphasized which means that only at the end of the reference time a decision is taken by the inspector. In the second part the objective of 'timely' detection will be emphasized which will lead to sequential test procedures. At the end of the paper all procedures will be summarized, and in view of the multitude of procedures available at the moment some comments about future work will be given. (orig./HP) [de

  14. Environmental impact analysis for the main accidental sequences of ignitor

    International Nuclear Information System (INIS)

    Carpignano, A.; Francabandiera, S.; Vella, R.; Zucchetti, M.

    1996-01-01

    A safety analysis study has been applied to the Ignitor machine using Probabilistic Safety Assessment. The main initiating events have been identified, and accident sequences have been studied by means of traditional methods such as Failure Mode and Effect Analysis (FMEA), Fault Trees (FT) and Event Trees (ET). The consequences of the radioactive environmental releases have been assessed in terms of Effective Dose Equivalent (EDEs) to the Most Exposed Individuals (MEI) of the chosen site, by means of a population dose code. Results point out the low enviromental impact of the machine. 13 refs., 1 fig., 3 tabs

  15. Use of gamma spectrometry for analysis of three reference materials

    International Nuclear Information System (INIS)

    Kinova, L.

    2004-01-01

    All reference materials (Reference material A: weight = 49.23 g; Reference material B: weight = 36.08 g; Reference material C: weight = 26.18 g) were packed in 50 cm 3 polypropylene vials, sealed and measured consecutively three times at intervals of the average of 25 days. Low background gamma spectrometry system: HPGe detector with high energy resolution (FWHM for 1332 KeV of Co-60 is 1.9 KeV, Relative counting efficiency for the same energy is 21 %) was used. Results: All materials are of low activity and must be measured for a long time.The highest specific activity of a man-made radionuclides Cs-137 and Am-241 is in the material A. An instrumentally measurable activity of Pb-210 also can be observed in this material. Medium values are in the material B. The reference material C according to the specific activity seems to be a low natural radioactivity material with highest activity of natural nuclides Th-232 and Pa-234 (progeny of U-238). Conclusions: Gamma spectrometry is an useful tool for initial measurement of materials with low radioactivity. Such measurements give an orientation for the nuclides content and approximate activity in the material for the following radiochemical determinations

  16. Methylobacterium genome sequences: a reference blueprint to investigate microbial metabolism of C1 compounds from natural and industrial sources.

    Science.gov (United States)

    Vuilleumier, Stéphane; Chistoserdova, Ludmila; Lee, Ming-Chun; Bringel, Françoise; Lajus, Aurélie; Zhou, Yang; Gourion, Benjamin; Barbe, Valérie; Chang, Jean; Cruveiller, Stéphane; Dossat, Carole; Gillett, Will; Gruffaz, Christelle; Haugen, Eric; Hourcade, Edith; Levy, Ruth; Mangenot, Sophie; Muller, Emilie; Nadalig, Thierry; Pagni, Marco; Penny, Christian; Peyraud, Rémi; Robinson, David G; Roche, David; Rouy, Zoé; Saenampechek, Channakhone; Salvignol, Grégory; Vallenet, David; Wu, Zaining; Marx, Christopher J; Vorholt, Julia A; Olson, Maynard V; Kaul, Rajinder; Weissenbach, Jean; Médigue, Claudine; Lidstrom, Mary E

    2009-01-01

    Methylotrophy describes the ability of organisms to grow on reduced organic compounds without carbon-carbon bonds. The genomes of two pink-pigmented facultative methylotrophic bacteria of the Alpha-proteobacterial genus Methylobacterium, the reference species Methylobacterium extorquens strain AM1 and the dichloromethane-degrading strain DM4, were compared. The 6.88 Mb genome of strain AM1 comprises a 5.51 Mb chromosome, a 1.26 Mb megaplasmid and three plasmids, while the 6.12 Mb genome of strain DM4 features a 5.94 Mb chromosome and two plasmids. The chromosomes are highly syntenic and share a large majority of genes, while plasmids are mostly strain-specific, with the exception of a 130 kb region of the strain AM1 megaplasmid which is syntenic to a chromosomal region of strain DM4. Both genomes contain large sets of insertion elements, many of them strain-specific, suggesting an important potential for genomic plasticity. Most of the genomic determinants associated with methylotrophy are nearly identical, with two exceptions that illustrate the metabolic and genomic versatility of Methylobacterium. A 126 kb dichloromethane utilization (dcm) gene cluster is essential for the ability of strain DM4 to use DCM as the sole carbon and energy source for growth and is unique to strain DM4. The methylamine utilization (mau) gene cluster is only found in strain AM1, indicating that strain DM4 employs an alternative system for growth with methylamine. The dcm and mau clusters represent two of the chromosomal genomic islands (AM1: 28; DM4: 17) that were defined. The mau cluster is flanked by mobile elements, but the dcm cluster disrupts a gene annotated as chelatase and for which we propose the name "island integration determinant" (iid). These two genome sequences provide a platform for intra- and interspecies genomic comparisons in the genus Methylobacterium, and for investigations of the adaptive mechanisms which allow bacterial lineages to acquire methylotrophic

  17. Analysis of sequence diversity through internal transcribed spacers and simple sequence repeats to identify Dendrobium species.

    Science.gov (United States)

    Liu, Y T; Chen, R K; Lin, S J; Chen, Y C; Chin, S W; Chen, F C; Lee, C Y

    2014-04-08

    The Orchidaceae is one of the largest and most diverse families of flowering plants. The Dendrobium genus has high economic potential as ornamental plants and for medicinal purposes. In addition, the species of this genus are able to produce large crops. However, many Dendrobium varieties are very similar in outward appearance, making it difficult to distinguish one species from another. This study demonstrated that the 12 Dendrobium species used in this study may be divided into 2 groups by internal transcribed spacer (ITS) sequence analysis. Red and yellow flowers may also be used to separate these species into 2 main groups. In particular, the deciduous characteristic is associated with the ITS genetic diversity of the A group. Of 53 designed simple sequence repeat (SSR) primer pairs, 7 pairs were polymorphic for polymerase chain reaction products that were amplified from a specific band. The results of this study demonstrate that these 7 SSR primer pairs may potentially be used to identify Dendrobium species and their progeny in future studies.

  18. Using Behavior Sequence Analysis to Map Serial Killers' Life Histories.

    Science.gov (United States)

    Keatley, David A; Golightly, Hayley; Shephard, Rebecca; Yaksic, Enzo; Reid, Sasha

    2018-03-01

    The aim of the current research was to provide a novel method for mapping the developmental sequences of serial killers' life histories. An in-depth biographical account of serial killers' lives, from birth through to conviction, was gained and analyzed using Behavior Sequence Analysis. The analyses highlight similarities in behavioral events across the serial killers' lives, indicating not only which risk factors occur, but the temporal order of these factors. Results focused on early childhood environment, indicating the role of parental abuse; behaviors and events surrounding criminal histories of serial killers, showing that many had previous convictions and were known to police for other crimes; behaviors surrounding their murders, highlighting differences in victim choice and modus operandi; and, finally, trial pleas and convictions. The present research, therefore, provides a novel approach to synthesizing large volumes of data on criminals and presenting results in accessible, understandable outcomes.

  19. Reference analysis of the signal + background model in counting experiments

    Science.gov (United States)

    Casadei, D.

    2012-01-01

    The model representing two independent Poisson processes, labelled as ``signal'' and ``background'' and both contributing additively to the total number of counted events, is considered from a Bayesian point of view. This is a widely used model for the searches of rare or exotic events in presence of a background source, as for example in the searches performed by high-energy physics experiments. In the assumption of prior knowledge about the background yield, a reference prior is obtained for the signal alone and its properties are studied. Finally, the properties of the full solution, the marginal reference posterior, are illustrated with few examples.

  20. Content Analysis of Virtual Reference Data: Reshaping Library Website Design.

    Science.gov (United States)

    Fan, Suhua Caroline; Welch, Jennifer M

    2016-01-01

    An academic health sciences library wanted to redesign its website to provide better access to health information in the community. Virtual reference data were used to provide information about user searching behavior. This study analyzed three years (2012-2014) of virtual reference data, including e-mail questions, text messaging, and live chat transcripts, to evaluate the library website for redesigning, especially in areas such as the home page, patrons' terminology, and issues prompting patrons to ask for help. A coding system based on information links in the current library website was created to analyze the data.

  1. Swab-to-Sequence: Real-time Data Analysis Platform for the Biomolecule Sequencer

    Data.gov (United States)

    National Aeronautics and Space Administration — DNA was successfully sequenced on the ISS in 2016, but the DNA sequenced was prepared on the ground. With FY’16 IRAD funds, the same team developed a...

  2. Sensitivity analysis of reference evapotranspiration to sensor accuracy

    Science.gov (United States)

    Meteorological sensor networks are often used across agricultural regions to calculate the ASCE Standardized Reference ET Equation, and inaccuracies in individual sensors can lead to inaccuracies in ET estimates. Multiyear datasets from the semi-arid Colorado Agricultural Meteorological (CoAgMet) an...

  3. Student Teacher Letters of Reference: A Critical Analysis

    Science.gov (United States)

    Mason, Richard W.; Schroeder, Mark P.

    2012-01-01

    Letters of reference are commonly used in acquiring a job in education. Despite serious issues of validity and reliability in writing and evaluating letters, there is a dearth of research that systematically examines the evaluation process and defines the constructs that define high quality letters. The current study used NVivo to examine 160…

  4. MerCat: a versatile k-mer counter and diversity estimator for database-independent property analysis obtained from metagenomic and/or metatranscriptomic sequencing data

    Energy Technology Data Exchange (ETDEWEB)

    White, Richard A.; Panyala, Ajay R.; Glass, Kevin A.; Colby, Sean M.; Glaesemann, Kurt R.; Jansson, Georg C.; Jansson, Janet K.

    2017-02-21

    MerCat is a parallel, highly scalable and modular property software package for robust analysis of features in next-generation sequencing data. MerCat inputs include assembled contigs and raw sequence reads from any platform resulting in feature abundance counts tables. MerCat allows for direct analysis of data properties without reference sequence database dependency commonly used by search tools such as BLAST and/or DIAMOND for compositional analysis of whole community shotgun sequencing (e.g. metagenomes and metatranscriptomes).

  5. Detailed tail proteomic analysis of axolotl (Ambystoma mexicanum) using an mRNA-seq reference database.

    Science.gov (United States)

    Demircan, Turan; Keskin, Ilknur; Dumlu, Seda Nilgün; Aytürk, Nilüfer; Avşaroğlu, Mahmut Erhan; Akgün, Emel; Öztürk, Gürkan; Baykal, Ahmet Tarık

    2017-01-01

    Salamander axolotl has been emerging as an important model for stem cell research due to its powerful regenerative capacity. Several advantages, such as the high capability of advanced tissue, organ, and appendages regeneration, promote axolotl as an ideal model system to extend our current understanding on the mechanisms of regeneration. Acknowledging the common molecular pathways between amphibians and mammals, there is a great potential to translate the messages from axolotl research to mammalian studies. However, the utilization of axolotl is hindered due to the lack of reference databases of genomic, transcriptomic, and proteomic data. Here, we introduce the proteome analysis of the axolotl tail section searched against an mRNA-seq database. We translated axolotl mRNA sequences to protein sequences and annotated these to process the LC-MS/MS data and identified 1001 nonredundant proteins. Functional classification of identified proteins was performed by gene ontology searches. The presence of some of the identified proteins was validated by in situ antibody labeling. Furthermore, we have analyzed the proteome expressional changes postamputation at three time points to evaluate the underlying mechanisms of the regeneration process. Taken together, this work expands the proteomics data of axolotl to contribute to its establishment as a fully utilized model. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  6. Sequence comparison and phylogenetic analysis of core gene of ...

    African Journals Online (AJOL)

    STORAGESEVER

    2010-07-19

    Jul 19, 2010 ... and antisense primers, a single band of 573 base pairs .... Amino acid sequence alignment of Cluster I and Cluster II of phylogenetic tree. First ten sequences ... sequence weighting, postion-spiecific gap penalties and weight.

  7. Linear discriminant analysis of character sequences using occurrences of words

    KAUST Repository

    Dutta, Subhajit; Chaudhuri, Probal; Ghosh, Anil

    2014-01-01

    Classification of character sequences, where the characters come from a finite set, arises in disciplines such as molecular biology and computer science. For discriminant analysis of such character sequences, the Bayes classifier based on Markov models turns out to have class boundaries defined by linear functions of occurrences of words in the sequences. It is shown that for such classifiers based on Markov models with unknown orders, if the orders are estimated from the data using cross-validation, the resulting classifier has Bayes risk consistency under suitable conditions. Even when Markov models are not valid for the data, we develop methods for constructing classifiers based on linear functions of occurrences of words, where the word length is chosen by cross-validation. Such linear classifiers are constructed using ideas of support vector machines, regression depth, and distance weighted discrimination. We show that classifiers with linear class boundaries have certain optimal properties in terms of their asymptotic misclassification probabilities. The performance of these classifiers is demonstrated in various simulated and benchmark data sets.

  8. Planarian homeobox genes: cloning, sequence analysis, and expression.

    Science.gov (United States)

    Garcia-Fernàndez, J; Baguñà, J; Saló, E

    1991-01-01

    Freshwater planarians (Platyhelminthes, Turbellaria, and Tricladida) are acoelomate, triploblastic, unsegmented, and bilaterally symmetrical organisms that are mainly known for their ample power to regenerate a complete organism from a small piece of their body. To identify potential pattern-control genes in planarian regeneration, we have isolated two homeobox-containing genes, Dth-1 and Dth-2 [Dugesia (Girardia) tigrina homeobox], by using degenerate oligonucleotides corresponding to the most conserved amino acid sequence from helix-3 of the homeodomain. Dth-1 and Dth-2 homeodomains are closely related (68% at the nucleotide level and 78% at the protein level) and show the conserved residues characteristic of the homeodomains identified to data. Similarity with most homeobox sequences is low (30-50%), except with Drosophila NK homeodomains (80-82% with NK-2) and the rodent TTF-1 homeodomain (77-87%). Some unusual amino acid residues specific to NK-2, TTF-1, Dth-1, and Dth-2 can be observed in the recognition helix (helix-3) and may define a family of homeodomains. The deduced amino acid sequences from the cDNAs contain, in addition to the homeodomain, other domains also present in various homeobox-containing genes. The expression of both genes, detected by Northern blot analysis, appear slightly higher in cephalic regions than in the rest of the intact organism, while a slight increase is detected in the central period (5 days) or regeneration. Images PMID:1714599

  9. Analysis of correlations between sites in models of protein sequences

    International Nuclear Information System (INIS)

    Giraud, B.G.; Lapedes, A.; Liu, L.C.

    1998-01-01

    A criterion based on conditional probabilities, related to the concept of algorithmic distance, is used to detect correlated mutations at noncontiguous sites on sequences. We apply this criterion to the problem of analyzing correlations between sites in protein sequences; however, the analysis applies generally to networks of interacting sites with discrete states at each site. Elementary models, where explicit results can be derived easily, are introduced. The number of states per site considered ranges from 2, illustrating the relation to familiar classical spin systems, to 20 states, suitable for representing amino acids. Numerical simulations show that the criterion remains valid even when the genetic history of the data samples (e.g., protein sequences), as represented by a phylogenetic tree, introduces nonindependence between samples. Statistical fluctuations due to finite sampling are also investigated and do not invalidate the criterion. A subsidiary result is found: The more homogeneous a population, the more easily its average properties can drift from the properties of its ancestor. copyright 1998 The American Physical Society

  10. Linear discriminant analysis of character sequences using occurrences of words

    KAUST Repository

    Dutta, Subhajit

    2014-02-01

    Classification of character sequences, where the characters come from a finite set, arises in disciplines such as molecular biology and computer science. For discriminant analysis of such character sequences, the Bayes classifier based on Markov models turns out to have class boundaries defined by linear functions of occurrences of words in the sequences. It is shown that for such classifiers based on Markov models with unknown orders, if the orders are estimated from the data using cross-validation, the resulting classifier has Bayes risk consistency under suitable conditions. Even when Markov models are not valid for the data, we develop methods for constructing classifiers based on linear functions of occurrences of words, where the word length is chosen by cross-validation. Such linear classifiers are constructed using ideas of support vector machines, regression depth, and distance weighted discrimination. We show that classifiers with linear class boundaries have certain optimal properties in terms of their asymptotic misclassification probabilities. The performance of these classifiers is demonstrated in various simulated and benchmark data sets.

  11. Analysis of the Reference Systems of Modern Selenographic Systems

    Science.gov (United States)

    Nefedyev, Yuri; Petrova, Natalia; Andreev, Alexey; Demina, Natalya

    2016-07-01

    In this work analysis of the reference systems of modern selenographic systems was made. The center of the Moon's mass position relative to its center of figure was determined from the data of "Clementine" and "Kaguya" missions and "ULCN" and "KSC-1162" catalogues. The knowledge of the Moon's center of mass position relative to its center of figure is important for researches of the lunar origin, structure and evolution and in terms of precision solutions circumlunar navigation tasks. At the present this task is the most relevant and demanded for cosmic lunar missions.The expansions by spherical harmonics N=5 degree and order of the lunar function h (λ, β) using the package ASNI USTU were executed. Module of the expansion of the local area to surfaces to full sphere was used. The parameters of cosmic missions are given for comparison (SAI; Bills, Ferrari). The normalized coefficients from expansions for eight sources of hypsometric information are obtained: "Clementine" (N=40), "KSC-1162" (N=5), "Kiev" (N=5), "SAI" (N=10; Chuikova (1975)), "Bills, Ferrari", "Kaguya" (Selena, Japan mission), "ULCN" (The Unified Lunar Control Network 2005). The displacements of the lunar center of figure relative to the lunar center of the mass were defined from equations (Chuikova (1975)): Δ ξ = C_{11} √{3}, Δ η= S_{11} √{3}, Δ ζ = C_{10} √{3}, where ξ is the axis directed towards the Earth, η is equatorial axis directed perpendicularly to ξ , ζ is rotation axis of the Moon, C_{11} , S_{11} , C_{10} are the normalized amplitudes of the harmonics of the first order expansion of the relief. After that we considered: - mathematical models in the form of expansions in spherical functions - methods for estimating the model parameters; - information technology data processing. As a model describing the behavior of the relief on the lunar sphere is used the expansion of the height in a series of spherical harmonics (Sagitov (1979)) in the form of a regression model

  12. Sequence analysis of PROTEOLYSIS 6 from Solanum lycopersicum

    Science.gov (United States)

    Roslan, Nur Farhana; Chew, Bee Lyn; Goh, Hoe-Han; Isa, Nurulhikma Md

    2018-04-01

    The N-end rule pathway is a protein degradation pathway that relates the protein half-life with the identity of its N-terminal residues. A destabilizing N-terminal residues is created by enzymatic reaction or chemical modifications. This destabilized substrate will be recognized by PROTEOLYSIS 6 (PRT6) protein, which encodes an E3 ligase enzyme and resulted in substrate degradation by proteasome. PRT6 has been studied in Arabidopsis thaliana and barley but not yet been studied in fleshy fruit plants. Hence, this study was carried out in tomato that is known as the model for fleshy fruit plants. BLASTX analysis identified that Solyc09g010830 which encodes for a PRT6 gene in tomato based on its sequence similarity with PRT6 in A. thaliana. In silico gene expression analysis shows that PRT6 gene was highly expressed in tomato fruits breaker +5. Co-expression analysis shows that PRT6 may not only involved in abiotic stresses but also in biotic stresses. The objective is to analyze the sequence and characterize PRT6 gene in tomato.

  13. Determining physical constraints in transcriptional initiationcomplexes using DNA sequence analysis

    Energy Technology Data Exchange (ETDEWEB)

    Shultzaberger, Ryan K.; Chiang, Derek Y.; Moses, Alan M.; Eisen,Michael B.

    2007-07-01

    Eukaryotic gene expression is often under the control ofcooperatively acting transcription factors whose binding is limited bystructural constraints. By determining these structural constraints, wecan understand the "rules" that define functional cooperativity.Conversely, by understanding the rules of binding, we can inferstructural characteristics. We have developed an information theory basedmethod for approximating the physical limitations of cooperativeinteractions by comparing sequence analysis to microarray expressiondata. When applied to the coordinated binding of the sulfur amino acidregulatory protein Met4 by Cbf1 and Met31, we were able to create acombinatorial model that can correctly identify Met4 regulatedgenes.

  14. Analysis of Periodic Errors for Synthesized-Reference-Wave Holography

    Directory of Open Access Journals (Sweden)

    V. Schejbal

    2009-12-01

    Full Text Available Synthesized-reference-wave holographic techniques offer relatively simple and cost-effective measurement of antenna radiation characteristics and reconstruction of complex aperture fields using near-field intensity-pattern measurement. These methods allow utilization of advantages of methods for probe compensations for amplitude and phasing near-field measurements for the planar and cylindrical scanning including accuracy analyses. The paper analyzes periodic errors, which can be created during scanning, using both theoretical results and numerical simulations.

  15. Methylobacterium genome sequences: a reference blueprint to investigate microbial metabolism of C1 compounds from natural and industrial sources.

    Directory of Open Access Journals (Sweden)

    Stéphane Vuilleumier

    Full Text Available Methylotrophy describes the ability of organisms to grow on reduced organic compounds without carbon-carbon bonds. The genomes of two pink-pigmented facultative methylotrophic bacteria of the Alpha-proteobacterial genus Methylobacterium, the reference species Methylobacterium extorquens strain AM1 and the dichloromethane-degrading strain DM4, were compared.The 6.88 Mb genome of strain AM1 comprises a 5.51 Mb chromosome, a 1.26 Mb megaplasmid and three plasmids, while the 6.12 Mb genome of strain DM4 features a 5.94 Mb chromosome and two plasmids. The chromosomes are highly syntenic and share a large majority of genes, while plasmids are mostly strain-specific, with the exception of a 130 kb region of the strain AM1 megaplasmid which is syntenic to a chromosomal region of strain DM4. Both genomes contain large sets of insertion elements, many of them strain-specific, suggesting an important potential for genomic plasticity. Most of the genomic determinants associated with methylotrophy are nearly identical, with two exceptions that illustrate the metabolic and genomic versatility of Methylobacterium. A 126 kb dichloromethane utilization (dcm gene cluster is essential for the ability of strain DM4 to use DCM as the sole carbon and energy source for growth and is unique to strain DM4. The methylamine utilization (mau gene cluster is only found in strain AM1, indicating that strain DM4 employs an alternative system for growth with methylamine. The dcm and mau clusters represent two of the chromosomal genomic islands (AM1: 28; DM4: 17 that were defined. The mau cluster is flanked by mobile elements, but the dcm cluster disrupts a gene annotated as chelatase and for which we propose the name "island integration determinant" (iid.These two genome sequences provide a platform for intra- and interspecies genomic comparisons in the genus Methylobacterium, and for investigations of the adaptive mechanisms which allow bacterial lineages to acquire

  16. mPUMA: a computational approach to microbiota analysis by de novo assembly of operational taxonomic units based on protein-coding barcode sequences.

    Science.gov (United States)

    Links, Matthew G; Chaban, Bonnie; Hemmingsen, Sean M; Muirhead, Kevin; Hill, Janet E

    2013-08-15

    Formation of operational taxonomic units (OTU) is a common approach to data aggregation in microbial ecology studies based on amplification and sequencing of individual gene targets. The de novo assembly of OTU sequences has been recently demonstrated as an alternative to widely used clustering methods, providing robust information from experimental data alone, without any reliance on an external reference database. Here we introduce mPUMA (microbial Profiling Using Metagenomic Assembly, http://mpuma.sourceforge.net), a software package for identification and analysis of protein-coding barcode sequence data. It was developed originally for Cpn60 universal target sequences (also known as GroEL or Hsp60). Using an unattended process that is independent of external reference sequences, mPUMA forms OTUs by DNA sequence assembly and is capable of tracking OTU abundance. mPUMA processes microbial profiles both in terms of the direct DNA sequence as well as in the translated amino acid sequence for protein coding barcodes. By forming OTUs and calculating abundance through an assembly approach, mPUMA is capable of generating inputs for several popular microbiota analysis tools. Using SFF data from sequencing of a synthetic community of Cpn60 sequences derived from the human vaginal microbiome, we demonstrate that mPUMA can faithfully reconstruct all expected OTU sequences and produce compositional profiles consistent with actual community structure. mPUMA enables analysis of microbial communities while empowering the discovery of novel organisms through OTU assembly.

  17. Reference material for trace analysis by radioanalytical methods: Bowen's Kale

    International Nuclear Information System (INIS)

    Wainerdi, R.E.

    1979-01-01

    A fairly large volume of published data on 'Bowen's Kale' has been examined critically in order to develop recommendations for the use of this preparation as a 'reference material' in the standardisation and evaluation of the reliability of analytical procedures. Values are now recommended for the contents of twelve elements present in major to trace concentrations in 'Bowen's Kale'. 'Indicated values' for another 16 elements are provided. Values for 15 more elements are listed with no recommendation. The criteria adopted in categorising elements into these groups are discussed. (author)

  18. Standard reference materials analysis for MINT Radiocarbon Laboratory

    International Nuclear Information System (INIS)

    Noraishah Othman; Kamisah Alias; Nasasni Nasrul

    2004-01-01

    As a follow-up to the setting up of the MINT Radiocarbon Dating facility. an exercise on the IAEA standard reference materials was carried out. Radiocarbon laboratories frequently used these 8 natural samples to verify their systems. The materials were either pretreated or analysed directly to determine the activity of 14 C isotopes of the five samples expressed in % Modern (pMC) terms and to make recommendations on further use of these materials. We present the results of the five materials and discuss the analyses that were undertaken. (Author)

  19. EPA flow reference method testing and analysis: Findings report

    International Nuclear Information System (INIS)

    1999-06-01

    This report describes an experimental program sponsored by the US Environmental Protection Agency (EPA) to evaluate potential improvements to the Agency's current reference method for measuring volumetric flow (Method 2, 40 CFR Part 60, Appendix B). Method 2 (Determination of Stack Gas Velocity and Volumetric Flow Rate (Type S Pitot Tube)) specifies measurements to determine volumetric flow, but does not prescribe specific procedures to account for yaw or pitch angles of flow when the flow in the stack is not axial. Method 2 also allows the use of only two probe types, the Type S and the Prandtl

  20. Streaming support for data intensive cloud-based sequence analysis.

    Science.gov (United States)

    Issa, Shadi A; Kienzler, Romeo; El-Kalioby, Mohamed; Tonellato, Peter J; Wall, Dennis; Bruggmann, Rémy; Abouelhoda, Mohamed

    2013-01-01

    Cloud computing provides a promising solution to the genomics data deluge problem resulting from the advent of next-generation sequencing (NGS) technology. Based on the concepts of "resources-on-demand" and "pay-as-you-go", scientists with no or limited infrastructure can have access to scalable and cost-effective computational resources. However, the large size of NGS data causes a significant data transfer latency from the client's site to the cloud, which presents a bottleneck for using cloud computing services. In this paper, we provide a streaming-based scheme to overcome this problem, where the NGS data is processed while being transferred to the cloud. Our scheme targets the wide class of NGS data analysis tasks, where the NGS sequences can be processed independently from one another. We also provide the elastream package that supports the use of this scheme with individual analysis programs or with workflow systems. Experiments presented in this paper show that our solution mitigates the effect of data transfer latency and saves both time and cost of computation.

  1. Streaming Support for Data Intensive Cloud-Based Sequence Analysis

    Directory of Open Access Journals (Sweden)

    Shadi A. Issa

    2013-01-01

    Full Text Available Cloud computing provides a promising solution to the genomics data deluge problem resulting from the advent of next-generation sequencing (NGS technology. Based on the concepts of “resources-on-demand” and “pay-as-you-go”, scientists with no or limited infrastructure can have access to scalable and cost-effective computational resources. However, the large size of NGS data causes a significant data transfer latency from the client’s site to the cloud, which presents a bottleneck for using cloud computing services. In this paper, we provide a streaming-based scheme to overcome this problem, where the NGS data is processed while being transferred to the cloud. Our scheme targets the wide class of NGS data analysis tasks, where the NGS sequences can be processed independently from one another. We also provide the elastream package that supports the use of this scheme with individual analysis programs or with workflow systems. Experiments presented in this paper show that our solution mitigates the effect of data transfer latency and saves both time and cost of computation.

  2. Next-generation sequence analysis of cancer xenograft models.

    Directory of Open Access Journals (Sweden)

    Fernando J Rossello

    Full Text Available Next-generation sequencing (NGS studies in cancer are limited by the amount, quality and purity of tissue samples. In this situation, primary xenografts have proven useful preclinical models. However, the presence of mouse-derived stromal cells represents a technical challenge to their use in NGS studies. We examined this problem in an established primary xenograft model of small cell lung cancer (SCLC, a malignancy often diagnosed from small biopsy or needle aspirate samples. Using an in silico strategy that assign reads according to species-of-origin, we prospectively compared NGS data from primary xenograft models with matched cell lines and with published datasets. We show here that low-coverage whole-genome analysis demonstrated remarkable concordance between published genome data and internal controls, despite the presence of mouse genomic DNA. Exome capture sequencing revealed that this enrichment procedure was highly species-specific, with less than 4% of reads aligning to the mouse genome. Human-specific expression profiling with RNA-Seq replicated array-based gene expression experiments, whereas mouse-specific transcript profiles correlated with published datasets from human cancer stroma. We conclude that primary xenografts represent a useful platform for complex NGS analysis in cancer research for tumours with limited sample resources, or those with prominent stromal cell populations.

  3. Streaming Support for Data Intensive Cloud-Based Sequence Analysis

    Science.gov (United States)

    Issa, Shadi A.; Kienzler, Romeo; El-Kalioby, Mohamed; Tonellato, Peter J.; Wall, Dennis; Bruggmann, Rémy; Abouelhoda, Mohamed

    2013-01-01

    Cloud computing provides a promising solution to the genomics data deluge problem resulting from the advent of next-generation sequencing (NGS) technology. Based on the concepts of “resources-on-demand” and “pay-as-you-go”, scientists with no or limited infrastructure can have access to scalable and cost-effective computational resources. However, the large size of NGS data causes a significant data transfer latency from the client's site to the cloud, which presents a bottleneck for using cloud computing services. In this paper, we provide a streaming-based scheme to overcome this problem, where the NGS data is processed while being transferred to the cloud. Our scheme targets the wide class of NGS data analysis tasks, where the NGS sequences can be processed independently from one another. We also provide the elastream package that supports the use of this scheme with individual analysis programs or with workflow systems. Experiments presented in this paper show that our solution mitigates the effect of data transfer latency and saves both time and cost of computation. PMID:23710461

  4. Extended -Regular Sequence for Automated Analysis of Microarray Images

    Directory of Open Access Journals (Sweden)

    Jin Hee-Jeong

    2006-01-01

    Full Text Available Microarray study enables us to obtain hundreds of thousands of expressions of genes or genotypes at once, and it is an indispensable technology for genome research. The first step is the analysis of scanned microarray images. This is the most important procedure for obtaining biologically reliable data. Currently most microarray image processing systems require burdensome manual block/spot indexing work. Since the amount of experimental data is increasing very quickly, automated microarray image analysis software becomes important. In this paper, we propose two automated methods for analyzing microarray images. First, we propose the extended -regular sequence to index blocks and spots, which enables a novel automatic gridding procedure. Second, we provide a methodology, hierarchical metagrid alignment, to allow reliable and efficient batch processing for a set of microarray images. Experimental results show that the proposed methods are more reliable and convenient than the commercial tools.

  5. Sequence Quality Analysis Tool for HIV Type 1 Protease and Reverse Transcriptase

    OpenAIRE

    DeLong, Allison K.; Wu, Mingham; Bennett, Diane; Parkin, Neil; Wu, Zhijin; Hogan, Joseph W.; Kantor, Rami

    2012-01-01

    Access to antiretroviral therapy is increasing globally and drug resistance evolution is anticipated. Currently, protease (PR) and reverse transcriptase (RT) sequence generation is increasing, including the use of in-house sequencing assays, and quality assessment prior to sequence analysis is essential. We created a computational HIV PR/RT Sequence Quality Analysis Tool (SQUAT) that runs in the R statistical environment. Sequence quality thresholds are calculated from a large dataset (46,802...

  6. Testing and reference model analysis of FTTH system

    Science.gov (United States)

    Feng, Xiancheng; Cui, Wanlong; Chen, Ying

    2009-08-01

    With rapid development of Internet and broadband access network, the technologies of xDSL, FTTx+LAN , WLAN have more applications, new network service emerges in endless stream, especially the increase of network game, meeting TV, video on demand, etc. FTTH supports all present and future service with enormous bandwidth, including traditional telecommunication service, traditional data service and traditional TV service, and the future digital TV and VOD. With huge bandwidth of FTTH, it wins the final solution of broadband network, becomes the final goal of development of optical access network.. Fiber to the Home (FTTH) will be the goal of telecommunications cable broadband access. In accordance with the development trend of telecommunication services, to enhance the capacity of integrated access network, to achieve triple-play (voice, data, image), based on the existing optical Fiber to the curb (FTTC), Fiber To The Zone (FTTZ), Fiber to the Building (FTTB) user optical cable network, the optical fiber can extend to the FTTH system of end-user by using EPON technology. The article first introduced the basic components of FTTH system; and then explain the reference model and reference point for testing of the FTTH system; Finally, by testing connection diagram, the testing process, expected results, primarily analyze SNI Interface Testing, PON interface testing, Ethernet performance testing, UNI interface testing, Ethernet functional testing, PON functional testing, equipment functional testing, telephone functional testing, operational support capability testing and so on testing of FTTH system. ...

  7. Complete genome sequencing and phylogenetic analysis of dengue type 1 virus isolated from Jeddah, Saudi Arabia.

    Science.gov (United States)

    Azhar, Esam I; Hashem, Anwar M; El-Kafrawy, Sherif A; Abol-Ela, Said; Abd-Alla, Adly M M; Sohrab, Sayed Sartaj; Farraj, Suha A; Othman, Norah A; Ben-Helaby, Huda G; Ashshi, Ahmed; Madani, Tariq A; Jamjoom, Ghazi

    2015-01-16

    Dengue viruses (DENVs) are mosquito-borne viruses which can cause disease ranging from mild fever to severe dengue infection. These viruses are endemic in several tropical and subtropical regions. Multiple outbreaks of DENV serotypes 1, 2 and 3 (DENV-1, DENV-2 and DENV-3) have been reported from the western region in Saudi Arabia since 1994. Strains from at least two genotypes of DENV-1 (Asia and America/Africa genotypes) have been circulating in western Saudi Arabia until 2006. However, all previous studies reported from Saudi Arabia were based on partial sequencing data of the envelope (E) gene without any reports of full genome sequences for any DENV serotypes circulating in Saudi Arabia. Here, we report the isolation and the first complete genome sequence of a DENV-1 strain (DENV-1-Jeddah-1-2011) isolated from a patient from Jeddah, Saudi Arabia in 2011. Whole genome sequence alignment and phylogenetic analysis showed high similarity between DENV-1-Jeddah-1-2011 strain and D1/H/IMTSSA/98/606 isolate (Asian genotype) reported from Djibouti in 1998. Further analysis of the full envelope gene revealed a close relationship between DENV-1-Jeddah-1-2011 strain and isolates reported between 2004-2006 from Jeddah as well as recent isolates from Somalia, suggesting the widespread of the Asian genotype in this region. These data suggest that strains belonging to the Asian genotype might have been introduced into Saudi Arabia long before 2004 most probably by African pilgrims and continued to circulate in western Saudi Arabia at least until 2011. Most importantly, these results indicate that pilgrims from dengue endemic regions can play an important role in the spread of new DENVs in Saudi Arabia and the rest of the world. Therefore, availability of complete genome sequences would serve as a reference for future epidemiological studies of DENV-1 viruses.

  8. Neutron activation analysis of trace elements in IAEA reference materials

    International Nuclear Information System (INIS)

    Cheema, M.N.; Hasany, S.M.; Hanif, I.; Chaudhry, M.S.; Qureshi, I.H.

    1978-09-01

    Analytical Chemistry Group of Nuclear Chemistry Division at PINSTECH has been participating in IAEA Intercomparison programme of analytical quality control since 1972. So far fifteen samples of a variety of materials received from the Agency have been analyzed for different minor and trace elements. Mostly destructive and non-destructive neutron activation analysis techniques have been used for elemental analysis. In this report the description of the samples and the experimental procedures employed have been mentioned. The results of elemental analysis have been reported and compared with IAEA values which are based on the average computed from the results of different participating laboratories. (authors)

  9. Cost-effective sequencing of full-length cDNA clones powered by a de novo-reference hybrid assembly.

    Science.gov (United States)

    Kuroshu, Reginaldo M; Watanabe, Junichi; Sugano, Sumio; Morishita, Shinichi; Suzuki, Yutaka; Kasahara, Masahiro

    2010-05-07

    Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were expensive or time-consuming, and therefore, a fast and efficient sequencing approach was demanded. We developed a program, MuSICA 2, that assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence approximately 800 human full-length cDNA clones. MuSICA 2 performs a hybrid assembly in which an external de novo assembler is run first and the result is then improved by reference alignment of shotgun reads. We compared the MuSICA 2 assembly with 200 pooled full-length cDNA clones finished independently by the conventional primer-walking using Sanger sequencers. The exon-intron structure of the coding sequence was correct for more than 95% of the clones with coding sequence annotation when we excluded cDNA clones insufficiently represented in the shotgun library due to PCR failure (42 out of 200 clones excluded), and the nucleotide-level accuracy of coding sequences of those correct clones was over 99.99%. We also applied MuSICA 2 to full-length cDNA clones from Toxoplasma gondii, to confirm that its ability was competent even for non-human species. The entire sequencing and shotgun assembly takes less than 1 week and the consumables cost only approximately US$3 per clone, demonstrating a significant advantage over previous approaches.

  10. Molecular cloning and sequencing analysis of the interferon receptor (IFNAR-1) from Columba livia.

    Science.gov (United States)

    Li, Chao; Chang, Wei Shan

    2014-01-01

    Partial sequence cloning of interferon receptor (IFNAR-1) of Columba livia. In order to obtain a certain length (630 bp) of gene, a pair of primers was designed according to the conserved nucleotide sequence of Gallus (EU477527.1) and Taeniopygia guttata (XM_002189232.1) IFNAR-1 gene fragment that was published by GenBank. Special primers were designed by the Race method to amplify the 3'terminal cDNA. The Columba livia IFNAR-1 displayed 88.5%, 80.5% and 73.8% nucleotide identity to Falco peregrinus, Gallus and Taeniopygia guttata, respectively. Phylogenetic analysis of the IFNAR1 gene showed that the relationship of Columba livia, Falco peregrinus and chicken had high homology. We successfully obtained a Columba livia IFNAR-1 gene partial sequence. Analysis of the genetic tree showed that the relationship of Columba livia and Falco peregrinus IFNAR-1 had high homology. This result can be used as reference for further research and practical application.

  11. Sequencing Infrastructure Investments under Deep Uncertainty Using Real Options Analysis

    Directory of Open Access Journals (Sweden)

    Nishtha Manocha

    2018-02-01

    Full Text Available The adaptation tipping point and adaptation pathway approach developed to make decisions under deep uncertainty do not shed light on which among the multiple available pathways should be chosen as the preferred pathway. This creates the need to extend these approaches by means of suitable tools that can help sequence actions and subsequently enable the outlining of relevant policies. This paper presents two sequencing approaches, namely, the “Build to Target” and “Build Up” approach, to aid in sub-selecting a set of preferred pathways. Both approaches differ in the levels of flexibility they offer. They are exemplified by means of two case studies wherein the Net Present Valuation and the Real Options Analysis are employed as selection criterions. The results demonstrate the benefit of these two approaches when used in conjunction with the adaptation pathways and show how the pathways selected by means of a Build to Target approach generally have a value greater than, or at least the same as, the pathways selected by the Build Up approach. Further, this paper also demonstrates the capacity of Real Options to quantify and capture the economic value of flexibility, which cannot be done by traditional valuation approaches such as Net Present Valuation.

  12. Reverse transcriptase sequences from mulberry LTR retrotransposons: characterization analysis

    Directory of Open Access Journals (Sweden)

    Ma Bi

    2017-10-01

    Full Text Available Copia and Gypsy play important roles in structural, functional and evolutionary dynamics of plant genomes. In this study, a total of 106 and 101, Copia and Gypsy reverse transcriptase (rt were amplified respectively in the Morus notabilis genome using degenerate primers. All sequences exhibited high levels of heterogeneity, were rich in AT and possessed higher sequence divergence of Copia rt in comparison to Gypsy rt. Two reasons are likely to account for this phenomenon: a these elements often experience deletions or fragmentation by illegitimate or unequal homologous recombination in the transposition process; b strong purifying selective pressure drives the evolution of these elements through “selective silencing” with random mutation and eventual deletion from the host genome. Interestingly, mulberry rt clustered with other rt from distantly related taxa according to the phylogenetic analysis. This phenomenon did not result from horizontal transposable element transfer. Results obtained from fluorescence in situ hybridization revealed that most of the hybridization signals were preferentially concentrated in pericentromeric and distal regions of chromosomes, and these elements may play important roles in the regions in which they are found. Results of this study support the continued pursuit of further functional studies of Copia and Gypsy in the mulberry genome.

  13. Integrated Reliability and Risk Analysis System (IRRAS), Version 2.5: Reference manual

    International Nuclear Information System (INIS)

    Russell, K.D.; McKay, M.K.; Sattison, M.B.; Skinner, N.L.; Wood, S.T.; Rasmuson, D.M.

    1991-03-01

    The Integrated Reliability and Risk Analysis System (IRRAS) is a state-of-the-art, microcomputer-based probabilistic risk assessment (PRA) model development and analysis tool to address key nuclear plant safety issues. IRRAS is an integrated software tool that gives the user the ability to create and analyze fault trees and accident sequences using a microcomputer. This program provides functions that range from graphical fault tree construction to cut set generation and quantification. Version 1.0 of the IRRAS program was released in February of 1987. Since that time, many user comments and enhancements have been incorporated into the program providing a much more powerful and user-friendly system. This version has been designated IRRAS 2.5 and is the subject of this Reference Manual. Version 2.5 of IRRAS provides the same capabilities as Version 1.0 and adds a relational data base facility for managing the data, improved functionality, and improved algorithm performance. 7 refs., 348 figs

  14. First research coordination meeting on reference database for neutron activation analysis. Summary report

    International Nuclear Information System (INIS)

    Firestone, R.B.; Trkov, A.

    2005-10-01

    Potential problems associated with nuclear data for neutron activation analysis were identified, the scope of the work to be undertaken was defined together with its priorities, and tasks were assigned to participants. Data testing and measurements refer to gamma spectrum peak evaluations, detector efficiency calibration, neutron spectrum characteristics and reference materials analysis. (author)

  15. Use of neutron activation in dietary reference material analysis

    Energy Technology Data Exchange (ETDEWEB)

    Woittiez, J R.W.; Iyengar, G V

    1988-12-01

    Results for a number of trace elements in a total human diet material (USDIET-1), obtained by the application of both INAA and RNAA are presented. Several dietary reference materials such as NBS SRM 1577A, and BCR CRM Single Cell Protein were also analyzed, and these results are also given. Combining measurements on short and long lived radionuclides, the INAA approach is useful for the determination of about 20 elements. In order to expand the elemental coverage or improve detection limits, RNAA was also explored in two modes: separation of radionuclides using organic ion exchange resins and the use of hydrated manganese dioxide. This combination is applicable to 15 trace elements. For example, using RNAA, the following results were obtained for USDIET-1: Cd=31.8, Mo=280, Cr=71, Ag=4, As=117 and Sb=9.4 ..mu..g/kg. In the INAA mode, special attention was given to Al, F and Se. The F content of USDIET-1 was found to be 840 mg/kg, a rather high value, resulting from handling USDIET-1 by Teflon tools. By applying INAA and RNAA under two different laboratory conditions, it has been demonstrated that, even for the so-called difficult to determine elements like Cr, As or Mo, consistent results can be obtained. Thus, NAA promises to be a strong tool for human nutritional studies.

  16. An Analysis of Delay-based and Integrator-based Sequence Detectors for Grid-Connected Converters

    DEFF Research Database (Denmark)

    Khazraj, Hesam; Silva, Filipe Miguel Faria da; Bak, Claus Leth

    2017-01-01

    -signal cancellation operators are the main members of the delay-based sequence detectors. The aim of this paper is to provide a theoretical and experimental comparative study between integrator and delay based sequence detectors. The theoretical analysis is conducted based on the small-signal modelling......Detecting and separating positive and negative sequence components of the grid voltage or current is of vital importance in the control of grid-connected power converters, HVDC systems, etc. To this end, several techniques have been proposed in recent years. These techniques can be broadly...... classified into two main classes: The integrator-based techniques and Delay-based techniques. The complex-coefficient filter-based technique, dual second-order generalized integrator-based method, multiple reference frame approach are the main members of the integrator-based sequence detector and the delay...

  17. Nonlinear analysis of sequence repeats of multi-domain proteins

    Energy Technology Data Exchange (ETDEWEB)

    Huang Yanzhao [Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan 430074, Hubei (China); Li Mingfeng [Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan 430074, Hubei (China); Xiao Yi [Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan 430074, Hubei (China)]. E-mail: lmf_bill@sina.com

    2007-11-15

    Many multi-domain proteins have repetitive three-dimensional structures but nearly-random amino acid sequences. In the present paper, by using a modified recurrence plot proposed by us previously, we show that these amino acid sequences have hidden repetitions in fact. These results indicate that the repetitive domain structures are encoded by the repetitive sequences. This also gives a method to detect the repetitive domain structures directly from amino acid sequences.

  18. Sequence analysis of cereal sucrose synthase genes and isolation ...

    African Journals Online (AJOL)

    SERVER

    2007-10-18

    Oct 18, 2007 ... sequencing of sucrose synthase gene fragment from sor- ghum using primers designed at their conserved exons. MATERIALS AND METHODS. Multiple sequence alignment. Sucrose synthase gene sequences of various cereals like rice, maize, and barley were accessed from NCBI Genbank database.

  19. Chimera: construction of chimeric sequences for phylogenetic analysis

    NARCIS (Netherlands)

    Leunissen, J.A.M.

    2003-01-01

    Chimera allows the construction of chimeric protein or nucleic acid sequence files by concatenating sequences from two or more sequence files in PHYLIP formats. It allows the user to interactively select genes and species from the input files. The concatenated result is stored to one single output

  20. The Sequence of Acquisition of Personal Pronoun Case and Person Reference among 6 Year Old Children in Two Selected Malaysian Kindergartens

    Directory of Open Access Journals (Sweden)

    Arshad Abd Samad

    2017-03-01

    Full Text Available Pronoun case and person reference refer to the position of the pronoun in the sentence and the person the pronoun refers to respectively.  Examining the acquisition of pronoun case and person reference among young children can be insightful as, besides their obvious relevance to language development, both these constructs can have implications on other aspects of child development.  Attention given by children to these various constructs may indicate the importance children place on the concept of ego and self as well as on social relations.  The sequence of acquisition of personal pronouns among these children is therefore an important phenomenon to be examined as it can reflect linguistic and socio-cognitive development.  This largely descriptive study examines the sequence of acquisition of the English pronouns among forty 6 year old Malaysian children learning ESL in two kindergartens.  The children in the study were presented with 33 drawings to assess their familiarity with case and person reference expressed through English personal pronouns.  They were required to select the correct pronoun from three pronouns that were used to describe each drawing.  This paper reports on the accuracy rates for each pronoun and assumes that high accuracy rates indicate a more complete acquisition of the pronoun.  Error forms by the children were also be identified and examined.  Data obtained were compared to acquisition sequences in the literature and general implications related to the acquisition of personal pronouns among children in an ESL setting in Malaysia will be discussed.

  1. Accident Sequence Evaluation Program: Human reliability analysis procedure

    Energy Technology Data Exchange (ETDEWEB)

    Swain, A.D.

    1987-02-01

    This document presents a shortened version of the procedure, models, and data for human reliability analysis (HRA) which are presented in the Handbook of Human Reliability Analysis With emphasis on Nuclear Power Plant Applications (NUREG/CR-1278, August 1983). This shortened version was prepared and tried out as part of the Accident Sequence Evaluation Program (ASEP) funded by the US Nuclear Regulatory Commission and managed by Sandia National Laboratories. The intent of this new HRA procedure, called the ''ASEP HRA Procedure,'' is to enable systems analysts, with minimal support from experts in human reliability analysis, to make estimates of human error probabilities and other human performance characteristics which are sufficiently accurate for many probabilistic risk assessments. The ASEP HRA Procedure consists of a Pre-Accident Screening HRA, a Pre-Accident Nominal HRA, a Post-Accident Screening HRA, and a Post-Accident Nominal HRA. The procedure in this document includes changes made after tryout and evaluation of the procedure in four nuclear power plants by four different systems analysts and related personnel, including human reliability specialists. The changes consist of some additional explanatory material (including examples), and more detailed definitions of some of the terms. 42 refs.

  2. Accident Sequence Evaluation Program: Human reliability analysis procedure

    International Nuclear Information System (INIS)

    Swain, A.D.

    1987-02-01

    This document presents a shortened version of the procedure, models, and data for human reliability analysis (HRA) which are presented in the Handbook of Human Reliability Analysis With emphasis on Nuclear Power Plant Applications (NUREG/CR-1278, August 1983). This shortened version was prepared and tried out as part of the Accident Sequence Evaluation Program (ASEP) funded by the US Nuclear Regulatory Commission and managed by Sandia National Laboratories. The intent of this new HRA procedure, called the ''ASEP HRA Procedure,'' is to enable systems analysts, with minimal support from experts in human reliability analysis, to make estimates of human error probabilities and other human performance characteristics which are sufficiently accurate for many probabilistic risk assessments. The ASEP HRA Procedure consists of a Pre-Accident Screening HRA, a Pre-Accident Nominal HRA, a Post-Accident Screening HRA, and a Post-Accident Nominal HRA. The procedure in this document includes changes made after tryout and evaluation of the procedure in four nuclear power plants by four different systems analysts and related personnel, including human reliability specialists. The changes consist of some additional explanatory material (including examples), and more detailed definitions of some of the terms. 42 refs

  3. A Quantitative Accident Sequence Analysis for a VHTR

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Jintae; Lee, Joeun; Jae, Moosung [Hanyang University, Seoul (Korea, Republic of)

    2016-05-15

    In Korea, the basic design features of VHTR are currently discussed in the various design concepts. Probabilistic risk assessment (PRA) offers a logical and structured method to assess risks of a large and complex engineered system, such as a nuclear power plant. It will be introduced at an early stage in the design, and will be upgraded at various design and licensing stages as the design matures and the design details are defined. Risk insights to be developed from the PRA are viewed as essential to developing a design that is optimized in meeting safety objectives and in interpreting the applicability of the existing demands to the safety design approach of the VHTR. In this study, initiating events which may occur in VHTRs were selected through MLD method. The initiating events were then grouped into four categories for the accident sequence analysis. Initiating events frequency and safety systems failure rate were calculated by using reliability data obtained from the available sources and fault tree analysis. After quantification, uncertainty analysis was conducted. The SR and LR frequency are calculated respectively 7.52E- 10/RY and 7.91E-16/RY, which are relatively less than the core damage frequency of LWRs.

  4. Example Work Domain Analysis for a Reference Sodium Fast Reactor

    Energy Technology Data Exchange (ETDEWEB)

    Hugo, Jacques [Idaho National Lab. (INL), Idaho Falls, ID (United States); Oxstrand, Johanna [Idaho National Lab. (INL), Idaho Falls, ID (United States)

    2015-01-01

    The nuclear industry is currently designing and building a new generation of reactors that will include different structural, functional, and environmental aspects, all of which are likely to have a significant impact on the way these plants are operated. In order to meet economic and safety objectives, these new reactors will all use advanced technologies to some extent, including new materials and advanced digital instrumentation and control systems. New technologies will affect not only operational strategies, but will also require a new approach to how functions are allocated to humans or machines to ensure optimal performance. Uncertainty about the effect of large scale changes in plant design will remain until sound technical bases are developed for new operational concepts and strategies. Up-to-date models and guidance are required for the development of operational concepts for complex socio-technical systems. This report describes how the classical Work Domain Analysis method was adapted to develop operational concept frameworks for new plants. This adaptation of the method is better able to deal with the uncertainty and incomplete information typical of first-of-a-kind designs. Practical examples are provided of the systematic application of the method in the operational analysis of sodium-cooled reactors. Insights from this application and its utility are reviewed and arguments for the formal adoption of Work Domain Analysis as a value-added part of the Systems Engineering process are presented.

  5. Comparing methods of classifying life courses: Sequence analysis and latent class analysis

    NARCIS (Netherlands)

    Elzinga, C.H.; Liefbroer, Aart C.; Han, Sapphire

    2017-01-01

    We compare life course typology solutions generated by sequence analysis (SA) and latent class analysis (LCA). First, we construct an analytic protocol to arrive at typology solutions for both methodologies and present methods to compare the empirical quality of alternative typologies. We apply this

  6. Comparing methods of classifying life courses: sequence analysis and latent class analysis

    NARCIS (Netherlands)

    Han, Y.; Liefbroer, A.C.; Elzinga, C.

    2017-01-01

    We compare life course typology solutions generated by sequence analysis (SA) and latent class analysis (LCA). First, we construct an analytic protocol to arrive at typology solutions for both methodologies and present methods to compare the empirical quality of alternative typologies. We apply this

  7. Reference respiratory waveforms by minimum jerk model analysis

    Energy Technology Data Exchange (ETDEWEB)

    Anetai, Yusuke, E-mail: anetai@radonc.med.osaka-u.ac.jp; Sumida, Iori; Takahashi, Yutaka; Yagi, Masashi; Mizuno, Hirokazu; Ogawa, Kazuhiko [Department of Radiation Oncology, Osaka University Graduate School of Medicine, Yamadaoka 2-2, Suita-shi, Osaka 565-0871 (Japan); Ota, Seiichi [Department of Medical Technology, Osaka University Hospital, Yamadaoka 2-15, Suita-shi, Osaka 565-0871 (Japan)

    2015-09-15

    Purpose: CyberKnife{sup ®} robotic surgery system has the ability to deliver radiation to a tumor subject to respiratory movements using Synchrony{sup ®} mode with less than 2 mm tracking accuracy. However, rapid and rough motion tracking causes mechanical tracking errors and puts mechanical stress on the robotic joint, leading to unexpected radiation delivery errors. During clinical treatment, patient respiratory motions are much more complicated, suggesting the need for patient-specific modeling of respiratory motion. The purpose of this study was to propose a novel method that provides a reference respiratory wave to enable smooth tracking for each patient. Methods: The minimum jerk model, which mathematically derives smoothness by means of jerk, or the third derivative of position and the derivative of acceleration with respect to time that is proportional to the time rate of force changed was introduced to model a patient-specific respiratory motion wave to provide smooth motion tracking using CyberKnife{sup ®}. To verify that patient-specific minimum jerk respiratory waves were being tracked smoothly by Synchrony{sup ®} mode, a tracking laser projection from CyberKnife{sup ®} was optically analyzed every 0.1 s using a webcam and a calibrated grid on a motion phantom whose motion was in accordance with three pattern waves (cosine, typical free-breathing, and minimum jerk theoretical wave models) for the clinically relevant superior–inferior directions from six volunteers assessed on the same node of the same isocentric plan. Results: Tracking discrepancy from the center of the grid to the beam projection was evaluated. The minimum jerk theoretical wave reduced the maximum-peak amplitude of radial tracking discrepancy compared with that of the waveforms modeled by cosine and typical free-breathing model by 22% and 35%, respectively, and provided smooth tracking for radial direction. Motion tracking constancy as indicated by radial tracking discrepancy

  8. Reference respiratory waveforms by minimum jerk model analysis

    International Nuclear Information System (INIS)

    Anetai, Yusuke; Sumida, Iori; Takahashi, Yutaka; Yagi, Masashi; Mizuno, Hirokazu; Ogawa, Kazuhiko; Ota, Seiichi

    2015-01-01

    Purpose: CyberKnife"® robotic surgery system has the ability to deliver radiation to a tumor subject to respiratory movements using Synchrony"® mode with less than 2 mm tracking accuracy. However, rapid and rough motion tracking causes mechanical tracking errors and puts mechanical stress on the robotic joint, leading to unexpected radiation delivery errors. During clinical treatment, patient respiratory motions are much more complicated, suggesting the need for patient-specific modeling of respiratory motion. The purpose of this study was to propose a novel method that provides a reference respiratory wave to enable smooth tracking for each patient. Methods: The minimum jerk model, which mathematically derives smoothness by means of jerk, or the third derivative of position and the derivative of acceleration with respect to time that is proportional to the time rate of force changed was introduced to model a patient-specific respiratory motion wave to provide smooth motion tracking using CyberKnife"®. To verify that patient-specific minimum jerk respiratory waves were being tracked smoothly by Synchrony"® mode, a tracking laser projection from CyberKnife"® was optically analyzed every 0.1 s using a webcam and a calibrated grid on a motion phantom whose motion was in accordance with three pattern waves (cosine, typical free-breathing, and minimum jerk theoretical wave models) for the clinically relevant superior–inferior directions from six volunteers assessed on the same node of the same isocentric plan. Results: Tracking discrepancy from the center of the grid to the beam projection was evaluated. The minimum jerk theoretical wave reduced the maximum-peak amplitude of radial tracking discrepancy compared with that of the waveforms modeled by cosine and typical free-breathing model by 22% and 35%, respectively, and provided smooth tracking for radial direction. Motion tracking constancy as indicated by radial tracking discrepancy affected by respiratory

  9. Overview of errors in the reference sequence and annotation of Mycobacterium tuberculosis H37Rv, and variation amongst its isolates

    KAUST Repository

    Kö ser, Claudio U.; Niemann, Stefan; Summers, David K.; Archer, John A.C.

    2012-01-01

    Since its publication in 1998, the genome sequence of the Mycobacterium tuberculosis H37Rv laboratory strain has acted as the cornerstone for the study of tuberculosis. In this review we address some of the practical aspects that have come to light

  10. Complete Sequence Analysis and Antiviral Screening of Medicinal Plants for Human Coxsackievirus A16 Isolated in Korea

    OpenAIRE

    Song, Jae-Hyoung; Park, Kwisung; Shim, Aeri; Kwon, Bo-Eun; Ahn, Jae-Hee; Choi, Young Jin; Kim, Jae Kyung; Yeo, Sang-Gu; Yoon, Kyungah; Ko, Hyun-Jeong

    2015-01-01

    Objectives Coxsackievirus A group 16 strain (CVA16) is one of the predominant causative agents of hand, foot, and mouth disease (HFMD). Methods Using a specimen from a male patient with HFMD, we isolated and performed sequencing of the Korean CVA16 strain and compared it with a G10 reference strain. Also, we were investigated the effects of medicinal plant extract on the cytopathic effects (CPE) by CPE reduction assay against Korean CVA16. Results Phylogenetic analysis showed that the Korean ...

  11. RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis.

    Science.gov (United States)

    Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab

    2012-01-01

    RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. http://www.cemb.edu.pk/sw.html RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language.

  12. CloVR-Comparative: automated, cloud-enabled comparative microbial genome sequence analysis pipeline.

    Science.gov (United States)

    Agrawal, Sonia; Arze, Cesar; Adkins, Ricky S; Crabtree, Jonathan; Riley, David; Vangala, Mahesh; Galens, Kevin; Fraser, Claire M; Tettelin, Hervé; White, Owen; Angiuoli, Samuel V; Mahurkar, Anup; Fricke, W Florian

    2017-04-27

    The benefit of increasing genomic sequence data to the scientific community depends on easy-to-use, scalable bioinformatics support. CloVR-Comparative combines commonly used bioinformatics tools into an intuitive, automated, and cloud-enabled analysis pipeline for comparative microbial genomics. CloVR-Comparative runs on annotated complete or draft genome sequences that are uploaded by the user or selected via a taxonomic tree-based user interface and downloaded from NCBI. CloVR-Comparative runs reference-free multiple whole-genome alignments to determine unique, shared and core coding sequences (CDSs) and single nucleotide polymorphisms (SNPs). Output includes short summary reports and detailed text-based results files, graphical visualizations (phylogenetic trees, circular figures), and a database file linked to the Sybil comparative genome browser. Data up- and download, pipeline configuration and monitoring, and access to Sybil are managed through CloVR-Comparative web interface. CloVR-Comparative and Sybil are distributed as part of the CloVR virtual appliance, which runs on local computers or the Amazon EC2 cloud. Representative datasets (e.g. 40 draft and complete Escherichia coli genomes) are processed in genomics projects, while eliminating the need for on-site computational resources and expertise.

  13. Genomic analysis of expressed sequence tags in American black bear Ursus americanus.

    Science.gov (United States)

    Zhao, Sen; Shao, Chunxuan; Goropashnaya, Anna V; Stewart, Nathan C; Xu, Yichi; Tøien, Øivind; Barnes, Brian M; Fedorov, Vadim B; Yan, Jun

    2010-03-26

    Species of the bear family (Ursidae) are important organisms for research in molecular evolution, comparative physiology and conservation biology, but relatively little genetic sequence information is available for this group. Here we report the development and analyses of the first large scale Expressed Sequence Tag (EST) resource for the American black bear (Ursus americanus). Comprehensive analyses of molecular functions, alternative splicing, and tissue-specific expression of 38,757 black bear EST sequences were conducted using the dog genome as a reference. We identified 18 genes, involved in functions such as lipid catabolism, cell cycle, and vesicle-mediated transport, that are showing rapid evolution in the bear lineage Three genes, Phospholamban (PLN), cysteine glycine-rich protein 3 (CSRP3) and Troponin I type 3 (TNNI3), are related to heart contraction, and defects in these genes in humans lead to heart disease. Two genes, biphenyl hydrolase-like (BPHL) and CSRP3, contain positively selected sites in bear. Global analysis of evolution rates of hibernation-related genes in bear showed that they are largely conserved and slowly evolving genes, rather than novel and fast-evolving genes. We provide a genomic resource for an important mammalian organism and our study sheds new light on the possible functions and evolution of bear genes.

  14. Genomic analysis of expressed sequence tags in American black bear Ursus americanus

    Science.gov (United States)

    2010-01-01

    Background Species of the bear family (Ursidae) are important organisms for research in molecular evolution, comparative physiology and conservation biology, but relatively little genetic sequence information is available for this group. Here we report the development and analyses of the first large scale Expressed Sequence Tag (EST) resource for the American black bear (Ursus americanus). Results Comprehensive analyses of molecular functions, alternative splicing, and tissue-specific expression of 38,757 black bear EST sequences were conducted using the dog genome as a reference. We identified 18 genes, involved in functions such as lipid catabolism, cell cycle, and vesicle-mediated transport, that are showing rapid evolution in the bear lineage Three genes, Phospholamban (PLN), cysteine glycine-rich protein 3 (CSRP3) and Troponin I type 3 (TNNI3), are related to heart contraction, and defects in these genes in humans lead to heart disease. Two genes, biphenyl hydrolase-like (BPHL) and CSRP3, contain positively selected sites in bear. Global analysis of evolution rates of hibernation-related genes in bear showed that they are largely conserved and slowly evolving genes, rather than novel and fast-evolving genes. Conclusion We provide a genomic resource for an important mammalian organism and our study sheds new light on the possible functions and evolution of bear genes. PMID:20338065

  15. Frame sequences analysis technique of linear objects movement

    Science.gov (United States)

    Oshchepkova, V. Y.; Berg, I. A.; Shchepkin, D. V.; Kopylova, G. V.

    2017-12-01

    Obtaining data by noninvasive methods are often needed in many fields of science and engineering. This is achieved through video recording in various frame rate and light spectra. In doing so quantitative analysis of movement of the objects being studied becomes an important component of the research. This work discusses analysis of motion of linear objects on the two-dimensional plane. The complexity of this problem increases when the frame contains numerous objects whose images may overlap. This study uses a sequence containing 30 frames at the resolution of 62 × 62 pixels and frame rate of 2 Hz. It was required to determine the average velocity of objects motion. This velocity was found as an average velocity for 8-12 objects with the error of 15%. After processing dependencies of the average velocity vs. control parameters were found. The processing was performed in the software environment GMimPro with the subsequent approximation of the data obtained using the Hill equation.

  16. Transcriptome sequencing and positive selected genes analysis of Bombyx mandarina.

    Directory of Open Access Journals (Sweden)

    Tingcai Cheng

    Full Text Available The wild silkworm Bombyx mandarina is widely believed to be an ancestor of the domesticated silkworm, Bombyx mori. Silkworms are often used as a model for studying the mechanism of species domestication. Here, we performed transcriptome sequencing of the wild silkworm using an Illumina HiSeq2000 platform. We produced 100,004,078 high-quality reads and assembled them into 50,773 contigs with an N50 length of 1764 bp and a mean length of 941.62 bp. A total of 33,759 unigenes were identified, with 12,805 annotated in the Nr database, 8273 in the Pfam database, and 9093 in the Swiss-Prot database. Expression profile analysis found significant differential expression of 1308 unigenes between the middle silk gland (MSG and posterior silk gland (PSG. Three sericin genes (sericin 1, sericin 2, and sericin 3 were expressed specifically in the MSG and three fibroin genes (fibroin-H, fibroin-L, and fibroin/P25 were expressed specifically in the PSG. In addition, 32,297 Single-nucleotide polymorphisms (SNPs and 361 insertion-deletions (INDELs were detected. Comparison with the domesticated silkworm p50/Dazao identified 5,295 orthologous genes, among which 400 might have experienced or to be experiencing positive selection by Ka/Ks analysis. These data and analyses presented here provide insights into silkworm domestication and an invaluable resource for wild silkworm genomics research.

  17. Complete genome sequence analysis identifies a new genotype of brassica yellows virus that infects cabbage and radish in China.

    Science.gov (United States)

    Zhang, Xiao-Yan; Xiang, Hai-Ying; Zhou, Cui-Ji; Li, Da-Wei; Yu, Jia-Lin; Han, Cheng-Gui

    2014-08-01

    For brassica yellows virus (BrYV), proposed to be a member of a new polerovirus species, two clearly distinct genotypes (BrYV-A and BrYV-B) have been described. In this study, the complete nucleotide sequences of two BrYV isolates from radish and Chinese cabbage were determined. Sequence analysis suggested that these isolates represent a new genotype, referred to here as BrYV-C. The full-length sequences of the two BrYV-C isolates shared 93.4-94.8 % identity with BrYV-A and BrYV-B. Further phylogenetic analysis showed that the BrYV-C isolates formed a subgroup that was distinct from the BrYV-A and BrYV-B isolates based on all of the proteins except P5.

  18. Secure and robust cloud computing for high-throughput forensic microsatellite sequence analysis and databasing.

    Science.gov (United States)

    Bailey, Sarah F; Scheible, Melissa K; Williams, Christopher; Silva, Deborah S B S; Hoggan, Marina; Eichman, Christopher; Faith, Seth A

    2017-11-01

    Next-generation Sequencing (NGS) is a rapidly evolving technology with demonstrated benefits for forensic genetic applications, and the strategies to analyze and manage the massive NGS datasets are currently in development. Here, the computing, data storage, connectivity, and security resources of the Cloud were evaluated as a model for forensic laboratory systems that produce NGS data. A complete front-to-end Cloud system was developed to upload, process, and interpret raw NGS data using a web browser dashboard. The system was extensible, demonstrating analysis capabilities of autosomal and Y-STRs from a variety of NGS instrumentation (Illumina MiniSeq and MiSeq, and Oxford Nanopore MinION). NGS data for STRs were concordant with standard reference materials previously characterized with capillary electrophoresis and Sanger sequencing. The computing power of the Cloud was implemented with on-demand auto-scaling to allow multiple file analysis in tandem. The system was designed to store resulting data in a relational database, amenable to downstream sample interpretations and databasing applications following the most recent guidelines in nomenclature for sequenced alleles. Lastly, a multi-layered Cloud security architecture was tested and showed that industry standards for securing data and computing resources were readily applied to the NGS system without disadvantageous effects for bioinformatic analysis, connectivity or data storage/retrieval. The results of this study demonstrate the feasibility of using Cloud-based systems for secured NGS data analysis, storage, databasing, and multi-user distributed connectivity. Copyright © 2017 Elsevier B.V. All rights reserved.

  19. Cloning, sequencing, and sequence analysis of two novel plasmids from the thermophilic anaerobic bacterium Anaerocellum thermophilum

    DEFF Research Database (Denmark)

    Clausen, Anders; Mikkelsen, Marie Just; Schrøder, I.

    2004-01-01

    The nucleotide sequence of two novel plasmids isolated from the extreme thermophilic anaerobic bacterium Anaerocellum thermophilum DSM6725 (A. thermophilum), growing optimally at 70degreesC, has been determined. pBAS2 was found to be a 3653 bp plasmid with a GC content of 43%, and the sequence re...... with highest similarity to DNA repair protein from Campylobacter jejuni (25% aa). Orf34 showed similarity to sigma factors with highest similarity (28% aa) to the sporulation specific Sigma factor, Sigma 28(K) from Bacillus thuringiensis....

  20. Multielement analysis of rice flour-unpolished reference material by instrumental neutron activation analysis

    International Nuclear Information System (INIS)

    Suzuki, Shogo; Hirai, Shoji

    1990-01-01

    Trace elements in NIES certified reference material No. 10-a∼10-c Rice Flour-Unpolished, prepared by the National Institute for Environmental Studies of Japan (NIES), were determined by instrumental neutron activation analysis (INAA). A set of three samples with different Cd concentration levels was subjected to analyses. Portions of each sample (ca. 200∼1000 mg) were irradiated, either with thermal neutrons without cadmium filter or with epithermal neutrons with cadmium filter, in the Musashi Institute of Technology Research Reactor (MITRR). The activated samples were analyzed by the three methods; conventional γ-ray spectrometry using a coaxial Ge detector, anticoincidence counting spectrometry, and coincidence counting spectrometry using a coaxial Ge detector and a well type NaI(Tl) detector. Concentrations of 26∼28 elements were determined by these methods. The values obtained for many elements, except for Mg and K, were in good agreement with those of the NIES certified and reference. Concentrations of 10 elements (S, Sc, V, Ag, Sb, Cs, Ba, La, Sm, Th), whose certified or reference values are not available from NIES, were also determined in this work. (author)

  1. Automatic analysis of the 2015 Gorkha earthquake aftershock sequence.

    Science.gov (United States)

    Baillard, C.; Lyon-Caen, H.; Bollinger, L.; Rietbrock, A.; Letort, J.; Adhikari, L. B.

    2016-12-01

    The Mw 7.8 Gorkha earthquake, that partially ruptured the Main Himalayan Thrust North of Kathmandu on the 25th April 2015, was the largest and most catastrophic earthquake striking Nepal since the great M8.4 1934 earthquake. This mainshock was followed by multiple aftershocks, among them, two notable events that occurred on the 12th May with magnitudes of 7.3 Mw and 6.3 Mw. Due to these recent events it became essential for the authorities and for the scientific community to better evaluate the seismic risk in the region through a detailed analysis of the earthquake catalog, amongst others, the spatio-temporal distribution of the Gorkha aftershock sequence. Here we complement this first study by doing a microseismic study using seismic data coming from the eastern part of the Nepalese Seismological Center network associated to one broadband station in Everest. Our primary goal is to deliver an accurate catalog of the aftershock sequence. Due to the exceptional number of events detected we performed an automatic picking/locating procedure which can be splitted in 4 steps: 1) Coarse picking of the onsets using a classical STA/LTA picker, 2) phase association of picked onsets to detect and declare seismic events, 3) Kurtosis pick refinement around theoretical arrival times to increase picking and location accuracy and, 4) local magnitude calculation based amplitude of waveforms. This procedure is time efficient ( 1 sec/event), reduces considerably the location uncertainties ( 2 to 5 km errors) and increases the number of events detected compared to manual processing. Indeed, the automatic detection rate is 10 times higher than the manual detection rate. By comparing to the USGS catalog we were able to give a new attenuation law to compute local magnitudes in the region. A detailed analysis of the seismicity shows a clear migration toward the east of the region and a sudden decrease of seismicity 100 km east of Kathmandu which may reveal the presence of a tectonic

  2. A DNA Structure-Based Bionic Wavelet Transform and Its Application to DNA Sequence Analysis

    Directory of Open Access Journals (Sweden)

    Fei Chen

    2003-01-01

    Full Text Available DNA sequence analysis is of great significance for increasing our understanding of genomic functions. An important task facing us is the exploration of hidden structural information stored in the DNA sequence. This paper introduces a DNA structure-based adaptive wavelet transform (WT – the bionic wavelet transform (BWT – for DNA sequence analysis. The symbolic DNA sequence can be separated into four channels of indicator sequences. An adaptive symbol-to-number mapping, determined from the structural feature of the DNA sequence, was introduced into WT. It can adjust the weight value of each channel to maximise the useful energy distribution of the whole BWT output. The performance of the proposed BWT was examined by analysing synthetic and real DNA sequences. Results show that BWT performs better than traditional WT in presenting greater energy distribution. This new BWT method should be useful for the detection of the latent structural features in future DNA sequence analysis.

  3. HYBRID SULFUR PROCESS REFERENCE DESIGN AND COST ANALYSIS

    Energy Technology Data Exchange (ETDEWEB)

    Gorensek, M.; Summers, W.; Boltrunis, C.; Lahoda, E.; Allen, D.; Greyvenstein, R.

    2009-05-12

    This report documents a detailed study to determine the expected efficiency and product costs for producing hydrogen via water-splitting using energy from an advanced nuclear reactor. It was determined that the overall efficiency from nuclear heat to hydrogen is high, and the cost of hydrogen is competitive under a high energy cost scenario. It would require over 40% more nuclear energy to generate an equivalent amount of hydrogen using conventional water-cooled nuclear reactors combined with water electrolysis compared to the proposed plant design described herein. There is a great deal of interest worldwide in reducing dependence on fossil fuels, while also minimizing the impact of the energy sector on global climate change. One potential opportunity to contribute to this effort is to replace the use of fossil fuels for hydrogen production by the use of water-splitting powered by nuclear energy. Hydrogen production is required for fertilizer (e.g. ammonia) production, oil refining, synfuels production, and other important industrial applications. It is typically produced by reacting natural gas, naphtha or coal with steam, which consumes significant amounts of energy and produces carbon dioxide as a byproduct. In the future, hydrogen could also be used as a transportation fuel, replacing petroleum. New processes are being developed that would permit hydrogen to be produced from water using only heat or a combination of heat and electricity produced by advanced, high temperature nuclear reactors. The U.S. Department of Energy (DOE) is developing these processes under a program known as the Nuclear Hydrogen Initiative (NHI). The Republic of South Africa (RSA) also is interested in developing advanced high temperature nuclear reactors and related chemical processes that could produce hydrogen fuel via water-splitting. This report focuses on the analysis of a nuclear hydrogen production system that combines the Pebble Bed Modular Reactor (PBMR), under development by

  4. What place is this time? Semiotics and the analysis of historical reference in landschape architecture

    NARCIS (Netherlands)

    Assche, van K.A.M.; Duineveld, M.; Jong, de H.C.; Zoest, van A.

    2012-01-01

    This paper revisits the potential contribution of semiotic analysis to heritage design. A case study analyzes lay interpretations of a number of urban landscape designs, displaying different ways to refer to the invisible (archaeological) past. A total of 12 draft designs were produced referring to

  5. A genome-wide analysis of lentivector integration sites using targeted sequence capture and next generation sequencing technology.

    Science.gov (United States)

    Ustek, Duran; Sirma, Sema; Gumus, Ergun; Arikan, Muzaffer; Cakiris, Aris; Abaci, Neslihan; Mathew, Jaicy; Emrence, Zeliha; Azakli, Hulya; Cosan, Fulya; Cakar, Atilla; Parlak, Mahmut; Kursun, Olcay

    2012-10-01

    One application of next-generation sequencing (NGS) is the targeted resequencing of interested genes which has not been used in viral integration site analysis of gene therapy applications. Here, we combined targeted sequence capture array and next generation sequencing to address the whole genome profiling of viral integration sites. Human 293T and K562 cells were transduced with a HIV-1 derived vector. A custom made DNA probe sets targeted pLVTHM vector used to capture lentiviral vector/human genome junctions. The captured DNA was sequenced using GS FLX platform. Seven thousand four hundred and eighty four human genome sequences flanking the long terminal repeats (LTR) of pLVTHM fragment sequences matched with an identity of at least 98% and minimum 50 bp criteria in both cells. In total, 203 unique integration sites were identified. The integrations in both cell lines were totally distant from the CpG islands and from the transcription start sites and preferentially located in introns. A comparison between the two cell lines showed that the lentiviral-transduced DNA does not have the same preferred regions in the two different cell lines. Copyright © 2012 Elsevier B.V. All rights reserved.

  6. Sequencing and analysis of an Irish human genome.

    LENUS (Irish Health Repository)

    Tong, Pin

    2010-01-01

    Recent studies generating complete human sequences from Asian, African and European subgroups have revealed population-specific variation and disease susceptibility loci. Here, choosing a DNA sample from a population of interest due to its relative geographical isolation and genetic impact on further populations, we extend the above studies through the generation of 11-fold coverage of the first Irish human genome sequence.

  7. Exome Sequence Analysis of 14 Families With High Myopia

    DEFF Research Database (Denmark)

    Kloss, Bethany A.; Tompson, Stuart W.; Whisenhunt, Kristina N.

    2017-01-01

    Purpose: To identify causal gene mutations in 14 families with autosomal dominant (AD) high myopia using exome sequencing. Methods: Select individuals from 14 large Caucasian families with high myopia were exome sequenced. Gene variants were filtered to identify potential pathogenic changes. Sang...

  8. Database-driven primary analysis of raw sequencing data

    DEFF Research Database (Denmark)

    2014-01-01

    The present invention relates to methods for identifying the source of a biological sequence containing sample from raw sequencing reads. The method may be used to identify the source of unknown DNA and can be used for diagnostic, biodefense, food safety and quality, and hygiene applications...

  9. Accelerating next generation sequencing data analysis with system level optimizations.

    Science.gov (United States)

    Kathiresan, Nagarajan; Temanni, Ramzi; Almabrazi, Hakeem; Syed, Najeeb; Jithesh, Puthen V; Al-Ali, Rashid

    2017-08-22

    Next generation sequencing (NGS) data analysis is highly compute intensive. In-memory computing, vectorization, bulk data transfer, CPU frequency scaling are some of the hardware features in the modern computing architectures. To get the best execution time and utilize these hardware features, it is necessary to tune the system level parameters before running the application. We studied the GATK-HaplotypeCaller which is part of common NGS workflows, that consume more than 43% of the total execution time. Multiple GATK 3.x versions were benchmarked and the execution time of HaplotypeCaller was optimized by various system level parameters which included: (i) tuning the parallel garbage collection and kernel shared memory to simulate in-memory computing, (ii) architecture-specific tuning in the PairHMM library for vectorization, (iii) including Java 1.8 features through GATK source code compilation and building a runtime environment for parallel sorting and bulk data transfer (iv) the default 'on-demand' mode of CPU frequency is over-clocked by using 'performance-mode' to accelerate the Java multi-threads. As a result, the HaplotypeCaller execution time was reduced by 82.66% in GATK 3.3 and 42.61% in GATK 3.7. Overall, the execution time of NGS pipeline was reduced to 70.60% and 34.14% for GATK 3.3 and GATK 3.7 respectively.

  10. The sequence and analysis of a Chinese pig genome

    Directory of Open Access Journals (Sweden)

    Fang Xiaodong

    2012-11-01

    Full Text Available Abstract Background The pig is an economically important food source, amounting to approximately 40% of all meat consumed worldwide. Pigs also serve as an important model organism because of their similarity to humans at the anatomical, physiological and genetic level, making them very useful for studying a variety of human diseases. A pig strain of particular interest is the miniature pig, specifically the Wuzhishan pig (WZSP, as it has been extensively inbred. Its high level of homozygosity offers increased ease for selective breeding for specific traits and a more straightforward understanding of the genetic changes that underlie its biological characteristics. WZSP also serves as a promising means for applications in surgery, tissue engineering, and xenotransplantation. Here, we report the sequencing and analysis of an inbreeding WZSP genome. Results Our results reveal some unique genomic features, including a relatively high level of homozygosity in the diploid genome, an unusual distribution of heterozygosity, an over-representation of tRNA-derived transposable elements, a small amount of porcine endogenous retrovirus, and a lack of type C retroviruses. In addition, we carried out systematic research on gene evolution, together with a detailed investigation of the counterparts of human drug target genes. Conclusion Our results provide the opportunity to more clearly define the genomic character of pig, which could enhance our ability to create more useful pig models.

  11. Analysis of expressed sequence tags from the Ulva prolifera (Chlorophyta)

    Science.gov (United States)

    Niu, Jianfeng; Hu, Haiyan; Hu, Songnian; Wang, Guangce; Peng, Guang; Sun, Song

    2010-01-01

    In 2008, a green tide broke out before the sailing competition of the 29th Olympic Games in Qingdao. The causative species was determined to be Enteromorpha prolifera ( Ulva prolifera O. F. Müller), a familiar green macroalga along the coastline of China. Rapid accumulation of a large biomass of floating U. prolifera prompted research on different aspects of this species. In this study, we constructed a nonnormalized cDNA library from the thalli of U. prolifera and acquired 10 072 high-quality expressed sequence tags (ESTs). These ESTs were assembled into 3 519 nonredundant gene groups, including 1 446 clusters and 2 073 singletons. After annotation with the nr database, a large number of genes were found to be related with chloroplast and ribosomal protein, GO functional classification showed 1 418 ESTs participated in photosynthesis and 1 359 ESTs were responsible for the generation of precursor metabolites and energy. In addition, rather comprehensive carbon fixation pathways were found in U. prolifera using KEGG. Some stress-related and signal transduction-related genes were also found in this study. All the evidences displayed that U. prolifera had substance and energy foundation for the intense photosynthesis and the rapid proliferation. Phylogenetic analysis of cytochrome c oxidase subunit I revealed that this green-tide causative species is most closely affiliated to Pseudendoclonium akinetum (Ulvophyceae).

  12. MinION Analysis and Reference Consortium: Phase 2 data release and analysis of R9.0 chemistry.

    Science.gov (United States)

    Jain, Miten; Tyson, John R; Loose, Matthew; Ip, Camilla L C; Eccles, David A; O'Grady, Justin; Malla, Sunir; Leggett, Richard M; Wallerman, Ola; Jansen, Hans J; Zalunin, Vadim; Birney, Ewan; Brown, Bonnie L; Snutch, Terrance P; Olsen, Hugh E

    2017-01-01

    Long-read sequencing is rapidly evolving and reshaping the suite of opportunities for genomic analysis. For the MinION in particular, as both the platform and chemistry develop, the user community requires reference data to set performance expectations and maximally exploit third-generation sequencing. We performed an analysis of MinION data derived from whole genome sequencing of Escherichia coli K-12 using the R9.0 chemistry, comparing the results with the older R7.3 chemistry. We computed the error-rate estimates for insertions, deletions, and mismatches in MinION reads. Run-time characteristics of the flow cell and run scripts for R9.0 were similar to those observed for R7.3 chemistry, but with an 8-fold increase in bases per second (from 30 bps in R7.3 and SQK-MAP005 library preparation, to 250 bps in R9.0) processed by individual nanopores, and less drop-off in yield over time. The 2-dimensional ("2D") N50 read length was unchanged from the prior chemistry. Using the proportion of alignable reads as a measure of base-call accuracy, 99.9% of "pass" template reads from 1-dimensional ("1D")  experiments were mappable and ~97% from 2D experiments. The median identity of reads was ~89% for 1D and ~94% for 2D experiments. The total error rate (miscall + insertion + deletion ) decreased for 2D "pass" reads from 9.1% in R7.3 to 7.5% in R9.0 and for template "pass" reads from 26.7% in R7.3 to 14.5% in R9.0. These Phase 2 MinION experiments serve as a baseline by providing estimates for read quality, throughput, and mappability. The datasets further enable the development of bioinformatic tools tailored to the new R9.0 chemistry and the design of novel biological applications for this technology. K: thousand, Kb: kilobase (one thousand base pairs), M: million, Mb: megabase (one million base pairs), Gb: gigabase (one billion base pairs).

  13. [Study of human immunodeficiency virus transmission chains in Andalusia: analysis from baseline antiretroviral resistance sequences].

    Science.gov (United States)

    Pérez-Parra, Santiago; Chueca-Porcuna, Natalia; Álvarez-Estevez, Marta; Pasquau, Juan; Omar, Mohamed; Collado, Antonio; Vinuesa, David; Lozano, Ana Belen; García-García, Federico

    2015-11-01

    Protease and reverse transcriptase HIV-1 sequences provide useful information for patient clinical management, as well as information on resistance to antiretrovirals. The aim of this study is to evaluate transmission events, transmitted drug resistance, and to georeference subtypes among newly diagnosed patients referred to our center. A study was conducted on 693 patients diagnosed between 2005 and 2012 in Southern Spain. Protease and reverse transcriptase sequences were obtained for resistance to cART analysis with Trugene(®) HIV Genotyping Kit (Siemens, NAD). MEGA 5.2, Neighbor-Joining, ArcGIS and REGA were used for subsequent analysis. The results showed 298 patients clustered into 77 different transmission events. Most of the clusters were formed by pairs (n=49), of men having sex with men (n=26), Spanish (n=37), and below 45 years of age (73.5%). Urban areas from Granada, and the coastal areas of Almeria and Granada showed the greatest subtype heterogeneity. Five clusters were formed by more than 10 patients, and 15 clusters had transmitted drug resistance. The study data demonstrate how the phylogenetic characterization of transmission clusters is a powerful tool to monitor the spread of HIV, and may contribute to design correct preventive measures to minimize it. Copyright © 2015 Elsevier España, S.L.U. y Sociedad Española de Enfermedades Infecciosas y Microbiología Clínica. All rights reserved.

  14. Data Link Test and Analysis System/ATCRBS Transponder Test System Technical Reference

    Science.gov (United States)

    1990-05-01

    This document references material for personnel using or making software changes : to the Data Link Test and Analysis System (DATAS) for Air Traffic Control Radar : Beacon System (ATCRBS) transponder testing and data collection. This is one of : a se...

  15. Review of neutron activation analysis in the standardization and study of reference materials, including its application to radionuclide reference materials

    International Nuclear Information System (INIS)

    Byrne, A.R.

    1993-01-01

    Neutron activation analysis (NAA) plays a very important role in the certification of reference materials (RMs) and their characterization, including homogeneity testing. The features of the method are briefly reviewed, particularly aspects relating to its completely independent nuclear basis, its virtual freedom from blank problems, and its capacity for self-verification. This last aspect, arising from the essentially isotopic character of NAA, can be exploited by using different nuclear reactions and induced nuclides, and the possibility of employing two modes, one instrumental (nondestructive), the other radiochemical (destructive). This enables the derivation of essentially independent analytical information and the unique capacity of NAA for selfvalidation. The application of NAA to quantify natural or man-made radionuclides such as uranium, thorium, 237 Np, 129 I and 230 Th is discussed, including its advantages over conventional radiometric methods and its usefulness in providing independent data for nuclides where other confirmatory analyses are impossible, or are only recently becoming available through newer 'atom counting' techniques. Certain additional, prospective uses of NAA in the study of RMs and potential RMs are mentioned, including transmutation reactions, creation of endogenously radiolabelled matrices for production and study of RMs (such as dissolution and leaching tests, use as incorporated radiotracers for chemical recovery correction), and the possibility of molecular activation analysis for specification. (orig.)

  16. Assembly of the Lactuca sativa, L. cv. Tizian draft genome sequence reveals differences within major resistance complex 1 as compared to the cv. Salinas reference genome.

    Science.gov (United States)

    Verwaaijen, Bart; Wibberg, Daniel; Nelkner, Johanna; Gordin, Miriam; Rupp, Oliver; Winkler, Anika; Bremges, Andreas; Blom, Jochen; Grosch, Rita; Pühler, Alfred; Schlüter, Andreas

    2018-02-10

    Lettuce (Lactuca sativa, L.) is an important annual plant of the family Asteraceae (Compositae). The commercial lettuce cultivar Tizian has been used in various scientific studies investigating the interaction of the plant with phytopathogens or biological control agents. Here, we present the de novo draft genome sequencing and gene prediction for this specific cultivar derived from transcriptome sequence data. The assembled scaffolds amount to a size of 2.22 Gb. Based on RNAseq data, 31,112 transcript isoforms were identified. Functional predictions for these transcripts were determined within the GenDBE annotation platform. Comparison with the cv. Salinas reference genome revealed a high degree of sequence similarity on genome and transcriptome levels, with an average amino acid identity of 99%. Furthermore, it was observed that two large regions are either missing or are highly divergent within the cv. Tizian genome compared to cv. Salinas. One of these regions covers the major resistance complex 1 region of cv. Salinas. The cv. Tizian draft genome sequence provides a valuable resource for future functional and transcriptome analyses focused on this lettuce cultivar. Copyright © 2017 Elsevier B.V. All rights reserved.

  17. Instrumental neutron activation analysis of marine sediment in-house reference material

    International Nuclear Information System (INIS)

    Nazaratul Ashifa Abdullah Salim; Mohd Suhaimi Hamzah; Mohd Suhaimi Elias; Siong, W.B.; Shamsiah Abdul Rahman; Azian Hashim; Shakirah Abdul Shukor

    2013-01-01

    Reference materials play an important role in demonstrating the quality and reliability of analytical data. The advantage of using in-house reference materials is that they provide a relatively cheap option as compared to using commercially available certified reference material (CRM) and can closely resemble the laboratory routine test sample. A marine sediment sample was designed as an in-house reference material, in the framework of quality assurance and control (QA/QC) program of the Neutron Activation Analysis (NAA) Laboratory at Nuclear Malaysia. The NAA technique was solely used for the homogeneity test of the marine sediment sample. The CRM of IAEA- Soil 7 and IAEA- SL1 (Lake Sediment) were applied in the analysis as compatible matrix based reference materials for QA purposes. (Author)

  18. Event Sequence Analysis of the Air Intelligence Agency Information Operations Center Flight Operations

    National Research Council Canada - National Science Library

    Larsen, Glen

    1998-01-01

    This report applies Event Sequence Analysis, methodology adapted from aircraft mishap investigation, to an investigation of the performance of the Air Intelligence Agency's Information Operations Center (IOC...

  19. Analysis of the Macaca mulatta transcriptome and the sequence divergence between Macaca and human.

    Science.gov (United States)

    Magness, Charles L; Fellin, P Campion; Thomas, Matthew J; Korth, Marcus J; Agy, Michael B; Proll, Sean C; Fitzgibbon, Matthew; Scherer, Christina A; Miner, Douglas G; Katze, Michael G; Iadonato, Shawn P

    2005-01-01

    We report the initial sequencing and comparative analysis of the Macaca mulatta transcriptome. Cloned sequences from 11 tissues, nine animals, and three species (M. mulatta, M. fascicularis, and M. nemestrina) were sampled, resulting in the generation of 48,642 sequence reads. These data represent an initial sampling of the putative rhesus orthologs for 6,216 human genes. Mean nucleotide diversity within M. mulatta and sequence divergence among M. fascicularis, M. nemestrina, and M. mulatta are also reported.

  20. Sequence analysis of mitochondrial 16S ribosomal RNA gene ...

    Indian Academy of Sciences (India)

    Unknown

    For the understanding of their vectorial capacity, identification of disease carrying and refractory strains is essential. ... been widely used for phylogenetic studies and sequence differences in ... In order to fill up the internal gap, a new set.

  1. simple sequence repeat (SSR) markers in genetic analysis of

    African Journals Online (AJOL)

    Yomi

    2012-08-28

    1998). Cross- species amplification of soybean (Glycine max) simple sequence repeats (SSRs) within the genus and other legume genera: implications for the transferability of SSRs in plants. Mol. Biol. Evol. 15:1275-1287.

  2. Sequence and expression analysis of gaps in human chromosome 20

    DEFF Research Database (Denmark)

    Minocherhomji, Sheroy; Seemann, Stefan; Mang, Yuan

    2012-01-01

    /or overlap disease-associated loci, including the DLGAP4 locus. In this study, we sequenced ~99% of all three unfinished gaps on human chr 20, determined their complete genomic sizes and assessed epigenetic profiles using a combination of Sanger sequencing, mate pair paired-end high-throughput sequencing......The finished human genome-assemblies comprise several hundred un-sequenced euchromatic gaps, which may be rich in long polypurine/polypyrimidine stretches. Human chromosome 20 (chr 20) currently has three unfinished gaps remaining on its q-arm. All three gaps are within gene-dense regions and...... and chromatin, methylation and expression analyses. We found histone 3 trimethylated at Lysine 27 to be distributed across all three gaps in immortalized B-lymphocytes. In one gap, five novel CpG islands were predominantly hypermethylated in genomic DNA from peripheral blood lymphocytes and human cerebellum...

  3. DELIMINATE--a fast and efficient method for loss-less compression of genomic sequences: sequence analysis.

    Science.gov (United States)

    Mohammed, Monzoorul Haque; Dutta, Anirban; Bose, Tungadri; Chadaram, Sudha; Mande, Sharmila S

    2012-10-01

    An unprecedented quantity of genome sequence data is currently being generated using next-generation sequencing platforms. This has necessitated the development of novel bioinformatics approaches and algorithms that not only facilitate a meaningful analysis of these data but also aid in efficient compression, storage, retrieval and transmission of huge volumes of the generated data. We present a novel compression algorithm (DELIMINATE) that can rapidly compress genomic sequence data in a loss-less fashion. Validation results indicate relatively higher compression efficiency of DELIMINATE when compared with popular general purpose compression algorithms, namely, gzip, bzip2 and lzma. Linux, Windows and Mac implementations (both 32 and 64-bit) of DELIMINATE are freely available for download at: http://metagenomics.atc.tcs.com/compression/DELIMINATE. sharmila@atc.tcs.com Supplementary data are available at Bioinformatics online.

  4. Analysis of 16S rRNA amplicon sequencing options on the Roche/454 next-generation titanium sequencing platform.

    Directory of Open Access Journals (Sweden)

    Hideyuki Tamaki

    Full Text Available BACKGROUND: 16S rRNA gene pyrosequencing approach has revolutionized studies in microbial ecology. While primer selection and short read length can affect the resulting microbial community profile, little is known about the influence of pyrosequencing methods on the sequencing throughput and the outcome of microbial community analyses. The aim of this study is to compare differences in output, ease, and cost among three different amplicon pyrosequencing methods for the Roche/454 Titanium platform METHODOLOGY/PRINCIPAL FINDINGS: The following three pyrosequencing methods for 16S rRNA genes were selected in this study: Method-1 (standard method is the recommended method for bi-directional sequencing using the LIB-A kit; Method-2 is a new option designed in this study for unidirectional sequencing with the LIB-A kit; and Method-3 uses the LIB-L kit for unidirectional sequencing. In our comparison among these three methods using 10 different environmental samples, Method-2 and Method-3 produced 1.5-1.6 times more useable reads than the standard method (Method-1, after quality-based trimming, and did not compromise the outcome of microbial community analyses. Specifically, Method-3 is the most cost-effective unidirectional amplicon sequencing method as it provided the most reads and required the least effort in consumables management. CONCLUSIONS: Our findings clearly demonstrated that alternative pyrosequencing methods for 16S rRNA genes could drastically affect sequencing output (e.g. number of reads before and after trimming but have little effect on the outcomes of microbial community analysis. This finding is important for both researchers and sequencing facilities utilizing 16S rRNA gene pyrosequencing for microbial ecological studies.

  5. Compilation and analysis of Escherichia coli promoter DNA sequences.

    OpenAIRE

    Hawley, D K; McClure, W R

    1983-01-01

    The DNA sequence of 168 promoter regions (-50 to +10) for Escherichia coli RNA polymerase were compiled. The complete listing was divided into two groups depending upon whether or not the promoter had been defined by genetic (promoter mutations) or biochemical (5' end determination) criteria. A consensus promoter sequence based on homologies among 112 well-defined promoters was determined that was in substantial agreement with previous compilations. In addition, we have tabulated 98 promoter ...

  6. Sequence quality analysis tool for HIV type 1 protease and reverse transcriptase.

    Science.gov (United States)

    Delong, Allison K; Wu, Mingham; Bennett, Diane; Parkin, Neil; Wu, Zhijin; Hogan, Joseph W; Kantor, Rami

    2012-08-01

    Access to antiretroviral therapy is increasing globally and drug resistance evolution is anticipated. Currently, protease (PR) and reverse transcriptase (RT) sequence generation is increasing, including the use of in-house sequencing assays, and quality assessment prior to sequence analysis is essential. We created a computational HIV PR/RT Sequence Quality Analysis Tool (SQUAT) that runs in the R statistical environment. Sequence quality thresholds are calculated from a large dataset (46,802 PR and 44,432 RT sequences) from the published literature ( http://hivdb.Stanford.edu ). Nucleic acid sequences are read into SQUAT, identified, aligned, and translated. Nucleic acid sequences are flagged if with >five 1-2-base insertions; >one 3-base insertion; >one deletion; >six PR or >18 RT ambiguous bases; >three consecutive PR or >four RT nucleic acid mutations; >zero stop codons; >three PR or >six RT ambiguous amino acids; >three consecutive PR or >four RT amino acid mutations; >zero unique amino acids; or 15% genetic distance from another submitted sequence. Thresholds are user modifiable. SQUAT output includes a summary report with detailed comments for troubleshooting of flagged sequences, histograms of pairwise genetic distances, neighbor joining phylogenetic trees, and aligned nucleic and amino acid sequences. SQUAT is a stand-alone, free, web-independent tool to ensure use of high-quality HIV PR/RT sequences in interpretation and reporting of drug resistance, while increasing awareness and expertise and facilitating troubleshooting of potentially problematic sequences.

  7. First fungal genome sequence from Africa: A preliminary analysis

    Directory of Open Access Journals (Sweden)

    Rene Sutherland

    2012-01-01

    Full Text Available Some of the most significant breakthroughs in the biological sciences this century will emerge from the development of next generation sequencing technologies. The ease of availability of DNA sequence made possible through these new technologies has given researchers opportunities to study organisms in a manner that was not possible with Sanger sequencing. Scientists will, therefore, need to embrace genomics, as well as develop and nurture the human capacity to sequence genomes and utilise the ’tsunami‘ of data that emerge from genome sequencing. In response to these challenges, we sequenced the genome of Fusarium circinatum, a fungal pathogen of pine that causes pitch canker, a disease of great concern to the South African forestry industry. The sequencing work was conducted in South Africa, making F. circinatum the first eukaryotic organism for which the complete genome has been sequenced locally. Here we report on the process that was followed to sequence, assemble and perform a preliminary characterisation of the genome. Furthermore, details of the computer annotation and manual curation of this genome are presented. The F. circinatum genome was found to be nearly 44 million bases in size, which is similar to that of four other Fusarium genomes that have been sequenced elsewhere. The genome contains just over 15 000 open reading frames, which is less than that of the related species, Fusarium oxysporum, but more than that for Fusarium verticillioides. Amongst the various putative gene clusters identified in F. circinatum, those encoding the secondary metabolites fumosin and fusarin appeared to harbour evidence of gene translocation. It is anticipated that similar comparisons of other loci will provide insights into the genetic basis for pathogenicity of the pitch canker pathogen. Perhaps more importantly, this project has engaged a relatively large group of scientists

  8. REFGEN and TREENAMER: Automated Sequence Data Handling for Phylogenetic Analysis in the Genomic Era

    Science.gov (United States)

    Leonard, Guy; Stevens, Jamie R.; Richards, Thomas A.

    2009-01-01

    The phylogenetic analysis of nucleotide sequences and increasingly that of amino acid sequences is used to address a number of biological questions. Access to extensive datasets, including numerous genome projects, means that standard phylogenetic analyses can include many hundreds of sequences. Unfortunately, most phylogenetic analysis programs do not tolerate the sequence naming conventions of genome databases. Managing large numbers of sequences and standardizing sequence labels for use in phylogenetic analysis programs can be a time consuming and laborious task. Here we report the availability of an online resource for the management of gene sequences recovered from public access genome databases such as GenBank. These web utilities include the facility for renaming every sequence in a FASTA alignment file, with each sequence label derived from a user-defined combination of the species name and/or database accession number. This facility enables the user to keep track of the branching order of the sequences/taxa during multiple tree calculations and re-optimisations. Post phylogenetic analysis, these webpages can then be used to rename every label in the subsequent tree files (with a user-defined combination of species name and/or database accession number). Together these programs drastically reduce the time required for managing sequence alignments and labelling phylogenetic figures. Additional features of our platform include the automatic removal of identical accession numbers (recorded in the report file) and generation of species and accession number lists for use in supplementary materials or figure legends. PMID:19812722

  9. REFGEN and TREENAMER: Automated Sequence Data Handling for Phylogenetic Analysis in the Genomic Era

    Directory of Open Access Journals (Sweden)

    Guy Leonard

    2009-01-01

    Full Text Available The phylogenetic analysis of nucleotide sequences and increasingly that of amino acid sequences is used to address a number of biological questions. Access to extensive datasets, including numerous genome projects, means that standard phylogenetic analyses can include many hundreds of sequences. Unfortunately, most phylogenetic analysis programs do not tolerate the sequence naming conventions of genome databases. Managing large numbers of sequences and standardizing sequence labels for use in phylogenetic analysis programs can be a time consuming and laborious task. Here we report the availability of an online resource for the management of gene sequences recovered from public access genome databases such as GenBank. These web utilities include the facility for renaming every sequence in a FASTA alignment fi le, with each sequence label derived from a user-defined combination of the species name and/or database accession number. This facility enables the user to keep track of the branching order of the sequences/taxa during multiple tree calculations and re-optimisations. Post phylogenetic analysis, these webpages can then be used to rename every label in the subsequent tree fi les (with a user-defined combination of species name and/or database accession number. Together these programs drastically reduce the time required for managing sequence alignments and labelling phylogenetic figures. Additional features of our platform include the automatic removal of identical accession numbers (recorded in the report file and generation of species and accession number lists for use in supplementary materials or figure legends.

  10. Optimizing a massive parallel sequencing workflow for quantitative miRNA expression analysis.

    Directory of Open Access Journals (Sweden)

    Francesca Cordero

    Full Text Available BACKGROUND: Massive Parallel Sequencing methods (MPS can extend and improve the knowledge obtained by conventional microarray technology, both for mRNAs and short non-coding RNAs, e.g. miRNAs. The processing methods used to extract and interpret the information are an important aspect of dealing with the vast amounts of data generated from short read sequencing. Although the number of computational tools for MPS data analysis is constantly growing, their strengths and weaknesses as part of a complex analytical pipe-line have not yet been well investigated. PRIMARY FINDINGS: A benchmark MPS miRNA dataset, resembling a situation in which miRNAs are spiked in biological replication experiments was assembled by merging a publicly available MPS spike-in miRNAs data set with MPS data derived from healthy donor peripheral blood mononuclear cells. Using this data set we observed that short reads counts estimation is strongly under estimated in case of duplicates miRNAs, if whole genome is used as reference. Furthermore, the sensitivity of miRNAs detection is strongly dependent by the primary tool used in the analysis. Within the six aligners tested, specifically devoted to miRNA detection, SHRiMP and MicroRazerS show the highest sensitivity. Differential expression estimation is quite efficient. Within the five tools investigated, two of them (DESseq, baySeq show a very good specificity and sensitivity in the detection of differential expression. CONCLUSIONS: The results provided by our analysis allow the definition of a clear and simple analytical optimized workflow for miRNAs digital quantitative analysis.

  11. Optimizing a massive parallel sequencing workflow for quantitative miRNA expression analysis.

    Science.gov (United States)

    Cordero, Francesca; Beccuti, Marco; Arigoni, Maddalena; Donatelli, Susanna; Calogero, Raffaele A

    2012-01-01

    Massive Parallel Sequencing methods (MPS) can extend and improve the knowledge obtained by conventional microarray technology, both for mRNAs and short non-coding RNAs, e.g. miRNAs. The processing methods used to extract and interpret the information are an important aspect of dealing with the vast amounts of data generated from short read sequencing. Although the number of computational tools for MPS data analysis is constantly growing, their strengths and weaknesses as part of a complex analytical pipe-line have not yet been well investigated. A benchmark MPS miRNA dataset, resembling a situation in which miRNAs are spiked in biological replication experiments was assembled by merging a publicly available MPS spike-in miRNAs data set with MPS data derived from healthy donor peripheral blood mononuclear cells. Using this data set we observed that short reads counts estimation is strongly under estimated in case of duplicates miRNAs, if whole genome is used as reference. Furthermore, the sensitivity of miRNAs detection is strongly dependent by the primary tool used in the analysis. Within the six aligners tested, specifically devoted to miRNA detection, SHRiMP and MicroRazerS show the highest sensitivity. Differential expression estimation is quite efficient. Within the five tools investigated, two of them (DESseq, baySeq) show a very good specificity and sensitivity in the detection of differential expression. The results provided by our analysis allow the definition of a clear and simple analytical optimized workflow for miRNAs digital quantitative analysis.

  12. Phenomenological uncertainty analysis of containment building pressure load caused by severe accident sequences

    International Nuclear Information System (INIS)

    Park, S.Y.; Ahn, K.I.

    2014-01-01

    Highlights: • Phenomenological uncertainty analysis has been applied to level 2 PSA. • The methodology provides an alternative to simple deterministic analyses and sensitivity studies. • A realistic evaluation provides a more complete characterization of risks. • Uncertain parameters of MAAP code for the early containment failure were identified. - Abstract: This paper illustrates an application of a severe accident analysis code, MAAP, to the uncertainty evaluation of early containment failure scenarios employed in the containment event tree (CET) model of a reference plant. An uncertainty analysis of containment pressure behavior during severe accidents has been performed for an optimum assessment of an early containment failure model. The present application is mainly focused on determining an estimate of the containment building pressure load caused by severe accident sequences of a nuclear power plant. Key modeling parameters and phenomenological models employed for the present uncertainty analysis are closely related to the in-vessel hydrogen generation, direct containment heating, and gas combustion. The basic approach of this methodology is to (1) develop severe accident scenarios for which containment pressure loads should be performed based on a level 2 PSA, (2) identify severe accident phenomena relevant to an early containment failure, (3) identify the MAAP input parameters, sensitivity coefficients, and modeling options that describe or influence the early containment failure phenomena, (4) prescribe the likelihood descriptions of the potential range of these parameters, and (5) evaluate the code predictions using a number of random combinations of parameter inputs sampled from the likelihood distributions

  13. Whole transcriptome analysis using next-generation sequencing of model species Setaria viridis to support C4 photosynthesis research.

    Science.gov (United States)

    Xu, Jiajia; Li, Yuanyuan; Ma, Xiuling; Ding, Jianfeng; Wang, Kai; Wang, Sisi; Tian, Ye; Zhang, Hui; Zhu, Xin-Guang

    2013-09-01

    Setaria viridis is an emerging model species for genetic studies of C4 photosynthesis. Many basic molecular resources need to be developed to support for this species. In this paper, we performed a comprehensive transcriptome analysis from multiple developmental stages and tissues of S. viridis using next-generation sequencing technologies. Sequencing of the transcriptome from multiple tissues across three developmental stages (seed germination, vegetative growth, and reproduction) yielded a total of 71 million single end 100 bp long reads. Reference-based assembly using Setaria italica genome as a reference generated 42,754 transcripts. De novo assembly generated 60,751 transcripts. In addition, 9,576 and 7,056 potential simple sequence repeats (SSRs) covering S. viridis genome were identified when using the reference based assembled transcripts and the de novo assembled transcripts, respectively. This identified transcripts and SSR provided by this study can be used for both reverse and forward genetic studies based on S. viridis.

  14. Multilocus sequence analysis of Echinococcus granulosus strains isolated from humans and animals in Iran.

    Science.gov (United States)

    Nikmanesh, Bahram; Mirhendi, Hossein; Mahmoudi, Shahram; Rokni, Mohammad Bagher

    2017-12-01

    Echinococcus granulosus is now considered a complex consisting of at least four species and ten genotypes. Different molecular targets have been described for molecular characterization of E. granulosus; however, in almost all studies only one or two of the targets have been used, and only limited data is available on the utilization of multiple loci. Therefore, we investigated the genetic diversity among 64 strains isolated from 138 cyst specimens of human and animal isolates, using a set of nuclear and mitochondrial genes; i.e., cytochrome c oxidase subunit 1 (cox1), NADH dehydrogenase subunit 1 (nad1), ATPase subunit 6 (atp6), 12S rRNA (12S), and Actin II (act II). In comparison to the use of molecular reference targets (nad1 + cox1), using singular target (act II or 12S or atp6) yielded lower discriminatory power. Act II and 12S genes could accurately discriminate the G6 genotype, but they were not able to differentiate between G1 and G3 genotypes. As the G1 and G3 genotypes belong to the E. granulosus sensu stricto, low intra-species variation was observed for act II and 12S. The atp6 gene could identify the G3 genotype but could not differentiate G6 and G1 genotypes. Using concatenated sequence of five genes (cox1 + nad1 + atp6 + 12S + act II), genotypes were identified accurately, and markedly higher resolution was obtained in comparison with the use of reference markers (nad1 + cox1) only. Application of multilocus sequence analysis (MLSA) to large-scale studies could provide valuable epidemiological data to make efficient control and management measures for cystic echinococcosis. Copyright © 2017 Elsevier Inc. All rights reserved.

  15. Cloning and evaluation of reference genes for quantitative real-time PCR analysis in Amorphophallus

    Directory of Open Access Journals (Sweden)

    Kai Wang

    2017-04-01

    Full Text Available Quantitative real-time reverse transcription PCR (RT-qPCR has been widely used in the detection and quantification of gene expression levels because of its high accuracy, sensitivity, and reproducibility as well as its large dynamic range. However, the reliability and accuracy of RT-qPCR depends on accurate transcript normalization using stably expressed reference genes. Amorphophallus is a perennial plant with a high content of konjac glucomannan (KGM in its corm. This crop has been used as a food source and as a traditional medicine for thousands of years. Without adequate knowledge of gene expression profiles, there has been no report of validated reference genes in Amorphophallus. In this study, nine genes that are usually used as reference genes in other crops were selected as candidate reference genes. These putative sequences of these genes Amorphophallus were cloned by the use of degenerate primers. The expression stability of each gene was assessed in different tissues and under two abiotic stresses (heat and waterlogging in A. albus and A. konjac. Three distinct algorithms were used to evaluate the expression stability of the candidate reference genes. The results demonstrated that EF1-a, EIF4A, H3 and UBQ were the best reference genes under heat stress in Amorphophallus. Furthermore, EF1-a, EIF4A, TUB, and RP were the best reference genes in waterlogged conditions. By comparing different tissues from all samples, we determined that EF1-α, EIF4A, and CYP were stable in these sets. In addition, the suitability of these reference genes was confirmed by validating the expression of a gene encoding the small heat shock protein SHSP, which is related to heat stress in Amorphophallus. In sum, EF1-α and EIF4A were the two best reference genes for normalizing mRNA levels in different tissues and under various stress treatments, and we suggest using one of these genes in combination with 1 or 2 reference genes associated with different

  16. Food adulteration analysis without laboratory prepared or determined reference food adulterant values.

    Science.gov (United States)

    Kalivas, John H; Georgiou, Constantinos A; Moira, Marianna; Tsafaras, Ilias; Petrakis, Eleftherios A; Mousdis, George A

    2014-04-01

    Quantitative analysis of food adulterants is an important health and economic issue that needs to be fast and simple. Spectroscopy has significantly reduced analysis time. However, still needed are preparations of analyte calibration samples matrix matched to prediction samples which can be laborious and costly. Reported in this paper is the application of a newly developed pure component Tikhonov regularization (PCTR) process that does not require laboratory prepared or reference analysis methods, and hence, is a greener calibration method. The PCTR method requires an analyte pure component spectrum and non-analyte spectra. As a food analysis example, synchronous fluorescence spectra of extra virgin olive oil samples adulterated with sunflower oil is used. Results are shown to be better than those obtained using ridge regression with reference calibration samples. The flexibility of PCTR allows including reference samples and is generic for use with other instrumental methods and food products. Copyright © 2013 Elsevier Ltd. All rights reserved.

  17. A de novo transcriptome of European pollen beetle populations and its analysis, with special reference to insecticide action and resistance.

    Science.gov (United States)

    Zimmer, C T; Maiwald, F; Schorn, C; Bass, C; Ott, M-C; Nauen, R

    2014-08-01

    The pollen beetle Meligethes aeneus is the most important coleopteran pest in European oilseed rape cultivation, annually infesting millions of hectares and responsible for substantial yield losses if not kept under economic damage thresholds. This species is primarily controlled with insecticides but has recently developed high levels of resistance to the pyrethroid class. The aim of the present study was to provide a transcriptomic resource to investigate mechanisms of resistance. cDNA was sequenced on both Roche (Indianapolis, IN, USA) and Illumina (LGC Genomics, Berlin, Germany) platforms, resulting in a total of ∼53 m reads which assembled into 43 396 expressed sequence tags (ESTs). Manual annotation revealed good coverage of genes encoding insecticide target sites and detoxification enzymes. A total of 77 nonredundant cytochrome P450 genes were identified. Mapping of Illumina RNAseq sequences (from susceptible and pyrethroid-resistant strains) against the reference transcriptome identified a cytochrome P450 (CYP6BQ23) as highly overexpressed in pyrethroid resistance strains. Single-nucleotide polymorphism analysis confirmed the presence of a target-site resistance mutation (L1014F) in the voltage-gated sodium channel of one resistant strain. Our results provide new insights into the important genes associated with pyrethroid resistance in M. aeneus. Furthermore, a comprehensive EST resource is provided for future studies on insecticide modes of action and resistance mechanisms in pollen beetle. © 2014 The Royal Entomological Society.

  18. Non-linear Analysis of Scalp EEG by Using Bispectra: The Effect of the Reference Choice

    Directory of Open Access Journals (Sweden)

    Federico Chella

    2017-05-01

    Full Text Available Bispectral analysis is a signal processing technique that makes it possible to capture the non-linear and non-Gaussian properties of the EEG signals. It has found various applications in EEG research and clinical practice, including the assessment of anesthetic depth, the identification of epileptic seizures, and more recently, the evaluation of non-linear cross-frequency brain functional connectivity. However, the validity and reliability of the indices drawn from bispectral analysis of EEG signals are potentially biased by the use of a non-neutral EEG reference. The present study aims at investigating the effects of the reference choice on the analysis of the non-linear features of EEG signals through bicoherence, as well as on the estimation of cross-frequency EEG connectivity through two different non-linear measures, i.e., the cross-bicoherence and the antisymmetric cross-bicoherence. To this end, four commonly used reference schemes were considered: the vertex electrode (Cz, the digitally linked mastoids, the average reference, and the Reference Electrode Standardization Technique (REST. The reference effects were assessed both in simulations and in a real EEG experiment. The simulations allowed to investigated: (i the effects of the electrode density on the performance of the above references in the estimation of bispectral measures; and (ii the effects of the head model accuracy in the performance of the REST. For real data, the EEG signals recorded from 10 subjects during eyes open resting state were examined, and the distortions induced by the reference choice in the patterns of alpha-beta bicoherence, cross-bicoherence, and antisymmetric cross-bicoherence were assessed. The results showed significant differences in the findings depending on the chosen reference, with the REST providing superior performance than all the other references in approximating the ideal neutral reference. In conclusion, this study highlights the importance of

  19. Analysis of expressed sequence tags from Prunus mume flower and fruit and development of simple sequence repeat markers

    Directory of Open Access Journals (Sweden)

    Gao Zhihong

    2010-07-01

    Full Text Available Abstract Background Expressed Sequence Tag (EST has been a cost-effective tool in molecular biology and represents an abundant valuable resource for genome annotation, gene expression, and comparative genomics in plants. Results In this study, we constructed a cDNA library of Prunus mume flower and fruit, sequenced 10,123 clones of the library, and obtained 8,656 expressed sequence tag (EST sequences with high quality. The ESTs were assembled into 4,473 unigenes composed of 1,492 contigs and 2,981 singletons and that have been deposited in NCBI (accession IDs: GW868575 - GW873047, among which 1,294 unique ESTs were with known or putative functions. Furthermore, we found 1,233 putative simple sequence repeats (SSRs in the P. mume unigene dataset. We randomly tested 42 pairs of PCR primers flanking potential SSRs, and 14 pairs were identified as true-to-type SSR loci and could amplify polymorphic bands from 20 individual plants of P. mume. We further used the 14 EST-SSR primer pairs to test the transferability on peach and plum. The result showed that nearly 89% of the primer pairs produced target PCR bands in the two species. A high level of marker polymorphism was observed in the plum species (65% and low in the peach (46%, and the clustering analysis of the three species indicated that these SSR markers were useful in the evaluation of genetic relationships and diversity between and within the Prunus species. Conclusions We have constructed the first cDNA library of P. mume flower and fruit, and our data provide sets of molecular biology resources for P. mume and other Prunus species. These resources will be useful for further study such as genome annotation, new gene discovery, gene functional analysis, molecular breeding, evolution and comparative genomics between Prunus species.

  20. Characterization of shark complement factor I gene(s): genomic analysis of a novel shark-specific sequence.

    Science.gov (United States)

    Shin, Dong-Ho; Webb, Barbara M; Nakao, Miki; Smith, Sylvia L

    2009-07-01

    Complement factor I is a crucial regulator of mammalian complement activity. Very little is known of complement regulators in non-mammalian species. We isolated and sequenced four highly similar complement factor I cDNAs from the liver of the nurse shark (Ginglymostoma cirratum), designated as GcIf-1, GcIf-2, GcIf-3 and GcIf-4 (previously referred to as nsFI-a, -b, -c and -d) which encode 689, 673, 673 and 657 amino acid residues, respectively. They share 95% (shark-specific sequence between the leader peptide (LP) and the factor I membrane attack complex (FIMAC) domain. The cDNA sequences differ only in the size and composition of the shark-specific region (SSR). Sequence analysis of each SSR has identified within the region two novel short sequences (SS1 and SS2) and three repeat sequences (RS1-3). Genomic analysis has revealed the existence of three introns between the leader peptide and the FIMAC domain, tentatively designated intron 1, intron 2, and intron 3 which span 4067, 2293 and 2082bp, respectively. Southern blot analysis suggests the presence of a single gene copy for each cDNA type. Phylogenetic analysis suggests that complement factor I of cartilaginous fish diverged prior to the emergence of mammals. All four GcIf cDNA species are expressed in four different tissues and the liver is the main tissue in which expression level of all four is high. This suggests that the expression of GcIf isotypes is tissue-dependent.

  1. Species-level analysis of DNA sequence data from the NIH Human Microbiome Project.

    Science.gov (United States)

    Conlan, Sean; Kong, Heidi H; Segre, Julia A

    2012-01-01

    Outbreaks of antibiotic-resistant bacterial infections emphasize the importance of surveillance of potentially pathogenic bacteria. Genomic sequencing of clinical microbiological specimens expands our capacity to study cultivable, fastidious and uncultivable members of the bacterial community. Herein, we compared the primary data collected by the NIH's Human Microbiome Project (HMP) with published epidemiological surveillance data of Staphylococcus aureus. The HMP's initial dataset contained microbial survey data from five body regions (skin, nares, oral cavity, gut and vagina) of 242 healthy volunteers. A significant component of the HMP dataset was deep sequencing of the 16S ribosomal RNA gene, which contains variable regions enabling taxonomic classification. Since species-level identification is essential in clinical microbiology, we built a reference database and used phylogenetic placement followed by most recent common ancestor classification to look at the species distribution for Staphylococcus, Klebsiella and Enterococcus. We show that selecting the accurate region of the 16S rRNA gene to sequence is analogous to carefully selecting culture conditions to distinguish closely related bacterial species. Analysis of the HMP data showed that Staphylococcus aureus was present in the nares of 36% of healthy volunteers, consistent with culture-based epidemiological data. Klebsiella pneumoniae and Enterococcus faecalis were found less frequently, but across many habitats. This work demonstrates that large 16S rRNA survey studies can be used to support epidemiological goals in the context of an increasing awareness that microbes flourish and compete within a larger bacterial community. This study demonstrates how genomic techniques and information could be critically important to trace microbial evolution and implement hospital infection control.

  2. Species-level analysis of DNA sequence data from the NIH Human Microbiome Project.

    Directory of Open Access Journals (Sweden)

    Sean Conlan

    Full Text Available BACKGROUND: Outbreaks of antibiotic-resistant bacterial infections emphasize the importance of surveillance of potentially pathogenic bacteria. Genomic sequencing of clinical microbiological specimens expands our capacity to study cultivable, fastidious and uncultivable members of the bacterial community. Herein, we compared the primary data collected by the NIH's Human Microbiome Project (HMP with published epidemiological surveillance data of Staphylococcus aureus. METHODS: The HMP's initial dataset contained microbial survey data from five body regions (skin, nares, oral cavity, gut and vagina of 242 healthy volunteers. A significant component of the HMP dataset was deep sequencing of the 16S ribosomal RNA gene, which contains variable regions enabling taxonomic classification. Since species-level identification is essential in clinical microbiology, we built a reference database and used phylogenetic placement followed by most recent common ancestor classification to look at the species distribution for Staphylococcus, Klebsiella and Enterococcus. MAIN RESULTS: We show that selecting the accurate region of the 16S rRNA gene to sequence is analogous to carefully selecting culture conditions to distinguish closely related bacterial species. Analysis of the HMP data showed that Staphylococcus aureus was present in the nares of 36% of healthy volunteers, consistent with culture-based epidemiological data. Klebsiella pneumoniae and Enterococcus faecalis were found less frequently, but across many habitats. CONCLUSIONS: This work demonstrates that large 16S rRNA survey studies can be used to support epidemiological goals in the context of an increasing awareness that microbes flourish and compete within a larger bacterial community. This study demonstrates how genomic techniques and information could be critically important to trace microbial evolution and implement hospital infection control.

  3. Evidence-based design and evaluation of a whole genome sequencing clinical report for the reference microbiology laboratory

    Science.gov (United States)

    Crisan, Anamaria; McKee, Geoffrey; Munzner, Tamara

    2018-01-01

    Background Microbial genome sequencing is now being routinely used in many clinical and public health laboratories. Understanding how to report complex genomic test results to stakeholders who may have varying familiarity with genomics—including clinicians, laboratorians, epidemiologists, and researchers—is critical to the successful and sustainable implementation of this new technology; however, there are no evidence-based guidelines for designing such a report in the pathogen genomics domain. Here, we describe an iterative, human-centered approach to creating a report template for communicating tuberculosis (TB) genomic test results. Methods We used Design Study Methodology—a human centered approach drawn from the information visualization domain—to redesign an existing clinical report. We used expert consults and an online questionnaire to discover various stakeholders’ needs around the types of data and tasks related to TB that they encounter in their daily workflow. We also evaluated their perceptions of and familiarity with genomic data, as well as its utility at various clinical decision points. These data shaped the design of multiple prototype reports that were compared against the existing report through a second online survey, with the resulting qualitative and quantitative data informing the final, redesigned, report. Results We recruited 78 participants, 65 of whom were clinicians, nurses, laboratorians, researchers, and epidemiologists involved in TB diagnosis, treatment, and/or surveillance. Our first survey indicated that participants were largely enthusiastic about genomic data, with the majority agreeing on its utility for certain TB diagnosis and treatment tasks and many reporting some confidence in their ability to interpret this type of data (between 58.8% and 94.1%, depending on the specific data type). When we compared our four prototype reports against the existing design, we found that for the majority (86.7%) of design

  4. Evidence-based design and evaluation of a whole genome sequencing clinical report for the reference microbiology laboratory

    Directory of Open Access Journals (Sweden)

    Anamaria Crisan

    2018-01-01

    Full Text Available Background Microbial genome sequencing is now being routinely used in many clinical and public health laboratories. Understanding how to report complex genomic test results to stakeholders who may have varying familiarity with genomics—including clinicians, laboratorians, epidemiologists, and researchers—is critical to the successful and sustainable implementation of this new technology; however, there are no evidence-based guidelines for designing such a report in the pathogen genomics domain. Here, we describe an iterative, human-centered approach to creating a report template for communicating tuberculosis (TB genomic test results. Methods We used Design Study Methodology—a human centered approach drawn from the information visualization domain—to redesign an existing clinical report. We used expert consults and an online questionnaire to discover various stakeholders’ needs around the types of data and tasks related to TB that they encounter in their daily workflow. We also evaluated their perceptions of and familiarity with genomic data, as well as its utility at various clinical decision points. These data shaped the design of multiple prototype reports that were compared against the existing report through a second online survey, with the resulting qualitative and quantitative data informing the final, redesigned, report. Results We recruited 78 participants, 65 of whom were clinicians, nurses, laboratorians, researchers, and epidemiologists involved in TB diagnosis, treatment, and/or surveillance. Our first survey indicated that participants were largely enthusiastic about genomic data, with the majority agreeing on its utility for certain TB diagnosis and treatment tasks and many reporting some confidence in their ability to interpret this type of data (between 58.8% and 94.1%, depending on the specific data type. When we compared our four prototype reports against the existing design, we found that for the majority (86.7% of

  5. Evidence-based design and evaluation of a whole genome sequencing clinical report for the reference microbiology laboratory.

    Science.gov (United States)

    Crisan, Anamaria; McKee, Geoffrey; Munzner, Tamara; Gardy, Jennifer L

    2018-01-01

    Microbial genome sequencing is now being routinely used in many clinical and public health laboratories. Understanding how to report complex genomic test results to stakeholders who may have varying familiarity with genomics-including clinicians, laboratorians, epidemiologists, and researchers-is critical to the successful and sustainable implementation of this new technology; however, there are no evidence-based guidelines for designing such a report in the pathogen genomics domain. Here, we describe an iterative, human-centered approach to creating a report template for communicating tuberculosis (TB) genomic test results. We used Design Study Methodology-a human centered approach drawn from the information visualization domain-to redesign an existing clinical report. We used expert consults and an online questionnaire to discover various stakeholders' needs around the types of data and tasks related to TB that they encounter in their daily workflow. We also evaluated their perceptions of and familiarity with genomic data, as well as its utility at various clinical decision points. These data shaped the design of multiple prototype reports that were compared against the existing report through a second online survey, with the resulting qualitative and quantitative data informing the final, redesigned, report. We recruited 78 participants, 65 of whom were clinicians, nurses, laboratorians, researchers, and epidemiologists involved in TB diagnosis, treatment, and/or surveillance. Our first survey indicated that participants were largely enthusiastic about genomic data, with the majority agreeing on its utility for certain TB diagnosis and treatment tasks and many reporting some confidence in their ability to interpret this type of data (between 58.8% and 94.1%, depending on the specific data type). When we compared our four prototype reports against the existing design, we found that for the majority (86.7%) of design comparisons, participants preferred the

  6. Identification of reference genes for quantitative expression analysis using large-scale RNA-seq data of Arabidopsis thaliana and model crop plants.

    Science.gov (United States)

    Kudo, Toru; Sasaki, Yohei; Terashima, Shin; Matsuda-Imai, Noriko; Takano, Tomoyuki; Saito, Misa; Kanno, Maasa; Ozaki, Soichi; Suwabe, Keita; Suzuki, Go; Watanabe, Masao; Matsuoka, Makoto; Takayama, Seiji; Yano, Kentaro

    2016-10-13

    In quantitative gene expression analysis, normalization using a reference gene as an internal control is frequently performed for appropriate interpretation of the results. Efforts have been devoted to exploring superior novel reference genes using microarray transcriptomic data and to evaluating commonly used reference genes by targeting analysis. However, because the number of specifically detectable genes is totally dependent on probe design in the microarray analysis, exploration using microarray data may miss some of the best choices for the reference genes. Recently emerging RNA sequencing (RNA-seq) provides an ideal resource for comprehensive exploration of reference genes since this method is capable of detecting all expressed genes, in principle including even unknown genes. We report the results of a comprehensive exploration of reference genes using public RNA-seq data from plants such as Arabidopsis thaliana (Arabidopsis), Glycine max (soybean), Solanum lycopersicum (tomato) and Oryza sativa (rice). To select reference genes suitable for the broadest experimental conditions possible, candidates were surveyed by the following four steps: (1) evaluation of the basal expression level of each gene in each experiment; (2) evaluation of the expression stability of each gene in each experiment; (3) evaluation of the expression stability of each gene across the experiments; and (4) selection of top-ranked genes, after ranking according to the number of experiments in which the gene was expressed stably. Employing this procedure, 13, 10, 12 and 21 top candidates for reference genes were proposed in Arabidopsis, soybean, tomato and rice, respectively. Microarray expression data confirmed that the expression of the proposed reference genes under broad experimental conditions was more stable than that of commonly used reference genes. These novel reference genes will be useful for analyzing gene expression profiles across experiments carried out under various

  7. Exploring Valid Reference Genes for Quantitative Real-time PCR Analysis in Plutella xylostella (Lepidoptera: Plutellidae)

    Science.gov (United States)

    Fu, Wei; Xie, Wen; Zhang, Zhuo; Wang, Shaoli; Wu, Qingjun; Liu, Yong; Zhou, Xiaomao; Zhou, Xuguo; Zhang, Youjun

    2013-01-01

    Abstract: Quantitative real-time PCR (qRT-PCR), a primary tool in gene expression analysis, requires an appropriate normalization strategy to control for variation among samples. The best option is to compare the mRNA level of a target gene with that of reference gene(s) whose expression level is stable across various experimental conditions. In this study, expression profiles of eight candidate reference genes from the diamondback moth, Plutella xylostella, were evaluated under diverse experimental conditions. RefFinder, a web-based analysis tool, integrates four major computational programs including geNorm, Normfinder, BestKeeper, and the comparative ΔCt method to comprehensively rank the tested candidate genes. Elongation factor 1 (EF1) was the most suited reference gene for the biotic factors (development stage, tissue, and strain). In contrast, although appropriate reference gene(s) do exist for several abiotic factors (temperature, photoperiod, insecticide, and mechanical injury), we were not able to identify a single universal reference gene. Nevertheless, a suite of candidate reference genes were specifically recommended for selected experimental conditions. Our finding is the first step toward establishing a standardized qRT-PCR analysis of this agriculturally important insect pest. PMID:23983612

  8. Accident Sequence Precursor Analysis for SGTR by Using Dynamic PSA Approach

    International Nuclear Information System (INIS)

    Lee, Han Sul; Heo, Gyun Young; Kim, Tae Wan

    2016-01-01

    In order to address this issue, this study suggests the sequence tree model to analyze accident sequence systematically. Using the sequence tree model, all possible scenarios which need a specific safety action to prevent the core damage can be identified and success conditions of safety action under complicated situation such as combined accident will be also identified. Sequence tree is branch model to divide plant condition considering the plant dynamics. Since sequence tree model can reflect the plant dynamics, arising from interaction of different accident timing and plant condition and from the interaction between the operator action, mitigation system, and the indicators for operation, sequence tree model can be used to develop the dynamic event tree model easily. Target safety action for this study is a feed-and-bleed (F and B) operation. A F and B operation directly cools down the reactor cooling system (RCS) using the primary cooling system when residual heat removal by the secondary cooling system is not available. In this study, a TLOFW accident and a TLOFW accident with LOCA were the target accidents. Based on the conventional PSA model and indicators, the sequence tree model for a TLOFW accident was developed. Based on the results of a sampling analysis and data from the conventional PSA model, the CDF caused by Sequence no. 26 can be realistically estimated. For a TLOFW accident with LOCA, second accident timings were categorized according to plant condition. Indicators were selected as branch point using the flow chart and tables, and a corresponding sequence tree model was developed. If sampling analysis is performed, practical accident sequences can be identified based on the sequence analysis. If a realistic distribution for the variables can be obtained for sampling analysis, much more realistic accident sequences can be described. Moreover, if the initiating event frequency under a combined accident can be quantified, the sequence tree model

  9. Sequencing and phylogenetic analysis of Herpes simplex virus type ...

    African Journals Online (AJOL)

    For determination of the genetic relationship of HSV-2 glycoprotein G gene (gG) in Iran with those in other countries, DNA fragment of 1100 bp corresponding to gG from six HSV-2 strains have been isolated from human infected sera samples in Iran, it was amplified in PCR system and was sequenced for determining ...

  10. Transcriptome analysis of blueberry using 454 EST sequencing

    Science.gov (United States)

    Blueberry (Vaccinium corymbosum) is a major berry crop in the United States, and one that has great nutritional and economical value. Next generation sequencing methodologies, such as 454, have been demonstrated to be successful and efficient in producing a snap-shot of transcriptional activities du...

  11. Characterization and sequence analysis of cysteine and glycine-rich ...

    African Journals Online (AJOL)

    Tarek

    2011-04-18

    Apr 18, 2011 ... nucleotide alignment of both native buffalo and cattle CSRP3 cDNAs sequences ..... Exon III, Identities = 71/75 (94%), Gaps = 1/75 (1%) Strand=Plus/Plus ... Band MR, Larson JH, Rebeiz M, Green CA, Heyen DW, Donovan J,.

  12. Functional analysis of bipartite begomovirus coat protein promoter sequences

    International Nuclear Information System (INIS)

    Lacatus, Gabriela; Sunter, Garry

    2008-01-01

    We demonstrate that the AL2 gene of Cabbage leaf curl virus (CaLCuV) activates the CP promoter in mesophyll and acts to derepress the promoter in vascular tissue, similar to that observed for Tomato golden mosaic virus (TGMV). Binding studies indicate that sequences mediating repression and activation of the TGMV and CaLCuV CP promoter specifically bind different nuclear factors common to Nicotiana benthamiana, spinach and tomato. However, chromatin immunoprecipitation demonstrates that TGMV AL2 can interact with both sequences independently. Binding of nuclear protein(s) from different crop species to viral sequences conserved in both bipartite and monopartite begomoviruses, including TGMV, CaLCuV, Pepper golden mosaic virus and Tomato yellow leaf curl virus suggests that bipartite begomoviruses bind common host factors to regulate the CP promoter. This is consistent with a model in which AL2 interacts with different components of the cellular transcription machinery that bind viral sequences important for repression and activation of begomovirus CP promoters

  13. The DNA sequence, annotation and analysis of human chromosome 3

    DEFF Research Database (Denmark)

    Muzny, D.M.; Bolund, Lars; As part of the Chinese Human Genome Sequencing Consortium, E.T.A.L.

    2006-01-01

    as numerous loci involved in multiple human cancers such as the gene encoding FHIT, which contains the most common constitutive fragile site in the genome, FRA3B. Using genomic sequence from chimpanzee and rhesus macaque, we were able to characterize the breakpoints defining a large pericentric inversion...

  14. Sequence analysis of mitochondrial 16S ribosomal RNA gene

    Indian Academy of Sciences (India)

    Mosquitoes are vectors for the transmission of many human pathogens that include viruses, nematodes and protozoa. For the understanding of their vectorial capacity, identification of disease carrying and refractory strains is essential. Recently, molecular taxonomic techniques have been utilized for this purpose. Sequence ...

  15. Illumina-based de novo transcriptome sequencing and analysis

    Indian Academy of Sciences (India)

    In the present study, we used Illumina HiSeq technology to perform de novo assembly of heart and musk gland transcriptomes from the Chinese forest musk deer. A total of 239,383 transcripts and 176,450 unigenes were obtained, of which 37,329 unigenes were matched to known sequences in the NCBI nonredundant ...

  16. Generation and analysis of expressed sequence tags from Botrytis cinerea

    Directory of Open Access Journals (Sweden)

    EVELYN SILVA

    2006-01-01

    Full Text Available Botrytis cinerea is a filamentous plant pathogen of a wide range of plant species, and its infection may cause enormous damage both during plant growth and in the post-harvest phase. We have constructed a cDNA library from an isolate of B. cinerea and have sequenced 11,482 expressed sequence tags that were assembled into 1,003 contigs sequences and 3,032 singletons. Approximately 81% of the unigenes showed significant similarity to genes coding for proteins with known functions: more than 50% of the sequences code for genes involved in cellular metabolism, 12% for transport of metabolites, and approximately 10% for cellular organization. Other functional categories include responses to biotic and abiotic stimuli, cell communication, cell homeostasis, and cell development. We carried out pair-wise comparisons with fungal databases to determine the B. cinerea unisequence set with relevant similarity to genes in other fungal pathogenic counterparts. Among the 4,035 non-redundant B. cinerea unigenes, 1,338 (23% have significant homology with Fusarium verticillioides unigenes. Similar values were obtained for Saccharomyces cerevisiae and Aspergillus nidulans (22% and 24%, respectively. The lower percentages of homology were with Magnaporthe grisae and Neurospora crassa (13% and 19%, respectively. Several genes involved in putative and known fungal virulence and general pathogenicity were identified. The results provide important information for future research on this fungal pathogen

  17. Whole-genome sequence-based analysis of thyroid function

    DEFF Research Database (Denmark)

    Taylor, Peter N.; Porcu, Eleonora; Chew, Shelby

    2015-01-01

    Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N = 2,287). Using additional whole-genome seque...

  18. DNA sequence and prokaryotic expression analysis of vitellogenin ...

    African Journals Online (AJOL)

    In this study, the DNA sequence of vitellogenin from Antheraea pernyi (Ap-Vg) was identified and its functional domain (30-740 aa, Ap-Vg-1) was expressed in Escherichia coli BL21 (DE3) cells. The recombinant Ap-Vg-1 proteins were purified and used for antibody preparation. The results showed that the intact DNA ...

  19. Molecular cloning, sequence analysis and structure prediction of the ...

    African Journals Online (AJOL)

    AJL

    2012-04-19

    Apr 19, 2012 ... The primers were based on the rBAT sequences of other animals deposited in GenBank. .... fragment; M1, 2000 bp DNA ladder; M2, 1000 bp DNA ladder. spliced to obtain the ..... A traffic signal for heterodimeric amino acid.

  20. A bibliometric analysis of global research on genome sequencing ...

    African Journals Online (AJOL)

    The results show that disease and protein related researches were the leading research focuses, and comparative genomics and evolution related research had strong potential in the near future. Key words: Genome sequencing, research trend, scientometrics, science citation index expanded (SCI-Expanded), word cluster ...

  1. Cloning and sequence analysis of the defective in anther ...

    African Journals Online (AJOL)

    To clone the defective in anther dehiscence1 (DAD1) gene fragment of Chinese kale, about 700 bp product was obtained by PCR amplification using Chinese kale genomic DNA as the template and a pair of specific primers designed according to the conserved sequence of DAD1 genes of Arabidopsis thaliana and ...

  2. Sequence and comparative analysis of Leuconostoc dairy bacteriophages

    DEFF Research Database (Denmark)

    Kot, Witold; Hansen, Lars Henrik; Neve, Horst

    2014-01-01

    Bacteriophages attacking Leuconostoc species may significantly influence the quality of the final product. There is however limited knowledge of this group of phages in the literature. We have determined the complete genome sequences of nine Leuconostoc bacteriophages virulent to either Leuconostoc...

  3. Cytosolic Glutamine Synthetase is Important for Photosynthetic Efficiency and Water Use Efficiency in Potato as Revealed by High Throughput Sequencing QTL analysis

    DEFF Research Database (Denmark)

    Kaminski, Kacper Piotr; Sørensen, Kirsten Kørup; Andersen, Mathias Neumann

    2015-01-01

    was observed. Two extreme WUE bulks of clones were identified and pools of genomic DNA from them as well as the parents were sequenced and mapped to reference potato genome. Following a novel data analysis approach, two highly resolved QTLs were found on chromosome 1 and 9. Interestingly, three genes encoding...

  4. [International reference prices and cost minimization analysis for the regulation of medicine prices in Colombia].

    Science.gov (United States)

    Vacca, Caludia; Acosta, Angela; Rodriguez, Ivan

    2011-01-01

    To suggest a scheme of decision making on pricing for medicines that are part of Free Regulated Regime, a regulation way of the pharmaceutical pricing policy in Colombia. It includes two regulation tools: international reference prices and a cost minimization analysis methodology. Following the current pricing policy, international reference prices were built with data from five countries for selected medicines, which are under Free Regulated Regime. The cost minimization analysis methodology includes selection of those medicines under Free Regulated Regime with possible comparable medicines, selection of comparable medicines, and treatment costs evaluation. As a result of the estimate of International Reference Prices, four medicines showed in the domestic pharmaceutical market a bigger price than the Reference Price. A scheme of decision-making was design containing two possible regulation tools for medicines that are part of Free Regulated Regime: estimate of international reference prices and cost minimization analysis methodology. This diagram would be useful to assist the pricing regulation of Free Regulated Regime in Colombia. As present results shows, international reference prices make clear when domestic prices are higher than those of reference countries. In the current regulation of pharmaceutical prices in Colombia, the international reference price has been applied for four medicines. Would be suitable to extend this methodology to other medicines of high impact on the pharmaceutical expenditure, in particular those covered by public funding. The availability of primary sources about treatment costs in Colombia needs to be improved as a requirement to develop pharmaco-economic evidence. SISMED is an official database that represents an important primary source of medicines prices in Colombia. Nevertheless, having into account that SISMED represents an important advantage of transparency in medicines prices, it needs to be improved in quality and data

  5. XplorSeq: a software environment for integrated management and phylogenetic analysis of metagenomic sequence data.

    Science.gov (United States)

    Frank, Daniel N

    2008-10-07

    Advances in automated DNA sequencing technology have accelerated the generation of metagenomic DNA sequences, especially environmental ribosomal RNA gene (rDNA) sequences. As the scale of rDNA-based studies of microbial ecology has expanded, need has arisen for software that is capable of managing, annotating, and analyzing the plethora of diverse data accumulated in these projects. XplorSeq is a software package that facilitates the compilation, management and phylogenetic analysis of DNA sequences. XplorSeq was developed for, but is not limited to, high-throughput analysis of environmental rRNA gene sequences. XplorSeq integrates and extends several commonly used UNIX-based analysis tools by use of a Macintosh OS-X-based graphical user interface (GUI). Through this GUI, users may perform basic sequence import and assembly steps (base-calling, vector/primer trimming, contig assembly), perform BLAST (Basic Local Alignment and Search Tool; 123) searches of NCBI and local databases, create multiple sequence alignments, build phylogenetic trees, assemble Operational Taxonomic Units, estimate biodiversity indices, and summarize data in a variety of formats. Furthermore, sequences may be annotated with user-specified meta-data, which then can be used to sort data and organize analyses and reports. A document-based architecture permits parallel analysis of sequence data from multiple clones or amplicons, with sequences and other data stored in a single file. XplorSeq should benefit researchers who are engaged in analyses of environmental sequence data, especially those with little experience using bioinformatics software. Although XplorSeq was developed for management of rDNA sequence data, it can be applied to most any sequencing project. The application is available free of charge for non-commercial use at http://vent.colorado.edu/phyloware.

  6. Transcriptome Sequencing, De Novo Assembly and Differential Gene Expression Analysis of the Early Development of Acipenser baeri.

    Directory of Open Access Journals (Sweden)

    Wei Song

    Full Text Available The molecular mechanisms that drive the development of the endangered fossil fish species Acipenser baeri are difficult to study due to the lack of genomic data. Recent advances in sequencing technologies and the reducing cost of sequencing offer exclusive opportunities for exploring important molecular mechanisms underlying specific biological processes. This manuscript describes the large scale sequencing and analyses of mRNA from Acipenser baeri collected at five development time points using the Illumina Hiseq2000 platform. The sequencing reads were de novo assembled and clustered into 278167 unigenes, of which 57346 (20.62% had 45837 known homologues proteins in Uniprot protein databases while 11509 proteins matched with at least one sequence of assembled unigenes. The remaining 79.38% of unigenes could stand for non-coding unigenes or unigenes specific to A. baeri. A number of 43062 unigenes were annotated into functional categories via Gene Ontology (GO annotation whereas 29526 unigenes were associated with 329 pathways by mapping to KEGG database. Subsequently, 3479 differentially expressed genes were scanned within developmental stages and clustered into 50 gene expression profiles. Genes preferentially expressed at each stage were also identified. Through GO and KEGG pathway enrichment analysis, relevant physiological variations during the early development of A. baeri could be better cognized. Accordingly, the present study gives insights into the transcriptome profile of the early development of A. baeri, and the information contained in this large scale transcriptome will provide substantial references for A. baeri developmental biology and promote its aquaculture research.

  7. Sequence analysis of the Legionella micdadei groELS operon

    DEFF Research Database (Denmark)

    Hindersson, P; Høiby, N; Bangsborg, Jette Marie

    1991-01-01

    A 2.7 kb DNA fragment encoding the 60 kDa common antigen (CA) and a 13 kDa protein of Legionella micdadei was sequenced. Two open reading frames of 57,677 and 10,456 Da were identified, corresponding to the heat shock proteins GroEL and GroES, respectively. Typical -35, -10, and Shine-Dalgarno heat...

  8. The Matrix Method of Representation, Analysis and Classification of Long Genetic Sequences

    Directory of Open Access Journals (Sweden)

    Ivan V. Stepanyan

    2017-01-01

    Full Text Available The article is devoted to a matrix method of comparative analysis of long nucleotide sequences by means of presenting each sequence in the form of three digital binary sequences. This method uses a set of symmetries of biochemical attributes of nucleotides. It also uses the possibility of presentation of every whole set of N-mers as one of the members of a Kronecker family of genetic matrices. With this method, a long nucleotide sequence can be visually represented as an individual fractal-like mosaic or another regular mosaic of binary type. In contrast to natural nucleotide sequences, artificial random sequences give non-regular patterns. Examples of binary mosaics of long nucleotide sequences are shown, including cases of human chromosomes and penicillins. The obtained results are then discussed.

  9. Sequence analysis of Maturase K (matK): A chloroplast-encoding ...

    African Journals Online (AJOL)

    The application and utilization of sequence data has been found very informative in the characterization and phylogenetic relationship of different crops species. This study aimed to use bioinformatics tools to characterize the matK gene in some selected legumes with special reference to pigeon pea [cajanus cajan ...

  10. OPTSDNA: Performance evaluation of an efficient distributed bioinformatics system for DNA sequence analysis.

    Science.gov (United States)

    Khan, Mohammad Ibrahim; Sheel, Chotan

    2013-01-01

    Storage of sequence data is a big concern as the amount of data generated is exponential in nature at several locations. Therefore, there is a need to develop techniques to store data using compression algorithm. Here we describe optimal storage algorithm (OPTSDNA) for storing large amount of DNA sequences of varying length. This paper provides performance analysis of optimal storage algorithm (OPTSDNA) of a distributed bioinformatics computing system for analysis of DNA sequences. OPTSDNA algorithm is used for storing various sizes of DNA sequences into database. DNA sequences of different lengths were stored by using this algorithm. These input DNA sequences are varied in size from very small to very large. Storage size is calculated by this algorithm. Response time is also calculated in this work. The efficiency and performance of the algorithm is high (in size calculation with percentage) when compared with other known with sequential approach.

  11. A content analysis of displayed alcohol references on a social networking web site.

    Science.gov (United States)

    Moreno, Megan A; Briner, Leslie R; Williams, Amanda; Brockman, Libby; Walker, Leslie; Christakis, Dimitri A

    2010-08-01

    Exposure to alcohol use in media is associated with adolescent alcohol use. Adolescents frequently display alcohol references on Internet media, such as social networking web sites. The purpose of this study was to conduct a theoretically based content analysis of older adolescents' displayed alcohol references on a social networking web site. We evaluated 400 randomly selected public MySpace profiles of self-reported 17- to 20-year-olds from zip codes, representing urban, suburban, and rural communities in one Washington county. Content was evaluated for alcohol references, suggesting: (1) explicit versus figurative alcohol use, (2) alcohol-related motivations, associations, and consequences, including references that met CRAFFT problem drinking criteria. We compared profiles from four target zip codes for prevalence and frequency of alcohol display. Of 400 profiles, 225 (56.3%) contained 341 references to alcohol. Profile owners who displayed alcohol references were mostly male (54.2%) and white (70.7%). The most frequent reference category was explicit use (49.3%); the most commonly displayed alcohol use motivation was peer pressure (4.7%). Few references met CRAFFT problem drinking criteria (3.2%). There were no differences in prevalence or frequency of alcohol display among the four sociodemographic communities. Despite alcohol use being illegal and potentially stigmatizing in this population, explicit alcohol use is frequently referenced on adolescents' MySpace profiles across several sociodemographic communities. Motivations, associations, and consequences regarding alcohol use referenced on MySpace appear consistent with previous studies of adolescent alcohol use. These references may be a potent source of influence on adolescents, particularly given that they are created and displayed by peers. (c) 2010 Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.

  12. RNA-sequence data normalization through in silico prediction of reference genes: the bacterial response to DNA damage as case study.

    Science.gov (United States)

    Berghoff, Bork A; Karlsson, Torgny; Källman, Thomas; Wagner, E Gerhart H; Grabherr, Manfred G

    2017-01-01

    Measuring how gene expression changes in the course of an experiment assesses how an organism responds on a molecular level. Sequencing of RNA molecules, and their subsequent quantification, aims to assess global gene expression changes on the RNA level (transcriptome). While advances in high-throughput RNA-sequencing (RNA-seq) technologies allow for inexpensive data generation, accurate post-processing and normalization across samples is required to eliminate any systematic noise introduced by the biochemical and/or technical processes. Existing methods thus either normalize on selected known reference genes that are invariant in expression across the experiment, assume that the majority of genes are invariant, or that the effects of up- and down-regulated genes cancel each other out during the normalization. Here, we present a novel method, moose 2 , which predicts invariant genes in silico through a dynamic programming (DP) scheme and applies a quadratic normalization based on this subset. The method allows for specifying a set of known or experimentally validated invariant genes, which guides the DP. We experimentally verified the predictions of this method in the bacterium Escherichia coli , and show how moose 2 is able to (i) estimate the expression value distances between RNA-seq samples, (ii) reduce the variation of expression values across all samples, and (iii) to subsequently reveal new functional groups of genes during the late stages of DNA damage. We further applied the method to three eukaryotic data sets, on which its performance compares favourably to other methods. The software is implemented in C++ and is publicly available from http://grabherr.github.io/moose2/. The proposed RNA-seq normalization method, moose 2 , is a valuable alternative to existing methods, with two major advantages: (i) in silico prediction of invariant genes provides a list of potential reference genes for downstream analyses, and (ii) non-linear artefacts in RNA-seq data

  13. Analysis of xylem formation in pine by cDNA sequencing

    Science.gov (United States)

    Allona, I.; Quinn, M.; Shoop, E.; Swope, K.; St Cyr, S.; Carlis, J.; Riedl, J.; Retzel, E.; Campbell, M. M.; Sederoff, R.; hide

    1998-01-01

    Secondary xylem (wood) formation is likely to involve some genes expressed rarely or not at all in herbaceous plants. Moreover, environmental and developmental stimuli influence secondary xylem differentiation, producing morphological and chemical changes in wood. To increase our understanding of xylem formation, and to provide material for comparative analysis of gymnosperm and angiosperm sequences, ESTs were obtained from immature xylem of loblolly pine (Pinus taeda L.). A total of 1,097 single-pass sequences were obtained from 5' ends of cDNAs made from gravistimulated tissue from bent trees. Cluster analysis detected 107 groups of similar sequences, ranging in size from 2 to 20 sequences. A total of 361 sequences fell into these groups, whereas 736 sequences were unique. About 55% of the pine EST sequences show similarity to previously described sequences in public databases. About 10% of the recognized genes encode factors involved in cell wall formation. Sequences similar to cell wall proteins, most known lignin biosynthetic enzymes, and several enzymes of carbohydrate metabolism were found. A number of putative regulatory proteins also are represented. Expression patterns of several of these genes were studied in various tissues and organs of pine. Sequencing novel genes expressed during xylem formation will provide a powerful means of identifying mechanisms controlling this important differentiation pathway.

  14. MiSeq: A Next Generation Sequencing Platform for Genomic Analysis.

    Science.gov (United States)

    Ravi, Rupesh Kanchi; Walton, Kendra; Khosroheidari, Mahdieh

    2018-01-01

    MiSeq, Illumina's integrated next generation sequencing instrument, uses reversible-terminator sequencing-by-synthesis technology to provide end-to-end sequencing solutions. The MiSeq instrument is one of the smallest benchtop sequencers that can perform onboard cluster generation, amplification, genomic DNA sequencing, and data analysis, including base calling, alignment and variant calling, in a single run. It performs both single- and paired-end runs with adjustable read lengths from 1 × 36 base pairs to 2 × 300 base pairs. A single run can produce output data of up to 15 Gb in as little as 4 h of runtime and can output up to 25 M single reads and 50 M paired-end reads. Thus, MiSeq provides an ideal platform for rapid turnaround time. MiSeq is also a cost-effective tool for various analyses focused on targeted gene sequencing (amplicon sequencing and target enrichment), metagenomics, and gene expression studies. For these reasons, MiSeq has become one of the most widely used next generation sequencing platforms. Here, we provide a protocol to prepare libraries for sequencing using the MiSeq instrument and basic guidelines for analysis of output data from the MiSeq sequencing run.

  15. Viral metagenomics: Analysis of begomoviruses by illumina high-throughput sequencing

    KAUST Repository

    Idris, Ali

    2014-03-12

    Traditional DNA sequencing methods are inefficient, lack the ability to discern the least abundant viral sequences, and ineffective for determining the extent of variability in viral populations. Here, populations of single-stranded DNA plant begomoviral genomes and their associated beta- and alpha-satellite molecules (virus-satellite complexes) (genus, Begomovirus; family, Geminiviridae) were enriched from total nucleic acids isolated from symptomatic, field-infected plants, using rolling circle amplification (RCA). Enriched virus-satellite complexes were subjected to Illumina-Next Generation Sequencing (NGS). CASAVA and SeqMan NGen programs were implemented, respectively, for quality control and for de novo and reference-guided contig assembly of viral-satellite sequences. The authenticity of the begomoviral sequences, and the reproducibility of the Illumina-NGS approach for begomoviral deep sequencing projects, were validated by comparing NGS results with those obtained using traditional molecular cloning and Sanger sequencing of viral components and satellite DNAs, also enriched by RCA or amplified by polymerase chain reaction. As the use of NGS approaches, together with advances in software development, make possible deep sequence coverage at a lower cost; the approach described herein will streamline the exploration of begomovirus diversity and population structure from naturally infected plants, irrespective of viral abundance. This is the first report of the implementation of Illumina-NGS to explore the diversity and identify begomoviral-satellite SNPs directly from plants naturally-infected with begomoviruses under field conditions. 2014 by the authors; licensee MDPI, Basel, Switzerland.

  16. Viral Metagenomics: Analysis of Begomoviruses by Illumina High-Throughput Sequencing

    Directory of Open Access Journals (Sweden)

    Ali Idris

    2014-03-01

    Full Text Available Traditional DNA sequencing methods are inefficient, lack the ability to discern the least abundant viral sequences, and ineffective for determining the extent of variability in viral populations. Here, populations of single-stranded DNA plant begomoviral genomes and their associated beta- and alpha-satellite molecules (virus-satellite complexes (genus, Begomovirus; family, Geminiviridae were enriched from total nucleic acids isolated from symptomatic, field-infected plants, using rolling circle amplification (RCA. Enriched virus-satellite complexes were subjected to Illumina-Next Generation Sequencing (NGS. CASAVA and SeqMan NGen programs were implemented, respectively, for quality control and for de novo and reference-guided contig assembly of viral-satellite sequences. The authenticity of the begomoviral sequences, and the reproducibility of the Illumina-NGS approach for begomoviral deep sequencing projects, were validated by comparing NGS results with those obtained using traditional molecular cloning and Sanger sequencing of viral components and satellite DNAs, also enriched by RCA or amplified by polymerase chain reaction. As the use of NGS approaches, together with advances in software development, make possible deep sequence coverage at a lower cost; the approach described herein will streamline the exploration of begomovirus diversity and population structure from naturally infected plants, irrespective of viral abundance. This is the first report of the implementation of Illumina-NGS to explore the diversity and identify begomoviral-satellite SNPs directly from plants naturally-infected with begomoviruses under field conditions.

  17. Maturity onset diabetes of youth (MODY) in Turkish children: sequence analysis of 11 causative genes by next generation sequencing.

    Science.gov (United States)

    Ağladıoğlu, Sebahat Yılmaz; Aycan, Zehra; Çetinkaya, Semra; Baş, Veysel Nijat; Önder, Aşan; Peltek Kendirci, Havva Nur; Doğan, Haldun; Ceylaner, Serdar

    2016-04-01

    Maturity-onset diabetes of the youth (MODY), is a genetically and clinically heterogeneous group of diseasesand is often misdiagnosed as type 1 or type 2 diabetes. The aim of this study is to investigate both novel and proven mutations of 11 MODY genes in Turkish children by using targeted next generation sequencing. A panel of 11 MODY genes were screened in 43 children with MODY diagnosed by clinical criterias. Studies of index cases was done with MISEQ-ILLUMINA, and family screenings and confirmation studies of mutations was done by Sanger sequencing. We identified 28 (65%) point mutations among 43 patients. Eighteen patients have GCK mutations, four have HNF1A, one has HNF4A, one has HNF1B, two have NEUROD1, one has PDX1 gene variations and one patient has both HNF1A and HNF4A heterozygote mutations. This is the first study including molecular studies of 11 MODY genes in Turkish children. GCK is the most frequent type of MODY in our study population. Very high frequency of novel mutations (42%) in our study population, supports that in heterogenous disorders like MODY sequence analysis provides rapid, cost effective and accurate genetic diagnosis.

  18. Mathematical Practice in Textbooks Analysis: Praxeological Reference Models, the Case of Proportion

    Science.gov (United States)

    Wijayanti, Dyana; Winsløw, Carl

    2017-01-01

    We present a new method in textbook analysis, based on so-called praxeological reference models focused on specific content at task level. This method implies that the mathematical contents of a textbook (or textbook part) is analyzed in terms of the tasks and techniques which are exposed to or demanded from readers; this can then be interpreted…

  19. Whole genome sequencing and bioinformatics analysis of two Egyptian genomes.

    Science.gov (United States)

    ElHefnawi, Mahmoud; Jeon, Sungwon; Bhak, Youngjune; ElFiky, Asmaa; Horaiz, Ahmed; Jun, JeHoon; Kim, Hyunho; Bhak, Jong

    2018-05-15

    We report two Egyptian male genomes (EGP1 and EGP2) sequenced at ~ 30× sequencing depths. EGP1 had 4.7 million variants, where 198,877 were novel variants while EGP2 had 209,109 novel variants out of 4.8 million variants. The mitochondrial haplogroup of the two individuals were identified to be H7b1 and L2a1c, respectively. We also identified the Y haplogroup of EGP1 (R1b) and EGP2 (J1a2a1a2 > P58 > FGC11). EGP1 had a mutation in the NADH gene of the mitochondrial genome ND4 (m.11778 G > A) that causes Leber's hereditary optic neuropathy. Some SNPs shared by the two genomes were associated with an increased level of cholesterol and triglycerides, probably related with Egyptians obesity. Comparison of these genomes with African and Western-Asian genomes can provide insights on Egyptian ancestry and genetic history. This resource can be used to further understand genomic diversity and functional classification of variants as well as human migration and evolution across Africa and Western-Asia. Copyright © 2017. Published by Elsevier B.V.

  20. Accident sequence precursor analysis level 2/3 model development

    International Nuclear Information System (INIS)

    Lui, C.H.; Galyean, W.J.; Brownson, D.A.

    1997-01-01

    The US Nuclear Regulatory Commission's Accident Sequence Precursor (ASP) program currently uses simple Level 1 models to assess the conditional core damage probability for operational events occurring in commercial nuclear power plants (NPP). Since not all accident sequences leading to core damage will result in the same radiological consequences, it is necessary to develop simple Level 2/3 models that can be used to analyze the response of the NPP containment structure in the context of a core damage accident, estimate the magnitude of the resulting radioactive releases to the environment, and calculate the consequences associated with these releases. The simple Level 2/3 model development work was initiated in 1995, and several prototype models have been completed. Once developed, these simple Level 2/3 models are linked to the simple Level 1 models to provide risk perspectives for operational events. This paper describes the methods implemented for the development of these simple Level 2/3 ASP models, and the linkage process to the existing Level 1 models

  1. In Vivo Enhancer Analysis Chromosome 16 Conserved NoncodingSequences

    Energy Technology Data Exchange (ETDEWEB)

    Pennacchio, Len A.; Ahituv, Nadav; Moses, Alan M.; Nobrega,Marcelo; Prabhakar, Shyam; Shoukry, Malak; Minovitsky, Simon; Visel,Axel; Dubchak, Inna; Holt, Amy; Lewis, Keith D.; Plajzer-Frick, Ingrid; Akiyama, Jennifer; De Val, Sarah; Afzal, Veena; Black, Brian L.; Couronne, Olivier; Eisen, Michael B.; Rubin, Edward M.

    2006-02-01

    The identification of enhancers with predicted specificitiesin vertebrate genomes remains a significant challenge that is hampered bya lack of experimentally validated training sets. In this study, weleveraged extreme evolutionary sequence conservation as a filter toidentify putative gene regulatory elements and characterized the in vivoenhancer activity of human-fish conserved and ultraconserved1 noncodingelements on human chromosome 16 as well as such elements from elsewherein the genome. We initially tested 165 of these extremely conservedsequences in a transgenic mouse enhancer assay and observed that 48percent (79/165) functioned reproducibly as tissue-specific enhancers ofgene expression at embryonic day 11.5. While driving expression in abroad range of anatomical structures in the embryo, the majority of the79 enhancers drove expression in various regions of the developingnervous system. Studying a set of DNA elements that specifically droveforebrain expression, we identified DNA signatures specifically enrichedin these elements and used these parameters to rank all ~;3,400human-fugu conserved noncoding elements in the human genome. The testingof the top predictions in transgenic mice resulted in a three-foldenrichment for sequences with forebrain enhancer activity. These datadramatically expand the catalogue of in vivo-characterized human geneenhancers and illustrate the future utility of such training sets for avariety of iological applications including decoding the regulatoryvocabulary of the human genome.

  2. Sequence analysis of putative swrW gene required for surfactant ...

    African Journals Online (AJOL)

    Serratia marcescens produces biosurfactant serrawettin, essential for its population migration behavior. Serrawettin W1 was revealed to be an antibiotic serratamolide that makes it significant for deoxyribonucleic acid (DNA) and protein sequence analysis. Four nucleotide and amino-acid sequences from local strains ...

  3. Selection of reference genes for expression analysis in the entomophthoralean fungus Pandora neoaphidis.

    Science.gov (United States)

    Chen, Chun; Xie, Tingna; Ye, Sudan; Jensen, Annette Bruun; Eilenberg, Jørgen

    2016-01-01

    The selection of suitable reference genes is crucial for accurate quantification of gene expression and can add to our understanding of host-pathogen interactions. To identify suitable reference genes in Pandora neoaphidis, an obligate aphid pathogenic fungus, the expression of three traditional candidate genes including 18S rRNA(18S), 28S rRNA(28S) and elongation factor 1 alpha-like protein (EF1), were measured by quantitative polymerase chain reaction at different developmental stages (conidia, conidia with germ tubes, short hyphae and elongated hyphae), and under different nutritional conditions. We calculated the expression stability of candidate reference genes using four algorithms including geNorm, NormFinder, BestKeeper and Delta Ct. The analysis results revealed that the comprehensive ranking of candidate reference genes from the most stable to the least stable was 18S (1.189), 28S (1.414) and EF1 (3). The 18S was, therefore, the most suitable reference gene for real-time RT-PCR analysis of gene expression under all conditions. These results will support further studies on gene expression in P. neoaphidis. Copyright © 2015 Sociedade Brasileira de Microbiologia. Published by Elsevier Editora Ltda. All rights reserved.

  4. Selection of reference genes for expression analysis in the entomophthoralean fungus Pandora neoaphidis

    Directory of Open Access Journals (Sweden)

    Chun Chen

    2016-03-01

    Full Text Available Abstract The selection of suitable reference genes is crucial for accurate quantification of gene expression and can add to our understanding of host–pathogen interactions. To identify suitable reference genes in Pandora neoaphidis, an obligate aphid pathogenic fungus, the expression of three traditional candidate genes including 18S rRNA(18S, 28S rRNA(28S and elongation factor 1 alpha-like protein (EF1, were measured by quantitative polymerase chain reaction at different developmental stages (conidia, conidia with germ tubes, short hyphae and elongated hyphae, and under different nutritional conditions. We calculated the expression stability of candidate reference genes using four algorithms including geNorm, NormFinder, BestKeeper and Delta Ct. The analysis results revealed that the comprehensive ranking of candidate reference genes from the most stable to the least stable was 18S (1.189, 28S (1.414 and EF1 (3. The 18S was, therefore, the most suitable reference gene for real-time RT-PCR analysis of gene expression under all conditions. These results will support further studies on gene expression in P. neoaphidis.

  5. Initial testing of a neutron activation analysis system by analysing standard reference materials

    International Nuclear Information System (INIS)

    Suhaimi Hamzah; Roslan Idris; Abdul Khalik Haji Wood; Che Seman Mahmood; Abdul Rahim Mohamad Noor.

    1983-01-01

    This paper describes the data acquisition and processing system in our laboratories (ND6600), the methods of activation analysis and the results obtained from our analysis of IAEA standard reference material (SL-l lake sediments and NBS coal ash 1632a). These standards were analysed in order to check the capability of the system, which was designed in such a way as to enable the user to independently collect and process data from multiple radiation detectors. (author)

  6. Genomic insight into the common carp (Cyprinus carpio genome by sequencing analysis of BAC-end sequences

    Directory of Open Access Journals (Sweden)

    Wang Jintu

    2011-04-01

    Full Text Available Abstract Background Common carp is one of the most important aquaculture teleost fish in the world. Common carp and other closely related Cyprinidae species provide over 30% aquaculture production in the world. However, common carp genomic resources are still relatively underdeveloped. BAC end sequences (BES are important resources for genome research on BAC-anchored genetic marker development, linkage map and physical map integration, and whole genome sequence assembling and scaffolding. Result To develop such valuable resources in common carp (Cyprinus carpio, a total of 40,224 BAC clones were sequenced on both ends, generating 65,720 clean BES with an average read length of 647 bp after sequence processing, representing 42,522,168 bp or 2.5% of common carp genome. The first survey of common carp genome was conducted with various bioinformatics tools. The common carp genome contains over 17.3% of repetitive elements with GC content of 36.8% and 518 transposon ORFs. To identify and develop BAC-anchored microsatellite markers, a total of 13,581 microsatellites were detected from 10,355 BES. The coding region of 7,127 genes were recognized from 9,443 BES on 7,453 BACs, with 1,990 BACs have genes on both ends. To evaluate the similarity to the genome of closely related zebrafish, BES of common carp were aligned against zebrafish genome. A total of 39,335 BES of common carp have conserved homologs on zebrafish genome which demonstrated the high similarity between zebrafish and common carp genomes, indicating the feasibility of comparative mapping between zebrafish and common carp once we have physical map of common carp. Conclusion BAC end sequences are great resources for the first genome wide survey of common carp. The repetitive DNA was estimated to be approximate 28% of common carp genome, indicating the higher complexity of the genome. Comparative analysis had mapped around 40,000 BES to zebrafish genome and established over 3

  7. Genomic insight into the common carp (Cyprinus carpio) genome by sequencing analysis of BAC-end sequences

    Science.gov (United States)

    2011-01-01

    Background Common carp is one of the most important aquaculture teleost fish in the world. Common carp and other closely related Cyprinidae species provide over 30% aquaculture production in the world. However, common carp genomic resources are still relatively underdeveloped. BAC end sequences (BES) are important resources for genome research on BAC-anchored genetic marker development, linkage map and physical map integration, and whole genome sequence assembling and scaffolding. Result To develop such valuable resources in common carp (Cyprinus carpio), a total of 40,224 BAC clones were sequenced on both ends, generating 65,720 clean BES with an average read length of 647 bp after sequence processing, representing 42,522,168 bp or 2.5% of common carp genome. The first survey of common carp genome was conducted with various bioinformatics tools. The common carp genome contains over 17.3% of repetitive elements with GC content of 36.8% and 518 transposon ORFs. To identify and develop BAC-anchored microsatellite markers, a total of 13,581 microsatellites were detected from 10,355 BES. The coding region of 7,127 genes were recognized from 9,443 BES on 7,453 BACs, with 1,990 BACs have genes on both ends. To evaluate the similarity to the genome of closely related zebrafish, BES of common carp were aligned against zebrafish genome. A total of 39,335 BES of common carp have conserved homologs on zebrafish genome which demonstrated the high similarity between zebrafish and common carp genomes, indicating the feasibility of comparative mapping between zebrafish and common carp once we have physical map of common carp. Conclusion BAC end sequences are great resources for the first genome wide survey of common carp. The repetitive DNA was estimated to be approximate 28% of common carp genome, indicating the higher complexity of the genome. Comparative analysis had mapped around 40,000 BES to zebrafish genome and established over 3,100 microsyntenies, covering over 50% of

  8. Multilocus sequence typing and phylogenetic analysis of Propionibacterium acnes

    DEFF Research Database (Denmark)

    Kilian, Mogens; Scholz, Christian F. P.; Lomholt, Hans B.

    2012-01-01

    Propionibacterium acnes is a commensal of human skin but is also implicated in the pathogenesis of acne vulgaris, in biofilm-associated infections of medical devices and endophthalmitis, and in infections of bone and dental root canals. Recent studies associate P. acnes with prostate cancer...... schemes were compared with reference to a phylogenetic tree based on 78 P. acnes genomes and their gene contents. Further support for a basically clonal population structure of P. acnes and a scenario of the global spread of epidemic clones of P. acnes was obtained. Compared to the Belfast scheme...

  9. Multilocus Sequence Typing (MLST) and Phylogenetic Analysis of Propionibacterium acnes

    DEFF Research Database (Denmark)

    Kilian, Mogens; Scholz, Christian; Lomholt, Hans B

    2011-01-01

    Propionibacterium acnes is a commensal of human skin but is also implicated in the pathogenesis of acne vulgaris and in biofilm-associated infections of medical devices and endophthalmitis, and in infections of bone and dental root canals. Recent studies associate P. acnes with prostate cancer...... with reference to a phylogenetic tree based on 78 P. acnes genomes and their gene contents. Further support for a basically clonal population structure of P. acnes and a scenario of global spread of epidemic clones of P. acnes was obtained. Compared with the Belfast scheme, the Aarhus MLST scheme (http...

  10. Application of synthetic DNA probes to the analysis of DNA sequence variants in man

    International Nuclear Information System (INIS)

    Wallace, R.B.; Petz, L.D.; Yam, P.Y.

    1986-01-01

    Oligonucleotide probes provide a tool to discriminate between any two alleles on the basis of hybridization. Random sampling of the genome with different oligonucleotide probes should reveal polymorphism in a certain percentage of the cases. In the hope of identifying polymorphic regions more efficiently, we chose to take advantage of the proposed hypermutability of repeated DNA sequences and the specificity of oligonucleotide hybridization. Since, under appropriate conditions, oligonucleotide probes require complete base pairing for hybridization to occur, they will only hybridize to a subset of the members of a repeat family when all members of the family are not identical. The results presented here suggest that oligonucleotide hybridization can be used to extend the genomic sequences that can be tested for the presence of RFLPs. This expands the tools available to human genetics. In addition, the results suggest that repeated DNA sequences are indeed more polymorphic than single-copy sequences. 28 references, 2 figures

  11. [Sequencing and analysis of the complete genome of a rabies virus isolate from Sika deer].

    Science.gov (United States)

    Zhao, Yun-Jiao; Guo, Li; Huang, Ying; Zhang, Li-Shi; Qian, Ai-Dong

    2008-05-01

    One DRV strain was isolated from Sika Deer brain and sequenced. Nine overlapped gene fragments were amplified by RT-PCR through 3'-RACE and 5'-RACE method, and the complete DRV genome sequence was assembled. The length of the complete genome is 11863bp. The DRV genome organization was similar to other rabies viruses which were composed of five genes and the initiation sites and termination sites were highly conservative. There were mutated amino acids in important antigen sites of nucleoprotein and glycoprotein. The nucleotide and amino acid homologies of gene N, P, M, G, L in strains with completed genomie sequencing were compared. Compared with N gene sequence of other typical rabies viruses, a phylogenetic tree was established . These results indicated that DRV belonged to gene type 1. The highest homology compared with Chinese vaccine strain 3aG was 94%, and the lowest was 71% compared with WCBV. These findings provided theoretical reference for further research in rabies virus.

  12. A symbolic dynamics approach for the complexity analysis of chaotic pseudo-random sequences

    International Nuclear Information System (INIS)

    Xiao Fanghong

    2004-01-01

    By considering a chaotic pseudo-random sequence as a symbolic sequence, authors present a symbolic dynamics approach for the complexity analysis of chaotic pseudo-random sequences. The method is applied to the cases of Logistic map and one-way coupled map lattice to demonstrate how it works, and a comparison is made between it and the approximate entropy method. The results show that this method is applicable to distinguish the complexities of different chaotic pseudo-random sequences, and it is superior to the approximate entropy method

  13. Sequence Analysis of IncA/C and IncI1 Plasmids Isolated from Multidrug-Resistant Salmonella Newport Using Single-Molecule Real-Time Sequencing.

    Science.gov (United States)

    Cao, Guojie; Allard, Marc; Hoffmann, Maria; Muruvanda, Tim; Luo, Yan; Payne, Justin; Meng, Kevin; Zhao, Shaohua; McDermott, Patrick; Brown, Eric; Meng, Jianghong

    2018-04-05

    Multidrug-resistant (MDR) plasmids play an important role in disseminating antimicrobial resistance genes. To elucidate the antimicrobial resistance gene compositions in A/C incompatibility complex (IncA/C) plasmids carried by animal-derived MDR Salmonella Newport, and to investigate the spread mechanism of IncA/C plasmids, this study characterizes the complete nucleotide sequences of IncA/C plasmids by comparative analysis. Complete nucleotide sequencing of plasmids and chromosomes of six MDR Salmonella Newport strains was performed using PacBio RSII. Open reading frames were assigned using prokaryotic genome annotation pipeline (PGAP). To understand genomic diversity and evolutionary relationships among Salmonella Newport IncA/C plasmids, we included three complete IncA/C plasmid sequences with similar backbones from Salmonella Newport and Escherichia coli: pSN254, pAM04528, and peH4H, and additional 200 draft chromosomes. With the exception of canine isolate CVM22462, which contained an additional IncI1 plasmid, each of the six MDR Salmonella Newport strains contained only the IncA/C plasmid. These IncA/C plasmids (including references) ranged in size from 80.1 (pCVM21538) to 176.5 kb (pSN254) and carried various resistance genes. Resistance genes floR, tetA, tetR, strA, strB, sul, and mer were identified in all IncA/C plasmids. Additionally, bla CMY-2 and sugE were present in all IncA/C plasmids, excepting pCVM21538. Plasmid pCVM22462 was capable of being transferred by conjugation. The IncI1 plasmid pCVM22462b in CVM22462 carried bla CMY-2 and sugE. Our data showed that MDR Salmonella Newport strains carrying similar IncA/C plasmids clustered together in the phylogenetic tree using chromosome sequences and the IncA/C plasmids from animal-derived Salmonella Newport contained diverse resistance genes. In the current study, we analyzed genomic diversities and phylogenetic relationships among MDR Salmonella Newport using complete plasmids and chromosome

  14. Reference miRNAs for miRNAome analysis of urothelial carcinomas.

    Directory of Open Access Journals (Sweden)

    Nadine Ratert

    Full Text Available BACKGROUND/OBJECTIVE: Reverse transcription quantitative real-time PCR (RT-qPCR is widely used in microRNA (miRNA expression studies on cancer. To compensate for the analytical variability produced by the multiple steps of the method, relative quantification of the measured miRNAs is required, which is based on normalization to endogenous reference genes. No study has been performed so far on reference miRNAs for normalization of miRNA expression in urothelial carcinoma. The aim of this study was to identify suitable reference miRNAs for miRNA expression studies by RT-qPCR in urothelial carcinoma. METHODS: Candidate reference miRNAs were selected from 24 urothelial carcinoma and normal bladder tissue samples by miRNA microarrays. The usefulness of these candidate reference miRNAs together with the commonly for normalization purposes used small nuclear RNAs RNU6B, RNU48, and Z30 were thereafter validated by RT-qPCR in 58 tissue samples and analyzed by the algorithms geNorm, NormFinder, and BestKeeper. PRINCIPAL FINDINGS: Based on the miRNA microarray data, a total of 16 miRNAs were identified as putative reference genes. After validation by RT-qPCR, miR-101, miR-125a-5p, miR-148b, miR-151-5p, miR-181a, miR-181b, miR-29c, miR-324-3p, miR-424, miR-874, RNU6B, RNU48, and Z30 were used for geNorm, NormFinder, and BestKeeper analyses that gave different combinations of recommended reference genes for normalization. CONCLUSIONS: The present study provided the first systematic analysis for identifying suitable reference miRNAs for miRNA expression studies of urothelial carcinoma by RT-qPCR. Different combinations of reference genes resulted in reliable expression data for both strongly and less strongly altered miRNAs. Notably, RNU6B, which is the most frequently used reference gene for miRNA studies, gave inaccurate normalization. The combination of four (miR-101, miR-125a-5p, miR-148b, and miR-151-5p or three (miR-148b, miR-181b, and miR-874

  15. The sequence and analysis of duplication rich human chromosome 16

    Energy Technology Data Exchange (ETDEWEB)

    Martin, Joel; Han, Cliff; Gordon, Laurie A.; Terry, Astrid; Prabhakar, Shyam; She, Xinwei; Xie, Gary; Hellsten, Uffe; Man Chan, Yee; Altherr, Michael; Couronne, Olivier; Aerts, Andrea; Bajorek, Eva; Black, Stacey; Blumer, Heather; Branscomb, Elbert; Brown, Nancy C.; Bruno, William J.; Buckingham, Judith M.; Callen, David F.; Campbell, Connie S.; Campbell, Mary L.; Campbell, Evelyn W.; Caoile, Chenier; Challacombe, Jean F.; Chasteen, Leslie A.; Chertkov, Olga; Chi, Han C.; Christensen, Mari; Clark, Lynn M.; Cohn, Judith D.; Denys, Mirian; Detter, John C.; Dickson, Mark; Dimitrijevic-Bussod, Mira; Escobar, Julio; Fawcett, Joseph J.; Flowers, Dave; Fotopulos, Dea; Glavina, Tijana; Gomez, Maria; Gonzales, Eidelyn; Goodstein, David; Goodwin, Lynne A.; Grady, Deborah L.; Grigoriev, Igor; Groza, Matthew; Hammon, Nancy; Hawkins, Trevor; Haydu, Lauren; Hildebrand, Carl E.; Huang, Wayne; Israni, Sanjay; Jett, Jamie; Jewett, Phillip E.; Kadner, Kristen; Kimball, Heather; Kobayashi, Arthur; Krawczyk, Marie-Claude; Leyba, Tina; Longmire, Jonathan L.; Lopez, Frederick; Lou, Yunian; Lowry, Steve; Ludeman, Thom; Mark, Graham A.; Mcmurray, Kimberly L.; Meincke, Linda J.; Morgan, Jenna; Moyzis, Robert K.; Mundt, Mark O.; Munk, A. Christine; Nandkeshwar, Richard D.; Pitluck, Sam; Pollard, Martin; Predki, Paul; Parson-Quintana, Beverly; Ramirez, Lucia; Rash, Sam; Retterer, James; Ricke, Darryl O.; Robinson, Donna L.; Rodriguez, Alex; Salamov, Asaf; Saunders, Elizabeth H.; Scott, Duncan; Shough, Timothy; Stallings, Raymond L.; Stalvey, Malinda; Sutherland, Robert D.; Tapia, Roxanne; Tesmer, Judith G.; Thayer, Nina; Thompson, Linda S.; Tice, Hope; Torney, David C.; Tran-Gyamfi, Mary; Tsai, Ming; Ulanovsky, Levy E.; Ustaszewska, Anna; Vo, Nu; White, P. Scott; Williams, Albert L.; Wills, Patricia L.; Wu, Jung-Rung; Wu, Kevin; Yang, Joan; DeJong, Pieter; Bruce, David; Doggett, Norman; Deaven, Larry; Schmutz, Jeremy; Grimwood, Jane; Richardson, Paul; et al.

    2004-08-01

    We report here the 78,884,754 base pairs of finished human chromosome 16 sequence, representing over 99.9 percent of its euchromatin. Manual annotation revealed 880 protein coding genes confirmed by 1,637 aligned transcripts, 19 tRNA genes, 341 pseudogenes and 3 RNA pseudogenes. These genes include metallothionein, cadherin and iroquois gene families, as well as the disease genes for polycystic kidney disease and acute myelomonocytic leukemia. Several large-scale structural polymorphisms spanning hundreds of kilobasepairs were identified and result in gene content differences across humans. One of the unique features of chromosome 16 is its high level of segmental duplication, ranked among the highest of the human autosomes. While the segmental duplications are enriched in the relatively gene poor pericentromere of the p-arm, some are involved in recent gene duplication and conversion events which are likely to have had an impact on the evolution of primates and human disease susceptibility.

  16. The Sequence and Analysis of Duplication Rich Human Chromosome 16

    Science.gov (United States)

    Martin, Joel; Han, Cliff; Gordon, Laurie A.; Terry, Astrid; Prabhakar, Shyam; She, Xinwei; Xie, Gary; Hellsten, Uffe; Man Chan, Yee; Altherr, Michael; Couronne, Olivier; Aerts, Andrea; Bajorek, Eva; Black, Stacey; Blumer, Heather; Branscomb, Elbert; Brown, Nancy C.; Bruno, William J.; Buckingham, Judith M.; Callen, David F.; Campbell, Connie S.; Campbell, Mary L.; Campbell, Evelyn W.; Caoile, Chenier; Challacombe, Jean F.; Chasteen, Leslie A.; Chertkov, Olga; Chi, Han C.; Christensen, Mari; Clark, Lynn M.; Cohn, Judith D.; Denys, Mirian; Detter, John C.; Dickson, Mark; Dimitrijevic-Bussod, Mira; Escobar, Julio; Fawcett, Joseph J.; Flowers, Dave; Fotopulos, Dea; Glavina, Tijana; Gomez, Maria; Gonzales, Eidelyn; Goodstein, David; Goodwin, Lynne A.; Grady, Deborah L.; Grigoriev, Igor; Groza, Matthew; Hammon, Nancy; Hawkins, Trevor; Haydu, Lauren; Hildebrand, Carl E.; Huang, Wayne; Israni, Sanjay; Jett, Jamie; Jewett, Phillip E.; Kadner, Kristen; Kimball, Heather; Kobayashi, Arthur; Krawczyk, Marie-Claude; Leyba, Tina; Longmire, Jonathan L.; Lopez, Frederick; Lou, Yunian; Lowry, Steve; Ludeman, Thom; Mark, Graham A.; Mcmurray, Kimberly L.; Meincke, Linda J.; Morgan, Jenna; Moyzis, Robert K.; Mundt, Mark O.; Munk, A. Christine; Nandkeshwar, Richard D.; Pitluck, Sam; Pollard, Martin; Predki, Paul; Parson-Quintana, Beverly; Ramirez, Lucia; Rash, Sam; Retterer, James; Ricke, Darryl O.; Robinson, Donna L.; Rodriguez, Alex; Salamov, Asaf; Saunders, Elizabeth H.; Scott, Duncan; Shough, Timothy; Stallings, Raymond L.; Stalvey, Malinda; Sutherland, Robert D.; Tapia, Roxanne; Tesmer, Judith G.; Thayer, Nina; Thompson, Linda S.; Tice, Hope; Torney, David C.; Tran-Gyamfi, Mary; Tsai, Ming; Ulanovsky, Levy E.; Ustaszewska, Anna; Vo, Nu; White, P. Scott; Williams, Albert L.; Wills, Patricia L.; Wu, Jung-Rung; Wu, Kevin; Yang, Joan; DeJong, Pieter; Bruce, David; Doggett, Norman; Deaven, Larry; Schmutz, Jeremy; Grimwood, Jane; Richardson, Paul; et al.

    2004-01-01

    We report here the 78,884,754 base pairs of finished human chromosome 16 sequence, representing over 99.9 percent of its euchromatin. Manual annotation revealed 880 protein coding genes confirmed by 1,637 aligned transcripts, 19 tRNA genes, 341 pseudogenes and 3 RNA pseudogenes. These genes include metallothionein, cadherin and iroquois gene families, as well as the disease genes for polycystic kidney disease and acute myelomonocytic leukemia. Several large-scale structural polymorphisms spanning hundreds of kilobasepairs were identified and result in gene content differences across humans. One of the unique features of chromosome 16 is its high level of segmental duplication, ranked among the highest of the human autosomes. While the segmental duplications are enriched in the relatively gene poor pericentromere of the p-arm, some are involved in recent gene duplication and conversion events which are likely to have had an impact on the evolution of primates and human disease susceptibility.

  17. Factoring local sequence composition in motif significance analysis.

    Science.gov (United States)

    Ng, Patrick; Keich, Uri

    2008-01-01

    We recently introduced a biologically realistic and reliable significance analysis of the output of a popular class of motif finders. In this paper we further improve our significance analysis by incorporating local base composition information. Relying on realistic biological data simulation, as well as on FDR analysis applied to real data, we show that our method is significantly better than the increasingly popular practice of using the normal approximation to estimate the significance of a finder's output. Finally we turn to leveraging our reliable significance analysis to improve the actual motif finding task. Specifically, endowing a variant of the Gibbs Sampler with our improved significance analysis we demonstrate that de novo finders can perform better than has been perceived. Significantly, our new variant outperforms all the finders reviewed in a recently published comprehensive analysis of the Harbison genome-wide binding location data. Interestingly, many of these finders incorporate additional information such as nucleosome positioning and the significance of binding data.

  18. Peptide Pattern Recognition for high-throughput protein sequence analysis and clustering

    DEFF Research Database (Denmark)

    Busk, Peter Kamp

    2017-01-01

    Large collections of protein sequences with divergent sequences are tedious to analyze for understanding their phylogenetic or structure-function relation. Peptide Pattern Recognition is an algorithm that was developed to facilitate this task but the previous version does only allow a limited...... number of sequences as input. I implemented Peptide Pattern Recognition as a multithread software designed to handle large numbers of sequences and perform analysis in a reasonable time frame. Benchmarking showed that the new implementation of Peptide Pattern Recognition is twenty times faster than...... the previous implementation on a small protein collection with 673 MAP kinase sequences. In addition, the new implementation could analyze a large protein collection with 48,570 Glycosyl Transferase family 20 sequences without reaching its upper limit on a desktop computer. Peptide Pattern Recognition...

  19. Information-Theoretical Analysis of EEG Microstate Sequences in Python

    Directory of Open Access Journals (Sweden)

    Frederic von Wegner

    2018-06-01

    Full Text Available We present an open-source Python package to compute information-theoretical quantities for electroencephalographic data. Electroencephalography (EEG measures the electrical potential generated by the cerebral cortex and the set of spatial patterns projected by the brain's electrical potential on the scalp surface can be clustered into a set of representative maps called EEG microstates. Microstate time series are obtained by competitively fitting the microstate maps back into the EEG data set, i.e., by substituting the EEG data at a given time with the label of the microstate that has the highest similarity with the actual EEG topography. As microstate sequences consist of non-metric random variables, e.g., the letters A–D, we recently introduced information-theoretical measures to quantify these time series. In wakeful resting state EEG recordings, we found new characteristics of microstate sequences such as periodicities related to EEG frequency bands. The algorithms used are here provided as an open-source package and their use is explained in a tutorial style. The package is self-contained and the programming style is procedural, focusing on code intelligibility and easy portability. Using a sample EEG file, we demonstrate how to perform EEG microstate segmentation using the modified K-means approach, and how to compute and visualize the recently introduced information-theoretical tests and quantities. The time-lagged mutual information function is derived as a discrete symbolic alternative to the autocorrelation function for metric time series and confidence intervals are computed from Markov chain surrogate data. The software package provides an open-source extension to the existing implementations of the microstate transform and is specifically designed to analyze resting state EEG recordings.

  20. Massively parallel sequencing and analysis of the Necator americanus transcriptome.

    Directory of Open Access Journals (Sweden)

    Cinzia Cantacessi

    2010-05-01

    Full Text Available The blood-feeding hookworm Necator americanus infects hundreds of millions of people worldwide. In order to elucidate fundamental molecular biological aspects of this hookworm, the transcriptome of the adult stage of Necator americanus was explored using next-generation sequencing and bioinformatic analyses.A total of 19,997 contigs were assembled from the sequence data; 6,771 of these contigs had known orthologues in the free-living nematode Caenorhabditis elegans, and most of them encoded proteins with WD40 repeats (10.6%, proteinase inhibitors (7.8% or calcium-binding EF-hand proteins (6.7%. Bioinformatic analyses inferred that the C. elegans homologues are involved mainly in biological pathways linked to ribosome biogenesis (70%, oxidative phosphorylation (63% and/or proteases (60%; most of these molecules were predicted to be involved in more than one biological pathway. Comparative analyses of the transcriptomes of N. americanus and the canine hookworm, Ancylostoma caninum, revealed qualitative and quantitative differences. For instance, proteinase inhibitors were inferred to be highly represented in the former species, whereas SCP/Tpx-1/Ag5/PR-1/Sc7 proteins ( = SCP/TAPS or Ancylostoma-secreted proteins were predominant in the latter. In N. americanus, essential molecules were predicted using a combination of orthology mapping and functional data available for C. elegans. Further analyses allowed the prioritization of 18 predicted drug targets which did not have homologues in the human host. These candidate targets were inferred to be linked to mitochondrial (e.g., processing proteins or amino acid metabolism (e.g., asparagine t-RNA synthetase.This study has provided detailed insights into the transcriptome of the adult stage of N. americanus and examines similarities and differences between this species and A. caninum. Future efforts should focus on comparative transcriptomic and proteomic investigations of the other predominant human

  1. [Investigation of reference intervals of blood gas and acid-base analysis assays in China].

    Science.gov (United States)

    Zhang, Lu; Wang, Wei; Wang, Zhiguo

    2015-10-01

    To investigate and analyze the upper and lower limits and their sources of reference intervals in blood gas and acid-base analysis assays. The data of reference intervals were collected, which come from the first run of 2014 External Quality Assessment (EQA) program in blood gas and acid-base analysis assays performed by National Center for Clinical Laboratories (NCCL). All the abnormal values and errors were eliminated. Data statistics was performed by SPSS 13.0 and Excel 2007 referring to upper and lower limits of reference intervals and sources of 7 blood gas and acid-base analysis assays, i.e. pH value, partial pressure of carbon dioxide (PCO2), partial pressure of oxygen (PO2), Na+, K+, Ca2+ and Cl-. Values were further grouped based on instrument system and the difference between each group were analyzed. There were 225 laboratories submitting the information on the reference intervals they had been using. The three main sources of reference intervals were National Guide to Clinical Laboratory Procedures [37.07% (400/1 079)], instructions of instrument manufactures [31.23% (337/1 079)] and instructions of reagent manufactures [23.26% (251/1 079)]. Approximately 35.1% (79/225) of the laboratories had validated the reference intervals they used. The difference of upper and lower limits in most assays among 7 laboratories was moderate, both minimum and maximum (i.e. the upper limits of pH value was 7.00-7.45, the lower limits of Na+ was 130.00-156.00 mmol/L), and mean and median (i.e. the upper limits of K+ was 5.04 mmol/L and 5.10 mmol/L, the upper limits of PCO2 was 45.65 mmHg and 45.00 mmHg, 1 mmHg = 0.133 kPa), as well as the difference in P2.5 and P97.5 between each instrument system group. It was shown by Kruskal-Wallis method that the P values of upper and lower limits of all the parameters were lower than 0.001, expecting the lower limits of Na+ with P value 0.029. It was shown by Mann-Whitney that the statistic differences were found among instrument

  2. Total body calcium by neutron activation analysis. Reference data for children

    International Nuclear Information System (INIS)

    Ellis, K.J.; Shypailo, R.J.

    2001-01-01

    There is a paucity of data on the chemical composition of the human body during growth. Total body calcium (TBCa) has been reported for only one male child, aged 41/2 yr. TBCa values for 25 children and 27 young women using in vivo neutron activation analysis have been obtained. TBCa results were lower than those reported for the one male cadaver, as well as the estimates derived for the 'Reference Man' model. It was concluded that the reference values for TBCa may need to be adjusted to appropriately describe skeletal mineralization of contemporary children. (author)

  3. Instrumental neutron activation analysis for the certification of biological reference materials

    International Nuclear Information System (INIS)

    Ambulkar, M.N.; Chutke, N.L.; Garg, A.N.

    1992-01-01

    A multielemental instrumental neutron activation analysis (INAA) method by short and long irradiation has been employed for the determination of 22 minor and trace constituents in two proposed Standard Reference Materials P-RBF and P-WBF from Institute of Radioecology and Applied Nuclear Techniques, Czechoslovakia. Also some biological standards such as Bowen's Kale, Cabbage leaves (Poland) including wheat and rice flour samples of local origin were analysed. It is suggested that INAA is an ideal method for the certification of reference materials of biological matrices. (author). 7 refs., 1 tab

  4. Systems Analysis Programs for Hands-on Integrated Reliability Evaluations (SAPHIRE), Version 5.0: Integrated Reliability and Risk Analysis System (IRRAS) reference manual. Volume 2

    International Nuclear Information System (INIS)

    Russell, K.D.; Kvarfordt, K.J.; Skinner, N.L.; Wood, S.T.; Rasmuson, D.M.

    1994-07-01

    The Systems Analysis Programs for Hands-on Integrated Reliability Evaluations (SAPHIRE) refers to a set of several microcomputer programs that were developed to create and analyze probabilistic risk assessments (PRAs), primarily for nuclear power plants. The Integrated Reliability and Risk Analysis System (IRRAS) is a state-of-the-art, microcomputer-based probabilistic risk assessment (PRA) model development and analysis tool to address key nuclear plant safety issues. IRRAS is an integrated software tool that gives the use the ability to create and analyze fault trees and accident sequences using a microcomputer. This program provides functions that range from graphical fault tree construction to cut set generation and quantification to report generation. Version 1.0 of the IRRAS program was released in February of 1987. Since then, many user comments and enhancements have been incorporated into the program providing a much more powerful and user-friendly system. This version has been designated IRRAS 5.0 and is the subject of this Reference Manual. Version 5.0 of IRRAS provides the same capabilities as earlier versions and ads the ability to perform location transformations, seismic analysis, and provides enhancements to the user interface as well as improved algorithm performance. Additionally, version 5.0 contains new alphanumeric fault tree and event used for event tree rules, recovery rules, and end state partitioning

  5. Combined DECS Analysis and Next-Generation Sequencing Enable Efficient Detection of Novel Plant RNA Viruses

    Directory of Open Access Journals (Sweden)

    Hironobu Yanagisawa

    2016-03-01

    Full Text Available The presence of high molecular weight double-stranded RNA (dsRNA within plant cells is an indicator of infection with RNA viruses as these possess genomic or replicative dsRNA. DECS (dsRNA isolation, exhaustive amplification, cloning, and sequencing analysis has been shown to be capable of detecting unknown viruses. We postulated that a combination of DECS analysis and next-generation sequencing (NGS would improve detection efficiency and usability of the technique. Here, we describe a model case in which we efficiently detected the presumed genome sequence of Blueberry shoestring virus (BSSV, a member of the genus Sobemovirus, which has not so far been reported. dsRNAs were isolated from BSSV-infected blueberry plants using the dsRNA-binding protein, reverse-transcribed, amplified, and sequenced using NGS. A contig of 4,020 nucleotides (nt that shared similarities with sequences from other Sobemovirus species was obtained as a candidate of the BSSV genomic sequence. Reverse transcription (RT-PCR primer sets based on sequences from this contig enabled the detection of BSSV in all BSSV-infected plants tested but not in healthy controls. A recombinant protein encoded by the putative coat protein gene was bound by the BSSV-antibody, indicating that the candidate sequence was that of BSSV itself. Our results suggest that a combination of DECS analysis and NGS, designated here as “DECS-C,” is a powerful method for detecting novel plant viruses.

  6. Certified reference materials for analytical quality control in neutron activation analysis

    International Nuclear Information System (INIS)

    Wee Boon Siong; Abdul Khalik Wood; Mohd Suhaimi Hamzah; Shamsiah Abdul Rahman; Mohd Suhaimi Elias; Nazaratul Ashifa Abdul Salim

    2007-01-01

    Analytical quality control in neutron activation analysis (NAA) requires the use of certified reference materials (CRM) in order to produce reliable analytical results. It is essential to evaluate the performance of NAA method when analyzing various sample matrices. Therefore, the CRM selected for an analysis should be suitable for the type of samples. There are many aspects such as concentration range, matrix match, sample size and uncertainty, which need to be considered when selecting a suitable CRM. Eventually, results of analysis of CRM were plotted into control charts in order to evaluate the qualify of the data. This is to ensure that the results are within the 95 % confidence interval as stipulated in the certificate of CRM. Thus, this article aims to discuss the uses of certified reference materials for quality control purposes in NAA involving various sample matrices. (author)

  7. Data Analysis of Sequences and qPCR for Microbial Communities during Algal Blooms

    Science.gov (United States)

    A training opportunity is open to a highly microbial-research-motivated student to conduct sequence analysis, explore novel genes and metabolic pathways, validate resultant findings using qPCR/RT-qPCR and summarize the findings

  8. Sequence analysis of the N-acetyltransferase 2 gene (NAT2) among ...

    African Journals Online (AJOL)

    Yazun Bashir Jarrar

    2017-11-26

    Nov 26, 2017 ... Sequence analysis of the N-acetyltransferase 2 gene (NAT2) among Jordanian volunteers, Libyan. Journal of Medicine .... For molecular modeling of NAT2 protein, visualized ..... cal clustering. .... cular dynamics simulation.

  9. Analysis of common SHOX gene sequence variants and ∼4.9-kb ...

    Indian Academy of Sciences (India)

    [Solc R., Hirschfeldova K., Kebrdlova V. and Baxova A. 2014 Analysis of common SHOX gene sequence variants ... based on a Gibbs sampling strategy were done using .... SHOX (short stature homeobox) are an important cause of growth.

  10. The Absolute Stability Analysis in Fuzzy Control Systems with Parametric Uncertainties and Reference Inputs

    Science.gov (United States)

    Wu, Bing-Fei; Ma, Li-Shan; Perng, Jau-Woei

    This study analyzes the absolute stability in P and PD type fuzzy logic control systems with both certain and uncertain linear plants. Stability analysis includes the reference input, actuator gain and interval plant parameters. For certain linear plants, the stability (i.e. the stable equilibriums of error) in P and PD types is analyzed with the Popov or linearization methods under various reference inputs and actuator gains. The steady state errors of fuzzy control systems are also addressed in the parameter plane. The parametric robust Popov criterion for parametric absolute stability based on Lur'e systems is also applied to the stability analysis of P type fuzzy control systems with uncertain plants. The PD type fuzzy logic controller in our approach is a single-input fuzzy logic controller and is transformed into the P type for analysis. In our work, the absolute stability analysis of fuzzy control systems is given with respect to a non-zero reference input and an uncertain linear plant with the parametric robust Popov criterion unlike previous works. Moreover, a fuzzy current controlled RC circuit is designed with PSPICE models. Both numerical and PSPICE simulations are provided to verify the analytical results. Furthermore, the oscillation mechanism in fuzzy control systems is specified with various equilibrium points of view in the simulation example. Finally, the comparisons are also given to show the effectiveness of the analysis method.

  11. Comparative sequence analysis of Sordaria macrospora and Neurospora crassa as a means to improve genome annotation.

    Science.gov (United States)

    Nowrousian, Minou; Würtz, Christian; Pöggeler, Stefanie; Kück, Ulrich

    2004-03-01

    One of the most challenging parts of large scale sequencing projects is the identification of functional elements encoded in a genome. Recently, studies of genomes of up to six different Saccharomyces species have demonstrated that a comparative analysis of genome sequences from closely related species is a powerful approach to identify open reading frames and other functional regions within genomes [Science 301 (2003) 71, Nature 423 (2003) 241]. Here, we present a comparison of selected sequences from Sordaria macrospora to their corresponding Neurospora crassa orthologous regions. Our analysis indicates that due to the high degree of sequence similarity and conservation of overall genomic organization, S. macrospora sequence information can be used to simplify the annotation of the N. crassa genome.

  12. Probabilistic topic modeling for the analysis and classification of genomic sequences

    Science.gov (United States)

    2015-01-01

    Background Studies on genomic sequences for classification and taxonomic identification have a leading role in the biomedical field and in the analysis of biodiversity. These studies are focusing on the so-called barcode genes, representing a well defined region of the whole genome. Recently, alignment-free techniques are gaining more importance because they are able to overcome the drawbacks of sequence alignment techniques. In this paper a new alignment-free method for DNA sequences clustering and classification is proposed. The method is based on k-mers representation and text mining techniques. Methods The presented method is based on Probabilistic Topic Modeling, a statistical technique originally proposed for text documents. Probabilistic topic models are able to find in a document corpus the topics (recurrent themes) characterizing classes of documents. This technique, applied on DNA sequences representing the documents, exploits the frequency of fixed-length k-mers and builds a generative model for a training group of sequences. This generative model, obtained through the Latent Dirichlet Allocation (LDA) algorithm, is then used to classify a large set of genomic sequences. Results and conclusions We performed classification of over 7000 16S DNA barcode sequences taken from Ribosomal Database Project (RDP) repository, training probabilistic topic models. The proposed method is compared to the RDP tool and Support Vector Machine (SVM) classification algorithm in a extensive set of trials using both complete sequences and short sequence snippets (from 400 bp to 25 bp). Our method reaches very similar results to RDP classifier and SVM for complete sequences. The most interesting results are obtained when short sequence snippets are considered. In these conditions the proposed method outperforms RDP and SVM with ultra short sequences and it exhibits a smooth decrease of performance, at every taxonomic level, when the sequence length is decreased. PMID:25916734

  13. Analysis of marine sediment and lobster hepatopancreas reference materials by instrumental photon activation

    International Nuclear Information System (INIS)

    Landsberger, S.; Davidson, W.F.

    1985-01-01

    By use of instrumental photon activation analysis, twelve trace (As, Ba, Cr, Co, Mn, Ni, Pb, Sb, Sr, U, Zn, and Zr) and eight minor (C, Na, Mg, Co, K, Ca, Tl, and Fe) elements were determined in a certified marine sediment standard reference material as well as eight trace (Mn, Ni, Cu, Zn, As, Sr, Cd, and Pb) and four minor (Na, Mg, Cl, and Ca) elements in a certified marine tissue (lobster hepatopancreas) standard reference material. The precision and accuracy of the present results when compared to the accepted values clearly demonstrate the reliability of this nondestructive technique and its applicability to marine environmental or marine geochemical studies. 24 references, 4 figures, 3 tables

  14. Dosimetric analysis at ICRU reference points in HDR-brachytherapy of cervical carcinoma.

    Science.gov (United States)

    Eich, H T; Haverkamp, U; Micke, O; Prott, F J; Müller, R P

    2000-01-01

    In vivo dosimetry in bladder and rectum as well as determining doses on suggested reference points following the ICRU report 38 contribute to quality assurance in HDR-brachytherapy of cervical carcinoma, especially to minimize side effects. In order to gain information regarding the radiation exposure at ICRU reference points in rectum, bladder, ureter and regional lymph nodes those were calculated (digitalisation) by means of orthogonal radiographs of 11 applications in patients with cervical carcinoma, who received primary radiotherapy. In addition, the doses at the ICRU rectum reference point was compared to the results of in vivo measurements in the rectum. The in vivo measurements were by factor 1.5 below the doses determined for the ICRU rectum reference point (4.05 +/- 0.68 Gy versus 6.11 +/- 1.63 Gy). Reasons for this were: calibration errors, non-orthogonal radiographs, movement of applicator and probe in the time span between X-ray and application, missing connection of probe and anterior rectal wall. The standard deviation of calculations at ICRU reference points was on average +/- 30%. Possible reasons for the relatively large standard deviation were difficulties in defining the points, identifying them on radiographs and the different locations of the applicators. Although 3 D CT, US or MR based treatment planning using dose volume histogram analysis is more and more established, this simple procedure of marking and digitising the ICRU reference points lengthened treatment planning only by 5 to 10 minutes. The advantages of in vivo dosimetry are easy practicability and the possibility to determine rectum doses during radiation. The advantages of computer-aided planning at ICRU reference points are that calculations are available before radiation and that they can still be taken into account for treatment planning. Both methods should be applied in HDR-brachytherapy of cervical carcinoma.

  15. Total RNA Sequencing Analysis of DCIS Progressing to Invasive Breast Cancer

    Science.gov (United States)

    2017-09-01

    AWARD NUMBER: W81XWH-14-1-0080 TITLE: Total RNA Sequencing Analysis of DCIS Progressing to Invasive Breast Cancer . PRINCIPAL INVESTIGATOR...TITLE AND SUBTITLE Total RNA Sequencing Analysis of DCIS Progressing to Invasive Breast Cancer . 5a. CONTRACT NUMBER 5b. GRANT NUMBER GRANT11489...institutional, NIH-funded study of genetic and epigenetic alterations of pre-invasive DCIS that did or did not progress to invasive breast cancer , with an

  16. Seismically induced accident sequence analysis of the advanced test reactor

    International Nuclear Information System (INIS)

    Khericha, S.T.; Henry, D.M.; Ravindra, M.K.; Hashimoto, P.S.; Griffin, M.J.; Tong, W.H.; Nafday, A.M.

    1991-01-01

    A seismic probabilistic risk assessment (PRA) was performed for the Department of Energy (DOE) Advanced Test Reactor (ATR) as part of the external events analysis. The risk from seismic events to the fuel in the core and in the fuel storage canal was evaluated. The key elements of this paper are the integration of seismically induced internal flood and internal fire, and the modeling of human error rates as a function of the magnitude of earthquake. The systems analysis was performed by EG ampersand G Idaho, Inc. and the fragility analysis and quantification were performed by EQE International, Inc. (EQE)

  17. Transaction Analysis of Interactions at the Reference Desk of a Small Academic Library

    Directory of Open Access Journals (Sweden)

    Heather Empey

    2011-01-01

    Full Text Available As discussions continue about the changing nature of reference service in academic libraries, the Geoffrey R. Weller Library determined that more detailed information on what was happening at the Reference Desk was needed. During the 2006/07 academic year, transactions at the Reference Desk were analyzed to determine when they occurred (both during the week and during the academic year, the length of time the transactions took, the categories of the transactions, what sources were used and whether or not instruction was provided as part of the transaction. Another round of data was gathered in September 2009 to determine if use patterns had changed. Transactions at the Reference Desk were generally conducted in person, took either <1 min. or between 1-5 min. to answer, and occurred most often on Mon-Thurs between 11am-5pm. Between September 2006 and September 2009, specific title and research categories of questions decreased by 6% and directional and technical help categories of questions increased by 9%. There was also a decrease in the level of instruction being given. As a result of this research, service hours have been reduced and the on-going data collection at the Reference Desk has become more detailed to allow for ongoing analysis.

  18. Evaluation of reference genes for gene expression analysis using quantitative RT-PCR in Azospirillum brasilense.

    Science.gov (United States)

    McMillan, Mary; Pereg, Lily

    2014-01-01

    Azospirillum brasilense is a nitrogen fixing bacterium that has been shown to have various beneficial effects on plant growth and yield. Under normal conditions A. brasilense exists in a motile flagellated form, which, under starvation or stress conditions, can undergo differentiation into an encapsulated, cyst-like form. Quantitative RT-PCR can be used to analyse changes in gene expression during this differentiation process. The accuracy of quantification of mRNA levels by qRT-PCR relies on the normalisation of data against stably expressed reference genes. No suitable set of reference genes has yet been described for A. brasilense. Here we evaluated the expression of ten candidate reference genes (16S rRNA, gapB, glyA, gyrA, proC, pykA, recA, recF, rpoD, and tpiA) in wild-type and mutant A. brasilense strains under different culture conditions, including conditions that induce differentiation. Analysis with the software programs BestKeeper, NormFinder and GeNorm indicated that gyrA, glyA and recA are the most stably expressed reference genes in A. brasilense. The results also suggested that the use of two reference genes (gyrA and glyA) is sufficient for effective normalisation of qRT-PCR data.

  19. Evaluation of reference genes for gene expression analysis using quantitative RT-PCR in Azospirillum brasilense.

    Directory of Open Access Journals (Sweden)

    Mary McMillan

    Full Text Available Azospirillum brasilense is a nitrogen fixing bacterium that has been shown to have various beneficial effects on plant growth and yield. Under normal conditions A. brasilense exists in a motile flagellated form, which, under starvation or stress conditions, can undergo differentiation into an encapsulated, cyst-like form. Quantitative RT-PCR can be used to analyse changes in gene expression during this differentiation process. The accuracy of quantification of mRNA levels by qRT-PCR relies on the normalisation of data against stably expressed reference genes. No suitable set of reference genes has yet been described for A. brasilense. Here we evaluated the expression of ten candidate reference genes (16S rRNA, gapB, glyA, gyrA, proC, pykA, recA, recF, rpoD, and tpiA in wild-type and mutant A. brasilense strains under different culture conditions, including conditions that induce differentiation. Analysis with the software programs BestKeeper, NormFinder and GeNorm indicated that gyrA, glyA and recA are the most stably expressed reference genes in A. brasilense. The results also suggested that the use of two reference genes (gyrA and glyA is sufficient for effective normalisation of qRT-PCR data.

  20. Preparation and certification of rice flour reference materials for trace elements analysis

    International Nuclear Information System (INIS)

    Cho, Kyung Haeng; Park, Chang Joon; Woo, Jin Choon; Suh, Jung Ki; Han, Myung Sub; Lee, Jong Hae

    1998-01-01

    Rice flour reference materials were prepared from the unpolished rice grown in korea and certified for elemental composition. The reference materials consist of two samples containing normal and high level. The reference material at elevated level was prepared by spiking to the normal rice flour six toxic elements of As, Cd, Cu, Cr, Hg, Pb with 1.0μg/g on a dry weight basis. Homogeneity of the prepared materials was evaluated through the determination of Ca, Cu, Fe, Mn, Zn by instrumental neutron activation analysis (INAA) and atomic absorption spectrometry (AAS). Small variance of elemental composition among inter-bottled samples assured homogeneity of the prepared materials. The materials were decomposed by high pres-sure digestion and microwave digestion method. INAA, AAS, inductively coupled plasma-atomic absorption spectrometry (ICP-AES), ICP-mass spectrometry (MS) and vapour generation techniques were employed to analyze the reference materials. From this independent analytical results, the certified or reference values are determined for As, Ca, Cd, Cr, Cu, Fe, Hg, K, Mg, Mn, Mo, Na, P, Pb, Se, Zn

  1. Microscopic Analysis and Modeling of Airport Surface Sequencing, Phase I

    Data.gov (United States)

    National Aeronautics and Space Administration — The complexity and interdependence of operations on the airport surface motivate the need for a comprehensive and detailed, yet flexible and validated analysis and...

  2. BioMatriX: Sequence analysis, structure visualization, phylogenetics ...

    African Journals Online (AJOL)

    bmx-biomatrix.blogspot.com) developed for biological science community to augment scientific research regarding genomics, proteomics, phylogenetics and linkage analysis in one platform. BioMatriX offers multi-functional services to perform ...

  3. An internal reference model-based PRF temperature mapping method with Cramer-Rao lower bound noise performance analysis.

    Science.gov (United States)

    Li, Cheng; Pan, Xinyi; Ying, Kui; Zhang, Qiang; An, Jing; Weng, Dehe; Qin, Wen; Li, Kuncheng

    2009-11-01

    The conventional phase difference method for MR thermometry suffers from disturbances caused by the presence of lipid protons, motion-induced error, and field drift. A signal model is presented with multi-echo gradient echo (GRE) sequence using a fat signal as an internal reference to overcome these problems. The internal reference signal model is fit to the water and fat signals by the extended Prony algorithm and the Levenberg-Marquardt algorithm to estimate the chemical shifts between water and fat which contain temperature information. A noise analysis of the signal model was conducted using the Cramer-Rao lower bound to evaluate the noise performance of various algorithms, the effects of imaging parameters, and the influence of the water:fat signal ratio in a sample on the temperature estimate. Comparison of the calculated temperature map and thermocouple temperature measurements shows that the maximum temperature estimation error is 0.614 degrees C, with a standard deviation of 0.06 degrees C, confirming the feasibility of this model-based temperature mapping method. The influence of sample water:fat signal ratio on the accuracy of the temperature estimate is evaluated in a water-fat mixed phantom experiment with an optimal ratio of approximately 0.66:1. (c) 2009 Wiley-Liss, Inc.

  4. Sequence analysis of L RNA of Lassa virus

    International Nuclear Information System (INIS)

    Vieth, Simon; Torda, Andrew E.; Asper, Marcel; Schmitz, Herbert; Guenther, Stephan

    2004-01-01

    The L RNA of three Lassa virus strains originating from Nigeria, Ghana/Ivory Coast, and Sierra Leone was sequenced and the data subjected to structure predictions and phylogenetic analyses. The L gene products had 2218-2221 residues, diverged by 18% at the amino acid level, and contained several conserved regions. Only one region of 504 residues (positions 1043-1546) could be assigned a function, namely that of an RNA polymerase. Secondary structure predictions suggest that this domain is very similar to RNA-dependent RNA polymerases of known structure encoded by plus-strand RNA viruses, permitting a model to be built. Outside the polymerase region, there is little structural data, except for regions of strong alpha-helical content and probably a coiled-coil domain at the N terminus. No evidence for reassortment or recombination during Lassa virus evolution was found. The secondary structure-assisted alignment of the RNA polymerase region permitted a reliable reconstruction of the phylogeny of all negative-strand RNA viruses, indicating that Arenaviridae are most closely related to Nairoviruses. In conclusion, the data provide a basis for structural and functional characterization of the Lassa virus L protein and reveal new insights into the phylogeny of negative-strand RNA viruses

  5. Antimicrobial susceptibility among clinical Nocardia species identified by multilocus sequence analysis.

    Science.gov (United States)

    McTaggart, Lisa R; Doucet, Jennifer; Witkowska, Maria; Richardson, Susan E

    2015-01-01

    Antimicrobial susceptibility patterns of 112 clinical isolates, 28 type strains, and 9 reference strains of Nocardia were determined using the Sensititre Rapmyco microdilution panel (Thermo Fisher, Inc.). Isolates were identified by highly discriminatory multilocus sequence analysis and were chosen to represent the diversity of species recovered from clinical specimens in Ontario, Canada. Susceptibility to the most commonly used drug, trimethoprim-sulfamethoxazole, was observed in 97% of isolates. Linezolid and amikacin were also highly effective; 100% and 99% of all isolates demonstrated a susceptible phenotype. For the remaining antimicrobials, resistance was species specific with isolates of Nocardia otitidiscaviarum, N. brasiliensis, N. abscessus complex, N. nova complex, N. transvalensis complex, N. farcinica, and N. cyriacigeorgica displaying the traditional characteristic drug pattern types. In addition, the antimicrobial susceptibility profiles of a variety of rarely encountered species isolated from clinical specimens are reported for the first time and were categorized into four additional drug pattern types. Finally, MICs for the control strains N. nova ATCC BAA-2227, N. asteroides ATCC 19247(T), and N. farcinica ATCC 23826 were robustly determined to demonstrate method reproducibility and suitability of the commercial Sensititre Rapmyco panel for antimicrobial susceptibility testing of Nocardia spp. isolated from clinical specimens. The reported values will facilitate quality control and standardization among laboratories. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  6. Full-length genome sequence analysis of four subgroup J avian leukosis virus strains isolated from chickens with clinical hemangioma.

    Science.gov (United States)

    Lin, Lulu; Wang, Peikun; Yang, Yongli; Li, Haijuan; Huang, Teng; Wei, Ping

    2017-12-01

    Since 2014, cases of hemangioma associated with avian leukosis virus subgroup J (ALV-J) have been emerging in commercial chickens in Guangxi. In this study, four strains of the subgroup J avian leukosis virus (ALV-J), named GX14HG01, GX14HG04, GX14LT07, and GX14ZS14, were isolated from chickens with clinical hemangioma in 2014 by DF-1 cell culture and then identified with ELISA detection of ALV group specific antigen p27, the detection of subtype specific PCR and indirect immunofluorescence assay (IFA) with ALV-J specific monoclonal antibody. The complete genomes of the isolates were sequenced and it was found that the gag and pol were relatively conservative, while env was variable especially the gp85 gene. Homology analysis of the env gene sequences showed that the env gene of all the four isolates had higher similarities with the hemangioma (HE)-type reference strains than that of the myeloid leukosis (ML)-type strains, and moreover, the HE-type strains' specific deletion of 205-bp sequence covering the rTM and DR1 in 3'UTR fragment was also found in the four isolates. Further analysis on the sequences of subunits of env gene revealed an interesting finding: the gp85 of isolates GX14ZS14 and GX14HG04 had a higher similarity with HPRS-103 and much lower similarity with the HE-type reference strains resulting in GX14ZS14, GX14HG04, and HPRS-103 being clustered in the same branch, while gp37 had higher similarities with the HE-type reference strains when compared to that of HPRS-103, resulted in GX14ZS14, GX14HG04, and HE-type reference strains being clustered in the same branch. The results suggested that isolates GX14ZS14 and GX14HG04 may be the recombinant strains of the foreign strain HPRS-103 with the local epidemic HE-type strains of ALV-J.

  7. Genome Sequencing

    DEFF Research Database (Denmark)

    Sato, Shusei; Andersen, Stig Uggerhøj

    2014-01-01

    The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based on transcr......The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based...

  8. Recognizing mild cognitive impairment based on network connectivity analysis of resting EEG with zero reference

    International Nuclear Information System (INIS)

    Xu, Peng; Xiong, Xiu Chun; Tian, Yin; Zhang, Rui; Li, Pei Yang; Yao, De Zhong; Xue, Qing; Wang, Yu Ping; Peng, Yueheng

    2014-01-01

    The diagnosis of mild cognitive impairment (MCI) is very helpful for early therapeutic interventions of Alzheimer's disease (AD). MCI has been proven to be correlated with disorders in multiple brain areas. In this paper, we used information from resting brain networks at different EEG frequency bands to reliably recognize MCI. Because EEG network analysis is influenced by the reference that is used, we also evaluate the effect of the reference choices on the resting scalp EEG network-based MCI differentiation. The conducted study reveals two aspects: (1) the network-based MCI differentiation is superior to the previously reported classification that uses coherence in the EEG; and (2) the used EEG reference influences the differentiation performance, and the zero approximation technique (reference electrode standardization technique, REST) can construct a more accurate scalp EEG network, which results in a higher differentiation accuracy for MCI. This study indicates that the resting scalp EEG-based network analysis could be valuable for MCI recognition in the future. (paper)

  9. Reference values for muscle strength: a systematic review with a descriptive meta-analysis.

    Science.gov (United States)

    Benfica, Poliana do Amaral; Aguiar, Larissa Tavares; Brito, Sherindan Ayessa Ferreira de; Bernardino, Luane Helena Nunes; Teixeira-Salmela, Luci Fuscaldi; Faria, Christina Danielli Coelho de Morais

    2018-05-03

    Muscle strength is an important component of health. To describe and evaluate the studies which have established the reference values for muscle strength on healthy individuals and to synthesize these values with a descriptive meta-analysis approach. A systematic review was performed in MEDLINE, LILACS, and SciELO databases. Studies that investigated the reference values for muscle strength of two or more appendicular/axial muscle groups of health individuals were included. Methodological quality, including risk of bias was assessed by the QUADAS-2. Data extracted included: country of the study, sample size, population characteristics, equipment/method used, and muscle groups evaluated. Of the 414 studies identified, 46 were included. Most of the studies had adequate methodological quality. Included studies evaluated: appendicular (80.4%) and axial (36.9%) muscles; adults (78.3%), elderly (58.7%), adolescents (43.5%), children (23.9%); isometric (91.3%) and isokinetic (17.4%) strength. Six studies (13%) with similar procedures were synthesized with meta-analysis. Generally, the coefficient of variation values that resulted from the meta-analysis ranged from 20.1% to 30% and were similar to those reported by the original studies. The meta-analysis synthesized the reference values of isometric strength of 14 muscle groups of the dominant/non-dominant sides of the upper/lower limbs of adults/elderly from developed countries, using dynamometers/myometer. Most of the included studies had adequate methodological quality. The meta-analysis provided reference values for the isometric strength of 14 appendicular muscle groups of the dominant/non-dominant sides, measured with dynamometers/myometers, of men/women, of adults/elderly. These data may be used to interpret the results of the evaluations and establish appropriate treatment goals. Copyright © 2018 Associação Brasileira de Pesquisa e Pós-Graduação em Fisioterapia. Publicado por Elsevier Editora Ltda. All rights

  10. Reproducible analysis of sequencing-based RNA structure probing data with user-friendly tools

    DEFF Research Database (Denmark)

    Kielpinski, Lukasz Jan; Sidiropoulos, Nikos; Vinther, Jeppe

    2015-01-01

    time also made analysis of the data challenging for scientists without formal training in computational biology. Here, we discuss different strategies for data analysis of massive parallel sequencing-based structure-probing data. To facilitate reproducible and standardized analysis of this type of data...

  11. Stratigraphical analysis of the neoproterozoic sedimentary sequences of the Sao Francisco Basin

    International Nuclear Information System (INIS)

    Martins, Mariela; Lemos, Valesca Brasil

    2007-01-01

    A stratigraphic analysis was performed under the principles of Sequence Stratigraphy on the neoproterozoic sedimentary sequences of the Sao Francisco Basin (Central Brazil). Three periods of deposition separated by unconformities were recognized in the Sao Francisco Megasequence: (1) Sequences 1 and 2, a cryogenian glaciogenic sequence, followed by a distal scarp carbonate ramp, developed during stable conditions, (2) Sequence 3, a Upper Cryogenian stack homoclinal ramps with mixed carbonate-siliciclastic sedimentation, deposited under a progressive influence of compressional stresses of the Brasiliano Cycle, (3) Sequence 4, a Lower Ediacaran shallow platform dominated by siliciclastic sedimentation of molassic nature, the erosion product of the nearby uplifted thrust sheets. Each of the carbonate-bearing sequences presents a distinct δ 13 C isotopic signature. The superposition to the global curve for carbon isotopic variation allowed the recognition of a major depositional hiatus between the Paranoa and Sao Francisco Megasequences, and suggested that the glacial diamictite deposition (Jequitai Formation) took place most probably around 800 Ma. This constrains the Sao Francisco Megasequence deposition to the interval between 800 and 600 Ma (the known ages of the Brasiliano Orogeny defines the upper limit). A minor depositional hiatus (700.680 Ma) was also identified separating sequences 2 and 3. Isotopic analyses suggest that from then on, more restricted environmental conditions were established in the basin, probably associated with a first order global event, which prevailed throughout deposition of the Sequence 3. (author)

  12. Isolation and sequence analysis of a cDNA clone encoding the fifth complement component

    DEFF Research Database (Denmark)

    Lundwall, Åke B; Wetsel, Rick A; Kristensen, Torsten

    1985-01-01

    DNA clone of 1.85 kilobase pairs was isolated. Hybridization of the mixed-sequence probe to the complementary strand of the plasmid insert and sequence analysis by the dideoxy method predicted the expected protein sequence of C5a (positions 1-12), amino-terminal to the anticipated priming site. The sequence......, subcloned into M13 mp8, and sequenced at random by the dideoxy technique, thereby generating a contiguous sequence of 1703 base pairs. This clone contained coding sequence for the C-terminal 262 amino acid residues of the beta-chain, the entire C5a fragment, and the N-terminal 98 residues of the alpha......'-chain. The 3' end of the clone had a polyadenylated tail preceded by a polyadenylation recognition site, a 3'-untranslated region, and base pairs homologous to the human Alu concensus sequence. Comparison of the derived partial human C5 protein sequence with that previously determined for murine C3 and human...

  13. Oasis: online analysis of small RNA deep sequencing data.

    Science.gov (United States)

    Capece, Vincenzo; Garcia Vizcaino, Julio C; Vidal, Ramon; Rahman, Raza-Ur; Pena Centeno, Tonatiuh; Shomroni, Orr; Suberviola, Irantzu; Fischer, Andre; Bonn, Stefan

    2015-07-01

    Oasis is a web application that allows for the fast and flexible online analysis of small-RNA-seq (sRNA-seq) data. It was designed for the end user in the lab, providing an easy-to-use web frontend including video tutorials, demo data and best practice step-by-step guidelines on how to analyze sRNA-seq data. Oasis' exclusive selling points are a differential expression module that allows for the multivariate analysis of samples, a classification module for robust biomarker detection and an advanced programming interface that supports the batch submission of jobs. Both modules include the analysis of novel miRNAs, miRNA targets and functional analyses including GO and pathway enrichment. Oasis generates downloadable interactive web reports for easy visualization, exploration and analysis of data on a local system. Finally, Oasis' modular workflow enables for the rapid (re-) analysis of data. Oasis is implemented in Python, R, Java, PHP, C++ and JavaScript. It is freely available at http://oasis.dzne.de. stefan.bonn@dzne.de Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  14. Establishment of screening technique for mutant cell and analysis of base sequence in the mutation

    International Nuclear Information System (INIS)

    Sofuni, Toshio; Nomi, Takehiko; Yamada, Masami; Masumura, Kenichi

    2000-01-01

    This research project aimed to establish an easy and quick detection method for radiation-induced mutation using molecular-biological techniques and an effective analyzing method for the molecular changes in base sequence. In this year, Spi mutants derived from γ-radiation exposed mouse were analyzed by PCR method and DNA sequence method. Male transgenic mice were exposed to γ-ray at 5,10, 50 Gy and the transgene was taken out from the genome DNA from the spleen in vivo packaging method. Spi mutant plaques were obtained by infecting the recovered phage to E. coli. Sequence analysis for the mutants was made using ALFred DNA sequencer and SequiTherm TM Long-Red Cycle sequencing kit. Sequence analysis was carried out for 41 of 50 independent Spi mutants obtained. The deletions were classified into 4 groups; Group 1 included 15 mutants that were characterized with a large deletion (43 bp-10 kb) with a short homologous sequence. Group 2 included 11 mutants of a large deletion having no homologous sequence at the connecting region. Group 3 included 11 mutants having a short deletion of less than 20 bp, which occurred in the non-repetitive sequence of gam gene and possibly caused by oxidative breakage of DNA or recombination of DNA fragment produced by the breakage. Group 4 included 4 mutants having deletions as short as 20 bp or less in the repetitive sequence of gam gene, resulting in an alteration of the reading frame. Thus, the synthesis of Gam protein was terminated by the appearance of TGA between code 13 and 14 of redB gene, leading to inactivation of gam gene and redBA gene. These results indicated that most of Spi mutants had a deletion in red/gam region and the deletions in more than half mutants occurred in homologous sequences as short as 8 bp. (M.N.)

  15. The BsaHI restriction-modification system: Cloning, sequencing and analysis of conserved motifs

    Directory of Open Access Journals (Sweden)

    Roberts Richard J

    2008-05-01

    Full Text Available Abstract Background Restriction and modification enzymes typically recognise short DNA sequences of between two and eight bases in length. Understanding the mechanism of this recognition represents a significant challenge that we begin to address for the BsaHI restriction-modification system, which recognises the six base sequence GRCGYC. Results The DNA sequences of the genes for the BsaHI methyltransferase, bsaHIM, and restriction endonuclease, bsaHIR, have been determined (GenBank accession #EU386360, cloned and expressed in E. coli. Both the restriction endonuclease and methyltransferase enzymes share significant similarity with a group of 6 other enzymes comprising the restriction-modification systems HgiDI and HgiGI and the putative HindVP, NlaCORFDP, NpuORFC228P and SplZORFNP restriction-modification systems. A sequence alignment of these homologues shows that their amino acid sequences are largely conserved and highlights several motifs of interest. We target one such conserved motif, reading SPERRFD, at the C-terminal end of the bsaHIR gene. A mutational analysis of these amino acids indicates that the motif is crucial for enzymatic activity. Sequence alignment of the methyltransferase gene reveals a short motif within the target recognition domain that is conserved among enzymes recognising the same sequences. Thus, this motif may be used as a diagnostic tool to define the recognition sequences of the cytosine C5 methyltransferases. Conclusion We have cloned and sequenced the BsaHI restriction and modification enzymes. We have identified a region of the R. BsaHI enzyme that is crucial for its activity. Analysis of the amino acid sequence of the BsaHI methyltransferase enzyme led us to propose two new motifs that can be used in the diagnosis of the recognition sequence of the cytosine C5-methyltransferases.

  16. A base composition analysis of natural patterns for the preprocessing of metagenome sequences.

    Science.gov (United States)

    Bonham-Carter, Oliver; Ali, Hesham; Bastola, Dhundy

    2013-01-01

    On the pretext that sequence reads and contigs often exhibit the same kinds of base usage that is also observed in the sequences from which they are derived, we offer a base composition analysis tool. Our tool uses these natural patterns to determine relatedness across sequence data. We introduce spectrum sets (sets of motifs) which are permutations of bacterial restriction sites and the base composition analysis framework to measure their proportional content in sequence data. We suggest that this framework will increase the efficiency during the pre-processing stages of metagenome sequencing and assembly projects. Our method is able to differentiate organisms and their reads or contigs. The framework shows how to successfully determine the relatedness between these reads or contigs by comparison of base composition. In particular, we show that two types of organismal-sequence data are fundamentally different by analyzing their spectrum set motif proportions (coverage). By the application of one of the four possible spectrum sets, encompassing all known restriction sites, we provide the evidence to claim that each set has a different ability to differentiate sequence data. Furthermore, we show that the spectrum set selection having relevance to one organism, but not to the others of the data set, will greatly improve performance of sequence differentiation even if the fragment size of the read, contig or sequence is not lengthy. We show the proof of concept of our method by its application to ten trials of two or three freshly selected sequence fragments (reads and contigs) for each experiment across the six organisms of our set. Here we describe a novel and computationally effective pre-processing step for metagenome sequencing and assembly tasks. Furthermore, our base composition method has applications in phylogeny where it can be used to infer evolutionary distances between organisms based on the notion that related organisms often have much conserved code.

  17. Expressed sequence tags as a tool for phylogenetic analysis of placental mammal evolution.

    Directory of Open Access Journals (Sweden)

    Morgan Kullberg

    Full Text Available BACKGROUND: We investigate the usefulness of expressed sequence tags, ESTs, for establishing divergences within the tree of placental mammals. This is done on the example of the established relationships among primates (human, lagomorphs (rabbit, rodents (rat and mouse, artiodactyls (cow, carnivorans (dog and proboscideans (elephant. METHODOLOGY/PRINCIPAL FINDINGS: We have produced 2000 ESTs (1.2 mega bases from a marsupial mouse and characterized the data for their use in phylogenetic analysis. The sequences were used to identify putative orthologous sequences from whole genome projects. Although most ESTs stem from single sequence reads, the frequency of potential sequencing errors was found to be lower than allelic variation. Most of the sequences represented slowly evolving housekeeping-type genes, with an average amino acid distance of 6.6% between human and mouse. Positive Darwinian selection was identified at only a few single sites. Phylogenetic analyses of the EST data yielded trees that were consistent with those established from whole genome projects. CONCLUSIONS: The general quality of EST sequences and the general absence of positive selection in these sequences make ESTs an attractive tool for phylogenetic analysis. The EST approach allows, at reasonable costs, a fast extension of data sampling from species outside the genome projects.

  18. Multivariate methods for analysis of environmental reference materials using laser-induced breakdown spectroscopy

    Directory of Open Access Journals (Sweden)

    Shikha Awasthi

    2017-06-01

    Full Text Available Analysis of emission from laser-induced plasma has a unique capability for quantifying the major and minor elements present in any type of samples under optimal analysis conditions. Chemometric techniques are very effective and reliable tools for quantification of multiple components in complex matrices. The feasibility of laser-induced breakdown spectroscopy (LIBS in combination with multivariate analysis was investigated for the analysis of environmental reference materials (RMs. In the present work, different (Certified/Standard Reference Materials of soil and plant origin were analyzed using LIBS and the presence of Al, Ca, Mg, Fe, K, Mn and Si were identified in the LIBS spectra of these materials. Multivariate statistical methods (Partial Least Square Regression and Partial Least Square Discriminant Analysis were employed for quantitative analysis of the constituent elements using the LIBS spectral data. Calibration models were used to predict the concentrations of the different elements of test samples and subsequently, the concentrations were compared with certified concentrations to check the authenticity of models. The non-destructive analytical method namely Instrumental Neutron Activation Analysis (INAA using high flux reactor neutrons and high resolution gamma-ray spectrometry was also used for intercomparison of results of two RMs by LIBS.

  19. No-reference analysis of decoded MPEG images for PSNR estimation and post-processing

    DEFF Research Database (Denmark)

    Forchhammer, Søren; Li, Huiying; Andersen, Jakob Dahl

    2011-01-01

    We propose no-reference analysis and processing of DCT (Discrete Cosine Transform) coded images based on estimation of selected MPEG parameters from the decoded video. The goal is to assess MPEG video quality and perform post-processing without access to neither the original stream nor the code...... stream. Solutions are presented for MPEG-2 video. A method to estimate the quantization parameters of DCT coded images and MPEG I-frames at the macro-block level is presented. The results of this analysis is used for deblocking and deringing artifact reduction and no-reference PSNR estimation without...... code stream access. An adaptive deringing method using texture classification is presented. On the test set, the quantization parameters in MPEG-2 I-frames are estimated with an overall accuracy of 99.9% and the PSNR is estimated with an overall average error of 0.3dB. The deringing and deblocking...

  20. Aluminium-gold reference material for the k0-standardisation of neutron activation analysis

    International Nuclear Information System (INIS)

    Ingelbrecht, C.; Peetermans, F.; Corte, F. de; Wispelaere, A. de; Vandecasteele, C.; Courtijn, E.; Hondt, P. d'

    1991-01-01

    Gold is an excellent comparator material for the k 0 -standardisation of neutron activation analysis because of its convenient and well defined nuclear properties. The most suitable form for a reference material is a dilute aluminium-gold alloy, for which the self-shielding effect for neutrons is small. Castings of composition Al-0.1 wt.% Au were prepared by crucible-less levitation melting, which gives close control of ingot composition with minimal contamination of the melt. The alloy composition was checked using induction-coupled plasma source emission spectrometry. The homogeneity of the alloy was measured by neutron activation analysis and a relative standard deviation of the gold content of 0.30% was found (10 mg samples). Metallography revealed a homogeneous distribution of AuAl 2 particles. The alloy was certified as Reference Materials CBNM-530, with certified gold mass fraction 0.100±0.002 wt.%. (orig.)

  1. Homogeneity study on biological candidate reference materials: the role of neutron activation analysis

    Energy Technology Data Exchange (ETDEWEB)

    Silva, Daniel P.; Moreira, Edson G., E-mail: dsilva.pereira@usp.br [Instituto de Pesquisas Energeticas e Nucleares (IPEN/CNEN-SP), Sao Paulo, SP (Brazil)

    2015-07-01

    Instrumental Neutron activation Analysis (INAA) is a mature nuclear analytical technique able to accurately determine chemical elements without the need of sample digestion and, hence, without the associated problems of analyte loss or contamination. This feature, along with its potentiality use as a primary method of analysis, makes it an important tool for the characterization of new references materials and in the assessment of their homogeneity status. In this study, the ability of the comparative method of INAA for the within-bottle homogeneity of K, Mg, Mn and V in a mussel reference material was investigated. Method parameters, such as irradiation time, sample decay time and distance from sample to the detector were varied in order to allow element determination in subsamples of different sample masses in duplicate. Sample masses were in the range of 1 to 250 mg and the limitations of the detection limit for small sample masses and dead time distortions for large sample masses were investigated. (author)

  2. Complete chloroplast genome sequence of MD-2 pineapple and its comparative analysis among nine other plants from the subclass Commelinidae.

    Science.gov (United States)

    Redwan, R M; Saidin, A; Kumar, S V

    2015-08-12

    Pineapple (Ananas comosus var. comosus) is known as the king of fruits for its crown and is the third most important tropical fruit after banana and citrus. The plant, which is indigenous to South America, is the most important species in the Bromeliaceae family and is largely traded for fresh fruit consumption. Here, we report the complete chloroplast sequence of the MD-2 pineapple that was sequenced using the PacBio sequencing technology. In this study, the high error rate of PacBio long sequence reads of A. comosus's total genomic DNA were improved by leveraging on the high accuracy but short Illumina reads for error-correction via the latest error correction module from Novocraft. Error corrected long PacBio reads were assembled by using a single tool to produce a contig representing the pineapple chloroplast genome. The genome of 159,636 bp in length is featured with the conserved quadripartite structure of chloroplast containing a large single copy region (LSC) with a size of 87,482 bp, a small single copy region (SSC) with a size of 18,622 bp and two inverted repeat regions (IRA and IRB) each with the size of 26,766 bp. Overall, the genome contained 117 unique coding regions and 30 were repeated in the IR region with its genes contents, structure and arrangement similar to its sister taxon, Typha latifolia. A total of 35 repeats structure were detected in both the coding and non-coding regions with a majority being tandem repeats. In addition, 205 SSRs were detected in the genome with six protein-coding genes contained more than two SSRs. Comparative chloroplast genomes from the subclass Commelinidae revealed a conservative protein coding gene albeit located in a highly divergence region. Analysis of selection pressure on protein-coding genes using Ka/Ks ratio showed significant positive selection exerted on the rps7 gene of the pineapple chloroplast with P less than 0.05. Phylogenetic analysis confirmed the recent taxonomical relation among the member of

  3. Cloning and sequence analysis of hyaluronoglucosaminidase (nagH gene of Clostridium chauvoei

    Directory of Open Access Journals (Sweden)

    Saroj K. Dangi

    2017-09-01

    Full Text Available Aim: Blackleg disease is caused by Clostridium chauvoei in ruminants. Although virulence factors such as C. chauvoei toxin A, sialidase, and flagellin are well characterized, hyaluronidases of C. chauvoei are not characterized. The present study was aimed at cloning and sequence analysis of hyaluronoglucosaminidase (nagH gene of C. chauvoei. Materials and Methods: C. chauvoei strain ATCC 10092 was grown in ATCC 2107 media and confirmed by polymerase chain reaction (PCR using the primers specific for 16-23S rDNA spacer region. nagH gene of C. chauvoei was amplified and cloned into pRham-SUMO vector and transformed into Escherichia cloni 10G cells. The construct was then transformed into E. cloni cells. Colony PCR was carried out to screen the colonies followed by sequencing of nagH gene in the construct. Results: PCR amplification yielded nagH gene of 1143 bp product, which was cloned in prokaryotic expression system. Colony PCR, as well as sequencing of nagH gene, confirmed the presence of insert. Sequence was then subjected to BLAST analysis of NCBI, which confirmed that the sequence was indeed of nagH gene of C. chauvoei. Phylogenetic analysis of the sequence showed that it is closely related to Clostridium perfringens and Clostridium paraputrificum. Conclusion: The gene for virulence factor nagH was cloned into a prokaryotic expression vector and confirmed by sequencing.

  4. Analysis of Multiple Genomic Sequence Alignments: A Web Resource, Online Tools, and Lessons Learned From Analysis of Mammalian SCL Loci

    Science.gov (United States)

    Chapman, Michael A.; Donaldson, Ian J.; Gilbert, James; Grafham, Darren; Rogers, Jane; Green, Anthony R.; Göttgens, Berthold

    2004-01-01

    Comparative analysis of genomic sequences is becoming a standard technique for studying gene regulation. However, only a limited number of tools are currently available for the analysis of multiple genomic sequences. An extensive data set for the testing and training of such tools is provided by the SCL gene locus. Here we have expanded the data set to eight vertebrate species by sequencing the dog SCL locus and by annotating the dog and rat SCL loci. To provide a resource for the bioinformatics community, all SCL sequences and functional annotations, comprising a collation of the extensive experimental evidence pertaining to SCL regulation, have been made available via a Web server. A Web interface to new tools specifically designed for the display and analysis of multiple sequence alignments was also implemented. The unique SCL data set and new sequence comparison tools allowed us to perform a rigorous examination of the true benefits of multiple sequence comparisons. We demonstrate that multiple sequence alignments are, overall, superior to pairwise alignments for identification of mammalian regulatory regions. In the search for individual transcription factor binding sites, multiple alignments markedly increase the signal-to-noise ratio compared to pairwise alignments. PMID:14718377

  5. A reference methylome database and analysis pipeline to facilitate integrative and comparative epigenomics.

    Directory of Open Access Journals (Sweden)

    Qiang Song

    Full Text Available DNA methylation is implicated in a surprising diversity of regulatory, evolutionary processes and diseases in eukaryotes. The introduction of whole-genome bisulfite sequencing has enabled the study of DNA methylation at a single-base resolution, revealing many new aspects of DNA methylation and highlighting the usefulness of methylome data in understanding a variety of genomic phenomena. As the number of publicly available whole-genome bisulfite sequencing studies reaches into the hundreds, reliable and convenient tools for comparing and analyzing methylomes become increasingly important. We present MethPipe, a pipeline for both low and high-level methylome analysis, and MethBase, an accompanying database of annotated methylomes from the public domain. Together these resources enable researchers to extract interesting features from methylomes and compare them with those identified in public methylomes in our database.

  6. A comparative neutron activation analysis study of common generic manipulated and reference medicines commercialized in Brazil

    International Nuclear Information System (INIS)

    Leal, A.S.; Menezes, M.A.B.C.; Rodrigues, R.R.; Andonie, O.; Vermaercke, P.; Sneyers, L.

    2008-01-01

    In this work, a comparative study of neutron activation analysis (NAA) was performed by the nuclear institutes: CDTN/CNEN-Brazil, CCHEN-Chile and the SCK.CEN-Belgium aiming to investigate some generic, manipulated and reference medicines largely commercialized in Brazil. Some impurities such as: As, Ba, Br, Ce, Co, Cr, Eu, Fe, Hf, Sb, Sc, Sm, Ti and Zn were found, and the heterogeneity of the samples pointed out the lack of an efficient public system of quality control

  7. Neutron activation analysis of reference materials by the k sub 0 standardization and relative methods

    Energy Technology Data Exchange (ETDEWEB)

    Freitas, M C; Martinho, E [LNETI/ICEN, Sacavem (Portugal)

    1989-04-15

    Instrumental neutron activation analysis with the k{sub o}-standardization method was applied to eight geological, environmental and biological reference materials, including leaves, blood, fish, sediments, soils and limestone. To a first approximation, the results were normally distributed around the certified values with a standard deviation of 10%. Results obtained by using the relative method based on well characterized multi-element standards for IAEA CRM Soil-7 are reported.

  8. Third research coordination meeting on reference database for neutron activation analysis. Summary report

    International Nuclear Information System (INIS)

    Kellett, M.A.

    2009-12-01

    The third meeting of the Co-ordinated Research Project on 'Reference Database for Neutron Activation Analysis' was held at the IAEA, Vienna from 17-19 November 2008. A summary of presentations made by participants is given, reports on specific tasks and subsequent discussions. With the aim of finalising the work of this CRP and in order to meet initial objectives, outputs were discussed and detailed task assignments agreed upon. (author)

  9. Precision and Accuracy of k0-NAA Method for Analysis of Multi Elements in Reference Samples

    International Nuclear Information System (INIS)

    Sri-Wardani

    2004-01-01

    Accuracy and precision of k 0 -NAA method could determine in the analysis of multi elements contained in reference samples. The analyzed results of multi elements in SRM 1633b sample were obtained with optimum results in bias of 20% but it is in a good accuracy and precision. The analyzed results of As, Cd and Zn in CCQM-P29 rice flour sample were obtained with very good result in bias of 0.5 - 5.6%. (author)

  10. Evaluation of positive Rift Valley fever virus formalin-fixed paraffin embedded samples as a source of sequence data for retrospective phylogenetic analysis.

    Science.gov (United States)

    Mubemba, B; Thompson, P N; Odendaal, L; Coetzee, P; Venter, E H

    2017-05-01

    Rift Valley fever (RVF), caused by an arthropod borne Phlebovirus in the family Bunyaviridae, is a haemorrhagic disease that affects ruminants and humans. Due to the zoonotic nature of the virus, a biosafety level 3 laboratory is required for isolation of the virus. Fresh and frozen samples are the preferred sample type for isolation and acquisition of sequence data. However, these samples are scarce in addition to posing a health risk to laboratory personnel. Archived formalin-fixed, paraffin-embedded (FFPE) tissue samples are safe and readily available, however FFPE derived RNA is in most cases degraded and cross-linked in peptide bonds and it is unknown whether the sample type would be suitable as reference material for retrospective phylogenetic studies. A RT-PCR assay targeting a 490 nt portion of the structural G N glycoprotein encoding gene of the RVFV M-segment was applied to total RNA extracted from archived RVFV positive FFPE samples. Several attempts to obtain target amplicons were unsuccessful. FFPE samples were then analysed using next generation sequencing (NGS), i.e. Truseq ® (Illumina) and sequenced on the Miseq ® genome analyser (Illumina). Using reference mapping, gapped virus sequence data of varying degrees of shallow depth was aligned to a reference sequence. However, the NGS did not yield long enough contigs that consistently covered the same genome regions in all samples to allow phylogenetic analysis. Copyright © 2017 Elsevier B.V. All rights reserved.

  11. Metagenomic Analysis of Slovak Bryndza Cheese Using Next-Generation 16S rDNA Amplicon Sequencing

    Directory of Open Access Journals (Sweden)

    Planý Matej

    2016-06-01

    Full Text Available Knowledge about diversity and taxonomic structure of the microbial population present in traditional fermented foods plays a key role in starter culture selection, safety improvement and quality enhancement of the end product. Aim of this study was to investigate microbial consortia composition in Slovak bryndza cheese. For this purpose, we used culture-independent approach based on 16S rDNA amplicon sequencing using next generation sequencing platform. Results obtained by the analysis of three commercial (produced on industrial scale in winter season and one traditional (artisanal, most valued, produced in May Slovak bryndza cheese sample were compared. A diverse prokaryotic microflora composed mostly of the genera Lactococcus, Streptococcus, Lactobacillus, and Enterococcus was identified. Lactococcus lactis subsp. lactis and Lactococcus lactis subsp. cremoris were the dominant taxons in all tested samples. Second most abundant species, detected in all bryndza cheeses, were Lactococcus fujiensis and Lactococcus taiwanensis, independently by two different approaches, using different reference 16S rRNA genes databases (Greengenes and NCBI respectively. They have been detected in bryndza cheese samples in substantial amount for the first time. The narrowest microbial diversity was observed in a sample made with a starter culture from pasteurised milk. Metagenomic analysis by high-throughput sequencing using 16S rRNA genes seems to be a powerful tool for studying the structure of the microbial population in cheeses.

  12. WebMGA: a customizable web server for fast metagenomic sequence analysis.

    Science.gov (United States)

    Wu, Sitao; Zhu, Zhengwei; Fu, Liming; Niu, Beifang; Li, Weizhong

    2011-09-07

    The new field of metagenomics studies microorganism communities by culture-independent sequencing. With the advances in next-generation sequencing techniques, researchers are facing tremendous challenges in metagenomic data analysis due to huge quantity and high complexity of sequence data. Analyzing large datasets is extremely time-consuming; also metagenomic annotation involves a wide range of computational tools, which are difficult to be installed and maintained by common users. The tools provided by the few available web servers are also limited and have various constraints such as login requirement, long waiting time, inability to configure pipelines etc. We developed WebMGA, a customizable web server for fast metagenomic analysis. WebMGA includes over 20 commonly used tools such as ORF calling, sequence clustering, quality control of raw reads, removal of sequencing artifacts and contaminations, taxonomic analysis, functional annotation etc. WebMGA provides users with rapid metagenomic data analysis using fast and effective tools, which have been implemented to run in parallel on our local computer cluster. Users can access WebMGA through web browsers or programming scripts to perform individual analysis or to configure and run customized pipelines. WebMGA is freely available at http://weizhongli-lab.org/metagenomic-analysis. WebMGA offers to researchers many fast and unique tools and great flexibility for complex metagenomic data analysis.

  13. WebMGA: a customizable web server for fast metagenomic sequence analysis

    Directory of Open Access Journals (Sweden)

    Niu Beifang

    2011-09-01

    Full Text Available Abstract Background The new field of metagenomics studies microorganism communities by culture-independent sequencing. With the advances in next-generation sequencing techniques, researchers are facing tremendous challenges in metagenomic data analysis due to huge quantity and high complexity of sequence data. Analyzing large datasets is extremely time-consuming; also metagenomic annotation involves a wide range of computational tools, which are difficult to be installed and maintained by common users. The tools provided by the few available web servers are also limited and have various constraints such as login requirement, long waiting time, inability to configure pipelines etc. Results We developed WebMGA, a customizable web server for fast metagenomic analysis. WebMGA includes over 20 commonly used tools such as ORF calling, sequence clustering, quality control of raw reads, removal of sequencing artifacts and contaminations, taxonomic analysis, functional annotation etc. WebMGA provides users with rapid metagenomic data analysis using fast and effective tools, which have been implemented to run in parallel on our local computer cluster. Users can access WebMGA through web browsers or programming scripts to perform individual analysis or to configure and run customized pipelines. WebMGA is freely available at http://weizhongli-lab.org/metagenomic-analysis. Conclusions WebMGA offers to researchers many fast and unique tools and great flexibility for complex metagenomic data analysis.

  14. Fit Gap Analysis – The Role of Business Process Reference Models

    Directory of Open Access Journals (Sweden)

    Dejan Pajk

    2013-12-01

    Full Text Available Enterprise resource planning (ERP systems support solutions for standard business processes such as financial, sales, procurement and warehouse. In order to improve the understandability and efficiency of their implementation, ERP vendors have introduced reference models that describe the processes and underlying structure of an ERP system. To select and successfully implement an ERP system, the capabilities of that system have to be compared with a company’s business needs. Based on a comparison, all of the fits and gaps must be identified and further analysed. This step usually forms part of ERP implementation methodologies and is called fit gap analysis. The paper theoretically overviews methods for applying reference models and describes fit gap analysis processes in detail. The paper’s first contribution is its presentation of a fit gap analysis using standard business process modelling notation. The second contribution is the demonstration of a process-based comparison approach between a supply chain process and an ERP system process reference model. In addition to its theoretical contributions, the results can also be practically applied to projects involving the selection and implementation of ERP systems.

  15. Multivariate reference technique for quantitative analysis of fiber-optic tissue Raman spectroscopy.

    Science.gov (United States)

    Bergholt, Mads Sylvest; Duraipandian, Shiyamala; Zheng, Wei; Huang, Zhiwei

    2013-12-03

    We report a novel method making use of multivariate reference signals of fused silica and sapphire Raman signals generated from a ball-lens fiber-optic Raman probe for quantitative analysis of in vivo tissue Raman measurements in real time. Partial least-squares (PLS) regression modeling is applied to extract the characteristic internal reference Raman signals (e.g., shoulder of the prominent fused silica boson peak (~130 cm(-1)); distinct sapphire ball-lens peaks (380, 417, 646, and 751 cm(-1))) from the ball-lens fiber-optic Raman probe for quantitative analysis of fiber-optic Raman spectroscopy. To evaluate the analytical value of this novel multivariate reference technique, a rapid Raman spectroscopy system coupled with a ball-lens fiber-optic Raman probe is used for in vivo oral tissue Raman measurements (n = 25 subjects) under 785 nm laser excitation powers ranging from 5 to 65 mW. An accurate linear relationship (R(2) = 0.981) with a root-mean-square error of cross validation (RMSECV) of 2.5 mW can be obtained for predicting the laser excitation power changes based on a leave-one-subject-out cross-validation, which is superior to the normal univariate reference method (RMSE = 6.2 mW). A root-mean-square error of prediction (RMSEP) of 2.4 mW (R(2) = 0.985) can also be achieved for laser power prediction in real time when we applied the multivariate method independently on the five new subjects (n = 166 spectra). We further apply the multivariate reference technique for quantitative analysis of gelatin tissue phantoms that gives rise to an RMSEP of ~2.0% (R(2) = 0.998) independent of laser excitation power variations. This work demonstrates that multivariate reference technique can be advantageously used to monitor and correct the variations of laser excitation power and fiber coupling efficiency in situ for standardizing the tissue Raman intensity to realize quantitative analysis of tissue Raman measurements in vivo, which is particularly appealing in

  16. Validation of suitable reference genes for quantitative gene expression analysis in Panax ginseng

    Directory of Open Access Journals (Sweden)

    Meizhen eWang

    2016-01-01

    Full Text Available Reverse transcription-qPCR (RT-qPCR has become a popular method for gene expression studies. Its results require data normalization by housekeeping genes. No single gene is proved to be stably expressed under all experimental conditions. Therefore, systematic evaluation of reference genes is necessary. With the aim to identify optimum reference genes for RT-qPCR analysis of gene expression in different tissues of Panax ginseng and the seedlings grown under heat stress, we investigated the expression stability of eight candidate reference genes, including elongation factor 1-beta (EF1-β, elongation factor 1-gamma (EF1-γ, eukaryotic translation initiation factor 3G (IF3G, eukaryotic translation initiation factor 3B (IF3B, actin (ACT, actin11 (ACT11, glyceraldehyde-3-phosphate dehydrogenase (GAPDH and cyclophilin ABH-like protein (CYC, using four widely used computational programs: geNorm, Normfinder, BestKeeper, and the comparative ΔCt method. The results were then integrated using the web-based tool RefFinder. As a result, EF1-γ, IF3G and EF1-β were the three most stable genes in different tissues of P. ginseng, while IF3G, ACT11 and GAPDH were the top three-ranked genes in seedlings treated with heat. Using three better reference genes alone or in combination as internal control, we examined the expression profiles of MAR, a multiple function-associated mRNA-like non-coding RNA (mlncRNA in P. ginseng. Taken together, we recommended EF1-γ/IF3G and IF3G/ACT11 as the suitable pair of reference genes for RT-qPCR analysis of gene expression in different tissues of P. ginseng and the seedlings grown under heat stress, respectively. The results serve as a foundation for future studies on P. ginseng functional genomics.

  17. The scale analysis sequence for LWR fuel depletion

    International Nuclear Information System (INIS)

    Hermann, O.W.; Parks, C.V.

    1991-01-01

    The SCALE (Standardized Computer Analyses for Licensing Evaluation) code system is used extensively to perform away-from-reactor safety analysis (particularly criticality safety, shielding, heat transfer analyses) for spent light water reactor (LWR) fuel. Spent fuel characteristics such as radiation sources, heat generation sources, and isotopic concentrations can be computed within SCALE using the SAS2 control module. A significantly enhanced version of the SAS2 control module, which is denoted as SAS2H, has been made available with the release of SCALE-4. For each time-dependent fuel composition, SAS2H performs one-dimensional (1-D) neutron transport analyses (via XSDRNPM-S) of the reactor fuel assembly using a two-part procedure with two separate unit-cell-lattice models. The cross sections derived from a transport analysis at each time step are used in a point-depletion computation (via ORIGEN-S) that produces the burnup-dependent fuel composition to be used in the next spectral calculation. A final ORIGEN-S case is used to perform the complete depletion/decay analysis using the burnup-dependent cross sections. The techniques used by SAS2H and two recent applications of the code are reviewed in this paper. 17 refs., 5 figs., 5 tabs

  18. Sequence determination and analysis of the NSs genes of two tospoviruses.

    Science.gov (United States)

    Hallwass, Mariana; Leastro, Mikhail O; Lima, Mirtes F; Inoue-Nagata, Alice K; Resende, Renato O

    2012-03-01

    The tospoviruses groundnut ringspot virus (GRSV) and zucchini lethal chlorosis virus (ZLCV) cause severe losses in many crops, especially in solanaceous and cucurbit species. In this study, the non-structural NSs gene and the 5'UTRs of these two biologically distinct tospoviruses were cloned and sequenced. The NSs sequence of GRSV and ZLCV were both 1,404 nucleotides long. Pairwise comparison showed that the NSs amino acid sequence of GRSV shared 69.6% identity with that of ZLCV and 75.9% identity with that of TSWV, while the NSs sequence of ZLCV and TSWV shared 67.9% identity. Phylogenetic analysis based on NSs sequences confirmed that these viruses cluster in the American clade.

  19. Sequencing and phylogenetic analysis of tobacco virus 2, a polerovirus from Nicotiana tabacum.

    Science.gov (United States)

    Zhou, Benguo; Wang, Fang; Zhang, Xuesong; Zhang, Lina; Lin, Huafeng

    2017-07-01

    The complete genome sequence of a new virus, provisionally named tobacco virus 2 (TV2), was determined and identified from leaves of tobacco (Nicotiana tabacum) exhibiting leaf mosaic, yellowing, and deformity, in Anhui Province, China. The genome sequence of TV2 comprises 5,979 nucleotides, with 87% nucleotide sequence identity to potato leafroll virus (PLRV). Its genome organization is similar to that of PLRV, containing six open reading frames (ORFs) that potentially encode proteins with putative functions in cell-to-cell movement and suppression of RNA silencing. Phylogenetic analysis of the nucleotide sequence placed TV2 alongside members of the genus Polerovirus in the family Luteoviridae. To the best our knowledge, this study is the first report of a complete genome sequence of a new polerovirus identified in tobacco.

  20. An analysis of LOCA sequences in the development of severe accident analysis DB

    International Nuclear Information System (INIS)

    Choi, Young; Park, Soo Yong; Ahn, Kwang-Il; Kim, D.H.

    2006-01-01

    Although a Level 2 PSA was performed for the Korean Standard Power Plants (KSNPs), and it considered the necessary sequences for an assessment of the containment integrity and source term analysis. In terms of an accident management, however, more cases causing severe core damage need to be analyzed and arranged systematically for an easy access to the results. At present, KAERI is calculating the severe accident sequences intensively for various initiating events and generating a database for the accident progression including thermal hydraulic and source term behaviours. The developed Database (DB) system includes a graphical display for a plant and equipment status, previous research results by knowledge-base technique, and the expected plant behaviour. The plant model used in this paper is oriented to the case of LOCAs related severe accident phenomena and thus can simulate the plant behaviours for a severe accident. Therefore the developed system may play a central role as an information source for decision-making for a severe accident management, and will be used as a training simulator for a severe accident management. (author)

  1. Complete sequence analysis reveals two distinct poleroviruses infecting cucurbits in China.

    Science.gov (United States)

    Xiang, Hai-ying; Shang, Qiao-xia; Han, Cheng-gui; Li, Da-wei; Yu, Jia-lin

    2008-01-01

    The complete RNA genomes of a Chinese isolate of cucurbit aphid-borne yellows virus (CABYV-CHN) and a new polerovirus tentatively referred to as melon aphid-borne yellows virus (MABYV) were determined. The entire genome of CABYV-CHN shared 89.0% nucleotide sequence identity with the French CABYV isolate. In contrast, nucleotide sequence identities between MABYV and CABYV and other poleroviruses were in the range of 50.7-74.2%, with amino acid sequence identities ranging from 24.8 to 82.9% for individual gene products. We propose that CABYV-CHN is a strain of CABYV and that MABYV is a member of a tentative distinct species within the genus Polerovirus.

  2. Sequence analysis of serum albumins reveals the molecular evolution of ligand recognition properties.

    Science.gov (United States)

    Fanali, Gabriella; Ascenzi, Paolo; Bernardi, Giorgio; Fasano, Mauro

    2012-01-01

    Serum albumin (SA) is a circulating protein providing a depot and carrier for many endogenous and exogenous compounds. At least seven major binding sites have been identified by structural and functional investigations mainly in human SA. SA is conserved in vertebrates, with at least 49 entries in protein sequence databases. The multiple sequence analysis of this set of entries leads to the definition of a cladistic tree for the molecular evolution of SA orthologs in vertebrates, thus showing the clustering of the considered species, with lamprey SAs (Lethenteron japonicum and Petromyzon marinus) in a separate outgroup. Sequence analysis aimed at searching conserved domains revealed that most SA sequences are made up by three repeated domains (about 600 residues), as extensively characterized for human SA. On the contrary, lamprey SAs are giant proteins (about 1400 residues) comprising seven repeated domains. The phylogenetic analysis of the SA family reveals a stringent correlation with the taxonomic classification of the species available in sequence databases. A focused inspection of the sequences of ligand binding sites in SA revealed that in all sites most residues involved in ligand binding are conserved, although the versatility towards different ligands could be peculiar of higher organisms. Moreover, the analysis of molecular links between the different sites suggests that allosteric modulation mechanisms could be restricted to higher vertebrates.

  3. Analysis of FDA in-house food reference materials with anticoincidence INAA

    International Nuclear Information System (INIS)

    Anderson, D.L.; Cunningham, W.C.

    2013-01-01

    In-house reference material (IRM) cocoa powder (CCP) has been in use at US Food and Drug Administration laboratories for about 15 years. A single lot of commercial material was originally characterized for 32 elements by several laboratories and five techniques. A unique approach for basis weight determination based upon ambient relative humidity was developed for CCP, eliminating the need for dry weight determinations. The CCP Reference Sheet is updated by incorporating new results approximately every 5 years. The last update occurred in 2006. As part of an effort to revalidate and update values for CCP, anticoincidence instrumental neutron activation analysis (INAA) was used to determine mass fractions for 16 of the originally characterized elements, as well as to provide information on 16 other elements. Results were in very good agreement with 2006 Reference Sheet values. A new candidate IRM, fresh-frozen swordfish (FFSF) powder, was produced by adding inorganic As, Cd, Cr, Hg, Pb, Sb, and Se to liquid nitrogen-frozen commercial swordfish filets which were then homogenized. Portions of FFSF were analyzed by INAA to provide mass fraction and homogeneity information for As, Cd, Cr, Hg, Sb, and Se as well as for eight other elements occurring naturally in the material. Non-homogeneities were ≤2.5 % for As, Br, Cd, and Cs, and ≤1.8 % for Cr, Hg, Rb, Sb, and Se. Certified reference materials DORM-3 Fish Protein powder and fresh-frozen SRM 1947 Lake Michigan Fish Tissue were analyzed as controls. (author)

  4. Single nucleotide polymorphism analysis of Korean native chickens using next generation sequencing data.

    Science.gov (United States)

    Seo, Dong-Won; Oh, Jae-Don; Jin, Shil; Song, Ki-Duk; Park, Hee-Bok; Heo, Kang-Nyeong; Shin, Younhee; Jung, Myunghee; Park, Junhyung; Jo, Cheorun; Lee, Hak-Kyo; Lee, Jun-Heon

    2015-02-01

    There are five native chicken lines in Korea, which are mainly classified by plumage colors (black, white, red, yellow, gray). These five lines are very important genetic resources in the Korean poultry industry. Based on a next generation sequencing technology, whole genome sequence and reference assemblies were performed using Gallus_gallus_4.0 (NCBI) with whole genome sequences from these lines to identify common and novel single nucleotide polymorphisms (SNPs). We obtained 36,660,731,136 ± 1,257,159,120 bp of raw sequence and average 26.6-fold of 25-29 billion reference assembly sequences representing 97.288 % coverage. Also, 4,006,068 ± 97,534 SNPs were observed from 29 autosomes and the Z chromosome and, of these, 752,309 SNPs are the common SNPs across lines. Among the identified SNPs, the number of novel- and known-location assigned SNPs was 1,047,951 ± 14,956 and 2,948,648 ± 81,414, respectively. The number of unassigned known SNPs was 1,181 ± 150 and unassigned novel SNPs was 8,238 ± 1,019. Synonymous SNPs, non-synonymous SNPs, and SNPs having character changes were 26,266 ± 1,456, 11,467 ± 604, 8,180 ± 458, respectively. Overall, 443,048 ± 26,389 SNPs in each bird were identified by comparing with dbSNP in NCBI. The presently obtained genome sequence and SNP information in Korean native chickens have wide applications for further genome studies such as genetic diversity studies to detect causative mutations for economic and disease related traits.

  5. Generation and analysis of expressed sequence tags from the ciliate protozoan parasite Ichthyophthirius multifiliis

    Directory of Open Access Journals (Sweden)

    Arias Covadonga

    2007-06-01

    Full Text Available Abstract Background The ciliate protozoan Ichthyophthirius multifiliis (Ich is an important parasite of freshwater fish that causes 'white spot disease' leading to significant losses. A genomic resource for large-scale studies of this parasite has been lacking. To study gene expression involved in Ich pathogenesis and virulence, our goal was to generate expressed sequence tags (ESTs for the development of a powerful microarray platform for the analysis of global gene expression in this species. Here, we initiated a project to sequence and analyze over 10,000 ESTs. Results We sequenced 10,368 EST clones using a normalized cDNA library made from pooled samples of the trophont, tomont, and theront life-cycle stages, and generated 9,769 sequences (94.2% success rate. Post-sequencing processing led to 8,432 high quality sequences. Clustering analysis of these ESTs allowed identification of 4,706 unique sequences containing 976 contigs and 3,730 singletons. These unique sequences represent over two million base pairs (~10% of Plasmodium falciparum genome, a phylogenetically related protozoan. BLASTX searches produced 2,518 significant (E-value -5 hits and further Gene Ontology (GO analysis annotated 1,008 of these genes. The ESTs were analyzed comparatively against the genomes of the related protozoa Tetrahymena thermophila and P. falciparum, allowing putative identification of additional genes. All the EST sequences were deposited by dbEST in GenBank (GenBank: EG957858–EG966289. Gene discovery and annotations are presented and discussed. Conclusion This set of ESTs represents a significant proportion of the Ich transcriptome, and provides a material basis for the development of microarrays useful for gene expression studies concerning Ich development, pathogenesis, and virulence.

  6. Genetic mutation analysis of human gastric adenocarcinomas using ion torrent sequencing platform.

    Directory of Open Access Journals (Sweden)

    Zhi Xu

    Full Text Available Gastric cancer is the one of the major causes of cancer-related death, especially in Asia. Gastric adenocarcinoma, the most common type of gastric cancer, is heterogeneous and its incidence and cause varies widely with geographical regions, gender, ethnicity, and diet. Since unique mutations have been observed in individual human cancer samples, identification and characterization of the molecular alterations underlying individual gastric adenocarcinomas is a critical step for developing more effective, personalized therapies. Until recently, identifying genetic mutations on an individual basis by DNA sequencing remained a daunting task. Recent advances in new next-generation DNA sequencing technologies, such as the semiconductor-based Ion Torrent sequencing platform, makes DNA sequencing cheaper, faster, and more reliable. In this study, we aim to identify genetic mutations in the genes which are targeted by drugs in clinical use or are under development in individual human gastric adenocarcinoma samples using Ion Torrent sequencing. We sequenced 737 loci from 45 cancer-related genes in 238 human gastric adenocarcinoma samples using the Ion Torrent Ampliseq Cancer Panel. The sequencing analysis revealed a high occurrence of mutations along the TP53 locus (9.7% in our sample set. Thus, this study indicates the utility of a cost and time efficient tool such as Ion Torrent sequencing to screen cancer mutations for the development of personalized cancer therapy.

  7. Plastome Sequence Determination and Comparative Analysis for Members of the Lolium-Festuca Grass Species Complex

    Science.gov (United States)

    Hand, Melanie L.; Spangenberg, German C.; Forster, John W.; Cogan, Noel O. I.

    2013-01-01

    Chloroplast genome sequences are of broad significance in plant biology, due to frequent use in molecular phylogenetics, comparative genomics, population genetics, and genetic modification studies. The present study used a second-generation sequencing approach to determine and assemble the plastid genomes (plastomes) of four representatives from the agriculturally important Lolium-Festuca species complex of pasture grasses (Lolium multiflorum, Festuca pratensis, Festuca altissima, and Festuca ovina). Total cellular DNA was extracted from either roots or leaves, was sequenced, and the output was filtered for plastome-related reads. A comparison between sources revealed fewer plastome-related reads from root-derived template but an increase in incidental bacterium-derived sequences. Plastome assembly and annotation indicated high levels of sequence identity and a conserved organization and gene content between species. However, frequent deletions within the F. ovina plastome appeared to contribute to a smaller plastid genome size. Comparative analysis with complete plastome sequences from other members of the Poaceae confirmed conservation of most grass-specific features. Detailed analysis of the rbcL–psaI intergenic region, however, revealed a “hot-spot” of variation characterized by independent deletion events. The evolutionary implications of this observation are discussed. The complete plastome sequences are anticipated to provide the basis for potential organelle-specific genetic modification of pasture grasses. PMID:23550121

  8. Sequence length variation, indel costs, and congruence in sensitivity analysis

    DEFF Research Database (Denmark)

    Aagesen, Lone; Petersen, Gitte; Seberg, Ole

    2005-01-01

    The behavior of two topological and four character-based congruence measures was explored using different indel treatments in three empirical data sets, each with different alignment difficulties. The analyses were done using direct optimization within a sensitivity analysis framework in which...... the cost of indels was varied. Indels were treated either as a fifth character state, or strings of contiguous gaps were considered single events by using linear affine gap cost. Congruence consistently improved when indels were treated as single events, but no congruence measure appeared as the obviously...... preferable one. However, when combining enough data, all congruence measures clearly tended to select the same alignment cost set as the optimal one. Disagreement among congruence measures was mostly caused by a dominant fragment or a data partition that included all or most of the length variation...

  9. Accident sequences and causes analysis in a hydrogen production process

    Energy Technology Data Exchange (ETDEWEB)

    Jae, Moo Sung; Hwang, Seok Won; Kang, Kyong Min; Ryu, Jung Hyun; Kim, Min Soo; Cho, Nam Chul; Jeon, Ho Jun; Jung, Gun Hyo; Han, Kyu Min; Lee, Seng Woo [Hanyang Univ., Seoul (Korea, Republic of)

    2006-03-15

    Since hydrogen production facility using IS process requires high temperature of nuclear power plant, safety assessment should be performed to guarantee the safety of facility. First of all, accident cases of hydrogen production and utilization has been surveyed. Based on the results, risk factors which can be derived from hydrogen production facility were identified. Besides the correlation between risk factors are schematized using influence diagram. Also initiating events of hydrogen production facility were identified and accident scenario development and quantification were performed. PSA methodology was used for identification of initiating event and master logic diagram was used for selection method of initiating event. Event tree analysis was used for quantification of accident scenario. The sum of all the leakage frequencies is 1.22x10{sup -4} which is similar value (1.0x10{sup -4}) for core damage frequency that International Nuclear Safety Advisory Group of IAEA suggested as a criteria.

  10. Image registration based on virtual frame sequence analysis

    Energy Technology Data Exchange (ETDEWEB)

    Chen, H.; Ng, W.S. [Nanyang Technological University, Computer Integrated Medical Intervention Laboratory, School of Mechanical and Aerospace Engineering, Singapore (Singapore); Shi, D. (Nanyang Technological University, School of Computer Engineering, Singapore, Singpore); Wee, S.B. [Tan Tock Seng Hospital, Department of General Surgery, Singapore (Singapore)

    2007-08-15

    This paper is to propose a new framework for medical image registration with large nonrigid deformations, which still remains one of the biggest challenges for image fusion and further analysis in many medical applications. Registration problem is formulated as to recover a deformation process with the known initial state and final state. To deal with large nonlinear deformations, virtual frames are proposed to be inserted to model the deformation process. A time parameter is introduced and the deformation between consecutive frames is described with a linear affine transformation. Experiments are conducted with simple geometric deformation as well as complex deformations presented in MRI and ultrasound images. All the deformations are characterized with nonlinearity. The positive results demonstrated the effectiveness of this algorithm. The framework proposed in this paper is feasible to register medical images with large nonlinear deformations and is especially useful for sequential images. (orig.)

  11. Next-generation sequencing of multiple individuals per barcoded library by deconvolution of sequenced amplicons using endonuclease fragment analysis

    DEFF Research Database (Denmark)

    Andersen, Jeppe D; Pereira, Vania; Pietroni, Carlotta

    2014-01-01

    The simultaneous sequencing of samples from multiple individuals increases the efficiency of next-generation sequencing (NGS) while also reducing costs. Here we describe a novel and simple approach for sequencing DNA from multiple individuals per barcode. Our strategy relies on the endonuclease...... digestion of PCR amplicons prior to library preparation, creating a specific fragment pattern for each individual that can be resolved after sequencing. By using both barcodes and restriction fragment patterns, we demonstrate the ability to sequence the human melanocortin 1 receptor (MC1R) genes from 72...... individuals using only 24 barcoded libraries....

  12. Software reference for SaTool - a Tool for Structural Analysis of Automated Systems

    DEFF Research Database (Denmark)

    Lorentzen, Torsten; Blanke, Mogens

    2004-01-01

    This software reference details the functions of SaTool – a tool for structural analysis of technical systems. SaTool is intended used as part of an industrial systems design cycle. Structural analysis is a graph-based technique where principal relations between variables express the system’s...... of the graph. SaTool makes analysis of the structure graph to provide knowledge about fundamental properties of the system in normal and faulty conditions. Salient features of SaTool include rapid analysis of possibility to diagnose faults and ability to make autonomous recovery should faults occur........ The list of such variables and functional relations constitute the system’s structure graph. Normal operation means all functional relations are intact. Should faults occur, one or more functional relations cease to be valid. In a structure graph, this is seen as the disappearance of one or more nodes...

  13. VisRseq: R-based visual framework for analysis of sequencing data

    OpenAIRE

    Younesy, Hamid; Möller, Torsten; Lorincz, Matthew C; Karimi, Mohammad M; Jones, Steven JM

    2015-01-01

    Background Several tools have been developed to enable biologists to perform initial browsing and exploration of sequencing data. However the computational tool set for further analyses often requires significant computational expertise to use and many of the biologists with the knowledge needed to interpret these data must rely on programming experts. Results We present VisRseq, a framework for analysis of sequencing datasets that provides a computationally rich and accessible framework for ...

  14. Targeted DNA Methylation Analysis by High Throughput Sequencing in Porcine Peri-attachment Embryos

    OpenAIRE

    MORRILL, Benson H.; COX, Lindsay; WARD, Anika; HEYWOOD, Sierra; PRATHER, Randall S.; ISOM, S. Clay

    2013-01-01

    Abstract The purpose of this experiment was to implement and evaluate the effectiveness of a next-generation sequencing-based method for DNA methylation analysis in porcine embryonic samples. Fourteen discrete genomic regions were amplified by PCR using bisulfite-converted genomic DNA derived from day 14 in vivo-derived (IVV) and parthenogenetic (PA) porcine embryos as template DNA. Resulting PCR products were subjected to high-throughput sequencing using the Illumina Genome Analyzer IIx plat...

  15. CloVR-Comparative: automated, cloud-enabled comparative microbial genome sequence analysis pipeline

    OpenAIRE

    Agrawal, Sonia; Arze, Cesar; Adkins, Ricky S.; Crabtree, Jonathan; Riley, David; Vangala, Mahesh; Galens, Kevin; Fraser, Claire M.; Tettelin, Herv?; White, Owen; Angiuoli, Samuel V.; Mahurkar, Anup; Fricke, W. Florian

    2017-01-01

    Background The benefit of increasing genomic sequence data to the scientific community depends on easy-to-use, scalable bioinformatics support. CloVR-Comparative combines commonly used bioinformatics tools into an intuitive, automated, and cloud-enabled analysis pipeline for comparative microbial genomics. Results CloVR-Comparative runs on annotated complete or draft genome sequences that are uploaded by the user or selected via a taxonomic tree-based user interface and downloaded from NCBI. ...

  16. Criteria of reference radionuclides for safety analysis of spent fuel waste disposal

    International Nuclear Information System (INIS)

    Suryanto

    1998-01-01

    Study on the criteria for reference radionuclides selection for assessment on spent fuel disposal have done. The reference radionuclides in this study means radionuclides are predicted to contribute of the most radiological effect for man if spent fuel waste are discharged on deep geology formation. The research was done by investigate critically of parameters were used on evaluation a kind of radionuclide. Especially, this research study of parameter which relevant disposal case and or spent fuel waste on deep geology formation . The research assumed that spent fuel discharged on deep geology by depth 500-1000 meters from surface of the land. The migration scenario Radionuclides from waste form to man was assumed particularly for normal release in which Radionuclides discharge from waste form in a series thorough container, buffer, geological, rock, to fracture(fault) and move together with ground water go to biosphere and than go into human body. On this scenario, the parameter such as radionuclides inventory, half life, heat generation, hazard index based on maximum permissible concentration (MPC) or annual limit on intake (ALI) was developed as criteria of reference radionuclides selection. The research concluded that radionuclides inventory, half live, heat generated, hazard index base on MPC or ALI can be used as criteria for selection of reference Radionuclide. The research obtained that the main radionuclides are predicted give the most radiological effect to human are as Cs-137, Sr-90, I-129, Am-243, Cm-244, Pu-238, Pu-239, Pu-240. The radionuclides reasonable to be used as reference radionuclides in safety analysis at spent fuel disposal. (author)

  17. Instrumental neutron activation analysis of proposed marine sediment reference material (IAEA-158)

    International Nuclear Information System (INIS)

    Siddique, N.; Waheed, S.

    2009-01-01

    IAEA-158, sediment prepared by the International Atomic Energy Agency -Marine Environmental Laboratory (IAEA-MEL), Monaco was received under the IAEA Analytical Quality Control Services (AQCS) Intercomparison Programme. Instrumental Neutron Activation Analysis (INAA) was used to determine AI, As, Br, Ce, Co, Cr, Cs, Eu, Fe, Hf, K, La, Lu, Mn, Na, Nd, Rb, Sb, Sc, Se, Sm, Ta, Tb, Th, V, Vb and Zn in this proposed reference material (RM). Four different irradiation protocols were adopted using a miniature neutron source reactor (MNSR) by varying the irradiation, cooling and counting times. IAEA-405 (Estuarine Sediment) and IAEA-SLI (Lake Sediment) were used as compatible matrix reference materials for quality assurance (QA) purposes. Good agreement between our data and lAEA certified values was obtained providing confidence in the reported data. (author)

  18. Third-Generation Sequencing and Analysis of Four Complete Pig Liver Esterase Gene Sequences in Clones Identified by Screening BAC Library.

    Science.gov (United States)

    Zhou, Qiongqiong; Sun, Wenjuan; Liu, Xiyan; Wang, Xiliang; Xiao, Yuncai; Bi, Dingren; Yin, Jingdong; Shi, Deshi

    2016-01-01

    Pig liver carboxylesterase (PLE) gene sequences in GenBank are incomplete, which has led to difficulties in studying the genetic structure and regulation mechanisms of gene expression of PLE family genes. The aim of this study was to obtain and analysis of complete gene sequences of PLE family by screening from a Rongchang pig BAC library and third-generation PacBio gene sequencing. After a number of existing incomplete PLE isoform gene sequences were analysed, primers were designed based on conserved regions in PLE exons, and the whole pig genome used as a template for Polymerase chain reaction (PCR) amplification. Specific primers were then selected based on the PCR amplification results. A three-step PCR screening method was used to identify PLE-positive clones by screening a Rongchang pig BAC library and PacBio third-generation sequencing was performed. BLAST comparisons and other bioinformatics methods were applied for sequence analysis. Five PLE-positive BAC clones, designated BAC-10, BAC-70, BAC-75, BAC-119 and BAC-206, were identified. Sequence analysis yielded the complete sequences of four PLE genes, PLE1, PLE-B9, PLE-C4, and PLE-G2. Complete PLE gene sequences were defined as those containing regulatory sequences, exons, and introns. It was found that, not only did the PLE exon sequences of the four genes show a high degree of homology, but also that the intron sequences were highly similar. Additionally, the regulatory region of the genes contained two 720bps reverse complement sequences that may have an important function in the regulation of PLE gene expression. This is the first report to confirm the complete sequences of four PLE genes. In addition, the study demonstrates that each PLE isoform is encoded by a single gene and that the various genes exhibit a high degree of sequence homology, suggesting that the PLE family evolved from a single ancestral gene. Obtaining the complete sequences of these PLE genes provides the necessary foundation for

  19. Prepare of microanalysis reference material for nuclear analysis of Chinese ancient ceramic

    International Nuclear Information System (INIS)

    Feng Songlin; Xu Qing; Feng Xiangqian; Fan Dongyu; Lei Yong; Cheng Lin

    2005-01-01

    Some analytic technique can play important role for identifying the provenance and age of ceramic ware. However, it is usually not allowed to destructive analyze for a valuable intact porcelain ware. These analysis methods such as X-ray Fluorescence (XRF), Proton Induced X-ray Emission (PIXE), and Synchrotron Radiation X-ray Fluorescence (SRXRF) are suitable for nondestructive analysis of ancient ceramic wares. In order to compare the analytic data obtained by different measuring method and identify the provenance and age accurately, the effective way is to calibrate elemental concentration in body and glaze of ceramic ware. Microanalysis reference material (MRM) of ancient ceramic has to be prepared for achieving quantitative analysis. A solid powder 99% in size of 500 mesh for microanalysis reference material (MRM) has being prepared in institute of high energy physics. The minimum analytic masses of 1 mg were determined by Neutron Activation Analysis (NAA) for these elements (Sc, Cr, Co, Rb: Cs, La, Ce, Nd, Sm, Tb, Yb, Lu; Hf, Ta, Th, U), and by SRXRF for elements (K, Ca, Ti, Mn, Fe, Zn; Rb, Sr).

  20. Multicomponent quantitative spectroscopic analysis without reference substances based on ICA modelling.

    Science.gov (United States)

    Monakhova, Yulia B; Mushtakova, Svetlana P

    2017-05-01

    A fast and reliable spectroscopic method for multicomponent quantitative analysis of targeted compounds with overlapping signals in complex mixtures has been established. The innovative analytical approach is based on the preliminary chemometric extraction of qualitative and quantitative information from UV-vis and IR spectral profiles of a calibration system using independent component analysis (ICA). Using this quantitative model and ICA resolution results of spectral profiling of "unknown" model mixtures, the absolute analyte concentrations in multicomponent mixtures and authentic samples were then calculated without reference solutions. Good recoveries generally between 95% and 105% were obtained. The method can be applied to any spectroscopic data that obey the Beer-Lambert-Bouguer law. The proposed method was tested on analysis of vitamins and caffeine in energy drinks and aromatic hydrocarbons in motor fuel with 10% error. The results demonstrated that the proposed method is a promising tool for rapid simultaneous multicomponent analysis in the case of spectral overlap and the absence/inaccessibility of reference materials.

  1. Genome sequencing and analysis of BCG vaccine strains.

    Directory of Open Access Journals (Sweden)

    Wen Zhang

    Full Text Available BACKGROUND: Although the Bacillus Calmette-Guérin (BCG vaccine against tuberculosis (TB has been available for more than 75 years, one third of the world's population is still infected with Mycobacterium tuberculosis and approximately 2 million people die of TB every year. To reduce this immense TB burden, a clearer understanding of the functional genes underlying the action of BCG and the development of new vaccines are urgently needed. METHODS AND FINDINGS: Comparative genomic analysis of 19 M. tuberculosis complex strains showed that BCG strains underwent repeated human manipulation, had higher region of deletion rates than those of natural M. tuberculosis strains, and lost several essential components such as T-cell epitopes. A total of 188 BCG strain T-cell epitopes were lost to various degrees. The non-virulent BCG Tokyo strain, which has the largest number of T-cell epitopes (359, lost 124. Here we propose that BCG strain protection variability results from different epitopes. This study is the first to present BCG as a model organism for genetics research. BCG strains have a very well-documented history and now detailed genome information. Genome comparison revealed the selection process of BCG strains under human manipulation (1908-1966. CONCLUSIONS: Our results revealed the cause of BCG vaccine strain protection variability at the genome level and supported the hypothesis that the restoration of lost BCG Tokyo epitopes is a useful future vaccine development strategy. Furthermore, these detailed BCG vaccine genome investigation results will be useful in microbial genetics, microbial engineering and other research fields.

  2. Selection of reference genes for transcriptional analysis of edible tubers of potato (Solanum tuberosum L..

    Directory of Open Access Journals (Sweden)

    Roberta Fogliatto Mariot

    Full Text Available Potato (Solanum tuberosum yield has increased dramatically over the last 50 years and this has been achieved by a combination of improved agronomy and biotechnology efforts. Gene studies are taking place to improve new qualities and develop new cultivars. Reverse transcriptase quantitative polymerase chain reaction (RT-qPCR is a bench-marking analytical tool for gene expression analysis, but its accuracy is highly dependent on a reliable normalization strategy of an invariant reference genes. For this reason, the goal of this work was to select and validate reference genes for transcriptional analysis of edible tubers of potato. To do so, RT-qPCR primers were designed for ten genes with relatively stable expression in potato tubers as observed in RNA-Seq experiments. Primers were designed across exon boundaries to avoid genomic DNA contamination. Differences were observed in the ranking of candidate genes identified by geNorm, NormFinder and BestKeeper algorithms. The ranks determined by geNorm and NormFinder were very similar and for all samples the most stable candidates were C2, exocyst complex component sec3 (SEC3 and ATCUL3/ATCUL3A/CUL3/CUL3A (CUL3A. According to BestKeeper, the importin alpha and ubiquitin-associated/ts-n genes were the most stable. Three genes were selected as reference genes for potato edible tubers in RT-qPCR studies. The first one, called C2, was selected in common by NormFinder and geNorm, the second one is SEC3, selected by NormFinder, and the third one is CUL3A, selected by geNorm. Appropriate reference genes identified in this work will help to improve the accuracy of gene expression quantification analyses by taking into account differences that may be observed in RNA quality or reverse transcription efficiency across the samples.

  3. Selection of reference genes for transcriptional analysis of edible tubers of potato (Solanum tuberosum L.).

    Science.gov (United States)

    Mariot, Roberta Fogliatto; de Oliveira, Luisa Abruzzi; Voorhuijzen, Marleen M; Staats, Martijn; Hutten, Ronald C B; Van Dijk, Jeroen P; Kok, Esther; Frazzon, Jeverson

    2015-01-01

    Potato (Solanum tuberosum) yield has increased dramatically over the last 50 years and this has been achieved by a combination of improved agronomy and biotechnology efforts. Gene studies are taking place to improve new qualities and develop new cultivars. Reverse transcriptase quantitative polymerase chain reaction (RT-qPCR) is a bench-marking analytical tool for gene expression analysis, but its accuracy is highly dependent on a reliable normalization strategy of an invariant reference genes. For this reason, the goal of this work was to select and validate reference genes for transcriptional analysis of edible tubers of potato. To do so, RT-qPCR primers were designed for ten genes with relatively stable expression in potato tubers as observed in RNA-Seq experiments. Primers were designed across exon boundaries to avoid genomic DNA contamination. Differences were observed in the ranking of candidate genes identified by geNorm, NormFinder and BestKeeper algorithms. The ranks determined by geNorm and NormFinder were very similar and for all samples the most stable candidates were C2, exocyst complex component sec3 (SEC3) and ATCUL3/ATCUL3A/CUL3/CUL3A (CUL3A). According to BestKeeper, the importin alpha and ubiquitin-associated/ts-n genes were the most stable. Three genes were selected as reference genes for potato edible tubers in RT-qPCR studies. The first one, called C2, was selected in common by NormFinder and geNorm, the second one is SEC3, selected by NormFinder, and the third one is CUL3A, selected by geNorm. Appropriate reference genes identified in this work will help to improve the accuracy of gene expression quantification analyses by taking into account differences that may be observed in RNA quality or reverse transcription efficiency across the samples.

  4. A Framework for Establishing Standard Reference Scale of Texture by Multivariate Statistical Analysis Based on Instrumental Measurement and Sensory Evaluation.

    Science.gov (United States)

    Zhi, Ruicong; Zhao, Lei; Xie, Nan; Wang, Houyin; Shi, Bolin; Shi, Jingye

    2016-01-13

    A framework of establishing standard reference scale (texture) is proposed by multivariate statistical analysis according to instrumental measurement and sensory evaluation. Multivariate statistical analysis is conducted to rapidly select typical reference samples with characteristics of universality, representativeness, stability, substitutability, and traceability. The reasonableness of the framework method is verified by establishing standard reference scale of texture attribute (hardness) with Chinese well-known food. More than 100 food products in 16 categories were tested using instrumental measurement (TPA test), and the result was analyzed with clustering analysis, principal component analysis, relative standard deviation, and analysis of variance. As a result, nine kinds of foods were determined to construct the hardness standard reference scale. The results indicate that the regression coefficient between the estimated sensory value and the instrumentally measured value is significant (R(2) = 0.9765), which fits well with Stevens's theory. The research provides reliable a theoretical basis and practical guide for quantitative standard reference scale establishment on food texture characteristics.

  5. mESAdb: microRNA expression and sequence analysis database.

    Science.gov (United States)

    Kaya, Koray D; Karakülah, Gökhan; Yakicier, Cengiz M; Acar, Aybar C; Konu, Ozlen

    2011-01-01

    microRNA expression and sequence analysis database (http://konulab.fen.bilkent.edu.tr/mirna/) (mESAdb) is a regularly updated database for the multivariate analysis of sequences and expression of microRNAs from multiple taxa. mESAdb is modular and has a user interface implemented in PHP and JavaScript and coupled with statistical analysis and visualization packages written for the R language. The database primarily comprises mature microRNA sequences and their target data, along with selected human, mouse and zebrafish expression data sets. mESAdb analysis modules allow (i) mining of microRNA expression data sets for subsets of microRNAs selected manually or by motif; (ii) pair-wise multivariate analysis of expression data sets within and between taxa; and (iii) association of microRNA subsets with annotation databases, HUGE Navigator, KEGG and GO. The use of existing and customized R packages facilitates future addition of data sets and analysis tools. Furthermore, the ability to upload and analyze user-specified data sets makes mESAdb an interactive and expandable analysis tool for microRNA sequence and expression data.

  6. Saprolegniaceae identified on amphibian eggs throughout the Pacific Northwest, USA, by internal transcribed spacer sequences and phylogenetic analysis.

    Science.gov (United States)

    Petrisko, Jill E; Pearl, Christopher A; Pilliod, David S; Sheridan, Peter P; Williams, Charles F; Peterson, Charles R; Bury, R Bruce

    2008-01-01

    We assessed the diversity and phylogeny of Saprolegniaceae on amphibian eggs from the Pacific Northwest, with particular focus on Saprolegnia ferax, a species implicated in high egg mortality. We identified isolates from eggs of six amphibians with the internal transcribed spacer (ITS) and 5.8S gene regions and BLAST of the GenBank database. We identified 68 sequences as Saprolegniaceae and 43 sequences as true fungi from at least nine genera. Our phylogenetic analysis of the Saprolegniaceae included isolates within the genera Saprolegnia, Achlya and Leptolegnia. Our phylogeny grouped S. semihypogyna with Achlya rather than with the Saprolegnia reference sequences. We found only one isolate that grouped closely with S. ferax, and this came from a hatchery-raised salmon (Idaho) that we sampled opportunistically. We had representatives of 7-12 species and three genera of Saprolegniaceae on our amphibian eggs. Further work on the ecological roles of different species of Saprolegniaceae is needed to clarify their potential importance in amphibian egg mortality and potential links to population declines.

  7. Novel primer specific false terminations during DNA sequencing reactions: danger of inaccuracy of mutation analysis in molecular diagnostics

    Science.gov (United States)

    Anwar, R; Booth, A; Churchill, A J; Markham, A F

    1996-01-01

    The determination of nucleotide sequence is fundamental to the identification and molecular analysis of genes. Direct sequencing of PCR products is now becoming a commonplace procedure for haplotype analysis, and for defining mutations and polymorphism within genes, particularly for diagnostic purposes. A previously unrecognised phenomenon, primer related variability, observed in sequence data generated using Taq cycle sequencing and T7 Sequenase sequencing, is reported. This suggests that caution is necessary when interpreting DNA sequence data. This is particularly important in situations where treatment may be dependent on the accuracy of the molecular diagnosis. Images PMID:16696096

  8. Preparation and evaluation of reference materials for accountancy analysis. (1) Preparation and evaluation method

    International Nuclear Information System (INIS)

    Takamatsu, Mai; Kacchi, Tomokazu; Murakami, Toshiki; Ai, Hironobu; Sumi, Mika; Abe, Katsuo; Kageyama, Tomio; Nakazawa, Hiroaki

    2009-01-01

    Isotope dilution mass spectrometry method used for the accountancy analysis at nuclear fuel facilities requires the standard materials called LSD (Large Size Dried) spike. Generally, LSD spikes are prepared from certified reference materials (CRMs) which supplied from foreign laboratories. However, the difficulty of Pu CRM importation is increasing. It is important for safeguards to attain and continue high reliable accountancy analysis and stable securing of LSD spike is essential. Therefore, in order to conserve CRMs, several types of LSD spike were prepared under collaboration work between JAEA and JNFL, such as the amount of nuclear material in one LSD spike is decreased and others. Practical test with actual samples were performed at JNFL Rokkasho reprocessing plant, and those results were compared with the results obtained by using LSD spike which supplied from foreign laboratory. Preparation and verification analysis of LSD spikes and evaluation of uncertainty based on ISO-GUM will be presented. (author)

  9. A priori Considerations When Conducting High-Throughput Amplicon-Based Sequence Analysis

    Directory of Open Access Journals (Sweden)

    Aditi Sengupta

    2016-03-01

    Full Text Available Amplicon-based sequencing strategies that include 16S rRNA and functional genes, alongside “meta-omics” analyses of communities of microorganisms, have allowed researchers to pose questions and find answers to “who” is present in the environment and “what” they are doing. Next-generation sequencing approaches that aid microbial ecology studies of agricultural systems are fast gaining popularity among agronomy, crop, soil, and environmental science researchers. Given the rapid development of these high-throughput sequencing techniques, researchers with no prior experience will desire information about the best practices that can be used before actually starting high-throughput amplicon-based sequence analyses. We have outlined items that need to be carefully considered in experimental design, sampling, basic bioinformatics, sequencing of mock communities and negative controls, acquisition of metadata, and in standardization of reaction conditions as per experimental requirements. Not all considerations mentioned here may pertain to a particular study. The overall goal is to inform researchers about considerations that must be taken into account when conducting high-throughput microbial DNA sequencing and sequences analysis.

  10. Regularized rare variant enrichment analysis for case-control exome sequencing data.

    Science.gov (United States)

    Larson, Nicholas B; Schaid, Daniel J

    2014-02-01

    Rare variants have recently garnered an immense amount of attention in genetic association analysis. However, unlike methods traditionally used for single marker analysis in GWAS, rare variant analysis often requires some method of aggregation, since single marker approaches are poorly powered for typical sequencing study sample sizes. Advancements in sequencing technologies have rendered next-generation sequencing platforms a realistic alternative to traditional genotyping arrays. Exome sequencing in particular not only provides base-level resolution of genetic coding regions, but also a natural paradigm for aggregation via genes and exons. Here, we propose the use of penalized regression in combination with variant aggregation measures to identify rare variant enrichment in exome sequencing data. In contrast to marginal gene-level testing, we simultaneously evaluate the effects of rare variants in multiple genes, focusing on gene-based least absolute shrinkage and selection operator (LASSO) and exon-based sparse group LASSO models. By using gene membership as a grouping variable, the sparse group LASSO can be used as a gene-centric analysis of rare variants while also providing a penalized approach toward identifying specific regions of interest. We apply extensive simulations to evaluate the performance of these approaches with respect to specificity and sensitivity, comparing these results to multiple competing marginal testing methods. Finally, we discuss our findings and outline future research. © 2013 WILEY PERIODICALS, INC.

  11. Molecular characterization, sequence analysis and tissue expression of a porcine gene – MOSPD2

    Directory of Open Access Journals (Sweden)

    Yang Jie

    2017-01-01

    Full Text Available The full-length cDNA sequence of a porcine gene, MOSPD2, was amplified using the rapid amplification of cDNA ends method based on a pig expressed sequence tag sequence which was highly homologous to the coding sequence of the human MOSPD2 gene. Sequence prediction analysis revealed that the open reading frame of this gene encodes a protein of 491 amino acids that has high homology with the motile sperm domain-containing protein 2 (MOSPD2 of five species: horse (89%, human (90%, chimpanzee (89%, rhesus monkey (89% and mouse (85%; thus, it could be defined as a porcine MOSPD2 gene. This novel porcine gene was assigned GeneID: 100153601. This gene is structured in 15 exons and 14 introns as revealed by computer-assisted analysis. The phylogenetic analysis revealed that the porcine MOSPD2 gene has a closer genetic relationship with the MOSPD2 gene of horse. Tissue expression analysis indicated that the porcine MOSPD2 gene is generally and differentially expressed in the spleen, muscle, skin, kidney, lung, liver, fat and heart. Our experiment is the first to establish the primary foundation for further research on the porcine MOSPD2 gene.

  12. Sequence and phylogenetic analysis of chicken anaemia virus obtained from backyard and commercial chickens in Nigeria.

    Science.gov (United States)

    Oluwayelu, D O; Todd, D; Olaleye, O D

    2008-12-01

    This work reports the first molecular analysis study of chicken anaemia virus (CAV) in backyard chickens in Africa using molecular cloning and sequence analysis to characterize CAV strains obtained from commercial chickens and Nigerian backyard chickens. Partial VP1 gene sequences were determined for three CAVs from commercial chickens and for six CAV variants present in samples from a backyard chicken. Multiple alignment analysis revealed that the 6% and 4% nucleotide diversity obtained respectively for the commercial and backyard chicken strains translated to only 2% amino acid diversity for each breed. Overall, the amino acid composition of Nigerian CAVs was found to be highly conserved. Since the partial VP1 gene sequence of two backyard chicken cloned CAV strains (NGR/CI-8 and NGR/CI-9) were almost identical and evolutionarily closely related to the commercial chicken strains NGR-1, and NGR-4 and NGR-5, respectively, we concluded that CAV infections had crossed the farm boundary.

  13. An update to the analysis of the Canadian Spatial Reference System

    Science.gov (United States)

    Ferland, R.; Piraszewski, M.; Craymer, M.

    2015-12-01

    The primary objective of the Canadian Spatial Reference System (CSRS) is to provide users access to a consistent geo-referencing infrastructure over the Canadian landmass. Global Navigation Satellite System (GNSS) positioning accuracy requirements ranges from meter level to mm level (e.g.: crustal deformation). The highest level of the Canadian infrastructure consist of a network of continually operating GPS and GNSS receivers, referred to as active control stations. The network includes all Canadian public active control stations, some bordering US CORS and Alaska stations, Greenland active control stations, as well as a selection of IGS reference frame stations. The Bernese analysis software is used for the daily processing and the combination into weekly solutions which form the basis for this analysis. IGS weekly final orbit, Earth Rotation parameters (ERP's) and coordinates products are used in the processing. For the more demanding users, the time dependant changes of station coordinates is often more important.All station coordinate estimates and related covariance information is used in this analysis. For each input solution, variance factor, translation, rotation and scale (and if needed their rates) or subsets of these are estimated. In the combination of these weekly solutions, station positions and velocities are estimated. Since the time series from the stations in these networks often experience changes in behavior, new (or reuse of) parameters are generally used in these situations. As is often the case with real data, unrealistic coordinates may occur. Automatic detection and removal of outliers is used in these cases. For the transformation, position and velocity parameters loose apriori estimates and uncertainties are provided. Alignment using the usual Helmert transformation to the latest IGb08 realization of ITRF is also performed during the adjustment.

  14. Challenges in the size analysis of a silica nanoparticle mixture as candidate certified reference material

    International Nuclear Information System (INIS)

    Kestens, Vikram; Roebben, Gert; Herrmann, Jan; Jämting, Åsa; Coleman, Victoria; Minelli, Caterina; Clifford, Charles; Temmerman, Pieter-Jan De; Mast, Jan; Junjie, Liu; Babick, Frank; Cölfen, Helmut; Emons, Hendrik

    2016-01-01

    A new certified reference material for quality control of nanoparticle size analysis methods has been developed and produced by the Institute for Reference Materials and Measurements of the European Commission’s Joint Research Centre. The material, ERM-FD102, consists of an aqueous suspension of a mixture of silica nanoparticle populations of distinct particle size and origin. The characterisation relied on an interlaboratory comparison study in which 30 laboratories of demonstrated competence participated with a variety of techniques for particle size analysis. After scrutinising the received datasets, certified and indicative values for different method-defined equivalent diameters that are specific for dynamic light scattering (DLS), centrifugal liquid sedimentation (CLS), scanning and transmission electron microscopy (SEM and TEM), atomic force microscopy (AFM), particle tracking analysis (PTA) and asymmetrical-flow field-flow fractionation (AF4) were assigned. The value assignment was a particular challenge because metrological concepts were not always interpreted uniformly across all participating laboratories. This paper presents the main elements and results of the ERM-FD102 characterisation study and discusses in particular the key issues of measurand definition and the estimation of measurement uncertainty.

  15. Usability Testing Analysis on The Bana Game as Education Game Design References on Junior High School

    Directory of Open Access Journals (Sweden)

    F. Adnan

    2017-04-01

    Full Text Available Learning media is one of the important elements in the learning process. Technological development support makes learning media more varied. The approach of using digital technology as a learning media has a better and more effective impact than other approaches. In order to increase the students’ learning interest, it requires the support of an interesting learning media. The use of gaming applications as learning media can improve learning outcomes. The benefits of using the maximum application cannot be separated from the determination of application design. The Bana game aims to increase the ability of critical thinking of the junior high school students. The usability-testing analysis on the Bana game application is used in order to get the design reference as an educational game development. The game is used as an object of the analysis because it has the same characteristics and goals with the game application to be developed. Usability Testing is a method used to measure the ease of use of an application by users. The Usability Testing consists of learnability, efficiency, memorability, errors, and satisfaction. The results of the analysis obtained will be used as a reference for educational game applications that will be developed.

  16. Challenges in the size analysis of a silica nanoparticle mixture as candidate certified reference material

    Energy Technology Data Exchange (ETDEWEB)

    Kestens, Vikram, E-mail: vikram.kestens@ec.europa.eu; Roebben, Gert [Joint Research Centre (JRC), European Commission, Institute for Reference Materials and Measurements (IRMM) (Belgium); Herrmann, Jan; Jämting, Åsa; Coleman, Victoria [National Measurement Institute Australia, Nanometrology Section (Australia); Minelli, Caterina; Clifford, Charles [National Physical Laboratory, Analytical Science Division (United Kingdom); Temmerman, Pieter-Jan De; Mast, Jan [Service Electron Microscopy, Veterinary and Agrochemical Research Centre (CODA-CERVA) (Belgium); Junjie, Liu [National Institute of Metrology, Division of Nanoscale Measurement and Advanced Materials (China); Babick, Frank [Technische Universität Dresden, Institut für Verfahrens- und Umwelttechnik (Germany); Cölfen, Helmut [University of Konstanz, Physical Chemistry, Department of Chemistry (Germany); Emons, Hendrik [Joint Research Centre (JRC), European Commission, Institute for Reference Materials and Measurements (IRMM) (Belgium)

    2016-06-15

    A new certified reference material for quality control of nanoparticle size analysis methods has been developed and produced by the Institute for Reference Materials and Measurements of the European Commission’s Joint Research Centre. The material, ERM-FD102, consists of an aqueous suspension of a mixture of silica nanoparticle populations of distinct particle size and origin. The characterisation relied on an interlaboratory comparison study in which 30 laboratories of demonstrated competence participated with a variety of techniques for particle size analysis. After scrutinising the received datasets, certified and indicative values for different method-defined equivalent diameters that are specific for dynamic light scattering (DLS), centrifugal liquid sedimentation (CLS), scanning and transmission electron microscopy (SEM and TEM), atomic force microscopy (AFM), particle tracking analysis (PTA) and asymmetrical-flow field-flow fractionation (AF4) were assigned. The value assignment was a particular challenge because metrological concepts were not always interpreted uniformly across all participating laboratories. This paper presents the main elements and results of the ERM-FD102 characterisation study and discusses in particular the key issues of measurand definition and the estimation of measurement uncertainty.

  17. The use of reference materials in the elemental analysis of biological samples

    International Nuclear Information System (INIS)

    Bowen, H.J.M.

    1975-01-01

    Reference materials (RMs) are useful to compare the accuracy and precision of laboratories and techniques. The desirable properties of biological reference materials are listed, and the problems of production, homogenization and storage described. At present there are only 10 biological RMs available compared with 213 geological and 520 metallurgical RMs. There is a need for more biological RMs including special materials for microprobe analysis and for in vivo activation analysis. A study of 650 mean values for elements in RM Kale, analysed by many laboratories, leads to the following conclusions. 61% of the values lie within +-10% of the best mean, and 80% lie within +-20% of the best mean. Atomic absorption spectrometry gives results that are 5-30% high for seven elements, while intrumental neutron activation analysis gives low and imprecise results for K. Other techniques with poor interlaboratory precision include neutron activation for Mg, polarography for Zn and arc-spectrometry for many elements. More than half the values for elements in Kale were obtained by neutron activation, confirming the importance of this technique and the need for RMs. As a rough estimate, 6 x 10 9 elemental analyses of biological materials are carried out each year, mostly by medical, agricultural and food scientists. It seems likely that a substantial percentage of these are inaccurate, a situation that might be improved by quality control using standard RMs. (author)

  18. Comparative analysis of catfish BAC end sequences with the zebrafish genome

    Directory of Open Access Journals (Sweden)

    Abernathy Jason

    2009-12-01

    Full Text Available Abstract Background Comparative mapping is a powerful tool to transfer genomic information from sequenced genomes to closely related species for which whole genome sequence data are not yet available. However, such an approach is still very limited in catfish, the most important aquaculture species in the United States. This project was initiated to generate additional BAC end sequences and demonstrate their applications in comparative mapping in catfish. Results We reported the generation of 43,000 BAC end sequences and their applications for comparative genome analysis in catfish. Using these and the additional 20,000 existing BAC end sequences as a resource along with linkage mapping and existing physical map, conserved syntenic regions were identified between the catfish and zebrafish genomes. A total of 10,943 catfish BAC end sequences (17.3% had significant BLAST hits to the zebrafish genome (cutoff value ≤ e-5, of which 3,221 were unique gene hits, providing a platform for comparative mapping based on locations of these genes in catfish and zebrafish. Genetic linkage mapping of microsatellites associated with contigs allowed identification of large conserved genomic segments and construction of super scaffolds. Conclusion BAC end sequences and their associated polymorphic markers are great resources for comparative genome analysis in catfish. Highly conserved chromosomal regions were identified to exist between catfish and zebrafish. However, it appears that the level of conservation at local genomic regions are high while a high level of chromosomal shuffling and rearrangements exist between catfish and zebrafish genomes. Orthologous regions established through comparative analysis should facilitate both structural and functional genome analysis in catfish.

  19. An integrative variant analysis suite for whole exome next-generation sequencing data

    Directory of Open Access Journals (Sweden)

    Challis Danny

    2012-01-01

    Full Text Available Abstract Background Whole exome capture sequencing allows researchers to cost-effectively sequence the coding regions of the genome. Although the exome capture sequencing methods have become routine and well established, there is currently a lack of tools specialized for variant calling in this type of data. Results Using statistical models trained on validated whole-exome capture sequencing data, the Atlas2 Suite is an integrative variant analysis pipeline optimized for variant discovery on all three of the widely used next generation sequencing platforms (SOLiD, Illumina, and Roche 454. The suite employs logistic regression models in conjunction with user-adjustable cutoffs to accurately separate true SNPs and INDELs from sequencing and mapping errors with high sensitivity (96.7%. Conclusion We have implemented the Atlas2 Suite and applied it to 92 whole exome samples from the 1000 Genomes Project. The Atlas2 Suite is available for download at http://sourceforge.net/projects/atlas2/. In addition to a command line version, the suite has been integrated into the Genboree Workbench, allowing biomedical scientists with minimal informatics expertise to remotely call, view, and further analyze variants through a simple web interface. The existing genomic databases displayed via the Genboree browser also streamline the process from variant discovery to functional genomics analysis, resulting in an off-the-shelf toolkit for the broader community.

  20. Cloning and sequence analysis of cDNA coding for rat nucleolar protein C23

    International Nuclear Information System (INIS)

    Ghaffari, S.H.; Olson, M.O.J.

    1986-01-01

    Using synthetic oligonucleotides as primers and probes, the authors have isolated and sequenced cDNA clones encoding protein C23, a putative nucleolus organizer protein. Poly(A + ) RNA was isolated from rat Novikoff hepatoma cells and enriched in C23 mRNA by sucrose density gradient ultracentrifugation. Two deoxyoligonuleotides, a 48- and a 27-mer, were synthesized on the basis of amino acid sequence from the C-terminal half of protein C23 and cDNA sequence data from CHO cell protein. The 48-mer was used a primer for synthesis of cDNA which was then inserted into plasmid pUC9. Transformed bacterial colonies were screened by hybridization with 32 P labeled 27-mer. Two clones among 5000 gave a strong positive signal. Plasmid DNAs from these clones were purified and characterized by blotting and nucleotide sequence analysis. The length of C23 mRNA was estimated to be 3200 bases in a northern blot analysis. The sequence of a 267 b.p. insert shows high homology with the CHO cDNA with only 9 nucleotide differences and an identical amino acid sequence. These studies indicate that this region of the protein is highly conserved

  1. Illumina MiSeq Sequencing for Preliminary Analysis of Microbiome Causing Primary Endodontic Infections in Egypt

    Directory of Open Access Journals (Sweden)

    Sally Ali Tawfik

    2018-01-01

    Full Text Available The use of high throughput next generation technologies has allowed more comprehensive analysis than traditional Sanger sequencing. The specific aim of this study was to investigate the microbial diversity of primary endodontic infections using Illumina MiSeq sequencing platform in Egyptian patients. Samples were collected from 19 patients in Suez Canal University Hospital (Endodontic Department using sterile # 15K file and paper points. DNA was extracted using Mo Bio power soil DNA isolation extraction kit followed by PCR amplification and agarose gel electrophoresis. The microbiome was characterized on the basis of the V3 and V4 hypervariable region of the 16S rRNA gene by using paired-end sequencing on Illumina MiSeq device. MOTHUR software was used in sequence filtration and analysis of sequenced data. A total of 1858 operational taxonomic units at 97% similarity were assigned to 26 phyla, 245 families, and 705 genera. Four main phyla Firmicutes, Bacteroidetes, Proteobacteria, and Synergistetes were predominant in all samples. At genus level, Prevotella, Bacillus, Porphyromonas, Streptococcus, and Bacteroides were the most abundant. Illumina MiSeq platform sequencing can be used to investigate oral microbiome composition of endodontic infections. Elucidating the ecology of endodontic infections is a necessary step in developing effective intracanal antimicrobials.

  2. Comparative analysis of codon usage bias and codon context patterns between dipteran and hymenopteran sequenced genomes.

    Directory of Open Access Journals (Sweden)

    Susanta K Behura

    Full Text Available BACKGROUND: Codon bias is a phenomenon of non-uniform usage of codons whereas codon context generally refers to sequential pair of codons in a gene. Although genome sequencing of multiple species of dipteran and hymenopteran insects have been completed only a few of these species have been analyzed for codon usage bias. METHODS AND PRINCIPAL FINDINGS: Here, we use bioinformatics approaches to analyze codon usage bias and codon context patterns in a genome-wide manner among 15 dipteran and 7 hymenopteran insect species. Results show that GAA is the most frequent codon in the dipteran species whereas GAG is the most frequent codon in the hymenopteran species. Data reveals that codons ending with C or G are frequently used in the dipteran genomes whereas codons ending with A or T are frequently used in the hymenopteran genomes. Synonymous codon usage orders (SCUO vary within genomes in a pattern that seems to be distinct for each species. Based on comparison of 30 one-to-one orthologous genes among 17 species, the fruit fly Drosophila willistoni shows the least codon usage bias whereas the honey bee (Apis mellifera shows the highest bias. Analysis of codon context patterns of these insects shows that specific codons are frequently used as the 3'- and 5'-context of start and stop codons, respectively. CONCLUSIONS: Codon bias pattern is distinct between dipteran and hymenopteran insects. While codon bias is favored by high GC content of dipteran genomes, high AT content of genes favors biased usage of synonymous codons in the hymenopteran insects. Also, codon context patterns vary among these species largely according to their phylogeny.

  3. Multielement comparison of instrumental neutron activation analysis techniques using reference materials

    International Nuclear Information System (INIS)

    Ratner, R.T.; Vernetson, W.G.

    1995-01-01

    Several instrumental neutron activation analysis techniques (parametric, comparative, and k o -standardization) are evaluated using three reference materials. Each technique is applied to National Institute of Standards and Technology standard reference materials, SRM 1577a (Bovine Liver) and SRM 2704 (Buffalo River Sediment), and the United States Geological Survey standard BHVO-1 (Hawaiian Basalt Rock). Identical (but not optimum) irradiation, decay, and counting schemes are employed with each technique to provide a basis for comparison and to determine sensitivities in a routine irradiation scheme. Fifty-one elements are used in this comparison; however, several elements are not detected in the reference materials due to rigid analytical conditions (e.g. insufficient length of irradiation or activity for radioisotope of interest decaying below the lower limit of detection before counting interval). Most elements are normally distributed around certified or consensus values with a standard deviation of 10%. For some elements, discrepancies are observed and discussed. The accuracy, precision, and sensitivity of each technique are discussed by comparing the analytical results to consensus values for the Hawaiian Basalt Rock to demonstrate the diversity of multielement applications. (author) 4 refs.; 2 tabs

  4. Characterisation of candidate reference materials by PIXE analysis and nuclear microprobe PIXE imaging

    International Nuclear Information System (INIS)

    Jaksic, M.; Pastuovic, Z.; Bogdanovic, I.; Tadic, T.

    2002-01-01

    In order to test whether some candidate reference materials show homogeneity that can satisfy quality control of the PIXE technique, six bottles of each of the two Candidate RM's - Lichen (IAEA 338) and Algae (IAEA 413) were tested. Four different tests were performed. First, two pellets from each bottle were prepared and analysed using broad beam (φ = 5 mm) PIXE. Second and third was analysis of homogeneity using scanning focussed beam at the nuclear microprobe. Scans of 50x50 μm 2 and 240x260 μm 2 were performed. Finally, individual grains with composition differing from the rest of the sample, were analysed using PIXE and RBS. (author)

  5. Exploring Valid Reference Genes for Quantitative Real-Time PCR Analysis in Sesamia inferens (Lepidoptera: Noctuidae)

    OpenAIRE

    Sun, Meng; Lu, Ming-Xing; Tang, Xiao-Tian; Du, Yu-Zhou

    2015-01-01

    The pink stem borer, Sesamia inferens, which is endemic in China and other parts of Asia, is a major pest of rice and causes significant yield loss in this host plant. Very few studies have addressed gene expression in S. inferens. Quantitative real-time PCR (qRT-PCR) is currently the most accurate and sensitive method for gene expression analysis. In qRT-PCR, data are normalized using reference genes, which help control for internal differences and reduce error between samples. In this study...

  6. Comparative analysis of the prion protein gene sequences in African lion.

    Science.gov (United States)

    Wu, Chang-De; Pang, Wan-Yong; Zhao, De-Ming

    2006-10-01

    The prion protein gene of African lion (Panthera Leo) was first cloned and polymorphisms screened. The results suggest that the prion protein gene of eight African lions is highly homogenous. The amino acid sequences of the prion protein (PrP) of all samples tested were identical. Four single nucleotide polymorphisms (C42T, C81A, C420T, T600C) in the prion protein gene (Prnp) of African lion were found, but no amino acid substitutions. Sequence analysis showed that the higher homology is observed to felis catus AF003087 (96.7%) and to sheep number M31313.1 (96.2%) Genbank accessed. With respect to all the mammalian prion protein sequences compared, the African lion prion protein sequence has three amino acid substitutions. The homology might in turn affect the potential intermolecular interactions critical for cross species transmission of prion disease.

  7. Genetic Diversity and Phylogenetic Analysis of the Iranian Leishmania Parasites Based on HSP70 Gene PCR-RFLP and Sequence Analysis.

    Science.gov (United States)

    Nemati, Sara; Fazaeli, Asghar; Hajjaran, Homa; Khamesipour, Ali; Anbaran, Mohsen Falahati; Bozorgomid, Arezoo; Zarei, Fatah

    2017-08-01

    Despite the broad distribution of leishmaniasis among Iranians and animals across the country, little is known about the genetic characteristics of the causative agents. Applying both HSP70 PCR-RFLP and sequence analyses, this study aimed to evaluate the genetic diversity and phylogenetic relationships among Leishmania spp. isolated from Iranian endemic foci and available reference strains. A total of 36 Leishmania isolates from almost all districts across the country were genetically analyzed for the HSP70 gene using both PCR-RFLP and sequence analysis. The original HSP70 gene sequences were aligned along with homologous Leishmania sequences retrieved from NCBI, and subjected to the phylogenetic analysis. Basic parameters of genetic diversity were also estimated. The HSP70 PCR-RFLP presented 3 different electrophoretic patterns, with no further intraspecific variation, corresponding to 3 Leishmania species available in the country, L. tropica, L. major, and L. infantum. Phylogenetic analyses presented 5 major clades, corresponding to 5 species complexes. Iranian lineages, including L. major, L. tropica, and L. infantum, were distributed among 3 complexes L. major, L. tropica, and L. donovani. However, within the L. major and L. donovani species complexes, the HSP70 phylogeny was not able to distinguish clearly between the L. major and L. turanica isolates, and between the L. infantum, L. donovani, and L. chagasi isolates, respectively. Our results indicated that both HSP70 PCR-RFLP and sequence analyses are medically applicable tools for identification of Leishmania species in Iranian patients. However, the reduced genetic diversity of the target gene makes it inevitable that its phylogeny only resolves the major groups, namely, the species complexes.

  8. Capillary electrophoresis fragment analysis and clone sequencing in detection of dynamic mutations of spinocerebellar ataxia

    Directory of Open Access Journals (Sweden)

    Yuan-yuan CHEN

    2018-04-01

    Full Text Available Objective To estimate the accuracy and stability of capillary electrophoresis fragment analysis and clone sequencing in detecting dynamic mutations of spinocerebellar ataxia (SCA. Methods Capillary electrophoresis fragment analysis and clone sequencing were used in detecting trinucleotide repeated sequence of 14 SCA patients (3 cases of SCA2, 2 cases of SCA7, 7 cases of SCA8 and 2 cases of SCA17. Results Capillary electrophoresis fragment analysis of 3 SCA2 cases showed the expanded cytosine-adenine-guanine (CAG repeats were 31, 30 and 32, and the copy numbers of 3 clone sequencing for 3 colonies in each case were 37/40/40, 37/38/39 and 38/39/40 respectively. Capillary electrophoresis fragment analysis