WorldWideScience

Sample records for annotation pipelines differences

  1. JGI Plant Genomics Gene Annotation Pipeline

    Energy Technology Data Exchange (ETDEWEB)

    Shu, Shengqiang; Rokhsar, Dan; Goodstein, David; Hayes, David; Mitros, Therese

    2014-07-14

    Plant genomes vary in size and are highly complex with a high amount of repeats, genome duplication and tandem duplication. Gene encodes a wealth of information useful in studying organism and it is critical to have high quality and stable gene annotation. Thanks to advancement of sequencing technology, many plant species genomes have been sequenced and transcriptomes are also sequenced. To use these vastly large amounts of sequence data to make gene annotation or re-annotation in a timely fashion, an automatic pipeline is needed. JGI plant genomics gene annotation pipeline, called integrated gene call (IGC), is our effort toward this aim with aid of a RNA-seq transcriptome assembly pipeline. It utilizes several gene predictors based on homolog peptides and transcript ORFs. See Methods for detail. Here we present genome annotation of JGI flagship green plants produced by this pipeline plus Arabidopsis and rice except for chlamy which is done by a third party. The genome annotations of these species and others are used in our gene family build pipeline and accessible via JGI Phytozome portal whose URL and front page snapshot are shown below.

  2. RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes

    Energy Technology Data Exchange (ETDEWEB)

    Brettin, Thomas; Davis, James J.; Disz, Terry; Edwards, Robert A.; Gerdes, Svetlana; Olsen, Gary J.; Olson, Robert; Overbeek, Ross; Parrello, Bruce; Pusch, Gordon D.; Shukla, Maulik; Thomason, James A.; Stevens, Rick; Vonstein, Veronika; Wattam, Alice R.; Xia, Fangfang

    2015-02-10

    The RAST (Rapid Annotation using Subsystem Technology) annotation engine was built in 2008 to annotate bacterial and archaeal genomes. It works by offering a standard software pipeline for identifying genomic features (i.e., protein-encoding genes and RNA) and annotating their functions. Recently, in order to make RAST a more useful research tool and to keep pace with advancements in bioinformatics, it has become desirable to build a version of RAST that is both customizable and extensible. In this paper, we describe the RAST tool kit (RASTtk), a modular version of RAST that enables researchers to build custom annotation pipelines. RASTtk offers a choice of software for identifying and annotating genomic features as well as the ability to add custom features to an annotation job. RASTtk also accommodates the batch submission of genomes and the ability to customize annotation protocols for batch submissions. This is the first major software restructuring of RAST since its inception.

  3. RASTtk: a modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes.

    Science.gov (United States)

    Brettin, Thomas; Davis, James J; Disz, Terry; Edwards, Robert A; Gerdes, Svetlana; Olsen, Gary J; Olson, Robert; Overbeek, Ross; Parrello, Bruce; Pusch, Gordon D; Shukla, Maulik; Thomason, James A; Stevens, Rick; Vonstein, Veronika; Wattam, Alice R; Xia, Fangfang

    2015-02-10

    The RAST (Rapid Annotation using Subsystem Technology) annotation engine was built in 2008 to annotate bacterial and archaeal genomes. It works by offering a standard software pipeline for identifying genomic features (i.e., protein-encoding genes and RNA) and annotating their functions. Recently, in order to make RAST a more useful research tool and to keep pace with advancements in bioinformatics, it has become desirable to build a version of RAST that is both customizable and extensible. In this paper, we describe the RAST tool kit (RASTtk), a modular version of RAST that enables researchers to build custom annotation pipelines. RASTtk offers a choice of software for identifying and annotating genomic features as well as the ability to add custom features to an annotation job. RASTtk also accommodates the batch submission of genomes and the ability to customize annotation protocols for batch submissions. This is the first major software restructuring of RAST since its inception.

  4. MitoFish and MitoAnnotator: a mitochondrial genome database of fish with an accurate and automatic annotation pipeline.

    Science.gov (United States)

    Iwasaki, Wataru; Fukunaga, Tsukasa; Isagozawa, Ryota; Yamada, Koichiro; Maeda, Yasunobu; Satoh, Takashi P; Sado, Tetsuya; Mabuchi, Kohji; Takeshima, Hirohiko; Miya, Masaki; Nishida, Mutsumi

    2013-11-01

    Mitofish is a database of fish mitochondrial genomes (mitogenomes) that includes powerful and precise de novo annotations for mitogenome sequences. Fish occupy an important position in the evolution of vertebrates and the ecology of the hydrosphere, and mitogenomic sequence data have served as a rich source of information for resolving fish phylogenies and identifying new fish species. The importance of a mitogenomic database continues to grow at a rapid pace as massive amounts of mitogenomic data are generated with the advent of new sequencing technologies. A severe bottleneck seems likely to occur with regard to mitogenome annotation because of the overwhelming pace of data accumulation and the intrinsic difficulties in annotating sequences with degenerating transfer RNA structures, divergent start/stop codons of the coding elements, and the overlapping of adjacent elements. To ease this data backlog, we developed an annotation pipeline named MitoAnnotator. MitoAnnotator automatically annotates a fish mitogenome with a high degree of accuracy in approximately 5 min; thus, it is readily applicable to data sets of dozens of sequences. MitoFish also contains re-annotations of previously sequenced fish mitogenomes, enabling researchers to refer to them when they find annotations that are likely to be erroneous or while conducting comparative mitogenomic analyses. For users who need more information on the taxonomy, habitats, phenotypes, or life cycles of fish, MitoFish provides links to related databases. MitoFish and MitoAnnotator are freely available at http://mitofish.aori.u-tokyo.ac.jp/ (last accessed August 28, 2013); all of the data can be batch downloaded, and the annotation pipeline can be used via a web interface.

  5. Transcriptator: An Automated Computational Pipeline to Annotate Assembled Reads and Identify Non Coding RNA.

    Directory of Open Access Journals (Sweden)

    Kumar Parijat Tripathi

    Full Text Available RNA-seq is a new tool to measure RNA transcript counts, using high-throughput sequencing at an extraordinary accuracy. It provides quantitative means to explore the transcriptome of an organism of interest. However, interpreting this extremely large data into biological knowledge is a problem, and biologist-friendly tools are lacking. In our lab, we developed Transcriptator, a web application based on a computational Python pipeline with a user-friendly Java interface. This pipeline uses the web services available for BLAST (Basis Local Search Alignment Tool, QuickGO and DAVID (Database for Annotation, Visualization and Integrated Discovery tools. It offers a report on statistical analysis of functional and Gene Ontology (GO annotation's enrichment. It helps users to identify enriched biological themes, particularly GO terms, pathways, domains, gene/proteins features and protein-protein interactions related informations. It clusters the transcripts based on functional annotations and generates a tabular report for functional and gene ontology annotations for each submitted transcript to the web server. The implementation of QuickGo web-services in our pipeline enable the users to carry out GO-Slim analysis, whereas the integration of PORTRAIT (Prediction of transcriptomic non coding RNA (ncRNA by ab initio methods helps to identify the non coding RNAs and their regulatory role in transcriptome. In summary, Transcriptator is a useful software for both NGS and array data. It helps the users to characterize the de-novo assembled reads, obtained from NGS experiments for non-referenced organisms, while it also performs the functional enrichment analysis of differentially expressed transcripts/genes for both RNA-seq and micro-array experiments. It generates easy to read tables and interactive charts for better understanding of the data. The pipeline is modular in nature, and provides an opportunity to add new plugins in the future. Web application is

  6. Pairagon+N-SCAN_EST: a model-based gene annotation pipeline

    DEFF Research Database (Denmark)

    Arumugam, Manimozhiyan; Wei, Chaochun; Brown, Randall H

    2006-01-01

    This paper describes Pairagon+N-SCAN_EST, a gene annotation pipeline that uses only native alignments. For each expressed sequence it chooses the best genomic alignment. Systems like ENSEMBL and ExoGean rely on trans alignments, in which expressed sequences are aligned to the genomic loci...... of putative homologs. Trans alignments contain a high proportion of mismatches, gaps, and/or apparently unspliceable introns, compared to alignments of cDNA sequences to their native loci. The Pairagon+N-SCAN_EST pipeline's first stage is Pairagon, a cDNA-to-genome alignment program based on a Pair......HMM probability model. This model relies on prior knowledge, such as the fact that introns must begin with GT, GC, or AT and end with AG or AC. It produces very precise alignments of high quality cDNA sequences. In the genomic regions between Pairagon's cDNA alignments, the pipeline combines EST alignments...

  7. DFAST: a flexible prokaryotic genome annotation pipeline for faster genome publication.

    Science.gov (United States)

    Tanizawa, Yasuhiro; Fujisawa, Takatomo; Nakamura, Yasukazu

    2018-03-15

    We developed a prokaryotic genome annotation pipeline, DFAST, that also supports genome submission to public sequence databases. DFAST was originally started as an on-line annotation server, and to date, over 7000 jobs have been processed since its first launch in 2016. Here, we present a newly implemented background annotation engine for DFAST, which is also available as a standalone command-line program. The new engine can annotate a typical-sized bacterial genome within 10 min, with rich information such as pseudogenes, translation exceptions and orthologous gene assignment between given reference genomes. In addition, the modular framework of DFAST allows users to customize the annotation workflow easily and will also facilitate extensions for new functions and incorporation of new tools in the future. The software is implemented in Python 3 and runs in both Python 2.7 and 3.4-on Macintosh and Linux systems. It is freely available at https://github.com/nigyta/dfast_core/under the GPLv3 license with external binaries bundled in the software distribution. An on-line version is also available at https://dfast.nig.ac.jp/. yn@nig.ac.jp. Supplementary data are available at Bioinformatics online.

  8. Natural language processing pipelines to annotate BioC collections with an application to the NCBI disease corpus.

    Science.gov (United States)

    Comeau, Donald C; Liu, Haibin; Islamaj Doğan, Rezarta; Wilbur, W John

    2014-01-01

    BioC is a new format and associated code libraries for sharing text and annotations. We have implemented BioC natural language preprocessing pipelines in two popular programming languages: C++ and Java. The current implementations interface with the well-known MedPost and Stanford natural language processing tool sets. The pipeline functionality includes sentence segmentation, tokenization, part-of-speech tagging, lemmatization and sentence parsing. These pipelines can be easily integrated along with other BioC programs into any BioC compliant text mining systems. As an application, we converted the NCBI disease corpus to BioC format, and the pipelines have successfully run on this corpus to demonstrate their functionality. Code and data can be downloaded from http://bioc.sourceforge.net. Database URL: http://bioc.sourceforge.net. © The Author(s) 2014. Published by Oxford University Press.

  9. VISPA2: a scalable pipeline for high-throughput identification and annotation of vector integration sites.

    Science.gov (United States)

    Spinozzi, Giulio; Calabria, Andrea; Brasca, Stefano; Beretta, Stefano; Merelli, Ivan; Milanesi, Luciano; Montini, Eugenio

    2017-11-25

    Bioinformatics tools designed to identify lentiviral or retroviral vector insertion sites in the genome of host cells are used to address the safety and long-term efficacy of hematopoietic stem cell gene therapy applications and to study the clonal dynamics of hematopoietic reconstitution. The increasing number of gene therapy clinical trials combined with the increasing amount of Next Generation Sequencing data, aimed at identifying integration sites, require both highly accurate and efficient computational software able to correctly process "big data" in a reasonable computational time. Here we present VISPA2 (Vector Integration Site Parallel Analysis, version 2), the latest optimized computational pipeline for integration site identification and analysis with the following features: (1) the sequence analysis for the integration site processing is fully compliant with paired-end reads and includes a sequence quality filter before and after the alignment on the target genome; (2) an heuristic algorithm to reduce false positive integration sites at nucleotide level to reduce the impact of Polymerase Chain Reaction or trimming/alignment artifacts; (3) a classification and annotation module for integration sites; (4) a user friendly web interface as researcher front-end to perform integration site analyses without computational skills; (5) the time speedup of all steps through parallelization (Hadoop free). We tested VISPA2 performances using simulated and real datasets of lentiviral vector integration sites, previously obtained from patients enrolled in a hematopoietic stem cell gene therapy clinical trial and compared the results with other preexisting tools for integration site analysis. On the computational side, VISPA2 showed a > 6-fold speedup and improved precision and recall metrics (1 and 0.97 respectively) compared to previously developed computational pipelines. These performances indicate that VISPA2 is a fast, reliable and user-friendly tool for

  10. A document processing pipeline for annotating chemical entities in scientific documents.

    Science.gov (United States)

    Campos, David; Matos, Sérgio; Oliveira, José L

    2015-01-01

    The recognition of drugs and chemical entities in text is a very important task within the field of biomedical information extraction, given the rapid growth in the amount of published texts (scientific papers, patents, patient records) and the relevance of these and other related concepts. If done effectively, this could allow exploiting such textual resources to automatically extract or infer relevant information, such as drug profiles, relations and similarities between drugs, or associations between drugs and potential drug targets. The objective of this work was to develop and validate a document processing and information extraction pipeline for the identification of chemical entity mentions in text. We used the BioCreative IV CHEMDNER task data to train and evaluate a machine-learning based entity recognition system. Using a combination of two conditional random field models, a selected set of features, and a post-processing stage, we achieved F-measure results of 87.48% in the chemical entity mention recognition task and 87.75% in the chemical document indexing task. We present a machine learning-based solution for automatic recognition of chemical and drug names in scientific documents. The proposed approach applies a rich feature set, including linguistic, orthographic, morphological, dictionary matching and local context features. Post-processing modules are also integrated, performing parentheses correction, abbreviation resolution and filtering erroneous mentions using an exclusion list derived from the training data. The developed methods were implemented as a document annotation tool and web service, freely available at http://bioinformatics.ua.pt/becas-chemicals/.

  11. MitoFish and MitoAnnotator: A Mitochondrial Genome Database of Fish with an Accurate and Automatic Annotation Pipeline

    OpenAIRE

    Iwasaki, Wataru; Fukunaga, Tsukasa; Isagozawa, Ryota; Yamada, Koichiro; Maeda, Yasunobu; Satoh, Takashi P.; Sado, Tetsuya; Mabuchi, Kohji; Takeshima, Hirohiko; Miya, Masaki; Nishida, Mutsumi

    2013-01-01

    Mitofish is a database of fish mitochondrial genomes (mitogenomes) that includes powerful and precise de novo annotations for mitogenome sequences. Fish occupy an important position in the evolution of vertebrates and the ecology of the hydrosphere, and mitogenomic sequence data have served as a rich source of information for resolving fish phylogenies and identifying new fish species. The importance of a mitogenomic database continues to grow at a rapid pace as massive amounts of mitogenomic...

  12. Tunnels: different construction methods and its use for pipelines installation

    Energy Technology Data Exchange (ETDEWEB)

    Mattos, Tales; Soares, Ana Cecilia; Assis, Slow de; Bolsonaro, Ralfo; Sanandres, Simon [Petroleo do Brasil S.A. (PETROBRAS), Rio de Janeiro, RJ (Brazil)

    2009-07-01

    In a continental dimensions country like Brazil, the pipeline modal faces the challenge of opening ROW's in the most different kind of soils with the most different geomorphology. To safely fulfill the pipeline construction demand, the ROW opening uses all techniques in earthworks and route definition and, where is necessary, no digging techniques like horizontal directional drilling, micro tunneling and also full size tunnels design for pipelines installation in high topography terrains to avoid geotechnical risks. PETROBRAS has already used the tunnel technique to cross higher terrains with great construction difficult, and mainly to make it pipeline maintenance and operation easier. For the GASBOL Project, in Aparados da Serra region and in GASYRG, in Bolivia, two tunnels were opened with approximately 700 meters and 2,000 meters each one. The GASBOL Project had the particularity of being a gallery with only one excavation face, finishing under the hill and from this point was drilled a vertical shaft was drilled until the top to install the pipeline section, while in GASYRG Project the tunnel had two excavation faces. Currently, two projects are under development with tunnels, one of then is the Caraguatatuba-Taubate gas pipeline (GASTAU), with a 5 km tunnel, with the same concepts of the GASBOL tunnel, with a gallery to be opened with the use of a TBM (Tunneling Boring Machine), and a shaft to the surface, and the gas pipeline Cabiunas-Reduc III (GASDUC III) project is under construction with a 3.7 km tunnel, like the GASYRG tunnel with two faces. This paper presents the main excavation tunneling methods, conventional and mechanized, presenting the most relevant characteristics from both and, in particular, the use of tunnels for pipelines installation. (author)

  13. WGSSAT: A High-Throughput Computational Pipeline for Mining and Annotation of SSR Markers From Whole Genomes.

    Science.gov (United States)

    Pandey, Manmohan; Kumar, Ravindra; Srivastava, Prachi; Agarwal, Suyash; Srivastava, Shreya; Nagpure, Naresh S; Jena, Joy K; Kushwaha, Basdeo

    2018-03-16

    Mining and characterization of Simple Sequence Repeat (SSR) markers from whole genomes provide valuable information about biological significance of SSR distribution and also facilitate development of markers for genetic analysis. Whole genome sequencing (WGS)-SSR Annotation Tool (WGSSAT) is a graphical user interface pipeline developed using Java Netbeans and Perl scripts which facilitates in simplifying the process of SSR mining and characterization. WGSSAT takes input in FASTA format and automates the prediction of genes, noncoding RNA (ncRNA), core genes, repeats and SSRs from whole genomes followed by mapping of the predicted SSRs onto a genome (classified according to genes, ncRNA, repeats, exonic, intronic, and core gene region) along with primer identification and mining of cross-species markers. The program also generates a detailed statistical report along with visualization of mapped SSRs, genes, core genes, and RNAs. The features of WGSSAT were demonstrated using Takifugu rubripes data. This yielded a total of 139 057 SSR, out of which 113 703 SSR primer pairs were uniquely amplified in silico onto a T. rubripes (fugu) genome. Out of 113 703 mined SSRs, 81 463 were from coding region (including 4286 exonic and 77 177 intronic), 7 from RNA, 267 from core genes of fugu, whereas 105 641 SSR and 601 SSR primer pairs were uniquely mapped onto the medaka genome. WGSSAT is tested under Ubuntu Linux. The source code, documentation, user manual, example dataset and scripts are available online at https://sourceforge.net/projects/wgssat-nbfgr.

  14. Fish the ChIPs: a pipeline for automated genomic annotation of ChIP-Seq data

    Directory of Open Access Journals (Sweden)

    Minucci Saverio

    2011-10-01

    Full Text Available Abstract Background High-throughput sequencing is generating massive amounts of data at a pace that largely exceeds the throughput of data analysis routines. Here we introduce Fish the ChIPs (FC, a computational pipeline aimed at a broad public of users and designed to perform complete ChIP-Seq data analysis of an unlimited number of samples, thus increasing throughput, reproducibility and saving time. Results Starting from short read sequences, FC performs the following steps: 1 quality controls, 2 alignment to a reference genome, 3 peak calling, 4 genomic annotation, 5 generation of raw signal tracks for visualization on the UCSC and IGV genome browsers. FC exploits some of the fastest and most effective tools today available. Installation on a Mac platform requires very basic computational skills while configuration and usage are supported by a user-friendly graphic user interface. Alternatively, FC can be compiled from the source code on any Unix machine and then run with the possibility of customizing each single parameter through a simple configuration text file that can be generated using a dedicated user-friendly web-form. Considering the execution time, FC can be run on a desktop machine, even though the use of a computer cluster is recommended for analyses of large batches of data. FC is perfectly suited to work with data coming from Illumina Solexa Genome Analyzers or ABI SOLiD and its usage can potentially be extended to any sequencing platform. Conclusions Compared to existing tools, FC has two main advantages that make it suitable for a broad range of users. First of all, it can be installed and run by wet biologists on a Mac machine. Besides it can handle an unlimited number of samples, being convenient for large analyses. In this context, computational biologists can increase reproducibility of their ChIP-Seq data analyses while saving time for downstream analyses. Reviewers This article was reviewed by Gavin Huttley, George

  15. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects.

    Science.gov (United States)

    Holt, Carson; Yandell, Mark

    2011-12-22

    Second-generation sequencing technologies are precipitating major shifts with regards to what kinds of genomes are being sequenced and how they are annotated. While the first generation of genome projects focused on well-studied model organisms, many of today's projects involve exotic organisms whose genomes are largely terra incognita. This complicates their annotation, because unlike first-generation projects, there are no pre-existing 'gold-standard' gene-models with which to train gene-finders. Improvements in genome assembly and the wide availability of mRNA-seq data are also creating opportunities to update and re-annotate previously published genome annotations. Today's genome projects are thus in need of new genome annotation tools that can meet the challenges and opportunities presented by second-generation sequencing technologies. We present MAKER2, a genome annotation and data management tool designed for second-generation genome projects. MAKER2 is a multi-threaded, parallelized application that can process second-generation datasets of virtually any size. We show that MAKER2 can produce accurate annotations for novel genomes where training-data are limited, of low quality or even non-existent. MAKER2 also provides an easy means to use mRNA-seq data to improve annotation quality; and it can use these data to update legacy annotations, significantly improving their quality. We also show that MAKER2 can evaluate the quality of genome annotations, and identify and prioritize problematic annotations for manual review. MAKER2 is the first annotation engine specifically designed for second-generation genome projects. MAKER2 scales to datasets of any size, requires little in the way of training data, and can use mRNA-seq data to improve annotation quality. It can also update and manage legacy genome annotation datasets.

  16. Interaction of Buried Pipeline with Soil Under Different Loading Cases

    Directory of Open Access Journals (Sweden)

    Magura Martin

    2016-09-01

    Full Text Available Gas pipelines pass through different topographies. Their stress level is influenced not only by gas pressure, but also by the adjacent soil, the thickness of any covering layers, and soil movements (sinking, landslides. The stress level may be unevenly spread over a pipe due to these causes. When evaluating experimental measurements, errors may occur. The value of the resistance reserve of steel can be adjusted by a detailed analysis of any loading. This reserve can be used in the assessment of a pipeline’s actual state or in reconstructions. A detailed analysis of such loading and its comparison with the simple theory of elasticity is shown in this article.

  17. Different Approaches to Automatic Polarity Annotation at Synset Level

    NARCIS (Netherlands)

    Maks, E.; Vossen, P.T.J.M.; Sagot, B.

    2011-01-01

    In this paper we explore two approaches for the automatic annotation of polarity (positive, negative and neutral) of adjective synsets in Dutch. Both approaches focus on the creation of a Dutch polarity lexicon at word sense level using wordnet as a lexical resource. The first method is based upon

  18. DomSign: a top-down annotation pipeline to enlarge enzyme space in the protein universe.

    Science.gov (United States)

    Wang, Tianmin; Mori, Hiroshi; Zhang, Chong; Kurokawa, Ken; Xing, Xin-Hui; Yamada, Takuji

    2015-03-21

    Computational predictions of catalytic function are vital for in-depth understanding of enzymes. Because several novel approaches performing better than the common BLAST tool are rarely applied in research, we hypothesized that there is a large gap between the number of known annotated enzymes and the actual number in the protein universe, which significantly limits our ability to extract additional biologically relevant functional information from the available sequencing data. To reliably expand the enzyme space, we developed DomSign, a highly accurate domain signature-based enzyme functional prediction tool to assign Enzyme Commission (EC) digits. DomSign is a top-down prediction engine that yields results comparable, or superior, to those from many benchmark EC number prediction tools, including BLASTP, when a homolog with an identity >30% is not available in the database. Performance tests showed that DomSign is a highly reliable enzyme EC number annotation tool. After multiple tests, the accuracy is thought to be greater than 90%. Thus, DomSign can be applied to large-scale datasets, with the goal of expanding the enzyme space with high fidelity. Using DomSign, we successfully increased the percentage of EC-tagged enzymes from 12% to 30% in UniProt-TrEMBL. In the Kyoto Encyclopedia of Genes and Genomes bacterial database, the percentage of EC-tagged enzymes for each bacterial genome could be increased from 26.0% to 33.2% on average. Metagenomic mining was also efficient, as exemplified by the application of DomSign to the Human Microbiome Project dataset, recovering nearly one million new EC-labeled enzymes. Our results offer preliminarily confirmation of the existence of the hypothesized huge number of "hidden enzymes" in the protein universe, the identification of which could substantially further our understanding of the metabolisms of diverse organisms and also facilitate bioengineering by providing a richer enzyme resource. Furthermore, our results

  19. Exploiting ''Subjective'' Annotations

    NARCIS (Netherlands)

    Reidsma, Dennis; op den Akker, Hendrikus J.A.; Artstein, R.; Boleda, G.; Keller, F.; Schulte im Walde, S.

    2008-01-01

    Many interesting phenomena in conversation can only be annotated as a subjective task, requiring interpretative judgements from annotators. This leads to data which is annotated with lower levels of agreement not only due to errors in the annotation, but also due to the differences in how annotators

  20. Detection of Variable Stars in the Open Cluster M11 Using Difference Image Analysis Pipeline

    Directory of Open Access Journals (Sweden)

    Chung-Uk Lee

    2010-12-01

    Full Text Available We developed a photometric pipeline to be used for a wide field survey. This pipeline employs the difference image analysis (DIA method appropriate for the photometry of star dense field such as the Galactic bulge. To verify the performance of pipeline, the observed dataset of the open cluster M11 was re-processed. One hundred seventy eight variable stars were newly discovered by analyzing the light curves of which photometric accuracy was improved through the DIA. The total number of variable stars in the M11 observation region is 335, including 157 variable stars discovered by previous studies. We present the catalogue and light curves for the 178 variable stars. This study shows that the photometric pipeline using the DIA is very useful in the detection of variable stars in a cluster.

  1. A Protocol for Annotating Parser Differences. Research Report. ETS RR-16-02

    Science.gov (United States)

    Bruno, James V.; Cahill, Aoife; Gyawali, Binod

    2016-01-01

    We present an annotation scheme for classifying differences in the outputs of syntactic constituency parsers when a gold standard is unavailable or undesired, as in the case of texts written by nonnative speakers of English. We discuss its automated implementation and the results of a case study that uses the scheme to choose a parser best suited…

  2. Corrosion Behavior of Pipeline Carbon Steel under Different Iron Oxide Deposits in the District Heating System

    Directory of Open Access Journals (Sweden)

    Yong-Sang Kim

    2017-05-01

    Full Text Available The corrosion behavior of pipeline steel covered by iron oxides (α-FeOOH; Fe3O4 and Fe2O3 was investigated in simulated district heating water. In potentiodynamic polarization tests; the corrosion rate of pipeline steel is increased under the iron oxide but the increaseing rate is different due to the differnet chemical reactions of the covered iron oxides. Pitting corrosion was only observed on the α-FeOOH-covered specimen; which is caused by the crevice corrosion under the α-FeOOH. From Mott-Schottky and X-ray diffraction results; the surface reaction and oxide layer were dependent on the kind of iron oxides. The iron oxides deposit increases the failure risk of the pipeline and localized corrosion can be occurred under the α-FeOOH-covered region of the pipeline. Thus, prevention methods for the iron oxide deposit in the district pipeline system such as filtering or periodic chemical cleaning are needed.

  3. Evaluation of a hybrid approach using UBLAST and BLASTX for metagenomic sequences annotation of specific functional genes.

    Directory of Open Access Journals (Sweden)

    Ying Yang

    Full Text Available The fast development of next generation sequencing (NGS has dramatically increased the application of metagenomics in various aspects. Functional annotation is a major step in the metagenomics studies. Fast annotation of functional genes has been a challenge because of the deluge of NGS data and expanding databases. A hybrid annotation pipeline proposed previously for taxonomic assignments was evaluated in this study for metagenomic sequences annotation of specific functional genes, such as antibiotic resistance genes, arsenic resistance genes and key genes in nitrogen metabolism. The hybrid approach using UBLAST and BLASTX is 44-177 times faster than direct BLASTX in the annotation using the small protein database for the specific functional genes, with the cost of missing a small portion (<1.8% of target sequences compared with direct BLASTX hits. Different from direct BLASTX, the time required for specific functional genes annotation using the hybrid annotation pipeline depends on the abundance for the target genes. Thus this hybrid annotation pipeline is more suitable in specific functional genes annotation than in comprehensive functional genes annotation.

  4. Structure-Promiscuity Relationship Puzzles-Extensively Assayed Analogs with Large Differences in Target Annotations.

    Science.gov (United States)

    Hu, Ye; Jasial, Swarit; Gilberg, Erik; Bajorath, Jürgen

    2017-05-01

    Publicly available screening data were systematically searched for extensively assayed structural analogs with large differences in the number of targets they were active against. Screening compounds with potential chemical liabilities that may give rise to assay artifacts were identified and excluded from the analysis. "Promiscuity cliffs" were frequently identified, defined here as pairs of structural analogs with a difference of at least 20 target annotations across all assays they were tested in. New assay indices were introduced to prioritize cliffs formed by screening compounds that were extensively tested in comparably large numbers of assays including many shared assays. In these cases, large differences in promiscuity degrees were not attributable to differences in assay frequency and/or lack of assay overlap. Such analog pairs have high priority for further exploring molecular origins of multi-target activities. Therefore, these promiscuity cliffs and associated target annotations are made freely available. The corresponding analogs often represent equally puzzling and interesting examples of structure-promiscuity relationships.

  5. Comparative Metagenomic Analysis of Human Gut Microbiome Composition Using Two Different Bioinformatic Pipelines

    Directory of Open Access Journals (Sweden)

    Valeria D’Argenio

    2014-01-01

    Full Text Available Technological advances in next-generation sequencing-based approaches have greatly impacted the analysis of microbial community composition. In particular, 16S rRNA-based methods have been widely used to analyze the whole set of bacteria present in a target environment. As a consequence, several specific bioinformatic pipelines have been developed to manage these data. MetaGenome Rapid Annotation using Subsystem Technology (MG-RAST and Quantitative Insights Into Microbial Ecology (QIIME are two freely available tools for metagenomic analyses that have been used in a wide range of studies. Here, we report the comparative analysis of the same dataset with both QIIME and MG-RAST in order to evaluate their accuracy in taxonomic assignment and in diversity analysis. We found that taxonomic assignment was more accurate with QIIME which, at family level, assigned a significantly higher number of reads. Thus, QIIME generated a more accurate BIOM file, which in turn improved the diversity analysis output. Finally, although informatics skills are needed to install QIIME, it offers a wide range of metrics that are useful for downstream applications and, not less important, it is not dependent on server times.

  6. The use of VLD (vive la difference) in the molecular-replacement approach: a pipeline.

    Science.gov (United States)

    Carrozzini, Benedetta; Cascarano, Giovanni Luca; Comunale, Giuliana; Giacovazzo, Carmelo; Mazzone, Annamaria

    2013-06-01

    VLD (vive la difference) is a novel ab initio phasing approach that is able to drive random phases to the correct values. It has been applied to small, medium and protein structures provided that the data resolution was atomic. It has never been used for non-ab initio cases in which some phase information is available but the data resolution is usually very far from 1 Å. In this paper, the potential of VLD is tested for the first time for a classical non-ab initio problem: molecular replacement. Good preliminary experimental results encouraged the construction of a pipeline for leading partial molecular-replacement models with errors to refined solutions in a fully automated way. The pipeline moduli and their interaction are described, together with applications to a wide set of test cases.

  7. Modelling and Simulation of Free Floating Pig for Different Pipeline Inclination Angles

    Directory of Open Access Journals (Sweden)

    Woldemichael Dereje Engida

    2016-01-01

    Full Text Available This paper presents a modelling and simulation of free floating pig to determine the flow parameters to avoid pig stalling in pigging operation. A free floating spherical shaped pig was design and equipped with necessary sensors to detect leak along the pipeline. The free floating pig does not have internal or external power supply to navigate through the pipeline. Instead, it is being driven by the flowing medium. In order to avoid stalling of the pig, it is essential to conduct simulation to determine the necessary flow parameters for different inclination angles. Accordingly, a pipeline section with inclination of 0°, 15°, 30°, 45°, 60°, 75°, and 90° were modelled and simulated using ANSYS FLUENT 15.0 with water and oil as working medium. For each case, the minimum velocity required to propel the free floating pig through the inclination were determined. In addition, the trajectory of the free floating pig has been visualized in the simulation.

  8. AIGO: Towards a unified framework for the Analysis and the Inter-comparison of GO functional annotations

    Directory of Open Access Journals (Sweden)

    Defoin-Platel Michael

    2011-11-01

    Full Text Available Abstract Background In response to the rapid growth of available genome sequences, efforts have been made to develop automatic inference methods to functionally characterize them. Pipelines that infer functional annotation are now routinely used to produce new annotations at a genome scale and for a broad variety of species. These pipelines differ widely in their inference algorithms, confidence thresholds and data sources for reasoning. This heterogeneity makes a comparison of the relative merits of each approach extremely complex. The evaluation of the quality of the resultant annotations is also challenging given there is often no existing gold-standard against which to evaluate precision and recall. Results In this paper, we present a pragmatic approach to the study of functional annotations. An ensemble of 12 metrics, describing various aspects of functional annotations, is defined and implemented in a unified framework, which facilitates their systematic analysis and inter-comparison. The use of this framework is demonstrated on three illustrative examples: analysing the outputs of state-of-the-art inference pipelines, comparing electronic versus manual annotation methods, and monitoring the evolution of publicly available functional annotations. The framework is part of the AIGO library (http://code.google.com/p/aigo for the Analysis and the Inter-comparison of the products of Gene Ontology (GO annotation pipelines. The AIGO library also provides functionalities to easily load, analyse, manipulate and compare functional annotations and also to plot and export the results of the analysis in various formats. Conclusions This work is a step toward developing a unified framework for the systematic study of GO functional annotations. This framework has been designed so that new metrics on GO functional annotations can be added in a very straightforward way.

  9. The Microstructure Evolution of Dual-Phase Pipeline Steel with Plastic Deformation at Different Strain Rates

    Science.gov (United States)

    Ji, L. K.; Xu, T.; Zhang, J. M.; Wang, H. T.; Tong, M. X.; Zhu, R. H.; Zhou, G. S.

    2017-07-01

    Tensile properties of the high-deformability dual-phase ferrite-bainite X70 pipeline steel have been investigated at room temperature under the strain rates of 2.5 × 10-5, 1.25 × 10-4, 2.5 × 10-3, and 1.25 × 10-2 s-1. The microstructures at different amount of plastic deformation were examined by using scanning and transmission electron microscopy. Generally, the ductility of typical body-centered cubic steels is reduced when its stain rate increases. However, we observed a different ductility dependence on strain rates in the dual-phase X70 pipeline steel. The uniform elongation (UEL%) and elongation to fracture (EL%) at the strain rate of 2.5 × 10-3 s-1 increase about 54 and 74%, respectively, compared to those at 2.5 × 10-5 s-1. The UEL% and EL% reach to their maximum at the strain rate of 2.5 × 10-3 s-1. This phenomenon was explained by the observed grain structures and dislocation configurations. Whether or not the ductility can be enhanced with increasing strain rates depends on the competition between the homogenization of plastic deformation among the microconstituents (ultra-fine ferrite grains, relatively coarse ferrite grains as well as bainite) and the progress of cracks formed as a consequence of localized inconsistent plastic deformation.

  10. Graph-based sequence annotation using a data integration approach

    Directory of Open Access Journals (Sweden)

    Pesch Robert

    2008-06-01

    Full Text Available The automated annotation of data from high throughput sequencing and genomics experiments is a significant challenge for bioinformatics. Most current approaches rely on sequential pipelines of gene finding and gene function prediction methods that annotate a gene with information from different reference data sources. Each function prediction method contributes evidence supporting a functional assignment. Such approaches generally ignore the links between the information in the reference datasets. These links, however, are valuable for assessing the plausibility of a function assignment and can be used to evaluate the confidence in a prediction. We are working towards a novel annotation system that uses the network of information supporting the function assignment to enrich the annotation process for use by expert curators and predicting the function of previously unannotated genes. In this paper we describe our success in the first stages of this development. We present the data integration steps that are needed to create the core database of integrated reference databases (UniProt, PFAM, PDB, GO and the pathway database Ara- Cyc which has been established in the ONDEX data integration system. We also present a comparison between different methods for integration of GO terms as part of the function assignment pipeline and discuss the consequences of this analysis for improving the accuracy of gene function annotation.

  11. Graph-based sequence annotation using a data integration approach.

    Science.gov (United States)

    Pesch, Robert; Lysenko, Artem; Hindle, Matthew; Hassani-Pak, Keywan; Thiele, Ralf; Rawlings, Christopher; Köhler, Jacob; Taubert, Jan

    2008-08-25

    The automated annotation of data from high throughput sequencing and genomics experiments is a significant challenge for bioinformatics. Most current approaches rely on sequential pipelines of gene finding and gene function prediction methods that annotate a gene with information from different reference data sources. Each function prediction method contributes evidence supporting a functional assignment. Such approaches generally ignore the links between the information in the reference datasets. These links, however, are valuable for assessing the plausibility of a function assignment and can be used to evaluate the confidence in a prediction. We are working towards a novel annotation system that uses the network of information supporting the function assignment to enrich the annotation process for use by expert curators and predicting the function of previously unannotated genes. In this paper we describe our success in the first stages of this development. We present the data integration steps that are needed to create the core database of integrated reference databases (UniProt, PFAM, PDB, GO and the pathway database Ara-Cyc) which has been established in the ONDEX data integration system. We also present a comparison between different methods for integration of GO terms as part of the function assignment pipeline and discuss the consequences of this analysis for improving the accuracy of gene function annotation. The methods and algorithms presented in this publication are an integral part of the ONDEX system which is freely available from http://ondex.sf.net/.

  12. Correction: Electrochemical Investigation of the Corrosion of Different Microstructural Phases of X65 Pipeline Steel under Saturated Carbon Dioxide Conditions. Materials 2015, 8, 2635–2649

    Directory of Open Access Journals (Sweden)

    Yuanfeng Yang

    2015-12-01

    Full Text Available In the published manuscript “Electrochemical Investigation of the Corrosion of Different Microstructural Phases of X65 Pipeline Steel under Saturated Carbon Dioxide Conditions. [...

  13. Effect of different microstructural parameters on hydrogen induced cracking in an API X70 pipeline steel

    Science.gov (United States)

    Mohtadi-Bonab, M. A.; Eskandari, M.; Karimdadashi, R.; Szpunar, J. A.

    2017-07-01

    In this study, the surface and cross section of an as-received API X70 pipeline steel was studied by SEM and EDS techniques in order to categorize the shape and morphology of inclusions. Then, an electrochemical hydrogen charging using a mixed solution of 0.2 M sulfuric acid and 3 g/l ammonium thiocyanate has been utilized to create hydrogen cracks in X70 steel. After hydrogen charging experiments, the cross section of this steel has been accurately checked by SEM in order to find out hydrogen cracks. The region of hydrogen cracks was investigated by SEM and EBSD techniques to predict the role of different microstructural parameters involving hydrogen induced cracking (HIC) phenomenon. The results showed that inclusions were randomly distributed in the cross section of tested specimens. Moreover, different types of inclusions in as-received X70 steel were found. However, only inclusions which were hard, brittle and incoherent with the metal matrix, such as manganese sulfide and carbonitride precipitates, were recognized to be harmful to HIC phenomenon. Moreover, HIC cracks propagate dominantly in transgraular manner through differently oriented grains with no clear preferential trend. Moreover, a different type of HIC crack with about 15-20 degrees of deviation from the rolling direction was found and studied by EBSD technique and role of micro-texture parameters on HIC was discussed.

  14. Re-annotation of the woodland strawberry (Fragaria vesca) genome.

    Science.gov (United States)

    Darwish, Omar; Shahan, Rachel; Liu, Zhongchi; Slovin, Janet P; Alkharouf, Nadim W

    2015-01-27

    Fragaria vesca is a low-growing, small-fruited diploid strawberry species commonly called woodland strawberry. It is native to temperate regions of Eurasia and North America and while it produces edible fruits, it is most highly useful as an experimental perennial plant system that can serve as a model for the agriculturally important Rosaceae family. A draft of the F. vesca genome sequence was published in 2011 [Nat Genet 43:223,2011]. The first generation annotation (version 1.1) were developed using GeneMark-ES+[Nuc Acids Res 33:6494,2005]which is a self-training gene prediction tool that relies primarily on the combination of ab initio predictions with mapping high confidence ESTs in addition to mapping gene deserts from transposable elements. Based on over 25 different tissue transcriptomes, we have revised the F. vesca genome annotation, thereby providing several improvements over version 1.1. The new annotation, which was achieved using Maker, describes many more predicted protein coding genes compared to the GeneMark generated annotation that is currently hosted at the Genome Database for Rosaceae ( http://www.rosaceae.org/ ). Our new annotation also results in an increase in the overall total coding length, and the number of coding regions found. The total number of gene predictions that do not overlap with the previous annotations is 2286, most of which were found to be homologous to other plant genes. We have experimentally verified one of the new gene model predictions to validate our results. Using the RNA-Seq transcriptome sequences from 25 diverse tissue types, the re-annotation pipeline improved existing annotations by increasing the annotation accuracy based on extensive transcriptome data. It uncovered new genes, added exons to current genes, and extended or merged exons. This complete genome re-annotation will significantly benefit functional genomic studies of the strawberry and other members of the Rosaceae.

  15. Anodic Dissolution of API X70 Pipeline Steel in Arabian Gulf Seawater after Different Exposure Intervals

    OpenAIRE

    El-Sayed M. Sherif; Abdulhakim A. Almajid

    2014-01-01

    The anodic dissolution of API X70 pipeline steel in Arabian Gulf seawater (AGSW) was investigated using open-circuit potential (OCP), electrochemical impedance spectroscopy (EIS), cyclic potentiodynamic polarization (CPP), and current-time measurements. The electrochemical experiments revealed that the X70 pipeline steel suffers both general and pitting corrosion in the AGSW solution. It was found that the general corrosion decreases as a result of decreasing the corrosion current density (jc...

  16. Pipeline engineering

    CERN Document Server

    Liu, Henry

    2003-01-01

    PART I: PIPE FLOWSINTRODUCTIONDefinition and Scope Brief History of PipelinesExisting Major PipelinesImportance of PipelinesFreight (Solids) Transport by PipelinesTypes of PipelinesComponents of PipelinesAdvantages of PipelinesReferencesSINGLE-PHASE INCOMPRESSIBLE NEWTONIAN FLUIDIntroductionFlow RegimesLocal Mean Velocity and Its Distribution (Velocity Profile)Flow Equations for One-Dimensional AnalysisHydraulic and Energy Grade LinesCavitation in Pipeline SystemsPipe in Series and ParallelInterconnected ReservoirsPipe NetworkUnsteady Flow in PipeSINGLE-PHASE COMPRESSIBLE FLOW IN PIPEFlow Ana

  17. Semantic annotation of mutable data.

    Directory of Open Access Journals (Sweden)

    Robert A Morris

    Full Text Available Electronic annotation of scientific data is very similar to annotation of documents. Both types of annotation amplify the original object, add related knowledge to it, and dispute or support assertions in it. In each case, annotation is a framework for discourse about the original object, and, in each case, an annotation needs to clearly identify its scope and its own terminology. However, electronic annotation of data differs from annotation of documents: the content of the annotations, including expectations and supporting evidence, is more often shared among members of networks. Any consequent actions taken by the holders of the annotated data could be shared as well. But even those current annotation systems that admit data as their subject often make it difficult or impossible to annotate at fine-enough granularity to use the results in this way for data quality control. We address these kinds of issues by offering simple extensions to an existing annotation ontology and describe how the results support an interest-based distribution of annotations. We are using the result to design and deploy a platform that supports annotation services overlaid on networks of distributed data, with particular application to data quality control. Our initial instance supports a set of natural science collection metadata services. An important application is the support for data quality control and provision of missing data. A previous proof of concept demonstrated such use based on data annotations modeled with XML-Schema.

  18. Ubiquitous Annotation Systems

    DEFF Research Database (Denmark)

    Hansen, Frank Allan

    2006-01-01

    Ubiquitous annotation systems allow users to annotate physical places, objects, and persons with digital information. Especially in the field of location based information systems much work has been done to implement adaptive and context-aware systems, but few efforts have focused on the general...... requirements for linking information to objects in both physical and digital space. This paper surveys annotation techniques from open hypermedia systems, Web based annotation systems, and mobile and augmented reality systems to illustrate different approaches to four central challenges ubiquitous annotation...... systems have to deal with: anchoring, structuring, presentation, and authoring. Through a number of examples each challenge is discussed and HyCon, a context-aware hypermedia framework developed at the University of Aarhus, Denmark, is used to illustrate an integrated approach to ubiquitous annotations...

  19. A graph-based approach for designing extensible pipelines.

    Science.gov (United States)

    Rodrigues, Maíra R; Magalhães, Wagner C S; Machado, Moara; Tarazona-Santos, Eduardo

    2012-07-12

    In bioinformatics, it is important to build extensible and low-maintenance systems that are able to deal with the new tools and data formats that are constantly being developed. The traditional and simplest implementation of pipelines involves hardcoding the execution steps into programs or scripts. This approach can lead to problems when a pipeline is expanding because the incorporation of new tools is often error prone and time consuming. Current approaches to pipeline development such as workflow management systems focus on analysis tasks that are systematically repeated without significant changes in their course of execution, such as genome annotation. However, more dynamism on the pipeline composition is necessary when each execution requires a different combination of steps. We propose a graph-based approach to implement extensible and low-maintenance pipelines that is suitable for pipeline applications with multiple functionalities that require different combinations of steps in each execution. Here pipelines are composed automatically by compiling a specialised set of tools on demand, depending on the functionality required, instead of specifying every sequence of tools in advance. We represent the connectivity of pipeline components with a directed graph in which components are the graph edges, their inputs and outputs are the graph nodes, and the paths through the graph are pipelines. To that end, we developed special data structures and a pipeline system algorithm. We demonstrate the applicability of our approach by implementing a format conversion pipeline for the fields of population genetics and genetic epidemiology, but our approach is also helpful in other fields where the use of multiple software is necessary to perform comprehensive analyses, such as gene expression and proteomics analyses. The project code, documentation and the Java executables are available under an open source license at http://code.google.com/p/dynamic-pipeline. The system

  20. A graph-based approach for designing extensible pipelines

    Directory of Open Access Journals (Sweden)

    Rodrigues Maíra R

    2012-07-01

    Full Text Available Abstract Background In bioinformatics, it is important to build extensible and low-maintenance systems that are able to deal with the new tools and data formats that are constantly being developed. The traditional and simplest implementation of pipelines involves hardcoding the execution steps into programs or scripts. This approach can lead to problems when a pipeline is expanding because the incorporation of new tools is often error prone and time consuming. Current approaches to pipeline development such as workflow management systems focus on analysis tasks that are systematically repeated without significant changes in their course of execution, such as genome annotation. However, more dynamism on the pipeline composition is necessary when each execution requires a different combination of steps. Results We propose a graph-based approach to implement extensible and low-maintenance pipelines that is suitable for pipeline applications with multiple functionalities that require different combinations of steps in each execution. Here pipelines are composed automatically by compiling a specialised set of tools on demand, depending on the functionality required, instead of specifying every sequence of tools in advance. We represent the connectivity of pipeline components with a directed graph in which components are the graph edges, their inputs and outputs are the graph nodes, and the paths through the graph are pipelines. To that end, we developed special data structures and a pipeline system algorithm. We demonstrate the applicability of our approach by implementing a format conversion pipeline for the fields of population genetics and genetic epidemiology, but our approach is also helpful in other fields where the use of multiple software is necessary to perform comprehensive analyses, such as gene expression and proteomics analyses. The project code, documentation and the Java executables are available under an open source license at http://code.google.com/p/dynamic-pipeline

  1. OMIGA: Optimized Maker-Based Insect Genome Annotation.

    Science.gov (United States)

    Liu, Jinding; Xiao, Huamei; Huang, Shuiqing; Li, Fei

    2014-08-01

    Insects are one of the largest classes of animals on Earth and constitute more than half of all living species. The i5k initiative has begun sequencing of more than 5,000 insect genomes, which should greatly help in exploring insect resource and pest control. Insect genome annotation remains challenging because many insects have high levels of heterozygosity. To improve the quality of insect genome annotation, we developed a pipeline, named Optimized Maker-Based Insect Genome Annotation (OMIGA), to predict protein-coding genes from insect genomes. We first mapped RNA-Seq reads to genomic scaffolds to determine transcribed regions using Bowtie, and the putative transcripts were assembled using Cufflink. We then selected highly reliable transcripts with intact coding sequences to train de novo gene prediction software, including Augustus. The re-trained software was used to predict genes from insect genomes. Exonerate was used to refine gene structure and to determine near exact exon/intron boundary in the genome. Finally, we used the software Maker to integrate data from RNA-Seq, de novo gene prediction, and protein alignment to produce an official gene set. The OMIGA pipeline was used to annotate the draft genome of an important insect pest, Chilo suppressalis, yielding 12,548 genes. Different strategies were compared, which demonstrated that OMIGA had the best performance. In summary, we present a comprehensive pipeline for identifying genes in insect genomes that can be widely used to improve the annotation quality in insects. OMIGA is provided at http://ento.njau.edu.cn/omiga.html .

  2. Anodic Dissolution of API X70 Pipeline Steel in Arabian Gulf Seawater after Different Exposure Intervals

    Directory of Open Access Journals (Sweden)

    El-Sayed M. Sherif

    2014-01-01

    Full Text Available The anodic dissolution of API X70 pipeline steel in Arabian Gulf seawater (AGSW was investigated using open-circuit potential (OCP, electrochemical impedance spectroscopy (EIS, cyclic potentiodynamic polarization (CPP, and current-time measurements. The electrochemical experiments revealed that the X70 pipeline steel suffers both general and pitting corrosion in the AGSW solution. It was found that the general corrosion decreases as a result of decreasing the corrosion current density (jcorr, corrosion rate (Rcorr and absolute currents as well as the increase of polarization resistance of X70 with increasing the exposure time. On the other hand, the pitting corrosion was found to increase with increasing the immersion time. This was confirmed by the increase of current with time and by the SEM images that were obtained on the steel surface after 20 h immersion before applying an amount of 0–.35 V versus Ag/AgCl for 1 h.

  3. FIGENIX: Intelligent automation of genomic annotation: expertise integration in a new software platform

    Directory of Open Access Journals (Sweden)

    Pontarotti Pierre

    2005-08-01

    Full Text Available Abstract Background Two of the main objectives of the genomic and post-genomic era are to structurally and functionally annotate genomes which consists of detecting genes' position and structure, and inferring their function (as well as of other features of genomes. Structural and functional annotation both require the complex chaining of numerous different software, algorithms and methods under the supervision of a biologist. The automation of these pipelines is necessary to manage huge amounts of data released by sequencing projects. Several pipelines already automate some of these complex chaining but still necessitate an important contribution of biologists for supervising and controlling the results at various steps. Results Here we propose an innovative automated platform, FIGENIX, which includes an expert system capable to substitute to human expertise at several key steps. FIGENIX currently automates complex pipelines of structural and functional annotation under the supervision of the expert system (which allows for example to make key decisions, check intermediate results or refine the dataset. The quality of the results produced by FIGENIX is comparable to those obtained by expert biologists with a drastic gain in terms of time costs and avoidance of errors due to the human manipulation of data. Conclusion The core engine and expert system of the FIGENIX platform currently handle complex annotation processes of broad interest for the genomic community. They could be easily adapted to new, or more specialized pipelines, such as for example the annotation of miRNAs, the classification of complex multigenic families, annotation of regulatory elements and other genomic features of interest.

  4. THE DIFFERENCE IMAGING PIPELINE FOR THE TRANSIENT SEARCH IN THE DARK ENERGY SURVEY

    Energy Technology Data Exchange (ETDEWEB)

    Kessler, R.; Marriner, J.; Childress, M.; Covarrubias, R.; D’Andrea, C. B.; Finley, D. A.; Fischer, J.; Foley, R. J.; Goldstein, D.; Gupta, R. R.; Kuehn, K.; Marcha, M.; Nichol, R. C.; Papadopoulos, A.; Sako, M.; Scolnic, D.; Smith, M.; Sullivan, M.; Wester, W.; Yuan, F.; Abbott, T.; Abdalla, F. B.; Allam, S.; Benoit-Lévy, A.; Bernstein, G. M.; Bertin, E.; Brooks, D.; Rosell, A. Carnero; Kind, M. Carrasco; Castander, F. J.; Crocce, M.; Costa, L. N. da; Desai, S.; Diehl, H. T.; Eifler, T. F.; Neto, A. Fausti; Flaugher, B.; Frieman, J.; Gerdes, D. W.; Gruen, D.; Gruendl, R. A.; Honscheid, K.; James, D. J.; Kuropatkin, N.; Li, T. S.; Maia, M. A. G.; Marshall, J. L.; Martini, P.; Miller, C. J.; Miquel, R.; Nord, B.; Ogando, R.; Plazas, A. A.; Reil, K.; Romer, A. K.; Roodman, A.; Sanchez, E.; Sevilla-Noarbe, I.; Smith, R. C.; Soares-Santos, M.; Sobreira, F.; Tarle, G.; Thaler, J.; Thomas, R. C.; Tucker, D.; Walker, A. R.

    2015-11-06

    We describe the operation and performance of the difference imaging pipeline (DiffImg) used to detect transients in deep images from the Dark Energy Survey Supernova program (DES-SN) in its first observing season from 2013 August through 2014 February. DES-SN is a search for transients in which ten 3 deg(2) fields are repeatedly observed in the g, r, i, z passbands with a cadence of about 1 week. The observing strategy has been optimized to measure high-quality light curves and redshifts for thousands of Type Ia supernovae (SNe Ia) with the goal of measuring dark energy parameters. The essential DiffImg functions are to align each search image to a deep reference image, do a pixel-by-pixel subtraction, and then examine the subtracted image for significant positive detections of point-source objects. The vast majority of detections are subtraction artifacts, but after selection requirements and image filtering with an automated scanning program, there are similar to 130 detections per deg(2) per observation in each band, of which only similar to 25% are artifacts. Of the similar to 7500 transients discovered by DES-SN in its first observing season, each requiring a detection on at least two separate nights, Monte Carlo (MC) simulations predict that 27% are expected to be SNe Ia or core-collapse SNe. Another similar to 30% of the transients are artifacts in which a small number of observations satisfy the selection criteria for a single-epoch detection. Spectroscopic analysis shows that most of the remaining transients are AGNs and variable stars. Fake SNe Ia are overlaid onto the images to rigorously evaluate detection efficiencies and to understand the DiffImg performance. The DiffImg efficiency measured with fake SNe agrees well with expectations from a MC simulation that uses analytical calculations of the fluxes and their uncertainties. In our 8 "shallow" fields with single-epoch 50% completeness depth similar to 23.5, the SN Ia efficiency falls to 1/2 at

  5. The effect of different types of hypertext annotations on vocabulary recall, text comprehension, and knowledge transfer in learning from scientific texts

    Science.gov (United States)

    Wallen, Erik Stanley

    The instructional uses of hypertext and multimedia are widespread but there are still many questions about how to maximize learning from these technologies. The purpose of this research was to determine whether providing learners with a basic science text in addition to hypertext annotations, designed to support the cognitive processes of selection, organization, and integration (Mayer, 1997), would result in different types of learning. Learning was measured using instruments designed to measure learning corresponding to each of the three processes. For the purposes of this study, selection-level learning was defined analogous to Bloom's (Bloom, 1956) knowledge level of learning and was measured with a recognition test. Organization-level learning was defined analogous to Bloom's (1956) comprehension-level of learning and was measured with a short-answer recall test. Integration-level learning was defined analogous to Bloom's (1956) levels of analysis and synthesis and was measured with a transfer test. In experiment one, participants read a text describing how cell phones work and viewed either no annotations (control), or annotations designed to support the selection, organization, or integration of information. As predicted, participants who viewed the selection-level annotations did significantly better than control participants on the recognition test. Results indicate that, for this group of novice learners, lower-level annotations were the most helpful for all levels of learning. In experiment two, participants read the text and viewed either no annotations (control) or combinations of annotations including selection and organization, organization and integration, or selection and integration. No significant differences were found between groups in these experiments. The results are discussed in terms of both multimedia learning theory and text comprehension theory and a new visualization of the generative theory of multimedia learning is offered.

  6. Pipeline system operability review

    Energy Technology Data Exchange (ETDEWEB)

    Eriksson, Kjell [Det Norske Veritas (Norway); Davies, Ray [CC Technologies, Dublin, OH (United States)

    2005-07-01

    Pipeline operators are continuously working to improve the safety of their systems and operations. In the US both liquid and gas pipeline operators have worked with the regulators over many years to develop more systematic approaches to pipeline integrity management. To successfully manage pipeline integrity, vast amounts of data from different sources needs to be collected, overlaid and analyzed in order to assess the current condition and predict future degradation. The efforts undertaken by the operators has had a significant impact on pipeline safety, nevertheless, during recent years we have seen a number of major high profile accidents. One can therefore ask how effective the pipeline integrity management systems and processes are. This paper will present one methodology 'The Pipeline System Operability Review' that can evaluate and rate the effectiveness of both the management systems and procedures, as well as the technical condition of the hardware. The result from the review can be used to compare the performance of different pipelines within one operating company, as well as benchmark with international best practices. (author)

  7. Leadership Pipeline

    DEFF Research Database (Denmark)

    Elmholdt, Claus Westergård

    2012-01-01

    Artiklen analyserer grundlaget for Leadership Pipeline modellen med henblik på en vurdering af substansen bag modellen, og perspektiverne for generalisering af modellen til en dansk organisatorisk kontekst.......Artiklen analyserer grundlaget for Leadership Pipeline modellen med henblik på en vurdering af substansen bag modellen, og perspektiverne for generalisering af modellen til en dansk organisatorisk kontekst....

  8. Electrochemical Investigation of the Corrosion of Different Microstructural Phases of X65 Pipeline Steel under Saturated Carbon Dioxide Conditions

    Directory of Open Access Journals (Sweden)

    Yuanfeng Yang

    2015-05-01

    Full Text Available The aim of this research was to investigate the influence of metallurgy on the corrosion behaviour of separate weld zone (WZ and parent plate (PP regions of X65 pipeline steel in a solution of deionised water saturated with CO2, at two different temperatures (55 °C and 80 °C and at initial pH~4.0. In addition, a non-electrochemical immersion experiment was also performed at 80 °C in CO2, on a sample portion of X65 pipeline containing part of a weld section, together with adjacent heat affected zones (HAZ and parent material. Electrochemical impedance spectroscopy (EIS was used to evaluate the corrosion behaviour of the separate weld and parent plate samples. This study seeks to understand the significance of the different microstructures within the different zones of the welded X65 pipe in CO2 environments on corrosion performance; with particular attention given to the formation of surface scales; and their composition/significance. The results obtained from grazing incidence X-ray diffraction (GIXRD measurements suggest that, post immersion, the parent plate substrate is scale free, with only features arising from ferrite (α-Fe and cementite (Fe3C apparent. In contrast, at 80 °C, GIXRD from the weld zone substrate, and weld zone/heat affected zone of the non-electrochemical sample indicates the presence of siderite (FeCO3 and chukanovite (Fe2CO3(OH2 phases. Scanning Electron Microscopy (SEM on this surface confirmed the presence of characteristic discrete cube-shaped crystallites of siderite together with plate-like clusters of chukanovite.

  9. Microstructure and mechanical properties of hard zone in friction stir welded X80 pipeline steel relative to different heat input

    International Nuclear Information System (INIS)

    Aydin, Hakan; Nelson, Tracy W.

    2013-01-01

    The study was conducted to investigate the microstructure and mechanical properties of the hard zone in friction stir welded X80 pipeline steel at different heat inputs. Microstructural analysis of the welds was carried out using optical microscopy, transmission electron microscopy, and microhardness. Heat input during friction stir welding process had a significant influence on the microstructure and mechanical properties in the hard zone along the advancing side of the weld nugget. Based on the results, the linear relationships between heat input and post-weld microstructures and mechanical properties in the hard zone of friction stir welded X80 steels were established. It can be concluded that with decrease in heat input the bainitic structure in the hard zone becomes finer and so hard zone strength increases

  10. Xylella fastidiosa comparative genomic database is an information resource to explore the annotation, genomic features, and biology of different strains

    Directory of Open Access Journals (Sweden)

    Alessandro M. Varani

    2012-01-01

    Full Text Available The Xylella fastidiosa comparative genomic database is a scientific resource with the aim to provide a user-friendly interface for accessing high-quality manually curated genomic annotation and comparative sequence analysis, as well as for identifying and mapping prophage-like elements, a marked feature of Xylella genomes. Here we describe a database and tools for exploring the biology of this important plant pathogen. The hallmarks of this database are the high quality genomic annotation, the functional and comparative genomic analysis and the identification and mapping of prophage-like elements. It is available from web site http://www.xylella.lncc.br.

  11. Using Nonexperts for Annotating Pharmacokinetic Drug-Drug Interaction Mentions in Product Labeling: A Feasibility Study.

    Science.gov (United States)

    Hochheiser, Harry; Ning, Yifan; Hernandez, Andres; Horn, John R; Jacobson, Rebecca; Boyce, Richard D

    2016-04-11

    Because vital details of potential pharmacokinetic drug-drug interactions are often described in free-text structured product labels, manual curation is a necessary but expensive step in the development of electronic drug-drug interaction information resources. The use of nonexperts to annotate potential drug-drug interaction (PDDI) mentions in drug product label annotation may be a means of lessening the burden of manual curation. Our goal was to explore the practicality of using nonexpert participants to annotate drug-drug interaction descriptions from structured product labels. By presenting annotation tasks to both pharmacy experts and relatively naïve participants, we hoped to demonstrate the feasibility of using nonexpert annotators for drug-drug information annotation. We were also interested in exploring whether and to what extent natural language processing (NLP) preannotation helped improve task completion time, accuracy, and subjective satisfaction. Two experts and 4 nonexperts were asked to annotate 208 structured product label sections under 4 conditions completed sequentially: (1) no NLP assistance, (2) preannotation of drug mentions, (3) preannotation of drug mentions and PDDIs, and (4) a repeat of the no-annotation condition. Results were evaluated within the 2 groups and relative to an existing gold standard. Participants were asked to provide reports on the time required to complete tasks and their perceptions of task difficulty. One of the experts and 3 of the nonexperts completed all tasks. Annotation results from the nonexpert group were relatively strong in every scenario and better than the performance of the NLP pipeline. The expert and 2 of the nonexperts were able to complete most tasks in less than 3 hours. Usability perceptions were generally positive (3.67 for expert, mean of 3.33 for nonexperts). The results suggest that nonexpert annotation might be a feasible option for comprehensive labeling of annotated PDDIs across a broader

  12. Accurate annotation of protein-coding genes in mitochondrial genomes.

    Science.gov (United States)

    Al Arab, Marwa; Höner Zu Siederdissen, Christian; Tout, Kifah; Sahyoun, Abdullah H; Stadler, Peter F; Bernt, Matthias

    2017-01-01

    Mitochondrial genome sequences are available in large number and new sequences become published nowadays with increasing pace. Fast, automatic, consistent, and high quality annotations are a prerequisite for downstream analyses. Therefore, we present an automated pipeline for fast de novo annotation of mitochondrial protein-coding genes. The annotation is based on enhanced phylogeny-aware hidden Markov models (HMMs). The pipeline builds taxon-specific enhanced multiple sequence alignments (MSA) of already annotated sequences and corresponding HMMs using an approximation of the phylogeny. The MSAs are enhanced by fixing unannotated frameshifts, purging of wrong sequences, and removal of non-conserved columns from both ends. A comparison with reference annotations highlights the high quality of the results. The frameshift correction method predicts a large number of frameshifts, many of which are unknown. A detailed analysis of the frameshifts in nad3 of the Archosauria-Testudines group has been conducted. Copyright © 2016 Elsevier Inc. All rights reserved.

  13. Annotated bibliography

    International Nuclear Information System (INIS)

    1997-08-01

    Under a cooperative agreement with the U.S. Department of Energy's Office of Science and Technology, Waste Policy Institute (WPI) is conducting a five-year research project to develop a research-based approach for integrating communication products in stakeholder involvement related to innovative technology. As part of the research, WPI developed this annotated bibliography which contains almost 100 citations of articles/books/resources involving topics related to communication and public involvement aspects of deploying innovative cleanup technology. To compile the bibliography, WPI performed on-line literature searches (e.g., Dialog, International Association of Business Communicators Public Relations Society of America, Chemical Manufacturers Association, etc.), consulted past years proceedings of major environmental waste cleanup conferences (e.g., Waste Management), networked with professional colleagues and DOE sites to gather reports or case studies, and received input during the August 1996 Research Design Team meeting held to discuss the project's research methodology. Articles were selected for annotation based upon their perceived usefulness to the broad range of public involvement and communication practitioners

  14. Leadership Pipeline

    DEFF Research Database (Denmark)

    Elmholdt, Claus Westergård

    2013-01-01

    I artiklen undersøges det empiriske grundlag for Leader- ship Pipeline. Først beskrives Leadership Pipeline modellen om le- delsesbaner og skilleveje i opadgående transitioner mellem orga- nisatoriske ledelsesniveauer (Freedman, 1998; Charan, Drotter and Noel, 2001). Dernæst sættes fokus på det...... forholdet mellem kontinuitet- og diskontinuitet i ledel- seskompetencer på tværs af organisatoriske niveauer præsenteres og diskuteres. Afslutningsvis diskuteres begrænsningerne i en kompetencebaseret tilgang til Leadership Pipeline, og det foreslås, at succesfuld ledelse i ligeså høj grad afhænger af...

  15. MetaStorm: A Public Resource for Customizable Metagenomics Annotation

    Science.gov (United States)

    Arango-Argoty, Gustavo; Singh, Gargi; Heath, Lenwood S.; Pruden, Amy; Xiao, Weidong; Zhang, Liqing

    2016-01-01

    Metagenomics is a trending research area, calling for the need to analyze large quantities of data generated from next generation DNA sequencing technologies. The need to store, retrieve, analyze, share, and visualize such data challenges current online computational systems. Interpretation and annotation of specific information is especially a challenge for metagenomic data sets derived from environmental samples, because current annotation systems only offer broad classification of microbial diversity and function. Moreover, existing resources are not configured to readily address common questions relevant to environmental systems. Here we developed a new online user-friendly metagenomic analysis server called MetaStorm (http://bench.cs.vt.edu/MetaStorm/), which facilitates customization of computational analysis for metagenomic data sets. Users can upload their own reference databases to tailor the metagenomics annotation to focus on various taxonomic and functional gene markers of interest. MetaStorm offers two major analysis pipelines: an assembly-based annotation pipeline and the standard read annotation pipeline used by existing web servers. These pipelines can be selected individually or together. Overall, MetaStorm provides enhanced interactive visualization to allow researchers to explore and manipulate taxonomy and functional annotation at various levels of resolution. PMID:27632579

  16. Building Simple Annotation Tools

    OpenAIRE

    Lin, Gordon

    2016-01-01

    The right annotation tool does not always exist for processing a particular natural language task. In these scenarios, researchers are required to build new annotation tools to fit the tasks at hand. However, developing new annotation tools is difficult and inefficient. There has not been careful consideration of software complexity in current annotation tools. Due to the problems of complexity, new annotation tools must reimplement common annotation features despite the availability of imple...

  17. Evaluating Hierarchical Structure in Music Annotations.

    Science.gov (United States)

    McFee, Brian; Nieto, Oriol; Farbood, Morwaread M; Bello, Juan Pablo

    2017-01-01

    Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR), it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded several insights relevant to the study of music cognition. First, annotators tend to agree when structural boundaries are ambiguous. Second, this ambiguity seems to depend on musical features, time scale, and genre. Furthermore, it is possible to tune current annotation evaluation metrics to better align with these perceptual differences. However, previous work has not directly analyzed the effects of hierarchical structure because the existing methods for comparing structural annotations are designed for "flat" descriptions, and do not readily generalize to hierarchical annotations. In this paper, we extend and generalize previous work on the evaluation of hierarchical descriptions of musical structure. We derive an evaluation metric which can compare hierarchical annotations holistically across multiple levels. sing this metric, we investigate inter-annotator agreement on the multilevel annotations of two different music corpora, investigate the influence of acoustic properties on hierarchical annotations, and evaluate existing hierarchical segmentation algorithms against the distribution of inter-annotator agreement.

  18. A computational genomics pipeline for prokaryotic sequencing projects.

    Science.gov (United States)

    Kislyuk, Andrey O; Katz, Lee S; Agrawal, Sonia; Hagen, Matthew S; Conley, Andrew B; Jayaraman, Pushkala; Nelakuditi, Viswateja; Humphrey, Jay C; Sammons, Scott A; Govil, Dhwani; Mair, Raydel D; Tatti, Kathleen M; Tondella, Maria L; Harcourt, Brian H; Mayer, Leonard W; Jordan, I King

    2010-08-01

    New sequencing technologies have accelerated research on prokaryotic genomes and have made genome sequencing operations outside major genome sequencing centers routine. However, no off-the-shelf solution exists for the combined assembly, gene prediction, genome annotation and data presentation necessary to interpret sequencing data. The resulting requirement to invest significant resources into custom informatics support for genome sequencing projects remains a major impediment to the accessibility of high-throughput sequence data. We present a self-contained, automated high-throughput open source genome sequencing and computational genomics pipeline suitable for prokaryotic sequencing projects. The pipeline has been used at the Georgia Institute of Technology and the Centers for Disease Control and Prevention for the analysis of Neisseria meningitidis and Bordetella bronchiseptica genomes. The pipeline is capable of enhanced or manually assisted reference-based assembly using multiple assemblers and modes; gene predictor combining; and functional annotation of genes and gene products. Because every component of the pipeline is executed on a local machine with no need to access resources over the Internet, the pipeline is suitable for projects of a sensitive nature. Annotation of virulence-related features makes the pipeline particularly useful for projects working with pathogenic prokaryotes. The pipeline is licensed under the open-source GNU General Public License and available at the Georgia Tech Neisseria Base (http://nbase.biology.gatech.edu/). The pipeline is implemented with a combination of Perl, Bourne Shell and MySQL and is compatible with Linux and other Unix systems.

  19. GENE-counter: a computational pipeline for the analysis of RNA-Seq data for gene expression differences.

    Science.gov (United States)

    Cumbie, Jason S; Kimbrel, Jeffrey A; Di, Yanming; Schafer, Daniel W; Wilhelm, Larry J; Fox, Samuel E; Sullivan, Christopher M; Curzon, Aron D; Carrington, James C; Mockler, Todd C; Chang, Jeff H

    2011-01-01

    GENE-counter is a complete Perl-based computational pipeline for analyzing RNA-Sequencing (RNA-Seq) data for differential gene expression. In addition to its use in studying transcriptomes of eukaryotic model organisms, GENE-counter is applicable for prokaryotes and non-model organisms without an available genome reference sequence. For alignments, GENE-counter is configured for CASHX, Bowtie, and BWA, but an end user can use any Sequence Alignment/Map (SAM)-compliant program of preference. To analyze data for differential gene expression, GENE-counter can be run with any one of three statistics packages that are based on variations of the negative binomial distribution. The default method is a new and simple statistical test we developed based on an over-parameterized version of the negative binomial distribution. GENE-counter also includes three different methods for assessing differentially expressed features for enriched gene ontology (GO) terms. Results are transparent and data are systematically stored in a MySQL relational database to facilitate additional analyses as well as quality assessment. We used next generation sequencing to generate a small-scale RNA-Seq dataset derived from the heavily studied defense response of Arabidopsis thaliana and used GENE-counter to process the data. Collectively, the support from analysis of microarrays as well as the observed and substantial overlap in results from each of the three statistics packages demonstrates that GENE-counter is well suited for handling the unique characteristics of small sample sizes and high variability in gene counts.

  20. GENE-counter: a computational pipeline for the analysis of RNA-Seq data for gene expression differences.

    Directory of Open Access Journals (Sweden)

    Jason S Cumbie

    Full Text Available GENE-counter is a complete Perl-based computational pipeline for analyzing RNA-Sequencing (RNA-Seq data for differential gene expression. In addition to its use in studying transcriptomes of eukaryotic model organisms, GENE-counter is applicable for prokaryotes and non-model organisms without an available genome reference sequence. For alignments, GENE-counter is configured for CASHX, Bowtie, and BWA, but an end user can use any Sequence Alignment/Map (SAM-compliant program of preference. To analyze data for differential gene expression, GENE-counter can be run with any one of three statistics packages that are based on variations of the negative binomial distribution. The default method is a new and simple statistical test we developed based on an over-parameterized version of the negative binomial distribution. GENE-counter also includes three different methods for assessing differentially expressed features for enriched gene ontology (GO terms. Results are transparent and data are systematically stored in a MySQL relational database to facilitate additional analyses as well as quality assessment. We used next generation sequencing to generate a small-scale RNA-Seq dataset derived from the heavily studied defense response of Arabidopsis thaliana and used GENE-counter to process the data. Collectively, the support from analysis of microarrays as well as the observed and substantial overlap in results from each of the three statistics packages demonstrates that GENE-counter is well suited for handling the unique characteristics of small sample sizes and high variability in gene counts.

  1. Burst strength behaviour of an aging subsea gas pipeline elbow in different external and internal corrosion-damaged positions

    Directory of Open Access Journals (Sweden)

    Geon Ho Lee

    2015-05-01

    Full Text Available Evaluation of the performance of aging structures is essential in the oil and gas industry, where the inaccurate prediction of structural performance can have significantly hazardous consequences. The effects of structure failure due to the significant reduction in wall thickness, which determines the burst strength, make it very complicated for pipeline operators to maintain pipeline serviceability. In other words, the serviceability of gas pipelines and elbows needs to be predicted and assessed to ensure that the burst or collapse strength capacities of the structures remain less than the maximum allowable operation pressure. In this study, several positions of the corrosion in a subsea elbow made of API X42 steel were evaluated using both design formulas and numerical analysis. The most hazardous corrosion position of the aging elbow was then determined to assess its serviceability. The results of this study are applicable to the operational and elbow serviceability needs of subsea pipelines and can help predict more accurate replacement or repair times.

  2. Gene Ontology annotations and resources.

    Science.gov (United States)

    Blake, J A; Dolan, M; Drabkin, H; Hill, D P; Li, Ni; Sitnikov, D; Bridges, S; Burgess, S; Buza, T; McCarthy, F; Peddinti, D; Pillai, L; Carbon, S; Dietze, H; Ireland, A; Lewis, S E; Mungall, C J; Gaudet, P; Chrisholm, R L; Fey, P; Kibbe, W A; Basu, S; Siegele, D A; McIntosh, B K; Renfro, D P; Zweifel, A E; Hu, J C; Brown, N H; Tweedie, S; Alam-Faruque, Y; Apweiler, R; Auchinchloss, A; Axelsen, K; Bely, B; Blatter, M -C; Bonilla, C; Bouguerleret, L; Boutet, E; Breuza, L; Bridge, A; Chan, W M; Chavali, G; Coudert, E; Dimmer, E; Estreicher, A; Famiglietti, L; Feuermann, M; Gos, A; Gruaz-Gumowski, N; Hieta, R; Hinz, C; Hulo, C; Huntley, R; James, J; Jungo, F; Keller, G; Laiho, K; Legge, D; Lemercier, P; Lieberherr, D; Magrane, M; Martin, M J; Masson, P; Mutowo-Muellenet, P; O'Donovan, C; Pedruzzi, I; Pichler, K; Poggioli, D; Porras Millán, P; Poux, S; Rivoire, C; Roechert, B; Sawford, T; Schneider, M; Stutz, A; Sundaram, S; Tognolli, M; Xenarios, I; Foulgar, R; Lomax, J; Roncaglia, P; Khodiyar, V K; Lovering, R C; Talmud, P J; Chibucos, M; Giglio, M Gwinn; Chang, H -Y; Hunter, S; McAnulla, C; Mitchell, A; Sangrador, A; Stephan, R; Harris, M A; Oliver, S G; Rutherford, K; Wood, V; Bahler, J; Lock, A; Kersey, P J; McDowall, D M; Staines, D M; Dwinell, M; Shimoyama, M; Laulederkind, S; Hayman, T; Wang, S -J; Petri, V; Lowry, T; D'Eustachio, P; Matthews, L; Balakrishnan, R; Binkley, G; Cherry, J M; Costanzo, M C; Dwight, S S; Engel, S R; Fisk, D G; Hitz, B C; Hong, E L; Karra, K; Miyasato, S R; Nash, R S; Park, J; Skrzypek, M S; Weng, S; Wong, E D; Berardini, T Z; Huala, E; Mi, H; Thomas, P D; Chan, J; Kishore, R; Sternberg, P; Van Auken, K; Howe, D; Westerfield, M

    2013-01-01

    The Gene Ontology (GO) Consortium (GOC, http://www.geneontology.org) is a community-based bioinformatics resource that classifies gene product function through the use of structured, controlled vocabularies. Over the past year, the GOC has implemented several processes to increase the quantity, quality and specificity of GO annotations. First, the number of manual, literature-based annotations has grown at an increasing rate. Second, as a result of a new 'phylogenetic annotation' process, manually reviewed, homology-based annotations are becoming available for a broad range of species. Third, the quality of GO annotations has been improved through a streamlined process for, and automated quality checks of, GO annotations deposited by different annotation groups. Fourth, the consistency and correctness of the ontology itself has increased by using automated reasoning tools. Finally, the GO has been expanded not only to cover new areas of biology through focused interaction with experts, but also to capture greater specificity in all areas of the ontology using tools for adding new combinatorial terms. The GOC works closely with other ontology developers to support integrated use of terminologies. The GOC supports its user community through the use of e-mail lists, social media and web-based resources.

  3. Plann: A command-line application for annotating plastome sequences.

    Science.gov (United States)

    Huang, Daisie I; Cronk, Quentin C B

    2015-08-01

    Plann automates the process of annotating a plastome sequence in GenBank format for either downstream processing or for GenBank submission by annotating a new plastome based on a similar, well-annotated plastome. Plann is a Perl script to be executed on the command line. Plann compares a new plastome sequence to the features annotated in a reference plastome and then shifts the intervals of any matching features to the locations in the new plastome. Plann's output can be used in the National Center for Biotechnology Information's tbl2asn to create a Sequin file for GenBank submission. Unlike Web-based annotation packages, Plann is a locally executable script that will accurately annotate a plastome sequence to a locally specified reference plastome. Because it executes from the command line, it is ready to use in other software pipelines and can be easily rerun as a draft plastome is improved.

  4. On the relevance of sophisticated structural annotations for disulfide connectivity pattern prediction.

    Directory of Open Access Journals (Sweden)

    Julien Becker

    Full Text Available Disulfide bridges strongly constrain the native structure of many proteins and predicting their formation is therefore a key sub-problem of protein structure and function inference. Most recently proposed approaches for this prediction problem adopt the following pipeline: first they enrich the primary sequence with structural annotations, second they apply a binary classifier to each candidate pair of cysteines to predict disulfide bonding probabilities and finally, they use a maximum weight graph matching algorithm to derive the predicted disulfide connectivity pattern of a protein. In this paper, we adopt this three step pipeline and propose an extensive study of the relevance of various structural annotations and feature encodings. In particular, we consider five kinds of structural annotations, among which three are novel in the context of disulfide bridge prediction. So as to be usable by machine learning algorithms, these annotations must be encoded into features. For this purpose, we propose four different feature encodings based on local windows and on different kinds of histograms. The combination of structural annotations with these possible encodings leads to a large number of possible feature functions. In order to identify a minimal subset of relevant feature functions among those, we propose an efficient and interpretable feature function selection scheme, designed so as to avoid any form of overfitting. We apply this scheme on top of three supervised learning algorithms: k-nearest neighbors, support vector machines and extremely randomized trees. Our results indicate that the use of only the PSSM (position-specific scoring matrix together with the CSP (cysteine separation profile are sufficient to construct a high performance disulfide pattern predictor and that extremely randomized trees reach a disulfide pattern prediction accuracy of [Formula: see text] on the benchmark dataset SPX[Formula: see text], which corresponds to

  5. Northern pipelines : backgrounder

    International Nuclear Information System (INIS)

    2002-04-01

    Most analysts agree that demand for natural gas in North America will continue to grow. Favourable market conditions created by rising demand and declining production have sparked renewed interest in northern natural gas development. The 2002 Annual Energy Outlook forecasted U.S. consumption to increase at an annual average rate of 2 per cent from 22.8 trillion cubic feet to 33.8 TCF by 2020, mostly due to rapid growth in demand for electric power generation. Natural gas prices are also expected to increase at an annual average rate of 1.6 per cent, reaching $3.26 per thousand cubic feet in 2020. There are currently 3 proposals for pipelines to move northern gas to US markets. They include a stand-alone Mackenzie Delta Project, the Alaska Highway Pipeline Project, and an offshore route that would combine Alaskan and Canadian gas in a pipeline across the floor of the Beaufort Sea. Current market conditions and demand suggest that the projects are not mutually exclusive, but complimentary. The factors that differentiate northern pipeline proposals are reserves, preparedness for market, costs, engineering, and environmental differences. Canada has affirmed its role to provide the regulatory and fiscal certainty needed by industry to make investment decisions. The Government of the Yukon does not believe that the Alaska Highway Project will shut in Mackenzie Delta gas, but will instead pave the way for development of a new northern natural gas industry. The Alaska Highway Pipeline Project will bring significant benefits for the Yukon, the Northwest Territories and the rest of Canada. Unresolved land claims are one of the challenges that has to be addressed for both Yukon and the Northwest Territories, as the proposed Alaska Highway Pipeline will travel through traditional territories of several Yukon first Nations. 1 tab., 4 figs

  6. SmashCommunity: A metagenomic annotation and analysis tool

    DEFF Research Database (Denmark)

    Arumugam, Manimozhiyan; Harrington, Eoghan D; Foerstner, Konrad U

    2010-01-01

    SUMMARY: SmashCommunity is a stand-alone metagenomic annotation and analysis pipeline suitable for data from Sanger and 454 sequencing technologies. It supports state-of-the-art software for essential metagenomic tasks such as assembly and gene prediction. It provides tools to estimate the quanti......SUMMARY: SmashCommunity is a stand-alone metagenomic annotation and analysis pipeline suitable for data from Sanger and 454 sequencing technologies. It supports state-of-the-art software for essential metagenomic tasks such as assembly and gene prediction. It provides tools to estimate...

  7. Comparison of a semi-automatic annotation tool and a natural language processing application for the generation of clinical statement entries.

    Science.gov (United States)

    Lin, Ching-Heng; Wu, Nai-Yuan; Lai, Wei-Shao; Liou, Der-Ming

    2015-01-01

    Electronic medical records with encoded entries should enhance the semantic interoperability of document exchange. However, it remains a challenge to encode the narrative concept and to transform the coded concepts into a standard entry-level document. This study aimed to use a novel approach for the generation of entry-level interoperable clinical documents. Using HL7 clinical document architecture (CDA) as the example, we developed three pipelines to generate entry-level CDA documents. The first approach was a semi-automatic annotation pipeline (SAAP), the second was a natural language processing (NLP) pipeline, and the third merged the above two pipelines. We randomly selected 50 test documents from the i2b2 corpora to evaluate the performance of the three pipelines. The 50 randomly selected test documents contained 9365 words, including 588 Observation terms and 123 Procedure terms. For the Observation terms, the merged pipeline had a significantly higher F-measure than the NLP pipeline (0.89 vs 0.80, p<0.0001), but a similar F-measure to that of the SAAP (0.89 vs 0.87). For the Procedure terms, the F-measure was not significantly different among the three pipelines. The combination of a semi-automatic annotation approach and the NLP application seems to be a solution for generating entry-level interoperable clinical documents. © The Author 2014. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.comFor numbered affiliation see end of article.

  8. Annotating Logical Forms for EHR Questions.

    Science.gov (United States)

    Roberts, Kirk; Demner-Fushman, Dina

    2016-05-01

    This paper discusses the creation of a semantically annotated corpus of questions about patient data in electronic health records (EHRs). The goal is to provide the training data necessary for semantic parsers to automatically convert EHR questions into a structured query. A layered annotation strategy is used which mirrors a typical natural language processing (NLP) pipeline. First, questions are syntactically analyzed to identify multi-part questions. Second, medical concepts are recognized and normalized to a clinical ontology. Finally, logical forms are created using a lambda calculus representation. We use a corpus of 446 questions asking for patient-specific information. From these, 468 specific questions are found containing 259 unique medical concepts and requiring 53 unique predicates to represent the logical forms. We further present detailed characteristics of the corpus, including inter-annotator agreement results, and describe the challenges automatic NLP systems will face on this task.

  9. Diverse Image Annotation

    KAUST Repository

    Wu, Baoyuan

    2017-11-09

    In this work we study the task of image annotation, of which the goal is to describe an image using a few tags. Instead of predicting the full list of tags, here we target for providing a short list of tags under a limited number (e.g., 3), to cover as much information as possible of the image. The tags in such a short list should be representative and diverse. It means they are required to be not only corresponding to the contents of the image, but also be different to each other. To this end, we treat the image annotation as a subset selection problem based on the conditional determinantal point process (DPP) model, which formulates the representation and diversity jointly. We further explore the semantic hierarchy and synonyms among the candidate tags, and require that two tags in a semantic hierarchy or in a pair of synonyms should not be selected simultaneously. This requirement is then embedded into the sampling algorithm according to the learned conditional DPP model. Besides, we find that traditional metrics for image annotation (e.g., precision, recall and F1 score) only consider the representation, but ignore the diversity. Thus we propose new metrics to evaluate the quality of the selected subset (i.e., the tag list), based on the semantic hierarchy and synonyms. Human study through Amazon Mechanical Turk verifies that the proposed metrics are more close to the humans judgment than traditional metrics. Experiments on two benchmark datasets show that the proposed method can produce more representative and diverse tags, compared with existing image annotation methods.

  10. Mining GO annotations for improving annotation consistency.

    Directory of Open Access Journals (Sweden)

    Daniel Faria

    Full Text Available Despite the structure and objectivity provided by the Gene Ontology (GO, the annotation of proteins is a complex task that is subject to errors and inconsistencies. Electronically inferred annotations in particular are widely considered unreliable. However, given that manual curation of all GO annotations is unfeasible, it is imperative to improve the quality of electronically inferred annotations. In this work, we analyze the full GO molecular function annotation of UniProtKB proteins, and discuss some of the issues that affect their quality, focusing particularly on the lack of annotation consistency. Based on our analysis, we estimate that 64% of the UniProtKB proteins are incompletely annotated, and that inconsistent annotations affect 83% of the protein functions and at least 23% of the proteins. Additionally, we present and evaluate a data mining algorithm, based on the association rule learning methodology, for identifying implicit relationships between molecular function terms. The goal of this algorithm is to assist GO curators in updating GO and correcting and preventing inconsistent annotations. Our algorithm predicted 501 relationships with an estimated precision of 94%, whereas the basic association rule learning methodology predicted 12,352 relationships with a precision below 9%.

  11. Genomic variant annotation workflow for clinical applications [version 2; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    Thomas Thurnherr

    2016-10-01

    Full Text Available Annotation and interpretation of DNA aberrations identified through next-generation sequencing is becoming an increasingly important task. Even more so in the context of data analysis pipelines for medical applications, where genomic aberrations are associated with phenotypic and clinical features. Here we describe a workflow to identify potential gene targets in aberrated genes or pathways and their corresponding drugs. To this end, we provide the R/Bioconductor package rDGIdb, an R wrapper to query the drug-gene interaction database (DGIdb. DGIdb accumulates drug-gene interaction data from 15 different resources and allows filtering on different levels. The rDGIdb package makes these resources and tools available to R users. Moreover, rDGIdb queries can be automated through incorporation of the rDGIdb package into NGS sequencing pipelines.

  12. Annotation Of Novel And Conserved MicroRNA Genes In The Build 10 Sus scrofa Reference Genome And Determination Of Their Expression Levels In Ten Different Tissues

    DEFF Research Database (Denmark)

    Thomsen, Bo; Nielsen, Mathilde; Hedegaard, Jakob

    The DNA template used in the pig genome sequencing project was provided by a Duroc pig named TJ Tabasco. In an effort to annotate microRNA (miRNA) genes in the reference genome we have conducted deep sequencing to determine the miRNA transcriptomes in ten different tissues isolated from Pinky...... against miRBase, we identified more than 600 conserved known miRNA/miRNA*, which is a significant increase relative to the 211 porcine miRNA/miRNA* deposited in the current version of miRBase. Furthermore, the genome-wide transcript profiles provided important information on the relative abundance...... and tissue-specificity of miRNA expression. In addition, we are currently analyzing our data using miRDeep for de novo discovery and annotation of the pig genome with both conserved and novel miRNAs. So far this analysis revealed the identity and genomic position of 535 miRNA genes of which 97 were novel...

  13. Sequence-based feature prediction and annotation of proteins

    DEFF Research Database (Denmark)

    Juncker, Agnieszka; Jensen, Lars J.; Pierleoni, Andrea

    2009-01-01

    A recent trend in computational methods for annotation of protein function is that many prediction tools are combined in complex workflows and pipelines to facilitate the analysis of feature combinations, for example, the entire repertoire of kinase-binding motifs in the human proteome....

  14. Removable pipeline plug

    International Nuclear Information System (INIS)

    Vassalotti, M.; Anastasi, F.

    1984-01-01

    A removable plugging device for a pipeline, and particularly for pressure testing a steam pipeline in a boiling water reactor, wherein an inflatable annular sealing member seals off the pipeline and characterized by radially movable shoes for holding the plug in place, each shoe being pivotally mounted for self-adjusting engagement with even an out-of-round pipeline interior

  15. Annotating Coloured Petri Nets

    DEFF Research Database (Denmark)

    Lindstrøm, Bo; Wells, Lisa Marie

    2002-01-01

    a method which makes it possible to associate auxiliary information, called annotations, with tokens without modifying the colour sets of the CP-net. Annotations are pieces of information that are not essential for determining the behaviour of the system being modelled, but are rather added to support...... a certain use of the CP-net. We define the semantics of annotations by describing a translation from a CP-net and the corresponding annotation layers to another CP-net where the annotations are an integrated part of the CP-net....

  16. Underground pipeline corrosion

    CERN Document Server

    Orazem, Mark

    2014-01-01

    Underground pipelines transporting liquid petroleum products and natural gas are critical components of civil infrastructure, making corrosion prevention an essential part of asset-protection strategy. Underground Pipeline Corrosion provides a basic understanding of the problems associated with corrosion detection and mitigation, and of the state of the art in corrosion prevention. The topics covered in part one include: basic principles for corrosion in underground pipelines, AC-induced corrosion of underground pipelines, significance of corrosion in onshore oil and gas pipelines, n

  17. Dynamically constrained pipeline for tracking neural progenitor cells

    DEFF Research Database (Denmark)

    Vestergaard, Jacob Schack; Dahl, Anders; Holm, Peter

    2013-01-01

    tracking methods are fundamental building blocks of setting up multi purpose pipelines. Segmentation by discriminative dictionary learning and a graph formulated tracking method constraining the allowed topology changes are combined here to accommodate for highly irregular cell shapes and movement patterns...... pipeline by tracking pig neural progenitor cells through a time lapse experiment consisting of 825 images collected over 69 hours. Each step of the tracking pipeline is validated separately by comparison with manual annotations. The number of tracked cells increase from approximately 350 to 650 during...

  18. GenePRIMP: A GENE PRediction IMprovement Pipeline for Prokaryotic genomes

    Energy Technology Data Exchange (ETDEWEB)

    Pati, Amrita; Ivanova, Natalia N.; Mikhailova, Natalia; Ovchinnikova, Galina; Hooper, Sean D.; Lykidis, Athanasios; Kyrpides, Nikos C.

    2010-04-01

    We present 'gene prediction improvement pipeline' (GenePRIMP; http://geneprimp.jgi-psf.org/), a computational process that performs evidence-based evaluation of gene models in prokaryotic genomes and reports anomalies including inconsistent start sites, missed genes and split genes. We found that manual curation of gene models using the anomaly reports generated by GenePRIMP improved their quality, and demonstrate the applicability of GenePRIMP in improving finishing quality and comparing different genome-sequencing and annotation technologies.

  19. GRaSP: A Multilayered Annotation Scheme for Perspectives

    NARCIS (Netherlands)

    van Son, C.M.; Caselli, T.; Fokkens, A.S.; Maks, E.; Morante Vallejo, R.; Aroyo, L.M.; Vossen, P.T.J.M.

    2016-01-01

    This paper presents a framework and methodology for the annotation of perspectives in text. In the last decade, different aspects of linguistic encoding of perspectives have been targeted as separated phenomena through different annotation initiatives. We propose an annotation scheme that integrates

  20. Phylogenetic molecular function annotation

    International Nuclear Information System (INIS)

    Engelhardt, Barbara E; Jordan, Michael I; Repo, Susanna T; Brenner, Steven E

    2009-01-01

    It is now easier to discover thousands of protein sequences in a new microbial genome than it is to biochemically characterize the specific activity of a single protein of unknown function. The molecular functions of protein sequences have typically been predicted using homology-based computational methods, which rely on the principle that homologous proteins share a similar function. However, some protein families include groups of proteins with different molecular functions. A phylogenetic approach for predicting molecular function (sometimes called 'phylogenomics') is an effective means to predict protein molecular function. These methods incorporate functional evidence from all members of a family that have functional characterizations using the evolutionary history of the protein family to make robust predictions for the uncharacterized proteins. However, they are often difficult to apply on a genome-wide scale because of the time-consuming step of reconstructing the phylogenies of each protein to be annotated. Our automated approach for function annotation using phylogeny, the SIFTER (Statistical Inference of Function Through Evolutionary Relationships) methodology, uses a statistical graphical model to compute the probabilities of molecular functions for unannotated proteins. Our benchmark tests showed that SIFTER provides accurate functional predictions on various protein families, outperforming other available methods.

  1. Stress corrosion cracking of X80 pipeline steel exposed to high pH solutions with different concentrations of bicarbonate

    Science.gov (United States)

    Fan, Lin; Du, Cui-wei; Liu, Zhi-yong; Li, Xiao-gang

    2013-07-01

    Susceptibilities to stress corrosion cracking (SCC) of X80 pipeline steel in high pH solutions with various concentrations of HCO{3/-} at a passive potential of -0.2 V vs. SCE were investigated by slow strain rate tensile (SSRT) test. The SCC mechanism and the effect of HCO{3/-} were discussed with the aid of electrochemical techniques. It is indicated that X80 steel shows enhanced susceptibility to SCC with the concentration of HCO{3/-} increasing from 0.15 to 1.00 mol/L, and the susceptibility can be evaluated in terms of current density at -0.2 V vs. SCE. The SCC behavior is controlled by the dissolution-based mechanism in these circumstances. Increasing the concentration of HCO{3/-} not only increases the risk of rupture of passive films but also promotes the anodic dissolution of crack tips. Besides, little susceptibility to SCC is found in dilute solution containing 0.05 mol/L HCO{3/-} for X80 steel. This can be attributed to the inhibited repassivation of passive films, manifesting as a more intensive dissolution in the non-crack tip areas than at the crack tips.

  2. The GATO gene annotation tool for research laboratories

    Directory of Open Access Journals (Sweden)

    A. Fujita

    2005-11-01

    Full Text Available Large-scale genome projects have generated a rapidly increasing number of DNA sequences. Therefore, development of computational methods to rapidly analyze these sequences is essential for progress in genomic research. Here we present an automatic annotation system for preliminary analysis of DNA sequences. The gene annotation tool (GATO is a Bioinformatics pipeline designed to facilitate routine functional annotation and easy access to annotated genes. It was designed in view of the frequent need of genomic researchers to access data pertaining to a common set of genes. In the GATO system, annotation is generated by querying some of the Web-accessible resources and the information is stored in a local database, which keeps a record of all previous annotation results. GATO may be accessed from everywhere through the internet or may be run locally if a large number of sequences are going to be annotated. It is implemented in PHP and Perl and may be run on any suitable Web server. Usually, installation and application of annotation systems require experience and are time consuming, but GATO is simple and practical, allowing anyone with basic skills in informatics to access it without any special training. GATO can be downloaded at [http://mariwork.iq.usp.br/gato/]. Minimum computer free space required is 2 MB.

  3. The GATO gene annotation tool for research laboratories.

    Science.gov (United States)

    Fujita, A; Massirer, K B; Durham, A M; Ferreira, C E; Sogayar, M C

    2005-11-01

    Large-scale genome projects have generated a rapidly increasing number of DNA sequences. Therefore, development of computational methods to rapidly analyze these sequences is essential for progress in genomic research. Here we present an automatic annotation system for preliminary analysis of DNA sequences. The gene annotation tool (GATO) is a Bioinformatics pipeline designed to facilitate routine functional annotation and easy access to annotated genes. It was designed in view of the frequent need of genomic researchers to access data pertaining to a common set of genes. In the GATO system, annotation is generated by querying some of the Web-accessible resources and the information is stored in a local database, which keeps a record of all previous annotation results. GATO may be accessed from everywhere through the internet or may be run locally if a large number of sequences are going to be annotated. It is implemented in PHP and Perl and may be run on any suitable Web server. Usually, installation and application of annotation systems require experience and are time consuming, but GATO is simple and practical, allowing anyone with basic skills in informatics to access it without any special training. GATO can be downloaded at [http://mariwork.iq.usp.br/gato/]. Minimum computer free space required is 2 MB.

  4. High-throughput proteogenomics of Ruegeria pomeroyi: seeding a better genomic annotation for the whole marine Roseobacter clade

    Directory of Open Access Journals (Sweden)

    Christie-Oleza Joseph A

    2012-02-01

    Full Text Available Abstract Background The structural and functional annotation of genomes is now heavily based on data obtained using automated pipeline systems. The key for an accurate structural annotation consists of blending similarities between closely related genomes with biochemical evidence of the genome interpretation. In this work we applied high-throughput proteogenomics to Ruegeria pomeroyi, a member of the Roseobacter clade, an abundant group of marine bacteria, as a seed for the annotation of the whole clade. Results A large dataset of peptides from R. pomeroyi was obtained after searching over 1.1 million MS/MS spectra against a six-frame translated genome database. We identified 2006 polypeptides, of which thirty-four were encoded by open reading frames (ORFs that had not previously been annotated. From the pool of 'one-hit-wonders', i.e. those ORFs specified by only one peptide detected by tandem mass spectrometry, we could confirm the probable existence of five additional new genes after proving that the corresponding RNAs were transcribed. We also identified the most-N-terminal peptide of 486 polypeptides, of which sixty-four had originally been wrongly annotated. Conclusions By extending these re-annotations to the other thirty-six Roseobacter isolates sequenced to date (twenty different genera, we propose the correction of the assigned start codons of 1082 homologous genes in the clade. In addition, we also report the presence of novel genes within operons encoding determinants of the important tricarboxylic acid cycle, a feature that seems to be characteristic of some Roseobacter genomes. The detection of their corresponding products in large amounts raises the question of their function. Their discoveries point to a possible theory for protein evolution that will rely on high expression of orphans in bacteria: their putative poor efficiency could be counterbalanced by a higher level of expression. Our proteogenomic analysis will increase

  5. SIMPLEX: Cloud-Enabled Pipeline for the Comprehensive Analysis of Exome Sequencing Data

    Science.gov (United States)

    Fischer, Maria; Snajder, Rene; Pabinger, Stephan; Dander, Andreas; Schossig, Anna; Zschocke, Johannes; Trajanoski, Zlatko; Stocker, Gernot

    2012-01-01

    In recent studies, exome sequencing has proven to be a successful screening tool for the identification of candidate genes causing rare genetic diseases. Although underlying targeted sequencing methods are well established, necessary data handling and focused, structured analysis still remain demanding tasks. Here, we present a cloud-enabled autonomous analysis pipeline, which comprises the complete exome analysis workflow. The pipeline combines several in-house developed and published applications to perform the following steps: (a) initial quality control, (b) intelligent data filtering and pre-processing, (c) sequence alignment to a reference genome, (d) SNP and DIP detection, (e) functional annotation of variants using different approaches, and (f) detailed report generation during various stages of the workflow. The pipeline connects the selected analysis steps, exposes all available parameters for customized usage, performs required data handling, and distributes computationally expensive tasks either on a dedicated high-performance computing infrastructure or on the Amazon cloud environment (EC2). The presented application has already been used in several research projects including studies to elucidate the role of rare genetic diseases. The pipeline is continuously tested and is publicly available under the GPL as a VirtualBox or Cloud image at http://simplex.i-med.ac.at; additional supplementary data is provided at http://www.icbi.at/exome. PMID:22870267

  6. SIMPLEX: cloud-enabled pipeline for the comprehensive analysis of exome sequencing data.

    Directory of Open Access Journals (Sweden)

    Maria Fischer

    Full Text Available In recent studies, exome sequencing has proven to be a successful screening tool for the identification of candidate genes causing rare genetic diseases. Although underlying targeted sequencing methods are well established, necessary data handling and focused, structured analysis still remain demanding tasks. Here, we present a cloud-enabled autonomous analysis pipeline, which comprises the complete exome analysis workflow. The pipeline combines several in-house developed and published applications to perform the following steps: (a initial quality control, (b intelligent data filtering and pre-processing, (c sequence alignment to a reference genome, (d SNP and DIP detection, (e functional annotation of variants using different approaches, and (f detailed report generation during various stages of the workflow. The pipeline connects the selected analysis steps, exposes all available parameters for customized usage, performs required data handling, and distributes computationally expensive tasks either on a dedicated high-performance computing infrastructure or on the Amazon cloud environment (EC2. The presented application has already been used in several research projects including studies to elucidate the role of rare genetic diseases. The pipeline is continuously tested and is publicly available under the GPL as a VirtualBox or Cloud image at http://simplex.i-med.ac.at; additional supplementary data is provided at http://www.icbi.at/exome.

  7. Combined evidence annotation of transposable elements in genome sequences.

    Directory of Open Access Journals (Sweden)

    Hadi Quesneville

    2005-07-01

    Full Text Available Transposable elements (TEs are mobile, repetitive sequences that make up significant fractions of metazoan genomes. Despite their near ubiquity and importance in genome and chromosome biology, most efforts to annotate TEs in genome sequences rely on the results of a single computational program, RepeatMasker. In contrast, recent advances in gene annotation indicate that high-quality gene models can be produced from combining multiple independent sources of computational evidence. To elevate the quality of TE annotations to a level comparable to that of gene models, we have developed a combined evidence-model TE annotation pipeline, analogous to systems used for gene annotation, by integrating results from multiple homology-based and de novo TE identification methods. As proof of principle, we have annotated "TE models" in Drosophila melanogaster Release 4 genomic sequences using the combined computational evidence derived from RepeatMasker, BLASTER, TBLASTX, all-by-all BLASTN, RECON, TE-HMM and the previous Release 3.1 annotation. Our system is designed for use with the Apollo genome annotation tool, allowing automatic results to be curated manually to produce reliable annotations. The euchromatic TE fraction of D. melanogaster is now estimated at 5.3% (cf. 3.86% in Release 3.1, and we found a substantially higher number of TEs (n = 6,013 than previously identified (n = 1,572. Most of the new TEs derive from small fragments of a few hundred nucleotides long and highly abundant families not previously annotated (e.g., INE-1. We also estimated that 518 TE copies (8.6% are inserted into at least one other TE, forming a nest of elements. The pipeline allows rapid and thorough annotation of even the most complex TE models, including highly deleted and/or nested elements such as those often found in heterochromatic sequences. Our pipeline can be easily adapted to other genome sequences, such as those of the D. melanogaster heterochromatin or other

  8. BEACON: automated tool for Bacterial GEnome Annotation ComparisON

    KAUST Repository

    Kalkatawi, Manal M.

    2015-08-18

    Background Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). Results The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON’s utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27 %, while the number of genes without any function assignment is reduced. Conclusions We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/

  9. BEACON: automated tool for Bacterial GEnome Annotation ComparisON.

    Science.gov (United States)

    Kalkatawi, Manal; Alam, Intikhab; Bajic, Vladimir B

    2015-08-18

    Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON's utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27%, while the number of genes without any function assignment is reduced. We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/ .

  10. Evaluation of Three Automated Genome Annotations for Halorhabdus utahensis

    DEFF Research Database (Denmark)

    Bakke, Peter; Carney, Nick; DeLoache, Will

    2009-01-01

    Genome annotations are accumulating rapidly and depend heavily on automated annotation systems. Many genome centers offer annotation systems but no one has compared their output in a systematic way to determine accuracy and inherent errors. Errors in the annotations are routinely deposited...... in databases such as NCBI and used to validate subsequent annotation errors. We submitted the genome sequence of halophilic archaeon Halorhabdus utahensis to be analyzed by three genome annotation services. We have examined the output from each service in a variety of ways in order to compare the methodology...... and effectiveness of the annotations, as well as to explore the genes, pathways, and physiology of the previously unannotated genome. The annotation services differ considerably in gene calls, features, and ease of use. We had to manually identify the origin of replication and the species-specific consensus...

  11. SigReannot-mart: a query environment for expression microarray probe re-annotations

    Science.gov (United States)

    Moreews, François; Rauffet, Gaelle; Dehais, Patrice; Klopp, Christophe

    2011-01-01

    Expression microarrays are commonly used to study transcriptomes. Most of the arrays are now based on oligo-nucleotide probes. Probe design being a tedious task, it often takes place once at the beginning of the project. The oligo set is then used for several years. During this time period, the knowledge gathered by the community on the genome and the transcriptome increases and gets more precise. Therefore re-annotating the set is essential to supply the biologists with up-to-date annotations. SigReannot-mart is a query environment populated with regularly updated annotations for different oligo sets. It stores the results of the SigReannot pipeline that has mainly been used on farm and aquaculture species. It permits easy extraction in different formats using filters. It is used to compare probe sets on different criteria, to choose the set for a given experiment to mix probe sets in order to create a new one. Database URL: http://sigreannot-mart.toulouse.inra.fr/ PMID:21930501

  12. North America pipeline map

    International Nuclear Information System (INIS)

    Anon.

    2005-01-01

    This map presents details of pipelines currently in place throughout North America. Fifty-nine natural gas pipelines are presented, as well as 16 oil pipelines. The map also identifies six proposed natural gas pipelines. Major cities, roads and highways are included as well as state and provincial boundaries. The National Petroleum Reserve is identified, as well as the Arctic National Wildlife Refuge. The following companies placed advertisements on the map with details of the services they provide relating to pipeline management and construction: Ferus Gas Industries Trust; Proline; SulfaTreat Direct Oxidation; and TransGas. 1 map

  13. Facilitating functional annotation of chicken microarray data

    Directory of Open Access Journals (Sweden)

    Gresham Cathy R

    2009-10-01

    Full Text Available Abstract Background Modeling results from chicken microarray studies is challenging for researchers due to little functional annotation associated with these arrays. The Affymetrix GenChip chicken genome array, one of the biggest arrays that serve as a key research tool for the study of chicken functional genomics, is among the few arrays that link gene products to Gene Ontology (GO. However the GO annotation data presented by Affymetrix is incomplete, for example, they do not show references linked to manually annotated functions. In addition, there is no tool that facilitates microarray researchers to directly retrieve functional annotations for their datasets from the annotated arrays. This costs researchers amount of time in searching multiple GO databases for functional information. Results We have improved the breadth of functional annotations of the gene products associated with probesets on the Affymetrix chicken genome array by 45% and the quality of annotation by 14%. We have also identified the most significant diseases and disorders, different types of genes, and known drug targets represented on Affymetrix chicken genome array. To facilitate functional annotation of other arrays and microarray experimental datasets we developed an Array GO Mapper (AGOM tool to help researchers to quickly retrieve corresponding functional information for their dataset. Conclusion Results from this study will directly facilitate annotation of other chicken arrays and microarray experimental datasets. Researchers will be able to quickly model their microarray dataset into more reliable biological functional information by using AGOM tool. The disease, disorders, gene types and drug targets revealed in the study will allow researchers to learn more about how genes function in complex biological systems and may lead to new drug discovery and development of therapies. The GO annotation data generated will be available for public use via AgBase website and

  14. IMG ER: A System for Microbial Genome Annotation Expert Review and Curation

    Energy Technology Data Exchange (ETDEWEB)

    Markowitz, Victor M.; Mavromatis, Konstantinos; Ivanova, Natalia N.; Chen, I-Min A.; Chu, Ken; Kyrpides, Nikos C.

    2009-05-25

    A rapidly increasing number of microbial genomes are sequenced by organizations worldwide and are eventually included into various public genome data resources. The quality of the annotations depends largely on the original dataset providers, with erroneous or incomplete annotations often carried over into the public resources and difficult to correct. We have developed an Expert Review (ER) version of the Integrated Microbial Genomes (IMG) system, with the goal of supporting systematic and efficient revision of microbial genome annotations. IMG ER provides tools for the review and curation of annotations of both new and publicly available microbial genomes within IMG's rich integrated genome framework. New genome datasets are included into IMG ER prior to their public release either with their native annotations or with annotations generated by IMG ER's annotation pipeline. IMG ER tools allow addressing annotation problems detected with IMG's comparative analysis tools, such as genes missed by gene prediction pipelines or genes without an associated function. Over the past year, IMG ER was used for improving the annotations of about 150 microbial genomes.

  15. AGeS: A Software System for Microbial Genome Sequence Annotation

    Science.gov (United States)

    Kumar, Kamal; Desai, Valmik; Cheng, Li; Khitrov, Maxim; Grover, Deepak; Satya, Ravi Vijaya; Yu, Chenggang; Zavaljevski, Nela; Reifman, Jaques

    2011-01-01

    Background The annotation of genomes from next-generation sequencing platforms needs to be rapid, high-throughput, and fully integrated and automated. Although a few Web-based annotation services have recently become available, they may not be the best solution for researchers that need to annotate a large number of genomes, possibly including proprietary data, and store them locally for further analysis. To address this need, we developed a standalone software application, the Annotation of microbial Genome Sequences (AGeS) system, which incorporates publicly available and in-house-developed bioinformatics tools and databases, many of which are parallelized for high-throughput performance. Methodology The AGeS system supports three main capabilities. The first is the storage of input contig sequences and the resulting annotation data in a central, customized database. The second is the annotation of microbial genomes using an integrated software pipeline, which first analyzes contigs from high-throughput sequencing by locating genomic regions that code for proteins, RNA, and other genomic elements through the Do-It-Yourself Annotation (DIYA) framework. The identified protein-coding regions are then functionally annotated using the in-house-developed Pipeline for Protein Annotation (PIPA). The third capability is the visualization of annotated sequences using GBrowse. To date, we have implemented these capabilities for bacterial genomes. AGeS was evaluated by comparing its genome annotations with those provided by three other methods. Our results indicate that the software tools integrated into AGeS provide annotations that are in general agreement with those provided by the compared methods. This is demonstrated by a >94% overlap in the number of identified genes, a significant number of identical annotated features, and a >90% agreement in enzyme function predictions. PMID:21408217

  16. Semantic annotation of consumer health questions.

    Science.gov (United States)

    Kilicoglu, Halil; Ben Abacha, Asma; Mrabet, Yassine; Shooshan, Sonya E; Rodriguez, Laritza; Masterton, Kate; Demner-Fushman, Dina

    2018-02-06

    Consumers increasingly use online resources for their health information needs. While current search engines can address these needs to some extent, they generally do not take into account that most health information needs are complex and can only fully be expressed in natural language. Consumer health question answering (QA) systems aim to fill this gap. A major challenge in developing consumer health QA systems is extracting relevant semantic content from the natural language questions (question understanding). To develop effective question understanding tools, question corpora semantically annotated for relevant question elements are needed. In this paper, we present a two-part consumer health question corpus annotated with several semantic categories: named entities, question triggers/types, question frames, and question topic. The first part (CHQA-email) consists of relatively long email requests received by the U.S. National Library of Medicine (NLM) customer service, while the second part (CHQA-web) consists of shorter questions posed to MedlinePlus search engine as queries. Each question has been annotated by two annotators. The annotation methodology is largely the same between the two parts of the corpus; however, we also explain and justify the differences between them. Additionally, we provide information about corpus characteristics, inter-annotator agreement, and our attempts to measure annotation confidence in the absence of adjudication of annotations. The resulting corpus consists of 2614 questions (CHQA-email: 1740, CHQA-web: 874). Problems are the most frequent named entities, while treatment and general information questions are the most common question types. Inter-annotator agreement was generally modest: question types and topics yielded highest agreement, while the agreement for more complex frame annotations was lower. Agreement in CHQA-web was consistently higher than that in CHQA-email. Pairwise inter-annotator agreement proved most

  17. Pipeline repair technology damage and repair assessment of pipelines with high residual stresses

    OpenAIRE

    Høie, Øyvind

    2015-01-01

    Master's thesis in Offshore technology : subsea technology Today in the offshore industry, there are an increasing number of pipelines that require both maintenance and repair. A wide specter of research in pipeline repair technology is available. Damage to a pipeline could be a quite complex event to analyze, due to the many different combinations of internal pipe stresses and damage types. Standards, such as DNV and ASTM have experimental based assessment methods for evaluating many of t...

  18. Report on Boeing pipeline leak detection system

    International Nuclear Information System (INIS)

    Aichele, W.T.

    1978-08-01

    Testing was performed on both simulated (test) and existing (water) pipelines to evaluate the Boeing leak detection technique. This technique uses a transformer mounted around the pipe to induce a voltage level onto the pipeline. The induced ground potential is measured from a distant ground probe, inserted into the surrounding soil, with respect to the excited pipeline. The induced voltage level will depend on the soil characteristics, the distance from the excited pipeline, and the probe types. If liquid should leak from the excited pipeline, the escaping liquid will modify the induced potential of the soil surrounding the excited pipeline. This will change the response of the quiescent soil characteristics and cause the voltage level on the detecting probes in the area of the leak to increase. This voltage increase will indicate a soil anomaly. However, the liquid does not have to reach the detection probe to reveal an anomalous soil condition. Several different detection probes were used and evaluated for sensitivity and response time. Although not evaluated during this test, results indicate that a wire laid parallel to the pipe axis may be the best probe configuration. A general sensitivity figure for any of the probes cannot be made from these tests; however, the technique used will reliably detect a pipeline leak of ten gallons. An additional test was performed using the Boeing pipeline leak detection technique to locate the position and depth of an underground pipeline. This test showed that the location and depth of an excited pipeline could be determined from above the ground where other methods for pipeline location had previously failed

  19. Optical fiber sensing technology in the pipeline industry

    Energy Technology Data Exchange (ETDEWEB)

    Braga, A.M.B.; Llerena, R.W.A. [Pontificia Univ. Catolica do Rio de Janeiro, RJ (Brazil). Dept. de Engenharia Mecanica]. E-mail: abraga@mec.puc-rio.br; roberan@mec.puc-rio.br; Valente, L.C.G.; Regazzi, R.D. [Gavea Sensors, Rio de Janeiro, RJ (Brazil)]. E-mail: guedes@gaveasensors.com; regazzi@gaveasensors.com

    2003-07-01

    This paper is concerned with applications of optical fiber sensors to pipeline monitoring. The basic principles of optical fiber sensors are briefly reviewed, with particular attention to fiber Bragg grating technology. Different potential applications in the pipeline industry are discussed, and an example of a pipeline strain monitoring system based on optical fiber Bragg grating sensors is presented. (author)

  20. optimization for trenchless reconstruction of pipelines

    Directory of Open Access Journals (Sweden)

    Zhmakov Gennadiy Nikolaevich

    2015-01-01

    Full Text Available Today the technologies of trenchless reconstruction of pipelines are becoming and more widely used in Russia and abroad. One of the most perspective is methods is shock-free destruction of the old pipeline being replaced with the help of hydraulic installations with working mechanism representing a cutting unit with knife disks and a conic expander. A construction of a working mechanism, which allows making trenchless reconstruction of pipelines of different diameters, is optimized and patented and its developmental prototype is manufactured. The dependence of pipeline cutting force from knifes obtusion of the working mechanisms. The cutting force of old steel pipelines with obtuse knife increases proportional to the value of its obtusion. Two stands for endurance tests of the knifes in laboratory environment are offered and patented.

  1. Large-scale prokaryotic gene prediction and comparison to genome annotation

    DEFF Research Database (Denmark)

    Nielsen, Pernille; Krogh, Anders Stærmose

    2005-01-01

    -annotated. These results are based on the difference between the number of annotated genes not found by EasyGene and the number of predicted genes that are not annotated in GenBank. We argue that the average performance of our standardized and fully automated method is slightly better than the annotation....... genefinder EasyGene. Comparison of the GenBank and RefSeq annotations with the EasyGene predictions reveals that in some genomes up to 60% of the genes may have been annotated with a wrong start codon, especially in the GC-rich genomes. The fractional difference between annotated and predicted confirms...

  2. MEETING: Chlamydomonas Annotation Jamboree - October 2003

    Energy Technology Data Exchange (ETDEWEB)

    Grossman, Arthur R

    2007-04-13

    Shotgun sequencing of the nuclear genome of Chlamydomonas reinhardtii (Chlamydomonas throughout) was performed at an approximate 10X coverage by JGI. Roughly half of the genome is now contained on 26 scaffolds, all of which are at least 1.6 Mb, and the coverage of the genome is ~95%. There are now over 200,000 cDNA sequence reads that we have generated as part of the Chlamydomonas genome project (Grossman, 2003; Shrager et al., 2003; Grossman et al. 2007; Merchant et al., 2007); other sequences have also been generated by the Kasuza sequence group (Asamizu et al., 1999; Asamizu et al., 2000) or individual laboratories that have focused on specific genes. Shrager et al. (2003) placed the reads into distinct contigs (an assemblage of reads with overlapping nucleotide sequences), and contigs that group together as part of the same genes have been designated ACEs (assembly of contigs generated from EST information). All of the reads have also been mapped to the Chlamydomonas nuclear genome and the cDNAs and their corresponding genomic sequences have been reassembled, and the resulting assemblage is called an ACEG (an Assembly of contiguous EST sequences supported by genomic sequence) (Jain et al., 2007). Most of the unique genes or ACEGs are also represented by gene models that have been generated by the Joint Genome Institute (JGI, Walnut Creek, CA). These gene models have been placed onto the DNA scaffolds and are presented as a track on the Chlamydomonas genome browser associated with the genome portal (http://genome.jgi-psf.org/Chlre3/Chlre3.home.html). Ultimately, the meeting grant awarded by DOE has helped enormously in the development of an annotation pipeline (a set of guidelines used in the annotation of genes) and resulted in high quality annotation of over 4,000 genes; the annotators were from both Europe and the USA. Some of the people who led the annotation initiative were Arthur Grossman, Olivier Vallon, and Sabeeha Merchant (with many individual

  3. Contributions to In Silico Genome Annotation

    KAUST Repository

    Kalkatawi, Manal M.

    2017-11-30

    Genome annotation is an important topic since it provides information for the foundation of downstream genomic and biological research. It is considered as a way of summarizing part of existing knowledge about the genomic characteristics of an organism. Annotating different regions of a genome sequence is known as structural annotation, while identifying functions of these regions is considered as a functional annotation. In silico approaches can facilitate both tasks that otherwise would be difficult and timeconsuming. This study contributes to genome annotation by introducing several novel bioinformatics methods, some based on machine learning (ML) approaches. First, we present Dragon PolyA Spotter (DPS), a method for accurate identification of the polyadenylation signals (PAS) within human genomic DNA sequences. For this, we derived a novel feature-set able to characterize properties of the genomic region surrounding the PAS, enabling development of high accuracy optimized ML predictive models. DPS considerably outperformed the state-of-the-art results. The second contribution concerns developing generic models for structural annotation, i.e., the recognition of different genomic signals and regions (GSR) within eukaryotic DNA. We developed DeepGSR, a systematic framework that facilitates generating ML models to predict GSR with high accuracy. To the best of our knowledge, no available generic and automated method exists for such task that could facilitate the studies of newly sequenced organisms. The prediction module of DeepGSR uses deep learning algorithms to derive highly abstract features that depend mainly on proper data representation and hyperparameters calibration. DeepGSR, which was evaluated on recognition of PAS and translation initiation sites (TIS) in different organisms, yields a simpler and more precise representation of the problem under study, compared to some other hand-tailored models, while producing high accuracy prediction results. Finally

  4. A framework for annotating human genome in disease context.

    Science.gov (United States)

    Xu, Wei; Wang, Huisong; Cheng, Wenqing; Fu, Dong; Xia, Tian; Kibbe, Warren A; Lin, Simon M

    2012-01-01

    Identification of gene-disease association is crucial to understanding disease mechanism. A rapid increase in biomedical literatures, led by advances of genome-scale technologies, poses challenge for manually-curated-based annotation databases to characterize gene-disease associations effectively and timely. We propose an automatic method-The Disease Ontology Annotation Framework (DOAF) to provide a comprehensive annotation of the human genome using the computable Disease Ontology (DO), the NCBO Annotator service and NCBI Gene Reference Into Function (GeneRIF). DOAF can keep the resulting knowledgebase current by periodically executing automatic pipeline to re-annotate the human genome using the latest DO and GeneRIF releases at any frequency such as daily or monthly. Further, DOAF provides a computable and programmable environment which enables large-scale and integrative analysis by working with external analytic software or online service platforms. A user-friendly web interface (doa.nubic.northwestern.edu) is implemented to allow users to efficiently query, download, and view disease annotations and the underlying evidences.

  5. A Novel Method to Enhance Pipeline Trajectory Determination Using Pipeline Junctions

    Directory of Open Access Journals (Sweden)

    Hussein Sahli

    2016-04-01

    Full Text Available Pipeline inspection gauges (pigs have been used for many years to perform various maintenance operations in oil and gas pipelines. Different pipeline parameters can be inspected during the pig journey. Although pigs use many sensors to detect the required pipeline parameters, matching these data with the corresponding pipeline location is considered a very important parameter. High-end, tactical-grade inertial measurement units (IMUs are used in pigging applications to locate the detected problems of pipeline using other sensors, and to reconstruct the trajectories of the pig. These IMUs are accurate; however, their high cost and large sizes limit their use in small diameter pipelines (8″ or less. This paper describes a new methodology for the use of MEMS-based IMUs using an extended Kalman filter (EKF and the pipeline junctions to increase the position parameters’ accuracy and to reduce the total RMS errors even during the unavailability of above ground markers (AGMs. The results of this new proposed method using a micro-electro-mechanical systems (MEMS-based IMU revealed that the position RMS errors were reduced by approximately 85% compared to the standard EKF solution. Therefore, this approach will enable the mapping of small diameter pipelines, which was not possible before.

  6. Annotation Graphs: A Graph-Based Visualization for Meta-Analysis of Data Based on User-Authored Annotations.

    Science.gov (United States)

    Zhao, Jian; Glueck, Michael; Breslav, Simon; Chevalier, Fanny; Khan, Azam

    2017-01-01

    User-authored annotations of data can support analysts in the activity of hypothesis generation and sensemaking, where it is not only critical to document key observations, but also to communicate insights between analysts. We present annotation graphs, a dynamic graph visualization that enables meta-analysis of data based on user-authored annotations. The annotation graph topology encodes annotation semantics, which describe the content of and relations between data selections, comments, and tags. We present a mixed-initiative approach to graph layout that integrates an analyst's manual manipulations with an automatic method based on similarity inferred from the annotation semantics. Various visual graph layout styles reveal different perspectives on the annotation semantics. Annotation graphs are implemented within C8, a system that supports authoring annotations during exploratory analysis of a dataset. We apply principles of Exploratory Sequential Data Analysis (ESDA) in designing C8, and further link these to an existing task typology in the visualization literature. We develop and evaluate the system through an iterative user-centered design process with three experts, situated in the domain of analyzing HCI experiment data. The results suggest that annotation graphs are effective as a method of visually extending user-authored annotations to data meta-analysis for discovery and organization of ideas.

  7. OligoRAP - an Oligo Re-Annotation Pipeline to improve annotation and estimate target specificity

    NARCIS (Netherlands)

    Neerincx, P.B.T.; Rauwerda, H.; Nie, H.; Groenen, M.A.M.; Breit, T.M.; Leunissen, J.A.M.

    2009-01-01

    Background: High throughput gene expression studies using oligonucleotide microarrays depend on the specificity of each oligonucleotide (oligo or probe) for its target gene. However, target specific probes can only be designed when a reference genome of the species at hand were completely sequenced,

  8. Military Petroleum Pipeline Systems

    Science.gov (United States)

    1978-06-01

    Army has standardized for this purpose. A tactical pipeline system, may be temporary or semilpermanenit and isemnplaced rapidly to maintain the Ipip ...with operating pressure. IPip is the1 highest cost Item In a pipeline system, normally representing more than half the total jinvest- ment in materiel

  9. Slurry pipeline hydrostatic testing

    Energy Technology Data Exchange (ETDEWEB)

    Betinol, Roy G.; Navarro Rojas, Luis Alejandro [BRASS Chile S.A., Santiago (Chile)

    2009-07-01

    The transportation of concentrates and tailings through long distance pipeline has been proven in recent years to be the most economic, environmentally friendly and secure means of transporting of mine products. This success has led to an increase in the demand for long distance pipeline throughout the mining industry. In year 2007 alone, a total of over 500 km of pipeline has been installed in South America alone and over 800 km are in the planning stages. As more pipelines are being installed, the need to ensure its operating integrity is ever increasing. Hydrostatic testing of long distance pipeline is one of the most economical and expeditious way to proving the operational integrity of the pipe. The intent of this paper is to show the sound reasoning behind construction hydro testing and the economic benefit it presents. It will show how hydro test pressures are determined based on ASME B31.11 criteria. (author)

  10. Slurry pipeline design approach

    Energy Technology Data Exchange (ETDEWEB)

    Betinol, Roy; Navarro R, Luis [Brass Chile S.A., Santiago (Chile)

    2009-12-19

    Compared to other engineering technologies, the design of a commercial long distance Slurry Pipeline design is a relatively new engineering concept which gained more recognition in the mid 1960 's. Slurry pipeline was first introduced to reduce cost in transporting coal to power generating units. Since then this technology has caught-up worldwide to transport other minerals such as limestone, copper, zinc and iron. In South America, the use of pipeline is commonly practiced in the transport of Copper (Chile, Peru and Argentina), Iron (Chile and Brazil), Zinc (Peru) and Bauxite (Brazil). As more mining operations expand and new mine facilities are opened, the design of the long distance slurry pipeline will continuously present a commercially viable option. The intent of this paper is to present the design process and discuss any new techniques and approach used today to ensure a better, safer and economical slurry pipeline. (author)

  11. Annotating individual human genomes.

    Science.gov (United States)

    Torkamani, Ali; Scott-Van Zeeland, Ashley A; Topol, Eric J; Schork, Nicholas J

    2011-10-01

    Advances in DNA sequencing technologies have made it possible to rapidly, accurately and affordably sequence entire individual human genomes. As impressive as this ability seems, however, it will not likely amount to much if one cannot extract meaningful information from individual sequence data. Annotating variations within individual genomes and providing information about their biological or phenotypic impact will thus be crucially important in moving individual sequencing projects forward, especially in the context of the clinical use of sequence information. In this paper we consider the various ways in which one might annotate individual sequence variations and point out limitations in the available methods for doing so. It is arguable that, in the foreseeable future, DNA sequencing of individual genomes will become routine for clinical, research, forensic, and personal purposes. We therefore also consider directions and areas for further research in annotating genomic variants. Copyright © 2011 Elsevier Inc. All rights reserved.

  12. ANNOTATING INDIVIDUAL HUMAN GENOMES*

    Science.gov (United States)

    Torkamani, Ali; Scott-Van Zeeland, Ashley A.; Topol, Eric J.; Schork, Nicholas J.

    2014-01-01

    Advances in DNA sequencing technologies have made it possible to rapidly, accurately and affordably sequence entire individual human genomes. As impressive as this ability seems, however, it will not likely to amount to much if one cannot extract meaningful information from individual sequence data. Annotating variations within individual genomes and providing information about their biological or phenotypic impact will thus be crucially important in moving individual sequencing projects forward, especially in the context of the clinical use of sequence information. In this paper we consider the various ways in which one might annotate individual sequence variations and point out limitations in the available methods for doing so. It is arguable that, in the foreseeable future, DNA sequencing of individual genomes will become routine for clinical, research, forensic, and personal purposes. We therefore also consider directions and areas for further research in annotating genomic variants. PMID:21839162

  13. GSV Annotated Bibliography

    Energy Technology Data Exchange (ETDEWEB)

    Roberts, Randy S. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Pope, Paul A. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Jiang, Ming [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Trucano, Timothy G. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Aragon, Cecilia R. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Ni, Kevin [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Wei, Thomas [Argonne National Lab. (ANL), Argonne, IL (United States); Chilton, Lawrence K. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Bakel, Alan [Argonne National Lab. (ANL), Argonne, IL (United States)

    2010-09-14

    The following annotated bibliography was developed as part of the geospatial algorithm verification and validation (GSV) project for the Simulation, Algorithms and Modeling program of NA-22. Verification and Validation of geospatial image analysis algorithms covers a wide range of technologies. Papers in the bibliography are thus organized into the following five topic areas: Image processing and analysis, usability and validation of geospatial image analysis algorithms, image distance measures, scene modeling and image rendering, and transportation simulation models. Many other papers were studied during the course of the investigation including. The annotations for these articles can be found in the paper "On the verification and validation of geospatial image analysis algorithms".

  14. Challenges in Whole-Genome Annotation of Pyrosequenced Eukaryotic Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Kuo, Alan; Grigoriev, Igor

    2009-04-17

    Pyrosequencing technologies such as 454/Roche and Solexa/Illumina vastly lower the cost of nucleotide sequencing compared to the traditional Sanger method, and thus promise to greatly expand the number of sequenced eukaryotic genomes. However, the new technologies also bring new challenges such as shorter reads and new kinds and higher rates of sequencing errors, which complicate genome assembly and gene prediction. At JGI we are deploying 454 technology for the sequencing and assembly of ever-larger eukaryotic genomes. Here we describe our first whole-genome annotation of a purely 454-sequenced fungal genome that is larger than a yeast (>30 Mbp). The pezizomycotine (filamentous ascomycote) Aspergillus carbonarius belongs to the Aspergillus section Nigri species complex, members of which are significant as platforms for bioenergy and bioindustrial technology, as members of soil microbial communities and players in the global carbon cycle, and as agricultural toxigens. Application of a modified version of the standard JGI Annotation Pipeline has so far predicted ~;;10k genes. ~;;12percent of these preliminary annotations suffer a potential frameshift error, which is somewhat higher than the ~;;9percent rate in the Sanger-sequenced and conventionally assembled and annotated genome of fellow Aspergillus section Nigri member A. niger. Also,>90percent of A. niger genes have potential homologs in the A. carbonarius preliminary annotation. Weconclude, and with further annotation and comparative analysis expect to confirm, that 454 sequencing strategies provide a promising substrate for annotation of modestly sized eukaryotic genomes. We will also present results of annotation of a number of other pyrosequenced fungal genomes of bioenergy interest.

  15. Pollution from pipelines

    International Nuclear Information System (INIS)

    1991-01-01

    During the 1980s, over 3,900 spills from land-based pipelines released nearly 20 million gallons of oil into U.S. waters-almost twice as much as was released by the March 1989 Exxon Valdez oil spill. Although the Department of Transportation is responsible for preventing water pollution from petroleum pipelines, GAO found that it has not established a program to prevent such pollution. DOT has instead delegated this responsibility to the Coast Guard, which has a program to stop water pollution from ships, but not from pipelines. This paper reports that, in the absence of any federal program to prevent water pollution from pipelines, both the Coast Guard and the Environmental Protection Agency have taken steps to plan for and respond to oil spills, including those from pipelines, as required by the Clean Water Act. The Coast Guard cannot, however, adequately plan for or ensure a timely response to pipeline spills because it generally is unaware of specific locations and operators of pipelines

  16. 77 FR 19799 - Pipeline Safety: Pipeline Damage Prevention Programs

    Science.gov (United States)

    2012-04-02

    ... rates translate to increased public and worker safety and decreased repair and outage costs for pipeline... April 2, 2012 Part III Department of Transportation Pipeline and Hazardous Materials Safety Administration 49 CFR Parts 196 and 198 Pipeline Safety: Pipeline Damage Prevention Programs; Proposed Rule #0;#0...

  17. High temperature pipeline design

    Energy Technology Data Exchange (ETDEWEB)

    Greenslade, J.G. [Colt Engineering, Calgary, AB (Canada). Pipelines Dept.; Nixon, J.F. [Nixon Geotech Ltd., Calgary, AB (Canada); Dyck, D.W. [Stress Tech Engineering Inc., Calgary, AB (Canada)

    2004-07-01

    It is impractical to transport bitumen and heavy oil by pipelines at ambient temperature unless diluents are added to reduce the viscosity. A diluted bitumen pipeline is commonly referred to as a dilbit pipeline. The diluent routinely used is natural gas condensate. Since natural gas condensate is limited in supply, it must be recovered and reused at high cost. This paper presented an alternative to the use of diluent to reduce the viscosity of heavy oil or bitumen. The following two basic design issues for a hot bitumen (hotbit) pipeline were presented: (1) modelling the restart problem, and, (2) establishing the maximum practical operating temperature. The transient behaviour during restart of a high temperature pipeline carrying viscous fluids was modelled using the concept of flow capacity. Although the design conditions were hypothetical, they could be encountered in the Athabasca oilsands. It was shown that environmental disturbances occur when the fluid is cooled during shut down because the ground temperature near the pipeline rises. This can change growing conditions, even near deeply buried insulated pipelines. Axial thermal loads also constrain the design and operation of a buried pipeline as higher operating temperatures are considered. As such, strain based design provides the opportunity to design for higher operating temperature than allowable stress based design methods. Expansion loops can partially relieve the thermal stress at a given temperature. As the design temperature increase, there is a point at which above grade pipelines become attractive options, although the materials and welding procedures must be suitable for low temperature service. 3 refs., 1 tab., 10 figs.

  18. Annotating Emotions in Meetings

    NARCIS (Netherlands)

    Reidsma, Dennis; Heylen, Dirk K.J.; Ordelman, Roeland J.F.

    We present the results of two trials testing procedures for the annotation of emotion and mental state of the AMI corpus. The first procedure is an adaptation of the FeelTrace method, focusing on a continuous labelling of emotion dimensions. The second method is centered around more discrete

  19. Annotation of Regular Polysemy

    DEFF Research Database (Denmark)

    Martinez Alonso, Hector

    Regular polysemy has received a lot of attention from the theory of lexical semantics and from computational linguistics. However, there is no consensus on how to represent the sense of underspecified examples at the token level, namely when annotating or disambiguating senses of metonymic words...

  20. Pipeline operators training and certification using thermohydraulic simulators

    Energy Technology Data Exchange (ETDEWEB)

    Barreto, Claudio V.; Plasencia C, Jose [Pontificia Universidade Catolica (PUC-Rio), Rio de Janeiro, RJ (Brazil). Nucleo de Simulacao Termohidraulica de Dutos (SIMDUT); Montalvao, Filipe; Costa, Luciano [TRANSPETRO - PETROBRAS Transporte S.A., Rio de Janeiro, RJ (Brazil)

    2009-07-01

    The continuous pipeline operators training and certification of the TRANSPETRO's Pipeline National Operations Control Center (CNCO) is an essential task aiming the efficiency and safety of the oil and derivatives transport operations through the Brazilian pipeline network. For this objective, a hydraulic simulator is considered an excellent tool that allows the creation of different operational scenarios for training the pipeline hydraulic behavior as well as for testing the operator's responses to normal and abnormal real time operational conditions. The hydraulic simulator is developed based on a pipeline simulation software that supplies the hydraulic responses normally acquired from the pipeline remote units in the field. The pipeline simulation software has a communication interface system that sends and receives data to the SCADA supervisory system database. Using the SCADA graphical interface to create and to customize human machine interfaces (HMI) from which the operator/instructor has total control of the pipeline/system and instrumentation by sending commands. Therefore, it is possible to have realistic training outside of the real production systems, while acquiring experience during training hours with the operation of a real pipeline. A pilot Project was initiated at TRANSPETRO - CNCO targeting to evaluate the hydraulic simulators advantages in pipeline operators training and certification programs. The first part of the project was the development of three simulators for different pipelines. The excellent results permitted the project expansion for a total of twenty different pipelines, being implemented in training programs for pipelines presently operated by CNCO as well as for the new ones that are being migrated. The main objective of this paper is to present an overview of the implementation process and the development of a training environment through a pipe simulation environment using commercial software. This paper also presents

  1. eggNOG v2.0: extending the evolutionary genealogy of genes with enhanced non-supervised orthologous groups, species and functional annotations

    DEFF Research Database (Denmark)

    Muller, J; Szklarczyk, D; Julien, P

    2010-01-01

    clustering. We applied this procedure to 630 complete genomes (529 bacteria, 46 archaea and 55 eukaryotes), which is a 2-fold increase relative to the previous version. The pipeline yielded 224,847 OGs, including 9724 extended versions of the original COG and KOG. We computed OGs for different levels...... of the tree of life; in addition to the species groups included in our first release (i.e. fungi, metazoa, insects, vertebrates and mammals), we have now constructed OGs for archaea, fishes, rodents and primates. We automatically annotate the non-supervised orthologous groups (NOGs) with functional...

  2. Chechnya: the pipeline front

    Energy Technology Data Exchange (ETDEWEB)

    Anon.

    1999-11-01

    This article examines the impact of the Russian campaign against Chechnya on projects for oil and gas pipelines from the new Caspian republics, which are seeking financial support. Topics discussed include the pipeline transport of oil from Azerbaijan through Chechnya to the Black Sea, the use of oil money to finance the war, the push for non-Russian export routes, the financing of pipelines, the impact of the war on the supply of Russian and Turkmenistan gas to Turkey, the proposed construction of the Trans Caspian pipeline, the weakening of trust between Russia and its neighbours, and the potential for trans Caucasus republics to look to western backers due to the instability of the North Caucasus. (UK)

  3. Natural Gas Liquid Pipelines

    Data.gov (United States)

    Department of Homeland Security — Natural gas interstate and intrastate pipelines in the United States. Based on a variety of sources with varying scales and levels of accuracy and therefore accuracy...

  4. Seismic response of buried pipelines: a state-of-the-art review

    International Nuclear Information System (INIS)

    Datta, T.K.

    1999-01-01

    A state-of-the-art review of the seismic response of buried pipelines is presented. The review includes modeling of soil-pipe system and seismic excitation, methods of response analysis of buried pipelines, seismic behavior of buried pipelines under different parametric variations, seismic stresses at the bends and intersections of network of pipelines. pipe damage in earthquakes and seismic risk analysis of buried pipelines. Based on the review, the future scope of work on the subject is outlined. (orig.)

  5. Metannogen: annotation of biological reaction networks.

    Science.gov (United States)

    Gille, Christoph; Hübner, Katrin; Hoppe, Andreas; Holzhütter, Hermann-Georg

    2011-10-01

    Semantic annotations of the biochemical entities constituting a biological reaction network are indispensable to create biologically meaningful networks. They further heighten efficient exchange, reuse and merging of existing models which concern present-day systems biology research more often. Two types of tools for the reconstruction of biological networks currently exist: (i) several sophisticated programs support graphical network editing and visualization. (ii) Data management systems permit reconstruction and curation of huge networks in a team of scientists including data integration, annotation and cross-referencing. We seeked ways to combine the advantages of both approaches. Metannogen, which was previously developed for network reconstruction, has been considerably improved. From now on, Metannogen provides sbml import and annotation of networks created elsewhere. This permits users of other network reconstruction platforms or modeling software to annotate their networks using Metannogen's advanced information management. We implemented word-autocompletion, multipattern highlighting, spell check, brace-expansion and publication management, and improved annotation, cross-referencing and team work requirements. Unspecific enzymes and transporters acting on a spectrum of different substrates are efficiently handled. The network can be exported in sbml format where the annotations are embedded in line with the miriam standard. For more comfort, Metannogen may be tightly coupled with the network editor such that Metannogen becomes an additional view for the focused reaction in the network editor. Finally, Metannogen provides local single user, shared password protected multiuser or public access to the annotation data. Metannogen is available free of charge at: http://www.bioinformatics.org/strap/metannogen/ or http://3d-alignment.eu/metannogen/. christoph.gille@charite.de Supplementary data are available at Bioinformatics online.

  6. Annotation of selection strengths in viral genomes

    DEFF Research Database (Denmark)

    McCauley, Stephen; de Groot, Saskia; Mailund, Thomas

    2007-01-01

    - and intergenomic regions. The presence of multiple coding regions complicates the concept of Ka/Ks ratio, and thus begs for an alternative approach when investigating selection strengths. Building on the paper by McCauley & Hein (2006), we develop a method for annotating a viral genome coding in overlapping...... may thus achieve an annotation both of coding regions as well as selection strengths, allowing us to investigate different selection patterns and hypotheses. Results: We illustrate our method by applying it to a multiple alignment of four HIV2 sequences, as well as four Hepatitis B sequences. We...... obtain an annotation of the coding regions, as well as a posterior probability for each site of the strength of selection acting on it. From this we may deduce the average posterior selection acting on the different genes. Whilst we are encouraged to see in HIV2, that the known to be conserved genes gag...

  7. GSV Annotated Bibliography

    Energy Technology Data Exchange (ETDEWEB)

    Roberts, Randy S. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Pope, Paul A. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Jiang, Ming [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Trucano, Timothy G. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Aragon, Cecilia R. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Ni, Kevin [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Wei, Thomas [Argonne National Lab. (ANL), Argonne, IL (United States); Chilton, Lawrence K. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Bakel, Alan [Argonne National Lab. (ANL), Argonne, IL (United States)

    2011-06-14

    The following annotated bibliography was developed as part of the Geospatial Algorithm Veri cation and Validation (GSV) project for the Simulation, Algorithms and Modeling program of NA-22. Veri cation and Validation of geospatial image analysis algorithms covers a wide range of technologies. Papers in the bibliography are thus organized into the following ve topic areas: Image processing and analysis, usability and validation of geospatial image analysis algorithms, image distance measures, scene modeling and image rendering, and transportation simulation models.

  8. Suggested Books for Children: An Annotated Bibliography

    Science.gov (United States)

    NHSA Dialog, 2008

    2008-01-01

    This article provides an annotated bibliography of various children's books. It includes listings of books that illustrate the dynamic relationships within the natural environment, economic context, racial and cultural identities, cross-group similarities and differences, gender, different abilities and stories of injustice and resistance.

  9. Taxonomic and functional annotation of gut bacterial communities of Eisenia foetida and Perionyx excavatus.

    Science.gov (United States)

    Singh, Arjun; Singh, Dushyant P; Tiwari, Rameshwar; Kumar, Kanika; Singh, Ran Vir; Singh, Surender; Prasanna, Radha; Saxena, Anil K; Nain, Lata

    2015-06-01

    Epigeic earthworms can significantly hasten the decomposition of organic matter, which is known to be mediated by gut associated microflora. However, there is scanty information on the abundance and diversity of the gut bacterial flora in different earthworm genera fed with a similar diet, particularly Eisenia foetida and Perionyx excavatus. In this context, 16S rDNA based clonal survey of gut metagenomic DNA was assessed after growth of these two earthworms on lignocellulosic biomass. A set of 67 clonal sequences belonging to E. foetida and 75 to P. excavatus were taxonomically annotated using MG-RAST and RDP pipeline servers. Highest number of sequences were annotated to Proteobacteria (38-44%), followed by unclassified bacteria (14-18%) and Firmicutes (9.3-11%). Comparative analyses revealed significantly higher abundance of Actinobacteria and Firmicutes in the gut of P. excavatus. The functional annotation for the 16S rDNA clonal libraries of both the metagenomes revealed a high abundance of xylan degraders (12.1-24.1%). However, chitin degraders (16.7%), ammonia oxidizers (24.1%) and nitrogen fixers (7.4%) were relatively higher in E. foetida, while in P. excavatus; sulphate reducers and sulphate oxidizers (12.1-29.6%) were more abundant. Lignin degradation was detected in 3.7% clones of E. foetida, while cellulose degraders represented 1.7%. The gut microbiomes showed relative abundance of dehalogenators (17.2-22.2%) and aromatic hydrocarbon degraders (1.7-5.6%), illustrating their role in bioremediation. This study highlights the significance of differences in the inherent microbiome of these two earthworms in shaping the metagenome for effective degradation of different types of biomass under tropical conditions. Copyright © 2015 Elsevier GmbH. All rights reserved.

  10. Transcript annotation in FANTOM3: mouse gene catalog based on physical cDNAs.

    Directory of Open Access Journals (Sweden)

    Norihiro Maeda

    2006-04-01

    Full Text Available The international FANTOM consortium aims to produce a comprehensive picture of the mammalian transcriptome, based upon an extensive cDNA collection and functional annotation of full-length enriched cDNAs. The previous dataset, FANTOM2, comprised 60,770 full-length enriched cDNAs. Functional annotation revealed that this cDNA dataset contained only about half of the estimated number of mouse protein-coding genes, indicating that a number of cDNAs still remained to be collected and identified. To pursue the complete gene catalog that covers all predicted mouse genes, cloning and sequencing of full-length enriched cDNAs has been continued since FANTOM2. In FANTOM3, 42,031 newly isolated cDNAs were subjected to functional annotation, and the annotation of 4,347 FANTOM2 cDNAs was updated. To accomplish accurate functional annotation, we improved our automated annotation pipeline by introducing new coding sequence prediction programs and developed a Web-based annotation interface for simplifying the annotation procedures to reduce manual annotation errors. Automated coding sequence and function prediction was followed with manual curation and review by expert curators. A total of 102,801 full-length enriched mouse cDNAs were annotated. Out of 102,801 transcripts, 56,722 were functionally annotated as protein coding (including partial or truncated transcripts, providing to our knowledge the greatest current coverage of the mouse proteome by full-length cDNAs. The total number of distinct non-protein-coding transcripts increased to 34,030. The FANTOM3 annotation system, consisting of automated computational prediction, manual curation, and final expert curation, facilitated the comprehensive characterization of the mouse transcriptome, and could be applied to the transcriptomes of other species.

  11. Investigation on potential SCC in gas transmission pipeline in China

    Energy Technology Data Exchange (ETDEWEB)

    Jian, S. [Petroleum Univ., Beijing (China); Zupei, Y.; Yunxin, M. [China Petroleum Pipeline Corp., Beijing (China). Science and Technology Center

    2004-07-01

    Stress corrosion cracking (SCC) is a common phenomenon that occurs on the outer surfaces of buried pipelines. This paper investigated aspects of SCC on 3 transmission pipelines on the West-East Gas Pipeline Project in China. The study was comprised of 3 different investigations: (1) an investigation of SCC cases on constructed pipelines; (2) an evaluation of SCC sensitivity of pipeline steels in typical soil environments; and (3) an analysis of soil environments and operation conditions of western pipelines. The study included a review of pipeline corrosion investigations, as well as an examination of pipeline failure cases. Investigative digs were conducted at 21 sites to test soil chemistries. Slow strain rate stress were conducted to evaluate SCC sensitivity of steel pipelines used in China. Potentiodynamic polarization tests were conducted to characterize the electrochemical behaviour of the X70 line pipe steel in different soil environments. Results of the study showed that the environmental conditions in many locations in China contributed to SCC in pipelines. SCC was observed on the surface of X70 steel pipe specimens in both marsh and saline environments. Seasonal temperature changes also contributed additional stress on pipelines. The movement of soil bodies in mountainous areas also contributed to stress and coating damage. It was concluded that proper cathodic protection can alleviate concentrations of local solutions under disbanded coatings. Overprotection of SCC will accelerate the growth of cracks and the degradation of coatings. Samples gathered from the solutions found under the disbanded coatings of pipelines will be used to form part of a reference database for predicting SCC in oil and gas pipelines in the future. 2 refs., 4 tabs., 5 figs.

  12. A software pipeline for processing and identification of fungal ITS sequences

    Directory of Open Access Journals (Sweden)

    Kristiansson Erik

    2009-01-01

    Full Text Available Abstract Background Fungi from environmental samples are typically identified to species level through DNA sequencing of the nuclear ribosomal internal transcribed spacer (ITS region for use in BLAST-based similarity searches in the International Nucleotide Sequence Databases. These searches are time-consuming and regularly require a significant amount of manual intervention and complementary analyses. We here present software – in the form of an identification pipeline for large sets of fungal ITS sequences – developed to automate the BLAST process and several additional analysis steps. The performance of the pipeline was evaluated on a dataset of 350 ITS sequences from fungi growing as epiphytes on building material. Results The pipeline was written in Perl and uses a local installation of NCBI-BLAST for the similarity searches of the query sequences. The variable subregion ITS2 of the ITS region is extracted from the sequences and used for additional searches of higher sensitivity. Multiple alignments of each query sequence and its closest matches are computed, and query sequences sharing at least 50% of their best matches are clustered to facilitate the evaluation of hypothetically conspecific groups. The pipeline proved to speed up the processing, as well as enhance the resolution, of the evaluation dataset considerably, and the fungi were found to belong chiefly to the Ascomycota, with Penicillium and Aspergillus as the two most common genera. The ITS2 was found to indicate a different taxonomic affiliation than did the complete ITS region for 10% of the query sequences, though this figure is likely to vary with the taxonomic scope of the query sequences. Conclusion The present software readily assigns large sets of fungal query sequences to their respective best matches in the international sequence databases and places them in a larger biological context. The output is highly structured to be easy to process, although it still needs

  13. Protecting a pipeline

    Energy Technology Data Exchange (ETDEWEB)

    Gray, D.H (Univ. of Michigan, Ann Arbor, MI (United States)); Garcia-Lopez, M. (Ingenieria y Geotecnia Ltda., Santafe de Bogota (Colombia))

    1994-12-01

    This article describes some of the difficulties in constructing an oil pipeline in Colombia across a forested mountain range that has erosion-prone slopes. Engineers are finding ways to protect the pipeline against slope failures and severe erosion problems while contending with threats of guerrilla attacks. Torrential rainfall, precipitous slopes, unstable soils, unfavorable geology and difficult access make construction of an oil pipeline in Colombia a formidable undertaking. Add the threat of guerrilla attacks, and the project takes on a new dimension. In the country's central uplands, a 76 cm pipeline traverses some of the most daunting and formidable terrain in the world. The right-of-way crosses rugged mountains with vertical elevations ranging from 300 m to 2,000 mm above sea level over a distance of some 30 km. The pipeline snakes up and down steep forested inclines in some spots and crosses streams and faults in others, carrying the country's major export--petroleum--from the Cusiana oil field, located in Colombia's lowland interior, to the coast.

  14. De novo assembly and annotation of the Asian tiger mosquito (Aedes albopictus) repeatome with dnaPipeTE from raw genomic reads and comparative analysis with the yellow fever mosquito (Aedes aegypti).

    Science.gov (United States)

    Goubert, Clément; Modolo, Laurent; Vieira, Cristina; ValienteMoro, Claire; Mavingui, Patrick; Boulesteix, Matthieu

    2015-03-11

    Repetitive DNA, including transposable elements (TEs), is found throughout eukaryotic genomes. Annotating and assembling the "repeatome" during genome-wide analysis often poses a challenge. To address this problem, we present dnaPipeTE-a new bioinformatics pipeline that uses a sample of raw genomic reads. It produces precise estimates of repeated DNA content and TE consensus sequences, as well as the relative ages of TE families. We shows that dnaPipeTE performs well using very low coverage sequencing in different genomes, losing accuracy only with old TE families. We applied this pipeline to the genome of the Asian tiger mosquito Aedes albopictus, an invasive species of human health interest, for which the genome size is estimated to be over 1 Gbp. Using dnaPipeTE, we showed that this species harbors a large (50% of the genome) and potentially active repeatome with an overall TE class and order composition similar to that of Aedes aegypti, the yellow fever mosquito. However, intraorder dynamics show clear distinctions between the two species, with differences at the TE family level. Our pipeline's ability to manage the repeatome annotation problem will make it helpful for new or ongoing assembly projects, and our results will benefit future genomic studies of A. albopictus. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  15. Multi-Atlas Segmentation using Partially Annotated Data: Methods and Annotation Strategies.

    Science.gov (United States)

    Koch, Lisa M; Rajchl, Martin; Bai, Wenjia; Baumgartner, Christian F; Tong, Tong; Passerat-Palmbach, Jonathan; Aljabar, Paul; Rueckert, Daniel

    2017-08-22

    Multi-atlas segmentation is a widely used tool in medical image analysis, providing robust and accurate results by learning from annotated atlas datasets. However, the availability of fully annotated atlas images for training is limited due to the time required for the labelling task. Segmentation methods requiring only a proportion of each atlas image to be labelled could therefore reduce the workload on expert raters tasked with annotating atlas images. To address this issue, we first re-examine the labelling problem common in many existing approaches and formulate its solution in terms of a Markov Random Field energy minimisation problem on a graph connecting atlases and the target image. This provides a unifying framework for multi-atlas segmentation. We then show how modifications in the graph configuration of the proposed framework enable the use of partially annotated atlas images and investigate different partial annotation strategies. The proposed method was evaluated on two Magnetic Resonance Imaging (MRI) datasets for hippocampal and cardiac segmentation. Experiments were performed aimed at (1) recreating existing segmentation techniques with the proposed framework and (2) demonstrating the potential of employing sparsely annotated atlas data for multi-atlas segmentation.

  16. ANNOTATION SUPPORTED OCCLUDED OBJECT TRACKING

    Directory of Open Access Journals (Sweden)

    Devinder Kumar

    2012-08-01

    Full Text Available Tracking occluded objects at different depths has become as extremely important component of study for any video sequence having wide applications in object tracking, scene recognition, coding, editing the videos and mosaicking. The paper studies the ability of annotation to track the occluded object based on pyramids with variation in depth further establishing a threshold at which the ability of the system to track the occluded object fails. Image annotation is applied on 3 similar video sequences varying in depth. In the experiment, one bike occludes the other at a depth of 60cm, 80cm and 100cm respectively. Another experiment is performed on tracking humans with similar depth to authenticate the results. The paper also computes the frame by frame error incurred by the system, supported by detailed simulations. This system can be effectively used to analyze the error in motion tracking and further correcting the error leading to flawless tracking. This can be of great interest to computer scientists while designing surveillance systems etc.

  17. Exploiting proteomic data for genome annotation and gene model validation in Aspergillus niger

    Directory of Open Access Journals (Sweden)

    Grigoriev Igor V

    2009-02-01

    Full Text Available Abstract Background Proteomic data is a potentially rich, but arguably unexploited, data source for genome annotation. Peptide identifications from tandem mass spectrometry provide prima facie evidence for gene predictions and can discriminate over a set of candidate gene models. Here we apply this to the recently sequenced Aspergillus niger fungal genome from the Joint Genome Institutes (JGI and another predicted protein set from another A.niger sequence. Tandem mass spectra (MS/MS were acquired from 1d gel electrophoresis bands and searched against all available gene models using Average Peptide Scoring (APS and reverse database searching to produce confident identifications at an acceptable false discovery rate (FDR. Results 405 identified peptide sequences were mapped to 214 different A.niger genomic loci to which 4093 predicted gene models clustered, 2872 of which contained the mapped peptides. Interestingly, 13 (6% of these loci either had no preferred predicted gene model or the genome annotators' chosen "best" model for that genomic locus was not found to be the most parsimonious match to the identified peptides. The peptides identified also boosted confidence in predicted gene structures spanning 54 introns from different gene models. Conclusion This work highlights the potential of integrating experimental proteomics data into genomic annotation pipelines much as expressed sequence tag (EST data has been. A comparison of the published genome from another strain of A.niger sequenced by DSM showed that a number of the gene models or proteins with proteomics evidence did not occur in both genomes, further highlighting the utility of the method.

  18. 76 FR 29333 - Pipeline Safety: Meetings of the Technical Pipeline Safety Standards Committee and the Technical...

    Science.gov (United States)

    2011-05-20

    ... pipeline rehabilitation, replacement and repair initiatives along with all forum-related comments. PHMSA... Pipeline and Hazardous Materials Safety Administration Pipeline Safety: Meetings of the Technical Pipeline Safety Standards Committee and the Technical Hazardous Liquid Pipeline Safety Standards Committee AGENCY...

  19. CPL: Common Pipeline Library

    Science.gov (United States)

    ESO CPL Development Team

    2014-02-01

    The Common Pipeline Library (CPL) is a set of ISO-C libraries that provide a comprehensive, efficient and robust software toolkit to create automated astronomical data reduction pipelines. Though initially developed as a standardized way to build VLT instrument pipelines, the CPL may be more generally applied to any similar application. The code also provides a variety of general purpose image- and signal-processing functions, making it an excellent framework for the creation of more generic data handling packages. The CPL handles low-level data types (images, tables, matrices, strings, property lists, etc.) and medium-level data access methods (a simple data abstraction layer for FITS files). It also provides table organization and manipulation, keyword/value handling and management, and support for dynamic loading of recipe modules using programs such as EsoRex (ascl:1504.003).

  20. Validation of pig operations through pipelines

    Energy Technology Data Exchange (ETDEWEB)

    Tolmasquim, Sueli Tiomno [TRANSPETRO - PETROBRAS Transporte S.A., Rio de Janeiro, RJ (Brazil); Nieckele, Angela O. [Pontificia Univ. Catolica do Rio de Janeiro, RJ (Brazil). Dept. de Engenharia Mecanica

    2005-07-01

    In the oil industry, pigging operations in pipelines have been largely applied for different purposes: pipe cleaning, inspection, liquid removal and product separation, among others. An efficient and safe pigging operation requires that a number of operational parameters, such as maximum and minimum pressures in the pipeline and pig velocity, to be well evaluated during the planning stage and maintained within stipulated limits while the operation is accomplished. With the objective of providing an efficient tool to assist in the control and design of pig operations through pipelines, a numerical code was developed, based on a finite difference scheme, which allows the simulation of two fluid transient flow, like liquid-liquid, gas-gas or liquid-gas products in the pipeline. Modules to automatically control process variables were included to employ different strategies to reach an efficient operation. Different test cases were investigated, to corroborate the robustness of the methodology. To validate the methodology, the results obtained with the code were compared with a real liquid displacement operation of a section of the OSPAR oil pipeline, belonging to PETROBRAS, with 30'' diameter and 60 km length, presenting good agreement. (author)

  1. Impingement: an annotated bibliography

    International Nuclear Information System (INIS)

    Uziel, M.S.; Hannon, E.H.

    1979-04-01

    This bibliography of 655 annotated references on impingement of aquatic organisms at intake structures of thermal-power-plant cooling systems was compiled from the published and unpublished literature. The bibliography includes references from 1928 to 1978 on impingement monitoring programs; impingement impact assessment; applicable law; location and design of intake structures, screens, louvers, and other barriers; fish behavior and swim speed as related to impingement susceptibility; and the effects of light, sound, bubbles, currents, and temperature on fish behavior. References are arranged alphabetically by author or corporate author. Indexes are provided for author, keywords, subject category, geographic location, taxon, and title

  2. Influence of remanent magnetization on pitting corrosion in pipeline steel

    Energy Technology Data Exchange (ETDEWEB)

    Espina-Hernandez, J. H. [ESIME Zacatenco, SEPI Electronica Instituto Politecnico Nacional Mexico, D. F. (Mexico); Caleyo, F.; Hallen, J. M. [DIM-ESIQIE, Instituto Politecnico Nacional Mexico D. F. (Mexico); Lopez-Montenegro, A.; Perez-Baruch, E. [Pemex Exploracion y Produccion, Region Sur Villahermosa, Tabasco (Mexico)

    2010-07-01

    Statistical studies performed in Mexico indicate that leakage due to external pitting corrosion is the most likely cause of failure of buried pipelines. When pipelines are inspected with the magnetic flux leakage (MFL) technology, which is routinely used, the magnetization level of every part of the pipeline changes as the MFL tool travels through it. Remanent magnetization stays in the pipeline wall after inspection, at levels that may differ from a point to the next. This paper studies the influence of the magnetic field on pitting corrosion. Experiments were carried out on grade 52 steel under a level of remanent magnetization and other laboratory conditions that imitated the conditions of a pipeline after an MLF inspection. Non-magnetized control samples and magnetized samples were subjected to pitting by immersion in a solution containing chlorine and sulfide ions for seven days, and then inspected with optical microscopy. Results show that the magnetic field in the pipeline wall significantly increases pitting corrosion.

  3. Predicting word sense annotation agreement

    DEFF Research Database (Denmark)

    Martinez Alonso, Hector; Johannsen, Anders Trærup; Lopez de Lacalle, Oier

    2015-01-01

    High agreement is a common objective when annotating data for word senses. However, a number of factors make perfect agreement impossible, e.g. the limitations of the sense inventories, the difficulty of the examples or the interpretation preferences of the annotations. Estimating potential...... agreement is thus a relevant task to supplement the evaluation of sense annotations. In this article we propose two methods to predict agreement on word-annotation instances. We experiment with a continuous representation and a three-way discretization of observed agreement. In spite of the difficulty...

  4. Supplementary Material for: BEACON: automated tool for Bacterial GEnome Annotation ComparisON

    KAUST Repository

    Kalkatawi, Manal M.

    2015-01-01

    Abstract Background Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). Results The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACONâ s utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27 %, while the number of genes without any function assignment is reduced. Conclusions We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/ .

  5. sRNAnalyzer-a flexible and customizable small RNA sequencing data analysis pipeline.

    Science.gov (United States)

    Wu, Xiaogang; Kim, Taek-Kyun; Baxter, David; Scherler, Kelsey; Gordon, Aaron; Fong, Olivia; Etheridge, Alton; Galas, David J; Wang, Kai

    2017-12-01

    Although many tools have been developed to analyze small RNA sequencing (sRNA-Seq) data, it remains challenging to accurately analyze the small RNA population, mainly due to multiple sequence ID assignment caused by short read length. Additional issues in small RNA analysis include low consistency of microRNA (miRNA) measurement results across different platforms, miRNA mapping associated with miRNA sequence variation (isomiR) and RNA editing, and the origin of those unmapped reads after screening against all endogenous reference sequence databases. To address these issues, we built a comprehensive and customizable sRNA-Seq data analysis pipeline-sRNAnalyzer, which enables: (i) comprehensive miRNA profiling strategies to better handle isomiRs and summarization based on each nucleotide position to detect potential SNPs in miRNAs, (ii) different sequence mapping result assignment approaches to simulate results from microarray/qRT-PCR platforms and a local probabilistic model to assign mapping results to the most-likely IDs, (iii) comprehensive ribosomal RNA filtering for accurate mapping of exogenous RNAs and summarization based on taxonomy annotation. We evaluated our pipeline on both artificial samples (including synthetic miRNA and Escherichia coli cultures) and biological samples (human tissue and plasma). sRNAnalyzer is implemented in Perl and available at: http://srnanalyzer.systemsbiology.net/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. Diagnosing in building main pipelines

    Energy Technology Data Exchange (ETDEWEB)

    Telegin, L.G.; Gorelov, A.S.; Kurepin, B.N.; Orekhov, V.I.; Vasil' yev, G.G.; Yakovlev, Ye. I.

    1984-01-01

    General principles are examined for technical diagnosis in building main pipelines. A technique is presented for diagnosis during construction, as well as diagnosis of the technical state of the pipeline-construction machines and mechanisms. The survey materials could be used to set up construction of main pipelines.

  7. APV6 Pipeline Emulations

    CERN Document Server

    Millmore, Martin

    1997-01-01

    The data volume from the CMS inner tracker is large enough that data cannot be read out for every bunch crossing, so data are stored in the front end readout chips until a first level trigger signal is received, after which the interesting data are read out. This will reduce the data rate from 40 MHz to 100 kHz. For the silicon microstrips, the data are read out using the APV6 chip which holds the data in an analogue pipeline for up to 3.2 us. Up to 6 events may be stored in the pipeline at any one time, and data are read out asynchronously. In any system where data arrives with a random distribution in time, a finite sized memory can become full, causing data to be lost. Because of the complex nature of the APV6 pipeline logic, a true estimate of the proportion of data which will be lost can only be achieved by running a computer emulation of the pipeline logic, with a poisson distribu tion of trigger signals. The emulation has also been modified to study the effect of other possible logic designs.

  8. Overview of interstate hydrogen pipeline systems.

    Energy Technology Data Exchange (ETDEWEB)

    Gillette, J .L.; Kolpa, R. L

    2008-02-01

    . The following discussion will focus on the similarities and differences between the two pipeline networks. Hydrogen production is currently concentrated in refining centers along the Gulf Coast and in the Farm Belt. These locations have ready access to natural gas, which is used in the steam methane reduction process to make bulk hydrogen in this country. Production centers could possibly change to lie along coastlines, rivers, lakes, or rail lines, should nuclear power or coal become a significant energy source for hydrogen production processes. Should electrolysis become a dominant process for hydrogen production, water availability would be an additional factor in the location of production facilities. Once produced, hydrogen must be transported to markets. A key obstacle to making hydrogen fuel widely available is the scale of expansion needed to serve additional markets. Developing a hydrogen transmission and distribution infrastructure would be one of the challenges to be faced if the United States is to move toward a hydrogen economy. Initial uses of hydrogen are likely to involve a variety of transmission and distribution methods. Smaller users would probably use truck transport, with the hydrogen being in either the liquid or gaseous form. Larger users, however, would likely consider using pipelines. This option would require specially constructed pipelines and the associated infrastructure. Pipeline transmission of hydrogen dates back to late 1930s. These pipelines have generally operated at less than 1,000 pounds per square inch (psi), with a good safety record. Estimates of the existing hydrogen transmission system in the United States range from about 450 to 800 miles. Estimates for Europe range from about 700 to 1,100 miles (Mohipour et al. 2004; Amos 1998). These seemingly large ranges result from using differing criteria in determining pipeline distances. For example, some analysts consider only pipelines above a certain diameter as transmission lines

  9. Overview of interstate hydrogen pipeline systems

    International Nuclear Information System (INIS)

    Gillette, J.L.; Kolpa, R.L.

    2008-01-01

    . The following discussion will focus on the similarities and differences between the two pipeline networks. Hydrogen production is currently concentrated in refining centers along the Gulf Coast and in the Farm Belt. These locations have ready access to natural gas, which is used in the steam methane reduction process to make bulk hydrogen in this country. Production centers could possibly change to lie along coastlines, rivers, lakes, or rail lines, should nuclear power or coal become a significant energy source for hydrogen production processes. Should electrolysis become a dominant process for hydrogen production, water availability would be an additional factor in the location of production facilities. Once produced, hydrogen must be transported to markets. A key obstacle to making hydrogen fuel widely available is the scale of expansion needed to serve additional markets. Developing a hydrogen transmission and distribution infrastructure would be one of the challenges to be faced if the United States is to move toward a hydrogen economy. Initial uses of hydrogen are likely to involve a variety of transmission and distribution methods. Smaller users would probably use truck transport, with the hydrogen being in either the liquid or gaseous form. Larger users, however, would likely consider using pipelines. This option would require specially constructed pipelines and the associated infrastructure. Pipeline transmission of hydrogen dates back to late 1930s. These pipelines have generally operated at less than 1,000 pounds per square inch (psi), with a good safety record. Estimates of the existing hydrogen transmission system in the United States range from about 450 to 800 miles. Estimates for Europe range from about 700 to 1,100 miles (Mohipour et al. 2004; Amos 1998). These seemingly large ranges result from using differing criteria in determining pipeline distances. For example, some analysts consider only pipelines above a certain diameter as transmission lines

  10. Improving microbial genome annotations in an integrated database context.

    Directory of Open Access Journals (Sweden)

    I-Min A Chen

    Full Text Available Effective comparative analysis of microbial genomes requires a consistent and complete view of biological data. Consistency regards the biological coherence of annotations, while completeness regards the extent and coverage of functional characterization for genomes. We have developed tools that allow scientists to assess and improve the consistency and completeness of microbial genome annotations in the context of the Integrated Microbial Genomes (IMG family of systems. All publicly available microbial genomes are characterized in IMG using different functional annotation and pathway resources, thus providing a comprehensive framework for identifying and resolving annotation discrepancies. A rule based system for predicting phenotypes in IMG provides a powerful mechanism for validating functional annotations, whereby the phenotypic traits of an organism are inferred based on the presence of certain metabolic reactions and pathways and compared to experimentally observed phenotypes. The IMG family of systems are available at http://img.jgi.doe.gov/.

  11. Annotation in Digital Scholarly Editions

    NARCIS (Netherlands)

    Boot, P.; Haentjens Dekker, R.

    2016-01-01

    Annotation in digital scholarly editions (of historical documents, literary works, letters, etc.) has long been recognized as an important desideratum, but has also proven to be an elusive ideal. In so far as annotation functionality is available, it is usually developed for a single edition and

  12. Mesotext. Framing and exploring annotations

    NARCIS (Netherlands)

    Boot, P.; Boot, P.; Stronks, E.

    2007-01-01

    From the introduction: Annotation is an important item on the wish list for digital scholarly tools. It is one of John Unsworth’s primitives of scholarship (Unsworth 2000). Especially in linguistics,a number of tools have been developed that facilitate the creation of annotations to source material

  13. Diagnostics and reliability of pipeline systems

    CERN Document Server

    Timashev, Sviatoslav

    2016-01-01

    The book contains solutions to fundamental problems which arise due to the logic of development of specific branches of science, which are related to pipeline safety, but mainly are subordinate to the needs of pipeline transportation.          The book deploys important but not yet solved aspects of reliability and safety assurance of pipeline systems, which are vital aspects not only for the oil and gas industry and, in general, fuel and energy industries , but also to virtually all contemporary industries and technologies. The volume will be useful to specialists and experts in the field of diagnostics/ inspection, monitoring, reliability and safety of critical infrastructures. First and foremost, it will be useful to the decision making persons —operators of different types of pipelines, pipeline diagnostics/inspection vendors, and designers of in-line –inspection (ILI) tools, industrial and ecological safety specialists, as well as to researchers and graduate students.

  14. Statistical mechanics of ontology based annotations

    Science.gov (United States)

    Hoyle, David C.; Brass, Andrew

    2016-01-01

    We present a statistical mechanical theory of the process of annotating an object with terms selected from an ontology. The term selection process is formulated as an ideal lattice gas model, but in a highly structured inhomogeneous field. The model enables us to explain patterns recently observed in real-world annotation data sets, in terms of the underlying graph structure of the ontology. By relating the external field strengths to the information content of each node in the ontology graph, the statistical mechanical model also allows us to propose a number of practical metrics for assessing the quality of both the ontology, and the annotations that arise from its use. Using the statistical mechanical formalism we also study an ensemble of ontologies of differing size and complexity; an analysis not readily performed using real data alone. Focusing on regular tree ontology graphs we uncover a rich set of scaling laws describing the growth in the optimal ontology size as the number of objects being annotated increases. In doing so we provide a further possible measure for assessment of ontologies.

  15. A modified GC-specific MAKER gene annotation method reveals improved and novel gene predictions of high and low GC content in Oryza sativa.

    Science.gov (United States)

    Bowman, Megan J; Pulman, Jane A; Liu, Tiffany L; Childs, Kevin L

    2017-11-25

    Accurate structural annotation depends on well-trained gene prediction programs. Training data for gene prediction programs are often chosen randomly from a subset of high-quality genes that ideally represent the variation found within a genome. One aspect of gene variation is GC content, which differs across species and is bimodal in grass genomes. When gene prediction programs are trained on a subset of grass genes with random GC content, they are effectively being trained on two classes of genes at once, and this can be expected to result in poor results when genes are predicted in new genome sequences. We find that gene prediction programs trained on grass genes with random GC content do not completely predict all grass genes with extreme GC content. We show that gene prediction programs that are trained with grass genes with high or low GC content can make both better and unique gene predictions compared to gene prediction programs that are trained on genes with random GC content. By separately training gene prediction programs with genes from multiple GC ranges and using the programs within the MAKER genome annotation pipeline, we were able to improve the annotation of the Oryza sativa genome compared to using the standard MAKER annotation protocol. Gene structure was improved in over 13% of genes, and 651 novel genes were predicted by the GC-specific MAKER protocol. We present a new GC-specific MAKER annotation protocol to predict new and improved gene models and assess the biological significance of this method in Oryza sativa. We expect that this protocol will also be beneficial for gene prediction in any organism with bimodal or other unusual gene GC content.

  16. Current and future trends in marine image annotation software

    Science.gov (United States)

    Gomes-Pereira, Jose Nuno; Auger, Vincent; Beisiegel, Kolja; Benjamin, Robert; Bergmann, Melanie; Bowden, David; Buhl-Mortensen, Pal; De Leo, Fabio C.; Dionísio, Gisela; Durden, Jennifer M.; Edwards, Luke; Friedman, Ariell; Greinert, Jens; Jacobsen-Stout, Nancy; Lerner, Steve; Leslie, Murray; Nattkemper, Tim W.; Sameoto, Jessica A.; Schoening, Timm; Schouten, Ronald; Seager, James; Singh, Hanumant; Soubigou, Olivier; Tojeira, Inês; van den Beld, Inge; Dias, Frederico; Tempera, Fernando; Santos, Ricardo S.

    2016-12-01

    Given the need to describe, analyze and index large quantities of marine imagery data for exploration and monitoring activities, a range of specialized image annotation tools have been developed worldwide. Image annotation - the process of transposing objects or events represented in a video or still image to the semantic level, may involve human interactions and computer-assisted solutions. Marine image annotation software (MIAS) have enabled over 500 publications to date. We review the functioning, application trends and developments, by comparing general and advanced features of 23 different tools utilized in underwater image analysis. MIAS requiring human input are basically a graphical user interface, with a video player or image browser that recognizes a specific time code or image code, allowing to log events in a time-stamped (and/or geo-referenced) manner. MIAS differ from similar software by the capability of integrating data associated to video collection, the most simple being the position coordinates of the video recording platform. MIAS have three main characteristics: annotating events in real time, posteriorly to annotation and interact with a database. These range from simple annotation interfaces, to full onboard data management systems, with a variety of toolboxes. Advanced packages allow to input and display data from multiple sensors or multiple annotators via intranet or internet. Posterior human-mediated annotation often include tools for data display and image analysis, e.g. length, area, image segmentation, point count; and in a few cases the possibility of browsing and editing previous dive logs or to analyze the annotations. The interaction with a database allows the automatic integration of annotations from different surveys, repeated annotation and collaborative annotation of shared datasets, browsing and querying of data. Progress in the field of automated annotation is mostly in post processing, for stable platforms or still images

  17. Instrumented Pipeline Initiative

    Energy Technology Data Exchange (ETDEWEB)

    Thomas Piro; Michael Ream

    2010-07-31

    This report summarizes technical progress achieved during the cooperative agreement between Concurrent Technologies Corporation (CTC) and U.S. Department of Energy to address the need for a for low-cost monitoring and inspection sensor system as identified in the Department of Energy (DOE) National Gas Infrastructure Research & Development (R&D) Delivery Reliability Program Roadmap.. The Instrumented Pipeline Initiative (IPI) achieved the objective by researching technologies for the monitoring of pipeline delivery integrity, through a ubiquitous network of sensors and controllers to detect and diagnose incipient defects, leaks, and failures. This report is organized by tasks as detailed in the Statement of Project Objectives (SOPO). The sections all state the objective and approach before detailing results of work.

  18. The Dark Energy Survey Image Processing Pipeline

    Energy Technology Data Exchange (ETDEWEB)

    Morganson, E.; et al.

    2018-01-09

    The Dark Energy Survey (DES) is a five-year optical imaging campaign with the goal of understanding the origin of cosmic acceleration. DES performs a 5000 square degree survey of the southern sky in five optical bands (g,r,i,z,Y) to a depth of ~24th magnitude. Contemporaneously, DES performs a deep, time-domain survey in four optical bands (g,r,i,z) over 27 square degrees. DES exposures are processed nightly with an evolving data reduction pipeline and evaluated for image quality to determine if they need to be retaken. Difference imaging and transient source detection are also performed in the time domain component nightly. On a bi-annual basis, DES exposures are reprocessed with a refined pipeline and coadded to maximize imaging depth. Here we describe the DES image processing pipeline in support of DES science, as a reference for users of archival DES data, and as a guide for future astronomical surveys.

  19. Pipeline network and environment

    International Nuclear Information System (INIS)

    Oliveira Nascimento, I.; Wagner, J.; Silveira, T.

    2012-01-01

    The Rio de Janeiro is one of 27 units of Brazil. It is located in the eastern portion of the Southeast and occupies an area of 43 696.054 km², being effectively the 3rd smallest state in Brazil. This state in recent years has suffered from erosion problems caused by the deployment of the network pipeline. The deployment pipeline is part of the activities related to the oil industry has caused a more intense conflict between the environment and economic activities, modifying the soil structure and distribution of surface and subsurface flows. This study aimed to analyze the erosion caused by the removal of soil for the deployment of pipeline transportation, with the consequences of the emergence of numerous gullies, landslides and silting of rivers. For the development of this study were performed bibliographic research, field work, mapping and digital preparation of the initial diagnosis of active processes and what the consequent environmental impacts. For these reasons, we conclude that the problems could be avoided or mitigated if there was a prior geological risk management. (author)

  20. United States petroleum pipelines: An empirical analysis of pipeline sizing

    Science.gov (United States)

    Coburn, L. L.

    1980-12-01

    The undersizing theory hypothesizes that integrated oil companies have a strong economic incentive to size the petroleum pipelines they own and ship over in a way that means that some of the demand must utilize higher cost alternatives. The DOJ theory posits that excess or monopoly profits are earned due to the natural monopoly characteristics of petroleum pipelines and the existence of market power in some pipelines at either the upstream or downstream market. The theory holds that independent petroleum pipelines owned by companies not otherwise affiliated with the petroleum industry (independent pipelines) do not have these incentives and all the efficiencies of pipeline transportation are passed to the ultimate consumer. Integrated oil companies on the other hand, keep these cost efficiencies for themselves in the form of excess profits.

  1. Chado controller: advanced annotation management with a community annotation system.

    Science.gov (United States)

    Guignon, Valentin; Droc, Gaëtan; Alaux, Michael; Baurens, Franc-Christophe; Garsmeur, Olivier; Poiron, Claire; Carver, Tim; Rouard, Mathieu; Bocs, Stéphanie

    2012-04-01

    We developed a controller that is compliant with the Chado database schema, GBrowse and genome annotation-editing tools such as Artemis and Apollo. It enables the management of public and private data, monitors manual annotation (with controlled vocabularies, structural and functional annotation controls) and stores versions of annotation for all modified features. The Chado controller uses PostgreSQL and Perl. The Chado Controller package is available for download at http://www.gnpannot.org/content/chado-controller and runs on any Unix-like operating system, and documentation is available at http://www.gnpannot.org/content/chado-controller-doc The system can be tested using the GNPAnnot Sandbox at http://www.gnpannot.org/content/gnpannot-sandbox-form valentin.guignon@cirad.fr; stephanie.sidibe-bocs@cirad.fr Supplementary data are available at Bioinformatics online.

  2. Energy geopolitics and Iran-Pakistan-India gas pipeline

    International Nuclear Information System (INIS)

    Verma, Shiv Kumar

    2007-01-01

    With the growing energy demands in India and its neighboring countries, Iran-Pakistan-India (IPI) gas pipeline assumes special significance. Energy-deficient countries such as India, China, and Pakistan are vying to acquire gas fields in different parts of the world. This has led to two conspicuous developments: first, they are competing against each other and secondly, a situation is emerging where they might have to confront the US and the western countries in the near future in their attempt to control energy bases. The proposed IPI pipeline is an attempt to acquire such base. However, Pakistan is playing its own game to maximize its leverages. Pakistan, which refuses to establish even normal trading ties with India, craves to earn hundreds of millions of dollars in transit fees and other annual royalties from a gas pipeline which runs from Iran's South Pars fields to Barmer in western India. Pakistan promises to subsidize its gas imports from Iran and thus also become a major forex earner. It is willing to give pipeline related 'international guarantees' notwithstanding its record of covert actions in breach of international law (such as the export of terrorism) and its reluctance to reciprocally provide India what World Trade Organization (WTO) rules obligate it to do-Most Favored Nation (MFN) status. India is looking at the possibility of using some set of norms for securing gas supply through pipeline as the European Union has already initiated a discussion on the issue. The key point that is relevant to India's plan to build a pipeline to source gas from Iran relates to national treatment for pipeline. Under the principle of national treatment which also figures in relation to foreign direct investment (FDI), the country through which a pipeline transits should provide some level of security to the transiting pipeline as it would have provided to its domestic pipelines. This paper will endeavor to analyze, first, the significance of this pipeline for India

  3. Identifying and annotating human bifunctional RNAs reveals their versatile functions.

    Science.gov (United States)

    Chen, Geng; Yang, Juan; Chen, Jiwei; Song, Yunjie; Cao, Ruifang; Shi, Tieliu; Shi, Leming

    2016-10-01

    Bifunctional RNAs that possess both protein-coding and noncoding functional properties were less explored and poorly understood. Here we systematically explored the characteristics and functions of such human bifunctional RNAs by integrating tandem mass spectrometry and RNA-seq data. We first constructed a pipeline to identify and annotate bifunctional RNAs, leading to the characterization of 132 high-confidence bifunctional RNAs. Our analyses indicate that bifunctional RNAs may be involved in human embryonic development and can be functional in diverse tissues. Moreover, bifunctional RNAs could interact with multiple miRNAs and RNA-binding proteins to exert their corresponding roles. Bifunctional RNAs may also function as competing endogenous RNAs to regulate the expression of many genes by competing for common targeting miRNAs. Finally, somatic mutations of diverse carcinomas may generate harmful effect on corresponding bifunctional RNAs. Collectively, our study not only provides the pipeline for identifying and annotating bifunctional RNAs but also reveals their important gene-regulatory functions.

  4. System reliability of corroding pipelines

    International Nuclear Information System (INIS)

    Zhou Wenxing

    2010-01-01

    A methodology is presented in this paper to evaluate the time-dependent system reliability of a pipeline segment that contains multiple active corrosion defects and is subjected to stochastic internal pressure loading. The pipeline segment is modeled as a series system with three distinctive failure modes due to corrosion, namely small leak, large leak and rupture. The internal pressure is characterized as a simple discrete stochastic process that consists of a sequence of independent and identically distributed random variables each acting over a period of one year. The magnitude of a given sequence follows the annual maximum pressure distribution. The methodology is illustrated through a hypothetical example. Furthermore, the impact of the spatial variability of the pressure loading and pipe resistances associated with different defects on the system reliability is investigated. The analysis results suggest that the spatial variability of pipe properties has a negligible impact on the system reliability. On the other hand, the spatial variability of the internal pressure, initial defect sizes and defect growth rates can have a significant impact on the system reliability.

  5. Planned and proposed pipeline regulations

    International Nuclear Information System (INIS)

    De Leon, C.

    1992-01-01

    The Research and Special Programs Administration administers the Natural Gas Pipeline Safety Act of 1968 (NGPSA) and the Hazardous Liquid Pipeline Safety Act of 1979 (HLPSA). The RSPA issues and enforces design, construction, operation and maintenance regulations for natural gas pipelines and hazardous liquid pipelines. This paper discusses a number of proposed and pending safety regulations and legislative initiatives currently being considered by the RSPA and the US Congress. Some new regulations have been enacted. The next few years will see a great deal of regulatory activity regarding natural gas and hazardous liquid pipelines, much of it resulting from legislative requirements. The office of Pipeline Safety is currently conducting a study to streamline its operations. This study is analyzing the office's business, social and technical operations with the goal of improving overall efficiency, effectiveness, productivity and job satisfaction to meet the challenges of the future

  6. Pipeline integrity : control by coatings

    Energy Technology Data Exchange (ETDEWEB)

    Khanna, A.S. [Indian Inst. of Technology, Bombay (India)

    2008-07-01

    This presentation provided background information on the history of cross-country pipelines in India. It discussed the major use of gas. The key users were described as being the power and fertilizer industries, followed by vehicles using compressed natural gas to replace liquid fuels and thereby reduce pollution. The presentation also addressed the integrity of pipelines in terms of high production, safety, and monitoring. Integrity issues of pipelines were discussed with reference to basic design, control of corrosion, and periodic health monitoring. Other topics that were outlined included integrity by corrosion control; integrity by health monitoring; coatings requirements; classification of UCC pipeline coatings; and how the pipeline integrity approach can help to achieve coatings which give design life without any failure. Surface cleanliness, coating conditions, and the relationship between temperature of Epoxy coating and the time of adhesive coating were also discussed. Last, the presentation provided the results of an audit of the HBJ pipeline conducted from 1999 to 2000. tabs., figs.

  7. 49 CFR 195.210 - Pipeline location.

    Science.gov (United States)

    2010-10-01

    ... 49 Transportation 3 2010-10-01 2010-10-01 false Pipeline location. 195.210 Section 195.210 Transportation Other Regulations Relating to Transportation (Continued) PIPELINE AND HAZARDOUS MATERIALS SAFETY... PIPELINE Construction § 195.210 Pipeline location. (a) Pipeline right-of-way must be selected to avoid, as...

  8. Evidence-based gene models for structural and functional annotations of the oil palm genome.

    Science.gov (United States)

    Chan, Kuang-Lim; Tatarinova, Tatiana V; Rosli, Rozana; Amiruddin, Nadzirah; Azizi, Norazah; Halim, Mohd Amin Ab; Sanusi, Nik Shazana Nik Mohd; Jayanthi, Nagappan; Ponomarenko, Petr; Triska, Martin; Solovyev, Victor; Firdaus-Raih, Mohd; Sambanthamurthi, Ravigadevi; Murphy, Denis; Low, Eng-Ti Leslie

    2017-09-08

    Oil palm is an important source of edible oil. The importance of the crop, as well as its long breeding cycle (10-12 years) has led to the sequencing of its genome in 2013 to pave the way for genomics-guided breeding. Nevertheless, the first set of gene predictions, although useful, had many fragmented genes. Classification and characterization of genes associated with traits of interest, such as those for fatty acid biosynthesis and disease resistance, were also limited. Lipid-, especially fatty acid (FA)-related genes are of particular interest for the oil palm as they specify oil yields and quality. This paper presents the characterization of the oil palm genome using different gene prediction methods and comparative genomics analysis, identification of FA biosynthesis and disease resistance genes, and the development of an annotation database and bioinformatics tools. Using two independent gene-prediction pipelines, Fgenesh++ and Seqping, 26,059 oil palm genes with transcriptome and RefSeq support were identified from the oil palm genome. These coding regions of the genome have a characteristic broad distribution of GC 3 (fraction of cytosine and guanine in the third position of a codon) with over half the GC 3 -rich genes (GC 3  ≥ 0.75286) being intronless. In comparison, only one-seventh of the oil palm genes identified are intronless. Using comparative genomics analysis, characterization of conserved domains and active sites, and expression analysis, 42 key genes involved in FA biosynthesis in oil palm were identified. For three of them, namely EgFABF, EgFABH and EgFAD3, segmental duplication events were detected. Our analysis also identified 210 candidate resistance genes in six classes, grouped by their protein domain structures. We present an accurate and comprehensive annotation of the oil palm genome, focusing on analysis of important categories of genes (GC 3 -rich and intronless), as well as those associated with important functions, such as FA

  9. Slurry pipeline technology: an overview

    Energy Technology Data Exchange (ETDEWEB)

    Chapman, Jay P. [Pipeline Systems Incorporated (PSI), Belo Horizonte, MG (Brazil); Lima, Rafael; Pinto, Daniel; Vidal, Alisson [Ausenco do Brasil Engenharia Ltda., Nova Lima, MG (Brazil). PSI Div.

    2009-12-19

    Slurry pipelines represent an economical and environmentally friendly transportation means for many solid materials. This paper provides an over-view of the technology, its evolution and current Brazilian activity. Mineral resources are increasingly moving farther away from ports, processing plants and end use points, and slurry pipelines are an important mode of solids transport. Application guidelines are discussed. State-of-the-Art technical solutions such as pipeline system simulation, pipe materials, pumps, valves, automation, telecommunications, and construction techniques that have made the technology successful are presented. A discussion of where long distant slurry pipelines fit in a picture that also includes thickened and paste materials pipe lining is included. (author)

  10. Automated annotation of mobile antibiotic resistance in Gram-negative bacteria: the Multiple Antibiotic Resistance Annotator (MARA) and database.

    Science.gov (United States)

    Partridge, Sally R; Tsafnat, Guy

    2018-04-01

    Multiresistance in Gram-negative bacteria is often due to acquisition of several different antibiotic resistance genes, each associated with a different mobile genetic element, that tend to cluster together in complex conglomerations. Accurate, consistent annotation of resistance genes, the boundaries and fragments of mobile elements, and signatures of insertion, such as DR, facilitates comparative analysis of complex multiresistance regions and plasmids to better understand their evolution and how resistance genes spread. To extend the Repository of Antibiotic resistance Cassettes (RAC) web site, which includes a database of 'features', and the Attacca automatic DNA annotation system, to encompass additional resistance genes and all types of associated mobile elements. Antibiotic resistance genes and mobile elements were added to RAC, from existing registries where possible. Attacca grammars were extended to accommodate the expanded database, to allow overlapping features to be annotated and to identify and annotate features such as composite transposons and DR. The Multiple Antibiotic Resistance Annotator (MARA) database includes antibiotic resistance genes and selected mobile elements from Gram-negative bacteria, distinguishing important variants. Sequences can be submitted to the MARA web site for annotation. A list of positions and orientations of annotated features, indicating those that are truncated, DR and potential composite transposons is provided for each sequence, as well as a diagram showing annotated features approximately to scale. The MARA web site (http://mara.spokade.com) provides a comprehensive database for mobile antibiotic resistance in Gram-negative bacteria and accurately annotates resistance genes and associated mobile elements in submitted sequences to facilitate comparative analysis.

  11. INTERNAL REPAIR OF PIPELINES

    Energy Technology Data Exchange (ETDEWEB)

    Bill Bruce; Nancy Porter; George Ritter; Matt Boring; Mark Lozev; Ian Harris; Bill Mohr; Dennis Harwig; Robin Gordon; Chris Neary; Mike Sullivan

    2005-07-20

    The two broad categories of fiber-reinforced composite liner repair and deposited weld metal repair technologies were reviewed and evaluated for potential application for internal repair of gas transmission pipelines. Both are used to some extent for other applications and could be further developed for internal, local, structural repair of gas transmission pipelines. Principal conclusions from a survey of natural gas transmission industry pipeline operators can be summarized in terms of the following performance requirements for internal repair: (1) Use of internal repair is most attractive for river crossings, under other bodies of water, in difficult soil conditions, under highways, under congested intersections, and under railway crossings. (2) Internal pipe repair offers a strong potential advantage to the high cost of horizontal direct drilling when a new bore must be created to solve a leak or other problem. (3) Typical travel distances can be divided into three distinct groups: up to 305 m (1,000 ft.); between 305 m and 610 m (1,000 ft. and 2,000 ft.); and beyond 914 m (3,000 ft.). All three groups require pig-based systems. A despooled umbilical system would suffice for the first two groups which represents 81% of survey respondents. The third group would require an onboard self-contained power unit for propulsion and welding/liner repair energy needs. (4) The most common size range for 80% to 90% of operators surveyed is 508 mm (20 in.) to 762 mm (30 in.), with 95% using 558.8 mm (22 in.) pipe. Evaluation trials were conducted on pipe sections with simulated corrosion damage repaired with glass fiber-reinforced composite liners, carbon fiber-reinforced composite liners, and weld deposition. Additional un-repaired pipe sections were evaluated in the virgin condition and with simulated damage. Hydrostatic failure pressures for pipe sections repaired with glass fiber-reinforced composite liner were only marginally greater than that of pipe sections without

  12. MIRACLE’s Naive Approach to Medical Images Annotation

    OpenAIRE

    Villena Román, Julio; González Cristóbal, José Carlos; Goñi Menoyo, José Miguel; Martínez Fernández, José Luis

    2005-01-01

    One of the proposed tasks of the ImageCLEF 2005 campaign has been an Automatic Annotation Task. The objective is to provide the classification of a given set of 1,000 previously unseen medical (radiological) images according to 57 predefined categories covering different medical pathologies. 9,000 classified training images are given which can be used in any way to train a classifier. The Automatic Annotation task uses no textual information, but image-content information only. This paper des...

  13. JAABA: interactive machine learning for automatic annotation of animal behavior

    OpenAIRE

    Kabra, Mayank; Robie, Alice A; Rivera-Alba, Marta; Branson, Steven; Branson, Kristin

    2013-01-01

    We present a machine learning-based system for automatically computing interpretable, quantitative measures of animal behavior. Through our interactive system, users encode their intuition about behavior by annotating a small set of video frames. These manual labels are converted into classifiers that can automatically annotate behaviors in screen-scale data sets. Our general-purpose system can create a variety of accurate individual and social behavior classifiers for different organisms, in...

  14. Improved methods and resources for paramecium genomics: transcription units, gene annotation and gene expression.

    Science.gov (United States)

    Arnaiz, Olivier; Van Dijk, Erwin; Bétermier, Mireille; Lhuillier-Akakpo, Maoussi; de Vanssay, Augustin; Duharcourt, Sandra; Sallet, Erika; Gouzy, Jérôme; Sperling, Linda

    2017-06-26

    The 15 sibling species of the Paramecium aurelia cryptic species complex emerged after a whole genome duplication that occurred tens of millions of years ago. Given extensive knowledge of the genetics and epigenetics of Paramecium acquired over the last century, this species complex offers a uniquely powerful system to investigate the consequences of whole genome duplication in a unicellular eukaryote as well as the genetic and epigenetic mechanisms that drive speciation. High quality Paramecium gene models are important for research using this system. The major aim of the work reported here was to build an improved gene annotation pipeline for the Paramecium lineage. We generated oriented RNA-Seq transcriptome data across the sexual process of autogamy for the model species Paramecium tetraurelia. We determined, for the first time in a ciliate, candidate P. tetraurelia transcription start sites using an adapted Cap-Seq protocol. We developed TrUC, multi-threaded Perl software that in conjunction with TopHat mapping of RNA-Seq data to a reference genome, predicts transcription units for the annotation pipeline. We used EuGene software to combine annotation evidence. The high quality gene structural annotations obtained for P. tetraurelia were used as evidence to improve published annotations for 3 other Paramecium species. The RNA-Seq data were also used for differential gene expression analysis, providing a gene expression atlas that is more sensitive than the previously established microarray resource. We have developed a gene annotation pipeline tailored for the compact genomes and tiny introns of Paramecium species. A novel component of this pipeline, TrUC, predicts transcription units using Cap-Seq and oriented RNA-Seq data. TrUC could prove useful beyond Paramecium, especially in the case of high gene density. Accurate predictions of 3' and 5' UTR will be particularly valuable for studies of gene expression (e.g. nucleosome positioning, identification of cis

  15. Optimal Energy Consumption Analysis of Natural Gas Pipeline

    OpenAIRE

    Liu, Enbin; Li, Changjun; Yang, Yi

    2014-01-01

    There are many compressor stations along long-distance natural gas pipelines. Natural gas can be transported using different boot programs and import pressures, combined with temperature control parameters. Moreover, different transport methods have correspondingly different energy consumptions. At present, the operating parameters of many pipelines are determined empirically by dispatchers, resulting in high energy consumption. This practice does not abide by energy reduction policies. There...

  16. Annotating functional RNAs in genomes using Infernal.

    Science.gov (United States)

    Nawrocki, Eric P

    2014-01-01

    Many different types of functional non-coding RNAs participate in a wide range of important cellular functions but the large majority of these RNAs are not routinely annotated in published genomes. Several programs have been developed for identifying RNAs, including specific tools tailored to a particular RNA family as well as more general ones designed to work for any family. Many of these tools utilize covariance models (CMs), statistical models of the conserved sequence, and structure of an RNA family. In this chapter, as an illustrative example, the Infernal software package and CMs from the Rfam database are used to identify RNAs in the genome of the archaeon Methanobrevibacter ruminantium, uncovering some additional RNAs not present in the genome's initial annotation. Analysis of the results and comparison with family-specific methods demonstrate some important strengths and weaknesses of this general approach.

  17. Objective-guided image annotation.

    Science.gov (United States)

    Mao, Qi; Tsang, Ivor Wai-Hung; Gao, Shenghua

    2013-04-01

    Automatic image annotation, which is usually formulated as a multi-label classification problem, is one of the major tools used to enhance the semantic understanding of web images. Many multimedia applications (e.g., tag-based image retrieval) can greatly benefit from image annotation. However, the insufficient performance of image annotation methods prevents these applications from being practical. On the other hand, specific measures are usually designed to evaluate how well one annotation method performs for a specific objective or application, but most image annotation methods do not consider optimization of these measures, so that they are inevitably trapped into suboptimal performance of these objective-specific measures. To address this issue, we first summarize a variety of objective-guided performance measures under a unified representation. Our analysis reveals that macro-averaging measures are very sensitive to infrequent keywords, and hamming measure is easily affected by skewed distributions. We then propose a unified multi-label learning framework, which directly optimizes a variety of objective-specific measures of multi-label learning tasks. Specifically, we first present a multilayer hierarchical structure of learning hypotheses for multi-label problems based on which a variety of loss functions with respect to objective-guided measures are defined. And then, we formulate these loss functions as relaxed surrogate functions and optimize them by structural SVMs. According to the analysis of various measures and the high time complexity of optimizing micro-averaging measures, in this paper, we focus on example-based measures that are tailor-made for image annotation tasks but are seldom explored in the literature. Experiments show consistency with the formal analysis on two widely used multi-label datasets, and demonstrate the superior performance of our proposed method over state-of-the-art baseline methods in terms of example-based measures on four

  18. Fluid pipeline system leak detection based on neural network and pattern recognition

    International Nuclear Information System (INIS)

    Tang Xiujia

    1998-01-01

    The mechanism of the stress wave propagation along the pipeline system of NPP, caused by turbulent ejection from pipeline leakage, is researched. A series of characteristic index are described in time domain or frequency domain, and compress numerical algorithm is developed for original data compression. A back propagation neural networks (BPNN) with the input matrix composed by stress wave characteristics in time domain or frequency domain is first proposed to classify various situations of the pipeline, in order to detect the leakage in the fluid flow pipelines. The capability of the new method had been demonstrated by experiments and finally used to design a handy instrument for the pipeline leakage detection. Usually a pipeline system has many inner branches and often in adjusting dynamic condition, it is difficult for traditional pipeline diagnosis facilities to identify the difference between inner pipeline operation and pipeline fault. The author first proposed pipeline wave propagation identification by pattern recognition to diagnose pipeline leak. A series of pattern primitives such as peaks, valleys, horizon lines, capstan peaks, dominant relations, slave relations, etc., are used to extract features of the negative pressure wave form. The context-free grammar of symbolic representation of the negative wave form is used, and a negative wave form parsing system with application to structural pattern recognition based on the representation is first proposed to detect and localize leaks of the fluid pipelines

  19. Plann: A command-line application for annotating plastome sequences1

    Science.gov (United States)

    Huang, Daisie I.; Cronk, Quentin C. B.

    2015-01-01

    Premise of the study: Plann automates the process of annotating a plastome sequence in GenBank format for either downstream processing or for GenBank submission by annotating a new plastome based on a similar, well-annotated plastome. Methods and Results: Plann is a Perl script to be executed on the command line. Plann compares a new plastome sequence to the features annotated in a reference plastome and then shifts the intervals of any matching features to the locations in the new plastome. Plann’s output can be used in the National Center for Biotechnology Information’s tbl2asn to create a Sequin file for GenBank submission. Conclusions: Unlike Web-based annotation packages, Plann is a locally executable script that will accurately annotate a plastome sequence to a locally specified reference plastome. Because it executes from the command line, it is ready to use in other software pipelines and can be easily rerun as a draft plastome is improved. PMID:26312193

  20. Learning the image processing pipeline.

    Science.gov (United States)

    Jiang, Haomiao; Tian, Qiyuan; Farrell, Joyce; Wandell, Brian

    2017-06-08

    Many creative ideas are being proposed for image sensor designs, and these may be useful in applications ranging from consumer photography to computer vision. To understand and evaluate each new design, we must create a corresponding image processing pipeline that transforms the sensor data into a form that is appropriate for the application. The need to design and optimize these pipelines is time-consuming and costly. We explain a method that combines machine learning and image systems simulation that automates the pipeline design. The approach is based on a new way of thinking of the image processing pipeline as a large collection of local linear filters. We illustrate how the method has been used to design pipelines for novel sensor architectures in consumer photography applications.

  1. Image annotation under X Windows

    Science.gov (United States)

    Pothier, Steven

    1991-08-01

    A mechanism for attaching graphic and overlay annotation to multiple bits/pixel imagery while providing levels of performance approaching that of native mode graphics systems is presented. This mechanism isolates programming complexity from the application programmer through software encapsulation under the X Window System. It ensures display accuracy throughout operations on the imagery and annotation including zooms, pans, and modifications of the annotation. Trade-offs that affect speed of display, consumption of memory, and system functionality are explored. The use of resource files to tune the display system is discussed. The mechanism makes use of an abstraction consisting of four parts; a graphics overlay, a dithered overlay, an image overly, and a physical display window. Data structures are maintained that retain the distinction between the four parts so that they can be modified independently, providing system flexibility. A unique technique for associating user color preferences with annotation is introduced. An interface that allows interactive modification of the mapping between image value and color is discussed. A procedure that provides for the colorization of imagery on 8-bit display systems using pixel dithering is explained. Finally, the application of annotation mechanisms to various applications is discussed.

  2. Studying Oogenesis in a Non-model Organism Using Transcriptomics: Assembling, Annotating, and Analyzing Your Data.

    Science.gov (United States)

    Carter, Jean-Michel; Gibbs, Melanie; Breuker, Casper J

    2016-01-01

    This chapter provides a guide to processing and analyzing RNA-Seq data in a non-model organism. This approach was implemented for studying oogenesis in the Speckled Wood Butterfly Pararge aegeria. We focus in particular on how to perform a more informative primary annotation of your non-model organism by implementing our multi-BLAST annotation strategy. We also provide a general guide to other essential steps in the next-generation sequencing analysis workflow. Before undertaking these methods, we recommend you familiarize yourself with command line usage and fundamental concepts of database handling. Most of the operations in the primary annotation pipeline can be performed in Galaxy (or equivalent standalone versions of the tools) and through the use of common database operations (e.g. to remove duplicates) but other equivalent programs and/or custom scripts can be implemented for further automation.

  3. Nonlinear Deep Kernel Learning for Image Annotation.

    Science.gov (United States)

    Jiu, Mingyuan; Sahbi, Hichem

    2017-02-08

    Multiple kernel learning (MKL) is a widely used technique for kernel design. Its principle consists in learning, for a given support vector classifier, the most suitable convex (or sparse) linear combination of standard elementary kernels. However, these combinations are shallow and often powerless to capture the actual similarity between highly semantic data, especially for challenging classification tasks such as image annotation. In this paper, we redefine multiple kernels using deep multi-layer networks. In this new contribution, a deep multiple kernel is recursively defined as a multi-layered combination of nonlinear activation functions, each one involves a combination of several elementary or intermediate kernels, and results into a positive semi-definite deep kernel. We propose four different frameworks in order to learn the weights of these networks: supervised, unsupervised, kernel-based semisupervised and Laplacian-based semi-supervised. When plugged into support vector machines (SVMs), the resulting deep kernel networks show clear gain, compared to several shallow kernels for the task of image annotation. Extensive experiments and analysis on the challenging ImageCLEF photo annotation benchmark, the COREL5k database and the Banana dataset validate the effectiveness of the proposed method.

  4. The meaning of "significance" for different types of research [translated and annotated by Eric-Jan Wagenmakers, Denny Borsboom, Josine Verhagen, Rogier Kievit, Marjan Bakker, Angelique Cramer, Dora Matzke, Don Mellenbergh, and Han L. J. van der Maas]. 1969.

    Science.gov (United States)

    de Groot, A D

    2014-05-01

    Adrianus Dingeman de Groot (1914-2006) was one of the most influential Dutch psychologists. He became famous for his work "Thought and Choice in Chess", but his main contribution was methodological--De Groot co-founded the Department of Psychological Methods at the University of Amsterdam (together with R. F. van Naerssen), founded one of the leading testing and assessment companies (CITO), and wrote the monograph "Methodology" that centers on the empirical-scientific cycle: observation-induction-deduction-testing-evaluation. Here we translate one of De Groot's early articles, published in 1956 in the Dutch journal Nederlands Tijdschrift voor de Psychologie en Haar Grensgebieden. This article is more topical now than it was almost 60years ago. De Groot stresses the difference between exploratory and confirmatory ("hypothesis testing") research and argues that statistical inference is only sensible for the latter: "One 'is allowed' to apply statistical tests in exploratory research, just as long as one realizes that they do not have evidential impact". De Groot may have also been one of the first psychologists to argue explicitly for preregistration of experiments and the associated plan of statistical analysis. The appendix provides annotations that connect De Groot's arguments to the current-day debate on transparency and reproducibility in psychological science. Copyright © 2014 Elsevier B.V. All rights reserved.

  5. INNOVATIVE ELECTROMAGNETIC SENSORS FOR PIPELINE CRAWLERS

    Energy Technology Data Exchange (ETDEWEB)

    J. Bruce Nestleroth

    2004-11-05

    Internal inspection of pipelines is an important tool for ensuring safe and reliable delivery of fossil energy products. Current inspection systems that are propelled through the pipeline by the product flow cannot be used to inspect all pipelines because of the various physical barriers they encounter. Recent development efforts include a new generation of powered inspection platforms that crawl slowly inside a pipeline and are able to maneuver past the physical barriers that can limit inspection. At Battelle, innovative electromagnetic sensors are being designed and tested for these new pipeline crawlers. The various sensor types can be used to assess a wide range of pipeline anomalies including corrosion, mechanical damage, and cracks. The Applied Energy Systems Group at Battelle is concluding the first year of work on a projected three-year development effort. In this first year, two innovative electromagnetic inspection technologies were designed and tested. Both were based on moving high-strength permanent magnets to generate inspection energy. One system involved translating permanent magnets towards the pipe. A pulse of electric current would be induced in the pipe to oppose the magnetization according to Lenz's Law. The decay of this pulse would indicate the presence of defects in the pipe wall. This inspection method is similar to pulsed eddy current inspection methods, with the fundamental difference being the manner in which the current is generated. Details of this development effort were reported in the first semiannual report on this project. This second semiannual report focuses on the development of a second inspection methodology, based on rotating permanent magnets. During this period, a rotating permanent magnet exciter was designed and built. The exciter unit produces strong eddy currents in the pipe wall. The tests have shown that at distances of a pipe diameter or more, the currents flow circumferentially, and that these circumferential

  6. INTERNAL REPAIR OF PIPELINES

    Energy Technology Data Exchange (ETDEWEB)

    Robin Gordon; Bill Bruce; Ian Harris; Dennis Harwig; George Ritter; Bill Mohr; Matt Boring; Nancy Porter; Mike Sullivan; Chris Neary

    2004-12-31

    The two broad categories of fiber-reinforced composite liner repair and deposited weld metal repair technologies were reviewed and evaluated for potential application for internal repair of gas transmission pipelines. Both are used to some extent for other applications and could be further developed for internal, local, structural repair of gas transmission pipelines. Principal conclusions from a survey of natural gas transmission industry pipeline operators can be summarized in terms of the following performance requirements for internal repair: (1) Use of internal repair is most attractive for river crossings, under other bodies of water, in difficult soil conditions, under highways, under congested intersections, and under railway crossings. (2) Internal pipe repair offers a strong potential advantage to the high cost of horizontal direct drilling when a new bore must be created to solve a leak or other problem. (3) Typical travel distances can be divided into three distinct groups: up to 305 m (1,000 ft.); between 305 m and 610 m (1,000 ft. and 2,000 ft.); and beyond 914 m (3,000 ft.). All three groups require pig-based systems. A despooled umbilical system would suffice for the first two groups which represents 81% of survey respondents. The third group would require an onboard self-contained power unit for propulsion and welding/liner repair energy needs. (4) The most common size range for 80% to 90% of operators surveyed is 508 mm (20 in.) to 762 mm (30 in.), with 95% using 558.8 mm (22 in.) pipe. Evaluation trials were conducted on pipe sections with simulated corrosion damage repaired with glass fiber-reinforced composite liners, carbon fiber-reinforced composite liners, and weld deposition. Additional un-repaired pipe sections were evaluated in the virgin condition and with simulated damage. Hydrostatic failure pressures for pipe sections repaired with glass fiber-reinforced composite liner were only marginally greater than that of pipe sections without

  7. Shipping Information Pipeline

    DEFF Research Database (Denmark)

    Jensen, Thomas; Vatrapu, Ravi

    2015-01-01

    systems researchers engaged in the design and development of a prototype for an innovative IT-artifact called Shipping Information Pipeline which is a kind of “an internet” for shipping information. The instrumental aim is to enable information seamlessly to cross the organizational boundaries......This paper presents a design science approach to solving persistent problems in the international shipping eco system by creating the missing common information infrastructures. Specifically, this paper reports on an ongoing dialogue between stakeholders in the shipping industry and information...... and national borders within international shipping which is a rather complex domain. The intellectual objective is to generate and evaluate the efficacy and effectiveness of design principles for inter-organizational information infrastructures in the international shipping domain that can have positive...

  8. INTERNAL REPAIR OF PIPELINES

    Energy Technology Data Exchange (ETDEWEB)

    Robin Gordon; Bill Bruce; Ian Harris; Dennis Harwig; Nancy Porter; Mike Sullivan; Chris Neary

    2004-04-12

    The two broad categories of deposited weld metal repair and fiber-reinforced composite liner repair technologies were reviewed for potential application for internal repair of gas transmission pipelines. Both are used to some extent for other applications and could be further developed for internal, local, structural repair of gas transmission pipelines. Preliminary test programs were developed for both deposited weld metal repair and for fiber-reinforced composite liner repair. Evaluation trials have been conducted using a modified fiber-reinforced composite liner provided by RolaTube and pipe sections without liners. All pipe section specimens failed in areas of simulated damage. Pipe sections containing fiber-reinforced composite liners failed at pressures marginally greater than the pipe sections without liners. The next step is to evaluate a liner material with a modulus of elasticity approximately 95% of the modulus of elasticity for steel. Preliminary welding parameters were developed for deposited weld metal repair in preparation of the receipt of Pacific Gas & Electric's internal pipeline welding repair system (that was designed specifically for 559 mm (22 in.) diameter pipe) and the receipt of 559 mm (22 in.) pipe sections from Panhandle Eastern. The next steps are to transfer welding parameters to the PG&E system and to pressure test repaired pipe sections to failure. A survey of pipeline operators was conducted to better understand the needs and performance requirements of the natural gas transmission industry regarding internal repair. Completed surveys contained the following principal conclusions: (1) Use of internal weld repair is most attractive for river crossings, under other bodies of water, in difficult soil conditions, under highways, under congested intersections, and under railway crossings. (2) Internal pipe repair offers a strong potential advantage to the high cost of horizontal direct drilling (HDD) when a new bore must be created

  9. INTERNAL REPAIR OF PIPELINES

    Energy Technology Data Exchange (ETDEWEB)

    Robin Gordon; Bill Bruce; Ian Harris; Dennis Harwig; George Ritter; Bill Mohr; Matt Boring; Nancy Porter; Mike Sullivan; Chris Neary

    2004-08-17

    The two broad categories of fiber-reinforced composite liner repair and deposited weld metal repair technologies were reviewed and evaluated for potential application for internal repair of gas transmission pipelines. Both are used to some extent for other applications and could be further developed for internal, local, structural repair of gas transmission pipelines. Principal conclusions from a survey of natural gas transmission industry pipeline operators can be summarized in terms of the following performance requirements for internal repair: (1) Use of internal repair is most attractive for river crossings, under other bodies of water, in difficult soil conditions, under highways, under congested intersections, and under railway. (2) Internal pipe repair offers a strong potential advantage to the high cost of horizontal direct drilling when a new bore must be created to solve a leak or other problem. (3) Typical travel distances can be divided into three distinct groups: up to 305 m (1,000 ft.); between 305 m and 610 m (1,000 ft. and 2,000 ft.); and beyond 914 m (3,000 ft.). All three groups require pig-based systems. A despooled umbilical system would suffice for the first two groups which represents 81% of survey respondents. The third group would require an onboard self-contained power unit for propulsion and welding/liner repair energy needs. (4) The most common size range for 80% to 90% of operators surveyed is 508 mm (20 in.) to 762 mm (30 in.), with 95% using 558.8 mm (22 in.) pipe. Evaluation trials were conducted on pipe sections with simulated corrosion damage repaired with glass fiber-reinforced composite liners, carbon fiber-reinforced composite liners, and weld deposition. Additional un-repaired pipe sections were evaluated in the virgin condition and with simulated damage. Hydrostatic failure pressures for pipe sections repaired with glass fiber-reinforced composite liner were only marginally greater than that of pipe sections without liners

  10. Virtual Pipeline System Testbed to Optimize the U.S. Natural Gas Transmission Pipeline System

    Energy Technology Data Exchange (ETDEWEB)

    Kirby S. Chapman; Prakash Krishniswami; Virg Wallentine; Mohammed Abbaspour; Revathi Ranganathan; Ravi Addanki; Jeet Sengupta; Liubo Chen

    2005-06-01

    The goal of this project is to develop a Virtual Pipeline System Testbed (VPST) for natural gas transmission. This study uses a fully implicit finite difference method to analyze transient, nonisothermal compressible gas flow through a gas pipeline system. The inertia term of the momentum equation is included in the analysis. The testbed simulate compressor stations, the pipe that connects these compressor stations, the supply sources, and the end-user demand markets. The compressor station is described by identifying the make, model, and number of engines, gas turbines, and compressors. System operators and engineers can analyze the impact of system changes on the dynamic deliverability of gas and on the environment.

  11. Accident Prevention and Diagnostics of Underground Pipeline Systems

    Science.gov (United States)

    Trokhimchuk, M.; Bakhracheva, Y.

    2017-11-01

    Up to forty thousand accidents occur annually with underground pipelines due to corrosion. The comparison of the methods for assessing the quality of anti-corrosion coating is provided. It is proposed to use the device to be tied-in to existing pipeline which has a higher functionality in comparison with other types of the devices due to the possibility of tie-in to the pipelines with different diameters. The existing technologies and applied materials allow us to organize industrial production of the proposed device.

  12. Energy cost reduction in oil pipeline

    Energy Technology Data Exchange (ETDEWEB)

    Limeira, Fabio Machado; Correa, Joao Luiz Lavoura; Costa, Luciano Macedo Josino da; Silva, Jose Luiz da; Henriques, Fausto Metzger Pessanha [Petrobras Transporte S.A. (TRANSPETRO), Rio de Janeiro, RJ (Brazil)

    2012-07-01

    One of the key questions of modern society consists on the rational use of the planet's natural resources and energy. Due to the lack of energy, many companies are forced to reduce their workload, especially during peak hours, because residential demand reaches its top and there is not enough energy to fulfill the needs of all users, which affects major industries. Therefore, using energy more wisely has become a strategic issue for any company, due to the limited supply and also for the excessive cost it represents. With the objective of saving energy and reducing costs for oil pipelines, it has been identified that the increase in energy consumption is primordially related to pumping stations and also by the way many facilities are operated, that is, differently from what was originally designed. Realizing this opportunity, in order to optimize the process, this article intends to examine the possibility of gains evaluating alternatives regarding changes in the pump scheme configuration and non-use of pump stations at peak hours. Initially, an oil pipeline with potential to reduce energy costs was chosen being followed by a history analysis, in order to confirm if there was sufficient room to change the operation mode. After confirming the pipeline choice, the system is briefly described and the literature is reviewed, explaining how the energy cost is calculated and also the main characteristics of a pumping system in series and in parallel. In that sequence, technically feasible alternatives are studied in order to operate and also to negotiate the energy demand contract. Finally, costs are calculated to identify the most economical alternative, that is, for a scenario with no increase in the actual transported volume of the pipeline and for another scenario that considers an increase of about 20%. The conclusion of this study indicates that the chosen pipeline can achieve a reduction on energy costs of up to 25% without the need for investments in new

  13. INNOVATIVE ELECTROMAGNETIC SENSORS FOR PIPELINE CRAWLERS

    Energy Technology Data Exchange (ETDEWEB)

    J. Bruce Nestleroth

    2005-11-30

    Internal inspection of pipelines is an important tool for ensuring safe and reliable delivery of fossil energy products. Current inspection systems that are propelled through the pipeline by the product flow cannot be used to inspect all pipelines because of the various physical barriers they encounter. Recent development efforts include a new generation of powered inspection platforms that crawl slowly inside a pipeline and are able to maneuver past the physical barriers that can limit inspection. At Battelle, innovative electromagnetic sensors are being designed and tested for these new pipeline crawlers. The various sensor types can be used to assess a wide range of pipeline anomalies including corrosion, mechanical damage, and cracks. Battelle has completed the second year of work on a projected three-year development effort. In the first year, two innovative electromagnetic inspection technologies were designed and tested. Both were based on moving high-strength permanent magnets to generate inspection energy. One system involved translating permanent magnets towards the pipe. A pulse of electric current would be induced in the pipe to oppose the magnetization according to Lenz's Law. The decay of this pulse would indicate the presence of defects in the pipe wall. This inspection method is similar to pulsed eddy current inspection methods, with the fundamental difference being the manner in which the current is generated. Details of this development effort were reported in the first semiannual report on this project. The second inspection methodology is based on rotating permanent magnets. The rotating exciter unit produces strong eddy currents in the pipe wall. At distances of a pipe diameter or more from the rotating exciter, the currents flow circumferentially. These circumferential currents are deflected by pipeline defects such as corrosion and axially aligned cracks. Simple sensors are used to detect the change in current densities in the pipe wall

  14. A survey on annotation tools for the biomedical literature.

    Science.gov (United States)

    Neves, Mariana; Leser, Ulf

    2014-03-01

    New approaches to biomedical text mining crucially depend on the existence of comprehensive annotated corpora. Such corpora, commonly called gold standards, are important for learning patterns or models during the training phase, for evaluating and comparing the performance of algorithms and also for better understanding the information sought for by means of examples. Gold standards depend on human understanding and manual annotation of natural language text. This process is very time-consuming and expensive because it requires high intellectual effort from domain experts. Accordingly, the lack of gold standards is considered as one of the main bottlenecks for developing novel text mining methods. This situation led the development of tools that support humans in annotating texts. Such tools should be intuitive to use, should support a range of different input formats, should include visualization of annotated texts and should generate an easy-to-parse output format. Today, a range of tools which implement some of these functionalities are available. In this survey, we present a comprehensive survey of tools for supporting annotation of biomedical texts. Altogether, we considered almost 30 tools, 13 of which were selected for an in-depth comparison. The comparison was performed using predefined criteria and was accompanied by hands-on experiences whenever possible. Our survey shows that current tools can support many of the tasks in biomedical text annotation in a satisfying manner, but also that no tool can be considered as a true comprehensive solution.

  15. Alignment-Annotator web server: rendering and annotating sequence alignments.

    Science.gov (United States)

    Gille, Christoph; Fähling, Michael; Weyand, Birgit; Wieland, Thomas; Gille, Andreas

    2014-07-01

    Alignment-Annotator is a novel web service designed to generate interactive views of annotated nucleotide and amino acid sequence alignments (i) de novo and (ii) embedded in other software. All computations are performed at server side. Interactivity is implemented in HTML5, a language native to web browsers. The alignment is initially displayed using default settings and can be modified with the graphical user interfaces. For example, individual sequences can be reordered or deleted using drag and drop, amino acid color code schemes can be applied and annotations can be added. Annotations can be made manually or imported (BioDAS servers, the UniProt, the Catalytic Site Atlas and the PDB). Some edits take immediate effect while others require server interaction and may take a few seconds to execute. The final alignment document can be downloaded as a zip-archive containing the HTML files. Because of the use of HTML the resulting interactive alignment can be viewed on any platform including Windows, Mac OS X, Linux, Android and iOS in any standard web browser. Importantly, no plugins nor Java are required and therefore Alignment-Anotator represents the first interactive browser-based alignment visualization. http://www.bioinformatics.org/strap/aa/ and http://strap.charite.de/aa/. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  16. SeqAnt: A web service to rapidly identify and annotate DNA sequence variations

    Directory of Open Access Journals (Sweden)

    Patel Viren

    2010-09-01

    Full Text Available Abstract Background The enormous throughput and low cost of second-generation sequencing platforms now allow research and clinical geneticists to routinely perform single experiments that identify tens of thousands to millions of variant sites. Existing methods to annotate variant sites using information from publicly available databases via web browsers are too slow to be useful for the large sequencing datasets being routinely generated by geneticists. Because sequence annotation of variant sites is required before functional characterization can proceed, the lack of a high-throughput pipeline to efficiently annotate variant sites can act as a significant bottleneck in genetics research. Results SeqAnt (Sequence Annotator is an open source web service and software package that rapidly annotates DNA sequence variants and identifies recessive or compound heterozygous loci in human, mouse, fly, and worm genome sequencing experiments. Variants are characterized with respect to their functional type, frequency, and evolutionary conservation. Annotated variants can be viewed on a web browser, downloaded in a tab-delimited text file, or directly uploaded in a BED format to the UCSC genome browser. To demonstrate the speed of SeqAnt, we annotated a series of publicly available datasets that ranged in size from 37 to 3,439,107 variant sites. The total time to completely annotate these data completely ranged from 0.17 seconds to 28 minutes 49.8 seconds. Conclusion SeqAnt is an open source web service and software package that overcomes a critical bottleneck facing research and clinical geneticists using second-generation sequencing platforms. SeqAnt will prove especially useful for those investigators who lack dedicated bioinformatics personnel or infrastructure in their laboratories.

  17. An Approach to Function Annotation for Proteins of Unknown Function (PUFs in the Transcriptome of Indian Mulberry.

    Directory of Open Access Journals (Sweden)

    K H Dhanyalakshmi

    Full Text Available The modern sequencing technologies are generating large volumes of information at the transcriptome and genome level. Translation of this information into a biological meaning is far behind the race due to which a significant portion of proteins discovered remain as proteins of unknown function (PUFs. Attempts to uncover the functional significance of PUFs are limited due to lack of easy and high throughput functional annotation tools. Here, we report an approach to assign putative functions to PUFs, identified in the transcriptome of mulberry, a perennial tree commonly cultivated as host of silkworm. We utilized the mulberry PUFs generated from leaf tissues exposed to drought stress at whole plant level. A sequence and structure based computational analysis predicted the probable function of the PUFs. For rapid and easy annotation of PUFs, we developed an automated pipeline by integrating diverse bioinformatics tools, designated as PUFs Annotation Server (PUFAS, which also provides a web service API (Application Programming Interface for a large-scale analysis up to a genome. The expression analysis of three selected PUFs annotated by the pipeline revealed abiotic stress responsiveness of the genes, and hence their potential role in stress acclimation pathways. The automated pipeline developed here could be extended to assign functions to PUFs from any organism in general. PUFAS web server is available at http://caps.ncbs.res.in/pufas/ and the web service is accessible at http://capservices.ncbs.res.in/help/pufas.

  18. Allele Workbench: transcriptome pipeline and interactive graphics for allele-specific expression.

    Directory of Open Access Journals (Sweden)

    Carol A Soderlund

    Full Text Available Sequencing the transcriptome can answer various questions such as determining the transcripts expressed in a given species for a specific tissue or condition, evaluating differential expression, discovering variants, and evaluating allele-specific expression. Differential expression evaluates the expression differences between different strains, tissues, and conditions. Allele-specific expression evaluates expression differences between parental alleles. Both differential expression and allele-specific expression have been studied for heterosis (hybrid vigor, where the hybrid has improved performance over the parents for one or more traits. The Allele Workbench software was developed for a heterosis study that evaluated allele-specific expression for a mouse F1 hybrid using libraries from multiple tissues with biological replicates. This software has been made into a distributable package, which includes a pipeline, a Java interface to build the database, and a Java interface for query and display of the results. The required input is a reference genome, annotation file, and one or more RNA-Seq libraries with optional replicates. It evaluates allelic imbalance at the SNP and transcript level and flags transcripts with significant opposite directional allele-specific expression. The Java interface allows the user to view data from libraries, replicates, genes, transcripts, exons, and variants, including queries on allele imbalance for selected libraries. To determine the impact of allele-specific SNPs on protein folding, variants are annotated with their effect (e.g., missense, and the parental protein sequences may be exported for protein folding analysis. The Allele Workbench processing results in transcript files and read counts that can be used as input to the previously published Transcriptome Computational Workbench, which has a new algorithm for determining a trimmed set of gene ontology terms. The software with demo files is available

  19. 75 FR 13342 - Pipeline Safety: Workshop on Distribution Pipeline Construction

    Science.gov (United States)

    2010-03-19

    ... SUPPLEMENTARY INFORMATION for additional information. Privacy Act Statement: Anyone may search the electronic... patterns of similar findings. Poor construction quality has led to short and long term pipeline integrity...

  20. Natural gas pipeline technology overview.

    Energy Technology Data Exchange (ETDEWEB)

    Folga, S. M.; Decision and Information Sciences

    2007-11-01

    The United States relies on natural gas for one-quarter of its energy needs. In 2001 alone, the nation consumed 21.5 trillion cubic feet of natural gas. A large portion of natural gas pipeline capacity within the United States is directed from major production areas in Texas and Louisiana, Wyoming, and other states to markets in the western, eastern, and midwestern regions of the country. In the past 10 years, increasing levels of gas from Canada have also been brought into these markets (EIA 2007). The United States has several major natural gas production basins and an extensive natural gas pipeline network, with almost 95% of U.S. natural gas imports coming from Canada. At present, the gas pipeline infrastructure is more developed between Canada and the United States than between Mexico and the United States. Gas flows from Canada to the United States through several major pipelines feeding U.S. markets in the Midwest, Northeast, Pacific Northwest, and California. Some key examples are the Alliance Pipeline, the Northern Border Pipeline, the Maritimes & Northeast Pipeline, the TransCanada Pipeline System, and Westcoast Energy pipelines. Major connections join Texas and northeastern Mexico, with additional connections to Arizona and between California and Baja California, Mexico (INGAA 2007). Of the natural gas consumed in the United States, 85% is produced domestically. Figure 1.1-1 shows the complex North American natural gas network. The pipeline transmission system--the 'interstate highway' for natural gas--consists of 180,000 miles of high-strength steel pipe varying in diameter, normally between 30 and 36 inches in diameter. The primary function of the transmission pipeline company is to move huge amounts of natural gas thousands of miles from producing regions to local natural gas utility delivery points. These delivery points, called 'city gate stations', are usually owned by distribution companies, although some are owned by

  1. Effort problem of chemical pipelines

    Energy Technology Data Exchange (ETDEWEB)

    Okrajni, J.; Ciesla, M.; Mutwil, K. [Silesian Technical University, Katowice (Poland)

    1998-12-31

    The problem of the technical state assessment of the chemical pipelines working under mechanical and thermal loading has been shown in the paper. The pipelines effort after the long time operating period has been analysed. Material geometrical and loading conditions of the crack initiation and crack growth process in the chosen object has been discussed. Areas of the maximal effort have been determined. The material structure charges after the long time operating period have been described. Mechanisms of the crack initiation and crack growth in the pipeline elements have been analysed and mutual relations between the chemical and mechanical influences have been shown. (orig.) 16 refs.

  2. Subsea pipeline operational risk management

    Energy Technology Data Exchange (ETDEWEB)

    Bell, R.L.; Lanan, G.A.

    1996-12-31

    Resources used for inspection, maintenance, and repair of a subsea pipeline must be allocated efficiently in order to operate it in the most cost effective manner. Operational risk management aids in resource allocation through the use of risk assessments and cost/benefit analyses. It identifies those areas where attention must be focused in order to reduce risk. When they are identified, a company`s resources (i.e., personnel, equipment, money, and time) can then be used for inspection, maintenance, and/or repair of the pipeline. The results are cost effective risk reduction and pipeline operation with minimum expenditure.

  3. A Study on Optimal Sizing of Pipeline Transporting Equi-sized Particulate Solid-Liquid Mixture

    International Nuclear Information System (INIS)

    Asim, Taimoor; Mishra, Rakesh; Pradhan, Suman; Ubbi, Kuldip

    2012-01-01

    Pipelines transporting solid-liquid mixtures are of practical interest to the oil and pipe industry throughout the world. Such pipelines are known as slurry pipelines where the solid medium of the flow is commonly known as slurry. The optimal designing of such pipelines is of commercial interests for their widespread acceptance. A methodology has been evolved for the optimal sizing of a pipeline transporting solid-liquid mixture. Least cost principle has been used in sizing such pipelines, which involves the determination of pipe diameter corresponding to the minimum cost for given solid throughput. The detailed analysis with regard to transportation of slurry having solids of uniformly graded particles size has been included. The proposed methodology can be used for designing a pipeline for transporting any solid material for different solid throughput.

  4. Pipeline oil fire detection with MODIS active fire products

    Science.gov (United States)

    Ogungbuyi, M. G.; Martinez, P.; Eckardt, F. D.

    2017-12-01

    We investigate 85 129 MODIS satellite active fire events from 2007 to 2015 in the Niger Delta of Nigeria. The region is the oil base for Nigerian economy and the hub of oil exploration where oil facilities (i.e. flowlines, flow stations, trunklines, oil wells and oil fields) are domiciled, and from where crude oil and refined products are transported to different Nigerian locations through a network of pipeline systems. Pipeline and other oil facilities are consistently susceptible to oil leaks due to operational or maintenance error, and by acts of deliberate sabotage of the pipeline equipment which often result in explosions and fire outbreaks. We used ground oil spill reports obtained from the National Oil Spill Detection and Response Agency (NOSDRA) database (see www.oilspillmonitor.ng) to validate MODIS satellite data. NOSDRA database shows an estimate of 10 000 spill events from 2007 - 2015. The spill events were filtered to include largest spills by volume and events occurring only in the Niger Delta (i.e. 386 spills). By projecting both MODIS fire and spill as `input vector' layers with `Points' geometry, and the Nigerian pipeline networks as `from vector' layers with `LineString' geometry in a geographical information system, we extracted the nearest MODIS events (i.e. 2192) closed to the pipelines by 1000m distance in spatial vector analysis. The extraction process that defined the nearest distance to the pipelines is based on the global practices of the Right of Way (ROW) in pipeline management that earmarked 30m strip of land to the pipeline. The KML files of the extracted fires in a Google map validated their source origin to be from oil facilities. Land cover mapping confirmed fire anomalies. The aim of the study is to propose a near-real-time monitoring of spill events along pipeline routes using 250 m spatial resolution of MODIS active fire detection sensor when such spills are accompanied by fire events in the study location.

  5. 49 CFR 195.422 - Pipeline repairs.

    Science.gov (United States)

    2010-10-01

    ... 49 Transportation 3 2010-10-01 2010-10-01 false Pipeline repairs. 195.422 Section 195.422... PIPELINE Operation and Maintenance § 195.422 Pipeline repairs. (a) Each operator shall, in repairing its pipeline systems, insure that the repairs are made in a safe manner and are made so as to prevent damage to...

  6. Fishing intensity around the BBL pipeline

    NARCIS (Netherlands)

    Hintzen, Niels

    2016-01-01

    Wageningen Marine Research was requested by ACRB B.V. to investigate the fishing activities around the BBL pipeline. This gas pipeline crosses the southern North Sea from Balgzand (near Den Helder) in the Netherlands to Bacton in the UK (230km). This pipeline is abbreviated as the BBL pipeline. Part

  7. Instructional Materials Centers; Annotated Bibliography.

    Science.gov (United States)

    Poli, Rosario, Comp.

    An annotated bibliography lists 74 articles and reports on instructional materials centers (IMC) which appeared from 1967-70. The articles deal with such topics as the purposes of an IMC, guidelines for setting up an IMC, and the relationship of an IMC to technology. Most articles deal with use of an IMC on an elementary or secondary level, but…

  8. Designing Annotation Before It's Needed

    NARCIS (Netherlands)

    F.-M. Nack (Frank); W. Putz

    2001-01-01

    textabstractThis paper considers the automated and semi-automated annotation of audiovisual media in a new type of production framework, A4SM (Authoring System for Syntactic, Semantic and Semiotic Modelling). We present the architecture of the framework and outline the underlying XML-Schema based

  9. Image annotation using clickthrough data

    NARCIS (Netherlands)

    T. Tsikrika (Theodora); C. Diou; A.P. de Vries (Arjen); A. Delopoulos

    2009-01-01

    htmlabstractAutomatic image annotation using supervised learning is performed by concept classifiers trained on labelled example images. This work proposes the use of clickthrough data collected from search logs as a source for the automatic generation of concept training data, thus avoiding the

  10. Logistics aspects of petroleum pipeline operations

    Directory of Open Access Journals (Sweden)

    W. J. Pienaar

    2010-11-01

    Full Text Available The paper identifies, assesses and describes the logistics aspects of the commercial operation of petroleum pipelines. The nature of petroleum-product supply chains, in which pipelines play a role, is outlined and the types of petroleum pipeline systems are described. An outline is presented of the nature of the logistics activities of petroleum pipeline operations. The reasons for the cost efficiency of petroleum pipeline operations are given. The relative modal service effectiveness of petroleum pipeline transport, based on the most pertinent service performance measures, is offered. The segments in the petroleum-products supply chain where pipelines can play an efficient and effective role are identified.

  11. Pipeline integrity handbook risk management and evaluation

    CERN Document Server

    Singh, Ramesh

    2013-01-01

    Based on over 40 years of experience in the field, Ramesh Singh goes beyond corrosion control, providing techniques for addressing present and future integrity issues. Pipeline Integrity Handbook provides pipeline engineers with the tools to evaluate and inspect pipelines, safeguard the life cycle of their pipeline asset and ensure that they are optimizing delivery and capability. Presented in easy-to-use, step-by-step order, Pipeline Integrity Handbook is a quick reference for day-to-day use in identifying key pipeline degradation mechanisms and threats to pipeline integrity. The book begins

  12. Learning Intelligent Dialogs for Bounding Box Annotation

    OpenAIRE

    Konyushkova, Ksenia; Uijlings, Jasper; Lampert, Christoph; Ferrari, Vittorio

    2017-01-01

    We introduce Intelligent Annotation Dialogs for bounding box annotation. We train an agent to automatically choose a sequence of actions for a human annotator to produce a bounding box in a minimal amount of time. Specifically, we consider two actions: box verification [37], where the annotator verifies a box generated by an object detector, and manual box drawing. We explore two kinds of agents, one based on predicting the probability that a box will be positively verified, and the other bas...

  13. Pipelines programming paradigms: Prefab plumbing

    International Nuclear Information System (INIS)

    Boeheim, C.

    1991-08-01

    Mastery of CMS Pipelines is a process of learning increasingly sophisticated tools and techniques that can be applied to your problem. This paper presents a compilation of techniques that can be used as a reference for solving similar problems

  14. The discrepancies in the results of bioinformatics tools for genomic structural annotation

    Science.gov (United States)

    Pawełkowicz, Magdalena; Nowak, Robert; Osipowski, Paweł; Rymuszka, Jacek; Świerkula, Katarzyna; Wojcieszek, Michał; Przybecki, Zbigniew

    2014-11-01

    A major focus of sequencing project is to identify genes in genomes. However it is necessary to define the variety of genes and the criteria for identifying them. In this work we present discrepancies and dependencies from the application of different bioinformatic programs for structural annotation performed on the cucumber data set from Polish Consortium of Cucumber Genome Sequencing. We use Fgenesh, GenScan and GeneMark to automated structural annotation, the results have been compared to reference annotation.

  15. OligoRAP – an Oligo Re-Annotation Pipeline to improve annotation and estimate target specificity

    NARCIS (Netherlands)

    Neerincx, P.; Rauwerda, H.; Nie, H.; Groenen, M.A.M.; Breit, T.M.; Leunissen, J.A.M.

    2009-01-01

    Background - High throughput gene expression studies using oligonucleotide microarrays depend on the specificity of each oligonucleotide (oligo or probe) for its target gene. However, target specific probes can only be designed when a reference genome of the species at hand were completely

  16. Annotating images by mining image search results

    NARCIS (Netherlands)

    Wang, X.J.; Zhang, L.; Li, X.; Ma, W.Y.

    2008-01-01

    Although it has been studied for years by the computer vision and machine learning communities, image annotation is still far from practical. In this paper, we propose a novel attempt at model-free image annotation, which is a data-driven approach that annotates images by mining their search

  17. Pipeline for Contraceptive Development

    Science.gov (United States)

    Blithe, Diana L.

    2016-01-01

    The high rates of unplanned pregnancy reflect unmet need for effective contraceptive methods for women, especially for individuals with health risks such as obesity, diabetes, hypertension, and other conditions that may contraindicate use of an estrogen-containing product. Improvements in safety, user convenience, acceptability and availability of products remain important goals of the contraceptive development program. Another important goal is to minimize the impact of the products on the environment. Development of new methods for male contraception has the potential to address many of these issues with regard to safety for women who have contraindications to effective contraceptive methods but want to protect against pregnancy. It also will address a huge unmet need for men who want to control their fertility. Products under development for men would not introduce eco-toxic hormones in the waste water. Investment in contraceptive research to identify new products for women has been limited in the pharmaceutical industry relative to investment in drug development for other indications. Pharmaceutical R&D for male contraception was active in the 1990’s but was abandoned over a decade ago. The Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD) has supported a contraceptive development program since 1969. Through a variety of programs including research grants and contracts, NICHD has developed a pipeline of new targets/products for male and female contraception. A number of lead candidates are under evaluation in the NICHD Contraceptive Clinical Trials Network (CCTN) (1–3). PMID:27523300

  18. Developing eThread Pipeline Using SAGA-Pilot Abstraction for Large-Scale Structural Bioinformatics

    Directory of Open Access Journals (Sweden)

    Anjani Ragothaman

    2014-01-01

    Full Text Available While most of computational annotation approaches are sequence-based, threading methods are becoming increasingly attractive because of predicted structural information that could uncover the underlying function. However, threading tools are generally compute-intensive and the number of protein sequences from even small genomes such as prokaryotes is large typically containing many thousands, prohibiting their application as a genome-wide structural systems biology tool. To leverage its utility, we have developed a pipeline for eThread—a meta-threading protein structure modeling tool, that can use computational resources efficiently and effectively. We employ a pilot-based approach that supports seamless data and task-level parallelism and manages large variation in workload and computational requirements. Our scalable pipeline is deployed on Amazon EC2 and can efficiently select resources based upon task requirements. We present runtime analysis to characterize computational complexity of eThread and EC2 infrastructure. Based on results, we suggest a pathway to an optimized solution with respect to metrics such as time-to-solution or cost-to-solution. Our eThread pipeline can scale to support a large number of sequences and is expected to be a viable solution for genome-scale structural bioinformatics and structure-based annotation, particularly, amenable for small genomes such as prokaryotes. The developed pipeline is easily extensible to other types of distributed cyberinfrastructure.

  19. Developing eThread pipeline using SAGA-pilot abstraction for large-scale structural bioinformatics.

    Science.gov (United States)

    Ragothaman, Anjani; Boddu, Sairam Chowdary; Kim, Nayong; Feinstein, Wei; Brylinski, Michal; Jha, Shantenu; Kim, Joohyun

    2014-01-01

    While most of computational annotation approaches are sequence-based, threading methods are becoming increasingly attractive because of predicted structural information that could uncover the underlying function. However, threading tools are generally compute-intensive and the number of protein sequences from even small genomes such as prokaryotes is large typically containing many thousands, prohibiting their application as a genome-wide structural systems biology tool. To leverage its utility, we have developed a pipeline for eThread--a meta-threading protein structure modeling tool, that can use computational resources efficiently and effectively. We employ a pilot-based approach that supports seamless data and task-level parallelism and manages large variation in workload and computational requirements. Our scalable pipeline is deployed on Amazon EC2 and can efficiently select resources based upon task requirements. We present runtime analysis to characterize computational complexity of eThread and EC2 infrastructure. Based on results, we suggest a pathway to an optimized solution with respect to metrics such as time-to-solution or cost-to-solution. Our eThread pipeline can scale to support a large number of sequences and is expected to be a viable solution for genome-scale structural bioinformatics and structure-based annotation, particularly, amenable for small genomes such as prokaryotes. The developed pipeline is easily extensible to other types of distributed cyberinfrastructure.

  20. State of art of seismic design and seismic hazard analysis for oil and gas pipeline system

    Science.gov (United States)

    Liu, Aiwen; Chen, Kun; Wu, Jian

    2010-06-01

    The purpose of this paper is to adopt the uniform confidence method in both water pipeline design and oil-gas pipeline design. Based on the importance of pipeline and consequence of its failure, oil and gas pipeline can be classified into three pipe classes, with exceeding probabilities over 50 years of 2%, 5% and 10%, respectively. Performance-based design requires more information about ground motion, which should be obtained by evaluating seismic safety for pipeline engineering site. Different from a city’s water pipeline network, the long-distance oil and gas pipeline system is a spatially linearly distributed system. For the uniform confidence of seismic safety, a long-distance oil and pipeline formed with pump stations and different-class pipe segments should be considered as a whole system when analyzing seismic risk. Considering the uncertainty of earthquake magnitude, the design-basis fault displacements corresponding to the different pipeline classes are proposed to improve deterministic seismic hazard analysis (DSHA). A new empirical relationship between the maximum fault displacement and the surface-wave magnitude is obtained with the supplemented earthquake data in East Asia. The estimation of fault displacement for a refined oil pipeline in Wenchuan M S8.0 earthquake is introduced as an example in this paper.

  1. Challenges in the development of market-based pipeline investments

    International Nuclear Information System (INIS)

    Von Bassenheim, G.; Mohitpour, M.; Klaudt, D.; Jenkins, A.

    2000-01-01

    The challenges, risks and uncertainties that the natural gas industry faces in developing market-based pipeline projects were discussed. Market-based pipeline investments are fundamentally different from user-driven projects. Market-based projected involve finding enough energy users and linking them with a pipeline infrastructure to viable supplies of natural gas. Each unique project is developed individually and requires a strong corporate vision and support before it can be successfully implemented. The three phases of a pipeline investment include the business development phase, the project development phase, and the implementation/operations phase. Market-based companies will need a clear vision for long-term goals and the desire to succeed. The company will have to prepare a detailed strategy and policies that clearly define geographic areas of operations, risk tolerance, availability of capital and expected project performance. 3 refs., 3 tabs., 2 figs

  2. Sustainability of social-environmental programs along pipelines

    Energy Technology Data Exchange (ETDEWEB)

    Doebereiner, Christian [Shell Southern Cone Gas and Power (Brazil); Herrera, Brigitte [Transredes S.A. (Bolivia)

    2005-07-01

    The sustainability of Social and Environmental programs along pipelines, have shown to be a major challenge. Gas pipelines in Bolivia and Brazil operate in a diversity of environments and communities with different cultures, values and expectations. However, the pipeline network can also provide opportunities for contributing to regional development and working with local populations on topics of mutual interest. Many of these are quite strategic because they arise from topics of mutual interest for both the company and neighboring populations, and because they provide opportunities for achieving results of mutual benefit. These opportunities could include helping to make gas available to local communities, contributions to urban planning, hiring local services and other initiatives. Sustainable and integrated Social and Environmental programs are therefore key to a successful pipeline operation. These opportunities are often missed or under valued. Some successful examples are presented from Transredes S.A., Bolivia. (author)

  3. Sensor network architectures for monitoring underwater pipelines.

    Science.gov (United States)

    Mohamed, Nader; Jawhar, Imad; Al-Jaroodi, Jameela; Zhang, Liren

    2011-01-01

    This paper develops and compares different sensor network architecture designs that can be used for monitoring underwater pipeline infrastructures. These architectures are underwater wired sensor networks, underwater acoustic wireless sensor networks, RF (radio frequency) wireless sensor networks, integrated wired/acoustic wireless sensor networks, and integrated wired/RF wireless sensor networks. The paper also discusses the reliability challenges and enhancement approaches for these network architectures. The reliability evaluation, characteristics, advantages, and disadvantages among these architectures are discussed and compared. Three reliability factors are used for the discussion and comparison: the network connectivity, the continuity of power supply for the network, and the physical network security. In addition, the paper also develops and evaluates a hierarchical sensor network framework for underwater pipeline monitoring.

  4. Key Design Properties for Shipping Information Pipeline

    DEFF Research Database (Denmark)

    Jensen, Thomas; Tan, Yao-Hua

    2015-01-01

    on paper, e-mail, phone and text message, and far too costly. This paper explores the design properties for a shared information infrastructure to exchange information between all parties in the supply chain, commercial parties as well as authorities, which is called a Shipping Information Pipeline...... Infrastructures. The paper argues why the previous attempts are inadequate to address the issues in the domain of international supply chains. Instead, a different set of key design properties are proposed for the Shipping Information Pipeline. The solution has been developed in collaboration with a network......This paper reports on the use of key design properties for development of a new approach towards a solution for sharing shipping information in the supply chain for international trade. Information exchange in international supply chain is extremely inefficient, rather uncoordinated, based largely...

  5. Annotation-Based Whole Genomic Prediction and Selection

    DEFF Research Database (Denmark)

    Kadarmideen, Haja; Do, Duy Ngoc; Janss, Luc

    Genomic selection is widely used in both animal and plant species, however, it is performed with no input from known genomic or biological role of genetic variants and therefore is a black box approach in a genomic era. This study investigated the role of different genomic regions and detected QTLs...... in their contribution to estimated genomic variances and in prediction of genomic breeding values by applying SNP annotation approaches to feed efficiency. Ensembl Variant Predictor (EVP) and Pig QTL database were used as the source of genomic annotation for 60K chip. Genomic prediction was performed using the Bayes...... classes. Predictive accuracy was 0.531, 0.532, 0.302, and 0.344 for DFI, RFI, ADG and BF, respectively. The contribution per SNP to total genomic variance was similar among annotated classes across different traits. Predictive performance of SNP classes did not significantly differ from randomized SNP...

  6. MicroScope: a platform for microbial genome annotation and comparative genomics.

    Science.gov (United States)

    Vallenet, D; Engelen, S; Mornico, D; Cruveiller, S; Fleury, L; Lajus, A; Rouy, Z; Roche, D; Salvignol, G; Scarpelli, C; Médigue, C

    2009-01-01

    The initial outcome of genome sequencing is the creation of long text strings written in a four letter alphabet. The role of in silico sequence analysis is to assist biologists in the act of associating biological knowledge with these sequences, allowing investigators to make inferences and predictions that can be tested experimentally. A wide variety of software is available to the scientific community, and can be used to identify genomic objects, before predicting their biological functions. However, only a limited number of biologically interesting features can be revealed from an isolated sequence. Comparative genomics tools, on the other hand, by bringing together the information contained in numerous genomes simultaneously, allow annotators to make inferences based on the idea that evolution and natural selection are central to the definition of all biological processes. We have developed the MicroScope platform in order to offer a web-based framework for the systematic and efficient revision of microbial genome annotation and comparative analysis (http://www.genoscope.cns.fr/agc/microscope). Starting with the description of the flow chart of the annotation processes implemented in the MicroScope pipeline, and the development of traditional and novel microbial annotation and comparative analysis tools, this article emphasizes the essential role of expert annotation as a complement of automatic annotation. Several examples illustrate the use of implemented tools for the review and curation of annotations of both new and publicly available microbial genomes within MicroScope's rich integrated genome framework. The platform is used as a viewer in order to browse updated annotation information of available microbial genomes (more than 440 organisms to date), and in the context of new annotation projects (117 bacterial genomes). The human expertise gathered in the MicroScope database (about 280,000 independent annotations) contributes to improve the quality of

  7. AnnoLnc: a web server for systematically annotating novel human lncRNAs.

    Science.gov (United States)

    Hou, Mei; Tang, Xing; Tian, Feng; Shi, Fangyuan; Liu, Fenglin; Gao, Ge

    2016-11-16

    Long noncoding RNAs (lncRNAs) have been shown to play essential roles in almost every important biological process through multiple mechanisms. Although the repertoire of human lncRNAs has rapidly expanded, their biological function and regulation remain largely elusive, calling for a systematic and integrative annotation tool. Here we present AnnoLnc ( http://annolnc.cbi.pku.edu.cn ), a one-stop portal for systematically annotating novel human lncRNAs. Based on more than 700 data sources and various tool chains, AnnoLnc enables a systematic annotation covering genomic location, secondary structure, expression patterns, transcriptional regulation, miRNA interaction, protein interaction, genetic association and evolution. An intuitive web interface is available for interactive analysis through both desktops and mobile devices, and programmers can further integrate AnnoLnc into their pipeline through standard JSON-based Web Service APIs. To the best of our knowledge, AnnoLnc is the only web server to provide on-the-fly and systematic annotation for newly identified human lncRNAs. Compared with similar tools, the annotation generated by AnnoLnc covers a much wider spectrum with intuitive visualization. Case studies demonstrate the power of AnnoLnc in not only rediscovering known functions of human lncRNAs but also inspiring novel hypotheses.

  8. Dictionary-driven protein annotation.

    Science.gov (United States)

    Rigoutsos, Isidore; Huynh, Tien; Floratos, Aris; Parida, Laxmi; Platt, Daniel

    2002-09-01

    Computational methods seeking to automatically determine the properties (functional, structural, physicochemical, etc.) of a protein directly from the sequence have long been the focus of numerous research groups. With the advent of advanced sequencing methods and systems, the number of amino acid sequences that are being deposited in the public databases has been increasing steadily. This has in turn generated a renewed demand for automated approaches that can annotate individual sequences and complete genomes quickly, exhaustively and objectively. In this paper, we present one such approach that is centered around and exploits the Bio-Dictionary, a collection of amino acid patterns that completely covers the natural sequence space and can capture functional and structural signals that have been reused during evolution, within and across protein families. Our annotation approach also makes use of a weighted, position-specific scoring scheme that is unaffected by the over-representation of well-conserved proteins and protein fragments in the databases used. For a given query sequence, the method permits one to determine, in a single pass, the following: local and global similarities between the query and any protein already present in a public database; the likeness of the query to all available archaeal/ bacterial/eukaryotic/viral sequences in the database as a function of amino acid position within the query; the character of secondary structure of the query as a function of amino acid position within the query; the cytoplasmic, transmembrane or extracellular behavior of the query; the nature and position of binding domains, active sites, post-translationally modified sites, signal peptides, etc. In terms of performance, the proposed method is exhaustive, objective and allows for the rapid annotation of individual sequences and full genomes. Annotation examples are presented and discussed in Results, including individual queries and complete genomes that were

  9. Automatic annotation of lecture videos for multimedia driven pedagogical platforms

    Directory of Open Access Journals (Sweden)

    Ali Shariq Imran

    2016-12-01

    Full Text Available Today’s eLearning websites are heavily loaded with multimedia contents, which are often unstructured, unedited, unsynchronized, and lack inter-links among different multimedia components. Hyperlinking different media modality may provide a solution for quick navigation and easy retrieval of pedagogical content in media driven eLearning websites. In addition, finding meta-data information to describe and annotate media content in eLearning platforms is challenging, laborious, prone to errors, and time-consuming task. Thus annotations for multimedia especially of lecture videos became an important part of video learning objects. To address this issue, this paper proposes three major contributions namely, automated video annotation, the 3-Dimensional (3D tag clouds, and the hyper interactive presenter (HIP eLearning platform. Combining existing state-of-the-art SIFT together with tag cloud, a novel approach for automatic lecture video annotation for the HIP is proposed. New video annotations are implemented automatically providing the needed random access in lecture videos within the platform, and a 3D tag cloud is proposed as a new way of user interaction mechanism. A preliminary study of the usefulness of the system has been carried out, and the initial results suggest that 70% of the students opted for using HIP as their preferred eLearning platform at Gjøvik University College (GUC.

  10. Phenex: ontological annotation of phenotypic diversity.

    Directory of Open Access Journals (Sweden)

    James P Balhoff

    2010-05-01

    Full Text Available Phenotypic differences among species have long been systematically itemized and described by biologists in the process of investigating phylogenetic relationships and trait evolution. Traditionally, these descriptions have been expressed in natural language within the context of individual journal publications or monographs. As such, this rich store of phenotype data has been largely unavailable for statistical and computational comparisons across studies or integration with other biological knowledge.Here we describe Phenex, a platform-independent desktop application designed to facilitate efficient and consistent annotation of phenotypic similarities and differences using Entity-Quality syntax, drawing on terms from community ontologies for anatomical entities, phenotypic qualities, and taxonomic names. Phenex can be configured to load only those ontologies pertinent to a taxonomic group of interest. The graphical user interface was optimized for evolutionary biologists accustomed to working with lists of taxa, characters, character states, and character-by-taxon matrices.Annotation of phenotypic data using ontologies and globally unique taxonomic identifiers will allow biologists to integrate phenotypic data from different organisms and studies, leveraging decades of work in systematics and comparative morphology.

  11. Training nuclei detection algorithms with simple annotations

    Directory of Open Access Journals (Sweden)

    Henning Kost

    2017-01-01

    Full Text Available Background: Generating good training datasets is essential for machine learning-based nuclei detection methods. However, creating exhaustive nuclei contour annotations, to derive optimal training data from, is often infeasible. Methods: We compared different approaches for training nuclei detection methods solely based on nucleus center markers. Such markers contain less accurate information, especially with regard to nuclear boundaries, but can be produced much easier and in greater quantities. The approaches use different automated sample extraction methods to derive image positions and class labels from nucleus center markers. In addition, the approaches use different automated sample selection methods to improve the detection quality of the classification algorithm and reduce the run time of the training process. We evaluated the approaches based on a previously published generic nuclei detection algorithm and a set of Ki-67-stained breast cancer images. Results: A Voronoi tessellation-based sample extraction method produced the best performing training sets. However, subsampling of the extracted training samples was crucial. Even simple class balancing improved the detection quality considerably. The incorporation of active learning led to a further increase in detection quality. Conclusions: With appropriate sample extraction and selection methods, nuclei detection algorithms trained on the basis of simple center marker annotations can produce comparable quality to algorithms trained on conventionally created training sets.

  12. Metafier - a Tool for Annotating and Structuring Building Metadata

    DEFF Research Database (Denmark)

    Holmegaard, Emil; Johansen, Aslak; Kjærgaard, Mikkel Baun

    2017-01-01

    , describing the instrumentation of the building. We have created Metafier, a tool for annotating and structuring metadata for buildings. Metafier optimizes the workflow of establishing metadata for buildings by enabling a human-in-the-loop to validate, search and group points. We have evaluated Metafier...... for two buildings, with different sizes, locations, ages and purposes. The evaluation was performed as a user test with three subjects with different backgrounds. The evaluation results indicates that the tool enabled the users to validate, search and group points while annotating metadata. One challenge...... is to get users to understand the concept of metadata for the tool to be useable. Based on our evaluation, we have listed guidelines for creating a tool for annotating building metadata....

  13. Assembly, Annotation, and Analysis of Multiple Mycorrhizal Fungal Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Initiative Consortium, Mycorrhizal Genomics; Kuo, Alan; Grigoriev, Igor; Kohler, Annegret; Martin, Francis

    2013-03-08

    Mycorrhizal fungi play critical roles in host plant health, soil community structure and chemistry, and carbon and nutrient cycling, all areas of intense interest to the US Dept. of Energy (DOE) Joint Genome Institute (JGI). To this end we are building on our earlier sequencing of the Laccaria bicolor genome by partnering with INRA-Nancy and the mycorrhizal research community in the MGI to sequence and analyze dozens of mycorrhizal genomes of all Basidiomycota and Ascomycota orders and multiple ecological types (ericoid, orchid, and ectomycorrhizal). JGI has developed and deployed high-throughput sequencing techniques, and Assembly, RNASeq, and Annotation Pipelines. In 2012 alone we sequenced, assembled, and annotated 12 draft or improved genomes of mycorrhizae, and predicted ~;;232831 genes and ~;;15011 multigene families, All of this data is publicly available on JGI MycoCosm (http://jgi.doe.gov/fungi/), which provides access to both the genome data and tools with which to analyze the data. Preliminary comparisons of the current total of 14 public mycorrhizal genomes suggest that 1) short secreted proteins potentially involved in symbiosis are more enriched in some orders than in others amongst the mycorrhizal Agaricomycetes, 2) there are wide ranges of numbers of genes involved in certain functional categories, such as signal transduction and post-translational modification, and 3) novel gene families are specific to some ecological types.

  14. Crowded country: Gentry pipeline faces infrastructure web

    Energy Technology Data Exchange (ETDEWEB)

    Jaremko, D.

    2003-12-01

    In order to support the growing potential production of leases, some Alberta oil and natural gas companies are obliged to pile their infrastructure on top of each other. To illustrate the overcrowding and the problems encountered by some of the companies, a case history describing the trials and tribulations of Gentry Resources, a Calgary-based junior company, is provided. The case history recalls that during the construction of a 11.2 km four-inch pipeline in the Princess/Tide Lake region to tie in their Princess Nisku gas well, the company had to cross 60 different existing pipelines and roads, including an abandoned rail bed. The greatest challenge in building the pipeline was getting the approval of each of many companies active in the region, not to mention the regulatory approvals which also included both environmental and historical assessments. Because of the combination of water and high carbon dioxide content of the gas, Gentry also had to install a dehydration plant to take water out of the gas train to avoid corrosion. Current production is about 2,000 boe/d; future growth in production is likely to be constrained because growth in gas production will strain the existing infrastructure, and space for adding more infrastructure is not available.

  15. Marky: a tool supporting annotation consistency in multi-user and iterative document annotation projects.

    Science.gov (United States)

    Pérez-Pérez, Martín; Glez-Peña, Daniel; Fdez-Riverola, Florentino; Lourenço, Anália

    2015-02-01

    Document annotation is a key task in the development of Text Mining methods and applications. High quality annotated corpora are invaluable, but their preparation requires a considerable amount of resources and time. Although the existing annotation tools offer good user interaction interfaces to domain experts, project management and quality control abilities are still limited. Therefore, the current work introduces Marky, a new Web-based document annotation tool equipped to manage multi-user and iterative projects, and to evaluate annotation quality throughout the project life cycle. At the core, Marky is a Web application based on the open source CakePHP framework. User interface relies on HTML5 and CSS3 technologies. Rangy library assists in browser-independent implementation of common DOM range and selection tasks, and Ajax and JQuery technologies are used to enhance user-system interaction. Marky grants solid management of inter- and intra-annotator work. Most notably, its annotation tracking system supports systematic and on-demand agreement analysis and annotation amendment. Each annotator may work over documents as usual, but all the annotations made are saved by the tracking system and may be further compared. So, the project administrator is able to evaluate annotation consistency among annotators and across rounds of annotation, while annotators are able to reject or amend subsets of annotations made in previous rounds. As a side effect, the tracking system minimises resource and time consumption. Marky is a novel environment for managing multi-user and iterative document annotation projects. Compared to other tools, Marky offers a similar visually intuitive annotation experience while providing unique means to minimise annotation effort and enforce annotation quality, and therefore corpus consistency. Marky is freely available for non-commercial use at http://sing.ei.uvigo.es/marky. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  16. Werkzeuge zur Annotation diachroner Korpora

    OpenAIRE

    Burghardt, Manuel; Wolff, Christian

    2009-01-01

    Wir diskutieren zunächst die Problematik der (syntaktischen) Annotation diachroner Korpora und stellen anschließend eine Evaluationsstudie vor, bei der mehr als 50 Annotationswerkzeuge und -frameworks vor dem Hintergrund eines funktionalen und software-ergonomischen Anforderungsprofils nach dem Qualitätsmodell von ISO/IEC 9126-1:2001 (Software engineering – Product quality – Part 1: Quality model) und ISO/IEC 25000:2005 (Software Engineering – Software product Quality Requirements and Evaluat...

  17. The restorationof the dilapidated pipelines using compressed plastic pipes

    Directory of Open Access Journals (Sweden)

    Orlov Vladimir Aleksandrovich

    2014-02-01

    Full Text Available The article provides the information on a promising technology for trenchless repair named Swagelining, which supposes pulling into the old pipeline the new polymer with its preliminary thermo-mechanical compression and further straightening. The coauthors present the results of the calculations determining the thickness of the polyethylene pipes after compression and straightening in the old pipeline depending on the initial diameter in case of different ratio of the diameter to the wall thickness (SDR and the dynamics of the changes in hydraulic performance after repair work on the pipeline using the method Swagelining. The concept of the energy saving potential is formed in addition to a no-dig repair for pressure piping systems, water supply, and its magnitude. On the basis of the research results, the authors formulate the principles of the energy efficiency potential after the implementation of the trenchless technology of drawing the old pipeline with new polymer pipes with their preliminary thermo-mechanical compression and subsequent area enlargement. The technology Swagelining is described and the authors develop a mathematical model that illustrates the behavior of the pipeline in the process of shrink operations. Such parameters are analyzed as changing the diameter of the pipeline at thermo-mechanical compression, the hydraulic parameters of the new (polymer and old (steel pipelines, energy savings on one-meter length of the pipeline. The calculated values of the electric power economy on the whole length of the pipeline repair section with a corresponding flow of transported waters.The characteristics and capabilities of the technology of trenchless renovation Swagelining allows achieving simultaneously the effect of resource saving (eliminationof the defects and, as a consequence, of water leakage and energy saving (reduction in the water transportation cost.A numerical example of the old steel pipeline renovation shows the

  18. Energy geopolitics and Iran-Pakistan-India gas pipeline

    Energy Technology Data Exchange (ETDEWEB)

    Verma, Shiv Kumar [Political Geography Division, Center for International Politics, Organization and Disarmament, School of International Studies, Jawaharlal Nehru University, New Delhi 110067 (India)]. E-mail: vermajnu@gmail.com

    2007-06-15

    With the growing energy demands in India and its neighboring countries, Iran-Pakistan-India (IPI) gas pipeline assumes special significance. Energy-deficient countries such as India, China, and Pakistan are vying to acquire gas fields in different parts of the world. This has led to two conspicuous developments: first, they are competing against each other and secondly, a situation is emerging where they might have to confront the US and the western countries in the near future in their attempt to control energy bases. The proposed IPI pipeline is an attempt to acquire such base. However, Pakistan is playing its own game to maximize its leverages. Pakistan, which refuses to establish even normal trading ties with India, craves to earn hundreds of millions of dollars in transit fees and other annual royalties from a gas pipeline which runs from Iran's South Pars fields to Barmer in western India. Pakistan promises to subsidize its gas imports from Iran and thus also become a major forex earner. It is willing to give pipeline related 'international guarantees' notwithstanding its record of covert actions in breach of international law (such as the export of terrorism) and its reluctance to reciprocally provide India what World Trade Organization (WTO) rules obligate it to do-Most Favored Nation (MFN) status. India is looking at the possibility of using some set of norms for securing gas supply through pipeline as the European Union has already initiated a discussion on the issue. The key point that is relevant to India's plan to build a pipeline to source gas from Iran relates to national treatment for pipeline. Under the principle of national treatment which also figures in relation to foreign direct investment (FDI), the country through which a pipeline transits should provide some level of security to the transiting pipeline as it would have provided to its domestic pipelines. This paper will endeavor to analyze, first, the significance of this

  19. AGAPE (Automated Genome Analysis PipelinE for pan-genome analysis of Saccharomyces cerevisiae.

    Directory of Open Access Journals (Sweden)

    Giltae Song

    Full Text Available The characterization and public release of genome sequences from thousands of organisms is expanding the scope for genetic variation studies. However, understanding the phenotypic consequences of genetic variation remains a challenge in eukaryotes due to the complexity of the genotype-phenotype map. One approach to this is the intensive study of model systems for which diverse sources of information can be accumulated and integrated. Saccharomyces cerevisiae is an extensively studied model organism, with well-known protein functions and thoroughly curated phenotype data. To develop and expand the available resources linking genomic variation with function in yeast, we aim to model the pan-genome of S. cerevisiae. To initiate the yeast pan-genome, we newly sequenced or re-sequenced the genomes of 25 strains that are commonly used in the yeast research community using advanced sequencing technology at high quality. We also developed a pipeline for automated pan-genome analysis, which integrates the steps of assembly, annotation, and variation calling. To assign strain-specific functional annotations, we identified genes that were not present in the reference genome. We classified these according to their presence or absence across strains and characterized each group of genes with known functional and phenotypic features. The functional roles of novel genes not found in the reference genome and associated with strains or groups of strains appear to be consistent with anticipated adaptations in specific lineages. As more S. cerevisiae strain genomes are released, our analysis can be used to collate genome data and relate it to lineage-specific patterns of genome evolution. Our new tool set will enhance our understanding of genomic and functional evolution in S. cerevisiae, and will be available to the yeast genetics and molecular biology community.

  20. Pipeline coating inspection in Mexico applying surface electromagnetic technology

    Energy Technology Data Exchange (ETDEWEB)

    Delgado, O.; Mousatov, A.; Nakamura, E.; Villarreal, J.M. [Instituto Mexicano del Petroleo (IMP), Mexico City (Mexico); Shevnin, V. [Moscow State University (Russian Federation); Cano, B. [Petroleos Mexicanos (PEMEX), Mexico City (Mexico)

    2009-07-01

    The main problems in the pipeline systems in Mexico include: extremely aggressive soil characterized by a high clay content and low resistivity, interconnection between several pipes, including electrical contacts of active pipelines with out of service pipes, and short distances between pipes in comparison with their depths which reduce the resolution of coating inspection. The results presented in this work show the efficiency of the Surface Electromagnetic Pipeline Inspection (SEMPI) technology to determine the technical condition of pipelines in situations before mentioned. The SEMPI technology includes two stages: regional and detailed measurements. The regional stage consists of magnetic field measurements along the pipeline using large distances (10 - 100 m) between observation points to delimit zones with damaged coating. For quantitative assessing the leakage and coating resistances along pipeline, additional measurements of voltage and soil resistivity measurements are performed. The second stage includes detailed measurements of the electric field on the pipe intervals with anomalous technical conditions identified in the regional stage. Based on the distribution of the coating electric resistance and the subsoil resistivity values, the delimitation of the zones with different grade of coating quality and soil aggressiveness are performed. (author)

  1. sRNAnalyzer—a flexible and customizable small RNA sequencing data analysis pipeline

    Science.gov (United States)

    Kim, Taek-Kyun; Baxter, David; Scherler, Kelsey; Gordon, Aaron; Fong, Olivia; Etheridge, Alton; Galas, David J.

    2017-01-01

    Abstract Although many tools have been developed to analyze small RNA sequencing (sRNA-Seq) data, it remains challenging to accurately analyze the small RNA population, mainly due to multiple sequence ID assignment caused by short read length. Additional issues in small RNA analysis include low consistency of microRNA (miRNA) measurement results across different platforms, miRNA mapping associated with miRNA sequence variation (isomiR) and RNA editing, and the origin of those unmapped reads after screening against all endogenous reference sequence databases. To address these issues, we built a comprehensive and customizable sRNA-Seq data analysis pipeline—sRNAnalyzer, which enables: (i) comprehensive miRNA profiling strategies to better handle isomiRs and summarization based on each nucleotide position to detect potential SNPs in miRNAs, (ii) different sequence mapping result assignment approaches to simulate results from microarray/qRT-PCR platforms and a local probabilistic model to assign mapping results to the most-likely IDs, (iii) comprehensive ribosomal RNA filtering for accurate mapping of exogenous RNAs and summarization based on taxonomy annotation. We evaluated our pipeline on both artificial samples (including synthetic miRNA and Escherichia coli cultures) and biological samples (human tissue and plasma). sRNAnalyzer is implemented in Perl and available at: http://srnanalyzer.systemsbiology.net/. PMID:29069500

  2. Statistical algorithms for ontology-based annotation of scientific literature.

    Science.gov (United States)

    Chakrabarti, Chayan; Jones, Thomas B; Luger, George F; Xu, Jiawei F; Turner, Matthew D; Laird, Angela R; Turner, Jessica A

    2014-01-01

    Ontologies encode relationships within a domain in robust data structures that can be used to annotate data objects, including scientific papers, in ways that ease tasks such as search and meta-analysis. However, the annotation process requires significant time and effort when performed by humans. Text mining algorithms can facilitate this process, but they render an analysis mainly based upon keyword, synonym and semantic matching. They do not leverage information embedded in an ontology's structure. We present a probabilistic framework that facilitates the automatic annotation of literature by indirectly modeling the restrictions among the different classes in the ontology. Our research focuses on annotating human functional neuroimaging literature within the Cognitive Paradigm Ontology (CogPO). We use an approach that combines the stochastic simplicity of naïve Bayes with the formal transparency of decision trees. Our data structure is easily modifiable to reflect changing domain knowledge. We compare our results across naïve Bayes, Bayesian Decision Trees, and Constrained Decision Tree classifiers that keep a human expert in the loop, in terms of the quality measure of the F1-mirco score. Unlike traditional text mining algorithms, our framework can model the knowledge encoded by the dependencies in an ontology, albeit indirectly. We successfully exploit the fact that CogPO has explicitly stated restrictions, and implicit dependencies in the form of patterns in the expert curated annotations.

  3. Regulating access of oil production and refining objects to pipeline networks: Russian and German experience

    OpenAIRE

    GULIYEV I.; LITVINYUK I.; ZINCHENKO O.

    2016-01-01

    In Russia the right to access oil and oil product pipeline networks is governed by a vast number of laws and subordinate legislation. On one hand, this imperative method is justified in order to ensure indiscriminate access to pipelines. On the other hand, international practices show that these objectives can also be achieved using a permissive legal regulation method more broadly. Therefore it is pertinent to analyze and compare the practice of pipeline access regulation in different states...

  4. Environmental audit guidelines for pipelines

    International Nuclear Information System (INIS)

    1991-01-01

    Environmental auditing is a form of management control which provides an objective basis by which a company can measure the degree of compliance with environmental regulations. Other benefits of this type of auditing include improved environmental management, furthering communication on environmental issues of concern within the company, and provision of documentation on environmental diligence. A series of environmental audit guidelines for pipelines is presented in the form of lists of questions to be asked during an environmental audit followed by recommended actions in response to those questions. The questions are organized into seven main categories: environmental management and planning; operating procedures; spill prevention; management of wastes and hazardous materials; environmental monitoring; construction of pipelines; and pipeline abandonment, decommissioning and site reclamation

  5. Emergency preparedness of OSBRA Pipeline

    Energy Technology Data Exchange (ETDEWEB)

    Magalhaes, Milton P.; Torres, Carlos A.R.; Almeida, Francisco J.C. [TRANSPETRO, Rio de Janeiro, RJ (Brazil)

    2009-07-01

    This paper presents the experience of PETROBRAS Transporte S. A. - TRANSPETRO in the preparation for emergencies in the OSBRA pipeline, showing specific aspects and solutions developed. The company has a standardized approach for the emergency management, based on risk analysis studies, risk management plan and contingency plans. To cover almost 1,000 km of pipeline, the Company avails of Emergency Response Centers and Environmental Defense Center, located at strategic points. In order to achieve preparation, fire fighting training and oil leakage elimination training are provided. Additionally, simulation exercises are performed, following a schedule worked out according to specific criteria and guidelines. As a conclusion, a picture is presented of the evolution of the preparation for emergencies in the OSBRA System which bears the enormous responsibility of transporting flammable products for almost 1,000 km of pipeline, crossing 40 municipalities, 3 states and the Federal District. (author)

  6. Semantic annotation in biomedicine: the current landscape.

    Science.gov (United States)

    Jovanović, Jelena; Bagheri, Ebrahim

    2017-09-22

    The abundance and unstructured nature of biomedical texts, be it clinical or research content, impose significant challenges for the effective and efficient use of information and knowledge stored in such texts. Annotation of biomedical documents with machine intelligible semantics facilitates advanced, semantics-based text management, curation, indexing, and search. This paper focuses on annotation of biomedical entity mentions with concepts from relevant biomedical knowledge bases such as UMLS. As a result, the meaning of those mentions is unambiguously and explicitly defined, and thus made readily available for automated processing. This process is widely known as semantic annotation, and the tools that perform it are known as semantic annotators.Over the last dozen years, the biomedical research community has invested significant efforts in the development of biomedical semantic annotation technology. Aiming to establish grounds for further developments in this area, we review a selected set of state of the art biomedical semantic annotators, focusing particularly on general purpose annotators, that is, semantic annotation tools that can be customized to work with texts from any area of biomedicine. We also examine potential directions for further improvements of today's annotators which could make them even more capable of meeting the needs of real-world applications. To motivate and encourage further developments in this area, along the suggested and/or related directions, we review existing and potential practical applications and benefits of semantic annotators.

  7. AISO: Annotation of Image Segments with Ontologies.

    Science.gov (United States)

    Lingutla, Nikhil Tej; Preece, Justin; Todorovic, Sinisa; Cooper, Laurel; Moore, Laura; Jaiswal, Pankaj

    2014-01-01

    Large quantities of digital images are now generated for biological collections, including those developed in projects premised on the high-throughput screening of genome-phenome experiments. These images often carry annotations on taxonomy and observable features, such as anatomical structures and phenotype variations often recorded in response to the environmental factors under which the organisms were sampled. At present, most of these annotations are described in free text, may involve limited use of non-standard vocabularies, and rarely specify precise coordinates of features on the image plane such that a computer vision algorithm could identify, extract and annotate them. Therefore, researchers and curators need a tool that can identify and demarcate features in an image plane and allow their annotation with semantically contextual ontology terms. Such a tool would generate data useful for inter and intra-specific comparison and encourage the integration of curation standards. In the future, quality annotated image segments may provide training data sets for developing machine learning applications for automated image annotation. We developed a novel image segmentation and annotation software application, "Annotation of Image Segments with Ontologies" (AISO). The tool enables researchers and curators to delineate portions of an image into multiple highlighted segments and annotate them with an ontology-based controlled vocabulary. AISO is a freely available Java-based desktop application and runs on multiple platforms. It can be downloaded at http://www.plantontology.org/software/AISO. AISO enables curators and researchers to annotate digital images with ontology terms in a manner which ensures the future computational value of the annotated images. We foresee uses for such data-encoded image annotations in biological data mining, machine learning, predictive annotation, semantic inference, and comparative analyses.

  8. Computational algorithms to predict Gene Ontology annotations.

    Science.gov (United States)

    Pinoli, Pietro; Chicco, Davide; Masseroli, Marco

    2015-01-01

    Gene function annotations, which are associations between a gene and a term of a controlled vocabulary describing gene functional features, are of paramount importance in modern biology. Datasets of these annotations, such as the ones provided by the Gene Ontology Consortium, are used to design novel biological experiments and interpret their results. Despite their importance, these sources of information have some known issues. They are incomplete, since biological knowledge is far from being definitive and it rapidly evolves, and some erroneous annotations may be present. Since the curation process of novel annotations is a costly procedure, both in economical and time terms, computational tools that can reliably predict likely annotations, and thus quicken the discovery of new gene annotations, are very useful. We used a set of computational algorithms and weighting schemes to infer novel gene annotations from a set of known ones. We used the latent semantic analysis approach, implementing two popular algorithms (Latent Semantic Indexing and Probabilistic Latent Semantic Analysis) and propose a novel method, the Semantic IMproved Latent Semantic Analysis, which adds a clustering step on the set of considered genes. Furthermore, we propose the improvement of these algorithms by weighting the annotations in the input set. We tested our methods and their weighted variants on the Gene Ontology annotation sets of three model organism genes (Bos taurus, Danio rerio and Drosophila melanogaster ). The methods showed their ability in predicting novel gene annotations and the weighting procedures demonstrated to lead to a valuable improvement, although the obtained results vary according to the dimension of the input annotation set and the considered algorithm. Out of the three considered methods, the Semantic IMproved Latent Semantic Analysis is the one that provides better results. In particular, when coupled with a proper weighting policy, it is able to predict a

  9. Scaling-up NLP Pipelines to Process Large Corpora of Clinical Notes.

    Science.gov (United States)

    Divita, G; Carter, M; Redd, A; Zeng, Q; Gupta, K; Trautner, B; Samore, M; Gundlapalli, A

    2015-01-01

    This article is part of the Focus Theme of Methods of Information in Medicine on "Big Data and Analytics in Healthcare". This paper describes the scale-up efforts at the VA Salt Lake City Health Care System to address processing large corpora of clinical notes through a natural language processing (NLP) pipeline. The use case described is a current project focused on detecting the presence of an indwelling urinary catheter in hospitalized patients and subsequent catheter-associated urinary tract infections. An NLP algorithm using v3NLP was developed to detect the presence of an indwelling urinary catheter in hospitalized patients. The algorithm was tested on a small corpus of notes on patients for whom the presence or absence of a catheter was already known (reference standard). In planning for a scale-up, we estimated that the original algorithm would have taken 2.4 days to run on a larger corpus of notes for this project (550,000 notes), and 27 days for a corpus of 6 million records representative of a national sample of notes. We approached scaling-up NLP pipelines through three techniques: pipeline replication via multi-threading, intra-annotator threading for tasks that can be further decomposed, and remote annotator services which enable annotator scale-out. The scale-up resulted in reducing the average time to process a record from 206 milliseconds to 17 milliseconds or a 12- fold increase in performance when applied to a corpus of 550,000 notes. Purposely simplistic in nature, these scale-up efforts are the straight forward evolution from small scale NLP processing to larger scale extraction without incurring associated complexities that are inherited by the use of the underlying UIMA framework. These efforts represent generalizable and widely applicable techniques that will aid other computationally complex NLP pipelines that are of need to be scaled out for processing and analyzing big data.

  10. A color image processing pipeline for digital microscope

    Science.gov (United States)

    Liu, Yan; Liu, Peng; Zhuang, Zhefeng; Chen, Enguo; Yu, Feihong

    2012-10-01

    Digital microscope has found wide application in the field of biology, medicine et al. A digital microscope differs from traditional optical microscope in that there is no need to observe the sample through an eyepiece directly, because the optical image is projected directly on the CCD/CMOS camera. However, because of the imaging difference between human eye and sensor, color image processing pipeline is needed for the digital microscope electronic eyepiece to get obtain fine image. The color image pipeline for digital microscope, including the procedures that convert the RAW image data captured by sensor into real color image, is of great concern to the quality of microscopic image. The color pipeline for digital microscope is different from digital still cameras and video cameras because of the specific requirements of microscopic image, which should have the characters of high dynamic range, keeping the same color with the objects observed and a variety of image post-processing. In this paper, a new color image processing pipeline is proposed to satisfy the requirements of digital microscope image. The algorithm of each step in the color image processing pipeline is designed and optimized with the purpose of getting high quality image and accommodating diverse user preferences. With the proposed pipeline implemented on the digital microscope platform, the output color images meet the various analysis requirements of images in the medicine and biology fields very well. The major steps of color imaging pipeline proposed include: black level adjustment, defect pixels removing, noise reduction, linearization, white balance, RGB color correction, tone scale correction and gamma correction.

  11. Transformation pipelines for PROJ.4

    Science.gov (United States)

    Knudsen, Thomas; Evers, Kristian

    2017-04-01

    For more than 2 decades, PROJ.4 has been the globally leading map projection library for open source (and probably also closed source) geospatial software. While focusing on mathematically well defined 2D projections from geographical to planar coordinates, PROJ.4 has nevertheless, since its introduction in the 1980s, provided limited support for more general geodetic datum transformations, and has gradually introduced a higher degree of support for 3D coordinate data and reference systems. The support has, however, been implemented over a long period of time, as need became evident and opportunity was found, by a number of different people, with different needs and at different times. Hence, the PROJ.4 3D support has not been the result of neither deep geodetic, nor careful code architectural considerations. This has resulted in a library that supports only a subset of commonly occurring geodetic transformations. To be more specific: It supports any datum shift that can be completed by a combination of two Helmert shifts (to and from a pivot datum) and, potentially, also a non-linear planar correction derived from interpolation in a correction grid. While this is sufficient for most small scale mapping activities, it is not at all sufficient for operational geodetic use, nor for many of the rapidly emerging high accuracy geospatial applications in agriculture, construction, transportation and utilities. To improve this situation, we have introduced a new framework for implementation of geodetic transformations, which will appear in the next release of the PROJ.4 library. Before describing the details, let us first remark that most cases of geodetic transformations can be expressed as a series of elementary operations, the output of one operation being the input of the next. E.g. when going from UTM zone 32, datum ED50, to UTM zone 32, datum ETRS89, one must, in the simplest case, go through 5 steps: Back-project the UTM coordinates to geographic coordinates

  12. Modal testing of hydraulic pipeline systems

    Science.gov (United States)

    Mikota, Gudrun; Manhartsgruber, Bernhard; Kogler, Helmut; Hammerle, Franz

    2017-11-01

    Dynamic models of fluid power systems require accurate descriptions of hydraulic pipeline systems. For laminar flow in a rigid pipeline, modal approximations of transcendental transfer functions lead to a multi-degrees-of-freedom description. This suggests the application of experimental modal analysis to investigate fluid dynamics in hydraulic pipeline systems. The concept of modal testing is adapted accordingly and demonstrated for a straight pipeline, the same pipeline with a single side branch, and a pipeline system with three side branches. Frequency response functions are determined by injecting a defined flow rate excitation and measuring pressure responses along the pipelines. The underlying theory is confirmed by comparisons between calculated transcendental, measured, and estimated rational frequency response functions. Natural frequencies, damping ratios, and pressure mode shapes are identified. Although the experiments are made for low flow rates and stiff pipeline walls, they indicate the way to perform modal testing in practical applications of fluid power.

  13. Natural disasters and the gas pipeline system.

    Science.gov (United States)

    1996-11-01

    Episodic descriptions are provided of the effects of the Loma Prieta earthquake (1989) on the gas pipeline systems of Pacific Gas & Electric Company and the Cit of Palo Alto and of the Northridge earthquake (1994) on Southern California Gas' pipeline...

  14. Improved structural annotation of protein-coding genes in the Meloidogyne hapla genome using RNA-Seq

    Science.gov (United States)

    Guo, Yuelong; Bird, David McK; Nielsen, Dahlia M

    2014-01-01

    As high-throughput cDNA sequencing (RNA-Seq) is increasingly applied to hypothesis-driven biological studies, the prediction of protein coding genes based on these data are usurping strictly in silico approaches. Compared with computationally derived gene predictions, structural annotation is more accurate when based on biological evidence, particularly RNA-Seq data. Here, we refine the current genome annotation for the Meloidogyne hapla genome utilizing RNA-Seq data. Published structural annotation defines 14 420 protein-coding genes in the M. hapla genome. Of these, 25% (3751) were found to exhibit some incongruence with RNA-Seq data. Manual annotation enabled these discrepancies to be resolved. Our analysis revealed 544 new gene models that were missing from the prior annotation. Additionally, 1457 transcribed regions were newly identified on the ends of as-yet-unjoined contigs. We also searched for trans-spliced leaders, and based on RNA-Seq data, identified genes that appear to be trans-spliced. Four 22-bp trans-spliced leaders were identified using our pipeline, including the known trans-spliced leader, which is the M. hapla ortholog of SL1. In silico predictions of trans-splicing were validated by comparison with earlier results derived from an independent cDNA library constructed to capture trans-spliced transcripts. The new annotation, which we term HapPep5, is publically available at www.hapla.org. PMID:25254153

  15. Centrifuge modelling of lateral displacement of buried pipelines; Modelagem fisica centrifuga de flambagem lateral de dutos

    Energy Technology Data Exchange (ETDEWEB)

    Oliveira, Jose Renato Moreira da Silva de; Almeida, Marcio de Souza Soares de; Marques, Maria Esther Soares; Almeida, Maria Cascao Ferreira de [Universidade Federal do Rio de Janeiro (UFRJ), RJ (Brazil). Coordenacao dos Programas de Pos-graduacao de Engenharia (COPPE); Costa, Alvaro Maia da [PETROBRAS, Rio de Janeiro, RJ (Brazil). Centro de Pesquisas (CENPES)

    2003-07-01

    This work discusses soil-structure interaction applied to the buckling phenomena of buried pipelines subjected to heated oil flow. A set of physical modelling tests on lateral buckling of pipelines buried on soft clay is presented using COPPE/UFRJ geotechnical centrifuge. A 1:30 pipeline model was moved side ward through a soft clay layer during centrifuge flight, varying the burial depth, in order to simulate the lateral buckling in plane strain condition. The results show different behaviour concerning horizontal and vertical forces measured at pipeline level due to soil reaction. (author)

  16. Annotated checklist of fungi in Cyprus Island. 1. Larger Basidiomycota

    Directory of Open Access Journals (Sweden)

    Miguel Torrejón

    2014-06-01

    Full Text Available An annotated checklist of wild fungi living in Cyprus Island has been compiled broughting together all the information collected from the different works dealing with fungi in this area throughout the three centuries of mycology in Cyprus. This part contains 363 taxa of macroscopic Basidiomycota.

  17. Communication in a Diverse Classroom: An Annotated Bibliographic Review

    Science.gov (United States)

    Brown, Rachelle

    2016-01-01

    Students have social and personal needs to fulfill and communicate these needs in different ways. This annotated bibliographic review examined communication studies to provide educators of diverse classrooms with ideas to build an environment that contributes to student well-being. Participants in the studies ranged in age, ability, and cultural…

  18. House dust mites in Brazil - an annotated bibliography

    Directory of Open Access Journals (Sweden)

    Binotti Raquel S

    2001-01-01

    Full Text Available House dust mites have been reported to be the most important allergen in human dwellings. Several articles had already shown the presence of different mite species at homes in Brazil, being Pyroglyphidae, Glycyphagidae and Cheyletidae the most important families found. This paper is an annotated bibliography that will lead to a better knowledge of house dust mite fauna in Brazil.

  19. Protein annotation in the era of personal genomics

    DEFF Research Database (Denmark)

    Holberg Blicher, Thomas; Gupta, Ramneek; Wesolowska, Agata

    2010-01-01

    Protein annotation provides a condensed and systematic view on the function of individual proteins. It has traditionally dealt with sorting proteins into functional categories, which for example has proven to be successful for the comparison of different species. However, if we are to understand...

  20. Integrity Evaluation of Oil and Gas Pipelines

    Energy Technology Data Exchange (ETDEWEB)

    Choi, Jae Boong [Sungkyunkwan University, Suwon (Korea, Republic of)

    2001-02-15

    The length of oil and gas pipelines is increased much according to economic reasons and practicality, the construction of pipeline in Asia and Europe is going on global region in recently. This oil and gas pipelines is managed integrity to it's explosion property or environmental pollution riskiness. This paper is dealt with major defects type, using on integrity evaluation methods in developed countries, showing basic data to workout a countermeasure integrity evaluation of domestic pipelines.

  1. California Natural Gas Pipelines: A Brief Guide

    Energy Technology Data Exchange (ETDEWEB)

    Neuscamman, Stephanie [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Price, Don [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Pezzola, Genny [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Glascoe, Lee [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2013-01-22

    The purpose of this document is to familiarize the reader with the general configuration and operation of the natural gas pipelines in California and to discuss potential LLNL contributions that would support the Partnership for the 21st Century collaboration. First, pipeline infrastructure will be reviewed. Then, recent pipeline events will be examined. Selected current pipeline industry research will be summarized. Finally, industry acronyms are listed for reference.

  2. 76 FR 73570 - Pipeline Safety: Miscellaneous Changes to Pipeline Safety Regulations

    Science.gov (United States)

    2011-11-29

    ... DEPARTMENT OF TRANSPORTATION Pipeline and Hazardous Materials Safety Administration 49 CFR Parts 191, 192, 195 and 198 [Docket No. PHMSA-2010-0026] RIN 2137-AE59 Pipeline Safety: Miscellaneous Changes to Pipeline Safety Regulations AGENCY: Pipeline and Hazardous Materials Safety Administration...

  3. 76 FR 303 - Pipeline Safety: Safety of On-Shore Hazardous Liquid Pipelines

    Science.gov (United States)

    2011-01-04

    ... DEPARTMENT OF TRANSPORTATION Pipeline and Hazardous Materials Safety Administration 49 CFR Part 195 [Docket ID PHMSA-2010-0229] RIN 2137-AE66 Pipeline Safety: Safety of On-Shore Hazardous Liquid Pipelines AGENCY: Pipeline and Hazardous Materials Safety Administration (PHMSA), DOT. ACTION: Notice of...

  4. 75 FR 4134 - Pipeline Safety: Leak Detection on Hazardous Liquid Pipelines

    Science.gov (United States)

    2010-01-26

    .... PHMSA-2009-0421] Pipeline Safety: Leak Detection on Hazardous Liquid Pipelines AGENCY: Pipeline and... INFORMATION: Background Pipeline leak detection is one of the many layers of protection in PHMSA's approach to... of these interconnected layers of protections, including advances in leak detection systems. These...

  5. 75 FR 5244 - Pipeline Safety: Integrity Management Program for Gas Distribution Pipelines; Correction

    Science.gov (United States)

    2010-02-02

    ... DEPARTMENT OF TRANSPORTATION Pipeline and Hazardous Materials Safety Administration 49 CFR Part 192 [Docket No. PHMSA-RSPA-2004-19854; Amdt. 192-113] RIN 2137-AE15 Pipeline Safety: Integrity Management Program for Gas Distribution Pipelines; Correction AGENCY: Pipeline and Hazardous Materials Safety...

  6. Annotation of regular polysemy and underspecification

    DEFF Research Database (Denmark)

    Martínez Alonso, Héctor; Pedersen, Bolette Sandford; Bel, Núria

    2013-01-01

    We present the result of an annotation task on regular polysemy for a series of seman- tic classes or dot types in English, Dan- ish and Spanish. This article describes the annotation process, the results in terms of inter-encoder agreement, and the sense distributions obtained with two methods......: majority voting with a theory-compliant backoff strategy, and MACE, an unsuper- vised system to choose the most likely sense from all the annotations....

  7. CGKB: an annotation knowledge base for cowpea (Vigna unguiculata L. methylation filtered genomic genespace sequences

    Directory of Open Access Journals (Sweden)

    Spraggins Thomas A

    2007-04-01

    potential domains on annotated GSS were analyzed using the HMMER package against the Pfam database. The annotated GSS were also assigned with Gene Ontology annotation terms and integrated with 228 curated plant metabolic pathways from the Arabidopsis Information Resource (TAIR knowledge base. The UniProtKB-Swiss-Prot ENZYME database was used to assign putative enzymatic function to each GSS. Each GSS was also analyzed with the Tandem Repeat Finder (TRF program in order to identify potential SSRs for molecular marker discovery. The raw sequence data, processed annotation, and SSR results were stored in relational tables designed in key-value pair fashion using a PostgreSQL relational database management system. The biological knowledge derived from the sequence data and processed results are represented as views or materialized views in the relational database management system. All materialized views are indexed for quick data access and retrieval. Data processing and analysis pipelines were implemented using the Perl programming language. The web interface was implemented in JavaScript and Perl CGI running on an Apache web server. The CPU intensive data processing and analysis pipelines were run on a computer cluster of more than 30 dual-processor Apple XServes. A job management system called Vela was created as a robust way to submit large numbers of jobs to the Portable Batch System (PBS. Conclusion CGKB is an integrated and annotated resource for cowpea GSS with features of homology-based and HMM-based annotations, enzyme and pathway annotations, GO term annotation, toolkits, and a large number of other facilities to perform complex queries. The cowpea GSS, chloroplast sequences, mitochondrial sequences, retroelements, and SSR sequences are available as FASTA formatted files and downloadable at CGKB. This database and web interface are publicly accessible at http://cowpeagenomics.med.virginia.edu/CGKB/.

  8. BioAnnote: a software platform for annotating biomedical documents with application in medical learning environments.

    Science.gov (United States)

    López-Fernández, H; Reboiro-Jato, M; Glez-Peña, D; Aparicio, F; Gachet, D; Buenaga, M; Fdez-Riverola, F

    2013-07-01

    Automatic term annotation from biomedical documents and external information linking are becoming a necessary prerequisite in modern computer-aided medical learning systems. In this context, this paper presents BioAnnote, a flexible and extensible open-source platform for automatically annotating biomedical resources. Apart from other valuable features, the software platform includes (i) a rich client enabling users to annotate multiple documents in a user friendly environment, (ii) an extensible and embeddable annotation meta-server allowing for the annotation of documents with local or remote vocabularies and (iii) a simple client/server protocol which facilitates the use of our meta-server from any other third-party application. In addition, BioAnnote implements a powerful scripting engine able to perform advanced batch annotations. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  9. Pipeline bottoming cycle study. Final report

    Energy Technology Data Exchange (ETDEWEB)

    1980-06-01

    The technical and economic feasibility of applying bottoming cycles to the prime movers that drive the compressors of natural gas pipelines was studied. These bottoming cycles convert some of the waste heat from the exhaust gas of the prime movers into shaft power and conserve gas. Three typical compressor station sites were selected, each on a different pipeline. Although the prime movers were different, they were similar enough in exhaust gas flow rate and temperature that a single bottoming cycle system could be designed, with some modifications, for all three sites. Preliminary design included selection of the bottoming cycle working fluid, optimization of the cycle, and design of the components, such as turbine, vapor generator and condensers. Installation drawings were made and hardware and installation costs were estimated. The results of the economic assessment of retrofitting bottoming cycle systems on the three selected sites indicated that profitability was strongly dependent upon the site-specific installation costs, how the energy was used and the yearly utilization of the apparatus. The study indicated that the bottoming cycles are a competitive investment alternative for certain applications for the pipeline industry. Bottoming cycles are technically feasible. It was concluded that proper design and operating practices would reduce the environmental and safety hazards to acceptable levels. The amount of gas that could be saved through the year 2000 by the adoption of bottoming cycles for two different supply projections was estimated as from 0.296 trillion ft/sup 3/ for a low supply projection to 0.734 trillion ft/sup 3/ for a high supply projection. The potential market for bottoming cycle equipment for the two supply projections varied from 170 to 500 units of varying size. Finally, a demonstration program plan was developed.

  10. Coreference annotation and resolution in the Colorado Richly Annotated Full Text (CRAFT) corpus of biomedical journal articles.

    Science.gov (United States)

    Cohen, K Bretonnel; Lanfranchi, Arrick; Choi, Miji Joo-Young; Bada, Michael; Baumgartner, William A; Panteleyeva, Natalya; Verspoor, Karin; Palmer, Martha; Hunter, Lawrence E

    2017-08-17

    Coreference resolution is the task of finding strings in text that have the same referent as other strings. Failures of coreference resolution are a common cause of false negatives in information extraction from the scientific literature. In order to better understand the nature of the phenomenon of coreference in biomedical publications and to increase performance on the task, we annotated the Colorado Richly Annotated Full Text (CRAFT) corpus with coreference relations. The corpus was manually annotated with coreference relations, including identity and appositives for all coreferring base noun phrases. The OntoNotes annotation guidelines, with minor adaptations, were used. Interannotator agreement ranges from 0.480 (entity-based CEAF) to 0.858 (Class-B3), depending on the metric that is used to assess it. The resulting corpus adds nearly 30,000 annotations to the previous release of the CRAFT corpus. Differences from related projects include a much broader definition of markables, connection to extensive annotation of several domain-relevant semantic classes, and connection to complete syntactic annotation. Tool performance was benchmarked on the data. A publicly available out-of-the-box, general-domain coreference resolution system achieved an F-measure of 0.14 (B3), while a simple domain-adapted rule-based system achieved an F-measure of 0.42. An ensemble of the two reached F of 0.46. Following the IDENTITY chains in the data would add 106,263 additional named entities in the full 97-paper corpus, for an increase of 76% percent in the semantic classes of the eight ontologies that have been annotated in earlier versions of the CRAFT corpus. The project produced a large data set for further investigation of coreference and coreference resolution in the scientific literature. The work raised issues in the phenomenon of reference in this domain and genre, and the paper proposes that many mentions that would be considered generic in the general domain are not

  11. Incorporating Non-Coding Annotations into Rare Variant Analysis.

    Directory of Open Access Journals (Sweden)

    Tom G Richardson

    Full Text Available The success of collapsing methods which investigate the combined effect of rare variants on complex traits has so far been limited. The manner in which variants within a gene are selected prior to analysis has a crucial impact on this success, which has resulted in analyses conventionally filtering variants according to their consequence. This study investigates whether an alternative approach to filtering, using annotations from recently developed bioinformatics tools, can aid these types of analyses in comparison to conventional approaches.We conducted a candidate gene analysis using the UK10K sequence and lipids data, filtering according to functional annotations using the resource CADD (Combined Annotation-Dependent Depletion and contrasting results with 'nonsynonymous' and 'loss of function' consequence analyses. Using CADD allowed the inclusion of potentially deleterious intronic variants, which was not possible when filtering by consequence. Overall, different filtering approaches provided similar evidence of association, although filtering according to CADD identified evidence of association between ANGPTL4 and High Density Lipoproteins (P = 0.02, N = 3,210 which was not observed in the other analyses. We also undertook genome-wide analyses to determine how filtering in this manner compared to conventional approaches for gene regions. Results suggested that filtering by annotations according to CADD, as well as other tools known as FATHMM-MKL and DANN, identified association signals not detected when filtering by variant consequence and vice versa.Incorporating variant annotations from non-coding bioinformatics tools should prove to be a valuable asset for rare variant analyses in the future. Filtering by variant consequence is only possible in coding regions of the genome, whereas utilising non-coding bioinformatics annotations provides an opportunity to discover unknown causal variants in non-coding regions as well. This should allow

  12. Fishing activity near offshore pipelines, 2017

    NARCIS (Netherlands)

    Machiels, Marcel

    2018-01-01

    On the North Sea bottom lie numerous pipelines to link oil- or gas offshore drilling units, - platforms and processing stations on land. Although pipeline tubes are coated and covered with protective layers, the pipelines risk being damaged through man-made hazards like anchor dropping and fishing

  13. Lay Pipeline Abandonment Head during Some

    African Journals Online (AJOL)

    2016-12-01

    Dec 1, 2016 ... Fracture mechanics approach is used on API 5L X52 of wall thickness of 0.5 inches pipeline structure. The pipeline was failed in a ... barge and the hydrodynamic loading on the pipeline itself. The Response Amplitude Operator is simply a measure of the Heave, Surge and Pitch of the barge relative to wave ...

  14. Annotating temporal information in clinical narratives.

    Science.gov (United States)

    Sun, Weiyi; Rumshisky, Anna; Uzuner, Ozlem

    2013-12-01

    Temporal information in clinical narratives plays an important role in patients' diagnosis, treatment and prognosis. In order to represent narrative information accurately, medical natural language processing (MLP) systems need to correctly identify and interpret temporal information. To promote research in this area, the Informatics for Integrating Biology and the Bedside (i2b2) project developed a temporally annotated corpus of clinical narratives. This corpus contains 310 de-identified discharge summaries, with annotations of clinical events, temporal expressions and temporal relations. This paper describes the process followed for the development of this corpus and discusses annotation guideline development, annotation methodology, and corpus quality. Copyright © 2013 Elsevier Inc. All rights reserved.

  15. Versatile annotation and publication quality visualization of protein complexes using POLYVIEW-3D

    Directory of Open Access Journals (Sweden)

    Meller Jaroslaw

    2007-08-01

    Full Text Available Abstract Background Macromolecular visualization as well as automated structural and functional annotation tools play an increasingly important role in the post-genomic era, contributing significantly towards the understanding of molecular systems and processes. For example, three dimensional (3D models help in exploring protein active sites and functional hot spots that can be targeted in drug design. Automated annotation and visualization pipelines can also reveal other functionally important attributes of macromolecules. These goals are dependent on the availability of advanced tools that integrate better the existing databases, annotation servers and other resources with state-of-the-art rendering programs. Results We present a new tool for protein structure analysis, with the focus on annotation and visualization of protein complexes, which is an extension of our previously developed POLYVIEW web server. By integrating the web technology with state-of-the-art software for macromolecular visualization, such as the PyMol program, POLYVIEW-3D enables combining versatile structural and functional annotations with a simple web-based interface for creating publication quality structure rendering, as well as animated images for Powerpoint™, web sites and other electronic resources. The service is platform independent and no plug-ins are required. Several examples of how POLYVIEW-3D can be used for structural and functional analysis in the context of protein-protein interactions are presented to illustrate the available annotation options. Conclusion POLYVIEW-3D server features the PyMol image rendering that provides detailed and high quality presentation of macromolecular structures, with an easy to use web-based interface. POLYVIEW-3D also provides a wide array of options for automated structural and functional analysis of proteins and their complexes. Thus, the POLYVIEW-3D server may become an important resource for researches and educators in

  16. LS-SNP/PDB: annotated non-synonymous SNPs mapped to Protein Data Bank structures.

    Science.gov (United States)

    Ryan, Michael; Diekhans, Mark; Lien, Stephanie; Liu, Yun; Karchin, Rachel

    2009-06-01

    LS-SNP/PDB is a new WWW resource for genome-wide annotation of human non-synonymous (amino acid changing) SNPs. It serves high-quality protein graphics rendered with UCSF Chimera molecular visualization software. The system is kept up-to-date by an automated, high-throughput build pipeline that systematically maps human nsSNPs onto Protein Data Bank structures and annotates several biologically relevant features. LS-SNP/PDB is available at (http://ls-snp.icm.jhu.edu/ls-snp-pdb) and via links from protein data bank (PDB) biology and chemistry tabs, UCSC Genome Browser Gene Details and SNP Details pages and PharmGKB Gene Variants Downloads/Cross-References pages.

  17. Customer service drives pipelines' reorganization

    International Nuclear Information System (INIS)

    Share, J.

    1997-01-01

    The concept behind formation of Enron Transportation and Storage tells plenty about this new gas industry. When executives at the Enron Gas Pipeline Group considered plans last year to streamline operations by merging the support functions of Transwestern Pipeline and their other wholly owned pipeline company, Northern Natural Gas, seamless customer service was foremost on their agenda. Instead of worrying about whether employees would favor one pipeline over the other, perhaps to the detriment of customers, they simply created a new organization that everyone would swear the same allegiance to. The 17,000-mile, 4.1 Bcf/d Northern system serves the upper Midwest market and two major expansion projects were completed there last year. Transwestern is a 2,700-mile system with an eastward capacity of 1 Bcf/d and westward of 1.5 Bcf/, that traditionally served California markets. It also ties into Texas intrastate markets and, thanks to expansion of the San Juan lateral, to southern Rocky Mountain supplies. Although Enron Corp. continues to position itself as a full-service energy company, the Gas Pipeline Group continues to fuel much of corporate's net income, which was $584 million last year. With ET and S comprising a significant portion of GPG's income, it was vital that the merger of Northern's 950 employees with Transwestern's 250 indeed be a seamless one. It was not easy either psychologically or geographically with main offices in Omaha, NE and Houston as well as operations centers in Minneapolis, MN; Amarillo, TX; W. Des Moines, IA; and Albuquerque, NM. But the results have been gratifying, according to William R. Cordes, President of ET and S and Nancy L. Gardner, Executive Vice President of Strategic Initiatives

  18. Analysis and comparison of very large metagenomes with fast clustering and functional annotation

    Directory of Open Access Journals (Sweden)

    Li Weizhong

    2009-10-01

    Full Text Available Abstract Background The remarkable advance of metagenomics presents significant new challenges in data analysis. Metagenomic datasets (metagenomes are large collections of sequencing reads from anonymous species within particular environments. Computational analyses for very large metagenomes are extremely time-consuming, and there are often many novel sequences in these metagenomes that are not fully utilized. The number of available metagenomes is rapidly increasing, so fast and efficient metagenome comparison methods are in great demand. Results The new metagenomic data analysis method Rapid Analysis of Multiple Metagenomes with a Clustering and Annotation Pipeline (RAMMCAP was developed using an ultra-fast sequence clustering algorithm, fast protein family annotation tools, and a novel statistical metagenome comparison method that employs a unique graphic interface. RAMMCAP processes extremely large datasets with only moderate computational effort. It identifies raw read clusters and protein clusters that may include novel gene families, and compares metagenomes using clusters or functional annotations calculated by RAMMCAP. In this study, RAMMCAP was applied to the two largest available metagenomic collections, the "Global Ocean Sampling" and the "Metagenomic Profiling of Nine Biomes". Conclusion RAMMCAP is a very fast method that can cluster and annotate one million metagenomic reads in only hundreds of CPU hours. It is available from http://tools.camera.calit2.net/camera/rammcap/.

  19. ONEMercury: Towards Automatic Annotation of Earth Science Metadata

    Science.gov (United States)

    Tuarob, S.; Pouchard, L. C.; Noy, N.; Horsburgh, J. S.; Palanisamy, G.

    2012-12-01

    Earth sciences have become more data-intensive, requiring access to heterogeneous data collected from multiple places, times, and thematic scales. For example, research on climate change may involve exploring and analyzing observational data such as the migration of animals and temperature shifts across the earth, as well as various model-observation inter-comparison studies. Recently, DataONE, a federated data network built to facilitate access to and preservation of environmental and ecological data, has come to exist. ONEMercury has recently been implemented as part of the DataONE project to serve as a portal for discovering and accessing environmental and observational data across the globe. ONEMercury harvests metadata from the data hosted by multiple data repositories and makes it searchable via a common search interface built upon cutting edge search engine technology, allowing users to interact with the system, intelligently filter the search results on the fly, and fetch the data from distributed data sources. Linking data from heterogeneous sources always has a cost. A problem that ONEMercury faces is the different levels of annotation in the harvested metadata records. Poorly annotated records tend to be missed during the search process as they lack meaningful keywords. Furthermore, such records would not be compatible with the advanced search functionality offered by ONEMercury as the interface requires a metadata record be semantically annotated. The explosion of the number of metadata records harvested from an increasing number of data repositories makes it impossible to annotate the harvested records manually, urging the need for a tool capable of automatically annotating poorly curated metadata records. In this paper, we propose a topic-model (TM) based approach for automatic metadata annotation. Our approach mines topics in the set of well annotated records and suggests keywords for poorly annotated records based on topic similarity. We utilize the

  20. Automated Radiology Report Summarization Using an Open-Source Natural Language Processing Pipeline.

    Science.gov (United States)

    Goff, Daniel J; Loehfelm, Thomas W

    2017-10-30

    Diagnostic radiologists are expected to review and assimilate findings from prior studies when constructing their overall assessment of the current study. Radiology information systems facilitate this process by presenting the radiologist with a subset of prior studies that are more likely to be relevant to the current study, usually by comparing anatomic coverage of both the current and prior studies. It is incumbent on the radiologist to review the full text report and/or images from those prior studies, a process that is time-consuming and confers substantial risk of overlooking a relevant prior study or finding. This risk is compounded when patients have dozens or even hundreds of prior imaging studies. Our goal is to assess the feasibility of natural language processing techniques to automatically extract asserted and negated disease entities from free-text radiology reports as a step towards automated report summarization. We compared automatically extracted disease mentions to a gold-standard set of manual annotations for 50 radiology reports from CT abdomen and pelvis examinations. The automated report summarization pipeline found perfect or overlapping partial matches for 86% of the manually annotated disease mentions (sensitivity 0.86, precision 0.66, accuracy 0.59, F1 score 0.74). The performance of the automated pipeline was good, and the overall accuracy was similar to the interobserver agreement between the two manual annotators.

  1. Rnnotator: an automated de novo transcriptome assembly pipeline from stranded RNA-Seq reads

    Energy Technology Data Exchange (ETDEWEB)

    Martin, Jeffrey; Bruno, Vincent M.; Fang, Zhide; Meng, Xiandong; Blow, Matthew; Zhang, Tao; Sherlock, Gavin; Snyder, Michael; Wang, Zhong

    2010-11-19

    Background: Comprehensive annotation and quantification of transcriptomes are outstanding problems in functional genomics. While high throughput mRNA sequencing (RNA-Seq) has emerged as a powerful tool for addressing these problems, its success is dependent upon the availability and quality of reference genome sequences, thus limiting the organisms to which it can be applied. Results: Here, we describe Rnnotator, an automated software pipeline that generates transcript models by de novo assembly of RNA-Seq data without the need for a reference genome. We have applied the Rnnotator assembly pipeline to two yeast transcriptomes and compared the results to the reference gene catalogs of these organisms. The contigs produced by Rnnotator are highly accurate (95percent) and reconstruct full-length genes for the majority of the existing gene models (54.3percent). Furthermore, our analyses revealed many novel transcribed regions that are absent from well annotated genomes, suggesting Rnnotator serves as a complementary approach to analysis based on a reference genome for comprehensive transcriptomics. Conclusions: These results demonstrate that the Rnnotator pipeline is able to reconstruct full-length transcripts in the absence of a complete reference genome.

  2. Ten steps to get started in Genome Assembly and Annotation [version 1; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    Victoria Dominguez Del Angel

    2018-02-01

    Full Text Available As a part of the ELIXIR-EXCELERATE efforts in capacity building, we present here 10 steps to facilitate researchers getting started in genome assembly and genome annotation. The guidelines given are broadly applicable, intended to be stable over time, and cover all aspects from start to finish of a general assembly and annotation project. Intrinsic properties of genomes are discussed, as is the importance of using high quality DNA. Different sequencing technologies and generally applicable workflows for genome assembly are also detailed. We cover structural and functional annotation and encourage readers to also annotate transposable elements, something that is often omitted from annotation workflows. The importance of data management is stressed, and we give advice on where to submit data and how to make your results Findable, Accessible, Interoperable, and Reusable (FAIR.

  3. IIS--Integrated Interactome System: a web-based platform for the annotation, analysis and visualization of protein-metabolite-gene-drug interactions by integrating a variety of data sources and tools.

    Science.gov (United States)

    Carazzolle, Marcelo Falsarella; de Carvalho, Lucas Miguel; Slepicka, Hugo Henrique; Vidal, Ramon Oliveira; Pereira, Gonçalo Amarante Guimarães; Kobarg, Jörg; Meirelles, Gabriela Vaz

    2014-01-01

    High-throughput screening of physical, genetic and chemical-genetic interactions brings important perspectives in the Systems Biology field, as the analysis of these interactions provides new insights into protein/gene function, cellular metabolic variations and the validation of therapeutic targets and drug design. However, such analysis depends on a pipeline connecting different tools that can automatically integrate data from diverse sources and result in a more comprehensive dataset that can be properly interpreted. We describe here the Integrated Interactome System (IIS), an integrative platform with a web-based interface for the annotation, analysis and visualization of the interaction profiles of proteins/genes, metabolites and drugs of interest. IIS works in four connected modules: (i) Submission module, which receives raw data derived from Sanger sequencing (e.g. two-hybrid system); (ii) Search module, which enables the user to search for the processed reads to be assembled into contigs/singlets, or for lists of proteins/genes, metabolites and drugs of interest, and add them to the project; (iii) Annotation module, which assigns annotations from several databases for the contigs/singlets or lists of proteins/genes, generating tables with automatic annotation that can be manually curated; and (iv) Interactome module, which maps the contigs/singlets or the uploaded lists to entries in our integrated database, building networks that gather novel identified interactions, protein and metabolite expression/concentration levels, subcellular localization and computed topological metrics, GO biological processes and KEGG pathways enrichment. This module generates a XGMML file that can be imported into Cytoscape or be visualized directly on the web. We have developed IIS by the integration of diverse databases following the need of appropriate tools for a systematic analysis of physical, genetic and chemical-genetic interactions. IIS was validated with yeast two

  4. annot8r: GO, EC and KEGG annotation of EST datasets.

    Science.gov (United States)

    Schmid, Ralf; Blaxter, Mark L

    2008-04-09

    The expressed sequence tag (EST) methodology is an attractive option for the generation of sequence data for species for which no completely sequenced genome is available. The annotation and comparative analysis of such datasets poses a formidable challenge for research groups that do not have the bioinformatics infrastructure of major genome sequencing centres. Therefore, there is a need for user-friendly tools to facilitate the annotation of non-model species EST datasets with well-defined ontologies that enable meaningful cross-species comparisons. To address this, we have developed annot8r, a platform for the rapid annotation of EST datasets with GO-terms, EC-numbers and KEGG-pathways. annot8r automatically downloads all files relevant for the annotation process and generates a reference database that stores UniProt entries, their associated Gene Ontology (GO), Enzyme Commission (EC) and Kyoto Encyclopaedia of Genes and Genomes (KEGG) annotation and additional relevant data. For each of GO, EC and KEGG, annot8r extracts a specific sequence subset from the UniProt dataset based on the information stored in the reference database. These three subsets are then formatted for BLAST searches. The user provides the protein or nucleotide sequences to be annotated and annot8r runs BLAST searches against these three subsets. The BLAST results are parsed and the corresponding annotations retrieved from the reference database. The annotations are saved both as flat files and also in a relational postgreSQL results database to facilitate more advanced searches within the results. annot8r is integrated with the PartiGene suite of EST analysis tools. annot8r is a tool that assigns GO, EC and KEGG annotations for data sets resulting from EST sequencing projects both rapidly and efficiently. The benefits of an underlying relational database, flexibility and the ease of use of the program make it ideally suited for non-model species EST-sequencing projects.

  5. PSPP: a protein structure prediction pipeline for computing clusters.

    Directory of Open Access Journals (Sweden)

    Michael S Lee

    2009-07-01

    Full Text Available Protein structures are critical for understanding the mechanisms of biological systems and, subsequently, for drug and vaccine design. Unfortunately, protein sequence data exceed structural data by a factor of more than 200 to 1. This gap can be partially filled by using computational protein structure prediction. While structure prediction Web servers are a notable option, they often restrict the number of sequence queries and/or provide a limited set of prediction methodologies. Therefore, we present a standalone protein structure prediction software package suitable for high-throughput structural genomic applications that performs all three classes of prediction methodologies: comparative modeling, fold recognition, and ab initio. This software can be deployed on a user's own high-performance computing cluster.The pipeline consists of a Perl core that integrates more than 20 individual software packages and databases, most of which are freely available from other research laboratories. The query protein sequences are first divided into domains either by domain boundary recognition or Bayesian statistics. The structures of the individual domains are then predicted using template-based modeling or ab initio modeling. The predicted models are scored with a statistical potential and an all-atom force field. The top-scoring ab initio models are annotated by structural comparison against the Structural Classification of Proteins (SCOP fold database. Furthermore, secondary structure, solvent accessibility, transmembrane helices, and structural disorder are predicted. The results are generated in text, tab-delimited, and hypertext markup language (HTML formats. So far, the pipeline has been used to study viral and bacterial proteomes.The standalone pipeline that we introduce here, unlike protein structure prediction Web servers, allows users to devote their own computing assets to process a potentially unlimited number of queries as well as perform

  6. Oil pipeline valve automation for spill reduction

    Energy Technology Data Exchange (ETDEWEB)

    Mohitpour, Mo; Trefanenko, Bill [Enbridge Technology Inc, Calgary (Canada); Tolmasquim, Sueli Tiomno; Kossatz, Helmut [TRANSPETRO - PETROBRAS Transporte S.A., Rio de Janeiro, RJ (Brazil)

    2003-07-01

    Liquid pipeline codes generally stipulate placement of block valves along liquid transmission pipelines such as on each side of major river crossings where environmental hazards could cause or are foreseen to potentially cause serious consequences. Codes, however, do not stipulate any requirement for block valve spacing for low vapour pressure petroleum transportation, nor for remote pipeline valve operations to reduce spills. A review of pipeline codes for valve requirement and spill limitation in high consequence areas is thus presented along with a criteria for an acceptable spill volume that could be caused by pipeline leak/full rupture. A technique for deciding economically and technically effective pipeline block valve automation for remote operation to reduce oil spilled and control of hazards is also provided. In this review, industry practice is highlighted and application of the criteria for maximum permissible oil spill and the technique for deciding valve automation thus developed, as applied to ORSUB pipeline is presented. ORSUB is one of the three initially selected pipelines that have been studied. These pipelines represent about 14% of the total length of petroleum transmission lines operated by PETROBRAS Transporte S.A. (TRANSPETRO) in Brazil. Based on the implementation of valve motorization on these three pipeline, motorization of block valves for remote operation on the remaining pipelines is intended, depending on the success of these implementations, on historical records of failure and appropriate ranking. (author)

  7. What, me worry? Pipeline certification at the FERC

    International Nuclear Information System (INIS)

    Schneider, J.D.

    1997-01-01

    Some of the new major pipeline projects that will bring Canadian gas into Chicago and the United States northeastern market were described. The seven projects discussed were: Independence Pipeline Company, Columbia's Millennium Project, Alliance Pipeline, Viking Voyageur, Pan Energy's Spectrum, Tennessee's Eastern Express, and National Fuel Expansion. The need for all this new capacity was questioned. The US FERC Commissioners are willing to let the market decide the need for this capacity. It was noted that the serious regulatory issues pipelines will face lie in the radically different rate treatment for pipelines, depending on the support the market shows for their respective projects. Various past decisions of the Federal Energy Regulatory Commission (FERC) were reviewed to illustrate the direction of the Commission's thinking in terms of regulatory approval, and to examine the question of whether FERC's approval policy makes good sense. The author's view is that the 'at risk' conditions of approval, coupled with incremental rate treatment by the Commission does, in fact, provide protection for consumers

  8. Bicycle: a bioinformatics pipeline to analyze bisulfite sequencing data.

    Science.gov (United States)

    Graña, Osvaldo; López-Fernández, Hugo; Fdez-Riverola, Florentino; González Pisano, David; Glez-Peña, Daniel

    2018-04-15

    High-throughput sequencing of bisulfite-converted DNA is a technique used to measure DNA methylation levels. Although a considerable number of computational pipelines have been developed to analyze such data, none of them tackles all the peculiarities of the analysis together, revealing limitations that can force the user to manually perform additional steps needed for a complete processing of the data. This article presents bicycle, an integrated, flexible analysis pipeline for bisulfite sequencing data. Bicycle analyzes whole genome bisulfite sequencing data, targeted bisulfite sequencing data and hydroxymethylation data. To show how bicycle overtakes other available pipelines, we compared them on a defined number of features that are summarized in a table. We also tested bicycle with both simulated and real datasets, to show its level of performance, and compared it to different state-of-the-art methylation analysis pipelines. Bicycle is publicly available under GNU LGPL v3.0 license at http://www.sing-group.org/bicycle. Users can also download a customized Ubuntu LiveCD including bicycle and other bisulfite sequencing data pipelines compared here. In addition, a docker image with bicycle and its dependencies, which allows a straightforward use of bicycle in any platform (e.g. Linux, OS X or Windows), is also available. ograna@cnio.es or dgpena@uvigo.es. Supplementary data are available at Bioinformatics online.

  9. Grasping at Straws: Comments on the Alberta Pipeline Safety Review

    Directory of Open Access Journals (Sweden)

    Jennifer Winter

    2013-09-01

    Full Text Available The release last month of the Alberta Pipeline Safety Review was meant to be a symbol of the province’s renewed commitment to environmental responsibility as it aims for new export markets. The report’s authors, Group 10 Engineering, submitted 17 recommendations covering public safety and pipeline incidents, pipeline integrity management and pipeline safety near bodies of water — and many of them run the gamut from the obvious to the unhelpful to the contradictory. That the energy regulator ought to be staffed to do its job should go without saying; in fact, staffing levels were never identified as an issue. The recommendation that record retention and transfer requirements be defined for mergers and acquisitions, sales and takeovers is moot. There is no reason a purchasing party would not want all relevant documents, and no real way to enforce transparency if the seller opts to withhold information. Harmonizing regulations between provinces could reduce companies’ cost of doing business, but could also prove challenging if different jurisdictions use performance-based regulations — which is what the Review recommended Alberta consider. This very brief paper pries apart the Review’s flaws and recommends that the province go back to the drawing board. Safety is a serious issue; a genuine statistical review linking pipeline characteristics to failures and risk-mitigation activities would be a better alternative by far.

  10. Black English Annotations for Elementary Reading Programs.

    Science.gov (United States)

    Prasad, Sandre

    This report describes a program that uses annotations in the teacher's editions of existing reading programs to indicate the characteristics of black English that may interfere with the reading process of black children. The first part of the report provides a rationale for the annotation approach, explaining that the discrepancy between written…

  11. Ground Truth Annotation in T Analyst

    DEFF Research Database (Denmark)

    2015-01-01

    This video shows how to annotate the ground truth tracks in the thermal videos. The ground truth tracks are produced to be able to compare them to tracks obtained from a Computer Vision tracking approach. The program used for annotation is T-Analyst, which is developed by Aliaksei Laureshyn, Ph...

  12. Towards the Automated Annotation of Process Models

    NARCIS (Netherlands)

    Leopold, H.; Meilicke, C.; Fellmann, M.; Pittke, F.; Stuckenschmidt, H.; Mendling, J.

    2016-01-01

    Many techniques for the advanced analysis of process models build on the annotation of process models with elements from predefined vocabularies such as taxonomies. However, the manual annotation of process models is cumbersome and sometimes even hardly manageable taking the size of taxonomies into

  13. Towards Viral Genome Annotation Standards, Report from the 2010 NCBI Annotation Workshop.

    Science.gov (United States)

    Brister, James Rodney; Bao, Yiming; Kuiken, Carla; Lefkowitz, Elliot J; Le Mercier, Philippe; Leplae, Raphael; Madupu, Ramana; Scheuermann, Richard H; Schobel, Seth; Seto, Donald; Shrivastava, Susmita; Sterk, Peter; Zeng, Qiandong; Klimke, William; Tatusova, Tatiana

    2010-10-01

    Improvements in DNA sequencing technologies portend a new era in virology and could possibly lead to a giant leap in our understanding of viral evolution and ecology. Yet, as viral genome sequences begin to fill the world's biological databases, it is critically important to recognize that the scientific promise of this era is dependent on consistent and comprehensive genome annotation. With this in mind, the NCBI Genome Annotation Workshop recently hosted a study group tasked with developing sequence, function, and metadata annotation standards for viral genomes. This report describes the issues involved in viral genome annotation and reviews policy recommendations presented at the NCBI Annotation Workshop.

  14. Towards Viral Genome Annotation Standards, Report from the 2010 NCBI Annotation Workshop

    Directory of Open Access Journals (Sweden)

    Qiandong Zeng

    2010-10-01

    Full Text Available Improvements in DNA sequencing technologies portend a new era in virology and could possibly lead to a giant leap in our understanding of viral evolution and ecology. Yet, as viral genome sequences begin to fill the world’s biological databases, it is critically important to recognize that the scientific promise of this era is dependent on consistent and comprehensive genome annotation. With this in mind, the NCBI Genome Annotation Workshop recently hosted a study group tasked with developing sequence, function, and metadata annotation standards for viral genomes. This report describes the issues involved in viral genome annotation and reviews policy recommendations presented at the NCBI Annotation Workshop.

  15. Creating Gaze Annotations in Head Mounted Displays

    DEFF Research Database (Denmark)

    Mardanbeigi, Diako; Qvarfordt, Pernilla

    2015-01-01

    To facilitate distributed communication in mobile settings, we developed GazeNote for creating and sharing gaze annotations in head mounted displays (HMDs). With gaze annotations it possible to point out objects of interest within an image and add a verbal description. To create an annota- tion......, the user simply captures an image using the HMD’s camera, looks at an object of interest in the image, and speaks out the information to be associated with the object. The gaze location is recorded and visualized with a marker. The voice is transcribed using speech recognition. Gaze annotations can...... be shared. Our study showed that users found that gaze annotations add precision and expressive- ness compared to annotations of the image as a whole...

  16. Ion implantation: an annotated bibliography

    International Nuclear Information System (INIS)

    Ting, R.N.; Subramanyam, K.

    1975-10-01

    Ion implantation is a technique for introducing controlled amounts of dopants into target substrates, and has been successfully used for the manufacture of silicon semiconductor devices. Ion implantation is superior to other methods of doping such as thermal diffusion and epitaxy, in view of its advantages such as high degree of control, flexibility, and amenability to automation. This annotated bibliography of 416 references consists of journal articles, books, and conference papers in English and foreign languages published during 1973-74, on all aspects of ion implantation including range distribution and concentration profile, channeling, radiation damage and annealing, compound semiconductors, structural and electrical characterization, applications, equipment and ion sources. Earlier bibliographies on ion implantation, and national and international conferences in which papers on ion implantation were presented have also been listed separately

  17. Pipeline FFT Architectures Optimized for FPGAs

    Directory of Open Access Journals (Sweden)

    Bin Zhou

    2009-01-01

    Full Text Available This paper presents optimized implementations of two different pipeline FFT processors on Xilinx Spartan-3 and Virtex-4 FPGAs. Different optimization techniques and rounding schemes were explored. The implementation results achieved better performance with lower resource usage than prior art. The 16-bit 1024-point FFT with the R22SDF architecture had a maximum clock frequency of 95.2 MHz and used 2802 slices on the Spartan-3, a throughput per area ratio of 0.034 Msamples/s/slice. The R4SDC architecture ran at 123.8 MHz and used 4409 slices on the Spartan-3, a throughput per area ratio of 0.028 Msamples/s/slice. On Virtex-4, the 16-bit 1024-point R22SDF architecture ran at 235.6 MHz and used 2256 slice, giving a 0.104 Msamples/s/slice ratio; the 16-bit 1024-point R4SDC architecture ran at 219.2 MHz and used 3064 slices, giving a 0.072 Msamples/s/slice ratio. The R22SDF was more efficient than the R4SDC in terms of throughput per area due to a simpler controller and an easier balanced rounding scheme. This paper also shows that balanced stage rounding is an appropriate rounding scheme for pipeline FFT processors.

  18. Nova Gas's pipeline to Asia

    International Nuclear Information System (INIS)

    Lea, N.

    1996-01-01

    The involvement of the Calgary-based company NOVA Gas International (NGI) in Malaysia's peninsular gas utilization (PGU) project, was described. Phase I and II of the project involved linking onshore gas processing plants with a natural gas transmission system. Phase III of the PGU project was a gas transmission pipeline that began midway up the west coast of peninsular Malaysia to the Malaysia-Thailand border. The complex 549 km pipeline included route selection, survey and soil investigation, archaeological study, environmental impact assessment, land acquisition, meter-station construction, telecommunication systems and office buildings. NGI was the prime contractor on the project through a joint venture with OGP Technical Services, jointly owned by NGI and Petronas, the Thai state oil company. Much of NGI's success was attributed to excellent interpersonal skills, particularly NGI's ability to build confidence and credibility with its Thai partners

  19. Concept annotation in the CRAFT corpus.

    Science.gov (United States)

    Bada, Michael; Eckert, Miriam; Evans, Donald; Garcia, Kristin; Shipley, Krista; Sitnikov, Dmitry; Baumgartner, William A; Cohen, K Bretonnel; Verspoor, Karin; Blake, Judith A; Hunter, Lawrence E

    2012-07-09

    Manually annotated corpora are critical for the training and evaluation of automated methods to identify concepts in biomedical text. This paper presents the concept annotations of the Colorado Richly Annotated Full-Text (CRAFT) Corpus, a collection of 97 full-length, open-access biomedical journal articles that have been annotated both semantically and syntactically to serve as a research resource for the biomedical natural-language-processing (NLP) community. CRAFT identifies all mentions of nearly all concepts from nine prominent biomedical ontologies and terminologies: the Cell Type Ontology, the Chemical Entities of Biological Interest ontology, the NCBI Taxonomy, the Protein Ontology, the Sequence Ontology, the entries of the Entrez Gene database, and the three subontologies of the Gene Ontology. The first public release includes the annotations for 67 of the 97 articles, reserving two sets of 15 articles for future text-mining competitions (after which these too will be released). Concept annotations were created based on a single set of guidelines, which has enabled us to achieve consistently high interannotator agreement. As the initial 67-article release contains more than 560,000 tokens (and the full set more than 790,000 tokens), our corpus is among the largest gold-standard annotated biomedical corpora. Unlike most others, the journal articles that comprise the corpus are drawn from diverse biomedical disciplines and are marked up in their entirety. Additionally, with a concept-annotation count of nearly 100,000 in the 67-article subset (and more than 140,000 in the full collection), the scale of conceptual markup is also among the largest of comparable corpora. The concept annotations of the CRAFT Corpus have the potential to significantly advance biomedical text mining by providing a high-quality gold standard for NLP systems. The corpus, annotation guidelines, and other associated resources are freely available at http://bionlp-corpora.sourceforge.net/CRAFT/index.shtml.

  20. Teaching and Learning Communities through Online Annotation

    Science.gov (United States)

    van der Pluijm, B.

    2016-12-01

    What do colleagues do with your assigned textbook? What they say or think about the material? Want students to be more engaged in their learning experience? If so, online materials that complement standard lecture format provide new opportunity through managed, online group annotation that leverages the ubiquity of internet access, while personalizing learning. The concept is illustrated with the new online textbook "Processes in Structural Geology and Tectonics", by Ben van der Pluijm and Stephen Marshak, which offers a platform for sharing of experiences, supplementary materials and approaches, including readings, mathematical applications, exercises, challenge questions, quizzes, alternative explanations, and more. The annotation framework used is Hypothes.is, which offers a free, open platform markup environment for annotation of websites and PDF postings. The annotations can be public, grouped or individualized, as desired, including export access and download of annotations. A teacher group, hosted by a moderator/owner, limits access to members of a user group of teachers, so that its members can use, copy or transcribe annotations for their own lesson material. Likewise, an instructor can host a student group that encourages sharing of observations, questions and answers among students and instructor. Also, the instructor can create one or more closed groups that offers study help and hints to students. Options galore, all of which aim to engage students and to promote greater responsibility for their learning experience. Beyond new capacity, the ability to analyze student annotation supports individual learners and their needs. For example, student notes can be analyzed for key phrases and concepts, and identify misunderstandings, omissions and problems. Also, example annotations can be shared to enhance notetaking skills and to help with studying. Lastly, online annotation allows active application to lecture posted slides, supporting real-time notetaking

  1. Concept annotation in the CRAFT corpus

    Science.gov (United States)

    2012-01-01

    Background Manually annotated corpora are critical for the training and evaluation of automated methods to identify concepts in biomedical text. Results This paper presents the concept annotations of the Colorado Richly Annotated Full-Text (CRAFT) Corpus, a collection of 97 full-length, open-access biomedical journal articles that have been annotated both semantically and syntactically to serve as a research resource for the biomedical natural-language-processing (NLP) community. CRAFT identifies all mentions of nearly all concepts from nine prominent biomedical ontologies and terminologies: the Cell Type Ontology, the Chemical Entities of Biological Interest ontology, the NCBI Taxonomy, the Protein Ontology, the Sequence Ontology, the entries of the Entrez Gene database, and the three subontologies of the Gene Ontology. The first public release includes the annotations for 67 of the 97 articles, reserving two sets of 15 articles for future text-mining competitions (after which these too will be released). Concept annotations were created based on a single set of guidelines, which has enabled us to achieve consistently high interannotator agreement. Conclusions As the initial 67-article release contains more than 560,000 tokens (and the full set more than 790,000 tokens), our corpus is among the largest gold-standard annotated biomedical corpora. Unlike most others, the journal articles that comprise the corpus are drawn from diverse biomedical disciplines and are marked up in their entirety. Additionally, with a concept-annotation count of nearly 100,000 in the 67-article subset (and more than 140,000 in the full collection), the scale of conceptual markup is also among the largest of comparable corpora. The concept annotations of the CRAFT Corpus have the potential to significantly advance biomedical text mining by providing a high-quality gold standard for NLP systems. The corpus, annotation guidelines, and other associated resources are freely available at http

  2. Automatic annotation of head velocity and acceleration in Anvil

    DEFF Research Database (Denmark)

    Jongejan, Bart

    2012-01-01

    We describe an automatic face tracker plugin for the ANVIL annotation tool. The face tracker produces data for velocity and for acceleration in two dimensions. We compare the annotations generated by the face tracking algorithm with independently made manual annotations for head movements....... The annotations are a useful supplement to manual annotations and may help human annotators to quickly and reliably determine onset of head movements and to suggest which kind of head movement is taking place....

  3. Onotology-Based Annotation and Ranking Service for Geoscience

    Science.gov (United States)

    Sainju, R.; Ramachandran, R.; Li, X.; McEniry, M.; Kulkarni, A.; Conover, H.

    2012-12-01

    There is a need to automatically annotate information using a either a control vocabulary or an ontology to make the information not only easily discoverable but also allow the information to be linked to other information based on these semantic annotations. We present an ontology annotation and a ranking service designed to address this need. The service can be configured to use an ontology describing a specific application domain. Given text inputs, this service generates annotations whenever the service finds terms that intersect both in the text and the ontology. The service is also capable of ranking the different inputs based on the "contextual" similarity to the information captured in the ontology. To rank a given input, the service uses a specialized algorithm which calculated both an ontological score based on precomputed weights of the intersecting term from the ontology and a statistical score using traditional term frequency- inverse document frequency (TF-IDF) approach. Both these scores are normalized and combined to generate the final ranking. An example application of this service to find relevant datasets for studying Hurricanes within NASA's data catalog. A hurricane ontology is used to index and rank all the data set descriptions from the metadata catalog and only the datasets that rank high are presented to the end users as contextually relevant for studying Hurricanes.

  4. Upgrading Algeria-Italy trans-Mediterranean natural gas pipeline

    International Nuclear Information System (INIS)

    Stella, G.

    1992-01-01

    The first trans-Mediterranean pipeline system, which went into service in 1983, had to be doubled in capacity in order to meet increased European demand for Algerian natural gas. After a brief review of the contractual, planning and construction history of the first pipeline, this paper discusses the strategies taken which led to the decision to double the line's capacity. Descriptions are then given of the different construction phases realized in Tunisia, the Sicilian Channel and Italian mainland. Focus is on construction schedules, problems and solutions. The report comes complete with details of project financing, organizing, materials supply programs, innovative technology applications, design philosophy and construction techniques

  5. Silicon compiler design of combinational and pipeline adder integrated circuits

    Science.gov (United States)

    Froede, A. O., III

    1985-06-01

    The architecture and structures used by the MacPitts silicon compiler to design integrated circuits are described, and the capabilities and limitations of the compiler are discussed. The performance of several combinational and pipeline adders designed by MacPitts and a hand-crafted pipeline adder are compared. Several different MacPitts design errors are documented. Tutorial material is presented to aid in using the MacPitts interpreter and to illustrate timing analysis of MacPitts-designed circuits using the program Crystal.

  6. Sustainable management of leakage from wastewater pipelines.

    Science.gov (United States)

    DeSilva, D; Burn, S; Tjandraatmadja, G; Moglia, M; Davis, P; Wolf, L; Held, I; Vollertsen, J; Williams, W; Hafskjold, L

    2005-01-01

    Wastewater pipeline leakage is an emerging concern in Europe, especially with regards to the potential effect of leaking effluent on groundwater contamination and the effects infiltration has on the management of sewer reticulation systems. This paper describes efforts by Australia, in association with several European partners, towards the development of decision support tools to prioritize proactive rehabilitation of wastewater pipe networks to account for leakage. In the fundamental models for the decision support system, leakage is viewed as a function of pipeline system deterioration. The models rely on soil type identification across the service area to determine the aggressiveness of the pipe environment and for division of the area into zones based on pipe properties and operational conditions. By understanding the interaction between pipe materials, operating conditions, and the pipe environment in the mechanisms leading to pipe deterioration, the models allow the prediction of leakage rates in different zones across a network. The decision support system utilizes these models to predict the condition of pipes in individual zones, and to optimize the utilization of rehabilitation resources by targeting the areas with the highest leakage rates.

  7. PIPELINE CORROSION CONTROL IN OIL AND GAS INDUSTRY: A ...

    African Journals Online (AJOL)

    user

    compared to other methods and thus constant monitoring is needed to achieve optimum efficiency. Keywords: Corrosion, Cathodic ... no impress current. This shows the difference between the pipe/soil potential and the natural potential. Table 1: Material/Specification for system 2A pipeline. Specifications. Designation.

  8. Pressure Transient Model of Water-Hydraulic Pipelines with Cavitation

    Directory of Open Access Journals (Sweden)

    Dan Jiang

    2018-03-01

    Full Text Available Transient pressure investigation of water-hydraulic pipelines is a challenge in the fluid transmission field, since the flow continuity equation and momentum equation are partial differential, and the vaporous cavitation has high dynamics; the frictional force caused by fluid viscosity is especially uncertain. In this study, due to the different transient pressure dynamics in upstream and downstream pipelines, the finite difference method (FDM is adopted to handle pressure transients with and without cavitation, as well as steady friction and frequency-dependent unsteady friction. Different from the traditional method of characteristics (MOC, the FDM is advantageous in terms of the simple and convenient computation. Furthermore, the mechanism of cavitation growth and collapse are captured both upstream and downstream of the water-hydraulic pipeline, i.e., the cavitation start time, the end time, the duration, the maximum volume, and the corresponding time points. By referring to the experimental results of two previous works, the comparative simulation results of two computation methods are verified in experimental water-hydraulic pipelines, which indicates that the finite difference method shows better data consistency than the MOC.

  9. Making web annotations persistent over time

    Energy Technology Data Exchange (ETDEWEB)

    Sanderson, Robert [Los Alamos National Laboratory; Van De Sompel, Herbert [Los Alamos National Laboratory

    2010-01-01

    As Digital Libraries (DL) become more aligned with the web architecture, their functional components need to be fundamentally rethought in terms of URIs and HTTP. Annotation, a core scholarly activity enabled by many DL solutions, exhibits a clearly unacceptable characteristic when existing models are applied to the web: due to the representations of web resources changing over time, an annotation made about a web resource today may no longer be relevant to the representation that is served from that same resource tomorrow. We assume the existence of archived versions of resources, and combine the temporal features of the emerging Open Annotation data model with the capability offered by the Memento framework that allows seamless navigation from the URI of a resource to archived versions of that resource, and arrive at a solution that provides guarantees regarding the persistence of web annotations over time. More specifically, we provide theoretical solutions and proof-of-concept experimental evaluations for two problems: reconstructing an existing annotation so that the correct archived version is displayed for all resources involved in the annotation, and retrieving all annotations that involve a given archived version of a web resource.

  10. Experimental Study on the Cause of Inorganic Scale Formation in the Water Injection Pipeline of Tarim Oilfield

    Directory of Open Access Journals (Sweden)

    Guihong Pei

    2014-01-01

    Full Text Available Scale formation of water injection pipeline will cause the pipeline to be corroded and increase frictional drag, which will induce the quality and quantity cannot meet the need of oil production process. The cause of scale formation in different oilfield is different because of the complex formation conditions. Taking one operation area of Tazhong oilfield as research object, the authors studied the water quality in different point along water injection pipeline through experiment studies, and analyzed the cause of inorganic scale formation and influence factors. The research results can provide theoretical guidance to anticorrosion and antiscale of oilfield pipeline.

  11. Missing genes in the annotation of prokaryotic genomes

    Directory of Open Access Journals (Sweden)

    Feng Wu-chun

    2010-03-01

    Full Text Available Abstract Background Protein-coding gene detection in prokaryotic genomes is considered a much simpler problem than in intron-containing eukaryotic genomes. However there have been reports that prokaryotic gene finder programs have problems with small genes (either over-predicting or under-predicting. Therefore the question arises as to whether current genome annotations have systematically missing, small genes. Results We have developed a high-performance computing methodology to investigate this problem. In this methodology we compare all ORFs larger than or equal to 33 aa from all fully-sequenced prokaryotic replicons. Based on that comparison, and using conservative criteria requiring a minimum taxonomic diversity between conserved ORFs in different genomes, we have discovered 1,153 candidate genes that are missing from current genome annotations. These missing genes are similar only to each other and do not have any strong similarity to gene sequences in public databases, with the implication that these ORFs belong to missing gene families. We also uncovered 38,895 intergenic ORFs, readily identified as putative genes by similarity to currently annotated genes (we call these absent annotations. The vast majority of the missing genes found are small (less than 100 aa. A comparison of select examples with GeneMark, EasyGene and Glimmer predictions yields evidence that some of these genes are escaping detection by these programs. Conclusions Prokaryotic gene finders and prokaryotic genome annotations require improvement for accurate prediction of small genes. The number of missing gene families found is likely a lower bound on the actual number, due to the conservative criteria used to determine whether an ORF corresponds to a real gene.

  12. Strength layout of pipelines

    International Nuclear Information System (INIS)

    Moczall, K.; Schmid, H.J.

    1976-01-01

    Different properties are required from pipes of nuclear power plants according to their respective importance for the safety as well as the availability of a plant. For reasons of expediency and profitability, it has therefore been necessary to attach 5 requiremental stages to the above-mentioned as well as to all other components. Of the five stages, the first three represent those which are of major, medium or minor importance for safety of nuclear plants. The 4th stage gives in general the importance of the availability of a plant, whereas the 5th stage is provided for the rest of the pipes. The requirements of conventional, technical regulations are generally fulfilled in the 4th stage, whereas, phased accordingly, exceeding requirements are set up and specified in the first three stages. These requirements refer to the securing of the workability of plant components, to (redundant, resp. diverse) layout, and to the quality of components (qualifying of producers and production processes, preliminary checking of production papers, supervision of production, material and construction testing, documentation) as well as to strength verification (pressure test, stress-strain measurements, stress and fatigue analysis). (orig.) [de

  13. Development of high productivity pipeline girth welding

    International Nuclear Information System (INIS)

    Yapp, David; Liratzis, Theocharis

    2010-01-01

    The trend for increased oil and gas consumption implies a growth of long-distance pipeline installations. Welding is a critical factor in the installation of pipelines, both onshore and offshore, and the rate at which the pipeline can be laid is generally determined by the speed of welding. This has resulted in substantial developments in pipeline welding techniques. Arc welding is still the dominant process used in practice, and forge welding processes have had limited successful application to date, in spite of large investments in process development. Power beam processes have also been investigated in detail and the latest laser systems now show promise for practical application. In recent years the use of high strength steels has substantially reduced the cost of pipeline installation, with X70 and X80 being commonly used. This use of high strength pipeline produced by thermomechanical processing has also been researched. They must all meet three requirments, high productivity, satisfactory weld properties, and weld quality

  14. Crowdsourcing and annotating NER for Twitter #drift

    DEFF Research Database (Denmark)

    Fromreide, Hege; Hovy, Dirk; Søgaard, Anders

    2014-01-01

    We present two new NER datasets for Twitter; a manually annotated set of 1,467 tweets (kappa=0.942) and a set of 2,975 expert-corrected, crowdsourced NER annotated tweets from the dataset described in Finin et al. (2010). In our experiments with these datasets, we observe two important points: (a......) language drift on Twitter is significant, and while off-the-shelf systems have been reported to perform well on in-sample data, they often perform poorly on new samples of tweets, (b) state-of-the-art performance across various datasets can beobtained from crowdsourced annotations, making it more feasible...

  15. DFAST and DAGA: web-based integrated genome annotation tools and resources.

    Science.gov (United States)

    Tanizawa, Yasuhiro; Fujisawa, Takatomo; Kaminuma, Eli; Nakamura, Yasukazu; Arita, Masanori

    2016-01-01

    Quality assurance and correct taxonomic affiliation of data submitted to public sequence databases have been an everlasting problem. The DDBJ Fast Annotation and Submission Tool (DFAST) is a newly developed genome annotation pipeline with quality and taxonomy assessment tools. To enable annotation of ready-to-submit quality, we also constructed curated reference protein databases tailored for lactic acid bacteria. DFAST was developed so that all the procedures required for DDBJ submission could be done seamlessly online. The online workspace would be especially useful for users not familiar with bioinformatics skills. In addition, we have developed a genome repository, DFAST Archive of Genome Annotation (DAGA), which currently includes 1,421 genomes covering 179 species and 18 subspecies of two genera, Lactobacillus and Pediococcus , obtained from both DDBJ/ENA/GenBank and Sequence Read Archive (SRA). All the genomes deposited in DAGA were annotated consistently and assessed using DFAST. To assess the taxonomic position based on genomic sequence information, we used the average nucleotide identity (ANI), which showed high discriminative power to determine whether two given genomes belong to the same species. We corrected mislabeled or misidentified genomes in the public database and deposited the curated information in DAGA. The repository will improve the accessibility and reusability of genome resources for lactic acid bacteria. By exploiting the data deposited in DAGA, we found intraspecific subgroups in Lactobacillus gasseri and Lactobacillus jensenii , whose variation between subgroups is larger than the well-accepted ANI threshold of 95% to differentiate species. DFAST and DAGA are freely accessible at https://dfast.nig.ac.jp.

  16. Lessons Learned from Developing and Operating the Kepler Science Pipeline and Building the TESS Science Pipeline

    Science.gov (United States)

    Jenkins, Jon M.

    2017-01-01

    The experience acquired through development, implementation and operation of the KeplerK2 science pipelines can provide lessons learned for the development of science pipelines for other missions such as NASA's Transiting Exoplanet Survey Satellite, and ESA's PLATO mission.

  17. Chile's pipelines - who's out in the cold?

    International Nuclear Information System (INIS)

    Bellhouse, G.

    1998-01-01

    There is a battle on in Northern Chile to supply the region with gas and electricity. Two pipelines and a transmission line are being built, but there is insufficient demand to merit the construction of all of these projects. It is widely believed that the first pipeline to be finished will be the overall winner, but the situation is not that simple. A more sensible conclusion could be the merger of the two pipeline projects, rationalising supply of gas to the region. (Author)

  18. Pipeline dreams face up to reality

    International Nuclear Information System (INIS)

    Ryan, Orla

    1999-01-01

    This article gives details of two gas pipelines which are expected to be built in Turkey to meet the estimated demand for gas. The Bluestream joint ENI/Gasprom project pipeline will convey Russian gas across the Black Sea to Turkey, and the PSG joint Bechtel/General Electric venture will bring gas from Turkmenistan to Turkey across the Caspian Sea. Construction of the pipelines and financing aspects are discussed. (uk)

  19. Computational annotation of genes differentially expressed along olive fruit development

    Directory of Open Access Journals (Sweden)

    Martinelli Federico

    2009-10-01

    Full Text Available Abstract Background Olea europaea L. is a traditional tree crop of the Mediterranean basin with a worldwide economical high impact. Differently from other fruit tree species, little is known about the physiological and molecular basis of the olive fruit development and a few sequences of genes and gene products are available for olive in public databases. This study deals with the identification of large sets of differentially expressed genes in developing olive fruits and the subsequent computational annotation by means of different software. Results mRNA from fruits of the cv. Leccino sampled at three different stages [i.e., initial fruit set (stage 1, completed pit hardening (stage 2 and veraison (stage 3] was used for the identification of differentially expressed genes putatively involved in main processes along fruit development. Four subtractive hybridization libraries were constructed: forward and reverse between stage 1 and 2 (libraries A and B, and 2 and 3 (libraries C and D. All sequenced clones (1,132 in total were analyzed through BlastX against non-redundant NCBI databases and about 60% of them showed similarity to known proteins. A total of 89 out of 642 differentially expressed unique sequences was further investigated by Real-Time PCR, showing a validation of the SSH results as high as 69%. Library-specific cDNA repertories were annotated according to the three main vocabularies of the gene ontology (GO: cellular component, biological process and molecular function. BlastX analysis, GO terms mapping and annotation analysis were performed using the Blast2GO software, a research tool designed with the main purpose of enabling GO based data mining on sequence sets for which no GO annotation is yet available. Bioinformatic analysis pointed out a significantly different distribution of the annotated sequences for each GO category, when comparing the three fruit developmental stages. The olive fruit-specific transcriptome dataset was

  20. Snpdat: Easy and rapid annotation of results from de novo snp discovery projects for model and non-model organisms

    Directory of Open Access Journals (Sweden)

    Doran Anthony G

    2013-02-01

    Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs are the most abundant genetic variant found in vertebrates and invertebrates. SNP discovery has become a highly automated, robust and relatively inexpensive process allowing the identification of many thousands of mutations for model and non-model organisms. Annotating large numbers of SNPs can be a difficult and complex process. Many tools available are optimised for use with organisms densely sampled for SNPs, such as humans. There are currently few tools available that are species non-specific or support non-model organism data. Results Here we present SNPdat, a high throughput analysis tool that can provide a comprehensive annotation of both novel and known SNPs for any organism with a draft sequence and annotation. Using a dataset of 4,566 SNPs identified in cattle using high-throughput DNA sequencing we demonstrate the annotations performed and the statistics that can be generated by SNPdat. Conclusions SNPdat provides users with a simple tool for annotation of genomes that are either not supported by other tools or have a small number of annotated SNPs available. SNPdat can also be used to analyse datasets from organisms which are densely sampled for SNPs. As a command line tool it can easily be incorporated into existing SNP discovery pipelines and fills a niche for analyses involving non-model organisms that are not supported by many available SNP annotation tools. SNPdat will be of great interest to scientists involved in SNP discovery and analysis projects, particularly those with limited bioinformatics experience.

  1. Acoustic system for communication in pipelines

    Science.gov (United States)

    Martin, II, Louis Peter; Cooper, John F [Oakland, CA

    2008-09-09

    A system for communication in a pipe, or pipeline, or network of pipes containing a fluid. The system includes an encoding and transmitting sub-system connected to the pipe, or pipeline, or network of pipes that transmits a signal in the frequency range of 3-100 kHz into the pipe, or pipeline, or network of pipes containing a fluid, and a receiver and processor sub-system connected to the pipe, or pipeline, or network of pipes containing a fluid that receives said signal and uses said signal for a desired application.

  2. Fluid mixing with a pipeline tee

    Energy Technology Data Exchange (ETDEWEB)

    Sroka, L.M.; Forney, L.J. (Georgia Inst. of Tech., Atlanta, GA (USA)); Forney, L.J. (Georgia Inst. of Tech., Atlanta, GA (USA))

    1988-01-01

    The pipeline mixing of two fluid streams by turbulent jet injection normal to the pipeline has been studied theoretically and experimentally. A simple scaling law for the second moment of the tracer concentration within the pipeline is proposed for the first fifteen pipe diameters downstream from the injection point. The similarity solution is derived by assuming that the tracer diffuses in a weak compound jet moving parallel to the pipeline axis. The theoretical results are correlated with all of the available experimental measurements. The results indicate that the second moment of the tracer concentration decreases with increasing jet momentum and distance from the injection point.

  3. Transmission pipeline calculations and simulations manual

    CERN Document Server

    Menon, E Shashi

    2014-01-01

    Transmission Pipeline Calculations and Simulations Manual is a valuable time- and money-saving tool to quickly pinpoint the essential formulae, equations, and calculations needed for transmission pipeline routing and construction decisions. The manual's three-part treatment starts with gas and petroleum data tables, followed by self-contained chapters concerning applications. Case studies at the end of each chapter provide practical experience for problem solving. Topics in this book include pressure and temperature profile of natural gas pipelines, how to size pipelines for specified f

  4. Tubular lining material for pipelines having bends

    Energy Technology Data Exchange (ETDEWEB)

    Moringa, A.; Sakaguchi, Y.; Hyodo, M.; Yagi, I.

    1987-03-24

    A tubular lining material for pipelines having bends or curved portions comprises a tubular textile jacket made of warps and wefts woven in a tubular form overlaid with a coating of a flexible synthetic resin. It is applicable onto the inner surface of a pipeline having bends or curved portions in such manner that the tubular lining material with a binder onto the inner surface thereof is inserted into the pipeline and allowed to advance within the pipeline, with or without the aid of a leading rope-like elongated element, while turning the tubular lining material inside out under fluid pressure. In this manner the tubular lining material is applied onto the inner surface of the pipeline with the binder being interposed between the pipeline and the tubular lining material. The lining material is characterized in that a part of all of the warps are comprised of an elastic yarn around which, over the full length thereof, a synthetic fiber yarn or yarns have been left-and/or right-handedly coiled. This tubular lining material is particularly suitable for lining a pipeline having an inner diameter of 25-200 mm and a plurality of bends, such as gas service pipelines or house pipelines, without occurrence of wrinkles in the lining material in a bend.

  5. Thermal expansion absorbing structure for pipeline

    International Nuclear Information System (INIS)

    Nagata, Takashi; Yamashita, Takuya.

    1995-01-01

    A thermal expansion absorbing structure for a pipeline is disposed to the end of pipelines to form a U-shaped cross section connecting a semi-circular torus shell and a short double-walled cylindrical tube. The U-shaped longitudinal cross-section is deformed in accordance with the shrinking deformation of the pipeline and absorbs thermal expansion. Namely, since the central lines of the outer and inner tubes of the double-walled cylindrical tube deform so as to incline, when the pipeline is deformed by thermal expansion, thermal expansion can be absorbed by a simple configuration thereby enabling to contribute to ensure the safety. Then, the entire length of the pipeline can greatly be shortened by applying it to the pipeline disposed in a high temperature state compared with a method of laying around a pipeline using only elbows, which has been conducted so far. Especially, when it is applied to a pipeline for an FBR-type reactor, the cost for the construction of a facility of a primary systems can greater be reduced. In addition, it can be applied to a pipeline for usual chemical plants and any other structures requiring absorption of deformation. (N.H.)

  6. East, West German gas pipeline grids linked

    International Nuclear Information System (INIS)

    Anon.

    1992-01-01

    This paper reports that Ruhrgas AG, Essen, has started up the first large diameter gas pipeline linking the gas grids of former East and West Germany. Ruhrgas last month placed in service a 40 in., 70 km line at Vitzeroda, near Eisenach, linking a new Ruhrgas pipeline in Hesse state with a 330 km gas pipeline built last year in Thuringia and Saxony states by Erdgasversorgungs GmbH (EVG), Leipzig. The new link enables pipeline operator EVG to receive 70 bcf/year of western European gas via Ruhrgas, complementing the 35 bcf/year of gas coming from the Commonwealth of Independent States via Verbundnetz Gas AG (VNG), Leipzig

  7. A quick guide to pipeline engineering

    CERN Document Server

    Alkazraji, D

    2008-01-01

    Pipeline engineering requires an understanding of a wide range of topics. Operators must take into account numerous pipeline codes and standards, calculation approaches, and reference materials in order to make accurate and informed decisions.A Quick Guide to Pipeline Engineering provides concise, easy-to-use, and accessible information on onshore and offshore pipeline engineering. Topics covered include: design; construction; testing; operation and maintenance; and decommissioning.Basic principles are discussed and clear guidance on regulations is provided, in a way that will

  8. Meteor showers an annotated catalog

    CERN Document Server

    Kronk, Gary W

    2014-01-01

    Meteor showers are among the most spectacular celestial events that may be observed by the naked eye, and have been the object of fascination throughout human history. In “Meteor Showers: An Annotated Catalog,” the interested observer can access detailed research on over 100 annual and periodic meteor streams in order to capitalize on these majestic spectacles. Each meteor shower entry includes details of their discovery, important observations and orbits, and gives a full picture of duration, location in the sky, and expected hourly rates. Armed with a fuller understanding, the amateur observer can better view and appreciate the shower of their choice. The original book, published in 1988, has been updated with over 25 years of research in this new and improved edition. Almost every meteor shower study is expanded, with some original minor showers being dropped while new ones are added. The book also includes breakthroughs in the study of meteor showers, such as accurate predictions of outbursts as well ...

  9. Environmental, public health, and safety assessment of fuel pipelines and other freight transportation modes

    International Nuclear Information System (INIS)

    Strogen, Bret; Bell, Kendon; Breunig, Hanna; Zilberman, David

    2016-01-01

    Highlights: • Externalities are examined for pipelines, truck, rail, and barge. • Safety impact factors include incidences of injuries, illnesses, and fatalities. • Environmental impact factors include CO 2 eq emissions and air pollution disease burden. • Externalities are estimated for constructing and operating a large domestic pipeline. • A large pipeline has lower cumulative impacts than other modes within ten years. - Abstract: The construction of pipelines along high-throughput fuel corridors can alleviate demand for rail, barge, and truck transportation. Pipelines have a very different externality profile than other freight transportation modes due to differences in construction, operation, and maintenance requirements; labor, energy, and material input intensity; location and profile of emissions from operations; and frequency and magnitude of environmental and safety incidents. Therefore, public policy makers have a strong justification to influence the economic viability of pipelines. We use data from prior literature and U.S. government statistics to estimate environmental, public health, and safety characterization factors for pipelines and other modes. In 2008, two pipeline companies proposed the construction of an ethanol pipeline from the Midwest to Northeast United States. This proposed project informs our case study of a 2735-km $3.5 billion pipeline (2009 USD), for which we evaluate potential long-term societal impacts including life-cycle costs, greenhouse gas emissions, employment, injuries, fatalities, and public health impacts. Although it may take decades to break even economically, and would result in lower cumulative employment, such a pipeline would likely have fewer safety incidents, pollution emissions, and health damages than the alternative multimodal system in less than ten years; these results stand even if comparing future cleaner ground transport modes to a pipeline that utilizes electricity produced from coal

  10. Annotation of glycoproteins in the SWISS-PROT database.

    Science.gov (United States)

    Jung, E; Veuthey, A L; Gasteiger, E; Bairoch, A

    2001-02-01

    SWISS-PROT is a protein sequence database, which aims to be nonredundant, fully annotated and highly cross-referenced. Most eukaryotic gene products undergo co- and/or post-translational modifications, and these need to be included in the database in order to describe the mature protein. SWISS-PROT includes information on many types of different protein modifications. As glycosylation is the most common type of post-translational protein modification, we are currently placing an emphasis on annotation of protein glycosylation in SWISS-PROT. Information on the position of the sugar within the polypeptide chain, the reducing terminal linkage as well as additional information on biological function of the sugar is included in the database. In this paper we describe how we account for the different types of protein glycosylation, namely N-linked glycosylation, O-linked glycosylation, proteoglycans, C-linked glycosylation and the attachment of glycosyl-phosphatidylinosital anchors to proteins.

  11. Image annotation by deep neural networks with attention shaping

    Science.gov (United States)

    Zheng, Kexin; Lv, Shaohe; Ma, Fang; Chen, Fei; Jin, Chi; Dou, Yong

    2017-07-01

    Image annotation is a task of assigning semantic labels to an image. Recently, deep neural networks with visual attention have been utilized successfully in many computer vision tasks. In this paper, we show that conventional attention mechanism is easily misled by the salient class, i.e., the attended region always contains part of the image area describing the content of salient class at different attention iterations. To this end, we propose a novel attention shaping mechanism, which aims to maximize the non-overlapping area between consecutive attention processes by taking into account the history of previous attention vectors. Several weighting polices are studied to utilize the history information in different manners. In two benchmark datasets, i.e., PASCAL VOC2012 and MIRFlickr-25k, the average precision is improved by up to 10% in comparison with the state-of-the-art annotation methods.

  12. PipelineDog: a simple and flexible graphic pipeline construction and maintenance tool.

    Science.gov (United States)

    Zhou, Anbo; Zhang, Yeting; Sun, Yazhou; Xing, Jinchuan

    2017-11-23

    Analysis pipelines are an essential part of bioinformatics research, and ad hoc pipelines are frequently created by researchers for prototyping and proof-of-concept purposes. However, most existing pipeline management system or workflow engines are too complex for rapid prototyping or learning the pipeline concept. A lightweight, user-friendly, and flexible solution is thus desirable. In this study, we developed a new pipeline construction and maintenance tool, PipelineDog. This is a web-based integrated development environment with a modern web graphical user interface. It offers cross-platform compatibility, project management capabilities, code formatting and error checking functions, and an online repository. It uses an easy-to-read/write script system that encourages code reuse. With the online repository, it also encourages sharing of pipelines, which enhances analysis reproducibility and accountability. For most users, PipelineDog requires no software installation. Overall, this web application provides a way to rapidly create and easily manage pipelines. PipelineDog web app is freely available at http://web.pipeline.dog. The command line version is available at http://www.npmjs.com/package/pipelinedog, and online repository at http://repo.pipeline.dog. ysun@kean.edu or xing@biology.rutgers.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  13. Addressing the workforce pipeline challenge

    Energy Technology Data Exchange (ETDEWEB)

    Leonard Bond; Kevin Kostelnik; Richard Holman

    2006-11-01

    A secure and affordable energy supply is essential for achieving U.S. national security, in continuing U.S. prosperity and in laying the foundations to enable future economic growth. To meet this goal the next generation energy workforce in the U.S., in particular those needed to support instrumentation, controls and advanced operations and maintenance, is a critical element. The workforce is aging and a new workforce pipeline, to support both current generation and new build has yet to be established. The paper reviews the challenges and some actions being taken to address this need.

  14. An Informally Annotated Bibliography of Sociolinguistics.

    Science.gov (United States)

    Tannen, Deborah

    This annotated bibliography of sociolinguistics is divided into the following sections: speech events, ethnography of speaking and anthropological approaches to analysis of conversation; discourse analysis (including analysis of conversation and narrative), ethnomethodology and nonverbal communication; sociolinguistics; pragmatics (including…

  15. Annotation and retrieval in protein interaction databases

    Science.gov (United States)

    Cannataro, Mario; Hiram Guzzi, Pietro; Veltri, Pierangelo

    2014-06-01

    Biological databases have been developed with a special focus on the efficient retrieval of single records or the efficient computation of specialized bioinformatics algorithms against the overall database, such as in sequence alignment. The continuos production of biological knowledge spread on several biological databases and ontologies, such as Gene Ontology, and the availability of efficient techniques to handle such knowledge, such as annotation and semantic similarity measures, enable the development on novel bioinformatics applications that explicitly use and integrate such knowledge. After introducing the annotation process and the main semantic similarity measures, this paper shows how annotations and semantic similarity can be exploited to improve the extraction and analysis of biologically relevant data from protein interaction databases. As case studies, the paper presents two novel software tools, OntoPIN and CytoSeVis, both based on the use of Gene Ontology annotations, for the advanced querying of protein interaction databases and for the enhanced visualization of protein interaction networks.

  16. SASL: A Semantic Annotation System for Literature

    Science.gov (United States)

    Yuan, Pingpeng; Wang, Guoyin; Zhang, Qin; Jin, Hai

    Due to ambiguity, search engines for scientific literatures may not return right search results. One efficient solution to the problems is to automatically annotate literatures and attach the semantic information to them. Generally, semantic annotation requires identifying entities before attaching semantic information to them. However, due to abbreviation and other reasons, it is very difficult to identify entities correctly. The paper presents a Semantic Annotation System for Literature (SASL), which utilizes Wikipedia as knowledge base to annotate literatures. SASL mainly attaches semantic to terminology, academic institutions, conferences, and journals etc. Many of them are usually abbreviations, which induces ambiguity. Here, SASL uses regular expressions to extract the mapping between full name of entities and their abbreviation. Since full names of several entities may map to a single abbreviation, SASL introduces Hidden Markov Model to implement name disambiguation. Finally, the paper presents the experimental results, which confirm SASL a good performance.

  17. Temporal Annotation in the Clinical Domain

    Science.gov (United States)

    Styler, William F.; Bethard, Steven; Finan, Sean; Palmer, Martha; Pradhan, Sameer; de Groen, Piet C; Erickson, Brad; Miller, Timothy; Lin, Chen; Savova, Guergana; Pustejovsky, James

    2014-01-01

    This article discusses the requirements of a formal specification for the annotation of temporal information in clinical narratives. We discuss the implementation and extension of ISO-TimeML for annotating a corpus of clinical notes, known as the THYME corpus. To reflect the information task and the heavily inference-based reasoning demands in the domain, a new annotation guideline has been developed, “the THYME Guidelines to ISO-TimeML (THYME-TimeML)”. To clarify what relations merit annotation, we distinguish between linguistically-derived and inferentially-derived temporal orderings in the text. We also apply a top performing TempEval 2013 system against this new resource to measure the difficulty of adapting systems to the clinical domain. The corpus is available to the community and has been proposed for use in a SemEval 2015 task. PMID:29082229

  18. WormBase: Annotating many nematode genomes.

    Science.gov (United States)

    Howe, Kevin; Davis, Paul; Paulini, Michael; Tuli, Mary Ann; Williams, Gary; Yook, Karen; Durbin, Richard; Kersey, Paul; Sternberg, Paul W

    2012-01-01

    WormBase (www.wormbase.org) has been serving the scientific community for over 11 years as the central repository for genomic and genetic information for the soil nematode Caenorhabditis elegans. The resource has evolved from its beginnings as a database housing the genomic sequence and genetic and physical maps of a single species, and now represents the breadth and diversity of nematode research, currently serving genome sequence and annotation for around 20 nematodes. In this article, we focus on WormBase's role of genome sequence annotation, describing how we annotate and integrate data from a growing collection of nematode species and strains. We also review our approaches to sequence curation, and discuss the impact on annotation quality of large functional genomics projects such as modENCODE.

  19. Annotated Tsunami bibliography: 1962-1976

    International Nuclear Information System (INIS)

    Pararas-Carayannis, G.; Dong, B.; Farmer, R.

    1982-08-01

    This compilation contains annotated citations to nearly 3000 tsunami-related publications from 1962 to 1976 in English and several other languages. The foreign-language citations have English titles and abstracts

  20. Annotating Human P-Glycoprotein Bioassay Data.

    Science.gov (United States)

    Zdrazil, Barbara; Pinto, Marta; Vasanthanathan, Poongavanam; Williams, Antony J; Balderud, Linda Zander; Engkvist, Ola; Chichester, Christine; Hersey, Anne; Overington, John P; Ecker, Gerhard F

    2012-08-01

    Huge amounts of small compound bioactivity data have been entering the public domain as a consequence of open innovation initiatives. It is now the time to carefully analyse existing bioassay data and give it a systematic structure. Our study aims to annotate prominent in vitro assays used for the determination of bioactivities of human P-glycoprotein inhibitors and substrates as they are represented in the ChEMBL and TP-search open source databases. Furthermore, the ability of data, determined in different assays, to be combined with each other is explored. As a result of this study, it is suggested that for inhibitors of human P-glycoprotein it is possible to combine data coming from the same assay type, if the cell lines used are also identical and the fluorescent or radiolabeled substrate have overlapping binding sites. In addition, it demonstrates that there is a need for larger chemical diverse datasets that have been measured in a panel of different assays. This would certainly alleviate the search for other inter-correlations between bioactivity data yielded by different assay setups.

  1. Pydpiper: A Flexible Toolkit for Constructing Novel Registration Pipelines

    Directory of Open Access Journals (Sweden)

    Miriam eFriedel

    2014-07-01

    Full Text Available Using neuroimaging technologies to elucidate the relationship between genotype and phenotype and brain and behavior will be a key contribution to biomedical research in the twenty-first century. Among the many methods for analyzing neuroimaging data, image registration deserves particular attention due to its wide range of applications. Finding strategies to register together many images and analyze the differences between them can be a challenge, particularly given that different experimental designs require different registration strategies. Moreover, writing software that can handle different types of image registration pipelines in a flexible, reusable and extensible way can be challenging. In response to this challenge, we have created Pydpiper, a neuroimaging registration toolkit written in Python. Pydpiper is an open-source, freely available pipeline framework that provides multiple modules for various image registration applications. Pydpiper offers five key innovations. Specifically: (1 a robust file handling class that allows access to outputs from all stages of registration at any point in the pipeline; (2 the ability of the framework to eliminate duplicate stages; (3 reusable, easy to subclass modules; (4 a development toolkit written for non-developers; (5 four complete applications that run complex image registration pipelines ``out-of-the-box.'' In this paper, we will discuss both the general Pydpiper framework and the various ways in which component modules can be pieced together to easily create new registration pipelines. This will include a discussion of the core principles motivating code development and a comparison of Pydpiper with other available toolkits. We also provide a comprehensive, line-by-line example to orient users with limited programming knowledge and highlight some of the most useful features of Pydpiper. In addition, we will present the four current applications of the code.

  2. 77 FR 16052 - Information Collection Activities: Pipelines and Pipeline Rights-of-Way; Submitted for Office of...

    Science.gov (United States)

    2012-03-19

    ... Modification (ROW)-- $3,865. Section 250.1008(e)--Pipeline Repair Notification--$360. Section 250.1015(a... Bureau of Safety and Environmental Enforcement Information Collection Activities: Pipelines and Pipeline... regulations under Subpart J, ``Pipelines and Pipeline Rights-of-Way.'' This notice also provides the public a...

  3. 78 FR 53190 - Pipeline Safety: Notice to Operators of Hazardous Liquid and Natural Gas Pipelines of a Recall on...

    Science.gov (United States)

    2013-08-28

    ... Pipeline and Hazardous Materials Safety Administration Pipeline Safety: Notice to Operators of Hazardous Liquid and Natural Gas Pipelines of a Recall on Leak Repair Clamps Due to Defective Seal AGENCY: Pipeline... to Operators of Natural Gas and Hazardous Liquid Pipelines of a Recall on Leak Repair Clamps Due to...

  4. Fluid Annotations in a Open World

    DEFF Research Database (Denmark)

    Zellweger, Polle Trescott; Bouvin, Niels Olof; Jehøj, Henning

    2001-01-01

    Fluid Documents use animated typographical changes to provide a novel and appealing user experience for hypertext browsing and for viewing document annotations in context. This paper describes an effort to broaden the utility of Fluid Documents by using the open hypermedia Arakne Environment...... to layer fluid annotations and links on top of abitrary HTML pages on the World Wide Web. Changes to both Fluid Documents and Arakne are required....

  5. Community annotation and bioinformatics workforce development in concert--Little Skate Genome Annotation Workshops and Jamborees.

    Science.gov (United States)

    Wang, Qinghua; Arighi, Cecilia N; King, Benjamin L; Polson, Shawn W; Vincent, James; Chen, Chuming; Huang, Hongzhan; Kingham, Brewster F; Page, Shallee T; Rendino, Marc Farnum; Thomas, William Kelley; Udwary, Daniel W; Wu, Cathy H

    2012-01-01

    Recent advances in high-throughput DNA sequencing technologies have equipped biologists with a powerful new set of tools for advancing research goals. The resulting flood of sequence data has made it critically important to train the next generation of scientists to handle the inherent bioinformatic challenges. The North East Bioinformatics Collaborative (NEBC) is undertaking the genome sequencing and annotation of the little skate (Leucoraja erinacea) to promote advancement of bioinformatics infrastructure in our region, with an emphasis on practical education to create a critical mass of informatically savvy life scientists. In support of the Little Skate Genome Project, the NEBC members have developed several annotation workshops and jamborees to provide training in genome sequencing, annotation and analysis. Acting as a nexus for both curation activities and dissemination of project data, a project web portal, SkateBase (http://skatebase.org) has been developed. As a case study to illustrate effective coupling of community annotation with workforce development, we report the results of the Mitochondrial Genome Annotation Jamborees organized to annotate the first completely assembled element of the Little Skate Genome Project, as a culminating experience for participants from our three prior annotation workshops. We are applying the physical/virtual infrastructure and lessons learned from these activities to enhance and streamline the genome annotation workflow, as we look toward our continuing efforts for larger-scale functional and structural community annotation of the L. erinacea genome.

  6. Community annotation and bioinformatics workforce development in concert—Little Skate Genome Annotation Workshops and Jamborees

    Science.gov (United States)

    Wang, Qinghua; Arighi, Cecilia N.; King, Benjamin L.; Polson, Shawn W.; Vincent, James; Chen, Chuming; Huang, Hongzhan; Kingham, Brewster F.; Page, Shallee T.; Farnum Rendino, Marc; Thomas, William Kelley; Udwary, Daniel W.; Wu, Cathy H.

    2012-01-01

    Recent advances in high-throughput DNA sequencing technologies have equipped biologists with a powerful new set of tools for advancing research goals. The resulting flood of sequence data has made it critically important to train the next generation of scientists to handle the inherent bioinformatic challenges. The North East Bioinformatics Collaborative (NEBC) is undertaking the genome sequencing and annotation of the little skate (Leucoraja erinacea) to promote advancement of bioinformatics infrastructure in our region, with an emphasis on practical education to create a critical mass of informatically savvy life scientists. In support of the Little Skate Genome Project, the NEBC members have developed several annotation workshops and jamborees to provide training in genome sequencing, annotation and analysis. Acting as a nexus for both curation activities and dissemination of project data, a project web portal, SkateBase (http://skatebase.org) has been developed. As a case study to illustrate effective coupling of community annotation with workforce development, we report the results of the Mitochondrial Genome Annotation Jamborees organized to annotate the first completely assembled element of the Little Skate Genome Project, as a culminating experience for participants from our three prior annotation workshops. We are applying the physical/virtual infrastructure and lessons learned from these activities to enhance and streamline the genome annotation workflow, as we look toward our continuing efforts for larger-scale functional and structural community annotation of the L. erinacea genome. PMID:22434832

  7. A Java-based fMRI processing pipeline evaluation system for assessment of univariate general linear model and multivariate canonical variate analysis-based pipelines.

    Science.gov (United States)

    Zhang, Jing; Liang, Lichen; Anderson, Jon R; Gatewood, Lael; Rottenberg, David A; Strother, Stephen C

    2008-01-01

    As functional magnetic resonance imaging (fMRI) becomes widely used, the demands for evaluation of fMRI processing pipelines and validation of fMRI analysis results is increasing rapidly. The current NPAIRS package, an IDL-based fMRI processing pipeline evaluation framework, lacks system interoperability and the ability to evaluate general linear model (GLM)-based pipelines using prediction metrics. Thus, it can not fully evaluate fMRI analytical software modules such as FSL.FEAT and NPAIRS.GLM. In order to overcome these limitations, a Java-based fMRI processing pipeline evaluation system was developed. It integrated YALE (a machine learning environment) into Fiswidgets (a fMRI software environment) to obtain system interoperability and applied an algorithm to measure GLM prediction accuracy. The results demonstrated that the system can evaluate fMRI processing pipelines with univariate GLM and multivariate canonical variates analysis (CVA)-based models on real fMRI data based on prediction accuracy (classification accuracy) and statistical parametric image (SPI) reproducibility. In addition, a preliminary study was performed where four fMRI processing pipelines with GLM and CVA modules such as FSL.FEAT and NPAIRS.CVA were evaluated with the system. The results indicated that (1) the system can compare different fMRI processing pipelines with heterogeneous models (NPAIRS.GLM, NPAIRS.CVA and FSL.FEAT) and rank their performance by automatic performance scoring, and (2) the rank of pipeline performance is highly dependent on the preprocessing operations. These results suggest that the system will be of value for the comparison, validation, standardization and optimization of functional neuroimaging software packages and fMRI processing pipelines.

  8. Annotation Method (AM): SE41_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available SE41_AM1 PowerGet annotation In annotation process, KEGG, KNApSAcK and LipidMAPS ar..., predicted molecular formulas are used for the annotation. MS/MS patterns was used to suggest functional gr...-MS Fragment Viewer (http://webs2.kazusa.or.jp/msmsfragmentviewer/) are used for annotation and identification of the compounds. ...

  9. Annotation of microsporidian genomes using transcriptional signals.

    Science.gov (United States)

    Peyretaillade, Eric; Parisot, Nicolas; Polonais, Valérie; Terrat, Sébastien; Denonfoux, Jérémie; Dugat-Bony, Eric; Wawrzyniak, Ivan; Biderre-Petit, Corinne; Mahul, Antoine; Rimour, Sébastien; Gonçalves, Olivier; Bornes, Stéphanie; Delbac, Frédéric; Chebance, Brigitte; Duprat, Simone; Samson, Gaëlle; Katinka, Michael; Weissenbach, Jean; Wincker, Patrick; Peyret, Pierre

    2012-01-01

    High-quality annotation of microsporidian genomes is essential for understanding the biological processes that govern the development of these parasites. Here we present an improved structural annotation method using transcriptional DNA signals. We apply this method to re-annotate four previously annotated genomes, which allow us to detect annotation errors and identify a significant number of unpredicted genes. We then annotate the newly sequenced genome of Anncaliia algerae. A comparative genomic analysis of A. algerae permits the identification of not only microsporidian core genes, but also potentially highly expressed genes encoding membrane-associated proteins, which represent good candidates involved in the spore architecture, the invasion process and the microsporidian-host relationships. Furthermore, we find that the ten-fold variation in microsporidian genome sizes is not due to gene number, size or complexity, but instead stems from the presence of transposable elements. Such elements, along with kinase regulatory pathways and specific transporters, appear to be key factors in microsporidian adaptive processes.

  10. Annotating the human genome with Disease Ontology

    Science.gov (United States)

    Osborne, John D; Flatow, Jared; Holko, Michelle; Lin, Simon M; Kibbe, Warren A; Zhu, Lihua (Julie); Danila, Maria I; Feng, Gang; Chisholm, Rex L

    2009-01-01

    Background The human genome has been extensively annotated with Gene Ontology for biological functions, but minimally computationally annotated for diseases. Results We used the Unified Medical Language System (UMLS) MetaMap Transfer tool (MMTx) to discover gene-disease relationships from the GeneRIF database. We utilized a comprehensive subset of UMLS, which is disease-focused and structured as a directed acyclic graph (the Disease Ontology), to filter and interpret results from MMTx. The results were validated against the Homayouni gene collection using recall and precision measurements. We compared our results with the widely used Online Mendelian Inheritance in Man (OMIM) annotations. Conclusion The validation data set suggests a 91% recall rate and 97% precision rate of disease annotation using GeneRIF, in contrast with a 22% recall and 98% precision using OMIM. Our thesaurus-based approach allows for comparisons to be made between disease containing databases and allows for increased accuracy in disease identification through synonym matching. The much higher recall rate of our approach demonstrates that annotating human genome with Disease Ontology and GeneRIF for diseases dramatically increases the coverage of the disease annotation of human genome. PMID:19594883

  11. ADEPt, a semantically-enriched pipeline for extracting adverse drug events from free-text electronic health records.

    Directory of Open Access Journals (Sweden)

    Ehtesham Iqbal

    Full Text Available Adverse drug events (ADEs are unintended responses to medical treatment. They can greatly affect a patient's quality of life and present a substantial burden on healthcare. Although Electronic health records (EHRs document a wealth of information relating to ADEs, they are frequently stored in the unstructured or semi-structured free-text narrative requiring Natural Language Processing (NLP techniques to mine the relevant information. Here we present a rule-based ADE detection and classification pipeline built and tested on a large Psychiatric corpus comprising 264k patients using the de-identified EHRs of four UK-based psychiatric hospitals. The pipeline uses characteristics specific to Psychiatric EHRs to guide the annotation process, and distinguishes: a the temporal value associated with the ADE mention (whether it is historical or present, b the categorical value of the ADE (whether it is assertive, hypothetical, retrospective or a general discussion and c the implicit contextual value where the status of the ADE is deduced from surrounding indicators, rather than explicitly stated. We manually created the rulebase in collaboration with clinicians and pharmacists by studying ADE mentions in various types of clinical notes. We evaluated the open-source Adverse Drug Event annotation Pipeline (ADEPt using 19 ADEs specific to antipsychotics and antidepressants medication. The ADEs chosen vary in severity, regularity and persistence. The average F-measure and accuracy achieved by our tool across all tested ADEs were 0.83 and 0.83 respectively. In addition to annotation power, the ADEPT pipeline presents an improvement to the state of the art context-discerning algorithm, ConText.

  12. Electrical fingerprint of pipeline defects

    International Nuclear Information System (INIS)

    Mica, Isabella; Polignano, Maria Luisa; Marco, Cinzia De

    2004-01-01

    Pipeline defects are dislocations that connect the source region of the transistor with the drain region. They were widely reported to occur in CMOS, BiCMOS devices and recently in SOI technologies. They can reduce device yield either by affecting the devices functionality or by increasing the current consumption under stand-by conditions. In this work the electrical fingerprint of these dislocations is studied, its purpose is to enable us to identify these defects as the ones responsible for device failure. It is shown that the pipeline defects are responsible for a leakage current from source to drain in the transistors. This leakage has a resistive characteristic and it is lightly modulated by the body bias. It is not sensitive to temperature; vice versa the off-current of a good transistor exhibits the well-known exponential dependence on 1/T. The emission spectrum of these defects was studied and compared with the spectrum of a good transistor. The paper aims to show that the spectrum of a defective transistor is quite peculiar; it shows well defined peaks, whereas the spectrum of a good transistor under saturation conditions is characterized by a broad spectral light emission distribution. Finally the deep-level transient spectroscopy (DLTS) is tried on defective diodes

  13. 77 FR 61826 - Pipeline Safety: Communication During Emergency Situations

    Science.gov (United States)

    2012-10-11

    .... PHMSA-2012-0201] Pipeline Safety: Communication During Emergency Situations AGENCY: Pipeline and... this Advisory Bulletin to reiterate the importance of immediate dialogue between pipeline facility...: Communication During Emergency Situations Advisory: To further enhance the Department's safety efforts, PHMSA is...

  14. Oil and Natural Gas Pipelines, North America, 2010, Platts

    Data.gov (United States)

    U.S. Environmental Protection Agency — The Oil and Natural Gas Pipeline geospatial data layer contains gathering, interstate, and intrastate natural gas pipelines, crude and product oil pipelines, and...

  15. Increase of ecological safety of the pipeline

    International Nuclear Information System (INIS)

    Dr Movsumov, Sh.N.; Prof Aliyev, F.G.

    2005-01-01

    Full text : For increase of ecological safety of the pipeline, necessary decrease of damage (risk) rendered by the pipeline on surrounding natural environment which depends: on the frequency of damage of the pipeline; on the volume poured oil; on the factor of sensitivity of an environment where flood of oil was. Frequency of damage of the pipeline depends on physico-chemical properties of a material of the pipeline, from its technical characteristics (thickness of a wall, length of a pipe, working pressure), on the seismic area of the district where the pipeline passed and also on the way of lining of the pipeline (underground or overground). The volume poured oil depends on diameter of the received damage, from stability of the pipeline mechanical and other external actions, from an ambient temperature, from capacity of the pipeline, from distance between the latches established in the pipeline, and also from time, necessary for their full closing. The factor of sensitivity of environment depends on geological structure and landscapes of district (mountain, the river, settlements) where passed the pipeline. At designing the pipeline, in report is shown questions of increase of ecological safety of the pipeline are considered at his construction and exploitation. For improvement of ecological safety of the pipeline is necessary to hold the following actions: Ecological education of the public, living near along a line of the oil pipeline; carrying out ecological monitoring; working of the public plan of response to oil spills; For ecological education of the public is necessary: carrying out informing of the public for all (technical, ecological, social and economic and legal) questions connected to an oil pipeline, and also on methods of protection of the rights at participation in acceptance of ecological significant decisions; Creation of public groups for realization of activity on observance of the legislation and to prevention of risks; Exposure of hot

  16. Unique features of the loblolly pine (Pinus taeda L.) megagenome revealed through sequence annotation.

    Science.gov (United States)

    Wegrzyn, Jill L; Liechty, John D; Stevens, Kristian A; Wu, Le-Shin; Loopstra, Carol A; Vasquez-Gross, Hans A; Dougherty, William M; Lin, Brian Y; Zieve, Jacob J; Martínez-García, Pedro J; Holt, Carson; Yandell, Mark; Zimin, Aleksey V; Yorke, James A; Crepeau, Marc W; Puiu, Daniela; Salzberg, Steven L; Dejong, Pieter J; Mockaitis, Keithanne; Main, Doreen; Langley, Charles H; Neale, David B

    2014-03-01

    The largest genus in the conifer family Pinaceae is Pinus, with over 100 species. The size and complexity of their genomes (∼20-40 Gb, 2n = 24) have delayed the arrival of a well-annotated reference sequence. In this study, we present the annotation of the first whole-genome shotgun assembly of loblolly pine (Pinus taeda L.), which comprises 20.1 Gb of sequence. The MAKER-P annotation pipeline combined evidence-based alignments and ab initio predictions to generate 50,172 gene models, of which 15,653 are classified as high confidence. Clustering these gene models with 13 other plant species resulted in 20,646 gene families, of which 1554 are predicted to be unique to conifers. Among the conifer gene families, 159 are composed exclusively of loblolly pine members. The gene models for loblolly pine have the highest median and mean intron lengths of 24 fully sequenced plant genomes. Conifer genomes are full of repetitive DNA, with the most significant contributions from long-terminal-repeat retrotransposons. In depth analysis of the tandem and interspersed repetitive content yielded a combined estimate of 82%.

  17. Annotation and analysis of the geome of Phycomyces blakesleeanus, a model photoresponsive zygomycete

    Energy Technology Data Exchange (ETDEWEB)

    Kuo, Alan; Salamov, Asaf; Pangilinan, Jasmyn; Lindquist, Erika; Shapiro, Harris; Baker, Scott; Corrochano, Luis; Grigoriev, Igor

    2007-03-19

    Light induces in P. blakesleeanus multiple developmental and biochemical responses (sporangiophore growth and development, beta-carotene synthesis). P. blakesleeanus is an intensively studied, experimentally tractable model organism, and whole-genome analysis is expected to further elucidate the signaling pathways underlying its photoregulation. To this end, the genome was sequenced to 7.49X depth and assembled into 475 scaffolds totaling 56Mbp, and 47847 ESTs were assembled from cDNAs of light and dark cultures. We combined into a single annotation pipeline a variety of gene modeling methods (homology-based, EST-based, and ab initio), and predicted 14792 protein-coding genes. Many of these gene predictions are supported by homology in nr (68percent), by Pfam domains (44percent), or by ESTs (35percent). We next assigned GO terms to 41percent of the proteins and EC numbers to 16percent. We then distributed these annotations to the Phycomyces consortium, along with tools to curate them manually. We expect that the annotation will provide a solid platform for expression analysis. In addition to its value as a model organism, P. blakesleeanus is the second zygomycete with a sequenced genome, after the related Rhizopus oryzae. We therefore will present preliminary results of comparative analysis between the two zygomycetes.

  18. annot8r: GO, EC and KEGG annotation of EST datasets

    Directory of Open Access Journals (Sweden)

    Schmid Ralf

    2008-04-01

    Full Text Available Abstract Background The expressed sequence tag (EST methodology is an attractive option for the generation of sequence data for species for which no completely sequenced genome is available. The annotation and comparative analysis of such datasets poses a formidable challenge for research groups that do not have the bioinformatics infrastructure of major genome sequencing centres. Therefore, there is a need for user-friendly tools to facilitate the annotation of non-model species EST datasets with well-defined ontologies that enable meaningful cross-species comparisons. To address this, we have developed annot8r, a platform for the rapid annotation of EST datasets with GO-terms, EC-numbers and KEGG-pathways. Results annot8r automatically downloads all files relevant for the annotation process and generates a reference database that stores UniProt entries, their associated Gene Ontology (GO, Enzyme Commission (EC and Kyoto Encyclopaedia of Genes and Genomes (KEGG annotation and additional relevant data. For each of GO, EC and KEGG, annot8r extracts a specific sequence subset from the UniProt dataset based on the information stored in the reference database. These three subsets are then formatted for BLAST searches. The user provides the protein or nucleotide sequences to be annotated and annot8r runs BLAST searches against these three subsets. The BLAST results are parsed and the corresponding annotations retrieved from the reference database. The annotations are saved both as flat files and also in a relational postgreSQL results database to facilitate more advanced searches within the results. annot8r is integrated with the PartiGene suite of EST analysis tools. Conclusion annot8r is a tool that assigns GO, EC and KEGG annotations for data sets resulting from EST sequencing projects both rapidly and efficiently. The benefits of an underlying relational database, flexibility and the ease of use of the program make it ideally suited for non

  19. Construction of a public CHO cell line transcript database using versatile bioinformatics analysis pipelines.

    Directory of Open Access Journals (Sweden)

    Oliver Rupp

    Full Text Available Chinese hamster ovary (CHO cell lines represent the most commonly used mammalian expression system for the production of therapeutic proteins. In this context, detailed knowledge of the CHO cell transcriptome might help to improve biotechnological processes conducted by specific cell lines. Nevertheless, very few assembled cDNA sequences of CHO cells were publicly released until recently, which puts a severe limitation on biotechnological research. Two extended annotation systems and web-based tools, one for browsing eukaryotic genomes (GenDBE and one for viewing eukaryotic transcriptomes (SAMS, were established as the first step towards a publicly usable CHO cell genome/transcriptome analysis platform. This is complemented by the development of a new strategy to assemble the ca. 100 million reads, sequenced from a broad range of diverse transcripts, to a high quality CHO cell transcript set. The cDNA libraries were constructed from different CHO cell lines grown under various culture conditions and sequenced using Roche/454 and Illumina sequencing technologies in addition to sequencing reads from a previous study. Two pipelines to extend and improve the CHO cell line transcripts were established. First, de novo assemblies were carried out with the Trinity and Oases assemblers, using varying k-mer sizes. The resulting contigs were screened for potential CDS using ESTScan. Redundant contigs were filtered out using cd-hit-est. The remaining CDS contigs were re-assembled with CAP3. Second, a reference-based assembly with the TopHat/Cufflinks pipeline was performed, using the recently published draft genome sequence of CHO-K1 as reference. Additionally, the de novo contigs were mapped to the reference genome using GMAP and merged with the Cufflinks assembly using the cuffmerge software. With this approach 28,874 transcripts located on 16,492 gene loci could be assembled. Combining the results of both approaches, 65,561 transcripts were identified

  20. Considerations about the Urucu-Manaus gas pipeline design; Consideracoes sobre o projeto do gasoduto Urucu-Manaus

    Energy Technology Data Exchange (ETDEWEB)

    Villela, Claudio Henrique Lobianco G.; Correia, Luiz de Carvalho Dias [PETROBRAS S.A., Rio de Janeiro, RJ (Brazil)

    2005-07-01

    The main purpose of this job is to present the characteristics that influenced the elaboration of the Urucu-Manaus Gas Pipeline Project and the difference between this pipeline and other pipelines already installed on the Amazon region. In this project were emphasized the aspects related to the route definition, mapping technologies that had not been utilized in our pipeline projects, the crossing of vast flooded areas, requiring specific studies, as well the minimization of the environment impacts, in this case the existence of animal species present only in this region. Other differential factor was the Rio Negro crossing, where the pipeline will be installed in the riverbed. The know-how attained with this project consolidates ever so the activity of building pipelines in tropical forest regions. (author)

  1. Analysis of buried pipelines at Kozloduy

    International Nuclear Information System (INIS)

    Asfura, A.

    1999-01-01

    This paper describes the analysis of the buried pipelines at Kozloduy NPP. It involves the description of the studied pipelines, their properties, a detailed description of the methodology applied, and the evaluation of the soil strain field as well as the graphical representation of the results obtained

  2. The LOFAR Known Pulsar Data Pipeline

    NARCIS (Netherlands)

    Alexov, A.; Hessels, J.W.T.; Mol, J.D.; Stappers, B.; van Leeuwen, J.

    2010-01-01

    Abstract: Transient radio phenomena and pulsars are one of six LOFAR Key Science Projects (KSPs). As part of the Transients KSP, the Pulsar Working Group (PWG) has been developing the LOFAR Pulsar Data Pipelines to both study known pulsars as well as search for new ones. The pipelines are being

  3. Testing the School-to-Prison Pipeline

    Science.gov (United States)

    Owens, Emily G.

    2017-01-01

    The School-to-Prison Pipeline is a social phenomenon where students become formally involved with the criminal justice system as a result of school policies that use law enforcement, rather than discipline, to address behavioral problems. A potentially important part of the School-to-Prison Pipeline is the use of sworn School Resource Officers…

  4. The MIRI Medium Resolution Spectrometer calibration pipeline

    NARCIS (Netherlands)

    Labiano, A.; Azzollini, R.; Bailey, J.; Beard, S.; Dicken, D.; García-Marín, M.; Geers, V.; Glasse, A.; Glauser, A.; Gordon, K.; Justtanont, K.; Klaassen, P.; Lahuis, F.; Law, D.; Morrison, J.; Müller, M.; Rieke, G.; Vandenbussche, B.; Wright, G.

    2016-01-01

    The Mid-Infrared Instrument (MIRI) Medium Resolution Spectrometer (MRS) is the only mid-IR Integral Field Spectrometer on board James Webb Space Telescope. The complexity of the MRS requires a very specialized pipeline, with some specific steps not present in other pipelines of JWST instruments,

  5. Oil pipeline energy consumption and efficiency

    Energy Technology Data Exchange (ETDEWEB)

    Hooker, J.N.

    1981-01-01

    This report describes an investigation of energy consumption and efficiency of oil pipelines in the US in 1978. It is based on a simulation of the actual movement of oil on a very detailed representation of the pipeline network, and it uses engineering equations to calculate the energy that pipeline pumps must have exerted on the oil to move it in this manner. The efficiencies of pumps and drivers are estimated so as to arrive at the amount of energy consumed at pumping stations. The throughput in each pipeline segment is estimated by distributing each pipeline company's reported oil movements over its segments in proportions predicted by regression equations that show typical throughput and throughput capacity as functions of pipe diameter. The form of the equations is justified by a generalized cost-engineering study of pipelining, and their parameters are estimated using new techniques developed for the purpose. A simplified model of flow scheduling is chosen on the basis of actual energy use data obtained from a few companies. The study yields energy consumption and intensiveness estimates for crude oil trunk lines, crude oil gathering lines and oil products lines, for the nation as well as by state and by pipe diameter. It characterizes the efficiency of typical pipelines of various diameters operating at capacity. Ancillary results include estimates of oil movements by state and by diameter and approximate pipeline capacity utilization nationwide.

  6. The School-to-Prison Pipeline

    Science.gov (United States)

    Elias, Marilyn

    2013-01-01

    Policies that encourage police presence at schools, harsh tactics including physical restraint, and automatic punishments that result in suspensions and out-of-class time are huge contributors to the school-to-prison pipeline, but the problem is more complex than that. The school-to-prison pipeline starts (or is best avoided) in the classroom.…

  7. Protection of pipelines affected by surface subsidence

    International Nuclear Information System (INIS)

    Luo, Y.; Peng, S.S.; Chen, H.J.

    1998-01-01

    Surface subsidence resulting from underground coal mining can cause problems for buried pipelines. A technique for assessing the level of stress on a subsidence-affected pipeline is introduced. The main contributors to the stress are identified, and mitigation techniques for reducing the stress are proposed. The proposed mitigation techniques were then successfully tested. 13 refs., 8 figs., 2 tabs

  8. Saudi Aramco experience towards establishing Pipelines Integrity Management System (PIMS)

    Energy Technology Data Exchange (ETDEWEB)

    Al-Ahmari, Saad A. [Saudi Aramco, Dhahran (Saudi Arabia)

    2009-07-01

    Saudi Aramco pipelines network transports hydrocarbons to export terminals, processing plants and domestic users. This network faced several safety and operational-related challenges that require having a more effective Pipelines Integrity Management System (PIMS). Therefore Saudi Aramco decided to develop its PIMS on the basis of geographical information system (GIS) support through different phases, i.e., establishing the integrity management framework, risk calculation approach, conducting a gap analysis toward the envisioned PIMS, establishing the required scope of work, screening the PIMS applications market, and selecting suitable tools that satisfy expected deliverables, and implement PIMS applications. Saudi Aramco expects great benefits from implementing PIMS, e.g., enhancing safety, enhancing pipeline network robustness, optimizing inspection and maintenance expenditures, and facilitating pipeline management and the decision-making process. Saudi Aramco's new experience in adopting PIMS includes many challenges and lessons-learned associated with all of the PIMS development phases. These challenges include performing the gap analysis, conducting QA/QC sensitivity analysis for the acquired data, establishing the scope of work, selecting the appropriate applications and implementing PIMS. (author)

  9. Saudi Aramco experience towards establishing Pipelines Integrity Management Systems (PIMS)

    Energy Technology Data Exchange (ETDEWEB)

    AlAhmari, Saad A. [Saudi Aramco, Dhahran (Saudi Arabia)

    2009-12-19

    Saudi Aramco pipelines network transports hydrocarbons to export terminals, processing plants and domestic users. This network faced several safety and operational-related challenges that require having a more effective Pipelines Integrity Management System (PIMS). Therefore Saudi Aramco decided to develop its PIMS on the basis of geographical information system (GIS) support through different phases, i.e., establishing the integrity management framework, risk calculation approach, conducting a gap analysis toward the envisioned PIMS, establishing the required scope of work, screening the PIMS applications market, and selecting suitable tools that satisfy expected deliverables, and implement PIMS applications. Saudi Aramco expects great benefits from implementing PIMS, e.g., enhancing safety, enhancing pipeline network robustness, optimizing inspection and maintenance expenditures, and facilitating pipeline management and the decision-making process. Saudi Aramco's new experience in adopting PIMS includes many challenges and lessons-learned associated with all of the PIMS development phases. These challenges include performing the gap analysis, conducting QA/QC sensitivity analysis for the acquired data, establishing the scope of work, selecting the appropriate applications and implementing PIMS. (author)

  10. Vibration analysis of liquid-filled pipelines with elastic constraints

    Science.gov (United States)

    Liu, Gongmin; Li, Yanhua

    2011-06-01

    In this paper, a transfer matrix method (TMM) in frequency domain considering fluid-structure interaction of liquid-filled pipelines with elastic constraints is proposed. The time-domain equations considering fluid-structure interaction, are transformed into frequency domain by Laplace transformation, and then twelve fourth-order ordinary differential equations and two second-order ordinary differential equations are deduced from the frequency-domain equations. The results of the fourteen frequency-domain equations are assembled into a transfer matrix, which represents the motion of a single pipe section. Combined with point matrices that describe specified boundary conditions, an overall transfer matrix for liquid-filled pipeline system can be assembled. Using the method, all the pipeline with no and rigid constraints can be easily calculated by simply setting the stiffness of the restraining springs from zero to a large number. Taking into account the longitudinal vibration, transverse vibration and torsional vibration, the proposed method can be used to analyze the pipelines with bends. Several numerical examples with different constraints are presented here to illustrate the application of the proposed method. The results are validated by measured and simulation data. Through the numerical examples, it is shown that the proposed method is efficient.

  11. Field sludge characterization obtained from inner of pipelines

    Energy Technology Data Exchange (ETDEWEB)

    Nava, N.; Sosa, E.; Alamilla, J.L. [Instituto Mexicano del Petroleo, Programa de Integridad de Ductos, Eje Central Lazaro Cardenas Norte 152, San Bartolo Atepehuacan, C.P. 07730 (Mexico); Knigth, C. [PEMEX Refinacion, Avenida Marina Nacional 329, Edificio B-2, Piso 11, C.P. 11311 (Mexico); Contreras, A. [Instituto Mexicano del Petroleo, Programa de Integridad de Ductos, Eje Central Lazaro Cardenas Norte 152, San Bartolo Atepehuacan, C.P. 07730 (Mexico)], E-mail: acontrer@imp.mx

    2009-11-15

    Physicochemical characterization of sludge obtained from refined hydrocarbons transmission pipeline was carried out through Moessbauer spectroscopy and X-ray diffraction. The Moessbauer and X-ray patterns indicate the presence of corrosion products composed of different iron oxide and sulfide phases. Hematite ({alpha}-Fe{sub 2}O{sub 3}), magnetite (Fe{sub 3}O{sub 4}), maghemite ({gamma}-Fe{sub 2}O{sub 3}), magnetic and superparamagnetic goethite ({alpha}-FeOOH), pyrrhotite (Fe{sub 1-x}S), akaganeite ({beta}-FeOOH), and lepidocrocite ({gamma}-FeOOH) were identified as corrosion products in samples obtained from pipeline transporting Magna and Premium gasoline. For diesel transmission pipeline, hematite, magnetite, and magnetic goethite were identified. Corrosion products follow a simple reaction mechanism of steel dissolution in aerated aqueous media at a near-neutral pH. Chemical composition of the corrosion products depends on H{sub 2}O and sulfur inherent in fluids (traces). These results can be useful for decision-making with regard to pipeline corrosion control.

  12. Optimal inspection planning for onshore pipelines subject to external corrosion

    International Nuclear Information System (INIS)

    Gomes, Wellison J.S.; Beck, André T.; Haukaas, Terje

    2013-01-01

    Continuous operation of pipeline systems involves significant expenditures in inspection and maintenance activities. The cost-effective safety management of such systems involves allocating the optimal amount of resources to inspection and maintenance activities, in order to control risks (expected costs of failure). In this context, this article addresses the optimal inspection planning for onshore pipelines subject to external corrosion. The investigation addresses a challenging problem of practical relevance, and strives for using the best available models to describe random corrosion growth and the relevant limit state functions. A single pipeline segment is considered in this paper. Expected numbers of failures and repairs are evaluated by Monte Carlo sampling, and a novel procedure is employed to evaluate sensitivities of the objective function with respect to design parameters. This procedure is shown to be accurate and more efficient than finite differences. The optimum inspection interval is found for an example problem, and the robustness of this optimum to the assumed inspection and failure costs is investigated. It is shown that optimum total expected costs found herein are not highly sensitive to the assumed costs of inspection and failure. -- Highlights: • Inspection, repair and failure costs of pipeline systems considered. • Optimum inspection schedule (OIS) obtained by minimizing total expected life-cycle costs. • Robustness of OIS evaluated w.r.t. estimated costs of inspection and failure. • Accurate non-conservative models of corrosion growth employed

  13. STRESS AND STRAIN STATE OF REPAIRING SECTION OF PIPELINE

    Directory of Open Access Journals (Sweden)

    V. V. Nikolaev

    2015-01-01

    Full Text Available Reliability of continuous operation of pipelines is an actual problem. For this reason should be developed an effective warning system of the main pipelines‘  failures and accidents not only in design and operation but also in selected repair. Changing of linear, unloaded by bending position leads to the change of stress and strain state of pipelines. And besides this, the stress and strain state should be determined and controlled in the process of carrying out the repair works. The article presents mathematical model of pipeline’s section straining in viscoelastic setting taking into account soils creep and high-speed stress state of pipeline with the purpose of stresses evaluation and load-supporting capacity of repairing section of pipeline, depending on time.  Stress and strain state analysis of pipeline includes longitudinal and circular stresses calculation  with  account of axis-asymmetrical straining and  was  fulfilled  on  the base of momentless theory of shells. To prove the consistency of data there were compared the calcu- lation results and the solution results by analytical methods for different cases (long pipeline’s section strain only under influence of cross-axis action; long pipeline’s section strain under in- fluence of longitudinal stress; long pipeline’s section strain; which is on the elastic foundation, under influence of cross-axis action. Comparison results shows that the calculation error is not more than 3 %.Analysis of stress-strain state change of pipeline’s section was carried out with development  of  this  model,  which  indicates  the  enlargement  of  span  deflection  in  comparison with problem’s solution in elastic approach. It is also proved, that for consistent assessment of pipeline maintenance conditions, it is necessary to consider the areolas of rheological processes of soils. On the base of complex analysis of pipelines there were determined stresses and time

  14. Offshore Pipeline Locations in the Gulf of Mexico, Geographic NAD27, MMS (2007) [pipelines_vectors_mms_2007

    Data.gov (United States)

    Louisiana Geographic Information Center — Offshore Minerals Management Pipeline Locations for the Gulf of Mexico (GOM). Contains the lines of the pipeline in the GOM. All pipelines existing in the databases...

  15. Offshore Pipeline Locations in the Gulf of Mexico, Geographic NAD27, MMS (2007) [pipelines_points_mms_2007

    Data.gov (United States)

    Louisiana Geographic Information Center — Offshore Minerals Management Pipeline Locations for the Gulf of Mexico (GOM). Contains the points of the pipeline in the GOM. All pipelines existing in the databases...

  16. The Hyper Suprime-Cam software pipeline

    Science.gov (United States)

    Bosch, James; Armstrong, Robert; Bickerton, Steven; Furusawa, Hisanori; Ikeda, Hiroyuki; Koike, Michitaro; Lupton, Robert; Mineo, Sogo; Price, Paul; Takata, Tadafumi; Tanaka, Masayuki; Yasuda, Naoki; AlSayyad, Yusra; Becker, Andrew C.; Coulton, William; Coupon, Jean; Garmilla, Jose; Huang, Song; Krughoff, K. Simon; Lang, Dustin; Leauthaud, Alexie; Lim, Kian-Tat; Lust, Nate B.; MacArthur, Lauren A.; Mandelbaum, Rachel; Miyatake, Hironao; Miyazaki, Satoshi; Murata, Ryoma; More, Surhud; Okura, Yuki; Owen, Russell; Swinbank, John D.; Strauss, Michael A.; Yamada, Yoshihiko; Yamanoi, Hitomi

    2018-01-01

    In this paper, we describe the optical imaging data processing pipeline developed for the Subaru Telescope's Hyper Suprime-Cam (HSC) instrument. The HSC Pipeline builds on the prototype pipeline being developed by the Large Synoptic Survey Telescope's Data Management system, adding customizations for HSC, large-scale processing capabilities, and novel algorithms that have since been reincorporated into the LSST codebase. While designed primarily to reduce HSC Subaru Strategic Program (SSP) data, it is also the recommended pipeline for reducing general-observer HSC data. The HSC pipeline includes high-level processing steps that generate coadded images and science-ready catalogs as well as low-level detrending and image characterizations.

  17. Efficiency improvements in pipeline transportation systems

    Energy Technology Data Exchange (ETDEWEB)

    Banks, W. F.; Horton, J. F.

    1977-09-09

    This report identifies potential energy-conservative pipeline innovations that are most energy- and cost-effective and formulates recommendations for the R, D, and D programs needed to exploit those opportunities. From a candidate field of over twenty classes of efficiency improvements, eight systems are recommended for pursuit. Most of these possess two highly important attributes: large potential energy savings and broad applicability outside the pipeline industry. The R, D, and D program for each improvement and the recommended immediate next step are described. The eight technologies recommended for R, D, and D are gas-fired combined cycle compressor station; internally cooled internal combustion engine; methanol-coal slurry pipeline; methanol-coal slurry-fired and coal-fired engines; indirect-fired coal-burning combined-cycle pump station; fuel-cell pump station; drag-reducing additives in liquid pipelines; and internal coatings in pipelines.

  18. Millennium Pipeline Presentation : a new northeast passage

    International Nuclear Information System (INIS)

    Wolnik, J.

    1997-01-01

    Routes of the proposed Millennium Pipeline project were presented. The pipeline is to originate at the Empress gas field in Alberta and link up to eastern markets in the United States. One of the key advantages of the pipeline is that it will have the lowest proposed rates from Empress to Chicago and through links via affiliates to New York and other eastern markets. It will include 380 miles of new 36-inch pipeline and have a capacity of 650 million cubic feet per day. In many instances it will follow existing rights-of-way. The pipeline is expected to be in service for the 1999 winter heating season. The project sponsors are Columbia Gas Transmission, CMS Energy, MCN Energy, and Westcoast Energy. 6 figs

  19. The local heat treatment equipment and technology of the pipelines welded joints

    International Nuclear Information System (INIS)

    Korol'kov, P.M.

    1998-01-01

    The principal methods and equipment for local treatment of the pipe-lines weld joints in different industry branches is described. Recommendations about heat treatment equipment and technology application are given

  20. Pipeline integrity: ILI baseline data for QRA

    Energy Technology Data Exchange (ETDEWEB)

    Porter, Todd R. [Tuboscope Pipeline Services, Houston, TX (United States)]. E-mail: tporter@varco.com; Silva, Jose Augusto Pereira da [Pipeway Engenharia, Rio de Janeiro, RJ (Brazil)]. E-mail: guto@pipeway.com; Marr, James [MARR and Associates, Calgary, AB (Canada)]. E-mail: jmarr@marr-associates.com

    2003-07-01

    The initial phase of a pipeline integrity management program (IMP) is conducting a baseline assessment of the pipeline system and segments as part of Quantitative Risk Assessment (QRA). This gives the operator's integrity team the opportunity to identify critical areas and deficiencies in the protection, maintenance, and mitigation strategies. As a part of data gathering and integration of a wide variety of sources, in-line inspection (ILI) data is a key element. In order to move forward in the integrity program development and execution, the baseline geometry of the pipeline must be determined with accuracy and confidence. From this, all subsequent analysis and conclusions will be derived. Tuboscope Pipeline Services (TPS), in conjunction with Pipeway Engenharia of Brazil, operate ILI inertial navigation system (INS) and Caliper geometry tools, to address this integrity requirement. This INS and Caliper ILI tool data provides pipeline trajectory at centimeter level resolution and sub-metre 3D position accuracy along with internal geometry - ovality, dents, misalignment, and wrinkle/buckle characterization. Global strain can be derived from precise INS curvature measurements and departure from the initial pipeline state. Accurate pipeline elevation profile data is essential in the identification of sag/over bend sections for fluid dynamic and hydrostatic calculations. This data, along with pipeline construction, operations, direct assessment and maintenance data is integrated in LinaViewPRO{sup TM}, a pipeline data management system for decision support functions, and subsequent QRA operations. This technology provides the baseline for an informed, accurate and confident integrity management program. This paper/presentation will detail these aspects of an effective IMP, and experience will be presented, showing the benefits for liquid and gas pipeline systems. (author)

  1. Semi-Semantic Annotation: A guideline for the URDU.KON-TB treebank POS annotation

    Directory of Open Access Journals (Sweden)

    Qaiser ABBAS

    2016-12-01

    Full Text Available This work elaborates the semi-semantic part of speech annotation guidelines for the URDU.KON-TB treebank: an annotated corpus. A hierarchical annotation scheme was designed to label the part of speech and then applied on the corpus. This raw corpus was collected from the Urdu Wikipedia and the Jang newspaper and then annotated with the proposed semi-semantic part of speech labels. The corpus contains text of local & international news, social stories, sports, culture, finance, religion, traveling, etc. This exercise finally contributed a part of speech annotation to the URDU.KON-TB treebank. Twenty-two main part of speech categories are divided into subcategories, which conclude the morphological, and semantical information encoded in it. This article reports the annotation guidelines in major; however, it also briefs the development of the URDU.KON-TB treebank, which includes the raw corpus collection, designing & employment of annotation scheme and finally, its statistical evaluation and results. The guidelines presented as follows, will be useful for linguistic community to annotate the sentences not only for the national language Urdu but for the other indigenous languages like Punjab, Sindhi, Pashto, etc., as well.

  2. MixtureTree annotator: a program for automatic colorization and visual annotation of MixtureTree.

    Directory of Open Access Journals (Sweden)

    Shu-Chuan Chen

    Full Text Available The MixtureTree Annotator, written in JAVA, allows the user to automatically color any phylogenetic tree in Newick format generated from any phylogeny reconstruction program and output the Nexus file. By providing the ability to automatically color the tree by sequence name, the MixtureTree Annotator provides a unique advantage over any other programs which perform a similar function. In addition, the MixtureTree Annotator is the only package that can efficiently annotate the output produced by MixtureTree with mutation information and coalescent time information. In order to visualize the resulting output file, a modified version of FigTree is used. Certain popular methods, which lack good built-in visualization tools, for example, MEGA, Mesquite, PHY-FI, TreeView, treeGraph and Geneious, may give results with human errors due to either manually adding colors to each node or with other limitations, for example only using color based on a number, such as branch length, or by taxonomy. In addition to allowing the user to automatically color any given Newick tree by sequence name, the MixtureTree Annotator is the only method that allows the user to automatically annotate the resulting tree created by the MixtureTree program. The MixtureTree Annotator is fast and easy-to-use, while still allowing the user full control over the coloring and annotating process.

  3. Extracting Cross-Ontology Weighted Association Rules from Gene Ontology Annotations.

    Science.gov (United States)

    Agapito, Giuseppe; Milano, Marianna; Guzzi, Pietro Hiram; Cannataro, Mario

    2016-01-01

    Gene Ontology (GO) is a structured repository of concepts (GO Terms) that are associated to one or more gene products through a process referred to as annotation. The analysis of annotated data is an important opportunity for bioinformatics. There are different approaches of analysis, among those, the use of association rules (AR) which provides useful knowledge, discovering biologically relevant associations between terms of GO, not previously known. In a previous work, we introduced GO-WAR (Gene Ontology-based Weighted Association Rules), a methodology for extracting weighted association rules from ontology-based annotated datasets. We here adapt the GO-WAR algorithm to mine cross-ontology association rules, i.e., rules that involve GO terms present in the three sub-ontologies of GO. We conduct a deep performance evaluation of GO-WAR by mining publicly available GO annotated datasets, showing how GO-WAR outperforms current state of the art approaches.

  4. Active learning reduces annotation time for clinical concept extraction.

    Science.gov (United States)

    Kholghi, Mahnoosh; Sitbon, Laurianne; Zuccon, Guido; Nguyen, Anthony

    2017-10-01

    To investigate: (1) the annotation time savings by various active learning query strategies compared to supervised learning and a random sampling baseline, and (2) the benefits of active learning-assisted pre-annotations in accelerating the manual annotation process compared to de novo annotation. There are 73 and 120 discharge summary reports provided by Beth Israel institute in the train and test sets of the concept extraction task in the i2b2/VA 2010 challenge, respectively. The 73 reports were used in user study experiments for manual annotation. First, all sequences within the 73 reports were manually annotated from scratch. Next, active learning models were built to generate pre-annotations for the sequences selected by a query strategy. The annotation/reviewing time per sequence was recorded. The 120 test reports were used to measure the effectiveness of the active learning models. When annotating from scratch, active learning reduced the annotation time up to 35% and 28% compared to a fully supervised approach and a random sampling baseline, respectively. Reviewing active learning-assisted pre-annotations resulted in 20% further reduction of the annotation time when compared to de novo annotation. The number of concepts that require manual annotation is a good indicator of the annotation time for various active learning approaches as demonstrated by high correlation between time rate and concept annotation rate. Active learning has a key role in reducing the time required to manually annotate domain concepts from clinical free text, either when annotating from scratch or reviewing active learning-assisted pre-annotations. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. Pipeline heating method based on optimal control and state estimation

    Energy Technology Data Exchange (ETDEWEB)

    Vianna, F.L.V. [Dept. of Subsea Technology. Petrobras Research and Development Center - CENPES, Rio de Janeiro, RJ (Brazil)], e-mail: fvianna@petrobras.com.br; Orlande, H.R.B. [Dept. of Mechanical Engineering. POLI/COPPE, Federal University of Rio de Janeiro - UFRJ, Rio de Janeiro, RJ (Brazil)], e-mail: helcio@mecanica.ufrj.br; Dulikravich, G.S. [Dept. of Mechanical and Materials Engineering. Florida International University - FIU, Miami, FL (United States)], e-mail: dulikrav@fiu.edu

    2010-07-01

    In production of oil and gas wells in deep waters the flowing of hydrocarbon through pipeline is a challenging problem. This environment presents high hydrostatic pressures and low sea bed temperatures, which can favor the formation of solid deposits that in critical operating conditions, as unplanned shutdown conditions, may result in a pipeline blockage and consequently incur in large financial losses. There are different methods to protect the system, but nowadays thermal insulation and chemical injection are the standard solutions normally used. An alternative method of flow assurance is to heat the pipeline. This concept, which is known as active heating system, aims at heating the produced fluid temperature above a safe reference level in order to avoid the formation of solid deposits. The objective of this paper is to introduce a Bayesian statistical approach for the state estimation problem, in which the state variables are considered as the transient temperatures within a pipeline cross-section, and to use the optimal control theory as a design tool for a typical heating system during a simulated shutdown condition. An application example is presented to illustrate how Bayesian filters can be used to reconstruct the temperature field from temperature measurements supposedly available on the external surface of the pipeline. The temperatures predicted with the Bayesian filter are then utilized in a control approach for a heating system used to maintain the temperature within the pipeline above the critical temperature of formation of solid deposits. The physical problem consists of a pipeline cross section represented by a circular domain with four points over the pipe wall representing heating cables. The fluid is considered stagnant, homogeneous, isotropic and with constant thermo-physical properties. The mathematical formulation governing the direct problem was solved with the finite volume method and for the solution of the state estimation problem

  6. DOOp: DAOSPEC Output Optimizer pipeline

    Science.gov (United States)

    Cantat-Gaudin, Tristan; Donati, Paolo; Pancino, Elena; Bragaglia, Angela; Vallenari, Antonella; Friel, Eileen D.; Sordo, Rosanna; Jacobson, Heather R.; Magrini, Laura

    2017-09-01

    The DAOSPEC Output Optimizer pipeline (DOOp) runs efficient and convenient equivalent widths measurements in batches of hundreds of spectra. It uses a series of BASH scripts to work as a wrapper for the FORTRAN code DAOSPEC (ascl:1011.002) and uses IRAF (ascl:9911.002) to automatically fix some of the parameters that are usually set by hand when using DAOSPEC. This allows batch-processing of quantities of spectra that would be impossible to deal with by hand. DOOp was originally built for the large quantity of UVES and GIRAFFE spectra produced by the Gaia-ESO Survey, but just like DAOSPEC, it can be used on any high resolution and high signal-to-noise ratio spectrum binned on a linear wavelength scale.

  7. The Very Large Array Data Processing Pipeline

    Science.gov (United States)

    Kent, Brian R.; Masters, Joseph S.; Chandler, Claire J.; Davis, Lindsey E.; Kern, Jeffrey S.; Ott, Juergen; Schinzel, Frank K.; Medlin, Drew; Muders, Dirk; Williams, Stewart; Geers, Vincent C.; Momjian, Emmanuel; Butler, Bryan J.; Nakazato, Takeshi; Sugimoto, Kanako

    2018-01-01

    We present the VLA Pipeline, software that is part of the larger pipeline processing framework used for the Karl G. Jansky Very Large Array (VLA), and Atacama Large Millimeter/sub-millimeter Array (ALMA) for both interferometric and single dish observations.Through a collection of base code jointly used by the VLA and ALMA, the pipeline builds a hierarchy of classes to execute individual atomic pipeline tasks within the Common Astronomy Software Applications (CASA) package. Each pipeline task contains heuristics designed by the team to actively decide the best processing path and execution parameters for calibration and imaging. The pipeline code is developed and written in Python and uses a "context" structure for tracking the heuristic decisions and processing results. The pipeline "weblog" acts as the user interface in verifying the quality assurance of each calibration and imaging stage. The majority of VLA scheduling blocks above 1 GHz are now processed with the standard continuum recipe of the pipeline and offer a calibrated measurement set as a basic data product to observatory users. In addition, the pipeline is used for processing data from the VLA Sky Survey (VLASS), a seven year community-driven endeavor started in September 2017 to survey the entire sky down to a declination of -40 degrees at S-band (2-4 GHz). This 5500 hour next-generation large radio survey will explore the time and spectral domains, relying on pipeline processing to generate calibrated measurement sets, polarimetry, and imaging data products that are available to the astronomical community with no proprietary period. Here we present an overview of the pipeline design philosophy, heuristics, and calibration and imaging results produced by the pipeline. Future development will include the testing of spectral line recipes, low signal-to-noise heuristics, and serving as a testing platform for science ready data products.The pipeline is developed as part of the CASA software package by an

  8. SOIL-PIPE INTERACTION OF FAULT CROSSING SEGMENTED BURIED DUCTILE IRON PIPELINES SUBJECTED TO DIP FAULTINGS

    Science.gov (United States)

    Erami, Mohammad Hossein; Miyajima, Masakatsu; Kaneko, Shougo

    This study investigates the necessity of considering different soil resistance against pipeline relative movement in upward and downward directions. In this way, results of FEM analyses are verified by experimental tests on a segmented ductile iron pipeline with 93mm diameter and 15m length installed at a 60cm depth from the ground surface in the moderate dense sand backfill condition. Fault movement, totally 35cm, has three same steps occurring in reverse way and intersection angle of 60 degrees with the pipe. This study demonstrates how assuming same resistance for soil against both upward and downward relative movements of pipeline, as suggested in JGA guideline, eventuates in imprecise FEM models.

  9. Estimating the Density of Fluid in a Pipeline System with an Electropump

    DEFF Research Database (Denmark)

    Sadeghi, H.; Poshtan, J.; Poulsen, Niels Kjølstad

    2018-01-01

    To transfer petroleum products, a common pipeline is often used to continuously transfer various products in batches. Separating the different products requires detecting the interface between the batches at the storage facilities or pump stations along the pipelines. The conventional technique...... to detect the product in the pipeline is to sample the fluid in a laboratory and perform an offline measurement of its physical characteristics. The measurement requires sophisticated laboratory equipment and can be time-consuming and susceptible to human error. In this paper, for performing the online...

  10. Comparing MapReduce and Pipeline Implementations for Counting Triangles

    Directory of Open Access Journals (Sweden)

    Edelmira Pasarella

    2017-01-01

    Full Text Available A common method to define a parallel solution for a computational problem consists in finding a way to use the Divide and Conquer paradigm in order to have processors acting on its own data and scheduled in a parallel fashion. MapReduce is a programming model that follows this paradigm, and allows for the definition of efficient solutions by both decomposing a problem into steps on subsets of the input data and combining the results of each step to produce final results. Albeit used for the implementation of a wide variety of computational problems, MapReduce performance can be negatively affected whenever the replication factor grows or the size of the input is larger than the resources available at each processor. In this paper we show an alternative approach to implement the Divide and Conquer paradigm, named dynamic pipeline. The main features of dynamic pipelines are illustrated on a parallel implementation of the well-known problem of counting triangles in a graph. This problem is especially interesting either when the input graph does not fit in memory or is dynamically generated. To evaluate the properties of pipeline, a dynamic pipeline of processes and an ad-hoc version of MapReduce are implemented in the language Go, exploiting its ability to deal with channels and spawned processes. An empirical evaluation is conducted on graphs of different topologies, sizes, and densities. Observed results suggest that dynamic pipelines allows for an efficient implementation of the problem of counting triangles in a graph, particularly, in dense and large graphs, drastically reducing the execution time with respect to the MapReduce implementation.

  11. Current pipelines for neglected diseases.

    Science.gov (United States)

    di Procolo, Paolo; Jommi, Claudio

    2014-09-01

    This paper scrutinises pipelines for Neglected Diseases (NDs), through freely accessible and at-least-weekly updated trials databases. It updates to 2012 data provided by recent publications, and integrates these analyses with information on location of trials coordinators and patients recruitment status. Additionally, it provides (i) disease-specific information to better understand the rational of investments in NDs, (ii) yearly data, to understand the investment trends. The search identified 650 clinical studies. Leishmaniasis, Arbovirus infection, and Dengue are the top three diseases by number of clinical studies. Disease diffusion risk seems to be the most important driver of the clinical trials target choice, whereas the role played by disease prevalence and unmet need is controversial. Number of trials is stable between 2005 and 2010, with an increase in the last two years. Patient recruitment was completed for most studies (57.6%), and Phases II and III account for 35% and 28% of trials, respectively. The primary purpose of clinical investigations is prevention (49.3%), especially for infectious diseases with mosquitoes and sand flies as the vector, and treatment (43.2%), which is the primary target for parasitic diseases Research centres and public organisations are the most important clinical studies sponsors (58.9%), followed by the pharmaceutical industry (24.1%), foundations and non-governmental organisations (9.3%). Many coordinator centres are located in less affluent countries (43.7%), whereas OECD countries and BRICS account for 34.7% and 17.5% of trials, respectively. Information was partially missing for some parameters. Notwithstanding, and despite its descriptive nature, this research has enhanced the evidence of the literature on pipelines for NDs. Future contributions may further investigate whether trials metrics are consistent with the characteristics of the interested countries and the explicative variables of trials location, target

  12. Current trend of annotating single nucleotide variation in humans--A case study on SNVrap.

    Science.gov (United States)

    Li, Mulin Jun; Wang, Junwen

    2015-06-01

    As high throughput methods, such as whole genome genotyping arrays, whole exome sequencing (WES) and whole genome sequencing (WGS), have detected huge amounts of genetic variants associated with human diseases, function annotation of these variants is an indispensable step in understanding disease etiology. Large-scale functional genomics projects, such as The ENCODE Project and Roadmap Epigenomics Project, provide genome-wide profiling of functional elements across different human cell types and tissues. With the urgent demands for identification of disease-causal variants, comprehensive and easy-to-use annotation tool is highly in demand. Here we review and discuss current progress and trend of the variant annotation field. Furthermore, we introduce a comprehensive web portal for annotating human genetic variants. We use gene-based features and the latest functional genomics datasets to annotate single nucleotide variation (SNVs) in human, at whole genome scale. We further apply several function prediction algorithms to annotate SNVs that might affect different biological processes, including transcriptional gene regulation, alternative splicing, post-transcriptional regulation, translation and post-translational modifications. The SNVrap web portal is freely available at http://jjwanglab.org/snvrap. Copyright © 2014 Elsevier Inc. All rights reserved.

  13. Annotation of the Protein Coding Regions of the Equine Genome.

    Directory of Open Access Journals (Sweden)

    Matthew S Hestand

    Full Text Available Current gene annotation of the horse genome is largely derived from in silico predictions and cross-species alignments. Only a small number of genes are annotated based on equine EST and mRNA sequences. To expand the number of equine genes annotated from equine experimental evidence, we sequenced mRNA from a pool of forty-three different tissues. From these, we derived the structures of 68,594 transcripts. In addition, we identified 301,829 positions with SNPs or small indels within these transcripts relative to EquCab2. Interestingly, 780 variants extend the open reading frame of the transcript and appear to be small errors in the equine reference genome, since they are also identified as homozygous variants by genomic DNA resequencing of the reference horse. Taken together, we provide a resource of equine mRNA structures and protein coding variants that will enhance equine and cross-species transcriptional and genomic comparisons.

  14. Annotating the biomedical literature for the human variome.

    Science.gov (United States)

    Verspoor, Karin; Jimeno Yepes, Antonio; Cavedon, Lawrence; McIntosh, Tara; Herten-Crabb, Asha; Thomas, Zoë; Plazzer, John-Paul

    2013-01-01

    This article introduces the Variome Annotation Schema, a schema that aims to capture the core concepts and relations relevant to cataloguing and interpreting human genetic variation and its relationship to disease, as described in the published literature. The schema was inspired by the needs of the database curators of the International Society for Gastrointestinal Hereditary Tumours (InSiGHT) database, but is intended to have application to genetic variation information in a range of diseases. The schema has been applied to a small corpus of full text journal publications on the subject of inherited colorectal cancer. We show that the inter-annotator agreement on annotation of this corpus ranges from 0.78 to 0.95 F-score across different entity types when exact matching is measured, and improves to a minimum F-score of 0.87 when boundary matching is relaxed. Relations show more variability in agreement, but several are reliable, with the highest, cohort-has-size, reaching 0.90 F-score. We also explore the relevance of the schema to the InSiGHT database curation process. The schema and the corpus represent an important new resource for the development of text mining solutions that address relationships among patient cohorts, disease and genetic variation, and therefore, we also discuss the role text mining might play in the curation of information related to the human variome. The corpus is available at http://opennicta.com/home/health/variome.

  15. Development of ecologically safe method for main oil and gas pipeline trenching

    Directory of Open Access Journals (Sweden)

    Akhmedov Asvar Mikdadovich

    2014-05-01

    Full Text Available Constructive, technical and technological reliability of major pipeline ensures ecological safety on different stages of life circle - beginning with project preparation activities up to the end of major pipeline operation. Even in the process of transition into new life circle stage, no matter if the pipeline needs major repairs or reconstruction, such technical and technological solutions should be found, which would preserve ecological stability of nature-anthropogenic system. Development of ecology protection technologies of construction, reconstruction and major repairs of main pipelines is of great importance not only for a region, but ensures ecological safety across the globe. The article presents a new way of trenching the main oil and gas pipeline, preservation and increase of ecological safety during its service. The updated technological plan is given in the paper for overhaul of the main oil and gas pipeline using the new technology of pipeline trenching. The suggested technical solution contributes to environment preservation with the help of deteriorating shells - the shells’ material decomposes into environment-friendly components: carbon dioxide, water and humus. The quantity of polluting agents in the atmosphere decreases with the decrease of construction term and quantity of technical equipment.

  16. Putative drug and vaccine target protein identification using comparative genomic analysis of KEGG annotated metabolic pathways of Mycoplasma hyopneumoniae.

    Science.gov (United States)

    Damte, Dereje; Suh, Joo-Won; Lee, Seung-Jin; Yohannes, Sileshi Belew; Hossain, Md Akil; Park, Seung-Chun

    2013-07-01

    In the present study, a computational comparative and subtractive genomic/proteomic analysis aimed at the identification of putative therapeutic target and vaccine candidate proteins from Kyoto Encyclopedia of Genes and Genomes (KEGG) annotated metabolic pathways of Mycoplasma hyopneumoniae was performed for drug design and vaccine production pipelines against M.hyopneumoniae. The employed comparative genomic and metabolic pathway analysis with a predefined computational systemic workflow extracted a total of 41 annotated metabolic pathways from KEGG among which five were unique to M. hyopneumoniae. A total of 234 proteins were identified to be involved in these metabolic pathways. Although 125 non homologous and predicted essential proteins were found from the total that could serve as potential drug targets and vaccine candidates, additional prioritizing parameters characterize 21 proteins as vaccine candidate while druggability of each of the identified proteins evaluated by the DrugBank database prioritized 42 proteins suitable for drug targets. Copyright © 2013 Elsevier Inc. All rights reserved.

  17. Pipe locator for imaging underground pipelines (abstract)

    Science.gov (United States)

    Miyamoto, Y.; Wasa, Y.; Mori, K.; Kondo, Y.

    1988-11-01

    Recently, it becomes more important to locate the complex piping patterns such as tee, bend, riser, and the others with high accuracy for maintenance and protection of city gas pipelines. Hence, we have developed a new pipe locator system for imaging the complex underground pipelines using magnetic remote sensing techniques. The main framework of this development is the application of the pattern recognition of the magnetic field distribution to the location of buried pipelines in urban areas. The first step for imaging the complex pipelines is to measure the three-dimensional magnetic field distribution with high accuracy which is generated by the passage of the alternating signal current through buried pipeline. For this purpose a portable trolley unit which is capable of scanning the ground to collect data, the 10 three-axes coil sensors with a sensitivity of 1 μG which are aligned in the unit, and a filter system using a FFT signal processor which eliminates urban magnetic noise as high as 10 mG in some cases, were developed. The second step is to process the magnetic field distribution data, to extract the feature of the underground pipeline using the contour diagram and the three-dimensional drawing of the magnetic field, and to identify the complex piping patterns. Further, we recognized that a nonlinear least-square method algorithm for calculation of the pipeline's position was useful to improve the location accuracy.

  18. Gastrointestinal hormone research - with a Scandinavian annotation.

    Science.gov (United States)

    Rehfeld, Jens F

    2015-06-01

    Gastrointestinal hormones are peptides released from neuroendocrine cells in the digestive tract. More than 30 hormone genes are currently known to be expressed in the gut, which makes it the largest hormone-producing organ in the body. Modern biology makes it feasible to conceive the hormones under five headings: The structural homology groups a majority of the hormones into nine families, each of which is assumed to originate from one ancestral gene. The individual hormone gene often has multiple phenotypes due to alternative splicing, tandem organization or differentiated posttranslational maturation of the prohormone. By a combination of these mechanisms, more than 100 different hormonally active peptides are released from the gut. Gut hormone genes are also widely expressed outside the gut, some only in extraintestinal endocrine cells and cerebral or peripheral neurons but others also in other cell types. The extraintestinal cells may release different bioactive fragments of the same prohormone due to cell-specific processing pathways. Moreover, endocrine cells, neurons, cancer cells and, for instance, spermatozoa secrete gut peptides in different ways, so the same peptide may act as a blood-borne hormone, a neurotransmitter, a local growth factor or a fertility factor. The targets of gastrointestinal hormones are specific G-protein-coupled receptors that are expressed in the cell membranes also outside the digestive tract. Thus, gut hormones not only regulate digestive functions, but also constitute regulatory systems operating in the whole organism. This overview of gut hormone biology is supplemented with an annotation on some Scandinavian contributions to gastrointestinal hormone research.

  19. Motion lecture annotation system to learn Naginata performances

    Science.gov (United States)

    Kobayashi, Daisuke; Sakamoto, Ryota; Nomura, Yoshihiko

    2013-12-01

    This paper describes a learning assistant system using motion capture data and annotation to teach "Naginata-jutsu" (a skill to practice Japanese halberd) performance. There are some video annotation tools such as YouTube. However these video based tools have only single angle of view. Our approach that uses motion-captured data allows us to view any angle. A lecturer can write annotations related to parts of body. We have made a comparison of effectiveness between the annotation tool of YouTube and the proposed system. The experimental result showed that our system triggered more annotations than the annotation tool of YouTube.

  20. Integrating sustainable growth into export pipeline projects

    International Nuclear Information System (INIS)

    Jeniffer, Barringer; William, Lukens; Patricia, Wild

    2002-01-01

    Full text: Sustainable growth in the energy industry is rapidly expanding beyond the conceptual stage. Policies addressing the three principles of Sustainable Development are being established and strategies to execute these policies are being developed and implemented in the field. Conoco is developing a strong corporate culture around sustainable growth; and, pipeline systems play a vital role in delivering the triple bottom line results for our stake holders. This paper will highlight some of the key focal points by Conoco Inc., in each phase of pipeline project development, execution, and operation to make pipeline projects a contributor to Conoco's sustainable growth success, and shares some lessons learned

  1. ARTIP: Automated Radio Telescope Image Processing Pipeline

    Science.gov (United States)

    Sharma, Ravi; Gyanchandani, Dolly; Kulkarni, Sarang; Gupta, Neeraj; Pathak, Vineet; Pande, Arti; Joshi, Unmesh

    2018-02-01

    The Automated Radio Telescope Image Processing Pipeline (ARTIP) automates the entire process of flagging, calibrating, and imaging for radio-interferometric data. ARTIP starts with raw data, i.e. a measurement set and goes through multiple stages, such as flux calibration, bandpass calibration, phase calibration, and imaging to generate continuum and spectral line images. Each stage can also be run independently. The pipeline provides continuous feedback to the user through various messages, charts and logs. It is written using standard python libraries and the CASA package. The pipeline can deal with datasets with multiple spectral windows and also multiple target sources which may have arbitrary combinations of flux/bandpass/phase calibrators.

  2. Prospects for coal slurry pipelines in California

    Science.gov (United States)

    Lynch, J. F.

    1978-01-01

    The coal slurry pipeline segment of the transport industry is emerging in the United States. If accepted it will play a vital role in meeting America's urgent energy requirements without public subsidy, tax relief, or federal grants. It is proven technology, ideally suited for transport of an abundant energy resource over thousands of miles to energy short industrial centers and at more than competitive costs. Briefly discussed are the following: (1) history of pipelines; (2) California market potential; (3) slurry technology; (4) environmental benefits; (5) market competition; and (6) a proposed pipeline.

  3. Optimal hub location in pipeline networks

    Energy Technology Data Exchange (ETDEWEB)

    Dott, D.R.; Wirasinghe, S.C.; Chakma, A. [Univ. of Calgary, Alberta (Canada)

    1996-12-31

    This paper discusses optimization strategies and techniques for the location of natural gas marketing hubs in the North American gas pipeline network. A hub is a facility at which inbound and outbound network links meet and freight is redirected towards their destinations. Common examples of hubs used in the gas pipeline industry include gas plants, interconnects and market centers. Characteristics of the gas pipeline industry which are relevant to the optimization of transportation costs using hubs are presented. Allocation techniques for solving location-allocation problems are discussed. An outline of the research in process by the authors in the field of optimal gas hub location concludes the paper.

  4. New in protein structure and function annotation: hotspots, single nucleotide polymorphisms and the 'Deep Web'.

    Science.gov (United States)

    Bromberg, Yana; Yachdav, Guy; Ofran, Yanay; Schneider, Reinhard; Rost, Burkhard

    2009-05-01

    The rapidly increasing quantity of protein sequence data continues to widen the gap between available sequences and annotations. Comparative modeling suggests some aspects of the 3D structures of approximately half of all known proteins; homology- and network-based inferences annotate some aspect of function for a similar fraction of the proteome. For most known protein sequences, however, there is detailed knowledge about neither their function nor their structure. Comprehensive efforts towards the expert curation of sequence annotations have failed to meet the demand of the rapidly increasing number of available sequences. Only the automated prediction of protein function in the absence of homology can close the gap between available sequences and annotations in the foreseeable future. This review focuses on two novel methods for automated annotation, and briefly presents an outlook on how modern web software may revolutionize the field of protein sequence annotation. First, predictions of protein binding sites and functional hotspots, and the evolution of these into the most successful type of prediction of protein function from sequence will be discussed. Second, a new tool, comprehensive in silico mutagenesis, which contributes important novel predictions of function and at the same time prepares for the onset of the next sequencing revolution, will be described. While these two new sub-fields of protein prediction represent the breakthroughs that have been achieved methodologically, it will then be argued that a different development might further change the way biomedical researchers benefit from annotations: modern web software can connect the worldwide web in any browser with the 'Deep Web' (ie, proprietary data resources). The availability of this direct connection, and the resulting access to a wealth of data, may impact drug discovery and development more than any existing method that contributes to protein annotation.

  5. An Annotated Dataset of 14 Meat Images

    DEFF Research Database (Denmark)

    Stegmann, Mikkel Bille

    2002-01-01

    This note describes a dataset consisting of 14 annotated images of meat. Points of correspondence are placed on each image. As such, the dataset can be readily used for building statistical models of shape. Further, format specifications and terms of use are given.......This note describes a dataset consisting of 14 annotated images of meat. Points of correspondence are placed on each image. As such, the dataset can be readily used for building statistical models of shape. Further, format specifications and terms of use are given....

  6. Software for computing and annotating genomic ranges.

    Science.gov (United States)

    Lawrence, Michael; Huber, Wolfgang; Pagès, Hervé; Aboyoun, Patrick; Carlson, Marc; Gentleman, Robert; Morgan, Martin T; Carey, Vincent J

    2013-01-01

    We describe Bioconductor infrastructure for representing and computing on annotated genomic ranges and integrating genomic data with the statistical computing features of R and its extensions. At the core of the infrastructure are three packages: IRanges, GenomicRanges, and GenomicFeatures. These packages provide scalable data structures for representing annotated ranges on the genome, with special support for transcript structures, read alignments and coverage vectors. Computational facilities include efficient algorithms for overlap and nearest neighbor detection, coverage calculation and other range operations. This infrastructure directly supports more than 80 other Bioconductor packages, including those for sequence analysis, differential expression analysis and visualization.

  7. Software for computing and annotating genomic ranges.

    Directory of Open Access Journals (Sweden)

    Michael Lawrence

    Full Text Available We describe Bioconductor infrastructure for representing and computing on annotated genomic ranges and integrating genomic data with the statistical computing features of R and its extensions. At the core of the infrastructure are three packages: IRanges, GenomicRanges, and GenomicFeatures. These packages provide scalable data structures for representing annotated ranges on the genome, with special support for transcript structures, read alignments and coverage vectors. Computational facilities include efficient algorithms for overlap and nearest neighbor detection, coverage calculation and other range operations. This infrastructure directly supports more than 80 other Bioconductor packages, including those for sequence analysis, differential expression analysis and visualization.

  8. Novel leak localization in pressurized pipeline networks using acoustic emission and geometric connectivity

    International Nuclear Information System (INIS)

    Ozevin, Didem; Harding, James

    2012-01-01

    Time dependent aging and instantaneous threats can cause the initiation of damage in the buried and on-ground pipelines. Damage may propagate all through the structural thickness and cause leaking. The leakage detection in oil, water, gas or steam pipeline networks before it becomes structurally instable is important to prevent any catastrophic failures. The leak in pressurized pipelines causes turbulent flow at its location, which generates solid particles or gas bubbles impacting on the pipeline material. The impact energy causes propagating elastic waves that can be detected by the sensors mounted on the pipeline. The method is called Acoustic Emission, which can be used for real time detection of damage caused by unintentional or intentional sources in the pipeline networks. In this paper, a new leak localization approach is proposed for pipeline networks spread in a two dimensional configuration. The approach is to determine arrival time differences using cross correlation function, and introduce the geometric connectivity in order to identify the path that the leak waves should propagate to reach the AE sensors. The leak location in multi-dimensional space is identified in an effective approach using an array of sensors spread on the pipeline network. The approach is successfully demonstrated on laboratory scale polypropylene pipeline networks. - Highlights: ► Leak is identified in 2D using the 1D algorithm and geometric connectivity. ► The methodology is applicable if the source to sensor path is not straight. ► The hit sequence based on average signal level improves the source location. ► The leak localization in viscoelastic materials is high due to attenuation.

  9. An Updated Functional Annotation of Protein-Coding Genes in the Cucumber Genome

    Directory of Open Access Journals (Sweden)

    Hongtao Song

    2018-03-01

    Full Text Available Background: Although the cucumber reference genome and its annotation were published several years ago, the functional annotation of predicted genes, particularly protein-coding genes, still requires further improvement. In general, accurately determining orthologous relationships between genes allows for better and more robust functional assignments of predicted genes. As one of the most reliable strategies, the determination of collinearity information may facilitate reliable orthology inferences among genes from multiple related genomes. Currently, the identification of collinear segments has mainly been based on conservation of gene order and orientation. Over the course of plant genome evolution, various evolutionary events have disrupted or distorted the order of genes along chromosomes, making it difficult to use those genes as genome-wide markers for plant genome comparisons.Results: Using the localized LASTZ/MULTIZ analysis pipeline, we aligned 15 genomes, including cucumber and other related angiosperm plants, and identified a set of genomic segments that are short in length, stable in structure, uniform in distribution and highly conserved across all 15 plants. Compared with protein-coding genes, these conserved segments were more suitable for use as genomic markers for detecting collinear segments among distantly divergent plants. Guided by this set of identified collinear genomic segments, we inferred 94,486 orthologous protein-coding gene pairs (OPPs between cucumber and 14 other angiosperm species, which were used as proxies for transferring functional terms to cucumber genes from the annotations of the other 14 genomes. In total, 10,885 protein-coding genes were assigned Gene Ontology (GO terms which was nearly 1,300 more than results collected in Uniprot-proteomic database. Our results showed that annotation accuracy would been improved compared with other existing approaches.Conclusions: In this study, we provided an

  10. CLAMP - a toolkit for efficiently building customized clinical natural language processing pipelines.

    Science.gov (United States)

    Soysal, Ergin; Wang, Jingqi; Jiang, Min; Wu, Yonghui; Pakhomov, Serguei; Liu, Hongfang; Xu, Hua

    2017-11-24

    Existing general clinical natural language processing (NLP) systems such as MetaMap and Clinical Text Analysis and Knowledge Extraction System have been successfully applied to information extraction from clinical text. However, end users often have to customize existing systems for their individual tasks, which can require substantial NLP skills. Here we present CLAMP (Clinical Language Annotation, Modeling, and Processing), a newly developed clinical NLP toolkit that provides not only state-of-the-art NLP components, but also a user-friendly graphic user interface that can help users quickly build customized NLP pipelines for their individual applications. Our evaluation shows that the CLAMP default pipeline achieved good performance on named entity recognition and concept encoding. We also demonstrate the efficiency of the CLAMP graphic user interface in building customized, high-performance NLP pipelines with 2 use cases, extracting smoking status and lab test values. CLAMP is publicly available for research use, and we believe it is a unique asset for the clinical NLP community. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  11. Data Pipeline Development for Grain Boundary Structures Classification

    OpenAIRE

    Li, Bingxi

    2017-01-01

    Grain Boundaries govern many properties of polycrystalline materials, including the vast majority of engineering materials. Evolutionary algorithm can be applied to predict the grain boundary structures in different systems. However, the recognition and classification of thousands of predicted structures is a very challenging work for eye detection in terms of efficiency and accuracy. A data pipeline is developed to accelerate the classification and recognition of grain boundary structures pr...

  12. Knowledge Pipeline: A Task Oriented Way to Implement Knowledge Management

    International Nuclear Information System (INIS)

    Pan Jiajie

    2014-01-01

    Concept of knowledge pipeline: There are many pipelines named by tasks or business processes in an organization. Knowledge contributors put knowledge to its corresponding pipelines. A maintenance team could keep the knowledge in pipelines clear and valid. Users could get knowledge just like opening a faucet in terms of their tasks or business processes

  13. 49 CFR 192.513 - Test requirements for plastic pipelines.

    Science.gov (United States)

    2010-10-01

    ... 49 Transportation 3 2010-10-01 2010-10-01 false Test requirements for plastic pipelines. 192.513 Section 192.513 Transportation Other Regulations Relating to Transportation (Continued) PIPELINE AND... Test requirements for plastic pipelines. (a) Each segment of a plastic pipeline must be tested in...

  14. Ranking Biomedical Annotations with Annotator’s Semantic Relevancy

    Directory of Open Access Journals (Sweden)

    Aihua Wu

    2014-01-01

    Full Text Available Biomedical annotation is a common and affective artifact for researchers to discuss, show opinion, and share discoveries. It becomes increasing popular in many online research communities, and implies much useful information. Ranking biomedical annotations is a critical problem for data user to efficiently get information. As the annotator’s knowledge about the annotated entity normally determines quality of the annotations, we evaluate the knowledge, that is, semantic relationship between them, in two ways. The first is extracting relational information from credible websites by mining association rules between an annotator and a biomedical entity. The second way is frequent pattern mining from historical annotations, which reveals common features of biomedical entities that an annotator can annotate with high quality. We propose a weighted and concept-extended RDF model to represent an annotator, a biomedical entity, and their background attributes and merge information from the two ways as the context of an annotator. Based on that, we present a method to rank the annotations by evaluating their correctness according to user’s vote and the semantic relevancy between the annotator and the annotated entity. The experimental results show that the approach is applicable and efficient even when data set is large.

  15. Qcorp: an annotated classification corpus of Chinese health questions.

    Science.gov (United States)

    Guo, Haihong; Na, Xu; Li, Jiao

    2018-03-22

    Health question-answering (QA) systems have become a typical application scenario of Artificial Intelligent (AI). An annotated question corpus is prerequisite for training machines to understand health information needs of users. Thus, we aimed to develop an annotated classification corpus of Chinese health questions (Qcorp) and make it openly accessible. We developed a two-layered classification schema and corresponding annotation rules on basis of our previous work. Using the schema, we annotated 5000 questions that were randomly selected from 5 Chinese health websites within 6 broad sections. 8 annotators participated in the annotation task, and the inter-annotator agreement was evaluated to ensure the corpus quality. Furthermore, the distribution and relationship of the annotated tags were measured by descriptive statistics and social network map. The questions were annotated using 7101 tags that covers 29 topic categories in the two-layered schema. In our released corpus, the distribution of questions on the top-layered categories was treatment of 64.22%, diagnosis of 37.14%, epidemiology of 14.96%, healthy lifestyle of 10.38%, and health provider choice of 4.54% respectively. Both the annotated health questions and annotation schema were openly accessible on the Qcorp website. Users can download the annotated Chinese questions in CSV, XML, and HTML format. We developed a Chinese health question corpus including 5000 manually annotated questions. It is openly accessible and would contribute to the intelligent health QA system development.

  16. BIOCAT: a pattern recognition platform for customizable biological image classification and annotation.

    Science.gov (United States)

    Zhou, Jie; Lamichhane, Santosh; Sterne, Gabriella; Ye, Bing; Peng, Hanchuan

    2013-10-04

    Pattern recognition algorithms are useful in bioimage informatics applications such as quantifying cellular and subcellular objects, annotating gene expressions, and classifying phenotypes. To provide effective and efficient image classification and annotation for the ever-increasing microscopic images, it is desirable to have tools that can combine and compare various algorithms, and build customizable solution for different biological problems. However, current tools often offer a limited solution in generating user-friendly and extensible tools for annotating higher dimensional images that correspond to multiple complicated categories. We develop the BIOimage Classification and Annotation Tool (BIOCAT). It is able to apply pattern recognition algorithms to two- and three-dimensional biological image sets as well as regions of interest (ROIs) in individual images for automatic classification and annotation. We also propose a 3D anisotropic wavelet feature extractor for extracting textural features from 3D images with xy-z resolution disparity. The extractor is one of the about 20 built-in algorithms of feature extractors, selectors and classifiers in BIOCAT. The algorithms are modularized so that they can be "chained" in a customizable way to form adaptive solution for various problems, and the plugin-based extensibility gives the tool an open architecture to incorporate future algorithms. We have applied BIOCAT to classification and annotation of images and ROIs of different properties with applications in cell biology and neuroscience. BIOCAT provides a user-friendly, portable platform for pattern recognition based biological image classification of two- and three- dimensional images and ROIs. We show, via diverse case studies, that different algorithms and their combinations have different suitability for various problems. The customizability of BIOCAT is thus expected to be useful for providing effective and efficient solutions for a variety of biological

  17. Report of study group 4.1 ''pipeline ageing and rehabilitation''

    Energy Technology Data Exchange (ETDEWEB)

    Serena, L.

    2000-07-01

    This report describes the work on the subject 'pipeline ageing and rehabilitation' carried out by the Study Group 4.1 and related to the triennium 1997 - 2000. The report is focused on ageing and rehabilitation of natural gas transmission pipelines and more in detail on the following topics: - Definition of pipeline ageing; - Different ageing elements; - Main causes of ageing; - Inspections and monitoring; - Repair methods on ageing pipelines; - Programmes and strategies for pipeline maintenance and rehabilitation. The report includes the state of the art of the different techniques used to assess pipeline ageing such as pig inspection, landslide areas monitoring as well as advanced monitoring methods used nowadays by pipeline operators; a clarification of the concepts for different maintenance approaches is also presented. In addition the report gives some information regarding repair methods in use, the methodologies to evaluate the defects and the philosophy on which each repair system is based. The remaining topics deal with the strategies of pipelines and coating rehabilitation, locus the attention in the economical and technical considerations also beyond the ageing concept and describe in details the main causes of ageing as indicated by operators. A questionnaire on these topics was in fact distributed and the obtained results are included in this report. (author)

  18. Impact of Pipeline Construction on Air Environment

    Science.gov (United States)

    Tomareva, I. A.; Kozlovtseva, E. Yu; Perfilov, V. A.

    2017-11-01

    The research of the environmental imbalance causes in the construction of pipelines is provided in the article. On the basis of the generalized data, a block diagram of a comprehensive model for reduction of the negative impact on the environment from the pipeline construction is made. It allowed us to identify the parameters of the subgoals describing the characteristics of measures to ensure the environmental balance as well as the parameters related to risk factors. The analysis of hazards made it possible to determine their sources, the probability of occurrence of a negative effect and to create a tree of causes of atmospheric pollution due to the pipeline construction. The research pays a special attention to the impact from the particulate matter of inorganic and abrasive dust on the air while pipeline construction.

  19. Citizenship program in near communities of pipelines

    Energy Technology Data Exchange (ETDEWEB)

    Mascarenhas, Carina R.; Vilas Boas, Ianne P. [TELSAN Engenharia, Belo Horizonte, MG (Brazil); Bourscheid, Pitagoras [PETROBRAS S.A., Rio de Janeiro, RJ (Brazil)

    2009-12-19

    During the construction of a pipeline, the IENE - Engineering Unit of PETROBRAS, responsible for the construction and erection of pipelines and related plants in northeastern Brazil, crossed more than 7 states and 250 counties, had implemented a social responsibility program, in special a citizenship program. This action was the result of community studies located near of the pipelines AID - Direct Influence Area (438 yards right and left of the pipeline) and through the evidence that those locations were poor and have no personal documents and citizen position in society. This paper intents to share the experience of IENE about its citizen program that worked in three big lines: community mobilization; citizenship qualification; and citizenship board. This last one, turns possible to people obtains theirs personal documents and exercise the plenitude of citizenship. (author)

  20. Regular pipeline maintenance of gas pipeline using technical operational diagnostics methods

    Energy Technology Data Exchange (ETDEWEB)

    Volentic, J. [Gas Transportation Department, Slovensky plynarensky priemysel, Slovak Gas Industry, Bratislava (Slovakia)

    1997-12-31

    Slovensky plynarensky priemysel (SPP) has operated 17 487 km of gas pipelines in 1995. The length of the long-line pipelines reached 5 191 km, distribution network was 12 296 km. The international transit system of long-line gas pipelines ranged 1 939 km of pipelines of various dimensions. The described scale of transport and distribution system represents a multibillion investments stored in the ground, which are exposed to the environmental influences and to pipeline operational stresses. In spite of all technical and maintenance arrangements, which have to be performed upon operating gas pipelines, the gradual ageing takes place anyway, expressed in degradation process both in steel tube, as well as in the anti-corrosion coating. Within a certain time horizon, a consistent and regular application of methods and means of in-service technical diagnostics and rehabilitation of existing pipeline systems make it possible to save substantial investment funds, postponing the need in funds for a complex or partial reconstruction or a new construction of a specific gas section. The purpose of this presentation is to report on the implementation of the programme of in-service technical diagnostics of gas pipelines within the framework of regular maintenance of SPP s.p. Bratislava high pressure gas pipelines. (orig.) 6 refs.

  1. Pipelines in Louisiana, Geographic NAD83, USGS (1999) [pipelines_la_usgs_1999

    Data.gov (United States)

    Louisiana Geographic Information Center — This dataset contains vector line map information of various pipelines throughout the State of Louisiana. The vector data contain selected base categories of...

  2. Optimal processor assignment for pipeline computations

    Science.gov (United States)

    Nicol, David M.; Simha, Rahul; Choudhury, Alok N.; Narahari, Bhagirath

    1991-01-01

    The availability of large scale multitasked parallel architectures introduces the following processor assignment problem for pipelined computations. Given a set of tasks and their precedence constraints, along with their experimentally determined individual responses times for different processor sizes, find an assignment of processor to tasks. Two objectives are of interest: minimal response given a throughput requirement, and maximal throughput given a response time requirement. These assignment problems differ considerably from the classical mapping problem in which several tasks share a processor; instead, it is assumed that a large number of processors are to be assigned to a relatively small number of tasks. Efficient assignment algorithms were developed for different classes of task structures. For a p processor system and a series parallel precedence graph with n constituent tasks, an O(np2) algorithm is provided that finds the optimal assignment for the response time optimization problem; it was found that the assignment optimizing the constrained throughput in O(np2log p) time. Special cases of linear, independent, and tree graphs are also considered.

  3. Offshore pipeline influence on middle east religion

    Directory of Open Access Journals (Sweden)

    Delistoian Dmitri

    2017-12-01

    Full Text Available The Middle East is responsible for nearly 30% of world oil production. Oil export is possible just using oil tankers and pipelines. Most developed pipeline network it has Syria. At the same time Syria is a gateway to Europe oil market. Bearing in mind oil price decline, fight on oil markets becomes more pronounced. The consequences of this struggle are creating increasing religious tensions between Shiites and Sunnis. Syria becomes a global battlefield for the influence on Arabian oil.

  4. On-the-fly pipeline parallelism

    OpenAIRE

    Lee, I-Ting Angelina; Leiserson, Charles E.; Sukha, Jim; Zhang, Zhunping; Schardl, Tao Benjamin

    2013-01-01

    Pipeline parallelism organizes a parallel program as a linear sequence of s stages. Each stage processes elements of a data stream, passing each processed data element to the next stage, and then taking on a new element before the subsequent stages have necessarily completed their processing. Pipeline parallelism is used especially in streaming applications that perform video, audio, and digital signal processing. Three out of 13 benchmarks in PARSEC, a popular software benchmark suite design...

  5. 76 FR 75894 - Information Collection Activities: Pipelines and Pipeline Rights-of-Way; Submitted for Office of...

    Science.gov (United States)

    2011-12-05

    ... repair report 3 1008(f) Submit report of pipeline failure analysis...... 30 1008(g) Submit plan of.... Sec. 250.1000(b)--Pipeline Application Modification (ROW)--$3,865. Sec. 250.1008(e)--Pipeline Repair... Bureau of Safety and Environmental Enforcement (BSEE) Information Collection Activities: Pipelines and...

  6. Computer systems for annotation of single molecule fragments

    Science.gov (United States)

    Schwartz, David Charles; Severin, Jessica

    2016-07-19

    There are provided computer systems for visualizing and annotating single molecule images. Annotation systems in accordance with this disclosure allow a user to mark and annotate single molecules of interest and their restriction enzyme cut sites thereby determining the restriction fragments of single nucleic acid molecules. The markings and annotations may be automatically generated by the system in certain embodiments and they may be overlaid translucently onto the single molecule images. An image caching system may be implemented in the computer annotation systems to reduce image processing time. The annotation systems include one or more connectors connecting to one or more databases capable of storing single molecule data as well as other biomedical data. Such diverse array of data can be retrieved and used to validate the markings and annotations. The annotation systems may be implemented and deployed over a computer network. They may be ergonomically optimized to facilitate user interactions.

  7. Bibliografia de Aztlan: An Annotated Chicano Bibliography.

    Science.gov (United States)

    Barrios, Ernie, Ed.

    More than 300 books and articles published from 1920 to 1971 are reviewed in this annotated bibliography of literature on the Chicano. The citations and reviews are categorized by subject area and deal with contemporary Chicano history, education, health, history of Mexico, literature, native Americans, philosophy, political science, pre-Columbian…

  8. DIMA – Annotation guidelines for German intonation

    DEFF Research Database (Denmark)

    Kügler, Frank; Smolibocki, Bernadett; Arnold, Denis

    2015-01-01

    This paper presents newly developed guidelines for prosodic annotation of German as a consensus system agreed upon by German intonologists. The DIMA system is rooted in the framework of autosegmental-metrical phonology. One important goal of the consensus is to make exchanging data between groups...

  9. Structuring and presenting annotated media repositories

    NARCIS (Netherlands)

    L. Rutledge (Lloyd); J.R. van Ossenbruggen (Jacco); L. Hardman (Lynda)

    2004-01-01

    textabstractThe Semantic Web envisions a Web that is both human readable and machine processible. In practice, however, there is still a large conceptual gap between annotated content repositories on the one hand, and coherent, human readable Web pages on the other. To bridge this conceptual gap,

  10. Canonical Processes of Semantically Annotated Media Production

    NARCIS (Netherlands)

    Hardman, L.; Obrenović, Ž.; Nack, F.; Troncy, R.; Huet, B.; Schenk, S.

    2011-01-01

    While many multimedia systems allow the association of semantic annotations with media assets, there is no agreed way of sharing these among systems. This chapter identifies a small number of fundamental processes of media production, which the author terms canonical processes, which can be

  11. Canonical processes of semantically annotated media production

    NARCIS (Netherlands)

    L. Hardman (Lynda); Z. Obrenovic; F.-M. Nack (Frank); B. Kerhervé; K. Piersol

    2008-01-01

    htmlabstractWhile many multimedia systems allow the association of semantic annotations with media assets, there is no agreed-upon way of sharing these among systems. As an initial step within the multimedia community, we identify a small number of fundamental processes of media production, which we

  12. Canonical processes of semantically annotated media production

    NARCIS (Netherlands)

    Hardman, L.; Obrenović, Ž.; Nack, F.; Kerhervé, B.; Piersol, K.

    2008-01-01

    While many multimedia systems allow the association of semantic annotations with media assets, there is no agreed-upon way of sharing these among systems. As an initial step within the multimedia community, we identify a small number of fundamental processes of media production, which we term

  13. Teaching Creative Writing: A Selective, Annotated Bibliography.

    Science.gov (United States)

    Bishop, Wendy; And Others

    Focusing on pedagogical issues in creative writing, this annotated bibliography reviews 149 books, articles, and dissertations in the fields of creative writing and composition, and, selectively, feminist and literary theory. Anthologies of original writing and reference books are not included. (MM)

  14. An Annotated Bibliography in Financial Therapy

    Directory of Open Access Journals (Sweden)

    Dorothy B. Durband

    2010-10-01

    Full Text Available The following annotated bibliography contains a summary of articles and websites, as well as a list of books related to financial therapy. The resources were compiled through e-mail solicitation from members of the Financial Therapy Forum in November 2008. Members of the forum are marked with an asterisk.

  15. Just-in-time : on strategy annotations

    NARCIS (Netherlands)

    J.C. van de Pol (Jaco)

    2001-01-01

    textabstractA simple kind of strategy annotations is investigated, giving rise to a class of strategies, including leftmost-innermost. It is shown that under certain restrictions, an interpreter can be written which computes the normal form of a term in a bottom-up traversal. The main contribution

  16. Automating Ontological Annotation with WordNet

    Energy Technology Data Exchange (ETDEWEB)

    Sanfilippo, Antonio P.; Tratz, Stephen C.; Gregory, Michelle L.; Chappell, Alan R.; Whitney, Paul D.; Posse, Christian; Paulson, Patrick R.; Baddeley, Bob L.; Hohimer, Ryan E.; White, Amanda M.

    2006-01-22

    Semantic Web applications require robust and accurate annotation tools that are capable of automating the assignment of ontological classes to words in naturally occurring text (ontological annotation). Most current ontologies do not include rich lexical databases and are therefore not easily integrated with word sense disambiguation algorithms that are needed to automate ontological annotation. WordNet provides a potentially ideal solution to this problem as it offers a highly structured lexical conceptual representation that has been extensively used to develop word sense disambiguation algorithms. However, WordNet has not been designed as an ontology, and while it can be easily turned into one, the result of doing this would present users with serious practical limitations due to the great number of concepts (synonym sets) it contains. Moreover, mapping WordNet to an existing ontology may be difficult and requires substantial labor. We propose to overcome these limitations by developing an analytical platform that (1) provides a WordNet-based ontology offering a manageable and yet comprehensive set of concept classes, (2) leverages the lexical richness of WordNet to give an extensive characterization of concept class in terms of lexical instances, and (3) integrates a class recognition algorithm that automates the assignment of concept classes to words in naturally occurring text. The ensuing framework makes available an ontological annotation platform that can be effectively integrated with intelligence analysis systems to facilitate evidence marshaling and sustain the creation and validation of inference models.

  17. Multimedia Annotations on the Semantic Web

    NARCIS (Netherlands)

    Stamou, G.; Ossenbruggen, J.R.; Pan, J.; Schreiber, A.T.

    2006-01-01

    Multimedia in all forms (images, video, graphics, music, speech) is exploding on the Web. The content needs to be annotated and indexed to enable effective search and retrieval. However, recent standards and best practices for multimedia metadata don't provide semantically rich descriptions of

  18. Multimedia Annotations on the Semantic Web

    NARCIS (Netherlands)

    G. Stamou; J.R. van Ossenbruggen (Jacco); J.Z. Pan (Jeff); G. Schreiber (Guus)

    2006-01-01

    textabstractMultimedia in all forms (images, video, graphics, music, speech) is exploding on the Web. The content needs to be annotated and indexed to enable effective search and retrieval. However, recent standards and best practices for multimedia metadata don't provide semantically rich

  19. La Mujer Chicana: An Annotated Bibliography, 1976.

    Science.gov (United States)

    Chapa, Evey, Ed.; And Others

    Intended to provide interested persons, researchers, and educators with information about "la mujer Chicana", this annotated bibliography cites 320 materials published between 1916 and 1975, with the majority being between 1960 and 1975. The 12 sections cover the following subject areas: Chicana publications; Chicana feminism and…

  20. Ontological Annotation with WordNet

    Energy Technology Data Exchange (ETDEWEB)

    Sanfilippo, Antonio P.; Tratz, Stephen C.; Gregory, Michelle L.; Chappell, Alan R.; Whitney, Paul D.; Posse, Christian; Paulson, Patrick R.; Baddeley, Bob; Hohimer, Ryan E.; White, Amanda M.

    2006-06-06

    Semantic Web applications require robust and accurate annotation tools that are capable of automating the assignment of ontological classes to words in naturally occurring text (ontological annotation). Most current ontologies do not include rich lexical databases and are therefore not easily integrated with word sense disambiguation algorithms that are needed to automate ontological annotation. WordNet provides a potentially ideal solution to this problem as it offers a highly structured lexical conceptual representation that has been extensively used to develop word sense disambiguation algorithms. However, WordNet has not been designed as an ontology, and while it can be easily turned into one, the result of doing this would present users with serious practical limitations due to the great number of concepts (synonym sets) it contains. Moreover, mapping WordNet to an existing ontology may be difficult and requires substantial labor. We propose to overcome these limitations by developing an analytical platform that (1) provides a WordNet-based ontology offering a manageable and yet comprehensive set of concept classes, (2) leverages the lexical richness of WordNet to give an extensive characterization of concept class in terms of lexical instances, and (3) integrates a class recognition algorithm that automates the assignment of concept classes to words in naturally occurring text. The ensuing framework makes available an ontological annotation platform that can be effectively integrated with intelligence analysis systems to facilitate evidence marshaling and sustain the creation and validation of inference models.

  1. Annotated Bibliography of EDGE2D Use

    International Nuclear Information System (INIS)

    Strachan, J.D.; Corrigan, G.

    2005-01-01

    This annotated bibliography is intended to help EDGE2D users, and particularly new users, find existing published literature that has used EDGE2D. Our idea is that a person can find existing studies which may relate to his intended use, as well as gain ideas about other possible applications by scanning the attached tables

  2. Male-Female Sexuality: An Annotated Bibliography.

    Science.gov (United States)

    Wilson, Janice

    This annotated bibliography contains over 500 sources on the historical and contemporary development and expression of male and female sexuality. There are 68 topic headings which provide easy access for subject areas. A major portion of the bibliography is devoted to contemporary male-female sexuality. These materials consist of research findings…

  3. Mulligan Concept manual therapy: standardizing annotation.

    Science.gov (United States)

    McDowell, Jillian Marie; Johnson, Gillian Margaret; Hetherington, Barbara Helen

    2014-10-01

    Quality technique documentation is integral to the practice of manual therapy, ensuring uniform application and reproducibility of treatment. Manual therapy techniques are described by annotations utilizing a range of acronyms, abbreviations and universal terminology based on biomechanical and anatomical concepts. The various combinations of therapist and patient generated forces utilized in a variety of weight-bearing positions, which are synonymous with Mulligan Concept, challenge practitioners existing annotational skills. An annotation framework with recording rules adapted to the Mulligan Concept is proposed in which the abbreviations incorporate established manual therapy tenets and are detailed in the following sequence of; starting position, side, joint/s, method of application, glide/s, Mulligan technique, movement (or function), whether an assistant is used, overpressure (and by whom) and numbers of repetitions or time and sets. Therapist or patient application of overpressure and utilization of treatment belts or manual techniques must be recorded to capture the complete description. The adoption of the Mulligan Concept annotation framework in this way for documentation purposes will provide uniformity and clarity of information transfer for the future purposes of teaching, clinical practice and audit for its practitioners. Copyright © 2014 Elsevier Ltd. All rights reserved.

  4. An Annotated Publications List on Homelessness.

    Science.gov (United States)

    Tutunjian, Beth Ann

    This annotated publications list on homelessness contains citations for 19 publications, most of which deal with problems of alcohol or drug abuse among homeless persons. Citations are listed alphabetically by author and cover the topics of homelessness and alcoholism, drug abuse, public policy, research methodologies, mental illness, alcohol- and…

  5. Book Reviews, Annotation, and Web Technology.

    Science.gov (United States)

    Schulze, Patricia

    From reading texts to annotating web pages, grade 6-8 students rely on group cooperation and individual reading and writing skills in this research project that spans six 50-minute lessons. Student objectives for this project are that they will: read, discuss, and keep a journal on a book in literature circles; understand the elements of and…

  6. Genotyping and annotation of Affymetrix SNP arrays

    DEFF Research Database (Denmark)

    Lamy, Philippe; Andersen, Claus Lindbjerg; Wikman, Friedrik

    2006-01-01

    allows us to annotate SNPs that have poor performance, either because of poor experimental conditions or because for one of the alleles the probes do not behave in a dose-response manner. Generally, our method agrees well with a method developed by Affymetrix. When both methods make a call they agree...

  7. Snap: an integrated SNP annotation platform

    DEFF Research Database (Denmark)

    Li, Shengting; Ma, Lijia; Li, Heng

    2007-01-01

    Snap (Single Nucleotide Polymorphism Annotation Platform) is a server designed to comprehensively analyze single genes and relationships between genes basing on SNPs in the human genome. The aim of the platform is to facilitate the study of SNP finding and analysis within the framework of medical...

  8. Annotating State of Mind in Meeting Data

    NARCIS (Netherlands)

    Heylen, Dirk K.J.; Reidsma, Dennis; Ordelman, Roeland J.F.; Devillers, L.; Martin, J-C.; Cowie, R.; Batliner, A.

    We discuss the annotation procedure for mental state and emotion that is under development for the AMI (Augmented Multiparty Interaction) corpus. The categories that were found to be most appropriate relate not only to emotions but also to (meta-)cognitive states and interpersonal variables. The

  9. ePNK Applications and Annotations

    DEFF Research Database (Denmark)

    Kindler, Ekkart

    2017-01-01

    newapplicationsfor the ePNK and, in particular, visualizing the result of an application in the graphical editor of the ePNK by singannotations, and interacting with the end user using these annotations. In this paper, we give an overview of the concepts of ePNK applications by discussing the implementation...

  10. Evaluating automatically annotated treebanks for linguistic research

    NARCIS (Netherlands)

    Bloem, J.; Bański, P.; Kupietz, M.; Lüngen, H.; Witt, A.; Barbaresi, A.; Biber, H.; Breiteneder, E.; Clematide, S.

    2016-01-01

    This study discusses evaluation methods for linguists to use when employing an automatically annotated treebank as a source of linguistic evidence. While treebanks are usually evaluated with a general measure over all the data, linguistic studies often focus on a particular construction or a group

  11. Indiana Newspaper History: An Annotated Bibliography.

    Science.gov (United States)

    Popovich, Mark, Comp.; And Others

    The purposes of this bibliography are to bring together materials that relate to the history of newspapers in Indiana and to assess, in a general way, the value of the material. The bibliography contains 415 entries, with descriptive annotations, arranged in seven sections: books; special materials; general newspaper histories and lists of…

  12. Multiview Hessian regularization for image annotation.

    Science.gov (United States)

    Liu, Weifeng; Tao, Dacheng

    2013-07-01

    The rapid development of computer hardware and Internet technology makes large scale data dependent models computationally tractable, and opens a bright avenue for annotating images through innovative machine learning algorithms. Semisupervised learning (SSL) therefore received intensive attention in recent years and was successfully deployed in image annotation. One representative work in SSL is Laplacian regularization (LR), which smoothes the conditional distribution for classification along the manifold encoded in the graph Laplacian, however, it is observed that LR biases the classification function toward a constant function that possibly results in poor generalization. In addition, LR is developed to handle uniformly distributed data (or single-view data), although instances or objects, such as images and videos, are usually represented by multiview features, such as color, shape, and texture. In this paper, we present multiview Hessian regularization (mHR) to address the above two problems in LR-based image annotation. In particular, mHR optimally combines multiple HR, each of which is obtained from a particular view of instances, and steers the classification function that varies linearly along the data manifold. We apply mHR to kernel least squares and support vector machines as two examples for image annotation. Extensive experiments on the PASCAL VOC'07 dataset validate the effectiveness of mHR by comparing it with baseline algorithms, including LR and HR.

  13. Annotated Bibliography of English for Special Purposes.

    Science.gov (United States)

    Allix, Beverley, Comp.

    This annotated bibliography covers the following types of materials of use to teachers of English for Special Purposes: (1) books, monographs, reports, and conference papers; (2) periodical articles and essays in collections; (3) theses and dissertations; (4) bibliographies; (5) dictionaries; and (6) textbooks in series by publisher. Section (1)…

  14. Great Basin Experimental Range: Annotated bibliography

    Science.gov (United States)

    E. Durant McArthur; Bryce A. Richardson; Stanley G. Kitchen

    2013-01-01

    This annotated bibliography documents the research that has been conducted on the Great Basin Experimental Range (GBER, also known as the Utah Experiment Station, Great Basin Station, the Great Basin Branch Experiment Station, Great Basin Experimental Center, and other similar name variants) over the 102 years of its existence. Entries were drawn from the original...

  15. Chemical Principles Revisited: Annotating Reaction Equations.

    Science.gov (United States)

    Tykodi, R. J.

    1987-01-01

    Urges chemistry teachers to have students annotate the chemical reactions in aqueous-solutions that they see in their textbooks and witness in the laboratory. Suggests this will help students recognize the reaction type more readily. Examples are given for gas formation, precipitate formation, redox interaction, acid-base interaction, and…

  16. Frost heave and pipeline upheaval buckling

    Energy Technology Data Exchange (ETDEWEB)

    Palmer, A. C. [University Engineering Department, Cambridge (United Kingdom); Williams, P. J. [Carleton Univ., Geotechnical Science Laboratories, Ottawa, ON (Canada)

    2003-10-01

    The interaction between frost heave and upheaval buckling and the potential effect of these two phenomena on the safety of Arctic pipelines is discussed. When soils freeze, ice forms within the pores between the particles. If the surface is free to move, it heaves, because of the expansion that accompanies freezing. Upheaval buckling occurs in longitudinally constrained buried pipelines which can lead to large upward movements of a pipeline. The driving force for upheaval is the longitudinal compressive force induced by operation of the pipeline. While uniform vertical movement does not affect functionality, movements that induce curvature can overstress a pipeline to a dangerous extent. This paper examines the adverse interaction between the longitudinal variability of frost heave and the propensity of heave-induced overbends, and the conditions under which they might lead to upheaval. Results obtained suggest that discontinuities in frost heave can be sufficient to destabilize a high-pressure pipeline and induce upheaval, even when the original 'as-laid' profile is perfectly straight and level. The heave is most likely to occur in winter, with the upheaval following in the summer when the operating temperature is higher and the uplift resistance is reduced. 27 refs., 2 figs.

  17. Outlook '98 - Gas and oil pipelines

    International Nuclear Information System (INIS)

    Curtis, B.

    1998-01-01

    Due to rising North American demand, especially by the United States, by the end of 1997 there were plans to build 15 new pipelines over the next three years, at an estimated cost of $17 billion. Canada''s proximity to the United States, combined with huge Canadian reserves, and the fact that Canada already supplies some 15 per cent of U.S. requirements, makes Canada the obvious choice for filling future demand. This explains why most, if not all, current pipeline expansion projects are targeting markets in the U.S. Market forces will determine which of the projects will actually go forward. From the point of view of the Canadian Energy Pipeline Association pipeline regulatory reform, pipeline safety, integrity and climate change will be the Association''s key concerns during 1998. To that end, the Association is cooperating with the National Energy Board in a multi-million dollar study of stress corrosion cracking. The Association has also developed a Manual of Recommended Practices for the use of member companies to assist them to tailor stress corrosion cracking practices to their own operations. Meeting Canada''s commitment at the Kyoto Conference for greenhouse gas emissions of six per cent below 1990 levels by the year 2008 to 2012 (in effect a 25 per cent reduction from the level anticipated in the year 2000), a very difficult task according to industry experts, is also among the high priority items on the pipeline industry''s agenda for 1998

  18. Detecting method and device for pipeline

    International Nuclear Information System (INIS)

    Hirano, Akihiko; Hirano, Atsuya; Otaka, Masahiro; Kanno, Satoshi; Amano, Kazuo; Miyazaki, Katsumasa; Hattori, Shigeo; Yamamoto, Michiyoshi

    1998-01-01

    The present invention provides a method of and a device for diagnosing integrity of pipelines having the surfaces being in a corrosive atmosphere in a nuclear power plant. Namely, through holes are perforated to the pipeline. The predetermined area in the inside of the pipeline is scanned by laser light through the through holes. A luminance distribution of the laser light reflected at the inner circumference of the pipeline is photographed by a CCD camera. The images of the distribution of the luminance is processed to pick out a failed state. The image of the failed state is measured to determine the dimension of the failure. Alternatively, an inspection module having a function of irradiating laser light and detecting the lasers is inserted from a portion which can be removed upon conducting detection among the inside of the pipeline system to a position of the object to be detected. A predetermined area is scanned by laser light. The luminance distribution of the laser light reflected in the pipeline is photographed by the CCD camera. The images of the luminance distribution are processed to pick out the failed state, and the image of the failed state is measured to determine the dimension of the failure. (I.S.)

  19. Astronomical pipeline processing using fuzzy logic

    Science.gov (United States)

    Shamir, Lior

    In the past few years, pipelines providing astronomical data have been becoming increasingly important. The wide use of robotic telescopes has provided significant discoveries, and sky survey projects such as SDSS and the future LSST are now considered among the premier projects in the field astronomy. The huge amount of data produced by these pipelines raises the need for automatic processing. Astronomical pipelines introduce several well-defined problems such as astronomical image compression, cosmic-ray hit rejection, transient detection, meteor triangulation and association of point sources with their corresponding known stellar objects. We developed and applied soft computing algorithms that provide new or improved solutions to these growing problems in the field of pipeline processing of astronomical data. One new approach that we use is fuzzy logic-based algorithms, which enables the automatic analysis of the astronomical pipelines and allows mining the data for not-yet-known astronomical discoveries such as optical transients and variable stars. The developed algorithms have been tested with excellent results on the NightSkyLive sky survey, which provides a pipeline of 150 astronomical pictures per hour, and covers almost the entire global night sky.

  20. Generation of silver standard concept annotations from biomedical texts with special relevance to phenotypes.

    Directory of Open Access Journals (Sweden)

    Anika Oellrich

    Full Text Available Electronic health records and scientific articles possess differing linguistic characteristics that may impact the performance of natural language processing tools developed for one or the other. In this paper, we investigate the performance of four extant concept recognition tools: the clinical Text Analysis and Knowledge Extraction System (cTAKES, the National Center for Biomedical Ontology (NCBO Annotator, the Biomedical Concept Annotation System (BeCAS and MetaMap. Each of the four concept recognition systems is applied to four different corpora: the i2b2 corpus of clinical documents, a PubMed corpus of Medline abstracts, a clinical trails corpus and the ShARe/CLEF corpus. In addition, we assess the individual system performances with respect to one gold standard annotation set, available for the ShARe/CLEF corpus. Furthermore, we built a silver standard annotation set from the individual systems' output and assess the quality as well as the contribution of individual systems to the quality of the silver standard. Our results demonstrate that mainly the NCBO annotator and cTAKES contribute to the silver standard corpora (F1-measures in the range of 21% to 74% and their quality (best F1-measure of 33%, independent from the type of text investigated. While BeCAS and MetaMap can contribute to the precision of silver standard annotations (precision of up to 42%, the F1-measure drops when combined with NCBO Annotator and cTAKES due to a low recall. In conclusion, the performances of individual systems need to be improved independently from the text types, and the leveraging strategies to best take advantage of individual systems' annotations need to be revised. The textual content of the PubMed corpus, accession numbers for the clinical trials corpus, and assigned annotations of the four concept recognition systems as well as the generated silver standard annotation sets are available from http://purl.org/phenotype/resources. The textual content

  1. Simulation and Experiment Research on Fatigue Life of High Pressure Air Pipeline Joint

    Science.gov (United States)

    Shang, Jin; Xie, Jianghui; Yu, Jian; Zhang, Deman

    2017-12-01

    High pressure air pipeline joint is important part of high pressure air system, whose reliability is related to the safety and stability of the system. This thesis developed a new type-high pressure air pipeline joint, carried out dynamics research on CB316-1995 and new type-high pressure air pipeline joint with finite element method, deeply analysed the join forms of different design schemes and effect of materials on stress, tightening torque and fatigue life of joint. Research team set up vibration/pulse test bench, carried out joint fatigue life contrast test. The result shows: the maximum stress of the joint is inverted in the inner side of the outer sleeve nut, which is consistent with the failure mode of the crack on the outer sleeve nut in practice. Simulation and experiment of fatigue life and tightening torque of new type-high pressure air pipeline joint are better than CB316-1995 joint.

  2. Method to reduce arc blow during DC arc welding of pipelines

    Energy Technology Data Exchange (ETDEWEB)

    Espina-Hernandez, J. H.; Rueda-Morales, G.L.; Caleyo, F.; Hallen, J. M. [Instituto Politecnico Nacional, Mexico, (Mexico); Lopez-Montenegro, A.; Perz-Baruch, E. [Pemex Exploracion y Produccion, Tabasco, (Mexico)

    2010-07-01

    Steel pipelines are huge ferromagnetic structures and can be easily subjected to arc blow during the DC arc welding process. The development of methods to avoid arc blow during pipeline DC arc welding is a major objective in the pipeline industry. This study developed a simple procedure to compensate the residual magnetic field in the groove during DC arc welding. A Gaussmeter was used to perform magnetic flux density measurements in pipelines in southern Mexico. These data were used to perform magnetic finite element simulations using FEMM. Different variables were studied such as the residual magnetic field in the groove or the position of the coil with respect to the groove. An empirical predictive equation was developed from these trials to compensate for the residual magnetic field. A new method of compensating for the residual magnetic field in the groove by selecting the number of coil turns and the position of the coil with respect to the groove was established.

  3. Integrity assessment of pipelines - additional remarks; Avaliacao da integridade de dutos - observacoes adicionais

    Energy Technology Data Exchange (ETDEWEB)

    Alves, Luis F.C. [PETROBRAS S.A., Salvador, BA (Brazil). Unidade de Negocios. Exploracao e Producao

    2005-07-01

    Integrity assessment of pipelines is part of a process that aims to enhance the operating safety of pipelines. During this task, questions related to the interpretation of inspection reports and the way of regarding the impact of several parameters on the pipeline integrity normally come up. In order to satisfactorily answer such questions, the integrity assessment team must be able to suitably approach different subjects such as corrosion control and monitoring, assessment of metal loss and geometric anomalies, and third party activities. This paper presents additional remarks on some of these questions based on the integrity assessment of almost fifty pipelines that has been done at PETROBRAS/E and P Bahia over the past eight years. (author)

  4. Drive Control System for Pipeline Crawl Robot Based on CAN Bus

    Energy Technology Data Exchange (ETDEWEB)

    Chen, H J [Department of Electrical Engineering, Harbin Institute of Technology Harbin, Heilongjiang, 150001 (China); Gao, B T [Department of Electrical Engineering, Harbin Institute of Technology Harbin, Heilongjiang, 150001 (China); Zhang, X H [Department of Electrical Engineering, Harbin Institute of Technology Harbin, Heilongjiang, 150001 (China); Deng, Z Q [School of Mechanical Engineering, Harbin Institute of Technology Harbin, Heilongjiang, 150001 (China)

    2006-10-15

    Drive control system plays important roles in pipeline robot. In order to inspect the flaw and corrosion of seabed crude oil pipeline, an original mobile pipeline robot with crawler drive unit, power and monitor unit, central control unit, and ultrasonic wave inspection device is developed. The CAN bus connects these different function units and presents a reliable information channel. Considering the limited space, a compact hardware system is designed based on an ARM processor with two CAN controllers. With made-to-order CAN protocol for the crawl robot, an intelligent drive control system is developed. The implementation of the crawl robot demonstrates that the presented drive control scheme can meet the motion control requirements of the underwater pipeline crawl robot.

  5. Drive Control System for Pipeline Crawl Robot Based on CAN Bus

    International Nuclear Information System (INIS)

    Chen, H J; Gao, B T; Zhang, X H; Deng, Z Q

    2006-01-01

    Drive control system plays important roles in pipeline robot. In order to inspect the flaw and corrosion of seabed crude oil pipeline, an original mobile pipeline robot with crawler drive unit, power and monitor unit, central control unit, and ultrasonic wave inspection device is developed. The CAN bus connects these different function units and presents a reliable information channel. Considering the limited space, a compact hardware system is designed based on an ARM processor with two CAN controllers. With made-to-order CAN protocol for the crawl robot, an intelligent drive control system is developed. The implementation of the crawl robot demonstrates that the presented drive control scheme can meet the motion control requirements of the underwater pipeline crawl robot

  6. SAND: an automated VLBI imaging and analysing pipeline - I. Stripping component trajectories

    Science.gov (United States)

    Zhang, M.; Collioud, A.; Charlot, P.

    2018-02-01

    We present our implementation of an automated very long baseline interferometry (VLBI) data-reduction pipeline that is dedicated to interferometric data imaging and analysis. The pipeline can handle massive VLBI data efficiently, which makes it an appropriate tool to investigate multi-epoch multiband VLBI data. Compared to traditional manual data reduction, our pipeline provides more objective results as less human interference is involved. The source extraction is carried out in the image plane, while deconvolution and model fitting are performed in both the image plane and the uv plane for parallel comparison. The output from the pipeline includes catalogues of CLEANed images and reconstructed models, polarization maps, proper motion estimates, core light curves and multiband spectra. We have developed a regression STRIP algorithm to automatically detect linear or non-linear patterns in the jet component trajectories. This algorithm offers an objective method to match jet components at different epochs and to determine their proper motions.

  7. A proposal of multi-objective function for submarine rigid pipelines route optimization via evolutionary algorithms

    Energy Technology Data Exchange (ETDEWEB)

    Fernandes, D.H.; Medeiros, A.R. [Subsea7, Niteroi, RJ (Brazil); Jacob, B.P.; Lima, B.S.L.P.; Albrecht, C.H. [Universidade Federaldo Rio de Janeiro (COPPE/UFRJ), RJ (Brazil). Coordenacao de Programas de Pos-graduacao em Engenharia

    2009-07-01

    This work presents studies regarding the determination of optimal pipeline routes for offshore applications. The assembly of an objective function is presented; this function can be later associated with Evolutionary Algorithm to implement a computational tool for the automatic determination of the most advantageous pipeline route for a given scenario. This tool may reduce computational overheads, avoid mistakes with route interpretation, and minimize costs with respect to submarine pipeline design and installation. The following aspects can be considered in the assembly of the objective function: Geophysical and geotechnical data obtained from the bathymetry and sonography; the influence of the installation method, total pipeline length and number of free spans to be mitigated along the routes as well as vessel time for both cases. Case studies are presented to illustrate the use of the proposed objective function, including a sensitivity analysis intended to identify the relative influence of selected parameters in the evaluation of different routes. (author)

  8. Markov chain modelling of pitting corrosion in underground pipelines

    International Nuclear Information System (INIS)

    Caleyo, F.; Velazquez, J.C.; Valor, A.; Hallen, J.M.

    2009-01-01

    A continuous-time, non-homogenous linear growth (pure birth) Markov process has been used to model external pitting corrosion in underground pipelines. The closed form solution of Kolmogorov's forward equations for this type of Markov process is used to describe the transition probability function in a discrete pit depth space. The identification of the transition probability function can be achieved by correlating the stochastic pit depth mean with the deterministic mean obtained experimentally. Monte-Carlo simulations previously reported have been used to predict the time evolution of the mean value of the pit depth distribution for different soil textural classes. The simulated distributions have been used to create an empirical Markov chain-based stochastic model for predicting the evolution of pitting corrosion depth and rate distributions from the observed properties of the soil. The proposed model has also been applied to pitting corrosion data from pipeline repeated in-line inspections and laboratory immersion experiments.

  9. Automatic Semantic Annotation of Music with Harmonic Structure

    OpenAIRE

    Weyde, T.

    2007-01-01

    This paper presents an annotation model for harmonic structure of a piece of music, and a rule system that supports the automatic generation of harmonic annotations. Musical structure has so far received relatively little attention in the context of musical metadata and annotation, although it is highly relevant for musicians, musicologists and indirectly for music listeners. Activities in semantic annotation of music have so far mostly concentrated on features derived from audio data and fil...

  10. An optimized algorithm for detecting and annotating regional differential methylation.

    Science.gov (United States)

    Li, Sheng; Garrett-Bakelman, Francine E; Akalin, Altuna; Zumbo, Paul; Levine, Ross; To, Bik L; Lewis, Ian D; Brown, Anna L; D'Andrea, Richard J; Melnick, Ari; Mason, Christopher E

    2013-01-01

    DNA methylation profiling reveals important differentially methylated regions (DMRs) of the genome that are altered during development or that are perturbed by disease. To date, few programs exist for regional analysis of enriched or whole-genome bisulfate conversion sequencing data, even though such data are increasingly common. Here, we describe an open-source, optimized method for determining empirically based DMRs (eDMR) from high-throughput sequence data that is applicable to enriched whole-genome methylation profiling datasets, as well as other globally enriched epigenetic modification data. Here we show that our bimodal distribution model and weighted cost function for optimized regional methylation analysis provides accurate boundaries of regions harboring significant epigenetic modifications. Our algorithm takes the spatial distribution of CpGs into account for the enrichment assay, allowing for optimization of the definition of empirical regions for differential methylation. Combined with the dependent adjustment for regional p-value combination and DMR annotation, we provide a method that may be applied to a variety of datasets for rapid DMR analysis. Our method classifies both the directionality of DMRs and their genome-wide distribution, and we have observed that shows clinical relevance through correct stratification of two Acute Myeloid Leukemia (AML) tumor sub-types. Our weighted optimization algorithm eDMR for calling DMRs extends an established DMR R pipeline (methylKit) and provides a needed resource in epigenomics. Our method enables an accurate and scalable way of finding DMRs in high-throughput methylation sequencing experiments. eDMR is available for download at http://code.google.com/p/edmr/.

  11. A SANE approach to annotation in the digital edition

    NARCIS (Netherlands)

    Boot, P.; Braungart, Georg; Jannidis, Fotis; Gendolla, Peter

    2007-01-01

    Robinson and others have recently called for dynamic and collaborative digital scholarly editions. Annotation is a key component for editions that are not merely passive, read-only repositories of knowledge. Annotation facilities (both annotation creation and display), however, require complex

  12. Annotation Method (AM): SE22_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available SE22_AM1 Annotation based on a grading system Collected mass spectral features, tog...ether with predicted molecular formulae and putative structures, were provided as metabolite annotations. Co...mparison with public databases was performed. A grading system was introduced to describe the evidence supporting the annotations. ...

  13. Annotation of the protein coding regions of the equine genome

    DEFF Research Database (Denmark)

    Hestand, Matthew S.; Kalbfleisch, Theodore S.; Coleman, Stephen J.

    2015-01-01

    Current gene annotation of the horse genome is largely derived from in silico predictions and cross-species alignments. Only a small number of genes are annotated based on equine EST and mRNA sequences. To expand the number of equine genes annotated from equine experimental evidence, we sequenced m...

  14. Web and personal image annotation by mining label correlation with relaxed visual graph embedding.

    Science.gov (United States)

    Yang, Yi; Wu, Fei; Nie, Feiping; Shen, Heng Tao; Zhuang, Yueting; Hauptmann, Alexander G

    2012-03-01

    The number of digital images rapidly increases, and it becomes an important challenge to organize these resources effectively. As a way to facilitate image categorization and retrieval, automatic image annotation has received much research attention. Considering that there are a great number of unlabeled images available, it is beneficial to develop an effective mechanism to leverage unlabeled images for large-scale image annotation. Meanwhile, a single image is usually associated with multiple labels, which are inherently correlated to each other. A straightforward method of image annotation is to decompose the problem into multiple independent single-label problems, but this ignores the underlying correlations among different labels. In this paper, we propose a new inductive algorithm for image annotation by integrating label correlation mining and visual similarity mining into a joint framework. We first construct a graph model according to image visual features. A multilabel classifier is then trained by simultaneously uncovering the shared structure common to different labels and the visual graph embedded label prediction matrix for image annotation. We show that the globally optimal solution of the proposed framework can be obtained by performing generalized eigen-decomposition. We apply the proposed framework to both web image annotation and personal album labeling using the NUS-WIDE, MSRA MM 2.0, and Kodak image data sets, and the AUC evaluation metric. Extensive experiments on large-scale image databases collected from the web and personal album show that the proposed algorithm is capable of utilizing both labeled and unlabeled data for image annotation and outperforms other algorithms.

  15. Redefining the Data Pipeline Using GPUs

    Science.gov (United States)

    Warner, C.; Eikenberry, S. S.; Gonzalez, A. H.; Packham, C.

    2013-10-01

    There are two major challenges facing the next generation of data processing pipelines: 1) handling an ever increasing volume of data as array sizes continue to increase and 2) the desire to process data in near real-time to maximize observing efficiency by providing rapid feedback on data quality. Combining the power of modern graphics processing units (GPUs), relational database management systems (RDBMSs), and extensible markup language (XML) to re-imagine traditional data pipelines will allow us to meet these challenges. Modern GPUs contain hundreds of processing cores, each of which can process hundreds of threads concurrently. Technologies such as Nvidia's Compute Unified Device Architecture (CUDA) platform and the PyCUDA (http://mathema.tician.de/software/pycuda) module for Python allow us to write parallel algorithms and easily link GPU-optimized code into existing data pipeline frameworks. This approach has produced speed gains of over a factor of 100 compared to CPU implementations for individual algorithms and overall pipeline speed gains of a factor of 10-25 compared to traditionally built data pipelines for both imaging and spectroscopy (Warner et al., 2011). However, there are still many bottlenecks inherent in the design of traditional data pipelines. For instance, file input/output of intermediate steps is now a significant portion of the overall processing time. In addition, most traditional pipelines are not designed to be able to process data on-the-fly in real time. We present a model for a next-generation data pipeline that has the flexibility to process data in near real-time at the observatory as well as to automatically process huge archives of past data by using a simple XML configuration file. XML is ideal for describing both the dataset and the processes that will be applied to the data. Meta-data for the datasets would be stored using an RDBMS (such as mysql or PostgreSQL) which could be easily and rapidly queried and file I/O would be

  16. A lightweight, flow-based toolkit for parallel and distributed bioinformatics pipelines

    Directory of Open Access Journals (Sweden)

    Cieślik Marcin

    2011-02-01

    Full Text Available Abstract Background Bioinformatic analyses typically proceed as chains of data-processing tasks. A pipeline, or 'workflow', is a well-defined protocol, with a specific structure defined by the topology of data-flow interdependencies, and a particular functionality arising from the data transformations applied at each step. In computer science, the dataflow programming (DFP paradigm defines software systems constructed in this manner, as networks of message-passing components. Thus, bioinformatic workflows can be naturally mapped onto DFP concepts. Results To enable the flexible creation and execution of bioinformatics dataflows, we have written a modular framework for parallel pipelines in Python ('PaPy'. A PaPy workflow is created from re-usable components connected by data-pipes into a directed acyclic graph, which together define nested higher-order map functions. The successive functional transformations of input data are evaluated on flexibly pooled compute resources, either local or remote. Input items are processed in batches of adjustable size, all flowing one to tune the trade-off between parallelism and lazy-evaluation (memory consumption. An add-on module ('NuBio' facilitates the creation of bioinformatics workflows by providing domain specific data-containers (e.g., for biomolecular sequences, alignments, structures and functionality (e.g., to parse/write standard file formats. Conclusions PaPy offers a modular framework for the creation and deployment of parallel and distributed data-processing workflows. Pipelines derive their functionality from user-written, data-coupled components, so PaPy also can be viewed as a lightweight toolkit for extensible, flow-based bioinformatics data-processing. The simplicity and flexibility of distributed PaPy pipelines may help users bridge the gap between traditional desktop/workstation and grid computing. PaPy is freely distributed as open-source Python code at http://muralab.org/PaPy, and

  17. A lightweight, flow-based toolkit for parallel and distributed bioinformatics pipelines.

    Science.gov (United States)

    Cieślik, Marcin; Mura, Cameron

    2011-02-25

    Bioinformatic analyses typically proceed as chains of data-processing tasks. A pipeline, or 'workflow', is a well-defined protocol, with a specific structure defined by the topology of data-flow interdependencies, and a particular functionality arising from the data transformations applied at each step. In computer science, the dataflow programming (DFP) paradigm defines software systems constructed in this manner, as networks of message-passing components. Thus, bioinformatic workflows can be naturally mapped onto DFP concepts. To enable the flexible creation and execution of bioinformatics dataflows, we have written a modular framework for parallel pipelines in Python ('PaPy'). A PaPy workflow is created from re-usable components connected by data-pipes into a directed acyclic graph, which together define nested higher-order map functions. The successive functional transformations of input data are evaluated on flexibly pooled compute resources, either local or remote. Input items are processed in batches of adjustable size, all flowing one to tune the trade-off between parallelism and lazy-evaluation (memory consumption). An add-on module ('NuBio') facilitates the creation of bioinformatics workflows by providing domain specific data-containers (e.g., for biomolecular sequences, alignments, structures) and functionality (e.g., to parse/write standard file formats). PaPy offers a modular framework for the creation and deployment of parallel and distributed data-processing workflows. Pipelines derive their functionality from user-written, data-coupled components, so PaPy also can be viewed as a lightweight toolkit for extensible, flow-based bioinformatics data-processing. The simplicity and flexibility of distributed PaPy pipelines may help users bridge the gap between traditional desktop/workstation and grid computing. PaPy is freely distributed as open-source Python code at http://muralab.org/PaPy, and includes extensive documentation and annotated usage

  18. A lightweight, flow-based toolkit for parallel and distributed bioinformatics pipelines

    Science.gov (United States)

    2011-01-01

    Background Bioinformatic analyses typically proceed as chains of data-processing tasks. A pipeline, or 'workflow', is a well-defined protocol, with a specific structure defined by the topology of data-flow interdependencies, and a particular functionality arising from the data transformations applied at each step. In computer science, the dataflow programming (DFP) paradigm defines software systems constructed in this manner, as networks of message-passing components. Thus, bioinformatic workflows can be naturally mapped onto DFP concepts. Results To enable the flexible creation and execution of bioinformatics dataflows, we have written a modular framework for parallel pipelines in Python ('PaPy'). A PaPy workflow is created from re-usable components connected by data-pipes into a directed acyclic graph, which together define nested higher-order map functions. The successive functional transformations of input data are evaluated on flexibly pooled compute resources, either local or remote. Input items are processed in batches of adjustable size, all flowing one to tune the trade-off between parallelism and lazy-evaluation (memory consumption). An add-on module ('NuBio') facilitates the creation of bioinformatics workflows by providing domain specific data-containers (e.g., for biomolecular sequences, alignments, structures) and functionality (e.g., to parse/write standard file formats). Conclusions PaPy offers a modular framework for the creation and deployment of parallel and distributed data-processing workflows. Pipelines derive their functionality from user-written, data-coupled components, so PaPy also can be viewed as a lightweight toolkit for extensible, flow-based bioinformatics data-processing. The simplicity and flexibility of distributed PaPy pipelines may help users bridge the gap between traditional desktop/workstation and grid computing. PaPy is freely distributed as open-source Python code at http://muralab.org/PaPy, and includes extensive

  19. 76 FR 44985 - Pipeline Safety: Potential for Damage to Pipeline Facilities Caused by Flooding

    Science.gov (United States)

    2011-07-27

    ... of underwater pipe should include the use of visual inspection by divers or instrumented detection... operators of gas and hazardous liquid pipelines to communicate the potential for damage to pipeline... facilities to determine and take appropriate action concerning changes in class location, failures, leakage...

  20. 77 FR 32631 - Lion Oil Trading & Transportation, Inc., Magnolia Pipeline Company, and El Dorado Pipeline...

    Science.gov (United States)

    2012-06-01

    ... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Docket No. OR12-13-000] Lion Oil... of the Commission's Rules of Practice and Procedure, 18 CFR 385.202 (2011), Lion Oil Trading & Transportation, Inc., Magnolia Pipeline Company, and El Dorado Pipeline Company, collectively, Lion Companies...

  1. The Dangers of Pipeline Thinking: How the School-to-Prison Pipeline Metaphor Squeezes out Complexity

    Science.gov (United States)

    McGrew, Ken

    2016-01-01

    In this essay Ken McGrew critically examines the "school-to-prison pipeline" metaphor and associated literature. The origins and influence of the metaphor are compared with the origins and influence of the competing "prison industrial complex" concept. Specific weaknesses in the "pipeline literature" are examined.…

  2. Marine Environmental Protection and Transboundary Pipeline Projects: A Case Study of the Nord Stream Pipeline

    NARCIS (Netherlands)

    Lott, Alexander

    2011-01-01

    The Nord Stream transboundary submarine pipeline, significant for its impact on the EU energy policy, has been a heav- ily debated issue in the Baltic Sea region during the past decade. This is partly due to the concerns over the effects that the pipeline might have on the Baltic Sea as a

  3. Genome re-annotation of the wild strawberry Fragaria vesca using extensive Illumina- and SMRT-based RNA-seq datasets.

    Science.gov (United States)

    Li, Yongping; Wei, Wei; Feng, Jia; Luo, Huifeng; Pi, Mengting; Liu, Zhongchi; Kang, Chunying

    2017-09-23

    The genome of the wild diploid strawberry species Fragaria vesca, an ideal model system of cultivated strawberry (Fragaria × ananassa, octoploid) and other Rosaceae family crops, was first published in 2011 and followed by a new assembly (Fvb). However, the annotation for Fvb mainly relied on ab initio predictions and included only predicted coding sequences, therefore an improved annotation is highly desirable. Here, a new annotation version named v2.0.a2 was created for the Fvb genome by a pipeline utilizing one PacBio library, 90 Illumina RNA-seq libraries, and 9 small RNA-seq libraries. Altogether, 18,641 genes (55.6% out of 33,538 genes) were augmented with information on the 5' and/or 3' UTRs, 13,168 (39.3%) protein-coding genes were modified or newly identified, and 7,370 genes were found to possess alternative isoforms. In addition, 1,938 long non-coding RNAs, 171 miRNAs, and 51,714 small RNA clusters were integrated into the annotation. This new annotation of F. vesca is substantially improved in both accuracy and integrity of gene predictions, beneficial to the gene functional studies in strawberry and to the comparative genomic analysis of other horticultural crops in Rosaceae family. © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  4. Integrative structural annotation of de novo RNA-Seq provides an accurate reference gene set of the enormous genome of the onion (Allium cepa L.).

    Science.gov (United States)

    Kim, Seungill; Kim, Myung-Shin; Kim, Yong-Min; Yeom, Seon-In; Cheong, Kyeongchae; Kim, Ki-Tae; Jeon, Jongbum; Kim, Sunggil; Kim, Do-Sun; Sohn, Seong-Han; Lee, Yong-Hwan; Choi, Doil

    2015-02-01

    The onion (Allium cepa L.) is one of the most widely cultivated and consumed vegetable crops in the world. Although a considerable amount of onion transcriptome data has been deposited into public databases, the sequences of the protein-coding genes are not accurate enough to be used, owing to non-coding sequences intermixed with the coding sequences. We generated a high-quality, annotated onion transcriptome from de novo sequence assembly and intensive structural annotation using the integrated structural gene annotation pipeline (ISGAP), which identified 54,165 protein-coding genes among 165,179 assembled transcripts totalling 203.0 Mb by eliminating the intron sequences. ISGAP performed reliable annotation, recognizing accurate gene structures based on reference proteins, and ab initio gene models of the assembled transcripts. Integrative functional annotation and gene-based SNP analysis revealed a whole biological repertoire of genes and transcriptomic variation in the onion. The method developed in this study provides a powerful tool for the construction of reference gene sets for organisms based solely on de novo transcriptome data. Furthermore, the reference genes and their variation described here for the onion represent essential tools for molecular breeding and gene cloning in Allium spp. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  5. Skaha Lake crossing, innovations in pipeline installation

    International Nuclear Information System (INIS)

    Fernandez, M.L.; Bryce, P.W.; Smith, J.D.

    1995-01-01

    This paper describes the construction of a 10.8 km long NPS16 (406 mm, 16 inch diameter) pipeline, across Skaha Lake, in the south Okanagan valley, British Columbia, Canada. The water crossing is part of the 32 km South Okanagan Natural Gas Pipeline Project (SONG) operated by BC Gas. The pipeline is located in a region dependent on year-round tourism. Therefore, the design and construction was influenced by sensitive environmental and land use concerns. From earlier studies, BC Gas identified surface tow or lay as preferred installation methods. The contractor, Fraser River Pile and Dredge departed from a conventional laybarge methodology after evaluating environmental data and assessing locally available equipment. The contractor proposed a surface tow with multiple surface tie-ins. This approach modification to the ''Surface Tow and Buoy Release Method'' (STBRM) used previously with success on relatively short underwater pipelines. A total of 10 pipe strings, up to 1 km long, were towed into position on the lake and tied-in using a floating platform. The joined pipeline was lowered to the lakebed by divers releasing buoys while tension was maintained from a winch barge at the free end of the pipeline. From analysis and field verified measurement the installation stresses were well below the allowable limits during all phases of construction. The entire construction, including mobilization and demobilization, lasted less than three months, and actual pipelaying less than three weeks. Installation was completed within budget and on schedule, without any environmental or safety related incidents. The SONG pipeline became operational in December 1994

  6. Pipeline risk assessment and control: Nigerian National Petroleum Pipeline network experience

    Energy Technology Data Exchange (ETDEWEB)

    Adubi, F.A.; Egho, P.I. [Pipelines and Products Marketing Company Ltd., Nigerian National Petroleum Corporation, Lagos (Nigeria)

    1992-12-31

    Third party encroachment and corrosion were identified as major causes of pipeline failure in Nigeria. The multi-faceted approach developed by the Nigerian National Petroleum Pipeline Corporation for effective assessment and control of risks to pipelines was described. In essence, information provided from each activity is used to complement information from other activities. This approach led to a better understanding of pipeline status and reduced risk of failures. Aerial surveillance was intensified in order to detect illegal activities, halt them, and remedy any damage. Efforts were also made to intensify corrosion monitoring and pipeline integrity surveys to avoid premature failures. Cathodic protection equipment was found to be only partially effective. due to vandalism and uncontrolled bush burning. 10 figs., 1 ref.

  7. Cost-allocation principles for pipeline capacity and usage

    International Nuclear Information System (INIS)

    Salant, D.J.; Watkins, G.C.

    1996-01-01

    The issue of cost sharing among multiple users of transmission facilities, such as pipelines, was discussed. The various ways in which a fair and reasonable pipeline cost-allocation scheme can be implemented were examined. It was suggested that no method exists for allocating costs that will achieve all major policy goals. The advantages and disadvantages of a system of uniform rates, such as postage stamp tolls, was discussed in the context of a natural gas pipeline system. A postage stamp system is one in which all users pay the same amount per unit of capacity, regardless of transport distances. This rate structure, while sometimes appropriate, is inefficient if total costs are distance-sensitive or if there is a significant variation in the sources of demand. Two commonly accepted minimal principles that a cost allocation should satisfy include: (1) the stand-alone cost test, and (2) the incremental cost test. This means that no one should pay costs in excess of their stand-alone costs and no one should pay less than their incremental costs. Postage stamp tolls were found to fail the minimal set of commonly applied principles. More optimal ways in which to allocate pipeline network costs among users were presented. The nucleolus is a unique cost allocation that is consistent, symmetric and homogeneous. The nucleolus is calculated by splitting the costs equally among users of a common facility. Another effective cost allocation system is the Shapley value which can be derived from a set of axioms which differ slightly from those that identify the nucleolus. 2 refs., 1 tab

  8. A study on an autonomous pipeline maintenance robot, 5

    International Nuclear Information System (INIS)

    Fukuda, Toshio; Hosokai, Hidemi; Otsuka, Masashi.

    1989-01-01

    The path planning is very important for the pipeline maintenance robot because there are many obstacles on pipeline such as flanges and T-joints and others, and because pipelines are constructed as a connected network in a very complicated way. Furthermore the maintenance robot Mark III previously reported has the ability to transit from one pipe to another the path planner should consider. The expert system especially aimed for path planning, named PPES (Path Planning Expert System), is described in this paper. A human-operator has only to give some tasks to this system. This system automatically replies with the optimal path, which is based on the calculation of the task levels and list of some control commands. Task level is a criterion to determine one optimal path. It consists of the difference of potential energies, the static joint torques, velocity of the robot, step numbers of the grippers' or body's movement, which the robot requires. This system also has graphic illustrations, so that the operator can easily check and understand the plant map and the result of the path planning. (author)

  9. Pipeline politics—A study of India′s proposed cross border gas projects

    International Nuclear Information System (INIS)

    Nathan, Hippu Salk Kristle; Kulkarni, Sanket Sudhir; Ahuja, Dilip R.

    2013-01-01

    India′s energy situation is characterized by increasing energy demand, high fossil fuel dependency, large import shares, and significant portion of population deprived of modern energy services. At this juncture, natural gas, being the cleanest fossil fuel with high efficiency and cost effectiveness, is expected to play an important role. India, with only 0.6% of proven world reserves, is not endowed with adequate natural gas domestically. Nevertheless, there are gas reserves in neighbouring regions which gives rise to the prospects of three cross border gas pipeline projects, namely, Iran–Pakistan–India, Turkmenistan–Afghanistan–Pakistan–India, and Myanmar–Bangladesh–India. This study is a political analysis of these pipeline projects. First, it provides justification on use of natural gas and promotion of cross border energy trade. Then it examines these three pipeline projects and analyses the security concerns, role of different actors, their positions, shifting goals, and strategies. The study develops scenarios on the basis of changing circumstances and discusses some of the pertinent issues like technology options for underground/underwater pipelines and role of private players. It also explores impact of India′s broader foreign relations and role of SAARC on the future of pipelines and proposes energy induced mutually assured protection (MAP) as a concept for regional security. -- Highlights: •We justify the need for cross border energy trade through gas pipelines for India. •We examine prospective pipeline projects—IPI, TAPI, MBI and their security issues. •We develop scenarios and analyze role of actors, their positions, and strategies. •We discuss technology and policy options for realizing these gas pipelines. •We propose energy induced mutually assured protection (MAP) for regional security

  10. GPA: a statistical approach to prioritizing GWAS results by integrating pleiotropy and annotation.

    Directory of Open Access Journals (Sweden)

    Dongjun Chung

    2014-11-01

    Full Text Available Results from Genome-Wide Association Studies (GWAS have shown that complex diseases are often affected by many genetic variants with small or moderate effects. Identifications of these risk variants remain a very challenging problem. There is a need to develop more powerful statistical methods to leverage available information to improve upon traditional approaches that focus on a single GWAS dataset without incorporating additional data. In this paper, we propose a novel statistical approach, GPA (Genetic analysis incorporating Pleiotropy and Annotation, to increase statistical power to identify risk variants through joint analysis of multiple GWAS data sets and annotation information because: (1 accumulating evidence suggests that different complex diseases share common risk bases, i.e., pleiotropy; and (2 functionally annotated variants have been consistently demonstrated to be enriched among GWAS hits. GPA can integrate multiple GWAS datasets and functional annotations to seek association signals, and it can also perform hypothesis testing to test the presence of pleiotropy and enrichment of functional annotation. Statistical inference of the model parameters and SNP ranking is achieved through an EM algorithm that can handle genome-wide markers efficiently. When we applied GPA to jointly analyze five psychiatric disorders with annotation information, not only did GPA identify many weak signals missed by the traditional single phenotype analysis, but it also revealed relationships in the genetic architecture of these disorders. Using our hypothesis testing framework, statistically significant pleiotropic effects were detected among these psychiatric disorders, and the markers annotated in the central nervous system genes and eQTLs from the Genotype-Tissue Expression (GTEx database were significantly enriched. We also applied GPA to a bladder cancer GWAS data set with the ENCODE DNase-seq data from 125 cell lines. GPA was able to detect cell

  11. A Flexible Object-of-Interest Annotation Framework for Online Video Portals

    Directory of Open Access Journals (Sweden)

    Robert Sorschag

    2012-02-01

    Full Text Available In this work, we address the use of object recognition techniques to annotate what is shown where in online video collections. These annotations are suitable to retrieve specific video scenes for object related text queries which is not possible with the manually generated metadata that is used by current portals. We are not the first to present object annotations that are generated with content-based analysis methods. However, the proposed framework possesses some outstanding features that offer good prospects for its application in real video portals. Firstly, it can be easily used as background module in any video environment. Secondly, it is not based on a fixed analysis chain but on an extensive recognition infrastructure that can be used with all kinds of visual features, matching and machine learning techniques. New recognition approaches can be integrated into this infrastructure with low development costs and a configuration of the used recognition approaches can be performed even on a running system. Thus, this framework might also benefit from future advances in computer vision. Thirdly, we present an automatic selection approach to support the use of different recognition strategies for different objects. Last but not least, visual analysis can be performed efficiently on distributed, multi-processor environments and a database schema is presented to store the resulting video annotations as well as the off-line generated low-level features in a compact form. We achieve promising results in an annotation case study and the instance search task of the TRECVID 2011 challenge.

  12. Pipeline four-dimension management is the trend of pipeline integrity management in the future

    Energy Technology Data Exchange (ETDEWEB)

    Shaohua, Dong; Feifan; Zhongchen, Han [China National Petroleum Corporation (CNPC), Beijing (China)

    2009-07-01

    Pipeline integrity management is essential for today's operators to operate their pipelines safety and cost effectively. The latest developments of pipeline integrity management around the world are involved with change of regulation, industry standard and innovation of technology. And who know the trend of PIM in the future, which can be answered in the paper. As a result, the concept of P4DM was set up firstly in the world. The paper analyzed the pipeline HSE management, pipeline integrity management (PIM) and asset integrity management (AIM), the problem of management was produced, and also the Pipeline 4-dimension Management (P4DM) theory was brought forward. According to P4DM, from the hierarchy of P4DM, the management elements, fields, space and time was analyzed. The main content is P4DM integrate the space geography location and time, control and manage the pipeline system in whole process, anywhere and anytime. It includes the pipeline integrity, pipeline operation and emergency, which is integrated by IT system. It come true that the idea, solution, technology, organization, manager alternately intelligently control the process of management. What the paper talks about included the definition of pipeline 4D management, the research develop of P4DM, the theory of P4DM, the relationship between P4DM and PIM, the technology basis of P4DM, how to perform the P4DM and conclusion. The P4DM was produced, which provide the development direction of PIM in the future, and also provide the new ideas for PetroChina in the field of technology and management. (author)

  13. Development of a design methodology for hydraulic pipelines carrying rectangular capsules

    International Nuclear Information System (INIS)

    Asim, Taimoor; Mishra, Rakesh; Abushaala, Sufyan; Jain, Anuj

    2016-01-01

    The scarcity of fossil fuels is affecting the efficiency of established modes of cargo transport within the transportation industry. Efforts have been made to develop innovative modes of transport that can be adopted for economic and environmental friendly operating systems. Solid material, for instance, can be packed in rectangular containers (commonly known as capsules), which can then be transported in different concentrations very effectively using the fluid energy in pipelines. For economical and efficient design of such systems, both the local flow characteristics and the global performance parameters need to be carefully investigated. Published literature is severely limited in establishing the effects of local flow features on system characteristics of Hydraulic Capsule Pipelines (HCPs). The present study focuses on using a well validated Computational Fluid Dynamics (CFD) tool to numerically simulate the solid-liquid mixture flow in both on-shore and off-shore HCPs applications including bends. Discrete Phase Modelling (DPM) has been employed to calculate the velocity of the rectangular capsules. Numerical predictions have been used to develop novel semi-empirical prediction models for pressure drop in HCPs, which have then been embedded into a robust and user-friendly pipeline optimisation methodology based on Least-Cost Principle. - Highlights: • Local flow characteristics in a pipeline transporting rectangular capsules. • Development of prediction models for the pressure drop contribution of capsules. • Methodology developed for sizing of Hydraulic Capsule Pipelines. • Implementation of the developed methodology to obtain optimal pipeline diameter.

  14. Latvijas gaze. Inspection of pipeline. Final report. Cathodic protection, phase 2

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1996-10-01

    The purpose of this project was an intensive field survey of Latvian gas pipelines with the purpose of locating areas with coating defects as the pipe surface could have been and might still be exposed to corrosion from stray currents originating from a nearby DC-electrified railway and/or from lack of effective cathodic protection. The field survey had to be started as early as possible - before the oncoming winter. A second aim of the activities was know-how transfer combined with the delivery of a cathodic protection instrument package. This report also describes the outcome of an intensive survey, measurements of influence from the nearby DC-electrified railway, insulating flange measurements, soil resistivity measurements, anode bed resistance measurements and spot-checks of pipe potentials at different measuring posts, all carried out as part of the training. The scope of work has been: to examine the Sloka branch pipeline history and documentation, to locate areas of corrosion concern in the northern end of the pipeline, to provide Latvijas Gaze staff with cathodic protection measuring instruments and knowledge of how to use the instruments, to investigate the interaction from the nearby DC-electrified railway on the pipeline, as severe corrosion attacks were found on the pipeline, to state the condition of the cathodic protection and to give an explanation of the causes of the corrosion attacks on the pipeline, to give a recommendation on how to achieve effective cathodic protection. (EG)

  15. Neuroimaging study designs, computational analyses and data provenance using the LONI pipeline.

    Directory of Open Access Journals (Sweden)

    Ivo Dinov

    2010-09-01

    Full Text Available Modern computational neuroscience employs diverse software tools and multidisciplinary expertise to analyze heterogeneous brain data. The classical problems of gathering meaningful data, fitting specific models, and discovering appropriate analysis and visualization tools give way to a new class of computational challenges--management of large and incongruous data, integration and interoperability of computational resources, and data provenance. We designed, implemented and validated a new paradigm for addressing these challenges in the neuroimaging field. Our solution is based on the LONI Pipeline environment [3], [4], a graphical workflow environment for constructing and executing complex data processing protocols. We developed study-design, database and visual language programming functionalities within the LONI Pipeline that enable the construction of complete, elaborate and robust graphical workflows for analyzing neuroimaging and other data. These workflows facilitate open sharing and communication of data and metadata, concrete processing protocols, result validation, and study replication among different investigators and research groups. The LONI Pipeline features include distributed grid-enabled infrastructure, virtualized execution environment, efficient integration, data provenance, validation and distribution of new computational tools, automated data format conversion, and an intuitive graphical user interface. We demonstrate the new LONI Pipeline features using large scale neuroimaging studies based on data from the International Consortium for Brain Mapping [5] and the Alzheimer's Disease Neuroimaging Initiative [6]. User guides, forums, instructions and downloads of the LONI Pipeline environment are available at http://pipeline.loni.ucla.edu.

  16. Neuroimaging study designs, computational analyses and data provenance using the LONI pipeline.

    Science.gov (United States)

    Dinov, Ivo; Lozev, Kamen; Petrosyan, Petros; Liu, Zhizhong; Eggert, Paul; Pierce, Jonathan; Zamanyan, Alen; Chakrapani, Shruthi; Van Horn, John; Parker, D Stott; Magsipoc, Rico; Leung, Kelvin; Gutman, Boris; Woods, Roger; Toga, Arthur

    2010-09-28

    Modern computational neuroscience employs diverse software tools and multidisciplinary expertise to analyze heterogeneous brain data. The classical problems of gathering meaningful data, fitting specific models, and discovering appropriate analysis and visualization tools give way to a new class of computational challenges--management of large and incongruous data, integration and interoperability of computational resources, and data provenance. We designed, implemented and validated a new paradigm for addressing these challenges in the neuroimaging field. Our solution is based on the LONI Pipeline environment [3], [4], a graphical workflow environment for constructing and executing complex data processing protocols. We developed study-design, database and visual language programming functionalities within the LONI Pipeline that enable the construction of complete, elaborate and robust graphical workflows for analyzing neuroimaging and other data. These workflows facilitate open sharing and communication of data and metadata, concrete processing protocols, result validation, and study replication among different investigators and research groups. The LONI Pipeline features include distributed grid-enabled infrastructure, virtualized execution environment, efficient integration, data provenance, validation and distribution of new computational tools, automated data format conversion, and an intuitive graphical user interface. We demonstrate the new LONI Pipeline features using large scale neuroimaging studies based on data from the International Consortium for Brain Mapping [5] and the Alzheimer's Disease Neuroimaging Initiative [6]. User guides, forums, instructions and downloads of the LONI Pipeline environment are available at http://pipeline.loni.ucla.edu.

  17. Simulations of severe slugging during depressurization of an oil/gas pipeline

    Directory of Open Access Journals (Sweden)

    M. Nordsveen

    1997-01-01

    Full Text Available Dynamic simulators for pipelines with multiphase flow have proved to be important computational tools for both design and operational support of oil and gas production systems. One important aim of such simulators is to predict the arrival time and magnitude of outlet liquid transients after production changes made by an operator of a pipeline. A multiphase flow simulator (OLGA-94.1 with a two-fluid model has been applied to simulate depressurization of a pipeline during a shutdown procedure. During depressurization liquid slugs may form and propagate towards the outlet. The importance of the numerical method for predictions of such transients is demonstrated by using an Eulerian, finite difference, implicit, upwind scheme both with and without a front tracking scheme. First the initial conditions for the depressurization is established from a shut-down simulation where the production at the inlet is closed down, and the liquid comes to rest at low points along the pipeline. A realistic depressurization is simulated by opening a choke at the outlet of the pressurized pipeline. The numerical scheme without front tracking (standard scheme gives outlet gas and liquid flow rates which are smeared out in time due to numerical diffusion. Simulations with the front tracking scheme give intermittent gas-liquid flow arriving as sharp fronts at the outlet. The total remaining fluid in the pipeline after the depressurization is larger when using the standard scheme.

  18. Magnetic Flux Leakage and Principal Component Analysis for metal loss approximation in a pipeline

    International Nuclear Information System (INIS)

    Ruiz, M; Mujica, L E; Quintero, M; Florez, J; Quintero, S

    2015-01-01

    Safety and reliability of hydrocarbon transportation pipelines represent a critical aspect for the Oil an Gas industry. Pipeline failures caused by corrosion, external agents, among others, can develop leaks or even rupture, which can negatively impact on population, natural environment, infrastructure and economy. It is imperative to have accurate inspection tools traveling through the pipeline to diagnose the integrity. In this way, over the last few years, different techniques under the concept of structural health monitoring (SHM) have continuously been in development.This work is based on a hybrid methodology that combines the Magnetic Flux Leakage (MFL) and Principal Components Analysis (PCA) approaches. The MFL technique induces a magnetic field in the pipeline's walls. The data are recorded by sensors measuring leakage magnetic field in segments with loss of metal, such as cracking, corrosion, among others. The data provide information of a pipeline with 15 years of operation approximately, which transports gas, has a diameter of 20 inches and a total length of 110 km (with several changes in the topography). On the other hand, PCA is a well-known technique that compresses the information and extracts the most relevant information facilitating the detection of damage in several structures. At this point, the goal of this work is to detect and localize critical loss of metal of a pipeline that are currently working. (paper)

  19. Annotating Mutational Effects on Proteins and Protein Interactions: Designing Novel and Revisiting Existing Protocols.

    Science.gov (United States)

    Li, Minghui; Goncearenco, Alexander; Panchenko, Anna R

    2017-01-01

    In this review we describe a protocol to annotate the effects of missense mutations on proteins, their functions, stability, and binding. For this purpose we present a collection of the most comprehensive databases which store different types of sequencing data on missense mutations, we discuss their relationships, possible intersections, and unique features. Next, we suggest an annotation workflow using the state-of-the art methods and highlight their usability, advantages, and limitations for different cases. Finally, we address a particularly difficult problem of deciphering the molecular mechanisms of mutations on proteins and protein complexes to understand the origins and mechanisms of diseases.

  20. Underwater pipeline impact localization using piezoceramic transducers

    Science.gov (United States)

    Zhu, Junxiao; Ho, Siu Chun Michael; Patil, Devendra; Wang, Ning; Hirsch, Rachel; Song, Gangbing

    2017-10-01

    Reports indicated that impact events accounted for 47% of offshore pipeline failures, which calls for impact detection and localization for subsea pipelines. In this paper, an innovative method for rapid localization of impacts on underwater pipelines utilizing a novel determination technique for both arrival-time and group velocity (ATGV) of ultrasonic guided waves with lead zirconate titanate (PZT) transducers is described. PZT transducers mounted on the outer surface of a model pipeline were utilized to measure ultrasonic guided waves generated by impact events. Based on the signals from PZT sensors, the ATGV technique integrates wavelet decomposition, Hilbert transform and statistical analysis to pinpoint the arrival-time of the designated ultrasonic guided waves with a specific group velocity. Experimental results have verified the effectiveness and the localization accuracy for eight impact points along a model underwater pipeline. All estimations errors were small and were comparable with the wavelength of the designated ultrasonic guided waves. Furthermore, the method is robust against the low frequency structural vibration introduced by other external forces.