automated genome mining: Topics by WorldWideScience.org

Sample records for automated genome mining

Automated genome mining of ribosomal peptide natural products

Energy Technology Data Exchange (ETDEWEB)

Mohimani, Hosein; Kersten, Roland; Liu, Wei; Wang, Mingxun; Purvine, Samuel O.; Wu, Si; Brewer, Heather M.; Pasa-Tolic, Ljiljana; Bandeira, Nuno; Moore, Bradley S.; Pevzner, Pavel A.; Dorrestein, Pieter C.

2014-07-31

Ribosomally synthesized and posttranslationally modified peptides (RiPPs), especially from microbial sources, are a large group of bioactive natural products that are a promising source of new (bio)chemistry and bioactivity (1). In light of exponentially increasing microbial genome databases and improved mass spectrometry (MS)-based metabolomic platforms, there is a need for computational tools that connect natural product genotypes predicted from microbial genome sequences with their corresponding chemotypes from metabolomic datasets. Here, we introduce RiPPquest, a tandem mass spectrometry database search tool for identification of microbial RiPPs and apply it for lanthipeptide discovery. RiPPquest uses genomics to limit search space to the vicinity of RiPP biosynthetic genes and proteomics to analyze extensive peptide modifications and compute p-values of peptide-spectrum matches (PSMs). We highlight RiPPquest by connection of multiple RiPPs from extracts of Streptomyces to their gene clusters and by the discovery of a new class III lanthipeptide, informatipeptin, from Streptomyces viridochromogenes DSM 40736 as the first natural product to be identified in an automated fashion by genome mining. The presented tool is available at cy-clo.ucsd.edu.
Automation and robotics technology for intelligent mining systems

Science.gov (United States)

Welsh, Jeffrey H.

1989-01-01

The U.S. Bureau of Mines is approaching the problems of accidents and efficiency in the mining industry through the application of automation and robotics to mining systems. This technology can increase safety by removing workers from hazardous areas of the mines or from performing hazardous tasks. The short-term goal of the Automation and Robotics program is to develop technology that can be implemented in the form of an autonomous mining machine using current continuous mining machine equipment. In the longer term, the goal is to conduct research that will lead to new intelligent mining systems that capitalize on the capabilities of robotics. The Bureau of Mines Automation and Robotics program has been structured to produce the technology required for the short- and long-term goals. The short-term goal of application of automation and robotics to an existing mining machine, resulting in autonomous operation, is expected to be accomplished within five years. Key technology elements required for an autonomous continuous mining machine are well underway and include machine navigation systems, coal-rock interface detectors, machine condition monitoring, and intelligent computer systems. The Bureau of Mines program is described, including status of key technology elements for an autonomous continuous mining machine, the program schedule, and future work. Although the program is directed toward underground mining, much of the technology being developed may have applications for space systems or mining on the Moon or other planets.
Pep2Path: automated mass spectrometry-guided genome mining of peptidic natural products.

Directory of Open Access Journals (Sweden)

Marnix H Medema

2014-09-01

Full Text Available Nonribosomally and ribosomally synthesized bioactive peptides constitute a source of molecules of great biomedical importance, including antibiotics such as penicillin, immunosuppressants such as cyclosporine, and cytostatics such as bleomycin. Recently, an innovative mass-spectrometry-based strategy, peptidogenomics, has been pioneered to effectively mine microbial strains for novel peptidic metabolites. Even though mass-spectrometric peptide detection can be performed quite fast, true high-throughput natural product discovery approaches have still been limited by the inability to rapidly match the identified tandem mass spectra to the gene clusters responsible for the biosynthesis of the corresponding compounds. With Pep2Path, we introduce a software package to fully automate the peptidogenomics approach through the rapid Bayesian probabilistic matching of mass spectra to their corresponding biosynthetic gene clusters. Detailed benchmarking of the method shows that the approach is powerful enough to correctly identify gene clusters even in data sets that consist of hundreds of genomes, which also makes it possible to match compounds from unsequenced organisms to closely related biosynthetic gene clusters in other genomes. Applying Pep2Path to a data set of compounds without known biosynthesis routes, we were able to identify candidate gene clusters for the biosynthesis of five important compounds. Notably, one of these clusters was detected in a genome from a different subphylum of Proteobacteria than that in which the molecule had first been identified. All in all, our approach paves the way towards high-throughput discovery of novel peptidic natural products. Pep2Path is freely available from http://pep2path.sourceforge.net/, implemented in Python, licensed under the GNU General Public License v3 and supported on MS Windows, Linux and Mac OS X.
Archveyor{trademark} automated mining system - implementation at the Conant mine

Energy Technology Data Exchange (ETDEWEB)

Hofmann, W.J. [Arch of Illinois, Percy, IL (United States)

1997-12-01

Arch Mineral Corporation, through the Arch Technology Department, has developed an automated continuous haulage mining system called the `Archveyor{trademark}`. The original technology came from a Russian patent. Kloeckner-Becorit (K-B) further developed the system and called it the `Mobile Conveyor`. This system was utilized in both coal and trona mines in the United States and Canada. Consolidation Coal designed their version of this continuous haulage system, called the `Tramveyor`. The Tramveyor is presently operating in their Dilworth Mine, in Pennsylvania. This system has no computer guidance system related to the continuous miner or the Tramveyor. Arch Mineral Corporation has further developed this continuous haulage mining system. Their system is a programmable, logic-controlled (PLC) automated mining system. A highwall version of the Archveyor{trademark} is being operated at Arch of Wyoming near Hanna, Wyoming. This paper introduces the first underground version of Archveyor{trademark} to be implemented at Conant Mine in southern Illinois. During the development process, the Archveyor{trademark} mining system consists of a continuous miner, a bolter car, the Archveyor{trademark} (itself), a stageloader, and an operator`s cab. During the secondary mining process the bolter car is taken out of the system.
Proceedings. Fourth international symposium on mine mechanisation and automation

Energy Technology Data Exchange (ETDEWEB)

Gurgenci, H.; Hood, M. [eds.

1997-12-31

Papers in the first volume are presented under the following session headings: drilling; mining robotics; machine monitoring; mine automation systems; reliability and maintenance; mine automation - communications mechanical excavation of medium-strength rock; and new mining equipment technologies. The second volume covers: mechanical excavation of hard rock; autonomous vehicles; mechanical excavation industry experience; machine guidance; applications of rock mechanics, mine planning management and scheduling; orebody delineation; and safety. Selected papers have been abstracted separately for the IEA Coal Research databases available on CD-ROM and the worldwide web.
Introduction of an automated mine surveying system - a method for effective control of mining operations

Energy Technology Data Exchange (ETDEWEB)

Mazhdrakov, M.

1987-04-01

Reviews developments in automated processing of mine survey data in Bulgaria for 1965-1970. This development has occurred in three phases. In the first phase, computers calculated coordinates of mine survey points; in the second phase, these data were electronically processed; in the third phase, surface and underground mine development is controlled by electronic data processing equipment. Centralized and decentralized electronic processing of data has been introduced at major coal mines. The Bulgarian Pravets 82 microcomputer and the ASMO-MINI program package are in current use at major coal mines. A lack of plotters, due to financial limitations, handicaps large-scale application of automated mine surveying in Bulgaria.
Functional Genome Mining for Metabolites Encoded by Large Gene Clusters through Heterologous Expression of a Whole-Genome Bacterial Artificial Chromosome Library in Streptomyces spp.

Science.gov (United States)

Xu, Min; Wang, Yemin; Zhao, Zhilong; Gao, Guixi; Huang, Sheng-Xiong; Kang, Qianjin; He, Xinyi; Lin, Shuangjun; Pang, Xiuhua; Deng, Zixin

2016-01-01

ABSTRACT Genome sequencing projects in the last decade revealed numerous cryptic biosynthetic pathways for unknown secondary metabolites in microbes, revitalizing drug discovery from microbial metabolites by approaches called genome mining. In this work, we developed a heterologous expression and functional screening approach for genome mining from genomic bacterial artificial chromosome (BAC) libraries in Streptomyces spp. We demonstrate mining from a strain of Streptomyces rochei, which is known to produce streptothricins and borrelidin, by expressing its BAC library in the surrogate host Streptomyces lividans SBT5, and screening for antimicrobial activity. In addition to the successful capture of the streptothricin and borrelidin biosynthetic gene clusters, we discovered two novel linear lipopeptides and their corresponding biosynthetic gene cluster, as well as a novel cryptic gene cluster for an unknown antibiotic from S. rochei. This high-throughput functional genome mining approach can be easily applied to other streptomycetes, and it is very suitable for the large-scale screening of genomic BAC libraries for bioactive natural products and the corresponding biosynthetic pathways. IMPORTANCE Microbial genomes encode numerous cryptic biosynthetic gene clusters for unknown small metabolites with potential biological activities. Several genome mining approaches have been developed to activate and bring these cryptic metabolites to biological tests for future drug discovery. Previous sequence-guided procedures relied on bioinformatic analysis to predict potentially interesting biosynthetic gene clusters. In this study, we describe an efficient approach based on heterologous expression and functional screening of a whole-genome library for the mining of bioactive metabolites from Streptomyces. The usefulness of this function-driven approach was demonstrated by the capture of four large biosynthetic gene clusters for metabolites of various chemical types, including
WGSSAT: A High-Throughput Computational Pipeline for Mining and Annotation of SSR Markers From Whole Genomes.

Science.gov (United States)

Pandey, Manmohan; Kumar, Ravindra; Srivastava, Prachi; Agarwal, Suyash; Srivastava, Shreya; Nagpure, Naresh S; Jena, Joy K; Kushwaha, Basdeo

2018-03-16

Mining and characterization of Simple Sequence Repeat (SSR) markers from whole genomes provide valuable information about biological significance of SSR distribution and also facilitate development of markers for genetic analysis. Whole genome sequencing (WGS)-SSR Annotation Tool (WGSSAT) is a graphical user interface pipeline developed using Java Netbeans and Perl scripts which facilitates in simplifying the process of SSR mining and characterization. WGSSAT takes input in FASTA format and automates the prediction of genes, noncoding RNA (ncRNA), core genes, repeats and SSRs from whole genomes followed by mapping of the predicted SSRs onto a genome (classified according to genes, ncRNA, repeats, exonic, intronic, and core gene region) along with primer identification and mining of cross-species markers. The program also generates a detailed statistical report along with visualization of mapped SSRs, genes, core genes, and RNAs. The features of WGSSAT were demonstrated using Takifugu rubripes data. This yielded a total of 139 057 SSR, out of which 113 703 SSR primer pairs were uniquely amplified in silico onto a T. rubripes (fugu) genome. Out of 113 703 mined SSRs, 81 463 were from coding region (including 4286 exonic and 77 177 intronic), 7 from RNA, 267 from core genes of fugu, whereas 105 641 SSR and 601 SSR primer pairs were uniquely mapped onto the medaka genome. WGSSAT is tested under Ubuntu Linux. The source code, documentation, user manual, example dataset and scripts are available online at https://sourceforge.net/projects/wgssat-nbfgr.
Automation of mining machinery at RAG; Automation von Bergbaumaschinen bei der RAG Deutsche Steinkohle

Energy Technology Data Exchange (ETDEWEB)

Barabasch, Uwe [Zentralstab Kernbereich, RAG Deutsche Steinkohle AG, Herne (Germany); Weiss, Hans-Juergen [Bergwerk Prosper-Haniel, RAG Deutsche Steinkohle AG, Bottrop (Germany); Kotke, Frank [Elektrotechnik unter Tage, Zentralstab Kernbereich der RAG Deutsche Steinkohle AG, Herne (Germany)

2009-11-05

The improvement of processes specific to mining in the collieries of RAG and the improvement of the ergonomic conditions in the deep coal mining deposits of Germany require a higher degree of automation and control of the processes in progress. A higher degree of automation is also re-quired here for the machinery and systems used. RAG will be consolidating its engineering and research activities in these areas over the coming years. (orig.)
Digital coal mine integrated automation system based on Controlnet

Energy Technology Data Exchange (ETDEWEB)

Jin-yun Chen; Shen Zhang; Wei-ran Zuo [China University of Mining and Technology, Xuzhou (China). School of Chemical Engineering and Technology

2007-06-15

A three-layer model for digital communication in a mine is proposed. Two basic platforms are discussed: a uniform transmission network and a uniform data warehouse. An actual, ControlNet based, transmission network platform suitable for the Jining No.3 coal mine in China is presented. This network is an information superhighway intended to integrate all existing and new automation subsystems. Its standard interface can be used with future subsystems. The network, data structure and management decision-making all employ this uniform hardware and software. This effectively avoids the problems of system and information islands seen in traditional mine-automation systems. The construction of the network provides a stable foundation for digital communication in the Jining No.3 coal mine. 9 refs., 5 figs.
Genomics Portals: integrative web-platform for mining genomics data.

Science.gov (United States)

Shinde, Kaustubh; Phatak, Mukta; Johannes, Freudenberg M; Chen, Jing; Li, Qian; Vineet, Joshi K; Hu, Zhen; Ghosh, Krishnendu; Meller, Jaroslaw; Medvedovic, Mario

2010-01-13

A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc), and the integration with an extensive knowledge base that can be used in such analysis. The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org.
Data Mining Supercomputing with SAS JMP® Genomics

Directory of Open Access Journals (Sweden)

Richard S. Segall

2011-02-01

Full Text Available JMP® Genomics is statistical discovery software that can uncover meaningful patterns in high-throughput genomics and proteomics data. JMP® Genomics is designed for biologists, biostatisticians, statistical geneticists, and those engaged in analyzing the vast stores of data that are common in genomic research (SAS, 2009. Data mining was performed using JMP® Genomics on the two collections of microarray databases available from National Center for Biotechnology Information (NCBI for lung cancer and breast cancer. The Gene Expression Omnibus (GEO of NCBI serves as a public repository for a wide range of highthroughput experimental data, including the two collections of lung cancer and breast cancer that were used for this research. The results for applying data mining using software JMP® Genomics are shown in this paper with numerous screen shots.
Genomics Portals: integrative web-platform for mining genomics data

Directory of Open Access Journals (Sweden)

Ghosh Krishnendu

2010-01-01

Full Text Available Abstract Background A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Results Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc, and the integration with an extensive knowledge base that can be used in such analysis. Conclusion The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org.
Mining Repair Actions for Guiding Automated Program Fixing

OpenAIRE

Martinez , Matias; Monperrus , Martin

2012-01-01

Automated program fixing consists of generating source code in order to fix bugs in an automated manner. Our intuition is that automated program fixing can imitate human-based program fixing. Hence, we present a method to mine repair actions from software repositories. A repair action is a small semantic modification on code such as adding a method call. We then decorate repair actions with a probability distribution also learnt from software repositories. Our probabilistic repair models enab...
Microbial genome mining for accelerated natural products discovery: is a renaissance in the making?

Science.gov (United States)

Bachmann, Brian O; Van Lanen, Steven G; Baltz, Richard H

2014-02-01

Microbial genome mining is a rapidly developing approach to discover new and novel secondary metabolites for drug discovery. Many advances have been made in the past decade to facilitate genome mining, and these are reviewed in this Special Issue of the Journal of Industrial Microbiology and Biotechnology. In this Introductory Review, we discuss the concept of genome mining and why it is important for the revitalization of natural product discovery; what microbes show the most promise for focused genome mining; how microbial genomes can be mined; how genome mining can be leveraged with other technologies; how progress on genome mining can be accelerated; and who should fund future progress in this promising field. We direct interested readers to more focused reviews on the individual topics in this Special Issue for more detailed summaries on the current state-of-the-art.
Semi-automated literature mining to identify putative biomarkers of disease from multiple biofluids

Science.gov (United States)

2014-01-01

Background Computational methods for mining of biomedical literature can be useful in augmenting manual searches of the literature using keywords for disease-specific biomarker discovery from biofluids. In this work, we develop and apply a semi-automated literature mining method to mine abstracts obtained from PubMed to discover putative biomarkers of breast and lung cancers in specific biofluids. Methodology A positive set of abstracts was defined by the terms ‘breast cancer’ and ‘lung cancer’ in conjunction with 14 separate ‘biofluids’ (bile, blood, breastmilk, cerebrospinal fluid, mucus, plasma, saliva, semen, serum, synovial fluid, stool, sweat, tears, and urine), while a negative set of abstracts was defined by the terms ‘(biofluid) NOT breast cancer’ or ‘(biofluid) NOT lung cancer.’ More than 5.3 million total abstracts were obtained from PubMed and examined for biomarker-disease-biofluid associations (34,296 positive and 2,653,396 negative for breast cancer; 28,355 positive and 2,595,034 negative for lung cancer). Biological entities such as genes and proteins were tagged using ABNER, and processed using Python scripts to produce a list of putative biomarkers. Z-scores were calculated, ranked, and used to determine significance of putative biomarkers found. Manual verification of relevant abstracts was performed to assess our method’s performance. Results Biofluid-specific markers were identified from the literature, assigned relevance scores based on frequency of occurrence, and validated using known biomarker lists and/or databases for lung and breast cancer [NCBI’s On-line Mendelian Inheritance in Man (OMIM), Cancer Gene annotation server for cancer genomics (CAGE), NCBI’s Genes & Disease, NCI’s Early Detection Research Network (EDRN), and others]. The specificity of each marker for a given biofluid was calculated, and the performance of our semi-automated literature mining method assessed for breast and lung cancer
Mining genome sequencing data to identify the genomic features linked to breast cancer histopathology

Science.gov (United States)

Ping, Zheng; Siegal, Gene P.; Almeida, Jonas S.; Schnitt, Stuart J.; Shen, Dejun

2014-01-01

Background: Genetics and genomics have radically altered our understanding of breast cancer progression. However, the genomic basis of various histopathologic features of breast cancer is not yet well-defined. Materials and Methods: The Cancer Genome Atlas (TCGA) is an international database containing a large collection of human cancer genome sequencing data. cBioPortal is a web tool developed for mining these sequencing data. We performed mining of TCGA sequencing data in an attempt to characterize the genomic features correlated with breast cancer histopathology. We first assessed the quality of the TCGA data using a group of genes with known alterations in various cancers. Both genome-wide gene mutation and copy number changes as well as a group of genes with a high frequency of genetic changes were then correlated with various histopathologic features of invasive breast cancer. Results: Validation of TCGA data using a group of genes with known alterations in breast cancer suggests that the TCGA has accurately documented the genomic abnormalities of multiple malignancies. Further analysis of TCGA breast cancer sequencing data shows that accumulation of specific genomic defects is associated with higher tumor grade, larger tumor size and receptor negativity. Distinct groups of genomic changes were found to be associated with the different grades of invasive ductal carcinoma. The mutator role of the TP53 gene was validated by genomic sequencing data of invasive breast cancer and TP53 mutation was found to play a critical role in defining high tumor grade. Conclusions: Data mining of the TCGA genome sequencing data is an innovative and reliable method to help characterize the genomic abnormalities associated with histopathologic features of invasive breast cancer. PMID:24672738
Mining genome sequencing data to identify the genomic features linked to breast cancer histopathology

Directory of Open Access Journals (Sweden)

Zheng Ping

2014-01-01

Full Text Available Background: Genetics and genomics have radically altered our understanding of breast cancer progression. However, the genomic basis of various histopathologic features of breast cancer is not yet well-defined. Materials and Methods: The Cancer Genome Atlas (TCGA is an international database containing a large collection of human cancer genome sequencing data. cBioPortal is a web tool developed for mining these sequencing data. We performed mining of TCGA sequencing data in an attempt to characterize the genomic features correlated with breast cancer histopathology. We first assessed the quality of the TCGA data using a group of genes with known alterations in various cancers. Both genome-wide gene mutation and copy number changes as well as a group of genes with a high frequency of genetic changes were then correlated with various histopathologic features of invasive breast cancer. Results: Validation of TCGA data using a group of genes with known alterations in breast cancer suggests that the TCGA has accurately documented the genomic abnormalities of multiple malignancies. Further analysis of TCGA breast cancer sequencing data shows that accumulation of specific genomic defects is associated with higher tumor grade, larger tumor size and receptor negativity. Distinct groups of genomic changes were found to be associated with the different grades of invasive ductal carcinoma. The mutator role of the TP53 gene was validated by genomic sequencing data of invasive breast cancer and TP53 mutation was found to play a critical role in defining high tumor grade. Conclusions: Data mining of the TCGA genome sequencing data is an innovative and reliable method to help characterize the genomic abnormalities associated with histopathologic features of invasive breast cancer.
Text mining from ontology learning to automated text processing applications

CERN Document Server

Biemann, Chris

2014-01-01

This book comprises a set of articles that specify the methodology of text mining, describe the creation of lexical resources in the framework of text mining and use text mining for various tasks in natural language processing (NLP). The analysis of large amounts of textual data is a prerequisite to build lexical resources such as dictionaries and ontologies and also has direct applications in automated text processing in fields such as history, healthcare and mobile applications, just to name a few. This volume gives an update in terms of the recent gains in text mining methods and reflects
BAGEL2 : mining for bacteriocins in genomic data

NARCIS (Netherlands)

de Jong, Anne; van Heel, Auke J.; Kok, Jan; Kuipers, Oscar P.

Mining bacterial genomes for bacteriocins is a challenging task due to the substantial structure and sequence diversity, and generally small sizes, of these antimicrobial peptides. Major progress in the research of antimicrobial peptides and the ever-increasing quantities of genomic data, varying

Electrical - light current remote monitoring, control and automation. [Coal mine, United Kingdom

Energy Technology Data Exchange (ETDEWEB)

Collingwood, C H

1981-06-01

A brief discussion is given of the application of control monitoring and automation techniques to coal mining in the United Kingdom, especially of the use of microprocessors, for the purpose of enhancing safety and productivity. Lighting systems for the coal mine is similarly discussed.
BEACON: automated tool for Bacterial GEnome Annotation ComparisON

KAUST Repository

Kalkatawi, Manal M.

2015-08-18

Background Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). Results The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON’s utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27 %, while the number of genes without any function assignment is reduced. Conclusions We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/
BEACON: automated tool for Bacterial GEnome Annotation ComparisON.

Science.gov (United States)

Kalkatawi, Manal; Alam, Intikhab; Bajic, Vladimir B

2015-08-18

Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON's utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27%, while the number of genes without any function assignment is reduced. We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/ .
The evolution of genome mining in microbes – a review

DEFF Research Database (Denmark)

Ziemert, Nadine; Alanjary, Mohammad; Weber, Tilmann

2016-01-01

Covering: 2006 to 2016. The computational mining of genomes has become an important part in the discovery of novel natural products as drug leads. Thousands of bacterial genome sequences are publically available these days containing an even larger number and diversity of secondary metabolite gene...... clusters that await linkage to their encoded natural products. With the development of high-throughput sequencing methods and the wealth of DNA data available, a variety of genome mining methods and tools have been developed to guide discovery and characterisation of these compounds. This article reviews...
Automated system of monitoring and positioning of functional units of mining technological machines for coal-mining enterprises

Directory of Open Access Journals (Sweden)

Meshcheryakov Yaroslav

2018-01-01

Full Text Available This article is show to the development of an automated monitoring and positioning system for functional nodes of mining technological machines. It describes the structure, element base, algorithms for identifying the operating states of a walking excavator; various types of errors in the functioning of microelectromechanical gyroscopes and accelerometers, as well as methods for their correction based on the Madgwick fusion filter. The results of industrial tests of an automated monitoring and positioning system for functional units on one of the opencast coal mines of Kuzbass are presented. This work is addressed to specialists working in the fields of the development of embedded systems and control systems, radio electronics, mechatronics, and robotics.
Comparative genomics using data mining tools

Indian Academy of Sciences (India)

We have analysed the genomes of representatives of three kingdoms of life, namely, archaea, eubacteria and eukaryota using data mining tools based on compositional analyses of the protein sequences. The representatives chosen in this analysis were Methanococcus jannaschii, Haemophilus influenzae and ...
Multi-purpose logical device with integrated circuit for the automation of mine water disposal

Energy Technology Data Exchange (ETDEWEB)

Pop, E.; Pasculescu, M.

1980-06-01

After an analysis of the waste water disposal as an object of automation, the author presents a BASIC-language programme established to simulate the automated control system on a digital computer. Then a multi-purpose logical device with integrated circuits for the automation of the mine water disposal is presented. (In Romanian)
Automating an integrated spatial data-mining model for landfill site selection

Science.gov (United States)

Abujayyab, Sohaib K. M.; Ahamad, Mohd Sanusi S.; Yahya, Ahmad Shukri; Ahmad, Siti Zubaidah; Aziz, Hamidi Abdul

2017-10-01

An integrated programming environment represents a robust approach to building a valid model for landfill site selection. One of the main challenges in the integrated model is the complicated processing and modelling due to the programming stages and several limitations. An automation process helps avoid the limitations and improve the interoperability between integrated programming environments. This work targets the automation of a spatial data-mining model for landfill site selection by integrating between spatial programming environment (Python-ArcGIS) and non-spatial environment (MATLAB). The model was constructed using neural networks and is divided into nine stages distributed between Matlab and Python-ArcGIS. A case study was taken from the north part of Peninsular Malaysia. 22 criteria were selected to utilise as input data and to build the training and testing datasets. The outcomes show a high-performance accuracy percentage of 98.2% in the testing dataset using 10-fold cross validation. The automated spatial data mining model provides a solid platform for decision makers to performing landfill site selection and planning operations on a regional scale.
Optimizing wireless LAN for longwall coal mine automation

Energy Technology Data Exchange (ETDEWEB)

Hargrave, C.O.; Ralston, J.C.; Hainsworth, D.W. [Exploration & Mining Commonwealth Science & Industrial Research Organisation, Pullenvale, Qld. (Australia)

2007-01-15

A significant development in underground longwall coal mining automation has been achieved with the successful implementation of wireless LAN (WLAN) technology for communication on a longwall shearer. WIreless-FIdelity (Wi-Fi) was selected to meet the bandwidth requirements of the underground data network, and several configurations were installed on operating longwalls to evaluate their performance. Although these efforts demonstrated the feasibility of using WLAN technology in longwall operation, it was clear that new research and development was required in order to establish optimal full-face coverage. By undertaking an accurate characterization of the target environment, it has been possible to achieve great improvements in WLAN performance over a nominal Wi-Fi installation. This paper discusses the impact of Fresnel zone obstructions and multipath effects on radio frequency propagation and reports an optimal antenna and system configuration. Many of the lessons learned in the longwall case are immediately applicable to other underground mining operations, particularly wherever there is a high degree of obstruction from mining equipment.
Automated tools to be used for ascertaining structural condition in South African hard rock mines

CSIR Research Space (South Africa)

Teleka, R

2011-11-01

Full Text Available in the mining operations and in the efforts to improve mine safety. If mines are safe, the belief is that more skilled labor will express interest in it unlike the way it currently is. The purpose of this paper is to discuss the possibility of using automated...
An automated annotation tool for genomic DNA sequences using

Indian Academy of Sciences (India)

Genomic sequence data are often available well before the annotated sequence is published. We present a method for analysis of genomic DNA to identify coding sequences using the GeneScan algorithm and characterize these resultant sequences by BLAST. The routines are used to develop a system for automated ...
Automated data mining: an innovative and efficient web-based approach to maintaining resident case logs.

Science.gov (United States)

Bhattacharya, Pratik; Van Stavern, Renee; Madhavan, Ramesh

2010-12-01

Use of resident case logs has been considered by the Residency Review Committee for Neurology of the Accreditation Council for Graduate Medical Education (ACGME). This study explores the effectiveness of a data-mining program for creating resident logs and compares the results to a manual data-entry system. Other potential applications of data mining to enhancing resident education are also explored. Patient notes dictated by residents were extracted from the Hospital Information System and analyzed using an unstructured mining program. History, examination and ICD codes were obtained and compared to the existing manual log. The automated data History, examination, and ICD codes were gathered for a 30-day period and compared to manual case logs. The automated method extracted all resident dictations with the dates of encounter and transcription. The automated data-miner processed information from all 19 residents, while only 4 residents logged manually. The manual method identified only broad categories of diseases; the major categories were stroke or vascular disorder 53 (27.6%), epilepsy 28 (14.7%), and pain syndromes 26 (13.5%). In the automated method, epilepsy 114 (21.1%), cerebral atherosclerosis 114 (21.1%), and headache 105 (19.4%) were the most frequent primary diagnoses, and headache 89 (16.5%), seizures 94 (17.4%), and low back pain 47 (9%) were the most common chief complaints. More detailed patient information such as tobacco use 227 (42%), alcohol use 205 (38%), and drug use 38 (7%) were extracted by the data-mining method. Manual case logs are time-consuming, provide limited information, and may be unpopular with residents. Data mining is a time-effective tool that may aid in the assessment of resident experience or the ACGME core competencies or in resident clinical research. More study of this method in larger numbers of residency programs is needed.
Automated ensemble assembly and validation of microbial genomes

Science.gov (United States)

2014-01-01

Background The continued democratization of DNA sequencing has sparked a new wave of development of genome assembly and assembly validation methods. As individual research labs, rather than centralized centers, begin to sequence the majority of new genomes, it is important to establish best practices for genome assembly. However, recent evaluations such as GAGE and the Assemblathon have concluded that there is no single best approach to genome assembly. Instead, it is preferable to generate multiple assemblies and validate them to determine which is most useful for the desired analysis; this is a labor-intensive process that is often impossible or unfeasible. Results To encourage best practices supported by the community, we present iMetAMOS, an automated ensemble assembly pipeline; iMetAMOS encapsulates the process of running, validating, and selecting a single assembly from multiple assemblies. iMetAMOS packages several leading open-source tools into a single binary that automates parameter selection and execution of multiple assemblers, scores the resulting assemblies based on multiple validation metrics, and annotates the assemblies for genes and contaminants. We demonstrate the utility of the ensemble process on 225 previously unassembled Mycobacterium tuberculosis genomes as well as a Rhodobacter sphaeroides benchmark dataset. On these real data, iMetAMOS reliably produces validated assemblies and identifies potential contamination without user intervention. In addition, intelligent parameter selection produces assemblies of R. sphaeroides comparable to or exceeding the quality of those from the GAGE-B evaluation, affecting the relative ranking of some assemblers. Conclusions Ensemble assembly with iMetAMOS provides users with multiple, validated assemblies for each genome. Although computationally limited to small or mid-sized genomes, this approach is the most effective and reproducible means for generating high-quality assemblies and enables users to
Supplementary Material for: BEACON: automated tool for Bacterial GEnome Annotation ComparisON

KAUST Repository

Kalkatawi, Manal M.; Alam, Intikhab; Bajic, Vladimir B.

2015-01-01

Abstract Background Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). Results The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACONâ s utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27Â %, while the number of genes without any function assignment is reduced. Conclusions We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/ .
FIGENIX: Intelligent automation of genomic annotation: expertise integration in a new software platform

Directory of Open Access Journals (Sweden)

Pontarotti Pierre

2005-08-01

Full Text Available Abstract Background Two of the main objectives of the genomic and post-genomic era are to structurally and functionally annotate genomes which consists of detecting genes' position and structure, and inferring their function (as well as of other features of genomes. Structural and functional annotation both require the complex chaining of numerous different software, algorithms and methods under the supervision of a biologist. The automation of these pipelines is necessary to manage huge amounts of data released by sequencing projects. Several pipelines already automate some of these complex chaining but still necessitate an important contribution of biologists for supervising and controlling the results at various steps. Results Here we propose an innovative automated platform, FIGENIX, which includes an expert system capable to substitute to human expertise at several key steps. FIGENIX currently automates complex pipelines of structural and functional annotation under the supervision of the expert system (which allows for example to make key decisions, check intermediate results or refine the dataset. The quality of the results produced by FIGENIX is comparable to those obtained by expert biologists with a drastic gain in terms of time costs and avoidance of errors due to the human manipulation of data. Conclusion The core engine and expert system of the FIGENIX platform currently handle complex annotation processes of broad interest for the genomic community. They could be easily adapted to new, or more specialized pipelines, such as for example the annotation of miRNAs, the classification of complex multigenic families, annotation of regulatory elements and other genomic features of interest.
Automated design of genomic Southern blot probes

Directory of Open Access Journals (Sweden)

Komiyama Noboru H

2010-01-01

experimentally validate a number of these automated designs by Southern blotting. The majority of probes we tested performed well confirming our in silico prediction methodology and the general usefulness of the software for automated genomic Southern probe design. Conclusions Software and supplementary information are freely available at: http://www.genes2cognition.org/software/southern_blot
Phylogeny-guided (meta)genome mining approach for the targeted discovery of new microbial natural products.

Science.gov (United States)

Kang, Hahk-Soo

2017-02-01

Genomics-based methods are now commonplace in natural products research. A phylogeny-guided mining approach provides a means to quickly screen a large number of microbial genomes or metagenomes in search of new biosynthetic gene clusters of interest. In this approach, biosynthetic genes serve as molecular markers, and phylogenetic trees built with known and unknown marker gene sequences are used to quickly prioritize biosynthetic gene clusters for their metabolites characterization. An increase in the use of this approach has been observed for the last couple of years along with the emergence of low cost sequencing technologies. The aim of this review is to discuss the basic concept of a phylogeny-guided mining approach, and also to provide examples in which this approach was successfully applied to discover new natural products from microbial genomes and metagenomes. I believe that the phylogeny-guided mining approach will continue to play an important role in genomics-based natural products research.
Automated information and control complex of hydro-gas endogenous mine processes

Science.gov (United States)

Davkaev, K. S.; Lyakhovets, M. V.; Gulevich, T. M.; Zolin, K. A.

2017-09-01

The automated information and control complex designed to prevent accidents, related to aerological situation in the underground workings, accounting of the received and handed over individual devices, transmission and display of measurement data, and the formation of preemptive solutions is considered. Examples for the automated workplace of an airgas control operator by individual means are given. The statistical characteristics of field data characterizing the aerological situation in the mine are obtained. The conducted studies of statistical characteristics confirm the feasibility of creating a subsystem of controlled gas distribution with an adaptive arrangement of points for gas control. The adaptive (multivariant) algorithm for processing measuring information of continuous multidimensional quantities and influencing factors has been developed.
Highlights of recent articles on data mining in genomics & proteomics

Science.gov (United States)

This editorial elaborates on investigations consisting of different “OMICS” technologies and their application to biological sciences. In addition, advantages and recent development of the proteomic, genomic and data mining technologies are discussed. This information will be useful to scientists ...
Genomic research and data-mining technology: implications for personal privacy and informed consent.

Science.gov (United States)

Tavani, Herman T

2004-01-01

This essay examines issues involving personal privacy and informed consent that arise at the intersection of information and communication technology (ICT) and population genomics research. I begin by briefly examining the ethical, legal, and social implications (ELSI) program requirements that were established to guide researchers working on the Human Genome Project (HGP). Next I consider a case illustration involving deCODE Genetics, a privately owned genetic company in Iceland, which raises some ethical concerns that are not clearly addressed in the current ELSI guidelines. The deCODE case also illustrates some ways in which an ICT technique known as data mining has both aided and posed special challenges for researchers working in the field of population genomics. On the one hand, data-mining tools have greatly assisted researchers in mapping the human genome and in identifying certain "disease genes" common in specific populations (which, in turn, has accelerated the process of finding cures for diseases tha affect those populations). On the other hand, this technology has significantly threatened the privacy of research subjects participating in population genomics studies, who may, unwittingly, contribute to the construction of new groups (based on arbitrary and non-obvious patterns and statistical correlations) that put those subjects at risk for discrimination and stigmatization. In the final section of this paper I examine some ways in which the use of data mining in the context of population genomics research poses a critical challenge for the principle of informed consent, which traditionally has played a central role in protecting the privacy interests of research subjects participating in epidemiological studies.

Flexible automated systems of real time mining operation management: concepts, architecture, models of network engineering for data transmission and processing

Energy Technology Data Exchange (ETDEWEB)

Markhasin, A.B.

1987-11-01

Since the mid 1960's considerable effort has been invested by the mining industry and its research institutions and by universities to create real time mining management automation systems. Some of the shortcomings which still persist in realizing the efficiency such systems can offer are due to objective and subjective factors within and outside the management systems: the creation of the component base, automation equipment, and computer technology, on the one hand, and the organization, process, engineering, and coordination of mining work on the other. This review addresses several of these shortcomings with recommendations for their solution in a primary and systematic way and suggests methods for the implementation of microprocessors and a network of flexible data transmission and processing facilities for both surface and underground mining.
CloVR-Comparative: automated, cloud-enabled comparative microbial genome sequence analysis pipeline

OpenAIRE

Agrawal, Sonia; Arze, Cesar; Adkins, Ricky S.; Crabtree, Jonathan; Riley, David; Vangala, Mahesh; Galens, Kevin; Fraser, Claire M.; Tettelin, Herv?; White, Owen; Angiuoli, Samuel V.; Mahurkar, Anup; Fricke, W. Florian

2017-01-01

Background The benefit of increasing genomic sequence data to the scientific community depends on easy-to-use, scalable bioinformatics support. CloVR-Comparative combines commonly used bioinformatics tools into an intuitive, automated, and cloud-enabled analysis pipeline for comparative microbial genomics. Results CloVR-Comparative runs on annotated complete or draft genome sequences that are uploaded by the user or selected via a taxonomic tree-based user interface and downloaded from NCBI. ...
dbCAN2: a meta server for automated carbohydrate-active enzyme annotation

DEFF Research Database (Denmark)

Zhang, Han; Yohe, Tanner; Huang, Le

2018-01-01

of plant and plant-associated microbial genomes and metagenomes being sequenced, there is an urgent need of automatic tools for genomic data mining of CAZymes. We developed the dbCAN web server in 2012 to provide a public service for automated CAZyme annotation for newly sequenced genomes. Here, dbCAN2...... (http://cys.bios.niu.edu/dbCAN2) is presented as an updated meta server, which integrates three state-of-the-art tools for CAZome (all CAZymes of a genome) annotation: (i) HMMER search against the dbCAN HMM (hidden Markov model) database; (ii) DIAMOND search against the CAZy pre-annotated CAZyme...
BEACON: automated tool for Bacterial GEnome Annotation ComparisON

KAUST Repository

Kalkatawi, Manal M.; Alam, Intikhab; Bajic, Vladimir B.

2015-01-01

We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/
Evaluation of three automated genome annotations for Halorhabdus utahensis.

Directory of Open Access Journals (Sweden)

Peter Bakke

2009-07-01

Full Text Available Genome annotations are accumulating rapidly and depend heavily on automated annotation systems. Many genome centers offer annotation systems but no one has compared their output in a systematic way to determine accuracy and inherent errors. Errors in the annotations are routinely deposited in databases such as NCBI and used to validate subsequent annotation errors. We submitted the genome sequence of halophilic archaeon Halorhabdus utahensis to be analyzed by three genome annotation services. We have examined the output from each service in a variety of ways in order to compare the methodology and effectiveness of the annotations, as well as to explore the genes, pathways, and physiology of the previously unannotated genome. The annotation services differ considerably in gene calls, features, and ease of use. We had to manually identify the origin of replication and the species-specific consensus ribosome-binding site. Additionally, we conducted laboratory experiments to test H. utahensis growth and enzyme activity. Current annotation practices need to improve in order to more accurately reflect a genome's biological potential. We make specific recommendations that could improve the quality of microbial annotation projects.
metabolicMine: an integrated genomics, genetics and proteomics data warehouse for common metabolic disease research.

Science.gov (United States)

Lyne, Mike; Smith, Richard N; Lyne, Rachel; Aleksic, Jelena; Hu, Fengyuan; Kalderimis, Alex; Stepan, Radek; Micklem, Gos

2013-01-01

Common metabolic and endocrine diseases such as diabetes affect millions of people worldwide and have a major health impact, frequently leading to complications and mortality. In a search for better prevention and treatment, there is ongoing research into the underlying molecular and genetic bases of these complex human diseases, as well as into the links with risk factors such as obesity. Although an increasing number of relevant genomic and proteomic data sets have become available, the quantity and diversity of the data make their efficient exploitation challenging. Here, we present metabolicMine, a data warehouse with a specific focus on the genomics, genetics and proteomics of common metabolic diseases. Developed in collaboration with leading UK metabolic disease groups, metabolicMine integrates data sets from a range of experiments and model organisms alongside tools for exploring them. The current version brings together information covering genes, proteins, orthologues, interactions, gene expression, pathways, ontologies, diseases, genome-wide association studies and single nucleotide polymorphisms. Although the emphasis is on human data, key data sets from mouse and rat are included. These are complemented by interoperation with the RatMine rat genomics database, with a corresponding mouse version under development by the Mouse Genome Informatics (MGI) group. The web interface contains a number of features including keyword search, a library of Search Forms, the QueryBuilder and list analysis tools. This provides researchers with many different ways to analyse, view and flexibly export data. Programming interfaces and automatic code generation in several languages are supported, and many of the features of the web interface are available through web services. The combination of diverse data sets integrated with analysis tools and a powerful query system makes metabolicMine a valuable research resource. The web interface makes it accessible to first
Mining and characterization of microsatellites from a genome of Venturia carpophila

Science.gov (United States)

A total of 4,021 microsatellites were mined from a genome of Venturia carpophila and 192 were selected to screen 39 isolates of the fungus collected from peach and nectarine in the southeastern USA. Of the 192 selected, 32 primers consistently and reliably produced polymorphic amplicons. Subsequentl...
Recent development of antiSMASH and other computational approaches to mine secondary metabolite biosynthetic gene clusters

DEFF Research Database (Denmark)

Blin, Kai; Kim, Hyun Uk; Medema, Marnix H.

2017-01-01

Many drugs are derived from small molecules produced by microorganisms and plants, so-called natural products. Natural products have diverse chemical structures, but the biosynthetic pathways producing those compounds are often organized as biosynthetic gene clusters (BGCs) and follow a highly...... conserved biosynthetic logic. This allows for the identification of core biosynthetic enzymes using genome mining strategies that are based on the sequence similarity of the involved enzymes/genes. However, mining for a variety of BGCs quickly approaches a complexity level where manual analyses...... are no longer possible and require the use of automated genome mining pipelines, such as the antiSMASH software. In this review, we discuss the principles underlying the predictions of antiSMASH and other tools and provide practical advice for their application. Furthermore, we discuss important caveats...
Automation for mineral resource development

Energy Technology Data Exchange (ETDEWEB)

Norrie, A.W.; Turner, D.R. (eds.)

1986-01-01

A total of 55 papers were presented at the symposium under the following headings: automation and the future of mining; modelling and control of mining processes; transportation for mining; automation and the future of metallurgical processes; modelling and control of metallurgical processes; and general aspects. Fifteen papers have been abstracted separately.
Mining engineer requirements in a German coal mine

Energy Technology Data Exchange (ETDEWEB)

Rauhut, F J

1985-10-01

Basic developments in German coal mines, new definitions of working areas of mining engineers, and groups of requirements in education are discussed. These groups include: requirements of hard-coal mining at great depth and in extended collieries; application of process technology and information systems in semi-automated mines; thinking in processes and systems; organizational changes; future requirements of mining engineers; responsibility of the mining engineer for employees and society.
Automated Processing of 2-D Gel Electrophoretograms of Genomic DNA for Hunting Pathogenic DNA Molecular Changes.

Science.gov (United States)

Takahashi; Nakazawa; Watanabe; Konagaya

1999-01-01

We have developed the automated processing algorithms for 2-dimensional (2-D) electrophoretograms of genomic DNA based on RLGS (Restriction Landmark Genomic Scanning) method, which scans the restriction enzyme recognition sites as the landmark and maps them onto a 2-D electrophoresis gel. Our powerful processing algorithms realize the automated spot recognition from RLGS electrophoretograms and the automated comparison of a huge number of such images. In the final stage of the automated processing, a master spot pattern, on which all the spots in the RLGS images are mapped at once, can be obtained. The spot pattern variations which seemed to be specific to the pathogenic DNA molecular changes can be easily detected by simply looking over the master spot pattern. When we applied our algorithms to the analysis of 33 RLGS images derived from human colon tissues, we successfully detected several colon tumor specific spot pattern changes.
A Tool for Multiple Targeted Genome Deletions that Is Precise, Scar-Free, and Suitable for Automation.

Directory of Open Access Journals (Sweden)

Wayne Aubrey

Full Text Available Many advances in synthetic biology require the removal of a large number of genomic elements from a genome. Most existing deletion methods leave behind markers, and as there are a limited number of markers, such methods can only be applied a fixed number of times. Deletion methods that recycle markers generally are either imprecise (remove untargeted sequences, or leave scar sequences which can cause genome instability and rearrangements. No existing marker recycling method is automation-friendly. We have developed a novel openly available deletion tool that consists of: 1 a method for deleting genomic elements that can be repeatedly used without limit, is precise, scar-free, and suitable for automation; and 2 software to design the method's primers. Our tool is sequence agnostic and could be used to delete large numbers of coding sequences, promoter regions, transcription factor binding sites, terminators, etc in a single genome. We have validated our tool on the deletion of non-essential open reading frames (ORFs from S. cerevisiae. The tool is applicable to arbitrary genomes, and we provide primer sequences for the deletion of: 90% of the ORFs from the S. cerevisiae genome, 88% of the ORFs from S. pombe genome, and 85% of the ORFs from the L. lactis genome.
Automating the Analysis of Spatial Grids A Practical Guide to Data Mining Geospatial Images for Human & Environmental Applications

CERN Document Server

Lakshmanan, Valliappa

2012-01-01

The ability to create automated algorithms to process gridded spatial data is increasingly important as remotely sensed datasets increase in volume and frequency. Whether in business, social science, ecology, meteorology or urban planning, the ability to create automated applications to analyze and detect patterns in geospatial data is increasingly important. This book provides students with a foundation in topics of digital image processing and data mining as applied to geospatial datasets. The aim is for readers to be able to devise and implement automated techniques to extract information from spatial grids such as radar, satellite or high-resolution survey imagery.
Robotics for mining control

Energy Technology Data Exchange (ETDEWEB)

1986-11-01

In 1982 surveys of the mining industry revealed no applications of robotics existed and none were planned. This report provides a general overview of automation in the mining industry since this point in time. Roof control electronics, gas monitoring, jumbo drill automation, remote and sensor- controlled continuous miners, automated trolley trucks, roof bolting and screening machines are examples of technology available today. The report concludes with recommendations as to six potential research areas. 25 refs.
Automated whole-genome multiple alignment of rat, mouse, and human

Energy Technology Data Exchange (ETDEWEB)

Brudno, Michael; Poliakov, Alexander; Salamov, Asaf; Cooper, Gregory M.; Sidow, Arend; Rubin, Edward M.; Solovyev, Victor; Batzoglou, Serafim; Dubchak, Inna

2004-07-04

We have built a whole genome multiple alignment of the three currently available mammalian genomes using a fully automated pipeline which combines the local/global approach of the Berkeley Genome Pipeline and the LAGAN program. The strategy is based on progressive alignment, and consists of two main steps: (1) alignment of the mouse and rat genomes; and (2) alignment of human to either the mouse-rat alignments from step 1, or the remaining unaligned mouse and rat sequences. The resulting alignments demonstrate high sensitivity, with 87% of all human gene-coding areas aligned in both mouse and rat. The specificity is also high: <7% of the rat contigs are aligned to multiple places in human and 97% of all alignments with human sequence > 100kb agree with a three-way synteny map built independently using predicted exons in the three genomes. At the nucleotide level <1% of the rat nucleotides are mapped to multiple places in the human sequence in the alignment; and 96.5% of human nucleotides within all alignments agree with the synteny map. The alignments are publicly available online, with visualization through the novel Multi-VISTA browser that we also present.
Automated assessment of patients' self-narratives for posttraumatic stress disorder screening using natural language processing and text mining

NARCIS (Netherlands)

He, Qiwei; Veldkamp, Bernard P.; Glas, Cornelis A.W.; de Vries, Theo

2017-01-01

Patients’ narratives about traumatic experiences and symptoms are useful in clinical screening and diagnostic procedures. In this study, we presented an automated assessment system to screen patients for posttraumatic stress disorder via a natural language processing and text-mining approach. Four
A web server for mining Comparative Genomic Hybridization (CGH) data

Science.gov (United States)

Liu, Jun; Ranka, Sanjay; Kahveci, Tamer

2007-11-01

Advances in cytogenetics and molecular biology has established that chromosomal alterations are critical in the pathogenesis of human cancer. Recurrent chromosomal alterations provide cytological and molecular markers for the diagnosis and prognosis of disease. They also facilitate the identification of genes that are important in carcinogenesis, which in the future may help in the development of targeted therapy. A large amount of publicly available cancer genetic data is now available and it is growing. There is a need for public domain tools that allow users to analyze their data and visualize the results. This chapter describes a web based software tool that will allow researchers to analyze and visualize Comparative Genomic Hybridization (CGH) datasets. It employs novel data mining methodologies for clustering and classification of CGH datasets as well as algorithms for identifying important markers (small set of genomic intervals with aberrations) that are potentially cancer signatures. The developed software will help in understanding the relationships between genomic aberrations and cancer types.
Mining in 2015

Energy Technology Data Exchange (ETDEWEB)

Hood, M.; Hatherly, P.; Gurgenci, H. [Centre for Mining Technology and Equipment (Australia)

1999-10-01

New technology in open-pit and underground hard rock mining in 2015 is anticipated in this article, based on a paper presented to the 1998 invitation symposium - 'Technology - Australia's future: new technology for traditional industry', held in Freemantle, WA, 24-25 November 1998. It is expected that essential mining operations of rock breakage and transport and ore processing will still exist but the use of drills, shovels/LHDs and trucks is likely to be replaced by continuous, intelligent, automated mining systems. Rock blasting models need to be fed data on rock properties at each blasthole for high accuracy. The authors believe that in 2015 measurements of rock properties will be a routine part of the drilling process. Blasthole drills will be fitted with a range of mechanical and geophysical sensors. New, non-explosive methods of rock breaking such as oscillating disc cutting, may be available. Mining automation will improve safety and productivity, perhaps with the automation of dragline swing LHDs and trucks may be able to drive themselves, with operators monitoring and intervening when necessary. Performance and machine condition data may be applied to improve equipment design. Australian mining stands to gain by these advances in mining technology. 1 fig., 3 photos.
Toward the automated generation of genome-scale metabolic networks in the SEED.

Science.gov (United States)

DeJongh, Matthew; Formsma, Kevin; Boillot, Paul; Gould, John; Rycenga, Matthew; Best, Aaron

2007-04-26

Current methods for the automated generation of genome-scale metabolic networks focus on genome annotation and preliminary biochemical reaction network assembly, but do not adequately address the process of identifying and filling gaps in the reaction network, and verifying that the network is suitable for systems level analysis. Thus, current methods are only sufficient for generating draft-quality networks, and refinement of the reaction network is still largely a manual, labor-intensive process. We have developed a method for generating genome-scale metabolic networks that produces substantially complete reaction networks, suitable for systems level analysis. Our method partitions the reaction space of central and intermediary metabolism into discrete, interconnected components that can be assembled and verified in isolation from each other, and then integrated and verified at the level of their interconnectivity. We have developed a database of components that are common across organisms, and have created tools for automatically assembling appropriate components for a particular organism based on the metabolic pathways encoded in the organism's genome. This focuses manual efforts on that portion of an organism's metabolism that is not yet represented in the database. We have demonstrated the efficacy of our method by reverse-engineering and automatically regenerating the reaction network from a published genome-scale metabolic model for Staphylococcus aureus. Additionally, we have verified that our method capitalizes on the database of common reaction network components created for S. aureus, by using these components to generate substantially complete reconstructions of the reaction networks from three other published metabolic models (Escherichia coli, Helicobacter pylori, and Lactococcus lactis). We have implemented our tools and database within the SEED, an open-source software environment for comparative genome annotation and analysis. Our method sets the
Toward the automated generation of genome-scale metabolic networks in the SEED

Directory of Open Access Journals (Sweden)

Gould John

2007-04-01

Full Text Available Abstract Background Current methods for the automated generation of genome-scale metabolic networks focus on genome annotation and preliminary biochemical reaction network assembly, but do not adequately address the process of identifying and filling gaps in the reaction network, and verifying that the network is suitable for systems level analysis. Thus, current methods are only sufficient for generating draft-quality networks, and refinement of the reaction network is still largely a manual, labor-intensive process. Results We have developed a method for generating genome-scale metabolic networks that produces substantially complete reaction networks, suitable for systems level analysis. Our method partitions the reaction space of central and intermediary metabolism into discrete, interconnected components that can be assembled and verified in isolation from each other, and then integrated and verified at the level of their interconnectivity. We have developed a database of components that are common across organisms, and have created tools for automatically assembling appropriate components for a particular organism based on the metabolic pathways encoded in the organism's genome. This focuses manual efforts on that portion of an organism's metabolism that is not yet represented in the database. We have demonstrated the efficacy of our method by reverse-engineering and automatically regenerating the reaction network from a published genome-scale metabolic model for Staphylococcus aureus. Additionally, we have verified that our method capitalizes on the database of common reaction network components created for S. aureus, by using these components to generate substantially complete reconstructions of the reaction networks from three other published metabolic models (Escherichia coli, Helicobacter pylori, and Lactococcus lactis. We have implemented our tools and database within the SEED, an open-source software environment for comparative

Correcting Inconsistencies and Errors in Bacterial Genome Metadata Using an Automated Curation Tool in Excel (AutoCurE).

Science.gov (United States)

Schmedes, Sarah E; King, Jonathan L; Budowle, Bruce

2015-01-01

Whole-genome data are invaluable for large-scale comparative genomic studies. Current sequencing technologies have made it feasible to sequence entire bacterial genomes with relative ease and time with a substantially reduced cost per nucleotide, hence cost per genome. More than 3,000 bacterial genomes have been sequenced and are available at the finished status. Publically available genomes can be readily downloaded; however, there are challenges to verify the specific supporting data contained within the download and to identify errors and inconsistencies that may be present within the organizational data content and metadata. AutoCurE, an automated tool for bacterial genome database curation in Excel, was developed to facilitate local database curation of supporting data that accompany downloaded genomes from the National Center for Biotechnology Information. AutoCurE provides an automated approach to curate local genomic databases by flagging inconsistencies or errors by comparing the downloaded supporting data to the genome reports to verify genome name, RefSeq accession numbers, the presence of archaea, BioProject/UIDs, and sequence file descriptions. Flags are generated for nine metadata fields if there are inconsistencies between the downloaded genomes and genomes reports and if erroneous or missing data are evident. AutoCurE is an easy-to-use tool for local database curation for large-scale genome data prior to downstream analyses.
Hal: an automated pipeline for phylogenetic analyses of genomic data.

Science.gov (United States)

Robbertse, Barbara; Yoder, Ryan J; Boyd, Alex; Reeves, John; Spatafora, Joseph W

2011-02-07

The rapid increase in genomic and genome-scale data is resulting in unprecedented levels of discrete sequence data available for phylogenetic analyses. Major analytical impasses exist, however, prior to analyzing these data with existing phylogenetic software. Obstacles include the management of large data sets without standardized naming conventions, identification and filtering of orthologous clusters of proteins or genes, and the assembly of alignments of orthologous sequence data into individual and concatenated super alignments. Here we report the production of an automated pipeline, Hal that produces multiple alignments and trees from genomic data. These alignments can be produced by a choice of four alignment programs and analyzed by a variety of phylogenetic programs. In short, the Hal pipeline connects the programs BLASTP, MCL, user specified alignment programs, GBlocks, ProtTest and user specified phylogenetic programs to produce species trees. The script is available at sourceforge (http://sourceforge.net/projects/bio-hal/). The results from an example analysis of Kingdom Fungi are briefly discussed.
PLAN: a web platform for automating high-throughput BLAST searches and for managing and mining results.

Science.gov (United States)

He, Ji; Dai, Xinbin; Zhao, Xuechun

2007-02-09

BLAST searches are widely used for sequence alignment. The search results are commonly adopted for various functional and comparative genomics tasks such as annotating unknown sequences, investigating gene models and comparing two sequence sets. Advances in sequencing technologies pose challenges for high-throughput analysis of large-scale sequence data. A number of programs and hardware solutions exist for efficient BLAST searching, but there is a lack of generic software solutions for mining and personalized management of the results. Systematically reviewing the results and identifying information of interest remains tedious and time-consuming. Personal BLAST Navigator (PLAN) is a versatile web platform that helps users to carry out various personalized pre- and post-BLAST tasks, including: (1) query and target sequence database management, (2) automated high-throughput BLAST searching, (3) indexing and searching of results, (4) filtering results online, (5) managing results of personal interest in favorite categories, (6) automated sequence annotation (such as NCBI NR and ontology-based annotation). PLAN integrates, by default, the Decypher hardware-based BLAST solution provided by Active Motif Inc. with a greatly improved efficiency over conventional BLAST software. BLAST results are visualized by spreadsheets and graphs and are full-text searchable. BLAST results and sequence annotations can be exported, in part or in full, in various formats including Microsoft Excel and FASTA. Sequences and BLAST results are organized in projects, the data publication levels of which are controlled by the registered project owners. In addition, all analytical functions are provided to public users without registration. PLAN has proved a valuable addition to the community for automated high-throughput BLAST searches, and, more importantly, for knowledge discovery, management and sharing based on sequence alignment results. The PLAN web interface is platform
PLAN: a web platform for automating high-throughput BLAST searches and for managing and mining results

Directory of Open Access Journals (Sweden)

Zhao Xuechun

2007-02-01

Full Text Available Abstract Background BLAST searches are widely used for sequence alignment. The search results are commonly adopted for various functional and comparative genomics tasks such as annotating unknown sequences, investigating gene models and comparing two sequence sets. Advances in sequencing technologies pose challenges for high-throughput analysis of large-scale sequence data. A number of programs and hardware solutions exist for efficient BLAST searching, but there is a lack of generic software solutions for mining and personalized management of the results. Systematically reviewing the results and identifying information of interest remains tedious and time-consuming. Results Personal BLAST Navigator (PLAN is a versatile web platform that helps users to carry out various personalized pre- and post-BLAST tasks, including: (1 query and target sequence database management, (2 automated high-throughput BLAST searching, (3 indexing and searching of results, (4 filtering results online, (5 managing results of personal interest in favorite categories, (6 automated sequence annotation (such as NCBI NR and ontology-based annotation. PLAN integrates, by default, the Decypher hardware-based BLAST solution provided by Active Motif Inc. with a greatly improved efficiency over conventional BLAST software. BLAST results are visualized by spreadsheets and graphs and are full-text searchable. BLAST results and sequence annotations can be exported, in part or in full, in various formats including Microsoft Excel and FASTA. Sequences and BLAST results are organized in projects, the data publication levels of which are controlled by the registered project owners. In addition, all analytical functions are provided to public users without registration. Conclusion PLAN has proved a valuable addition to the community for automated high-throughput BLAST searches, and, more importantly, for knowledge discovery, management and sharing based on sequence alignment results
Quantification of Operational Risk Using A Data Mining

Science.gov (United States)

Perera, J. Sebastian

1999-01-01

What is Data Mining? - Data Mining is the process of finding actionable information hidden in raw data. - Data Mining helps find hidden patterns, trends, and important relationships often buried in a sea of data - Typically, automated software tools based on advanced statistical analysis and data modeling technology can be utilized to automate the data mining process
CloVR-Comparative: automated, cloud-enabled comparative microbial genome sequence analysis pipeline.

Science.gov (United States)

Agrawal, Sonia; Arze, Cesar; Adkins, Ricky S; Crabtree, Jonathan; Riley, David; Vangala, Mahesh; Galens, Kevin; Fraser, Claire M; Tettelin, Hervé; White, Owen; Angiuoli, Samuel V; Mahurkar, Anup; Fricke, W Florian

2017-04-27

The benefit of increasing genomic sequence data to the scientific community depends on easy-to-use, scalable bioinformatics support. CloVR-Comparative combines commonly used bioinformatics tools into an intuitive, automated, and cloud-enabled analysis pipeline for comparative microbial genomics. CloVR-Comparative runs on annotated complete or draft genome sequences that are uploaded by the user or selected via a taxonomic tree-based user interface and downloaded from NCBI. CloVR-Comparative runs reference-free multiple whole-genome alignments to determine unique, shared and core coding sequences (CDSs) and single nucleotide polymorphisms (SNPs). Output includes short summary reports and detailed text-based results files, graphical visualizations (phylogenetic trees, circular figures), and a database file linked to the Sybil comparative genome browser. Data up- and download, pipeline configuration and monitoring, and access to Sybil are managed through CloVR-Comparative web interface. CloVR-Comparative and Sybil are distributed as part of the CloVR virtual appliance, which runs on local computers or the Amazon EC2 cloud. Representative datasets (e.g. 40 draft and complete Escherichia coli genomes) are processed in genomics projects, while eliminating the need for on-site computational resources and expertise.
Unmanned Mine of the 21st Centuries

Science.gov (United States)

Semykina, Irina; Grigoryev, Aleksandr; Gargayev, Andrey; Zavyalov, Valeriy

2017-11-01

The article is analytical. It considers the construction principles of the automation system structure which realize the concept of «unmanned mine». All of these principles intend to deal with problems caused by a continuous complication of mining-and-geological conditions at coalmine such as the labor safety and health protection, the weak integration of different mining automation subsystems and the deficiency of optimal balance between a quantity of resource and energy consumed by mining machines and their throughput. The authors describe the main problems and neck stage of mining machines autonomation and automation subsystem. The article makes a general survey of the applied «unmanned technology» in the field of mining such as the remotely operated autonomous complexes, the underground positioning systems of mining machines using infrared radiation in mine workings etc. The concept of «unmanned mine» is considered with an example of the robotic road heading machine. In the final, the authors analyze the techniques and methods that could solve the task of underground mining without human labor.
Automation of coal mining equipment

Energy Technology Data Exchange (ETDEWEB)

Yamada, Ryuji

1986-12-25

Major machines used in the working face include the shearer and the self-advancing frame. The shearer has been changed from the radio-controlled model to the microcomputer operated machine, while automating various functions. In addition, a system for comprehensively examining operating conditions and natural conditions in the working face for further automation. The selfadvancing frame has been modified from the sequence controlled model to the microcomputer aided electrohydraulic control system. In order to proceed further with automation and introduce robotics, detectors, control units and valves must be made smaller in higher reliability. The system will be controlled above the ground in the future, provided that the machines in the working face are remote controlled at the gate while transmitting relevant data above the ground from this system. Thus, automated working face will be realized. (2 figs, 1 photo)
antiSMASH 2.0-a versatile platform for genome mining of secondary metabolite producers

NARCIS (Netherlands)

Blin, Kai; Medema, Marnix H.; Kazempour, Daniyal; Fischbach, Michael A.; Breitling, Rainer; Takano, Eriko; Weber, Tilmann

Microbial secondary metabolites are a potent source of antibiotics and other pharmaceuticals. Genome mining of their biosynthetic gene clusters has become a key method to accelerate their identification and characterization. In 2011, we developed antiSMASH, a web-based analysis platform that
BioCreative Workshops for DOE Genome Sciences: Text Mining for Metagenomics

Energy Technology Data Exchange (ETDEWEB)

Wu, Cathy H. [Univ. of Delaware, Newark, DE (United States). Center for Bioinformatics and Computational Biology; Hirschman, Lynette [The MITRE Corporation, Bedford, MA (United States)

2016-10-29

The objective of this project was to host BioCreative workshops to define and develop text mining tasks to meet the needs of the Genome Sciences community, focusing on metadata information extraction in metagenomics. Following the successful introduction of metagenomics at the BioCreative IV workshop, members of the metagenomics community and BioCreative communities continued discussion to identify candidate topics for a BioCreative metagenomics track for BioCreative V. Of particular interest was the capture of environmental and isolation source information from text. The outcome was to form a “community of interest” around work on the interactive EXTRACT system, which supported interactive tagging of environmental and species data. This experiment is included in the BioCreative V virtual issue of Database. In addition, there was broad participation by members of the metagenomics community in the panels held at BioCreative V, leading to valuable exchanges between the text mining developers and members of the metagenomics research community. These exchanges are reflected in a number of the overview and perspective pieces also being captured in the BioCreative V virtual issue. Overall, this conversation has exposed the metagenomics researchers to the possibilities of text mining, and educated the text mining developers to the specific needs of the metagenomics community.
Automation of technological processes at surface mines in the GDR as one of the main directions of increased coal extraction effectiveness by surface mining

Energy Technology Data Exchange (ETDEWEB)

Jona, U.

1987-12-01

In the GDR, about 53% of brown coal is mined with the use of overburden conveyor bridges, 27% with the use of belt conveyors, and 20% with the use of rail transport. Compares efficiency and cost per 1 m/sup 3/ of these transport methods. The overburden conveyor bridges, their specifications and microcomputer control are described. Describes utilization of microcomputer techniques, especially the stereochart system of Carl Zeiss Jena, for automated processing of data on surface mine geometry. Other computer applications are also presented, e.g. for surveying, slope stability calculation, and conveyor bridge control. Maintains that application of the KED/KEM microcomputer system for overburden conveyor bridge control increases its effectiveness by 10%, i.e. by 8 million m/sup 3//a.
Event-based text mining for biology and functional genomics

Science.gov (United States)

Thompson, Paul; Nawaz, Raheel; McNaught, John; Kell, Douglas B.

2015-01-01

The assessment of genome function requires a mapping between genome-derived entities and biochemical reactions, and the biomedical literature represents a rich source of information about reactions between biological components. However, the increasingly rapid growth in the volume of literature provides both a challenge and an opportunity for researchers to isolate information about reactions of interest in a timely and efficient manner. In response, recent text mining research in the biology domain has been largely focused on the identification and extraction of ‘events’, i.e. categorised, structured representations of relationships between biochemical entities, from the literature. Functional genomics analyses necessarily encompass events as so defined. Automatic event extraction systems facilitate the development of sophisticated semantic search applications, allowing researchers to formulate structured queries over extracted events, so as to specify the exact types of reactions to be retrieved. This article provides an overview of recent research into event extraction. We cover annotated corpora on which systems are trained, systems that achieve state-of-the-art performance and details of the community shared tasks that have been instrumental in increasing the quality, coverage and scalability of recent systems. Finally, several concrete applications of event extraction are covered, together with emerging directions of research. PMID:24907365
Activities of the EMAG Mine Automation Company in development of control systems for powered supports

Energy Technology Data Exchange (ETDEWEB)

Sikora, W; Sobczyk, J

1986-10-01

Research programs are evaluated of the EMAG Mine Automation Company on electrohydraulic control systems and remote control for powered supports manufactured in Poland. A control system for the FAZOS-12/28-Oz and FAZOS-15/31-Oz supports is characterized by: reduced length of hydraulic pipes (by 1/3), increased reliability, increased number of commands, less complicated design and lower maintenance and repair cost. A scheme of the system and its main elements (electrohydraulic executive units, control panels and their position, power supply system, and electric cables) is given. After operational tests in the Murcki mine, EMAG will start manufacturing the systems on a commercial scale (about 35 units per year). The research program for development of a remote control system for the 1990s is characterized (computerized control, microprocessors complementary MOS units, etc.).
Prerequisites for the Establishment of the Automated Monitoring System and Accounting of the Displacement of the Roof of Underground Mines for the Improvement of Safety of Mining Work

Science.gov (United States)

Abramovich, Alexandr; Pudov, Evgeniy; Kuzin, Evgeny

2017-11-01

In the article the necessity of continuous control over the condition of the roof of mine workings is considered, to increase the safety in the conduct of mining operations. Provided the rationale for monitoring in complex mining and geological conditions, as well as in areas prone to rock blows and sudden coal emissions. The existing methods for controlling the displacement of the roof rocks are described, and their shortcomings are given. An idea is given of an automated system for monitoring the displacement of the workings. The stages of the system as a whole are considered, including the choice of a linear displacement sensor, a platform for software development, and a programming language. In order to ensure integration into other systems and subsequent analysis of the results, it is envisaged to output data to spreadsheets. Are shown the interfaces of the program and the output of the readings from the sensors to the monitors of the mining manager.
Draft Genome Sequences of Two Novel Acidimicrobiaceae Members from an Acid Mine Drainage Biofilm Metagenome

OpenAIRE

Pinto, Ameet J.; Sharp, Jonathan O.; Yoder, Michael J.; Almstrand, Robert

2016-01-01

Bacteria belonging to the family Acidimicrobiaceae are frequently encountered in heavy metal-contaminated acidic environments. However, their phylogenetic and metabolic diversity is poorly resolved. We present draft genome sequences of two novel and phylogenetically distinct Acidimicrobiaceae members assembled from an acid mine drainage biofilm metagenome.
Exploring repetitive DNA landscapes using REPCLASS, a tool that automates the classification of transposable elements in eukaryotic genomes.

Science.gov (United States)

Feschotte, Cédric; Keswani, Umeshkumar; Ranganathan, Nirmal; Guibotsy, Marcel L; Levine, David

2009-07-23

Eukaryotic genomes contain large amount of repetitive DNA, most of which is derived from transposable elements (TEs). Progress has been made to develop computational tools for ab initio identification of repeat families, but there is an urgent need to develop tools to automate the annotation of TEs in genome sequences. Here we introduce REPCLASS, a tool that automates the classification of TE sequences. Using control repeat libraries, we show that the program can classify accurately virtually any known TE types. Combining REPCLASS to ab initio repeat finding in the genomes of Caenorhabditis elegans and Drosophila melanogaster allowed us to recover the contrasting TE landscape characteristic of these species. Unexpectedly, REPCLASS also uncovered several novel TE families in both genomes, augmenting the TE repertoire of these model species. When applied to the genomes of distant Caenorhabditis and Drosophila species, the approach revealed a remarkable conservation of TE composition profile within each genus, despite substantial interspecific covariations in genome size and in the number of TEs and TE families. Lastly, we applied REPCLASS to analyze 10 fungal genomes from a wide taxonomic range, most of which have not been analyzed for TE content previously. The results showed that TE diversity varies widely across the fungi "kingdom" and appears to positively correlate with genome size, in particular for DNA transposons. Together, these data validate REPCLASS as a powerful tool to explore the repetitive DNA landscapes of eukaryotes and to shed light onto the evolutionary forces shaping TE diversity and genome architecture.
Automated training for algorithms that learn from genomic data.

Science.gov (United States)

Cilingir, Gokcen; Broschat, Shira L

2015-01-01

Supervised machine learning algorithms are used by life scientists for a variety of objectives. Expert-curated public gene and protein databases are major resources for gathering data to train these algorithms. While these data resources are continuously updated, generally, these updates are not incorporated into published machine learning algorithms which thereby can become outdated soon after their introduction. In this paper, we propose a new model of operation for supervised machine learning algorithms that learn from genomic data. By defining these algorithms in a pipeline in which the training data gathering procedure and the learning process are automated, one can create a system that generates a classifier or predictor using information available from public resources. The proposed model is explained using three case studies on SignalP, MemLoci, and ApicoAP in which existing machine learning models are utilized in pipelines. Given that the vast majority of the procedures described for gathering training data can easily be automated, it is possible to transform valuable machine learning algorithms into self-evolving learners that benefit from the ever-changing data available for gene products and to develop new machine learning algorithms that are similarly capable.
MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets

Energy Technology Data Exchange (ETDEWEB)

Wu, Yu-Wei [Joint BioEnergy Inst. (JBEI), Emeryville, CA (United States); Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Simmons, Blake A. [Joint BioEnergy Inst. (JBEI), Emeryville, CA (United States); Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Singer, Steven W. [Joint BioEnergy Inst. (JBEI), Emeryville, CA (United States); Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

2015-10-29

The recovery of genomes from metagenomic datasets is a critical step to defining the functional roles of the underlying uncultivated populations. We previously developed MaxBin, an automated binning approach for high-throughput recovery of microbial genomes from metagenomes. Here, we present an expanded binning algorithm, MaxBin 2.0, which recovers genomes from co-assembly of a collection of metagenomic datasets. Tests on simulated datasets revealed that MaxBin 2.0 is highly accurate in recovering individual genomes, and the application of MaxBin 2.0 to several metagenomes from environmental samples demonstrated that it could achieve two complementary goals: recovering more bacterial genomes compared to binning a single sample as well as comparing the microbial community composition between different sampling environments. Availability and implementation: MaxBin 2.0 is freely available at http://sourceforge.net/projects/maxbin/ under BSD license. Supplementary information: Supplementary data are available at Bioinformatics online.
Text mining in the classification of digital documents

Directory of Open Access Journals (Sweden)

Marcial Contreras Barrera

2016-11-01

Full Text Available Objective: Develop an automated classifier for the classification of bibliographic material by means of the text mining. Methodology: The text mining is used for the development of the classifier, based on a method of type supervised, conformed by two phases; learning and recognition, in the learning phase, the classifier learns patterns across the analysis of bibliographical records, of the classification Z, belonging to library science, information sciences and information resources, recovered from the database LIBRUNAM, in this phase is obtained the classifier capable of recognizing different subclasses (LC. In the recognition phase the classifier is validated and evaluates across classification tests, for this end bibliographical records of the classification Z are taken randomly, classified by a cataloguer and processed by the automated classifier, in order to obtain the precision of the automated classifier. Results: The application of the text mining achieved the development of the automated classifier, through the method classifying documents supervised type. The precision of the classifier was calculated doing the comparison among the assigned topics manually and automated obtaining 75.70% of precision. Conclusions: The application of text mining facilitated the creation of automated classifier, allowing to obtain useful technology for the classification of bibliographical material with the aim of improving and speed up the process of organizing digital documents.
Ask and Ye Shall Receive? Automated Text Mining of Michigan Capital Facility Finance Bond Election Proposals to Identify Which Topics Are Associated with Bond Passage and Voter Turnout

Science.gov (United States)

Bowers, Alex J.; Chen, Jingjing

2015-01-01

The purpose of this study is to bring together recent innovations in the research literature around school district capital facility finance, municipal bond elections, statistical models of conditional time-varying outcomes, and data mining algorithms for automated text mining of election ballot proposals to examine the factors that influence the…

Genomic insights into a new acidophilic, copper-resistant Desulfosporosinus isolate from the oxidized tailings area of an abandoned gold mine.

Science.gov (United States)

Mardanov, Andrey V; Panova, Inna A; Beletsky, Alexey V; Avakyan, Marat R; Kadnikov, Vitaly V; Antsiferov, Dmitry V; Banks, David; Frank, Yulia A; Pimenov, Nikolay V; Ravin, Nikolai V; Karnachuk, Olga V

2016-08-01

Microbial sulfate reduction in acid mine drainage is still considered to be confined to anoxic conditions, although several reports have shown that sulfate-reducing bacteria occur under microaerophilic or aerobic conditions. We have measured sulfate reduction rates of up to 60 nmol S cm(-3) day(-1) in oxidized layers of gold mine tailings in Kuzbass (SW Siberia). A novel, acidophilic, copper-tolerant Desulfosporosinus sp. I2 was isolated from the same sample and its genome was sequenced. The genomic analysis and physiological data indicate the involvement of transporters and additional mechanisms to tolerate metals, such as sequestration by polyphosphates. Desulfosporinus sp. I2 encodes systems for a metabolically versatile life style. The genome possessed a complete Embden-Meyerhof pathway for glycolysis and gluconeogenesis. Complete oxidation of organic substrates could be enabled by the complete TCA cycle. Genomic analysis found all major components of the electron transfer chain necessary for energy generation via oxidative phosphorylation. Autotrophic CO2 fixation could be performed through the Wood-Ljungdahl pathway. Multiple oxygen detoxification systems were identified in the genome. Taking into account the metabolic activity and genomic analysis, the traits of the novel isolate broaden our understanding of active sulfate reduction and associated metabolism beyond strictly anaerobic niches. © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Automation of the Design of the Anchorage System Taking into Account the Geomechanical State of the Massif and Mining Development Schemes

Directory of Open Access Journals (Sweden)

Demin Vladimir

2018-01-01

Full Text Available The article presents the system for the automation of the design of the anchorage, which regulates the calculation of the required parameters of the fasteners for the fastening of the fastening system. The main factors affecting the operation of the anchor support are grouped in the following way: mining and geological conditions, technical characteristics of the anchor support, geomechanical conditions for conducting and operating the mine workings. Mining and geological conditions for carrying out excavations include: physical and mechanical properties of rocks, the category of roof stability, fracturing, etc. Technical characteristics of the anchor support: material of the rod, filler, filling completeness, etc. Conditions (geomechanical of carrying out and exploitation of the mine workings: the depth of the conduct, the location relative to the zone of influence of the cleaning works, the location relative to the waste zone, etc. As a result of calculations the program gives out the basic parameters of the anchor support, which coincide with the parameters adopted by the passport.
GMATA: An Integrated Software Package for Genome-Scale SSR Mining, Marker Development and Viewing.

Science.gov (United States)

Wang, Xuewen; Wang, Le

2016-01-01

Simple sequence repeats (SSRs), also referred to as microsatellites, are highly variable tandem DNAs that are widely used as genetic markers. The increasing availability of whole-genome and transcript sequences provides information resources for SSR marker development. However, efficient software is required to efficiently identify and display SSR information along with other gene features at a genome scale. We developed novel software package Genome-wide Microsatellite Analyzing Tool Package (GMATA) integrating SSR mining, statistical analysis and plotting, marker design, polymorphism screening and marker transferability, and enabled simultaneously display SSR markers with other genome features. GMATA applies novel strategies for SSR analysis and primer design in large genomes, which allows GMATA to perform faster calculation and provides more accurate results than existing tools. Our package is also capable of processing DNA sequences of any size on a standard computer. GMATA is user friendly, only requires mouse clicks or types inputs on the command line, and is executable in multiple computing platforms. We demonstrated the application of GMATA in plants genomes and reveal a novel distribution pattern of SSRs in 15 grass genomes. The most abundant motifs are dimer GA/TC, the A/T monomer and the GCG/CGC trimer, rather than the rich G/C content in DNA sequence. We also revealed that SSR count is a linear to the chromosome length in fully assembled grass genomes. GMATA represents a powerful application tool that facilitates genomic sequence analyses. GAMTA is freely available at http://sourceforge.net/projects/gmata/?source=navbar.
Inaugural Genomics Automation Congress and the coming deluge of sequencing data.

Science.gov (United States)

Creighton, Chad J

2010-10-01

Presentations at Select Biosciences's first 'Genomics Automation Congress' (Boston, MA, USA) in 2010 focused on next-generation sequencing and the platforms and methodology around them. The meeting provided an overview of sequencing technologies, both new and emerging. Speakers shared their recent work on applying sequencing to profile cells for various levels of biomolecular complexity, including DNA sequences, DNA copy, DNA methylation, mRNA and microRNA. With sequencing time and costs continuing to drop dramatically, a virtual explosion of very large sequencing datasets is at hand, which will probably present challenges and opportunities for high-level data analysis and interpretation, as well as for information technology infrastructure.
Longwall automation 2

Energy Technology Data Exchange (ETDEWEB)

David Hainsworth; David Reid; Con Caris; J.C. Ralston; C.O. Hargrave; Ron McPhee; I.N. Hutchinson; A. Strange; C. Wesner [CSIRO (Australia)

2008-05-15

This report covers a nominal two-year extension to the Major Longwall Automation Project (C10100). Production standard implementation of Longwall Automation Steering Committee (LASC) automation systems has been achieved at Beltana and Broadmeadow mines. The systems are now used on a 24/7 basis and have provided production benefits to the mines. The LASC Information System (LIS) has been updated and has been implemented successfully in the IT environment of major coal mining houses. This enables 3D visualisation of the longwall environment and equipment to be accessed on line. A simulator has been specified and a prototype system is now ready for implementation. The Shearer Position Measurement System (SPMS) has been upgraded to a modular commercial production standard hardware solution.A compact hardware solution for visual face monitoring has been developed, an approved enclosure for a thermal infrared camera has been produced and software for providing horizon control through faulted conditions has been delivered. The incorporation of the LASC Cut Model information into OEM horizon control algorithms has been bench and underground tested. A prototype system for shield convergence monitoring has been produced and studies to identify techniques for coal flow optimisation and void monitoring have been carried out. Liaison with equipment manufacturers has been maintained and technology delivery mechanisms for LASC hardware and software have been established.
Data Mining Approaches for Genomic Biomarker Development: Applications Using Drug Screening Data from the Cancer Genome Project and the Cancer Cell Line Encyclopedia.

Directory of Open Access Journals (Sweden)

David G Covell

Full Text Available Developing reliable biomarkers of tumor cell drug sensitivity and resistance can guide hypothesis-driven basic science research and influence pre-therapy clinical decisions. A popular strategy for developing biomarkers uses characterizations of human tumor samples against a range of cancer drug responses that correlate with genomic change; developed largely from the efforts of the Cancer Cell Line Encyclopedia (CCLE and Sanger Cancer Genome Project (CGP. The purpose of this study is to provide an independent analysis of this data that aims to vet existing and add novel perspectives to biomarker discoveries and applications. Existing and alternative data mining and statistical methods will be used to a evaluate drug responses of compounds with similar mechanism of action (MOA, b examine measures of gene expression (GE, copy number (CN and mutation status (MUT biomarkers, combined with gene set enrichment analysis (GSEA, for hypothesizing biological processes important for drug response, c conduct global comparisons of GE, CN and MUT as biomarkers across all drugs screened in the CGP dataset, and d assess the positive predictive power of CGP-derived GE biomarkers as predictors of drug response in CCLE tumor cells. The perspectives derived from individual and global examinations of GEs, MUTs and CNs confirm existing and reveal unique and shared roles for these biomarkers in tumor cell drug sensitivity and resistance. Applications of CGP-derived genomic biomarkers to predict the drug response of CCLE tumor cells finds a highly significant ROC, with a positive predictive power of 0.78. The results of this study expand the available data mining and analysis methods for genomic biomarker development and provide additional support for using biomarkers to guide hypothesis-driven basic science research and pre-therapy clinical decisions.
Automated Coal-Mine Shuttle Car

Science.gov (United States)

Collins, E. R., Jr.

1984-01-01

Cable-guided car increases efficiency in underground coal mines. Unmanned vehicle contains storage batteries in side panels for driving traction motors located in wheels. Batteries recharged during inactive periods or slid out as unit and replaced by fresh battery bank. Onboard generator charges batteries as car operates.
Genome Sequence of Carbon Dioxide-Sequestering Serratia sp. Strain ISTD04 Isolated from Marble Mining Rocks

OpenAIRE

Kumar, Manish; Gazara, Rajesh Kumar; Verma, Sandhya; Kumar, Madan; Verma, Praveen Kumar; Thakur, Indu Shekhar

2016-01-01

The Serratia sp. strain ISTD04 has been identified as a carbon dioxide (CO2)-sequestering bacterium isolated from marble mining rocks in the Umra area, Rajasthan, India. This strain grows chemolithotrophically on media that contain sodium bicarbonate (NaHCO3) as the sole carbon source. Here, we report the genome sequence of 5.07?Mb Serratia sp. ISTD04.
Automated Assessment of Patients' Self-Narratives for Posttraumatic Stress Disorder Screening Using Natural Language Processing and Text Mining.

Science.gov (United States)

He, Qiwei; Veldkamp, Bernard P; Glas, Cees A W; de Vries, Theo

2017-03-01

Patients' narratives about traumatic experiences and symptoms are useful in clinical screening and diagnostic procedures. In this study, we presented an automated assessment system to screen patients for posttraumatic stress disorder via a natural language processing and text-mining approach. Four machine-learning algorithms-including decision tree, naive Bayes, support vector machine, and an alternative classification approach called the product score model-were used in combination with n-gram representation models to identify patterns between verbal features in self-narratives and psychiatric diagnoses. With our sample, the product score model with unigrams attained the highest prediction accuracy when compared with practitioners' diagnoses. The addition of multigrams contributed most to balancing the metrics of sensitivity and specificity. This article also demonstrates that text mining is a promising approach for analyzing patients' self-expression behavior, thus helping clinicians identify potential patients from an early stage.
Genome cluster database. A sequence family analysis platform for Arabidopsis and rice.

Science.gov (United States)

Horan, Kevin; Lauricha, Josh; Bailey-Serres, Julia; Raikhel, Natasha; Girke, Thomas

2005-05-01

The genome-wide protein sequences from Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa) spp. japonica were clustered into families using sequence similarity and domain-based clustering. The two fundamentally different methods resulted in separate cluster sets with complementary properties to compensate the limitations for accurate family analysis. Functional names for the identified families were assigned with an efficient computational approach that uses the description of the most common molecular function gene ontology node within each cluster. Subsequently, multiple alignments and phylogenetic trees were calculated for the assembled families. All clustering results and their underlying sequences were organized in the Web-accessible Genome Cluster Database (http://bioinfo.ucr.edu/projects/GCD) with rich interactive and user-friendly sequence family mining tools to facilitate the analysis of any given family of interest for the plant science community. An automated clustering pipeline ensures current information for future updates in the annotations of the two genomes and clustering improvements. The analysis allowed the first systematic identification of family and singlet proteins present in both organisms as well as those restricted to one of them. In addition, the established Web resources for mining these data provide a road map for future studies of the composition and structure of protein families between the two species.
Locating previously unknown patterns in data-mining results: a dual data- and knowledge-mining method

Directory of Open Access Journals (Sweden)

Knaus William A

2006-03-01

Full Text Available Abstract Background Data mining can be utilized to automate analysis of substantial amounts of data produced in many organizations. However, data mining produces large numbers of rules and patterns, many of which are not useful. Existing methods for pruning uninteresting patterns have only begun to automate the knowledge acquisition step (which is required for subjective measures of interestingness, hence leaving a serious bottleneck. In this paper we propose a method for automatically acquiring knowledge to shorten the pattern list by locating the novel and interesting ones. Methods The dual-mining method is based on automatically comparing the strength of patterns mined from a database with the strength of equivalent patterns mined from a relevant knowledgebase. When these two estimates of pattern strength do not match, a high "surprise score" is assigned to the pattern, identifying the pattern as potentially interesting. The surprise score captures the degree of novelty or interestingness of the mined pattern. In addition, we show how to compute p values for each surprise score, thus filtering out noise and attaching statistical significance. Results We have implemented the dual-mining method using scripts written in Perl and R. We applied the method to a large patient database and a biomedical literature citation knowledgebase. The system estimated association scores for 50,000 patterns, composed of disease entities and lab results, by querying the database and the knowledgebase. It then computed the surprise scores by comparing the pairs of association scores. Finally, the system estimated statistical significance of the scores. Conclusion The dual-mining method eliminates more than 90% of patterns with strong associations, thus identifying them as uninteresting. We found that the pruning of patterns using the surprise score matched the biomedical evidence in the 100 cases that were examined by hand. The method automates the acquisition of
Means to improve underground coal mine safety by automated control of methane drainage systems

Directory of Open Access Journals (Sweden)

Babut Gabriel Bujor

2017-01-01

Full Text Available Based on the critical analysis of the presently employed management of methane drainage systems operation in Jiu Valley collieries, the paper aims to assess the basic elements required to develop an automated monitoring and control system of these. The results obtained after studies and researches carried out also allowed formulating certain proposals regarding the modification of manual control procedures of methane drainage systems operation, in order to correlate them with the prescriptions of legislation requirements from countries having a well-developed mining industry. Putting in practice the mentioned proposals could have immediate and beneficial effects on increasing the methane drainage process efficiency, leading meanwhile to an improved working environment and, implicitly, to a higher level of occupational safety and health in Jiu Valley collieries.
Genome Sequence of Carbon Dioxide-Sequestering Serratia sp. Strain ISTD04 Isolated from Marble Mining Rocks.

Science.gov (United States)

Kumar, Manish; Gazara, Rajesh Kumar; Verma, Sandhya; Kumar, Madan; Verma, Praveen Kumar; Thakur, Indu Shekhar

2016-10-20

The Serratia sp. strain ISTD04 has been identified as a carbon dioxide (CO 2 )-sequestering bacterium isolated from marble mining rocks in the Umra area, Rajasthan, India. This strain grows chemolithotrophically on media that contain sodium bicarbonate (NaHCO 3 ) as the sole carbon source. Here, we report the genome sequence of 5.07 Mb Serratia sp. ISTD04. Copyright © 2016 Kumar et al.
Text Mining Applications and Theory

CERN Document Server

Berry, Michael W

2010-01-01

Text Mining: Applications and Theory presents the state-of-the-art algorithms for text mining from both the academic and industrial perspectives. The contributors span several countries and scientific domains: universities, industrial corporations, and government laboratories, and demonstrate the use of techniques from machine learning, knowledge discovery, natural language processing and information retrieval to design computational models for automated text analysis and mining. This volume demonstrates how advancements in the fields of applied mathematics, computer science, machine learning
Section of Cybernetics in Mining of Mining Committee of Polish Academy of Sciences - Pro Memoria

Science.gov (United States)

Wojaczek, Antoni; Miśkiewicz, Kazimierz

2017-09-01

Section of Cybernetics in Mining of Mining Committee of Polish Academy of Science (PAN) has been created by PAN Mining Committee in 1969. It was a section in Mining Committee of PAN, whose operation range included widely understood issues of automation, telecommunication and informatics in mining industry. The main operation method of the Section was to organize the periodic conferences dedicated to issues of control systems in mining. The first conference took place in 1971 in Katowice. Together with new (the current one) term of office of Mining Committee of PAN this Section ceased to exist. The paper presents (pro memoria) over 40 year long conference output of this Section that functioned within the scope of operation of Mining Committee of PAN up to 12th January 2016.
Automated Text Data Mining Analysis of Five Decades of Educational Leadership Research Literature: Probabilistic Topic Modeling of "EAQ" Articles From 1965 to 2014

Science.gov (United States)

Wang, Yinying; Bowers, Alex J.; Fikis, David J.

2017-01-01

Purpose: The purpose of this study is to describe the underlying topics and the topic evolution in the 50-year history of educational leadership research literature. Method: We used automated text data mining with probabilistic latent topic models to examine the full text of the entire publication history of all 1,539 articles published in…
Genome-derived vaccines.

Science.gov (United States)

De Groot, Anne S; Rappuoli, Rino

2004-02-01

Vaccine research entered a new era when the complete genome of a pathogenic bacterium was published in 1995. Since then, more than 97 bacterial pathogens have been sequenced and at least 110 additional projects are now in progress. Genome sequencing has also dramatically accelerated: high-throughput facilities can draft the sequence of an entire microbe (two to four megabases) in 1 to 2 days. Vaccine developers are using microarrays, immunoinformatics, proteomics and high-throughput immunology assays to reduce the truly unmanageable volume of information available in genome databases to a manageable size. Vaccines composed by novel antigens discovered from genome mining are already in clinical trials. Within 5 years we can expect to see a novel class of vaccines composed by genome-predicted, assembled and engineered T- and Bcell epitopes. This article addresses the convergence of three forces--microbial genome sequencing, computational immunology and new vaccine technologies--that are shifting genome mining for vaccines onto the forefront of immunology research.
A Recommendation Algorithm for Automating Corollary Order Generation

Science.gov (United States)

Klann, Jeffrey; Schadow, Gunther; McCoy, JM

2009-01-01

Manual development and maintenance of decision support content is time-consuming and expensive. We explore recommendation algorithms, e-commerce data-mining tools that use collective order history to suggest purchases, to assist with this. In particular, previous work shows corollary order suggestions are amenable to automated data-mining techniques. Here, an item-based collaborative filtering algorithm augmented with association rule interestingness measures mined suggestions from 866,445 orders made in an inpatient hospital in 2007, generating 584 potential corollary orders. Our expert physician panel evaluated the top 92 and agreed 75.3% were clinically meaningful. Also, at least one felt 47.9% would be directly relevant in guideline development. This automated generation of a rough-cut of corollary orders confirms prior indications about automated tools in building decision support content. It is an important step toward computerized augmentation to decision support development, which could increase development efficiency and content quality while automatically capturing local standards. PMID:20351875
Human detection for underground autonomous mine vehicles using thermal imaging

CSIR Research Space (South Africa)

Dickens, JS

2011-07-01

Full Text Available Underground mine automation has the potential to increase safety, productivity and allow the mining of lower-grade resources. In a mining environment with both autonomous robots and humans, it is essential that the robots are able to detect...
Imaging mass spectrometry and genome mining reveal highly antifungal virulence factor of mushroom soft rot pathogen.

Science.gov (United States)

Graupner, Katharina; Scherlach, Kirstin; Bretschneider, Tom; Lackner, Gerald; Roth, Martin; Gross, Harald; Hertweck, Christian

2012-12-21

Caught in the act: imaging mass spectrometry of a button mushroom infected with the soft rot pathogen Janthinobacterium agaricidamnosum in conjunction with genome mining revealed jagaricin as a highly antifungal virulence factor that is not produced under standard cultivation conditions. The structure of jagaricin was rigorously elucidated by a combination of physicochemical analyses, chemical derivatization, and bioinformatics. Copyright © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

Genome-wide data-mining of candidate human splice translational efficiency polymorphisms (STEPs and an online database.

Directory of Open Access Journals (Sweden)

Christopher A Raistrick

2010-10-01

Full Text Available Variation in pre-mRNA splicing is common and in some cases caused by genetic variants in intronic splicing motifs. Recent studies into the insulin gene (INS discovered a polymorphism in a 5' non-coding intron that influences the likelihood of intron retention in the final mRNA, extending the 5' untranslated region and maintaining protein quality. Retention was also associated with increased insulin levels, suggesting that such variants--splice translational efficiency polymorphisms (STEPs--may relate to disease phenotypes through differential protein expression. We set out to explore the prevalence of STEPs in the human genome and validate this new category of protein quantitative trait loci (pQTL using publicly available data.Gene transcript and variant data were collected and mined for candidate STEPs in motif regions. Sequences from transcripts containing potential STEPs were analysed for evidence of splice site recognition and an effect in expressed sequence tags (ESTs. 16 publicly released genome-wide association data sets of common diseases were searched for association to candidate polymorphisms with HapMap frequency data. Our study found 3324 candidate STEPs lying in motif sequences of 5' non-coding introns and further mining revealed 170 with transcript evidence of intron retention. 21 potential STEPs had EST evidence of intron retention or exon extension, as well as population frequency data for comparison.Results suggest that the insulin STEP was not a unique example and that many STEPs may occur genome-wide with potentially causal effects in complex disease. An online database of STEPs is freely accessible at http://dbstep.genes.org.uk/.
Application of Modern Tools and Techniques for Mine Safety & Disaster Management

Science.gov (United States)

Kumar, Dheeraj

2016-04-01

The implementation of novel systems and adoption of improvised equipment in mines help mining companies in two important ways: enhanced mine productivity and improved worker safety. There is a substantial need for adoption of state-of-the-art automation technologies in the mines to ensure the safety and to protect health of mine workers. With the advent of new autonomous equipment used in the mine, the inefficiencies are reduced by limiting human inconsistencies and error. The desired increase in productivity at a mine can sometimes be achieved by changing only a few simple variables. Significant developments have been made in the areas of surface and underground communication, robotics, smart sensors, tracking systems, mine gas monitoring systems and ground movements etc. Advancement in information technology in the form of internet, GIS, remote sensing, satellite communication, etc. have proved to be important tools for hazard reduction and disaster management. This paper is mainly focused on issues pertaining to mine safety and disaster management and some of the recent innovations in the mine automations that could be deployed in mines for safe mining operations and for avoiding any unforeseen mine disaster.
New advances in the automation of mining cartography and topography in coal exploitation. Nuevos avances en la automatizacion de la cartografia y topografia minera en explotaciones de carbon

Energy Technology Data Exchange (ETDEWEB)

Fuente Martin, P; Gonzalez Marroquina, V; Saez Garcia, E; Perez Suarez, M A [HUNOSA, Madrid (Spain)

1988-01-01

In this entry are described some HUNOSA researches for automating the typographical information and any other technical information produced in the coal workings. Also recorded were some aspects about a basis of mining cartographic information. 3 tabs.
High-frequency, long-duration water sampling in acid mine drainage studies: a short review of current methods and recent advances in automated water samplers

Science.gov (United States)

Chapin, Thomas

2015-01-01

Hand-collected grab samples are the most common water sampling method but using grab sampling to monitor temporally variable aquatic processes such as diel metal cycling or episodic events is rarely feasible or cost-effective. Currently available automated samplers are a proven, widely used technology and typically collect up to 24 samples during a deployment. However, these automated samplers are not well suited for long-term sampling in remote areas or in freezing conditions. There is a critical need for low-cost, long-duration, high-frequency water sampling technology to improve our understanding of the geochemical response to temporally variable processes. This review article will examine recent developments in automated water sampler technology and utilize selected field data from acid mine drainage studies to illustrate the utility of high-frequency, long-duration water sampling.
Robotics and automation for oil sands bitumen production and maintenance

Energy Technology Data Exchange (ETDEWEB)

Lipsett, M.G. [Alberta Univ., Edmonton, AB (Canada). Dept. of Mechanical Engineering

2008-07-01

This presentation examined technical challenges and commercial challenges related to robotics and automation processes in the mining and oil sands industries. The oil sands industry has on-going cost pressures. Challenges include the depths to which miners must travel, as well as problems related to equipment reliability and safety. Surface mines must operate in all weather conditions with a variety of complex systems. Barriers for new technologies include high capital and operating expenses. It has also proven difficult to integrate new technologies within established mining practices. However, automation has the potential to improve mineral processing, production, and maintenance processes. Step changes can be placed in locations that are hazardous or inaccessible. Automated sizing, material, and ventilation systems are can also be implemented as well as tele-operated equipment. Prototypes currently being developed include advanced systems for cutting; rock bolting; loose rock detection systems; lump size estimation; unstructured environment sensing; environment modelling; and automatic task execution. Enabling technologies are now being developed for excavation, haulage, material handling systems, mining and reclamation methods, and integrated control and reliability. tabs., figs.
Genome mining reveals unlocked bioactive potential of marine Gram-negative bacteria

DEFF Research Database (Denmark)

Machado, Henrique; Sonnenschein, Eva; Melchiorsen, Jette

2015-01-01

Background: Antibiotic resistance in bacteria spreads quickly, overtaking the pace at which new compounds are discovered and this emphasizes the immediate need to discover new compounds for control of infectious diseases. Terrestrial bacteria have for decades been investigated as a source......- and Gammaproteobacteria collected during the Galathea 3 expedition were sequenced and mined for natural product encoding gene clusters. Results: Independently of genome size, bacteria of all tested genera carried a large number of clusters encoding different potential bioactivities, especially within the Vibrionaceae...... and Pseudoalteromonas species that commonly live in close association with eukaryotic organisms in the environment. Chitin regulation by the ChiS histidine-kinase seems to be a general trait of the Vibrionaceae family, however it is absent in the Pseudomonadaceae. Hence, the degree to which chitin influences secondary...
30 CFR 75.209 - Automated Temporary Roof Support (ATRS) systems.

Science.gov (United States)

2010-07-01

... of temporary support shall be used, as specified in the roof control plan, when— (1) Mining... 30 Mineral Resources 1 2010-07-01 2010-07-01 false Automated Temporary Roof Support (ATRS) systems... COAL MINE SAFETY AND HEALTH MANDATORY SAFETY STANDARDS-UNDERGROUND COAL MINES Roof Support § 75.209...
Savant Genome Browser 2: visualization and analysis for population-scale genomics.

Science.gov (United States)

Fiume, Marc; Smith, Eric J M; Brook, Andrew; Strbenac, Dario; Turner, Brian; Mezlini, Aziz M; Robinson, Mark D; Wodak, Shoshana J; Brudno, Michael

2012-07-01

High-throughput sequencing (HTS) technologies are providing an unprecedented capacity for data generation, and there is a corresponding need for efficient data exploration and analysis capabilities. Although most existing tools for HTS data analysis are developed for either automated (e.g. genotyping) or visualization (e.g. genome browsing) purposes, such tools are most powerful when combined. For example, integration of visualization and computation allows users to iteratively refine their analyses by updating computational parameters within the visual framework in real-time. Here we introduce the second version of the Savant Genome Browser, a standalone program for visual and computational analysis of HTS data. Savant substantially improves upon its predecessor and existing tools by introducing innovative visualization modes and navigation interfaces for several genomic datatypes, and synergizing visual and automated analyses in a way that is powerful yet easy even for non-expert users. We also present a number of plugins that were developed by the Savant Community, which demonstrate the power of integrating visual and automated analyses using Savant. The Savant Genome Browser is freely available (open source) at www.savantbrowser.com.
Introduction to Space Resource Mining

Science.gov (United States)

Mueller, Robert P.

2013-01-01

There are vast amounts of resources in the solar system that will be useful to humans in space and possibly on Earth. None of these resources can be exploited without the first necessary step of extra-terrestrial mining. The necessary technologies for tele-robotic and autonomous mining have not matured sufficiently yet. The current state of technology was assessed for terrestrial and extraterrestrial mining and a taxonomy of robotic space mining mechanisms was presented which was based on current existing prototypes. Terrestrial and extra-terrestrial mining methods and technologies are on the cusp of massive changes towards automation and autonomy for economic and safety reasons. It is highly likely that these industries will benefit from mutual cooperation and technology transfer.
Automation of route identification and optimisation based on data-mining and chemical intuition.

Science.gov (United States)

Lapkin, A A; Heer, P K; Jacob, P-M; Hutchby, M; Cunningham, W; Bull, S D; Davidson, M G

2017-09-21

Data-mining of Reaxys and network analysis of the combined literature and in-house reactions set were used to generate multiple possible reaction routes to convert a bio-waste feedstock, limonene, into a pharmaceutical API, paracetamol. The network analysis of data provides a rich knowledge-base for generation of the initial reaction screening and development programme. Based on the literature and the in-house data, an overall flowsheet for the conversion of limonene to paracetamol was proposed. Each individual reaction-separation step in the sequence was simulated as a combination of the continuous flow and batch steps. The linear model generation methodology allowed us to identify the reaction steps requiring further chemical optimisation. The generated model can be used for global optimisation and generation of environmental and other performance indicators, such as cost indicators. However, the identified further challenge is to automate model generation to evolve optimal multi-step chemical routes and optimal process configurations.
Soft measures and incremental gains in mines; Mesures douces et gains incrementaux : mines

Energy Technology Data Exchange (ETDEWEB)

Laliberte, P. [Natural Resources Canada, Ottawa, ON (Canada). CANMET Mining and Mineral Sciences Laboratories

2008-07-01

This paper presented a variety of measures that mine operators can adopt to save energy. Researchers at the CANMET Mining and Mineral Sciences Laboratories of Natural Resources Canada have conducted a joint study with Hydro-Quebec to investigate the impact of alternate energy technologies and control systems on energy savings. The impacts of a range of technologies were evaluated and rates of energy efficiency were compared. Technologies included hybrid vehicles; fuel cell-powered vehicles; automated ventilation control systems; heat recovery; compressed air; and electrical mining equipment. Energy profiles for various industrial applications were included. This paper also provided details of computerized simulations currently being conducted to estimate the potential incremental gains associated with the use of technology innovations in mining applications. 9 tabs., 3 figs.
Optimized mine ventilation on demand (OMVOD)

International Nuclear Information System (INIS)

Anderson, M.

2009-01-01

This paper provided an overview of the Optimized Mine Ventilation on Demand (OMVOD) system that is being installed at Xstrata Nickel Rim South Project and at Vale Inco's Totten Mine in Sudbury. The OMVOD system is designed to dynamically monitor and control air quality and quantity in real time and dilute and remove hazardous substances including diesel particulate matter (DPM), carbon monoxide (CO) and nitrous oxide (NO 2 ). It is also designed to control the thermal environment and provide ventilation for humans as well as mobile equipment engine combustion according to regulatory standards. The paper highlighted the OMVOD system optimization of energy, air quality measurement and control and production management of the mines through real time dynamic automation. Topics of discussion included real-time tracking and monitoring of diesel equipment; real-time tracking of underground miners; real-time evaluation of mine ventilation networks; and real-time control and optimization of ventilation equipment. ABB and Simsmart Technologies have joined forces to provide underground mining customers with a ventilation optimization solution. Simsmart's OMVOD provides proven real time/dynamic automation technology to significantly reduce energy costs, provide health and safety benefits as well as major capital cost savings while realizing an increase in production.
Pneumatic automation systems in coal mines

Energy Technology Data Exchange (ETDEWEB)

Shmatkov, N.A.; Kiklevich, Yu.N.

1981-04-01

Giprougleavtomatizatsiya, Avtomatgormash, Dongiprouglemash, VNIIGD and other plants develop 30 new pneumatic systems for mine machines and equipment control each year. The plants produce about 200 types of pneumatic systems. Major pneumatic systems for face systems, machines and equipment are reviewed: Sirena system for remote control of ANShch and AShchM face systems for steep coal seams, UPS control systems for pump stations, PAUZA control system for stowing machines, remote control system of B100-200 drilling machines, PUSK control system for coal cutter loaders with pneumatic drive (A-70, Temp), PUVSh control system for ventilation barriers activated from moving electric locomotives, PAZ control system for skip hoist loading. Specifications of the systems are given. Economic benefit produced by the pneumatic control systems are evaluated (from 1,500 to 40,000 rubles/year). Using the systems increases productivity of face machines and other machines used in black coal mines by 5 to 30%.
Development of electric drive for centrifugal mine pumps in Solikamsk Potassium Mine Group Based on Industrial OMRON Controller

Science.gov (United States)

Kostarev, S. N.; Sereda, T. G.; Tatarnikova, N. A.; Kochetova, O. V.

2018-03-01

The electric drive for automation pumping out of filtration waters in the Second Solikamsk Potasssium Mine Group is developed. The emergency situation of flooding of the Mine has been considered in the course of development of the Upper Kama deposits of potash-magnesium salts. The functional scheme of automation of a drive of the pump is developed. The scheme is stipulated with manual and automatic control. To decrease the risk of flooding of mine, it is recommended to establish gauges of both bottom and top level control of a brine and other equipment in the collector of a brine: the gauge of measurementof a level, the gauge of the signal system of a level, the gauge of the pump control, the gauge of the signal system of a level with remote data transmission. For regulation of the charge of sewage, the P-regulator with the executive mechanism is stipulated. The ladder diagram of a pump control is developed to improve the work of centrifugal pumps and to prevent the cases of mines flooding.
antiSMASH 3.0—a comprehensive resource for the genome mining of biosynthetic gene clusters

DEFF Research Database (Denmark)

Weber, Tilmann; Blin, Kai; Duddela, Srikanth

2015-01-01

Microbial secondary metabolism constitutes a rich source of antibiotics, chemotherapeutics, insecticides and other high-value chemicals. Genome mining of gene clusters that encode the biosynthetic pathways for these metabolites has become a key methodology for novel compound discovery. In 2011, we...... introduced antiSMASH, a web server and stand-alone tool for the automatic genomic identification and analysis of biosynthetic gene clusters, available at http://antismash.secondarymetabolites.org. Here, we present version 3.0 of antiSMASH, which has undergone major improvements. A full integration...... of the recently published ClusterFinder algorithm now allows using this probabilistic algorithm to detect putative gene clusters of unknown types. Also, a new dereplication variant of the ClusterBlast module now identifies similarities of identified clusters to any of 1172 clusters with known end products...
Automated typing of red blood cell and platelet antigens: a whole-genome sequencing study.

Science.gov (United States)

Lane, William J; Westhoff, Connie M; Gleadall, Nicholas S; Aguad, Maria; Smeland-Wagman, Robin; Vege, Sunitha; Simmons, Daimon P; Mah, Helen H; Lebo, Matthew S; Walter, Klaudia; Soranzo, Nicole; Di Angelantonio, Emanuele; Danesh, John; Roberts, David J; Watkins, Nick A; Ouwehand, Willem H; Butterworth, Adam S; Kaufman, Richard M; Rehm, Heidi L; Silberstein, Leslie E; Green, Robert C

2018-06-01

There are more than 300 known red blood cell (RBC) antigens and 33 platelet antigens that differ between individuals. Sensitisation to antigens is a serious complication that can occur in prenatal medicine and after blood transfusion, particularly for patients who require multiple transfusions. Although pre-transfusion compatibility testing largely relies on serological methods, reagents are not available for many antigens. Methods based on single-nucleotide polymorphism (SNP) arrays have been used, but typing for ABO and Rh-the most important blood groups-cannot be done with SNP typing alone. We aimed to develop a novel method based on whole-genome sequencing to identify RBC and platelet antigens. This whole-genome sequencing study is a subanalysis of data from patients in the whole-genome sequencing arm of the MedSeq Project randomised controlled trial (NCT01736566) with no measured patient outcomes. We created a database of molecular changes in RBC and platelet antigens and developed an automated antigen-typing algorithm based on whole-genome sequencing (bloodTyper). This algorithm was iteratively improved to address cis-trans haplotype ambiguities and homologous gene alignments. Whole-genome sequencing data from 110 MedSeq participants (30 × depth) were used to initially validate bloodTyper through comparison with conventional serology and SNP methods for typing of 38 RBC antigens in 12 blood-group systems and 22 human platelet antigens. bloodTyper was further validated with whole-genome sequencing data from 200 INTERVAL trial participants (15 × depth) with serological comparisons. We iteratively improved bloodTyper by comparing its typing results with conventional serological and SNP typing in three rounds of testing. The initial whole-genome sequencing typing algorithm was 99·5% concordant across the first 20 MedSeq genomes. Addressing discordances led to development of an improved algorithm that was 99·8% concordant for the remaining 90 Med
Automated genomic DNA purification options in agricultural applications using MagneSil paramagnetic particles

Science.gov (United States)

Bitner, Rex M.; Koller, Susan C.

2002-06-01

The automated high throughput purification of genomic DNA form plant materials can be performed using MagneSil paramagnetic particles on the Beckman-Coulter FX, BioMek 2000, and the Tecan Genesis robot. Similar automated methods are available for DNA purifications from animal blood. These methods eliminate organic extractions, lengthy incubations and cumbersome filter plates. The DNA is suitable for applications such as PCR and RAPD analysis. Methods are described for processing traditionally difficult samples such as those containing large amounts of polyphenolics or oils, while still maintaining a high level of DNA purity. The robotic protocols have ben optimized for agricultural applications such as marker assisted breeding, seed-quality testing, and SNP discovery and scoring. In addition to high yield purification of DNA from plant samples or animal blood, the use of Promega's DNA-IQ purification system is also described. This method allows for the purification of a narrow range of DNA regardless of the amount of additional DNA that is present in the initial sample. This simultaneous Isolation and Quantification of DNA allows the DNA to be used directly in applications such as PCR, SNP analysis, and RAPD, without the need for separate quantitation of the DNA.
Discovery of defense- and neuropeptides in social ants by genome-mining.

Directory of Open Access Journals (Sweden)

Christian W Gruber

Full Text Available Natural peptides of great number and diversity occur in all organisms, but analyzing their peptidome is often difficult. With natural product drug discovery in mind, we devised a genome-mining approach to identify defense- and neuropeptides in the genomes of social ants from Atta cephalotes (leaf-cutter ant, Camponotus floridanus (carpenter ant and Harpegnathos saltator (basal genus. Numerous peptide-encoding genes of defense peptides, in particular defensins, and neuropeptides or regulatory peptide hormones, such as allatostatins and tachykinins, were identified and analyzed. Most interestingly we annotated genes that encode oxytocin/vasopressin-related peptides (inotocins and their putative receptors. This is the first piece of evidence for the existence of this nonapeptide hormone system in ants (Formicidae and supports recent findings in Tribolium castaneum (red flour beetle and Nasonia vitripennis (parasitoid wasp, and therefore its confinement to some basal holometabolous insects. By contrast, the absence of the inotocin hormone system in Apis mellifera (honeybee, another closely-related member of the eusocial Hymenoptera clade, establishes the basis for future studies on the molecular evolution and physiological function of oxytocin/vasopressin-related peptides (vasotocin nonapeptide family and their receptors in social insects. Particularly the identification of ant inotocin and defensin peptide sequences will provide a basis for future pharmacological characterization in the quest for potent and selective lead compounds of therapeutic value.
High safety in the mining industry

Energy Technology Data Exchange (ETDEWEB)

1987-08-01

Presents an interview in question and answer format with the deputy chairman of Gosgortekhnadzor (Committee for Supervision of Industrial Work Safety and Mining Supervision) in which he discusses two recent fatal accidents in the Yasinovskaya-Glubokaya and Chaikino coal mines and identifies areas where safety needs to be improved (more automation, protective devices, ventilation etc.). Discusses the particular problems involved with deep mining (20% of mines are now deeper than 700 m and 27 mines are deeper than 1000 m), such as fires, dust, methane, rock falls, insufficient maintenance and strata control and poor ventilation. Confirms that a large number of accidents is due to poor organization and stresses the fact the coal industry must be subjected to perestroika (restructuring) as much as other areas of society.
Automation of strata bolting in iron mines

Energy Technology Data Exchange (ETDEWEB)

Belin, M; Lethuaire, M

1978-01-01

The Moyeure iron mine (Lorraine), with an output of 16,000 t/day, works 2 seams separated by a dirt band 6.50 m thick. The tyre-mounted Diesel Secoma jumbo for bolting operations can insert resin-embedded bolts 1.70 m in length. The jumbo is fitted with a standard universal boom with two settings - one for drilling and one for bolting. As the operator has to work in the unbolted zone in order to offer up the bolt on the boom and insert the resin cartridge, the mine in conjunction with the manufacturers, Secoma, have improved the safety and performance of the machine by adding three special attachments to the standard boom: a reamer, a resin-injection system and a bolt supply-magazine. Gives details of the results achieved. (In French)

Process mining

DEFF Research Database (Denmark)

van der Aalst, W.M.P.; Rubin, V.; Verbeek, H.M.W.

2010-01-01

Process mining includes the automated discovery of processes from event logs. Based on observed events (e.g., activities being executed or messages being exchanged) a process model is constructed. One of the essential problems in process mining is that one cannot assume to have seen all possible...... behavior. At best, one has seen a representative subset. Therefore, classical synthesis techniques are not suitable as they aim at finding a model that is able to exactly reproduce the log. Existing process mining techniques try to avoid such “overfitting” by generalizing the model to allow for more...... support for it). None of the existing techniques enables the user to control the balance between “overfitting” and “underfitting”. To address this, we propose a two-step approach. First, using a configurable approach, a transition system is constructed. Then, using the “theory of regions”, the model...
Real world data mining applications

CERN Document Server

Abou-Nasr, Mahmoud; Stahlbock, Robert; Weiss, Gary M

2014-01-01

Data mining applications range from commercial to social domains, with novel applications appearing swiftly; for example, within the context of social networks. The expanding application sphere and social reach of advanced data mining raise pertinent issues of privacy and security. Present-day data mining is a progressive multidisciplinary endeavor. This inter- and multidisciplinary approach is well reflected within the field of information systems. The information systems research addresses software and hardware requirements for supporting computationally and data-intensive applications. Furthermore, it encompasses analyzing system and data aspects, and all manual or automated activities. In that respect, research at the interface of information systems and data mining has significant potential to produce actionable knowledge vital for corporate decision-making. The aim of the proposed volume is to provide a balanced treatment of the latest advances and developments in data mining; in particular, exploring s...
The hydrogen mine introduction initiative

Energy Technology Data Exchange (ETDEWEB)

Betournay, M.C.; Howell, B. [Natural Resources Canada, Ottawa, ON (Canada). CANMET Mining and Mineral Sciences Laboratories

2009-07-01

In an effort to address air quality concerns in underground mines, the mining industry is considering the use fuel cells instead of diesel to power mine production vehicles. The immediate issues and opportunities associated with fuel cells use include a reduction in harmful greenhouse gas emissions; reduction in ventilation operating costs; reduction in energy consumption; improved health benefits; automation; and high productivity. The objective of the hydrogen mine introduction initiative (HMII) is to develop and test the range of fundamental and needed operational technology, specifications and best practices for underground hydrogen power applications. Although proof of concept studies have shown high potential for fuel cell use, safety considerations must be addressed, including hydrogen behaviour in confined conditions. This presentation highlighted the issues to meet operational requirements, notably hydrogen production; delivery and storage; mine regulations; and hydrogen behaviour underground. tabs., figs.
Siemens' innovative role in mining technology

Energy Technology Data Exchange (ETDEWEB)

1990-07-01

The growth of the mining industry in South Africa has played a decisive role in the industrial development of the country. As mining activities expanded, the need for energy production increased and as of late mining is becoming more mechanised and the need for more energy as well as automation is growing. The origins of Siemens operations in South Africa date back to the humble beginnings of the mining era, when the company provided the first generator and floodlights to illuminate the famous 'Big Hole' of the diamond mine at Kimberley as well as hydro-electric plants in 1895 on the Crocodile River and Blyde River respectively to supply the newly established mines in the Lydenburg district with electric power. 7 figs.
Radio Propagation in Open-pit Mines

DEFF Research Database (Denmark)

Portela Lopes de Almeida, Erika; Caldwell, George; Rodriguez Larrad, Ignacio

2017-01-01

In this paper we present the results of an extensive measurement campaign performed at two large iron ore mining centers in Brazil at the 2.6 GHz band. Although several studies focusing on radio propagation in underground mines have been published, measurement data and careful analyses for open......-pit mines are still scarce. Our results aim at filling this gap in the literature. The research is motivated by the ongoing mine automation initiatives, where connectivity becomes critical. This paper presents the first set of results comprising measurements under a gamut of propagation conditions. A second...... paper detailing sub-GHz propagation is also in preparation. The results indicate that conventional wisdom is wrong, in other words, radio-frequency (RF) propagation in surface mines can be far more elaborate than plain free-space line-of-sight conditions. Additionally, the old mining adage “no two mines...
Automated Psychological Categorization via Linguistic Processing System

National Research Council Canada - National Science Library

Eramo, Mark

2004-01-01

.... This research examined whether or not Information Technology (IT) tools, specializing in text mining, are robust enough to automate the categorization/segmentation of individual profiles for the purpose...
The Challenge of Wireless Connectivity to Support Intelligent Mines

DEFF Research Database (Denmark)

Barbosa, Viviane S. B.; Garcia, Luis G. U.; Caldwell, George

2016-01-01

The need for continuous safety improvements and increased operational efficiency is driving the mining industry through a transition towards large-scale automation of operations, i.e., “intelligent mines”. The technology promises to remove human operators from harsh or dangerous conditions...... for unmanned mine operations. Although voice and narrowband data radios have been used for years to support several types of mining activities, such as fleet management (dispatch) and telemetry, the use of automated equipment introduces a new set of connectivity requirements and poses a set of challenges...... in terms of network planning, management and optimization. For example, the data rates required to support unmanned equipment, e.g. a teleoperated bulldozer, shift from a few kilobits/second to megabits/second due to live video feeds. This traffic volume is well beyond the capabilities of Professional...
Glycogenomics as a mass spectrometry-guided genome-mining method for microbial glycosylated molecules.

Science.gov (United States)

Kersten, Roland D; Ziemert, Nadine; Gonzalez, David J; Duggan, Brendan M; Nizet, Victor; Dorrestein, Pieter C; Moore, Bradley S

2013-11-19

Glycosyl groups are an essential mediator of molecular interactions in cells and on cellular surfaces. There are very few methods that directly relate sugar-containing molecules to their biosynthetic machineries. Here, we introduce glycogenomics as an experiment-guided genome-mining approach for fast characterization of glycosylated natural products (GNPs) and their biosynthetic pathways from genome-sequenced microbes by targeting glycosyl groups in microbial metabolomes. Microbial GNPs consist of aglycone and glycosyl structure groups in which the sugar unit(s) are often critical for the GNP's bioactivity, e.g., by promoting binding to a target biomolecule. GNPs are a structurally diverse class of molecules with important pharmaceutical and agrochemical applications. Herein, O- and N-glycosyl groups are characterized in their sugar monomers by tandem mass spectrometry (MS) and matched to corresponding glycosylation genes in secondary metabolic pathways by a MS-glycogenetic code. The associated aglycone biosynthetic genes of the GNP genotype then classify the natural product to further guide structure elucidation. We highlight the glycogenomic strategy by the characterization of several bioactive glycosylated molecules and their gene clusters, including the anticancer agent cinerubin B from Streptomyces sp. SPB74 and an antibiotic, arenimycin B, from Salinispora arenicola CNB-527.
Genome mining reveals high incidence of putative lipopeptide biosynthesis NRPS/PKS clusters containing fatty acyl-AMP ligase genes inbiofilm-forming cyanobacteria

Czech Academy of Sciences Publication Activity Database

Galica, Tomáš; Hrouzek, P.; Mareš, Jan

2017-01-01

Roč. 53, č. 5 (2017), s. 985-998 ISSN 0022-3646 R&D Projects: GA ČR(CZ) GA16-09381S Institutional support: RVO:60077344 Keywords : cyanobacteria * fatty-acyl AMP ligase * genome mining * lipopeptides * microbial biofilm Subject RIV: EE - Microbiology, Virology OBOR OECD: Microbiology Impact factor: 2.608, year: 2016
Mining biological databases for candidate disease genes

Science.gov (United States)

Braun, Terry A.; Scheetz, Todd; Webster, Gregg L.; Casavant, Thomas L.

2001-07-01

The publicly-funded effort to sequence the complete nucleotide sequence of the human genome, the Human Genome Project (HGP), has currently produced more than 93% of the 3 billion nucleotides of the human genome into a preliminary `draft' format. In addition, several valuable sources of information have been developed as direct and indirect results of the HGP. These include the sequencing of model organisms (rat, mouse, fly, and others), gene discovery projects (ESTs and full-length), and new technologies such as expression analysis and resources (micro-arrays or gene chips). These resources are invaluable for the researchers identifying the functional genes of the genome that transcribe and translate into the transcriptome and proteome, both of which potentially contain orders of magnitude more complexity than the genome itself. Preliminary analyses of this data identified approximately 30,000 - 40,000 human `genes.' However, the bulk of the effort still remains -- to identify the functional and structural elements contained within the transcriptome and proteome, and to associate function in the transcriptome and proteome to genes. A fortuitous consequence of the HGP is the existence of hundreds of databases containing biological information that may contain relevant data pertaining to the identification of disease-causing genes. The task of mining these databases for information on candidate genes is a commercial application of enormous potential. We are developing a system to acquire and mine data from specific databases to aid our efforts to identify disease genes. A high speed cluster of Linux of workstations is used to analyze sequence and perform distributed sequence alignments as part of our data mining and processing. This system has been used to mine GeneMap99 sequences within specific genomic intervals to identify potential candidate disease genes associated with Bardet-Biedle Syndrome (BBS).
AUTOMATION DESIGN FOR MONORAIL - BASED SYSTEM PROCESSES

Directory of Open Access Journals (Sweden)

Bunda BESA

2016-12-01

Full Text Available Currently, conventional methods of decline development put enormous cost pressure on the profitability of mining operations. This is the case with narrow vein ore bodies where current methods and mine design of decline development may be too expensive to support economic extraction of the ore. According to studies, the time it takes to drill, clean and blast an end in conventional decline development can be up to 224 minutes. This is because once an end is blasted, cleaning should first be completed before drilling can commence, resulting in low advance rates per shift. Improvements in advance rates during decline development can be achieved by application of the Electric Monorail Transport System (EMTS based drilling system. The system consists of the drilling and loading components that use monorail technology to drill and clean the face during decline development. The two systems work simultaneously at the face in such a way that as the top part of the face is being drilled the pneumatic loading system cleans the face. However, to improve the efficiency of the two systems, critical processes performed by the two systems during mining operations must be automated. Automation increases safety and productivity, reduces operator fatigue and also reduces the labour costs of the system. The aim of this paper is, therefore, to describe automation designs of the two processes performed by the monorail drilling and loading systems during operations. During automation design, critical processes performed by the two systems and control requirements necessary to allow the two systems execute such processes automatically have also been identified.
Physics Mining of Multi-Source Data Sets

Science.gov (United States)

Helly, John; Karimabadi, Homa; Sipes, Tamara

2012-01-01

Powerful new parallel data mining algorithms can produce diagnostic and prognostic numerical models and analyses from observational data. These techniques yield higher-resolution measures than ever before of environmental parameters by fusing synoptic imagery and time-series measurements. These techniques are general and relevant to observational data, including raster, vector, and scalar, and can be applied in all Earth- and environmental science domains. Because they can be highly automated and are parallel, they scale to large spatial domains and are well suited to change and gap detection. This makes it possible to analyze spatial and temporal gaps in information, and facilitates within-mission replanning to optimize the allocation of observational resources. The basis of the innovation is the extension of a recently developed set of algorithms packaged into MineTool to multi-variate time-series data. MineTool is unique in that it automates the various steps of the data mining process, thus making it amenable to autonomous analysis of large data sets. Unlike techniques such as Artificial Neural Nets, which yield a blackbox solution, MineTool's outcome is always an analytical model in parametric form that expresses the output in terms of the input variables. This has the advantage that the derived equation can then be used to gain insight into the physical relevance and relative importance of the parameters and coefficients in the model. This is referred to as physics-mining of data. The capabilities of MineTool are extended to include both supervised and unsupervised algorithms, handle multi-type data sets, and parallelize it.
Discussion of Minos Mine operating system

Energy Technology Data Exchange (ETDEWEB)

Pan, B.

1991-10-01

The MINOS (mine operating system), which is used in the majority of British collieries, provides central control at the surface for the machinery and environmental equipment distributed throughout the mine. Installed equipment, including face machinery, conveyors, pumps, fans and sensors are connected to local outstations which all communicate with the control system via a single run of signal cable. The article discusses the system particularly its use in the Automated Control System of Underground Mining Locomotives (ACSUML). The discussion includes the use of MINOS to improve wagon identification, the operating principle of ACSUML and the possibilities of a driverless locomotive. 2 figs.
Genome engineering for microbial natural product discovery.

Science.gov (United States)

Choi, Si-Sun; Katsuyama, Yohei; Bai, Linquan; Deng, Zixin; Ohnishi, Yasuo; Kim, Eung-Soo

2018-03-03

The discovery and development of microbial natural products (MNPs) have played pivotal roles in the fields of human medicine and its related biotechnology sectors over the past several decades. The post-genomic era has witnessed the development of microbial genome mining approaches to isolate previously unsuspected MNP biosynthetic gene clusters (BGCs) hidden in the genome, followed by various BGC awakening techniques to visualize compound production. Additional microbial genome engineering techniques have allowed higher MNP production titers, which could complement a traditional culture-based MNP chasing approach. Here, we describe recent developments in the MNP research paradigm, including microbial genome mining, NP BGC activation, and NP overproducing cell factory design. Copyright © 2018 Elsevier Ltd. All rights reserved.
An automated graphics tool for comparative genomics: the Coulson plot generator.

Science.gov (United States)

Field, Helen I; Coulson, Richard M R; Field, Mark C

2013-04-27

Comparative analysis is an essential component to biology. When applied to genomics for example, analysis may require comparisons between the predicted presence and absence of genes in a group of genomes under consideration. Frequently, genes can be grouped into small categories based on functional criteria, for example membership of a multimeric complex, participation in a metabolic or signaling pathway or shared sequence features and/or paralogy. These patterns of retention and loss are highly informative for the prediction of function, and hence possible biological context, and can provide great insights into the evolutionary history of cellular functions. However, representation of such information in a standard spreadsheet is a poor visual means from which to extract patterns within a dataset. We devised the Coulson Plot, a new graphical representation that exploits a matrix of pie charts to display comparative genomics data. Each pie is used to describe a complex or process from a separate taxon, and is divided into sectors corresponding to the number of proteins (subunits) in a complex/process. The predicted presence or absence of proteins in each complex are delineated by occupancy of a given sector; this format is visually highly accessible and makes pattern recognition rapid and reliable. A key to the identity of each subunit, plus hierarchical naming of taxa and coloring are included. A java-based application, the Coulson plot generator (CPG) automates graphic production, with a tab or comma-delineated text file as input and generating an editable portable document format or svg file. CPG software may be used to rapidly convert spreadsheet data to a graphical matrix pie chart format. The representation essentially retains all of the information from the spreadsheet but presents a graphically rich format making comparisons and identification of patterns significantly clearer. While the Coulson plot format is highly useful in comparative genomics, its
Protective and control relays as coal-mine power-supply ACS subsystem

Science.gov (United States)

Kostin, V. N.; Minakova, T. E.

2017-10-01

The paper presents instantaneous selective short-circuit protection for the cabling of the underground part of a coal mine and central control algorithms as a Coal-Mine Power-Supply ACS Subsystem. In order to improve the reliability of electricity supply and reduce the mining equipment down-time, a dual channel relay protection and central control system is proposed as a subsystem of the coal-mine power-supply automated control system (PS ACS).
Genome sequence of the acid-tolerant Desulfovibrio sp. DV isolated from the sediments of a Pb-Zn mine tailings dam in the Chita region, Russia

Directory of Open Access Journals (Sweden)

Anastasiia Kovaliova

2017-03-01

Full Text Available Here we report the draft genome sequence of the acid-tolerant Desulfovibrio sp. DV isolated from the sediments of a Pb-Zn mine tailings dam in the Chita region, Russia. The draft genome has a size of 4.9 Mb and encodes multiple K+-transporters and proton-consuming decarboxylases. The phylogenetic analysis based on concatenated ribosomal proteins revealed that strain DV clusters together with the acid-tolerant Desulfovibrio sp. TomC and Desulfovibrio magneticus. The draft genome sequence and annotation have been deposited at GenBank under the accession number MLBG00000000.
Specializing CRISP-DM for evidence mining

CSIR Research Space (South Africa)

Venter, JP

2007-01-01

Full Text Available of the analysis tasks that are currently being performed manually. This paper argues that forensic analysis can greatly benefit from research in knowledge discovery and data mining, which has developed powerful automated techniques for analyzing massive quantities...
Mine robotics for the extraction of minerals at great depths

Energy Technology Data Exchange (ETDEWEB)

Chaikovskii, Eh G; Poller, B V; Konyukh, V L

1983-09-01

An article is discussed which was written by A.A. Bovin, N.V. Kurleni and E.I. Shemyakin on Problems in mining mineral deposits at great depth, printed in issue No. 2 of this journal in 1983. First the authors define the problems, then discuss the construction of automatic systems for the control of underground extraction and haulage and end with the basic problems and organizational measures connected with the development and construction of mining robots. They also deal with systems of control and radio communications for underground winning and hauling operations. The article represents a complex study of the need for full automation of mining and the gradual introduction of robots to replace men in hazardous work places. The authors suggest equipment for the automatic extraction and hauling of minerals based on the use of microcomputers underground and computers located on the surface, videosensors and pressure transducers. The authors state that in order to solve the problems of automation and remote control of mining operations it is necessary to involve more specialists in robotics and remote control at the mining scientific research institutes and to increase the number of graduates in this field. 28 references.
MetReS, an Efficient Database for Genomic Applications.

Science.gov (United States)

Vilaplana, Jordi; Alves, Rui; Solsona, Francesc; Mateo, Jordi; Teixidó, Ivan; Pifarré, Marc

2018-02-01

MetReS (Metabolic Reconstruction Server) is a genomic database that is shared between two software applications that address important biological problems. Biblio-MetReS is a data-mining tool that enables the reconstruction of molecular networks based on automated text-mining analysis of published scientific literature. Homol-MetReS allows functional (re)annotation of proteomes, to properly identify both the individual proteins involved in the processes of interest and their function. The main goal of this work was to identify the areas where the performance of the MetReS database performance could be improved and to test whether this improvement would scale to larger datasets and more complex types of analysis. The study was started with a relational database, MySQL, which is the current database server used by the applications. We also tested the performance of an alternative data-handling framework, Apache Hadoop. Hadoop is currently used for large-scale data processing. We found that this data handling framework is likely to greatly improve the efficiency of the MetReS applications as the dataset and the processing needs increase by several orders of magnitude, as expected to happen in the near future.

Antimicrobials of Bacillus species: mining and engineering

NARCIS (Netherlands)

Zhao, Xin

2016-01-01

Bacillus sp. have been successfully used to suppress various bacterial and fungal pathogens. Due to the wide availability of whole genome sequence data and the development of genome mining tools, novel antimicrobials are being discovered and updated,;not only bacteriocins, but also NRPs and PKs. A
Genome mining reveals high incidence of putative lipopeptide biosynthesis NRPS/PKS clusters containing fatty acyl-AMP ligase genes inbiofilm-forming cyanobacteria

Czech Academy of Sciences Publication Activity Database

Galica, Tomáš; Hrouzek, Pavel; Mareš, Jan

2017-01-01

Roč. 53, č. 5 (2017), s. 985-998 ISSN 0022-3646 R&D Projects: GA ČR(CZ) GA16-09381S; GA MŠk(CZ) LO1416; GA MŠk(CZ) ED2.1.00/19.0392 Institutional support: RVO:61388971 Keywords : cyanobacteria * fatty-acyl AMP ligase * genome mining Subject RIV: EE - Microbiology, Virology OBOR OECD: Microbiology Impact factor: 2.608, year: 2016
Lunar construction/mining equipment

Science.gov (United States)

Ozdemir, Levent

1990-01-01

For centuries, mining has utilized drill and blast as the primary method of rock excavation. Although this technique has undergone significant improvements, it still remains a cyclic, labor intensive operation with inherent safety hazards. Other drawbacks include damage to the surrounding ground, creation of blast vibrations, rough excavation walls resulting in increased ventilation requirements, and the lack of selective mining ability. Perhaps the most important shortcoming of drill and blast is that it is not conducive to full implementation of automation or robotics technologies. Numerous attempts have been made in the past to automate drill and blast operations to remove personnel from the hazardous work environment. Although most of the concepts devised look promising on paper, none of them was found workable on a sustained production basis. In particular, the problem of serious damage to equipment during the blasting cycle could not be resolved regardless of the amount of charge used in excavation. Since drill and blast is not capable of meeting the requirements of a fully automated rock fragmentation method, its role is bound to gradually decrease. Mechanical excavation, in contrast, is highly suitable to automation because it is a continuous process and does not involve any explosives. Many of the basic principles and trends controlling the design of an earth-based mechanical excavator will hold in an extraterrestrial environment such as on the lunar surface. However, the economic and physical limitations for transporting materials to space will require major rethinking of these machines. In concept, then, a lunar mechanical excavator will look and perform significantly different from one designed for use here on earth. This viewgraph presentation gives an overview of such mechanical excavator systems.
Text Mining of Supreme Administrative Court Jurisdictions

OpenAIRE

Feinerer, Ingo; Hornik, Kurt

2007-01-01

Within the last decade text mining, i.e., extracting sensitive information from text corpora, has become a major factor in business intelligence. The automated textual analysis of law corpora is highly valuable because of its impact on a company's legal options and the raw amount of available jurisdiction. The study of supreme court jurisdiction and international law corpora is equally important due to its effects on business sectors. In this paper we use text mining methods to investigate Au...
Comparative process mining in education : an approach based on process cubes

NARCIS (Netherlands)

van der Aalst, W.M.P.; Guo, S.; Gorissen, P.J.B.; Ceravolo, P.; Accorsi, R.; Cudre-Mauroux, P.

2015-01-01

Process mining techniques enable the analysis of a wide variety of processes using event data. For example, event logs can be used to automatically learn a process model (e.g., a Petri net or BPMN model). Next to the automated discovery of the real underlying process, there are process mining
The application and implementation of optimized mine ventilation on demand (OMVOD) at the Xstrata Nickel Rim South Mine, Sudbury, Ontario

International Nuclear Information System (INIS)

Bartsch, E.; Laine, M.; Andersen, M.

2010-01-01

An Optimized Mine Ventilation on Demand (OMVOD) system has been installed at the Xstrata Nickel Rim South Mine in Sudbury. Developed by Simsmart Technologies, the OMVOD system monitors and controls air quality and quantity through real time dynamic automation. A ventilation on demand (VOD) system was needed to remove diesel particulate matter (DPM), carbon monoxide (CO) and nitrogen dioxide (NO 2 ). This paper described the real-time tracking and monitoring of the OMVOD system and optimization of ventilation equipment. Simsmart's OMVOD system was shown to reduce energy costs while improve air quality in the underground mine. 7 refs., 3 tabs., 8 figs.
AHCODA-DB: a data repository with web-based mining tools for the analysis of automated high-content mouse phenomics data.

Science.gov (United States)

Koopmans, Bastijn; Smit, August B; Verhage, Matthijs; Loos, Maarten

2017-04-04

Systematic, standardized and in-depth phenotyping and data analyses of rodent behaviour empowers gene-function studies, drug testing and therapy design. However, no data repositories are currently available for standardized quality control, data analysis and mining at the resolution of individual mice. Here, we present AHCODA-DB, a public data repository with standardized quality control and exclusion criteria aimed to enhance robustness of data, enabled with web-based mining tools for the analysis of individually and group-wise collected mouse phenotypic data. AHCODA-DB allows monitoring in vivo effects of compounds collected from conventional behavioural tests and from automated home-cage experiments assessing spontaneous behaviour, anxiety and cognition without human interference. AHCODA-DB includes such data from mutant mice (transgenics, knock-out, knock-in), (recombinant) inbred strains, and compound effects in wildtype mice and disease models. AHCODA-DB provides real time statistical analyses with single mouse resolution and versatile suite of data presentation tools. On March 9th, 2017 AHCODA-DB contained 650 k data points on 2419 parameters from 1563 mice. AHCODA-DB provides users with tools to systematically explore mouse behavioural data, both with positive and negative outcome, published and unpublished, across time and experiments with single mouse resolution. The standardized (automated) experimental settings and the large current dataset (1563 mice) in AHCODA-DB provide a unique framework for the interpretation of behavioural data and drug effects. The use of common ontologies allows data export to other databases such as the Mouse Phenome Database. Unbiased presentation of positive and negative data obtained under the highly standardized screening conditions increase cost efficiency of publicly funded mouse screening projects and help to reach consensus conclusions on drug responses and mouse behavioural phenotypes. The website is publicly
Discovery of novel targets for multi-epitope vaccines: Screening of HIV-1 genomes using association rule mining

Directory of Open Access Journals (Sweden)

Piontkivska Helen

2009-07-01

Full Text Available Abstract Background Studies have shown that in the genome of human immunodeficiency virus (HIV-1 regions responsible for interactions with the host's immune system, namely, cytotoxic T-lymphocyte (CTL epitopes tend to cluster together in relatively conserved regions. On the other hand, "epitope-less" regions or regions with relatively low density of epitopes tend to be more variable. However, very little is known about relationships among epitopes from different genes, in other words, whether particular epitopes from different genes would occur together in the same viral genome. To identify CTL epitopes in different genes that co-occur in HIV genomes, association rule mining was used. Results Using a set of 189 best-defined HIV-1 CTL/CD8+ epitopes from 9 different protein-coding genes, as described by Frahm, Linde & Brander (2007, we examined the complete genomic sequences of 62 reference HIV sequences (including 13 subtypes and sub-subtypes with approximately 4 representative sequences for each subtype or sub-subtype, and 18 circulating recombinant forms. The results showed that despite inclusion of recombinant sequences that would be expected to break-up associations of epitopes in different genes when two different genomes are recombined, there exist particular combinations of epitopes (epitope associations that occur repeatedly across the world-wide population of HIV-1. For example, Pol epitope LFLDGIDKA is found to be significantly associated with epitopes GHQAAMQML and FLKEKGGL from Gag and Nef, respectively, and this association rule is observed even among circulating recombinant forms. Conclusion We have identified CTL epitope combinations co-occurring in HIV-1 genomes including different subtypes and recombinant forms. Such co-occurrence has important implications for design of complex vaccines (multi-epitope vaccines and/or drugs that would target multiple HIV-1 regions at once and, thus, may be expected to overcome challenges
Automated analysis for large amount gaseous fission product gamma-scanning spectra from nuclear power plant and its data mining

International Nuclear Information System (INIS)

Weihua Zhang; Kurt Ungar; Ian Hoffman; Ryan Lawrie; Jarmo Ala-Heikkila

2010-01-01

Based on the Linssi database and UniSampo/Shaman software, an automated analysis platform has been setup for the analysis of large amounts of gamma-spectra from the primary coolant monitoring systems of a CANDU reactor. Thus, a database inventory of gaseous and volatile fission products in the primary coolant of a CANDU reactor has been established. This database is comprised of 15,000 spectra of radioisotope analysis records. Records from the database inventory were retrieved by a specifically designed data-mining module and subjected to further analysis. Results from the analysis were subsequently used to identify the reactor coolant half-life of 135 Xe and 133 Xe, as well as the correlations of 135 Xe and 88 Kr activities. (author)
QuadBase2: web server for multiplexed guanine quadruplex mining and visualization

Science.gov (United States)

Dhapola, Parashar; Chowdhury, Shantanu

2016-01-01

DNA guanine quadruplexes or G4s are non-canonical DNA secondary structures which affect genomic processes like replication, transcription and recombination. G4s are computationally identified by specific nucleotide motifs which are also called putative G4 (PG4) motifs. Despite the general relevance of these structures, there is currently no tool available that can allow batch queries and genome-wide analysis of these motifs in a user-friendly interface. QuadBase2 (quadbase.igib.res.in) presents a completely reinvented web server version of previously published QuadBase database. QuadBase2 enables users to mine PG4 motifs in up to 178 eukaryotes through the EuQuad module. This module interfaces with Ensembl Compara database, to allow users mine PG4 motifs in the orthologues of genes of interest across eukaryotes. PG4 motifs can be mined across genes and their promoter sequences in 1719 prokaryotes through ProQuad module. This module includes a feature that allows genome-wide mining of PG4 motifs and their visualization as circular histograms. TetraplexFinder, the module for mining PG4 motifs in user-provided sequences is now capable of handling up to 20 MB of data. QuadBase2 is a comprehensive PG4 motif mining tool that further expands the configurations and algorithms for mining PG4 motifs in a user-friendly way. PMID:27185890
Recent advances in remote coal mining machine sensing, guidance, and teleoperation

Energy Technology Data Exchange (ETDEWEB)

Ralston, J C; Hainsworth, D W; Reid, D C; Anderson, D L; McPhee, R J [CSIRO Exploration & Minerals, Kenmore, Qld. (Australia)

2001-10-01

Some recent applications of sensing, guidance and telerobotic technology in the coal mining industry are presented. Of special interest is the development of semi or fully autonomous systems to provide remote guidance and communications for coal mining equipment. The use of radar and inertial based sensors are considered in an attempt to solve the horizontal and lateral guidance problems associated with mining equipment automation. Also described is a novel teleoperated robot vehicle with unique communications capabilities, called the Numbat, which is used in underground mine safety and reconnaissance missions.
Mining olive genome through library sequencing and bioinformatics ...

African Journals Online (AJOL)

As one of the initial steps of olive (Olea europaea L.) genome analysis, a small insert genomic DNA library was constructed (digesting olive genomic DNA with SmaI and cloning the digestion products into pUC19 vector) and randomly picked 83 colonies were sequenced. Analysis of the insert sequences revealed 12 clones ...
Automated detection of follow-up appointments using text mining of discharge records.

Science.gov (United States)

Ruud, Kari L; Johnson, Matthew G; Liesinger, Juliette T; Grafft, Carrie A; Naessens, James M

2010-06-01

To determine whether text mining can accurately detect specific follow-up appointment criteria in free-text hospital discharge records. Cross-sectional study. Mayo Clinic Rochester hospitals. Inpatients discharged from general medicine services in 2006 (n = 6481). Textual hospital dismissal summaries were manually reviewed to determine whether the records contained specific follow-up appointment arrangement elements: date, time and either physician or location for an appointment. The data set was evaluated for the same criteria using SAS Text Miner software. The two assessments were compared to determine the accuracy of text mining for detecting records containing follow-up appointment arrangements. Agreement of text-mined appointment findings with gold standard (manual abstraction) including sensitivity, specificity, positive predictive and negative predictive values (PPV and NPV). About 55.2% (3576) of discharge records contained all criteria for follow-up appointment arrangements according to the manual review, 3.2% (113) of which were missed through text mining. Text mining incorrectly identified 3.7% (107) follow-up appointments that were not considered valid through manual review. Therefore, the text mining analysis concurred with the manual review in 96.6% of the appointment findings. Overall sensitivity and specificity were 96.8 and 96.3%, respectively; and PPV and NPV were 97.0 and 96.1%, respectively. of individual appointment criteria resulted in accuracy rates of 93.5% for date, 97.4% for time, 97.5% for physician and 82.9% for location. Text mining of unstructured hospital dismissal summaries can accurately detect documentation of follow-up appointment arrangement elements, thus saving considerable resources for performance assessment and quality-related research.
Mining Genome-Scale Growth Phenotype Data through Constant-Column Biclustering

KAUST Repository

Alzahrani, Majed A.

2017-07-10

Growth phenotype profiling of genome-wide gene-deletion strains over stress conditions can offer a clear picture that the essentiality of genes depends on environmental conditions. Systematically identifying groups of genes from such recently emerging high-throughput data that share similar patterns of conditional essentiality and dispensability under various environmental conditions can elucidate how genetic interactions of the growth phenotype are regulated in response to the environment. In this dissertation, we first demonstrate that detecting such “co-fit” gene groups can be cast as a less well-studied problem in biclustering, i.e., constant-column biclustering. Despite significant advances in biclustering techniques, very few were designed for mining in growth phenotype data. Here, we propose Gracob, a novel, efficient graph-based method that casts and solves the constant-column biclustering problem as a maximal clique finding problem in a multipartite graph. We compared Gracob with a large collection of widely used biclustering methods that cover different types of algorithms designed to detect different types of biclusters. Gracob showed superior performance on finding co-fit genes over all the existing methods on both a variety of synthetic data sets with a wide range of settings, and three real growth phenotype data sets for E. coli, proteobacteria, and yeast.
Ensembl Genomes 2016: more genomes, more complexity.

Science.gov (United States)

Kersey, Paul Julian; Allen, James E; Armean, Irina; Boddu, Sanjay; Bolt, Bruce J; Carvalho-Silva, Denise; Christensen, Mikkel; Davis, Paul; Falin, Lee J; Grabmueller, Christoph; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Aranganathan, Naveen K; Langridge, Nicholas; Lowy, Ernesto; McDowall, Mark D; Maheswari, Uma; Nuhn, Michael; Ong, Chuang Kee; Overduin, Bert; Paulini, Michael; Pedro, Helder; Perry, Emily; Spudich, Giulietta; Tapanari, Electra; Walts, Brandon; Williams, Gareth; Tello-Ruiz, Marcela; Stein, Joshua; Wei, Sharon; Ware, Doreen; Bolser, Daniel M; Howe, Kevin L; Kulesha, Eugene; Lawson, Daniel; Maslen, Gareth; Staines, Daniel M

2016-01-04

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including reference sequence, gene models, transcriptional data, genetic variation and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments. These include the development of new analyses and views to represent polyploid genomes (of which bread wheat is the primary exemplar); and the continued up-scaling of the resource, which now includes over 23 000 bacterial genomes, 400 fungal genomes and 100 protist genomes, in addition to 55 genomes from invertebrate metazoa and 39 genomes from plants. This dramatic increase in the number of included genomes is one part of a broader effort to automate the integration of archival data (genome sequence, but also associated RNA sequence data and variant calls) within the context of reference genomes and make it available through the Ensembl user interfaces. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Data processing in management of Dolni Rozinka uranium mines

International Nuclear Information System (INIS)

Benes, B.

1987-01-01

In 1985, a qualitative inovation was introduced of data processing by the commissioning of the EC 1026 computer with a terminal network and a remote data communication system. The design jobs which are being gradually implemented are mainly oriented to the creating of an automated information system for operative control of mining production, data preparation in mining plants, and to the personnel, wages, material consumptions, etc. areas. (J.B.)
Discovery of the leinamycin family of natural products by mining actinobacterial genomes.

Science.gov (United States)

Pan, Guohui; Xu, Zhengren; Guo, Zhikai; Hindra; Ma, Ming; Yang, Dong; Zhou, Hao; Gansemans, Yannick; Zhu, Xiangcheng; Huang, Yong; Zhao, Li-Xing; Jiang, Yi; Cheng, Jinhua; Van Nieuwerburgh, Filip; Suh, Joo-Won; Duan, Yanwen; Shen, Ben

2017-12-26

Nature's ability to generate diverse natural products from simple building blocks has inspired combinatorial biosynthesis. The knowledge-based approach to combinatorial biosynthesis has allowed the production of designer analogs by rational metabolic pathway engineering. While successful, structural alterations are limited, with designer analogs often produced in compromised titers. The discovery-based approach to combinatorial biosynthesis complements the knowledge-based approach by exploring the vast combinatorial biosynthesis repertoire found in Nature. Here we showcase the discovery-based approach to combinatorial biosynthesis by targeting the domain of unknown function and cysteine lyase domain (DUF-SH) didomain, specific for sulfur incorporation from the leinamycin (LNM) biosynthetic machinery, to discover the LNM family of natural products. By mining bacterial genomes from public databases and the actinomycetes strain collection at The Scripps Research Institute, we discovered 49 potential producers that could be grouped into 18 distinct clades based on phylogenetic analysis of the DUF-SH didomains. Further analysis of the representative genomes from each of the clades identified 28 lnm -type gene clusters. Structural diversities encoded by the LNM-type biosynthetic machineries were predicted based on bioinformatics and confirmed by in vitro characterization of selected adenylation proteins and isolation and structural elucidation of the guangnanmycins and weishanmycins. These findings demonstrate the power of the discovery-based approach to combinatorial biosynthesis for natural product discovery and structural diversity and highlight Nature's rich biosynthetic repertoire. Comparative analysis of the LNM-type biosynthetic machineries provides outstanding opportunities to dissect Nature's biosynthetic strategies and apply these findings to combinatorial biosynthesis for natural product discovery and structural diversity.
Text Mining in Biomedical Domain with Emphasis on Document Clustering.

Science.gov (United States)

Renganathan, Vinaitheerthan

2017-07-01

With the exponential increase in the number of articles published every year in the biomedical domain, there is a need to build automated systems to extract unknown information from the articles published. Text mining techniques enable the extraction of unknown knowledge from unstructured documents. This paper reviews text mining processes in detail and the software tools available to carry out text mining. It also reviews the roles and applications of text mining in the biomedical domain. Text mining processes, such as search and retrieval of documents, pre-processing of documents, natural language processing, methods for text clustering, and methods for text classification are described in detail. Text mining techniques can facilitate the mining of vast amounts of knowledge on a given topic from published biomedical research articles and draw meaningful conclusions that are not possible otherwise.
Imitating manual curation of text-mined facts in biomedicine.

Directory of Open Access Journals (Sweden)

Raul Rodriguez-Esteban

2006-09-01

Full Text Available Text-mining algorithms make mistakes in extracting facts from natural-language texts. In biomedical applications, which rely on use of text-mined data, it is critical to assess the quality (the probability that the message is correctly extracted of individual facts--to resolve data conflicts and inconsistencies. Using a large set of almost 100,000 manually produced evaluations (most facts were independently reviewed more than once, producing independent evaluations, we implemented and tested a collection of algorithms that mimic human evaluation of facts provided by an automated information-extraction system. The performance of our best automated classifiers closely approached that of our human evaluators (ROC score close to 0.95. Our hypothesis is that, were we to use a larger number of human experts to evaluate any given sentence, we could implement an artificial-intelligence curator that would perform the classification job at least as accurately as an average individual human evaluator. We illustrated our analysis by visualizing the predicted accuracy of the text-mined relations involving the term cocaine.
A Novel Uncultured Bacterium of the Family Gallionellaceae: Description and Genome Reconstruction Based on the Metagenomic Analysis of Microbial Community in Acid Mine Drainage.

Science.gov (United States)

Kadnikov, V V; Ivasenko, D A; Beletsky, A V; Mardanov, A V; Danilova, E V; Pimenov, N V; Karnachuk, O V; Ravin, N V

2016-07-01

Drainage waters at the metal mining areas often have low pH and high content of dissolved metals due to oxidation of sulfide minerals. Extreme conditions limit microbial diversity in- such ecosystems. A drainage water microbial community (6.5'C, pH 2.65) in an open pit at the Sherlovaya Gora polymetallic open-cast mine (Transbaikal region, Eastern Siberia, Russia) was studied using metagenomic techniques. Metagenome sequencing provided information for taxonomic and functional characterization of the micro- bial community. The majority of microorganisms belonged to a single uncultured lineage representing a new Betaproteobacteria species of the genus Gallionella. While no.acidophiles are known among the cultured members of the family Gallionellaceae, similar 16S rRNA gene sequences were detected in acid mine drain- ages. Bacteria ofthe genera Thiobacillus, Acidobacterium, Acidisphaera, and Acidithiobacillus,-which are com- mon in acid mine drainage environments, were the minor components of the community. Metagenomic data were -used to determine the almost complete (-3.4 Mb) composite genome of the new bacterial. lineage desig- nated Candidatus Gallionella acididurans ShG14-8. Genome analysis revealed that Fe(II) oxidation probably involved the cytochromes localized on the outer membrane of the cell. The electron transport chain included NADH dehydrogenase, a cytochrome bc1 complex, an alternative complex III, and cytochrome oxidases of the bd, cbb3, and bo3 types. Oxidation of reduced sulfur compounds probably involved the Sox system, sul- fide-quinone oxidoreductase, adenyl sulfate reductase, and sulfate adenyltransferase. The genes required for autotrophic carbon assimilation via the Calvin cycle were present, while no pathway for nitrogen fixation was revealed. High numbers of RND metal transporters and P type ATPases were probably responsible for resis- tance to heavy metals. The new microorganism was an aerobic chemolithoautotroph of the group of

Automated Comparative Auditing of NCIT Genomic Roles Using NCBI

Science.gov (United States)

Cohen, Barry; Oren, Marc; Min, Hua; Perl, Yehoshua; Halper, Michael

2008-01-01

Biomedical research has identified many human genes and various knowledge about them. The National Cancer Institute Thesaurus (NCIT) represents such knowledge as concepts and roles (relationships). Due to the rapid advances in this field, it is to be expected that the NCIT’s Gene hierarchy will contain role errors. A comparative methodology to audit the Gene hierarchy with the use of the National Center for Biotechnology Information’s (NCBI’s) Entrez Gene database is presented. The two knowledge sources are accessed via a pair of Web crawlers to ensure up-to-date data. Our algorithms then compare the knowledge gathered from each, identify discrepancies that represent probable errors, and suggest corrective actions. The primary focus is on two kinds of gene-roles: (1) the chromosomal locations of genes, and (2) the biological processes in which genes plays a role. Regarding chromosomal locations, the discrepancies revealed are striking and systematic, suggesting a structurally common origin. In regard to the biological processes, difficulties arise because genes frequently play roles in multiple processes, and processes may have many designations (such as synonymous terms). Our algorithms make use of the roles defined in the NCIT Biological Process hierarchy to uncover many probable gene-role errors in the NCIT. These results show that automated comparative auditing is a promising technique that can identify a large number of probable errors and corrections for them in a terminological genomic knowledge repository, thus facilitating its overall maintenance. PMID:18486558
MouseMine: a new data warehouse for MGI.

Science.gov (United States)

Motenko, H; Neuhauser, S B; O'Keefe, M; Richardson, J E

2015-08-01

MouseMine (www.mousemine.org) is a new data warehouse for accessing mouse data from Mouse Genome Informatics (MGI). Based on the InterMine software framework, MouseMine supports powerful query, reporting, and analysis capabilities, the ability to save and combine results from different queries, easy integration into larger workflows, and a comprehensive Web Services layer. Through MouseMine, users can access a significant portion of MGI data in new and useful ways. Importantly, MouseMine is also a member of a growing community of online data resources based on InterMine, including those established by other model organism databases. Adopting common interfaces and collaborating on data representation standards are critical to fostering cross-species data analysis. This paper presents a general introduction to MouseMine, presents examples of its use, and discusses the potential for further integration into the MGI interface.
Resources for Functional Genomics Studies in Drosophila melanogaster

Science.gov (United States)

Mohr, Stephanie E.; Hu, Yanhui; Kim, Kevin; Housden, Benjamin E.; Perrimon, Norbert

2014-01-01

Drosophila melanogaster has become a system of choice for functional genomic studies. Many resources, including online databases and software tools, are now available to support design or identification of relevant fly stocks and reagents or analysis and mining of existing functional genomic, transcriptomic, proteomic, etc. datasets. These include large community collections of fly stocks and plasmid clones, “meta” information sites like FlyBase and FlyMine, and an increasing number of more specialized reagents, databases, and online tools. Here, we introduce key resources useful to plan large-scale functional genomics studies in Drosophila and to analyze, integrate, and mine the results of those studies in ways that facilitate identification of highest-confidence results and generation of new hypotheses. We also discuss ways in which existing resources can be used and might be improved and suggest a few areas of future development that would further support large- and small-scale studies in Drosophila and facilitate use of Drosophila information by the research community more generally. PMID:24653003
Restudy on digital mine: characteristics, framework and key technologies

Energy Technology Data Exchange (ETDEWEB)

Wu, L.; Yin, Z.; Zhong, Y. [China University of Mining and Technology, Beijing (China). Beijing Campus

2003-02-01

Based on analysing the problems associated with the promotion of mine digitization in China, the basic characteristics of a digital mine (DM) are described with respect to an intelligent transportation system: high speed intranet as the road-net; modular mining software as the vehicle; mine data and models as the fuel; 3D geoscience modelling (3DGM) and data mining as the filter; data collection and update as the safeguard, and mine GIS (MGIS) as the scheduler. Based on the four-layered structure of C/S and the work processes in a mine, the fundamental constituent diagram and a network framework of DM are designed. The nine key technologies for the implementation of DM strategy are discussed, which includes the data warehouse in mine; the data mining in mine; real 3DGM and visualization; 3D topological modelling and analysis in mine; modular application software and models; underground rapid positioning and automatic navigation; underground multimedia communication and wireless transmission; intelligent mining robot-groups and the integration of 3S (GIS, GPS, RS), office automation (OA) and command dispatch system (CDS). In addition, the recent progress of 'digital Kailuan' is introduced as an example of DM. 26 refs., 4 figs.
Genome bioinformatics of tomato and potato

NARCIS (Netherlands)

Datema, E.

2011-01-01

In the past two decades genome sequencing has developed from a laborious and costly technology employed by large international consortia to a widely used, automated and affordable tool used worldwide by many individual research groups. Genome sequences of many food animals and crop plants have
Herbarium genomics

DEFF Research Database (Denmark)

Bakker, Freek T.; Lei, Di; Yu, Jiaying

2016-01-01

Herbarium genomics is proving promising as next-generation sequencing approaches are well suited to deal with the usually fragmented nature of archival DNA. We show that routine assembly of partial plastome sequences from herbarium specimens is feasible, from total DNA extracts and with specimens...... up to 146 years old. We use genome skimming and an automated assembly pipeline, Iterative Organelle Genome Assembly, that assembles paired-end reads into a series of candidate assemblies, the best one of which is selected based on likelihood estimation. We used 93 specimens from 12 different...... correlation between plastome coverage and nuclear genome size (C value) in our samples, but the range of C values included is limited. Finally, we conclude that routine plastome sequencing from herbarium specimens is feasible and cost-effective (compared with Sanger sequencing or plastome...
Report. First international symposium on innovating mining systems

Energy Technology Data Exchange (ETDEWEB)

Blackwood, R L

1985-01-01

The author presents a summary of proceedings of the First International Symposium on Innovative Mining Systems held at Massachusetts Institute of Technology, USA 4-5 November 1985, together with some comments on the conclusions and discussion throughout. The Symposium agenda included the following (i) Symposium intentions and expectations; (ii) International; (iii) Developments in safety; (iv) Overview of current major research and trends; (v) Panel discussion: Mechanisms for industrial and international collaboration; (vi) Closing remarks; (vii) Review of innovations: university programs; (viii) Review of selected mine operator programs and needs; Review of equipment innovations; capabilities and trends in areas of mining equipment and robotics; Concurrent sessions: operations and manufacturing. A series of workshops was also held, the titles of which were as follows: (i) Establishment of research network; (ii) Entry development-machine excavation; (iii) Sensing, monitoring, diagnostics, artificial intelligence; (iv) Remote control, automation, mining systems; (v) Computer aided design, simulation, system development; (vi) Surface mining; (vii) Rock breakage.
AUTOMATION OF REMEDY TICKETS CATEGORIZATION USING BUSINESS INTELLIGENCE TOOLS

OpenAIRE

DR. M RAJASEKHARA BABU; ANKITA TIWARI

2012-01-01

The work log of an issue is often the primary source of information for predicting the cause. Mining patterns from work log is an important issue management task. This paper aims at developing an application which categorizes the issues into problem areas using a clustering algorithm. This algorithm helps one to cluster the issues by mining patterns from the work log files. Standard reports can be generated for the root cause analysis. The whole process is automated using Business Intelligenc...
Availability analysis of selected mining machinery

Directory of Open Access Journals (Sweden)

Brodny Jarosław

2017-06-01

Full Text Available Underground extraction of coal is characterized by high variability of mining and geological conditions in which it is conducted. Despite ever more effective methods and tools, used to identify the factors influencing this process, mining machinery, used in mining underground, work in difficult and not always foreseeable conditions, which means that these machines should be very universal and reliable. Additionally, a big competition, occurring on the coal market, causes that it is necessary to take action in order to reduce the cost of its production, e.g. by increasing the efficiency of utilization machines. To meet this objective it should be pro-ceed with analysis presented in this paper. The analysis concerns to availability of utilization selected mining machinery, conducted using the model of OEE, which is a tool for quantitative estimate strategy TPM. In this article we considered the machines being part of the mechanized longwall complex and the basis of analysis was the data recording by the industrial automation system. Using this data set we evaluated the availability of studied machines and the structure of registered breaks in their work. The results should be an important source of information for maintenance staff and management of mining plants, needed to improve the economic efficiency of underground mining.
Fuel cell mining vehicles: design, performance and advantages

International Nuclear Information System (INIS)

Betournay, M.C.; Miller, A.R.; Barnes, D.L.

2003-01-01

The potential for using fuel cell technology in underground mining equipment was discussed with reference to the risks associated with the operation of hydrogen vehicles, hydrogen production and hydrogen delivery systems. This paper presented some of the initiatives for mine locomotives and fuel cell stacks for underground environments. In particular, it presents the test results of the first applied industrial fuel cell vehicle in the world, a mining and tunneling locomotive. This study was part of an international initiative managed by the Fuel Cell Propulsion Institute which consists of several mining companies, mining equipment manufacturers, and fuel cell technology developers. Some of the obvious benefits of fuel cells for underground mining operations include no exhaust gases, lower electrical costs, significantly reduced maintenance, and lower ventilation costs. Another advantage is that the technology can be readily automated and computer-based for tele-remote operations. This study also quantified the cost and operational benefits associated with fuel cell vehicles compared to diesel vehicles. It is expected that higher vehicle productivity could render fuel cell underground vehicles cost-competitive. 6 refs., 1 tab
Technological advances in telecommunications for mines

Energy Technology Data Exchange (ETDEWEB)

Waye, P.M.Y.; Yewen, R. [Mine Radio Systemic Inc., Stouffville, ON (Canada)

2002-01-01

As mines utilize more automation in mining operations to improve efficiency and safety, a corresponding increasing demand is placed on the transport of information. Some of the recent technological advances in underground telecommunications are described for various data, voice and video applications. In particular, two new innovative underground communication systems are described, one with highspeed data at 30 Mbps and the other for mine-wide evacuation and safety applications. The high-speed data system incorporates state-of-the-art data networking technologies and the existing leaky-cable, narrow-band radio channels. The new system provides over the same basic infrastructure - the highspeed data network at 30 Mbps TCP/IP Ethernet with 100 Base-T interconnection, plus 32 narrow-band radio channels. The second system is a system for mine-wide evacuation with 'through-the-earth' communication infrastructure. Emergency situations can be communicated to and from all the miners within seconds through a central control location. The technology involved does not require leaky cable or any other similar transmission media installation. Many applications are possible, including warning miners of emergency situations, mine rescue operation to communicate with trapped miners, and regular reporting from miners working alone.
Opportunities and Challenges in Deep Mining: A Brief Review

Directory of Open Access Journals (Sweden)

Pathegama G. Ranjith

2017-08-01

Full Text Available Mineral consumption is increasing rapidly as more consumers enter the market for minerals and as the global standard of living increases. As a result, underground mining continues to progress to deeper levels in order to tackle the mineral supply crisis in the 21st century. However, deep mining occurs in a very technical and challenging environment, in which significant innovative solutions and best practice are required and additional safety standards must be implemented in order to overcome the challenges and reap huge economic gains. These challenges include the catastrophic events that are often met in deep mining engineering: rockbursts, gas outbursts, high in situ and redistributed stresses, large deformation, squeezing and creeping rocks, and high temperature. This review paper presents the current global status of deep mining and highlights some of the newest technological achievements and opportunities associated with rock mechanics and geotechnical engineering in deep mining. Of the various technical achievements, unmanned working-faces and unmanned mines based on fully automated mining and mineral extraction processes have become important fields in the 21st century.
A Text Matching Method to Facilitate the Validation of Frequent Order Sets Obtained Through Data Mining

OpenAIRE

Che, Chengjian; Rocha, Roberto A.

2006-01-01

In order to compare order sets discovered using a data mining algorithm with existing order sets, we developed an order matching tool based on Oracle Text. The tool includes both automated searching and manual review processes. The comparison between the automated process and the manual review process indicates that the sensitivity of the automated matching is 81% and the specificity is 84%.
Computational Mining and Genome Wide Distribution of Microsatellite in Fusarium oxysporum f. sp. lycopersici

Directory of Open Access Journals (Sweden)

Sudheer KUMAR

2012-11-01

Full Text Available Simple sequence repeat (SSR is currently the most preferred molecular marker system owing to their highly desirable properties viz., abundance, hyper-variability, and suitability for high-throughput analysis. Hence, in present study an attempt was made to mine and analyze microsatellite dynamics in whole genome of Fusarium oxysporum f. sp. lycopersici. The distribution pattern of different SSR motifs provides the evidence of greater accumulation of tetra-nucleotide (3837 repeats followed by tri-nucleotide (3367 repeats. Maximum frequency distribution in coding region was shown by mono-nucleotide SSR motifs (34.8%, where as minimum frequency is observed for penta-nucleotide SSR (0.87%. Highest relative abundance (1023 SSR/Mb and density of SSRs (114.46 bp/Mb were observed on chromosome 1, while least density of SSR motifs was recorded on chromosome 11 (7.40 bp/Mb and 12 (7.41 bp/Mb, respectively. Maximum trinucleotide (34.24% motifs code for glutamic acid (GAA while GT/CT were the most frequent repeat of dinucleotide SSRs. Most common and highly repeated SSR motifs were identified as (A64, (T48, (GT24, (GAA31, (TTTC24, (TTTCT28 and (AACCAG27. Overall, the generated information may serve as baseline information for developing SSR markers that could find applications in genomic analysis of F. oxysporum f. sp. lycopersici for better understanding of evolution, diversity analysis, population genetics, race identification and acquisition of new virulence.
Integrated circuit devices in control systems of coal mining complexes

Energy Technology Data Exchange (ETDEWEB)

1983-01-01

Systems of automatic monitoring and control of coal mining complexes developed in the 1960's used electromagnetic relays, thyristors, and flip-flops on transistors of varying conductivity. The circuits' designers, devoted much attention to ensuring spark safety, lowering power consumption, and raising noise immunity and repairability of functional devices. The fast development of integrated circuitry led to the use of microelectronic components in most devices of mine automation. An analysis of specifications and experimental research into integrated circuits (IMS) shows that the series K 176 IMS components made by CMOS technology best meet mine conditions of operation. The use of IMS devices under mine conditions has demonstrated their high reliability. Further development of integrated circuitry involve using microprocessors and microcomputers. (SC)
VirSorter: mining viral signal from microbial genomic data

Directory of Open Access Journals (Sweden)

Simon Roux

2015-05-01

Full Text Available Viruses of microbes impact all ecosystems where microbes drive key energy and substrate transformations including the oceans, humans and industrial fermenters. However, despite this recognized importance, our understanding of viral diversity and impacts remains limited by too few model systems and reference genomes. One way to fill these gaps in our knowledge of viral diversity is through the detection of viral signal in microbial genomic data. While multiple approaches have been developed and applied for the detection of prophages (viral genomes integrated in a microbial genome, new types of microbial genomic data are emerging that are more fragmented and larger scale, such as Single-cell Amplified Genomes (SAGs of uncultivated organisms or genomic fragments assembled from metagenomic sequencing. Here, we present VirSorter, a tool designed to detect viral signal in these different types of microbial sequence data in both a reference-dependent and reference-independent manner, leveraging probabilistic models and extensive virome data to maximize detection of novel viruses. Performance testing shows that VirSorter’s prophage prediction capability compares to that of available prophage predictors for complete genomes, but is superior in predicting viral sequences outside of a host genome (i.e., from extrachromosomal prophages, lytic infections, or partially assembled prophages. Furthermore, VirSorter outperforms existing tools for fragmented genomic and metagenomic datasets, and can identify viral signal in assembled sequence (contigs as short as 3kb, while providing near-perfect identification (>95% Recall and 100% Precision on contigs of at least 10kb. Because VirSorter scales to large datasets, it can also be used in “reverse” to more confidently identify viral sequence in viral metagenomes by sorting away cellular DNA whether derived from gene transfer agents, generalized transduction or contamination. Finally, VirSorter is made
VirSorter: mining viral signal from microbial genomic data

Science.gov (United States)

Roux, Simon; Enault, Francois; Hurwitz, Bonnie L.

2015-01-01

Viruses of microbes impact all ecosystems where microbes drive key energy and substrate transformations including the oceans, humans and industrial fermenters. However, despite this recognized importance, our understanding of viral diversity and impacts remains limited by too few model systems and reference genomes. One way to fill these gaps in our knowledge of viral diversity is through the detection of viral signal in microbial genomic data. While multiple approaches have been developed and applied for the detection of prophages (viral genomes integrated in a microbial genome), new types of microbial genomic data are emerging that are more fragmented and larger scale, such as Single-cell Amplified Genomes (SAGs) of uncultivated organisms or genomic fragments assembled from metagenomic sequencing. Here, we present VirSorter, a tool designed to detect viral signal in these different types of microbial sequence data in both a reference-dependent and reference-independent manner, leveraging probabilistic models and extensive virome data to maximize detection of novel viruses. Performance testing shows that VirSorter’s prophage prediction capability compares to that of available prophage predictors for complete genomes, but is superior in predicting viral sequences outside of a host genome (i.e., from extrachromosomal prophages, lytic infections, or partially assembled prophages). Furthermore, VirSorter outperforms existing tools for fragmented genomic and metagenomic datasets, and can identify viral signal in assembled sequence (contigs) as short as 3kb, while providing near-perfect identification (>95% Recall and 100% Precision) on contigs of at least 10kb. Because VirSorter scales to large datasets, it can also be used in “reverse” to more confidently identify viral sequence in viral metagenomes by sorting away cellular DNA whether derived from gene transfer agents, generalized transduction or contamination. Finally, VirSorter is made available through the i
antiSMASH 3.0-a comprehensive resource for the genome mining of biosynthetic gene clusters.

Science.gov (United States)

Weber, Tilmann; Blin, Kai; Duddela, Srikanth; Krug, Daniel; Kim, Hyun Uk; Bruccoleri, Robert; Lee, Sang Yup; Fischbach, Michael A; Müller, Rolf; Wohlleben, Wolfgang; Breitling, Rainer; Takano, Eriko; Medema, Marnix H

2015-07-01

Microbial secondary metabolism constitutes a rich source of antibiotics, chemotherapeutics, insecticides and other high-value chemicals. Genome mining of gene clusters that encode the biosynthetic pathways for these metabolites has become a key methodology for novel compound discovery. In 2011, we introduced antiSMASH, a web server and stand-alone tool for the automatic genomic identification and analysis of biosynthetic gene clusters, available at http://antismash.secondarymetabolites.org. Here, we present version 3.0 of antiSMASH, which has undergone major improvements. A full integration of the recently published ClusterFinder algorithm now allows using this probabilistic algorithm to detect putative gene clusters of unknown types. Also, a new dereplication variant of the ClusterBlast module now identifies similarities of identified clusters to any of 1172 clusters with known end products. At the enzyme level, active sites of key biosynthetic enzymes are now pinpointed through a curated pattern-matching procedure and Enzyme Commission numbers are assigned to functionally classify all enzyme-coding genes. Additionally, chemical structure prediction has been improved by incorporating polyketide reduction states. Finally, in order for users to be able to organize and analyze multiple antiSMASH outputs in a private setting, a new XML output module allows offline editing of antiSMASH annotations within the Geneious software. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Roles for text mining in protein function prediction.

Science.gov (United States)

Verspoor, Karin M

2014-01-01

The Human Genome Project has provided science with a hugely valuable resource: the blueprints for life; the specification of all of the genes that make up a human. While the genes have all been identified and deciphered, it is proteins that are the workhorses of the human body: they are essential to virtually all cell functions and are the primary mechanism through which biological function is carried out. Hence in order to fully understand what happens at a molecular level in biological organisms, and eventually to enable development of treatments for diseases where some aspect of a biological system goes awry, we must understand the functions of proteins. However, experimental characterization of protein function cannot scale to the vast amount of DNA sequence data now available. Computational protein function prediction has therefore emerged as a problem at the forefront of modern biology (Radivojac et al., Nat Methods 10(13):221-227, 2013).Within the varied approaches to computational protein function prediction that have been explored, there are several that make use of biomedical literature mining. These methods take advantage of information in the published literature to associate specific proteins with specific protein functions. In this chapter, we introduce two main strategies for doing this: association of function terms, represented as Gene Ontology terms (Ashburner et al., Nat Genet 25(1):25-29, 2000), to proteins based on information in published articles, and a paradigm called LEAP-FS (Literature-Enhanced Automated Prediction of Functional Sites) in which literature mining is used to validate the predictions of an orthogonal computational protein function prediction method.
Mining of relations between proteins over biomedical scientific literature using a deep-linguistic approach.

Science.gov (United States)

Rinaldi, Fabio; Schneider, Gerold; Kaljurand, Kaarel; Hess, Michael; Andronis, Christos; Konstandi, Ourania; Persidis, Andreas

2007-02-01

The amount of new discoveries (as published in the scientific literature) in the biomedical area is growing at an exponential rate. This growth makes it very difficult to filter the most relevant results, and thus the extraction of the core information becomes very expensive. Therefore, there is a growing interest in text processing approaches that can deliver selected information from scientific publications, which can limit the amount of human intervention normally needed to gather those results. This paper presents and evaluates an approach aimed at automating the process of extracting functional relations (e.g. interactions between genes and proteins) from scientific literature in the biomedical domain. The approach, using a novel dependency-based parser, is based on a complete syntactic analysis of the corpus. We have implemented a state-of-the-art text mining system for biomedical literature, based on a deep-linguistic, full-parsing approach. The results are validated on two different corpora: the manually annotated genomics information access (GENIA) corpus and the automatically annotated arabidopsis thaliana circadian rhythms (ATCR) corpus. We show how a deep-linguistic approach (contrary to common belief) can be used in a real world text mining application, offering high-precision relation extraction, while at the same time retaining a sufficient recall.

Frontiers of biomedical text mining: current progress

Science.gov (United States)

Zweigenbaum, Pierre; Demner-Fushman, Dina; Yu, Hong; Cohen, Kevin B.

2008-01-01

It is now almost 15 years since the publication of the first paper on text mining in the genomics domain, and decades since the first paper on text mining in the medical domain. Enormous progress has been made in the areas of information retrieval, evaluation methodologies and resource construction. Some problems, such as abbreviation-handling, can essentially be considered solved problems, and others, such as identification of gene mentions in text, seem likely to be solved soon. However, a number of problems at the frontiers of biomedical text mining continue to present interesting challenges and opportunities for great improvements and interesting research. In this article we review the current state of the art in biomedical text mining or ‘BioNLP’ in general, focusing primarily on papers published within the past year. PMID:17977867
Joint Genome Institute's Automation Approach and History

Energy Technology Data Exchange (ETDEWEB)

Roberts, Simon

2006-07-05

Department of Energy/Joint Genome Institute (DOE/JGI) collaborates with DOE national laboratories and community users, to advance genome science in support of the DOE missions of clean bio-energy, carbon cycling, and bioremediation.
Improving diagnostic accuracy using agent-based distributed data mining system.

Science.gov (United States)

Sridhar, S

2013-09-01

The use of data mining techniques to improve the diagnostic system accuracy is investigated in this paper. The data mining algorithms aim to discover patterns and extract useful knowledge from facts recorded in databases. Generally, the expert systems are constructed for automating diagnostic procedures. The learning component uses the data mining algorithms to extract the expert system rules from the database automatically. Learning algorithms can assist the clinicians in extracting knowledge automatically. As the number and variety of data sources is dramatically increasing, another way to acquire knowledge from databases is to apply various data mining algorithms that extract knowledge from data. As data sets are inherently distributed, the distributed system uses agents to transport the trained classifiers and uses meta learning to combine the knowledge. Commonsense reasoning is also used in association with distributed data mining to obtain better results. Combining human expert knowledge and data mining knowledge improves the performance of the diagnostic system. This work suggests a framework of combining the human knowledge and knowledge gained by better data mining algorithms on a renal and gallstone data set.
System approach to automation and robotization of drivage

Science.gov (United States)

Zinov’ev, VV; Mayorov, AE; Starodubov, AN; Nikolaev, PI

2018-03-01

The authors consider the system approach to finding ways of no-man drilling and blasting in the face area by means of automation and robotization of operations with a view to reducing injuries in mines. The analysis is carried out in terms of the drilling and blasting technology applied in Makarevskoe Coal Field, Kuznetsk Coal Basin. Within the system-functional approach and using INDEFO procedure, the processes of drilling and blasthole charging are decomposed into related elementary operations. The automation and robotization methods to avoid the presence of miners in the face are found for each operation.
Data mining strategies to improve multiplex microbead immunoassay tolerance in a mouse model of infectious diseases.

Directory of Open Access Journals (Sweden)

Akshay Mani

Full Text Available Multiplex methodologies, especially those with high-throughput capabilities generate large volumes of data. Accumulation of such data (e.g., genomics, proteomics, metabolomics etc. is fast becoming more common and thus requires the development and implementation of effective data mining strategies designed for biological and clinical applications. Multiplex microbead immunoassay (MMIA, on xMAP or MagPix platform (Luminex, which is amenable to automation, offers a major advantage over conventional methods such as Western blot or ELISA, for increasing the efficiencies in serodiagnosis of infectious diseases. MMIA allows detection of antibodies and/or antigens efficiently for a wide range of infectious agents simultaneously in host blood samples, in one reaction vessel. In the process, MMIA generates large volumes of data. In this report we demonstrate the application of data mining tools on how the inherent large volume data can improve the assay tolerance (measured in terms of sensitivity and specificity by analysis of experimental data accumulated over a span of two years. The combination of prior knowledge with machine learning tools provides an efficient approach to improve the diagnostic power of the assay in a continuous basis. Furthermore, this study provides an in-depth knowledge base to study pathological trends of infectious agents in mouse colonies on a multivariate scale. Data mining techniques using serodetection of infections in mice, developed in this study, can be used as a general model for more complex applications in epidemiology and clinical translational research.
Integrated system of production information processing for surface mines

Energy Technology Data Exchange (ETDEWEB)

Li, K.; Wang, S.; Zeng, Z.; Wei, J.; Ren, Z. [China University of Mining and Technology, Xuzhou (China). Dept of Mining Engineering

2000-09-01

Based on the concept of geological statistic, mathematical program, condition simulation, system engineering, and the features and duties of each main department in surface mine production, an integrated system for surface mine production information was studied systematically and developed by using the technology of data warehousing, CAD, object-oriented and system integration, which leads to the systematizing and automating of the information management, data processing, optimization computing and plotting. In this paper, its overall object, system design, structure and functions and some key techniques were described. 2 refs., 3 figs.
Radiation protection in uranium mining and metallurgical industries

International Nuclear Information System (INIS)

Pan Yingjie.

1988-01-01

The main radioactive contaminants in uranium mines are radon and its daughters, while in uranium plants the dust produced in crushing operation is the main source of contamination. In this paper the radiation protection levels and the problems present in China's uranium mines and plants are described and analyzed. 15 protective measures are presented by the auther. The main measurements are: to increase mechanization and automation levels in technology, to reduce the direct contact of man's body with radioactive materials, to strongthen the ventilation for removing radon, to establish a complete ventilation system, and so on
Development of mechanization of extraction in underground coal mining (part I)

Energy Technology Data Exchange (ETDEWEB)

Strzeminski, J

1984-01-01

The history of underground coal mining and history of mechanizing underground operations of cutting, strata control, mine haulage, hoisting and ventilation are discussed. The following development periods are characterized: until 1769 (date of steam engine invention by J. Watt), from 1769 to 1945 (period of partial mechanization of operations in underground coal mining), from 1945 (period of comprehensive mechanization and automation). A general description of mining in the first development period is given. Evaluation of the second development period concentrates on mechanization in underground coal mining. The following equipment types are described: cutting (pneumatic picks and pneumatic drills, coal saws developed by Eickhoff, coal cutters developed after 1870, cutter loaders patented in 1925-1927, coal plows and coal cutter loaders), mine haulage (mine cars, conveyors developed in the United Kingdom, Germany and Russia, Poland), strata control at working faces (timber props, steel friction props, roof bars), strata control in the goaf (room and pillar mining, stowing, minestone utilization for stowing in Upper Silesia, hydraulic stowing in Upper Silesia). 5 references.
Uses of antimicrobial genes from microbial genome

Science.gov (United States)

Sorek, Rotem; Rubin, Edward M.

2013-08-20

We describe a method for mining microbial genomes to discover antimicrobial genes and proteins having broad spectrum of activity. Also described are antimicrobial genes and their expression products from various microbial genomes that were found using this method. The products of such genes can be used as antimicrobial agents or as tools for molecular biology.
Genetic and functional properties of uncultivated thermophilic crenarchaeotes from a subsurface gold mine as revealed by analysis of genome fragments.

Science.gov (United States)

Nunoura, Takuro; Hirayama, Hisako; Takami, Hideto; Oida, Hanako; Nishi, Shinro; Shimamura, Shigeru; Suzuki, Yohey; Inagaki, Fumio; Takai, Ken; Nealson, Kenneth H; Horikoshi, Koki

2005-12-01

Within a phylum Crenarchaeota, only some members of the hyperthermophilic class Thermoprotei, have been cultivated and characterized. In this study, we have constructed a metagenomic library from a microbial mat formation in a subsurface hot water stream of the Hishikari gold mine, Japan, and sequenced genome fragments of two different phylogroups of uncultivated thermophilic Crenarchaeota: (i) hot water crenarchaeotic group (HWCG) I (41.2 kb), and (ii) HWCG III (49.3 kb). The genome fragment of HWCG I contained a 16S rRNA gene, two tRNA genes and 35 genes encoding proteins but no 23S rRNA gene. Among the genes encoding proteins, several genes for putative aerobic-type carbon monoxide dehydrogenase represented a potential clue with regard to the yet unknown metabolism of HWCG I Archaea. The genome fragment of HWCG III contained a 16S/23S rRNA operon and 44 genes encoding proteins. In the 23S rRNA gene, we detected a homing-endonuclease encoding a group I intron similar to those detected in hyperthermophilic Crenarchaeota and Bacteria, as well as eukaryotic organelles. The reconstructed phylogenetic tree based on the 23S rRNA gene sequence reinforced the intermediate phylogenetic affiliation of HWCG III bridging the hyperthermophilic and non-thermophilic uncultivated Crenarchaeota.
Application of text mining in the biomedical domain.

Science.gov (United States)

Fleuren, Wilco W M; Alkema, Wynand

2015-03-01

In recent years the amount of experimental data that is produced in biomedical research and the number of papers that are being published in this field have grown rapidly. In order to keep up to date with developments in their field of interest and to interpret the outcome of experiments in light of all available literature, researchers turn more and more to the use of automated literature mining. As a consequence, text mining tools have evolved considerably in number and quality and nowadays can be used to address a variety of research questions ranging from de novo drug target discovery to enhanced biological interpretation of the results from high throughput experiments. In this paper we introduce the most important techniques that are used for a text mining and give an overview of the text mining tools that are currently being used and the type of problems they are typically applied for. Copyright © 2015 Elsevier Inc. All rights reserved.
Eucalyptus microsatellites mined in silico: survey and evaluation

Indian Academy of Sciences (India)

2008-04-01

Apr 1, 2008 ... Eucalyptus microsatellites mined in silico: survey and evaluation ... cific regions of the genome of different species (Marques et ..... Received 21 June 2007, in revised form 11 September 2007; accepted 12 September 2007.
ThaleMine: A Warehouse for Arabidopsis Data Integration and Discovery.

Science.gov (United States)

Krishnakumar, Vivek; Contrino, Sergio; Cheng, Chia-Yi; Belyaeva, Irina; Ferlanti, Erik S; Miller, Jason R; Vaughn, Matthew W; Micklem, Gos; Town, Christopher D; Chan, Agnes P

2017-01-01

ThaleMine (https://apps.araport.org/thalemine/) is a comprehensive data warehouse that integrates a wide array of genomic information of the model plant Arabidopsis thaliana. The data collection currently includes the latest structural and functional annotation from the Araport11 update, the Col-0 genome sequence, RNA-seq and array expression, co-expression, protein interactions, homologs, pathways, publications, alleles, germplasm and phenotypes. The data are collected from a wide variety of public resources. Users can browse gene-specific data through Gene Report pages, identify and create gene lists based on experiments or indexed keywords, and run GO enrichment analysis to investigate the biological significance of selected gene sets. Developed by the Arabidopsis Information Portal project (Araport, https://www.araport.org/), ThaleMine uses the InterMine software framework, which builds well-structured data, and provides powerful data query and analysis functionality. The warehoused data can be accessed by users via graphical interfaces, as well as programmatically via web-services. Here we describe recent developments in ThaleMine including new features and extensions, and discuss future improvements. InterMine has been broadly adopted by the model organism research community including nematode, rat, mouse, zebrafish, budding yeast, the modENCODE project, as well as being used for human data. ThaleMine is the first InterMine developed for a plant model. As additional new plant InterMines are developed by the legume and other plant research communities, the potential of cross-organism integrative data analysis will be further enabled. © The Author 2016. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Improvement of the banana "Musa acuminata" reference sequence using NGS data and semi-automated bioinformatics methods.

Science.gov (United States)

Martin, Guillaume; Baurens, Franc-Christophe; Droc, Gaëtan; Rouard, Mathieu; Cenci, Alberto; Kilian, Andrzej; Hastie, Alex; Doležel, Jaroslav; Aury, Jean-Marc; Alberti, Adriana; Carreel, Françoise; D'Hont, Angélique

2016-03-16

Recent advances in genomics indicate functional significance of a majority of genome sequences and their long range interactions. As a detailed examination of genome organization and function requires very high quality genome sequence, the objective of this study was to improve reference genome assembly of banana (Musa acuminata). We have developed a modular bioinformatics pipeline to improve genome sequence assemblies, which can handle various types of data. The pipeline comprises several semi-automated tools. However, unlike classical automated tools that are based on global parameters, the semi-automated tools proposed an expert mode for a user who can decide on suggested improvements through local compromises. The pipeline was used to improve the draft genome sequence of Musa acuminata. Genotyping by sequencing (GBS) of a segregating population and paired-end sequencing were used to detect and correct scaffold misassemblies. Long insert size paired-end reads identified scaffold junctions and fusions missed by automated assembly methods. GBS markers were used to anchor scaffolds to pseudo-molecules with a new bioinformatics approach that avoids the tedious step of marker ordering during genetic map construction. Furthermore, a genome map was constructed and used to assemble scaffolds into super scaffolds. Finally, a consensus gene annotation was projected on the new assembly from two pre-existing annotations. This approach reduced the total Musa scaffold number from 7513 to 1532 (i.e. by 80%), with an N50 that increased from 1.3 Mb (65 scaffolds) to 3.0 Mb (26 scaffolds). 89.5% of the assembly was anchored to the 11 Musa chromosomes compared to the previous 70%. Unknown sites (N) were reduced from 17.3 to 10.0%. The release of the Musa acuminata reference genome version 2 provides a platform for detailed analysis of banana genome variation, function and evolution. Bioinformatics tools developed in this work can be used to improve genome sequence assemblies in
Primer to analysis of genomic data using R

CERN Document Server

Gondro, Cedric

2015-01-01

Through this book, researchers and students will learn to use R for analysis of large-scale genomic data and how to create routines to automate analytical steps. The philosophy behind the book is to start with real world raw datasets and perform all the analytical steps needed to reach final results. Though theory plays an important role, this is a practical book for advanced undergraduate and graduate classes in bioinformatics, genomics and statistical genetics or for use in lab sessions. This book is also designed to be used by students in computer science and statistics who want to learn the practical aspects of genomic analysis without delving into algorithmic details. The datasets used throughout the book may be downloaded from the publisher’s website. Chapters show how to handle and manage high-throughput genomic data, create automated workflows and speed up analyses in R. A wide range of R packages useful for working with genomic data are illustrated with practical examples. In recent years R has b...
Genomic sovereignty and the African promise: mining the African genome for the benefit of Africa.

Science.gov (United States)

de Vries, Jantina; Pepper, Michael

2012-08-01

Scientific interest in genomics in Africa is on the rise with a number of funding initiatives aimed specifically at supporting research in this area. Genomics research on material of African origin raises a number of important ethical issues. A prominent concern relates to sample export, which is increasingly seen by researchers and ethics committees across the continent as being problematic. The concept of genomic sovereignty proposes that unique patterns of genomic variation can be found in human populations, and that these are commercially, scientifically or symbolically valuable and in need of protection against exploitation. Although it is appealing as a response to increasing concerns regarding sample export, there are a number of important conceptual problems relating to the term. It is not clear, for instance, whether it is appropriate that ownership over human genomic samples should rest with national governments. Furthermore, ethnic groups in Africa are frequently spread across multiple nation states, and protection offered in one state may not prevent researchers from accessing the same group elsewhere. Lastly, scientific evidence suggests that the assumption that genomic data is unique for population groups is false. Although the frequency with which particular variants are found can differ between groups, such genes or variants per se are not unique to any population group. In this paper, the authors describe these concerns in detail and argue that the concept of genomic sovereignty alone may not be adequate to protect the genetic resources of people of African descent.
Towards educational data mining: Using data mining methods for automated chat analysis to understand and support inquiry learning processes

OpenAIRE

Anjewierden , Anjo; Kolloffel , Bas; Hulshof , Casper

2007-01-01

In this paper we investigate the application of data mining methods to provide learners with real-time adaptive feedback on the nature and patterns of their on-line communication while learning collaboratively.We derived two models for classifying chat messages using data mining techniques and tested these on an actual data set [16]. The reliability of the classification of chat messages is established by comparing the models performance to that of humans. Results indicate that the classifica...
Toward a unified and digital communication system for underground mines

Energy Technology Data Exchange (ETDEWEB)

Outalha, S.; Le, R.; Tardif, P-M. [Quebec Univ., Abitibi-Temiscamingue, PQ (Canada)

2000-10-01

Communications systems currently in use in underground mines are reviewed to demonstrate their limitations, especially in terms of their diversity and incompatibility. A new system concept, based on the existing IEEE 802.11 standard is presented as an alternative. This standard has shown its versatility by solving major wireless communication issues in various in-building wireless local area networks such as the Aironet 4800 series, Lucent WaveLan, OTC Telecom Air EZY2400-SWG, and BayStack 600 Series WLAN. Adaptation and implementation of a wireless local area network (SIAMnet, for System for the Integrated Automation of Mines Network) in the Val d'Or Mine Laboratory of CANMET is discussed. 11 refs., 5 figs.
KAIKObase: An integrated silkworm genome database and data mining tool

Directory of Open Access Journals (Sweden)

Nagaraju Javaregowda

2009-10-01

Full Text Available Abstract Background The silkworm, Bombyx mori, is one of the most economically important insects in many developing countries owing to its large-scale cultivation for silk production. With the development of genomic and biotechnological tools, B. mori has also become an important bioreactor for production of various recombinant proteins of biomedical interest. In 2004, two genome sequencing projects for B. mori were reported independently by Chinese and Japanese teams; however, the datasets were insufficient for building long genomic scaffolds which are essential for unambiguous annotation of the genome. Now, both the datasets have been merged and assembled through a joint collaboration between the two groups. Description Integration of the two data sets of silkworm whole-genome-shotgun sequencing by the Japanese and Chinese groups together with newly obtained fosmid- and BAC-end sequences produced the best continuity (~3.7 Mb in N50 scaffold size among the sequenced insect genomes and provided a high degree of nucleotide coverage (88% of all 28 chromosomes. In addition, a physical map of BAC contigs constructed by fingerprinting BAC clones and a SNP linkage map constructed using BAC-end sequences were available. In parallel, proteomic data from two-dimensional polyacrylamide gel electrophoresis in various tissues and developmental stages were compiled into a silkworm proteome database. Finally, a Bombyx trap database was constructed for documenting insertion positions and expression data of transposon insertion lines. Conclusion For efficient usage of genome information for functional studies, genomic sequences, physical and genetic map information and EST data were compiled into KAIKObase, an integrated silkworm genome database which consists of 4 map viewers, a gene viewer, and sequence, keyword and position search systems to display results and data at the level of nucleotide sequence, gene, scaffold and chromosome. Integration of the
Text mining for the biocuration workflow.

Science.gov (United States)

Hirschman, Lynette; Burns, Gully A P C; Krallinger, Martin; Arighi, Cecilia; Cohen, K Bretonnel; Valencia, Alfonso; Wu, Cathy H; Chatr-Aryamontri, Andrew; Dowell, Karen G; Huala, Eva; Lourenço, Anália; Nash, Robert; Veuthey, Anne-Lise; Wiegers, Thomas; Winter, Andrew G

2012-01-01

Molecular biology has become heavily dependent on biological knowledge encoded in expert curated biological databases. As the volume of biological literature increases, biocurators need help in keeping up with the literature; (semi-) automated aids for biocuration would seem to be an ideal application for natural language processing and text mining. However, to date, there have been few documented successes for improving biocuration throughput using text mining. Our initial investigations took place for the workshop on 'Text Mining for the BioCuration Workflow' at the third International Biocuration Conference (Berlin, 2009). We interviewed biocurators to obtain workflows from eight biological databases. This initial study revealed high-level commonalities, including (i) selection of documents for curation; (ii) indexing of documents with biologically relevant entities (e.g. genes); and (iii) detailed curation of specific relations (e.g. interactions); however, the detailed workflows also showed many variabilities. Following the workshop, we conducted a survey of biocurators. The survey identified biocurator priorities, including the handling of full text indexed with biological entities and support for the identification and prioritization of documents for curation. It also indicated that two-thirds of the biocuration teams had experimented with text mining and almost half were using text mining at that time. Analysis of our interviews and survey provide a set of requirements for the integration of text mining into the biocuration workflow. These can guide the identification of common needs across curated databases and encourage joint experimentation involving biocurators, text mining developers and the larger biomedical research community.

Text mining for the biocuration workflow

Science.gov (United States)

Hirschman, Lynette; Burns, Gully A. P. C; Krallinger, Martin; Arighi, Cecilia; Cohen, K. Bretonnel; Valencia, Alfonso; Wu, Cathy H.; Chatr-Aryamontri, Andrew; Dowell, Karen G.; Huala, Eva; Lourenço, Anália; Nash, Robert; Veuthey, Anne-Lise; Wiegers, Thomas; Winter, Andrew G.

2012-01-01

Molecular biology has become heavily dependent on biological knowledge encoded in expert curated biological databases. As the volume of biological literature increases, biocurators need help in keeping up with the literature; (semi-) automated aids for biocuration would seem to be an ideal application for natural language processing and text mining. However, to date, there have been few documented successes for improving biocuration throughput using text mining. Our initial investigations took place for the workshop on ‘Text Mining for the BioCuration Workflow’ at the third International Biocuration Conference (Berlin, 2009). We interviewed biocurators to obtain workflows from eight biological databases. This initial study revealed high-level commonalities, including (i) selection of documents for curation; (ii) indexing of documents with biologically relevant entities (e.g. genes); and (iii) detailed curation of specific relations (e.g. interactions); however, the detailed workflows also showed many variabilities. Following the workshop, we conducted a survey of biocurators. The survey identified biocurator priorities, including the handling of full text indexed with biological entities and support for the identification and prioritization of documents for curation. It also indicated that two-thirds of the biocuration teams had experimented with text mining and almost half were using text mining at that time. Analysis of our interviews and survey provide a set of requirements for the integration of text mining into the biocuration workflow. These can guide the identification of common needs across curated databases and encourage joint experimentation involving biocurators, text mining developers and the larger biomedical research community. PMID:22513129
Characterization of polymorphic SSRs among Prunus chloroplast genomes

Science.gov (United States)

An in silico mining process yielded 80, 75, and 78 microsatellites in the chloroplast genome of Prunus persica, P. kansuensis, and P. mume. A and T repeats were predominant in the three genomes, accounting for 67.8% on average and most of them were successful in primer design. For the 80 P. persica ...
MOST-visualization: software for producing automated textbook-style maps of genome-scale metabolic networks.

Science.gov (United States)

Kelley, James J; Maor, Shay; Kim, Min Kyung; Lane, Anatoliy; Lun, Desmond S

2017-08-15

Visualization of metabolites, reactions and pathways in genome-scale metabolic networks (GEMs) can assist in understanding cellular metabolism. Three attributes are desirable in software used for visualizing GEMs: (i) automation, since GEMs can be quite large; (ii) production of understandable maps that provide ease in identification of pathways, reactions and metabolites; and (iii) visualization of the entire network to show how pathways are interconnected. No software currently exists for visualizing GEMs that satisfies all three characteristics, but MOST-Visualization, an extension of the software package MOST (Metabolic Optimization and Simulation Tool), satisfies (i), and by using a pre-drawn overview map of metabolism based on the Roche map satisfies (ii) and comes close to satisfying (iii). MOST is distributed for free on the GNU General Public License. The software and full documentation are available at http://most.ccib.rutgers.edu/. dslun@rutgers.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Genome Modeling System: A Knowledge Management Platform for Genomics.

Directory of Open Access Journals (Sweden)

Malachi Griffith

2015-07-01

Full Text Available In this work, we present the Genome Modeling System (GMS, an analysis information management system capable of executing automated genome analysis pipelines at a massive scale. The GMS framework provides detailed tracking of samples and data coupled with reliable and repeatable analysis pipelines. The GMS also serves as a platform for bioinformatics development, allowing a large team to collaborate on data analysis, or an individual researcher to leverage the work of others effectively within its data management system. Rather than separating ad-hoc analysis from rigorous, reproducible pipelines, the GMS promotes systematic integration between the two. As a demonstration of the GMS, we performed an integrated analysis of whole genome, exome and transcriptome sequencing data from a breast cancer cell line (HCC1395 and matched lymphoblastoid line (HCC1395BL. These data are available for users to test the software, complete tutorials and develop novel GMS pipeline configurations. The GMS is available at https://github.com/genome/gms.
Automation, parallelism, and robotics for proteomics.

Science.gov (United States)

Alterovitz, Gil; Liu, Jonathan; Chow, Jijun; Ramoni, Marco F

2006-07-01

The speed of the human genome project (Lander, E. S., Linton, L. M., Birren, B., Nusbaum, C. et al., Nature 2001, 409, 860-921) was made possible, in part, by developments in automation of sequencing technologies. Before these technologies, sequencing was a laborious, expensive, and personnel-intensive task. Similarly, automation and robotics are changing the field of proteomics today. Proteomics is defined as the effort to understand and characterize proteins in the categories of structure, function and interaction (Englbrecht, C. C., Facius, A., Comb. Chem. High Throughput Screen. 2005, 8, 705-715). As such, this field nicely lends itself to automation technologies since these methods often require large economies of scale in order to achieve cost and time-saving benefits. This article describes some of the technologies and methods being applied in proteomics in order to facilitate automation within the field as well as in linking proteomics-based information with other related research areas.
Post-Genomics and Vaccine Improvement for Leishmania

Science.gov (United States)

Seyed, Negar; Taheri, Tahereh; Rafati, Sima

2016-01-01

Leishmaniasis is a parasitic disease that primarily affects Asia, Africa, South America, and the Mediterranean basin. Despite extensive efforts to develop an effective prophylactic vaccine, no promising vaccine is available yet. However, recent advancements in computational vaccinology on the one hand and genome sequencing approaches on the other have generated new hopes in vaccine development. Computational genome mining for new vaccine candidates is known as reverse vaccinology and is believed to further extend the current list of Leishmania vaccine candidates. Reverse vaccinology can also reduce the intrinsic risks associated with live attenuated vaccines. Individual epitopes arranged in tandem as polytopes are also a possible outcome of reverse genome mining. Here, we will briefly compare reverse vaccinology with conventional vaccinology in respect to Leishmania vaccine, and we will discuss how it influences the aforementioned topics. We will also introduce new in vivo models that will bridge the gap between human and laboratory animal models in future studies. PMID:27092123
plantiSMASH: automated identification, annotation and expression analysis of plant biosynthetic gene clusters

DEFF Research Database (Denmark)

Kautsar, Satria A.; Suarez Duran, Hernando G.; Blin, Kai

2017-01-01

exploration of the nature and dynamics of gene clustering in plant metabolism. Moreover, spurred by the continuing decrease in costs of plant genome sequencing, they will allow genome mining technologies to be applied to plant natural product discovery. The plantiSMASH web server, precalculated results...
Data mining approach to model the diagnostic service management.

Science.gov (United States)

Lee, Sun-Mi; Lee, Ae-Kyung; Park, Il-Su

2006-01-01

Korea has National Health Insurance Program operated by the government-owned National Health Insurance Corporation, and diagnostic services are provided every two year for the insured and their family members. Developing a customer relationship management (CRM) system using data mining technology would be useful to improve the performance of diagnostic service programs. Under these circumstances, this study developed a model for diagnostic service management taking into account the characteristics of subjects using a data mining approach. This study could be further used to develop an automated CRM system contributing to the increase in the rate of receiving diagnostic services.
Genomic analyses of metal resistance genes in three plant growth promoting bacteria of legume plants in Northwest mine tailings, China.

Science.gov (United States)

Xie, Pin; Hao, Xiuli; Herzberg, Martin; Luo, Yantao; Nies, Dietrich H; Wei, Gehong

2015-01-01

To better understand the diversity of metal resistance genetic determinant from microbes that survived at metal tailings in northwest of China, a highly elevated level of heavy metal containing region, genomic analyses was conducted using genome sequence of three native metal-resistant plant growth promoting bacteria (PGPB). It shows that: Mesorhizobium amorphae CCNWGS0123 contains metal transporters from P-type ATPase, CDF (Cation Diffusion Facilitator), HupE/UreJ and CHR (chromate ion transporter) family involved in copper, zinc, nickel as well as chromate resistance and homeostasis. Meanwhile, the putative CopA/CueO system is expected to mediate copper resistance in Sinorhizobium meliloti CCNWSX0020 while ZntA transporter, assisted with putative CzcD, determines zinc tolerance in Agrobacterium tumefaciens CCNWGS0286. The greenhouse experiment provides the consistent evidence of the plant growth promoting effects of these microbes on their hosts by nitrogen fixation and/or indoleacetic acid (IAA) secretion, indicating a potential in-site phytoremediation usage in the mining tailing regions of China. Copyright © 2014. Published by Elsevier B.V.
Text mining in livestock animal science: introducing the potential of text mining to animal sciences.

Science.gov (United States)

Sahadevan, S; Hofmann-Apitius, M; Schellander, K; Tesfaye, D; Fluck, J; Friedrich, C M

2012-10-01

In biological research, establishing the prior art by searching and collecting information already present in the domain has equal importance as the experiments done. To obtain a complete overview about the relevant knowledge, researchers mainly rely on 2 major information sources: i) various biological databases and ii) scientific publications in the field. The major difference between the 2 information sources is that information from databases is available, typically well structured and condensed. The information content in scientific literature is vastly unstructured; that is, dispersed among the many different sections of scientific text. The traditional method of information extraction from scientific literature occurs by generating a list of relevant publications in the field of interest and manually scanning these texts for relevant information, which is very time consuming. It is more than likely that in using this "classical" approach the researcher misses some relevant information mentioned in the literature or has to go through biological databases to extract further information. Text mining and named entity recognition methods have already been used in human genomics and related fields as a solution to this problem. These methods can process and extract information from large volumes of scientific text. Text mining is defined as the automatic extraction of previously unknown and potentially useful information from text. Named entity recognition (NER) is defined as the method of identifying named entities (names of real world objects; for example, gene/protein names, drugs, enzymes) in text. In animal sciences, text mining and related methods have been briefly used in murine genomics and associated fields, leaving behind other fields of animal sciences, such as livestock genomics. The aim of this work was to develop an information retrieval platform in the livestock domain focusing on livestock publications and the recognition of relevant data from
A Simulator to Enhance Teaching and Learning of Mining Methods ...

African Journals Online (AJOL)

Audio visual education that incorporates devices and materials which involve sight, sound, or both has become a sine qua non in recent times in the teaching and learning process. An automated physical model of mining methods aided with video instructions was designed and constructed by harnessing locally available ...
Technical devices of powered roof support for the top coal caving as automation objects

Science.gov (United States)

Nikitenko, M. S.; Kizilov, S. A.; Nikolaev, P. I.; Kuznetsov, I. S.

2018-05-01

In the paper technical devices for the top coal caving as automation objects in the composition of the longwall mining complex (LTCC) are considered. The proposed concept for automation of the top coal caving process allows caving efficiency to be ensured, coal dilution to be prevented, conveyor overloading to be prevented, the shearer service personnel to be unloaded, the influence of the “human factor” to be reduced.
Underground coal mining - methods, equipment developments and trends

Energy Technology Data Exchange (ETDEWEB)

Singhal, R

1988-12-01

Underground mines are truly beginning to accept the so-called 'high tech' technology evident in other industries. Automation, remote control and robotics have taken an added significance. Wireless communication, mine-wide equipment health and performance monitoring, and transmission of data from deeper levels to surface is moving towards becoming the norm. There is emphasis on developing and applying continuous mining systems, as well as on modifying cyclical discontinuous methods to continuous systems. Multi-purpose equipment is also being developed. Technology transfer is playing its role - equipment and systems from surface coal mining are being applied to underground mining and vice-versa. At the American Mining Congress Exhibition held in Chicago in April 1988, a variety of equipment for underground mining was displayed including coal face equipment such as shearer loaders, conveyors and powered supports, and equipment for room-and-pillar coal mining. The trend continues to be towards high power machines equipped with a variety of electronics and sensors, safety devices, and alarm systems. Ancillary equipment on display covered a variety of cutting drums, cutting tools, conveying equipment and so on. In room-and-pillar mining, the overall emphasis was on moving away from the cyclical nature of the work. Transportation by shuttle cars must be replaced by continuous transport systems such as conveyors. Experience from Australia has shown that the application of continuous haulage and breaker line supports has permitted a doubling of production from room-and-pillar systems. Production levels of 3,000tpd have already been achieved, and 4,000tpd is considered achievable.
Human-specific HERV-K insertion causes genomic variations in the human genome.

Directory of Open Access Journals (Sweden)

Wonseok Shin

Full Text Available Human endogenous retroviruses (HERV sequences account for about 8% of the human genome. Through comparative genomics and literature mining, we identified a total of 29 human-specific HERV-K insertions. We characterized them focusing on their structure and flanking sequence. The results showed that four of the human-specific HERV-K insertions deleted human genomic sequences via non-classical insertion mechanisms. Interestingly, two of the human-specific HERV-K insertion loci contained two HERV-K internals and three LTR elements, a pattern which could be explained by LTR-LTR ectopic recombination or template switching. In addition, we conducted a polymorphic test and observed that twelve out of the 29 elements are polymorphic in the human population. In conclusion, human-specific HERV-K elements have inserted into human genome since the divergence of human and chimpanzee, causing human genomic changes. Thus, we believe that human-specific HERV-K activity has contributed to the genomic divergence between humans and chimpanzees, as well as within the human population.
Producing genome structure populations with the dynamic and automated PGS software.

Science.gov (United States)

Hua, Nan; Tjong, Harianto; Shin, Hanjun; Gong, Ke; Zhou, Xianghong Jasmine; Alber, Frank

2018-05-01

Chromosome conformation capture technologies such as Hi-C are widely used to investigate the spatial organization of genomes. Because genome structures can vary considerably between individual cells of a population, interpreting ensemble-averaged Hi-C data can be challenging, in particular for long-range and interchromosomal interactions. We pioneered a probabilistic approach for the generation of a population of distinct diploid 3D genome structures consistent with all the chromatin-chromatin interaction probabilities from Hi-C experiments. Each structure in the population is a physical model of the genome in 3D. Analysis of these models yields new insights into the causes and the functional properties of the genome's organization in space and time. We provide a user-friendly software package, called PGS, which runs on local machines (for practice runs) and high-performance computing platforms. PGS takes a genome-wide Hi-C contact frequency matrix, along with information about genome segmentation, and produces an ensemble of 3D genome structures entirely consistent with the input. The software automatically generates an analysis report, and provides tools to extract and analyze the 3D coordinates of specific domains. Basic Linux command-line knowledge is sufficient for using this software. A typical running time of the pipeline is ∼3 d with 300 cores on a computer cluster to generate a population of 1,000 diploid genome structures at topological-associated domain (TAD)-level resolution.
Computational genomics of specialized metabolism

NARCIS (Netherlands)

Medema, Marnix H.

2018-01-01

Microbial and plant specialized metabolites, also known as natural products, are key mediators of microbe-microbe and host-microbe interactions and constitute a rich resource for drug development. In the past decade, genome mining has emerged as a prominent strategy for natural product discovery.
Spatiotemporal Data Mining: A Computational Perspective

Directory of Open Access Journals (Sweden)

Shashi Shekhar

2015-10-01

Full Text Available Explosive growth in geospatial and temporal data as well as the emergence of new technologies emphasize the need for automated discovery of spatiotemporal knowledge. Spatiotemporal data mining studies the process of discovering interesting and previously unknown, but potentially useful patterns from large spatiotemporal databases. It has broad application domains including ecology and environmental management, public safety, transportation, earth science, epidemiology, and climatology. The complexity of spatiotemporal data and intrinsic relationships limits the usefulness of conventional data science techniques for extracting spatiotemporal patterns. In this survey, we review recent computational techniques and tools in spatiotemporal data mining, focusing on several major pattern families: spatiotemporal outlier, spatiotemporal coupling and tele-coupling, spatiotemporal prediction, spatiotemporal partitioning and summarization, spatiotemporal hotspots, and change detection. Compared with other surveys in the literature, this paper emphasizes the statistical foundations of spatiotemporal data mining and provides comprehensive coverage of computational approaches for various pattern families. ISPRS Int. J. Geo-Inf. 2015, 4 2307 We also list popular software tools for spatiotemporal data analysis. The survey concludes with a look at future research needs.
Advances in Computer, Communication, Control and Automation

CERN Document Server

011 International Conference on Computer, Communication, Control and Automation

2012-01-01

The volume includes a set of selected papers extended and revised from the 2011 International Conference on Computer, Communication, Control and Automation (3CA 2011). 2011 International Conference on Computer, Communication, Control and Automation (3CA 2011) has been held in Zhuhai, China, November 19-20, 2011. This volume topics covered include signal and Image processing, speech and audio Processing, video processing and analysis, artificial intelligence, computing and intelligent systems, machine learning, sensor and neural networks, knowledge discovery and data mining, fuzzy mathematics and Applications, knowledge-based systems, hybrid systems modeling and design, risk analysis and management, system modeling and simulation. We hope that researchers, graduate students and other interested readers benefit scientifically from the proceedings and also find it stimulating in the process.
Data mining-aided materials discovery and optimization

Directory of Open Access Journals (Sweden)

Wencong Lu

2017-09-01

Full Text Available Recent developments in data mining-aided materials discovery and optimization are reviewed in this paper, and an introduction to the materials data mining (MDM process is provided using case studies. Both qualitative and quantitative methods in machine learning can be adopted in the MDM process to accomplish different tasks in materials discovery, design, and optimization. State-of-the-art techniques in data mining-aided materials discovery and optimization are demonstrated by reviewing the controllable synthesis of dendritic Co3O4 superstructures, materials design of layered double hydroxide, battery materials discovery, and thermoelectric materials design. The results of the case studies indicate that MDM is a powerful approach for use in materials discovery and innovation, and will play an important role in the development of the Materials Genome Initiative and Materials Informatics.
Millstone: software for multiplex microbial genome analysis and engineering.

Science.gov (United States)

Goodman, Daniel B; Kuznetsov, Gleb; Lajoie, Marc J; Ahern, Brian W; Napolitano, Michael G; Chen, Kevin Y; Chen, Changping; Church, George M

2017-05-25

Inexpensive DNA sequencing and advances in genome editing have made computational analysis a major rate-limiting step in adaptive laboratory evolution and microbial genome engineering. We describe Millstone, a web-based platform that automates genotype comparison and visualization for projects with up to hundreds of genomic samples. To enable iterative genome engineering, Millstone allows users to design oligonucleotide libraries and create successive versions of reference genomes. Millstone is open source and easily deployable to a cloud platform, local cluster, or desktop, making it a scalable solution for any lab.

Mining biological networks from full-text articles.

Science.gov (United States)

Czarnecki, Jan; Shepherd, Adrian J

2014-01-01

The study of biological networks is playing an increasingly important role in the life sciences. Many different kinds of biological system can be modelled as networks; perhaps the most important examples are protein-protein interaction (PPI) networks, metabolic pathways, gene regulatory networks, and signalling networks. Although much useful information is easily accessible in publicly databases, a lot of extra relevant data lies scattered in numerous published papers. Hence there is a pressing need for automated text-mining methods capable of extracting such information from full-text articles. Here we present practical guidelines for constructing a text-mining pipeline from existing code and software components capable of extracting PPI networks from full-text articles. This approach can be adapted to tackle other types of biological network.
Semi-automated categorization of open-ended questions

Directory of Open Access Journals (Sweden)

Matthias Schonlau

2016-08-01

Full Text Available Text data from open-ended questions in surveys are difficult to analyze and are frequently ignored. Yet open-ended questions are important because they do not constrain respondents’ answer choices. Where open-ended questions are necessary, sometimes multiple human coders hand-code answers into one of several categories. At the same time, computer scientists have made impressive advances in text mining that may allow automation of such coding. Automated algorithms do not achieve an overall accuracy high enough to entirely replace humans. We categorize open-ended questions soliciting narrative responses using text mining for easy-to-categorize answers and humans for the remainder using expected accuracies to guide the choice of the threshold delineating between “easy” and “hard”. Employing multinomial boosting avoids the common practice of converting machine learning “confidence scores” into pseudo-probabilities. This approach is illustrated with examples from open-ended questions related to respondents’ advice to a patient in a hypothetical dilemma, a follow-up probe related to respondents’ perception of disclosure/privacy risk, and from a question on reasons for quitting smoking from a follow-up survey from the Ontario Smoker’s Helpline. Targeting 80% combined accuracy, we found that 54%-80% of the data could be categorized automatically in research surveys.
Mining and processing of uranium ores in the USSR

International Nuclear Information System (INIS)

Laskorin, B.N.; Mamilov, V.A.; Korejsho, Yu.A.

1983-01-01

Experience gained in uranium ore mining by modern methods in combination with underground and heap leaching is summarized. More intensive processing of low-grade ores has been achieved through the use of autoclave leaching, sorptive treatment of thick pulps, extractive separation of pure uranium compounds, automated continuous sorption devices of high efficiency for processing the underground- and heap-leaching liquors, natural and mine water, and recovery of molybdenum, vanadium, scandium, rare earths and phosphate fertilizers from low-grade ores. Production of ion-exchangers and extractants has been developed and processes for concomitant recovery of copper, gold, ionium, tungsten, caesium, zirconium, tantalum, nickel and cobalt have been designed. (author)
BGDMdocker: a Docker workflow for data mining and visualization of bacterial pan-genomes and biosynthetic gene clusters

Directory of Open Access Journals (Sweden)

Gong Cheng

2017-11-01

Full Text Available Recently, Docker technology has received increasing attention throughout the bioinformatics community. However, its implementation has not yet been mastered by most biologists; accordingly, its application in biological research has been limited. In order to popularize this technology in the field of bioinformatics and to promote the use of publicly available bioinformatics tools, such as Dockerfiles and Images from communities, government sources, and private owners in the Docker Hub Registry and other Docker-based resources, we introduce here a complete and accurate bioinformatics workflow based on Docker. The present workflow enables analysis and visualization of pan-genomes and biosynthetic gene clusters of bacteria. This provides a new solution for bioinformatics mining of big data from various publicly available biological databases. The present step-by-step guide creates an integrative workflow through a Dockerfile to allow researchers to build their own Image and run Container easily.
BGDMdocker: a Docker workflow for data mining and visualization of bacterial pan-genomes and biosynthetic gene clusters.

Science.gov (United States)

Cheng, Gong; Lu, Quan; Ma, Ling; Zhang, Guocai; Xu, Liang; Zhou, Zongshan

2017-01-01

Recently, Docker technology has received increasing attention throughout the bioinformatics community. However, its implementation has not yet been mastered by most biologists; accordingly, its application in biological research has been limited. In order to popularize this technology in the field of bioinformatics and to promote the use of publicly available bioinformatics tools, such as Dockerfiles and Images from communities, government sources, and private owners in the Docker Hub Registry and other Docker-based resources, we introduce here a complete and accurate bioinformatics workflow based on Docker. The present workflow enables analysis and visualization of pan-genomes and biosynthetic gene clusters of bacteria. This provides a new solution for bioinformatics mining of big data from various publicly available biological databases. The present step-by-step guide creates an integrative workflow through a Dockerfile to allow researchers to build their own Image and run Container easily.
Automated protein structure calculation from NMR data

International Nuclear Information System (INIS)

Williamson, Mike P.; Craven, C. Jeremy

2009-01-01

Current software is almost at the stage to permit completely automatic structure determination of small proteins of <15 kDa, from NMR spectra to structure validation with minimal user interaction. This goal is welcome, as it makes structure calculation more objective and therefore more easily validated, without any loss in the quality of the structures generated. Moreover, it releases expert spectroscopists to carry out research that cannot be automated. It should not take much further effort to extend automation to ca 20 kDa. However, there are technological barriers to further automation, of which the biggest are identified as: routines for peak picking; adoption and sharing of a common framework for structure calculation, including the assembly of an automated and trusted package for structure validation; and sample preparation, particularly for larger proteins. These barriers should be the main target for development of methodology for protein structure determination, particularly by structural genomics consortia
Direct Mutagenesis of Thousands of Genomic Targets using Microarray-derived Oligonucleotides

DEFF Research Database (Denmark)

Bonde, Mads; Kosuri, Sriram; Genee, Hans Jasper

2015-01-01

Multiplex Automated Genome Engineering (MAGE) allows simultaneous mutagenesis of multiple target sites in bacterial genomes using short oligonucleotides. However, large-scale mutagenesis requires hundreds to thousands of unique oligos, which are costly to synthesize and impossible to scale-up by ...
Population genomic analysis of strain variation in Leptospirillum group II bacteria involved in acid mine drainage formation.

Science.gov (United States)

Simmons, Sheri L; Dibartolo, Genevieve; Denef, Vincent J; Goltsman, Daniela S Aliaga; Thelen, Michael P; Banfield, Jillian F

2008-07-22

Deeply sampled community genomic (metagenomic) datasets enable comprehensive analysis of heterogeneity in natural microbial populations. In this study, we used sequence data obtained from the dominant member of a low-diversity natural chemoautotrophic microbial community to determine how coexisting closely related individuals differ from each other in terms of gene sequence and gene content, and to uncover evidence of evolutionary processes that occur over short timescales. DNA sequence obtained from an acid mine drainage biofilm was reconstructed, taking into account the effects of strain variation, to generate a nearly complete genome tiling path for a Leptospirillum group II species closely related to L. ferriphilum (sampling depth approximately 20x). The population is dominated by one sequence type, yet we detected evidence for relatively abundant variants (>99.5% sequence identity to the dominant type) at multiple loci, and a few rare variants. Blocks of other Leptospirillum group II types ( approximately 94% sequence identity) have recombined into one or more variants. Variant blocks of both types are more numerous near the origin of replication. Heterogeneity in genetic potential within the population arises from localized variation in gene content, typically focused in integrated plasmid/phage-like regions. Some laterally transferred gene blocks encode physiologically important genes, including quorum-sensing genes of the LuxIR system. Overall, results suggest inter- and intrapopulation genetic exchange involving distinct parental genome types and implicate gain and loss of phage and plasmid genes in recent evolution of this Leptospirillum group II population. Population genetic analyses of single nucleotide polymorphisms indicate variation between closely related strains is not maintained by positive selection, suggesting that these regions do not represent adaptive differences between strains. Thus, the most likely explanation for the observed patterns of
Population genomic analysis of strain variation in Leptospirillum group II bacteria involved in acid mine drainage formation.

Directory of Open Access Journals (Sweden)

Sheri L Simmons

2008-07-01

Full Text Available Deeply sampled community genomic (metagenomic datasets enable comprehensive analysis of heterogeneity in natural microbial populations. In this study, we used sequence data obtained from the dominant member of a low-diversity natural chemoautotrophic microbial community to determine how coexisting closely related individuals differ from each other in terms of gene sequence and gene content, and to uncover evidence of evolutionary processes that occur over short timescales. DNA sequence obtained from an acid mine drainage biofilm was reconstructed, taking into account the effects of strain variation, to generate a nearly complete genome tiling path for a Leptospirillum group II species closely related to L. ferriphilum (sampling depth approximately 20x. The population is dominated by one sequence type, yet we detected evidence for relatively abundant variants (>99.5% sequence identity to the dominant type at multiple loci, and a few rare variants. Blocks of other Leptospirillum group II types ( approximately 94% sequence identity have recombined into one or more variants. Variant blocks of both types are more numerous near the origin of replication. Heterogeneity in genetic potential within the population arises from localized variation in gene content, typically focused in integrated plasmid/phage-like regions. Some laterally transferred gene blocks encode physiologically important genes, including quorum-sensing genes of the LuxIR system. Overall, results suggest inter- and intrapopulation genetic exchange involving distinct parental genome types and implicate gain and loss of phage and plasmid genes in recent evolution of this Leptospirillum group II population. Population genetic analyses of single nucleotide polymorphisms indicate variation between closely related strains is not maintained by positive selection, suggesting that these regions do not represent adaptive differences between strains. Thus, the most likely explanation for the
Automated MAD and MIR structure solution

International Nuclear Information System (INIS)

Terwilliger, Thomas C.; Berendzen, Joel

1999-01-01

A fully automated procedure for solving MIR and MAD structures has been developed using a scoring scheme to convert the structure-solution process into an optimization problem. Obtaining an electron-density map from X-ray diffraction data can be difficult and time-consuming even after the data have been collected, largely because MIR and MAD structure determinations currently require many subjective evaluations of the qualities of trial heavy-atom partial structures before a correct heavy-atom solution is obtained. A set of criteria for evaluating the quality of heavy-atom partial solutions in macromolecular crystallography have been developed. These have allowed the conversion of the crystal structure-solution process into an optimization problem and have allowed its automation. The SOLVE software has been used to solve MAD data sets with as many as 52 selenium sites in the asymmetric unit. The automated structure-solution process developed is a major step towards the fully automated structure-determination, model-building and refinement procedure which is needed for genomic scale structure determinations
Data mining in healthcare: decision making and precision

Directory of Open Access Journals (Sweden)

Ionuţ ŢĂRANU

2016-05-01

Full Text Available The trend of application of data mining in healthcare today is increased because the health sector is rich with information and data mining has become a necessity. Healthcare organizations generate and collect large volumes of information to a daily basis. Use of information technology enables automation of data mining and knowledge that help bring some interesting patterns which means eliminating manual tasks and easy data extraction directly from electronic records, electronic transfer system that will secure medical records, save lives and reduce the cost of medical services as well as enabling early detection of infectious diseases on the basis of advanced data collection. Data mining can enable healthcare organizations to anticipate trends in the patient's medical condition and behaviour proved by analysis of prospects different and by making connections between seemingly unrelated information. The raw data from healthcare organizations are voluminous and heterogeneous. It needs to be collected and stored in organized form and their integration allows the formation unite medical information system. Data mining in health offers unlimited possibilities for analyzing different data models less visible or hidden to common analysis techniques. These patterns can be used by healthcare practitioners to make forecasts, put diagnoses, and set treatments for patients in healthcare organizations.
Experiences with Text Mining Large Collections of Unstructured Systems Development Artifacts at JPL

Science.gov (United States)

Port, Dan; Nikora, Allen; Hihn, Jairus; Huang, LiGuo

2011-01-01

Often repositories of systems engineering artifacts at NASA's Jet Propulsion Laboratory (JPL) are so large and poorly structured that they have outgrown our capability to effectively manually process their contents to extract useful information. Sophisticated text mining methods and tools seem a quick, low-effort approach to automating our limited manual efforts. Our experiences of exploring such methods mainly in three areas including historical risk analysis, defect identification based on requirements analysis, and over-time analysis of system anomalies at JPL, have shown that obtaining useful results requires substantial unanticipated efforts - from preprocessing the data to transforming the output for practical applications. We have not observed any quick 'wins' or realized benefit from short-term effort avoidance through automation in this area. Surprisingly we have realized a number of unexpected long-term benefits from the process of applying text mining to our repositories. This paper elaborates some of these benefits and our important lessons learned from the process of preparing and applying text mining to large unstructured system artifacts at JPL aiming to benefit future TM applications in similar problem domains and also in hope for being extended to broader areas of applications.
Mapping genomic features to functional traits through microbial whole genome sequences.

Science.gov (United States)

Zhang, Wei; Zeng, Erliang; Liu, Dan; Jones, Stuart E; Emrich, Scott

2014-01-01

Recently, the utility of trait-based approaches for microbial communities has been identified. Increasing availability of whole genome sequences provide the opportunity to explore the genetic foundations of a variety of functional traits. We proposed a machine learning framework to quantitatively link the genomic features with functional traits. Genes from bacteria genomes belonging to different functional traits were grouped to Cluster of Orthologs (COGs), and were used as features. Then, TF-IDF technique from the text mining domain was applied to transform the data to accommodate the abundance and importance of each COG. After TF-IDF processing, COGs were ranked using feature selection methods to identify their relevance to the functional trait of interest. Extensive experimental results demonstrated that functional trait related genes can be detected using our method. Further, the method has the potential to provide novel biological insights.
Investigating the Control of Chlorophyll Degradation by Genomic Correlation Mining.

Science.gov (United States)

Ghandchi, Frederick P; Caetano-Anolles, Gustavo; Clough, Steven J; Ort, Donald R

2016-01-01

Chlorophyll degradation is an intricate process that is critical in a variety of plant tissues at different times during the plant life cycle. Many of the photoactive chlorophyll degradation intermediates are exceptionally cytotoxic necessitating that the pathway be carefully coordinated and regulated. The primary regulatory step in the chlorophyll degradation pathway involves the enzyme pheophorbide a oxygenase (PAO), which oxidizes the chlorophyll intermediate pheophorbide a, that is eventually converted to non-fluorescent chlorophyll catabolites. There is evidence that PAO is differentially regulated across different environmental and developmental conditions with both transcriptional and post-transcriptional components, but the involved regulatory elements are uncertain or unknown. We hypothesized that transcription factors modulate PAO expression across different environmental conditions, such as cold and drought, as well as during developmental transitions to leaf senescence and maturation of green seeds. To test these hypotheses, several sets of Arabidopsis genomic and bioinformatic experiments were investigated and re-analyzed using computational approaches. PAO expression was compared across varied environmental conditions in the three separate datasets using regression modeling and correlation mining to identify gene elements co-expressed with PAO. Their functions were investigated as candidate upstream transcription factors or other regulatory elements that may regulate PAO expression. PAO transcript expression was found to be significantly up-regulated in warm conditions, during leaf senescence, and in drought conditions, and in all three conditions significantly positively correlated with expression of transcription factor Arabidopsis thaliana activating factor 1 (ATAF1), suggesting that ATAF1 is triggered in the plant response to these processes or abiotic stresses and in result up-regulates PAO expression. The proposed regulatory network includes the
A jewel in the desert: BHP Billiton's San Juan underground mine

Energy Technology Data Exchange (ETDEWEB)

Buchsbaum, L.

2007-12-15

The Navajo Nation is America's largest native American tribe by population and acreage, and is blessed with large tracks of good coal deposits. BHP Billiton's New Mexico Coal Co. is the largest in the Navajo regeneration area. The holdings comprise the San Juan underground mine, the La Plata surface mine, now in reclamation, and the expanding Navajo surface mine. The article recounts the recent history of the mines. It stresses the emphasis on sensitivity to and helping to sustain tribal culture, and also on safety. San Juan's longwall system is unique to the nation. It started up as an automated system from the outset. Problems caused by hydrogen sulfide are being tackled. San Juan has a bleederless ventilation system to minimise the risk of spontaneous combustion of methane and the atmospheric conditions in the mine are heavily monitored, especially within the gob areas. 3 photos.
VRLane: a desktop virtual safety management program for underground coal mine

Science.gov (United States)

Li, Mei; Chen, Jingzhu; Xiong, Wei; Zhang, Pengpeng; Wu, Daozheng

2008-10-01

VR technologies, which generate immersive, interactive, and three-dimensional (3D) environments, are seldom applied to coal mine safety work management. In this paper, a new method that combined the VR technologies with underground mine safety management system was explored. A desktop virtual safety management program for underground coal mine, called VRLane, was developed. The paper mainly concerned about the current research advance in VR, system design, key techniques and system application. Two important techniques were introduced in the paper. Firstly, an algorithm was designed and implemented, with which the 3D laneway models and equipment models can be built on the basis of the latest mine 2D drawings automatically, whereas common VR programs established 3D environment by using 3DS Max or the other 3D modeling software packages with which laneway models were built manually and laboriously. Secondly, VRLane realized system integration with underground industrial automation. VRLane not only described a realistic 3D laneway environment, but also described the status of the coal mining, with functions of displaying the run states and related parameters of equipment, per-alarming the abnormal mining events, and animating mine cars, mine workers, or long-wall shearers. The system, with advantages of cheap, dynamic, easy to maintenance, provided a useful tool for safety production management in coal mine.
Automated gamma knife radiosurgery treatment planning with image registration, data-mining, and Nelder-Mead simplex optimization

International Nuclear Information System (INIS)

Lee, Kuan J.; Barber, David C.; Walton, Lee

2006-01-01

Gamma knife treatments are usually planned manually, requiring much expertise and time. We describe a new, fully automatic method of treatment planning. The treatment volume to be planned is first compared with a database of past treatments to find volumes closely matching in size and shape. The treatment parameters of the closest matches are used as starting points for the new treatment plan. Further optimization is performed with the Nelder-Mead simplex method: the coordinates and weight of the isocenters are allowed to vary until a maximally conformal plan specific to the new treatment volume is found. The method was tested on a randomly selected set of 10 acoustic neuromas and 10 meningiomas. Typically, matching a new volume took under 30 seconds. The time for simplex optimization, on a 3 GHz Xeon processor, ranged from under a minute for small volumes ( 30 000 cubic mm,>20 isocenters). In 8/10 acoustic neuromas and 8/10 meningiomas, the automatic method found plans with conformation number equal or better than that of the manual plan. In 4/10 acoustic neuromas and 5/10 meningiomas, both overtreatment and undertreatment ratios were equal or better in automated plans. In conclusion, data-mining of past treatments can be used to derive starting parameters for treatment planning. These parameters can then be computer optimized to give good plans automatically
Automation of a single-DNA molecule stretching device

DEFF Research Database (Denmark)

Sørensen, Kristian Tølbøl; Lopacinska, Joanna M.; Tommerup, Niels

2015-01-01

We automate the manipulation of genomic-length DNA in a nanofluidic device based on real-time analysis of fluorescence images. In our protocol, individual molecules are picked from a microchannel and stretched with pN forces using pressure driven flows. The millimeter-long DNA fragments free...
Data mining concepts and techniques

CERN Document Server

Han, Jiawei

2005-01-01

Our ability to generate and collect data has been increasing rapidly. Not only are all of our business, scientific, and government transactions now computerized, but the widespread use of digital cameras, publication tools, and bar codes also generate data. On the collection side, scanned text and image platforms, satellite remote sensing systems, and the World Wide Web have flooded us with a tremendous amount of data. This explosive growth has generated an even more urgent need for new techniques and automated tools that can help us transform this data into useful information and knowledge.Like the first edition, voted the most popular data mining book by KD Nuggets readers, this book explores concepts and techniques for the discovery of patterns hidden in large data sets, focusing on issues relating to their feasibility, usefulness, effectiveness, and scalability. However, since the publication of the first edition, great progress has been made in the development of new data mining methods, systems, and app...
Automated Single Cell Data Decontamination Pipeline

Energy Technology Data Exchange (ETDEWEB)

Tennessen, Kristin [Lawrence Berkeley National Lab. (LBNL), Walnut Creek, CA (United States). Dept. of Energy Joint Genome Inst.; Pati, Amrita [Lawrence Berkeley National Lab. (LBNL), Walnut Creek, CA (United States). Dept. of Energy Joint Genome Inst.

2014-03-21

Recent technological advancements in single-cell genomics have encouraged the classification and functional assessment of microorganisms from a wide span of the biospheres phylogeny.1,2 Environmental processes of interest to the DOE, such as bioremediation and carbon cycling, can be elucidated through the genomic lens of these unculturable microbes. However, contamination can occur at various stages of the single-cell sequencing process. Contaminated data can lead to wasted time and effort on meaningless analyses, inaccurate or erroneous conclusions, and pollution of public databases. A fully automated decontamination tool is necessary to prevent these instances and increase the throughput of the single-cell sequencing process

Geochemistry and mineralogy of arsenic in mine wastes and stream sediments in a historic metal mining area in the UK

Energy Technology Data Exchange (ETDEWEB)

Rieuwerts, J.S., E-mail: jrieuwerts@plymouth.ac.uk [School of Geography, Earth and Environmental Sciences, Plymouth University, Plymouth PL4 8AA (United Kingdom); Mighanetara, K.; Braungardt, C.B. [School of Geography, Earth and Environmental Sciences, Plymouth University, Plymouth PL4 8AA (United Kingdom); Rollinson, G.K. [Camborne School of Mines, CEMPS, University of Exeter, Tremough Campus, Penryn, Cornwall TR10 9EZ (United Kingdom); Pirrie, D. [Helford Geoscience LLP, Menallack Farm, Treverva, Penryn, Cornwall TR10 9BP (United Kingdom); Azizi, F. [School of Geography, Earth and Environmental Sciences, Plymouth University, Plymouth PL4 8AA (United Kingdom)

2014-02-01

Mining generates large amounts of waste which may contain potentially toxic elements (PTE), which, if released into the wider environment, can cause air, water and soil pollution long after mining operations have ceased. The fate and toxicological impact of PTEs are determined by their partitioning and speciation and in this study, the concentrations and mineralogy of arsenic in mine wastes and stream sediments in a former metal mining area of the UK are investigated. Pseudo-total (aqua-regia extractable) arsenic concentrations in all samples from the mining area exceeded background and guideline values by 1–5 orders of magnitude, with a maximum concentration in mine wastes of 1.8 × 10{sup 5} mg kg{sup −1} As and concentrations in stream sediments of up to 2.5 × 10{sup 4} mg kg{sup −1} As, raising concerns over potential environmental impacts. Mineralogical analysis of the wastes and sediments was undertaken by scanning electron microscopy (SEM) and automated SEM-EDS based quantitative evaluation (QEMSCAN®). The main arsenic mineral in the mine waste was scorodite and this was significantly correlated with pseudo-total As concentrations and significantly inversely correlated with potentially mobile arsenic, as estimated from the sum of exchangeable, reducible and oxidisable arsenic fractions obtained from a sequential extraction procedure; these findings correspond with the low solubility of scorodite in acidic mine wastes. The work presented shows that the study area remains grossly polluted by historical mining and processing and illustrates the value of combining mineralogical data with acid and sequential extractions to increase our understanding of potential environmental threats. - Highlights: • Stream sediments in a former mining area remain polluted with up to 25 g As per kg. • The main arsenic mineral in adjacent mine wastes appears to be scorodite. • Low solubility scorodite was inversely correlated with potentially mobile As. • Combining
Geochemistry and mineralogy of arsenic in mine wastes and stream sediments in a historic metal mining area in the UK

International Nuclear Information System (INIS)

Rieuwerts, J.S.; Mighanetara, K.; Braungardt, C.B.; Rollinson, G.K.; Pirrie, D.; Azizi, F.

2014-01-01

Mining generates large amounts of waste which may contain potentially toxic elements (PTE), which, if released into the wider environment, can cause air, water and soil pollution long after mining operations have ceased. The fate and toxicological impact of PTEs are determined by their partitioning and speciation and in this study, the concentrations and mineralogy of arsenic in mine wastes and stream sediments in a former metal mining area of the UK are investigated. Pseudo-total (aqua-regia extractable) arsenic concentrations in all samples from the mining area exceeded background and guideline values by 1–5 orders of magnitude, with a maximum concentration in mine wastes of 1.8 × 10 5 mg kg −1 As and concentrations in stream sediments of up to 2.5 × 10 4 mg kg −1 As, raising concerns over potential environmental impacts. Mineralogical analysis of the wastes and sediments was undertaken by scanning electron microscopy (SEM) and automated SEM-EDS based quantitative evaluation (QEMSCAN®). The main arsenic mineral in the mine waste was scorodite and this was significantly correlated with pseudo-total As concentrations and significantly inversely correlated with potentially mobile arsenic, as estimated from the sum of exchangeable, reducible and oxidisable arsenic fractions obtained from a sequential extraction procedure; these findings correspond with the low solubility of scorodite in acidic mine wastes. The work presented shows that the study area remains grossly polluted by historical mining and processing and illustrates the value of combining mineralogical data with acid and sequential extractions to increase our understanding of potential environmental threats. - Highlights: • Stream sediments in a former mining area remain polluted with up to 25 g As per kg. • The main arsenic mineral in adjacent mine wastes appears to be scorodite. • Low solubility scorodite was inversely correlated with potentially mobile As. • Combining mineralogical and
Identification of novel target genes for safer and more specific control of root-knot nematodes from a pan-genome mining.

Directory of Open Access Journals (Sweden)

Etienne G J Danchin

2013-10-01

Full Text Available Root-knot nematodes are globally the most aggressive and damaging plant-parasitic nematodes. Chemical nematicides have so far constituted the most efficient control measures against these agricultural pests. Because of their toxicity for the environment and danger for human health, these nematicides have now been banned from use. Consequently, new and more specific control means, safe for the environment and human health, are urgently needed to avoid worldwide proliferation of these devastating plant-parasites. Mining the genomes of root-knot nematodes through an evolutionary and comparative genomics approach, we identified and analyzed 15,952 nematode genes conserved in genomes of plant-damaging species but absent from non target genomes of chordates, plants, annelids, insect pollinators and mollusks. Functional annotation of the corresponding proteins revealed a relative abundance of putative transcription factors in this parasite-specific set compared to whole proteomes of root-knot nematodes. This may point to important and specific regulators of genes involved in parasitism. Because these nematodes are known to secrete effector proteins in planta, essential for parasitism, we searched and identified 993 such effector-like proteins absent from non-target species. Aiming at identifying novel targets for the development of future control methods, we biologically tested the effect of inactivation of the corresponding genes through RNA interference. A total of 15 novel effector-like proteins and one putative transcription factor compatible with the design of siRNAs were present as non-redundant genes and had transcriptional support in the model root-knot nematode Meloidogyne incognita. Infestation assays with siRNA-treated M. incognita on tomato plants showed significant and reproducible reduction of the infestation for 12 of the 16 tested genes compared to control nematodes. These 12 novel genes, showing efficient reduction of parasitism when
Marine Genomics: A clearing-house for genomic and transcriptomic data of marine organisms

Directory of Open Access Journals (Sweden)

Trent Harold F

2005-03-01

Full Text Available Abstract Background The Marine Genomics project is a functional genomics initiative developed to provide a pipeline for the curation of Expressed Sequence Tags (ESTs and gene expression microarray data for marine organisms. It provides a unique clearing-house for marine specific EST and microarray data and is currently available at http://www.marinegenomics.org. Description The Marine Genomics pipeline automates the processing, maintenance, storage and analysis of EST and microarray data for an increasing number of marine species. It currently contains 19 species databases (over 46,000 EST sequences that are maintained by registered users from local and remote locations in Europe and South America in addition to the USA. A collection of analysis tools are implemented. These include a pipeline upload tool for EST FASTA file, sequence trace file and microarray data, an annotative text search, automated sequence trimming, sequence quality control (QA/QC editing, sequence BLAST capabilities and a tool for interactive submission to GenBank. Another feature of this resource is the integration with a scientific computing analysis environment implemented by MATLAB. Conclusion The conglomeration of multiple marine organisms with integrated analysis tools enables users to focus on the comprehensive descriptions of transcriptomic responses to typical marine stresses. This cross species data comparison and integration enables users to contain their research within a marine-oriented data management and analysis environment.
AUTOMATION OF CONVEYOR BELT TRANSPORT

Directory of Open Access Journals (Sweden)

Nenad Marinović

1990-12-01

Full Text Available Belt conveyor transport, although one of the most economical mining transport system, introduce many problems to mantain the continuity of the operation. Every stop causes economical loses. Optimal operation require correct tension of the belt, correct belt position and velocity and faultless rolls, which are together input conditions for automation. Detection and position selection of the faults are essential for safety to eliminate fire hazard and for efficient maintenance. Detection and location of idler roll faults are still open problem and up to now not solved successfully (the paper is published in Croatian.
Using text mining for study identification in systematic reviews: a systematic review of current approaches

OpenAIRE

O?Mara-Eves, Alison; Thomas, James; McNaught, John; Miwa, Makoto; Ananiadou, Sophia

2015-01-01

Background The large and growing number of published studies, and their increasing rate of publication, makes the task of identifying relevant studies in an unbiased way for inclusion in systematic reviews both complex and time consuming. Text mining has been offered as a potential solution: through automating some of the screening process, reviewer time can be saved. The evidence base around the use of text mining for screening has not yet been pulled together systematically; this systematic...
Mining Genome-Scale Growth Phenotype Data through Constant-Column Biclustering

KAUST Repository

Alzahrani, Majed A.

2017-01-01

for mining in growth phenotype data. Here, we propose Gracob, a novel, efficient graph-based method that casts and solves the constant-column biclustering problem as a maximal clique finding problem in a multipartite graph. We compared Gracob with a large
BRAD, the genetics and genomics database for Brassica plants

Directory of Open Access Journals (Sweden)

Li Pingxia

2011-10-01

Full Text Available Abstract Background Brassica species include both vegetable and oilseed crops, which are very important to the daily life of common human beings. Meanwhile, the Brassica species represent an excellent system for studying numerous aspects of plant biology, specifically for the analysis of genome evolution following polyploidy, so it is also very important for scientific research. Now, the genome of Brassica rapa has already been assembled, it is the time to do deep mining of the genome data. Description BRAD, the Brassica database, is a web-based resource focusing on genome scale genetic and genomic data for important Brassica crops. BRAD was built based on the first whole genome sequence and on further data analysis of the Brassica A genome species, Brassica rapa (Chiifu-401-42. It provides datasets, such as the complete genome sequence of B. rapa, which was de novo assembled from Illumina GA II short reads and from BAC clone sequences, predicted genes and associated annotations, non coding RNAs, transposable elements (TE, B. rapa genes' orthologous to those in A. thaliana, as well as genetic markers and linkage maps. BRAD offers useful searching and data mining tools, including search across annotation datasets, search for syntenic or non-syntenic orthologs, and to search the flanking regions of a certain target, as well as the tools of BLAST and Gbrowse. BRAD allows users to enter almost any kind of information, such as a B. rapa or A. thaliana gene ID, physical position or genetic marker. Conclusion BRAD, a new database which focuses on the genetics and genomics of the Brassica plants has been developed, it aims at helping scientists and breeders to fully and efficiently use the information of genome data of Brassica plants. BRAD will be continuously updated and can be accessed through http://brassicadb.org.
SIGMA: A System for Integrative Genomic Microarray Analysis of Cancer Genomes

Directory of Open Access Journals (Sweden)

Davies Jonathan J

2006-12-01

Full Text Available Abstract Background The prevalence of high resolution profiling of genomes has created a need for the integrative analysis of information generated from multiple methodologies and platforms. Although the majority of data in the public domain are gene expression profiles, and expression analysis software are available, the increase of array CGH studies has enabled integration of high throughput genomic and gene expression datasets. However, tools for direct mining and analysis of array CGH data are limited. Hence, there is a great need for analytical and display software tailored to cross platform integrative analysis of cancer genomes. Results We have created a user-friendly java application to facilitate sophisticated visualization and analysis such as cross-tumor and cross-platform comparisons. To demonstrate the utility of this software, we assembled array CGH data representing Affymetrix SNP chip, Stanford cDNA arrays and whole genome tiling path array platforms for cross comparison. This cancer genome database contains 267 profiles from commonly used cancer cell lines representing 14 different tissue types. Conclusion In this study we have developed an application for the visualization and analysis of data from high resolution array CGH platforms that can be adapted for analysis of multiple types of high throughput genomic datasets. Furthermore, we invite researchers using array CGH technology to deposit both their raw and processed data, as this will be a continually expanding database of cancer genomes. This publicly available resource, the System for Integrative Genomic Microarray Analysis (SIGMA of cancer genomes, can be accessed at http://sigma.bccrc.ca.
Making progress with the automation of systematic reviews: principles of the International Collaboration for the Automation of Systematic Reviews (ICASR).

Science.gov (United States)

Beller, Elaine; Clark, Justin; Tsafnat, Guy; Adams, Clive; Diehl, Heinz; Lund, Hans; Ouzzani, Mourad; Thayer, Kristina; Thomas, James; Turner, Tari; Xia, Jun; Robinson, Karen; Glasziou, Paul

2018-05-19

Systematic reviews (SR) are vital to health care, but have become complicated and time-consuming, due to the rapid expansion of evidence to be synthesised. Fortunately, many tasks of systematic reviews have the potential to be automated or may be assisted by automation. Recent advances in natural language processing, text mining and machine learning have produced new algorithms that can accurately mimic human endeavour in systematic review activity, faster and more cheaply. Automation tools need to be able to work together, to exchange data and results. Therefore, we initiated the International Collaboration for the Automation of Systematic Reviews (ICASR), to successfully put all the parts of automation of systematic review production together. The first meeting was held in Vienna in October 2015. We established a set of principles to enable tools to be developed and integrated into toolkits.This paper sets out the principles devised at that meeting, which cover the need for improvement in efficiency of SR tasks, automation across the spectrum of SR tasks, continuous improvement, adherence to high quality standards, flexibility of use and combining components, the need for a collaboration and varied skills, the desire for open source, shared code and evaluation, and a requirement for replicability through rigorous and open evaluation.Automation has a great potential to improve the speed of systematic reviews. Considerable work is already being done on many of the steps involved in a review. The 'Vienna Principles' set out in this paper aim to guide a more coordinated effort which will allow the integration of work by separate teams and build on the experience, code and evaluations done by the many teams working across the globe.
Brassica database (BRAD) version 2.0: integrating and mining Brassicaceae species genomic resources.

Science.gov (United States)

Wang, Xiaobo; Wu, Jian; Liang, Jianli; Cheng, Feng; Wang, Xiaowu

2015-01-01

The Brassica database (BRAD) was built initially to assist users apply Brassica rapa and Arabidopsis thaliana genomic data efficiently to their research. However, many Brassicaceae genomes have been sequenced and released after its construction. These genomes are rich resources for comparative genomics, gene annotation and functional evolutionary studies of Brassica crops. Therefore, we have updated BRAD to version 2.0 (V2.0). In BRAD V2.0, 11 more Brassicaceae genomes have been integrated into the database, namely those of Arabidopsis lyrata, Aethionema arabicum, Brassica oleracea, Brassica napus, Camelina sativa, Capsella rubella, Leavenworthia alabamica, Sisymbrium irio and three extremophiles Schrenkiella parvula, Thellungiella halophila and Thellungiella salsuginea. BRAD V2.0 provides plots of syntenic genomic fragments between pairs of Brassicaceae species, from the level of chromosomes to genomic blocks. The Generic Synteny Browser (GBrowse_syn), a module of the Genome Browser (GBrowse), is used to show syntenic relationships between multiple genomes. Search functions for retrieving syntenic and non-syntenic orthologs, as well as their annotation and sequences are also provided. Furthermore, genome and annotation information have been imported into GBrowse so that all functional elements can be visualized in one frame. We plan to continually update BRAD by integrating more Brassicaceae genomes into the database. Database URL: http://brassicadb.org/brad/. © The Author(s) 2015. Published by Oxford University Press.
Review of the application of ergonomics design of trackless mining equipment (TME) - lessons and challenges

CSIR Research Space (South Africa)

James, JP

2007-06-01

Full Text Available the design of trackless mining equipment (TME) is poor, with improvised seating, poor cabin layouts and sub-standard display instrumentation. This paper will present the key findings of two studies assessing ergonomics risk factors associated with automated...
G2D: a tool for mining genes associated with disease

OpenAIRE

Perez-Iratxeta, Carolina; Wjst, Matthias; Bork, Peer; Andrade, Miguel A

2005-01-01

Abstract Background Human inherited diseases can be associated by genetic linkage with one or more genomic regions. The availability of the complete sequence of the human genome allows examining those locations for an associated gene. We previously developed an algorithm to prioritize genes on a chromosomal region according to their possible relation to an inherited disease using a combination of data mining on biomedical databases and gene sequence analysis. Results We have implemented this ...
Viral dark matter and virus–host interactions resolved from publicly available microbial genomes

Science.gov (United States)

Roux, Simon; Hallam, Steven J; Woyke, Tanja; Sullivan, Matthew B

2015-01-01

The ecological importance of viruses is now widely recognized, yet our limited knowledge of viral sequence space and virus–host interactions precludes accurate prediction of their roles and impacts. In this study, we mined publicly available bacterial and archaeal genomic data sets to identify 12,498 high-confidence viral genomes linked to their microbial hosts. These data augment public data sets 10-fold, provide first viral sequences for 13 new bacterial phyla including ecologically abundant phyla, and help taxonomically identify 7–38% of ‘unknown’ sequence space in viromes. Genome- and network-based classification was largely consistent with accepted viral taxonomy and suggested that (i) 264 new viral genera were identified (doubling known genera) and (ii) cross-taxon genomic recombination is limited. Further analyses provided empirical data on extrachromosomal prophages and coinfection prevalences, as well as evaluation of in silico virus–host linkage predictions. Together these findings illustrate the value of mining viral signal from microbial genomes. DOI: http://dx.doi.org/10.7554/eLife.08490.001 PMID:26200428
Genome mining of the sordarin biosynthetic gene cluster from Sordaria araneosa Cain ATCC 36386: characterization of cycloaraneosene synthase and GDP-6-deoxyaltrose transferase.

Science.gov (United States)

Kudo, Fumitaka; Matsuura, Yasunori; Hayashi, Takaaki; Fukushima, Masayuki; Eguchi, Tadashi

2016-07-01

Sordarin is a glycoside antibiotic with a unique tetracyclic diterpene aglycone structure called sordaricin. To understand its intriguing biosynthetic pathway that may include a Diels-Alder-type [4+2]cycloaddition, genome mining of the gene cluster from the draft genome sequence of the producer strain, Sordaria araneosa Cain ATCC 36386, was carried out. A contiguous 67 kb gene cluster consisting of 20 open reading frames encoding a putative diterpene cyclase, a glycosyltransferase, a type I polyketide synthase, and six cytochrome P450 monooxygenases were identified. In vitro enzymatic analysis of the putative diterpene cyclase SdnA showed that it catalyzes the transformation of geranylgeranyl diphosphate to cycloaraneosene, a known biosynthetic intermediate of sordarin. Furthermore, a putative glycosyltransferase SdnJ was found to catalyze the glycosylation of sordaricin in the presence of GDP-6-deoxy-d-altrose to give 4'-O-demethylsordarin. These results suggest that the identified sdn gene cluster is responsible for the biosynthesis of sordarin. Based on the isolated potential biosynthetic intermediates and bioinformatics analysis, a plausible biosynthetic pathway for sordarin is proposed.
Complete genome sequence of Halorhodospira halophila SL1

Energy Technology Data Exchange (ETDEWEB)

Challacombe, Jean F [ORNL; Majid, Sophia [University of Chicago; Deole, Ratnakar [Oklahoma State University; Brettin, Thomas S. [Argonne National Laboratory (ANL); Bruce, David [Los Alamos National Laboratory (LANL); Delano, Susana [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Gleasner, Cheryl D. [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Misra, Monica [Los Alamos National Laboratory (LANL); Reitenga, Krista K. [Los Alamos National Laboratory (LANL); Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Saunders, Elizabeth H [Los Alamos National Laboratory (LANL); Tapia, Roxanne [Los Alamos National Laboratory (LANL); Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Hoff, Wouter D. [Oklahoma State University

2013-01-01

Halorhodospira halophila is among the most halophilic organisms known. It is an obligately photosynthetic and anaerobic purple sulfur bacterium that exhibits autotrophic growth up to saturated NaCl concentrations. The type strain H. halophila SL1 was isolated from a hypersaline lake in Oregon. Here we report the determination of its entire genome in a single contig. This is the first genome of a phototrophic extreme halophile. The genome consists of 2,678,452 bp, encoding 2493 predicted genes as determined by automated genome annotation. Of the 2407 predicted proteins, 1905 were assigned to a putative function. Future detailed analysis of this genome promises to yield insights into the halophilic adaptations of this organism, its ability for photoautotrophic growth under extreme conditions, and its characteristic sulfur metabolism.
Sustainable lignite mining and utilization. Developments in the Rhenish lignite-mining area; Nachhaltige Braunkohlegewinnung und -nutzung. Entwicklung im Rheinischen Revier

Energy Technology Data Exchange (ETDEWEB)

Gaertner, Dieter [RWE Power AG, Bergheim (Germany). Opencast Mines Div.

2012-03-15

Lignite is an essential module in the Rhenish mining area's economic power. Mining in a densely populated region like the Rhineland calls for keeping an eye equally on people, the environment and industry now and in the future. By considering all concerns and ensuring transparency in our approach to people, we have succeeded in obtaining public acceptance also for large-scale projects in an environment that is not always easy in Germany. RWE Power plans to use lignite in power generation and in upgrading operations for many decades to come, so that the company is systematically implementing the power-plant renewal programme with is planning for BoAplus as highly efficient next-generation lignite-based power plants. Research on CO{sub 2} utilization, flexibilization, energy storage and alternative uses of lignite are as much features of RWE Power's future-proof alignment in the Rhenish mining area as are further innovations in the opencast mines. Core aspects here include further automation in the deployment of main mine equipment, closely dovetailed with innovations in other operating units. Parallel restructuring of the operating units and the Lignite Approvals area are underpinning these measures. Innovations and their translation into technical progress will ensure the success of a measured energy turnaround both in Germany and throughout Europe. However, this requires dependable political conditions, so that an engineering spirit can go on being transformed into entrepreneurial action. (orig.)
Automated microscopy for high-content RNAi screening

Science.gov (United States)

2010-01-01

Fluorescence microscopy is one of the most powerful tools to investigate complex cellular processes such as cell division, cell motility, or intracellular trafficking. The availability of RNA interference (RNAi) technology and automated microscopy has opened the possibility to perform cellular imaging in functional genomics and other large-scale applications. Although imaging often dramatically increases the content of a screening assay, it poses new challenges to achieve accurate quantitative annotation and therefore needs to be carefully adjusted to the specific needs of individual screening applications. In this review, we discuss principles of assay design, large-scale RNAi, microscope automation, and computational data analysis. We highlight strategies for imaging-based RNAi screening adapted to different library and assay designs. PMID:20176920
Advanced Data Mining of Leukemia Cells Micro-Arrays

Directory of Open Access Journals (Sweden)

Richard S. Segall

2009-12-01

Full Text Available This paper provides continuation and extensions of previous research by Segall and Pierce (2009a that discussed data mining for micro-array databases of Leukemia cells for primarily self-organized maps (SOM. As Segall and Pierce (2009a and Segall and Pierce (2009b the results of applying data mining are shown and discussed for the data categories of microarray databases of HL60, Jurkat, NB4 and U937 Leukemia cells that are also described in this article. First, a background section is provided on the work of others pertaining to the applications of data mining to micro-array databases of Leukemia cells and micro-array databases in general. As noted in predecessor article by Segall and Pierce (2009a, micro-array databases are one of the most popular functional genomics tools in use today. This research in this paper is intended to use advanced data mining technologies for better interpretations and knowledge discovery as generated by the patterns of gene expressions of HL60, Jurkat, NB4 and U937 Leukemia cells. The advanced data mining performed entailed using other data mining tools such as cubic clustering criterion, variable importance rankings, decision trees, and more detailed examinations of data mining statistics and study of other self-organized maps (SOM clustering regions of workspace as generated by SAS Enterprise Miner version 4. Conclusions and future directions of the research are also presented.
Gene loss and horizontal gene transfer contributed to the genome evolution of the extreme acidophile Ferrovum

Directory of Open Access Journals (Sweden)

Sophie Roxana Ullrich

2016-05-01

Full Text Available Acid mine drainage (AMD, associated with active and abandoned mining sites, is a habitat for acidophilic microorganisms that gain energy from the oxidation of reduced sulfur compounds and ferrous iron and that thrive at pH below 4. Members of the recently proposed genus Ferrovum are the first acidophilic iron oxidizers to be described within the Betaproteobacteria. Although they have been detected as typical community members in AMD habitats worldwide, knowledge of their phylogenetic and metabolic diversity is scarce. Genomics approaches appear to be most promising in addressing this lacuna since isolation and cultivation of Ferrovum has proven to be extremely difficult and has so far only been successful for the designated type strain Ferrovum myxofaciens P3G. In this study, the genomes of two novel strains of Ferrovum (PN-J185 and Z-31 derived from water samples of a mine water treatment plant were sequenced. These genomes were compared with those of Ferrovum sp. JA12 that also originated from the mine water treatment plant, and of the type strain (P3G. Phylogenomic scrutiny suggests that the four strains represent three Ferrovum species that cluster in two groups (1 and 2. Comprehensive analysis of their predicted metabolic pathways revealed that these groups harbor characteristic metabolic profiles, notably with respect to motility, chemotaxis, nitrogen metabolism, biofilm formation and their potential strategies to cope with the acidic environment. For example, while the F. myxofaciens strains (group 1 appear to be motile and diazotrophic, the non-motile group 2 strains have the predicted potential to use a greater variety of fixed nitrogen sources. Furthermore, analysis of their genome synteny provides first insights into their genome evolution, suggesting that horizontal gene transfer and genome reduction in the group 2 strains by loss of genes encoding complete metabolic pathways or physiological features contributed to the observed

Evaluation of Three Automated Genome Annotations for Halorhabdus utahensis

DEFF Research Database (Denmark)

Bakke, Peter; Carney, Nick; DeLoache, Will

2009-01-01

in databases such as NCBI and used to validate subsequent annotation errors. We submitted the genome sequence of halophilic archaeon Halorhabdus utahensis to be analyzed by three genome annotation services. We have examined the output from each service in a variety of ways in order to compare the methodology...
Mine drivage in hydraulic mines

Energy Technology Data Exchange (ETDEWEB)

Ehkber, B Ya

1983-09-01

From 20 to 25% of labor cost in hydraulic coal mines falls on mine drivage. Range of mine drivage is high due to the large number of shortwalls mined by hydraulic monitors. Reducing mining cost in hydraulic mines depends on lowering drivage cost by use of new drivage systems or by increasing efficiency of drivage systems used at present. The following drivage methods used in hydraulic mines are compared: heading machines with hydraulic haulage of cut rocks and coal, hydraulic monitors with hydraulic haulage, drilling and blasting with hydraulic haulage of blasted rocks. Mining and geologic conditions which influence selection of the optimum mine drivage system are analyzed. Standardized cross sections of mine roadways driven by the 3 methods are shown in schemes. Support systems used in mine roadways are compared: timber supports, roof bolts, roof bolts with steel elements, and roadways driven in rocks without a support system. Heading machines (K-56MG, GPKG, 4PU, PK-3M) and hydraulic monitors (GMDTs-3M, 12GD-2) used for mine drivage are described. Data on mine drivage in hydraulic coal mines in the Kuzbass are discussed. From 40 to 46% of roadways are driven by heading machines with hydraulic haulage and from 12 to 15% by hydraulic monitors with hydraulic haulage.
An ISU study of asteroid mining

Science.gov (United States)

Burke, J. D.

During the 1990 summer session of the International Space University, 59 graduate students from 16 countries carried out a design project on using the resources of near-earth asteroids. The results of the project, whose full report is now available from ISU, are summarized. The student team included people in these fields: architecture, business and management, engineering, life sciences, physical sciences, policy and law, resources and manufacturing, and satellite applications. They designed a project for transporting equipment and personnel to a near-earth asteroid, setting up a mining base there, and hauling products back for use in cislunar space. In addition, they outlined the needed precursor steps, beginning with expansion of present ground-based programs for finding and characterizing near-earth asteroids and continuing with automated flight missions to candidate bodies. (To limit the summer project's scope the actual design of these flight-mission precursors was excluded.) The main conclusions were that asteroid mining may provide an important complement to the future use of lunar resources, with the potential to provide large amounts of water and carbonaceous materials for use off earth. However, the recovery of such materials from presently known asteroids did not show an economic gain under the study assumptions; therefore, asteroid mining cannot yet be considered a prospective business.
Contributions to In Silico Genome Annotation

KAUST Repository

Kalkatawi, Manal M.

2017-11-30

Genome annotation is an important topic since it provides information for the foundation of downstream genomic and biological research. It is considered as a way of summarizing part of existing knowledge about the genomic characteristics of an organism. Annotating different regions of a genome sequence is known as structural annotation, while identifying functions of these regions is considered as a functional annotation. In silico approaches can facilitate both tasks that otherwise would be difficult and timeconsuming. This study contributes to genome annotation by introducing several novel bioinformatics methods, some based on machine learning (ML) approaches. First, we present Dragon PolyA Spotter (DPS), a method for accurate identification of the polyadenylation signals (PAS) within human genomic DNA sequences. For this, we derived a novel feature-set able to characterize properties of the genomic region surrounding the PAS, enabling development of high accuracy optimized ML predictive models. DPS considerably outperformed the state-of-the-art results. The second contribution concerns developing generic models for structural annotation, i.e., the recognition of different genomic signals and regions (GSR) within eukaryotic DNA. We developed DeepGSR, a systematic framework that facilitates generating ML models to predict GSR with high accuracy. To the best of our knowledge, no available generic and automated method exists for such task that could facilitate the studies of newly sequenced organisms. The prediction module of DeepGSR uses deep learning algorithms to derive highly abstract features that depend mainly on proper data representation and hyperparameters calibration. DeepGSR, which was evaluated on recognition of PAS and translation initiation sites (TIS) in different organisms, yields a simpler and more precise representation of the problem under study, compared to some other hand-tailored models, while producing high accuracy prediction results. Finally
Will the future of knowledge work automation transform personalized medicine?

Science.gov (United States)

Naik, Gauri; Bhide, Sanika S

2014-09-01

Today, we live in a world of 'information overload' which demands high level of knowledge-based work. However, advances in computer hardware and software have opened possibilities to automate 'routine cognitive tasks' for knowledge processing. Engineering intelligent software systems that can process large data sets using unstructured commands and subtle judgments and have the ability to learn 'on the fly' are a significant step towards automation of knowledge work. The applications of this technology for high throughput genomic analysis, database updating, reporting clinically significant variants, and diagnostic imaging purposes are explored using case studies.
Mechatronics in the mining industrie. With (development) method towards success; Mechatronik im Bergbau. Mit (Entwicklungs-) Methode zum Erfolg

Energy Technology Data Exchange (ETDEWEB)

Brandt, Thorsten; Bruckmann, Tobias [Mercatronics GmbH, Duisburg (Germany)

2009-10-01

Germany is a high-wage country. Hence the internationally competitive extraction of raw materials in Germany can only be ensured by highly efficient working processes. Tackling the associated extreme requirements on road-driving, coal winning and transport equipment has resulted in the German mining industry and its suppliers achieving the role of an international leader in technology. To safeguard this position also in the future the successful mechanisation will now be followed by the mechatronisation in the mining industry. Efficiency will be increased by (partial) automation and assistance systems. This contribution is a first step towards a series of articles, which explain the principles of mechatronic development methods in the mining industry and will make the development engineers in the mines aware of the high potential of mechatronics in the mining industry. (orig.)
Correction of the Caulobacter crescentus NA1000 genome annotation.

Directory of Open Access Journals (Sweden)

Bert Ely

Full Text Available Bacterial genome annotations are accumulating rapidly in the GenBank database and the use of automated annotation technologies to create these annotations has become the norm. However, these automated methods commonly result in a small, but significant percentage of genome annotation errors. To improve accuracy and reliability, we analyzed the Caulobacter crescentus NA1000 genome utilizing computer programs Artemis and MICheck to manually examine the third codon position GC content, alignment to a third codon position GC frame plot peak, and matches in the GenBank database. We identified 11 new genes, modified the start site of 113 genes, and changed the reading frame of 38 genes that had been incorrectly annotated. Furthermore, our manual method of identifying protein-coding genes allowed us to remove 112 non-coding regions that had been designated as coding regions. The improved NA1000 genome annotation resulted in a reduction in the use of rare codons since noncoding regions with atypical codon usage were removed from the annotation and 49 new coding regions were added to the annotation. Thus, a more accurate codon usage table was generated as well. These results demonstrate that a comparison of the location of peaks third codon position GC content to the location of protein coding regions could be used to verify the annotation of any genome that has a GC content that is greater than 60%.
KnowEnG: a knowledge engine for genomics.

Science.gov (United States)

Sinha, Saurabh; Song, Jun; Weinshilboum, Richard; Jongeneel, Victor; Han, Jiawei

2015-11-01

We describe here the vision, motivations, and research plans of the National Institutes of Health Center for Excellence in Big Data Computing at the University of Illinois, Urbana-Champaign. The Center is organized around the construction of "Knowledge Engine for Genomics" (KnowEnG), an E-science framework for genomics where biomedical scientists will have access to powerful methods of data mining, network mining, and machine learning to extract knowledge out of genomics data. The scientist will come to KnowEnG with their own data sets in the form of spreadsheets and ask KnowEnG to analyze those data sets in the light of a massive knowledge base of community data sets called the "Knowledge Network" that will be at the heart of the system. The Center is undertaking discovery projects aimed at testing the utility of KnowEnG for transforming big data to knowledge. These projects span a broad range of biological enquiry, from pharmacogenomics (in collaboration with Mayo Clinic) to transcriptomics of human behavior. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Fiscal 1998 engineer interexchange project (coal mine technology field). Preliminary survey on the international interexchange project in America; 1998 nendo gijutsusha koryu jigyo (tanko gijutsu bun'ya) kokusai koryu jigyo. Jizen chosa (Beikoku)

Energy Technology Data Exchange (ETDEWEB)

NONE

1999-03-01

As a part of the fiscal 1998 international interexchange project in a coal mine technology field, the survey was made in America. Geological engineering problem has large effect on the protection and productivity of underground coal mines. Promotion of long wall mining has contributed to reduction of disasters, however, recently deaths due to roof collapse and wall collapse are on the increase. A roof evaluation technique was developed for adequate selection of mining methods and support design, and its standardization and diffusion are in promotion. Integration and improvement advanced in facility technology because of worldwide integration by acquisition of coal mine facility manufacturers. Introduction of high-power high-capacity facilities is increasing with introduction of large long wall working faces, and development of rear transport system technology and labor saving by remote control are also in promotion according to such trend. As automation and labor saving technology of mining facilities, the automated direction detection and control system by laser and gyroscope is under development. (NEDO)
Genome-Wide Association Mapping for Intelligence in Military Working Dogs: Canine Cohort, Canine Intelligence Assessment Regimen, Genome-Wide Single Nucleotide Polymorphism (SNP) Typing, and Unsupervised Classification Algorithm for Genome-Wide Association Data Analysis

Science.gov (United States)

2011-09-01

SNP Array v2. A ‘proof-of-concept’ advanced data mining algorithm for unsupervised analysis of genome-wide association study (GWAS) dataset was... Opal F AUS Yes U141 Peggs F AUS Yes U142 Taxi F AUS Yes U143 Riso MI MAL Yes U144 Szarik MI GSD Yes U145 Astor MI MAL Yes U146 Roy MC MAL Yes... mining of genetic studies in general, and especially GWAS. As a proof-of-concept, a classification analysis of the WG SNP typing dataset of a
Data mining and the human genome

Energy Technology Data Exchange (ETDEWEB)

Abarbanel, Henry [The MITRE Corporation, McLean, VA (US). JASON Program Office; Callan, Curtis [The MITRE Corporation, McLean, VA (US). JASON Program Office; Dally, William [The MITRE Corporation, McLean, VA (US). JASON Program Office; Dyson, Freeman [The MITRE Corporation, McLean, VA (US). JASON Program Office; Hwa, Terence [The MITRE Corporation, McLean, VA (US). JASON Program Office; Koonin, Steven [The MITRE Corporation, McLean, VA (US). JASON Program Office; Levine, Herbert [The MITRE Corporation, McLean, VA (US). JASON Program Office; Rothaus, Oscar [The MITRE Corporation, McLean, VA (US). JASON Program Office; Schwitters, Roy [The MITRE Corporation, McLean, VA (US). JASON Program Office; Stubbs, Christopher [The MITRE Corporation, McLean, VA (US). JASON Program Office; Weinberger, Peter [The MITRE Corporation, McLean, VA (US). JASON Program Office

2000-01-07

As genomics research moves from an era of data acquisition to one of both acquisition and interpretation, new methods are required for organizing and prioritizing the data. These methods would allow an initial level of data analysis to be carried out before committing resources to a particular genetic locus. This JASON study sought to delineate the main problems that must be faced in bioinformatics and to identify information technologies that can help to overcome those problems. While the current influx of data greatly exceeds what biologists have experienced in the past, other scientific disciplines and the commercial sector have been handling much larger datasets for many years. Powerful datamining techniques have been developed in other fields that, with appropriate modification, could be applied to the biological sciences.
Text Mining for Information Systems Researchers: An Annotated Topic Modeling Tutorial

DEFF Research Database (Denmark)

Debortoli, Stefan; Müller, Oliver; Junglas, Iris

2016-01-01

, such as manual coding. Yet, the size of text data setsobtained from the Internet makes manual analysis virtually impossible. In this tutorial, we discuss the challengesencountered when applying automated text-mining techniques in information systems research. In particular, weshowcase the use of probabilistic...... researchers,this tutorial provides some guidance for conducting text mining studies on their own and for evaluating the quality ofothers.......t is estimated that more than 80 percent of today’s data is stored in unstructured form (e.g., text, audio, image, video);and much of it is expressed in rich and ambiguous natural language. Traditionally, the analysis of natural languagehas prompted the use of qualitative data analysis approaches...
Caveat emptor: limitations of the automated reconstruction of metabolic pathways in Plasmodium.

Science.gov (United States)

Ginsburg, Hagai

2009-01-01

The functional reconstruction of metabolic pathways from an annotated genome is a tedious and demanding enterprise. Automation of this endeavor using bioinformatics algorithms could cope with the ever-increasing number of sequenced genomes and accelerate the process. Here, the manual reconstruction of metabolic pathways in the functional genomic database of Plasmodium falciparum--Malaria Parasite Metabolic Pathways--is described and compared with pathways generated automatically as they appear in PlasmoCyc, metaSHARK and the Kyoto Encyclopedia for Genes and Genomes. A critical evaluation of this comparison discloses that the automatic reconstruction of pathways generates manifold paths that need an expert manual verification to accept some and reject most others based on manually curated gene annotation.
Automated CD-SEM metrology for efficient TD and HVM

Science.gov (United States)

Starikov, Alexander; Mulapudi, Satya P.

2008-03-01

CD-SEM is the metrology tool of choice for patterning process development and production process control. We can make these applications more efficient by extracting more information from each CD-SEM image. This enables direct monitors of key process parameters, such as lithography dose and focus, or predicting the outcome of processing, such as etched dimensions or electrical parameters. Automating CD-SEM recipes at the early stages of process development can accelerate technology characterization, segmentation of variance and process improvements. This leverages the engineering effort, reduces development costs and helps to manage the risks inherent in new technology. Automating CD-SEM for manufacturing enables efficient operations. Novel SEM Alarm Time Indicator (SATI) makes this task manageable. SATI pulls together data mining, trend charting of the key recipe and Operations (OPS) indicators, Pareto of OPS losses and inputs for root cause analysis. This approach proved natural to our FAB personnel. After minimal initial training, we applied new methods in 65nm FLASH manufacture. This resulted in significant lasting improvements of CD-SEM recipe robustness, portability and automation, increased CD-SEM capacity and MT productivity.
HGVA: the Human Genome Variation Archive.

Science.gov (United States)

Lopez, Javier; Coll, Jacobo; Haimel, Matthias; Kandasamy, Swaathi; Tarraga, Joaquin; Furio-Tari, Pedro; Bari, Wasim; Bleda, Marta; Rueda, Antonio; Gräf, Stefan; Rendon, Augusto; Dopazo, Joaquin; Medina, Ignacio

2017-07-03

High-profile genomic variation projects like the 1000 Genomes project or the Exome Aggregation Consortium, are generating a wealth of human genomic variation knowledge which can be used as an essential reference for identifying disease-causing genotypes. However, accessing these data, contrasting the various studies and integrating those data in downstream analyses remains cumbersome. The Human Genome Variation Archive (HGVA) tackles these challenges and facilitates access to genomic data for key reference projects in a clean, fast and integrated fashion. HGVA provides an efficient and intuitive web-interface for easy data mining, a comprehensive RESTful API and client libraries in Python, Java and JavaScript for fast programmatic access to its knowledge base. HGVA calculates population frequencies for these projects and enriches their data with variant annotation provided by CellBase, a rich and fast annotation solution. HGVA serves as a proof-of-concept of the genome analysis developments being carried out by the University of Cambridge together with UK's 100 000 genomes project and the National Institute for Health Research BioResource Rare-Diseases, in particular, deploying open-source for Computational Biology (OpenCB) software platform for storing and analyzing massive genomic datasets. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Mining and mining authorities in Saarland 2016. Mining economy, mining technology, occupational safety, environmental protection, statistics, mining authority activities. Annual report

International Nuclear Information System (INIS)

2016-01-01

The annual report of the Saarland Upper Mining Authority provides an insight into the activities of mining authorities. Especially, the development of the black coal mining, safety and technology of mining as well as the correlation between mining and environment are stressed.
Programming cells by multiplex genome engineering and accelerated evolution.

Science.gov (United States)

Wang, Harris H; Isaacs, Farren J; Carr, Peter A; Sun, Zachary Z; Xu, George; Forest, Craig R; Church, George M

2009-08-13

The breadth of genomic diversity found among organisms in nature allows populations to adapt to diverse environments. However, genomic diversity is difficult to generate in the laboratory and new phenotypes do not easily arise on practical timescales. Although in vitro and directed evolution methods have created genetic variants with usefully altered phenotypes, these methods are limited to laborious and serial manipulation of single genes and are not used for parallel and continuous directed evolution of gene networks or genomes. Here, we describe multiplex automated genome engineering (MAGE) for large-scale programming and evolution of cells. MAGE simultaneously targets many locations on the chromosome for modification in a single cell or across a population of cells, thus producing combinatorial genomic diversity. Because the process is cyclical and scalable, we constructed prototype devices that automate the MAGE technology to facilitate rapid and continuous generation of a diverse set of genetic changes (mismatches, insertions, deletions). We applied MAGE to optimize the 1-deoxy-D-xylulose-5-phosphate (DXP) biosynthesis pathway in Escherichia coli to overproduce the industrially important isoprenoid lycopene. Twenty-four genetic components in the DXP pathway were modified simultaneously using a complex pool of synthetic DNA, creating over 4.3 billion combinatorial genomic variants per day. We isolated variants with more than fivefold increase in lycopene production within 3 days, a significant improvement over existing metabolic engineering techniques. Our multiplex approach embraces engineering in the context of evolution by expediting the design and evolution of organisms with new and improved properties.
Psychological aspects of accident prevention in mines

Energy Technology Data Exchange (ETDEWEB)

Lukestikova, M

1981-04-01

This paper duscusses ways of preventing work accidents and increasing work safety in underground black coal mines. Specific conditions of underground operations in coal mines are stressed. Elements of work accident prevention are analyzed: reducing hazards by introducing safer technology, automation and mechanization of operations associated with hazards, introducing special measures within the framework of safety engineering. Dependence of accident rate on such factors as personnel training, age, motivation, qualifications, and labor discipline is discussed. Investigations indicate that miner motivation plays a significant role in accident prevention. A high degree of labor motivation successfully reduces accident rate and a low degree of motivation increases accident rate. Role of labor collective in labor motivation as well as a correct system of wage incentives are evaluated. Methods of personnel training aimed at reducing accident rate are described. Role of a technique by which a group of miners attempts to find a solution to a work safety problem by amassing all ideas spontaneously contributed by participants is stressed.
Automated classification of Acid Rock Drainage potential from Corescan drill core imagery

Science.gov (United States)

Cracknell, M. J.; Jackson, L.; Parbhakar-Fox, A.; Savinova, K.

2017-12-01

Classification of the acid forming potential of waste rock is important for managing environmental hazards associated with mining operations. Current methods for the classification of acid rock drainage (ARD) potential usually involve labour intensive and subjective assessment of drill core and/or hand specimens. Manual methods are subject to operator bias, human error and the amount of material that can be assessed within a given time frame is limited. The automated classification of ARD potential documented here is based on the ARD Index developed by Parbhakar-Fox et al. (2011). This ARD Index involves the combination of five indicators: A - sulphide content; B - sulphide alteration; C - sulphide morphology; D - primary neutraliser content; and E - sulphide mineral association. Several components of the ARD Index require accurate identification of sulphide minerals. This is achieved by classifying Corescan Red-Green-Blue true colour images into the presence or absence of sulphide minerals using supervised classification. Subsequently, sulphide classification images are processed and combined with Corescan SWIR-based mineral classifications to obtain information on sulphide content, indices representing sulphide textures (disseminated versus massive and degree of veining), and spatially associated minerals. This information is combined to calculate ARD Index indicator values that feed into the classification of ARD potential. Automated ARD potential classifications of drill core samples associated with a porphyry Cu-Au deposit are compared to manually derived classifications and those obtained by standard static geochemical testing and X-ray diffractometry analyses. Results indicate a high degree of similarity between automated and manual ARD potential classifications. Major differences between approaches are observed in sulphide and neutraliser mineral percentages, likely due to the subjective nature of manual estimates of mineral content. The automated approach
Data analysis in the post-genome-wide association study era

Directory of Open Access Journals (Sweden)

Qiao-Ling Wang

2016-12-01

Full Text Available Since the first report of a genome-wide association study (GWAS on human age-related macular degeneration, GWAS has successfully been used to discover genetic variants for a variety of complex human diseases and/or traits, and thousands of associated loci have been identified. However, the underlying mechanisms for these loci remain largely unknown. To make these GWAS findings more useful, it is necessary to perform in-depth data mining. The data analysis in the post-GWAS era will include the following aspects: fine-mapping of susceptibility regions to identify susceptibility genes for elucidating the biological mechanism of action; joint analysis of susceptibility genes in different diseases; integration of GWAS, transcriptome, and epigenetic data to analyze expression and methylation quantitative trait loci at the whole-genome level, and find single-nucleotide polymorphisms that influence gene expression and DNA methylation; genome-wide association analysis of disease-related DNA copy number variations. Applying these strategies and methods will serve to strengthen GWAS data to enhance the utility and significance of GWAS in improving understanding of the genetics of complex diseases or traits and translate these findings for clinical applications. Keywords: Genome-wide association study, Data mining, Integrative data analysis, Polymorphism, Copy number variation

Comparative genomic analysis of multiple strains of two unusual plant pathogens: Pseudomonas corrugata and Pseudomonas mediterranea

Science.gov (United States)

Trantas, Emmanouil A.; Licciardello, Grazia; Almeida, Nalvo F.; Witek, Kamil; Strano, Cinzia P.; Duxbury, Zane; Ververidis, Filippos; Goumas, Dimitrios E.; Jones, Jonathan D. G.; Guttman, David S.; Catara, Vittoria; Sarris, Panagiotis F.

2015-01-01

The non-fluorescent pseudomonads, Pseudomonas corrugata (Pcor) and P. mediterranea (Pmed), are closely related species that cause pith necrosis, a disease of tomato that causes severe crop losses. However, they also show strong antagonistic effects against economically important pathogens, demonstrating their potential for utilization as biological control agents. In addition, their metabolic versatility makes them attractive for the production of commercial biomolecules and bioremediation. An extensive comparative genomics study is required to dissect the mechanisms that Pcor and Pmed employ to cause disease, prevent disease caused by other pathogens, and to mine their genomes for genes that encode proteins involved in commercially important chemical pathways. Here, we present the draft genomes of nine Pcor and Pmed strains from different geographical locations. This analysis covered significant genetic heterogeneity and allowed in-depth genomic comparison. All examined strains were able to trigger symptoms in tomato plants but not all induced a hypersensitive-like response in Nicotiana benthamiana. Genome-mining revealed the absence of type III secretion system and known type III effector-encoding genes from all examined Pcor and Pmed strains. The lack of a type III secretion system appears to be unique among the plant pathogenic pseudomonads. Several gene clusters coding for type VI secretion system were detected in all genomes. Genome-mining also revealed the presence of gene clusters for biosynthesis of siderophores, polyketides, non-ribosomal peptides, and hydrogen cyanide. A highly conserved quorum sensing system was detected in all strains, although species specific differences were observed. Our study provides the basis for in-depth investigations regarding the molecular mechanisms underlying virulence strategies in the battle between plants and microbes. PMID:26300874
Will the future of knowledge work automation transform personalized medicine?

Directory of Open Access Journals (Sweden)

Gauri Naik

2014-09-01

Full Text Available Today, we live in a world of ‘information overload’ which demands high level of knowledge-based work. However, advances in computer hardware and software have opened possibilities to automate ‘routine cognitive tasks’ for knowledge processing. Engineering intelligent software systems that can process large data sets using unstructured commands and subtle judgments and have the ability to learn ‘on the fly’ are a significant step towards automation of knowledge work. The applications of this technology for high throughput genomic analysis, database updating, reporting clinically significant variants, and diagnostic imaging purposes are explored using case studies.
Diverse circovirus-like genome architectures revealed by environmental metagenomics.

Science.gov (United States)

Rosario, Karyna; Duffy, Siobain; Breitbart, Mya

2009-10-01

Single-stranded DNA (ssDNA) viruses with circular genomes are the smallest viruses known to infect eukaryotes. The present study identified 10 novel genomes similar to ssDNA circoviruses through data-mining of public viral metagenomes. The metagenomic libraries included samples from reclaimed water and three different marine environments (Chesapeake Bay, British Columbia coastal waters and Sargasso Sea). All the genomes have similarities to the replication (Rep) protein of circoviruses; however, only half have genomic features consistent with known circoviruses. Some of the genomes exhibit a mixture of genomic features associated with different families of ssDNA viruses (i.e. circoviruses, geminiviruses and parvoviruses). Unique genome architectures and phylogenetic analysis of the Rep protein suggest that these viruses belong to novel genera and/or families. Investigating the complex community of ssDNA viruses in the environment can lead to the discovery of divergent species and help elucidate evolutionary links between ssDNA viruses.
Pervasive, Genome-Wide Transcription in the Organelle Genomes of Diverse Plastid-Bearing Protists

Directory of Open Access Journals (Sweden)

Matheus Sanitá Lima

2017-11-01

Full Text Available Organelle genomes are among the most sequenced kinds of chromosome. This is largely because they are small and widely used in molecular studies, but also because next-generation sequencing technologies made sequencing easier, faster, and cheaper. However, studies of organelle RNA have not kept pace with those of DNA, despite huge amounts of freely available eukaryotic RNA-sequencing (RNA-seq data. Little is known about organelle transcription in nonmodel species, and most of the available eukaryotic RNA-seq data have not been mined for organelle transcripts. Here, we use publicly available RNA-seq experiments to investigate organelle transcription in 30 diverse plastid-bearing protists with varying organelle genomic architectures. Mapping RNA-seq data to organelle genomes revealed pervasive, genome-wide transcription, regardless of the taxonomic grouping, gene organization, or noncoding content. For every species analyzed, transcripts covered ≥85% of the mitochondrial and/or plastid genomes (all of which were ≤105 kb, indicating that most of the organelle DNA—coding and noncoding—is transcriptionally active. These results follow earlier studies of model species showing that organellar transcription is coupled and ubiquitous across the genome, requiring significant downstream processing of polycistronic transcripts. Our findings suggest that noncoding organelle DNA can be transcriptionally active, raising questions about the underlying function of these transcripts and underscoring the utility of publicly available RNA-seq data for recovering complete genome sequences. If pervasive transcription is also found in bigger organelle genomes (>105 kb and across a broader range of eukaryotes, this could indicate that noncoding organelle RNAs are regulating fundamental processes within eukaryotic cells.
Full-text automated detection of surgical site infections secondary to neurosurgery in Rennes, France.

Science.gov (United States)

Campillo-Gimenez, Boris; Garcelon, Nicolas; Jarno, Pascal; Chapplain, Jean Marc; Cuggia, Marc

2013-01-01

The surveillance of Surgical Site Infections (SSI) contributes to the management of risk in French hospitals. Manual identification of infections is costly, time-consuming and limits the promotion of preventive procedures by the dedicated teams. The introduction of alternative methods using automated detection strategies is promising to improve this surveillance. The present study describes an automated detection strategy for SSI in neurosurgery, based on textual analysis of medical reports stored in a clinical data warehouse. The method consists firstly, of enrichment and concept extraction from full-text reports using NOMINDEX, and secondly, text similarity measurement using a vector space model. The text detection was compared to the conventional strategy based on self-declaration and to the automated detection using the diagnosis-related group database. The text-mining approach showed the best detection accuracy, with recall and precision equal to 92% and 40% respectively, and confirmed the interest of reusing full-text medical reports to perform automated detection of SSI.
Production of uranium in Navoi Mining and Metallurgy Combinat, Uzbekistan

International Nuclear Information System (INIS)

Kuchersky, N.; Tolstov, E.A.; Mazurkevich, A.P.; Inozemzev, S.B.

2001-01-01

the mining sites and processing facilities. During these periods, three principle flowsheets were tested and introduced at the ISL sites of the Navoi Mining and Metallurgy Combinat: sulfuric acid scheme, acid and bicarbonate scheme, weak acid scheme. Out of these three, the sulfuric acid and weak acid schemes have been more intensively developed. The commercial scale operation of the new method of leaching, using as a leach the ore body water saturated with air, and practically free of reagents, resulted in the reduction of the mining cost and laid the foundation for environmentally friendly operational method. Optimal parameters of the ISL process have been established including method of processing the pregnant solutions by sorption and the arrangement of the appropriate equipment; methods of well construction which is efficient for any particular conditions and ways to restore their productivity during the course of exploitation, a new production well grid arrangements in connection with hydrological and geological conditions of horizons under development, and an effective lifting equipment for pregnant solutions; methods for extraction of rhenium from ISL solutions as a by-product; and the recovery of scandium oxides from wastes after the hydrometallurgical treatment of the ISL solutions. Under the transition to market-oriented economy the major effort of the Combinat is focused on reducing the cost of uranium production by the implementation of an up-to-date innovations in ISL technology. We are in the process of re-equipping our drilling facilities by replacing out-of-date and depreciated major equipment. A new plant has been commissioned for the production of PVC casing pipes necessary for the completion of production wells thus covering the requirements of the entire ISL complex for the near future and for many years to come. Based on modern control and measurement instrumentation we introduce automated control systems for ISL process with broad communication
Automated detection of structural alerts (chemical fragments in (ecotoxicology

Directory of Open Access Journals (Sweden)

Ronan Bureau

2013-02-01

Full Text Available This mini-review describes the evolution of different algorithms dedicated to the automated discovery of chemical fragments associated to (ecotoxicological endpoints. These structural alerts correspond to one of the most interesting approach of in silico toxicology due to their direct link with specific toxicological mechanisms. A number of expert systems are already available but, since the first work in this field which considered a binomial distribution of chemical fragments between two datasets, new data miners were developed and applied with success in chemoinformatics. The frequency of a chemical fragment in a dataset is often at the core of the process for the definition of its toxicological relevance. However, recent progresses in data mining provide new insights into the automated discovery of new rules. Particularly, this review highlights the notion of Emerging Patterns that can capture contrasts between classes of data.
AUTOMATED DETECTION OF STRUCTURAL ALERTS (CHEMICAL FRAGMENTS IN (ECOTOXICOLOGY

Directory of Open Access Journals (Sweden)

Alban Lepailleur

2013-02-01

Full Text Available This mini-review describes the evolution of different algorithms dedicated to the automated discovery of chemical fragments associated to (ecotoxicological endpoints. These structural alerts correspond to one of the most interesting approach of in silico toxicology due to their direct link with specific toxicological mechanisms. A number of expert systems are already available but, since the first work in this field which considered a binomial distribution of chemical fragments between two datasets, new data miners were developed and applied with success in chemoinformatics. The frequency of a chemical fragment in a dataset is often at the core of the process for the definition of its toxicological relevance. However, recent progresses in data mining provide new insights into the automated discovery of new rules. Particularly, this review highlights the notion of Emerging Patterns that can capture contrasts between classes of data.
Application of fluorescence-based semi-automated AFLP analysis in barley and wheat

DEFF Research Database (Denmark)

Schwarz, G.; Herz, M.; Huang, X.Q.

2000-01-01

of semi-automated codominant analysis for hemizygous AFLP markers in an F-2 population was too low, proposing the use of dominant allele-typing defaults. Nevertheless, the efficiency of genetic mapping, especially of complex plant genomes, will be accelerated by combining the presented genotyping......Genetic mapping and the selection of closely linked molecular markers for important agronomic traits require efficient, large-scale genotyping methods. A semi-automated multifluorophore technique was applied for genotyping AFLP marker loci in barley and wheat. In comparison to conventional P-33...
In vivo robotics: the automation of neuroscience and other intact-system biological fields.

Science.gov (United States)

Kodandaramaiah, Suhasa B; Boyden, Edward S; Forest, Craig R

2013-12-01

Robotic and automation technologies have played a huge role in in vitro biological science, having proved critical for scientific endeavors such as genome sequencing and high-throughput screening. Robotic and automation strategies are beginning to play a greater role in in vivo and in situ sciences, especially when it comes to the difficult in vivo experiments required for understanding the neural mechanisms of behavior and disease. In this perspective, we discuss the prospects for robotics and automation to influence neuroscientific and intact-system biology fields. We discuss how robotic innovations might be created to open up new frontiers in basic and applied neuroscience and present a concrete example with our recent automation of in vivo whole-cell patch clamp electrophysiology of neurons in the living mouse brain. © 2013 New York Academy of Sciences.
Automation, communication and cybernetics in science and engineering 2013/2014

CERN Document Server

Isenhardt, Ingrid; Hees, Frank; Henning, Klaus

2014-01-01

This book continues the tradition of its predecessors “Automation, Communication and Cybernetics in Science and Engineering 2009/2010 and 2011/2012” and includes a representative selection of scientific publications from researchers at the institute cluster IMA/ZLW & IfU. IMA - Institute of Information Management in Mechanical Engineering  ZLW - Center for Learning and Knowledge Management  IfU - Associated Institute for Management Cybernetics e.V. Faculty of Mechanical Engineering, RWTH Aachen University The book presents a range of innovative fields of application, including: cognitive systems, cyber-physical production systems, robotics, automation technology, machine learning, natural language processing, data mining, predictive data analytics, visual analytics, innovation and diversity management, demographic models, virtual and remote laboratories, virtual and augmented realities, multimedia learning environments, organizational development and management cybernetics. The contributio...
Development of Test Rig for Robotization of Mining Technological Processes - Oversized Rock Breaking Process Case

Science.gov (United States)

Pawel, Stefaniak; Jacek, Wodecki; Jakubiak, Janusz; Zimroz, Radoslaw

2017-12-01

Production chain (PCh) in underground copper ore mine consists of several subprocesses. From our perspective implementation of so called ZEPA approach (Zero Entry Production Area) might be very interesting [16]. In practice, it leads to automation/robotization of subprocesses in production area. In this paper was investigated a specific part of PCh i.e. a place when cyclic transport by LHDs is replaced with continuous transport by conveying system. Such place is called dumping point. The objective of dumping points with screen is primary classification of the material (into coarse and fine material) and breaking oversized rocks with hydraulic hammer. Current challenges for the underground mining include e.g. safety improvement as well as production optimization related to bottlenecks, stoppages and operational efficiency of the machines. As a first step, remote control of the hydraulic hammer has been introduced, which not only transferred the operator to safe workplace, but also allowed for more comfortable work environment and control over multiple technical objects by a single person. Today literature analysis shows that current mining industry around the world is oriented to automation and robotization of mining processes and reveals technological readiness for 4th industrial revolution. The paper is focused on preliminary analysis of possibilities for the use of the robotic system to rock-breaking process. Prototype test rig has been proposed and experimental works have been carried out. Automatic algorithms for detection of oversized rocks, crushing them as well as sweeping and loosening of material have been formulated. Obviously many simplifications have been assumed. Some near future works have been proposed.
Automated multi-dimensional purification of tagged proteins.

Science.gov (United States)

Sigrell, Jill A; Eklund, Pär; Galin, Markus; Hedkvist, Lotta; Liljedahl, Pia; Johansson, Christine Markeland; Pless, Thomas; Torstenson, Karin

2003-01-01

The capacity for high throughput purification (HTP) is essential in fields such as structural genomics where large numbers of protein samples are routinely characterized in, for example, studies of structural determination, functionality and drug development. Proteins required for such analysis must be pure and homogenous and available in relatively large amounts. AKTA 3D system is a powerful automated protein purification system, which minimizes preparation, run-time and repetitive manual tasks. It has the capacity to purify up to 6 different His6- or GST-tagged proteins per day and can produce 1-50 mg protein per run at >90% purity. The success of automated protein purification increases with careful experimental planning. Protocol, columns and buffers need to be chosen with the final application area for the purified protein in mind.
Characterization of ancient and modern genomes by SNP detection and phylogenomic and metagenomic analysis using PALEOMIX

DEFF Research Database (Denmark)

Schubert, Mikkel; Ermini, Luca; Der Sarkissian, Clio

2014-01-01

a variety of computational tools. Here we present PALEOMIX (http://geogenetics.ku.dk/publications/paleomix), a flexible and user-friendly pipeline applicable to both modern and ancient genomes, which largely automates the in silico analyses behind whole-genome resequencing. Starting with next...
Advances in Miniaturized Instruments for Genomics

Directory of Open Access Journals (Sweden)

Cihun-Siyong Alex Gong

2014-01-01

Full Text Available In recent years, a lot of demonstrations of the miniaturized instruments were reported for genomic applications. They provided the advantages of miniaturization, automation, sensitivity, and specificity for the development of point-of-care diagnostics. The aim of this paper is to report on recent developments on miniaturized instruments for genomic applications. Based on the mature development of microfabrication, microfluidic systems have been demonstrated for various genomic detections. Since one of the objectives of miniaturized instruments is for the development of point-of-care device, impedimetric detection is found to be a promising technique for this purpose. An in-depth discussion of the impedimetric circuits and systems will be included to provide total consideration of the miniaturized instruments and their potential application towards real-time portable imaging in the “-omics” era. The current excellent demonstrations suggest a solid foundation for the development of practical and widespread point-of-care genomic diagnostic devices.
MINING SECURITY PIPE© (TSM© WITH UNDERGROUND GPS GLOBAL© (RSPG© ESCAPE SECURITY DEVICE IN UNDERGROUND MINING

Directory of Open Access Journals (Sweden)

Rafael Barrionuevo GIMÉNEZ

2016-04-01

Full Text Available TSM is escape pipe in case of collapse of terrain. The TSM is a passive security tool placed underground to connect the work area with secure area (mining gallery mainly. TSM is light and hand able pipe made with aramid (Kevlar, carbon fibre, or other kind of new material. The TSM will be placed as a pipe line network with many in/out entrances/exits to rich and connect problem work areas with another parts in a safe mode. Different levels of instrumentation could be added inside such as micro-led escape way suggested, temperature, humidity, level of oxygen, etc.. The open hardware and software like Arduino will be the heart of control and automation system.
Sustainable lignite mining and utilization. Developments in the Rhenish lignite-mining area; Nachhaltige Braunkohlegewinnung und -nutzung. Entwicklung im Rheinischen Revier

Energy Technology Data Exchange (ETDEWEB)

Gaertner, Dieter [RWE Power AG, Bergheim (Germany). Sparte Tagebaue

2012-09-15

Lignite is an essential module in the Rhenish mining area's economic power. Mini ng in a densely populated region like the Rhineland calls for keeping an eye equ ally on people, the environment and industry now and in the future. By considering all concerns and ensuring transparency in our approach to people, we have succeeded in obtaining public acceptance also for large-scale projects in an environment that is not always easy in Germany. RWE Power plans to use lignite in powe r generation and in upgrading operations for many decades to come, so that the company is systematically implementing the power-plant renewal programme with is planning for BoAplus as highly efficient next-generation lignite-based power plants. Research on CO{sub 2} utilization, flexibilization, energy storage and alternative uses of lignite are as much features of RWE Power's future-proof alignment in the Rhenish mining area as are further innovations in the opencast mines. Core aspects here include further automation in the deployment of main mine equipment, closely dovetailed with innovations in other operating units. Parallel restructuring of the operating units and the Lignite Approvals area are underpinning these measures. Innovations and their translation into technical progress will ensure the success of a measured energy turnaround both in Germany and throughout Europe. However, this requires dependable political conditions, so that an engineering spirit can go on being transformed into entrepreneurial action. (orig.)
ReseqChip: Automated integration of multiple local context probe data from the MitoChip array in mitochondrial DNA sequence assembly

Directory of Open Access Journals (Sweden)

Spang Rainer

2009-12-01

Full Text Available Abstract Background The Affymetrix MitoChip v2.0 is an oligonucleotide tiling array for the resequencing of the human mitochondrial (mt genome. For each of 16,569 nucleotide positions of the mt genome it holds two sets of four 25-mer probes each that match the heavy and the light strand of a reference mt genome and vary only at their central position to interrogate all four possible alleles. In addition, the MitoChip v2.0 carries alternative local context probes to account for known mtDNA variants. These probes have been neglected in most studies due to the lack of software for their automated analysis. Results We provide ReseqChip, a free software that automates the process of resequencing mtDNA using multiple local context probes on the MitoChip v2.0. ReseqChip significantly improves base call rate and sequence accuracy. ReseqChip is available at http://code.open-bio.org/svnweb/index.cgi/bioperl/browse/bioperl-live/trunk/Bio/Microarray/Tools/. Conclusions ReseqChip allows for the automated consolidation of base calls from alternative local mt genome context probes. It thereby improves the accuracy of resequencing, while reducing the number of non-called bases.
Visualizing data mining results with the Brede tools

DEFF Research Database (Denmark)

Nielsen, Finn Årup

2009-01-01

has expanded and now includes its own database with coordinates along with ontologies for brain regions and functions: The Brede Database. With Brede Toolbox and Database combined we setup automated workflows for extraction of data, mass meta-analytic data mining and visualizations. Most of the Web......A few neuroinformatics databases now exist that record results from neuroimaging studies in the form of brain coordinates in stereotaxic space. The Brede Toolbox was originally developed to extract, analyze and visualize data from one of them --- the BrainMap database. Since then the Brede Toolbox...
Mine Water Treatment in Hongai Coal Mines

Science.gov (United States)

Dang, Phuong Thao; Dang, Vu Chi

2018-03-01

Acid mine drainage (AMD) is recognized as one of the most serious environmental problem associated with mining industry. Acid water, also known as acid mine drainage forms when iron sulfide minerals found in the rock of coal seams are exposed to oxidizing conditions in coal mining. Until 2009, mine drainage in Hongai coal mines was not treated, leading to harmful effects on humans, animals and aquatic ecosystem. This report has examined acid mine drainage problem and techniques for acid mine drainage treatment in Hongai coal mines. In addition, selection and criteria for the design of the treatment systems have been presented.

DISEASES: text mining and data integration of disease-gene associations.

Science.gov (United States)

Pletscher-Frankild, Sune; Pallejà, Albert; Tsafou, Kalliopi; Binder, Janos X; Jensen, Lars Juhl

2015-03-01

Text mining is a flexible technology that can be applied to numerous different tasks in biology and medicine. We present a system for extracting disease-gene associations from biomedical abstracts. The system consists of a highly efficient dictionary-based tagger for named entity recognition of human genes and diseases, which we combine with a scoring scheme that takes into account co-occurrences both within and between sentences. We show that this approach is able to extract half of all manually curated associations with a false positive rate of only 0.16%. Nonetheless, text mining should not stand alone, but be combined with other types of evidence. For this reason, we have developed the DISEASES resource, which integrates the results from text mining with manually curated disease-gene associations, cancer mutation data, and genome-wide association studies from existing databases. The DISEASES resource is accessible through a web interface at http://diseases.jensenlab.org/, where the text-mining software and all associations are also freely available for download. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.
Mine Water Treatment in Hongai Coal Mines

OpenAIRE

Dang Phuong Thao; Dang Vu Chi

2018-01-01

Acid mine drainage (AMD) is recognized as one of the most serious environmental problem associated with mining industry. Acid water, also known as acid mine drainage forms when iron sulfide minerals found in the rock of coal seams are exposed to oxidizing conditions in coal mining. Until 2009, mine drainage in Hongai coal mines was not treated, leading to harmful effects on humans, animals and aquatic ecosystem. This report has examined acid mine drainage problem and techniques for acid mine ...
The Whole Genome Assembly and Comparative Genomic Research of Thellungiella parvula (Extremophile Crucifer Mitochondrion

Directory of Open Access Journals (Sweden)

Xuelin Wang

2016-01-01

Full Text Available The complete nucleotide sequences of the mitochondrial (mt genome of an extremophile species Thellungiella parvula (T. parvula have been determined with the lengths of 255,773 bp. T. parvula mt genome is a circular sequence and contains 32 protein-coding genes, 19 tRNA genes, and three ribosomal RNA genes with a 11.5% coding sequence. The base composition of 27.5% A, 27.5% T, 22.7% C, and 22.3% G in descending order shows a slight bias of 55% AT. Fifty-three repeats were identified in the mitochondrial genome of T. parvula, including 24 direct repeats, 28 tandem repeats (TRs, and one palindromic repeat. Furthermore, a total of 199 perfect microsatellites have been mined with a high A/T content (83.1% through simple sequence repeat (SSR analysis and they were distributed unevenly within this mitochondrial genome. We also analyzed other plant mitochondrial genomes’ evolution in general, providing clues for the understanding of the evolution of organelles genomes in plants. Comparing with other Brassicaceae species, T. parvula is related to Arabidopsis thaliana whose characters of low temperature resistance have been well documented. This study will provide important genetic tools for other Brassicaceae species research and improve yields of economically important plants.
Automated integration of genomic physical mapping data via parallel simulated annealing

Energy Technology Data Exchange (ETDEWEB)

Slezak, T.

1994-06-01

The Human Genome Center at the Lawrence Livermore National Laboratory (LLNL) is nearing closure on a high-resolution physical map of human chromosome 19. We have build automated tools to assemble 15,000 fingerprinted cosmid clones into 800 contigs with minimal spanning paths identified. These islands are being ordered, oriented, and spanned by a variety of other techniques including: Fluorescence Insitu Hybridization (FISH) at 3 levels of resolution, ECO restriction fragment mapping across all contigs, and a multitude of different hybridization and PCR techniques to link cosmid, YAC, AC, PAC, and Pl clones. The FISH data provide us with partial order and distance data as well as orientation. We made the observation that map builders need a much rougher presentation of data than do map readers; the former wish to see raw data since these can expose errors or interesting biology. We further noted that by ignoring our length and distance data we could simplify our problem into one that could be readily attacked with optimization techniques. The data integration problem could then be seen as an M x N ordering of our N cosmid clones which ``intersect`` M larger objects by defining ``intersection`` to mean either contig/map membership or hybridization results. Clearly, the goal of making an integrated map is now to rearrange the N cosmid clone ``columns`` such that the number of gaps on the object ``rows`` are minimized. Our FISH partially-ordered cosmid clones provide us with a set of constraints that cannot be violated by the rearrangement process. We solved the optimization problem via simulated annealing performed on a network of 40+ Unix machines in parallel, using a server/client model built on explicit socket calls. For current maps we can create a map in about 4 hours on the parallel net versus 4+ days on a single workstation. Our biologists are now using this software on a daily basis to guide their efforts toward final closure.
Nuclear Species-Diagnostic SNP Markers Mined from 454 Amplicon Sequencing Reveal Admixture Genomic Structure of Modern Citrus Varieties

Science.gov (United States)

Curk, Franck; Ancillo, Gema; Ollitrault, Frédérique; Perrier, Xavier; Jacquemoud-Collet, Jean-Pierre; Garcia-Lor, Andres; Navarro, Luis; Ollitrault, Patrick

2015-01-01

Most cultivated Citrus species originated from interspecific hybridisation between four ancestral taxa (C. reticulata, C. maxima, C. medica, and C. micrantha) with limited further interspecific recombination due to vegetative propagation. This evolution resulted in admixture genomes with frequent interspecific heterozygosity. Moreover, a major part of the phenotypic diversity of edible citrus results from the initial differentiation between these taxa. Deciphering the phylogenomic structure of citrus germplasm is therefore essential for an efficient utilization of citrus biodiversity in breeding schemes. The objective of this work was to develop a set of species-diagnostic single nucleotide polymorphism (SNP) markers for the four Citrus ancestral taxa covering the nine chromosomes, and to use these markers to infer the phylogenomic structure of secondary species and modern cultivars. Species-diagnostic SNPs were mined from 454 amplicon sequencing of 57 gene fragments from 26 genotypes of the four basic taxa. Of the 1,053 SNPs mined from 28,507 kb sequence, 273 were found to be highly diagnostic for a single basic taxon. Species-diagnostic SNP markers (105) were used to analyse the admixture structure of varieties and rootstocks. This revealed C. maxima introgressions in most of the old and in all recent selections of mandarins, and suggested that C. reticulata × C. maxima reticulation and introgression processes were important in edible mandarin domestication. The large range of phylogenomic constitutions between C. reticulata and C. maxima revealed in mandarins, tangelos, tangors, sweet oranges, sour oranges, grapefruits, and orangelos is favourable for genetic association studies based on phylogenomic structures of the germplasm. Inferred admixture structures were in agreement with previous hypotheses regarding the origin of several secondary species and also revealed the probable origin of several acid citrus varieties. The developed species-diagnostic SNP
Building a genome database using an object-oriented approach.

Science.gov (United States)

Barbasiewicz, Anna; Liu, Lin; Lang, B Franz; Burger, Gertraud

2002-01-01

GOBASE is a relational database that integrates data associated with mitochondria and chloroplasts. The most important data in GOBASE, i. e., molecular sequences and taxonomic information, are obtained from the public sequence data repository at the National Center for Biotechnology Information (NCBI), and are validated by our experts. Maintaining a curated genomic database comes with a towering labor cost, due to the shear volume of available genomic sequences and the plethora of annotation errors and omissions in records retrieved from public repositories. Here we describe our approach to increase automation of the database population process, thereby reducing manual intervention. As a first step, we used Unified Modeling Language (UML) to construct a list of potential errors. Each case was evaluated independently, and an expert solution was devised, and represented as a diagram. Subsequently, the UML diagrams were used as templates for writing object-oriented automation programs in the Java programming language.
Exploratory analysis of genomic segmentations with Segtools

Directory of Open Access Journals (Sweden)

Buske Orion J

2011-10-01

Full Text Available Abstract Background As genome-wide experiments and annotations become more prevalent, researchers increasingly require tools to help interpret data at this scale. Many functional genomics experiments involve partitioning the genome into labeled segments, such that segments sharing the same label exhibit one or more biochemical or functional traits. For example, a collection of ChlP-seq experiments yields a compendium of peaks, each labeled with one or more associated DNA-binding proteins. Similarly, manually or automatically generated annotations of functional genomic elements, including cis-regulatory modules and protein-coding or RNA genes, can also be summarized as genomic segmentations. Results We present a software toolkit called Segtools that simplifies and automates the exploration of genomic segmentations. The software operates as a series of interacting tools, each of which provides one mode of summarization. These various tools can be pipelined and summarized in a single HTML page. We describe the Segtools toolkit and demonstrate its use in interpreting a collection of human histone modification data sets and Plasmodium falciparum local chromatin structure data sets. Conclusions Segtools provides a convenient, powerful means of interpreting a genomic segmentation.
Open reading frames associated with cancer in the dark matter of the human genome.

Science.gov (United States)

Delgado, Ana Paula; Brandao, Pamela; Chapado, Maria Julia; Hamid, Sheilin; Narayanan, Ramaswamy

2014-01-01

The uncharacterized proteins (open reading frames, ORFs) in the human genome offer an opportunity to discover novel targets for cancer. A systematic analysis of the dark matter of the human proteome for druggability and biomarker discovery is crucial to mining the genome. Numerous data mining tools are available to mine these ORFs to develop a comprehensive knowledge base for future target discovery and validation. Using the Genetic Association Database, the ORFs of the human dark matter proteome were screened for evidence of association with neoplasms. The Phenome-Genome Integrator tool was used to establish phenotypic association with disease traits including cancer. Batch analysis of the tools for protein expression analysis, gene ontology and motifs and domains was used to characterize the ORFs. Sixty-two ORFs were identified for neoplasm association. The expression Quantitative Trait Loci (eQTL) analysis identified thirteen ORFs related to cancer traits. Protein expression, motifs and domain analysis and genome-wide association studies verified the relevance of these OncoORFs in diverse tumors. The OncoORFs are also associated with a wide variety of human diseases and disorders. Our results link the OncoORFs to diverse diseases and disorders. This suggests a complex landscape of the uncharacterized proteome in human diseases. These results open the dark matter of the proteome to novel cancer target research. Copyright© 2014, International Institute of Anticancer Research (Dr. John G. Delinasios), All rights reserved.
Mine Water Treatment in Hongai Coal Mines

Directory of Open Access Journals (Sweden)

Dang Phuong Thao

2018-01-01

Full Text Available Acid mine drainage (AMD is recognized as one of the most serious environmental problem associated with mining industry. Acid water, also known as acid mine drainage forms when iron sulfide minerals found in the rock of coal seams are exposed to oxidizing conditions in coal mining. Until 2009, mine drainage in Hongai coal mines was not treated, leading to harmful effects on humans, animals and aquatic ecosystem. This report has examined acid mine drainage problem and techniques for acid mine drainage treatment in Hongai coal mines. In addition, selection and criteria for the design of the treatment systems have been presented.
Literature classification for semi-automated updating of biological knowledgebases

DEFF Research Database (Denmark)

Olsen, Lars Rønn; Kudahl, Ulrich Johan; Winther, Ole

2013-01-01

abstracts yielded classification accuracy of 0.95, thus showing significant value in support of data extraction from the literature. Conclusion: We here propose a conceptual framework for semi-automated extraction of epitope data embedded in scientific literature using principles from text mining...... types of biological data, such as sequence data, are extensively stored in biological databases, functional annotations, such as immunological epitopes, are found primarily in semi-structured formats or free text embedded in primary scientific literature. Results: We defined and applied a machine...
Unlimited Thirst for Genome Sequencing, Data Interpretation, and Database Usage in Genomic Era: The Road towards Fast-Track Crop Plant Improvement

Directory of Open Access Journals (Sweden)

Arun Prabhu Dhanapal

2015-01-01

Full Text Available The number of sequenced crop genomes and associated genomic resources is growing rapidly with the advent of inexpensive next generation sequencing methods. Databases have become an integral part of all aspects of science research, including basic and applied plant and animal sciences. The importance of databases keeps increasing as the volume of datasets from direct and indirect genomics, as well as other omics approaches, keeps expanding in recent years. The databases and associated web portals provide at a minimum a uniform set of tools and automated analysis across a wide range of crop plant genomes. This paper reviews some basic terms and considerations in dealing with crop plant databases utilization in advancing genomic era. The utilization of databases for variation analysis with other comparative genomics tools, and data interpretation platforms are well described. The major focus of this review is to provide knowledge on platforms and databases for genome-based investigations of agriculturally important crop plants. The utilization of these databases in applied crop improvement program is still being achieved widely; otherwise, the end for sequencing is not far away.
An enhanced method for sequence walking and paralog mining: TOPO® Vector-Ligation PCR

Directory of Open Access Journals (Sweden)

Davis Thomas M

2010-03-01

Full Text Available Abstract Background Although technological advances allow for the economical acquisition of whole genome sequences, many organisms' genomes remain unsequenced, and fully sequenced genomes may contain gaps. Researchers reliant upon partial genomic or heterologous sequence information require methods for obtaining unknown sequences from loci of interest. Various PCR based techniques are available for sequence walking - i.e., the acquisition of unknown DNA sequence adjacent to known sequence. Many such methods require rigid, elaborate protocols and/or impose narrowly confined options in the choice of restriction enzymes for necessary genomic digests. We describe a new method, TOPO® Vector-Ligation PCR (or TVL-PCR that innovatively integrates available tools and familiar concepts to offer advantages as a means of both targeted sequence walking and paralog mining. Findings TVL-PCR exploits the ligation efficiency of the pCR®4-TOPO® (Invitrogen, Carlsbad, California vector system to capture fragments of unknown sequence by creating chimeric molecules containing defined priming sites at both ends. Initially, restriction enzyme-digested genomic DNA is end-repaired to create 3' adenosine overhangs and is then ligated to pCR4-TOPO vectors. The ligation product pool is used directly as a template for nested PCR, using specific primers to target orthologous sequences, or degenerate primers to enable capture of paralogous gene family members. We demonstrated the efficacy of this method by capturing entire coding and partial promoter sequences of several strawberry Superman-like genes. Conclusions TVL-PCR is a convenient and efficient method for DNA sequence walking and paralog mining that is applicable to any organism for which relevant DNA sequence is available as a basis for primer design.
Pervasive, Genome-Wide Transcription in the Organelle Genomes of Diverse Plastid-Bearing Protists.

Science.gov (United States)

Sanitá Lima, Matheus; Smith, David Roy

2017-11-06

Organelle genomes are among the most sequenced kinds of chromosome. This is largely because they are small and widely used in molecular studies, but also because next-generation sequencing technologies made sequencing easier, faster, and cheaper. However, studies of organelle RNA have not kept pace with those of DNA, despite huge amounts of freely available eukaryotic RNA-sequencing (RNA-seq) data. Little is known about organelle transcription in nonmodel species, and most of the available eukaryotic RNA-seq data have not been mined for organelle transcripts. Here, we use publicly available RNA-seq experiments to investigate organelle transcription in 30 diverse plastid-bearing protists with varying organelle genomic architectures. Mapping RNA-seq data to organelle genomes revealed pervasive, genome-wide transcription, regardless of the taxonomic grouping, gene organization, or noncoding content. For every species analyzed, transcripts covered ≥85% of the mitochondrial and/or plastid genomes (all of which were ≤105 kb), indicating that most of the organelle DNA-coding and noncoding-is transcriptionally active. These results follow earlier studies of model species showing that organellar transcription is coupled and ubiquitous across the genome, requiring significant downstream processing of polycistronic transcripts. Our findings suggest that noncoding organelle DNA can be transcriptionally active, raising questions about the underlying function of these transcripts and underscoring the utility of publicly available RNA-seq data for recovering complete genome sequences. If pervasive transcription is also found in bigger organelle genomes (>105 kb) and across a broader range of eukaryotes, this could indicate that noncoding organelle RNAs are regulating fundamental processes within eukaryotic cells. Copyright © 2017 Sanitá Lima and Smith.
Mine-shaft conveyance monitoring

Energy Technology Data Exchange (ETDEWEB)

Beus, M.J.; Ruff, T.M.; Iverson, S.; McCoy, W.G. [National Institute for Occupational Safety and Health, Spokane, WA (USA). Spokane Research Laboratory

2000-10-01

Monitoring conveyance position and wire rope load directly from the skip or cage top offers several significant safety and production advantages. The Spokane Research Laboratory (SRL) of the National Institute for Occupational Safety and Health (NIOSH) developed a shaft conveyance monitoring system (SCMS). This system consists of position and guide-displacement sensors, a maintenance-free battery power supply and a new sensor, which is mounted on the wire rope with a Crosby Clip, to measure hoist-rope tension. A radio data link transmits sensor output to the hoist room. A state-of-the-art automated hoisting test facility was also constructed to test the concept in a controlled laboratory setting. Field tests are now underway at the SRL hoisting research facility and in deep mine shafts in northern Idaho. 4 refs., 5 figs.
Automated Eukaryotic Gene Structure Annotation Using EVidenceModeler and the Program to Assemble Spliced Alignments

Energy Technology Data Exchange (ETDEWEB)

Haas, B J; Salzberg, S L; Zhu, W; Pertea, M; Allen, J E; Orvis, J; White, O; Buell, C R; Wortman, J R

2007-12-10

EVidenceModeler (EVM) is presented as an automated eukaryotic gene structure annotation tool that reports eukaryotic gene structures as a weighted consensus of all available evidence. EVM, when combined with the Program to Assemble Spliced Alignments (PASA), yields a comprehensive, configurable annotation system that predicts protein-coding genes and alternatively spliced isoforms. Our experiments on both rice and human genome sequences demonstrate that EVM produces automated gene structure annotation approaching the quality of manual curation.
Mining for Strategic Competitive Intelligence Foundations and Applications

CERN Document Server

Ziegler, Cai-Nicolas

2012-01-01

The textbook at hand aims to provide an introduction to the use of automated methods for gathering strategic competitive intelligence. Hereby, the text does not describe a singleton research discipline in its own right, such as machine learning or Web mining. It rather contemplates an application scenario, namely the gathering of knowledge that appears of paramount importance to organizations, e.g., companies and corporations. To this end, the book first summarizes the range of research disciplines that contribute to addressing the issue, extracting from each those grains that are of utmost relevance to the depicted application scope. Moreover, the book presents systems that put these techniques to practical use (e.g., reputation monitoring platforms) and takes an inductive approach to define the gestalt of mining for competitive strategic intelligence by selecting major use cases that are laid out and explained in detail. These pieces form the first part of the book. Each of those use cases is backed by a nu...
A genomic pathway approach to a complex disease: axon guidance and Parkinson disease.

Directory of Open Access Journals (Sweden)

Timothy G Lesnick

2007-06-01

Full Text Available While major inroads have been made in identifying the genetic causes of rare Mendelian disorders, little progress has been made in the discovery of common gene variations that predispose to complex diseases. The single gene variants that have been shown to associate reproducibly with complex diseases typically have small effect sizes or attributable risks. However, the joint actions of common gene variants within pathways may play a major role in predisposing to complex diseases (the paradigm of complex genetics. The goal of this study was to determine whether polymorphism in a candidate pathway (axon guidance predisposed to a complex disease (Parkinson disease [PD]. We mined a whole-genome association dataset and identified single nucleotide polymorphisms (SNPs that were within axon-guidance pathway genes. We then constructed models of axon-guidance pathway SNPs that predicted three outcomes: PD susceptibility (odds ratio = 90.8, p = 4.64 x 10(-38, survival free of PD (hazards ratio = 19.0, p = 5.43 x 10(-48, and PD age at onset (R(2 = 0.68, p = 1.68 x 10(-51. By contrast, models constructed from thousands of random selections of genomic SNPs predicted the three PD outcomes poorly. Mining of a second whole-genome association dataset and mining of an expression profiling dataset also supported a role for many axon-guidance pathway genes in PD. These findings could have important implications regarding the pathogenesis of PD. This genomic pathway approach may also offer insights into other complex diseases such as Alzheimer disease, diabetes mellitus, nicotine and alcohol dependence, and several cancers.
Genome-Wide Comparative Gene Family Classification

Science.gov (United States)

Frech, Christian; Chen, Nansheng

2010-01-01

Correct classification of genes into gene families is important for understanding gene function and evolution. Although gene families of many species have been resolved both computationally and experimentally with high accuracy, gene family classification in most newly sequenced genomes has not been done with the same high standard. This project has been designed to develop a strategy to effectively and accurately classify gene families across genomes. We first examine and compare the performance of computer programs developed for automated gene family classification. We demonstrate that some programs, including the hierarchical average-linkage clustering algorithm MC-UPGMA and the popular Markov clustering algorithm TRIBE-MCL, can reconstruct manual curation of gene families accurately. However, their performance is highly sensitive to parameter setting, i.e. different gene families require different program parameters for correct resolution. To circumvent the problem of parameterization, we have developed a comparative strategy for gene family classification. This strategy takes advantage of existing curated gene families of reference species to find suitable parameters for classifying genes in related genomes. To demonstrate the effectiveness of this novel strategy, we use TRIBE-MCL to classify chemosensory and ABC transporter gene families in C. elegans and its four sister species. We conclude that fully automated programs can establish biologically accurate gene families if parameterized accordingly. Comparative gene family classification finds optimal parameters automatically, thus allowing rapid insights into gene families of newly sequenced species. PMID:20976221
Genome Context Viewer: visual exploration of multiple annotated genomes using microsynteny.

Science.gov (United States)

Cleary, Alan; Farmer, Andrew

2018-05-01

The Genome Context Viewer is a visual data-mining tool that allows users to search across multiple providers of genome data for regions with similarly annotated content that may be aligned and visualized at the level of their shared functional elements. By handling ordered sequences of gene family memberships as a unit of search and comparison, the user interface enables quick and intuitive assessment of the degree of gene content divergence and the presence of various types of structural events within syntenic contexts. Insights into functionally significant differences seen at this level of abstraction can then serve to direct the user to more detailed explorations of the underlying data in other interconnected, provider-specific tools. GCV is provided under the GNU General Public License version 3 (GPL-3.0). Source code is available at https://github.com/legumeinfo/lis_context_viewer. adf@ncgr.org. Supplementary data are available at Bioinformatics online.
Data mining for ontology development.

Energy Technology Data Exchange (ETDEWEB)

Davidson, George S.; Strasburg, Jana (Pacific Northwest National Laboratory, Richland, WA); Stampf, David (Brookhaven National Laboratory, Upton, NY); Neymotin,Lev (Brookhaven National Laboratory, Upton, NY); Czajkowski, Carl (Brookhaven National Laboratory, Upton, NY); Shine, Eugene (Savannah River National Laboratory, Aiken, SC); Bollinger, James (Savannah River National Laboratory, Aiken, SC); Ghosh, Vinita (Brookhaven National Laboratory, Upton, NY); Sorokine, Alexandre (Oak Ridge National Laboratory, Oak Ridge, TN); Ferrell, Regina (Oak Ridge National Laboratory, Oak Ridge, TN); Ward, Richard (Oak Ridge National Laboratory, Oak Ridge, TN); Schoenwald, David Alan

2010-06-01

A multi-laboratory ontology construction effort during the summer and fall of 2009 prototyped an ontology for counterfeit semiconductor manufacturing. This effort included an ontology development team and an ontology validation methods team. Here the third team of the Ontology Project, the Data Analysis (DA) team reports on their approaches, the tools they used, and results for mining literature for terminology pertinent to counterfeit semiconductor manufacturing. A discussion of the value of ontology-based analysis is presented, with insights drawn from other ontology-based methods regularly used in the analysis of genomic experiments. Finally, suggestions for future work are offered.

Chapter 10: Mining genome-wide genetic markers.

Directory of Open Access Journals (Sweden)

Xiang Zhang

Full Text Available Genome-wide association study (GWAS aims to discover genetic factors underlying phenotypic traits. The large number of genetic factors poses both computational and statistical challenges. Various computational approaches have been developed for large scale GWAS. In this chapter, we will discuss several widely used computational approaches in GWAS. The following topics will be covered: (1 An introduction to the background of GWAS. (2 The existing computational approaches that are widely used in GWAS. This will cover single-locus, epistasis detection, and machine learning methods that have been recently developed in biology, statistic, and computer science communities. This part will be the main focus of this chapter. (3 The limitations of current approaches and future directions.
Automated systems to identify relevant documents in product risk management

Science.gov (United States)

2012-01-01

Background Product risk management involves critical assessment of the risks and benefits of health products circulating in the market. One of the important sources of safety information is the primary literature, especially for newer products which regulatory authorities have relatively little experience with. Although the primary literature provides vast and diverse information, only a small proportion of which is useful for product risk assessment work. Hence, the aim of this study is to explore the possibility of using text mining to automate the identification of useful articles, which will reduce the time taken for literature search and hence improving work efficiency. In this study, term-frequency inverse document-frequency values were computed for predictors extracted from the titles and abstracts of articles related to three tumour necrosis factors-alpha blockers. A general automated system was developed using only general predictors and was tested for its generalizability using articles related to four other drug classes. Several specific automated systems were developed using both general and specific predictors and training sets of different sizes in order to determine the minimum number of articles required for developing such systems. Results The general automated system had an area under the curve value of 0.731 and was able to rank 34.6% and 46.2% of the total number of 'useful' articles among the first 10% and 20% of the articles presented to the evaluators when tested on the generalizability set. However, its use may be limited by the subjective definition of useful articles. For the specific automated system, it was found that only 20 articles were required to develop a specific automated system with a prediction performance (AUC 0.748) that was better than that of general automated system. Conclusions Specific automated systems can be developed rapidly and avoid problems caused by subjective definition of useful articles. Thus the efficiency of
International mining forum 2004, new technologies in underground mining, safety in mines proceedings

Energy Technology Data Exchange (ETDEWEB)

Jerzy Kicki; Eugeniusz Sobczyk (eds.)

2004-01-15

The book comprises technical papers that were presented at the International Mining Forum 2004. This event aims to bring together scientists and engineers in mining, rock mechanics, and computer engineering, with a view to explore and discuss international developments in the field. Topics discussed in this book are: trends in the mining industry; new solutions and tendencies in underground mines; rock engineering problems in underground mines; utilization and exploitation of methane; prevention measures for the control of rock bursts in Polish mines; and current problems in Ukrainian coal mines.
Draft Genome Sequence of Marine Sponge Symbiont Pseudoalteromonas luteoviolacea IPB1, Isolated from Hilo, Hawaii.

Science.gov (United States)

Sakai-Kawada, Francis E; Yakym, Christopher J; Helmkampf, Martin; Hagiwara, Kehau; Ip, Courtney G; Antonio, Brandi J; Armstrong, Ellie; Ulloa, Wesley J; Awaya, Jonathan D

2016-09-22

We report here the 6.0-Mb draft genome assembly of Pseudoalteromonas luteoviolacea strain IPB1 that was isolated from the Hawaiian marine sponge Iotrochota protea Genome mining complemented with bioassay studies will elucidate secondary metabolite biosynthetic pathways and will help explain the ecological interaction between host sponge and microorganism. Copyright © 2016 Sakai-Kawada et al.
IVAG: An Integrative Visualization Application for Various Types of Genomic Data Based on R-Shiny and the Docker Platform.

Science.gov (United States)

Lee, Tae-Rim; Ahn, Jin Mo; Kim, Gyuhee; Kim, Sangsoo

2017-12-01

Next-generation sequencing (NGS) technology has become a trend in the genomics research area. There are many software programs and automated pipelines to analyze NGS data, which can ease the pain for traditional scientists who are not familiar with computer programming. However, downstream analyses, such as finding differentially expressed genes or visualizing linkage disequilibrium maps and genome-wide association study (GWAS) data, still remain a challenge. Here, we introduce a dockerized web application written in R using the Shiny platform to visualize pre-analyzed RNA sequencing and GWAS data. In addition, we have integrated a genome browser based on the JBrowse platform and an automated intermediate parsing process required for custom track construction, so that users can easily build and navigate their personal genome tracks with in-house datasets. This application will help scientists perform series of downstream analyses and obtain a more integrative understanding about various types of genomic data by interactively visualizing them with customizable options.
Comparative genomic analysis of multiple strains of two unusual plant pathogens: Pseudomonas corrugata and Pseudomonas mediterranea

Directory of Open Access Journals (Sweden)

Emmanouil A Trantas

2015-08-01

Full Text Available The non-fluorescent pseudomonads, Pseudomonas corrugata (Pcor and P. mediterranea (Pmed, are closely related species that cause pith necrosis, a disease of tomato that causes severe crop losses. However, they also show strong antagonistic effects against economically important pathogens, demonstrating their potential for utilization as biological control agents. In addition, their metabolic versatility makes them attractive for the production of commercial biomolecules and bioremediation. An extensive comparative genomics study is required to dissect the mechanisms that Pcor and Pmed employ to cause disease, prevent disease caused by other pathogens, and to mine their genomes for commercially significant chemical pathways. Here, we present the draft genomes of nine Pcor and Pmed strains from different geographical locations. This analysis covered significant genetic heterogeneity and allowed in-depth genomic comparison. All examined strains were able to trigger symptoms in tomato plants but not all induced a hypersensitive-like response in Nicotiana benthamiana. Genome-mining revealed the absence of a type III secretion system and of known type III effectors from all examined Pcor and Pmed strains. The lack of a type III secretion system appears to be unique among the plant pathogenic pseudomonads. Several gene clusters coding for type VI secretion system were detected in all genomes.
Data Integration and Mining for Synthetic Biology Design.

Science.gov (United States)

Mısırlı, Göksel; Hallinan, Jennifer; Pocock, Matthew; Lord, Phillip; McLaughlin, James Alastair; Sauro, Herbert; Wipat, Anil

2016-10-21

One aim of synthetic biologists is to create novel and predictable biological systems from simpler modular parts. This approach is currently hampered by a lack of well-defined and characterized parts and devices. However, there is a wealth of existing biological information, which can be used to identify and characterize biological parts, and their design constraints in the literature and numerous biological databases. However, this information is spread among these databases in many different formats. New computational approaches are required to make this information available in an integrated format that is more amenable to data mining. A tried and tested approach to this problem is to map disparate data sources into a single data set, with common syntax and semantics, to produce a data warehouse or knowledge base. Ontologies have been used extensively in the life sciences, providing this common syntax and semantics as a model for a given biological domain, in a fashion that is amenable to computational analysis and reasoning. Here, we present an ontology for applications in synthetic biology design, SyBiOnt, which facilitates the modeling of information about biological parts and their relationships. SyBiOnt was used to create the SyBiOntKB knowledge base, incorporating and building upon existing life sciences ontologies and standards. The reasoning capabilities of ontologies were then applied to automate the mining of biological parts from this knowledge base. We propose that this approach will be useful to speed up synthetic biology design and ultimately help facilitate the automation of the biological engineering life cycle.
Automation of the shearer loader technique - an overview; Automatisierung der Walzenladertechnik - ein Ueberblick

Energy Technology Data Exchange (ETDEWEB)

Hackelboerger, B.; Hoelling, B. [Eickhoff Bergbautechnik GmbH, Bochum (Germany); Nienhaus, K.; Winkel, R. [RWTH Aachen (DE). Lehr- und Forschungsgebiet Betriebsmittel fuer die Gewinnung mineralischer Rohstoffe (BGMR)

2007-09-06

The present paper deals with the automation of the shearer loader technology. Starting with the reasons for automation: These are firstly the increase of production by way of rationalization, secondly the gentle material handling associated with health protection and safety at work. Consecutive the development of the automation concept by Eickhoff Bergbautechnik GmbH is being presented which commenced in 1986 with the introduction of the Memory Cut at DSK and successfully continued in 1993 with the first-time utilization of the Defined Face Opening (DFO) in Australia. The authors will present the state-of-the-art in the automation of Eickhoff shearer loaders with the EiControl concept. Besides tramming and ranging arm autosteering it comprises further additional modules for the automatic and semi-automatic cutting sequence to enable control and stabilization of the machine. An overview of the supplied systems combined with the experiences made by the respective mine operators will complete our glance at today's opportunities provided by the advanced shearer loader technology. The Outlook discusses the existing limits of automation and presents future development potentials such as coal boundary layer detection and collision prevention which paves the road to automatic longwall control with the vision of an unmanned face. (orig.)
Sustainable Mining Environment: Technical Review of Post-mining Plans

Directory of Open Access Journals (Sweden)

Restu Juniah

2017-12-01

Full Text Available The mining industry exists because humans need mining commodities to meet their daily needs such as motor vehicles, mobile phones, electronic equipment and others. Mining commodities as mentioned in Government Regulation No. 23 of 2010 on Implementation of Mineral and Coal Mining Business Activities are radioactive minerals, metal minerals, nonmetallic minerals, rocks and coal. Mineral and coal mining is conducted to obtain the mining commodities through production operations. Mining and coal mining companies have an obligation to ensure that the mining environment in particular after the post production operation or post mining continues. The survey research aims to examine technically the post-mining plan in coal mining of PT Samantaka Batubara in Indragiri Hulu Regency of Riau Province towards the sustainability of the mining environment. The results indicate that the post-mining plan of PT Samantaka Batubara has met the technical aspects required in post mining planning for a sustainable mining environment. Postponement of post-mining land of PT Samantaka Batubara for garden and forest zone. The results of this study are expected to be useful and can be used by stakeholders, academics, researchers, practitioners and associations of mining, and the environment.
Contract Mining versus Owner Mining

African Journals Online (AJOL)

Owner

mining companies can concentrate on their core businesses while using specialists for ... 2 Definition of Contract and Owner. Mining ... equipment maintenance, scheduling and budgeting ..... No. Region. Amount Spent on. Contract Mining. ($ billion). Percent of. Total. 1 ... cost and productivity data based on a large range.
Optimization of mining design of Hongwei uranium mine

International Nuclear Information System (INIS)

Wu Sanmao; Yuan Baixiang

2012-01-01

Combined with the mining conditions of Hongwei uranium mine, optimization schemes for hoisting cage, mine drainge,ore transport, mine wastewater treatment, power-supply system,etc are put forward in the mining design of the mine. Optimized effects are analyzed from the aspects of technique, economy, and energy saving and reducing emissions. (authors)
Microbial taxonomy in the post-genomic era: Rebuilding from scratch?

Energy Technology Data Exchange (ETDEWEB)

Thompson, Cristiane C. [Univ. of Rio de Janeiro (UFRJ) (Brazil); Amaral, Gilda R. [Univ. of Rio de Janeiro (UFRJ) (Brazil); Campeão, Mariana [Univ. of Rio de Janeiro (UFRJ) (Brazil); Edwards, Robert A. [Univ. of Rio de Janeiro (UFRJ) (Brazil); San Diego State Univ., CA (United States); Argonne National Lab. (ANL), Argonne, IL (United States); Polz, Martin F. [Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States); Dutilh, Bas E. [Univ. of Rio de Janeiro (UFRJ) (Brazil); Radbould Univ., Nijmegen (Netherlands); Ussery, David W. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Sawabe, Tomoo [Hokkaido Univ., Hakodate (Japan); Swings, Jean [Univ. of Rio de Janeiro (UFRJ) (Brazil); Ghent Univ. (Belgium); Thompson, Fabiano L. [Univ. of Rio de Janeiro (UFRJ) (Brazil); Advanced Systems Laboratory Production Management COPPE / UFRJ, Rio de Janeiro (Brazil)

2014-12-23

Microbial taxonomy should provide adequate descriptions of bacterial, archaeal, and eukaryotic microbial diversity in ecological, clinical, and industrial environments. We re-evaluated the prokaryote species twice. It is time to revisit polyphasic taxonomy, its principles, and its practice, including its underlying pragmatic species concept. We will be able to realize an old dream of our predecessor taxonomists and build a genomic-based microbial taxonomy, using standardized and automated curation of high-quality complete genome sequences as the new gold standard.
SNP mining porcine ESTs with MAVIANT, a novel tool for SNP evaluation and annotation

DEFF Research Database (Denmark)

Panitz, Frank; Stengaard, Henrik; Hornshoj, Henrik

2007-01-01

MOTIVATION: Single nucleotide polymorphisms (SNPs) analysis is an important means to study genetic variation. A fast and cost-efficient approach to identify large numbers of novel candidates is the SNP mining of large scale sequencing projects. The increasing availability of sequence trace data...... manual annotation, which is immediately accessible and can be easily shared with external collaborators. RESULTS: Large-scale SNP mining of polymorphisms bases on porcine EST sequences yielded more than 7900 candidate SNPs in coding regions (cSNPs), which were annotated relative to the human genome. Non...
GWATCH: a web platform for automated gene association discovery analysis

Science.gov (United States)

2014-01-01

Background As genome-wide sequence analyses for complex human disease determinants are expanding, it is increasingly necessary to develop strategies to promote discovery and validation of potential disease-gene associations. Findings Here we present a dynamic web-based platform – GWATCH – that automates and facilitates four steps in genetic epidemiological discovery: 1) Rapid gene association search and discovery analysis of large genome-wide datasets; 2) Expanded visual display of gene associations for genome-wide variants (SNPs, indels, CNVs), including Manhattan plots, 2D and 3D snapshots of any gene region, and a dynamic genome browser illustrating gene association chromosomal regions; 3) Real-time validation/replication of candidate or putative genes suggested from other sources, limiting Bonferroni genome-wide association study (GWAS) penalties; 4) Open data release and sharing by eliminating privacy constraints (The National Human Genome Research Institute (NHGRI) Institutional Review Board (IRB), informed consent, The Health Insurance Portability and Accountability Act (HIPAA) of 1996 etc.) on unabridged results, which allows for open access comparative and meta-analysis. Conclusions GWATCH is suitable for both GWAS and whole genome sequence association datasets. We illustrate the utility of GWATCH with three large genome-wide association studies for HIV-AIDS resistance genes screened in large multicenter cohorts; however, association datasets from any study can be uploaded and analyzed by GWATCH. PMID:25374661
Beyond accuracy: creating interoperable and scalable text-mining web services.

Science.gov (United States)

Wei, Chih-Hsuan; Leaman, Robert; Lu, Zhiyong

2016-06-15

The biomedical literature is a knowledge-rich resource and an important foundation for future research. With over 24 million articles in PubMed and an increasing growth rate, research in automated text processing is becoming increasingly important. We report here our recently developed web-based text mining services for biomedical concept recognition and normalization. Unlike most text-mining software tools, our web services integrate several state-of-the-art entity tagging systems (DNorm, GNormPlus, SR4GN, tmChem and tmVar) and offer a batch-processing mode able to process arbitrary text input (e.g. scholarly publications, patents and medical records) in multiple formats (e.g. BioC). We support multiple standards to make our service interoperable and allow simpler integration with other text-processing pipelines. To maximize scalability, we have preprocessed all PubMed articles, and use a computer cluster for processing large requests of arbitrary text. Our text-mining web service is freely available at http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/tmTools/#curl : Zhiyong.Lu@nih.gov. Published by Oxford University Press 2016. This work is written by US Government employees and is in the public domain in the US.
Excised radicle tips as a source of genomic DNA for PCR-based ...

Indian Academy of Sciences (India)

2012-12-13

Dec 13, 2012 ... Cotton; cry1Ac; genomic DNA isolation; high-resolution melting curve analysis; radicle tip; seed purity testing .... cooled to 40°C. Fluorescence data for melting curves were ... greatly increased by introducing automation.
Single-cell sequencing unveils the lifestyle and CRISPR-based population history of Hydrotalea sp. in acid mine drainage.

Science.gov (United States)

Medeiros, J D; Leite, L R; Pylro, V S; Oliveira, F S; Almeida, V M; Fernandes, G R; Salim, A C M; Araújo, F M G; Volpini, A C; Oliveira, G; Cuadros-Orellana, S

2017-10-01

Acid mine drainage (AMD) is characterized by an acid and metal-rich run-off that originates from mining systems. Despite having been studied for many decades, much remains unknown about the microbial community dynamics in AMD sites, especially during their early development, when the acidity is moderate. Here, we describe draft genome assemblies from single cells retrieved from an early-stage AMD sample. These cells belong to the genus Hydrotalea and are closely related to Hydrotalea flava. The phylogeny and average nucleotide identity analysis suggest that all single amplified genomes (SAGs) form two clades that may represent different strains. These cells have the genomic potential for denitrification, copper and other metal resistance. Two coexisting CRISPR-Cas loci were recovered across SAGs, and we observed heterogeneity in the population with regard to the spacer sequences, together with the loss of trailer-end spacers. Our results suggest that the genomes of Hydrotalea sp. strains studied here are adjusting to a quickly changing selective pressure at the microhabitat scale, and an important form of this selective pressure is infection by foreign DNA. © 2017 John Wiley & Sons Ltd.
CGMIM: Automated text-mining of Online Mendelian Inheritance in Man (OMIM to identify genetically-associated cancers and candidate genes

Directory of Open Access Journals (Sweden)

Jones Steven

2005-03-01

Full Text Available Abstract Background Online Mendelian Inheritance in Man (OMIM is a computerized database of information about genes and heritable traits in human populations, based on information reported in the scientific literature. Our objective was to establish an automated text-mining system for OMIM that will identify genetically-related cancers and cancer-related genes. We developed the computer program CGMIM to search for entries in OMIM that are related to one or more cancer types. We performed manual searches of OMIM to verify the program results. Results In the OMIM database on September 30, 2004, CGMIM identified 1943 genes related to cancer. BRCA2 (OMIM *164757, BRAF (OMIM *164757 and CDKN2A (OMIM *600160 were each related to 14 types of cancer. There were 45 genes related to cancer of the esophagus, 121 genes related to cancer of the stomach, and 21 genes related to both. Analysis of CGMIM results indicate that fewer than three gene entries in OMIM should mention both, and the more than seven-fold discrepancy suggests cancers of the esophagus and stomach are more genetically related than current literature suggests. Conclusion CGMIM identifies genetically-related cancers and cancer-related genes. In several ways, cancers with shared genetic etiology are anticipated to lead to further etiologic hypotheses and advances regarding environmental agents. CGMIM results are posted monthly and the source code can be obtained free of charge from the BC Cancer Research Centre website http://www.bccrc.ca/ccr/CGMIM.
Automatic detection of referral patients due to retinal pathologies through data mining.

Science.gov (United States)

Quellec, Gwenolé; Lamard, Mathieu; Erginay, Ali; Chabouis, Agnès; Massin, Pascale; Cochener, Béatrice; Cazuguel, Guy

2016-04-01

With the increased prevalence of retinal pathologies, automating the detection of these pathologies is becoming more and more relevant. In the past few years, many algorithms have been developed for the automated detection of a specific pathology, typically diabetic retinopathy, using eye fundus photography. No matter how good these algorithms are, we believe many clinicians would not use automatic detection tools focusing on a single pathology and ignoring any other pathology present in the patient's retinas. To solve this issue, an algorithm for characterizing the appearance of abnormal retinas, as well as the appearance of the normal ones, is presented. This algorithm does not focus on individual images: it considers examination records consisting of multiple photographs of each retina, together with contextual information about the patient. Specifically, it relies on data mining in order to learn diagnosis rules from characterizations of fundus examination records. The main novelty is that the content of examination records (images and context) is characterized at multiple levels of spatial and lexical granularity: 1) spatial flexibility is ensured by an adaptive decomposition of composite retinal images into a cascade of regions, 2) lexical granularity is ensured by an adaptive decomposition of the feature space into a cascade of visual words. This multigranular representation allows for great flexibility in automatically characterizing normality and abnormality: it is possible to generate diagnosis rules whose precision and generalization ability can be traded off depending on data availability. A variation on usual data mining algorithms, originally designed to mine static data, is proposed so that contextual and visual data at adaptive granularity levels can be mined. This framework was evaluated in e-ophtha, a dataset of 25,702 examination records from the OPHDIAT screening network, as well as in the publicly-available Messidor dataset. It was successfully
Surface Mines, Other - Longwall Mining Panels

Data.gov (United States)

NSGIC Education | GIS Inventory — Coal mining has occurred in Pennsylvania for over a century. A method of coal mining known as Longwall Mining has become more prevalent in recent decades. Longwall...

The automation of the "making safe" process in South African hard-rock underground mine

CSIR Research Space (South Africa)

Teleka, SR

2011-07-01

Full Text Available In South African hard-rock mines, best practice dictates that the hanging-walls be inspected after blasting. This process is known as ‘making safe’ and although intended to save lives, it is laborious and subjective. Pressure is placed on the barrer...
Whole genome sequencing in clinical and public health microbiology.

Science.gov (United States)

Kwong, J C; McCallum, N; Sintchenko, V; Howden, B P

2015-04-01

Genomics and whole genome sequencing (WGS) have the capacity to greatly enhance knowledge and understanding of infectious diseases and clinical microbiology.The growth and availability of bench-top WGS analysers has facilitated the feasibility of genomics in clinical and public health microbiology.Given current resource and infrastructure limitations, WGS is most applicable to use in public health laboratories, reference laboratories, and hospital infection control-affiliated laboratories.As WGS represents the pinnacle for strain characterisation and epidemiological analyses, it is likely to replace traditional typing methods, resistance gene detection and other sequence-based investigations (e.g., 16S rDNA PCR) in the near future.Although genomic technologies are rapidly evolving, widespread implementation in clinical and public health microbiology laboratories is limited by the need for effective semi-automated pipelines, standardised quality control and data interpretation, bioinformatics expertise, and infrastructure.
The use of vibration monitoring to record the blasting works impact on buildings surrounding open-pit mines

Directory of Open Access Journals (Sweden)

Sołtys Anna

2018-01-01

Full Text Available Environmental protection law and geological and mining law require the mineral mining plant to protect its surroundings from the effects of mining operations. This also applies to the negative impact of vibrations induced by blasting works on people and construction facilities. Effective protection is only possible if the level of this impact is known, therefore it is necessary to record it. The thesis formulated in this way has been and continues to be the guiding principle of the research works carried out in the AGH Laboratory of Blasting Work and Environmental Protection. As a result of these works are procedures for conducting preventive activities by open-pit mines in order to minimize the impact of blasting on facilities in the surrounding area. An important element of this activity is the monitoring of vibrations in constructions, which is a source of knowledge for excavation supervisors and engineers performing blasting works, thus contributing to raising the awareness of the responsible operation of the mining plant. Developed in the Laboratory of the Mine's Vibration Monitoring Station (KSMD, after several modernizations, it became a fully automated system for monitoring and recording the impact of blasting works on the surrounding environment. Currently, there are 30 measuring devices in 10 open-pit mines, and additional 8 devices are used to provide periodic measurement and recording services for the mines concerned.
The use of vibration monitoring to record the blasting works impact on buildings surrounding open-pit mines

Science.gov (United States)

Sołtys, Anna; Pyra, Józef; Winzer, Jan

2018-04-01

Environmental protection law and geological and mining law require the mineral mining plant to protect its surroundings from the effects of mining operations. This also applies to the negative impact of vibrations induced by blasting works on people and construction facilities. Effective protection is only possible if the level of this impact is known, therefore it is necessary to record it. The thesis formulated in this way has been and continues to be the guiding principle of the research works carried out in the AGH Laboratory of Blasting Work and Environmental Protection. As a result of these works are procedures for conducting preventive activities by open-pit mines in order to minimize the impact of blasting on facilities in the surrounding area. An important element of this activity is the monitoring of vibrations in constructions, which is a source of knowledge for excavation supervisors and engineers performing blasting works, thus contributing to raising the awareness of the responsible operation of the mining plant. Developed in the Laboratory of the Mine's Vibration Monitoring Station (KSMD), after several modernizations, it became a fully automated system for monitoring and recording the impact of blasting works on the surrounding environment. Currently, there are 30 measuring devices in 10 open-pit mines, and additional 8 devices are used to provide periodic measurement and recording services for the mines concerned.
Application for trackless mining technique in Benxi uranium mine

International Nuclear Information System (INIS)

Chen Bingguo

1998-01-01

The author narrates the circumstances achieving constructional target in Benxi Uranium Mine under relying on advance of science and technology and adopting small trackless mining equipment, presents the application of trackless mining equipment at mining small mine and complex mineral deposit and discusses the unique superiority of trackless mining technique in development work, mining preparation work and backstoping
Using text mining for study identification in systematic reviews: a systematic review of current approaches.

Science.gov (United States)

O'Mara-Eves, Alison; Thomas, James; McNaught, John; Miwa, Makoto; Ananiadou, Sophia

2015-01-14

The large and growing number of published studies, and their increasing rate of publication, makes the task of identifying relevant studies in an unbiased way for inclusion in systematic reviews both complex and time consuming. Text mining has been offered as a potential solution: through automating some of the screening process, reviewer time can be saved. The evidence base around the use of text mining for screening has not yet been pulled together systematically; this systematic review fills that research gap. Focusing mainly on non-technical issues, the review aims to increase awareness of the potential of these technologies and promote further collaborative research between the computer science and systematic review communities. Five research questions led our review: what is the state of the evidence base; how has workload reduction been evaluated; what are the purposes of semi-automation and how effective are they; how have key contextual problems of applying text mining to the systematic review field been addressed; and what challenges to implementation have emerged? We answered these questions using standard systematic review methods: systematic and exhaustive searching, quality-assured data extraction and a narrative synthesis to synthesise findings. The evidence base is active and diverse; there is almost no replication between studies or collaboration between research teams and, whilst it is difficult to establish any overall conclusions about best approaches, it is clear that efficiencies and reductions in workload are potentially achievable. On the whole, most suggested that a saving in workload of between 30% and 70% might be possible, though sometimes the saving in workload is accompanied by the loss of 5% of relevant studies (i.e. a 95% recall). Using text mining to prioritise the order in which items are screened should be considered safe and ready for use in 'live' reviews. The use of text mining as a 'second screener' may also be used cautiously
A Perfect Match Genomic Landscape Provides a Unified Framework for the Precise Detection of Variation in Natural and Synthetic Haploid Genomes.

Science.gov (United States)

Palacios-Flores, Kim; García-Sotelo, Jair; Castillo, Alejandra; Uribe, Carina; Aguilar, Luis; Morales, Lucía; Gómez-Romero, Laura; Reyes, José; Garciarubio, Alejandro; Boege, Margareta; Dávila, Guillermo

2018-04-01

We present a conceptually simple, sensitive, precise, and essentially nonstatistical solution for the analysis of genome variation in haploid organisms. The generation of a Perfect Match Genomic Landscape (PMGL), which computes intergenome identity with single nucleotide resolution, reveals signatures of variation wherever a query genome differs from a reference genome. Such signatures encode the precise location of different types of variants, including single nucleotide variants, deletions, insertions, and amplifications, effectively introducing the concept of a general signature of variation. The precise nature of variants is then resolved through the generation of targeted alignments between specific sets of sequence reads and known regions of the reference genome. Thus, the perfect match logic decouples the identification of the location of variants from the characterization of their nature, providing a unified framework for the detection of genome variation. We assessed the performance of the PMGL strategy via simulation experiments. We determined the variation profiles of natural genomes and of a synthetic chromosome, both in the context of haploid yeast strains. Our approach uncovered variants that have previously escaped detection. Moreover, our strategy is ideally suited for further refining high-quality reference genomes. The source codes for the automated PMGL pipeline have been deposited in a public repository. Copyright © 2018 by the Genetics Society of America.
tagtog: interactive and text-mining-assisted annotation of gene mentions in PLOS full-text articles.

Science.gov (United States)

Cejuela, Juan Miguel; McQuilton, Peter; Ponting, Laura; Marygold, Steven J; Stefancsik, Raymund; Millburn, Gillian H; Rost, Burkhard

2014-01-01

The breadth and depth of biomedical literature are increasing year upon year. To keep abreast of these increases, FlyBase, a database for Drosophila genomic and genetic information, is constantly exploring new ways to mine the published literature to increase the efficiency and accuracy of manual curation and to automate some aspects, such as triaging and entity extraction. Toward this end, we present the 'tagtog' system, a web-based annotation framework that can be used to mark up biological entities (such as genes) and concepts (such as Gene Ontology terms) in full-text articles. tagtog leverages manual user annotation in combination with automatic machine-learned annotation to provide accurate identification of gene symbols and gene names. As part of the BioCreative IV Interactive Annotation Task, FlyBase has used tagtog to identify and extract mentions of Drosophila melanogaster gene symbols and names in full-text biomedical articles from the PLOS stable of journals. We show here the results of three experiments with different sized corpora and assess gene recognition performance and curation speed. We conclude that tagtog-named entity recognition improves with a larger corpus and that tagtog-assisted curation is quicker than manual curation. DATABASE URL: www.tagtog.net, www.flybase.org.
Automating dChip: toward reproducible sharing of microarray data analysis

Directory of Open Access Journals (Sweden)

Li Cheng

2008-05-01

Full Text Available Abstract Background During the past decade, many software packages have been developed for analysis and visualization of various types of microarrays. We have developed and maintained the widely used dChip as a microarray analysis software package accessible to both biologist and data analysts. However, challenges arise when dChip users want to analyze large number of arrays automatically and share data analysis procedures and parameters. Improvement is also needed when the dChip user support team tries to identify the causes of reported analysis errors or bugs from users. Results We report here implementation and application of the dChip automation module. Through this module, dChip automation files can be created to include menu steps, parameters, and data viewpoints to run automatically. A data-packaging function allows convenient transfer from one user to another of the dChip software, microarray data, and analysis procedures, so that the second user can reproduce the entire analysis session of the first user. An analysis report file can also be generated during an automated run, including analysis logs, user comments, and viewpoint screenshots. Conclusion The dChip automation module is a step toward reproducible research, and it can prompt a more convenient and reproducible mechanism for sharing microarray software, data, and analysis procedures and results. Automation data packages can also be used as publication supplements. Similar automation mechanisms could be valuable to the research community if implemented in other genomics and bioinformatics software packages.
High-throughput automated microfluidic sample preparation for accurate microbial genomics.

Science.gov (United States)

Kim, Soohong; De Jonghe, Joachim; Kulesa, Anthony B; Feldman, David; Vatanen, Tommi; Bhattacharyya, Roby P; Berdy, Brittany; Gomez, James; Nolan, Jill; Epstein, Slava; Blainey, Paul C

2017-01-27

Low-cost shotgun DNA sequencing is transforming the microbial sciences. Sequencing instruments are so effective that sample preparation is now the key limiting factor. Here, we introduce a microfluidic sample preparation platform that integrates the key steps in cells to sequence library sample preparation for up to 96 samples and reduces DNA input requirements 100-fold while maintaining or improving data quality. The general-purpose microarchitecture we demonstrate supports workflows with arbitrary numbers of reaction and clean-up or capture steps. By reducing the sample quantity requirements, we enabled low-input (∼10,000 cells) whole-genome shotgun (WGS) sequencing of Mycobacterium tuberculosis and soil micro-colonies with superior results. We also leveraged the enhanced throughput to sequence ∼400 clinical Pseudomonas aeruginosa libraries and demonstrate excellent single-nucleotide polymorphism detection performance that explained phenotypically observed antibiotic resistance. Fully-integrated lab-on-chip sample preparation overcomes technical barriers to enable broader deployment of genomics across many basic research and translational applications.
A review of genomic data warehousing systems.

Science.gov (United States)

Triplet, Thomas; Butler, Gregory

2014-07-01

To facilitate the integration and querying of genomics data, a number of generic data warehousing frameworks have been developed. They differ in their design and capabilities, as well as their intended audience. We provide a comprehensive and quantitative review of those genomic data warehousing frameworks in the context of large-scale systems biology. We reviewed in detail four genomic data warehouses (BioMart, BioXRT, InterMine and PathwayTools) freely available to the academic community. We quantified 20 aspects of the warehouses, covering the accuracy of their responses, their computational requirements and development efforts. Performance of the warehouses was evaluated under various hardware configurations to help laboratories optimize hardware expenses. Each aspect of the benchmark may be dynamically weighted by scientists using our online tool BenchDW (http://warehousebenchmark.fungalgenomics.ca/benchmark/) to build custom warehouse profiles and tailor our results to their specific needs.
Cancer Genome Interpreter annotates the biological and clinical relevance of tumor alterations.

Science.gov (United States)

Tamborero, David; Rubio-Perez, Carlota; Deu-Pons, Jordi; Schroeder, Michael P; Vivancos, Ana; Rovira, Ana; Tusquets, Ignasi; Albanell, Joan; Rodon, Jordi; Tabernero, Josep; de Torres, Carmen; Dienstmann, Rodrigo; Gonzalez-Perez, Abel; Lopez-Bigas, Nuria

2018-03-28

While tumor genome sequencing has become widely available in clinical and research settings, the interpretation of tumor somatic variants remains an important bottleneck. Here we present the Cancer Genome Interpreter, a versatile platform that automates the interpretation of newly sequenced cancer genomes, annotating the potential of alterations detected in tumors to act as drivers and their possible effect on treatment response. The results are organized in different levels of evidence according to current knowledge, which we envision can support a broad range of oncology use cases. The resource is publicly available at http://www.cancergenomeinterpreter.org .
Mining Together : Large-Scale Mining Meets Artisanal Mining, A Guide for Action

OpenAIRE

World Bank

2009-01-01

The present guide mining together-when large-scale mining meets artisanal mining is an important step to better understanding the conflict dynamics and underlying issues between large-scale and small-scale mining. This guide for action not only points to some of the challenges that both parties need to deal with in order to build a more constructive relationship, but most importantly it sh...
SeMPI: a genome-based secondary metabolite prediction and identification web server.

Science.gov (United States)

Zierep, Paul F; Padilla, Natàlia; Yonchev, Dimitar G; Telukunta, Kiran K; Klementz, Dennis; Günther, Stefan

2017-07-03

The secondary metabolism of bacteria, fungi and plants yields a vast number of bioactive substances. The constantly increasing amount of published genomic data provides the opportunity for an efficient identification of gene clusters by genome mining. Conversely, for many natural products with resolved structures, the encoding gene clusters have not been identified yet. Even though genome mining tools have become significantly more efficient in the identification of biosynthetic gene clusters, structural elucidation of the actual secondary metabolite is still challenging, especially due to as yet unpredictable post-modifications. Here, we introduce SeMPI, a web server providing a prediction and identification pipeline for natural products synthesized by polyketide synthases of type I modular. In order to limit the possible structures of PKS products and to include putative tailoring reactions, a structural comparison with annotated natural products was introduced. Furthermore, a benchmark was designed based on 40 gene clusters with annotated PKS products. The web server of the pipeline (SeMPI) is freely available at: http://www.pharmaceutical-bioinformatics.de/sempi. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
ProteinSplit: splitting of multi-domain proteins using prediction of ordered and disordered regions in protein sequences for virtual structural genomics

International Nuclear Information System (INIS)

Wyrwicz, Lucjan S; Koczyk, Grzegorz; Rychlewski, Leszek; Plewczynski, Dariusz

2007-01-01

The annotation of protein folds within newly sequenced genomes is the main target for semi-automated protein structure prediction (virtual structural genomics). A large number of automated methods have been developed recently with very good results in the case of single-domain proteins. Unfortunately, most of these automated methods often fail to properly predict the distant homology between a given multi-domain protein query and structural templates. Therefore a multi-domain protein should be split into domains in order to overcome this limitation. ProteinSplit is designed to identify protein domain boundaries using a novel algorithm that predicts disordered regions in protein sequences. The software utilizes various sequence characteristics to assess the local propensity of a protein to be disordered or ordered in terms of local structure stability. These disordered parts of a protein are likely to create interdomain spacers. Because of its speed and portability, the method was successfully applied to several genome-wide fold annotation experiments. The user can run an automated analysis of sets of proteins or perform semi-automated multiple user projects (saving the results on the server). Additionally the sequences of predicted domains can be sent to the Bioinfo.PL Protein Structure Prediction Meta-Server for further protein three-dimensional structure and function prediction. The program is freely accessible as a web service at http://lucjan.bioinfo.pl/proteinsplit together with detailed benchmark results on the critical assessment of a fully automated structure prediction (CAFASP) set of sequences. The source code of the local version of protein domain boundary prediction is available upon request from the authors
Weighted mining of massive collections of [Formula: see text]-values by convex optimization.

Science.gov (United States)

Dobriban, Edgar

2018-06-01

Researchers in data-rich disciplines-think of computational genomics and observational cosmology-often wish to mine large bodies of [Formula: see text]-values looking for significant effects, while controlling the false discovery rate or family-wise error rate. Increasingly, researchers also wish to prioritize certain hypotheses, for example, those thought to have larger effect sizes, by upweighting, and to impose constraints on the underlying mining, such as monotonicity along a certain sequence. We introduce Princessp , a principled method for performing weighted multiple testing by constrained convex optimization. Our method elegantly allows one to prioritize certain hypotheses through upweighting and to discount others through downweighting, while constraining the underlying weights involved in the mining process. When the [Formula: see text]-values derive from monotone likelihood ratio families such as the Gaussian means model, the new method allows exact solution of an important optimal weighting problem previously thought to be non-convex and computationally infeasible. Our method scales to massive data set sizes. We illustrate the applications of Princessp on a series of standard genomics data sets and offer comparisons with several previous 'standard' methods. Princessp offers both ease of operation and the ability to scale to extremely large problem sizes. The method is available as open-source software from github.com/dobriban/pvalue_weighting_matlab (accessed 11 October 2017).
Complete Genome Sequence of Staphylococcus succinus 14BME20 Isolated from a Traditional Korean Fermented Soybean Food

OpenAIRE

Jeong, Do-Won; Lee, Jong-Hoon

2017-01-01

ABSTRACT The complete genome sequence of Staphylococcus succinus 14BME20, isolated from a Korean fermented soybean food and selected as a possible starter culture candidate, was determined. Comparative genome analysis with S.?succinus CSM-77 from a Triassic salt mine revealed the presence of strain-specific genes for lipid degradation in strain 14BME20.
FLAVIdB: A data mining system for knowledge discovery in flaviviruses with direct applications in immunology and vaccinology

DEFF Research Database (Denmark)

Olsen, Lars Rønn; Zhang, Guang Lan; Reinherz, Ellis L.

2011-01-01

was incorporated into a web-accessible data mining system, combining specialized data analysis tools for integrated analysis of relevant data categories (protein sequences, macromolecular structures, and immune epitopes). The data mining system includes tools for variability and conservation analysis, T......-cell epitope prediction, and characterization of neutralizing components of B-cell epitopes. FLAVIdB is accessible at cvc.dfci.harvard.edu/flavi/ Conclusion: FLAVIdB represents a new generation of databases in which data and tools are integrated into a data mining infrastructures specifically designed to aid...... have been studied extensively, safe and efficient vaccines lack for the majority of the flaviviruses. Results: We have assembled a database that combines antigenic data of flaviviruses, specialized analysis tools, and workflows for automated complex analyses focusing on applications in immunology...
New simulated gas detector offers realistic training for mine rescue teams

Energy Technology Data Exchange (ETDEWEB)

Bealko, S.B.; Alexander, D.; Chasko, L.L. [National Inst. for Occupational Safety and Health, Pittsburgh, PA (United States). Office of Mine Safety and Health Research; Holtan, J. [LightsOn Safety Solutions, Spring, TX (United States)

2010-07-01

The National Institute for Occupational Safety and Health, together with LightsOn Safety Solutions, evaluated 2 versions of a multi-gas simulated gas monitor system (GMS) in separate field trials with mine rescue teams. This paper described the GMS wireless simulation tool along with its development and testing. It also described the GMS functions for the initial phase of testing as well as plans for the next phase of research which may introduce tracking and automation features. The GMS requires a personal computer and uses a wireless local area network. The GMS teaches mine rescue members about gas detection and helps them understand the importance of gas concentrations. In addition, it promotes decision-making actions by team members and offers a more realistic method of receiving gas concentration readings using a simulated hand-held gas detector. The purpose of the evaluation was to determine if the electronic placard in the GMS could be used by mine rescue teams instead of the currently used cardboard placards, and if the functionality of the device was suitable, reliable and practical. Results from the second field trial demonstrated improvements with the GMS over the original prototype technology, particularly with regards to wireless and connectivity issues. The GMS was successfully incorporated into the mine rescue exercises as planned, with very few problems encountered. 4 refs., 2 figs.
New simulated gas detector offers realistic training for mine rescue teams

International Nuclear Information System (INIS)

Bealko, S.B.; Alexander, D.; Chasko, L.L.

2010-01-01

The National Institute for Occupational Safety and Health, together with LightsOn Safety Solutions, evaluated 2 versions of a multi-gas simulated gas monitor system (GMS) in separate field trials with mine rescue teams. This paper described the GMS wireless simulation tool along with its development and testing. It also described the GMS functions for the initial phase of testing as well as plans for the next phase of research which may introduce tracking and automation features. The GMS requires a personal computer and uses a wireless local area network. The GMS teaches mine rescue members about gas detection and helps them understand the importance of gas concentrations. In addition, it promotes decision-making actions by team members and offers a more realistic method of receiving gas concentration readings using a simulated hand-held gas detector. The purpose of the evaluation was to determine if the electronic placard in the GMS could be used by mine rescue teams instead of the currently used cardboard placards, and if the functionality of the device was suitable, reliable and practical. Results from the second field trial demonstrated improvements with the GMS over the original prototype technology, particularly with regards to wireless and connectivity issues. The GMS was successfully incorporated into the mine rescue exercises as planned, with very few problems encountered. 4 refs., 2 figs.

Bioinformatics for whole-genome shotgun sequencing of microbial communities.

Directory of Open Access Journals (Sweden)

Kevin Chen

2005-07-01

Full Text Available The application of whole-genome shotgun sequencing to microbial communities represents a major development in metagenomics, the study of uncultured microbes via the tools of modern genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic communities from an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding to previously published work on viral communities from marine and fecal samples. The interpretation of this new kind of data poses a wide variety of exciting and difficult bioinformatics problems. The aim of this review is to introduce the bioinformatics community to this emerging field by surveying existing techniques and promising new approaches for several of the most interesting of these computational problems.
Genomic Enzymology: Web Tools for Leveraging Protein Family Sequence-Function Space and Genome Context to Discover Novel Functions.

Science.gov (United States)

Gerlt, John A

2017-08-22

The exponentially increasing number of protein and nucleic acid sequences provides opportunities to discover novel enzymes, metabolic pathways, and metabolites/natural products, thereby adding to our knowledge of biochemistry and biology. The challenge has evolved from generating sequence information to mining the databases to integrating and leveraging the available information, i.e., the availability of "genomic enzymology" web tools. Web tools that allow identification of biosynthetic gene clusters are widely used by the natural products/synthetic biology community, thereby facilitating the discovery of novel natural products and the enzymes responsible for their biosynthesis. However, many novel enzymes with interesting mechanisms participate in uncharacterized small-molecule metabolic pathways; their discovery and functional characterization also can be accomplished by leveraging information in protein and nucleic acid databases. This Perspective focuses on two genomic enzymology web tools that assist the discovery novel metabolic pathways: (1) Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST) for generating sequence similarity networks to visualize and analyze sequence-function space in protein families and (2) Enzyme Function Initiative-Genome Neighborhood Tool (EFI-GNT) for generating genome neighborhood networks to visualize and analyze the genome context in microbial and fungal genomes. Both tools have been adapted to other applications to facilitate target selection for enzyme discovery and functional characterization. As the natural products community has demonstrated, the enzymology community needs to embrace the essential role of web tools that allow the protein and genome sequence databases to be leveraged for novel insights into enzymological problems.
Environmental genomics reveals a single species ecosystem deep within the Earth

Energy Technology Data Exchange (ETDEWEB)

Chivian, Dylan; Brodie, Eoin L.; Alm, Eric J.; Culley, David E.; Dehal, Paramvir S.; DeSantis, Todd Z.; Gihring, Thomas M.; Lapidus, Alla; Lin, Li-Hung; Lowry, Stephen R.; Moser, Duane P.; Richardson, Paul; Southam, Gordon; Wanger, Greg; Pratt, Lisa M.; Andersen, Gary L.; Hazen, Terry C.; Brockman, Fred J.; Arkin, Adam P.; Onstott, Tullis C.

2008-09-17

DNA from low biodiversity fracture water collected at 2.8 km depth in a South African gold mine was sequenced and assembled into a single, complete genome. This bacterium, Candidatus Desulforudis audaxviator, comprises>99.9percent of the microorganisms inhabiting the fluid phase of this particular fracture. Its genome indicates a motile, sporulating, sulfate reducing, chemoautotrophic thermophile that can fix its own nitrogen and carbon using machinery shared with archaea. Candidatus Desulforudis audaxviator is capable of an independent lifestyle well suited to long-term isolation from the photosphere deep within Earth?s crust, and offers the first example of a natural ecosystem that appears to have its biological component entirely encoded within a single genome.
Microarray data and gene expression statistics for Saccharomyces cerevisiae exposed to simulated asbestos mine drainage

Directory of Open Access Journals (Sweden)

Heather E. Driscoll

2017-08-01

Full Text Available Here we describe microarray expression data (raw and normalized, experimental metadata, and gene-level data with expression statistics from Saccharomyces cerevisiae exposed to simulated asbestos mine drainage from the Vermont Asbestos Group (VAG Mine on Belvidere Mountain in northern Vermont, USA. For nearly 100 years (between the late 1890s and 1993, chrysotile asbestos fibers were extracted from serpentinized ultramafic rock at the VAG Mine for use in construction and manufacturing industries. Studies have shown that water courses and streambeds nearby have become contaminated with asbestos mine tailings runoff, including elevated levels of magnesium, nickel, chromium, and arsenic, elevated pH, and chrysotile asbestos-laden mine tailings, due to leaching and gradual erosion of massive piles of mine waste covering approximately 9 km2. We exposed yeast to simulated VAG Mine tailings leachate to help gain insight on how eukaryotic cells exposed to VAG Mine drainage may respond in the mine environment. Affymetrix GeneChip® Yeast Genome 2.0 Arrays were utilized to assess gene expression after 24-h exposure to simulated VAG Mine tailings runoff. The chemistry of mine-tailings leachate, mine-tailings leachate plus yeast extract peptone dextrose media, and control yeast extract peptone dextrose media is also reported. To our knowledge this is the first dataset to assess global gene expression patterns in a eukaryotic model system simulating asbestos mine tailings runoff exposure. Raw and normalized gene expression data are accessible through the National Center for Biotechnology Information Gene Expression Omnibus (NCBI GEO Database Series GSE89875 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE89875.
Implementation of Paste Backfill Mining Technology in Chinese Coal Mines

Science.gov (United States)

Chang, Qingliang; Zhou, Huaqiang; Bai, Jianbiao

2014-01-01

Implementation of clean mining technology at coal mines is crucial to protect the environment and maintain balance among energy resources, consumption, and ecology. After reviewing present coal clean mining technology, we introduce the technology principles and technological process of paste backfill mining in coal mines and discuss the components and features of backfill materials, the constitution of the backfill system, and the backfill process. Specific implementation of this technology and its application are analyzed for paste backfill mining in Daizhuang Coal Mine; a practical implementation shows that paste backfill mining can improve the safety and excavation rate of coal mining, which can effectively resolve surface subsidence problems caused by underground mining activities, by utilizing solid waste such as coal gangues as a resource. Therefore, paste backfill mining is an effective clean coal mining technology, which has widespread application. PMID:25258737
Recent Advances and Emerging Applications in Text and Data Mining for Biomedical Discovery.

Science.gov (United States)

Gonzalez, Graciela H; Tahsin, Tasnia; Goodale, Britton C; Greene, Anna C; Greene, Casey S

2016-01-01

Precision medicine will revolutionize the way we treat and prevent disease. A major barrier to the implementation of precision medicine that clinicians and translational scientists face is understanding the underlying mechanisms of disease. We are starting to address this challenge through automatic approaches for information extraction, representation and analysis. Recent advances in text and data mining have been applied to a broad spectrum of key biomedical questions in genomics, pharmacogenomics and other fields. We present an overview of the fundamental methods for text and data mining, as well as recent advances and emerging applications toward precision medicine. © The Author 2015. Published by Oxford University Press.
Genomic cloud computing: legal and ethical points to consider.

Science.gov (United States)

Dove, Edward S; Joly, Yann; Tassé, Anne-Marie; Knoppers, Bartha M

2015-10-01

The biggest challenge in twenty-first century data-intensive genomic science, is developing vast computer infrastructure and advanced software tools to perform comprehensive analyses of genomic data sets for biomedical research and clinical practice. Researchers are increasingly turning to cloud computing both as a solution to integrate data from genomics, systems biology and biomedical data mining and as an approach to analyze data to solve biomedical problems. Although cloud computing provides several benefits such as lower costs and greater efficiency, it also raises legal and ethical issues. In this article, we discuss three key 'points to consider' (data control; data security, confidentiality and transfer; and accountability) based on a preliminary review of several publicly available cloud service providers' Terms of Service. These 'points to consider' should be borne in mind by genomic research organizations when negotiating legal arrangements to store genomic data on a large commercial cloud service provider's servers. Diligent genomic cloud computing means leveraging security standards and evaluation processes as a means to protect data and entails many of the same good practices that researchers should always consider in securing their local infrastructure.
Global repeat discovery and estimation of genomic copy number in a large, complex genome using a high-throughput 454 sequence survey

Directory of Open Access Journals (Sweden)

Varala Kranthi

2007-05-01

Full Text Available Abstract Background Extensive computational and database tools are available to mine genomic and genetic databases for model organisms, but little genomic data is available for many species of ecological or agricultural significance, especially those with large genomes. Genome surveys using conventional sequencing techniques are powerful, particularly for detecting sequences present in many copies per genome. However these methods are time-consuming and have potential drawbacks. High throughput 454 sequencing provides an alternative method by which much information can be gained quickly and cheaply from high-coverage surveys of genomic DNA. Results We sequenced 78 million base-pairs of randomly sheared soybean DNA which passed our quality criteria. Computational analysis of the survey sequences provided global information on the abundant repetitive sequences in soybean. The sequence was used to determine the copy number across regions of large genomic clones or contigs and discover higher-order structures within satellite repeats. We have created an annotated, online database of sequences present in multiple copies in the soybean genome. The low bias of pyrosequencing against repeat sequences is demonstrated by the overall composition of the survey data, which matches well with past estimates of repetitive DNA content obtained by DNA re-association kinetics (Cot analysis. Conclusion This approach provides a potential aid to conventional or shotgun genome assembly, by allowing rapid assessment of copy number in any clone or clone-end sequence. In addition, we show that partial sequencing can provide access to partial protein-coding sequences.
The use of Data Mining techniques to Automated Detection of Beneficiaries With Indicative of Diabetes Mellitus 2

Directory of Open Access Journals (Sweden)

Deborah Ribeiro Carvalho

2015-09-01

Full Text Available Introduction: The Health Industry companies store a vast amount of data in order to support administrative tasks like payment of medical bills, but filling out epidemiological data (International Classification of Diseases - ICD is not mandatory. This makes it difficult to identify the persons’ illness using standard data extraction techniques as well as implementing preventive programs. Objective: This paper proposes a data mining model that identifies automatically the patients with chronic illnesses. Method: The proposed method is comprised of the following steps: initial identification of the variables and their analysis; variable selection; data mining and rule validation by experts. An experiment, for identifying the patients with propensity for diabetes type 2, was designed to validate the methodology. Results: For the data mining process, 12 variables were selected, targeting 43.375 patients: 843 rules were discovered, with a 88,9% success rate. Conclusion: From the 843 rules, six were selected to be evaluated by four experts: they considered the model efficient, with an 89.6% rate of positive results.
Altering user' acceptance of automation through prior automation exposure.

Science.gov (United States)

Bekier, Marek; Molesworth, Brett R C

2017-06-01

Air navigation service providers worldwide see increased use of automation as one solution to overcome the capacity constraints imbedded in the present air traffic management (ATM) system. However, increased use of automation within any system is dependent on user acceptance. The present research sought to determine if the point at which an individual is no longer willing to accept or cooperate with automation can be manipulated. Forty participants underwent training on a computer-based air traffic control programme, followed by two ATM exercises (order counterbalanced), one with and one without the aid of automation. Results revealed after exposure to a task with automation assistance, user acceptance of high(er) levels of automation ('tipping point') decreased; suggesting it is indeed possible to alter automation acceptance. Practitioner Summary: This paper investigates whether the point at which a user of automation rejects automation (i.e. 'tipping point') is constant or can be manipulated. The results revealed after exposure to a task with automation assistance, user acceptance of high(er) levels of automation decreased; suggesting it is possible to alter automation acceptance.
Genome and Transcriptome sequence of Finger millet (Eleusine coracana (L.) Gaertn.) provides insights into drought tolerance and nutraceutical properties.

Science.gov (United States)

Hittalmani, Shailaja; Mahesh, H B; Shirke, Meghana Deepak; Biradar, Hanamareddy; Uday, Govindareddy; Aruna, Y R; Lohithaswa, H C; Mohanrao, A

2017-06-15

Finger millet (Eleusine coracana (L.) Gaertn.) is an important staple food crop widely grown in Africa and South Asia. Among the millets, finger millet has high amount of calcium, methionine, tryptophan, fiber, and sulphur containing amino acids. In addition, it has C4 photosynthetic carbon assimilation mechanism, which helps to utilize water and nitrogen efficiently under hot and arid conditions without severely affecting yield. Therefore, development and utilization of genomic resources for genetic improvement of this crop is immensely useful. Experimental results from whole genome sequencing and assembling process of ML-365 finger millet cultivar yielded 1196 Mb covering approximately 82% of total estimated genome size. Genome analysis showed the presence of 85,243 genes and one half of the genome is repetitive in nature. The finger millet genome was found to have higher colinearity with foxtail millet and rice as compared to other Poaceae species. Mining of simple sequence repeats (SSRs) yielded abundance of SSRs within the finger millet genome. Functional annotation and mining of transcription factors revealed finger millet genome harbors large number of drought tolerance related genes. Transcriptome analysis of low moisture stress and non-stress samples revealed the identification of several drought-induced candidate genes, which could be used in drought tolerance breeding. This genome sequencing effort will strengthen plant breeders for allele discovery, genetic mapping, and identification of candidate genes for agronomically important traits. Availability of genomic resources of finger millet will enhance the novel breeding possibilities to address potential challenges of finger millet improvement.
30 CFR 819.21 - Auger mining: Protection of underground mining.

Science.gov (United States)

2010-07-01

... 30 Mineral Resources 3 2010-07-01 2010-07-01 false Auger mining: Protection of underground mining. 819.21 Section 819.21 Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT... STANDARDS-AUGER MINING § 819.21 Auger mining: Protection of underground mining. Auger holes shall not extend...
Toward mapping the biology of the genome.

Science.gov (United States)

Chanock, Stephen

2012-09-01

This issue of Genome Research presents new results, methods, and tools from The ENCODE Project (ENCyclopedia of DNA Elements), which collectively represents an important step in moving beyond a parts list of the genome and promises to shape the future of genomic research. This collection sheds light on basic biological questions and frames the current debate over the optimization of tools and methodological challenges necessary to compare and interpret large complex data sets focused on how the genome is organized and regulated. In a number of instances, the authors have highlighted the strengths and limitations of current computational and technical approaches, providing the community with useful standards, which should stimulate development of new tools. In many ways, these papers will ripple through the scientific community, as those in pursuit of understanding the "regulatory genome" will heavily traverse the maps and tools. Similarly, the work should have a substantive impact on how genetic variation contributes to specific diseases and traits by providing a compendium of functional elements for follow-up study. The success of these papers should not only be measured by the scope of the scientific insights and tools but also by their ability to attract new talent to mine existing and future data.
Contract Mining versus Owner Mining – The Way Forward | Suglo ...

African Journals Online (AJOL)

Ghana Mining Journal ... By contracting out one or more of their mining operations, the mining companies can concentrate on their core businesses. This paper reviews ... The general trends in the mining industry show that contract mining will be the way forward for most mines under various circumstances in the future.
Most interesting patents obtained by the OBR EMAG Center for Research and Production of Electrical Engineering and Mine Automation

Energy Technology Data Exchange (ETDEWEB)

Kusz, G

1976-07-01

Four patents are reviewed obtained by the Design and Mechanization Plant for Coal Industry in Gliwice. Patent number, inventor and a short description and scheme are given. The following patents are discussed: design of a low-pressure gas discharge lamp (with argon and mercury vapor) for lighting systems in underground coal mines and for signalling devices; design of an earth clamp for armoured electric cables and cables with a lead covering used in coal mines with methane hazards (in which joining cable coverings by soldering to copper cables is not used); design of a push-button made of plastics or rubber for underground mines (resistant to moisture and coal dusts); design of a sensor indicating piston position in a cylinder of hydraulic rams used in powered supports.
pico-PLAZA, a genome database of microbial photosynthetic eukaryotes.

Science.gov (United States)

Vandepoele, Klaas; Van Bel, Michiel; Richard, Guilhem; Van Landeghem, Sofie; Verhelst, Bram; Moreau, Hervé; Van de Peer, Yves; Grimsley, Nigel; Piganeau, Gwenael

2013-08-01

With the advent of next generation genome sequencing, the number of sequenced algal genomes and transcriptomes is rapidly growing. Although a few genome portals exist to browse individual genome sequences, exploring complete genome information from multiple species for the analysis of user-defined sequences or gene lists remains a major challenge. pico-PLAZA is a web-based resource (http://bioinformatics.psb.ugent.be/pico-plaza/) for algal genomics that combines different data types with intuitive tools to explore genomic diversity, perform integrative evolutionary sequence analysis and study gene functions. Apart from homologous gene families, multiple sequence alignments, phylogenetic trees, Gene Ontology, InterPro and text-mining functional annotations, different interactive viewers are available to study genome organization using gene collinearity and synteny information. Different search functions, documentation pages, export functions and an extensive glossary are available to guide non-expert scientists. To illustrate the versatility of the platform, different case studies are presented demonstrating how pico-PLAZA can be used to functionally characterize large-scale EST/RNA-Seq data sets and to perform environmental genomics. Functional enrichments analysis of 16 Phaeodactylum tricornutum transcriptome libraries offers a molecular view on diatom adaptation to different environments of ecological relevance. Furthermore, we show how complementary genomic data sources can easily be combined to identify marker genes to study the diversity and distribution of algal species, for example in metagenomes, or to quantify intraspecific diversity from environmental strains. © 2013 John Wiley & Sons Ltd and Society for Applied Microbiology.
From NGS assembly challenges to instability of fungal mitochondrial genomes: A case study in genome complexity.

Science.gov (United States)

Misas, Elizabeth; Muñoz, José Fernando; Gallo, Juan Esteban; McEwen, Juan Guillermo; Clay, Oliver Keatinge

2016-04-01

The presence of repetitive or non-unique DNA persisting over sizable regions of a eukaryotic genome can hinder the genome's successful de novo assembly from short reads: ambiguities in assigning genome locations to the non-unique subsequences can result in premature termination of contigs and thus overfragmented assemblies. Fungal mitochondrial (mtDNA) genomes are compact (typically less than 100 kb), yet often contain short non-unique sequences that can be shown to impede their successful de novo assembly in silico. Such repeats can also confuse processes in the cell in vivo. A well-studied example is ectopic (out-of-register, illegitimate) recombination associated with repeat pairs, which can lead to deletion of functionally important genes that are located between the repeats. Repeats that remain conserved over micro- or macroevolutionary timescales despite such risks may indicate functionally or structurally (e.g., for replication) important regions. This principle could form the basis of a mining strategy for accelerating discovery of function in genome sequences. We present here our screening of a sample of 11 fully sequenced fungal mitochondrial genomes by observing where exact k-mer repeats occurred several times; initial analyses motivated us to focus on 17-mers occurring more than three times. Based on the diverse repeats we observe, we propose that such screening may serve as an efficient expedient for gaining a rapid but representative first insight into the repeat landscapes of sparsely characterized mitochondrial chromosomes. Our matching of the flagged repeats to previously reported regions of interest supports the idea that systems of persisting, non-trivial repeats in genomes can often highlight features meriting further attention. Copyright © 2016 Elsevier Ltd. All rights reserved.
Selection of mining method for No.3 uranium ore body in the independent mining area at a uranium mine

International Nuclear Information System (INIS)

Ding Fulong; Ding Dexin; Ye Yongjun

2010-01-01

Mining operation in the existed mining area at a uranium mine is near completion and it is necessary to mine the No.3 uranium ore body in another mining area at the mine. This paper, based on the geological conditions, used analogical method for analyzing the feasible methods and the low cost and high efficiency mining method was suggested for the No.3 ore body in the independent mining area at the uranium mine. (authors)
Complacency and Automation Bias in the Use of Imperfect Automation.

Science.gov (United States)

Wickens, Christopher D; Clegg, Benjamin A; Vieane, Alex Z; Sebok, Angelia L

2015-08-01

We examine the effects of two different kinds of decision-aiding automation errors on human-automation interaction (HAI), occurring at the first failure following repeated exposure to correctly functioning automation. The two errors are incorrect advice, triggering the automation bias, and missing advice, reflecting complacency. Contrasts between analogous automation errors in alerting systems, rather than decision aiding, have revealed that alerting false alarms are more problematic to HAI than alerting misses are. Prior research in decision aiding, although contrasting the two aiding errors (incorrect vs. missing), has confounded error expectancy. Participants performed an environmental process control simulation with and without decision aiding. For those with the aid, automation dependence was created through several trials of perfect aiding performance, and an unexpected automation error was then imposed in which automation was either gone (one group) or wrong (a second group). A control group received no automation support. The correct aid supported faster and more accurate diagnosis and lower workload. The aid failure degraded all three variables, but "automation wrong" had a much greater effect on accuracy, reflecting the automation bias, than did "automation gone," reflecting the impact of complacency. Some complacency was manifested for automation gone, by a longer latency and more modest reduction in accuracy. Automation wrong, creating the automation bias, appears to be a more problematic form of automation error than automation gone, reflecting complacency. Decision-aiding automation should indicate its lower degree of confidence in uncertain environments to avoid the automation bias. © 2015, Human Factors and Ergonomics Society.
GarlicESTdb: an online database and mining tool for garlic EST sequences

Directory of Open Access Journals (Sweden)

Choi Sang-Haeng

2009-05-01

Full Text Available Abstract Background Allium sativum., commonly known as garlic, is a species in the onion genus (Allium, which is a large and diverse one containing over 1,250 species. Its close relatives include chives, onion, leek and shallot. Garlic has been used throughout recorded history for culinary, medicinal use and health benefits. Currently, the interest in garlic is highly increasing due to nutritional and pharmaceutical value including high blood pressure and cholesterol, atherosclerosis and cancer. For all that, there are no comprehensive databases available for Expressed Sequence Tags(EST of garlic for gene discovery and future efforts of genome annotation. That is why we developed a new garlic database and applications to enable comprehensive analysis of garlic gene expression. Description GarlicESTdb is an integrated database and mining tool for large-scale garlic (Allium sativum EST sequencing. A total of 21,595 ESTs collected from an in-house cDNA library were used to construct the database. The analysis pipeline is an automated system written in JAVA and consists of the following components: automatic preprocessing of EST reads, assembly of raw sequences, annotation of the assembled sequences, storage of the analyzed information into MySQL databases, and graphic display of all processed data. A web application was implemented with the latest J2EE (Java 2 Platform Enterprise Edition software technology (JSP/EJB/JavaServlet for browsing and querying the database, for creation of dynamic web pages on the client side, and for mapping annotated enzymes to KEGG pathways, the AJAX framework was also used partially. The online resources, such as putative annotation, single nucleotide polymorphisms (SNP and tandem repeat data sets, can be searched by text, explored on the website, searched using BLAST, and downloaded. To archive more significant BLAST results, a curation system was introduced with which biologists can easily edit best-hit annotation

GarlicESTdb: an online database and mining tool for garlic EST sequences.

Science.gov (United States)

Kim, Dae-Won; Jung, Tae-Sung; Nam, Seong-Hyeuk; Kwon, Hyuk-Ryul; Kim, Aeri; Chae, Sung-Hwa; Choi, Sang-Haeng; Kim, Dong-Wook; Kim, Ryong Nam; Park, Hong-Seog

2009-05-18

Allium sativum., commonly known as garlic, is a species in the onion genus (Allium), which is a large and diverse one containing over 1,250 species. Its close relatives include chives, onion, leek and shallot. Garlic has been used throughout recorded history for culinary, medicinal use and health benefits. Currently, the interest in garlic is highly increasing due to nutritional and pharmaceutical value including high blood pressure and cholesterol, atherosclerosis and cancer. For all that, there are no comprehensive databases available for Expressed Sequence Tags(EST) of garlic for gene discovery and future efforts of genome annotation. That is why we developed a new garlic database and applications to enable comprehensive analysis of garlic gene expression. GarlicESTdb is an integrated database and mining tool for large-scale garlic (Allium sativum) EST sequencing. A total of 21,595 ESTs collected from an in-house cDNA library were used to construct the database. The analysis pipeline is an automated system written in JAVA and consists of the following components: automatic preprocessing of EST reads, assembly of raw sequences, annotation of the assembled sequences, storage of the analyzed information into MySQL databases, and graphic display of all processed data. A web application was implemented with the latest J2EE (Java 2 Platform Enterprise Edition) software technology (JSP/EJB/JavaServlet) for browsing and querying the database, for creation of dynamic web pages on the client side, and for mapping annotated enzymes to KEGG pathways, the AJAX framework was also used partially. The online resources, such as putative annotation, single nucleotide polymorphisms (SNP) and tandem repeat data sets, can be searched by text, explored on the website, searched using BLAST, and downloaded. To archive more significant BLAST results, a curation system was introduced with which biologists can easily edit best-hit annotation information for others to view. The Garlic
Proceedings of the 18. international symposium on mine planning and equipment selection (MPES 2009) and the 11. international symposium on environmental issues and waste management in energy and mineral production (SWEMP 2009) : mine planning and equipment selection and environmental issues and waste management in energy and mineral production

International Nuclear Information System (INIS)

Singhal, R.K.; Mehrotra, A.; Fytas, K.; Ge, H.

2009-01-01

This conference focused on the application of innovative technologies to the mineral industries and the development of productive methods for the mining and processing industries. It was attended by participants from North and South America, Europe, Australia, Africa and Asia with backgrounds in computer sciences, mining engineering and research in mineral production. The major topics addressed regarding mine planning and equipment selection included economic and technical feasibility studies; reserve estimation; mine development; design and planning of surface and underground mines; drilling, blasting, tunneling and excavation engineering; mining equipment selection; automation and information technology; maintenance and production management for mines and mining systems; mining in terms of health, safety and the environment; and rock mechanics and geotechnical applications. The topics addressed regarding waste management in energy and mineral production included the environmental impacts of coal-fired power projects; mining and reclamation; water management; social aspects of rehabilitation; sustainable development for mineral and energy industries; remediation of contaminated soil and groundwater; health hazard and safety issues in small-scale mining; environmental issues in surface and underground mining of metalliferous, coal, uranium and industrial minerals; occupational health and safety; control of effluents from mineral processing, metallurgy and chemical plants; emerging technologies for environmental protection; reliability of waste containment structures; and tailings treatment, recycling and disposal. The conference featured 162 presentations, of which 30 have been catalogued separately for inclusion in this database. refs., tabs., figs.
Complex Genetics of Behavior: BXDs in the Automated Home-Cage.

Science.gov (United States)

Loos, Maarten; Verhage, Matthijs; Spijker, Sabine; Smit, August B

2017-01-01

This chapter describes a use case for the genetic dissection and automated analysis of complex behavioral traits using the genetically diverse panel of BXD mouse recombinant inbred strains. Strains of the BXD resource differ widely in terms of gene and protein expression in the brain, as well as in their behavioral repertoire. A large mouse resource opens the possibility for gene finding studies underlying distinct behavioral phenotypes, however, such a resource poses a challenge in behavioral phenotyping. To address the specifics of large-scale screening we describe how to investigate: (1) how to assess mouse behavior systematically in addressing a large genetic cohort, (2) how to dissect automation-derived longitudinal mouse behavior into quantitative parameters, and (3) how to map these quantitative traits to the genome, deriving loci underlying aspects of behavior.
Semantic prioritization of novel causative genomic variants

KAUST Repository

Boudellioua, Imene

2017-04-17

Discriminating the causative disease variant(s) for individuals with inherited or de novo mutations presents one of the main challenges faced by the clinical genetics community today. Computational approaches for variant prioritization include machine learning methods utilizing a large number of features, including molecular information, interaction networks, or phenotypes. Here, we demonstrate the PhenomeNET Variant Predictor (PVP) system that exploits semantic technologies and automated reasoning over genotype-phenotype relations to filter and prioritize variants in whole exome and whole genome sequencing datasets. We demonstrate the performance of PVP in identifying causative variants on a large number of synthetic whole exome and whole genome sequences, covering a wide range of diseases and syndromes. In a retrospective study, we further illustrate the application of PVP for the interpretation of whole exome sequencing data in patients suffering from congenital hypothyroidism. We find that PVP accurately identifies causative variants in whole exome and whole genome sequencing datasets and provides a powerful resource for the discovery of causal variants.
Semantic prioritization of novel causative genomic variants

KAUST Repository

Boudellioua, Imene; Mohamad Razali, Rozaimi; Kulmanov, Maxat; Hashish, Yasmeen; Bajic, Vladimir B.; Goncalves-Serra, Eva; Schoenmakers, Nadia; Gkoutos, Georgios V.; Schofield, Paul N.; Hoehndorf, Robert

2017-01-01

Discriminating the causative disease variant(s) for individuals with inherited or de novo mutations presents one of the main challenges faced by the clinical genetics community today. Computational approaches for variant prioritization include machine learning methods utilizing a large number of features, including molecular information, interaction networks, or phenotypes. Here, we demonstrate the PhenomeNET Variant Predictor (PVP) system that exploits semantic technologies and automated reasoning over genotype-phenotype relations to filter and prioritize variants in whole exome and whole genome sequencing datasets. We demonstrate the performance of PVP in identifying causative variants on a large number of synthetic whole exome and whole genome sequences, covering a wide range of diseases and syndromes. In a retrospective study, we further illustrate the application of PVP for the interpretation of whole exome sequencing data in patients suffering from congenital hypothyroidism. We find that PVP accurately identifies causative variants in whole exome and whole genome sequencing datasets and provides a powerful resource for the discovery of causal variants.
GenoSets: visual analytic methods for comparative genomics.

Directory of Open Access Journals (Sweden)

Aurora A Cain

Full Text Available Many important questions in biology are, fundamentally, comparative, and this extends to our analysis of a growing number of sequenced genomes. Existing genomic analysis tools are often organized around literal views of genomes as linear strings. Even when information is highly condensed, these views grow cumbersome as larger numbers of genomes are added. Data aggregation and summarization methods from the field of visual analytics can provide abstracted comparative views, suitable for sifting large multi-genome datasets to identify critical similarities and differences. We introduce a software system for visual analysis of comparative genomics data. The system automates the process of data integration, and provides the analysis platform to identify and explore features of interest within these large datasets. GenoSets borrows techniques from business intelligence and visual analytics to provide a rich interface of interactive visualizations supported by a multi-dimensional data warehouse. In GenoSets, visual analytic approaches are used to enable querying based on orthology, functional assignment, and taxonomic or user-defined groupings of genomes. GenoSets links this information together with coordinated, interactive visualizations for both detailed and high-level categorical analysis of summarized data. GenoSets has been designed to simplify the exploration of multiple genome datasets and to facilitate reasoning about genomic comparisons. Case examples are included showing the use of this system in the analysis of 12 Brucella genomes. GenoSets software and the case study dataset are freely available at http://genosets.uncc.edu. We demonstrate that the integration of genomic data using a coordinated multiple view approach can simplify the exploration of large comparative genomic data sets, and facilitate reasoning about comparisons and features of interest.
Text mining improves prediction of protein functional sites.

Directory of Open Access Journals (Sweden)

Karin M Verspoor

Full Text Available We present an approach that integrates protein structure analysis and text mining for protein functional site prediction, called LEAP-FS (Literature Enhanced Automated Prediction of Functional Sites. The structure analysis was carried out using Dynamics Perturbation Analysis (DPA, which predicts functional sites at control points where interactions greatly perturb protein vibrations. The text mining extracts mentions of residues in the literature, and predicts that residues mentioned are functionally important. We assessed the significance of each of these methods by analyzing their performance in finding known functional sites (specifically, small-molecule binding sites and catalytic sites in about 100,000 publicly available protein structures. The DPA predictions recapitulated many of the functional site annotations and preferentially recovered binding sites annotated as biologically relevant vs. those annotated as potentially spurious. The text-based predictions were also substantially supported by the functional site annotations: compared to other residues, residues mentioned in text were roughly six times more likely to be found in a functional site. The overlap of predictions with annotations improved when the text-based and structure-based methods agreed. Our analysis also yielded new high-quality predictions of many functional site residues that were not catalogued in the curated data sources we inspected. We conclude that both DPA and text mining independently provide valuable high-throughput protein functional site predictions, and that integrating the two methods using LEAP-FS further improves the quality of these predictions.
Text Mining Improves Prediction of Protein Functional Sites

Science.gov (United States)

Cohn, Judith D.; Ravikumar, Komandur E.

2012-01-01

We present an approach that integrates protein structure analysis and text mining for protein functional site prediction, called LEAP-FS (Literature Enhanced Automated Prediction of Functional Sites). The structure analysis was carried out using Dynamics Perturbation Analysis (DPA), which predicts functional sites at control points where interactions greatly perturb protein vibrations. The text mining extracts mentions of residues in the literature, and predicts that residues mentioned are functionally important. We assessed the significance of each of these methods by analyzing their performance in finding known functional sites (specifically, small-molecule binding sites and catalytic sites) in about 100,000 publicly available protein structures. The DPA predictions recapitulated many of the functional site annotations and preferentially recovered binding sites annotated as biologically relevant vs. those annotated as potentially spurious. The text-based predictions were also substantially supported by the functional site annotations: compared to other residues, residues mentioned in text were roughly six times more likely to be found in a functional site. The overlap of predictions with annotations improved when the text-based and structure-based methods agreed. Our analysis also yielded new high-quality predictions of many functional site residues that were not catalogued in the curated data sources we inspected. We conclude that both DPA and text mining independently provide valuable high-throughput protein functional site predictions, and that integrating the two methods using LEAP-FS further improves the quality of these predictions. PMID:22393388
GWA study data mining and independent replication identify cardiomyopathy-associated 5 (CMYA5) as a risk gene for schizophrenia

NARCIS (Netherlands)

Chen, X.; Lee, G.; Maher, B. S.; Fanous, A. H.; Chen, J.; Zhao, Z.; Guo, A.; van den Oord, E.; Sullivan, P. F.; Shi, J.; Levinson, D. F.; Gejman, P. V.; Sanders, A.; Duan, J.; Owen, M. J.; Craddock, N. J.; O'Donovan, M. C.; Blackman, J.; Lewis, D.; Kirov, G. K.; Qin, W.; Schwab, S.; Wildenauer, D.; Chowdari, K.; Nimgaonkar, V.; Straub, R. E.; Weinberger, D. R.; O'Neill, F. A.; Walsh, D.; Bronstein, M.; Darvasi, A.; Lencz, T.; Malhotra, A. K.; Rujescu, D.; Giegling, I.; Werge, T.; Hansen, T.; Ingason, A.; Nöethen, M. M.; Rietschel, M.; Cichon, S.; Djurovic, S.; Andreassen, O. A.; Cantor, R. M.; Ophoff, R.; Corvin, A.; Morris, D. W.; Gill, M.; Pato, C. N.; Pato, M. T.; Macedo, A.; Gurling, H. M. D.; McQuillin, A.; Pimm, J.; Hultman, C.; Lichtenstein, P.; Sklar, P.; Purcell, S. M.; Scolnick, E.; St Clair, D.; Blackwood, D. H. R.; Kendler, K. S.; Kahn, René S.; Linszen, Don H.; van Os, Jim; Wiersma, Durk; Bruggeman, Richard; Cahn, Wiepke; de Haan, Lieuwe; Krabbendam, Lydia; Myin-Germeys, Inez; O'Donovan, Michael C.; Kirov, George K.; Craddock, Nick J.; Holmans, Peter A.; Williams, Nigel M.; Georgieva, Lyudmila; Nikolov, Ivan; Norton, N.; Williams, H.; Toncheva, Draga; Milanova, Vihra; Owen, Michael J.; Hultman, Christina M.; Lichtenstein, Paul; Thelander, Emma F.; Sullivan, Patrick; Morris, Derek W.; O'Dushlaine, Colm T.; Kenny, Elaine; Quinn, Emma M.; Gill, Michael; Corvin, Aiden; McQuillin, Andrew; Choudhury, Khalid; Datta, Susmita; Pimm, Jonathan; Thirumalai, Srinivasa; Puri, Vinay; Krasucki, Robert; Lawrence, Jacob; Quested, Digby; Bass, Nicholas; Gurling, Hugh; Crombie, Caroline; Fraser, Gillian; Kuan, Soh Leh; Walker, Nicholas; St Clair, David; Blackwood, Douglas H. R.; Muir, Walter J.; McGhee, Kevin A.; Pickard, Ben; Malloy, Pat; Maclean, Alan W.; van Beck, Margaret; Wray, Naomi R.; Macgregor, Stuart; Visscher, Peter M.; Pato, Michele T.; Medeiros, Helena; Middleton, Frank; Carvalho, Celia; Morley, Christopher; Fanous, Ayman; Conti, David; Knowles, James A.; Ferreira, Carlos Paz; Macedo, Antonio; Azevedo, M. Helena; Pato, Carlos N.; Stone, Jennifer L.; Ruderfer, Douglas M.; Kirby, Andrew N.; Ferreira, Manuel A. R.; Daly, Mark J.; Purcell, Shaun M.; Sklar, Pamela; Chambert, Kimberly; Kuruvilla, Finny; Gabriel, Stacey B.; Ardlie, Kristin; Moran, Jennifer L.; Scolnick, Edward M.

2011-01-01

We conducted data-mining analyses using the Clinical Antipsychotic Trials of Intervention Effectiveness (CATIE) and molecular genetics of schizophrenia genome-wide association study supported by the genetic association information network (MGS-GAIN) schizophrenia data sets and performed
Automation of cDNA Synthesis and Labelling Improves Reproducibility

Directory of Open Access Journals (Sweden)

Daniel Klevebring

2009-01-01

Full Text Available Background. Several technologies, such as in-depth sequencing and microarrays, enable large-scale interrogation of genomes and transcriptomes. In this study, we asses reproducibility and throughput by moving all laboratory procedures to a robotic workstation, capable of handling superparamagnetic beads. Here, we describe a fully automated procedure for cDNA synthesis and labelling for microarrays, where the purification steps prior to and after labelling are based on precipitation of DNA on carboxylic acid-coated paramagnetic beads. Results. The fully automated procedure allows for samples arrayed on a microtiter plate to be processed in parallel without manual intervention and ensuring high reproducibility. We compare our results to a manual sample preparation procedure and, in addition, use a comprehensive reference dataset to show that the protocol described performs better than similar manual procedures. Conclusions. We demonstrate, in an automated gene expression microarray experiment, a reduced variance between replicates, resulting in an increase in the statistical power to detect differentially expressed genes, thus allowing smaller differences between samples to be identified. This protocol can with minor modifications be used to create cDNA libraries for other applications such as in-depth analysis using next-generation sequencing technologies.
Concept and Establishment of the Mine Information System within the CROMAC GIP Project

Directory of Open Access Journals (Sweden)

Zvonko Biljecki

2006-12-01

Full Text Available In order to solve mine problems in the Republic of Croatia, a unique project CROMAC GIP (Croatian Mine Action Centre Geoinformation Project has been initiated significantly increasing the functional quality of the existing Mine Information System (MIS. Since mine problems are closely related to space, geodata are a crucial part of MIS intended for monitoring and planning of demining. Since the moment the Croatian Mine Action Centre was funded till today, the process of demining has progressed. The implementation of a topographic database in accordance with the CROTIS data model and the usage of orthophoto data produced according to the official product specifications can be pointed out in that progress. Usage of such geodata requires a sophisticated information system that enables a simultaneous usage of geodata and other data connected with solving mine problems. In order to reach all goals in demining and to use all advantages of geodata, it was indispensable to upgrade the existing Mine Information System by merging geodata and HCR data and to collect new data according to the standardized procedures, but controlling at the same time the quality and automated procedures of uploading into the system. Apart from being constructed in accordance with the Standard Operative Procedures (SOP, the modernised MIS is also based on generally accepted standards in the field of geoinformation and it is implemented on advanced technology. The core of the system is the Oracle database, and GeoMedia is a WebMap Professional tool on the basis of which the distribution and the work with spatial data is possible on intranet/Internet. In order to achieve full efficiency of the system, it is necessary to provide high quality and updated geodata. In this respect, photogrammetric data are the most efficient solution.
Design database for quantitative trait loci (QTL) data warehouse, data mining, and meta-analysis.

Science.gov (United States)

Hu, Zhi-Liang; Reecy, James M; Wu, Xiao-Lin

2012-01-01

A database can be used to warehouse quantitative trait loci (QTL) data from multiple sources for comparison, genomic data mining, and meta-analysis. A robust database design involves sound data structure logistics, meaningful data transformations, normalization, and proper user interface designs. This chapter starts with a brief review of relational database basics and concentrates on issues associated with curation of QTL data into a relational database, with emphasis on the principles of data normalization and structure optimization. In addition, some simple examples of QTL data mining and meta-analysis are included. These examples are provided to help readers better understand the potential and importance of sound database design.
Treatment of mine-water from decommissioning uranium mines

International Nuclear Information System (INIS)

Fan Quanhui

2002-01-01

Treatment methods for mine-water from decommissioning uranium mines are introduced and classified. The suggestions on optimal treatment methods are presented as a matter of experience with decommissioned Chenzhou Uranium Mine
Research into robotic automation of drilling equipment by the Institute of Mining, UB RAS

Science.gov (United States)

Regotunov, AS; Sukhov, RI

2018-03-01

The article discusses the issues connected with the development of instrumentation for the express-determination of strength characteristics of rocks during blasthole drilling in open pit mines. The trial results of the instrumentation are reported in terms of the drilling rate–energy content interrelation determined in the analyses of experimental drilling block data and by the digital model of rock distribution in depth versus drilling complexity index.
Genome Sequence of Selenium-Solubilizing Bacterium Caulobacter vibrioides T5M6

DEFF Research Database (Denmark)

Wang, Yihua; Qin, Yanan; Kot, Witold

2016-01-01

Caulobacter vibrioides T5M6 is a Gram-negative strain that strongly solubilizes selenium (Se) mineral into Se(IV) and was isolated from a selenium mining area in Enshi, southwest China. This strain produces the phytohormone IAA and promotes plant growth. Here we present the genome of this strain...
A computational genomics pipeline for prokaryotic sequencing projects.

Science.gov (United States)

Kislyuk, Andrey O; Katz, Lee S; Agrawal, Sonia; Hagen, Matthew S; Conley, Andrew B; Jayaraman, Pushkala; Nelakuditi, Viswateja; Humphrey, Jay C; Sammons, Scott A; Govil, Dhwani; Mair, Raydel D; Tatti, Kathleen M; Tondella, Maria L; Harcourt, Brian H; Mayer, Leonard W; Jordan, I King

2010-08-01

New sequencing technologies have accelerated research on prokaryotic genomes and have made genome sequencing operations outside major genome sequencing centers routine. However, no off-the-shelf solution exists for the combined assembly, gene prediction, genome annotation and data presentation necessary to interpret sequencing data. The resulting requirement to invest significant resources into custom informatics support for genome sequencing projects remains a major impediment to the accessibility of high-throughput sequence data. We present a self-contained, automated high-throughput open source genome sequencing and computational genomics pipeline suitable for prokaryotic sequencing projects. The pipeline has been used at the Georgia Institute of Technology and the Centers for Disease Control and Prevention for the analysis of Neisseria meningitidis and Bordetella bronchiseptica genomes. The pipeline is capable of enhanced or manually assisted reference-based assembly using multiple assemblers and modes; gene predictor combining; and functional annotation of genes and gene products. Because every component of the pipeline is executed on a local machine with no need to access resources over the Internet, the pipeline is suitable for projects of a sensitive nature. Annotation of virulence-related features makes the pipeline particularly useful for projects working with pathogenic prokaryotes. The pipeline is licensed under the open-source GNU General Public License and available at the Georgia Tech Neisseria Base (http://nbase.biology.gatech.edu/). The pipeline is implemented with a combination of Perl, Bourne Shell and MySQL and is compatible with Linux and other Unix systems.
TIGER: Toolbox for integrating genome-scale metabolic models, expression data, and transcriptional regulatory networks

Directory of Open Access Journals (Sweden)

Jensen Paul A

2011-09-01

Full Text Available Abstract Background Several methods have been developed for analyzing genome-scale models of metabolism and transcriptional regulation. Many of these methods, such as Flux Balance Analysis, use constrained optimization to predict relationships between metabolic flux and the genes that encode and regulate enzyme activity. Recently, mixed integer programming has been used to encode these gene-protein-reaction (GPR relationships into a single optimization problem, but these techniques are often of limited generality and lack a tool for automating the conversion of rules to a coupled regulatory/metabolic model. Results We present TIGER, a Toolbox for Integrating Genome-scale Metabolism, Expression, and Regulation. TIGER converts a series of generalized, Boolean or multilevel rules into a set of mixed integer inequalities. The package also includes implementations of existing algorithms to integrate high-throughput expression data with genome-scale models of metabolism and transcriptional regulation. We demonstrate how TIGER automates the coupling of a genome-scale metabolic model with GPR logic and models of transcriptional regulation, thereby serving as a platform for algorithm development and large-scale metabolic analysis. Additionally, we demonstrate how TIGER's algorithms can be used to identify inconsistencies and improve existing models of transcriptional regulation with examples from the reconstructed transcriptional regulatory network of Saccharomyces cerevisiae. Conclusion The TIGER package provides a consistent platform for algorithm development and extending existing genome-scale metabolic models with regulatory networks and high-throughput data.
PDBj Mine: design and implementation of relational database interface for Protein Data Bank Japan.

Science.gov (United States)

Kinjo, Akira R; Yamashita, Reiko; Nakamura, Haruki

2010-08-25

This article is a tutorial for PDBj Mine, a new database and its interface for Protein Data Bank Japan (PDBj). In PDBj Mine, data are loaded from files in the PDBMLplus format (an extension of PDBML, PDB's canonical XML format, enriched with annotations), which are then served for the user of PDBj via the worldwide web (WWW). We describe the basic design of the relational database (RDB) and web interfaces of PDBj Mine. The contents of PDBMLplus files are first broken into XPath entities, and these paths and data are indexed in the way that reflects the hierarchical structure of the XML files. The data for each XPath type are saved into the corresponding relational table that is named as the XPath itself. The generation of table definitions from the PDBMLplus XML schema is fully automated. For efficient search, frequently queried terms are compiled into a brief summary table. Casual users can perform simple keyword search, and 'Advanced Search' which can specify various conditions on the entries. More experienced users can query the database using SQL statements which can be constructed in a uniform manner. Thus, PDBj Mine achieves a combination of the flexibility of XML documents and the robustness of the RDB. Database URL: http://www.pdbj.org/
30 CFR 77.1712 - Reopening mines; notification; inspection prior to mining.

Science.gov (United States)

2010-07-01

... to mining. 77.1712 Section 77.1712 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION... prior to mining. Prior to reopening any surface coal mine after it has been abandoned or declared... an authorized representative of the Secretary before any mining operations in such mine are...
Rnnotator: an automated de novo transcriptome assembly pipeline from stranded RNA-Seq reads

Energy Technology Data Exchange (ETDEWEB)

Martin, Jeffrey; Bruno, Vincent M.; Fang, Zhide; Meng, Xiandong; Blow, Matthew; Zhang, Tao; Sherlock, Gavin; Snyder, Michael; Wang, Zhong

2010-11-19

Background: Comprehensive annotation and quantification of transcriptomes are outstanding problems in functional genomics. While high throughput mRNA sequencing (RNA-Seq) has emerged as a powerful tool for addressing these problems, its success is dependent upon the availability and quality of reference genome sequences, thus limiting the organisms to which it can be applied. Results: Here, we describe Rnnotator, an automated software pipeline that generates transcript models by de novo assembly of RNA-Seq data without the need for a reference genome. We have applied the Rnnotator assembly pipeline to two yeast transcriptomes and compared the results to the reference gene catalogs of these organisms. The contigs produced by Rnnotator are highly accurate (95percent) and reconstruct full-length genes for the majority of the existing gene models (54.3percent). Furthermore, our analyses revealed many novel transcribed regions that are absent from well annotated genomes, suggesting Rnnotator serves as a complementary approach to analysis based on a reference genome for comprehensive transcriptomics. Conclusions: These results demonstrate that the Rnnotator pipeline is able to reconstruct full-length transcripts in the absence of a complete reference genome.

The Adam and Eve Robot Scientists for the Automated Discovery of Scientific Knowledge

Science.gov (United States)

King, Ross

A Robot Scientist is a physically implemented robotic system that applies techniques from artificial intelligence to execute cycles of automated scientific experimentation. A Robot Scientist can automatically execute cycles of hypothesis formation, selection of efficient experiments to discriminate between hypotheses, execution of experiments using laboratory automation equipment, and analysis of results. The motivation for developing Robot Scientists is to better understand science, and to make scientific research more efficient. The Robot Scientist `Adam' was the first machine to autonomously discover scientific knowledge: both form and experimentally confirm novel hypotheses. Adam worked in the domain of yeast functional genomics. The Robot Scientist `Eve' was originally developed to automate early-stage drug development, with specific application to neglected tropical disease such as malaria, African sleeping sickness, etc. We are now adapting Eve to work with on cancer. We are also teaching Eve to autonomously extract information from the scientific literature.
Automated Quantitative Rare Earth Elements Mineralogy by Scanning Electron Microscopy

Science.gov (United States)

Sindern, Sven; Meyer, F. Michael

2016-09-01

Increasing industrial demand of rare earth elements (REEs) stems from the central role they play for advanced technologies and the accelerating move away from carbon-based fuels. However, REE production is often hampered by the chemical, mineralogical as well as textural complexity of the ores with a need for better understanding of their salient properties. This is not only essential for in-depth genetic interpretations but also for a robust assessment of ore quality and economic viability. The design of energy and cost-efficient processing of REE ores depends heavily on information about REE element deportment that can be made available employing automated quantitative process mineralogy. Quantitative mineralogy assigns numeric values to compositional and textural properties of mineral matter. Scanning electron microscopy (SEM) combined with a suitable software package for acquisition of backscatter electron and X-ray signals, phase assignment and image analysis is one of the most efficient tools for quantitative mineralogy. The four different SEM-based automated quantitative mineralogy systems, i.e. FEI QEMSCAN and MLA, Tescan TIMA and Zeiss Mineralogic Mining, which are commercially available, are briefly characterized. Using examples of quantitative REE mineralogy, this chapter illustrates capabilities and limitations of automated SEM-based systems. Chemical variability of REE minerals and analytical uncertainty can reduce performance of phase assignment. This is shown for the REE phases parisite and synchysite. In another example from a monazite REE deposit, the quantitative mineralogical parameters surface roughness and mineral association derived from image analysis are applied for automated discrimination of apatite formed in a breakdown reaction of monazite and apatite formed by metamorphism prior to monazite breakdown. SEM-based automated mineralogy fulfils all requirements for characterization of complex unconventional REE ores that will become
Mining RNA-seq data for infections and contaminations.

Directory of Open Access Journals (Sweden)

Thomas Bonfert

Full Text Available RNA sequencing (RNA-seq provides novel opportunities for transcriptomic studies at nucleotide resolution, including transcriptomics of viruses or microbes infecting a cell. However, standard approaches for mapping the resulting sequencing reads generally ignore alternative sources of expression other than the host cell and are little equipped to address the problems arising from redundancies and gaps among sequenced microbe and virus genomes. We show that screening of sequencing reads for contaminations and infections can be performed easily using ContextMap, our recently developed mapping software. Based on mapping-derived statistics, mapping confidence, similarities and misidentifications (e.g. due to missing genome sequences of species/strains can be assessed. Performance of our approach is evaluated on three real-life sequencing data sets and compared to state-of-the-art metagenomics tools. In particular, ContextMap vastly outperformed GASiC and GRAMMy in terms of runtime. In contrast to MEGAN4, it was capable of providing individual read mappings to species and resolving non-unique mappings, thus allowing the identification of misalignments caused by sequence similarities between genomes and missing genome sequences. Our study illustrates the importance and potentials of routinely mining RNA-seq experiments for infections or contaminations by microbes and viruses. By using ContextMap, gene expression of infecting agents can be analyzed and novel insights in infection processes and tumorigenesis can be obtained.
MODEST: a web-based design tool for oligonucleotide-mediated genome engineering and recombineering

DEFF Research Database (Denmark)

Bonde, Mads; Klausen, Michael Schantz; Anderson, Mads Valdemar

2014-01-01

Recombineering and multiplex automated genome engineering (MAGE) offer the possibility to rapidly modify multiple genomic or plasmid sites at high efficiencies. This enables efficient creation of genetic variants including both single mutants with specifically targeted modifications as well......, which confers the corresponding genetic change, is performed manually. To address these challenges, we have developed the MAGE Oligo Design Tool (MODEST). This web-based tool allows designing of MAGE oligos for (i) tuning translation rates by modifying the ribosomal binding site, (ii) generating...
The potential of text mining in data integration and network biology for plant research: a case study on Arabidopsis.

Science.gov (United States)

Van Landeghem, Sofie; De Bodt, Stefanie; Drebert, Zuzanna J; Inzé, Dirk; Van de Peer, Yves

2013-03-01

Despite the availability of various data repositories for plant research, a wealth of information currently remains hidden within the biomolecular literature. Text mining provides the necessary means to retrieve these data through automated processing of texts. However, only recently has advanced text mining methodology been implemented with sufficient computational power to process texts at a large scale. In this study, we assess the potential of large-scale text mining for plant biology research in general and for network biology in particular using a state-of-the-art text mining system applied to all PubMed abstracts and PubMed Central full texts. We present extensive evaluation of the textual data for Arabidopsis thaliana, assessing the overall accuracy of this new resource for usage in plant network analyses. Furthermore, we combine text mining information with both protein-protein and regulatory interactions from experimental databases. Clusters of tightly connected genes are delineated from the resulting network, illustrating how such an integrative approach is essential to grasp the current knowledge available for Arabidopsis and to uncover gene information through guilt by association. All large-scale data sets, as well as the manually curated textual data, are made publicly available, hereby stimulating the application of text mining data in future plant biology studies.
Marinobacter subterrani, a genetically tractable neutrophilic Fe(II-oxidizing strain isolated from the Soudan Iron Mine

Directory of Open Access Journals (Sweden)

Benjamin Michael Bonis

2015-07-01

Full Text Available We report the isolation, characterization, and development of a robust genetic system for a halophilic, Fe(II-oxidizing bacterium isolated from a vertical borehole originating 714 m below the surface located in the Soudan Iron Mine in northern Minnesota, USA. Sequence analysis of the 16S rRNA gene places the isolate in the genus Marinobacter of the Gammaproteobacteria. The genome of the isolate was sequenced using a combination of short- and long-read technologies resulting in 2 contigs representing a 4.4 Mbp genome. Using genomic information, we used a suicide vector for targeted deletion of specific flagellin genes, resulting in a motility-deficient mutant. The motility mutant was successfully complemented by expression of the deleted genes in trans. Random mutagenesis using a transposon was also achieved. Capable of heterotrophic growth, this isolate represents a microaerophilic Fe(II-oxidizing species for which a system for both directed and random mutagenesis has been established. Analysis of 16S rDNA suggests Marinobacter represents a major taxon in the mine, and genetic interrogation of this genus may offer insight into the structure of deep subsurface communities as well as an additional tool for analyzing nutrient and element cycling in the subsurface ecosystem.
Data mining and education.

Science.gov (United States)

Koedinger, Kenneth R; D'Mello, Sidney; McLaughlin, Elizabeth A; Pardos, Zachary A; Rosé, Carolyn P

2015-01-01

An emerging field of educational data mining (EDM) is building on and contributing to a wide variety of disciplines through analysis of data coming from various educational technologies. EDM researchers are addressing questions of cognition, metacognition, motivation, affect, language, social discourse, etc. using data from intelligent tutoring systems, massive open online courses, educational games and simulations, and discussion forums. The data include detailed action and timing logs of student interactions in user interfaces such as graded responses to questions or essays, steps in rich problem solving environments, games or simulations, discussion forum posts, or chat dialogs. They might also include external sensors such as eye tracking, facial expression, body movement, etc. We review how EDM has addressed the research questions that surround the psychology of learning with an emphasis on assessment, transfer of learning and model discovery, the role of affect, motivation and metacognition on learning, and analysis of language data and collaborative learning. For example, we discuss (1) how different statistical assessment methods were used in a data mining competition to improve prediction of student responses to intelligent tutor tasks, (2) how better cognitive models can be discovered from data and used to improve instruction, (3) how data-driven models of student affect can be used to focus discussion in a dialog-based tutoring system, and (4) how machine learning techniques applied to discussion data can be used to produce automated agents that support student learning as they collaborate in a chat room or a discussion board. © 2015 John Wiley & Sons, Ltd.
Acid mine drainage: mining and water pollution issues in British Columbia

Energy Technology Data Exchange (ETDEWEB)

NONE

1998-12-31

The importance of protecting water quality and some of the problems associated with mineral development are described. Negative impacts of mining operations such as sedimentation, water disturbances, and water pollution from waste rock and tailings are considered. Mining wastes, types of water pollution from mining, the legacy of acid mine drainage, predicting acid mine drainage, preventing and mitigating acid mine drainage, examples from the past, and cyanide heap-leaching are discussed. The real costs of mining at the Telkwa open pit coal mine are assessed. British Columbia mines that are known for or are potentially acid generating are shown on a map. 32 refs., 10 figs.
Predicting the disease of Alzheimer with SNP biomarkers and clinical data using data mining classification approach: decision tree.

Science.gov (United States)

Erdoğan, Onur; Aydin Son, Yeşim

2014-01-01

Single Nucleotide Polymorphisms (SNPs) are the most common genomic variations where only a single nucleotide differs between individuals. Individual SNPs and SNP profiles associated with diseases can be utilized as biological markers. But there is a need to determine the SNP subsets and patients' clinical data which is informative for the diagnosis. Data mining approaches have the highest potential for extracting the knowledge from genomic datasets and selecting the representative SNPs as well as most effective and informative clinical features for the clinical diagnosis of the diseases. In this study, we have applied one of the widely used data mining classification methodology: "decision tree" for associating the SNP biomarkers and significant clinical data with the Alzheimer's disease (AD), which is the most common form of "dementia". Different tree construction parameters have been compared for the optimization, and the most accurate tree for predicting the AD is presented.
Responsible Mining: A Human Resources Strategy for Mine Development Project

OpenAIRE

Sampathkumar, Sriram (Ram)

2012-01-01

Mining is a global industry. Most mining companies operate internationally, often in remote, challenging environments and consequently frequently have respond to unusual and demanding Human Resource (HR) requirements. It is my opinion that the strategic imperative behind success in mining industry is responsible mining. The purpose of this paper is to examine how an effective HR strategy can be a competitive advantage that contributes to the success of a mining project in the global mining in...
Extended -Regular Sequence for Automated Analysis of Microarray Images

Directory of Open Access Journals (Sweden)

Jin Hee-Jeong

2006-01-01

Full Text Available Microarray study enables us to obtain hundreds of thousands of expressions of genes or genotypes at once, and it is an indispensable technology for genome research. The first step is the analysis of scanned microarray images. This is the most important procedure for obtaining biologically reliable data. Currently most microarray image processing systems require burdensome manual block/spot indexing work. Since the amount of experimental data is increasing very quickly, automated microarray image analysis software becomes important. In this paper, we propose two automated methods for analyzing microarray images. First, we propose the extended -regular sequence to index blocks and spots, which enables a novel automatic gridding procedure. Second, we provide a methodology, hierarchical metagrid alignment, to allow reliable and efficient batch processing for a set of microarray images. Experimental results show that the proposed methods are more reliable and convenient than the commercial tools.
Improving the driver-automation interaction: an approach using automation uncertainty.

Science.gov (United States)

Beller, Johannes; Heesen, Matthias; Vollrath, Mark

2013-12-01

The aim of this study was to evaluate whether communicating automation uncertainty improves the driver-automation interaction. A false system understanding of infallibility may provoke automation misuse and can lead to severe consequences in case of automation failure. The presentation of automation uncertainty may prevent this false system understanding and, as was shown by previous studies, may have numerous benefits. Few studies, however, have clearly shown the potential of communicating uncertainty information in driving. The current study fills this gap. We conducted a driving simulator experiment, varying the presented uncertainty information between participants (no uncertainty information vs. uncertainty information) and the automation reliability (high vs.low) within participants. Participants interacted with a highly automated driving system while engaging in secondary tasks and were required to cooperate with the automation to drive safely. Quantile regressions and multilevel modeling showed that the presentation of uncertainty information increases the time to collision in the case of automation failure. Furthermore, the data indicated improved situation awareness and better knowledge of fallibility for the experimental group. Consequently, the automation with the uncertainty symbol received higher trust ratings and increased acceptance. The presentation of automation uncertaintythrough a symbol improves overall driver-automation cooperation. Most automated systems in driving could benefit from displaying reliability information. This display might improve the acceptance of fallible systems and further enhances driver-automation cooperation.
Microbial Ecology and Evolution in the Acid Mine Drainage Model System.

Science.gov (United States)

Huang, Li-Nan; Kuang, Jia-Liang; Shu, Wen-Sheng

2016-07-01

Acid mine drainage (AMD) is a unique ecological niche for acid- and toxic-metals-adapted microorganisms. These low-complexity systems offer a special opportunity for the ecological and evolutionary analyses of natural microbial assemblages. The last decade has witnessed an unprecedented interest in the study of AMD communities using 16S rRNA high-throughput sequencing and community genomic and postgenomic methodologies, significantly advancing our understanding of microbial diversity, community function, and evolution in acidic environments. This review describes new data on AMD microbial ecology and evolution, especially dynamics of microbial diversity, community functions, and population genomes, and further identifies gaps in our current knowledge that future research, with integrated applications of meta-omics technologies, will fill. Copyright © 2016 Elsevier Ltd. All rights reserved.
A bench-top automated workstation for nucleic acid isolation from clinical sample types.

Science.gov (United States)

Thakore, Nitu; Garber, Steve; Bueno, Arial; Qu, Peter; Norville, Ryan; Villanueva, Michael; Chandler, Darrell P; Holmberg, Rebecca; Cooney, Christopher G

2018-04-18

Systems that automate extraction of nucleic acid from cells or viruses in complex clinical matrices have tremendous value even in the absence of an integrated downstream detector. We describe our bench-top automated workstation that integrates our previously-reported extraction method - TruTip - with our newly-developed mechanical lysis method. This is the first report of this method for homogenizing viscous and heterogeneous samples and lysing difficult-to-disrupt cells using "MagVor": a rotating magnet that rotates a miniature stir disk amidst glass beads confined inside of a disposable tube. Using this system, we demonstrate automated nucleic acid extraction from methicillin-resistant Staphylococcus aureus (MRSA) in nasopharyngeal aspirate (NPA), influenza A in nasopharyngeal swabs (NPS), human genomic DNA from whole blood, and Mycobacterium tuberculosis in NPA. The automated workstation yields nucleic acid with comparable extraction efficiency to manual protocols, which include commercially-available Qiagen spin column kits, across each of these sample types. This work expands the scope of applications beyond previous reports of TruTip to include difficult-to-disrupt cell types and automates the process, including a method for removal of organics, inside a compact bench-top workstation. Copyright © 2018 Elsevier B.V. All rights reserved.
Scalable Device for Automated Microbial Electroporation in a Digital Microfluidic Platform.

Science.gov (United States)

Madison, Andrew C; Royal, Matthew W; Vigneault, Frederic; Chen, Liji; Griffin, Peter B; Horowitz, Mark; Church, George M; Fair, Richard B

2017-09-15

Electrowetting-on-dielectric (EWD) digital microfluidic laboratory-on-a-chip platforms demonstrate excellent performance in automating labor-intensive protocols. When coupled with an on-chip electroporation capability, these systems hold promise for streamlining cumbersome processes such as multiplex automated genome engineering (MAGE). We integrated a single Ti:Au electroporation electrode into an otherwise standard parallel-plate EWD geometry to enable high-efficiency transformation of Escherichia coli with reporter plasmid DNA in a 200 nL droplet. Test devices exhibited robust operation with more than 10 transformation experiments performed per device without cross-contamination or failure. Despite intrinsic electric-field nonuniformity present in the EP/EWD device, the peak on-chip transformation efficiency was measured to be 8.6 ± 1.0 × 10 8 cfu·μg -1 for an average applied electric field strength of 2.25 ± 0.50 kV·mm -1 . Cell survival and transformation fractions at this electroporation pulse strength were found to be 1.5 ± 0.3 and 2.3 ± 0.1%, respectively. Our work expands the EWD toolkit to include on-chip microbial electroporation and opens the possibility of scaling advanced genome engineering methods, like MAGE, into the submicroliter regime.
The Protein Maker: an automated system for high-throughput parallel purification

International Nuclear Information System (INIS)

Smith, Eric R.; Begley, Darren W.; Anderson, Vanessa; Raymond, Amy C.; Haffner, Taryn E.; Robinson, John I.; Edwards, Thomas E.; Duncan, Natalie; Gerdts, Cory J.; Mixon, Mark B.; Nollert, Peter; Staker, Bart L.; Stewart, Lance J.

2011-01-01

The Protein Maker instrument addresses a critical bottleneck in structural genomics by allowing automated purification and buffer testing of multiple protein targets in parallel with a single instrument. Here, the use of this instrument to (i) purify multiple influenza-virus proteins in parallel for crystallization trials and (ii) identify optimal lysis-buffer conditions prior to large-scale protein purification is described. The Protein Maker is an automated purification system developed by Emerald BioSystems for high-throughput parallel purification of proteins and antibodies. This instrument allows multiple load, wash and elution buffers to be used in parallel along independent lines for up to 24 individual samples. To demonstrate its utility, its use in the purification of five recombinant PB2 C-terminal domains from various subtypes of the influenza A virus is described. Three of these constructs crystallized and one diffracted X-rays to sufficient resolution for structure determination and deposition in the Protein Data Bank. Methods for screening lysis buffers for a cytochrome P450 from a pathogenic fungus prior to upscaling expression and purification are also described. The Protein Maker has become a valuable asset within the Seattle Structural Genomics Center for Infectious Disease (SSGCID) and hence is a potentially valuable tool for a variety of high-throughput protein-purification applications
Extending mine life

International Nuclear Information System (INIS)

Anon.

1984-01-01

Mine layouts, new machines and techniques, research into problem areas of ground control and so on, are highlighted in this report on extending mine life. The main resources taken into account are coal mining, uranium mining, molybdenum and gold mining
Experience of creating a multifunctional safety system at the coal mining enterprise

Science.gov (United States)

Reshetnikov, V. V.; Davkaev, K. S.; Korolkov, M. V.; Lyakhovets, M. V.

2018-05-01

The principles of creating multifunctional safety systems (MFSS) based on mathematical models with Markov properties are considered. The applicability of such models for the analysis of the safety of the created systems and their effectiveness is substantiated. The method of this analysis and the results of its testing are discussed. The variant of IFSB implementation in the conditions of the operating coal-mining enterprise is given. The functional scheme, data scheme and operating modes of the MFSS are given. The automated workplace of the industrial safety controller is described.
Lunabotics Mining Competition: Inspiration Through Accomplishment

Science.gov (United States)

Mueller, Robert P.

2011-01-01

NASA's Lunabotics Mining Competition is designed to promote the development of interest in space activities and STEM (Science, Technology, Engineering, and Mathematics) fields. The competition uses excavation, a necessary first step towards extracting resources from the regolith and building bases on the moon. The unique physical properties of lunar regolith and the reduced 1/6th gravity, vacuum environment make excavation a difficult technical challenge. Advances in lunar regolith mining have the potential to significantly contribute to our nation's space vision and NASA space exploration operations. The competition is conducted annually by NASA at the Kennedy Space Center Visitor Complex. The teams that can use telerobotic or autonomous operation to excavate a lunar regolith geotechnical simulant, herein after referred to as Black Point-1 (or BP-1) and score the most points (calculated as an average of two separate 10-minute timed competition attempts) will eam points towards the Joe Kosmo Award for Excellence and the scores will reflect ranking in the on-site mining category of the competition. The minimum excavation requirement is 10.0 kg during each competition attempt and the robotic excavator, referred to as the "Lunabot", must meet all specifications. This paper will review the achievements of the Lunabotics Mining Competition in 2010 and 2011, and present the new rules for 2012. By providing a framework for robotic design and fabrication, which culminates in a live competition event, university students have been able to produce sophisticated lunabots which are tele-operated. Multi-disciplinary teams are encouraged and the extreme sense of accomplishment provides a unique source of inspiration to the participating students, which has been shown to translate into increased interest in STEM careers. Our industrial sponsors (Caterpillar, Newmont Mining, Harris, Honeybee Robotics) have all stated that there is a strong need for skills in the workforce related
Multi-scale structural community organisation of the human genome.

Science.gov (United States)

Boulos, Rasha E; Tremblay, Nicolas; Arneodo, Alain; Borgnat, Pierre; Audit, Benjamin

2017-04-11

Structural interaction frequency matrices between all genome loci are now experimentally achievable thanks to high-throughput chromosome conformation capture technologies. This ensues a new methodological challenge for computational biology which consists in objectively extracting from these data the structural motifs characteristic of genome organisation. We deployed the fast multi-scale community mining algorithm based on spectral graph wavelets to characterise the networks of intra-chromosomal interactions in human cell lines. We observed that there exist structural domains of all sizes up to chromosome length and demonstrated that the set of structural communities forms a hierarchy of chromosome segments. Hence, at all scales, chromosome folding predominantly involves interactions between neighbouring sites rather than the formation of links between distant loci. Multi-scale structural decomposition of human chromosomes provides an original framework to question structural organisation and its relationship to functional regulation across the scales. By construction the proposed methodology is independent of the precise assembly of the reference genome and is thus directly applicable to genomes whose assembly is not fully determined.

Bacillus velezensis RC 218 as a biocontrol agent to reduce Fusarium head blight and deoxynivalenol accumulation: Genome sequencing and secondary metabolite cluster profiles.

Science.gov (United States)

Palazzini, Juan M; Dunlap, Christopher A; Bowman, Michael J; Chulze, Sofía N

2016-11-01

Bacillus subtilis RC 218 was originally isolated from wheat anthers as a potential antagonist of Fusarium graminearum, the causal agent of Fusarium head blight (FHB). It was demonstrated to have antagonist activity against the plant pathogen under in vitro and greenhouse assays. The current study extends characterizing B. subtilis RC 218 with a field study and genome sequencing. The field study demonstrated that B. subtilis RC 218 could reduce disease severity and the associated mycotoxin (deoxynivalenol) accumulation, under field conditions. The genome sequencing allowed us to accurately determine the taxonomy of the strain using a phylogenomic approach, which places it in the Bacillus velezensis clade. In addition, the draft genome allowed us to use bioinformatics to mine the genome for potential metabolites. The genome mining allowed us to identify 9 active secondary metabolites conserved by all B. velezensis strains and one additional secondary metabolite, the lantibiotic ericin, which is unique to this strain. This study represents the first confirmed production of ericin by a B. velezensis strain. The genome also allowed us to do a comparative genomics with its closest relatives and compare the secondary metabolite production of the publically available B. velezensis genomes. The results showed that the diversity in secondary metabolites of strains in the B. velezensis clade is driven by strains making different antibacterials. Copyright © 2016 Elsevier GmbH. All rights reserved.
Automated DNA extraction from genetically modified maize using aminosilane-modified bacterial magnetic particles.

Science.gov (United States)

Ota, Hiroyuki; Lim, Tae-Kyu; Tanaka, Tsuyoshi; Yoshino, Tomoko; Harada, Manabu; Matsunaga, Tadashi

2006-09-18

A novel, automated system, PNE-1080, equipped with eight automated pestle units and a spectrophotometer was developed for genomic DNA extraction from maize using aminosilane-modified bacterial magnetic particles (BMPs). The use of aminosilane-modified BMPs allowed highly accurate DNA recovery. The (A(260)-A(320)):(A(280)-A(320)) ratio of the extracted DNA was 1.9+/-0.1. The DNA quality was sufficiently pure for PCR analysis. The PNE-1080 offered rapid assay completion (30 min) with high accuracy. Furthermore, the results of real-time PCR confirmed that our proposed method permitted the accurate determination of genetically modified DNA composition and correlated well with results obtained by conventional cetyltrimethylammonium bromide (CTAB)-based methods.
Genome mining of Streptomyces scabrisporus NF3 reveals symbiotic features including genes related to plant interactions

Science.gov (United States)

Rodríguez-Luna, Stefany Daniela; Cruz Vázquez, Angélica Patricia; Jiménez Suárez, Verónica; Rodríguez-Sanoja, Romina; Alvarez-Buylla, Elena R.; Sánchez, Sergio

2018-01-01

Endophytic bacteria are wide-spread and associated with plant physiological benefits, yet their genomes and secondary metabolites remain largely unidentified. In this study, we explored the genome of the endophyte Streptomyces scabrisporus NF3 for discovery of potential novel molecules as well as genes and metabolites involved in host interactions. The complete genomes of seven Streptomyces and three other more distantly related bacteria were used to define the functional landscape of this unique microbe. The S. scabrisporus NF3 genome is larger than the average Streptomyces genome and not structured for an obligate endosymbiotic lifestyle; this and the fact that can grow in R2YE media implies that it could include a soil-living stage. The genome displays an enrichment of genes associated with amino acid production, protein secretion, secondary metabolite and antioxidants production and xenobiotic degradation, indicating that S. scabrisporus NF3 could contribute to the metabolic enrichment of soil microbial communities and of its hosts. Importantly, besides its metabolic advantages, the genome showed evidence for differential functional specificity and diversification of plant interaction molecules, including genes for the production of plant hormones, stress resistance molecules, chitinases, antibiotics and siderophores. Given the diversity of S. scabrisporus mechanisms for host upkeep, we propose that these strategies were necessary for its adaptation to plant hosts and to face changes in environmental conditions. PMID:29447216
Genome mining of Streptomyces scabrisporus NF3 reveals symbiotic features including genes related to plant interactions.

Directory of Open Access Journals (Sweden)

Corina Diana Ceapă

Full Text Available Endophytic bacteria are wide-spread and associated with plant physiological benefits, yet their genomes and secondary metabolites remain largely unidentified. In this study, we explored the genome of the endophyte Streptomyces scabrisporus NF3 for discovery of potential novel molecules as well as genes and metabolites involved in host interactions. The complete genomes of seven Streptomyces and three other more distantly related bacteria were used to define the functional landscape of this unique microbe. The S. scabrisporus NF3 genome is larger than the average Streptomyces genome and not structured for an obligate endosymbiotic lifestyle; this and the fact that can grow in R2YE media implies that it could include a soil-living stage. The genome displays an enrichment of genes associated with amino acid production, protein secretion, secondary metabolite and antioxidants production and xenobiotic degradation, indicating that S. scabrisporus NF3 could contribute to the metabolic enrichment of soil microbial communities and of its hosts. Importantly, besides its metabolic advantages, the genome showed evidence for differential functional specificity and diversification of plant interaction molecules, including genes for the production of plant hormones, stress resistance molecules, chitinases, antibiotics and siderophores. Given the diversity of S. scabrisporus mechanisms for host upkeep, we propose that these strategies were necessary for its adaptation to plant hosts and to face changes in environmental conditions.
CGUG: in silico proteome and genome parsing tool for the determination of "core" and unique genes in the analysis of genomes up to ca. 1.9 Mb

Directory of Open Access Journals (Sweden)

Mahadevan Padmanabhan

2009-08-01

Full Text Available Abstract Background Viruses and small-genome bacteria (~2 megabases and smaller comprise a considerable population in the biosphere and are of interest to many researchers. These genomes are now sequenced at an unprecedented rate and require complementary computational tools to analyze. "CoreGenesUniqueGenes" (CGUG is an in silico genome data mining tool that determines a "core" set of genes from two to five organisms with genomes in this size range. Core and unique genes may reflect similar niches and needs, and may be used in classifying organisms. Findings CGUG is available at http://binf.gmu.edu/geneorder.html as a web-based on-the-fly tool that performs iterative BLASTP analyses using a reference genome and up to four query genomes to provide a table of genes common to these genomes. The result is an in silico display of genomes and their proteomes, allowing for further analysis. CGUG can be used for "genome annotation by homology", as demonstrated with Chlamydophila and Francisella genomes. Conclusion CGUG is used to reanalyze the ICTV-based classifications of bacteriophages, to reconfirm long-standing relationships and to explore new classifications. These genomes have been problematic in the past, due largely to horizontal gene transfers. CGUG is validated as a tool for reannotating small genome bacteria using more up-to-date annotations by similarity or homology. These serve as an entry point for wet-bench experiments to confirm the functions of these "hypothetical" and "unknown" proteins.
Chloroplast Genome Sequence of pigeonpea (Cajanus cajan (L. Millspaugh and Cajanus scarabaeoides: Genome organization and Comparison with other legumes

Directory of Open Access Journals (Sweden)

Tanvi Kaila

2016-12-01

Full Text Available Pigeonpea (Cajanus cajan (L. Millspaugh, a diploid (2n = 22 legume crop with a genome size of 852 Mbp, serves as an important source of human dietary protein especially in South East Asian and African regions. In this study, the draft chloroplast genomes of Cajanus cajan and Cajanus scarabaeoides were sequenced. Cajanus scarabaeoides is an important species of the Cajanus gene pool and has also been used for developing promising CMS system by different groups. A male sterile genotype harbouring the Cajanus scarabaeoides cytoplasm was used for sequencing the plastid genome. The cp genome of Cajanus cajan is 152,242bp long, having a quadripartite structure with LSC of 83,455 bp and SSC of 17,871 bp separated by IRs of 25,398 bp. Similarly, the cp genome of Cajanus scarabaeoides is 152,201bp long, having a quadripartite structure in which IRs of 25,402 bp length separates 83,423 bp of LSC and 17,854 bp of SSC. The pigeonpea cp genome contains 116 unique genes, including 30 tRNA, 4 rRNA, 78 predicted protein coding genes and 5 pseudogenes. A 50kb inversion was observed in the LSC region of pigeonpea cp genome, consistent with other legumes. Comparison of cp genome with other legumes revealed the contraction of IR boundaries due to the absence of rps19 gene in the IR region. Chloroplast SSRs were mined and a total of 280 and 292 cpSSRs were identified in Cajanus scarabaeoides and Cajanus cajan respectively. RNA editing was observed at 37 sites in both Cajanus scarabaeoides and Cajanus cajan, with maximum occurrence in the ndh genes. The pigeonpea cp genome sequence would be beneficial in providing informative molecular markers which can be utilized for genetic diversity analysis and aid in understanding the plant systematics studies among major grain legumes.
Identification and Mapping of Simple Sequence Repeat Markers from Common Bean (Phaseolus vulgaris L. Bacterial Artificial Chromosome End Sequences for Genome Characterization and Genetic–Physical Map Integration

Directory of Open Access Journals (Sweden)

Juana M. Córdoba

2010-11-01

Full Text Available Microsatellite markers or simple sequence repeat (SSR loci are useful for diversity characterization and genetic–physical mapping. Different in silico microsatellite search methods have been developed for mining bacterial artificial chromosome (BAC end sequences for SSRs. The overall goal of this study was genome characterization based on SSRs in 89,017 BAC end sequences (BESs from the G19833 common bean ( L. library. Another objective was to identify new SSR taking into account three tandem motif identification programs (Automated Microsatellite Marker Development [AMMD], Tandem Repeats Finder [TRF], and SSRLocator [SSRL]. Among the microsatellite search engines, SSRL identified the highest number of SSRs; however, when primer design was attempted, the number dropped due to poor primer design regions. Automated Microsatellite Marker Development software identified many SSRs with valuable AT/TA or AG/TC motifs, while TRF found fewer SSRs and produced no primers. A subgroup of 323 AT-rich, di-, and trinucleotide SSRs were selected from the AMMD results and used in a parental survey with DOR364 and G19833, of which 75 could be mapped in the corresponding population; these represented 4052 BAC clones. Together with 92 previously mapped BES- and 114 non-BES-derived markers, a total of 280 SSRs were included in the polymerase chain reaction (PCR-based map, integrating a total of 8232 BAC clones in 162 contigs from the physical map.
MONITORING OF MINING

Directory of Open Access Journals (Sweden)

Berislav Šebečić

1996-12-01

Full Text Available The way mining was monitored in the past depended on knowledge, interest and the existing legal regulations. Documentary evidence about this work can be found in archives, libraries and museums. In particular, there is the rich archival material (papers and books concerning the work of the one-time Imperial and Royal Mining Captaincies in Zagreb, Zadar, Klagenfurt and Split, A minor part of the documentation has not yet been transferred to Croatia. From mining handbooks and books we can also find out about mining in Croatia. In the context of Austro-Hungary. For example, we can find out that the first governorships in Zagreb and Zadar headed the Ban, Count Jelacic and Baron Mamula were also the top mining authorities, though this, probably from political motives, was suppressed in the guides and inventories or the Mining Captaincies. At the end of the 1850s, Croatia produced 92-94% of sea salt, up to 8.5% of sulphur, 19.5% of asphalt and 100% of oil for the Austro-Hungarian empire. From data about mining in the Split Mining Captaincy, prepared for the Philadephia Exhibition, it can be seen that in the exploratory mining operations in which there were 33,372 independent mines declared in 1925 they were looking mainly for bauxite (60,0%, then dark coal (19,0%, asphalts (10.3% and lignites (62%. In 1931, within the area covered by the same captaincy, of 74 declared mines, only 9 were working. There were five coal mines, three bauxite mines and one for asphalt. I suggest that within state institution, the Mining Captaincy or Authority be renewed, or that a Mining and Geological Authority be set ap, which would lead to the more complete affirmation of Croatian mining (the paper is published in Croatian.
Development of the testing procedure for units and elements of mining equipment

Directory of Open Access Journals (Sweden)

P. B. Gerike

2017-09-01

Full Text Available The author considers in detail the stages of creating a testing procedure for mining equipment based on the complex implementation of principles of nondestructive testing and technical diagnostics. The author substantiates effectiveness of application of a complex diagnostic approach for assessing the state of metal structures and energy-mechanical equipment of mining machines. The opportunity for timely detection of defects, regardless of their type and degree of danger, presents itself only with a wide application of the modern methods of vibration diagnostics and nondestructive testing. The author substantiates the effectiveness of specific combination of methods of nondestructive testing, most optimally suited for solving given tasks. The article contains the developed complex of more than 120 diagnostic rules, suitable for performing automated analysis of vibroacoustic signal and revealing the main damages of energy-mechanical equipment based on selective groups of informative frequencies. The author formulates the main criteria that one can use as a basic platform for improving the methodology for normalizing the parameters of mechanical oscillations. The developed diagnostic criteria became a basis for the development of individual spectral masks suitable for performing the analysis of parameters of vibroacoustic waves generated during operation of mining equipment. The author proves necessity of transition of repair and maintenance divisions of industrial enterprises to the system of maintenance of machinery according to its actual technical state, and the developed complex of diagnostic rules for detecting defects can serve as a platform for the implementation of basic elements of this system. The author substantiates the principal validity of the developed methodology for testing mining machines equipment and its individual elements, such as the predictive modeling of degradation of technical state of mining equipment and the
Sifting through genomes with iterative-sequence clustering produces a large, phylogenetically diverse protein-family resource.

Science.gov (United States)

Sharpton, Thomas J; Jospin, Guillaume; Wu, Dongying; Langille, Morgan G I; Pollard, Katherine S; Eisen, Jonathan A

2012-10-13

New computational resources are needed to manage the increasing volume of biological data from genome sequencing projects. One fundamental challenge is the ability to maintain a complete and current catalog of protein diversity. We developed a new approach for the identification of protein families that focuses on the rapid discovery of homologous protein sequences. We implemented fully automated and high-throughput procedures to de novo cluster proteins into families based upon global alignment similarity. Our approach employs an iterative clustering strategy in which homologs of known families are sifted out of the search for new families. The resulting reduction in computational complexity enables us to rapidly identify novel protein families found in new genomes and to perform efficient, automated updates that keep pace with genome sequencing. We refer to protein families identified through this approach as "Sifting Families," or SFams. Our analysis of ~10.5 million protein sequences from 2,928 genomes identified 436,360 SFams, many of which are not represented in other protein family databases. We validated the quality of SFam clustering through statistical as well as network topology-based analyses. We describe the rapid identification of SFams and demonstrate how they can be used to annotate genomes and metagenomes. The SFam database catalogs protein-family quality metrics, multiple sequence alignments, hidden Markov models, and phylogenetic trees. Our source code and database are publicly available and will be subject to frequent updates (http://edhar.genomecenter.ucdavis.edu/sifting_families/).
Gold-Mining

DEFF Research Database (Denmark)

Raaballe, J.; Grundy, B.D.

2002-01-01

Based on standard option pricing arguments and assumptions (including no convenience yield and sustainable property rights), we will not observe operating gold mines. We find that asymmetric information on the reserves in the gold mine is a necessary and sufficient condition for the existence...... of operating gold mines. Asymmetric information on the reserves in the mine implies that, at a high enough price of gold, the manager of high type finds the extraction value of the company to be higher than the current market value of the non-operating gold mine. Due to this under valuation the maxim of market...
Evaluation of Documentation Patterns of Trainees and Supervising Physicians Using Data Mining.

Science.gov (United States)

Madhavan, Ramesh; Tang, Chi; Bhattacharya, Pratik; Delly, Fadi; Basha, Maysaa M

2014-09-01

The electronic health record (EHR) includes a rich data set that may offer opportunities for data mining and natural language processing to answer questions about quality of care, key aspects of resident education, or attributes of the residents' learning environment. We used data obtained from the EHR to report on inpatient documentation practices of residents and attending physicians at a large academic medical center. We conducted a retrospective observational study of deidentified patient notes entered over 7 consecutive months by a multispecialty university physician group at an urban hospital. A novel automated data mining technology was used to extract patient note-related variables. A sample of 26 802 consecutive patient notes was analyzed using the data mining and modeling tool Healthcare Smartgrid. Residents entered most of the notes (33%, 8178 of 24 787) between noon and 4 pm and 31% (7718 of 24 787) of notes between 8 am and noon. Attending physicians placed notes about teaching attestations within 24 hours in only 73% (17 843 of 24 443) of the records. Surgical residents were more likely to place notes before noon (P Data related to patient note entry was successfully used to objectively measure current work flow of resident physicians and their supervising faculty, and the findings have implications for physician oversight of residents' clinical work. We were able to demonstrate the utility of a data mining model as an assessment tool in graduate medical education.
The mining methods at the Fraisse mine

International Nuclear Information System (INIS)

Heurley, P.; Vervialle, J.P.

1985-01-01

The Fraisse mine is one of the four underground mines of the La Crouzille mining divisions of Cogema. Faced with the necessity to mechanize its workings, this mine also had to satisfy a certain number of stringent demands. This has led to concept of four different mining methods for the four workings at present in active operation at this pit, which nevertheless preserve the basic ideas of the methods of top slicing under concrete slabs (TSS) or horizontal cut-and-fill stopes (CFS). An electric scooptram is utilized. With this type of vehicle the stringent demands for the introduction of means for fire fighting and prevention are reduced to a minimum. Finally, the dimensions of the vehicles and the operation of these methods result in a net-to-gross tonnages of close to 1, i.e. a maximum output, combined with a minimum of contamination [fr
Data Mining for Understanding and Improving Decision-making Affecting Ground Delay Programs

Science.gov (United States)

Kulkarni, Deepak; Wang, Yao; Sridhar, Banavar

2013-01-01

The continuous growth in the demand for air transportation results in an imbalance between airspace capacity and traffic demand. The airspace capacity of a region depends on the ability of the system to maintain safe separation between aircraft in the region. In addition to growing demand, the airspace capacity is severely limited by convective weather. During such conditions, traffic managers at the FAA's Air Traffic Control System Command Center (ATCSCC) and dispatchers at various Airlines' Operations Center (AOC) collaborate to mitigate the demand-capacity imbalance caused by weather. The end result is the implementation of a set of Traffic Flow Management (TFM) initiatives such as ground delay programs, reroute advisories, flow metering, and ground stops. Data Mining is the automated process of analyzing large sets of data and then extracting patterns in the data. Data mining tools are capable of predicting behaviors and future trends, allowing an organization to benefit from past experience in making knowledge-driven decisions.
Genetic Algorithm Calibration of Probabilistic Cellular Automata for Modeling Mining Permit Activity

Science.gov (United States)

Louis, S.J.; Raines, G.L.

2003-01-01

We use a genetic algorithm to calibrate a spatially and temporally resolved cellular automata to model mining activity on public land in Idaho and western Montana. The genetic algorithm searches through a space of transition rule parameters of a two dimensional cellular automata model to find rule parameters that fit observed mining activity data. Previous work by one of the authors in calibrating the cellular automaton took weeks - the genetic algorithm takes a day and produces rules leading to about the same (or better) fit to observed data. These preliminary results indicate that genetic algorithms are a viable tool in calibrating cellular automata for this application. Experience gained during the calibration of this cellular automata suggests that mineral resource information is a critical factor in the quality of the results. With automated calibration, further refinements of how the mineral-resource information is provided to the cellular automaton will probably improve our model.
Genomic and metagenomic challenges and opportunities for bioleaching: a mini-review.

Science.gov (United States)

Cárdenas, Juan Pablo; Quatrini, Raquel; Holmes, David S

2016-09-01

High-throughput genomic technologies are accelerating progress in understanding the diversity of microbial life in many environments. Here we highlight advances in genomics and metagenomics of microorganisms from bioleaching heaps and related acidic mining environments. Bioleaching heaps used for copper recovery provide significant opportunities to study the processes and mechanisms underlying microbial successions and the influence of community composition on ecosystem functioning. Obtaining quantitative and process-level knowledge of these dynamics is pivotal for understanding how microorganisms contribute to the solubilization of copper for industrial recovery. Advances in DNA sequencing technology provide unprecedented opportunities to obtain information about the genomes of bioleaching microorganisms, allowing predictive models of metabolic potential and ecosystem-level interactions to be constructed. These approaches are enabling predictive phenotyping of organisms many of which are recalcitrant to genetic approaches or are unculturable. This mini-review describes current bioleaching genomic and metagenomic projects and addresses the use of genome information to: (i) build metabolic models; (ii) predict microbial interactions; (iii) estimate genetic diversity; and (iv) study microbial evolution. Key challenges and perspectives of bioleaching genomics/metagenomics are addressed. Copyright © 2016 The Author(s). Published by Elsevier Masson SAS.. All rights reserved.
Pivotal role of the muscle-contraction pathway in cryptorchidism and evidence for genomic connections with cardiomyopathy pathways in RASopathies

KAUST Repository

Cannistraci, Carlo; Ogorevc, Jernej; Zorc, Minja; Ravasi, Timothy; Dovc, Peter; Kunej, Tanja

2013-01-01

genetic factors and molecular pathways underlying testis descent. Methods. Literature mining was performed to collect genomic loci associated with cryptorchidism in seven mammalian species. Information regarding the collected candidate genes was stored
Mined-out land

International Nuclear Information System (INIS)

Reinsalu, Enno; Toomik, Arvi; Valgma, Ingo

2002-01-01

Estonian mineral resources are deposited in low depth and mining fields are large, therefore vast areas are affected by mining. There are at least 800 deposits with total area of 6,000 km 2 and about the same number of underground mines, surface mines, peat fields, quarries, and sand and gravel pits. The deposits cover more than 10% of Estonian mainland. The total area of operating mine claims exceeds 150 km 2 that makes 0.3 % of Estonian area. The book is written mainly for the people who are living or acting in the area influenced by mining. The observations and research could benefit those who are interested in geography and environment, who follow formation and look of mined-out landscapes. The book contains also warnings for careless people on and under the surface of the mined-out land. Part of the book contains results of the research made in 1968-1993 by the first two authors working at the Estonian branch of A.Skochinsky Institute of Mining. Since 1990, Arvi Toomik continued this study at the Northeastern section of the Institute of Ecology of Tallinn Pedagogical University. Enno Reinsalu studied aftereffects of mining at the Mining Department of Tallinn Technical University from 1998 to 2000. Geographical Information System for Mining was studied by Ingo Valgma within his doctoral dissertation, and this book is one of the applications of his study
Trust Mines

Science.gov (United States)

The United States and the Navajo Nation entered into settlement agreements that provide funds to conduct investigations and any needed cleanup at 16 of the 46 priority mines, including six mines in the Northern Abandoned Uranium Mine Region.
Web Mining

Science.gov (United States)

Fürnkranz, Johannes

The World-Wide Web provides every internet citizen with access to an abundance of information, but it becomes increasingly difficult to identify the relevant pieces of information. Research in web mining tries to address this problem by applying techniques from data mining and machine learning to Web data and documents. This chapter provides a brief overview of web mining techniques and research areas, most notably hypertext classification, wrapper induction, recommender systems and web usage mining.

Data mining, mining data : energy consumption modelling

Energy Technology Data Exchange (ETDEWEB)

Dessureault, S. [Arizona Univ., Tucson, AZ (United States)

2007-09-15

Most modern mining operations are accumulating large amounts of data on production and business processes. Data, however, provides value only if it can be translated into information that appropriate users can utilize. This paper emphasized that a new technological focus should emerge, notably how to concentrate data into information; analyze information sufficiently to become knowledge; and, act on that knowledge. Researchers at the Mining Information Systems and Operations Management (MISOM) laboratory at the University of Arizona have created a method to transform data into action. The data-to-action approach was exercised in the development of an energy consumption model (ECM), in partnership with a major US-based copper mining company, 2 software companies, and the MISOM laboratory. The approach begins by integrating several key data sources using data warehousing techniques, and increasing the existing level of integration and data cleaning. An online analytical processing (OLAP) cube was also created to investigate the data and identify a subset of several million records. Data mining algorithms were applied using the information that was isolated by the OLAP cube. The data mining results showed that traditional cost drivers of energy consumption are poor predictors. A comparison was made between traditional methods of predicting energy consumption and the prediction formed using data mining. Traditionally, in the mines for which data were available, monthly averages of tons and distance are used to predict diesel fuel consumption. However, this article showed that new information technology can be used to incorporate many more variables into the budgeting process, resulting in more accurate predictions. The ECM helped mine planners improve the prediction of energy use through more data integration, measure development, and workflow analysis. 5 refs., 11 figs.
Mining with communities

International Nuclear Information System (INIS)

Veiga, Marcello M.; Scoble, Malcolm; McAllister, Mary Louise

2001-01-01

To be considered as sustainable, a mining community needs to adhere to the principles of ecological sustainability, economic vitality and social equity. These principles apply over a long time span, covering both the life of the mine and post-mining closure. The legacy left by a mine to the community after its closure is emerging as a significant aspect of its planning. Progress towards sustainability is made when value is added to a community with respect to these principles by the mining operation during its life cycle. This article presents a series of cases to demonstrate the diverse potential challenges to achieving a sustainable mining community. These case studies of both new and old mining communities are drawn mainly from Canada and from locations abroad where Canadian companies are now building mines. The article concludes by considering various approaches that can foster sustainable mining communities and the role of community consultation and capacity building. (author)
Ideate about building green mine of uranium mining and metallurgy

International Nuclear Information System (INIS)

Shi Zuyuan

2012-01-01

Analysing the current situation of uranium mining and metallurgy; Setting up goals for green uranium mining and metallurgy, its fundamental conditions, Contents and measures. Putting forward an idea to combine green uranium mining and metallurgy with the state target for green mining, and keeping its own characteristics. (author)
Social big data mining

CERN Document Server

Ishikawa, Hiroshi

2015-01-01

Social Media. Big Data and Social Data. Hypotheses in the Era of Big Data. Social Big Data Applications. Basic Concepts in Data Mining. Association Rule Mining. Clustering. Classification. Prediction. Web Structure Mining. Web Content Mining. Web Access Log Mining, Information Extraction and Deep Web Mining. Media Mining. Scalability and Outlier Detection.
Mining and mining authorities in Saarland 2016. Mining economy, mining technology, occupational safety, environmental protection, statistics, mining authority activities. Annual report; Bergbau und Bergbehoerden im Saarland 2016. Bergwirtschaft, Bergtechnik, Arbeitsschutz, Umweltschutz, Statistiken, Taetigkeiten der Bergbehoerden. Jahresbericht

Energy Technology Data Exchange (ETDEWEB)

NONE

2016-07-01

The annual report of the Saarland Upper Mining Authority provides an insight into the activities of mining authorities. Especially, the development of the black coal mining, safety and technology of mining as well as the correlation between mining and environment are stressed.
FluReF, an automated flu virus reassortment finder based on phylogenetic trees.

Science.gov (United States)

Yurovsky, Alisa; Moret, Bernard M E

2011-01-01

Reassortments are events in the evolution of the genome of influenza (flu), whereby segments of the genome are exchanged between different strains. As reassortments have been implicated in major human pandemics of the last century, their identification has become a health priority. While such identification can be done "by hand" on a small dataset, researchers and health authorities are building up enormous databases of genomic sequences for every flu strain, so that it is imperative to develop automated identification methods. However, current methods are limited to pairwise segment comparisons. We present FluReF, a fully automated flu virus reassortment finder. FluReF is inspired by the visual approach to reassortment identification and uses the reconstructed phylogenetic trees of the individual segments and of the full genome. We also present a simple flu evolution simulator, based on the current, source-sink, hypothesis for flu cycles. On synthetic datasets produced by our simulator, FluReF, tuned for a 0% false positive rate, yielded false negative rates of less than 10%. FluReF corroborated two new reassortments identified by visual analysis of 75 Human H3N2 New York flu strains from 2005-2008 and gave partial verification of reassortments found using another bioinformatics method. FluReF finds reassortments by a bottom-up search of the full-genome and segment-based phylogenetic trees for candidate clades--groups of one or more sampled viruses that are separated from the other variants from the same season. Candidate clades in each tree are tested to guarantee confidence values, using the lengths of key edges as well as other tree parameters; clades with reassortments must have validated incongruencies among segment trees. FluReF demonstrates robustness of prediction for geographically and temporally expanded datasets, and is not limited to finding reassortments with previously collected sequences. The complete source code is available from http://lcbb.epfl.ch/software.html.
75 FR 17529 - High-Voltage Continuous Mining Machine Standard for Underground Coal Mines

Science.gov (United States)

2010-04-06

... High-Voltage Continuous Mining Machine Standard for Underground Coal Mines AGENCY: Mine Safety and... of high-voltage continuous mining machines in underground coal mines. It also revises MSHA's design...-- Underground Coal Mines III. Section-by-Section Analysis A. Part 18--Electric Motor-Driven Mine Equipment and...
Genome mining offers a new starting point for parasitology research.

Science.gov (United States)

Lv, Zhiyue; Wu, Zhongdao; Zhang, Limei; Ji, Pengyu; Cai, Yifeng; Luo, Shiqi; Wang, Hongxi; Li, Hao

2015-02-01

Parasites including helminthes, protozoa, and medical arthropod vectors are a major cause of global infectious diseases, affecting one-sixth of the world's population, which are responsible for enormous levels of morbidity and mortality important and remain impediments to economic development especially in tropical countries. Prevalent drug resistance, lack of highly effective and practical vaccines, as well as specific and sensitive diagnostic markers are proving to be challenging problems in parasitic disease control in most parts of the world. The impressive progress recently made in genome-wide analysis of parasites of medical importance, including trematodes of Clonorchis sinensis, Opisthorchis viverrini, Schistosoma haematobium, S. japonicum, and S. mansoni; nematodes of Brugia malayi, Loa loa, Necator americanus, Trichinella spiralis, and Trichuris suis; cestodes of Echinococcus granulosus, E. multilocularis, and Taenia solium; protozoa of Babesia bovis, B. microti, Cryptosporidium hominis, Eimeria falciformis, E. histolytica, Giardia intestinalis, Leishmania braziliensis, L. donovani, L. major, Plasmodium falciparum, P. vivax, Trichomonas vaginalis, Trypanosoma brucei and T. cruzi; and medical arthropod vectors of Aedes aegypti, Anopheles darlingi, A. sinensis, and Culex quinquefasciatus, have been systematically covered in this review for a comprehensive understanding of the genetic information contained in nuclear, mitochondrial, kinetoplast, plastid, or endosymbiotic bacterial genomes of parasites, further valuable insight into parasite-host interactions and development of promising novel drug and vaccine candidates and preferable diagnostic tools, thereby underpinning the prevention and control of parasitic diseases.
Low cost automation

International Nuclear Information System (INIS)

1987-03-01

This book indicates method of building of automation plan, design of automation facilities, automation and CHIP process like basics of cutting, NC processing machine and CHIP handling, automation unit, such as drilling unit, tapping unit, boring unit, milling unit and slide unit, application of oil pressure on characteristics and basic oil pressure circuit, application of pneumatic, automation kinds and application of process, assembly, transportation, automatic machine and factory automation.
Mine water treatment

Energy Technology Data Exchange (ETDEWEB)

Komissarov, S V

1980-10-01

This article discusses composition of chemical compounds dissolved or suspended in mine waters in various coal basins of the USSR: Moscow basin, Kuzbass, Pechora, Kizelovsk, Karaganda, Donetsk and Chelyabinsk basins. Percentage of suspended materials in water depending on water source (water from water drainage system of dust suppression system) is evaluated. Pollution of mine waters with oils and coli bacteria is also described. Recommendations on construction, capacity of water settling tanks, and methods of mine water treatment are presented. In mines where coal seams 2 m or thicker are mined a system of two settling tanks should be used: in the upper one large grains are settled, in the lower one finer grains. The upper tank should be large enough to store mine water discharged during one month, and the lower one to store water discharged over two months. Salty waters from coal mines mining thin coal seams should be treated in a system of water reservoirs from which water evaporates (if climatic conditions permit). Mine waters from mines with thin coal seams but without high salt content can be treated in a system of long channels with water plants, which increase amount of oxygen in treated water. System of biological treatment of waste waters from mine wash-houses and baths is also described. Influence of temperature, sunshine and season of the year on efficiency of mine water treatment is also assessed. (In Russian)
Uranium mining

International Nuclear Information System (INIS)

2008-01-01

Full text: The economic and environmental sustainability of uranium mining has been analysed by Monash University researcher Dr Gavin Mudd in a paper that challenges the perception that uranium mining is an 'infinite quality source' that provides solutions to the world's demand for energy. Dr Mudd says information on the uranium industry touted by politicians and mining companies is not necessarily inaccurate, but it does not tell the whole story, being often just an average snapshot of the costs of uranium mining today without reflecting the escalating costs associated with the process in years to come. 'From a sustainability perspective, it is critical to evaluate accurately the true lifecycle costs of all forms of electricity production, especially with respect to greenhouse emissions, ' he says. 'For nuclear power, a significant proportion of greenhouse emissions are derived from the fuel supply, including uranium mining, milling, enrichment and fuel manufacture.' Dr Mudd found that financial and environmental costs escalate dramatically as the uranium ore is used. The deeper the mining process required to extract the ore, the higher the cost for mining companies, the greater the impact on the environment and the more resources needed to obtain the product. I t is clear that there is a strong sensitivity of energy and water consumption and greenhouse emissions to ore grade, and that ore grades are likely to continue to decline gradually in the medium to long term. These issues are critical to the current debate over nuclear power and greenhouse emissions, especially with respect to ascribing sustainability to such activities as uranium mining and milling. For example, mining at Roxby Downs is responsible for the emission of over one million tonnes of greenhouse gases per year and this could increase to four million tonnes if the mine is expanded.'
The Potential of Text Mining in Data Integration and Network Biology for Plant Research: A Case Study on Arabidopsis[C][W

Science.gov (United States)

Van Landeghem, Sofie; De Bodt, Stefanie; Drebert, Zuzanna J.; Inzé, Dirk; Van de Peer, Yves

2013-01-01

Despite the availability of various data repositories for plant research, a wealth of information currently remains hidden within the biomolecular literature. Text mining provides the necessary means to retrieve these data through automated processing of texts. However, only recently has advanced text mining methodology been implemented with sufficient computational power to process texts at a large scale. In this study, we assess the potential of large-scale text mining for plant biology research in general and for network biology in particular using a state-of-the-art text mining system applied to all PubMed abstracts and PubMed Central full texts. We present extensive evaluation of the textual data for Arabidopsis thaliana, assessing the overall accuracy of this new resource for usage in plant network analyses. Furthermore, we combine text mining information with both protein–protein and regulatory interactions from experimental databases. Clusters of tightly connected genes are delineated from the resulting network, illustrating how such an integrative approach is essential to grasp the current knowledge available for Arabidopsis and to uncover gene information through guilt by association. All large-scale data sets, as well as the manually curated textual data, are made publicly available, hereby stimulating the application of text mining data in future plant biology studies. PMID:23532071
Internet technologies in the mining industry. Towards unattended mining systems

Energy Technology Data Exchange (ETDEWEB)

Krzykawski, Michal [FAMUR Group, Katowice (Poland)

2009-08-27

Global suppliers of longwall systems focus mainly on maximising the efficiency of the equipment they manufacture. Given the fact that, since 2004, coal demand on world markets has been constantly on the increase, even during an economic downturn, this endeavour seems fully justified. However, it should be remembered that maximum efficiency must be accompanied by maximum safety of all underground operations. This statement is based on the belief that the mining industry, which exploits increasingly deep and dangerous coal beds, faces the necessity to implement comprehensive IT systems for managing all mining processes and, in the near future, to use unmanned mining systems, fully controllable from the mine surface. The computerisation of mines is an indispensable element of the development of the world mining industry, a belief which has been put into practice with e-mine, developed by the FAMUR Group. (orig.)
Sequencing and annotation of mitochondrial genomes from individual parasitic helminths.

Science.gov (United States)

Jex, Aaron R; Littlewood, D Timothy; Gasser, Robin B

2015-01-01

Mitochondrial (mt) genomics has significant implications in a range of fundamental areas of parasitology, including evolution, systematics, and population genetics as well as explorations of mt biochemistry, physiology, and function. Mt genomes also provide a rich source of markers to aid molecular epidemiological and ecological studies of key parasites. However, there is still a paucity of information on mt genomes for many metazoan organisms, particularly parasitic helminths, which has often related to challenges linked to sequencing from tiny amounts of material. The advent of next-generation sequencing (NGS) technologies has paved the way for low cost, high-throughput mt genomic research, but there have been obstacles, particularly in relation to post-sequencing assembly and analyses of large datasets. In this chapter, we describe protocols for the efficient amplification and sequencing of mt genomes from small portions of individual helminths, and highlight the utility of NGS platforms to expedite mt genomics. In addition, we recommend approaches for manual or semi-automated bioinformatic annotation and analyses to overcome the bioinformatic "bottleneck" to research in this area. Taken together, these approaches have demonstrated applicability to a range of parasites and provide prospects for using complete mt genomic sequence datasets for large-scale molecular systematic and epidemiological studies. In addition, these methods have broader utility and might be readily adapted to a range of other medium-sized molecular regions (i.e., 10-100 kb), including large genomic operons, and other organellar (e.g., plastid) and viral genomes.
Using the Personal Health Train for Automated and Privacy-Preserving Analytics on Vertically Partitioned Data.

Science.gov (United States)

van Soest, Johan; Sun, Chang; Mussmann, Ole; Puts, Marco; van den Berg, Bob; Malic, Alexander; van Oppen, Claudia; Towend, David; Dekker, Andre; Dumontier, Michel

2018-01-01

Conventional data mining algorithms are unable to satisfy the current requirements on analyzing big data in some fields such as medicine, policy making, judicial, and tax records. However, applying diverse datasets from different institutes (both healthcare and non-healthcare related) can enrich information and insights. So far, analyzing this data in an automated, privacy-preserving manner does not exist to our knowledge. In this work, we propose an infrastructure, and proof-of-concept for privacy-preserving analytics on vertically partitioned data.
Text-mining analysis of mHealth research

Science.gov (United States)

Zengul, Ferhat; Oner, Nurettin; Delen, Dursun

2017-01-01

In recent years, because of the advancements in communication and networking technologies, mobile technologies have been developing at an unprecedented rate. mHealth, the use of mobile technologies in medicine, and the related research has also surged parallel to these technological advancements. Although there have been several attempts to review mHealth research through manual processes such as systematic reviews, the sheer magnitude of the number of studies published in recent years makes this task very challenging. The most recent developments in machine learning and text mining offer some potential solutions to address this challenge by allowing analyses of large volumes of texts through semi-automated processes. The objective of this study is to analyze the evolution of mHealth research by utilizing text-mining and natural language processing (NLP) analyses. The study sample included abstracts of 5,644 mHealth research articles, which were gathered from five academic search engines by using search terms such as mobile health, and mHealth. The analysis used the Text Explorer module of JMP Pro 13 and an iterative semi-automated process involving tokenizing, phrasing, and terming. After developing the document term matrix (DTM) analyses such as single value decomposition (SVD), topic, and hierarchical document clustering were performed, along with the topic-informed document clustering approach. The results were presented in the form of word-clouds and trend analyses. There were several major findings regarding research clusters and trends. First, our results confirmed time-dependent nature of terminology use in mHealth research. For example, in earlier versus recent years the use of terminology changed from “mobile phone” to “smartphone” and from “applications” to “apps”. Second, ten clusters for mHealth research were identified including (I) Clinical Research on Lifestyle Management, (II) Community Health, (III) Literature Review, (IV) Medical
Text-mining analysis of mHealth research.

Science.gov (United States)

Ozaydin, Bunyamin; Zengul, Ferhat; Oner, Nurettin; Delen, Dursun

2017-01-01

In recent years, because of the advancements in communication and networking technologies, mobile technologies have been developing at an unprecedented rate. mHealth, the use of mobile technologies in medicine, and the related research has also surged parallel to these technological advancements. Although there have been several attempts to review mHealth research through manual processes such as systematic reviews, the sheer magnitude of the number of studies published in recent years makes this task very challenging. The most recent developments in machine learning and text mining offer some potential solutions to address this challenge by allowing analyses of large volumes of texts through semi-automated processes. The objective of this study is to analyze the evolution of mHealth research by utilizing text-mining and natural language processing (NLP) analyses. The study sample included abstracts of 5,644 mHealth research articles, which were gathered from five academic search engines by using search terms such as mobile health, and mHealth. The analysis used the Text Explorer module of JMP Pro 13 and an iterative semi-automated process involving tokenizing, phrasing, and terming. After developing the document term matrix (DTM) analyses such as single value decomposition (SVD), topic, and hierarchical document clustering were performed, along with the topic-informed document clustering approach. The results were presented in the form of word-clouds and trend analyses. There were several major findings regarding research clusters and trends. First, our results confirmed time-dependent nature of terminology use in mHealth research. For example, in earlier versus recent years the use of terminology changed from "mobile phone" to "smartphone" and from "applications" to "apps". Second, ten clusters for mHealth research were identified including (I) Clinical Research on Lifestyle Management, (II) Community Health, (III) Literature Review, (IV) Medical Interventions
The Einstein Genome Gateway using WASP - a high throughput multi-layered life sciences portal for XSEDE.

Science.gov (United States)

Golden, Aaron; McLellan, Andrew S; Dubin, Robert A; Jing, Qiang; O Broin, Pilib; Moskowitz, David; Zhang, Zhengdong; Suzuki, Masako; Hargitai, Joseph; Calder, R Brent; Greally, John M

2012-01-01

Massively-parallel sequencing (MPS) technologies and their diverse applications in genomics and epigenomics research have yielded enormous new insights into the physiology and pathophysiology of the human genome. The biggest hurdle remains the magnitude and diversity of the datasets generated, compromising our ability to manage, organize, process and ultimately analyse data. The Wiki-based Automated Sequence Processor (WASP), developed at the Albert Einstein College of Medicine (hereafter Einstein), uniquely manages to tightly couple the sequencing platform, the sequencing assay, sample metadata and the automated workflows deployed on a heterogeneous high performance computing cluster infrastructure that yield sequenced, quality-controlled and 'mapped' sequence data, all within the one operating environment accessible by a web-based GUI interface. WASP at Einstein processes 4-6 TB of data per week and since its production cycle commenced it has processed ~ 1 PB of data overall and has revolutionized user interactivity with these new genomic technologies, who remain blissfully unaware of the data storage, management and most importantly processing services they request. The abstraction of such computational complexity for the user in effect makes WASP an ideal middleware solution, and an appropriate basis for the development of a grid-enabled resource - the Einstein Genome Gateway - as part of the Extreme Science and Engineering Discovery Environment (XSEDE) program. In this paper we discuss the existing WASP system, its proposed middleware role, and its planned interaction with XSEDE to form the Einstein Genome Gateway.
A simplified approach to the PROMETHEE method for priority setting in management of mine action projects

Directory of Open Access Journals (Sweden)

Marko Mladineo

2016-12-01

Full Text Available In the last 20 years, priority setting in mine actions, i.e. in humanitarian demining, has become an increasingly important topic. Given that mine action projects require management and decision-making based on a multi -criteria approach, multi-criteria decision-making methods like PROMETHEE and AHP have been used worldwide for priority setting. However, from the aspect of mine action, where stakeholders in the decision-making process for priority setting are project managers, local politicians, leaders of different humanitarian organizations, or similar, applying these methods can be difficult. Therefore, a specialized web-based decision support system (Web DSS for priority setting, developed as part of the FP7 project TIRAMISU, has been extended using a module for developing custom priority setting scenarios in line with an exceptionally easy, user-friendly approach. The idea behind this research is to simplify the multi-criteria analysis based on the PROMETHEE method. Therefore, a simplified PROMETHEE method based on statistical analysis for automated suggestions of parameters such as preference function thresholds, interactive selection of criteria weights, and easy input of criteria evaluations is presented in this paper. The result is web-based DSS that can be applied worldwide for priority setting in mine action. Additionally, the management of mine action projects is supported using modules for providing spatial data based on the geographic information system (GIS. In this paper, the benefits and limitations of a simplified PROMETHEE method are presented using a case study involving mine action projects, and subsequently, certain proposals are given for the further research.
Requirements and opportunities for mining engineers in the mining industry abroad

Energy Technology Data Exchange (ETDEWEB)

Albrecht, E

1987-04-09

The decline of the German mining industry and the increasing industrialization of mining is forcing ever greater numbers of young German mining graduates to build their careers abroad. The requirements for this - apart from the technical qualifications are a good knowledge of foreign languages and a readiness to leave Germany for a long time, even for ever. If the young mining graduate accepts these conditions, numerous professional opportunities will open up for him, both with German mining companies with interests abroad, in mining supply companies and consultancy firms and with foreign companies. 6 references.

Mining

Directory of Open Access Journals (Sweden)

Khairullah Khan

2014-09-01

Full Text Available Opinion mining is an interesting area of research because of its applications in various fields. Collecting opinions of people about products and about social and political events and problems through the Web is becoming increasingly popular every day. The opinions of users are helpful for the public and for stakeholders when making certain decisions. Opinion mining is a way to retrieve information through search engines, Web blogs and social networks. Because of the huge number of reviews in the form of unstructured text, it is impossible to summarize the information manually. Accordingly, efficient computational methods are needed for mining and summarizing the reviews from corpuses and Web documents. This study presents a systematic literature survey regarding the computational techniques, models and algorithms for mining opinion components from unstructured reviews.
PREVENTION OF ACID MINE DRAINAGE GENERATION FROM OPEN-PIT MINE HIGHWALLS

Science.gov (United States)

Exposed, open pit mine highwalls contribute significantly to the production of acid mine drainage (AMD) thus causing environmental concerns upon closure of an operating mine. Available information on the generation of AMD from open-pit mine highwalls is very limit...
A direction of developing a mining method and mining complexes

Energy Technology Data Exchange (ETDEWEB)

Gabov, V.V.; Efimov, I.A. [St. Petersburg State Mining Institute, St. Petersburg (Russian Federation). Vorkuta Branch

1996-12-31

The analyses of a mining method as a main factor determining the development stages of mining units is presented. The paper suggests a perspective mining method which differs from the known ones by following peculiarities: the direction selectivity of cuts with regard to coal seams structure; the cutting speed, thickness and succession of dusts. This method may be done by modulate complexes (a shield carrying a cutting head for coal mining), their mining devices being supplied with hydraulic drive. An experimental model of the module complex has been developed. 2 refs.
Short-term scheduling of an open-pit mine with multiple objectives

Science.gov (United States)

Blom, Michelle; Pearce, Adrian R.; Stuckey, Peter J.

2017-05-01

This article presents a novel algorithm for the generation of multiple short-term production schedules for an open-pit mine, in which several objectives, of varying priority, characterize the quality of each solution. A short-term schedule selects regions of a mine site, known as 'blocks', to be extracted in each week of a planning horizon (typically spanning 13 weeks). Existing tools for constructing these schedules use greedy heuristics, with little optimization. To construct a single schedule in which infrastructure is sufficiently utilized, with production grades consistently close to a desired target, a planner must often run these heuristics many times, adjusting parameters after each iteration. A planner's intuition and experience can evaluate the relative quality and mineability of different schedules in a way that is difficult to automate. Of interest to a short-term planner is the generation of multiple schedules, extracting available ore and waste in varying sequences, which can then be manually compared. This article presents a tool in which multiple, diverse, short-term schedules are constructed, meeting a range of common objectives without the need for iterative parameter adjustment.
Autonomy and Automation

Science.gov (United States)

Shively, Jay

2017-01-01

A significant level of debate and confusion has surrounded the meaning of the terms autonomy and automation. Automation is a multi-dimensional concept, and we propose that Remotely Piloted Aircraft Systems (RPAS) automation should be described with reference to the specific system and task that has been automated, the context in which the automation functions, and other relevant dimensions. In this paper, we present definitions of automation, pilot in the loop, pilot on the loop and pilot out of the loop. We further propose that in future, the International Civil Aviation Organization (ICAO) RPAS Panel avoids the use of the terms autonomy and autonomous when referring to automated systems on board RPA. Work Group 7 proposes to develop, in consultation with other workgroups, a taxonomy of Levels of Automation for RPAS.
Origins of the Human Genome Project.

Science.gov (United States)

Watson, J D; Cook-Deegan, R M

1991-01-01

The Human Genome Project has become a reality. Building on a debate that dates back to 1985, several genome projects are now in full stride around the world, and more are likely to form in the next several years. Italy began its genome program in 1987, and the United Kingdom and U.S.S.R. in 1988. The European communities mounted several genome projects on yeast, bacteria, Drosophila, and Arabidospis thaliana (a rapidly growing plant with a small genome) in 1988, and in 1990 commenced a new 2-year program on the human genome. In the United States, we have completed the first year of operation of the National Center for Human Genome Research at the National Institutes of Health (NIH), now the largest single funding source for genome research in the world. There have been dedicated budgets focused on genome-scale research at NIH, the U.S. Department of Energy, and the Howard Hughes Medical Institute for several years, and results are beginning to accumulate. There were three annual meetings on genome mapping and sequencing at Cold Spring Harbor, New York, in the spring of 1988, 1989, and 1990; the talks have shifted from a discussion about how to approach problems to presenting results from experiments already performed. We have finally begun to work rather than merely talk. The purpose of genome projects is to assemble data on the structure of DNA in human chromosomes and those of other organisms. A second goal is to develop new technologies to perform mapping and sequencing. There have been impressive technical advances in the past 5 years since the debate about the human genome project began. We are on the verge of beginning pilot projects to test several approaches to sequencing long stretches of DNA, using both automation and manual methods. Ordered sets of yeast artificial chromosome and cosmid clones have been assembled to span more than 2 million base pairs of several human chromosomes, and a region of 10 million base pairs has been assembled for
Sifting through genomes with iterative-sequence clustering produces a large, phylogenetically diverse protein-family resource

Directory of Open Access Journals (Sweden)

Sharpton Thomas J

2012-10-01

Full Text Available Abstract Background New computational resources are needed to manage the increasing volume of biological data from genome sequencing projects. One fundamental challenge is the ability to maintain a complete and current catalog of protein diversity. We developed a new approach for the identification of protein families that focuses on the rapid discovery of homologous protein sequences. Results We implemented fully automated and high-throughput procedures to de novo cluster proteins into families based upon global alignment similarity. Our approach employs an iterative clustering strategy in which homologs of known families are sifted out of the search for new families. The resulting reduction in computational complexity enables us to rapidly identify novel protein families found in new genomes and to perform efficient, automated updates that keep pace with genome sequencing. We refer to protein families identified through this approach as “Sifting Families,” or SFams. Our analysis of ~10.5 million protein sequences from 2,928 genomes identified 436,360 SFams, many of which are not represented in other protein family databases. We validated the quality of SFam clustering through statistical as well as network topology–based analyses. Conclusions We describe the rapid identification of SFams and demonstrate how they can be used to annotate genomes and metagenomes. The SFam database catalogs protein-family quality metrics, multiple sequence alignments, hidden Markov models, and phylogenetic trees. Our source code and database are publicly available and will be subject to frequent updates (http://edhar.genomecenter.ucdavis.edu/sifting_families/.
Automated Reporting of DXA Studies Using a Custom-Built Computer Program.

Science.gov (United States)

England, Joseph R; Colletti, Patrick M

2018-06-01

Dual-energy x-ray absorptiometry (DXA) scans are a critical population health tool and relatively simple to interpret but can be time consuming to report, often requiring manual transfer of bone mineral density and associated statistics into commercially available dictation systems. We describe here a custom-built computer program for automated reporting of DXA scans using Pydicom, an open-source package built in the Python computer language, and regular expressions to mine DICOM tags for patient information and bone mineral density statistics. This program, easy to emulate by any novice computer programmer, has doubled our efficiency at reporting DXA scans and has eliminated dictation errors.
Vaccine adverse event text mining system for extracting features from vaccine safety reports.

Science.gov (United States)

Botsis, Taxiarchis; Buttolph, Thomas; Nguyen, Michael D; Winiecki, Scott; Woo, Emily Jane; Ball, Robert

2012-01-01

To develop and evaluate a text mining system for extracting key clinical features from vaccine adverse event reporting system (VAERS) narratives to aid in the automated review of adverse event reports. Based upon clinical significance to VAERS reviewing physicians, we defined the primary (diagnosis and cause of death) and secondary features (eg, symptoms) for extraction. We built a novel vaccine adverse event text mining (VaeTM) system based on a semantic text mining strategy. The performance of VaeTM was evaluated using a total of 300 VAERS reports in three sequential evaluations of 100 reports each. Moreover, we evaluated the VaeTM contribution to case classification; an information retrieval-based approach was used for the identification of anaphylaxis cases in a set of reports and was compared with two other methods: a dedicated text classifier and an online tool. The performance metrics of VaeTM were text mining metrics: recall, precision and F-measure. We also conducted a qualitative difference analysis and calculated sensitivity and specificity for classification of anaphylaxis cases based on the above three approaches. VaeTM performed best in extracting diagnosis, second level diagnosis, drug, vaccine, and lot number features (lenient F-measure in the third evaluation: 0.897, 0.817, 0.858, 0.874, and 0.914, respectively). In terms of case classification, high sensitivity was achieved (83.1%); this was equal and better compared to the text classifier (83.1%) and the online tool (40.7%), respectively. Our VaeTM implementation of a semantic text mining strategy shows promise in providing accurate and efficient extraction of key features from VAERS narratives.
Detecting Plastic PFM-1 Butterfly Mines Using Thermal Infrared Sensing

Science.gov (United States)

Baur, J.; de Smet, T.; Nikulin, A.

2017-12-01

supervised learning algorithm to automate detection of the mines over large areas. We anticipate that following further development, this remote sensing method can aid in significantly reducing the cost and time associated with landmine remediation in post-conflict nations.
30 CFR 780.27 - Reclamation plan: Surface mining near underground mining.

Science.gov (United States)

2010-07-01

... 30 Mineral Resources 3 2010-07-01 2010-07-01 false Reclamation plan: Surface mining near underground mining. 780.27 Section 780.27 Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR SURFACE COAL MINING AND RECLAMATION OPERATIONS PERMITS AND COAL...
Minimizing the Impact of Mining Activities for Sustainable Mined-Out ...

African Journals Online (AJOL)

Minimizing the Impact of Mining Activities for Sustainable Mined-Out Area ... sensing and Geographical Information System (GIS) in assessing environmental impact of ... Keywords: Solid mineral, Impact assessment, Mined-out area utilization, ...
InterMine: a flexible data warehouse system for the integration and analysis of heterogeneous biological data.

Science.gov (United States)

Smith, Richard N; Aleksic, Jelena; Butano, Daniela; Carr, Adrian; Contrino, Sergio; Hu, Fengyuan; Lyne, Mike; Lyne, Rachel; Kalderimis, Alex; Rutherford, Kim; Stepan, Radek; Sullivan, Julie; Wakeling, Matthew; Watkins, Xavier; Micklem, Gos

2012-12-01

InterMine is an open-source data warehouse system that facilitates the building of databases with complex data integration requirements and a need for a fast customizable query facility. Using InterMine, large biological databases can be created from a range of heterogeneous data sources, and the extensible data model allows for easy integration of new data types. The analysis tools include a flexible query builder, genomic region search and a library of 'widgets' performing various statistical analyses. The results can be exported in many commonly used formats. InterMine is a fully extensible framework where developers can add new tools and functionality. Additionally, there is a comprehensive set of web services, for which client libraries are provided in five commonly used programming languages. Freely available from http://www.intermine.org under the LGPL license. g.micklem@gen.cam.ac.uk Supplementary data are available at Bioinformatics online.
Management of mining-related damages in abandoned underground coal mine areas using GIS

International Nuclear Information System (INIS)

Lee, U.J.; Kim, J.A.; Kim, S.S.; Kim, W.K.; Yoon, S.H.; Choi, J.K.

2005-01-01

The mining-related damages such as ground subsidence, acid mine drainage (AMD), and deforestation in the abandoned underground coal mine areas become an object of public concern. Therefore, the system to manage the mining-related damages is needed for the effective drive of rehabilitation activities. The management system for Abandoned Underground Coal Mine using GIS includes the database about mining record and information associated with the mining-related damages and application programs to support mine damage prevention business. Also, this system would support decision-making policy for rehabilitation and provide basic geological data for regional construction works in abandoned underground coal mine areas. (authors)
Text Mining.

Science.gov (United States)

Trybula, Walter J.

1999-01-01

Reviews the state of research in text mining, focusing on newer developments. The intent is to describe the disparate investigations currently included under the term text mining and provide a cohesive structure for these efforts. A summary of research identifies key organizations responsible for pushing the development of text mining. A section…
Surface mining

Science.gov (United States)

Robert Leopold; Bruce Rowland; Reed Stalder

1979-01-01

The surface mining process consists of four phases: (1) exploration; (2) development; (3) production; and (4) reclamation. A variety of surface mining methods has been developed, including strip mining, auger, area strip, open pit, dredging, and hydraulic. Sound planning and design techniques are essential to implement alternatives to meet the myriad of laws,...
Uranium mining

International Nuclear Information System (INIS)

Lange, G.

1975-01-01

The winning of uranium ore is the first stage of the fuel cycle. The whole complex of questions to be considered when evaluating the profitability of an ore mine is shortly outlined, and the possible mining techniques are described. Some data on uranium mining in the western world are also given. (RB) [de
Automated quantitative micro-mineralogical characterization for environmental applications

Science.gov (United States)

Smith, Kathleen S.; Hoal, K.O.; Walton-Day, Katherine; Stammer, J.G.; Pietersen, K.

2013-01-01

Characterization of ore and waste-rock material using automated quantitative micro-mineralogical techniques (e.g., QEMSCAN® and MLA) has the potential to complement traditional acid-base accounting and humidity cell techniques when predicting acid generation and metal release. These characterization techniques, which most commonly are used for metallurgical, mineral-processing, and geometallurgical applications, can be broadly applied throughout the mine-life cycle to include numerous environmental applications. Critical insights into mineral liberation, mineral associations, particle size, particle texture, and mineralogical residence phase(s) of environmentally important elements can be used to anticipate potential environmental challenges. Resources spent on initial characterization result in lower uncertainties of potential environmental impacts and possible cost savings associated with remediation and closure. Examples illustrate mineralogical and textural characterization of fluvial tailings material from the upper Arkansas River in Colorado.
Annotating gene sets by mining large literature collections with protein networks.

Science.gov (United States)

Wang, Sheng; Ma, Jianzhu; Yu, Michael Ku; Zheng, Fan; Huang, Edward W; Han, Jiawei; Peng, Jian; Ideker, Trey

2018-01-01

Analysis of patient genomes and transcriptomes routinely recognizes new gene sets associated with human disease. Here we present an integrative natural language processing system which infers common functions for a gene set through automatic mining of the scientific literature with biological networks. This system links genes with associated literature phrases and combines these links with protein interactions in a single heterogeneous network. Multiscale functional annotations are inferred based on network distances between phrases and genes and then visualized as an ontology of biological concepts. To evaluate this system, we predict functions for gene sets representing known pathways and find that our approach achieves substantial improvement over the conventional text-mining baseline method. Moreover, our system discovers novel annotations for gene sets or pathways without previously known functions. Two case studies demonstrate how the system is used in discovery of new cancer-related pathways with ontological annotations.
Online Analytical Processing (OLAP: A Fast and Effective Data Mining Tool for Gene Expression Databases

Directory of Open Access Journals (Sweden)

Alkharouf Nadim W.

2005-01-01

Full Text Available Gene expression databases contain a wealth of information, but current data mining tools are limited in their speed and effectiveness in extracting meaningful biological knowledge from them. Online analytical processing (OLAP can be used as a supplement to cluster analysis for fast and effective data mining of gene expression databases. We used Analysis Services 2000, a product that ships with SQLServer2000, to construct an OLAP cube that was used to mine a time series experiment designed to identify genes associated with resistance of soybean to the soybean cyst nematode, a devastating pest of soybean. The data for these experiments is stored in the soybean genomics and microarray database (SGMD. A number of candidate resistance genes and pathways were found. Compared to traditional cluster analysis of gene expression data, OLAP was more effective and faster in finding biologically meaningful information. OLAP is available from a number of vendors and can work with any relational database management system through OLE DB.

Online analytical processing (OLAP): a fast and effective data mining tool for gene expression databases.

Science.gov (United States)

Alkharouf, Nadim W; Jamison, D Curtis; Matthews, Benjamin F

2005-06-30

Gene expression databases contain a wealth of information, but current data mining tools are limited in their speed and effectiveness in extracting meaningful biological knowledge from them. Online analytical processing (OLAP) can be used as a supplement to cluster analysis for fast and effective data mining of gene expression databases. We used Analysis Services 2000, a product that ships with SQLServer2000, to construct an OLAP cube that was used to mine a time series experiment designed to identify genes associated with resistance of soybean to the soybean cyst nematode, a devastating pest of soybean. The data for these experiments is stored in the soybean genomics and microarray database (SGMD). A number of candidate resistance genes and pathways were found. Compared to traditional cluster analysis of gene expression data, OLAP was more effective and faster in finding biologically meaningful information. OLAP is available from a number of vendors and can work with any relational database management system through OLE DB.
Data Mining for Understanding and Impriving Decision-Making Affecting Ground Delay Programs

Science.gov (United States)

Kulkarni, Deepak; Wang, Yao Xun; Sridhar, Banavar

2013-01-01

The continuous growth in the demand for air transportation results in an imbalance between airspace capacity and traffic demand. The airspace capacity of a region depends on the ability of the system to maintain safe separation between aircraft in the region. In addition to growing demand, the airspace capacity is severely limited by convective weather. During such conditions, traffic managers at the FAA's Air Traffic Control System Command Center (ATCSCC) and dispatchers at various Airlines' Operations Center (AOC) collaborate to mitigate the demand-capacity imbalance caused by weather. The end result is the implementation of a set of Traffic Flow Management (TFM) initiatives such as ground delay programs, reroute advisories, flow metering, and ground stops. Data Mining is the automated process of analyzing large sets of data and then extracting patterns in the data. Data mining tools are capable of predicting behaviors and future trends, allowing an organization to benefit from past experience in making knowledge-driven decisions. The work reported in this paper is focused on ground delay programs. Data mining algorithms have the potential to develop associations between weather patterns and the corresponding ground delay program responses. If successful, they can be used to improve and standardize TFM decision resulting in better predictability of traffic flows on days with reliable weather forecasts. The approach here seeks to develop a set of data mining and machine learning models and apply them to historical archives of weather observations and forecasts and TFM initiatives to determine the extent to which the theory can predict and explain the observed traffic flow behaviors.
Realizatinon of “zero emission” of mining water effluents from Sasa mine

OpenAIRE

Mirakovski, Dejan; Doneva, Nikolinka; Hadzi-Nikolova, Marija; Gocevski, Borce

2015-01-01

Sasa mine continuously takes actions to minimize the environmental impact of mining activities, in order to fulfill the national legislation in the field of environmental protection which comply with European legislation. This paper shows the drainage system of the horizon 830, which is performed in order to prevent free leakage of mining groundwater, as a part of these actions. This system provides a zero emission of mining water in the environment from Sasa mine. Key words: mining water...
toxoMine: an integrated omics data warehouse for Toxoplasma gondii systems biology research.

Science.gov (United States)

Rhee, David B; Croken, Matthew McKnight; Shieh, Kevin R; Sullivan, Julie; Micklem, Gos; Kim, Kami; Golden, Aaron

2015-01-01

Toxoplasma gondii (T. gondii) is an obligate intracellular parasite that must monitor for changes in the host environment and respond accordingly; however, it is still not fully known which genetic or epigenetic factors are involved in regulating virulence traits of T. gondii. There are on-going efforts to elucidate the mechanisms regulating the stage transition process via the application of high-throughput epigenomics, genomics and proteomics techniques. Given the range of experimental conditions and the typical yield from such high-throughput techniques, a new challenge arises: how to effectively collect, organize and disseminate the generated data for subsequent data analysis. Here, we describe toxoMine, which provides a powerful interface to support sophisticated integrative exploration of high-throughput experimental data and metadata, providing researchers with a more tractable means toward understanding how genetic and/or epigenetic factors play a coordinated role in determining pathogenicity of T. gondii. As a data warehouse, toxoMine allows integration of high-throughput data sets with public T. gondii data. toxoMine is also able to execute complex queries involving multiple data sets with straightforward user interaction. Furthermore, toxoMine allows users to define their own parameters during the search process that gives users near-limitless search and query capabilities. The interoperability feature also allows users to query and examine data available in other InterMine systems, which would effectively augment the search scope beyond what is available to toxoMine. toxoMine complements the major community database ToxoDB by providing a data warehouse that enables more extensive integrative studies for T. gondii. Given all these factors, we believe it will become an indispensable resource to the greater infectious disease research community. © The Author(s) 2015. Published by Oxford University Press.
Sustainable rehabilitation of mining waste and acid mine drainage using geochemistry, mine type, mineralogy, texture, ore extraction and climate knowledge.

Science.gov (United States)

Anawar, Hossain Md

2015-08-01

The oxidative dissolution of sulfidic minerals releases the extremely acidic leachate, sulfate and potentially toxic elements e.g., As, Ag, Cd, Cr, Cu, Hg, Ni, Pb, Sb, Th, U, Zn, etc. from different mine tailings and waste dumps. For the sustainable rehabilitation and disposal of mining waste, the sources and mechanisms of contaminant generation, fate and transport of contaminants should be clearly understood. Therefore, this study has provided a critical review on (1) recent insights in mechanisms of oxidation of sulfidic minerals, (2) environmental contamination by mining waste, and (3) remediation and rehabilitation techniques, and (4) then developed the GEMTEC conceptual model/guide [(bio)-geochemistry-mine type-mineralogy- geological texture-ore extraction process-climatic knowledge)] to provide the new scientific approach and knowledge for remediation of mining wastes and acid mine drainage. This study has suggested the pre-mining geological, geochemical, mineralogical and microtextural characterization of different mineral deposits, and post-mining studies of ore extraction processes, physical, geochemical, mineralogical and microbial reactions, natural attenuation and effect of climate change for sustainable rehabilitation of mining waste. All components of this model should be considered for effective and integrated management of mining waste and acid mine drainage. Copyright © 2015 Elsevier Ltd. All rights reserved.
Philippine Mining Capitalism: The Changing Terrains of Struggle in the Neoliberal Mining Regime

Directory of Open Access Journals (Sweden)

Alvin A. Camba

2016-06-01

Full Text Available This article analyzes how the mining sector and anti-mining groups compete for mining outcomes in the Philippines. I argue that the transition to a neoliberal mineral regime has empowered the mining sector and weakened the mining groups by shifting the terrains of struggle onto the domains of state agencies and scientific networks. Since the neoliberal era, the mining sector has come up with two strategies. First, technologies of subjection elevate various public institutions to elect and select the processes aimed at making mining accountable and sensitive to the demands of local communities. However, they often refuse or lack the capacity to intervene effectively. Second, technologies of subjectivities allow a selective group of industry experts to single-handedly determine the environmental viability of mining projects. Mining consultants, specialists, and scientists chosen by mining companies determine the potential environmental damage on water bodies, air pollution, and soil erosion. Because of the mining capital’s access to economic and legal resources, anti-mining communities across the Philippines have been forced to compete on an unequal terrain for a meaningful social dialogue and mining outcomes.
The Design and Implementation of Test System Based on Programmable Excitation Power Supply for Mining Comprehensive Protector

Directory of Open Access Journals (Sweden)

Zhi-jie Zhang

2013-11-01

Full Text Available As comprehensive protectors for coal mining (referred to comprehensive protectors in use are prone to fail, it can timely screen out the invalid comprehensive protector by periodic functional test when it is used (it is called test in use to ensure the production safety. The test in use needs the specialized test equipment, which is not used in delivery inspection by the manufacturers of comprehensive protectors. Thus, testing excitation power becomes a constraint for the improvement of the accuracy of test in use and the degree of automation. To solve the problem, this paper developed a power frequency programmed input-output testing excitation power supply, and on that basis it also realized the mining comprehensive protector test system in use with the excitation circuit and voltage program-controlled output.
Large scale analysis of small repeats via mining of the human genome

NARCIS (Netherlands)

van den Berg, I.; Bosnacki, D.; Hilbers, P.A.J.

2009-01-01

Small repetitive sequences, called tandem repeats, are abundant throughout the human genome, both in coding and in non-coding regions. Their role is still mostly unknown, but at least 20 of those repetitive sequences have been related to neurodegenerative disorders. The mutational process that is
Near-line Archive Data Mining at the Goddard Distributed Active Archive Center

Science.gov (United States)

Pham, L.; Mack, R.; Eng, E.; Lynnes, C.

2002-12-01

NASA's Earth Observing System (EOS) is generating immense volumes of data, in some cases too much to provide to users with data-intensive needs. As an alternative to moving the data to the user and his/her research algorithms, we are providing a means to move the algorithms to the data. The Near-line Archive Data Mining (NADM) system is the Goddard Earth Sciences Distributed Active Archive Center's (GES DAAC) web data mining portal to the EOS Data and Information System (EOSDIS) data pool, a 50-TB online disk cache. The NADM web portal enables registered users to submit and execute data mining algorithm codes on the data in the EOSDIS data pool. A web interface allows the user to access the NADM system. The users first develops personalized data mining code on their home platform and then uploads them to the NADM system. The C, FORTRAN and IDL languages are currently supported. The user developed code is automatically audited for any potential security problems before it is installed within the NADM system and made available to the user. Once the code has been installed the user is provided a test environment where he/she can test the execution of the software against data sets of the user's choosing. When the user is satisfied with the results, he/she can promote their code to the "operational" environment. From here the user can interactively run his/her code on the data available in the EOSDIS data pool. The user can also set up a processing subscription. The subscription will automatically process new data as it becomes available in the EOSDIS data pool. The generated mined data products are then made available for FTP pickup. The NADM system uses the GES DAAC-developed Simple Scalable Script-based Science Processor (S4P) to automate tasks and perform the actual data processing. Users will also have the option of selecting a DAAC-provided data mining algorithm and using it to process the data of their choice.
Investigating core genetic-and-epigenetic cell cycle networks for stemness and carcinogenic mechanisms, and cancer drug design using big database mining and genome-wide next-generation sequencing data.

Science.gov (United States)

Li, Cheng-Wei; Chen, Bor-Sen

2016-10-01

Recent studies have demonstrated that cell cycle plays a central role in development and carcinogenesis. Thus, the use of big databases and genome-wide high-throughput data to unravel the genetic and epigenetic mechanisms underlying cell cycle progression in stem cells and cancer cells is a matter of considerable interest. Real genetic-and-epigenetic cell cycle networks (GECNs) of embryonic stem cells (ESCs) and HeLa cancer cells were constructed by applying system modeling, system identification, and big database mining to genome-wide next-generation sequencing data. Real GECNs were then reduced to core GECNs of HeLa cells and ESCs by applying principal genome-wide network projection. In this study, we investigated potential carcinogenic and stemness mechanisms for systems cancer drug design by identifying common core and specific GECNs between HeLa cells and ESCs. Integrating drug database information with the specific GECNs of HeLa cells could lead to identification of multiple drugs for cervical cancer treatment with minimal side-effects on the genes in the common core. We found that dysregulation of miR-29C, miR-34A, miR-98, and miR-215; and methylation of ANKRD1, ARID5B, CDCA2, PIF1, STAMBPL1, TROAP, ZNF165, and HIST1H2AJ in HeLa cells could result in cell proliferation and anti-apoptosis through NFκB, TGF-β, and PI3K pathways. We also identified 3 drugs, methotrexate, quercetin, and mimosine, which repressed the activated cell cycle genes, ARID5B, STK17B, and CCL2, in HeLa cells with minimal side-effects.
Imputation and quality control steps for combining multiple genome-wide datasets

Directory of Open Access Journals (Sweden)

Shefali S Verma

2014-12-01

Full Text Available The electronic MEdical Records and GEnomics (eMERGE network brings together DNA biobanks linked to electronic health records (EHRs from multiple institutions. Approximately 52,000 DNA samples from distinct individuals have been genotyped using genome-wide SNP arrays across the nine sites of the network. The eMERGE Coordinating Center and the Genomics Workgroup developed a pipeline to impute and merge genomic data across the different SNP arrays to maximize sample size and power to detect associations with a variety of clinical endpoints. The 1000 Genomes cosmopolitan reference panel was used for imputation. Imputation results were evaluated using the following metrics: accuracy of imputation, allelic R2 (estimated correlation between the imputed and true genotypes, and the relationship between allelic R2 and minor allele frequency. Computation time and memory resources required by two different software packages (BEAGLE and IMPUTE2 were also evaluated. A number of challenges were encountered due to the complexity of using two different imputation software packages, multiple ancestral populations, and many different genotyping platforms. We present lessons learned and describe the pipeline implemented here to impute and merge genomic data sets. The eMERGE imputed dataset will serve as a valuable resource for discovery, leveraging the clinical data that can be mined from the EHR.
Technological highwall mining

Energy Technology Data Exchange (ETDEWEB)

Davison, I. [Highwall Systems (United States)

2006-09-15

The paper explores the issues facing highwall mining. Based in Chilhowie, Virginia, American Highwall Systems has developed a highwall mining system that will allow the mining of coal seams from 26 in to 10 ft in thickness. The first production model, AH51, began mining in August 2006. Technologies incorporated into the company's mining machines to improve the performance, enhance the efficiency, and improve the reliability of the highwall mining equipment incorporate technologies from many disciplines. Technology as applied to design engineering, manufacturing and fabrication engineering, control and monitoring computer hardware and software has played an important role in the evolution of the American Highwall Systems design concept. 5 photos.
Analysis of high-throughput sequencing and annotation strategies for phage genomes.

Directory of Open Access Journals (Sweden)

Matthew R Henn

Full Text Available BACKGROUND: Bacterial viruses (phages play a critical role in shaping microbial populations as they influence both host mortality and horizontal gene transfer. As such, they have a significant impact on local and global ecosystem function and human health. Despite their importance, little is known about the genomic diversity harbored in phages, as methods to capture complete phage genomes have been hampered by the lack of knowledge about the target genomes, and difficulties in generating sufficient quantities of genomic DNA for sequencing. Of the approximately 550 phage genomes currently available in the public domain, fewer than 5% are marine phage. METHODOLOGY/PRINCIPAL FINDINGS: To advance the study of phage biology through comparative genomic approaches we used marine cyanophage as a model system. We compared DNA preparation methodologies (DNA extraction directly from either phage lysates or CsCl purified phage particles, and sequencing strategies that utilize either Sanger sequencing of a linker amplification shotgun library (LASL or of a whole genome shotgun library (WGSL, or 454 pyrosequencing methods. We demonstrate that genomic DNA sample preparation directly from a phage lysate, combined with 454 pyrosequencing, is best suited for phage genome sequencing at scale, as this method is capable of capturing complete continuous genomes with high accuracy. In addition, we describe an automated annotation informatics pipeline that delivers high-quality annotation and yields few false positives and negatives in ORF calling. CONCLUSIONS/SIGNIFICANCE: These DNA preparation, sequencing and annotation strategies enable a high-throughput approach to the burgeoning field of phage genomics.
Kiruna research mine

Energy Technology Data Exchange (ETDEWEB)

Oestensen, A

1983-12-01

The research mine at Kiruna is the first large-scale mining research project sponsored by the Swedish government. Under the leadership of the Swedish Mining Research Foundation, a five-year project involving development of new mining systems and machinery will be carried out in cooperation with the Lulea Institute of Technology and a number of Swedish industrial companies.
Enhancing Seismic Calibration Research Through Software Automation and Scientific Information Management

Energy Technology Data Exchange (ETDEWEB)

Ruppert, S D; Dodge, D A; Ganzberger, M D; Harris, D B; Hauk, T F

2009-07-07

The National Nuclear Security Administration (NNSA) Ground-Based Nuclear Explosion Monitoring Research and Development (GNEMRD) Program at LLNL continues to make significant progress enhancing the process of deriving seismic calibrations and performing scientific integration, analysis, and information management with software automation tools. Our tool efforts address the problematic issues of very large datasets and varied formats encountered during seismic calibration research. New information management and analysis tools have resulted in demonstrated gains in efficiency of producing scientific data products and improved accuracy of derived seismic calibrations. In contrast to previous years, software development work this past year has emphasized development of automation at the data ingestion level. This change reflects a gradually-changing emphasis in our program from processing a few large data sets that result in a single integrated delivery, to processing many different data sets from a variety of sources. The increase in the number of sources had resulted in a large increase in the amount of metadata relative to the final volume of research products. Software developed this year addresses the problems of: (1) Efficient metadata ingestion and conflict resolution; (2) Automated ingestion of bulletin information; (3) Automated ingestion of waveform information from global data centers; and (4) Site Metadata and Response transformation required for certain products. This year, we also made a significant step forward in meeting a long-standing goal of developing and using a waveform correlation framework. Our objective for such a framework is to extract additional calibration data (e.g. mining blasts) and to study the extent to which correlated seismicity can be found in global and regional scale environments.
HPIminer: A text mining system for building and visualizing human protein interaction networks and pathways.

Science.gov (United States)

Subramani, Suresh; Kalpana, Raja; Monickaraj, Pankaj Moses; Natarajan, Jeyakumar

2015-04-01

The knowledge on protein-protein interactions (PPI) and their related pathways are equally important to understand the biological functions of the living cell. Such information on human proteins is highly desirable to understand the mechanism of several diseases such as cancer, diabetes, and Alzheimer's disease. Because much of that information is buried in biomedical literature, an automated text mining system for visualizing human PPI and pathways is highly desirable. In this paper, we present HPIminer, a text mining system for visualizing human protein interactions and pathways from biomedical literature. HPIminer extracts human PPI information and PPI pairs from biomedical literature, and visualize their associated interactions, networks and pathways using two curated databases HPRD and KEGG. To our knowledge, HPIminer is the first system to build interaction networks from literature as well as curated databases. Further, the new interactions mined only from literature and not reported earlier in databases are highlighted as new. A comparative study with other similar tools shows that the resultant network is more informative and provides additional information on interacting proteins and their associated networks. Copyright © 2015 Elsevier Inc. All rights reserved.
Natural Product Biosynthetic Diversity and Comparative Genomics of the Cyanobacteria.

Science.gov (United States)

Dittmann, Elke; Gugger, Muriel; Sivonen, Kaarina; Fewer, David P

2015-10-01

Cyanobacteria are an ancient lineage of slow-growing photosynthetic bacteria and a prolific source of natural products with intricate chemical structures and potent biological activities. The bulk of these natural products are known from just a handful of genera. Recent efforts have elucidated the mechanisms underpinning the biosynthesis of a diverse array of natural products from cyanobacteria. Many of the biosynthetic mechanisms are unique to cyanobacteria or rarely described from other organisms. Advances in genome sequence technology have precipitated a deluge of genome sequences for cyanobacteria. This makes it possible to link known natural products to biosynthetic gene clusters but also accelerates the discovery of new natural products through genome mining. These studies demonstrate that cyanobacteria encode a huge variety of cryptic gene clusters for the production of natural products, and the known chemical diversity is likely to be just a fraction of the true biosynthetic capabilities of this fascinating and ancient group of organisms. Copyright © 2015. Published by Elsevier Ltd.
Home Automation

OpenAIRE

Ahmed, Zeeshan

2010-01-01

In this paper I briefly discuss the importance of home automation system. Going in to the details I briefly present a real time designed and implemented software and hardware oriented house automation research project, capable of automating house's electricity and providing a security system to detect the presence of unexpected behavior.
Complete genome sequence of the bioleaching bacterium Leptospirillum sp. group II strain CF-1.

Science.gov (United States)

Ferrer, Alonso; Bunk, Boyke; Spröer, Cathrin; Biedendieck, Rebekka; Valdés, Natalia; Jahn, Martina; Jahn, Dieter; Orellana, Omar; Levicán, Gloria

2016-03-20

We describe the complete genome sequence of Leptospirillum sp. group II strain CF-1, an acidophilic bioleaching bacterium isolated from an acid mine drainage (AMD). This work provides data to gain insights about adaptive response of Leptospirillum spp. to the extreme conditions of bioleaching environments. Copyright © 2016 Elsevier B.V. All rights reserved.
SV-AUTOPILOT: optimized, automated construction of structural variation discovery and benchmarking pipelines.

Science.gov (United States)

Leung, Wai Yi; Marschall, Tobias; Paudel, Yogesh; Falquet, Laurent; Mei, Hailiang; Schönhuth, Alexander; Maoz Moss, Tiffanie Yael

2015-03-25

Many tools exist to predict structural variants (SVs), utilizing a variety of algorithms. However, they have largely been developed and tested on human germline or somatic (e.g. cancer) variation. It seems appropriate to exploit this wealth of technology available for humans also for other species. Objectives of this work included: a) Creating an automated, standardized pipeline for SV prediction. b) Identifying the best tool(s) for SV prediction through benchmarking. c) Providing a statistically sound method for merging SV calls. The SV-AUTOPILOT meta-tool platform is an automated pipeline for standardization of SV prediction and SV tool development in paired-end next-generation sequencing (NGS) analysis. SV-AUTOPILOT comes in the form of a virtual machine, which includes all datasets, tools and algorithms presented here. The virtual machine easily allows one to add, replace and update genomes, SV callers and post-processing routines and therefore provides an easy, out-of-the-box environment for complex SV discovery tasks. SV-AUTOPILOT was used to make a direct comparison between 7 popular SV tools on the Arabidopsis thaliana genome using the Landsberg (Ler) ecotype as a standardized dataset. Recall and precision measurements suggest that Pindel and Clever were the most adaptable to this dataset across all size ranges while Delly performed well for SVs larger than 250 nucleotides. A novel, statistically-sound merging process, which can control the false discovery rate, reduced the false positive rate on the Arabidopsis benchmark dataset used here by >60%. SV-AUTOPILOT provides a meta-tool platform for future SV tool development and the benchmarking of tools on other genomes using a standardized pipeline. It optimizes detection of SVs in non-human genomes using statistically robust merging. The benchmarking in this study has demonstrated the power of 7 different SV tools for analyzing different size classes and types of structural variants. The optional merge

Some links on this page may take you to non-federal websites. Their policies may differ from this site.